Decision Problems Concerning. Prime Words and Languages of the

Similar documents
The commutation with ternary sets of words

Note On Parikh slender context-free languages

Note Watson Crick D0L systems with regular triggers

Solvability of Word Equations Modulo Finite Special And. Conuent String-Rewriting Systems Is Undecidable In General.

Decidability and Undecidability of Marked PCP Vesa Halava 1, Mika Hirvensalo 2;1?, and Ronald de Wolf 3;4 1 Turku Centre for Computer Science, Lemmink

The Accepting Power of Finite. Faculty of Mathematics, University of Bucharest. Str. Academiei 14, 70109, Bucharest, ROMANIA

of poly-slenderness coincides with the one of boundedness. Once more, the result was proved again by Raz [17]. In the case of regular languages, Szila

Watson-Crick ω-automata. Elena Petre. Turku Centre for Computer Science. TUCS Technical Reports

Independence of certain quantities indicating subword occurrences

On the Equation x k = z k 1 in a Free Semigroup

arxiv: v1 [math.co] 9 Jun 2015

Complexity of Reachability, Mortality and Freeness Problems for Matrix Semigroups

Abstract This work is a survey on decidable and undecidable problems in matrix theory. The problems studied are simply formulated, however most of the

Power and size of extended Watson Crick L systems

The Equation a M = b N c P in a Free Semigroup

A note on the decidability of subword inequalities

On the robustness of primitive words

On the dual post correspondence problem

2 THE COMPLEXITY OF TORSION-FREENESS On the other hand, the nite presentation of a group G also does not allow us to determine almost any conceivable

A shrinking lemma for random forbidding context languages

Properties of Fibonacci languages

Descriptional Complexity of Formal Systems (Draft) Deadline for submissions: April 20, 2009 Final versions: June 18, 2009

Invertible insertion and deletion operations

Substitutions, Trajectories and Noisy Channels

CERNY CONJECTURE FOR DFA ACCEPTING STAR-FREE LANGUAGES

Klaus Madlener. Internet: Friedrich Otto. Fachbereich Mathematik/Informatik, Universitat Kassel.

Aperiodic languages and generalizations

Laboratoire d Informatique Fondamentale de Lille

On the Unavoidability of k-abelian Squares in Pure Morphic Words

The rest of the paper is organized as follows: in Section 2 we prove undecidability of the existential-universal ( 2 ) part of the theory of an AC ide

Morphically Primitive Words (Extended Abstract)

Undecidability of ground reducibility. for word rewriting systems with variables. Gregory KUCHEROV andmichael RUSINOWITCH

splicing systems coincides with the family SLT of strictly locally testable languages which is a subfamily of the regular languages that was introduce

Fine and Wilf s Periodicity on Partial Words and Consequences

Simple equations on binary factorial languages

Bridges for concatenation hierarchies

A q-matrix Encoding Extending the Parikh Matrix Mapping

Binary words containing infinitely many overlaps

Locally catenative sequences and Turtle graphics

For the property of having nite derivation type this result has been strengthened considerably by Otto and Sattler-Klein by showing that this property

SUBSPACES OF COMPUTABLE VECTOR SPACES

2 THE COMPUTABLY ENUMERABLE SUPERSETS OF AN R-MAXIMAL SET The structure of E has been the subject of much investigation over the past fty- ve years, s

CHARACTERIZING INTEGERS AMONG RATIONAL NUMBERS WITH A UNIVERSAL-EXISTENTIAL FORMULA

Some decision problems on integer matrices

About Duval Extensions

Letter frequency in infinite repetition-free words

Restricted ambiguity of erasing morphisms

Abelian Algebras and the Hamiltonian Property. Abstract. In this paper we show that a nite algebra A is Hamiltonian if the

Boolean Algebra and Propositional Logic

2 C. A. Gunter ackground asic Domain Theory. A poset is a set D together with a binary relation v which is reexive, transitive and anti-symmetric. A s

Finite groups determined by an inequality of the orders of their elements

A version of for which ZFC can not predict a single bit Robert M. Solovay May 16, Introduction In [2], Chaitin introd

ON PARTITIONS SEPARATING WORDS. Formal languages; finite automata; separation by closed sets.

form, but that fails as soon as one has an object greater than every natural number. Induction in the < form frequently goes under the fancy name \tra

TECHNISCHE UNIVERSITÄT DRESDEN. Fakultät Informatik. Technische Berichte Technical Reports. Daniel Kirsten. TUD / FI 98 / 07 - Mai 1998

University of Waterloo. W. F. Smyth. McMaster University. Curtin University of Technology

Then RAND RAND(pspace), so (1.1) and (1.2) together immediately give the random oracle characterization BPP = fa j (8B 2 RAND) A 2 P(B)g: (1:3) Since

PERIODS OF FACTORS OF THE FIBONACCI WORD

3. G. Groups, as men, will be known by their actions. - Guillermo Moreno

Fundamental gaps in numerical semigroups

Decision issues on functions realized by finite automata. May 7, 1999

Fundamenta Informaticae 30 (1997) 23{41 1. Petri Nets, Commutative Context-Free Grammars,

Myhill-Nerode Theorem for Recognizable Tree Series Revisited?

Boolean Algebra and Propositional Logic

MATH 433 Applied Algebra Lecture 22: Semigroups. Rings.

Equalizers and kernels in categories of monoids

Note An example of a computable absolutely normal number

How to Pop a Deep PDA Matters

Splitting a Default Theory. Hudson Turner. University of Texas at Austin.

BOUNDS ON ZIMIN WORD AVOIDANCE

Research Statement. MUHAMMAD INAM 1 of 5

Semi-simple Splicing Systems

Insertion and Deletion of Words: Determinism and Reversibility

ON MINIMAL CONTEXT-FREE INSERTION-DELETION SYSTEMS

RECENT RESULTS IN STURMIAN WORDS 1. LITP, IBP, Universite Pierre et Marie Curie, 4, place Jussieu. F Paris Cedex 05, France

F -inverse covers of E-unitary inverse monoids

Reversal of Regular Languages and State Complexity

ON HIGHLY PALINDROMIC WORDS

A REPRESENTATION THEORETIC APPROACH TO SYNCHRONIZING AUTOMATA

Note that a unit is unique: 1 = 11 = 1. Examples: Nonnegative integers under addition; all integers under multiplication.

Lecture 1. Toric Varieties: Basics

On P Systems with Active Membranes

1 Introduction We study classical rst-order logic with equality but without any other relation symbols. The letters ' and are reserved for quantier-fr

of acceptance conditions (nite, looping and repeating) for the automata. It turns out,

Claude Marche. Bat 490, Universite Paris-Sud. Abstract

Homological Decision Problems for Finitely Generated Groups with Solvable Word Problem

On the Simplification of HD0L Power Series

Stabilization as a CW approximation

3.1 Basic properties of real numbers - continuation Inmum and supremum of a set of real numbers

On decision problems for timed automata

A note on fuzzy predicate logic. Petr H jek 1. Academy of Sciences of the Czech Republic

THE MAXIMAL SUBGROUPS AND THE COMPLEXITY OF THE FLOW SEMIGROUP OF FINITE (DI)GRAPHS

Upper and Lower Bounds on the Number of Faults. a System Can Withstand Without Repairs. Cambridge, MA 02139

Preprint MCS-P , Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, Ill., November 1993.

A DESCRIPTION OF INCIDENCE RINGS OF GROUP AUTOMATA

for average case complexity 1 randomized reductions, an attempt to derive these notions from (more or less) rst

P systems based on tag operations

Semigroup, monoid and group models of groupoid identities. 1. Introduction

POLYNOMIAL ALGORITHM FOR FIXED POINTS OF NONTRIVIAL MORPHISMS

One Quantier Will Do in Existential Monadic. Second-Order Logic over Pictures. Oliver Matz. Institut fur Informatik und Praktische Mathematik

Transcription:

Decision Problems Concerning Prime Words and Languages of the PCP Marjo Lipponen Turku Centre for Computer Science TUCS Technical Report No 27 June 1996 ISBN 951-650-783-2 ISSN 1239-1891

Abstract This paper investigates properties of prime words and prime languages obtained from the Post Correspondence Problem. We show that the properties of being a prime word or a nite prime language are decidable. We also present other characterization methods. TUCS Research Group Mathematical Structures of Computer Science 1 To be presented at the 8th International Conference on Automata and Formal Languages, Salgotarjan, Hungary, July 29 { August 2, 1996

1 Prime solutions of the PCP It has become customary, starting from [14], to consider three types of solutions for an instance of the Post Correspondence Problem that are somehow simpler than the other solutions. A solution is termed F-prime if no (nonempty) nal subword can be removed such that what remains is still a solution. In the same way S-prime and P-prime solutions correspond to removing a subword and a scattered subword, respectively. More specically, by an instance of the Post Correspondence Problem we mean a pair (g; h) of nonerasing morphisms g; h :! where and are nite alphabets, and by its solutions the words in the equality set E(g; h) = fw 2 + j g(w) = h(w)g: Hence a solution is a nonempty word whose images under g and h coincide. If g(a) = h(a) for some a 2, we call such an instance (g; h) trivial and if there exists u 2 + such that g(a) = u ia and h(a) = u ja for all a 2 then (g; h) is called periodic. The sets of F-prime, S-prime and P-prime solutions, based on removing a nal subword, a subword or a scattered subword, are dened by where F (g; h) = fw 2 E(g; h) j fin (w) \ E(g; h) = fwgg; S(g; h) = fw 2 E(g; h) j sub (w) \ E(g; h) = fwgg; P (g; h) = fw 2 E(g; h) j scatsub (w) \ E(g; h) = fwgg; fin (w) = fv j w = vx; for some x 2 g; sub (w) = fv 1 v 2 j w = v 1 xv 2 ; for some v 1 ; v 2 ; x 2 g; scatsub (w) = fv 1 : : : v k j w = x 1 v 1 : : : x k v k x k+1 ; for some x i ; v i 2 g: The study of prime words, initiated in [9] and continued in [3{7], is an extension of that of prime solutions: every word that is F-jS-jP-prime for some instance (g; h) of the Post Correspondence Problem is an F-jS-jP-word. Since every P-prime is an S-prime and every S-prime is an F-prime, also prime words form an increasing hierarchy. The details of this study can be found in [4, 5]. Prime languages can be dened in two ways, based on equality or inclusion. We say that a language L is an F-, S- or a P-language if, for some instance (g; h), L either equals or is a part of the set P (g; h), S(g; h) or F (g; h), respectively. 1

Though the Post Correspondence Problem, [10], is most famous for its use in undecidability proofs, [11], many properties of prime words and languages turn out to be decidable. We study the membership problems of prime words and languages as well as other methods to characterize them. The last section is devoted to the binary alphabet. The results we are presenting in this paper are based on a dissertation, [6], published only recently. For further details of formal language theory we refer to [12]. 2 Decision problems We start by showing that for any word w we can decide whether w is a P-prime, an S-prime or an F-prime solution for some instance (g; h). Theorem 2.1 Each of the properties of being a P-word, an S-word or an F-word is decidable. Proof: To check whether a given word w is a P-word we have to consider the set scatsub (w), its scattered subwords, which is always nite. Using the result of Makanin [8] and the possibility to translate an inequality into several systems of equations as in [1] and [2], we can test for each of the following systems whether it has a solution, i.e., whether there are g and h such that g(w) = h(w) g(v) 6= h(v) for all v 2 scatsub (w)? fw; g: If so, then w is a P-prime solution for this instance (g; h), otherwise w is not a P-word. With S-words and F-words we use similarly the sets sub (w) and fin (w), respectively. 2 This rst result tells actually very little of the nature of the three types of prime words since Makanin's general algorithm is very complicated. This is why we seek for other possibilities to characterize these words. We start with two important notions. A basic Parikh vector 0 is obtained from the Parikh vector by dividing it with the greatest common divisor of its components. Parikh vectors can also be used to make comparisons between words. We say that a word u is Parikh shorter than v if (u) (v) componentwise. 2

The following result was established in [5]. It gives us an eective tool for characterizing prime words. Lemma 2.2 For each word w we can eectively nd an instance of the Post Correspondence Problem such that for all words w 0 than w, w 0 is also a solution if and only if 0 (w 0 ) = 0 (w). which are Parikh shorter A word is said to be ratioprimitive (resp. subratioprimitive) if none of its proper prexes (resp. subwords) has the same basic Parikh vector as the whole word. Theorem 2.3 A nonempty word w is an F-word if and only if it is ratioprimitive. Theorem 2.3 gives an eective algorithm for F-words: it is easy to check for a given word whether its prexes have the same basic Parikh vector as the whole word, even in polynomial time. For S-words we have found only a partial algorithm. Theorem 2.4 Every subratioprimitive word is an S-word. The converse of the previous theorem does not hold; for instance, the word 311132223 is an S-word but not subratioprimitive. Also for P-words the algorithm is partial. The following theorem shows what kind of words can appear as P-words. The rst result follows directly from Lemma 2.2 and the second one is due to [9]. Theorem 2.5 The word w is a P-word if 1. (w) = 0 (w) or 2. w = a i 1 1 ai 2 2 : : : ain n where = fa 1; : : : ; a n g and n 2. These results are not exhaustive either, for instance, the words 121233, 122313, 122133, 122331, 121323 are all P-words. These results can be improved, however, with a new restriction. We say that a word w is periodicity forcing if every instance (g; h) for which w is in E(g; h) is periodic or trivial. Lemma 2.6 If w is periodicity forcing and (w) 6= 0 (w) (resp. not subratioprimitive) then w is not a P-word (resp. an S-word). 3

Unfortunately, periodicity forcing words are not characterized any more than prime words, see [6] for details. On the other hand, Lemma 2.6 is not true for the other direction; for instance, the word 1212123123 which is not subratioprimitive is not an S-word either (see [4]) even though it is not periodicity forcing. The following result of prime languages can be viewed as an extension of Theorem 2.1. Theorem 2.7 For nite languages each of the properties of being a P-, an S- and an F-language in inclusion sense is decidable. Proof: In order to decide whether a given language L = fw 1 ; : : : ; w n g is a P-language we apply Makanin's result, [8], for the nite system of equations, g(w 1 ) = h(w 1 );. g(w n ) = h(w n ); (1) g(u 1 ) 6= h(u 1 ) for all u 1 2 scatsub (w 1 )? fw 1 ; g;. g(u n ) 6= h(u n ) for all u n 2 scatsub (w n )? fw n ; g: If this system has a solution; that is, the equations hold for some g and h, then L is a subset of P (g; h); otherwise, L cannot be a P-language. For S- or F-languages we use similarly the sets sub and n. 2 What is the situation with innite languages? Since P-languages are always nite (see [14]), they are decidable also in this case. On the other hand, with S- and F-languages we cannot apply the previous argument any more, the system of equations being innite. By Ehrenfeucht's conjecture (see [13] for details) every language L possesses a nite subset D, called a test set, such that, whenever g and h are two morphisms dened on and satisfying g(w) = h(w) for every w in D, then g(w) = h(w) holds for every w in L. The construction of D is not eective in general but is eective, for instance, for context-free languages. With this in mind we cannot, however, be sure that all the members of L are F- or S-prime solutions even if this is the case for D. In the same context, [13], a more general result was proved: Every system of word equations possesses a nite subsystem equivalent to the original system. Hence, if we construct the similar system as (1), the previous result 4

implies that there exists a nite subsystem which has exactly the same solutions and Makanin's result is again applicable. However, the nite subsystem is not eective here either. Nor do we have any obvious subclasses for which the construction is eective since it is possible that the subsystem is not the same as the test set for the given language. Also for the prime languages dened by equality method the argument in Theorem 2.7 is not sucient. If the system of equations fails to have a solution then the language cannot be a prime language in this sense either; otherwise, we do not know whether the language under examination contains all or only some of the prime solutions of P (g; h), S(g; h) or F (g; h). 3 Binary case In this section we consider prime words and prime languages in a binary alphabet. It seems that in many cases the results form an exception compared with larger alphabets. Also the instances seem to be much more limited even though we lack the exact evidence. Hence this section concentrates more on conjectures than on actual results. The rst conjecture deals with P-words. Conjecture 3.1 In a binary alphabet w is a P-word if and only if (w) = 0 (w) or w 2 a + b + [ b + a + : (2) It actually looks like any word which does not satisfy (2) is periodicity forcing. If this could be proved, Conjecture 3.1 would be a straight consequence of Lemma 2.6. Another interesting thing is the hierarchy of P-words and S-words. In [4, 5] we proved that in alphabets with at least three letters the inclusion is strict; still, in a binary alphabet they seem to be equal. Conjecture 3.2 Let g and h be morphisms over a binary alphabet. Then P (g; h) = S(g; h): Here it would suce to show that there are no other equality sets generated by two words, apart from the sets fa; bg and fa i b; ba i g, i 1. This is closely connected also with the following conjecture of prime languages (with at least two words) dened in equality sense. 5

Conjecture 3.3 In the binary alphabet fa; bg the only P- and S-languages are 1. fa i b; ba i g, i 1, and 2. c(fa i b j g(i; j) = 1) and the only F-languages, in addition to 1., 2 0 : fw 2 + j 0 (w) = (i; j) and w is ratioprimitiveg for some i; j 1: Here the set c(fa i b j g(i; j) = 1) consists of all the words which are permutations of a i b j where gcd(i; j) = 1. For instance, c(fa 1 b 3 g) = fabbb; babb; bbab; bbbag. For prime languages with at least three letters we have better results. Theorem 3.4 The properties of being a P- or an S-language are both decidable for languages with cardinality at least three over a binary alphabet. Theorem 3.5 If L is a nite language with cardinality at least three in a binary alphabet then L is not an F-language. The rst result concerns also prime languages dened in inclusion sense but not the second one. In fact the inclusion denition carries much more information about prime languages. Theorem 3.6 L fa; bg is an F-language (in inclusion sense) if and only if its members are ratioprimitive and have the same basic Parikh vector. Proof: The \only if"-part follows from Theorem 2.3 and the fact that in a binary alphabet all the members of the equality set must have the same basic Parikh vector. On the other hand, if the words of L have the same basic Parikh vector then they must be solutions for some periodic instance (g; h) whereas ratioprimitiveness now guarantees that they are, indeed, F- prime solutions. By Lemma 2.2 any such instance (g; h) can be eectively constructed. 2 For S- and P-languages the following theorem expresses a sucient condition but we conjecture that the condition is also necessary. Theorem 3.7 L fa; bg is a P- and an S-language (in inclusion sense) if either its members have the same basic Parikh vector and for each w 2 L, (w) = 0 (w) or L fa i b; ba i g (or symmetrically fab i ; b i ag), i 1. In larger alphabets, however, we lack any similar knowledge of equality sets. Thus the only extension of Theorem 3.6 we can prove is that a given language L is an F-language only if its members are ratioprimitive. 6

References [1] K. Culik II, J. Karhumaki: On the equality sets for homomorphisms on free monoids with two generators, RAIRO Inform. Theor. 14 (1980) 349{ 369. [2] K. Culik II, J. Karhumaki: Systems of equations over a free monoid and Ehrenfeucht's Conjecture, Discrete Math. 43 (1983) 139{153. [3] M. Lipponen: Primitive words and languages associated to PCP, EATCS Bull. 53 (1994) 217{226. [4] M. Lipponen: Post Correspondence Problem: words possible as primitive solutions, Proc. 22nd ICALP, Springer LNCS 944 (1995) 63{74. [5] M. Lipponen: On F-prime solutions of the Post Correspondence Problem, 2nd Internat. Conf. on Developments in Language Theory, Magdeburg, 1995, to appear. [6] M. Lipponen: On primitive solutions of the Post Correspondence Problem, TUCS Dissertations No. 1 (1996). [7] M. Lipponen, Gh. Paun: Strongly prime PCP words, Discrete Appl. Math. 63 (1995) 193{197. [8] G.S. Makanin: The problem of solvability of equations in a free semigroup (in Russian), Mat. Sb. 103 No. 145 (1977) 148{236. [9] A. Mateescu, A. Salomaa: PCP-prime words and primality types, RAIRO Inform. Theor. 27 (1993) 57{70. [10] E. Post: A variant of a recursively unsolvable problem, Bull. Amer. Math. Soc. 53 (1946) 264{268. [11] G. Rozenberg, A. Salomaa: Cornerstones of Undecidability, Prentice Hall (1994). [12] G. Rozenberg, A. Salomaa (ed.): Handbook of Formal Languages, I{III, Springer-Verlag, forthcoming. [13] A. Salomaa: The Ehrenfeucht conjecture: a proof for language theorists, EATCS Bull. 27 (1985) 71{82. [14] A. Salomaa, K. Salomaa, Sheng Yu: Primality types of instances of the Post Correspondence Problem, EATCS Bull. 44 (1991) 226{241. 7

Turku Centre for Computer Science Lemminkaisenkatu 14 FIN-20520 Turku Finland http://www.tucs.abo. University of Turku Department of Mathematical Sciences Abo Akademi University Department of Computer Science Institute for Advanced Management Systems Research Turku School of Economics and Business Administration Institute of Information Systems Science