CDMTCS Research Report Series. Quasiperiods, Subword Complexity and the Smallest Pisot Number

Similar documents
CDMTCS Research Report Series. The Maximal Subword Complexity of Quasiperiodic Infinite Words. Ronny Polley 1 and Ludwig Staiger 2.

Stefan Hoffmann Universität Trier Sibylle Schwarz HTWK Leipzig Ludwig Staiger Martin-Luther-Universität Halle-Wittenberg

Topologies refining the CANTOR topology on X ω

CDMTCS Research Report Series. The Kolmogorov complexity of infinite words. Ludwig Staiger Martin-Luther-Universität Halle-Wittenberg

On Strong Alt-Induced Codes

Ludwig Staiger Martin-Luther-Universität Halle-Wittenberg Hideki Yamasaki Hitotsubashi University

Acceptance of!-languages by Communicating Deterministic Turing Machines

CS256/Spring 2008 Lecture #11 Zohar Manna. Beyond Temporal Logics

Descriptional Complexity of Formal Systems (Draft) Deadline for submissions: April 20, 2009 Final versions: June 18, 2009

Theoretical Computer Science. State complexity of basic operations on suffix-free regular languages

State Complexity of Neighbourhoods and Approximate Pattern Matching

Introduction to Hausdorff Measure and Dimension

VIC 2004 p.1/26. Institut für Informatik Universität Heidelberg. Wolfgang Merkle and Jan Reimann. Borel Normality, Automata, and Complexity

Correspondence Principles for Effective Dimensions

About Duval Extensions

Independence of certain quantities indicating subword occurrences

Sturmian Words, Sturmian Trees and Sturmian Graphs

On Properties and State Complexity of Deterministic State-Partition Automata

The commutation with ternary sets of words

Simultaneous Accumulation Points to Sets of d-tuples

Words with the Smallest Number of Closed Factors

The exact complexity of the infinite Post Correspondence Problem

Mergible States in Large NFA

On the Entropy of a Two Step Random Fibonacci Substitution

Section 2: Classes of Sets

Efficient minimization of deterministic weak ω-automata

arxiv: v1 [cs.dm] 13 Feb 2010

Infinite Strings Generated by Insertions

Nondeterministic State Complexity of Basic Operations for Prefix-Free Regular Languages

Watson-Crick ω-automata. Elena Petre. Turku Centre for Computer Science. TUCS Technical Reports

arxiv: v1 [math.co] 22 Jan 2013

Morphically Primitive Words (Extended Abstract)

On the Simplification of HD0L Power Series

Note Watson Crick D0L systems with regular triggers

group Jean-Eric Pin and Christophe Reutenauer

Product of Finite Maximal p-codes

Incompleteness Theorems, Large Cardinals, and Automata ov

of poly-slenderness coincides with the one of boundedness. Once more, the result was proved again by Raz [17]. In the case of regular languages, Szila

Hierarchy among Automata on Linear Orderings

Theoretical Computer Science

State Complexity of Two Combined Operations: Catenation-Union and Catenation-Intersection

Automata on linear orderings

On decision problems for timed automata

arxiv: v1 [math.co] 11 Mar 2013

Universal Disjunctive Concatenation and Star

Finite Automata and Regular Languages

Entropy Games and Matrix Multiplication Games. EQINOCS seminar

Games derived from a generalized Thue-Morse word

Linear distortion of Hausdorff dimension and Cantor s function

MANFRED DROSTE AND WERNER KUICH

Results on Transforming NFA into DFCA

ON HIGHLY PALINDROMIC WORDS

Chapter 2: Introduction to Fractals. Topics

Uses of finite automata

Part III. 10 Topological Space Basics. Topological Spaces

Theory of Computation

3515ICT: Theory of Computation. Regular languages

Restricted ambiguity of erasing morphisms

A BOREL SOLUTION TO THE HORN-TARSKI PROBLEM. MSC 2000: 03E05, 03E20, 06A10 Keywords: Chain Conditions, Boolean Algebras.

On External Contextual Grammars with Subregular Selection Languages

ON THE STAR-HEIGHT OF SUBWORD COUNTING LANGUAGES AND THEIR RELATIONSHIP TO REES ZERO-MATRIX SEMIGROUPS

c 1998 Society for Industrial and Applied Mathematics Vol. 27, No. 4, pp , August

Generating All Circular Shifts by Context-Free Grammars in Chomsky Normal Form

Kolmogorov Complexity and Diophantine Approximation

Regularity-preserving letter selections

Minimizing finite automata is computationally hard

CS 530: Theory of Computation Based on Sipser (second edition): Notes on regular languages(version 1.1)

Pattern-Matching for Strings with Short Descriptions

Deletion operations: closure properties

A shrinking lemma for random forbidding context languages

Theory of Computation (I) Yijia Chen Fudan University

Series of Error Terms for Rational Approximations of Irrational Numbers

Left-Forbidding Cooperating Distributed Grammar Systems

THE MEASURE-THEORETIC ENTROPY OF LINEAR CELLULAR AUTOMATA WITH RESPECT TO A MARKOV MEASURE. 1. Introduction

Büchi Automata and their closure properties. - Ajith S and Ankit Kumar

Kraft-Chaitin Inequality Revisited

DYNAMICAL PROPERTIES OF MONOTONE DENDRITE MAPS

BOUNDS ON ZIMIN WORD AVOIDANCE

Sequence of maximal distance codes in graphs or other metric spaces

arxiv: v2 [math.ds] 14 May 2012

Automaten und Formale Sprachen Automata and Formal Languages

CS 133 : Automata Theory and Computability

Transformation Between Regular Expressions and ω-automata

Isomorphism of regular trees and words

SIMPLE ALGORITHM FOR SORTING THE FIBONACCI STRING ROTATIONS

1 Computational Problems

arxiv: v2 [math.nt] 11 Jul 2018

Undecidable properties of self-affine sets and multi-tape automata

The Number of Runs in a String: Improved Analysis of the Linear Upper Bound

A Note on the Number of Squares in a Partial Word with One Hole

5 Set Operations, Functions, and Counting

Equational Theory of Kleene Algebra

Important Properties of R

LTL is Closed Under Topological Closure

LANGUAGE CLASSES ASSOCIATED WITH AUTOMATA OVER MATRIX GROUPS. Özlem Salehi (A) Flavio D Alessandro (B,C) Ahmet Celal Cem Say (A)

On the Average Complexity of Brzozowski s Algorithm for Deterministic Automata with a Small Number of Final States

Special Factors and Suffix and Factor Automata

A PASCAL-LIKE BOUND FOR THE NUMBER OF NECKLACES WITH FIXED DENSITY

Hausdorff Measures and Perfect Subsets How random are sequences of positive dimension?

Weighted Finite-State Transducer Algorithms An Overview

Transcription:

CDMTCS Research Report Series Quasiperiods, Subword Complexity and the Smallest Pisot Number Ronny Polley itcampus Software- und Systemhaus GmbH, Leipzig Ludwig Staiger Martin-Luther-Universität Halle-Wittenberg CDMTCS-479 March 2015, revised September 2016 Centre for Discrete Mathematics and Theoretical Computer Science

Quasiperiods, Subword Complexity and the smallest Pisot Number Ronny Polley IT Sonix custom development GmbH Georgiring 3, D-04103 Leipzig, Germany and Ludwig Staiger Martin-Luther-Universität Halle-Wittenberg Institut für Informatik von-seckendorff-platz 1, D 06099 Halle (Saale), Germany Abstract A quasiperiod of a finite or infinite string/word is a word whose occurrences cover every part of the string. A word or an infinite string is referred to as quasiperiodic if it has a quasiperiod. It is obvious that a quasiperiodic infinite string cannot have every word as a subword (factor). Therefore, the question arises how large the set of subwords of a quasiperiodic infinite string can be [Mar04]. Here we show that on the one hand the maximal subword complexity of quasiperiodic infinite strings and on the other hand the size of the sets of maximally complex quasiperiodic infinite strings both are intimately related to the smallest Pisot number t P (also known as plastic constant). We provide an exact estimate on the maximal subword complexity for quasiperiodic infinite words. Keywords: quasiperiodic ω-words, subword complexity, Hausdorff measure email: r.polley@itsonix.eu email: ludwig.staiger@informatik.uni-halle.de

2 R. Polley and L. Staiger Contents 1 Notation 3 2 Quasiperiodicity 4 2.1 General properties....................... 4 3 Hausdorff Dimension and Hausdorff Measure 5 3.1 General properties....................... 5 3.2 The Hausdorff measure of P ω aba and Pω aabaa........... 6 3.3 The Hausdorff measure of suff(p ω aba ) and suff(pω aabaa ).... 7 4 Subword Complexity 9 4.1 The subword complexity of quasiperiodic ω-words..... 9 4.2 Quasiperiods of maximal subword complexity....... 10 In his tutorial [Mar04] Solomon Marcus discussed some open questions on quasiperiodic infinite words. Soon after its publication Levé and Richomme gave answers on some of the open problems (see [LR04]). In connection with Marcus Question 2 they presented a quasiperiodic infinite word (with quasiperiod aba) of exponential subword complexity, and they posed the new question of what is the maximal complexity of a quasiperiodic infinite word. In a recent paper [PS10] we estimated the maximal asymptotic (in the sense of [Sta12]) subword complexity of quasiperiodic infinite words. More precisely, it is shown in [PS10] that every quasiperiodic infinite word ξ has at most f (ξ,n) O(1) tp n factors (subwords) of length n, where t P is the smallest Pisot number, that is, the unique positive root of the polynomial t 3 t 1. Moreover, the general construction of Section 5 in [Sta93] yields quasiperiodic infinite words achieving this bound. In fact, also Levé s and Richomme s [LR04] example meets this asymptotic upper bound O(1) tp n. Surprisingly, it turned out in [PS10] that there are also infinite words meeting this bound having aabaa a different word as quasiperiod. Moreover, it was shown that all other quasiperiods yield infinite words asymptotically below this bound.

Quasiperiods, Subword Complexity and the Smallest Pisot Number 3 The aim of this paper is to compare these two maximal quasiperiods aba and aabaa in order to obtain an answer as to which one of them yields infinite words of greater complexity. Here we compare the quasiperiods aba and aabaa in two respects. 1. Which one of the words aba or aabaa generates the larger set (ωlanguage) of infinite words having q as quasiperiod, and 2. which one of the words aba or aabaa generates an ω-word ξ q having a maximal subword function f (ξ q,n)? As a measure of ω-languages in Item 1 we use the Hausdorff dimension and Hausdorff measure of a subset of the Cantor space of infinite words (ω-words). We obtain that, when neglecting the fixed prefix q of quasiperiodic ω-words having the quasiperiod q the sets of ω-words having quasiperiod aba or aabaa have the same Hausdorff dimension logt P and the same Hausdorff measure t p, both values showing the close connection to the smallest Pisot number. A difference for these quasiperiods appears when we consider the subword function f (ξ,n). It turns out that aabaa is the quasiperiod having the maximally achievable subword complexity for quasiperiodic ω- words. 1 Notation In this section we introduce the notation used throughout the paper. By N = {0, 1, 2,...} we denote the set of natural numbers. Let X be an alphabet of cardinality X = r 2. By X we denote the set of finite words on X, including the empty word e, and X ω is the set of infinite strings (ω-words) over X. Subsets of X will be referred to as languages and subsets of X ω as ω-languages. For w X and η X X ω let w η be their concatenation. This concatenation product extends in an obvious way to subsets L X and B X X ω. For a language L let L := i N L i, and by L ω := {w 1 w i : w i L \ {e}} we denote the set of infinite strings formed by concatenating words in L. Furthermore w is the length of the word w X and pref(b) is the set of all finite prefixes of strings in B X X ω. We shall abbreviate w pref(η) (η X X ω ) by w η. We denote by B/w := {η : w η B} the left derivative of the set B X X ω. As usual, a language L X is regular provided it is accepted by a

4 R. Polley and L. Staiger finite automaton. An equivalent condition is that its set of left derivatives {L/w : w X } is finite. The sets of infixes of B or η are infix(b) := pref(b/w) and infix(η) := w X pref({η}/w), respectively. Similarly suff(b) := B/w is the set of w X w X suffixes of elements of B. In the sequel we assume the reader to be familiar with basic facts of language theory. 2 Quasiperiodicity 2.1 General properties A finite or infinite word η X X ω is referred to as quasiperiodic with quasiperiod q X \ {e} provided for every j < η N { } there is a prefix u j η of length j q < u j j such that u j q η, that is, for every w η the relation u w w u w q is valid (cf. [LR04, Mar04, Mou00]). Next we introduce the finite language P q (L(q) in [Mou00]) which generates the set of quasiperiodic ω-words having quasiperiod q. We set P q := {v : e v q v q} = {v : w(w q v w = q)}. (1) Corollary 4 in [PS10] yields the following characterisation of ω-words having quasiperiod q X \ {e}. ξ has quasiperiod q if and only if pref(ξ) pref(p ω q ) (2) We list some further properties of the set of quasiperiodic ω-words which will be useful in the sequel. Proposition 1 There is a V infix(p q q ) such that P ω q = {ξ : pref(ξ) pref(p ω q )} (3) P ω q = q V ω. (4) Proof. Since P q is finite, Eq. (3) follows from Eq. (2). For the proof of the second identity observe that every word in P q q starts with the quasiperiod q. Then the assertion follows from the identity Pq ω = (Pq) k ω, k 1, and the rotation property (W 1 W 2 ) ω = W 1 (W 2 W 1 ) ω where W 1,W 2 X.

Quasiperiods, Subword Complexity and the Smallest Pisot Number 5 Eq. (3) shows that the set of quasiperiodic ω-words P ω q belongs to the class of ω-languages F X ω satisfying the property F = {ξ : ξ X ω pref(ξ) pref(f)}. Those ω-languages are the ones closed in the CANTOR topology of the set X ω (see [Sta97, Tho90]). Moreover, if pref(f) is a regular language, these ω-languages F can be accepted by every deterministic finite automaton B = (X,S,s 0,δ,S F ) accepting pref(f) (see Fig. 1). ξ F w(w pref(ξ) δ(s 0,w) S F ). (5) b a s a 3 3 0 s 1 2 s 2 2 s 3 b a Figure 1: Deterministic partial automaton accepting Paba ω, all states being final. The proof of Eq. (4) yields a rather large languagev. Using the rotation property successively starting with P q yields a more concise version of V. We exhibit this for the set P aabaa = {aab,aaba,aabaa}. ( ) ω aab {e,a,aa} Example 2 Paabaa ω = {aab,aaba,aabaa} ω = = aab {aab,aaab,aaaab} ω = aabaa {baa,abaa,aabaa} ω Observe that the resulting language {baa, abaa, aabaa} is prefix-free. Moreover {baa, abaa, aabaa} = R (aabaa) in terms of [Mou00]. In the same way one obtains P ω aba = aba {ba,aba}ω. 3 Hausdorff Dimension and Hausdorff Measure 3.1 General properties First, we shall briefly describe the basic formulae needed for the definition of Hausdorff measure and Hausdorff dimension of a subset of X ω. For more background see the textbooks [Edg08, Fal90] or Section 1 in [MS94]. In the setting of languages and ω-languages this can be read as follows (see [MS94, Sta93]). For F X ω, r = X 2 and 0 α 1 the equation IL α (F) := lim l inf { w W } r α w : F W X ω w(w W w l) (6)

6 R. Polley and L. Staiger defines the α-dimensional metric outer measure on X ω. The measure IL α satisfies the following properties (see [MS94, Sta93, Sta15]). Proposition 3 Let F X ω, V X and α [0,1]. 1. If IL α (F) <, then IL α+ε (F) = 0, for all ε > 0. 2. It holds the scaling property IL α (w F) = r α w IL α (F). Then the Hausdorff dimension of F is defined as dimf := sup{α : α = 0 IL α (F) = } = inf{α : IL α (F) = 0}. It should be mentioned that dim is countably stable and invariant under scaling, that is, for F i X ω we have dim i N F i = sup{dimf i : i N} and dimw F 0 = dimf 0. (7) If α = dimf then we call IL α (F) the Hausdorff measure of F. We have the following relation between a language of finite words V and the Hausdorff dimension of its ω-power V ω. Proposition 4 ([Sta93, Eq. (6.2)]) Let V X and V ω be non-empty. Then dimv ω 1 = limsup n log r V X n. n 3.2 The Hausdorff measure of P ω aba and Pω aabaa In Section 4.1 in [PS10] the value limsup n 1 n log r P w X n, for w {aba,aabaa}, was found as log r t P where t P is the smallest Pisot number, that is, the (unique) positive root of the polynomial t 3 t 1. Thus Proposition 4 shows that the Hausdorff dimension of Paba ω and Pω aabaa is log r t P. In what concerns the Hausdorff measure of Paba ω and Pω aabaa we consider the identities Paabaa ω = aabaa {baa,abaa,aabaa}ω and Paba ω = aba {ba,aba}ω derived in Example 2. Here the languages {baa, abaa, aabaa} and {ba, aba} are prefix-free. Therefore we can apply Theorem 4 of [Sta05] which gives the following general formula for the Hausdorff measure of V ω for prefix-free languages V X. Theorem 5 ([Sta05, Theorem 4]) LetV X be prefix-free and α = dimv ω. Then { 0, if u V r α u < 1, and IL α (V ω ) = inf {( r α v ) 1 } : w pref(v ), if u V r α u = 1. wv V

Quasiperiods, Subword Complexity and the Smallest Pisot Number 7 Observe that v V r α v > 1 implies α < dimv ω (see e.g. Proposition 3 in [Sta05]). Example 6 For V = {ba,aba} we have and for V = {baa,abaa,aabaa} w pref(v ) w V {e} a ab b wv V r α v 1 t 2 P w pref(v ) w V {e} a aa aab aaba b ba wv V r α v 1 (tp 3 +t 4 P ) t 3 P Since tp 3 +t 4 P t 1 P t 2 P t 1 P t 1 P = t 1 P < 1, we obtain IL α ({ba,aba} ω ) = IL α ({baa,abaa,aabaa} ω ) = 1 and in view of the scaling property Proposition 3(2), IL α (P aba ) = tp 3 and IL α (P aabaa ) = tp 5. 3.3 The Hausdorff measure of suff(p ω aba ) and suff(pω aabaa ) The estimates IL α (P aabaa ) = tp 5 and IL α(p aba ) = tp 3, however, do not seem to represent the real size of the sets Paba ω and Pω aabaa : All ω-words in Pω aba start with aba and all ω-words in Paabaa ω start with the longer word aabaa. Thus, in view of the scaling property Proposition 3(2), these prefixes contribute the factors tp 3 and tp 5, respectively, to the Hausdorff measure, whereas according to Example 6 the tails {ba,aba} ω and {baa,abaa,aabaa} ω both have Hausdorff measure 1. In order to eliminate this influence of the scaling down of the Hausdorff measures we consider instead the sets suff(pq ω ) of all tails (suffixes) of ω-words in Pq ω. In view of Eq. (7) we have dimsuff(pq ω ) = dim w X Pω q /w = dimpq ω. The following proposition enables us to derive a representation of suff(pq ω ) suitable for calculating its Hausdorff measure. t 2 P t 1 P Proposition 7 Let the ω-language F X ω satisfy the condition F = {ξ : ξ X ω pref(ξ) pref(f)} and let its set of left derivatives {F/w : w X } be finite. Then {suff(f)/w : w X } is also finite and suff(f) = {ξ : ξ X ω pref(ξ) infix(f)}. Both of the assumptions in Proposition 7 are essential. First consider that {E/w : w X } be finite. Here the ω-language E = {a ω } n N a n b{a,b} n a ω has infinitely many left derivatives, for E/a n b E/a m b unless n = m.

8 R. Polley and L. Staiger We have infix(e) = {a,b} but suff(e) {a,b} ω. Next {a,b} a ω {a,b} ω has F/w = F/v for w,v {a,b} but F does not satisfy the condition F = {ξ : ξ X ω pref(ξ) pref(f)}. Here we have suff(f) = F {a,b} ω = {ξ : pref(ξ) {a,b} }. Proof of Proposition 7. Let V X be finite such that {F/w : w X } = {F/w : w V }. Then in view of {suff(f)/w : w X } { v V F/v : V V } the set of left derivatives of suff(f) is obviously finite. Concerning the second assertion, the inclusion follows from pref(ξ/w) infix(f) whenever w X and ξ F. Let now pref(ζ) infix(f). Then for every v pref(ζ) there is a w v with v pref(f/w v ). Since the set {F/w : w X } is finite, there is a w pref(f) such that the language W ζ,w := {v : v pref(ζ) F/w v = F/w} is infinite. Then pref(w ζ,w ) = pref(ζ) and from the assumption that F = {ξ : pref(ξ) pref(f)} we obtain w ζ F. Proposition 7 yields suff(pq ω ) = {ξ : ξ X ω pref(ξ) infix(pq ω )}. In Table 1 the partial automata B aba = ({a,b},{s 0,s 1,s 2,s 3 },s 0,δ aba ) and B aabaa = ({a,b},{z 0,z 1,z 2,z 3,z 4,z 5,z 6 },z 0,δ aabaa ) accepting the languages infix(paba ω ) and infix(paabaa ω ), respectively, are given. B aba s 0 s 1 s 2 s 3 a s 1 s 2 s 1 b s 3 s 3 s 3 B aabaa z 0 z 1 z 2 z 3 z 4 z 5 z 6 a z 1 z 2 z 3 z 4 z 6 z 2 b z 5 z 5 z 5 z 5 z 5 Table 1: Partial automata B aba and B aabaa accepting the languages infix(paba ω ) and infix(pω aabaa ), respectively In view of Proposition 7 these automata accept also the ω-languages suff(paba ω ) and suff(pω aabaa ). Moreover, replacing the initial states s 0 by s 1 and z 0 by z 2 the modified automata accept the ω-languages {ba,aba} ω and {baa,abaa,aabaa} ω, respectively. Consequently, the automaton B aba accepts the ω-language {a,ba} {ba,aba} ω and the automaton B aabaa accepts {aa,baa,abaa} {baa,abaa,aabaa} ω. Using IL α ({ba,aba} ω ) = IL α ({baa,abaa,aabaa} ω ) = 1,α = log r t P, the scaling property Proposition 3, the facts that the unions are disjoint and IL α is an outer measure we obtain IL α (suff(paba ω )) = t 1 P + t 2 P = t P and also IL α (suff(paabaa ω )) = t 2 P +t 3 P +t 4 P = t P. In summary, if we neglect the influence of the prefixes, with respect to Hausdorff measure and Hausdorff dimension both maximal quasiperiods have the same behaviour. Our results support also the close connec-

Quasiperiods, Subword Complexity and the Smallest Pisot Number 9 tion between the smallest Pisot number t P and the sets of quasiperiodic ω-words of largest complexity P ω aba and Pω aabaa. 4 Subword Complexity 4.1 The subword complexity of quasiperiodic ω-words In this section we recall some results from [PS10] on the subword complexity function f (ξ,n) for quasiperiodic ω-words. If ξ X ω is quasiperiodic with quasiperiod q then Eq. (2) shows infix(ξ) infix(p ω q ). Thus f (ξ,n) infix(p ω q ) X n for ξ P ω q. (8) Similarly to the proof of Proposition 5.5 in [Sta93] let ξ q := v P q \{e} v where the order of the factors v P q \ {e} is an arbitrary but fixed wellorder, e.g. the length-lexicographical order. This implies infix(ξ q ) = infix(p ω q ). Consequently, the tight upper bound on the subword complexity of quasiperiodic ω-words having a certain quasiperiod q is f q (n) := infix(p ω q ) X n. The following facts are known from the theory of formal power series (cf. [BP85, SS78]). As infix(p ω q ) is a regular language the power series n N f q (n) t n is a rational series and, therefore, f q satisfies a recurrence relation f q (n + k) = k 1 i=0 m i f q (n + i) (9) with integer coefficients m i Z. Thus f q (n) = k 1 i=0 g i(n) λ n i where k k, λ i are pairwise distinct roots of the polynomial χ q (t) = t n i=0 k 1 m i t i and g i are polynomials of degree not larger than k. The growth of f q (n) mainly depends on the (positive) root λ q of largest modulus among the λ i and the corresponding polynomial g i. Using Corollary 4 in [Sta85] (see also Eq. (8) in [PS10]) one can show without explicitly inspecting the polynomials χ q (t) that the polynomial g i corresponding to the maximal root λ q is constant. Lemma 8 ([PS10, Lemma 16]) Let q X \ {e}. Then there are constants c q,1,c q,2 > 0 and a λ q 1 such that c q,1 λ n q infix(p q ) X n c q,2 λ n q. The quasiperiods aba and aabaa yield the largest value of λ q among all quasiperiods.

10 R. Polley and L. Staiger Lemma 9 ([PS10, Lemma 18]) Let X be an arbitrary alphabet containing at least the two letters a,b. Then the maximal value λ q is obtained for q = aba or q = aabaa. This value is λ aba = λ aabaa = t P where t P is the positive root of the polynomial t 3 t 1. Remark 10 The bound in Lemma 9 is independent of the size of the alphabet X. 4.2 Quasiperiods of maximal subword complexity We have seen that the quasiperiods aba and aabaa yield quasiperiodic ω-words of maximal asymptotic subword complexity. In this section we investigate which one of these two quasiperiods q {aba, aabaa} yields ω-words ξ q {a,b} ω of larger subword complexity f (ξ q,n) = f q (n). From the deterministic automata B aba and B aabaa (see Table 1) accepting the languages infix(paba ) and infix(p aabaa ), respectively, we obtain the adjacency matrices A aba = 0 1 1 0 0 0 1 1 0 1 0 0 0 0 1 0 and A aabaa = 0 1 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0. (10) Remark 11 Here the boldface entries are irreducible square submatrices which correspond to the automata accepting the ω-languages {ba,aba} ω and {baa,abaa,aabaa} ω, respectively, obtained from B aba and B aabaa by replacing the initial states s 0 by s 1 and z 0 by z 2. Then we obtain f q (n) = infix(p ω q X n ) as the vector-matrix-vector product f q (n) = (1,0,...,0) A n q (1,...,1) (see Chapter II.9 in [SS78] or [MS94]). The characteristic polynomials of A aba and A aabaa are χ aba (t) = t (t 3 t 1) and χ aabaa (t) = t 2 (t 3 t 1) (t 2 + 1) = t 7 +t 4 +t 3 +t 2. (11) Thus the sequence ( f aba (n) ) satisfies the recurrence relation n N f aba (n + 4) = f aba (n + 2) + f aba (n + 1)

Quasiperiods, Subword Complexity and the Smallest Pisot Number 11 and ( f aabaa (n) ) n N satisfies f aabaa (n + 7) = f aabaa (n + 4) + f aabaa (n + 3) + f aabaa (n + 2). Since the polynomial χ aba (t) divides χ aabaa (t), both sequences ( f aba (n) ) n N and ( f aabaa (n) ) satisfy the recurrence relation n N f q (n + 7) = f q (n + 4) + f q (n + 3) + f q (n + 2). The initial values are (1,2,3,4,5,7,9) for q = aba (see also [LR04]) and (1,2,3,4,6,8,10) for q = aabaa. This gives already evidence that the sequence ( f aabaa (n) ) n N grows faster than ( f aba (n) ) n N. In order to calculate the growth of ( f q (n)) n N where q {aba,aabaa} more accurately, we use standard methods of recurrent relations (see [GKP94] or Chapter 3 of [Hal67]) The non-zero roots of the polynomials χ aba (t) and χ aabaa (t) are the roots t P,t 1,t 2 of t 3 t 1 and, for χ aabaa (t) additionally, i and i where i = 1 is the imaginary unit. The roots t P,t 1,t 2 satisfy the relations t P +t 1 +t 2 = 0, t P t 1 t 2 = 1 t P > 1 and t 1 = t 2 < 1. Since both characteristic polynomials have only simple non-zero roots, f aba (n) and f aabaa (n) satisfy the following identities (cf. [BR88, GKP94, SS78]). f aba (n) = γ 1 tp n + γ 2 t1 n + γ 3 t2 n, n 1 and (12) f aabaa (n) = γ 1 tp n + γ 2 t1 n + γ 3 t2 n + γ 4 i n + γ 5 ( i)n, n 2. (13) For the function f aabaa (n) the following initial conditions hold. f aabaa (2) = 3 = γ 1 t2 P + γ 2 t2 1 + γ 3 t2 2 + γ 4 i2 + γ 5 ( i)2 f aabaa (3) = 4 = γ 1 t3 P + γ 2 t3 1 + γ 3 t3 2 + γ 4 i3 + γ 5 ( i)3 f aabaa (4) = 6 = γ 1 t4 P + γ 2 t4 1 + γ 3 t4 2 + γ 4 i4 + γ 5 ( i)4 f aabaa (5) = 8 = γ 1 t5 P + γ 2 t5 1 + γ 3 t5 2 + γ 4 i5 + γ 5 ( i)5 f aabaa (6) = 10 = γ 1 t6 P + γ 2 t6 1 + γ 3 t6 2 + γ 4 i6 + γ 5 ( i)6 (14) Then f aabaa (5) f aabaa (3) f aabaa (2) = 1 and f aabaa (6) f aabaa (4) f aabaa (3) = 0 in view of t 3 = t + 1 for t {t P,t 1,t 2 } imply 2 i (γ 4 γ 5 )+ (γ 4 + γ 5 ) = 1, and i (γ 4 γ 5 ) 2 (γ 4 + γ 5 ) = 0 (15)

12 R. Polley and L. Staiger which in turn yields γ 4 + γ 5 = 1 5 and γ 4 γ 5 = 2 i 5. Thus we may reduce the numbers of equations in Eq. (14) to three. f aabaa (2) = 3 = γ 1 t2 P + γ 2 t2 1 + γ 3 t2 2 1/5 f aabaa (3) = 4 = γ 1 t3 P + γ 2 t3 1 + γ 3 t3 2 2/5 (16) f aabaa (4) = 6 = γ 1 t4 P + γ 2 t4 1 + γ 3 t4 2 + 1/5 And for f aba (n) we obtain the following three equations from the initial conditions. f aba (1) = 2 = γ 1 t P + γ 2 t 1 + γ 3 t 2 f aba (2) = 3 = γ 1 t 2 P + γ 2 t 2 1 + γ 3 t 2 2 (17) f aba (3) = 4 = γ 1 tp 3 + γ 2 t1 3 + γ 3 t2 3 To solve these for the values of γ 1 and γ 1, respectively, we use Cramer s rule. 2 t 1 t 2 γ 1 = 3 t1 2 t2 2 4 t1 3 t2 3 t P t 1 t 2 1 16 tp 2 t2 1 t2 2 and γ 5 t1 2 t 2 2 tp 3 t3 1 t2 3 1 = 22 5 t1 3 t2 3 29 5 t1 4 t2 4 tp 2 t2 1 t 2 1 2 tp 3 t3 1 t2 3 tp 4 t4 1 t2 4 The following auxiliary consideration alleviates the calculation of the determinants. Here we use the identities t 1 + t 2 = t P, t 1 t 2 = tp 1, which hold for the roots t P,t 1,t 2 of t 3 t 1. x 1 1 x 1 0 y t 1 t 2 z t1 2 t2 2 = (t 2 t 1 ) y t 1 1 z t1 2 t 2 +t 1 = (t x 1 0 2 t 1 ) y 0 1 z t 1 t 2 t 2 +t 1 = (t 2 t 1 ) y t2 P + z t P + x (18) t P = (t 2 t 1 ) (x tp 2 + y t P + (z x) ) Extracting common factors, applying the auxiliary Eq. (18) and reducing to lowest terms we obtain from Eq. (4.2) γ 1 = 2 t2 P + 3 t P + 2 1, 6787356, and 2 t P + 3 (19) γ 1 = 13 t2 P + 16 t P + 9 1, 876608 5 (2 t P + 3) (20) Since t 1 = t 2 < 1 we have γ 2 t1 n + γ 3 t2 n < 1 2 and γ 2 tn 1 + γ 3 tn 2 ± 2 5 < 1 2 for sufficiently large n N. Taking γ 4 + γ 5 2 5 into account, Eqs. (12) and (13) show that for these n N the values of f aba (n) and f aabaa (n) are the integers closest to γ 1 tp n and γ 1 tn P, respectively.

Quasiperiods, Subword Complexity and the Smallest Pisot Number 13 Acknowledgement The authors thank the referees for critical and helpful comments that improved the paper. References [BP85] Jean Berstel and Dominique Perrin. Theory of codes, volume 117 of Pure and Applied Mathematics. Academic Press Inc., Orlando, FL, 1985. [BR88] Jean Berstel and Christophe Reutenauer. Rational series and their languages, volume 12 of EATCS Monographs on Theoretical Computer Science. Springer-Verlag, Berlin, 1988. [Edg08] [Fal90] Gerald Edgar. Measure, topology, and fractal geometry. Undergraduate Texts in Mathematics. Springer-Verlag, New York, second edition, 2008. Kenneth Falconer. Fractal geometry. John Wiley & Sons Ltd., Chichester, 1990. [GKP94] Ronald L. Graham, Donald E. Knuth, and Oren Patashnik. Concrete mathematics. Addison-Wesley Publishing Company, Reading, MA, second edition, 1994. [Hal67] Marshall Hall, Jr. Combinatorial theory. Blaisdell Publishing Co., Waltham, Mass., 1967. [LR04] Florence Levé and Gwénaël Richomme. Quasiperiodic infinite words: Some answers (column: Formal language theory). Bulletin of the EATCS, 84:128 138, 2004. [Mar04] Solomon Marcus. Quasiperiodic infinite words (column: Formal language theory). Bulletin of the EATCS, 82:170 174, 2004. [Mou00] Laurent Mouchard. Normal forms of quasiperiodic strings. Theoret. Comput. Sci., 249(2):313 324, 2000. [MS94] Wolfgang Merzenich and Ludwig Staiger. Fractals, dimension, and formal languages. RAIRO Inform. Théor. Appl., 28(3-4):361 386, 1994.

14 R. Polley and L. Staiger [PS10] [SS78] [Sta85] [Sta93] [Sta97] Ronny Polley and Ludwig Staiger. The maximal subword complexity of quasiperiodic infinite words. In Ian McQuillan and Giovanni Pighizzini, editors, DCFS, volume 31 of Electr. Proc. Theor. Comput. Sci. (EPTCS), pages 169 176, 2010. Arto Salomaa and Matti Soittola. Automata-theoretic aspects of formal power series. Springer-Verlag, New York, 1978. Ludwig Staiger. The entropy of finite-state ω-languages. Problems Control Inform. Theory/Problemy Upravlen. Teor. Inform., 14(5):383 392, 1985. Ludwig Staiger. Kolmogorov complexity and Hausdorff dimension. Inform. and Comput., 103(2):159 194, 1993. Ludwig Staiger. ω-languages. In Grzegorz Rozenberg and Arto Salomaa, editors, Handbook of Formal Languages, volume 3, pages 339 387. Springer-Verlag, Berlin, 1997. [Sta05] Ludwig Staiger. Infinite iterated function systems in Cantor space and the Hausdorff measure of ω-power languages. Internat. J. Found. Comput. Sci., 16(4):787 802, 2005. [Sta12] Ludwig Staiger. Asymptotic subword complexity. In Henning Bordihn, Martin Kutrib, and Bianca Truthe, editors, Languages Alive, volume 7300 of Lecture Notes in Computer Science, pages 236 245. Springer-Verlag, Heidelberg, 2012. [Sta15] Ludwig Staiger. On the Hausdorff measure of regular ω- languages in Cantor space. Discrete Mathematics & Theoretical Computer Science, 17(1):357 368, 2015. [Tho90] Wolfgang Thomas. Automata on infinite objects. In Jan van Leeuwen, editor, Handbook of Theoretical Computer Science, Vol. B, pages 133 191. Elsevier, Amsterdam, 1990.