The Frequent Paucity of Trivial Strings

The Frequent Paucity of Trivial Strings Jack H. Lutz Departent of Coputer Science Iowa State University Aes, IA 50011, USA lutz@cs.iastate.edu Abstract A 1976 theore of Chaitin can be used to show that arbitrarily dense sets of lengths n have a paucity of trivial strings (only a bounded nuber of strings of length n having trivially low plain Kologorov coplexities). We use the probabilistic ethod to give a new proof of this fact. This proof is uch sipler than previously published proofs, and it gives a tighter paucity bound. 1 Background A string of binary data is trivial if, like a string of all zeros, it contains negligible inforation beyond that iplicit in its length. This notion of triviality has been ade precise in several different ways, and these have been useful in the foundations of Kologorov coplexity [6], inforationtheoretic characterizations of decidability and polynoial-tie decidability [2, 8], foral language theory [4], and the theory of K-trivial sequences [7, 3]. These applications share several coon features. Each uses soe version of Kologorov coplexity to quantify the inforation content of a string. Each paraetrizes its triviality notion by a nonnegative integer c, defining a string to be c-trivial if its inforation content is within c bits of a triviality criterion. Most crucially, the key to each of these applications is a paucity theore, stating that there are any lengths n at which there is a paucity (at ost a fixed ultiple of 2 c ) of c-trivial strings of length n. The first such paucity theore, reported in 1969, was proved by Meyer [6]. Chaitin subsequently strengthened Meyer s proof, slightly relaxing his triviality notion and obtaining the following. Theore 1 (Chaitin [2]). There is a constant a N such that, for all n, d N, at ost 2 d+a strings x {0, 1} n satisfy C(x) d + C(n). 1

Here C(x) is the plain Kologorov coplexity of x, the iniu nuber of bits required to progra a fixed universal Turing achine to print the string x, and C(n) = C(s n ), where s 0, s 1,... is a standard enueration of {0, 1}. (Thorough treatents of C(x) appear in [5, 7, 3].) This note concerns paucity theores involving log n, rather than C(n), as a triviality criterion. Since C(n) is usually close to log n, one such paucity theore can be derived fro Theore 1, as we now show. Logariths here are base-2. We will use the (Schnirelann) density of a set L N, which is { } L< σ(l) = inf Z+, where we write L < = L {0,..., 1} [9]. Intuitively the condition n L holds frequently if σ(l) > 0. This is clearly a stronger condition than the assertion that L is infinite. To relate the triviality criteria log n and C(n), define the set for each r N. L(r) = { n N C(n) + r log n Observation 2. For each R N, σ(l(r)) 1 2 1 r. Proof. For each Z +, the copleent L(r) c of L(r) satisfies so It follows that (L(r) c ) < = {n < C(n) < (log n) r} {n < C(n) < (log ) r} (L(r) c ) < {0, 1} <(log ) r < 2 1 r+log = 2 1 r. { } L(r)< σ(l(r)) = inf Z+ { } 2 1 r inf Z+ = 1 2 1 r. 2 }

We now have the following easy consequence of Theore 1. Theore 3 (very frequent paucity theore). The constant a of Theore 1 has the property that, for all c, r N, the set of nonnegative integers n for which at ost 2 c+a+r strings x {0, 1} n satisfy C(x) c + log n has density at least 1 2 1 r. Proof. Let a N be as in Theore 1, and let c, r N. For each n N, define the sets B n = {x {0, 1} n C(x) c + log n} and and let B n = {x {0, 1} n C(x) c + r + C(n)} L c = { n N Bn 2 c+a+r}. It suffice to show that σ(l c ) 1 2 1 r. Let n L(r). Then C(n) + r log n, so B n B n. Applying Theore 1 with d = c + r, we have B 2 c+r+a, whence B n 2 c+r+a. Hence n L c. We have now shown that L(r) L c. It follows by Observation 2 that σ(l c ) σ(l(r)) 1 2 1 r. The proofs of Theore 1 and Meyer s earlier paucity theore are soewhat involved. Part of this is because these early proofs were aied at proving ore, naely that (I) for every c N there are at ost 2 c+a infinite binary sequences that are c-trivial in the sense that every nonepty prefix x of such a sequence satisfies C(x) c + log x ; and (II) every such c-trivial sequence is decidable. It is clear that (I) follows iediately fro Theore 1, and it is now well understood that (II) follows directly fro (I), because every isolated infinite branch of a decidable tree is decidable [3]. In the 1990s, Li and Vitanyi proved the following paucity theore. Theore 4 (Li and Vitanyi [4]). There is a constant a N such that, for every c N, there exist infinitely any lengths n for which at ost 2 c+a strings x {0, 1} n satisfy C(x) c + log n. 3

Theore 4 is weaker than Theore 3, because it only tells us that the paucity of trivial strings occurs at infinitely any lengths. Li and Vitanyi s proof of Theore 4 is sipler than the proof of Theore 1 (hence siplier than the proof of Theore 3), even when one discounts the parts of the proof of Theore 1 devoted to (I) and (II). However, even Li and Vitanyi s siplified proof is nontrivial. 2 Result The purpose of this note is to give a very siple proof of a frequent paucity theore. Our theore s frequency condition is as strong as that of Theore 3. However, our theore iproves on earlier paucity theores in a significant respect: While the proofs of Theores 1, 3, and 4 require the constant a to be as large as the nuber of bits required to encode a nontrivial Turing achine, our siple proof shows that it suffices to take a = 1. Our siple proof has a siple intuition: As in the proof of Theore 3, let B n = {x {0, 1} n C(x) c + log n}. We want to show that B n is often sall. Well, the average of the first values of B n is 1 1 B n = 1 1 B n 1 {0, 1} <c+log < 1 +1 2c+log = 2 c+1, so B n 2 c+1 ust hold frequently! The details follow. Theore 5. Let c N. 1 (frequent paucity). The set of nonnegative integers n for which at ost 2 c+1 strings x {0, 1} n satisfy C(x) c + log n has density at least (2 c+1 1) 1. 2 (very frequent paucity). For every r N, the set of nonnegative integers n for which at ost 2 c+r strings x {0, 1} n satisfy C(x) c + log n has density at least 1 2 1 r. 4

Proof. Let c, r N, and let d = 2 c+r. For each n N, let B n = {x {0, 1} n C(x) c + log n}, noting that B 0 =, and let { } L = n N B n d Let Z +, and let l = L <. Consider the average We have µ = 1 1 B n. µ = 1 1 B n 1 {0, } <c log < 1 +1 2c+log and whence 1. If r = 1, then ( ) says that = 2 c+1 µ 1 ( l)(d + 1) 2 c+1 > ( l)(d + 1). d > ( l)(d + 1) whence l > d + 1. Since this holds for all Z +, it follows that σ(l) 1 d+1 2. More generally, for r N, ( ) iplies that whence 2 c+1 > ( l)2 c+r, d > (1 2 1 r ) Since this holds for all Z +, it follows that σ(l) 1 2 1 r. ( ) 5

3 Conclusion The siplicity of the above proof is the ain contribution of this note. Its siplicity arises fro its use of the first oent probabilistic ethod [1, 9]: Rather than deal with the cardinalities B n individually, it exaines their average. It is an open question whether the probabilistic ethod can siilarly siplify the proof of Theore 1. A brief reark on pedagogy: Li and Vitanyi s Kologorov coplexity characterization of regular languages [4, 5] yields a siple and intuitive ethod for proving that languages are not regular. A possible obstacle to teaching this ethod in undergraduate theory courses has been that the characterization theore relies on the (seeingly) difficult Theore 4. The siple proof here reoves that obstacle. Acknowledgents I thank the referees for extreely useful observations. This research was supported in part by National Science Foundation Grant 0652569. Part of this work was done during a sabbatical at Caltech and the Isaac Newton Institute for Matheatical Sciences at the University of Cabridge. References [1] N. Alon and J.H. Spencer. The Probabilistic Method (third edition). John Wiley & Sons, 2008. [2] Gregory J. Chaitin. Inforation-theoretic characterizations of recursive infinite strings. Theoretical Coputer Science, 2:45 48, 1976. [3] Rodney G. Downey and Denis R. Hirschfeldt. Algorithic Randoness and Coplexity. Springer, 2010. [4] Ming Li and Paul M.B. Vitányi. A new approach to foral language theory by Kologorov coplexity. SIAM Journal on Coputing, 24:398 410, 1995. [5] Ming Li and Paul M.B. Vitányi. An Introduction to Kologorov Coplexity and Its Applications (third edition). Springer, 2008. 6

[6] Donald W. Loveland. A variant of the Kologorov concept of coplexity. Inforation and Control, 15:510 526, 1969. [7] André Nies. Coputability and Randoness. Oxford University Press, New York, NY, USA, 2009. [8] Pekka Orponen, Ker-I Ko, Uwe Schöning, and Osau Watanabe. Instance coplexity. Journal of the ACM, 41:96 121, 1994. [9] Terence Tao and Van Vu. Additive Cobinatorics. Cabridge University Press, 2006. 7