ISSN Article

Similar documents
Sophistication as Randomness Deficiency

Sophistication Revisited

Randomness and Recursive Enumerability

Chaitin Ω Numbers and Halting Problems

Introduction to Kolmogorov Complexity

ISSN Article. Tsallis Entropy, Escort Probability and the Incomplete Information Theory

A statistical mechanical interpretation of algorithmic information theory

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Program size complexity for possibly infinite computations

Received: 20 December 2011; in revised form: 4 February 2012 / Accepted: 7 February 2012 / Published: 2 March 2012

Data Compression. Limit of Information Compression. October, Examples of codes 1

Low-Depth Witnesses are Easy to Find

The Optimal Fix-Free Code for Anti-Uniform Sources

CISC 876: Kolmogorov Complexity

Kolmogorov complexity

Is there an Elegant Universal Theory of Prediction?

Using Information Theory to Study Efficiency and Capacity of Computers and Similar Devices

Random Reals à la Chaitin with or without prefix-freeness

Cone Avoidance of Some Turing Degrees

Received: 7 September 2017; Accepted: 22 October 2017; Published: 25 October 2017

Kolmogorov complexity and its applications

Randomness, probabilities and machines

Limitations of Efficient Reducibility to the Kolmogorov Random Strings

Symmetry of Information: A Closer Look

Counting dependent and independent strings

Kolmogorov-Loveland Randomness and Stochasticity

Kolmogorov Complexity in Randomness Extraction

On the Entropy of a Two Step Random Fibonacci Substitution

Kolmogorov complexity and its applications

Compression Complexity

On Extensions over Semigroups and Applications

Fuzzy Limits of Functions

A version of for which ZFC can not predict a single bit Robert M. Solovay May 16, Introduction In [2], Chaitin introd

THE K-DEGREES, LOW FOR K DEGREES, AND WEAKLY LOW FOR K SETS

A Generalized Characterization of Algorithmic Probability

2 Plain Kolmogorov Complexity

Measuring Agent Intelligence via Hierarchies of Environments

Distinguishing two probability ensembles with one sample from each ensemble

HAMILTONIAN ELLIPTIC DYNAMICS ON SYMPLECTIC 4-MANIFOLDS

Randomness, Computability, and Density

Undergraduate Notes in Mathematics. Arkansas Tech University Department of Mathematics

Entropy and Ergodic Theory Lecture 3: The meaning of entropy in information theory

Pseudorandom Generators

Distribution of Environments in Formal Measures of Intelligence: Extended Version

Bias and No Free Lunch in Formal Measures of Intelligence

Entropy as a measure of surprise

Journal of Computer and System Sciences

General Formula for the Efficiency of Quantum-Mechanical Analog of the Carnot Engine

Algorithmic Probability

SINCE algorithmic probability has first been studied in the

STAT 7032 Probability Spring Wlodek Bryc

g 2 (x) (1/3)M 1 = (1/3)(2/3)M.

Measures of relative complexity

Section 11.1 Sequences

Kolmogorov Complexity and Diophantine Approximation

An algorithm for the satisfiability problem of formulas in conjunctive normal form

REVIEW OF ESSENTIAL MATH 346 TOPICS

Intro to Information Theory

What are the recursion theoretic properties of a set of axioms? Understanding a paper by William Craig Armando B. Matos

The complexity of recursive constraint satisfaction problems.

NOTES ON DIOPHANTINE APPROXIMATION

Information Theory in Intelligent Decision Making

Algorithmic Strategy Complexity

algorithms ISSN

Correspondence Principles for Effective Dimensions

We are going to discuss what it means for a sequence to converge in three stages: First, we define what it means for a sequence to converge to zero

Kraft-Chaitin Inequality Revisited

Homework 4, 5, 6 Solutions. > 0, and so a n 0 = n + 1 n = ( n+1 n)( n+1+ n) 1 if n is odd 1/n if n is even diverges.

Math 104: Homework 7 solutions

ECE 587 / STA 563: Lecture 5 Lossless Compression

Proof. We indicate by α, β (finite or not) the end-points of I and call

ECE 587 / STA 563: Lecture 5 Lossless Compression

New universal Lyapunov functions for nonlinear kinetics

Hausdorff Measures and Perfect Subsets How random are sequences of positive dimension?

CHAPTER 8: EXPLORING R

Kolmogorov complexity ; induction, prediction and compression

Lecture 11: Quantum Information III - Source Coding

A Mathematical Theory of Communication

Randomness Beyond Lebesgue Measure

which possibility holds in an ensemble with a given a priori probability p k log 2

Time-Bounded Kolmogorov Complexity and Solovay Functions

Recovery Based on Kolmogorov Complexity in Underdetermined Systems of Linear Equations

Solutions Final Exam May. 14, 2014

On Generalized Entropy Measures and Non-extensive Statistical Mechanics

Induction, sequences, limits and continuity

Chapter 6 The Structural Risk Minimization Principle

On the average-case complexity of Shellsort

Solutions Final Exam May. 14, 2014

The Riemann Hypothesis. Cristian Dumitrescu

Kolmogorov Complexity

Randomness and Halting Probabilities

A New Lower Bound Technique for Quantum Circuits without Ancillæ

Existence of Ulam Stability for Iterative Fractional Differential Equations Based on Fractional Entropy

10-704: Information Processing and Learning Fall Lecture 10: Oct 3

Monotonically Computable Real Numbers

Metric Spaces and Topology

Entropy and Ergodic Theory Lecture 4: Conditional entropy and mutual information

Algorithmic Information Theory

Chapter 2 Review of Classical Information Theory

CS154, Lecture 12: Kolmogorov Complexity: A Universal Theory of Data Compression

Transcription:

Entropy 0, 3, 595-6; doi:0.3390/e3030595 OPEN ACCESS entropy ISSN 099-4300 www.mdpi.com/journal/entropy Article Entropy Measures vs. Kolmogorov Compleity Andreia Teieira,,, Armando Matos,3, André Souto, and Luís Antunes, Computer Science Department, Faculty of Sciences, University of Porto, Rua Campo Alegre 0/055, 469-007 Porto, Portugal; E-Mails: acm@dcc.fc.up.pt (A.M.); andresouto@dcc.fc.up.pt (A.S.); lfa@dcc.fc.up.pt (L.A.) Instituto de Telecomunicações, Av. Rovisco Pais, 049-00 Lisboa, Portugal 3 Laboratório de Inteligência Artificial e Ciência de Computadores, Rua Campo Alegre 0/055, 469-007 Porto, Portugal Author to whom correspondence should be addressed; E-Mail: andreiasofia@ncc.up.pt; Tel.: +35-0-40-93. Received: 4 January 0; in revised form: 5 February 0 / Accepted: 6 February 0 / Published: 3 March 0 Abstract: Kolmogorov compleity and Shannon entropy are conceptually different measures. However, for any recursive probability distribution, the epected value of Kolmogorov compleity equals its Shannon entropy, up to a constant. We study if a similar relationship holds for Rényi and Tsallis entropies of order α, showing that it only holds for α. Regarding a time-bounded analogue relationship, we show that, for some distributions we have a similar result. We prove that, for universal time-bounded distribution m t (), Tsallis and Rényi entropies converge if and only if α is greater than. We also establish the uniform continuity of these entropies. Keywords: Kolmogorov compleity; Shannon entropy; Rényi entropy; Tsallis entropy. Introduction The Kolmogorov compleity K() measures the amount of information contained in an individual object (usually a string), by the size of the smallest program that generates it. It naturally characterizes a probability distribution over Σ (the set of all finite binary strings), assigning a probability of c K() for any string, where c is a constant. This probability distribution is called universal probability distribution and it is denoted by m.

Entropy 0, 3 596 The Shannon entropy H(X) of a random variable X is a measure of its average uncertainty. It is the smallest number of bits required, on average, to describe, the output of the random variable X. Kolmogorov compleity and Shannon entropy are conceptually different, as the former is based on the length of programs and the later in probability distributions. However, for any recursive probability distribution (i.e., distributions that are computable by a Turing machine), the epected value of the Kolmogorov compleity equals the Shannon entropy, up to a constant term depending only on the distribution (see []). Several information measures or entropies have been introduced since Shannon s seminal paper []. We are interested in two different generalizations of Shannon entropy: Rényi entropy [3], an additive measure based on a specific form of mean of the elementary information gain: instead of using the arithmetic mean, Rényi used the Kolmogorov-Naguno mean associated with the function f() c b ( α) + c where c and c are constants, b is a real greater than and α is a non-negative parameter; Tsallis entropy [4], a non additive measure, often called a non-etensive measure, in which the probabilities are scaled by a positive power α, that may either reinforce the large (if α>) or the small (if α<) probabilities. Let R α (P ) and T α (P ) denote, respectively, the Rényi and Tsallis entropies associated with the probability distribution P. Both are continuous functions of the parameter α and both are (quite different) generalizations of the Shannon entropy, in the sense that R (P )T (P )H(P ) (see [5]). It is well known (see [6]) that for any recursive probability distribution P over Σ, the average value of K() and the Shannon entropy H(P ) are close, in the sense that 0 P ()K() H(P ) K(P ) () where K(P ) is the length of the shortest program that describes the distribution P. We study if this property also holds for Rényi and Tsallis entropies. The answer is no. If we replace H by R or by T, the inequalities () are no longer true (unless α ). We also analyze the validity of the relationship (), replacing Kolmogorov compleity by its time-bounded version, proving that it holds for distributions such that the cumulative probability distribution is computable in an allotted time. So, for these distributions, Shannon entropy equals the epected value of the time-bounded Kolmogorov compleity. We also study the convergence of Tsallis and Rényi entropies of the universal time-bounded distribution m t () c Kt (), proving that both entropies converge if and only if α>. Finally, we prove the uniform continuity of Tsallis, Rényi and Shannon entropies. The rest of this paper is organized as follows. In the net section, we present the notions and results that will be used. In Section 3, we study if the inequalities () can be generalized for Rényi and Tsallis entropies and we also establish a similar relationship for the time-bounded Kolmogorov compleity. In Section 4, we analyze the entropies of the universal time-bounded distribution. In Section 5, we establish the uniform continuity of the entropies, a mathematical result that may be useful in further research.

Entropy 0, 3 597. Preliminaries Σ {0, } is the set of all finite binary strings. The empty string is denoted by ɛ. Σ n is the set of strings of length n and denotes the length of the string. Strings are leicographically ordered. The logarithm of in base is denoted by log(). The real interval between a and b, including a and ecluding b is represented by [a, b). A sequence of real numbers r n is denoted by (r n ) n N... Kolmogorov Compleity Kolmogorov compleity was introduced independently, in the 960 s by Solomonoff [7], Kolmogorov [8], and Chaitin [9]. Only essential definitions and basic results are given here, for further details see []. The model of computation used is the prefi-free Turing machine, i.e., Turing machines with a prefi-free domain. A set of strings A is prefi-free if no string in A is prefi of another string of A. Kraft s inequality guarantees that for any prefi-free set A, A. In this work all resource bounds t considered are time constructible, i.e., there is a Turing machine whose running time is eactly t(n) on every input of size n. Definition.. Kolmogorov compleity. Let U be a fied prefi-free universal Turing machine. For any two strings, y Σ, the Kolmogorov compleity of given y is K( y) min p { p : U(p, y) }, where U(p, y) is the output of the program p with auiliary input y when it is run in the machine U. For any time constructible t, the t-time-bounded Kolmogorov compleity of given y is, K t ( y) min p { p : U(p, y) in at most t( ) steps}. The default value for y, the auiliary input is the empty string ɛ; for simplicity, we denote K( ɛ) and K t ( ɛ) by K() and K t (), respectively. The choice of the universal Turing machine affects the running time of a program by, at most, a logarithmic factor and the program length by, at most, a constant number of etra bits. Definition.. Let c be a non-negative integer. K() n c. We say that Σ n is c-kolmogorov random if Proposition.3. For all strings, y Σ, we have: - K() K t ()+O() +log( )+O(); - K( y) K()+O() and K t ( y) K t ()+O(); - There are at least n ( c ) c-kolmogorov random strings of length n. As Solovay [0] observed, for infinitely many, the time-bounded version of Kolmogorov compleity equals the unbounded version. Formally, we have: Theorem.4. (Solovay [0]) For all time-bounds t(n) n + O() we have K t () K()+O() for infinitely many. As a consequence of this result, there is a string of arbitrarily large Kolmogorov compleity such that K t () K()+O().

Entropy 0, 3 598 Definition.5. A semi-measure over a discrete set X is a function f : X [0, ] such that, f(). We say that a semi-measure is a measure if the equality holds. A semi-measure is X constructive if it is semi-computable from below, i.e., for each, there is a Turing machine that produces a monotone increasing sequence of rationals converging to f(). An important constructive semi-measure based on Kolmogorov compleity is defined by m() K(). This semi-measure dominates any other constructive semi-measure μ (see [,]), in the sense that there is a constant c μ K(μ) such that, for all, m() c μ μ(). For this reason, this semi-measure is called universal. Since it is natural to consider time-bounds on Kolmogorov compleity, we can define m t (), a time-bounded version of m(). Definition.6. We say that a function f is computable in time t if there is a Turing machine that on the input computes the output f(), in eactly t( ) steps. Definition.7. The t-time-bounded universal distribution is m t () c Kt (), where c is a real number such that Σ m t (). In [], the authors proved that m t dominates distributions computable in time t, where t is a time-bound that only depends on t. Formally: Theorem.8. (Claim 7.6. in []) If μ, the cumulative probability distribution of μ, is computable in time t(n) then for all Σ, m t () Kt (μ ) μ(), where t (n) nt(n)log(nt(n))... Entropies Information Theory was introduced in 948 by C.E. Shannon []. Shannon entropy quantifies the uncertainty of the results of an eperiment; it quantifies the average number of bits necessary to describe an outcome from an event. Definition.9. (Shannon entropy []) Let X be a finite or infinitely countable set and let X be a random variable taking values in X with distribution P. The Shannon entropy of the random variable X is H(X) X P ()logp () () Definition.0. (Rényi entropy [3]) Let X be a finite or infinitely countable set and let X be a random variable taking values in X with distribution P and let α be a non-negative real number. The Rényi entropy of order α of the random variable X is R α (X) log ( ) P () α Definition.. (Tsallis entropy [4]) Let X be a finite or infinitely countable set and let X be a random variable taking values in X with distribution P and let α be a non-negative real number. The Tsallis entropy of order α of the random variable X is T α (X) α X ( ) P () α X (3) (4)

Entropy 0, 3 599 It is easy to prove that lim R α(x) lim T α (X) H(X) (5) α α Given the conceptual differences in the definitions of Kolmogorov compleity and Shannon entropy, it is interesting that under some weak restrictions on the distribution of the strings, they are related, in the sense that the value of Shannon entropy equals the epected value of Kolmogorov compleity, up to a constant term that only depends on the distribution. Theorem.. Let P () be a recursive probability distribution such that H(P ) <. Then, 0 P ()K() H(P ) K(P ) (6) Proof. (Sketch, see [] for details). The first inequality follows directly from the Noiseless Coding Theorem, that, for these distributions, states H(P ) P ()K(). Since m is universal, m() K(P ) P (), for all, which is equivalent to log P () K(P ) K(). Thus, we have: P ()K() H(P ) ( ) P ()(K()+logP()) (7) ( ) P ()(K()+K(P) K()) (8) K(P ) (9) 3. Kolmogorov Compleity and Entropy: How Close? Since Rényi and Tsallis entropies are generalizations of Shannon entropy, we now study if Theorem. can be generalized for these entropies. Then, we prove that for distributions such that the cumulative probability distribution is computable in time t(n), Shannon entropy equals the epected value of the t-time-bounded Kolmogorov compleity. First, we observe that the interval [0,K(P)] of the inequalities of Theorem. is tight up to a constant term that only depends on the universal Turing machine chosen as reference. A similar study has been included in [] and in [3]. The following eamples illustrate the tightness of this interval. We present a probability distribution that satisfies: P ()K() H(P )K(P) O(), with K(P ) n (0) and a probability distribution that satisfies: P ()K() H(P )O() and K(P ) n. () Eample 3.. Fi 0 Σ n. Consider the distribution concentrated in 0, i.e., { if 0 P n () 0 otherwise

Entropy 0, 3 600 Notice that describing this distribution is equivalent to describing 0. So, K(P n )K( 0 )+O(). On the other hand, P n ()K() H(P n )K( 0 ). Thus, if 0 is c-kolmogorov random, i.e., K( 0 ) n c, then K(P n ) n and P n ()K() H(P n ) n. Eample 3.. Let y be a string of length n that is c-kolmogorov random, i.e., K(y) n c and consider the following probability distribution over Σ : 0.y if 0 P n () 0.y if 0 otherwise where 0.y represents the real number between 0 and which binary representation is y. Notice that we can choose 0 and such that K( 0 )K( ) c, where c is a constant greater than and hence does not depend on n. Thus,. K(P n ) n c, since describing P n is essentially equivalent to describe 0, and y;. P n ()K() (0.y)K( 0 )+( 0.y)K( ) 0.y c +( 0.y) c c ; 3. H(P n ) 0.y log 0.y ( 0.y)log( 0.y). Thus, 0 P n ()K() H(P n ) c << K (P n ) () and K(P n ) n c (3) Now we address the question if an analogue of Theorem. holds for Rényi and Tsallis entropies. We show that the Shannon entropy is the only entropy that verifies simultaneously both inequalities of Theorem. and thus is the only one suitable to deal with information. For every ε>0, 0 <ε <, and any probability distribution P, with finite support, (see [5]), we have: Thus, R +ε (P ) H(P ) R ε (P ) (4). For α, 0 P ()K() R α (P );. For α, P ()K() R α (P ) K(P ). It is known that for a given probability distribution with finite support, the Rényi and Tsallis entropies are monotonic increasing functions one of each other with respect to α (see [4]). Thus, for every ε>0 and 0 <ε <, we also have a similar relationship for the Tsallis entropy, i.e., T +ε (P ) H(P ) T ε (P ) (5)

Entropy 0, 3 60 Hence, it follows that:. For α, 0 P ()K() T α (P );. For α, P ()K() T α (P ) K(P ). In the net result we show that the inequalities above are, in general, false for different values of α. Proposition 3.3. There are recursive probability distributions P such that:. P ()K() R α (P ) >K(P ), where α>;. P ()K() R α (P ) < 0, where α<; 3. P ()K() T α (P ) >K(P ), where α>; 4. P ()K() T α (P ) < 0, where α<. Proof. For Σ n, consider the following probability distribution: / if 0 n P n () n if, {0, } n 0 otherwise It is clear that this distribution is recursive. We use this distribution for some specific n to prove all items.. First observe that: H(P n ) P n ()logp n () (6) ( log + n n log ) (7) ( n ) n n n (8) n + (9) Notice also that to describe P n it is sufficient to give n, sok(p n ) c log n, where c is a real number. By Theorem., wehave, which implies that: P n ()K() H(P n ) 0 (0) P n ()K() n + ()

Entropy 0, 3 60 On the other hand, by definition: R α (P n ) log P n () α () ( log α +n ) (3) nα ( log( (n )α + n ) nα ) (4) To prove that P n ()K() R α (P n ) >K(P n ), it is sufficiently to prove that: lim n ( ) P n ()K() R α (P n ) K(P n ) > 0 (5) i.e., the limit of the following epression is bigger than 0: But, n + ( log( (n )α + n ) nα ) c log n (6) ( n + lim log((n )α + n ) + nα ) n c log n (7) ( n + lim + log((n )α ) nα ) n α α c log n (8) ( n + lim α ) n α c log n + (9). To prove this item we use the other inequality of Theorem.: P n ()K() H(P n ) K(P n ) (30) which implies that So, P n ()K() R α (P n ) n + P n ()K() n + n + + c log n (3) + c log n ( log( (n )α + n ) nα ) (3) + c log n log(n ) + nα n + + c log n + Thus, taking n sufficiently large, the conclusion follows. (33) (34)

Entropy 0, 3 603 3. The Tsallis entropy of order α of distribution P n is T α (P n ) Using the inequality, weget α α α α P n () α (35) + ) α n nα ( α α (α ) n( α) (α ) (36) (37) P n ()K() T α (P n ) n + P n ()K() α α (α ) + n( α) (α ) α α (α ) + n( α) (α ) (38) (39) Since α>, for n sufficiently large, n + α α (α ) + n( α) (α ) n + α (40) α (α ) > clog n K(P n ) (4) 4. Using the inequality 3, weget c P n ()K() T α (P n ) n + + c log n α α (α ) + n( α) (α ) (4) Since α<, for n sufficiently large, we conclude that: n + + c log n + α α () n( α) () < 0 (43) We generalize this result, obtaining the following theorem which proof is similar to the previous proposition. Theorem 3.4. For every Δ > 0 and α> there are recursive probability distributions P such that,. P ()K() R α (P ) (K(P )) α ;. P ()K() R α (P ) K(P )+Δ; 3. P ()K() T α (P ) (K(P )) α ; 4. P ()K() T α (P ) K(P )+Δ.

Entropy 0, 3 604 The previous results show that only the Shannon entropy satisfies the inequalities of Theorem.. Now, we analyze if a similar relationship holds in a time-bounded Kolmogorov compleity scenario. If, instead of considering K(P ) and K() in the inequalities of Theorem., we use their time-bounded version and impose some computational restrictions on the distributions, we obtain a similar result. Notice that, for the class of distributions mentioned on the following theorem, the entropy equals (up to a constant) the epected value of time-bounded Kolmogorov compleity. Theorem 3.5. Let P be a probability distribution over Σ n such that P, the cumulative probability distribution of P, is computable in time t(n). Setting t (n) O ( nt(n)log(nt(n)) ), we have, 0 P ()K t () H(P ) K t (P ) (44) Proof. The first inequality follows directly from the similar inequality of Theorem. and from the fact that K t () K(). From Theorem.8,ifP is a probability distribution such that P is computable in time t(n), then for all Σ n and t (n) nt(n)log(nt(n)), K t () +logp () K t (P ). Then, summing over all, we get: P ()(K t ()+logp()) P ()K t (P ) (45) which is equivalent to P ()K t () H(P ) K t (P ). 4. On the Entropies of the Time-Bounded Universal Distribution If the time that a program can use to produce a string is bounded, we get the so called time-bounded universal distribution, m t () c Kt (). In this section, we study the convergence of the three entropies with respect to this distribution. Theorem 4.. The Shannon entropy of the distribution m t diverges. Proof. If then f() is a decreasing function. Let A be the set of strings such that log m t () ; this set is recursively enumerable. H(m t ) Σ m t ()logm t () (46) A m t ()logm t () (47) A c Kt () (K t () log c) (48) c log c Kt () + c K t () Kt () A A (49)

Entropy 0, 3 605 So, if we prove that K t () Kt () diverges, the result follows. Assume, by contradiction, that A K t () Kt () <dfor some d R. Then, considering A r() d Kt () Kt () if s A 0 otherwise we conclude that r is a semi-measure. Thus, there is a constant c such that, for all, r() c m(). Hence, for A, wehave (50) d Kt () Kt () c K() (5) So, K t () c d Kt () K(), which is a contradiction since by Theorem.4, A contains infinitely many strings of time-bounded Kolmogorov compleity larger than a constant such that K t () K()+O(). The contradiction follows from the assumption that the sum K t () Kt () A converges. So, H(m t ) diverges. Now we show that, similarly to the behavior of Rényi and Tsallis entropies of universal distribution m (see [5]), we have R α (m t ) < iff α>and T α (m t ) < iff α>. First observe that:. If α>, T α (P ) α + R α(p );. If α<, T α (P ) α + R α(p ). Theorem 4.. The Tsallis entropy of order α of time-bounded universal distribution m t converges iff α>. Proof. From Theorem 8 of [5], we have that Σ (m()) α converges if α>. Since m t is a probability distribution, there is a constant λ such that, for all, m t () λm(). So, (m t ()) α (λm()) α, which implies that (m t ()) α λ α (m()) α (5) Σ Σ from which we conclude that for α>, T α (m t ) converges. For α<, the proof is analogous to the proof of Theorem 4.. Suppose that Σ (m t ()) α <dfor some d R. Hence, r() d (mt ()) α is a constructible semi-measure. Then, there is a constant τ such that for all Σ, r() ( ) α c Kt () τ K() (53) d which is equivalent to c α dτ αkt () K() (54)

Entropy 0, 3 606 By Theorem.4, there are infinitely many strings such that K t () K() +O(). Then it would follow that for these strings cα (α )K(), which is false since these particular strings can have dτ arbitrarily large Kolmogorov compleity. Theorem 4.3. The Rényi entropy of order α of time-bounded universal distribution m t converges iff α>. Proof. For α>, since Kt () < +, wehave ( Kt () ) α <. Thus, R α (m t ) converges. For α <, suppose without loss of generality that α is rational (otherwise take another rational slightly larger than α). Assume that ( Kt () ) α <. Then by universality of m and since ( Kt () ) α is computable, we would have m() ( Kt () ) α which, by taking logarithms, is equivalent to K t () K()+O(). Since (/α) >, this would contradict Solovay s Theorem of page 597. α 5. Uniform Continuity of the Entropies Shannon, Rényi, and Tsallis entropies are very useful measures in Information Theory and Physics. In this section we look to these entropies as functions of the corresponding probability distribution, and establish an important property: all the three entropies are uniformly continuous. We think that this mathematical result may be useful in the future research on this topic. It should be mentioned that the uniform continuity of the Tsallis entropy for certain values of α has already been proved in [6]. 5.. Tsallis Entropy In order to prove the uniform continuity of the Tsallis entropy, we need some technical lemmas. Lemma 5.. Let 0 b<a and 0 <α<. We have, (a b) α a α b α (55) Proof. Consider the function f() α for 0 <α< and 0. So, f () α α is positive for all 0 and f () α(α ) α is negative. We want to prove that: f(a b) >f(a) f(b) (56) If we shift to the left a and b, { a a e b b e (57) it is easy to show that the value of f(a ) f(b ) (for e 0) is greater than f(a) f(b). In particular, if e b, f(a ) f(b )f(a b) 0f(a b) >f(a) f(b) (58) which means that (a b) α >a α b α.

Entropy 0, 3 607 Lemma 5.. Let 0 b<a, and α>. We have, a α b α α(a b) (59) Proof. Consider the function f() α for α> and 0. Let 0 a, 0 b. The function f is continuous in [a, b] and differentiable in (a, b). So, by Lagrange s Theorem, c (a, b) :a α b α f (c)(a b) (60) Note that f (c) αc α <α. Thus, c(a b) <α(a b). Theorem 5.3. Let P and Q be two probability distributions over Σ n. Let T α (P ) and T α (Q) be the Tsallis entropy of distributions P and Q, respectively, where α>0, α. Then, ε >0, δ >0, P, Q : ma P () Q() δ T α (P ) T α (Q) ε (6) Proof. If α<, wehave: T α (P ) T α (Q) α + P () α α Q() α (6) P () α Q() α (63) (P () α Q() α ) (64) P () Q() α, by Lemma 5. (65) δ α n δ α (66) Thus, it is sufficient to consider δ α ()ε n. If α>, wehave: T α (P ) T α (Q) Thus, it is sufficient to consider δ α + P () α α Q() α (67) P () α Q() α (68) α (P () α Q() α ) (69) α α P () Q(), by Lemma 5. (70) α αδ αn α α δ (7) (α ) α n ε.

Entropy 0, 3 608 5.. Shannon Entropy Lemma 5.4. The function log is uniformly continuous in [0, ]. Proof. The function is continuous in [0, ] and this is a compact set, so the function is uniformly continuous. Theorem 5.5. Let P and Q be two probability distributions over Σ n. Let H(P ) and H(Q) be the Shannon entropy of distributions P and Q, respectively. Then, ε >0, δ >0, P, Q : ma P () Q() δ H(P ) H(Q) ε (7) Proof. By Lemma 5.4, we have that: γ >0, β >0,, y : ma y β log y log y γ (73) So, H(P ) H(Q) P ()logp()+ Q()logQ() (P ()logp() Q()logQ()) (74) (75) P ()logp () Q()logQ() (76) γ n γ (77) It is sufficient to consider γ ε and consider δ β. n 5.3. Rényi Entropy Lemma 5.6. Let a and b be two real numbers greater than. Then, log a log b a b (78) Proof. Consider the function f() log and. Let a, b. The function f is continuous in [a, b] and differentiable in (a, b). So, by Lagrange s Theorem, c (a, b) : log a log b f (c) a b (79) Note that f (c) <. Thus, c a b < a b. ln c Lemma 5.7. The function α is uniformly continuous in [0, ]. Thus, γ >0 β >0, y : ma y β α y α γ (80)

Entropy 0, 3 609 Proof. The function is continuous in [0, ] and this is a compact set, so the function is uniformly continuous. Theorem 5.8. Let P and Q be two probability distributions over Σ n. Let R α (P ) and R α (Q) be the Rényi entropy of distributions P and Q respectively, where α>0, α. Then, ε >0, δ >0, P, Q : ma P () Q() δ R α (P ) R α (Q) ε (8) Proof. If α<: ε() Consider γ. By Lemma 5.7, we know that there is β such that if ma y β then n+ α y α γ. So, consider δ β. We have to show that if ma P () Q() δ then R α (P ) R α (Q) ε: R α (P ) R α (Q) log P () α log Q() α (8) log P () α log Q() α (83) (P () α Q() α ), by Lemma 5.6 (84) P () α Q() α (85) ε(), by Lemma 5.7 (86) n+ If α>: One of the P () is at least / n so that that ma P () Q() δ. Wehave log P () α log But, by Lemma 5. we have P () α Q() α Thus, log P () α log We get n+ ε() ε (87) n+ P () α / αn and d d log ln αn ln ; assume Q() α αn ln P () α Q() α (88) P () α Q() α n δα (89) Q() α αn(α+) δ ln (90) R α (p) R α (q) αn(α+) ln (α ) δ (9)

Entropy 0, 3 60 It is sufficient to consider δ ln (α ) α n(α+) ε. 6. Conclusions We proved that, among the three entropies we have studied, Shannon entropy is the only one that satisfies the relationship with the epected value of Kolmogorov compleity stated in [], by ehibiting a probability distribution for which the relationship fails for some values of α of Tsallis and Rényi entropies. Furthermore, under the assumption that cumulative probability distribution is computable in an allotted time, a time-bounded version of the same relationship holds for the Shannon entropy. Since it is natural to define a probability distribution based on time-bounded Kolmogorov compleity, we studied the convergence of this distribution under Shannon entropy and its two generalizations: Tsallis and Rényi entropies. We also proved that the three entropies considered in this work are uniformly continuous. Acknowledgements This work was partially supported by the project CSI (PTDC/EIA- CCO/09995/008), the grants SFRH/BD/3334/007 and SFRH/BD/849/006 from Fundação para a Ciência e Tecnologia (FCT) and funds granted to Instituto de Telecomunicações and Laboratório de Inteligência Artificial e Ciência de Computadores through the Programa de Financiamento Plurianual. References. Li, M.; Vitányi, P. An Introduction to Kolmogorov Compleity and Its Applications, 3rd ed.; Springer Publishing Company, Incorporated: New York, NY, USA, 008.. Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 948, 7 379 43, 63 656. 3. Rényi, A. On measures of entropy and information. In Proceedings of the 4th Berkeley Symposium on Mathematics, Statistics and Probability, Statistical Laboratory of the University of California, Berkeley, CA, USA, 0 June 30 July, 960; pp. 547 56. 4. Tsallis, C. Possible generalization of Boltzmann-Gibbs statistics. J. Stat. Phys. 988, 5, 479 487. 5. Cachin, C. Entropy measures and unconditional security in cryptography, st ed.; In ETH Series in Information Security and Cryptography; Hartung-Gorre Verlag: Konstanz, Germany, 997. 6. Cover, T.; Gacs, P.; Gray, M. Kolmogorov s contributions to information theory and algorithmic compleity. Ann. Probab. 989, 7, 840 865. 7. Solomonoff, R. A formal theory of inductive inference; Part I. Inform. Contr. 964, 7,. 8. Kolmogorov, A.N. Three approaches to the quantitative definition of information. Probl. Inform. Transm. 965,, 7. 9. Chaitin, G.J. On the length of programs for computing finite binary sequences. J. ACM 966, 3, 547 569. 0. Solovay, R.M. Draft of a paper (or series of papers) on Chaitin s work. IBM Thomas J. Watson Research Center, Yorktown Heights, New York, NY, USA. Unpublished manuscript, 975.

Entropy 0, 3 6. Levin, L. Laws of information conservation (non-growth) and aspects of the foundation of probability theory. Prob. Peredachi Inf. 974, 0, 06 0.. Gacs, P. On the symmetry of algorithmic information. Soviet Mathematics Doklady 974, 5, 477 480. 3. Grunwald, P.; Vitanyi, P. Shannon information and Kolmogorov compleity. arxiv 004, arxiv:cs/04000v. 4. Ramshaw, J. Thermodynamic stability conditions for the Tsallis and Rényi entropies. Phys. Lett. A 995, 98, 9. 5. Tadaki, K. The Tsallis entropy and the Shannon entropy of a universal probability. In Proceedings of the International Symposium on Information Theory, Toronto, Canada, 6 July 008; abs/0805.054, pp. 5. 6. Zhang, Z. Uniform estimates on the Tsallis entropies. Lett. Math. Phys. 007, 80, 7 8. c 0 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/.)