7 June 2002 Physics Letters A 298 (2002 369 374 wwwelseviercom/locate/pla Thermostatistics based on Kolmogorov Nagumo averages: unifying framewor for extensive and nonextensive generalizations Mare Czachor a,b, Jan Naudts b, a Katedra Fizyi Teoretycznej i Metod Matematycznych, Politechnia Gdańsa, 80-952 Gdańs, Poland b Departement Natuurunde, Universiteit Antwerpen UIA, Universiteitsplein, B260 Antwerpen, Belgium Received 24 January 2002; received in revised form April 2002; accepted April 2002 Communicated by CR Doering Abstract We show that extensive thermostatistics based on Rényi entropy and Kolmogorov Nagumo averages can be expressed in terms of Tsallis nonextensive thermostatistics We use this correspondence to generalize thermostatistics to a large class of Kolmogorov Nagumo means and suitably adapted definitions of entropy As an application, we reanalyze linguistic data discussed in a paper by Montemurro 2002 Elsevier Science BV All rights reserved PACS: 0520Gg; 0570Ce Keywords: Nonextensive thermostatistics; Rényi entropy; Nonlinear averages; Zipf Mandelbrot law Generalized averages of the form x φ = φ ( p φ(x, ( where φ is an arbitrary continuous and strictly monotonic function, were introduced into statistics by Kolmogorov [] and Nagumo [2], and further generalized by de Finetti [3], Jessen [4], Kitagawa [5], Aczél [6] and many others Their first applications in information theory can be found in the seminal papers by Rényi [7,8] who employed them to define a one-parameter family of measures of information * Corresponding author E-mail addresses: mczachor@pggdapl (M Czachor, jannaudts@uaacbe (J Naudts (α-entropies ( ( I α = ϕα p ϕ α log b = ( α log b p α p (2 The Kolmogorov Nagumo (KN function is here ϕ α (x = b ( αx, a choice motivated by a theorem [9] stating that only affine or exponential φ satisfy x + C φ = x φ + C, where C is a constant Random variable I = log b p, (3 (4 represents an amount of information received by learning that an event of probability p too place [0,]; 0375-960/02/$ see front matter 2002 Elsevier Science BV All rights reserved PII: S0375-960(0200540-6
370 M Czachor, J Naudts / Physics Letters A 298 (2002 369 374 b specifies units of information (b = 2 correspondsto bits; below we use b = e which is more common in the physics literature α-entropies were also derived in a purely pragmatic manner in [2] as measures of information for concrete information-theoretic problems The above derivation of I α clearly shows the two elements which led Rényi to the idea of α-entropy: ( one needs a generalized average and (2 the random variable one averages is the logarithmic measure of information The latter has a well-nown heuristic explanation which goes bac to Hartley [3]: to uniquely specify a single element of a set containing N numbers one needs log 2 N bits; but, if one splits the set into n subsets containing, respectively, N,,N n elements ( i N i = N then in order to specify only in which set the element of interest is located it is enough to have log 2 N log 2 N i = log 2 (N/N i bits of information The latter construction ignores the information encoded in correlations between the subsets For this reason the sum of pieces of information characterizing subsets is typically larger than the information characterizing the entire set The idea is used in datacompression algorithms and is essential for the argument we will present below In particular, we shall see that for systems with long-range correlations a natural candidate is I = ln α (/p,whereln α ( is the deformed logarithm [4] (ln ( = ln( Although α-entropies are occasionally used in statistical physics [5,6] it seems the same cannot be said of KN-averages Thining of the original motivation behind generalized entropies one may wonder whether this is not logically inconsistent Constructing statistical physics with α-entropies one should consistently apply KN-averaging to all random variables, internal energy included Applying the procedure to thermostatistics one may expect to arrive at a oneparameter family of equilibrium states which, in the limit α, reproduce Boltzmann Gibbs statistics During the past ten years it became quite clear that there is a need for some generalization of standard thermostatistics, as exemplified by the ongoing efforts in Tsallis q-thermodynamics [7] Systems with long-range correlations, memory effects or fractal boundaries are well described by q Tsallis-type equilibria Gradual development of this theory allowed to understand that there is indeed a lin between generalized entropies and generalized averages However, the averages one uses in Tsallis statistics are the standard linear ones but expressed in terms of the so-called escort probabilities (see (4 below So there is no direct lin to KN-averages In what follows we present a thermostatistical theory based on KN-averages The idea is to use maximum entropy principles where the KN-averages are applied on equal footing to entropies and constraints As we shall see there is a lin between such a theory and Tsallis thermostatistics Actually, many technical developments obtained within the Tsallis scheme have a straightforward application in the new framewor An important difference with respect to the Tsallis theory is that we can obtain both nonextensive and extensive generalizations so that one may expect the formalism will have a still wider scope of applications Rényi s definition of entropy (2 becomes more natural if one notices that KN-averages are invariant under φ(x Aφ(x + B and one replaces ϕ α by φ α (x = e( αx [ ] ln α exp(x (5 α The α-entropy written with the help of the modified KN-function (5 is ( ( φα p φ α (I = φα p φ α ( ln p (6 = φ α = φ α ( pα α ( p ln α (/p = I α (7 It is interesting that in the course of the calculation of I α the expression for the Harvda Charvat Daróczy Tsallis entropy [5,7 9] S α (p = pα, (8 α arises This shows that in the context of KN-means there is an intrinsic relation between I α and S α : φ α (I α = S α Let us note that the formula φ α (I = ln α (/p, (9 (0 may hold also for other pairs (φ α,i, with φ α not given by (5, and be valid even for measures of information different from the Hartley Shannon Wiener
M Czachor, J Naudts / Physics Letters A 298 (2002 369 374 37 random variable I = ln p The ey assumption of the present Letter is that the generalized theory is characterized by the properties (9 and (0 One can see (0 as a definition of I in case φ α is given, or as a constraint on φ α if I is given In particular, the choice φ α (x = x determines Tsallis thermostatistics as one of the generalized thermostatistic formalisms under consideration here Generalized thermodynamics is obtained by maximizing I α under the constraint of fixed internal energy ( β 0 H φα = φα p φ α (β 0 E = β 0 U, ( where β 0 is a constant needed to mae the averaged energy dimensionless Equivalently, the problem may be reformulated as maximizing S α, given by (8, under the constraint p φ α (β 0 E = φ α (β 0 U (2 This problem is of the type originally considered by Tsallis [7] However, since then the formalism of nonextensive thermostatistics has evolved In particular, one has learned [20] that the optimization problem should be reparametrized for the following reasons The standard thermodynamic relation for temperature T is (3 T = ds du, with S and U, respectively, entropy and energy calculated using the equilibrium averages In generalized thermostatistics this definition of temperature is not necessarily correct Recently it has been shown [2, 22] that (3 is valid if the entropy is additive and must be modified in all other cases The reparametrization of nonextensive thermostatistics, by introduction of escort probabilities, is such that energy U becomes generically an increasing function of some (unphysical temperature T (see, eg, Proposition 35 of [23], which is then related to physical temperature T The reparametrization is done by means of q /q duality [20,24] The escort probabilities ρ are defined by ρ = pα pα (4 One has clearly also the inverse relation p = ρq, (5 ρq with q = /α The above optimization problem is now equivalent to maximizing S q (ρ under the constraint ρq φ /q(β 0 E = φ /q (β 0 U (6 ρq This is so because S q (ρ is maximal if and only if S /q (p is maximal (see [24] The latter optimization problem is of the type studied in the new style nonextensive thermostatistics [20] The free energy F is defined by β 0 F = ρq φ α(β 0 E β 0 T S q (ρ (7 ρq Minima of β 0 F, if they exist [23,25], are realized for distributions of the form [20] ρ or, if <q, [ + ax ] /(q (8 ρ [ ax ] /( q +, if 0 <q< (9 Here x = φ α (β 0 E and [x] + equals x if x is positive, zero otherwise Expression (8, with /(q replaced by + κ, is called the appa-distribution or generalized Lorentzian distribution [26] There are several reasons why this distribution is of interest In the first place, the Gibbs distribution, which determines the equilibrium average in the standard setting of thermodynamics [27], is obtained in the limit κ +,orq The appa-distribution is frequently used For example, in plasma physics it is used to describe an excess of highly energetic particles [28] Typical for distribution (9 is that the probabilities p are identically zero whenever ae This cutoff for high values of E is of interest in many areas of physics In astrophysics it has been used [29] to describe stellar systems with finite average mass A statistical description of an electron captured in a Coulomb potential requires the cut-off to mas scattering states [24,30] In standard statistical mechanics the treatment of vanishing probabilities requires infinite energies which lead to ambiguities These can be avoided if distributions of the type (9 are used
372 M Czachor, J Naudts / Physics Letters A 298 (2002 369 374 The formulas that follow are based on results already found in literature at many places, eg, in [23] The equilibrium average is the KN-average with p given by p = [ + a( α(x u ] /(α, (20 Z + with x = φ α (β 0 E and the normalization constant being given by Z n = [ + a( α(x u ] n+α/(α, + n = 0,, (2 for n = The unnown parameters a>0 and u have to be fixed in such a way that (2 holds This condition can be written as ( Z0 φ α (β 0 U= u + (22 ( αa Z The entropy I α follows from (8 with (20 One obtains ( Z0 φ α (I α = α Z α Temperature T is given by (cf Eq (4 in [23] (23 T = Z(+α/α 0 (24 aαz 2 The set of Eqs (20 (24 is what is needed for applications Let us finally return to the specific case of Rényi s entropy, ie, I and φ α given, respectively, by (4 and (5 This choice is particularly interesting since only then the following three conditions are satisfied β 0 H + β 0 E φα = β 0 H φα + β 0 E, (25 β 0 H A+B φα = β 0 H A φα + β 0 H B φα, (26 I α (A + B = I α (A + I α (B, (27 where A and B are two uncorrelated noninteracting systems Condition (25 when combined with the explicit form of equilibrium state means that equilibrium does not depend on the origin of the energy scale The remaining two conditions imply that we have a oneparameter family of extensive generalizations of the Boltzmann Gibbs statistics, the latter being recovered in the limit α For α = q we obtain the well-nown Tsallis-type appa-distributions but with energies βe replaced by φ α (β 0 E In general, the equilibrium probabilities are not of the product form (there is one exception see below The product form is of course also absent in the standard formalism when there are correlations between subsystems Nevertheless, if the correlations are not too strong, then the system in equilibrium is still extensive This is expressed by stating that the so-called thermodynamic limit exists We expect that also in the present formalism the thermodynamic limit exists We have checed this statement for the Curie Weiss model [33] Consider now the case a = /(+( αu in (20 This is a remarable case because the equilibrium distribution (20 becomes exponential Indeed, one verifies that p = Z e β 0E, with Z = e β 0E Internal energy equals β 0 U = ( α ln e αβ 0E = I α ln Z Z (28 (29 This means that for each system there exists a particular temperature where the equilibrium state is factorizable Still assuming that φ α is given by (5 one can easily calculate thermodynamic temperature T, as given by (3 One finds β 0 T = d / d da β 0U da S α = aα + a( α(φ α (β 0 U u + ( αφ α (β 0 U (30 This expression can be used to eliminate a from (20 With some effort one obtains [ p + ( αφ α (β 0 U + α ( φα (β 0 E φ α (β 0 U ] /(α αβ 0 T Using definition (5 of φ α one obtains p = A [ λ + λe ( αβ 0(E U ] /(α, (3 (32 where A is the appropriate normalization constant, and with λ = (αβ 0 T From this result it is immediately
M Czachor, J Naudts / Physics Letters A 298 (2002 369 374 373 clear that the Boltzmann Gibbs distribution follows in the limit α = From (30 one sees that the special temperature T for which a = ( + ( αu holds is /(αβ 0 Formula (32 shows also that β 0 controls cross-over between different regimes of temperature dependence of p and is the energy analog of the cross-over time t 0 used in [32] This clarifies the meaning of β 0 It is quite remarable that (32 is exactly the probability density postulated in [3] on an ad hoc basis in order to improve theoretical fits to experimental protein-folding data [32] Although the original motivation of [3] was to interpret (32 as a signature of nonextensivity, we have derived it on the basis of Rényi s entropy which is extensive However, as shown in [3], the long-tail data are still better described by a small deviation from (32 An appropriate nonextensive (although inequivalent to Tsallis departure from Rényi s φ α is given by ( φ αρ (x = ln α expρ (x = [ ( ] ( α/( ρ + ( ρx α (33 The corresponding equilibrium distribution is of the form (assuming U = 0, see [33] ( p = A λ + λ ( ( α/( ρ /( α + ( ρβ 0 E (34 In the limit ρ = this expression coincides with (32 The exponent ρ controls the tail of the distribution p see Fig Recently, Montemurro [34] analyzed linguistic data using (32 and further generalizations proposed in [3] From our point of view (34 is the obvious generalization of (32 Therefore we have reanalyzed some of the fits made in [34] Of particular interest are the compound data from a corpus of 2606 boos in English, containing 448,359 different words Surprisingly, a convincing 4-parameter fit to all 448,359 data points is possible using (34 in the limit α = 0, assuming E = See Fig 2 The fitting parameters are ρ = 0568, β 0 = /3086, and λ = 2630 The slope of the tail equals /( ρ 232, in agreement with the Fig Log log plot of p, with E =, β 0 = /00, α = 0 and λ = 500, not normalized (A =, for different values of ρ: 0 dotted, 08 short-dashed, 0 solid, 08 long-dashed Fig 2 Log log plot of the frequency of words as a function of their raning Comparison of experimental data (solid line and fitted curve (dotted line value 23 mentioned in [34] The root mean square error of the fit in the log log plot equals 027 Let us summarize the results We present a formalism of thermostatistics based on nonlinear KN-averages Entropy is maximized under the constraint that the nonlinear average of energy E equals a given value U If energy does not fluctuate then linear and nonlinear averages coincide and our approach reduces to the standard one However, in interesting systems energy does fluctuate, in which case we obtain new results Our formalism simultaneously generalizes Boltzmann Gibbs and Tsallis theories As opposed to the Tsallis case, which is always nonextensive, the KNapproach allows for a family of extensive generalizations, which however lead to equilibrium states sharing many properties with Tsallis q distributions,
374 M Czachor, J Naudts / Physics Letters A 298 (2002 369 374 via the relation between I α and S α The extensive case corresponds to the choice φ α (x = ln α (exp x since then the average information coincides with Rényi s entropy As proved by Rényi, his entropy, together with that of Shannon [0], are the only additive entropies As shown in [2,22] additivity of entropy is a requirement for physical temperature T to be defined by the usual thermodynamic relation (3 The formalism generalizes to other nonexponential choices of φ provided the information measure is adapted in such a way that (9 and (0 still hold In this more general context entropy is no longer additive In a natural way Tsallis entropy appears as a tool for calculating equilibrium averages This offers the opportunity to reuse the nowledge from Tsallis-lie thermostatistics A tempting question is whether in each of the many applications of Tsallis thermostatistics one can find a natural KN-average which maps the problem into the present formalism In [33] we discuss the present formalism from a more fundamental point of view and give explicit examples Here we mention only that the results for a two-level system and the Curie Weiss model are satisfactory Of course, more complicated examples should be studied For the sae of completeness let us mention that Rényi s entropy has been studied already [6] in relation with escort probabilities (4 One of the conclusions of that paper is that they obtain the same results as in Tsallis thermostatistics, which is not a surprise since Rényi s entropy and Tsallis entropy are monotonic functions of each other The cross-over property of our p, a consequence of KN-averaged constraints of our formalism, is absent for distributions found in [6] since their constraints employ linear averages Acnowledgements We are grateful to Dr Montemurro for maing available his numerical data One of the authors (MC wishes to than the NATO for a research fellowship enabling his stay at the Universiteit Antwerpen References [2] M Nagumo, Japan J Math 7 (930 7 [3] B de Finetti, Giornale di Istituto Italiano del Attuarii 2 (93 369 [4] B Jessen, Acta Sci Math 5 (93 08 [5] T Kitagawa, Proc Phys Math Soc Japan 6 (934 7 [6] J Aczél, Bull Amer Math Soc 54 (948 392 [7] A Rényi, MTA III Oszt Közl 0 (960 25, reprinted in [35], pp 526 552 [8] A Rényi, in: Proceedings of the Fourth Bereley Symposium on Mathematical Statistics and Probability, University of California Press, Bereley, 96, pp 547 56, reprinted in [35], pp 565 580 [9] GH Hardy, JE Littlewood, G Pólya, Inequalities, Cambridge, 934, Theorem 89 [0] CE Shannon, Bell System Tech J 27 (948 379; CE Shannon, Bell System Tech J 27 (948 623 [] N Wiener, Cybernetics, Wiley, New Yor, 948 [2] A Rényi, Rev Inst Internat Stat 33 (965, reprinted in [35], pp 304 38 [3] RV Hartley, Bell System Tech J 7 (928 535 [4] C Tsallis, Quim Nova 7 (994 468; EP Borges, J Phys A 3 (998 528 [5] A Wehrl, Rev Mod Phys 50 (978 22 [6] EK Lenzi, RS Mendes, LR da Silva, Physica A 280 (2000 337 [7] C Tsallis, J Stat Phys 52 (988 479 [8] J Harvda, F Charvat, Kybernetica 3 (967 30 [9] Z Daróczy, Inform Control 6 (970 36 [20] C Tsallis, RS Mendes, AR Plastino, Physica A 26 (998 543 [2] S Abe, A Martinez, F Pennini, A Plastino, Phys Lett A 28 (200 26; S Martinez, F Pennini, A Plastino, Physica A 295 (200 246; S Martinez, F Pennini, A Plastino, Physica A 295 (200 46 [22] R Toral, cond-mat/006060 [23] J Naudts, Rev Math Phys 2 (2000 305 [24] J Naudts, Chaos Solitons Fractals 3 (3 (2002 445 [25] J Naudts, M Czachor, in: S Abe, Y Oamoto (Eds, Nonextensive Statistical Mechanics and its Applications, Lecture Notes in Physics, Vol 560, Springer, 200, pp 243 252 [26] AV Milovanov, LM Zelenyi, Nonlinear Processes Geophys 7 (2000 2 [27] ET Jaynes, Phys Rev 06 (957 620 [28] N Meyer-Vernet, M Moncuquet, S Hoang, Icarus 6 (995 202 [29] AR Plastino, A Plastino, Phys Lett A 74 (993 384 [30] LS Lucena, LR da Silva, C Tsallis, Phys Rev E 5 (995 6247 [3] C Tsallis, G Bemsi, RS Mendes, Phys Lett A 257 (999 93 [32] RH Austin et al, Phys Rev Lett 32 (974 403 [33] J Naudts, M Czachor, cond-mat/00077 [34] M Montemurro, Physica A 300 (3 (200 567 [35] Selected Papers of Alfréd Rényi, Vol 2, Aadémiai Kiadó, Budapest, 976 [] A Kolmogorov, Atti R Accad Naz Lincei 2 (930 388