BOUNDS ON DIVERGENCE IN NONEXTENSIVE STATISTICAL MECHANICS

J. Indones. Math. Soc. Vol. 19, No. 2 (2013), pp. 89 97. BOUNDS ON DIVERGENCE IN NONEXTENSIVE STATISTICAL MECHANICS Takuya Yamano Department of Informaton Scences, Faculty of Scence, Kanagawa Unversty, 2946, 6-233 Tsuchya, Hratsuka, Kanagawa 259-1293, Japan yamano@amy.h-ho.ne.jp Abstract. We focus on an mportant property upon generalzaton of the Kullback- Lebler dvergence used n nonextensve statstcal mechancs,.e., bounds. We explctly show upper and lower bounds on t n terms of exstng famlar dvergences based on the fnte range of the probablty dstrbuton rato. Ths provdes a lnk between the observed dstrbuton functons based on hstograms of events and parameterzed dstance measures n physcal scences. The characterng parameter q < 0 and q > 1 are rejected from the consderaton of bounded dvergence. Key words: Informaton dvergences, Relatve entropes, Generalzed dvergences. Abstrak. Makalah n berfokus pada sebuah sfat pentng dar generalsas dvergens Kullback-Lebler yang dgunakan dalam statstk mekanka nonextensf, yakn, batas. Secara eksplst dperlhatkan batas atas dan batas bawah dalam terma dvergens yang dkenal berdasarkan pada jelajah terhngga dar raso dstrbus peluang. Hal n memberkan keterkatan antara fungs dstrbus pengamatan berdasarkan pada hstogram dar kejadan-kejadan dan ukuran jarak terparameter dalam lmulmu fsk. Parameter q < 0 dan q > 1 tertolak sebaga dvergens terbatas. consderaton of bounded dvergence. Kata kunc: Dvergens nformas, entrof relatf, generalsas dvergens. 1. INTRODUCTION The current upsurge of nterest n dvergence measures determned by two probablty dstrbutons s due to both usefulness and necessty for practcal dscrmnatons of dfferent states and also for dscoverng how much they dffer from 2000 Mathematcs Subject Classfcaton: 62B10, 94A15, 94A17. Receved: 03-12-2011, revsed: 28-10-2012, accepted: 10-06-2013. 89

90 T. Yamano each other. Such scenaros appear n many areas whch use statstcal methods. Especally n statstcal physcs, the H-theorem s the most relevant noton to dvergence measures that probe proxmty toward a statonary dstrbuton n the course of a gven dynamcs. Usually t s specfed as the Kulback-Lebler dvergence (or relatve entropy) [11]. For Markovan processes, however, the valdty of the proof of the H-theorem s shown for a wde class of dvergences (Csszár- Mormoto f-dvergence) [2, 15]. For the specfc forms of the generalzed dvergence, t s presented n [1, 18, 24]. Hstorcally, an attempt for buldng a parameterzed dvergence measure n a statstcal mechancs context has presented n [14] wthout usng a noton of averaged nformaton, where nstead of the term dvergence, a word of a relatve degradaton functon of nth order was used. Numerous propertes upon generalzaton of the conventonal relatve entropy, on the one hand, s becomng an nterestng research topc n ts own rght, because generalzatons of one conventonal measure provde an nsght nto ts orgnal ones. Among them, the ranges or bounds of dvergences can be regarded as fundamental, snce they contan structural property reflectng the geometrc manfold governed by the parameter used upon generalzaton. Also, the avalablty of bounds for the dstance s mportant n physcs and n statstcal nference test, where the bound can be used to gve an estmate for specfc states (usually equlbrum states). Therefore, upper and lower bounds for dstance measures n general can be useful nformaton and provde a clue to nterpret the meanng of the parameter. The purpose of the present artcle s to provde such bounds for the Tsalls relatve entropy that are not presented n the lterature [23] so far. The approach here s based on the fact that a detector whch produces statstcal dstrbutons of occuered events has a fnte dynamc range and consequently has fnte probablty dstrbuton rato. Therefore, these bounds obtaned must be more relevant n terms of observatonal pont of vew. Our presentaton proceeds as follows. Frst, we revew the defnton of the Tsalls relatve entropy. Then we consder the bounds of t by the usual relatve dvergence. It can provde a degree of change by the parameter q of generalzaton. We next consder the upper bound by l 1 -norm and the lower bound n terms of t (the so-called Pnsker lke nequalty). Bounds by χ 2 dvergence followed by Hellnger s dstance are presented as smple applcatons of nequaltes that hold for generc f-dvergences. We summarze our consderaton and present dscusson n the last secton. 2. THE NONEXTENSIVE RELATIVE ENTROPY It was provded n [22] n the context of the consstent testng and some propertes were nvestgated n [1]. Ths generalzaton keeps the nonextensve thermostatstcs pcture [18] and t belongs to the relatve nformaton of type s, whch has proposed n [17]. Presently, the generalzaton of the relatve entropy whch s consstent wth the nonextensve entropy of Tsalls [21] s provded by

Bounds on Dvergence 91 takng the lnear mean (so-called the f-dvergence [2]) of the correspondng dstance measure f between two probablty dstrbutons [22], D q (p r) := r f(t ), f(t ) := tq t q 1 (q R), (1) where t s used to denote the rato of two probablty dstrbutons.e., p /r throughout ths paper. When supp p supp r, where supp p = {ω Ω; p(ω) > 0} n σ-fnte measure space Ω, dvergences become nfnty and ths apples for later consderatons [25]. Alternatvely, D q (p r) s produced from takng a based average of the quantty (p 1 q r 1 q )/(1 q) by p q. Ths can be expressed as D q (p r) = ( ) p q p 1 q r 1 q. 1 q 3. BOUNDS IN TERMS OF USUAL KULLBACK-LEIBLER DIVERGENCE We frst descrbe our settng of consderaton. In dscrete cases, t s common to regard a hstogram of observed values as a probablty dstrbuton assocated wth the system under study. Ths means that a measured value n a sngle measurement falls nto one of the fnte bns of a detector whch has a fnte dynamc range. The measurng apparatus conssts of a lmted number of bns, therefore the probablty dstrbuton also has a fnte support reflectng the dynamc range. It s therefore hghly probable that when we compare the two dfferent probablty dstrbutons constructed n that manner, the rato of them has fnte ranges wthn the dentcal bn. Let us call ths quantty a rato range hereafter. The rato range becomes null f there s no detected event for the dstrbuton r of th bn. Furthermore, we set the mnmum and the maxmum values u and U, respectvely on ths rato range for th bn: 0 < u p /r U <. Under ths settng, we consder bounds for the generalzed relatve entropy D q (p r). Under ths settng, we can use Theorem 6 provded n [4]. It was proved that when f C 2 (u, U) and when tf (t) s bounded from below and from above wth constants m R and M R, respectvely, an nequalty holds n terms of the Kullback-Lebler dvergence D KL (p r) from p to r. More concretely, for the general f-dvergences D f (p r) we know an nequalty md KL (p r) D f (p r) MD KL (p r). (2) Note that D KL (p r) s the most well known measure of the f-dvergence class and t s obtaned f we choose f(t) = t log t (t > 0). For the dvergence of Tsalls, we have tf (t) = qt q 1, then sup tf (t) = t (u,u) { qu q 1 (q > 1) qu q 1 (q < 1), (3) { qu q 1 (q > 1) nf tf (t) = t (u,u) qu q 1 (q < 1). (4)

92 T. Yamano Therefore, we obtan the followng nequaltes, qu q 1 D KL (p r) D q (p r) qu q 1 D KL (p r), (q > 1), (5) qu q 1 D KL (p r) D q (p r) qu q 1 D KL (p r), (q < 1). (6) 4. UPPER BOUNDS IN TERMS OF VARIATIONAL DISTANCE The varatonal dstance s also one of the f-dvergences, snce we can choose f(t) = t 1 wth t R +. Dragomr provded n [5] that the followng nequalty holds on [u, U] f f s absolutely contnuous and f f L [u, U], 0 D f (p r) f L V (p, r), (7) where f L := ess. sup f (t) and V (p, r) = p r s the varatonal dstance t (u,u) (l 1 -norm). In the range 0 < u p /r U < for each, the quantty for the dvergence of Tsalls s found to be 8 >< f L = >: qu q 1 1 q 1 (0 < q < u 1 q < 1 or 2 < q(u q 1 + u q 1 ) or 1 < u 1 q < q) qu q 1 1 1 q (U 1 q < q < 1 or 2 > q(u q 1 + u q 1 ) or 1 < q < U 1 q ) qu q 1 1 q 1 (q < 0). Therefore, we obtan the correspondng nequaltes by substtutng these nto Eq.(7). (8) 5. LOWER BOUNDS BY PINSKER TYPE INEQUALITY The Pnsker nequalty provdes a lower bound on the Kullback-Lebler dvergence n terms of the varatonal dstance V (p, r) = p r as D KL (p r) V 2 /2 [16]. However, for other dvergences, the correspondng nequalty wth the hgher order n the varatonal dstances was not known untl recently. We present t by usng the recent progress on the fourth-order extended Pnsker nequalty proved n [9] and ths gves a lower bound for the f-dvergence measures. Under a certan condton, (for detals see Theorem 7 n [9]) the followng bound holds D f (p r) f (1) V 2 + 1 2 72 [3f (4) (1) 4 (f (3) (1)) 2 f (1) ] V 4, (9) where the coeffcents must be postve and are best possble n the sense that there exst no larger constants. Applyng ths bound for the dvergence of Tsalls, we obtan D q (p r) 1 2 qv 2 1 72 q(q 2)(q + 1)V 4, (q > 0). (10) When q 1, we recover the 4th order Pnsker s nequalty for the Kullback-Lebler dvergence,.e. D KL (p r) V 2 /2 + V 4 /36, whose proof was provded n [10]. Note that the above bound are vald when q > 0, snce the coeffcent of V 2, vz.

Bounds on Dvergence 93 f (1), must be postve. The refnement of the Pnsker nequalty wth best possble coeffcents up to eghth order wth respect to V has been obtaned [20, 8] but the connecton wth physcs remans unexplaned so far, whle the postvty of D q (p r) 0 (nformaton nequalty or Gbbs nequalty) has a clear physcal nterpretaton of the second law of thermodynamcs. 6. BOUNDS IN TERMS OF HELLINGER S DISTANCE The Hellnger s dstance also belongs to the f-dvergence class and s obtaned f we set f(t) = ( t 1) 2 /2 wth t R +. It s shown n [6] that for f C 2 on (u, U) wth the range 0 < u 1 U <, f m, M s.t. m t 3/2 f (t) M, then the followng nequalty holds, 4mh 2 (p r) D f (p r) 4Mh 2 (p r). (11) We apply these bounds for our present consderaton. For the dvergence of Tsalls, we have t 3 2 f (t) = qt q 1 2. Therefore, M and m are determned as { sup t 3 2 qu q f (t) 1 2 (q > 1 = 2 ) t (u,u) qu q 1 2 (q < 1 2 ), (12) We then have the nequaltes, { nf t 3 2 qu q 1 2 (q > 1 f (t) = 2 ) t (u,u) qu q 1 2 (q < 1 2 ). (13) 4qu q 1 2 h 2 (p r) D q (p r) 4qU q 1 2 h 2 (p r), (q > 1 ), (14) 2 4qU q 1 2 h 2 (p r) D q (p r) 4qu q 1 2 h 2 (p r), (q < 1 ). (15) 2 7. BOUNDS IN TERMS OF χ 2 -DIVERGENCE It s useful to gve bounds of D q (p r) n terms of χ 2 -dvergence because t provdes a bound for the mxng tme of Markov chans [3]. When we set f(t) = (t 1) 2 on [0, ), the χ 2 -dvergence D χ 2(p r) = (p r ) 2 /r s also found to be a f-dvergence. The Kullback-Lebler dvergence s asymmetrc about the exchange of any two probablty dstrbutons p and r. Ther dfference quantfes nformaton to what extent the symmetry breaks. It was shown that the absolute value of the dfference of that measure from p to r and from r to p s bounded from above n terms of χ 2 -dvergence [7], D KL (p r) D KL (r p) U u 4uU D χ2(p r), (16) where p and r satsfy the range 0 < u p /r U < for each. The dervaton of Eq.(16) comes from a trapezod nequalty, whch holds for any f-dvergences

94 T. Yamano (f(1) = 0) obtaned n [7] Df (p r) 1 2 Df # (p r) 1 8 (Γ γ)d χ2(p r), (17) where the functon f # s defned for t (0, ) as f # (t) = (t 1)f (t) and f (t) s assumed to be bounded by γ = nf t [u,u) f (t) and Γ = sup t [u,u) f (t). Ths nequalty provdes bounds for the dfference of varous f-dvergences and enables us to evaluate and compare them n terms of the characterstcs (the nfmum and the supremum) of the second dervatve of the functons f. We now apply t to the Tsalls dvergence D q (p r). We have f # (t) = 1 q 1 (t 1)(qtq 1 1). (18) Wth ths f # (t ), where t = p /r, we obtan for the second term of Eq.(17) as, D f # q (p r) = r (t 1) q (t ) q 1 1 q 1 { (t ) q 1 1 = q p } (t ) q 1 1 r q 1 q 1 = q {D q (p r) + D 2 q (r p)}. (19) We note that when q 1, we have the Jeffereys dvergence D f # q (p r) D KL (p r)+ D KL (r p) = (p r ) log(p /r ), whch s also one of the f-dvergences. Then, we have for the l.h.s of the nequalty Eq.(17), D q(p r) 1 2 Df # q (p r) = 1 2 (2 q)d q(p r) qd 2 q (r p). (20) Snce f (t) = qt q 2 and 0 < u t U <,, we have { qu q 2 f (t) qu q 2 (q > 2) qu q 2 f (t) qu q 2 (0 < q < 2). Notng the relaton, D q (r p) = (21) q 1 q D 1 q(p r), (22) and usng Γ and γ obtaned from Eq.(21) and applyng them to the nequalty Eq.(17), we have fnally nequaltes (2 q){d q(p r) q q 1 D q 1(p r)} { q 4 (U q 2 u q 2 )D χ 2(p r) (q > 2) q 4 (uq 2 U q 2 (23) )D χ 2(p r) (0 < q < 2). When one needs the bounds for l.h.s. of Eq.(23) n terms of other f-dvergences, we need to know the correspondng bounds on χ 2 -dvergence. Taneja and Kumar [19] provded the upper bounds on the χ 2 -dvergence n terms of the Kullback- Lebler dvergence and of the Hellnger s dstance h(p r) := ( p r ) 2 /2 as

Bounds on Dvergence 95 D χ 2(p r) 2U 2 D KL (r p) and as D χ 2(p r) 8 U 3 h(p r), respectvely. Wth these nequaltes, the nequalty Eq.(23) can be bounded by usng D KL (p r) and h(p r). Anyway, the upper bounds n terms of χ 2 -dvergence s tghter than those n terms of others. 8. AN UPPER BOUND ON AN OVERLAP BETWEEN DIVERGENCES We here concern how much the values of dvergences dffer each other when we measure them wth the dentcal reference dstrbuton r. Namely, we shall want to know an upper bound of an overlap between quanttes D q (p r) and D q (p r) for a gven q. To ths end we defne the followng quantty U U 0 0 f(t)g(u) t u λ dtdu, (24) where the two functons are respectvely defned as f(t) = (t q t)/(q 1) and g(u) = (u q u)/(q 1) wth t = p/r and u = p /r. The case λ = 0 corresponds to the usual overlap of functons f and g (or the nner product of two real funcons). Otherwse t can gve a normalzed overlap tempered by λ. We are concerned wth the bound on t. Recallng that for α, β > 1 and 0 < λ < n wth a relaton 1/α + λ/n + 1/β = 2, the Hardy-Lttlewood-Sobolev nequalty [12] reads R n dxdy f(x)g(y) R x y λ n C(n, λ, p) f α g β (25) where a sharp constant C(n, λ, p) ndependent of functons f and g when α = β = 2n/(2n λ) s gven as, ( ) λ π λ Γ(n/2 λ/2) Γ(n/2) n 1 2. (26) Γ(n λ/2) Γ(n) Our problem s n a specal case by equatng n = 1 and f = g by settng ts form as (t q t)/(q 1). The desred upper bound s the multple of the value of Eq.(26) and that of f 2 2 gven n Appendx. 2 λ 9. SUMMARY AND CONCLUDING REMARKS We have presented the fundamental bounds on the generalzed KL dvergence used n nonextensve statstcal mechancs n terms of several known dvergences. The bounds for the parameterzed dvergence s ndspensable, snce wthout t most (f not all) of the nonextensve structures n system would not fully be understood. Our startng assumpton was the exstence of maxmum and mnmum values for the rato of probablty dstrbutons, whch orgnates from the fnte dynamc range of measurng apparatus for events assocated wth physcal system. The bounds depend on the parameter range that bears nonextensvty. Then, a natural queston arses concernng the parameter range.e., whch value of q we should use.

96 T. Yamano Regardng ths queston, we resort to the crteron gven as an nequalty, whch holds for generc f-dvergence [13], 0 D f { (p r) lm f(u) + uf(u 1 ) }, u = p u +0 r. (27) Snce we have f(u) = (u q u)/(q 1) for Tsalls dvergence, the upper bound after takng the lmt n the above becomes { 1 1 q (0 < q < 1) (28) + (q > 1). When q < 0 t does not gve a bound. Therefore, the use of lmtaton s found to be 0 < q < 1 n order to guarantee the feasblty of measurng dstance between two dstrbuton functons. 10. APPENDIX When 0 < q < 1 the norm f 2 2 2 λ s calculated to be 0 (λ 2)U (1 q) 2 B 1 q (U U q ) 4 λ 1 2 λ 2 F 1 h1, (2 q)(2 λ) 2 ; (2 q)(2 λ) 2q ; U (1 q)(2 λ) (1 q)(2 λ) 1 q 2 λ C @ A 2q + 2 λ (29) where 2 F 1 (a, b; c; d) s the Hypergeometrc functon. Acknowledgements. The author thanks the organzers of ICREM5 at ITB Bandung held on 22-24 Oct. 2011. REFERENCES [1] Borland, L., Plastno, A.R., and Tsalls, C., Informaton gan wthn generalzed thermostatstcs, J. Math. Phys., 39 (1998), 6490-6501; Erratum 40 (1999), 2196. [2] Csszár, I., Informaton-type measures of dfference of probablty dstrbutons and ndrect observatons, Studa Math. Hungarca, 2 (1967), 299-318; bd 2 (1967), 329-339. [3] Dacons, P., and Stroock, D., Geometrc bounds for egenvalues of Markov chans, Ann. Appl. Prob., 1 (1991), 36-61; Fll, J.A., Egenvalue Bounds on Convergence to Statonarty for Nonreversble Markov Chans, wth an Applcaton to the Excluson Process, bd 1 (1991), 62-87. [4] Dragomr, S.S., RGMIA Monographs: Inequaltes for Csszár f-dvergences n Informaton Theory, (1999) Ch. II 1. [5] Dragomr, S.S., RGMIA Monographs: Inequaltes for Csszár f-dvergences n Informaton Theory, (1999) Ch. II 3. [6] Dragomr, S.S., RGMIA Monographs: Inequaltes for Csszár f-dvergences n Informaton Theory, (1999) Ch. II 2, Theorem 6. [7] Dragomr, S.S., Glušcevć, V., and Pearce, C.E.M., New approxmaton for f-dvergence va trapezod and mdpont nequaltes, RGMIA Res. Rep. Coll., 14 (2002), 1-8. [8] Fedotov, A., Harremoës, P., and Topsøe, F., Refnements of Pnsker Inequalty, IEEE Tans. Infom. Theory, 49 (2003), 1491-1498. [9] Glardon, G.L., On Pnsker s type nequaltes and Csszár s f-dvergence. Part I: Second and Fourth-order nequaltes (Preprnt arxv:cs/0603097v2).

Bounds on Dvergence 97 [10] Kraft, O., and Schmtz, N., A note on Hoeffdng nequalty, J. Amer. Statst. Assoc., 64 (1969), 907-912. [11] Kullback, S., and Lebler, R.A., On Informaton and Suffcency, Ann. Math. Stat., 22 (1951), 79-86; Kullback, S. Informaton Theory and Statstcs (Wley, New York, 1959). [12] Leb, E.H., and Loss, M., Analyss Graduate Studes n Mathematcs Vol.14, Second Edton, AMS, 2001. [13] Lese, F., and Vajda, I., Convex statstcal dstances, vol. 95 of Teubner-Texte zur Mathematk, BSB BG Teubner Verlagsgesellschaft, Lepzg, 1987. [14] Lndhard, J., On the theory of measurement and ts consequences n statstcal dynamcs, Det Kongelge Danske Vdenskabernes Selskab Matematsk-fysske Meddelelser, 39 (1974), 1-39. [15] Mormoto, T., Markov processes and the H-Theorem, J. Phys. Soc. Jpn., 18 (1963), 328-331. [16] Pnsker, M.S., Informaton and Informaton Stablty of Random Varables and Processes, ed. A. Fensten, Holden-Day, San Francsco, 1964. [17] Sharma, B.D., and Autar, R., Relatve-Informaton Functons and Ther Type (α, β) Generalzatons, Metrka, 21 (1974), 41-50. [18] Shno, M., H-Theorem wth Generalzed Relatve Entropes and thetsalls Statstcs, J. Phys. Soc. Jpn., 67 (1998), 3658-3660. [19] Taneja, I.J., and Kumar, P., Relatve Informaton of Type s, Csszar f-dvergence, and Informaton Inequaltes, Informaton Scences 166 (2004), 105-125. [20] Topsøe, F., Bounds for entropy and dvergence of dstrbutons over a two-element set, J. Ineq. Pure Appl. Math., 2 (2001), Artcle 25. [21] Tsalls, C., Possble generalzaton of Boltzmann-Gbbs statstcs, J. Stat. Phys., 52 (1988), 479-487. [22] Tsalls, C., Generalzed entropy-based crteron for consstent testng, Phys. Rev. E, 58 (1998), 1442-1445. [23] Tsalls, C., Introducton to Nonextensve Statstcal Mechancs: Approachng a Complex World, Sprnger, New York, 2009. [24] Yamano, T., H-theorems based upon a generalzed, power law-lke dvergence, Phys. Lett. A., 374 (2010), 3116-3118. [25] For mathematcal rgor, we menton the followng for the contnuous case. When the rato of dstrbuton functons p and r appears, we suppose that P s absolutely contnuous wth respect to R (P R), where P and R are probablty measures wth σ-fnte measure µ from the Radon-Nkodym dervatve p = dp/dµ and r = dr/dµ, respectvely. If P R s not satsfed, the dvergence s +.