ON A CLASS OF PERIMETER-TYPE DISTANCES OF PROBABILITY DISTRIBUTIONS

Similar documents
THEOREM AND METRIZABILITY

PROPERTIES. Ferdinand Österreicher Institute of Mathematics, University of Salzburg, Austria

CONVEX FUNCTIONS AND MATRICES. Silvestru Sever Dragomir

Literature on Bregman divergences

A New Quantum f-divergence for Trace Class Operators in Hilbert Spaces

Some New Information Inequalities Involving f-divergences

Institut für Mathematik

Research Article On Some Improvements of the Jensen Inequality with Some Applications

ONE SILVESTRU SEVER DRAGOMIR 1;2

Jensen-Shannon Divergence and Hilbert space embedding

Minimum Phi-Divergence Estimators and Phi-Divergence Test Statistics in Contingency Tables with Symmetry Structure: An Overview

arxiv:math/ v1 [math.st] 19 Jan 2005

On The Asymptotics of Minimum Disparity Estimation

A SYMMETRIC INFORMATION DIVERGENCE MEASURE OF CSISZAR'S F DIVERGENCE CLASS

An Arrangement of Pseudocircles Not Realizable With Circles

Covering the Convex Quadrilaterals of Point Sets

Linear independence, a unifying approach to shadow theorems

Nested Inequalities Among Divergence Measures

276 P. Erdős, Z. Füredi provided there exists a pair of parallel (distinct) supporting hyperplanes of 91 such that a belongs to one of them and b to t

arxiv: v2 [cs.it] 26 Sep 2011

Divergences, surrogate loss functions and experimental design

On self-circumferences in Minkowski planes

Dimensional behaviour of entropy and information

Deviation Measures and Normals of Convex Bodies

f(x n ) [0,1[ s f(x) dx.

Convexity/Concavity of Renyi Entropy and α-mutual Information

EQUICONTINUITY AND SINGULARITIES OF FAMILIES OF MONOMIAL MAPPINGS

Comparison of Estimators in GLM with Binary Data

Linear functionals and the duality principle for harmonic functions

Convergence of generalized entropy minimizers in sequences of convex problems

On the Distribution of Sums of Vectors in General Position

Subdifferential representation of convex functions: refinements and applications

GERHARD S I E GEL (DRESDEN)

Full text available at: Information Theory and Statistics: A Tutorial

MAPS ON DENSITY OPERATORS PRESERVING

Bounds for entropy and divergence for distributions over a two-element set

Tight Bounds for Symmetric Divergence Measures and a Refined Bound for Lossless Source Coding

PM functions, their characteristic intervals and iterative roots

arxiv: v1 [math.mg] 27 Jan 2016

CERTAIN PROPERTIES OF GENERALIZED ORLICZ SPACES

Generalized function algebras as sequence space algebras

On generalized empirical likelihood methods

SOLUTION OF THE ULAM STABILITY PROBLEM FOR CUBIC MAPPINGS. John Michael Rassias National and Capodistrian University of Athens, Greece

Disjointness conditions in free products of. distributive lattices: An application of Ramsay's theorem. Harry Lakser< 1)

POINTWISE MULTIPLIERS FROM WEIGHTED BERGMAN SPACES AND HARDY SPACES TO WEIGHTED BERGMAN SPACES

ON THE SIMILARITY OF CENTERED OPERATORS TO CONTRACTIONS. Srdjan Petrović

Journal of Inequalities in Pure and Applied Mathematics

(1.) For any subset P S we denote by L(P ) the abelian group of integral relations between elements of P, i.e. L(P ) := ker Z P! span Z P S S : For ea

arxiv: v3 [math.st] 24 Apr 2018

The small ball property in Banach spaces (quantitative results)

Uniform convergence of N-dimensional Walsh Fourier series

AN OPERATOR THEORETIC APPROACH TO DEGENERATED NEVANLINNA-PICK INTERPOLATION

The tensor algebra of power series spaces

ON THE CONVERGENCE OF THE ISHIKAWA ITERATION IN THE CLASS OF QUASI CONTRACTIVE OPERATORS. 1. Introduction

The longest shortest piercing.

AN IMPROVED MENSHOV-RADEMACHER THEOREM

Minimality Concepts Using a New Parameterized Binary Relation in Vector Optimization 1

Technische Universität Ilmenau Institut für Mathematik

THE NUMBER OF LOCALLY RESTRICTED DIRECTED GRAPHS1

Acta Mathematica Academiae Paedagogicae Nyíregyháziensis 32 (2016), ISSN

Some Quantum f-divergence Inequalities for Convex Functions of Self-adjoint Operators

A Lower Estimate for Entropy Numbers

On finite elements in vector lattices and Banach lattices

en-. כ Some recent results on the distribution of zeros of real, 1822 Fourier. 1937, Ålander, Pólya, Wiman

Composition Operators from Hardy-Orlicz Spaces to Bloch-Orlicz Type Spaces

An extension of Knopp s core theorem for complex bounded sequences

On Joint Eigenvalues of Commuting Matrices. R. Bhatia L. Elsner y. September 28, Abstract

ON THE CORE OF A GRAPHf

COMPUTING OPTIMAL SEQUENTIAL ALLOCATION RULES IN CLINICAL TRIALS* Michael N. Katehakis. State University of New York at Stony Brook. and.

THE ISODIAMETRIC PROBLEM AND OTHER INEQUALITIES IN THE CONSTANT CURVATURE 2-SPACES

arxiv: v1 [math.fa] 1 Nov 2017

Solutions for Homework Assignment 2

Wulff shapes and their duals

Mutual information of Contingency Tables and Related Inequalities

Exact Bounds for Some Hypergraph Saturation Problems

On families of anticommuting matrices

DEFORMATIONS OF A STRONGLY PSEUDO-CONVEX DOMAIN OF COMPLEX DIMENSION 4

On metric characterizations of some classes of Banach spaces

Block companion matrices, discrete-time block diagonal stability and polynomial matrices

On the representation of primes by polynomials (a survey of some recent results)

Integral mean value theorems and the Ganelius inequality

On the Optimum Asymptotic Multiuser Efficiency of Randomly Spread CDMA

Estimation of Rényi Information Divergence via Pruned Minimal Spanning Trees 1

THERE IS NO FINITELY ISOMETRIC KRIVINE S THEOREM JAMES KILBANE AND MIKHAIL I. OSTROVSKII

Covariograms of convex bodies in the plane: A remark on Nagel s theorem

A combinatorial determinant

Math 341: Convex Geometry. Xi Chen

arxiv: v2 [math.co] 11 Oct 2016

f-divergence Inequalities

Polynomial root separation examples

UNIFORMLY CONTINUOUS COMPOSITION OPERATORS IN THE SPACE OF BOUNDED Φ-VARIATION FUNCTIONS IN THE SCHRAMM SENSE

f-divergence Inequalities

Connectivity of addable graph classes

The equivalence of Picard, Mann and Ishikawa iterations dealing with quasi-contractive operators

Mi-Hwa Ko. t=1 Z t is true. j=0

Tiziana Cardinali Francesco Portigiani Paola Rubbioni. 1. Introduction

Convergence Rates in Regularization for Nonlinear Ill-Posed Equations Involving m-accretive Mappings in Banach Spaces

CONSIDER an estimation problem in which we want to

entropy From f-divergence to Quantum Quasi-Entropies and Their Use Entropy 2010, 12, ; doi: /e

Proc. A. Razmadze Math. Inst. 151(2009), V. Kokilashvili

Transcription:

KYBERNETIKA VOLUME 32 (1996), NUMBER 4, PAGES 389-393 ON A CLASS OF PERIMETER-TYPE DISTANCES OF PROBABILITY DISTRIBUTIONS FERDINAND OSTERREICHER The class If, p G (l,oo], of /-divergences investigated in this paper generalizes an /-divergence introduced by the author in [9] and applied there and by Reschenhofer and Bomze [11] in different areas of hypotheses testing. The main result of the present paper ensures that, for every p (1, oo), the square root of the corresponding divergence defines a distance on the set of probability distributions. Thus it generalizes the respecting statement for p = 2 made in connection with Example 4 by Kafka, Osterreicher and Vincze in [6]. From the former literature on the subject the maximal powers of /-divergences defining a distance are known for the subsequent classes. For the class of Hellinger-divergences given in terms of p s \u) = 1 + u (u s -\-u r ~ s ), s (0,1), already Csiszar and Fischer [3] have shown that the maximal power is min(s, 1 s). For the following two classes the maximal power coincides with their parameter. The class given in terms of f( a )(u) = l u a \ a, a (0,1], was investigated by Boekee [2]. The previous class and this one have the special case s = a = \ in common. This famous case is attributed to Matusita [8]. The class given by <p a ( u ) = 1 w «(1 + u) ~«,96(0,1], and investigated in [6], Example 3, contains the wellknown special case a = introduced by Vincze [13]. 1. INTRODUCTION Let (Q,A) be a nondegenerate measurable space (i.e..4 > 2 and hence f2 > 1) and let M\(Q,A) be the set of probability distributions on ( 2,.4). Furthermore let T be the set of convex functions / : JH+ > 1R which are continuous at 0. And let the function /* T be defined by f*(u) = u-f(-\ forue(o.oo). Remark 1. Owing to the continuity of / and f* at 0 and by setting 0 / (jj) =0 for all / ~ T it holds ľ (f) = У / ( ~ ) ÍOľаìlx > yem 4

390 F. OSTERREICHER Definition (cf. Csiszar [4] and Ali and Silvey [1]). Let Q 0, Q x G Mi( l,a). Then If(QuQo) = j f{j Q ) -SodAi is called /-divergence of Qo and Q\. (As usual, q x and q 0 denote the Radon- Nikodym-derivatives of Qi and Q 0 with respect to a dominating cr-finite measure A-0 In the sequel we briefly restate those results from [6] which are basic for the statement and proof of the main result of this paper. For further informations on /-divergences we refer to the monograph [7] by Liese and Vajda and the paper [12] of Vajda and Osterreicher. Provided (fl) /(l) = 0 and / is strictly convex at 1 and (f2) r(u) = /(u) it holds (Ml) Ij(Qi,Qo)>0 with equality iff Qo = Qi V Q 0,Qi Mi{Q,A), (M2) IJ(QI,QO) = IJ(Q 0,QI) V Q 0,Qi G Mi(Q,A) respectively. If, in addition to (fl) and (f2), there exists an a G (0,1], such that (f3,a) the function h(u) = -, u [0,1), f( u ) is (not neccessarily strictly) decreasing, then, according to [6], Theorems 1 and 2, the power Pa(Qo,Qi) = [Ij(Qi,Qo)} a of the /-divergence satisfies the triangle inequality (M3) p a (Qo,Qi) < Pa(Qo,Q2) + P«(Q 2, Qi) V Qo, Qu <?2 6 Ml(Q,A). Remark 2. Note that by virtue of Jensen's inequality /(-) + / «_ i /w+ +,(i)> /(1 ). I + u 1 + u I + u \u Therefore (fl) and (f2) imply f(u) > 0 for all u IR+\{1} and hence /(0) G (0, oo). Moreover, it can be easily seen that, provided (f3,/?) is satisfied for (3 = a G (0,1], it is also satisfied for every (3 G (0,1]. The following Remark is a consequence of [6], Propositions 5 and 6.

On a Class of Perimeter-Type Distances of Probability Distributions 391 Remark 3. Let (fl) and (f2) hold true and let a-o G (0,1] be the maximal a for which (f3,a) is satisfied. Then the following statement concerning c*o can be made. Let ko, k\, Co, c\ G (0,oo) be such that /(0) (1 + u) - f(u) ~ c 0 u ko for u i 0 and f(u) ~ ci u - l fcl forw l then ko < 1, ki > 1 and a 0 < min (& 0, -M < 1. 2. THE MAIN RESULT First we are going to show that /-divergences can be defined in terms of the following class of functions ( (1 + up)' -2v~ l -(1 + w) for pg(l,oo) /-.(«)= S IT* II r, WGIR+ I o * or P which satisfies lim p _oo /p( w ) -= foo(u). Lemma 1. / p G T and satisfies (fl) and (f2) for all p G (1, oo]. Proof. Since this assertion is obvious for the case p = oo, let us assume p G (l,oo) from now on. For this case \im f p (u) = / p (0) = 1-2 1 " 1 G (0, oo), / p (l) = 0, «o (1) f p ( u ) = (1 + up)^" 1. U P- 1-2^~ 1 and hence /;(1) = 0 and (2) fp(u)= (p- l)-(l + up)p~ 2 - U P- 2 > 0 VwG(0,oo) and hence /;'(!) = (p-l)-2i- 2. Therefore f p is an element of T satisfying (fl). The validity of f p (u) obvious. _ f p (u) is Remark 3 provides an upper bound for the subset of those a G (0,1], for which (f3,a) may hold. Remark 4. Owing to / p (0) (1 + u) - f p (u) = 1 + u - (1 + u**) ~ u for «0, /.,(_) ~ ( p _i). 2 i- 3.( w _i)2 for u t 1, (the latter being a consequence of (1) and (2)), the maximal a G (0,1] satisfying (f3,a) - if there is any - must be a 0 < \

392 F. OSTERREICHER Interpretation of the /-divergences under consideration. Let R(Qo, Qi) = co{(q 0 (A), Q X (A C )), AeA} be the risk set of the testing problem (Qo,Qi) M.\(Q,,A) 2 "the convex hull of). Then the corresponding /-divergence (whereby u co" means T (n n\ / /(?i p +?o p ) i d/z-2i for pe(l.oo) lf P (QuQo)=< { i J l?i-?o d// for p = oo can be interpreted as the difference or the arc lengths of the lower boundary of the risk set and the diagonal D = {(x,y)(e[0,l} 2 :x + y = l}, both measured in terms of the l p norm in IR 2. We denote the arc length in question by lp arc length since it coincides for p = 2 with the ordinary arc length. For further reading on the geometric point of view we refer to Feldman and Osterreicher [5] and the entry [10] of the author. For the limiting case p = oo the corresponding /-divergence If^iQi, Qo) is half of the well-known variation distance. For p = 2 it has been shown in [6] that the square root of the corresponding /-divergence If 2 (Qi,Qo) is also a distance. In the sequel we are going to show the following generalization of the latter which may be conjectured from Remark 4. Theorem. For every p (l,oo) the square root of the /-divergence If p (Qi,Qo) defines a distance on M\(Q,A). By virtue of Lemma 1 and [6], Theorems 1 and 2, the proof is reduced to that of the following Lemma. Lemma 2. Let p (1, oo). Then the function f P (u) is (strictly) decreasing. Proof. Because of Л p ( u ) = (v^~0'ш' 0 p ( u ) with ФP(U) = -[/ғ(«)+(v" -«) («)] = 2>- 1 (l + u*)-(l + г.p)*- 1.(l + up-ł)

On a Class of Perimeter-Type Distances of Probability Distributions 393 it suffices to show <j> p (u) < 0 for all u (0,1). Owing to <f) p (l) = 0 it suffices to show that the functions i P defined by ip p (u) = y/u <f>' p (u) satisfy 4> p (u) = 2P- 2 - (1 + u p )*~ 2 u^1 \(p - 1) (1 - y/u) + 1 + U* >0 for all u G (0,1). Because of ip p (l) = 0 this, however, follows from VCM = -(P -1) (p- *) (i + " p ) >~ 3 t- p ~ 2 (i - \A0 (i - «p ) < o, which is obvious. D (Received July 18, 1995.) REFERENCES [I] S. M. Ali and S. D. Silvey: A general class of coefficients of divergence of one distribution from another. J. Roy. Statist. Soc. Ser. B 28 (1966), 131-142. [2] D. E. Boekee: A generalization of the Fisher information measure. Delft University Press, Delft 1977. [3] I. Csiszar and J. Fischer: Informationsentfernungen im Raum der Wahrscheinlichkeitsverteilungen. Magyar Tud. Akad. Mat. Kutato Int. Kosl. 7(1962), 159-180. [4] I. Csiszar: Eine informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizitat von Markoffschen Ketten. Publ. Math. Inst. Hungar. Acad. Sci. 8 (1963), 85-107. [5] D. Feldman and F. Osterreicher: Divergenzen von Wahrscheinlichkeitsverteilungen - integralgeometrisch betrachtet. Acta Math. Acad. Sci. Hungar. 57(1981), 4, 329-337. [6] P. Kafka, F. Osterreicher and I. Vincze: On powers of /-divergences defining a distance. Studia Sci. Math. Hungar. 26 (1991), 415-422. [7] F. Liese and I. Vajda: Convex Statistical Distances. Teubner-Texte zur Mathematik, Band 95, Leipzig 1987. [8] K. Matusita: Decision rules based on the distance for problems of fit, two samples and estimation. Ann. Math. Stat. 26(1955), 631-640. [9] F. Osterreicher: The construction of least favourable distributions is traceable to a minimal perimeter problem. Studia Sci. Math. Hungar. j7(l982), 341-351. [10] F. Osterreicher: The risk set of a testing problem - A vivid statistical tool. In: Trans. of the Eleventh Prague Conference, Academia, Prague 1992, Vol. A, pp. 175-188. [II] E. Reschenhofer and I. M. Bomze: Length tests for goodness of fit. Biometrika 18 (1991), 207-216. [12] I. Vajda and F. Osterreicher: Statistical information and discrimination. IEEE Trans. Inform. Theory 39 (1993), 3, 1036-1039. [13] I. Vincze: On the concept and measure of information contained in an observation. In: Contributions to Probability (J. Gani and V.F. Rohatgi, eds.), Academic Press, New York 1981, pp. 207-214. Universität Salzbürg, Hellbrunner- Dr. Ferdinand Österreicher, Institut für Mathematik, straße 34, 5020 Salzburg. Austria.