On the Chi square and higher-order Chi distances for approximating f-divergences

Size: px
Start display at page:

Download "On the Chi square and higher-order Chi distances for approximating f-divergences"

Transcription

1 On the Chi square and higher-order Chi distances for approximating f-divergences Frank Nielsen, Senior Member, IEEE and Richard Nock, Nonmember Abstract We report closed-form formula for calculating the Chi square and higher-order Chi distances between statistical distributions belonging to the same exponential family with affine natural space, and instantiate those formula for the Poisson and isotropic Gaussian families. We then describe an analytic formula for the f-divergences based on Taylor expansions and relying on an extended class of Chi-type distances. Index Terms families. statistical divergences, chi square distance, Kullback-Leibler divergence, Taylor series, exponential A. Statistical divergences: f-divergences I. INTRODUCTION Measuring the similarity or dissimilarity between two probability measures is met ubiquitously in signal processing. Some usual distances are the Pearson χ P and Neyman χ N chi square distances, and the Kullback-Leibler divergence [] defined respectively by χ (x (x) x (x)) P (X : X ) = dν(x), () x (x) χ (x (x) x (x)) N(X : X ) = dν(x), () x (x) KL(X : X ) = x (x) log x (x) dν(x), (3) x (x) Frank Nielsen is with Sony Computer Science Laboratories, Inc., Higashi Gotanda, 4-00 Shinagawa-ku, Tokyo, Japan, nielsen@csl.sony.co.jp Richard Nock is with UAG CEREGMIA, Martinique, France, rnock@martinique.univ-ag.fr.

2 where X and X are probability measures absolutely continuous with respect to a reference measure ν, and x and x denote their Radon-Nikodym densities, respectively. Those dissimilarity measures M are termed divergences to contrast with metric distances since they are oriented distances (i.e., M(X : X ) M(X : X )) that do not satisfy the triangular inequality. In the 960 s, many of those divergences were unified using the generic framework of f-divergences [], I f, defined for an arbitrary functional f: I f (X : X ) = x (x)f ( ) x (x) dν(x) 0, (4) x (x) where f is a convex function f : (0, ) dom(f) [0, ] (often normalized so that f() = 0). Those f-divergences can always be symmetrized by taking: S f (X : X ) = I f (X : X ) + I f (X : X ), (5) with f (u) = uf(/u), and I f (X : X ) = I f (X : X ). See Table I for a list of common f-divergences with their corresponding generators f. In information theory, f-divergences are characterized as the unique family of convex divergences that satisfies the information monotonicity property [3]. Note that f-divergences may evaluate to infinity (that is, unbounded I f ) when the integral diverge, even if x, x > 0 on the support X. For example, let X = (0, ) be the unit interval, and two densities (with respect to Lebesgue measure ν L ) x (x) = and x (x) = ce /x with c = 0 e /x dx 0.48 the normalizing constant. Consider the Kullback-Leibler divergence (f-divergence with f(u) = u log u): KL(X : X ) = 0 = log c + x log x (x) x (x) dν L(x), (6) 0 dν(x) =. (7) x Beware that sometimes the χ N and χ P definitions are inverted in the literature. This may stem from an alternative definition of f-divergences defined as I f (X : X ) = x (x)f( x (x) x (x) )dν(x) = I f (X : X ). That is, given any transition probability τ that transforms measures X and X to X τ and X τ (by a Markov morphism τ), we have I f (X : X ) I f (X τ : X τ ), with equality if and only if the transition τ is induced from a sufficient statistic [3].

3 3 f-divergence I f (P : Q) Formula Generator f(u) Total variation (metric) Squared Hellinger Pearson χ P Neyman χ N Pearson-Vajda χ k λ,p Pearson-Vajda χ k λ,p Kullback-Leibler reverse Kullback-Leibler α-divergence Jensen-Shannon p(x) q(x) dν(x) u ( p(x) q(x)) dν(x) ( u ) (q(x) p(x)) p(x) dν(x) (u ) (p(x) q(x)) q(x) (q(x) λp(x)) k dν(x) ( u) u p k (x) dν(x) (u λ) k q(x) λp(x) k dν(x) u λ k p k (x) p(x) log p(x) dν(x) log u q(x) q(x) log q(x) dν(x) u log u p(x) 4 ( p α α (x)q +α 4 (x)dν(x)) ( u +α α ) (p(x) log p(x) q(x) +u + q(x) log )dν(x) (u + ) log + u log u p(x)+q(x) p(x)+q(x) TABLE I SOME COMMON f -DIVERGENCES I f WITH CORRESPONDING GENERATORS: EXCEPT THE TOTAL VARIATION, f -DIVERGENCES ARE NOT METRIC [4]. B. Stochastic approximations of f-divergences To bypass the integral evaluation of I f of Eq. 4 (often mathematically intractable), we carry out a stochastic integration: Î f (X : X ) n n i= ( f ( ) x (s i ) + x (t i ) x (s i ) x (t i ) f ( )) x (t i ), (8) x (t i ) with s,..., s n and t,..., t n IID. sampled from X and X, respectively. Those approximations, although converging to the true values when n, are time consuming and yield poor results in practice, specially when the dimension of the observation space, X, is large. We therefore concentrate on obtaining exact or arbitrarily fine approximation formula for f-divergences by considering a restricted class of exponential families. C. Exponential families Let x, y denote the inner product for x, y X : The inner product for vector spaces X is the scalar product x, y = x y. An exponential family [6] is a set of probability measures E F = {P θ } θ dominated by a measure ν having their Radon-Nikodym densities p θ expressed canonically as:

4 4 p θ (x) = exp( t(x), θ F (θ) + k(x)), (9) for θ belonging to the natural parameter space: Θ = { θ R D p θ (x)dν(x) = }. Since log p x X θ(x)dν(x) = log = 0, it follows that F (θ) = log exp( t(x), θ + k(x))dν(x). For full regular families [6], it can be proved that function F is strictly convex and differentiable over the open convex set Θ. Function F characterizes the family, and bears different names in the literature (partition function, log-normalizer or cumulant function) and parameter θ (natural parameter) defines the member P θ of the family E F. Let D = dim(θ) denote the dimension of Θ, the order of the family. The map k(x) : X R is an auxiliary function defining a carrier measure ξ with dξ(x) = e k(x) dν(x). In practice, we often consider the Lebesgue measure ν L defined over the Borel σ-algebra E = B(R d ) of R d for continuous distributions (e.g., Gaussian), or the counting measure ν c defined on the power set σ-algebra E = X for discrete distributions (e.g., Poisson or multinomial families). The term t(x) is a measure mapping called the sufficient statistic [6]. Table II shows the canonical decomposition for the Poisson and isotropic Gaussian families. Notice that the Kullback-Leibler divergence between members X E F (θ ) and X E F (θ ) of the same exponential family amount to compute a Bregman divergence on swapped natural parameters [8]: KL(X : X ) = B F (θ : θ ), where B F (θ : θ ) = F (θ) F (θ ) (θ θ ) F (θ ), where F denotes the gradient. II. χ AND HIGHER-ORDER χ k DISTANCES A. A closed-form formula When X and X belong to the same restricted exponential family E F, we obtain the following result: Lemma : The Pearson/Neyman Chi square distance between X E F (θ ) and X E F (θ ) is given by: χ P (X : X ) = e F (θ θ ) (F (θ ) F (θ )), (0) χ N(X : X ) = e F (θ θ ) (F (θ ) F (θ )), () provided that θ θ and θ θ belongs to the natural parameter space Θ. This implies that the chi square distances are always bounded. The proof relies on the following lemma:

5 5 Lemma : The integral I p,q = x (x) p x (x) q dν(x) with p + q = for X E F (θ ) and X E F (θ ), p R, p + q = converge and equals to: I p,q = e F (pθ +qθ ) (pf (θ )+qf (θ )) () provided the natural parameter space Θ is affine. Proof: Let us calculate the integral I p,q : = exp(p( t(x), θ F (θ ) + k(x))) exp(q( t(x), θ F (θ ) + k(x)))dν(x), = e t(x),pθ +qθ (pf (θ )+qf (θ ))+k(x) dν(x), = e F (pθ +qθ ) (pf (θ )+qf (θ )) p F (x pθ + qθ )dν(x). When pθ + qθ Θ, we have p F (x pθ + qθ)dν(x) =, hence the result. To prove Lemma, we rewrite: χ P (X : X ) = = ( x (x) x (x) x (x) + x (x))dν(x), (3) ( ) x (x) x (x) dν(x), (4) and apply Lemma for p = and q = (checking that p + q = ). The closed-form formula for the Neyman chi square follows from the fact that χ N (X : X ) = χ P (X : X ). Thus when the natural parameter space Θ is affine, the Pearson/Neyman Chi square distances and its symmetrization χ P + χ N between members of the same exponential family are available in closed-form. Examples of such families are the Poisson, binomial, multinomial, or isotropic Gaussian families to name a few. Let us call those families: affine exponential families for short. The canonical decomposition of usual affine exponential families are reported in Table II. Note that a formula for the α-divergences between members of the same exponential family were reported in [8] for α [0, ]: In that case, αθ + ( α)θ always belong to the open convex natural space Θ (here, p belongs to R). B. The Poisson and isotropic Gaussian cases As reported in Table II, those Poisson and isotropic Gaussian exponential families have affine natural parameter spaces Θ.

6 6 Poi(λ) : p(x λ) = λx e λ, λ > 0, x {0,,...} x! Nor I(µ) : p(x µ) = (π) d e (x µ) (x µ), µ R d, x R d Family θ Θ F (θ) k(x) t(x) ν Poisson log λ R e θ log x! x ν c Iso.Gaussian µ R d θ θ d log π x x x ν L TABLE II EXAMPLES OF EXPONENTIAL FAMILIES WITH AFFINE NATURAL SPACE Θ. ν c DENOTES THE COUNTING MEASURE AND ν L THE LEBESGUE MEASURE. The Poisson family. For P Poi(λ ) and P Poi(λ ), we have: ( ) λ χ P (λ : λ ) = exp λ + λ. (5) λ To illustrate this formula with a numerical example, consider X Poi() and X Poi(). Then, it comes that χ P (P : P ) = e.78. The isotropic Normal family. For N Nor I (µ ) and N Nor I (µ ), we have according to Table II: χ P (µ : µ ) = e (µ µ ) (µ µ ) (µ µ µ µ ). In that case the χ distance is symmetric: χ P (µ : µ ) = e (µ µ ) (µ µ ) = χ N(µ : µ ) (6) C. Extensions to higher-order Vajda χ k divergences The higher-order Pearson-Vajda χ k P and χk P distances [5] are defined by: χ k (x (x) x (x)) k P (X : X ) = dν(x), (7) x (x) k χ k x (x) x (x) k P (X : X ) = dν(x), (8) x (x) k are f-divergences for the generators (u ) k and u k (with χ k P (X : X ) χ k P (X : X )). When k =, we have χ P (X : X ) = (x (x) x (x))dν(x) = 0 (i.e., divergence is never discriminative), and χ P (X, X ) is twice the total variation distance (the only metric

7 7 f-divergence [4]). χ 0 P is the unit constant. Observe that the χk P odd k (signed distance), but not χ k P. We can compute the χk P binomial expansion: distance may be negative for term explicitly by performing the Lemma 3: The (signed) χ k P distance between members X E F (θ ) and X E F (θ ) of the same affine exponential family is (k N) always bounded and equal to: k ( ) k e χ k P (X : X ) = ( ) k j F (( j)θ +jθ ) j e. (9) ( j)f (θ )+jf (θ ) Proof: = = j=0 χ k (x (x) x (x)) k P (X : X ) = dν(x), x (x) k (0) k ( ) k ( ) k j x (x) k j x (x) j dν(x), j x (x) k () j=0 k ( ) k ( ) k j j j=0 x (x) j x (x) j dν(x). () Then the proof follows from Lemma that shows that I j,j (X : X ) = x (x) j x (x) j dν(x) = e F (( j)θ +jθ ) e ( j)f (θ )+jf (θ ). For Poisson/Normal distributions, we get: χ k P (λ : λ ) = χ k P (µ : µ ) = k ( ) k ( ) k j e λ j λ j (( j)λ +jλ ), (3) j j=0 k ( ) k ( ) k j e j(j )(µ µ ) (µ µ ). (4) j j=0 Observe that for λ = λ = λ, we have χ k P (λ : λ ) = k j=0 ( )k j( k j) e λ λ = ( ) k = 0 when k N, as expected. The χ k P value is always bounded. For sanity check, consider the binomial expansion for k =, we have: χ P (λ : λ ) = ( ) e λ λ ( ) e λ λ + ( ) λ λ e λ = e λ λ λ, in accordance with Eq. 5. Consider a numerical example: Let λ = 0.6 and λ = 0.3, then χ P 0.6, χ3 P 0.03, χ4 P 0.04, χ5 P 0.0, χ6 P 0.08, χ7 P 0.03, χ 8 P 0.0, χ9 P , χ0 P sign of those χ k -type signed distances , etc. This numerical example illustrates the alternating

8 8 III. f -DIVERGENCES FROM TAYLOR SERIES Recall that the f-divergence defined for a generator f is I f (X : X ) = x (x)f ( ) x (x) x dν(x). (x) Assuming f analytic, we use the Taylor expansion about a point λ: f(x) = f(λ) + f (λ)(x λ) + f (λ)(x λ) +... = i=0 f (i) (λ)(x λ) i, the power series expansion of f, for i! λ int(dom(f (i) )) i. Lemma 4 (extends Theorem of [5]): When bounded, the f-divergence I f can be expressed as the power series of higher order Chi distances: I f (X : X ) = x (x) = i=0 i=0 i! f (i) (λ) ( ) i x (x) x (x) λ dν(x), i! f (i) (λ) χ i λ,p (X : X ), (5) where in the equality we have swapped the integral and sum according to Fubini theorem since we assumed that I f <, and χ i λ,p (X : X ) is a generalization of the χ i P Table I): and χ 0 λ,p (X : X ) = by convention. defined by (see χ i (x (x) λx (x)) i λ,p (X : X ) = dν(x). (6) x (x) i Choosing λ = int(dom(f (i) )), we approximate the f-divergence as follows (Theorem of [5]): I f (X : X ) s k=0 where f (s+) = sup t [m,m] f (s+) (t) and m p q fatness of p q, we ensure that I f <. f (k) () χ k P (X : X ) k! (s + )! f (s+) (M m) s, (7) M. Notice that by assuming the Choosing λ = 0 (whenever 0 int(dom(f (i) ))) and affine exponential families, we get the f-divergence in a much simpler analytic expression: I f (X : X ) = i=0 f (i) (0) I i,i (θ : θ ), (8) i! I i,i (θ : θ ) = ef (iθ +( i)θ ) e if (θ )+( i)f (θ ). (9)

9 9 Lemma 5: The bounded f-divergences between members of the same affine exponential family can be computed as an equivalent power series whenever f is analytic. Corollary : A second-order Taylor expansion yields I f (X : X ) f() + f ()χ N (X : X )+ f ()χ N (X : X ). Since f() = 0 (f can always be renormalized) and χ N (X : X ) = 0, it follows that I f (X : X ) f () χ N(X : X ), (30) and reciprocally χ N (X : X ) f () I f(x : X ) (f () > 0 follows from the strict convexity of the generator). When f(u) = u log u, this yields the well-known approximation []: χ P (X : X ) KL(X : X ). (3) For affine exponential families, we then plug the closed-form formula of Lemma to get a simple approximation formula of I f. For example, consider the Jensen-Shannon divergence (Table I) with f (u) = u u+ and f () =. It follows that I JS(X : X ) 4 χ N (X : X ). (For Poisson distributions λ = 5 and λ = 5., we get.5% relative error. A. Example : χ revisited Let us start with a sanity check for the χ distance between Poisson distributions. The Pearson chi square distance is a f-divergence for f(t) = t with f (t) = t and f (t) = and f (i) (t) = 0 for i >. Thus, with f (0) (0) =, f () (0) = 0, f () (0) =, and f (i) (0) = 0 for i >. Recall that I i,i (θ : θ ) = e F (iθ +( i)θ ) (if (θ )+( i)f (θ ) = exp(λ i λ i iλ ( i)λ ). Note that I i,i (λ, λ) = e 0 = for all i. Thus we get: I f (X : X ) = I,0 + I, with I,0 = e λ λ = and I, = e λ λ λ +λ. Thus, we obtain I f (X : X ) = + e λ λ λ +λ, in accordance with Eq. 5. B. Example : Kullback-Leibler divergence By choosing f(u) = log u, we obtain the Kullback-Leibler divergence (see Table I). We have f (i) (u) = ( ) i (i )!u i, and hence f (i) () χ,p j= = ( )i i! i, for i (with f() = 0). Since = 0, it follows that: ( ) i KL(X : X ) = χ j,p i (X : X ). (3)

10 0 Note that for the case of KL divergence between members of the same exponential families, the divergence can be expressed in a simpler closed-form using a Bregman divergence [8] on the swapped natural parameters. For example, consider Poisson distributions with λ = 0.6 and λ = 0.3, the Kullback-Leibler divergence computed from the equivalent Bregman divergence yields KL 0.58, the stochastic evaluation of Eq. 8 with n = 0 6 yields KL 0.56 and the KL divergence obtained from the truncation of Eq. 3 to the first s terms yields the following sequence: (s = ), 0.090(s = 3), 0.07(s = 4), 0.35(s = 0), 0.50(s = 5), etc. IV. CONCLUDING REMARKS We investigated the calculation of statistical f-divergences between members of the same exponential family with affine natural space. We first reported a generic closed-form formula for the Pearson/Neyman χ and Vajda χ k -type distance, and instantiated that formula for the Poisson and the isotropic Gaussian affine exponential families. We then considered the Taylor expansion of the generator f at any given point λ to deduce an analytic expression of f-divergences using Pearson-Vajda χ k λ,p distances. A second-order Taylor approximation yielded a fast estimation of f-divergences. This framework shall find potential applications in signal processing and when designing inequality bounds between divergences. A Java TM package that illustrates numerically the lemma is available online at: REFERENCES [] T. M. Cover and J. A. Thomas, Elements of information theory. New York, USA: Wiley-Interscience, 99. [] F. Österreicher, Distances based on the perimeter of the risk set of a testing problem, Austrian Journal of Statistics, vol. 4, no., p. 3.9, 03. [3] S. Amari and H. Nagaoka, Methods of Information Geometry, AMS, Ed. Oxford University Press, 000. [4] M. Khosravifard, D. Fooladivanda, and T. A. Gulliver, Confliction of the Convexity and Metric Properties in f-divergences, IEICE Trans. Fundam. Electron. Commun. Comput. Sci. E90-A, 9, , 007. [5] N. Barnett, P. Cerone, S. Dragomir, and A. Sofo, Approximating Csiszár f-divergence by the use of Taylor s formula with integral remainder, Mathematical Inequalities & Applications, vol. 5, no. 3, pp , 00. [6] L. D. Brown, Fundamentals of statistical exponential families: with applications in statistical decision theory. Hayworth, CA, USA: Institute of Mathematical Statistics, 986. [7] A. Cichocki and S.-i. Amari, Families of alpha- beta- and gamma- divergences: Flexible and robust measures of similarities, Entropy, vol., no. 6, pp , 00. [8] F. Nielsen and S. Boltz, The Burbea-Rao and Bhattacharyya centroids, IEEE Transactions on Information Theory, vol. 57, no. 8, pp , 0.

On the Chi square and higher-order Chi distances for approximating f-divergences

On the Chi square and higher-order Chi distances for approximating f-divergences c 2013 Frank Nielsen, Sony Computer Science Laboratories, Inc. 1/17 On the Chi square and higher-order Chi distances for approximating f-divergences Frank Nielsen 1 Richard Nock 2 www.informationgeometry.org

More information

topics about f-divergence

topics about f-divergence topics about f-divergence Presented by Liqun Chen Mar 16th, 2018 1 Outline 1 f-gan: Training Generative Neural Samplers using Variational Experiments 2 f-gans in an Information Geometric Nutshell Experiments

More information

This article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and education use, including for instruction at the author s institution, sharing

More information

arxiv: v2 [cs.cv] 19 Dec 2011

arxiv: v2 [cs.cv] 19 Dec 2011 A family of statistical symmetric divergences based on Jensen s inequality arxiv:009.4004v [cs.cv] 9 Dec 0 Abstract Frank Nielsen a,b, a Ecole Polytechnique Computer Science Department (LIX) Palaiseau,

More information

arxiv: v2 [math.st] 22 Aug 2018

arxiv: v2 [math.st] 22 Aug 2018 GENERALIZED BREGMAN AND JENSEN DIVERGENCES WHICH INCLUDE SOME F-DIVERGENCES TOMOHIRO NISHIYAMA arxiv:808.0648v2 [math.st] 22 Aug 208 Abstract. In this paper, we introduce new classes of divergences by

More information

CS Lecture 19. Exponential Families & Expectation Propagation

CS Lecture 19. Exponential Families & Expectation Propagation CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces

More information

Advanced Machine Learning & Perception

Advanced Machine Learning & Perception Advanced Machine Learning & Perception Instructor: Tony Jebara Topic 6 Standard Kernels Unusual Input Spaces for Kernels String Kernels Probabilistic Kernels Fisher Kernels Probability Product Kernels

More information

On Improved Bounds for Probability Metrics and f- Divergences

On Improved Bounds for Probability Metrics and f- Divergences IRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES On Improved Bounds for Probability Metrics and f- Divergences Igal Sason CCIT Report #855 March 014 Electronics Computers Communications

More information

A View on Extension of Utility-Based on Links with Information Measures

A View on Extension of Utility-Based on Links with Information Measures Communications of the Korean Statistical Society 2009, Vol. 16, No. 5, 813 820 A View on Extension of Utility-Based on Links with Information Measures A.R. Hoseinzadeh a, G.R. Mohtashami Borzadaran 1,b,

More information

arxiv: v4 [cs.it] 8 Apr 2014

arxiv: v4 [cs.it] 8 Apr 2014 1 On Improved Bounds for Probability Metrics and f-divergences Igal Sason Abstract Derivation of tight bounds for probability metrics and f-divergences is of interest in information theory and statistics.

More information

Tight Bounds for Symmetric Divergence Measures and a Refined Bound for Lossless Source Coding

Tight Bounds for Symmetric Divergence Measures and a Refined Bound for Lossless Source Coding APPEARS IN THE IEEE TRANSACTIONS ON INFORMATION THEORY, FEBRUARY 015 1 Tight Bounds for Symmetric Divergence Measures and a Refined Bound for Lossless Source Coding Igal Sason Abstract Tight bounds for

More information

June 21, Peking University. Dual Connections. Zhengchao Wan. Overview. Duality of connections. Divergence: general contrast functions

June 21, Peking University. Dual Connections. Zhengchao Wan. Overview. Duality of connections. Divergence: general contrast functions Dual Peking University June 21, 2016 Divergences: Riemannian connection Let M be a manifold on which there is given a Riemannian metric g =,. A connection satisfying Z X, Y = Z X, Y + X, Z Y (1) for all

More information

Information Theory and Hypothesis Testing

Information Theory and Hypothesis Testing Summer School on Game Theory and Telecommunications Campione, 7-12 September, 2014 Information Theory and Hypothesis Testing Mauro Barni University of Siena September 8 Review of some basic results linking

More information

Stat410 Probability and Statistics II (F16)

Stat410 Probability and Statistics II (F16) Stat4 Probability and Statistics II (F6 Exponential, Poisson and Gamma Suppose on average every /λ hours, a Stochastic train arrives at the Random station. Further we assume the waiting time between two

More information

arxiv:math/ v1 [math.st] 19 Jan 2005

arxiv:math/ v1 [math.st] 19 Jan 2005 ON A DIFFERENCE OF JENSEN INEQUALITY AND ITS APPLICATIONS TO MEAN DIVERGENCE MEASURES INDER JEET TANEJA arxiv:math/05030v [math.st] 9 Jan 005 Let Abstract. In this paper we have considered a difference

More information

Convergence of generalized entropy minimizers in sequences of convex problems

Convergence of generalized entropy minimizers in sequences of convex problems Proceedings IEEE ISIT 206, Barcelona, Spain, 2609 263 Convergence of generalized entropy minimizers in sequences of convex problems Imre Csiszár A Rényi Institute of Mathematics Hungarian Academy of Sciences

More information

A SYMMETRIC INFORMATION DIVERGENCE MEASURE OF CSISZAR'S F DIVERGENCE CLASS

A SYMMETRIC INFORMATION DIVERGENCE MEASURE OF CSISZAR'S F DIVERGENCE CLASS Journal of the Applied Mathematics, Statistics Informatics (JAMSI), 7 (2011), No. 1 A SYMMETRIC INFORMATION DIVERGENCE MEASURE OF CSISZAR'S F DIVERGENCE CLASS K.C. JAIN AND R. MATHUR Abstract Information

More information

EECS 750. Hypothesis Testing with Communication Constraints

EECS 750. Hypothesis Testing with Communication Constraints EECS 750 Hypothesis Testing with Communication Constraints Name: Dinesh Krithivasan Abstract In this report, we study a modification of the classical statistical problem of bivariate hypothesis testing.

More information

Probability and Measure

Probability and Measure Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability

More information

Mathematical Foundations of the Generalization of t-sne and SNE for Arbitrary Divergences

Mathematical Foundations of the Generalization of t-sne and SNE for Arbitrary Divergences MACHINE LEARNING REPORTS Mathematical Foundations of the Generalization of t-sne and SNE for Arbitrary Report 02/2010 Submitted: 01.04.2010 Published:26.04.2010 T. Villmann and S. Haase University of Applied

More information

Belief Propagation, Information Projections, and Dykstra s Algorithm

Belief Propagation, Information Projections, and Dykstra s Algorithm Belief Propagation, Information Projections, and Dykstra s Algorithm John MacLaren Walsh, PhD Department of Electrical and Computer Engineering Drexel University Philadelphia, PA jwalsh@ece.drexel.edu

More information

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012 Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 202 BOUNDS AND ASYMPTOTICS FOR FISHER INFORMATION IN THE CENTRAL LIMIT THEOREM

More information

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology Singapore University of Design and Technology Lecture 9: Hypothesis testing, uniformly most powerful tests. The Neyman-Pearson framework Let P be the family of distributions of concern. The Neyman-Pearson

More information

Chapter 3 : Likelihood function and inference

Chapter 3 : Likelihood function and inference Chapter 3 : Likelihood function and inference 4 Likelihood function and inference The likelihood Information and curvature Sufficiency and ancilarity Maximum likelihood estimation Non-regular models EM

More information

MATHS 730 FC Lecture Notes March 5, Introduction

MATHS 730 FC Lecture Notes March 5, Introduction 1 INTRODUCTION MATHS 730 FC Lecture Notes March 5, 2014 1 Introduction Definition. If A, B are sets and there exists a bijection A B, they have the same cardinality, which we write as A, #A. If there exists

More information

Chapter 1. Statistical Spaces

Chapter 1. Statistical Spaces Chapter 1 Statistical Spaces Mathematical statistics is a science that studies the statistical regularity of random phenomena, essentially by some observation values of random variable (r.v.) X. Sometimes

More information

Total Jensen divergences: Definition, Properties and k-means++ Clustering

Total Jensen divergences: Definition, Properties and k-means++ Clustering Total Jensen divergences: Definition, Properties and k-means++ Clustering Frank Nielsen, Senior Member, IEEE Sony Computer Science Laboratories, Inc. 3-4-3 Higashi Gotanda, 4-00 Shinagawa-ku arxiv:309.709v

More information

Some New Information Inequalities Involving f-divergences

Some New Information Inequalities Involving f-divergences BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 12, No 2 Sofia 2012 Some New Information Inequalities Involving f-divergences Amit Srivastava Department of Mathematics, Jaypee

More information

Signed Measures. Chapter Basic Properties of Signed Measures. 4.2 Jordan and Hahn Decompositions

Signed Measures. Chapter Basic Properties of Signed Measures. 4.2 Jordan and Hahn Decompositions Chapter 4 Signed Measures Up until now our measures have always assumed values that were greater than or equal to 0. In this chapter we will extend our definition to allow for both positive negative values.

More information

PROPERTIES. Ferdinand Österreicher Institute of Mathematics, University of Salzburg, Austria

PROPERTIES. Ferdinand Österreicher Institute of Mathematics, University of Salzburg, Austria CSISZÁR S f-divergences - BASIC PROPERTIES Ferdinand Österreicher Institute of Mathematics, University of Salzburg, Austria Abstract In this talk basic general properties of f-divergences, including their

More information

Bayes spaces: use of improper priors and distances between densities

Bayes spaces: use of improper priors and distances between densities Bayes spaces: use of improper priors and distances between densities J. J. Egozcue 1, V. Pawlowsky-Glahn 2, R. Tolosana-Delgado 1, M. I. Ortego 1 and G. van den Boogaart 3 1 Universidad Politécnica de

More information

March Algebra 2 Question 1. March Algebra 2 Question 1

March Algebra 2 Question 1. March Algebra 2 Question 1 March Algebra 2 Question 1 If the statement is always true for the domain, assign that part a 3. If it is sometimes true, assign it a 2. If it is never true, assign it a 1. Your answer for this question

More information

CORRECTIONS AND ADDITION TO THE PAPER KELLERER-STRASSEN TYPE MARGINAL MEASURE PROBLEM. Rataka TAHATA. Received March 9, 2005

CORRECTIONS AND ADDITION TO THE PAPER KELLERER-STRASSEN TYPE MARGINAL MEASURE PROBLEM. Rataka TAHATA. Received March 9, 2005 Scientiae Mathematicae Japonicae Online, e-5, 189 194 189 CORRECTIONS AND ADDITION TO THE PAPER KELLERER-STRASSEN TPE MARGINAL MEASURE PROBLEM Rataka TAHATA Received March 9, 5 Abstract. A report on an

More information

44 CHAPTER 2. BAYESIAN DECISION THEORY

44 CHAPTER 2. BAYESIAN DECISION THEORY 44 CHAPTER 2. BAYESIAN DECISION THEORY Problems Section 2.1 1. In the two-category case, under the Bayes decision rule the conditional error is given by Eq. 7. Even if the posterior densities are continuous,

More information

Legendre transformation and information geometry

Legendre transformation and information geometry Legendre transformation and information geometry CIG-MEMO #2, v1 Frank Nielsen École Polytechnique Sony Computer Science Laboratorie, Inc http://www.informationgeometry.org September 2010 Abstract We explain

More information

Kernel methods and the exponential family

Kernel methods and the exponential family Kernel methods and the exponential family Stephane Canu a Alex Smola b a 1-PSI-FRE CNRS 645, INSA de Rouen, France, St Etienne du Rouvray, France b Statistical Machine Learning Program, National ICT Australia

More information

1/37. Convexity theory. Victor Kitov

1/37. Convexity theory. Victor Kitov 1/37 Convexity theory Victor Kitov 2/37 Table of Contents 1 2 Strictly convex functions 3 Concave & strictly concave functions 4 Kullback-Leibler divergence 3/37 Convex sets Denition 1 Set X is convex

More information

Bregman divergence and density integration Noboru Murata and Yu Fujimoto

Bregman divergence and density integration Noboru Murata and Yu Fujimoto Journal of Math-for-industry, Vol.1(2009B-3), pp.97 104 Bregman divergence and density integration Noboru Murata and Yu Fujimoto Received on August 29, 2009 / Revised on October 4, 2009 Abstract. In this

More information

Information geometry for bivariate distribution control

Information geometry for bivariate distribution control Information geometry for bivariate distribution control C.T.J.Dodson + Hong Wang Mathematics + Control Systems Centre, University of Manchester Institute of Science and Technology Optimal control of stochastic

More information

x log x, which is strictly convex, and use Jensen s Inequality:

x log x, which is strictly convex, and use Jensen s Inequality: 2. Information measures: mutual information 2.1 Divergence: main inequality Theorem 2.1 (Information Inequality). D(P Q) 0 ; D(P Q) = 0 iff P = Q Proof. Let ϕ(x) x log x, which is strictly convex, and

More information

Fundamentals of Statistics

Fundamentals of Statistics Chapter 2 Fundamentals of Statistics This chapter discusses some fundamental concepts of mathematical statistics. These concepts are essential for the material in later chapters. 2.1 Populations, Samples,

More information

Inference. Data. Model. Variates

Inference. Data. Model. Variates Data Inference Variates Model ˆθ (,..., ) mˆθn(d) m θ2 M m θ1 (,, ) (,,, ) (,, ) α = :=: (, ) F( ) = = {(, ),, } F( ) X( ) = Γ( ) = Σ = ( ) = ( ) ( ) = { = } :=: (U, ) , = { = } = { = } x 2 e i, e j

More information

3 Integration and Expectation

3 Integration and Expectation 3 Integration and Expectation 3.1 Construction of the Lebesgue Integral Let (, F, µ) be a measure space (not necessarily a probability space). Our objective will be to define the Lebesgue integral R fdµ

More information

Convexity/Concavity of Renyi Entropy and α-mutual Information

Convexity/Concavity of Renyi Entropy and α-mutual Information Convexity/Concavity of Renyi Entropy and -Mutual Information Siu-Wai Ho Institute for Telecommunications Research University of South Australia Adelaide, SA 5095, Australia Email: siuwai.ho@unisa.edu.au

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Fundamental Tools - Probability Theory II

Fundamental Tools - Probability Theory II Fundamental Tools - Probability Theory II MSc Financial Mathematics The University of Warwick September 29, 2015 MSc Financial Mathematics Fundamental Tools - Probability Theory II 1 / 22 Measurable random

More information

L p Spaces and Convexity

L p Spaces and Convexity L p Spaces and Convexity These notes largely follow the treatments in Royden, Real Analysis, and Rudin, Real & Complex Analysis. 1. Convex functions Let I R be an interval. For I open, we say a function

More information

The Information Bottleneck Revisited or How to Choose a Good Distortion Measure

The Information Bottleneck Revisited or How to Choose a Good Distortion Measure The Information Bottleneck Revisited or How to Choose a Good Distortion Measure Peter Harremoës Centrum voor Wiskunde en Informatica PO 94079, 1090 GB Amsterdam The Nederlands PHarremoes@cwinl Naftali

More information

Minimax lower bounds I

Minimax lower bounds I Minimax lower bounds I Kyoung Hee Kim Sungshin University 1 Preliminaries 2 General strategy 3 Le Cam, 1973 4 Assouad, 1983 5 Appendix Setting Family of probability measures {P θ : θ Θ} on a sigma field

More information

Maximization of the information divergence from the multinomial distributions 1

Maximization of the information divergence from the multinomial distributions 1 aximization of the information divergence from the multinomial distributions Jozef Juríček Charles University in Prague Faculty of athematics and Physics Department of Probability and athematical Statistics

More information

Brief Review on Estimation Theory

Brief Review on Estimation Theory Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on

More information

Stochastic Comparisons of Order Statistics from Generalized Normal Distributions

Stochastic Comparisons of Order Statistics from Generalized Normal Distributions A^VÇÚO 1 33 ò 1 6 Ï 2017 c 12 Chinese Journal of Applied Probability and Statistics Dec. 2017 Vol. 33 No. 6 pp. 591-607 doi: 10.3969/j.issn.1001-4268.2017.06.004 Stochastic Comparisons of Order Statistics

More information

Information Geometry of Positive Measures and Positive-Definite Matrices: Decomposable Dually Flat Structure

Information Geometry of Positive Measures and Positive-Definite Matrices: Decomposable Dually Flat Structure Entropy 014, 16, 131-145; doi:10.3390/e1604131 OPEN ACCESS entropy ISSN 1099-4300 www.mdpi.com/journal/entropy Article Information Geometry of Positive Measures and Positive-Definite Matrices: Decomposable

More information

Gaussian vectors and central limit theorem

Gaussian vectors and central limit theorem Gaussian vectors and central limit theorem Samy Tindel Purdue University Probability Theory 2 - MA 539 Samy T. Gaussian vectors & CLT Probability Theory 1 / 86 Outline 1 Real Gaussian random variables

More information

On Differentiability of Average Cost in Parameterized Markov Chains

On Differentiability of Average Cost in Parameterized Markov Chains On Differentiability of Average Cost in Parameterized Markov Chains Vijay Konda John N. Tsitsiklis August 30, 2002 1 Overview The purpose of this appendix is to prove Theorem 4.6 in 5 and establish various

More information

arxiv: v4 [cs.it] 17 Oct 2015

arxiv: v4 [cs.it] 17 Oct 2015 Upper Bounds on the Relative Entropy and Rényi Divergence as a Function of Total Variation Distance for Finite Alphabets Igal Sason Department of Electrical Engineering Technion Israel Institute of Technology

More information

2 Two-Point Boundary Value Problems

2 Two-Point Boundary Value Problems 2 Two-Point Boundary Value Problems Another fundamental equation, in addition to the heat eq. and the wave eq., is Poisson s equation: n j=1 2 u x 2 j The unknown is the function u = u(x 1, x 2,..., x

More information

Chapter 2 Exponential Families and Mixture Families of Probability Distributions

Chapter 2 Exponential Families and Mixture Families of Probability Distributions Chapter 2 Exponential Families and Mixture Families of Probability Distributions The present chapter studies the geometry of the exponential family of probability distributions. It is not only a typical

More information

Received: 20 December 2011; in revised form: 4 February 2012 / Accepted: 7 February 2012 / Published: 2 March 2012

Received: 20 December 2011; in revised form: 4 February 2012 / Accepted: 7 February 2012 / Published: 2 March 2012 Entropy 2012, 14, 480-490; doi:10.3390/e14030480 Article OPEN ACCESS entropy ISSN 1099-4300 www.mdpi.com/journal/entropy Interval Entropy and Informative Distance Fakhroddin Misagh 1, * and Gholamhossein

More information

THEOREM AND METRIZABILITY

THEOREM AND METRIZABILITY f-divergences - REPRESENTATION THEOREM AND METRIZABILITY Ferdinand Österreicher Institute of Mathematics, University of Salzburg, Austria Abstract In this talk we are first going to state the so-called

More information

Part II Probability and Measure

Part II Probability and Measure Part II Probability and Measure Theorems Based on lectures by J. Miller Notes taken by Dexter Chua Michaelmas 2016 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

Latent Variable Models and EM algorithm

Latent Variable Models and EM algorithm Latent Variable Models and EM algorithm SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic 3.1 Clustering and Mixture Modelling K-means and hierarchical clustering are non-probabilistic

More information

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Data from one or a series of random experiments are collected. Planning experiments and collecting data (not discussed here). Analysis:

More information

Contents 1. Introduction 1 2. Main results 3 3. Proof of the main inequalities 7 4. Application to random dynamical systems 11 References 16

Contents 1. Introduction 1 2. Main results 3 3. Proof of the main inequalities 7 4. Application to random dynamical systems 11 References 16 WEIGHTED CSISZÁR-KULLBACK-PINSKER INEQUALITIES AND APPLICATIONS TO TRANSPORTATION INEQUALITIES FRANÇOIS BOLLEY AND CÉDRIC VILLANI Abstract. We strengthen the usual Csiszár-Kullback-Pinsker inequality by

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

6.1 Variational representation of f-divergences

6.1 Variational representation of f-divergences ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 6: Variational representation, HCR and CR lower bounds Lecturer: Yihong Wu Scribe: Georgios Rovatsos, Feb 11, 2016

More information

MTH 404: Measure and Integration

MTH 404: Measure and Integration MTH 404: Measure and Integration Semester 2, 2012-2013 Dr. Prahlad Vaidyanathan Contents I. Introduction....................................... 3 1. Motivation................................... 3 2. The

More information

MATH MEASURE THEORY AND FOURIER ANALYSIS. Contents

MATH MEASURE THEORY AND FOURIER ANALYSIS. Contents MATH 3969 - MEASURE THEORY AND FOURIER ANALYSIS ANDREW TULLOCH Contents 1. Measure Theory 2 1.1. Properties of Measures 3 1.2. Constructing σ-algebras and measures 3 1.3. Properties of the Lebesgue measure

More information

Divergence measure of intuitionistic fuzzy sets

Divergence measure of intuitionistic fuzzy sets Divergence measure of intuitionistic fuzzy sets Fuyuan Xiao a, a School of Computer and Information Science, Southwest University, Chongqing, 400715, China Abstract As a generation of fuzzy sets, the intuitionistic

More information

Concentration Inequalities for Density Functionals. Shashank Singh

Concentration Inequalities for Density Functionals. Shashank Singh Concentration Inequalities for Density Functionals by Shashank Singh Submitted to the Department of Mathematical Sciences in partial fulfillment of the requirements for the degree of Master of Science

More information

Research Article On Some Improvements of the Jensen Inequality with Some Applications

Research Article On Some Improvements of the Jensen Inequality with Some Applications Hindawi Publishing Corporation Journal of Inequalities and Applications Volume 2009, Article ID 323615, 15 pages doi:10.1155/2009/323615 Research Article On Some Improvements of the Jensen Inequality with

More information

Goodness of Fit Test and Test of Independence by Entropy

Goodness of Fit Test and Test of Independence by Entropy Journal of Mathematical Extension Vol. 3, No. 2 (2009), 43-59 Goodness of Fit Test and Test of Independence by Entropy M. Sharifdoost Islamic Azad University Science & Research Branch, Tehran N. Nematollahi

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1)

Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Detection problems can usually be casted as binary or M-ary hypothesis testing problems. Applications: This chapter: Simple hypothesis

More information

2882 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 6, JUNE A formal definition considers probability measures P and Q defined

2882 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 6, JUNE A formal definition considers probability measures P and Q defined 2882 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 6, JUNE 2009 Sided and Symmetrized Bregman Centroids Frank Nielsen, Member, IEEE, and Richard Nock Abstract In this paper, we generalize the notions

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

ST5215: Advanced Statistical Theory

ST5215: Advanced Statistical Theory Department of Statistics & Applied Probability Monday, September 26, 2011 Lecture 10: Exponential families and Sufficient statistics Exponential Families Exponential families are important parametric families

More information

Robustness and duality of maximum entropy and exponential family distributions

Robustness and duality of maximum entropy and exponential family distributions Chapter 7 Robustness and duality of maximum entropy and exponential family distributions In this lecture, we continue our study of exponential families, but now we investigate their properties in somewhat

More information

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions International Journal of Control Vol. 00, No. 00, January 2007, 1 10 Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions I-JENG WANG and JAMES C.

More information

f-divergence Inequalities

f-divergence Inequalities f-divergence Inequalities Igal Sason Sergio Verdú Abstract This paper develops systematic approaches to obtain f-divergence inequalities, dealing with pairs of probability measures defined on arbitrary

More information

On the Entropy of Sums of Bernoulli Random Variables via the Chen-Stein Method

On the Entropy of Sums of Bernoulli Random Variables via the Chen-Stein Method On the Entropy of Sums of Bernoulli Random Variables via the Chen-Stein Method Igal Sason Department of Electrical Engineering Technion - Israel Institute of Technology Haifa 32000, Israel ETH, Zurich,

More information

f-divergence Inequalities

f-divergence Inequalities SASON AND VERDÚ: f-divergence INEQUALITIES f-divergence Inequalities Igal Sason, and Sergio Verdú Abstract arxiv:508.00335v7 [cs.it] 4 Dec 06 This paper develops systematic approaches to obtain f-divergence

More information

arxiv: v2 [cs.it] 26 Sep 2011

arxiv: v2 [cs.it] 26 Sep 2011 Sequences o Inequalities Among New Divergence Measures arxiv:1010.041v [cs.it] 6 Sep 011 Inder Jeet Taneja Departamento de Matemática Universidade Federal de Santa Catarina 88.040-900 Florianópolis SC

More information

1: PROBABILITY REVIEW

1: PROBABILITY REVIEW 1: PROBABILITY REVIEW Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 1: Probability Review 1 / 56 Outline We will review the following

More information

Bayesian statistics: Inference and decision theory

Bayesian statistics: Inference and decision theory Bayesian statistics: Inference and decision theory Patric Müller und Francesco Antognini Seminar über Statistik FS 28 3.3.28 Contents 1 Introduction and basic definitions 2 2 Bayes Method 4 3 Two optimalities:

More information

CONVEX FUNCTIONS AND MATRICES. Silvestru Sever Dragomir

CONVEX FUNCTIONS AND MATRICES. Silvestru Sever Dragomir Korean J. Math. 6 (018), No. 3, pp. 349 371 https://doi.org/10.11568/kjm.018.6.3.349 INEQUALITIES FOR QUANTUM f-divergence OF CONVEX FUNCTIONS AND MATRICES Silvestru Sever Dragomir Abstract. Some inequalities

More information

f-divergence Estimation and Two-Sample Homogeneity Test under Semiparametric Density-Ratio Models

f-divergence Estimation and Two-Sample Homogeneity Test under Semiparametric Density-Ratio Models IEEE Transactions on Information Theory, vol.58, no.2, pp.708 720, 2012. 1 f-divergence Estimation and Two-Sample Homogeneity Test under Semiparametric Density-Ratio Models Takafumi Kanamori Nagoya University,

More information

ZOBECNĚNÉ PHI-DIVERGENCE A EM-ALGORITMUS V AKUSTICKÉ EMISI

ZOBECNĚNÉ PHI-DIVERGENCE A EM-ALGORITMUS V AKUSTICKÉ EMISI ZOBECNĚNÉ PHI-DIVERGENCE A EM-ALGORITMUS V AKUSTICKÉ EMISI Jan Tláskal a Václav Kůs Faculty of Nuclear Sciences and Physical Engineering, Czech Technical University in Prague 11.11.2010 Concepts What the

More information

STATISTICAL METHODS FOR SIGNAL PROCESSING c Alfred Hero

STATISTICAL METHODS FOR SIGNAL PROCESSING c Alfred Hero STATISTICAL METHODS FOR SIGNAL PROCESSING c Alfred Hero 1999 32 Statistic used Meaning in plain english Reduction ratio T (X) [X 1,..., X n ] T, entire data sample RR 1 T (X) [X (1),..., X (n) ] T, rank

More information

Formulas for probability theory and linear models SF2941

Formulas for probability theory and linear models SF2941 Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms

More information

Algebraic Information Geometry for Learning Machines with Singularities

Algebraic Information Geometry for Learning Machines with Singularities Algebraic Information Geometry for Learning Machines with Singularities Sumio Watanabe Precision and Intelligence Laboratory Tokyo Institute of Technology 4259 Nagatsuta, Midori-ku, Yokohama, 226-8503

More information

Statistical Inference

Statistical Inference Statistical Inference Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Week 12. Testing and Kullback-Leibler Divergence 1. Likelihood Ratios Let 1, 2, 2,...

More information

Lecture 1: Introduction

Lecture 1: Introduction Principles of Statistics Part II - Michaelmas 208 Lecturer: Quentin Berthet Lecture : Introduction This course is concerned with presenting some of the mathematical principles of statistical theory. One

More information

INFORMATION PROCESSING ABILITY OF BINARY DETECTORS AND BLOCK DECODERS. Michael A. Lexa and Don H. Johnson

INFORMATION PROCESSING ABILITY OF BINARY DETECTORS AND BLOCK DECODERS. Michael A. Lexa and Don H. Johnson INFORMATION PROCESSING ABILITY OF BINARY DETECTORS AND BLOCK DECODERS Michael A. Lexa and Don H. Johnson Rice University Department of Electrical and Computer Engineering Houston, TX 775-892 amlexa@rice.edu,

More information

4. Conditional risk measures and their robust representation

4. Conditional risk measures and their robust representation 4. Conditional risk measures and their robust representation We consider a discrete-time information structure given by a filtration (F t ) t=0,...,t on our probability space (Ω, F, P ). The time horizon

More information

4 Expectation & the Lebesgue Theorems

4 Expectation & the Lebesgue Theorems STA 205: Probability & Measure Theory Robert L. Wolpert 4 Expectation & the Lebesgue Theorems Let X and {X n : n N} be random variables on a probability space (Ω,F,P). If X n (ω) X(ω) for each ω Ω, does

More information

Lecture 13 : Variational Inference: Mean Field Approximation

Lecture 13 : Variational Inference: Mean Field Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1

More information

Expectation Maximization

Expectation Maximization Expectation Maximization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr 1 /

More information

Convex Functions. Wing-Kin (Ken) Ma The Chinese University of Hong Kong (CUHK)

Convex Functions. Wing-Kin (Ken) Ma The Chinese University of Hong Kong (CUHK) Convex Functions Wing-Kin (Ken) Ma The Chinese University of Hong Kong (CUHK) Course on Convex Optimization for Wireless Comm. and Signal Proc. Jointly taught by Daniel P. Palomar and Wing-Kin (Ken) Ma

More information

Information Geometric view of Belief Propagation

Information Geometric view of Belief Propagation Information Geometric view of Belief Propagation Yunshu Liu 2013-10-17 References: [1]. Shiro Ikeda, Toshiyuki Tanaka and Shun-ichi Amari, Stochastic reasoning, Free energy and Information Geometry, Neural

More information