Journée Interdisciplinaire Mathématiques Musique

Size: px
Start display at page:

Download "Journée Interdisciplinaire Mathématiques Musique"

Transcription

1 Journée Interdisciplinaire Mathématiques Musique Music Information Geometry Arnaud Dessein 1,2 and Arshia Cont 1 1 Institute for Research and Coordination of Acoustics and Music, Paris, France 2 Japanese-French Laboratory for Informatics, Tokyo, Japan IRMA, Strasbourg, April 7th 2011 arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 1/21

2 Outline Introduction 1 Introduction arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 2/21

3 Outline Introduction A bit of history about science and music Motivations towards information geometry 1 Introduction A bit of history about science and music Motivations towards information geometry arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 3/21

4 Where do we come from? Introduction A bit of history about science and music Motivations towards information geometry Pythagoras ( BC): relation between string length and produced sound, Pythagorean tuning. There is geometry in the humming of the strings, there is music in the spacing of the spheres. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 4/21

5 Where do we come from? Introduction A bit of history about science and music Motivations towards information geometry Pythagoras ( BC): relation between string length and produced sound, Pythagorean tuning. There is geometry in the humming of the strings, there is music in the spacing of the spheres. Helmholtz ( ): Helmholtz resonator, harmonics and frequency spectrum of sounds. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 4/21

6 Where do we come from? Introduction A bit of history about science and music Motivations towards information geometry Pythagoras ( BC): relation between string length and produced sound, Pythagorean tuning. There is geometry in the humming of the strings, there is music in the spacing of the spheres. Helmholtz ( ): Helmholtz resonator, harmonics and frequency spectrum of sounds. But also indirectly Fourier, Shannon, etc. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 4/21

7 What do we need? Introduction A bit of history about science and music Motivations towards information geometry Figure: Levels of representation of audio, waveform and spectrogram representations. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 5/21

8 What do we need? Introduction A bit of history about science and music Motivations towards information geometry Figure: Levels of representation of audio, waveform and spectrogram representations. Develop a comprehensive framework that allows to quantify, process and represent the information contained in audio signals. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 5/21

9 What do we need? Introduction A bit of history about science and music Motivations towards information geometry Figure: Levels of representation of audio, waveform and spectrogram representations. Develop a comprehensive framework that allows to quantify, process and represent the information contained in audio signals. Fill in the gap between signal and symbolic representations. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 5/21

10 Outline Introduction Background Exponential families 1 Introduction 2 Background Exponential families arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 6/21

11 What is information geometry? Background Exponential families Statistical differentiable manifold. Under certain assumptions, a parametric statistical model S = {p ξ : ξ Ξ} of probability distributions defined on X forms a differentiable manifold. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 7/21

12 What is information geometry? Background Exponential families Statistical differentiable manifold. Under certain assumptions, a parametric statistical model S = {p ξ : ξ Ξ} of probability distributions defined on X forms a differentiable manifold. { } 1 Example: p ξ (x) = exp (x µ)2 for all x X = R, with 2πσ 2 2σ 2 ξ = [µ, σ 2 ] Ξ = R R ++. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 7/21

13 What is information geometry? Background Exponential families Statistical differentiable manifold. Under certain assumptions, a parametric statistical model S = {p ξ : ξ Ξ} of probability distributions defined on X forms a differentiable manifold. { } 1 Example: p ξ (x) = exp (x µ)2 for all x X = R, with 2πσ 2 2σ 2 ξ = [µ, σ 2 ] Ξ = R R ++. Fisher information metric [Rao, 1945, Chentsov, 1982]. Under certain assumptions, the Fisher information matrix defines the unique Riemannian metric g on S: g ij(ξ) = i log p ξ (x) j log p ξ (x) p ξ (x) dx. x X arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 7/21

14 What is information geometry? Background Exponential families Statistical differentiable manifold. Under certain assumptions, a parametric statistical model S = {p ξ : ξ Ξ} of probability distributions defined on X forms a differentiable manifold. { } 1 Example: p ξ (x) = exp (x µ)2 for all x X = R, with 2πσ 2 2σ 2 ξ = [µ, σ 2 ] Ξ = R R ++. Fisher information metric [Rao, 1945, Chentsov, 1982]. Under certain assumptions, the Fisher information matrix defines the unique Riemannian metric g on S: g ij(ξ) = i log p ξ (x) j log p ξ (x) p ξ (x) dx. x X Dual affine connections [Chentsov, 1982, Amari & Nagaoka, 2000]. Under certain assumptions, there is a unique family of dual affine connections { (α), ( α) } α R on (S, g) called α-connections. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 7/21

15 Background Exponential families How to use information geometry from a computational viewpoint? Exponential family. p θ (x) = exp ( θ T T (x) F (θ) + C(x) ) for all x X. θ: natural parameters, vector belonging to a convex open set Θ. F : log-normalizer, real-valued, strictly convex smooth function on Θ. C: carrier measure, real-valued function on X. T : sufficient statistic, vector-valued function on X. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 8/21

16 Background Exponential families How to use information geometry from a computational viewpoint? Exponential family. A taxonomy of probability measures p θ (x) = exp ( θ T T (x) F (θ) + C(x) ) for all x X. Probability measure Parametric Non-parametric Exponential families Non-exponential families Univariate Multivariate Uniform Cauchy Lévy skew α-stable uniparameter Bi-parameter multi-parameter Binomial Beta β Gamma Γ Multinomial Dirichlet Weibull Bernoulli Poisson Exponential Rayleigh Gaussian Figure: A taxonomy of exponential families. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 8/21

17 Background Exponential families How to use information geometry from a computational viewpoint? Exponential family. p θ (x) = exp ( θ T T (x) F (θ) + C(x) ) for all x X. We consider a statistical manifold S = {p θ : θ Θ} equipped with g and the dual exponential and mixture connections (1) and ( 1). arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 8/21

18 Background Exponential families How to use information geometry from a computational viewpoint? Exponential family. p θ (x) = exp ( θ T T (x) F (θ) + C(x) ) for all x X. We consider a statistical manifold S = {p θ : θ Θ} equipped with g and the dual exponential and mixture connections (1) and ( 1). (S, g, (1), ( 1) ) possesses two dual affine coordinate systems, natural parameters θ and expectation parameters η = F (θ). arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 8/21

19 Background Exponential families How to use information geometry from a computational viewpoint? Exponential family. p θ (x) = exp ( θ T T (x) F (θ) + C(x) ) for all x X. We consider a statistical manifold S = {p θ : θ Θ} equipped with g and the dual exponential and mixture connections (1) and ( 1). (S, g, (1), ( 1) ) possesses two dual affine coordinate systems, natural parameters θ and expectation parameters η = F (θ). Dually flat geometry, Hessian structure (g = 2 F ), generated by the potential F together with its conjugate potential F defined by the Legendre-Fenchel transform: F (η) = sup θ Θ θ T η F (θ), which verifies F = ( F ) 1 so that θ = F (η). arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 8/21

20 Background Exponential families How to use information geometry from a computational viewpoint? Exponential family. p θ (x) = exp ( θ T T (x) F (θ) + C(x) ) for all x X. We consider a statistical manifold S = {p θ : θ Θ} equipped with g and the dual exponential and mixture connections (1) and ( 1). (S, g, (1), ( 1) ) possesses two dual affine coordinate systems, natural parameters θ and expectation parameters η = F (θ). Dually flat geometry, Hessian structure (g = 2 F ), generated by the potential F together with its conjugate potential F defined by the Legendre-Fenchel transform: F (η) = sup θ Θ θ T η F (θ), which verifies F = ( F ) 1 so that θ = F (η). Generalizes the self-dual Euclidean geometry, with notably two canonically associated Bregman divergences B F and B F instead of the self-dual Euclidean distance, but also dual geodesics, a generalized Pythagorean theorem and dual projections. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 8/21

21 Background Exponential families How to use information geometry from a computational viewpoint? Exponential family. p θ (x) = exp ( θ T T (x) F (θ) + C(x) ) for all x X. Bregman divergence. B F (θ, θ ) = F (θ) F (θ ) (θ θ ) T F (θ ). arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 8/21

22 Background Exponential families How to use information geometry from a computational viewpoint? Exponential family. p θ (x) = exp ( θ T T (x) F (θ) + C(x) ) for all x X. Bregman divergence. B F (θ, θ ) = F (θ) F (θ ) (θ θ ) T F (θ ). Canonical divergences of dually flat spaces, bijection with exponential families [Amari & Nagaoka, 2000, Banerjee et al., 2005]: D KL (p ξ p ξ ) = B F (θ θ) = B F (η η ). arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 8/21

23 Background Exponential families How to use information geometry from a computational viewpoint? Exponential family. p θ (x) = exp ( θ T T (x) F (θ) + C(x) ) for all x X. Bregman divergence. B F (θ, θ ) = F (θ) F (θ ) (θ θ ) T F (θ ). Canonical divergences of dually flat spaces, bijection with exponential families [Amari & Nagaoka, 2000, Banerjee et al., 2005]: D KL (p ξ p ξ ) = B F (θ θ) = B F (η η ). No symmetry nor triangular inequality in general, but an information-theoretic interpretation. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 8/21

24 Background Exponential families How to use information geometry from a computational viewpoint? Exponential family. p θ (x) = exp ( θ T T (x) F (θ) + C(x) ) for all x X. Bregman divergence. B F (θ, θ ) = F (θ) F (θ ) (θ θ ) T F (θ ). Canonical divergences of dually flat spaces, bijection with exponential families [Amari & Nagaoka, 2000, Banerjee et al., 2005]: D KL (p ξ p ξ ) = B F (θ θ) = B F (η η ). No symmetry nor triangular inequality in general, but an information-theoretic interpretation. Generic algorithms that handle many generalized distances [Banerjee et al., 2005, Cayton, 2008, Cayton, 2009, Nielsen & Nock, 2009, Nielsen et al., 2009, Garcia et al., 2009]: Centroid computation and hard clustering (k-means). Parameter estimation and soft clustering (expectation-maximization). Proximity queries in ball trees (nearest-neighbors and range search). arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 8/21

25 Outline Introduction General architecture Sound descriptors modeling Temporal modeling 1 Introduction 2 3 General architecture Sound descriptors modeling Temporal modeling 4 5 arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 9/21

26 General architecture Sound descriptors modeling Temporal modeling How to design an audio system based on information geometry? Audio stream decomposition (on-line) Scheme: 1 Represent the incoming audio stream with short-time sound descriptors d j. 2 Model these descriptors as probability distributions p θj from a given exponential family. 3 Use the framework of computational information geometry on these distributions. Auditory scene Short-time sound representation dj Sound descriptors modeling pθj Temporal modeling Figure: Schema of the general architecture of the system. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 10/21

27 General architecture Sound descriptors modeling Temporal modeling How to design an audio system based on information geometry? Audio stream decomposition (on-line) Scheme: 1 Represent the incoming audio stream with short-time sound descriptors d j. 2 Model these descriptors as probability distributions p θj from a given exponential family. 3 Use the framework of computational information geometry on these distributions. In particular, it allows to define the notion of similarity in an information setup through divergences. Auditory scene Short-time sound representation dj Sound descriptors modeling pθj Temporal modeling Figure: Schema of the general architecture of the system. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 10/21

28 General architecture Sound descriptors modeling Temporal modeling How to design an audio system based on information geometry? Audio stream decomposition (on-line) Scheme: 1 Represent the incoming audio stream with short-time sound descriptors d j. 2 Model these descriptors as probability distributions p θj from a given exponential family. 3 Use the framework of computational information geometry on these distributions. In particular, it allows to define the notion of similarity in an information setup through divergences. Important need for temporal modeling. Auditory scene Short-time sound representation dj Sound descriptors modeling pθj Temporal modeling Figure: Schema of the general architecture of the system. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 10/21

29 General architecture Sound descriptors modeling Temporal modeling How to design an audio system based on information geometry? Audio stream decomposition (on-line) Scheme: 1 Represent the incoming audio stream with short-time sound descriptors d j. 2 Model these descriptors as probability distributions p θj from a given exponential family. 3 Use the framework of computational information geometry on these distributions. In particular, it allows to define the notion of similarity in an information setup through divergences. Important need for temporal modeling. Potential applications [Cont et al., 2011]: Audio content analysis. Segmentation of audio streams. Automatic structure discovery of audio signals. Sound processing and synthesis. Auditory scene Short-time sound representation dj Sound descriptors modeling pθj Temporal modeling Figure: Schema of the general architecture of the system. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 10/21

30 How to model sounds? Introduction General architecture Sound descriptors modeling Temporal modeling Computation of a sound descriptor d j: Fourier or constant-q transforms for information on the spectral content. Mel-frequency cepstral coefficients for information on the timbre. Many other possibilities. Figure: Sound descriptors modeling. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 11/21

31 How to model sounds? Introduction General architecture Sound descriptors modeling Temporal modeling Computation of a sound descriptor d j: Fourier or constant-q transforms for information on the spectral content. Mel-frequency cepstral coefficients for information on the timbre. Many other possibilities. Modeling with a probability distribution p θj from an exponential family: Categorical distributions. Many other possibilities. Figure: Sound descriptors modeling. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 11/21

32 How to take time into account? General architecture Sound descriptors modeling Temporal modeling Model formation: from signal to symbol. Assumption of quasi-stationary audio chunks. Change detection adapted from CuSum [Basseville & Nikiforov, 1993]. Figure: Model formation at time t. April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 12/21

33 How to take time into account? General architecture Sound descriptors modeling Temporal modeling Model formation: from signal to symbol. Assumption of quasi-stationary audio chunks. Change detection adapted from CuSum [Basseville & Nikiforov, 1993]. Figure: Model formation at time t. Factor oracle: from symbol to syntax (and from genetics to music!). Forward transitions: original sequence factors. Backward links: suffix relations, common context. Figure: Factor oracle of the word abbbaab. April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 12/21

34 Outline Introduction Audio segmentation Music similarity analysis Musical structure discovery Query by similarity Audio recombination by concatenative synthesis Computer-assisted improvisation 1 Introduction Audio segmentation Music similarity analysis Musical structure discovery Query by similarity Audio recombination by concatenative synthesis Computer-assisted improvisation 5 arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 13/21

35 Audio segmentation Introduction Audio segmentation Music similarity analysis Musical structure discovery Query by similarity Audio recombination by concatenative synthesis Computer-assisted improvisation Figure: Segmentation of the 1st Piano Sonate, 1st Movement, 1st Theme, Beethoven. April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 14/21

36 Music similarity analysis Introduction Audio segmentation Music similarity analysis Musical structure discovery Query by similarity Audio recombination by concatenative synthesis Computer-assisted improvisation Figure: Similarity analysis of the 1st Piano Sonate, 3rd Movement, Beethoven. April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 15/21

37 Musical structure discovery Introduction Audio segmentation Music similarity analysis Musical structure discovery Query by similarity Audio recombination by concatenative synthesis Computer-assisted improvisation Figure: Structure discovery of the 1st Piano Sonate, 3rd Movement, Beethoven. April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 16/21

38 Query by similarity Introduction Audio segmentation Music similarity analysis Musical structure discovery Query by similarity Audio recombination by concatenative synthesis Computer-assisted improvisation Figure: Query by similarity of the 1st Theme over the entire 1st Piano Sonate, 1st Movement, Beethoven. April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 17/21

39 Audio recombination by concatenative synthesis Audio segmentation Music similarity analysis Musical structure discovery Query by similarity Audio recombination by concatenative synthesis Computer-assisted improvisation Figure: Audio recombination of African drums by concatenative synthesis of congas. April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 18/21

40 Computer-assisted improvisation Audio segmentation Music similarity analysis Musical structure discovery Query by similarity Audio recombination by concatenative synthesis Computer-assisted improvisation Figure: Computer-assisted improvisation, Fabrizio Cassol and Philippe Leclerc. April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 19/21

41 Outline Introduction 1 Introduction arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 20/21

42 What we (don t) have Introduction Summary and perspectives. Representations. Descriptors modeling. Temporal modeling. Temporality of events. April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 21/21

43 What we (don t) have Introduction Summary and perspectives. Representations. Descriptors modeling. Temporal modeling. Temporality of events. Many possibilities. Combinations of descriptors. Complex representations. April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 21/21

44 What we (don t) have Introduction Summary and perspectives. Representations. Descriptors modeling. Temporal modeling. Temporality of events. Exponential families and Bregman divergences. Mixture models of a given exponential family. Other geometries, divergences, metrics. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 21/21

45 What we (don t) have Introduction Summary and perspectives. Representations. Descriptors modeling. Temporal modeling. Temporality of events. On-line segmentation and factor oracle. On-line clustering and equivalence between symbols. Overlap between symbols and other temporal models. April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 21/21

46 What we (don t) have Introduction Summary and perspectives. Representations. Descriptors modeling. Temporal modeling. Temporality of events. Assumption of quasi-stationarity. Non-stationarity modeling. Time series. April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 21/21

47 What we (don t) have Summary and perspectives. Representations. Descriptors modeling. Temporal modeling. Temporality of events. Resources on IG: National research group: IRCAM, Ecole Polytechnique, Thales, etc. Brillouin seminar: IGAIA April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 21/21

48 What we (don t) have Summary and perspectives. Representations. Descriptors modeling. Temporal modeling. Temporality of events. Resources on IG: National research group: IRCAM, Ecole Polytechnique, Thales, etc. Brillouin seminar: IGAIA Thank you very much for your attention! Questions? This work was supported by a doctoral fellowship from the UPMC (EDITE) and by a grant from the JST-CNRS ICT (Improving the VR Experience). arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 21/21

49 Bibliography I Amari, S.-i. & Nagaoka, H. (2000). Methods of information geometry, volume 191 of Translations of Mathematical Monographs. American Mathematical Society. Banerjee, A., Merugu, S., Dhillon, I. S., & Ghosh, J. (2005). Clustering with Bregman divergences. Journal of Machine Learning Research, 6, Basseville, M. & Nikiforov, V. (1993). Detection of abrupt changes: Theory and application. Englewood Cliffs, NJ, USA: Prentice-Hall, Inc. Cayton, L. (2008). Fast nearest neighbor retrieval for Bregman divergences. In Proceedings of the 25th International Conference on Machine Learning, volume 307 Helsinki, Finland. Cayton, L. (2009). Efficient Bregman range search. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, & A. Culotta (Eds.), Advances in Neural Information Processing Systems, volume 22 (pp ). Curran Associates, Inc. Chentsov, N. N. (1982). Statistical decision rules and optimal inference, volume 53 of Translations of Mathematical Monographs. American Mathematical Society. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 22/21

50 Bibliography II Cont, A., Dubnov, S., & Assayag, G. (2011). On the information geometry of audio streams with applications to similarity computing. IEEE Transactions on Audio, Speech and Language Processing, 19. To appear. Garcia, V., Nielsen, F., & Nock, R. (2009). Levels of details for Gaussian mixture models. In Proceedings of the 9th Asian Conference on Computer Vision, ACCV 2009 (pp ). Xi an, China. Nielsen, F. & Nock, R. (2009). Sided and symmetrized Bregman centroids. IEEE Transactions on Information Theory, 55(6), Nielsen, F., Piro, P., & Barlaud, M. (2009). Tailored Bregman ball trees for effective nearest neighbors. In Proceedings of the 25th European Workshop on Computational Geometry (EuroCG) (pp ). Brussels, Belgium. Rao, C. R. (1945). Information and accuracy attainable in the estimation of statistical parameters. Bulletin of the Calcutta Mathematical Society, 37, April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 23/21

An information-geometric approach to real-time audio segmentation

An information-geometric approach to real-time audio segmentation An information-geometric approach to real-time audio segmentation Arnaud Dessein, Arshia Cont To cite this version: Arnaud Dessein, Arshia Cont. An information-geometric approach to real-time audio segmentation.

More information

Tailored Bregman Ball Trees for Effective Nearest Neighbors

Tailored Bregman Ball Trees for Effective Nearest Neighbors Tailored Bregman Ball Trees for Effective Nearest Neighbors Frank Nielsen 1 Paolo Piro 2 Michel Barlaud 2 1 Ecole Polytechnique, LIX, Palaiseau, France 2 CNRS / University of Nice-Sophia Antipolis, Sophia

More information

Inference. Data. Model. Variates

Inference. Data. Model. Variates Data Inference Variates Model ˆθ (,..., ) mˆθn(d) m θ2 M m θ1 (,, ) (,,, ) (,, ) α = :=: (, ) F( ) = = {(, ),, } F( ) X( ) = Γ( ) = Σ = ( ) = ( ) ( ) = { = } :=: (U, ) , = { = } = { = } x 2 e i, e j

More information

Chapter 2 Exponential Families and Mixture Families of Probability Distributions

Chapter 2 Exponential Families and Mixture Families of Probability Distributions Chapter 2 Exponential Families and Mixture Families of Probability Distributions The present chapter studies the geometry of the exponential family of probability distributions. It is not only a typical

More information

Introduction to Information Geometry

Introduction to Information Geometry Introduction to Information Geometry based on the book Methods of Information Geometry written by Shun-Ichi Amari and Hiroshi Nagaoka Yunshu Liu 2012-02-17 Outline 1 Introduction to differential geometry

More information

arxiv: v1 [cs.lg] 1 May 2010

arxiv: v1 [cs.lg] 1 May 2010 A Geometric View of Conjugate Priors Arvind Agarwal and Hal Daumé III School of Computing, University of Utah, Salt Lake City, Utah, 84112 USA {arvind,hal}@cs.utah.edu arxiv:1005.0047v1 [cs.lg] 1 May 2010

More information

A Geometric View of Conjugate Priors

A Geometric View of Conjugate Priors A Geometric View of Conjugate Priors Arvind Agarwal and Hal Daumé III School of Computing, University of Utah, Salt Lake City, Utah, 84112 USA {arvind,hal}@cs.utah.edu Abstract. In Bayesian machine learning,

More information

Bregman Divergences for Data Mining Meta-Algorithms

Bregman Divergences for Data Mining Meta-Algorithms p.1/?? Bregman Divergences for Data Mining Meta-Algorithms Joydeep Ghosh University of Texas at Austin ghosh@ece.utexas.edu Reflects joint work with Arindam Banerjee, Srujana Merugu, Inderjit Dhillon,

More information

Information Geometric view of Belief Propagation

Information Geometric view of Belief Propagation Information Geometric view of Belief Propagation Yunshu Liu 2013-10-17 References: [1]. Shiro Ikeda, Toshiyuki Tanaka and Shun-ichi Amari, Stochastic reasoning, Free energy and Information Geometry, Neural

More information

Bregman Divergences. Barnabás Póczos. RLAI Tea Talk UofA, Edmonton. Aug 5, 2008

Bregman Divergences. Barnabás Póczos. RLAI Tea Talk UofA, Edmonton. Aug 5, 2008 Bregman Divergences Barnabás Póczos RLAI Tea Talk UofA, Edmonton Aug 5, 2008 Contents Bregman Divergences Bregman Matrix Divergences Relation to Exponential Family Applications Definition Properties Generalization

More information

Research Seminar Télécom ParisTech

Research Seminar Télécom ParisTech Research Seminar Télécom ParisTech Computational Methods of Information Geometry with Real-Time Applications in Audio Signal Processing Arnaud Dessein August 30th 2013 M. Gérard Assayag Directeur M. Arshia

More information

This article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and education use, including for instruction at the author s institution, sharing

More information

Part III : Audio Semantics

Part III : Audio Semantics Part III : Audio Semantics What is music? Stylistic Aleatorics Factor Oracle Cognitive Model Music as Information Source Listening as Communication Channel Anticipation: description and explanation Emotional

More information

Information Geometry on Hierarchy of Probability Distributions

Information Geometry on Hierarchy of Probability Distributions IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 5, JULY 2001 1701 Information Geometry on Hierarchy of Probability Distributions Shun-ichi Amari, Fellow, IEEE Abstract An exponential family or mixture

More information

On the Chi square and higher-order Chi distances for approximating f-divergences

On the Chi square and higher-order Chi distances for approximating f-divergences On the Chi square and higher-order Chi distances for approximating f-divergences Frank Nielsen, Senior Member, IEEE and Richard Nock, Nonmember Abstract We report closed-form formula for calculating the

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Information Geometry of Positive Measures and Positive-Definite Matrices: Decomposable Dually Flat Structure

Information Geometry of Positive Measures and Positive-Definite Matrices: Decomposable Dually Flat Structure Entropy 014, 16, 131-145; doi:10.3390/e1604131 OPEN ACCESS entropy ISSN 1099-4300 www.mdpi.com/journal/entropy Article Information Geometry of Positive Measures and Positive-Definite Matrices: Decomposable

More information

arxiv: v2 [cs.cv] 19 Dec 2011

arxiv: v2 [cs.cv] 19 Dec 2011 A family of statistical symmetric divergences based on Jensen s inequality arxiv:009.4004v [cs.cv] 9 Dec 0 Abstract Frank Nielsen a,b, a Ecole Polytechnique Computer Science Department (LIX) Palaiseau,

More information

Independent Component Analysis and Unsupervised Learning

Independent Component Analysis and Unsupervised Learning Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien National Cheng Kung University TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent

More information

Information Geometry

Information Geometry 2015 Workshop on High-Dimensional Statistical Analysis Dec.11 (Friday) ~15 (Tuesday) Humanities and Social Sciences Center, Academia Sinica, Taiwan Information Geometry and Spontaneous Data Learning Shinto

More information

Symplectic and Kähler Structures on Statistical Manifolds Induced from Divergence Functions

Symplectic and Kähler Structures on Statistical Manifolds Induced from Divergence Functions Symplectic and Kähler Structures on Statistical Manifolds Induced from Divergence Functions Jun Zhang 1, and Fubo Li 1 University of Michigan, Ann Arbor, Michigan, USA Sichuan University, Chengdu, China

More information

Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation

Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation Mikkel N. Schmidt and Morten Mørup Technical University of Denmark Informatics and Mathematical Modelling Richard

More information

Minimum message length estimation of mixtures of multivariate Gaussian and von Mises-Fisher distributions

Minimum message length estimation of mixtures of multivariate Gaussian and von Mises-Fisher distributions Minimum message length estimation of mixtures of multivariate Gaussian and von Mises-Fisher distributions Parthan Kasarapu & Lloyd Allison Monash University, Australia September 8, 25 Parthan Kasarapu

More information

Independent Component Analysis and Unsupervised Learning. Jen-Tzung Chien

Independent Component Analysis and Unsupervised Learning. Jen-Tzung Chien Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent voices Nonparametric likelihood

More information

SUBTANGENT-LIKE STATISTICAL MANIFOLDS. 1. Introduction

SUBTANGENT-LIKE STATISTICAL MANIFOLDS. 1. Introduction SUBTANGENT-LIKE STATISTICAL MANIFOLDS A. M. BLAGA Abstract. Subtangent-like statistical manifolds are introduced and characterization theorems for them are given. The special case when the conjugate connections

More information

Analysis of polyphonic audio using source-filter model and non-negative matrix factorization

Analysis of polyphonic audio using source-filter model and non-negative matrix factorization Analysis of polyphonic audio using source-filter model and non-negative matrix factorization Tuomas Virtanen and Anssi Klapuri Tampere University of Technology, Institute of Signal Processing Korkeakoulunkatu

More information

2882 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 6, JUNE A formal definition considers probability measures P and Q defined

2882 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 6, JUNE A formal definition considers probability measures P and Q defined 2882 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 6, JUNE 2009 Sided and Symmetrized Bregman Centroids Frank Nielsen, Member, IEEE, and Richard Nock Abstract In this paper, we generalize the notions

More information

Information geometry of Bayesian statistics

Information geometry of Bayesian statistics Information geometry of Bayesian statistics Hiroshi Matsuzoe Department of Computer Science and Engineering, Graduate School of Engineering, Nagoya Institute of Technology, Nagoya 466-8555, Japan Abstract.

More information

Information geometry of mirror descent

Information geometry of mirror descent Information geometry of mirror descent Geometric Science of Information Anthea Monod Department of Statistical Science Duke University Information Initiative at Duke G. Raskutti (UW Madison) and S. Mukherjee

More information

Legendre transformation and information geometry

Legendre transformation and information geometry Legendre transformation and information geometry CIG-MEMO #2, v1 Frank Nielsen École Polytechnique Sony Computer Science Laboratorie, Inc http://www.informationgeometry.org September 2010 Abstract We explain

More information

June 21, Peking University. Dual Connections. Zhengchao Wan. Overview. Duality of connections. Divergence: general contrast functions

June 21, Peking University. Dual Connections. Zhengchao Wan. Overview. Duality of connections. Divergence: general contrast functions Dual Peking University June 21, 2016 Divergences: Riemannian connection Let M be a manifold on which there is given a Riemannian metric g =,. A connection satisfying Z X, Y = Z X, Y + X, Z Y (1) for all

More information

CS Lecture 19. Exponential Families & Expectation Propagation

CS Lecture 19. Exponential Families & Expectation Propagation CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces

More information

Belief Propagation, Information Projections, and Dykstra s Algorithm

Belief Propagation, Information Projections, and Dykstra s Algorithm Belief Propagation, Information Projections, and Dykstra s Algorithm John MacLaren Walsh, PhD Department of Electrical and Computer Engineering Drexel University Philadelphia, PA jwalsh@ece.drexel.edu

More information

Bayesian Models in Machine Learning

Bayesian Models in Machine Learning Bayesian Models in Machine Learning Lukáš Burget Escuela de Ciencias Informáticas 2017 Buenos Aires, July 24-29 2017 Frequentist vs. Bayesian Frequentist point of view: Probability is the frequency of

More information

Information Geometry: Background and Applications in Machine Learning

Information Geometry: Background and Applications in Machine Learning Geometry and Computer Science Information Geometry: Background and Applications in Machine Learning Giovanni Pistone www.giannidiorestino.it Pescara IT), February 8 10, 2017 Abstract Information Geometry

More information

AN AFFINE EMBEDDING OF THE GAMMA MANIFOLD

AN AFFINE EMBEDDING OF THE GAMMA MANIFOLD AN AFFINE EMBEDDING OF THE GAMMA MANIFOLD C.T.J. DODSON AND HIROSHI MATSUZOE Abstract. For the space of gamma distributions with Fisher metric and exponential connections, natural coordinate systems, potential

More information

Bias-Variance Trade-Off in Hierarchical Probabilistic Models Using Higher-Order Feature Interactions

Bias-Variance Trade-Off in Hierarchical Probabilistic Models Using Higher-Order Feature Interactions - Trade-Off in Hierarchical Probabilistic Models Using Higher-Order Feature Interactions Simon Luo The University of Sydney Data61, CSIRO simon.luo@data61.csiro.au Mahito Sugiyama National Institute of

More information

Single Channel Signal Separation Using MAP-based Subspace Decomposition

Single Channel Signal Separation Using MAP-based Subspace Decomposition Single Channel Signal Separation Using MAP-based Subspace Decomposition Gil-Jin Jang, Te-Won Lee, and Yung-Hwan Oh 1 Spoken Language Laboratory, Department of Computer Science, KAIST 373-1 Gusong-dong,

More information

Information Geometric Structure on Positive Definite Matrices and its Applications

Information Geometric Structure on Positive Definite Matrices and its Applications Information Geometric Structure on Positive Definite Matrices and its Applications Atsumi Ohara Osaka University 2010 Feb. 21 at Osaka City University 大阪市立大学数学研究所情報幾何関連分野研究会 2010 情報工学への幾何学的アプローチ 1 Outline

More information

Approximating Covering and Minimum Enclosing Balls in Hyperbolic Geometry

Approximating Covering and Minimum Enclosing Balls in Hyperbolic Geometry Approximating Covering and Minimum Enclosing Balls in Hyperbolic Geometry Frank Nielsen 1 and Gaëtan Hadjeres 1 École Polytechnique, France/Sony Computer Science Laboratories, Japan Frank.Nielsen@acm.org

More information

F -Geometry and Amari s α Geometry on a Statistical Manifold

F -Geometry and Amari s α Geometry on a Statistical Manifold Entropy 014, 16, 47-487; doi:10.3390/e160547 OPEN ACCESS entropy ISSN 1099-4300 www.mdpi.com/journal/entropy Article F -Geometry and Amari s α Geometry on a Statistical Manifold Harsha K. V. * and Subrahamanian

More information

A Geometric View of Conjugate Priors

A Geometric View of Conjugate Priors Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence A Geometric View of Conjugate Priors Arvind Agarwal Department of Computer Science University of Maryland College

More information

arxiv: v1 [cs.cg] 14 Sep 2007

arxiv: v1 [cs.cg] 14 Sep 2007 Bregman Voronoi Diagrams: Properties, Algorithms and Applications Frank Nielsen Jean-Daniel Boissonnat Richard Nock arxiv:0709.2196v1 [cs.cg] 14 Sep 2007 Abstract The Voronoi diagram of a finite set of

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Support Vector Machines Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique

More information

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. A Superharmonic Prior for the Autoregressive Process of the Second Order

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. A Superharmonic Prior for the Autoregressive Process of the Second Order MATHEMATICAL ENGINEERING TECHNICAL REPORTS A Superharmonic Prior for the Autoregressive Process of the Second Order Fuyuhiko TANAKA and Fumiyasu KOMAKI METR 2006 30 May 2006 DEPARTMENT OF MATHEMATICAL

More information

Maximum likelihood estimation

Maximum likelihood estimation Maximum likelihood estimation Guillaume Obozinski Ecole des Ponts - ParisTech Master MVA Maximum likelihood estimation 1/26 Outline 1 Statistical concepts 2 A short review of convex analysis and optimization

More information

Algorithms for Variational Learning of Mixture of Gaussians

Algorithms for Variational Learning of Mixture of Gaussians Algorithms for Variational Learning of Mixture of Gaussians Instructors: Tapani Raiko and Antti Honkela Bayes Group Adaptive Informatics Research Center 28.08.2008 Variational Bayesian Inference Mixture

More information

Information geometry of the power inverse Gaussian distribution

Information geometry of the power inverse Gaussian distribution Information geometry of the power inverse Gaussian distribution Zhenning Zhang, Huafei Sun and Fengwei Zhong Abstract. The power inverse Gaussian distribution is a common distribution in reliability analysis

More information

A geometric view of conjugate priors

A geometric view of conjugate priors Mach Learn (2010) 81: 99 113 DOI 10.1007/s10994-010-5203-x A geometric view of conjugate priors Arvind Agarwal Hal Daumé III Received: 30 April 2010 / Accepted: 20 June 2010 / Published online: 22 July

More information

Machine Learning Overview

Machine Learning Overview Machine Learning Overview Sargur N. Srihari University at Buffalo, State University of New York USA 1 Outline 1. What is Machine Learning (ML)? 2. Types of Information Processing Problems Solved 1. Regression

More information

Advanced Machine Learning & Perception

Advanced Machine Learning & Perception Advanced Machine Learning & Perception Instructor: Tony Jebara Topic 6 Standard Kernels Unusual Input Spaces for Kernels String Kernels Probabilistic Kernels Fisher Kernels Probability Product Kernels

More information

Signal Modeling Techniques in Speech Recognition. Hassan A. Kingravi

Signal Modeling Techniques in Speech Recognition. Hassan A. Kingravi Signal Modeling Techniques in Speech Recognition Hassan A. Kingravi Outline Introduction Spectral Shaping Spectral Analysis Parameter Transforms Statistical Modeling Discussion Conclusions 1: Introduction

More information

MANY digital speech communication applications, e.g.,

MANY digital speech communication applications, e.g., 406 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 2, FEBRUARY 2007 An MMSE Estimator for Speech Enhancement Under a Combined Stochastic Deterministic Speech Model Richard C.

More information

Brief Introduction of Machine Learning Techniques for Content Analysis

Brief Introduction of Machine Learning Techniques for Content Analysis 1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview

More information

An Information Geometry Perspective on Estimation of Distribution Algorithms: Boundary Analysis

An Information Geometry Perspective on Estimation of Distribution Algorithms: Boundary Analysis An Information Geometry Perspective on Estimation of Distribution Algorithms: Boundary Analysis Luigi Malagò Department of Electronics and Information Politecnico di Milano Via Ponzio, 34/5 20133 Milan,

More information

On the Chi square and higher-order Chi distances for approximating f-divergences

On the Chi square and higher-order Chi distances for approximating f-divergences c 2013 Frank Nielsen, Sony Computer Science Laboratories, Inc. 1/17 On the Chi square and higher-order Chi distances for approximating f-divergences Frank Nielsen 1 Richard Nock 2 www.informationgeometry.org

More information

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. An Extension of Least Angle Regression Based on the Information Geometry of Dually Flat Spaces

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. An Extension of Least Angle Regression Based on the Information Geometry of Dually Flat Spaces MATHEMATICAL ENGINEERING TECHNICAL REPORTS An Extension of Least Angle Regression Based on the Information Geometry of Dually Flat Spaces Yoshihiro HIROSE and Fumiyasu KOMAKI METR 2009 09 March 2009 DEPARTMENT

More information

Latent Dirichlet Allocation Introduction/Overview

Latent Dirichlet Allocation Introduction/Overview Latent Dirichlet Allocation Introduction/Overview David Meyer 03.10.2016 David Meyer http://www.1-4-5.net/~dmm/ml/lda_intro.pdf 03.10.2016 Agenda What is Topic Modeling? Parametric vs. Non-Parametric Models

More information

Riemannian Metric Learning for Symmetric Positive Definite Matrices

Riemannian Metric Learning for Symmetric Positive Definite Matrices CMSC 88J: Linear Subspaces and Manifolds for Computer Vision and Machine Learning Riemannian Metric Learning for Symmetric Positive Definite Matrices Raviteja Vemulapalli Guide: Professor David W. Jacobs

More information

PMR Learning as Inference

PMR Learning as Inference Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning

More information

Patch matching with polynomial exponential families and projective divergences

Patch matching with polynomial exponential families and projective divergences Patch matching with polynomial exponential families and projective divergences Frank Nielsen 1 and Richard Nock 2 1 École Polytechnique and Sony Computer Science Laboratories Inc Frank.Nielsen@acm.org

More information

Sound Recognition in Mixtures

Sound Recognition in Mixtures Sound Recognition in Mixtures Juhan Nam, Gautham J. Mysore 2, and Paris Smaragdis 2,3 Center for Computer Research in Music and Acoustics, Stanford University, 2 Advanced Technology Labs, Adobe Systems

More information

Conditional Gradient (Frank-Wolfe) Method

Conditional Gradient (Frank-Wolfe) Method Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties

More information

The Information Bottleneck Revisited or How to Choose a Good Distortion Measure

The Information Bottleneck Revisited or How to Choose a Good Distortion Measure The Information Bottleneck Revisited or How to Choose a Good Distortion Measure Peter Harremoës Centrum voor Wiskunde en Informatica PO 94079, 1090 GB Amsterdam The Nederlands PHarremoes@cwinl Naftali

More information

Clustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.

Clustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014. Clustering K-means Machine Learning CSE546 Carlos Guestrin University of Washington November 4, 2014 1 Clustering images Set of Images [Goldberger et al.] 2 1 K-means Randomly initialize k centers µ (0)

More information

Non-Negative Matrix Factorization And Its Application to Audio. Tuomas Virtanen Tampere University of Technology

Non-Negative Matrix Factorization And Its Application to Audio. Tuomas Virtanen Tampere University of Technology Non-Negative Matrix Factorization And Its Application to Audio Tuomas Virtanen Tampere University of Technology tuomas.virtanen@tut.fi 2 Contents Introduction to audio signals Spectrogram representation

More information

SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIX FACTORIZATION AND SPECTRAL MASKS. Emad M. Grais and Hakan Erdogan

SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIX FACTORIZATION AND SPECTRAL MASKS. Emad M. Grais and Hakan Erdogan SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIX FACTORIZATION AND SPECTRAL MASKS Emad M. Grais and Hakan Erdogan Faculty of Engineering and Natural Sciences, Sabanci University, Orhanli

More information

Dominant Feature Vectors Based Audio Similarity Measure

Dominant Feature Vectors Based Audio Similarity Measure Dominant Feature Vectors Based Audio Similarity Measure Jing Gu 1, Lie Lu 2, Rui Cai 3, Hong-Jiang Zhang 2, and Jian Yang 1 1 Dept. of Electronic Engineering, Tsinghua Univ., Beijing, 100084, China 2 Microsoft

More information

Information geometry connecting Wasserstein distance and Kullback Leibler divergence via the entropy-relaxed transportation problem

Information geometry connecting Wasserstein distance and Kullback Leibler divergence via the entropy-relaxed transportation problem https://doi.org/10.1007/s41884-018-0002-8 RESEARCH PAPER Information geometry connecting Wasserstein distance and Kullback Leibler divergence via the entropy-relaxed transportation problem Shun-ichi Amari

More information

Single Channel Music Sound Separation Based on Spectrogram Decomposition and Note Classification

Single Channel Music Sound Separation Based on Spectrogram Decomposition and Note Classification Single Channel Music Sound Separation Based on Spectrogram Decomposition and Note Classification Hafiz Mustafa and Wenwu Wang Centre for Vision, Speech and Signal Processing (CVSSP) University of Surrey,

More information

arxiv: v1 [cs.lg] 22 Oct 2018

arxiv: v1 [cs.lg] 22 Oct 2018 The Bregman chord divergence arxiv:1810.09113v1 [cs.lg] Oct 018 rank Nielsen Sony Computer Science Laboratories Inc, Japan rank.nielsen@acm.org Richard Nock Data 61, Australia The Australian National &

More information

Mean-field equations for higher-order quantum statistical models : an information geometric approach

Mean-field equations for higher-order quantum statistical models : an information geometric approach Mean-field equations for higher-order quantum statistical models : an information geometric approach N Yapage Department of Mathematics University of Ruhuna, Matara Sri Lanka. arxiv:1202.5726v1 [quant-ph]

More information

Statistical physics models belonging to the generalised exponential family

Statistical physics models belonging to the generalised exponential family Statistical physics models belonging to the generalised exponential family Jan Naudts Universiteit Antwerpen 1. The generalized exponential family 6. The porous media equation 2. Theorem 7. The microcanonical

More information

The Karcher Mean of Points on SO n

The Karcher Mean of Points on SO n The Karcher Mean of Points on SO n Knut Hüper joint work with Jonathan Manton (Univ. Melbourne) Knut.Hueper@nicta.com.au National ICT Australia Ltd. CESAME LLN, 15/7/04 p.1/25 Contents Introduction CESAME

More information

NMF WITH SPECTRAL AND TEMPORAL CONTINUITY CRITERIA FOR MONAURAL SOUND SOURCE SEPARATION. Julian M. Becker, Christian Sohn and Christian Rohlfing

NMF WITH SPECTRAL AND TEMPORAL CONTINUITY CRITERIA FOR MONAURAL SOUND SOURCE SEPARATION. Julian M. Becker, Christian Sohn and Christian Rohlfing NMF WITH SPECTRAL AND TEMPORAL CONTINUITY CRITERIA FOR MONAURAL SOUND SOURCE SEPARATION Julian M. ecker, Christian Sohn Christian Rohlfing Institut für Nachrichtentechnik RWTH Aachen University D-52056

More information

Functional Bregman Divergence and Bayesian Estimation of Distributions Béla A. Frigyik, Santosh Srivastava, and Maya R. Gupta

Functional Bregman Divergence and Bayesian Estimation of Distributions Béla A. Frigyik, Santosh Srivastava, and Maya R. Gupta 5130 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 54, NO. 11, NOVEMBER 2008 Functional Bregman Divergence and Bayesian Estimation of Distributions Béla A. Frigyik, Santosh Srivastava, and Maya R. Gupta

More information

LECTURE 11: EXPONENTIAL FAMILY AND GENERALIZED LINEAR MODELS

LECTURE 11: EXPONENTIAL FAMILY AND GENERALIZED LINEAR MODELS LECTURE : EXPONENTIAL FAMILY AND GENERALIZED LINEAR MODELS HANI GOODARZI AND SINA JAFARPOUR. EXPONENTIAL FAMILY. Exponential family comprises a set of flexible distribution ranging both continuous and

More information

Natural Gradient for the Gaussian Distribution via Least Squares Regression

Natural Gradient for the Gaussian Distribution via Least Squares Regression Natural Gradient for the Gaussian Distribution via Least Squares Regression Luigi Malagò Department of Electrical and Electronic Engineering Shinshu University & INRIA Saclay Île-de-France 4-17-1 Wakasato,

More information

Hessian Riemannian Gradient Flows in Convex Programming

Hessian Riemannian Gradient Flows in Convex Programming Hessian Riemannian Gradient Flows in Convex Programming Felipe Alvarez, Jérôme Bolte, Olivier Brahic INTERNATIONAL CONFERENCE ON MODELING AND OPTIMIZATION MODOPT 2004 Universidad de La Frontera, Temuco,

More information

Density Modeling and Clustering Using Dirichlet Diffusion Trees

Density Modeling and Clustering Using Dirichlet Diffusion Trees p. 1/3 Density Modeling and Clustering Using Dirichlet Diffusion Trees Radford M. Neal Bayesian Statistics 7, 2003, pp. 619-629. Presenter: Ivo D. Shterev p. 2/3 Outline Motivation. Data points generation.

More information

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007

More information

CRYSTALLIZATION SONIFICATION OF HIGH-DIMENSIONAL DATASETS

CRYSTALLIZATION SONIFICATION OF HIGH-DIMENSIONAL DATASETS Proceedings of the 22 International Conference on Auditory Display, Kyoto, Japan, July 2 5, 22 CRYSTALLIZATION SONIFICATION OF HIGH-DIMENSIONAL DATASETS T. Hermann Faculty of Technology Bielefeld University,

More information

A CUSUM approach for online change-point detection on curve sequences

A CUSUM approach for online change-point detection on curve sequences ESANN 22 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges Belgium, 25-27 April 22, i6doc.com publ., ISBN 978-2-8749-49-. Available

More information

Geometry of Goodness-of-Fit Testing in High Dimensional Low Sample Size Modelling

Geometry of Goodness-of-Fit Testing in High Dimensional Low Sample Size Modelling Geometry of Goodness-of-Fit Testing in High Dimensional Low Sample Size Modelling Paul Marriott 1, Radka Sabolova 2, Germain Van Bever 2, and Frank Critchley 2 1 University of Waterloo, Waterloo, Ontario,

More information

LECTURE 15: COMPLETENESS AND CONVEXITY

LECTURE 15: COMPLETENESS AND CONVEXITY LECTURE 15: COMPLETENESS AND CONVEXITY 1. The Hopf-Rinow Theorem Recall that a Riemannian manifold (M, g) is called geodesically complete if the maximal defining interval of any geodesic is R. On the other

More information

arxiv: v2 [math.st] 22 Aug 2018

arxiv: v2 [math.st] 22 Aug 2018 GENERALIZED BREGMAN AND JENSEN DIVERGENCES WHICH INCLUDE SOME F-DIVERGENCES TOMOHIRO NISHIYAMA arxiv:808.0648v2 [math.st] 22 Aug 208 Abstract. In this paper, we introduce new classes of divergences by

More information

COMS 4771 Lecture Course overview 2. Maximum likelihood estimation (review of some statistics)

COMS 4771 Lecture Course overview 2. Maximum likelihood estimation (review of some statistics) COMS 4771 Lecture 1 1. Course overview 2. Maximum likelihood estimation (review of some statistics) 1 / 24 Administrivia This course Topics http://www.satyenkale.com/coms4771/ 1. Supervised learning Core

More information

Lecture 6: Gaussian Mixture Models (GMM)

Lecture 6: Gaussian Mixture Models (GMM) Helsinki Institute for Information Technology Lecture 6: Gaussian Mixture Models (GMM) Pedram Daee 3.11.2015 Outline Gaussian Mixture Models (GMM) Models Model families and parameters Parameter learning

More information

Gentle Introduction to Infinite Gaussian Mixture Modeling

Gentle Introduction to Infinite Gaussian Mixture Modeling Gentle Introduction to Infinite Gaussian Mixture Modeling with an application in neuroscience By Frank Wood Rasmussen, NIPS 1999 Neuroscience Application: Spike Sorting Important in neuroscience and for

More information

Characterizing the Region of Entropic Vectors via Information Geometry

Characterizing the Region of Entropic Vectors via Information Geometry Characterizing the Region of Entropic Vectors via Information Geometry John MacLaren Walsh Department of Electrical and Computer Engineering Drexel University Philadelphia, PA jwalsh@ece.drexel.edu Thanks

More information

PROPERTIES OF THE EMPIRICAL CHARACTERISTIC FUNCTION AND ITS APPLICATION TO TESTING FOR INDEPENDENCE. Noboru Murata

PROPERTIES OF THE EMPIRICAL CHARACTERISTIC FUNCTION AND ITS APPLICATION TO TESTING FOR INDEPENDENCE. Noboru Murata ' / PROPERTIES OF THE EMPIRICAL CHARACTERISTIC FUNCTION AND ITS APPLICATION TO TESTING FOR INDEPENDENCE Noboru Murata Waseda University Department of Electrical Electronics and Computer Engineering 3--

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices

A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices Ryota Tomioka 1, Taiji Suzuki 1, Masashi Sugiyama 2, Hisashi Kashima 1 1 The University of Tokyo 2 Tokyo Institute of Technology 2010-06-22

More information

1. Geometry of the unit tangent bundle

1. Geometry of the unit tangent bundle 1 1. Geometry of the unit tangent bundle The main reference for this section is [8]. In the following, we consider (M, g) an n-dimensional smooth manifold endowed with a Riemannian metric g. 1.1. Notations

More information

Kernel Learning with Bregman Matrix Divergences

Kernel Learning with Bregman Matrix Divergences Kernel Learning with Bregman Matrix Divergences Inderjit S. Dhillon The University of Texas at Austin Workshop on Algorithms for Modern Massive Data Sets Stanford University and Yahoo! Research June 22,

More information

Bregman divergence and density integration Noboru Murata and Yu Fujimoto

Bregman divergence and density integration Noboru Murata and Yu Fujimoto Journal of Math-for-industry, Vol.1(2009B-3), pp.97 104 Bregman divergence and density integration Noboru Murata and Yu Fujimoto Received on August 29, 2009 / Revised on October 4, 2009 Abstract. In this

More information

Learning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley

Learning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley Learning Methods for Online Prediction Problems Peter Bartlett Statistics and EECS UC Berkeley Course Synopsis A finite comparison class: A = {1,..., m}. 1. Prediction with expert advice. 2. With perfect

More information

A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement

A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement Simon Leglaive 1 Laurent Girin 1,2 Radu Horaud 1 1: Inria Grenoble Rhône-Alpes 2: Univ. Grenoble Alpes, Grenoble INP,

More information

Lecture 1 October 9, 2013

Lecture 1 October 9, 2013 Probabilistic Graphical Models Fall 2013 Lecture 1 October 9, 2013 Lecturer: Guillaume Obozinski Scribe: Huu Dien Khue Le, Robin Bénesse The web page of the course: http://www.di.ens.fr/~fbach/courses/fall2013/

More information

Two Further Gradient BYY Learning Rules for Gaussian Mixture with Automated Model Selection

Two Further Gradient BYY Learning Rules for Gaussian Mixture with Automated Model Selection Two Further Gradient BYY Learning Rules for Gaussian Mixture with Automated Model Selection Jinwen Ma, Bin Gao, Yang Wang, and Qiansheng Cheng Department of Information Science, School of Mathematical

More information