Journée Interdisciplinaire Mathématiques Musique

Similar documents
An information-geometric approach to real-time audio segmentation

Tailored Bregman Ball Trees for Effective Nearest Neighbors

Inference. Data. Model. Variates

Chapter 2 Exponential Families and Mixture Families of Probability Distributions

Introduction to Information Geometry

arxiv: v1 [cs.lg] 1 May 2010

A Geometric View of Conjugate Priors

Bregman Divergences for Data Mining Meta-Algorithms

Information Geometric view of Belief Propagation

Bregman Divergences. Barnabás Póczos. RLAI Tea Talk UofA, Edmonton. Aug 5, 2008

Research Seminar Télécom ParisTech


Part III : Audio Semantics

Information Geometry on Hierarchy of Probability Distributions

On the Chi square and higher-order Chi distances for approximating f-divergences

Pattern Recognition and Machine Learning

Information Geometry of Positive Measures and Positive-Definite Matrices: Decomposable Dually Flat Structure

arxiv: v2 [cs.cv] 19 Dec 2011

Independent Component Analysis and Unsupervised Learning

Information Geometry

Symplectic and Kähler Structures on Statistical Manifolds Induced from Divergence Functions

Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation

Minimum message length estimation of mixtures of multivariate Gaussian and von Mises-Fisher distributions

Independent Component Analysis and Unsupervised Learning. Jen-Tzung Chien

SUBTANGENT-LIKE STATISTICAL MANIFOLDS. 1. Introduction

Analysis of polyphonic audio using source-filter model and non-negative matrix factorization

2882 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 6, JUNE A formal definition considers probability measures P and Q defined

Information geometry of Bayesian statistics

Information geometry of mirror descent

Legendre transformation and information geometry

June 21, Peking University. Dual Connections. Zhengchao Wan. Overview. Duality of connections. Divergence: general contrast functions

CS Lecture 19. Exponential Families & Expectation Propagation

Belief Propagation, Information Projections, and Dykstra s Algorithm

Bayesian Models in Machine Learning

Information Geometry: Background and Applications in Machine Learning

AN AFFINE EMBEDDING OF THE GAMMA MANIFOLD

Bias-Variance Trade-Off in Hierarchical Probabilistic Models Using Higher-Order Feature Interactions

Single Channel Signal Separation Using MAP-based Subspace Decomposition

Information Geometric Structure on Positive Definite Matrices and its Applications

Approximating Covering and Minimum Enclosing Balls in Hyperbolic Geometry

F -Geometry and Amari s α Geometry on a Statistical Manifold

A Geometric View of Conjugate Priors

arxiv: v1 [cs.cg] 14 Sep 2007

Statistical Machine Learning from Data

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. A Superharmonic Prior for the Autoregressive Process of the Second Order

Maximum likelihood estimation

Algorithms for Variational Learning of Mixture of Gaussians

Information geometry of the power inverse Gaussian distribution

A geometric view of conjugate priors

Machine Learning Overview

Advanced Machine Learning & Perception

Signal Modeling Techniques in Speech Recognition. Hassan A. Kingravi

MANY digital speech communication applications, e.g.,

Brief Introduction of Machine Learning Techniques for Content Analysis

An Information Geometry Perspective on Estimation of Distribution Algorithms: Boundary Analysis

On the Chi square and higher-order Chi distances for approximating f-divergences

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. An Extension of Least Angle Regression Based on the Information Geometry of Dually Flat Spaces

Latent Dirichlet Allocation Introduction/Overview

Riemannian Metric Learning for Symmetric Positive Definite Matrices

PMR Learning as Inference

Patch matching with polynomial exponential families and projective divergences

Sound Recognition in Mixtures

Conditional Gradient (Frank-Wolfe) Method

The Information Bottleneck Revisited or How to Choose a Good Distortion Measure

Clustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.

Non-Negative Matrix Factorization And Its Application to Audio. Tuomas Virtanen Tampere University of Technology

SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIX FACTORIZATION AND SPECTRAL MASKS. Emad M. Grais and Hakan Erdogan

Dominant Feature Vectors Based Audio Similarity Measure

Information geometry connecting Wasserstein distance and Kullback Leibler divergence via the entropy-relaxed transportation problem

Single Channel Music Sound Separation Based on Spectrogram Decomposition and Note Classification

arxiv: v1 [cs.lg] 22 Oct 2018

Mean-field equations for higher-order quantum statistical models : an information geometric approach

Statistical physics models belonging to the generalised exponential family

The Karcher Mean of Points on SO n

NMF WITH SPECTRAL AND TEMPORAL CONTINUITY CRITERIA FOR MONAURAL SOUND SOURCE SEPARATION. Julian M. Becker, Christian Sohn and Christian Rohlfing

Functional Bregman Divergence and Bayesian Estimation of Distributions Béla A. Frigyik, Santosh Srivastava, and Maya R. Gupta

LECTURE 11: EXPONENTIAL FAMILY AND GENERALIZED LINEAR MODELS

Natural Gradient for the Gaussian Distribution via Least Squares Regression

Hessian Riemannian Gradient Flows in Convex Programming

Density Modeling and Clustering Using Dirichlet Diffusion Trees

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

CRYSTALLIZATION SONIFICATION OF HIGH-DIMENSIONAL DATASETS

A CUSUM approach for online change-point detection on curve sequences

Geometry of Goodness-of-Fit Testing in High Dimensional Low Sample Size Modelling

LECTURE 15: COMPLETENESS AND CONVEXITY

arxiv: v2 [math.st] 22 Aug 2018

COMS 4771 Lecture Course overview 2. Maximum likelihood estimation (review of some statistics)

Lecture 6: Gaussian Mixture Models (GMM)

Gentle Introduction to Infinite Gaussian Mixture Modeling

Characterizing the Region of Entropic Vectors via Information Geometry

PROPERTIES OF THE EMPIRICAL CHARACTERISTIC FUNCTION AND ITS APPLICATION TO TESTING FOR INDEPENDENCE. Noboru Murata

Constrained Optimization and Lagrangian Duality

A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices

1. Geometry of the unit tangent bundle

Kernel Learning with Bregman Matrix Divergences

Bregman divergence and density integration Noboru Murata and Yu Fujimoto

Learning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley

A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement

Lecture 1 October 9, 2013

Two Further Gradient BYY Learning Rules for Gaussian Mixture with Automated Model Selection

Transcription:

Journée Interdisciplinaire Mathématiques Musique Music Information Geometry Arnaud Dessein 1,2 and Arshia Cont 1 1 Institute for Research and Coordination of Acoustics and Music, Paris, France 2 Japanese-French Laboratory for Informatics, Tokyo, Japan IRMA, Strasbourg, April 7th 2011 arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 1/21

Outline Introduction 1 Introduction 2 3 4 5 arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 2/21

Outline Introduction A bit of history about science and music Motivations towards information geometry 1 Introduction A bit of history about science and music Motivations towards information geometry 2 3 4 5 arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 3/21

Where do we come from? Introduction A bit of history about science and music Motivations towards information geometry Pythagoras ( 570 495 BC): relation between string length and produced sound, Pythagorean tuning. There is geometry in the humming of the strings, there is music in the spacing of the spheres. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 4/21

Where do we come from? Introduction A bit of history about science and music Motivations towards information geometry Pythagoras ( 570 495 BC): relation between string length and produced sound, Pythagorean tuning. There is geometry in the humming of the strings, there is music in the spacing of the spheres. Helmholtz (1821 1894): Helmholtz resonator, harmonics and frequency spectrum of sounds. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 4/21

Where do we come from? Introduction A bit of history about science and music Motivations towards information geometry Pythagoras ( 570 495 BC): relation between string length and produced sound, Pythagorean tuning. There is geometry in the humming of the strings, there is music in the spacing of the spheres. Helmholtz (1821 1894): Helmholtz resonator, harmonics and frequency spectrum of sounds. But also indirectly Fourier, Shannon, etc. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 4/21

What do we need? Introduction A bit of history about science and music Motivations towards information geometry Figure: Levels of representation of audio, waveform and spectrogram representations. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 5/21

What do we need? Introduction A bit of history about science and music Motivations towards information geometry Figure: Levels of representation of audio, waveform and spectrogram representations. Develop a comprehensive framework that allows to quantify, process and represent the information contained in audio signals. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 5/21

What do we need? Introduction A bit of history about science and music Motivations towards information geometry Figure: Levels of representation of audio, waveform and spectrogram representations. Develop a comprehensive framework that allows to quantify, process and represent the information contained in audio signals. Fill in the gap between signal and symbolic representations. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 5/21

Outline Introduction Background Exponential families 1 Introduction 2 Background Exponential families 3 4 5 arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 6/21

What is information geometry? Background Exponential families Statistical differentiable manifold. Under certain assumptions, a parametric statistical model S = {p ξ : ξ Ξ} of probability distributions defined on X forms a differentiable manifold. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 7/21

What is information geometry? Background Exponential families Statistical differentiable manifold. Under certain assumptions, a parametric statistical model S = {p ξ : ξ Ξ} of probability distributions defined on X forms a differentiable manifold. { } 1 Example: p ξ (x) = exp (x µ)2 for all x X = R, with 2πσ 2 2σ 2 ξ = [µ, σ 2 ] Ξ = R R ++. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 7/21

What is information geometry? Background Exponential families Statistical differentiable manifold. Under certain assumptions, a parametric statistical model S = {p ξ : ξ Ξ} of probability distributions defined on X forms a differentiable manifold. { } 1 Example: p ξ (x) = exp (x µ)2 for all x X = R, with 2πσ 2 2σ 2 ξ = [µ, σ 2 ] Ξ = R R ++. Fisher information metric [Rao, 1945, Chentsov, 1982]. Under certain assumptions, the Fisher information matrix defines the unique Riemannian metric g on S: g ij(ξ) = i log p ξ (x) j log p ξ (x) p ξ (x) dx. x X arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 7/21

What is information geometry? Background Exponential families Statistical differentiable manifold. Under certain assumptions, a parametric statistical model S = {p ξ : ξ Ξ} of probability distributions defined on X forms a differentiable manifold. { } 1 Example: p ξ (x) = exp (x µ)2 for all x X = R, with 2πσ 2 2σ 2 ξ = [µ, σ 2 ] Ξ = R R ++. Fisher information metric [Rao, 1945, Chentsov, 1982]. Under certain assumptions, the Fisher information matrix defines the unique Riemannian metric g on S: g ij(ξ) = i log p ξ (x) j log p ξ (x) p ξ (x) dx. x X Dual affine connections [Chentsov, 1982, Amari & Nagaoka, 2000]. Under certain assumptions, there is a unique family of dual affine connections { (α), ( α) } α R on (S, g) called α-connections. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 7/21

Background Exponential families How to use information geometry from a computational viewpoint? Exponential family. p θ (x) = exp ( θ T T (x) F (θ) + C(x) ) for all x X. θ: natural parameters, vector belonging to a convex open set Θ. F : log-normalizer, real-valued, strictly convex smooth function on Θ. C: carrier measure, real-valued function on X. T : sufficient statistic, vector-valued function on X. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 8/21

Background Exponential families How to use information geometry from a computational viewpoint? Exponential family. A taxonomy of probability measures p θ (x) = exp ( θ T T (x) F (θ) + C(x) ) for all x X. Probability measure Parametric Non-parametric Exponential families Non-exponential families Univariate Multivariate Uniform Cauchy Lévy skew α-stable uniparameter Bi-parameter multi-parameter Binomial Beta β Gamma Γ Multinomial Dirichlet Weibull Bernoulli Poisson Exponential Rayleigh Gaussian Figure: A taxonomy of exponential families. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 8/21

Background Exponential families How to use information geometry from a computational viewpoint? Exponential family. p θ (x) = exp ( θ T T (x) F (θ) + C(x) ) for all x X. We consider a statistical manifold S = {p θ : θ Θ} equipped with g and the dual exponential and mixture connections (1) and ( 1). arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 8/21

Background Exponential families How to use information geometry from a computational viewpoint? Exponential family. p θ (x) = exp ( θ T T (x) F (θ) + C(x) ) for all x X. We consider a statistical manifold S = {p θ : θ Θ} equipped with g and the dual exponential and mixture connections (1) and ( 1). (S, g, (1), ( 1) ) possesses two dual affine coordinate systems, natural parameters θ and expectation parameters η = F (θ). arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 8/21

Background Exponential families How to use information geometry from a computational viewpoint? Exponential family. p θ (x) = exp ( θ T T (x) F (θ) + C(x) ) for all x X. We consider a statistical manifold S = {p θ : θ Θ} equipped with g and the dual exponential and mixture connections (1) and ( 1). (S, g, (1), ( 1) ) possesses two dual affine coordinate systems, natural parameters θ and expectation parameters η = F (θ). Dually flat geometry, Hessian structure (g = 2 F ), generated by the potential F together with its conjugate potential F defined by the Legendre-Fenchel transform: F (η) = sup θ Θ θ T η F (θ), which verifies F = ( F ) 1 so that θ = F (η). arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 8/21

Background Exponential families How to use information geometry from a computational viewpoint? Exponential family. p θ (x) = exp ( θ T T (x) F (θ) + C(x) ) for all x X. We consider a statistical manifold S = {p θ : θ Θ} equipped with g and the dual exponential and mixture connections (1) and ( 1). (S, g, (1), ( 1) ) possesses two dual affine coordinate systems, natural parameters θ and expectation parameters η = F (θ). Dually flat geometry, Hessian structure (g = 2 F ), generated by the potential F together with its conjugate potential F defined by the Legendre-Fenchel transform: F (η) = sup θ Θ θ T η F (θ), which verifies F = ( F ) 1 so that θ = F (η). Generalizes the self-dual Euclidean geometry, with notably two canonically associated Bregman divergences B F and B F instead of the self-dual Euclidean distance, but also dual geodesics, a generalized Pythagorean theorem and dual projections. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 8/21

Background Exponential families How to use information geometry from a computational viewpoint? Exponential family. p θ (x) = exp ( θ T T (x) F (θ) + C(x) ) for all x X. Bregman divergence. B F (θ, θ ) = F (θ) F (θ ) (θ θ ) T F (θ ). arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 8/21

Background Exponential families How to use information geometry from a computational viewpoint? Exponential family. p θ (x) = exp ( θ T T (x) F (θ) + C(x) ) for all x X. Bregman divergence. B F (θ, θ ) = F (θ) F (θ ) (θ θ ) T F (θ ). Canonical divergences of dually flat spaces, bijection with exponential families [Amari & Nagaoka, 2000, Banerjee et al., 2005]: D KL (p ξ p ξ ) = B F (θ θ) = B F (η η ). arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 8/21

Background Exponential families How to use information geometry from a computational viewpoint? Exponential family. p θ (x) = exp ( θ T T (x) F (θ) + C(x) ) for all x X. Bregman divergence. B F (θ, θ ) = F (θ) F (θ ) (θ θ ) T F (θ ). Canonical divergences of dually flat spaces, bijection with exponential families [Amari & Nagaoka, 2000, Banerjee et al., 2005]: D KL (p ξ p ξ ) = B F (θ θ) = B F (η η ). No symmetry nor triangular inequality in general, but an information-theoretic interpretation. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 8/21

Background Exponential families How to use information geometry from a computational viewpoint? Exponential family. p θ (x) = exp ( θ T T (x) F (θ) + C(x) ) for all x X. Bregman divergence. B F (θ, θ ) = F (θ) F (θ ) (θ θ ) T F (θ ). Canonical divergences of dually flat spaces, bijection with exponential families [Amari & Nagaoka, 2000, Banerjee et al., 2005]: D KL (p ξ p ξ ) = B F (θ θ) = B F (η η ). No symmetry nor triangular inequality in general, but an information-theoretic interpretation. Generic algorithms that handle many generalized distances [Banerjee et al., 2005, Cayton, 2008, Cayton, 2009, Nielsen & Nock, 2009, Nielsen et al., 2009, Garcia et al., 2009]: Centroid computation and hard clustering (k-means). Parameter estimation and soft clustering (expectation-maximization). Proximity queries in ball trees (nearest-neighbors and range search). arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 8/21

Outline Introduction General architecture Sound descriptors modeling Temporal modeling 1 Introduction 2 3 General architecture Sound descriptors modeling Temporal modeling 4 5 arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 9/21

General architecture Sound descriptors modeling Temporal modeling How to design an audio system based on information geometry? Audio stream decomposition (on-line) Scheme: 1 Represent the incoming audio stream with short-time sound descriptors d j. 2 Model these descriptors as probability distributions p θj from a given exponential family. 3 Use the framework of computational information geometry on these distributions. Auditory scene Short-time sound representation dj Sound descriptors modeling pθj Temporal modeling Figure: Schema of the general architecture of the system. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 10/21

General architecture Sound descriptors modeling Temporal modeling How to design an audio system based on information geometry? Audio stream decomposition (on-line) Scheme: 1 Represent the incoming audio stream with short-time sound descriptors d j. 2 Model these descriptors as probability distributions p θj from a given exponential family. 3 Use the framework of computational information geometry on these distributions. In particular, it allows to define the notion of similarity in an information setup through divergences. Auditory scene Short-time sound representation dj Sound descriptors modeling pθj Temporal modeling Figure: Schema of the general architecture of the system. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 10/21

General architecture Sound descriptors modeling Temporal modeling How to design an audio system based on information geometry? Audio stream decomposition (on-line) Scheme: 1 Represent the incoming audio stream with short-time sound descriptors d j. 2 Model these descriptors as probability distributions p θj from a given exponential family. 3 Use the framework of computational information geometry on these distributions. In particular, it allows to define the notion of similarity in an information setup through divergences. Important need for temporal modeling. Auditory scene Short-time sound representation dj Sound descriptors modeling pθj Temporal modeling Figure: Schema of the general architecture of the system. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 10/21

General architecture Sound descriptors modeling Temporal modeling How to design an audio system based on information geometry? Audio stream decomposition (on-line) Scheme: 1 Represent the incoming audio stream with short-time sound descriptors d j. 2 Model these descriptors as probability distributions p θj from a given exponential family. 3 Use the framework of computational information geometry on these distributions. In particular, it allows to define the notion of similarity in an information setup through divergences. Important need for temporal modeling. Potential applications [Cont et al., 2011]: Audio content analysis. Segmentation of audio streams. Automatic structure discovery of audio signals. Sound processing and synthesis. Auditory scene Short-time sound representation dj Sound descriptors modeling pθj Temporal modeling Figure: Schema of the general architecture of the system. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 10/21

How to model sounds? Introduction General architecture Sound descriptors modeling Temporal modeling Computation of a sound descriptor d j: Fourier or constant-q transforms for information on the spectral content. Mel-frequency cepstral coefficients for information on the timbre. Many other possibilities. Figure: Sound descriptors modeling. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 11/21

How to model sounds? Introduction General architecture Sound descriptors modeling Temporal modeling Computation of a sound descriptor d j: Fourier or constant-q transforms for information on the spectral content. Mel-frequency cepstral coefficients for information on the timbre. Many other possibilities. Modeling with a probability distribution p θj from an exponential family: Categorical distributions. Many other possibilities. Figure: Sound descriptors modeling. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 11/21

How to take time into account? General architecture Sound descriptors modeling Temporal modeling Model formation: from signal to symbol. Assumption of quasi-stationary audio chunks. Change detection adapted from CuSum [Basseville & Nikiforov, 1993]. Figure: Model formation at time t. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 12/21

How to take time into account? General architecture Sound descriptors modeling Temporal modeling Model formation: from signal to symbol. Assumption of quasi-stationary audio chunks. Change detection adapted from CuSum [Basseville & Nikiforov, 1993]. Figure: Model formation at time t. Factor oracle: from symbol to syntax (and from genetics to music!). Forward transitions: original sequence factors. Backward links: suffix relations, common context. Figure: Factor oracle of the word abbbaab. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 12/21

Outline Introduction Audio segmentation Music similarity analysis Musical structure discovery Query by similarity Audio recombination by concatenative synthesis Computer-assisted improvisation 1 Introduction 2 3 4 Audio segmentation Music similarity analysis Musical structure discovery Query by similarity Audio recombination by concatenative synthesis Computer-assisted improvisation 5 arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 13/21

Audio segmentation Introduction Audio segmentation Music similarity analysis Musical structure discovery Query by similarity Audio recombination by concatenative synthesis Computer-assisted improvisation Figure: Segmentation of the 1st Piano Sonate, 1st Movement, 1st Theme, Beethoven. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 14/21

Music similarity analysis Introduction Audio segmentation Music similarity analysis Musical structure discovery Query by similarity Audio recombination by concatenative synthesis Computer-assisted improvisation Figure: Similarity analysis of the 1st Piano Sonate, 3rd Movement, Beethoven. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 15/21

Musical structure discovery Introduction Audio segmentation Music similarity analysis Musical structure discovery Query by similarity Audio recombination by concatenative synthesis Computer-assisted improvisation Figure: Structure discovery of the 1st Piano Sonate, 3rd Movement, Beethoven. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 16/21

Query by similarity Introduction Audio segmentation Music similarity analysis Musical structure discovery Query by similarity Audio recombination by concatenative synthesis Computer-assisted improvisation Figure: Query by similarity of the 1st Theme over the entire 1st Piano Sonate, 1st Movement, Beethoven. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 17/21

Audio recombination by concatenative synthesis Audio segmentation Music similarity analysis Musical structure discovery Query by similarity Audio recombination by concatenative synthesis Computer-assisted improvisation Figure: Audio recombination of African drums by concatenative synthesis of congas. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 18/21

Computer-assisted improvisation Audio segmentation Music similarity analysis Musical structure discovery Query by similarity Audio recombination by concatenative synthesis Computer-assisted improvisation Figure: Computer-assisted improvisation, Fabrizio Cassol and Philippe Leclerc. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 19/21

Outline Introduction 1 Introduction 2 3 4 5 arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 20/21

What we (don t) have Introduction Summary and perspectives. Representations. Descriptors modeling. Temporal modeling. Temporality of events. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 21/21

What we (don t) have Introduction Summary and perspectives. Representations. Descriptors modeling. Temporal modeling. Temporality of events. Many possibilities. Combinations of descriptors. Complex representations. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 21/21

What we (don t) have Introduction Summary and perspectives. Representations. Descriptors modeling. Temporal modeling. Temporality of events. Exponential families and Bregman divergences. Mixture models of a given exponential family. Other geometries, divergences, metrics. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 21/21

What we (don t) have Introduction Summary and perspectives. Representations. Descriptors modeling. Temporal modeling. Temporality of events. On-line segmentation and factor oracle. On-line clustering and equivalence between symbols. Overlap between symbols and other temporal models. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 21/21

What we (don t) have Introduction Summary and perspectives. Representations. Descriptors modeling. Temporal modeling. Temporality of events. Assumption of quasi-stationarity. Non-stationarity modeling. Time series. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 21/21

What we (don t) have Summary and perspectives. Representations. Descriptors modeling. Temporal modeling. Temporality of events. Resources on IG: http://imtr.ircam.fr/imtr/music_information_geometry National research group: IRCAM, Ecole Polytechnique, Thales, etc. Brillouin seminar: http://www.informationgeometry.org/seminar/seminarbrillouin.html IGAIA 2012. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 21/21

What we (don t) have Summary and perspectives. Representations. Descriptors modeling. Temporal modeling. Temporality of events. Resources on IG: http://imtr.ircam.fr/imtr/music_information_geometry National research group: IRCAM, Ecole Polytechnique, Thales, etc. Brillouin seminar: http://www.informationgeometry.org/seminar/seminarbrillouin.html IGAIA 2012. Thank you very much for your attention! Questions? This work was supported by a doctoral fellowship from the UPMC (EDITE) and by a grant from the JST-CNRS ICT (Improving the VR Experience). arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 21/21

Bibliography I Amari, S.-i. & Nagaoka, H. (2000). Methods of information geometry, volume 191 of Translations of Mathematical Monographs. American Mathematical Society. Banerjee, A., Merugu, S., Dhillon, I. S., & Ghosh, J. (2005). Clustering with Bregman divergences. Journal of Machine Learning Research, 6, 1705 1749. Basseville, M. & Nikiforov, V. (1993). Detection of abrupt changes: Theory and application. Englewood Cliffs, NJ, USA: Prentice-Hall, Inc. Cayton, L. (2008). Fast nearest neighbor retrieval for Bregman divergences. In Proceedings of the 25th International Conference on Machine Learning, volume 307 Helsinki, Finland. Cayton, L. (2009). Efficient Bregman range search. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, & A. Culotta (Eds.), Advances in Neural Information Processing Systems, volume 22 (pp. 243 251). Curran Associates, Inc. Chentsov, N. N. (1982). Statistical decision rules and optimal inference, volume 53 of Translations of Mathematical Monographs. American Mathematical Society. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 22/21

Bibliography II Cont, A., Dubnov, S., & Assayag, G. (2011). On the information geometry of audio streams with applications to similarity computing. IEEE Transactions on Audio, Speech and Language Processing, 19. To appear. Garcia, V., Nielsen, F., & Nock, R. (2009). Levels of details for Gaussian mixture models. In Proceedings of the 9th Asian Conference on Computer Vision, ACCV 2009 (pp. 514 525). Xi an, China. Nielsen, F. & Nock, R. (2009). Sided and symmetrized Bregman centroids. IEEE Transactions on Information Theory, 55(6), 2882 2904. Nielsen, F., Piro, P., & Barlaud, M. (2009). Tailored Bregman ball trees for effective nearest neighbors. In Proceedings of the 25th European Workshop on Computational Geometry (EuroCG) (pp. 29 32). Brussels, Belgium. Rao, C. R. (1945). Information and accuracy attainable in the estimation of statistical parameters. Bulletin of the Calcutta Mathematical Society, 37, 81 91. arnaud.dessein@ircam.fr April 7th 2011 Journée Interdisciplinaire Mathématiques Musique 23/21