TIME-DEPENDENT PARAMETRIC AND HARMONIC TEMPLATES IN NON-NEGATIVE MATRIX FACTORIZATION

Similar documents
TIME-DEPENDENT PARAMETRIC AND HARMONIC TEMPLATES IN NON-NEGATIVE MATRIX FACTORIZATION

SUPPLEMENTARY MATERIAL FOR THE PAPER "A PARAMETRIC MODEL AND ESTIMATION TECHNIQUES FOR THE INHARMONICITY AND TUNING OF THE PIANO"

10ème Congrès Français d Acoustique

LONG-TERM REVERBERATION MODELING FOR UNDER-DETERMINED AUDIO SOURCE SEPARATION WITH APPLICATION TO VOCAL MELODY EXTRACTION.

Nonnegative Matrix Factorization with Markov-Chained Bases for Modeling Time-Varying Patterns in Music Spectrograms

Harmonic Adaptive Latent Component Analysis of Audio and Application to Music Transcription

Oracle Analysis of Sparse Automatic Music Transcription

Non-Negative Matrix Factorization And Its Application to Audio. Tuomas Virtanen Tampere University of Technology

ACCOUNTING FOR PHASE CANCELLATIONS IN NON-NEGATIVE MATRIX FACTORIZATION USING WEIGHTED DISTANCES. Sebastian Ewert Mark D. Plumbley Mark Sandler

744 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 19, NO. 4, MAY 2011

MULTIPITCH ESTIMATION AND INSTRUMENT RECOGNITION BY EXEMPLAR-BASED SPARSE REPRESENTATION. Ikuo Degawa, Kei Sato, Masaaki Ikehara

Jacobi Algorithm For Nonnegative Matrix Factorization With Transform Learning

Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation

STRUCTURE-AWARE DICTIONARY LEARNING WITH HARMONIC ATOMS

MULTI-RESOLUTION SIGNAL DECOMPOSITION WITH TIME-DOMAIN SPECTROGRAM FACTORIZATION. Hirokazu Kameoka

FACTORS IN FACTORIZATION: DOES BETTER AUDIO SOURCE SEPARATION IMPLY BETTER POLYPHONIC MUSIC TRANSCRIPTION?

Non-negative Matrix Factorization: Algorithms, Extensions and Applications

Audio Features. Fourier Transform. Fourier Transform. Fourier Transform. Short Time Fourier Transform. Fourier Transform.

Audio Features. Fourier Transform. Short Time Fourier Transform. Short Time Fourier Transform. Short Time Fourier Transform

University of Colorado at Boulder ECEN 4/5532. Lab 2 Lab report due on February 16, 2015

Monaural Music Separation via Supervised Non-negative Matrix Factor with Side-information

Introduction Basic Audio Feature Extraction

ORTHOGONALITY-REGULARIZED MASKED NMF FOR LEARNING ON WEAKLY LABELED AUDIO DATA. Iwona Sobieraj, Lucas Rencker, Mark D. Plumbley

NONNEGATIVE MATRIX FACTORIZATION WITH TRANSFORM LEARNING. Dylan Fagot, Herwig Wendt and Cédric Févotte

PHY 103: Standing Waves and Harmonics. Segev BenZvi Department of Physics and Astronomy University of Rochester

A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement

BILEVEL SPARSE MODELS FOR POLYPHONIC MUSIC TRANSCRIPTION

Constrained Nonnegative Matrix Factorization with Applications to Music Transcription

Sound 2: frequency analysis

Machine Learning for Signal Processing Non-negative Matrix Factorization

Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis

NONNEGATIVE matrix factorization (NMF) is a powerful

THE task of identifying the environment in which a sound

Variational inference

Fast adaptive ESPRIT algorithm

Stability analysis of multiplicative update algorithms and application to non-negative matrix factorization

Feature Learning with Matrix Factorization Applied to Acoustic Scene Classification

SHIFTED AND CONVOLUTIVE SOURCE-FILTER NON-NEGATIVE MATRIX FACTORIZATION FOR MONAURAL AUDIO SOURCE SEPARATION. Tomohiko Nakamura and Hirokazu Kameoka,

Unsupervised Analysis of Polyphonic Music by Sparse Coding

Real-time polyphonic music transcription with non-negative matrix factorization and beta-divergence

Sparse and Shift-Invariant Feature Extraction From Non-Negative Data

Non-negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs

SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIX FACTORIZATION AND SPECTRAL MASKS. Emad M. Grais and Hakan Erdogan

Topic 3: Fourier Series (FS)

Low-Rank Time-Frequency Synthesis

Signal Processing COS 323

Single Channel Music Sound Separation Based on Spectrogram Decomposition and Note Classification

VARIATIONAL BAYESIAN EM ALGORITHM FOR MODELING MIXTURES OF NON-STATIONARY SIGNALS IN THE TIME-FREQUENCY DOMAIN (HR-NMF)

Variational Bayesian EM algorithm for modeling mixtures of non-stationary signals in the time-frequency domain (HR-NMF)

Time-Frequency Analysis

Optimal spectral transportation with application to music transcription

Power-Scaled Spectral Flux and Peak-Valley Group-Delay Methods for Robust Musical Onset Detection

Drum extraction in single channel audio signals using multi-layer non negative matrix factor deconvolution

Discovering Convolutive Speech Phones using Sparseness and Non-Negativity Constraints

Complex NMF under phase constraints based on signal modeling: application to audio source separation

Scalable audio separation with light Kernel Additive Modelling

On Spectral Basis Selection for Single Channel Polyphonic Music Separation

Real-Time Pitch Determination of One or More Voices by Nonnegative Matrix Factorization

NON-NEGATIVE MATRIX FACTORIZATION WITH SELECTIVE SPARSITY CONSTRAINTS FOR TRANSCRIPTION OF BELL CHIMING RECORDINGS

ESTIMATING TRAFFIC NOISE LEVELS USING ACOUSTIC MONITORING: A PRELIMINARY STUDY

Convolutive Non-Negative Matrix Factorization for CQT Transform using Itakura-Saito Divergence

CONVOLUTIVE NON-NEGATIVE MATRIX FACTORISATION WITH SPARSENESS CONSTRAINT

Single-channel source separation using non-negative matrix factorization

REVIEW OF SINGLE CHANNEL SOURCE SEPARATION TECHNIQUES

Gaussian Processes for Audio Feature Extraction

OBJECT CODING OF HARMONIC SOUNDS USING SPARSE AND STRUCTURED REPRESENTATIONS

Environmental Sound Classification in Realistic Situations

Sound Recognition in Mixtures

NMF WITH SPECTRAL AND TEMPORAL CONTINUITY CRITERIA FOR MONAURAL SOUND SOURCE SEPARATION. Julian M. Becker, Christian Sohn and Christian Rohlfing

ROBUST REALTIME POLYPHONIC PITCH DETECTION

Machine Learning for Signal Processing Non-negative Matrix Factorization

STATISTICAL APPROACH FOR SOUND MODELING

t Tao Group Limited, Reading, RG6 IAZ, U.K. wenwu.wang(ieee.org t Samsung Electronics Research Institute, Staines, TW1 8 4QE, U.K.

ACOUSTIC SCENE CLASSIFICATION WITH MATRIX FACTORIZATION FOR UNSUPERVISED FEATURE LEARNING. Victor Bisot, Romain Serizel, Slim Essid, Gaël Richard

Automatic Relevance Determination in Nonnegative Matrix Factorization

IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL.?, NO.?, MONTH

Analysis of polyphonic audio using source-filter model and non-negative matrix factorization

Music transcription with ISA and HMM

Learning Dictionaries of Stable Autoregressive Models for Audio Scene Analysis

Deep NMF for Speech Separation

Computational Methods CMSC/AMSC/MAPL 460. Fourier transform

Temporal models with low-rank spectrograms

PreFEst: A Predominant-F0 Estimation Method for Polyphonic Musical Audio Signals

COMP 546, Winter 2018 lecture 19 - sound 2

CS229 Project: Musical Alignment Discovery

Enforcing Harmonicity and Smoothness in Bayesian Non-negative Matrix Factorization Applied to Polyphonic Music Transcription.

Reverberation Impulse Response Analysis

High resolution NMF for modeling mixtures of non-stationary signals in the time-frequency domain

POLYPHONIC MUSIC TRANSCRIPTION BY NON-NEGATIVE SPARSE CODING OF POWER SPECTRA

arxiv: v1 [stat.ml] 6 Nov 2018

Source Separation Tutorial Mini-Series III: Extensions and Interpretations to Non-Negative Matrix Factorization

Functional and Structural Implications of Non-Separability of Spectral and Temporal Responses in AI

Music Synthesis. synthesis. 1. NCTU/CSIE/ DSP Copyright 1996 C.M. LIU

A NEW DISSIMILARITY METRIC FOR THE CLUSTERING OF PARTIALS USING THE COMMON VARIATION CUE

Ken O Hanlon and Mark B. Sandler. Centre for Digital Music Queen Mary University of London

A Generative Model for Music Transcription

General Physics I. Lecture 14: Sinusoidal Waves. Prof. WAN, Xin ( 万歆 )

Topic 6. Timbre Representations

Nonlinear Losses in Electro-acoustical Transducers Wolfgang Klippel, Daniel Knobloch

Melody Extraction and Musical Onset Detection

Transcription:

TIME-DEPENDENT PARAMETRIC AND HARMONIC TEMPLATES IN NON-NEGATIVE MATRIX FACTORIZATION 13 th International Conference on Digital Audio Effects Romain Hennequin, Roland Badeau and Bertrand David Telecom ParisTech September 8, 2010 Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 1/26

Introduction Musical spectrograms decomposition (on a basis of notes) Decomposition based on Non-negative Matrix Factorization (NMF) s are introduced into decomposition methods: parametric harmonic atoms makes it possible to model slight pitch variations Potential applications: Multipitch estimation/transcription Source separation Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 2/26

Sommaire 1 2 3 Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 3/26

Contents Introduction Principle Issues Proposed solution 1 Principle Issues Proposed solution 2 3 Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 4/26

Principle of NMF Introduction Principle Issues Proposed solution Low-rank approximation: R V ˆV = WH ˆV ft = W fr H rt r=1 Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 5/26

Issues with NMF Introduction Principle Issues Proposed solution Pitch variations Low-rank approximation does not permit to model variations over time, such as slight pitch variations (vibrato...). Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 6/26

Issues with NMF Introduction Principle Issues Proposed solution Original spectrogram NMF spectrogram R = 1 5 5 4 4 frequency (khz) 3 2 frequency (khz) 3 2 1 1 0 50 100 150 time (frames) 0 50 100 150 time (frames) Note with vibrato: Decomposition with a single atom. Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 7/26

Issues with NMF Introduction Principle Issues Proposed solution Original spectrogram NMF spectrogram R = 3 5 5 4 4 frequency (khz) 3 2 frequency (khz) 3 2 1 1 0 50 100 150 time (frames) 0 50 100 150 time (frames) Note with vibrato: Decomposition with 3 atoms. Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 8/26

Proposed solution Introduction Principle Issues Proposed solution What does an atom look like in a musical spectrogram? In a musical spectrogram most of the (non-percussive) elements are instruments notes which are generally harmonic tones. Parameters of interest are generally the fundamental frequency of these tones, and the shape of the amplitudes of the harmonics. Proposed method: parametric model of spectrogram with harmonic atoms. Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 9/26

Contents Introduction Parametric spectrogram Parametric atoms Algorithm 1 2 Parametric spectrogram Parametric atoms Algorithm 3 Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 10/26

Parametric spectrogram Parametric spectrogram Parametric atoms Algorithm Time-varying atoms in NMF: ˆV ft = R W fr H rt ˆV ft = r=1 R r=1 W θrt fr H rt θ rt is a time-varying parameter associated to each atom. In this paper, θ rt is the fundamental frequency f rt 0 of each atom. Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 11/26

Parametric atoms Introduction Parametric spectrogram Parametric atoms Algorithm Parametric harmonic atom construction n h (f rt W f 0 rt fr = 0 ) k=1 a k g(f kf rt 0 ) Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 12/26

Parametric spectrogram Parametric spectrogram Parametric atoms Algorithm Hypotheses of the model The harmonic part of notes is supposed to be stationary within an analysis frame. Interferences between harmonics are supposed to be negligible. Classical hypothesis of NMF about positive summation of parts. Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 13/26

Algorithm Introduction Parametric spectrogram Learnt parameters ˆV ft = Parametric spectrogram Parametric atoms Algorithm R n h a k g(f kf rt r=1 k=1 0 ) h rt }{{} W f 0 rt fr A divergence between V and ˆV is to be minimized w.r.t.: f0 rt : the fundamental frequency of each atom at each frame a k : the amplitudes of harmonics (Atoms share the same set of amplitudes) h rt : the activation of each atom at each frame Cost function: C(f rt 0, a k, h rt ) = D(V ft ˆV ft ) Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 14/26

Algorithm Introduction Parametric spectrogram Parametric atoms Algorithm Minimization Global optimization w.r.t. f rt 0 is impossible (numerous local minima in C). one atom is introduced for each MIDI note. Optimization thus becomes local (fine estimate of f rt 0 ). Minimization achieved with multiplicative update rules. Remark The proposed method is no longer a rank-reduction method but still reduces the data dimension. Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 15/26

Contents Introduction Decomposition Improvement Estimated frequency Real signals 1 2 3 Decomposition Improvement Estimated frequency Real signals Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 16/26

Decomposition Improvement Estimated frequency Real signals Decomposition of a synthetic spectrogram Original power spectrogram 40 5 35 Frequency (khz) 4 3 2 30 25 20 15 10 5 1 0 5 0 50 100 150 200 250 300 Time (frame) Spectrogram of the first bars of JS Bach s first prelude played by a synthesizer. Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 17/26

Obtained decomposition Decomposition Improvement Estimated frequency Real signals 70 60 15 Semitones 50 40 30 20 25 20 30 10 50 100 150 200 250 300 Frames 35 Activations for each MIDI note. Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 18/26

Obtained decomposition Decomposition Improvement Estimated frequency Real signals Decomposition Notes appear at the right place with decreasing amplitudes Numerous atoms activated at onset time Notes activated at octave, twelfth and double octave of the right note (note with many common partials). Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 19/26

Improvement Introduction Decomposition Improvement Estimated frequency Real signals Onset A few standard NMF atoms can be used to model onsets: ˆV ft = R r=1 W θrt fr H rt + K A fk B kt k=1 Octaves, twelfths... Add constraints to the cost function: Sparsity constraints on activations Decorrelation constraints (between activations of octaves...) Smoothness constraints on amplitudes Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 20/26

Obtained decomposition Decomposition Improvement Estimated frequency Real signals 70 10 60 15 50 Semitones 40 30 20 25 20 10 30 50 100 150 200 250 300 Frames Activations for each MIDI note. Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 21/26

Time/frequency representation Decomposition Improvement Estimated frequency Real signals 34 32 40 30 28 35 Semitones 26 24 22 30 20 25 18 16 20 14 20 40 60 80 100 120 140 160 Frames Activations centered on estimated frequency for each MIDI note: vibrato appears. Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 22/26

Issues with real signals Decomposition Improvement Estimated frequency Real signals 70 10 60 15 50 Semitones 40 30 20 25 20 10 30 50 100 150 200 250 300 Frames Activations for each MIDI note. (Piano sound) Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 23/26

Issues with real signals Decomposition Improvement Estimated frequency Real signals Issues The model of amplitudes of harmonics is quite rough Issues with onsets and octaves are more important Noisy components (breath... ) Some instruments are not perfectly harmonic (piano... ) Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 24/26

Summary New way of decomposing musical spectrograms with slight pitch variations in constituting elements. Parametric thus flexible model. Perspectives Improve decomposition to make it more adapted to real data: Better modeling of harmonic amplitudes Supervised learning of amplitudes Better onset and noise modeling Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 25/26

Any questions? Romain Hennequin, Roland Badeau and Bertrand David Time-dependent parametric templates in NMF - slide 26/26