Temporal models with low-rank spectrograms
|
|
- Noreen Gaines
- 5 years ago
- Views:
Transcription
1 Temporal models with low-rank spectrograms Cédric Févotte Institut de Recherche en Informatique de Toulouse (IRIT) IEEE MLSP Aalborg, Sep. 218
2 Outline NMF for spectral unmixing Generalities Itakura-Saito NMF Low-rank time-frequency synthesis Analysis vs synthesis Generative model Maximum joint likelihood estimation Variants NMF with transform learning Principle Optimisation 2
3 Collaborators Low-rank time-frequency synthesis NMF with transform learning Matthieu Kowalski (Paris-Saclay University) Dylan Fagot Herwig Wendt (CNRS, IRIT, Toulouse) 3
4 NMF for audio spectral unmixing (Smaragdis and Brown, 23) power spectrogram S W H frequency transform original temporal signal x(t) time y fn : short-time Fourier transform (STFT) of temporal signal x(t). s fn = y fn 2 : power spectrogram. NMF extracts recurring spectral patterns from the data by solving min D(S WH) = d(s fn [WH] fn ) W,H fn Successful applications in audio source separation and music transcription. 4
5 All factors are positive-valued: NMF for audio spectral unmixing (Smaragdis! Resulting and Brown, reconstruction 23) is additive X W H Non-negative 2 16 Input music passage 6 Component Frequency (Hz) Frequency Time (sec) 4 Component A BOUT THIS NON-NEGATIVE BUSINESS reproduced from (Smaragdis, 213) 5
6 NMF for audio spectral unmixing (Smaragdis and Brown, 23) power spectrogram S W H frequency transform original temporal signal x(t) time What is the right time-frequency transform S? What is the right measure of fit D(S WH)? NMF approximates the spectrogram by a sum of rank-one spectrograms. How to reconstruct temporal components? What about phase? 6
7 Itakura-Saito NMF (Févotte, Bertin, and Durrieu, 29) Gaussian low-rank variance model of the complex-valued STFT: y fn N c(, [WH] fn ) Related work by (Benaroya et al., 23; Abdallah and Plumbley, 24; Parry and Essa, 27) 7
8 Itakura-Saito NMF (Févotte, Bertin, and Durrieu, 29) Gaussian low-rank variance model of the complex-valued STFT: y fn N c(, [WH] fn ) Log-likelihood equivalent to Itakura-Saito (IS) divergence: log p(y WH) = D IS ( Y 2 WH) + cst Related work by (Benaroya et al., 23; Abdallah and Plumbley, 24; Parry and Essa, 27) 7
9 Itakura-Saito NMF (Févotte, Bertin, and Durrieu, 29) Gaussian low-rank variance model of the complex-valued STFT: y fn N c(, [WH] fn ) Log-likelihood equivalent to Itakura-Saito (IS) divergence: log p(y WH) = D IS ( Y 2 WH) + cst Zero-mean assumption: E [x(t)] = implies E [y fn ] = by linearity. Related work by (Benaroya et al., 23; Abdallah and Plumbley, 24; Parry and Essa, 27) 7
10 Itakura-Saito NMF (Févotte, Bertin, and Durrieu, 29) Gaussian low-rank variance model of the complex-valued STFT: y fn N c(, [WH] fn ) Log-likelihood equivalent to Itakura-Saito (IS) divergence: log p(y WH) = D IS ( Y 2 WH) + cst Zero-mean assumption: E [x(t)] = implies E [y fn ] = by linearity. Underlies a Gaussian composite model (GCM): y fn = z kfn, k z kfn N c(, w fk h kn ) Related work by (Benaroya et al., 23; Abdallah and Plumbley, 24; Parry and Essa, 27) 7
11 Itakura-Saito NMF (Févotte, Bertin, and Durrieu, 29) Gaussian low-rank variance model of the complex-valued STFT: y fn N c(, [WH] fn ) Log-likelihood equivalent to Itakura-Saito (IS) divergence: log p(y WH) = D IS ( Y 2 WH) + cst Zero-mean assumption: E [x(t)] = implies E [y fn ] = by linearity. Underlies a Gaussian composite model (GCM): y fn = k z kfn, z kfn N c(, w fk h kn ) Latent STFT components can be estimated a posteriori by Wiener filter: ẑ kfn = E [z kfn Y, W, H] = w fkh kn [WH] fn y fn Related work by (Benaroya et al., 23; Abdallah and Plumbley, 24; Parry and Essa, 27) 7
12 Itakura-Saito NMF (Févotte, Bertin, and Durrieu, 29) Gaussian low-rank variance model of the complex-valued STFT: y fn N c(, [WH] fn ) Log-likelihood equivalent to Itakura-Saito (IS) divergence: log p(y WH) = D IS ( Y 2 WH) + cst Zero-mean assumption: E [x(t)] = implies E [y fn ] = by linearity. Underlies a Gaussian composite model (GCM): y fn = k z kfn, z kfn N c(, w fk h kn ) Latent STFT components can be estimated a posteriori by Wiener filter: ẑ kfn = E [z kfn Y, W, H] = w fkh kn [WH] fn y fn Inverse-STFT of {ẑ kfn } fn produces temporal components such that: x(t) = k ĉk(t) Related work by (Benaroya et al., 23; Abdallah and Plumbley, 24; Parry and Essa, 27) 7
13 The Itakura-Saito divergence (Itakura and Saito, 1968; Gray et al., 198) Divergence d(x y) as a function of y with x=1 Itakura-Saito Generalised KL Squared Euclidean IS divergence: d IS (x y) = x y log x y 1 Nonconvex in y. Scale-invariance: d IS (λ x λ y) = d IS (x y) Very relevant for spectral data with high dynamic range. In comparison, d Euc (λ x λ y) = λ 2 d Euc (x y), d KL (λ x λ y) = λ d KL (x y). 8
14 Optimisation for IS-NMF (Févotte and Idier, 211) Objective min W,H D(S WH) = fn = fn [ ] s fn s fn log 1 [WH] fn [WH] fn [ ] s fn + log[wh] fn + cst [WH] fn State of the art Block-coordinate descent (W, H) with majorisation-minimisation (MM) Updates of W and H equivalent by transposition of S MM leads to multiplicative updates (linear complexity per iteration) f h kn h w fks fn [WH] 2 fn kn f w fk[wh] 1 fn Nonconvex problem (because of bilinearity and the divergence), initialisation matters. 9
15 Piano toy example (MIDI numbers : 61, 65, 68, 72) Figure: Three representations of data. demo available at 1
16 Piano toy example IS-NMF on power spectrogram with K = 8 K = Dictionary W Coefficients H.2.2 Reconstructed components K = K = K = K = K = K = K = Pitch estimates: (True values: 61, 65, 68, 72) x 1 5
17 Piano toy example KL-NMF on magnitude spectrogram with K = 8 Dictionary W Coefficients H Reconstructed components K = K = K = K = K = K = K = K = Pitch estimates: (True values: 61, 65, 68, 72) x 1 5
18 Follow-up on IS-NMF penalised versions promoting sparsity or dynamics (Lefèvre, Bach, and Févotte, 211a; Févotte, 211; Févotte, Le Roux, and Hershey, 213) model order selection (Tan and Févotte, 213) online/incremental variants (Dessein et al., 21; Lefèvre et al., 211b) Bayesian approaches (Hoffman et al., 21; Dikmen and Févotte, 211; Turner and Sahani, 214) full-covariance models (Liutkus et al., 211; Yoshii et al., 213) improved phase models (Badeau, 211; Magron, Badeau, and David, 217) multichannel variants (Ozerov and Févotte, 21; Sawada et al., 213; Kounades-Bastian et al., 216; Leglaive et al., 216) Sources S NMF: W H Mixing system A noise 1 Mixture X + + noise 2 Multichannel NMF problem: Estimate W, H and A from X 13
19 Outline NMF for spectral unmixing Generalities Itakura-Saito NMF Low-rank time-frequency synthesis Analysis vs synthesis Generative model Maximum joint likelihood estimation Variants NMF with transform learning Principle Optimisation 14
20 Analysis vs synthesis IS-NMF is a generative model of the STFT but not of the raw signal itself. Low-rank time-frequency synthesis (LRTFS) fills in this ultimate gap. STFT is an analysis transform y fn = t x(t)φ fn(t) LRTFS is a synthesis model x(t) = fn α fn φ fn (t) + e(t) Figure: Two Gabor atoms φ fn (t) 15
21 Low-rank time frequency synthesis (LRTFS) (Févotte and Kowalski, 214) Gaussian low-rank variance model of the synthesis coefficients: x(t) = α fn φ fn (t) + e(t), fn α fn N c (, [WH] fn ), e(t) N c (, λ). LRTFS is a generative model of raw signal x(t). Like in IS-NMF, latent composite structure of the synthesis coefficients: α fn = k z kfn, z kfn N c (, w fk h kn ). Given {ẑ kfn } fn, temporal components can be reconstructed as ĉ k (t) = fn ẑkfn φ fn (t). 16
22 Relation to sparse Bayesian learning (SBL) Generative signal model in vector/matrix form: x = Φα + e. x, e: vectors of signal and residual time samples (size T ), α: vector of synthesis coefficients αfn (size FN), Φ: time-frequency dictionary (size T FN). Synthesis coefficients model in vector/matrix form: p(α v) = N c (α, diag(v)). v: vector of variance coefficients vfn = [WH] fn (size FN). Similar to sparse Bayesian learning (Tipping, 21; Wipf and Rao, 24) except that the variance parameters are tied together by the low-rank structure WH. 17
23 Maximum joint likelihood estimation in LRTFS Optimise C(α, W, H) def = log p(x, α W, H, λ) = 1 λ x Φα [ αfn 2 ] + log [WH] fn + cst [WH] fn fn Block coordinate descent (α, W, H) 18
24 Maximum joint likelihood estimation in LRTFS Optimise C(α, W, H) def = log p(x, α W, H, λ) = 1 λ x Φα [ αfn 2 ] + log [WH] fn + cst [WH] fn fn Block coordinate descent (α, W, H) Optimisation of α 1 min α C M λ x Φα fn α fn 2 [WH] fn Ridge regression with complex-valued FISTA (Chaâri et al., 211; Florescu et al., 214) 18
25 Maximum joint likelihood estimation in LRTFS Optimise C(α, W, H) def = log p(x, α W, H, λ) = 1 λ x Φα [ αfn 2 ] + log [WH] fn + cst [WH] fn fn Block coordinate descent (α, W, H) Optimisation of W, H min W,H d IS ( α fn 2 [WH] fn ) fn IS-NMF with majorisation-minimisation (Févotte and Idier, 211) 18
26 Maximum joint likelihood estimation in LRTFS Algorithm 1: Block coordinate descent for LRTFS Set L Φ 2 2 (gradient inverse step size) Set α () = Φ H x (STFT) repeat % Update W and H with NMF {W (i+1), H (i+1) } = arg min W,H fn d IS( α (i) fn 2 [WH] fn ) % Update α with FISTA repeat % Gradient descent z (i) = α (i) + 1 L ΦH (x Φα (i) ) % Shrink α (i+1) fn = [WH](i+1) fn z (i) [WH] (i+1) +λ/l fn fn % Accelerate with momentum until convergence; until convergence; Complexity is one NMF per update of synthesis coefficients Efficient multiplication by Φ or Φ H using the LTFAT time-frequency toolbox 19
27 Noisy piano example 1 (a) noisy signal Time (s) (b) IS-NMF decomposition audio: c 1(t) c 2(t) c 3(t) c 4(t) c 5(t) c 6(t) c 7(t) c 8(t) c 9(t) c 1(t) 2
28 Noisy piano example 1 (a) noisy signal Time (s) (b) LRTFS decomposition audio: c 1(t) c 2(t) c 3(t) c 4(t) c 5(t) c 6(t) c 7(t) c 8(t) c 9(t) c 1(t) 21
29 Remarks about LRTFS Real-valued signals (Févotte and Kowalski, 218) x(t) previously assumed complex-valued for simplicity. In practice, x(t) is real-valued and Φ, α have Hermitian symmetry: x(t) = F /2 f =1 n=1 N 2R[α fn φ fn (t)] + e(t) More difficult to address but leads to essentially the same algorithm. Multi-layer representations (Févotte and Kowalski, 214, 215) LRTFS allows for multi-resolution hybrid representations: x = Φ a α a + Φ b α b + e. Φa and Φ b are time-frequency dictionaries with possibly different resolutions, αa and α b have their own structure, either low-rank or sparse. Not possible with standard NMF! 22
30 Remarks about LRTFS Compressive sensing (Févotte and Kowalski, 218) Signal x of size T sensed through linear operator A of size S T : b = Ax + e = AΦα + e Thanks to LRTFS, low-rankness can be used instead of sparsity. Estimate α from b under α fn N c (, [WH] fn ), similar algorithm. 25 toy piano 18 song excerpt output SNR 15 1 output SNR LRTFS SBL SBL L1 L1 2 LRTFS Oracle: spectrogram Oracle: spectrogram Oracle: WH (IS NMF) Oracle: WH (IS NMF) Number of measurements S (in % of T) Number of measurements S (in % of T) Recovery with LRTFS, SBL, LASSO (optimal hyperparameter) and two oracles Mamavatu by S. Raman (acoustic guitar, percussion, drums) 23
31 Outline NMF for spectral unmixing Generalities Itakura-Saito NMF Low-rank time-frequency synthesis Analysis vs synthesis Generative model Maximum joint likelihood estimation Variants NMF with transform learning Principle Optimisation 24
32 NMF with transform learning (Fagot, Wendt, and Févotte, 218) windowed segments X W H Fourier transform U 2 original temporal signal x(t) time 25
33 NMF with transform learning (Fagot, Wendt, and Févotte, 218) windowed segments X W H adaptive transform U 2 original temporal signal x(t) time 26
34 NMF with transform learning (Fagot, Wendt, and Févotte, 218) Power spectrogram S can be written as S = U FT X 2 X of size F N contains adjacent and windowed segments of x(t). UFT of size F F is the orthogonal Fourier matrix. 27
35 NMF with transform learning (Fagot, Wendt, and Févotte, 218) Power spectrogram S can be written as S = U FT X 2 X of size F N contains adjacent and windowed segments of x(t). UFT of size F F is the orthogonal Fourier matrix. Traditional NMF: min D( U FTX 2 WH) s.t. W, H W,H 27
36 NMF with transform learning (Fagot, Wendt, and Févotte, 218) Power spectrogram S can be written as S = U FT X 2 X of size F N contains adjacent and windowed segments of x(t). UFT of size F F is the orthogonal Fourier matrix. Traditional NMF: min D( U FTX 2 WH) s.t. W, H W,H NMF with transform learning (TL-NMF): min W,H,U D( UX 2 WH) s.t. W, H, U orthogonal Can be interpreted as a one-layer factorising network. 27
37 NMF with transform learning (Fagot, Wendt, and Févotte, 218) Power spectrogram S can be written as S = U FT X 2 X of size F N contains adjacent and windowed segments of x(t). UFT of size F F is the orthogonal Fourier matrix. Traditional NMF: min D( U FTX 2 WH) s.t. W, H W,H NMF with transform learning (TL-NMF): min W,H,U D( UX 2 WH) s.t. W, H, U orthogonal Can be interpreted as a one-layer factorising network. Inspired by the sparsifying transform of (Ravishankar and Bresler, 213): min U UX 1 s.t. U square invertible 27
38 NMF with transform learning (Fagot, Wendt, and Févotte, 218) Objective min W,H,U D IS( UX 2 WH) + λ H 1 s.t. W, H w k 1 = 1 UU T = I For simplicity, U real-valued orthogonal matrix. Practical importance of imposing sparsity on H. Block coordinate descent (U, W, H) Update of (W, H) with majorisation-minimisation. Update of U: Projected gradient descent with line-search (Manton, 22) Jacobi algorithm (U decomposed as a product of Givens rotations) (Wendt, Fagot, and Févotte, 218) 28
39 Learnt atoms from the piano toy example u 1 u 7 u 13 u 19 u 25 u 2 u 8 u 14 u 2 u 26 u 3 u 9 u 15 u 21 u 27 u 4 u 1 u 16 u 22 u 28 u 5 u 11 u 17 u 23 u 29 u 6 u 12 u 18 u 24 u 3 3 most active atoms learnt with TL-NMF (random initialisation) 29
40 Learnt atoms from the piano toy example u 5 u 6 u 7 u 8 u 13 u 14 u 1 u 2 u 1 u 11 u 19 u 21 Some atoms form pairs in phase quadrature 3
41 Temporal decomposition with TL-NMF Component Component Component Component Component Component Component Component time (s) time (s) Audio: c 1(t) c 2(t) c 3(t) c 4(t) c 5(t) c 6(t) c 7(t) c 8(t) 31
42 Summary IS-NMF is a generative model of the STFT. LRTFS is a generative model of the signal itself, with low-rank variance structure of the synthesis coefficients. TL-NMF learns a short-time transform together with the factorisation. Both LRTFS and TL-NMF take the raw signal as input. 32
43 Summary IS-NMF is a generative model of the STFT. LRTFS is a generative model of the signal itself, with low-rank variance structure of the synthesis coefficients. TL-NMF learns a short-time transform together with the factorisation. Both LRTFS and TL-NMF take the raw signal as input. Available PhD & postdoc positions in machine learning & signal processing within ERC project FACTORY 32
44 References I S. A. Abdallah and M. D. Plumbley. Polyphonic transcription by nonnegative sparse coding of power spectra. In Proc. International Symposium Music Information Retrieval Conference (ISMIR), pages , Barcelona, Spain, Oct. 24. R. Badeau. Gaussian modeling of mixtures of non-stationary signals in the time-frequency domain (HR-NMF). In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 211. doi: 1.119/ASPAA L. Benaroya, R. Gribonval, and F. Bimbot. Non negative sparse representation for Wiener based source separation with a single sensor. In Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages , Hong Kong, 23. L. Chaâri, J.-C. Pesquet, A. Benazza-Benyahia, and P. Ciuciu. A wavelet-based regularized reconstruction algorithm for SENSE parallel MRI with applications to neuroimaging. Medical Image Analysis, 15(2):185 21, 211. L. Daudet and B. Torrésani. Hybrid representations for audiophonic signal encoding. Signal Processing, 82(11): , 22. ISSN doi: A. Dessein, A. Cont, and G. Lemaitre. Real-time polyphonic music transcription with non-negative matrix factorization and beta-divergence. In Proc. International Society for Music Information Retrieval Conference (ISMIR), 21. O. Dikmen and C. Févotte. Nonnegative dictionary learning in the exponential noise model for adaptive music signal representation. In Advances in Neural Information Processing Systems (NIPS), pages , Granada, Spain, Dec URL 33
45 References II D. Fagot, H. Wendt, and C. Févotte. Nonnegative matrix factorization with transform learning. In Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr URL C. Févotte. Majorization-minimization algorithm for smooth Itakura-Saito nonnegative matrix factorization. In Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, May 211. URL C. Févotte and J. Idier. Algorithms for nonnegative matrix factorization with the beta-divergence. Neural Computation, 23(9): , Sep doi: /NECO a 168. URL C. Févotte and M. Kowalski. Low-rank time-frequency synthesis. In Advances in Neural Information Processing Systems (NIPS), Dec URL C. Févotte and M. Kowalski. Hybrid sparse and low-rank time-frequency signal decomposition. In Proc. European Signal Processing Conference (EUSIPCO), Nice, France, Sep URL C. Févotte and M. Kowalski. Estimation with low-rank time-frequency synthesis models. IEEE Transactions on Signal Processing, 66(15): , Aug doi: URL C. Févotte, N. Bertin, and J.-L. Durrieu. Nonnegative matrix factorization with the Itakura-Saito divergence. With application to music analysis. Neural Computation, 21(3):793 83, Mar. 29. doi: /neco URL 34
46 References III C. Févotte, J. Le Roux, and J. R. Hershey. Non-negative dynamical system with application to speech and audio. In Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, Canada, May 213. URL A. Florescu, E. Chouzenoux, J.-C. Pesquet, P. Ciuciu, and S. Ciochina. A majorize-minimize memory gradient method for complex-valued inverse problems. Signal Processing, 13: , 214. R. M. Gray, A. Buzo, A. H. Gray, and Y. Matsuyama. Distortion measures for speech processing. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4): , Aug M. Hoffman, D. Blei, and P. Cook. Bayesian nonparametric matrix factorization for recorded music. In Proc. 27th International Conference on Machine Learning (ICML), Haifa, Israel, 21. F. Itakura and S. Saito. Analysis synthesis telephony based on the maximum likelihood method. In Proc 6 th International Congress on Acoustics, pages C 17 C 2, Tokyo, Japan, Aug D. Kounades-Bastian, L. Girin, X. Alameda-Pineda, S. Gannot, and R. Horaud. A variational EM algorithm for the separation of time-varying convolutive audio mixtures. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 24(8): , Aug ISSN doi: 1.119/TASLP A. Lefèvre, F. Bach, and C. Févotte. Itakura-Saito nonnegative matrix factorization with group sparsity. In Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, May 211a. URL A. Lefèvre, F. Bach, and C. Févotte. Online algorithms for nonnegative matrix factorization with the Itakura-Saito divergence. In Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), Mohonk, NY, Oct. 211b. URL 35
47 References IV S. Leglaive, R. Badeau, and G. Richard. Multichannel audio source separation with probabilistic reverberation priors. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 24(12): , Dec ISSN doi: 1.119/TASLP A. Liutkus, R. Badeau, and G. Richard. Gaussian processes for underdetermined source separation. IEEE Transactions on Signal Processing, 59(7): , July 211. doi: 1.119/TSP P. Magron, R. Badeau, and B. David. Phase-dependent anisotropic Gaussian model for audio source separation. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 217. doi: 1.119/ICASSP J. H. Manton. Optimization algorithms exploiting unitary constraints. IEEE Transactions on Signal Processing, 5(3):635 65, Mar. 22. A. Ozerov and C. Févotte. Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Transactions on Audio, Speech and Language Processing, 18(3): , Mar. 21. doi: 1.119/TASL URL R. M. Parry and I. Essa. Phase-aware non-negative spectrogram factorization. In Proc. International Conference on Independent Component Analysis and Signal Separation (ICA), pages , London, UK, Sep. 27. S. Ravishankar and Y. Bresler. Learning sparsifying transforms. IEEE Transactions on Signal Processing, 61(5): , Mar ISSN X. doi: 1.119/TSP H. Sawada, H. Kameoka, S. Araki, and N. Ueda. Multichannel extensions of non-negative matrix factorization with complex-valued data. IEEE Transactions on Audio, Speech, and Language Processing, 21(5): , May 213. ISSN doi: 1.119/TASL P. Smaragdis. About this non-negative business. WASPAA keynote slides, 213. URL 36
48 References V P. Smaragdis and J. C. Brown. Non-negative matrix factorization for polyphonic music transcription. In Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), Oct. 23. V. Y. F. Tan and C. Févotte. Automatic relevance determination in nonnegative matrix factorization with the beta-divergence. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(7): , July 213. URL M. E. Tipping. Sparse Bayesian learning and the relevance vector machine. Journal of Machine Learning Research, 1: , 21. R. E. Turner and M. Sahani. Time-frequency analysis as probabilistic inference. IEEE Transactions on Signal Processing, 62(23): , Dec 214. doi: 1.119/TSP H. Wendt, D. Fagot, and C. Févotte. Jacobi algorithm for nonnegative matrix factorization with transform learning. In Proc. European Signal Processing Conference (EUSIPCO), Sep URL D. P. Wipf and B. D. Rao. Sparse bayesian learning for basis selection. IEEE Transactions on Signal Processing, 52(8): , Aug. 24. K. Yoshii, R. Tomioka, D. Mochihashi, and M. Goto. Infinite positive semidefinite tensor factorization for source separation of mixture signals. In Proc. International Conference on Machine Learning (ICML),
NONNEGATIVE MATRIX FACTORIZATION WITH TRANSFORM LEARNING. Dylan Fagot, Herwig Wendt and Cédric Févotte
NONNEGATIVE MATRIX FACTORIZATION WITH TRANSFORM LEARNING Dylan Fagot, Herwig Wendt and Cédric Févotte IRIT, Université de Toulouse, CNRS, Toulouse, France firstname.lastname@irit.fr ABSTRACT Traditional
More informationJacobi Algorithm For Nonnegative Matrix Factorization With Transform Learning
Jacobi Algorithm For Nonnegative Matrix Factorization With Transform Learning Herwig Wt, Dylan Fagot and Cédric Févotte IRIT, Université de Toulouse, CNRS Toulouse, France firstname.lastname@irit.fr Abstract
More informationHYBRID SPARSE AND LOW-RANK TIME-FREQUENCY SIGNAL DECOMPOSITION
HYBRID SPARSE AND LOW-RANK TIME-FREQUENCY SIGNAL DECOMPOSITION Cédric Févotte, Matthieu Kowalski, Laboratoire Lagrange (CNRS, OCA & Université Nice Sophia Antipolis), Nice, France Laboratoire des Signaux
More informationLow-Rank Time-Frequency Synthesis
Low-Rank - Synthesis Cédric Févotte Laboratoire Lagrange CNRS, OCA & Université de Nice Nice, France cfevotte@unice.fr Matthieu Kowalski Laboratoire des Signaux et Systèmes CNRS, Supélec & Université Paris-Sud
More informationA Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement
A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement Simon Leglaive 1 Laurent Girin 1,2 Radu Horaud 1 1: Inria Grenoble Rhône-Alpes 2: Univ. Grenoble Alpes, Grenoble INP,
More informationSINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIX FACTORIZATION AND SPECTRAL MASKS. Emad M. Grais and Hakan Erdogan
SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIX FACTORIZATION AND SPECTRAL MASKS Emad M. Grais and Hakan Erdogan Faculty of Engineering and Natural Sciences, Sabanci University, Orhanli
More informationExpectation-maximization algorithms for Itakura-Saito nonnegative matrix factorization
Expectation-maximization algorithms for Itakura-Saito nonnegative matrix factorization Paul Magron, Tuomas Virtanen To cite this version: Paul Magron, Tuomas Virtanen. Expectation-maximization algorithms
More informationMULTI-RESOLUTION SIGNAL DECOMPOSITION WITH TIME-DOMAIN SPECTROGRAM FACTORIZATION. Hirokazu Kameoka
MULTI-RESOLUTION SIGNAL DECOMPOSITION WITH TIME-DOMAIN SPECTROGRAM FACTORIZATION Hiroazu Kameoa The University of Toyo / Nippon Telegraph and Telephone Corporation ABSTRACT This paper proposes a novel
More informationFACTORS IN FACTORIZATION: DOES BETTER AUDIO SOURCE SEPARATION IMPLY BETTER POLYPHONIC MUSIC TRANSCRIPTION?
FACTORS IN FACTORIZATION: DOES BETTER AUDIO SOURCE SEPARATION IMPLY BETTER POLYPHONIC MUSIC TRANSCRIPTION? Tiago Fernandes Tavares, George Tzanetakis, Peter Driessen University of Victoria Department of
More informationNon-Negative Matrix Factorization And Its Application to Audio. Tuomas Virtanen Tampere University of Technology
Non-Negative Matrix Factorization And Its Application to Audio Tuomas Virtanen Tampere University of Technology tuomas.virtanen@tut.fi 2 Contents Introduction to audio signals Spectrogram representation
More informationOPTIMAL WEIGHT LEARNING FOR COUPLED TENSOR FACTORIZATION WITH MIXED DIVERGENCES
OPTIMAL WEIGHT LEARNING FOR COUPLED TENSOR FACTORIZATION WITH MIXED DIVERGENCES Umut Şimşekli, Beyza Ermiş, A. Taylan Cemgil Boğaziçi University Dept. of Computer Engineering 34342, Bebek, İstanbul, Turkey
More informationCONVOLUTIVE NON-NEGATIVE MATRIX FACTORISATION WITH SPARSENESS CONSTRAINT
CONOLUTIE NON-NEGATIE MATRIX FACTORISATION WITH SPARSENESS CONSTRAINT Paul D. O Grady Barak A. Pearlmutter Hamilton Institute National University of Ireland, Maynooth Co. Kildare, Ireland. ABSTRACT Discovering
More informationScalable audio separation with light Kernel Additive Modelling
Scalable audio separation with light Kernel Additive Modelling Antoine Liutkus 1, Derry Fitzgerald 2, Zafar Rafii 3 1 Inria, Université de Lorraine, LORIA, UMR 7503, France 2 NIMBUS Centre, Cork Institute
More informationLONG-TERM REVERBERATION MODELING FOR UNDER-DETERMINED AUDIO SOURCE SEPARATION WITH APPLICATION TO VOCAL MELODY EXTRACTION.
LONG-TERM REVERBERATION MODELING FOR UNDER-DETERMINED AUDIO SOURCE SEPARATION WITH APPLICATION TO VOCAL MELODY EXTRACTION. Romain Hennequin Deezer R&D 10 rue d Athènes, 75009 Paris, France rhennequin@deezer.com
More informationNonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation
Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation Mikkel N. Schmidt and Morten Mørup Technical University of Denmark Informatics and Mathematical Modelling Richard
More informationNonnegative Matrix Factorization with Markov-Chained Bases for Modeling Time-Varying Patterns in Music Spectrograms
Nonnegative Matrix Factorization with Markov-Chained Bases for Modeling Time-Varying Patterns in Music Spectrograms Masahiro Nakano 1, Jonathan Le Roux 2, Hirokazu Kameoka 2,YuKitano 1, Nobutaka Ono 1,
More informationDept. Electronics and Electrical Engineering, Keio University, Japan. NTT Communication Science Laboratories, NTT Corporation, Japan.
JOINT SEPARATION AND DEREVERBERATION OF REVERBERANT MIXTURES WITH DETERMINED MULTICHANNEL NON-NEGATIVE MATRIX FACTORIZATION Hideaki Kagami, Hirokazu Kameoka, Masahiro Yukawa Dept. Electronics and Electrical
More informationACCOUNTING FOR PHASE CANCELLATIONS IN NON-NEGATIVE MATRIX FACTORIZATION USING WEIGHTED DISTANCES. Sebastian Ewert Mark D. Plumbley Mark Sandler
ACCOUNTING FOR PHASE CANCELLATIONS IN NON-NEGATIVE MATRIX FACTORIZATION USING WEIGHTED DISTANCES Sebastian Ewert Mark D. Plumbley Mark Sandler Queen Mary University of London, London, United Kingdom ABSTRACT
More informationarxiv: v1 [stat.ml] 6 Nov 2018
A Quasi-Newton algorithm on the orthogonal manifold for NMF with transform learning arxiv:1811.02225v1 [stat.ml] 6 Nov 2018 Pierre Ablin, Dylan Fagot, Herwig Wendt, Alexandre Gramfort, and Cédric Févotte
More informationSingle Channel Music Sound Separation Based on Spectrogram Decomposition and Note Classification
Single Channel Music Sound Separation Based on Spectrogram Decomposition and Note Classification Hafiz Mustafa and Wenwu Wang Centre for Vision, Speech and Signal Processing (CVSSP) University of Surrey,
More informationNon-negative Matrix Factorization: Algorithms, Extensions and Applications
Non-negative Matrix Factorization: Algorithms, Extensions and Applications Emmanouil Benetos www.soi.city.ac.uk/ sbbj660/ March 2013 Emmanouil Benetos Non-negative Matrix Factorization March 2013 1 / 25
More informationDiscovering Convolutive Speech Phones using Sparseness and Non-Negativity Constraints
Discovering Convolutive Speech Phones using Sparseness and Non-Negativity Constraints Paul D. O Grady and Barak A. Pearlmutter Hamilton Institute, National University of Ireland Maynooth, Co. Kildare,
More informationOracle Analysis of Sparse Automatic Music Transcription
Oracle Analysis of Sparse Automatic Music Transcription Ken O Hanlon, Hidehisa Nagano, and Mark D. Plumbley Queen Mary University of London NTT Communication Science Laboratories, NTT Corporation {keno,nagano,mark.plumbley}@eecs.qmul.ac.uk
More informationNONNEGATIVE matrix factorization was originally introduced
1670 IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 22, NO. 11, NOVEMBER 2014 Multichannel High-Resolution NMF for Modeling Convolutive Mixtures of Non-Stationary Signals in the
More informationSTRUCTURE-AWARE DICTIONARY LEARNING WITH HARMONIC ATOMS
19th European Signal Processing Conference (EUSIPCO 2011) Barcelona, Spain, August 29 - September 2, 2011 STRUCTURE-AWARE DICTIONARY LEARNING WITH HARMONIC ATOMS Ken O Hanlon and Mark D.Plumbley Queen
More informationA Comparative Study of Example-guided Audio Source Separation Approaches Based On Nonnegative Matrix Factorization
A Comparative Study of Example-guided Audio Source Separation Approaches Based On Nonnegative Matrix Factorization Alexey Ozerov, Srđan Kitić, Patrick Pérez To cite this version: Alexey Ozerov, Srđan Kitić,
More informationGeneralized Constraints for NMF with Application to Informed Source Separation
Generalized Constraints for with Application to Informed Source Separation Christian Rohlfing and Julian M. Becker Institut für Nachrichtentechnik RWTH Aachen University, D2056 Aachen, Germany Email: rohlfing@ient.rwth-aachen.de
More informationCauchy Nonnegative Matrix Factorization
Cauchy Nonnegative Matrix Factorization Antoine Liutkus, Derry Fitzgerald, Roland Badeau To cite this version: Antoine Liutkus, Derry Fitzgerald, Roland Badeau. Cauchy Nonnegative Matrix Factorization.
More informationConvolutive Non-Negative Matrix Factorization for CQT Transform using Itakura-Saito Divergence
Convolutive Non-Negative Matrix Factorization for CQT Transform using Itakura-Saito Divergence Fabio Louvatti do Carmo; Evandro Ottoni Teatini Salles Abstract This paper proposes a modification of the
More information744 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 19, NO. 4, MAY 2011
744 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 19, NO. 4, MAY 2011 NMF With Time Frequency Activations to Model Nonstationary Audio Events Romain Hennequin, Student Member, IEEE,
More informationSOUND SOURCE SEPARATION BASED ON NON-NEGATIVE TENSOR FACTORIZATION INCORPORATING SPATIAL CUE AS PRIOR KNOWLEDGE
SOUND SOURCE SEPARATION BASED ON NON-NEGATIVE TENSOR FACTORIZATION INCORPORATING SPATIAL CUE AS PRIOR KNOWLEDGE Yuki Mitsufuji Sony Corporation, Tokyo, Japan Axel Roebel 1 IRCAM-CNRS-UPMC UMR 9912, 75004,
More informationBlind Spectral-GMM Estimation for Underdetermined Instantaneous Audio Source Separation
Blind Spectral-GMM Estimation for Underdetermined Instantaneous Audio Source Separation Simon Arberet 1, Alexey Ozerov 2, Rémi Gribonval 1, and Frédéric Bimbot 1 1 METISS Group, IRISA-INRIA Campus de Beaulieu,
More informationTIME-DEPENDENT PARAMETRIC AND HARMONIC TEMPLATES IN NON-NEGATIVE MATRIX FACTORIZATION
TIME-DEPENDENT PARAMETRIC AND HARMONIC TEMPLATES IN NON-NEGATIVE MATRIX FACTORIZATION Romain Hennequin, Roland Badeau and Bertrand David, Institut Telecom; Telecom ParisTech; CNRS LTCI Paris France romainhennequin@telecom-paristechfr
More informationDeep NMF for Speech Separation
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Deep NMF for Speech Separation Le Roux, J.; Hershey, J.R.; Weninger, F.J. TR2015-029 April 2015 Abstract Non-negative matrix factorization
More informationORTHOGONALITY-REGULARIZED MASKED NMF FOR LEARNING ON WEAKLY LABELED AUDIO DATA. Iwona Sobieraj, Lucas Rencker, Mark D. Plumbley
ORTHOGONALITY-REGULARIZED MASKED NMF FOR LEARNING ON WEAKLY LABELED AUDIO DATA Iwona Sobieraj, Lucas Rencker, Mark D. Plumbley University of Surrey Centre for Vision Speech and Signal Processing Guildford,
More informationMaximum Marginal Likelihood Estimation For Nonnegative Dictionary Learning
Maximum Marginal Likelihood Estimation For Nonnegative Dictionary Learning Onur Dikmen and Cédric Févotte CNRS LTCI; Télécom ParisTech dikmen@telecom-paristech.fr perso.telecom-paristech.fr/ dikmen 26.5.2011
More informationESTIMATING TRAFFIC NOISE LEVELS USING ACOUSTIC MONITORING: A PRELIMINARY STUDY
ESTIMATING TRAFFIC NOISE LEVELS USING ACOUSTIC MONITORING: A PRELIMINARY STUDY Jean-Rémy Gloaguen, Arnaud Can Ifsttar - LAE Route de Bouaye - CS4 44344, Bouguenais, FR jean-remy.gloaguen@ifsttar.fr Mathieu
More informationNMF WITH SPECTRAL AND TEMPORAL CONTINUITY CRITERIA FOR MONAURAL SOUND SOURCE SEPARATION. Julian M. Becker, Christian Sohn and Christian Rohlfing
NMF WITH SPECTRAL AND TEMPORAL CONTINUITY CRITERIA FOR MONAURAL SOUND SOURCE SEPARATION Julian M. ecker, Christian Sohn Christian Rohlfing Institut für Nachrichtentechnik RWTH Aachen University D-52056
More informationKen O Hanlon and Mark B. Sandler. Centre for Digital Music Queen Mary University of London
AN ITERATIVE HARD THRESHOLDING APPROACH TO l 0 SPARSE HELLINGER NMF Ken O Hanlon and Mark B. Sandler Centre for Digital Music Queen Mary University of London ABSTRACT Performance of Non-negative Matrix
More informationOn Spectral Basis Selection for Single Channel Polyphonic Music Separation
On Spectral Basis Selection for Single Channel Polyphonic Music Separation Minje Kim and Seungjin Choi Department of Computer Science Pohang University of Science and Technology San 31 Hyoja-dong, Nam-gu
More informationAn Efficient Posterior Regularized Latent Variable Model for Interactive Sound Source Separation
An Efficient Posterior Regularized Latent Variable Model for Interactive Sound Source Separation Nicholas J. Bryan Center for Computer Research in Music and Acoustics, Stanford University Gautham J. Mysore
More informationGaussian Processes for Audio Feature Extraction
Gaussian Processes for Audio Feature Extraction Dr. Richard E. Turner (ret26@cam.ac.uk) Computational and Biological Learning Lab Department of Engineering University of Cambridge Machine hearing pipeline
More informationEXPLOITING LONG-TERM TEMPORAL DEPENDENCIES IN NMF USING RECURRENT NEURAL NETWORKS WITH APPLICATION TO SOURCE SEPARATION
EXPLOITING LONG-TERM TEMPORAL DEPENDENCIES IN NMF USING RECURRENT NEURAL NETWORKS WITH APPLICATION TO SOURCE SEPARATION Nicolas Boulanger-Lewandowski Gautham J. Mysore Matthew Hoffman Université de Montréal
More informationSparse Time-Frequency Transforms and Applications.
Sparse Time-Frequency Transforms and Applications. Bruno Torrésani http://www.cmi.univ-mrs.fr/~torresan LATP, Université de Provence, Marseille DAFx, Montreal, September 2006 B. Torrésani (LATP Marseille)
More informationAuthor manuscript, published in "7th International Symposium on Computer Music Modeling and Retrieval (CMMR 2010), Málaga : Spain (2010)"
Author manuscript, published in "7th International Symposium on Computer Music Modeling and Retrieval (CMMR 2010), Málaga : Spain (2010)" Notes on nonnegative tensor factorization of the spectrogram for
More informationACOUSTIC SCENE CLASSIFICATION WITH MATRIX FACTORIZATION FOR UNSUPERVISED FEATURE LEARNING. Victor Bisot, Romain Serizel, Slim Essid, Gaël Richard
ACOUSTIC SCENE CLASSIFICATION WITH MATRIX FACTORIZATION FOR UNSUPERVISED FEATURE LEARNING Victor Bisot, Romain Serizel, Slim Essid, Gaël Richard LTCI, CNRS, Télćom ParisTech, Université Paris-Saclay, 75013,
More informationA diagonal plus low-rank covariance model for computationally efficient source separation
A diagonal plus low-ran covariance model for computationally efficient source separation Antoine Liutus, Kazuyoshi Yoshii To cite this version: Antoine Liutus, Kazuyoshi Yoshii. A diagonal plus low-ran
More informationAN EM ALGORITHM FOR AUDIO SOURCE SEPARATION BASED ON THE CONVOLUTIVE TRANSFER FUNCTION
AN EM ALGORITHM FOR AUDIO SOURCE SEPARATION BASED ON THE CONVOLUTIVE TRANSFER FUNCTION Xiaofei Li, 1 Laurent Girin, 2,1 Radu Horaud, 1 1 INRIA Grenoble Rhône-Alpes, France 2 Univ. Grenoble Alpes, GIPSA-Lab,
More informationMultichannel Audio Modeling with Elliptically Stable Tensor Decomposition
Multichannel Audio Modeling with Elliptically Stable Tensor Decomposition Mathieu Fontaine, Fabian-Robert Stöter, Antoine Liutkus, Umut Simsekli, Romain Serizel, Roland Badeau To cite this version: Mathieu
More informationMusic separation guided by cover tracks: designing the joint NMF model
Music separation guided by cover tracks: designing the joint NMF model Nathan Souviraà-Labastie, Emmanuel Vincent, Frédéric Bimbot To cite this version: Nathan Souviraà-Labastie, Emmanuel Vincent, Frédéric
More informationEUSIPCO
EUSIPCO 213 1569744273 GAMMA HIDDEN MARKOV MODEL AS A PROBABILISTIC NONNEGATIVE MATRIX FACTORIZATION Nasser Mohammadiha, W. Bastiaan Kleijn, Arne Leijon KTH Royal Institute of Technology, Department of
More informationA State-Space Approach to Dynamic Nonnegative Matrix Factorization
1 A State-Space Approach to Dynamic Nonnegative Matrix Factorization Nasser Mohammadiha, Paris Smaragdis, Ghazaleh Panahandeh, Simon Doclo arxiv:179.5v1 [cs.lg] 31 Aug 17 Abstract Nonnegative matrix factorization
More informationDictionary learning methods and single-channel source separation
Dictionary learning methods and single-channel source separation Augustin Lefèvre November 29, 2012 Acknowledgements Ministere de la Recherche Willow team European Research Council Sierra team TSI Telecom
More informationNotes on Nonnegative Tensor Factorization of the Spectrogram for Audio Source Separation: Statistical Insights and Towards Self-Clustering of the Spatial Cues Cédric Févotte 1, and Alexey Ozerov 2 1 CNRS
More informationNonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis
LETTER Communicated by Andrzej Cichocki Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis Cédric Févotte fevotte@telecom-paristech.fr Nancy Bertin nbertin@telecom-paristech.fr
More informationPROBLEM of online factorization of data matrices is of
1 Online Matrix Factorization via Broyden Updates Ömer Deniz Akyıldız arxiv:1506.04389v [stat.ml] 6 Jun 015 Abstract In this paper, we propose an online algorithm to compute matrix factorizations. Proposed
More informationNON-NEGATIVE MATRIX FACTORIZATION FOR SINGLE-CHANNEL EEG ARTIFACT REJECTION
NON-NEGATIVE MATRIX FACTORIZATION FOR SINGLE-CHANNEL EEG ARTIFACT REJECTION Cécilia Damon Antoine Liutkus Alexandre Gramfort Slim Essid Institut Mines-Telecom, TELECOM ParisTech - CNRS, LTCI 37, rue Dareau
More informationCOLLABORATIVE SPEECH DEREVERBERATION: REGULARIZED TENSOR FACTORIZATION FOR CROWDSOURCED MULTI-CHANNEL RECORDINGS. Sanna Wager, Minje Kim
COLLABORATIVE SPEECH DEREVERBERATION: REGULARIZED TENSOR FACTORIZATION FOR CROWDSOURCED MULTI-CHANNEL RECORDINGS Sanna Wager, Minje Kim Indiana University School of Informatics, Computing, and Engineering
More informationAnalysis of polyphonic audio using source-filter model and non-negative matrix factorization
Analysis of polyphonic audio using source-filter model and non-negative matrix factorization Tuomas Virtanen and Anssi Klapuri Tampere University of Technology, Institute of Signal Processing Korkeakoulunkatu
More informationSpeech enhancement based on nonnegative matrix factorization with mixed group sparsity constraint
Speech enhancement based on nonnegative matrix factorization with mixed group sparsity constraint Hien-Thanh Duong, Quoc-Cuong Nguyen, Cong-Phuong Nguyen, Thanh-Huan Tran, Ngoc Q. K. Duong To cite this
More informationTIME-DEPENDENT PARAMETRIC AND HARMONIC TEMPLATES IN NON-NEGATIVE MATRIX FACTORIZATION
TIME-DEPENDENT PARAMETRIC AND HARMONIC TEMPLATES IN NON-NEGATIVE MATRIX FACTORIZATION 13 th International Conference on Digital Audio Effects Romain Hennequin, Roland Badeau and Bertrand David Telecom
More informationPROJET - Spatial Audio Separation Using Projections
PROJET - Spatial Audio Separation Using Proections Derry Fitzgerald, Antoine Liutkus, Roland Badeau To cite this version: Derry Fitzgerald, Antoine Liutkus, Roland Badeau. PROJET - Spatial Audio Separation
More informationUnderdetermined Instantaneous Audio Source Separation via Local Gaussian Modeling
Underdetermined Instantaneous Audio Source Separation via Local Gaussian Modeling Emmanuel Vincent, Simon Arberet, and Rémi Gribonval METISS Group, IRISA-INRIA Campus de Beaulieu, 35042 Rennes Cedex, France
More informationVariational inference
Simon Leglaive Télécom ParisTech, CNRS LTCI, Université Paris Saclay November 18, 2016, Télécom ParisTech, Paris, France. Outline Introduction Probabilistic model Problem Log-likelihood decomposition EM
More informationIndependent Component Analysis and Unsupervised Learning. Jen-Tzung Chien
Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent voices Nonparametric likelihood
More informationOn monotonicity of multiplicative update rules for weighted nonnegative tensor factorization
On monotonicity of multiplicative update rules for weighted nonnegative tensor factorization Alexey Ozerov, Ngoc Duong, Louis Chevallier To cite this version: Alexey Ozerov, Ngoc Duong, Louis Chevallier.
More informationNon-Negative Tensor Factorisation for Sound Source Separation
ISSC 2005, Dublin, Sept. -2 Non-Negative Tensor Factorisation for Sound Source Separation Derry FitzGerald, Matt Cranitch φ and Eugene Coyle* φ Dept. of Electronic Engineering, Cor Institute of Technology
More informationSUPERVISED NON-EUCLIDEAN SPARSE NMF VIA BILEVEL OPTIMIZATION WITH APPLICATIONS TO SPEECH ENHANCEMENT
SUPERVISED NON-EUCLIDEAN SPARSE NMF VIA BILEVEL OPTIMIZATION WITH APPLICATIONS TO SPEECH ENHANCEMENT Pablo Sprechmann, 1 Alex M. Bronstein, 2 and Guillermo Sapiro 1 1 Duke University, USA; 2 Tel Aviv University,
More informationVariational Bayesian EM algorithm for modeling mixtures of non-stationary signals in the time-frequency domain (HR-NMF)
Variational Bayesian EM algorithm for modeling mixtures of non-stationary signals in the time-frequency domain HR-NMF Roland Badeau, Angélique Dremeau To cite this version: Roland Badeau, Angélique Dremeau.
More informationAudio Source Separation Based on Convolutive Transfer Function and Frequency-Domain Lasso Optimization
Audio Source Separation Based on Convolutive Transfer Function and Frequency-Domain Lasso Optimization Xiaofei Li, Laurent Girin, Radu Horaud To cite this version: Xiaofei Li, Laurent Girin, Radu Horaud.
More informationREVIEW OF SINGLE CHANNEL SOURCE SEPARATION TECHNIQUES
REVIEW OF SINGLE CHANNEL SOURCE SEPARATION TECHNIQUES Kedar Patki University of Rochester Dept. of Electrical and Computer Engineering kedar.patki@rochester.edu ABSTRACT The paper reviews the problem of
More informationIndependent Component Analysis and Unsupervised Learning
Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien National Cheng Kung University TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent
More informationDrum extraction in single channel audio signals using multi-layer non negative matrix factor deconvolution
Drum extraction in single channel audio signals using multi-layer non negative matrix factor deconvolution Clément Laroche, Hélène Papadopoulos, Matthieu Kowalski, Gaël Richard To cite this version: Clément
More information(3) where the mixing vector is the Fourier transform of are the STFT coefficients of the sources I. INTRODUCTION
1830 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 7, SEPTEMBER 2010 Under-Determined Reverberant Audio Source Separation Using a Full-Rank Spatial Covariance Model Ngoc Q.
More informationVARIATIONAL BAYESIAN EM ALGORITHM FOR MODELING MIXTURES OF NON-STATIONARY SIGNALS IN THE TIME-FREQUENCY DOMAIN (HR-NMF)
VARIATIONAL BAYESIAN EM ALGORITHM FOR MODELING MIXTURES OF NON-STATIONARY SIGNALS IN THE TIME-FREQUENCY DOMAIN HR-NMF Roland Badeau, Angélique Drémeau Institut Mines-Telecom, Telecom ParisTech, CNRS LTCI
More informationPhase-dependent anisotropic Gaussian model for audio source separation
Phase-dependent anisotropic Gaussian model for audio source separation Paul Magron, Roland Badeau, Bertrand David To cite this version: Paul Magron, Roland Badeau, Bertrand David. Phase-dependent anisotropic
More informationSingle-channel source separation using non-negative matrix factorization
Single-channel source separation using non-negative matrix factorization Mikkel N. Schmidt Technical University of Denmark mns@imm.dtu.dk www.mikkelschmidt.dk DTU Informatics Department of Informatics
More information10ème Congrès Français d Acoustique
1ème Congrès Français d Acoustique Lyon, 1-16 Avril 1 Spectral similarity measure invariant to pitch shifting and amplitude scaling Romain Hennequin 1, Roland Badeau 1, Bertrand David 1 1 Institut TELECOM,
More informationReal-time polyphonic music transcription with non-negative matrix factorization and beta-divergence
Real-time polyphonic music transcription with non-negative matrix factorization and beta-divergence Arnaud Dessein, Arshia Cont, Guillaume Lemaitre To cite this version: Arnaud Dessein, Arshia Cont, Guillaume
More informationSource Separation Tutorial Mini-Series III: Extensions and Interpretations to Non-Negative Matrix Factorization
Source Separation Tutorial Mini-Series III: Extensions and Interpretations to Non-Negative Matrix Factorization Nicholas Bryan Dennis Sun Center for Computer Research in Music and Acoustics, Stanford University
More informationSHIFT-INVARIANT DICTIONARY LEARNING FOR SPARSE REPRESENTATIONS: EXTENDING K-SVD
SHIFT-INVARIANT DICTIONARY LEARNING FOR SPARSE REPRESENTATIONS: EXTENDING K-SVD Boris Mailhé, Sylvain Lesage,Rémi Gribonval and Frédéric Bimbot Projet METISS Centre de Recherche INRIA Rennes - Bretagne
More informationIN real-world audio signals, several sound sources are usually
1066 IEEE RANSACIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 3, MARCH 2007 Monaural Sound Source Separation by Nonnegative Matrix Factorization With emporal Continuity and Sparseness Criteria
More informationEnforcing Harmonicity and Smoothness in Bayesian Non-negative Matrix Factorization Applied to Polyphonic Music Transcription.
Enforcing Harmonicity and Smoothness in Bayesian Non-negative Matrix Factorization Applied to Polyphonic Music Transcription. Nancy Bertin*, Roland Badeau, Member, IEEE and Emmanuel Vincent, Member, IEEE
More informationEnvironmental Sound Classification in Realistic Situations
Environmental Sound Classification in Realistic Situations K. Haddad, W. Song Brüel & Kjær Sound and Vibration Measurement A/S, Skodsborgvej 307, 2850 Nærum, Denmark. X. Valero La Salle, Universistat Ramon
More informationBayesian Hierarchical Modeling for Music and Audio Processing at LabROSA
Bayesian Hierarchical Modeling for Music and Audio Processing at LabROSA Dawen Liang (LabROSA) Joint work with: Dan Ellis (LabROSA), Matt Hoffman (Adobe Research), Gautham Mysore (Adobe Research) 1. Bayesian
More informationMatrix Factorization for Speech Enhancement
Matrix Factorization for Speech Enhancement Peter Li Peter.Li@nyu.edu Yijun Xiao ryjxiao@nyu.edu 1 Introduction In this report, we explore techniques for speech enhancement using matrix factorization.
More informationNon-negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Non-negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs Paris Smaragdis TR2004-104 September
More informationSHIFTED AND CONVOLUTIVE SOURCE-FILTER NON-NEGATIVE MATRIX FACTORIZATION FOR MONAURAL AUDIO SOURCE SEPARATION. Tomohiko Nakamura and Hirokazu Kameoka,
SHIFTED AND CONVOLUTIVE SOURCE-FILTER NON-NEGATIVE MATRIX FACTORIZATION FOR MONAURAL AUDIO SOURCE SEPARATION Tomohiko Nakamura and Hirokazu Kameoka, Graduate School of Information Science and Technology,
More informationDiffuse noise suppression with asynchronous microphone array based on amplitude additivity model
Diffuse noise suppression with asynchronous microphone array based on amplitude additivity model Yoshikazu Murase, Hironobu Chiba, Nobutaka Ono, Shigeki Miyabe, Takeshi Yamada, and Shoji Makino University
More informationMULTIPITCH ESTIMATION AND INSTRUMENT RECOGNITION BY EXEMPLAR-BASED SPARSE REPRESENTATION. Ikuo Degawa, Kei Sato, Masaaki Ikehara
MULTIPITCH ESTIMATION AND INSTRUMENT RECOGNITION BY EXEMPLAR-BASED SPARSE REPRESENTATION Ikuo Degawa, Kei Sato, Masaaki Ikehara EEE Dept. Keio University Yokohama, Kanagawa 223-8522 Japan E-mail:{degawa,
More informationSound Recognition in Mixtures
Sound Recognition in Mixtures Juhan Nam, Gautham J. Mysore 2, and Paris Smaragdis 2,3 Center for Computer Research in Music and Acoustics, Stanford University, 2 Advanced Technology Labs, Adobe Systems
More informationOn the use of a spatial cue as prior information for stereo sound source separation based on spatially weighted non-negative tensor factorization
Mitsufuji and Roebel EURASIP Journal on Advances in Signal Processing 2014, 2014:40 RESEARCH Open Access On the use of a spatial cue as prior information for stereo sound source separation based on spatially
More informationSparseness Constraints on Nonnegative Tensor Decomposition
Sparseness Constraints on Nonnegative Tensor Decomposition Na Li nali@clarksonedu Carmeliza Navasca cnavasca@clarksonedu Department of Mathematics Clarkson University Potsdam, New York 3699, USA Department
More information502 IEEE SIGNAL PROCESSING LETTERS, VOL. 23, NO. 4, APRIL 2016
502 IEEE SIGNAL PROCESSING LETTERS, VOL. 23, NO. 4, APRIL 2016 Discriminative Training of NMF Model Based on Class Probabilities for Speech Enhancement Hanwook Chung, Student Member, IEEE, Eric Plourde,
More informationMultichannel Extensions of Non-negative Matrix Factorization with Complex-valued Data
1 Multichannel Extensions of Non-negative Matrix Factorization with Complex-valued Data Hiroshi Sawada, Senior Member, IEEE, Hirokazu Kameoka, Member, IEEE, Shoko Araki, Senior Member, IEEE, Naonori Ueda,
More informationScalable audio separation with light kernel additive modelling
Scalable audio separation with light kernel additive modelling Antoine Liutkus, Derry Fitzgerald, Zafar Rafii To cite this version: Antoine Liutkus, Derry Fitzgerald, Zafar Rafii. Scalable audio separation
More informationHarmonic/Percussive Separation Using Kernel Additive Modelling
Author manuscript, published in "IET Irish Signals & Systems Conference 2014 (2014)" ISSC 2014 / CIICT 2014, Limerick, June 26 27 Harmonic/Percussive Separation Using Kernel Additive Modelling Derry FitzGerald
More informationLearning the β-divergence in Tweedie Compound Poisson Matrix Factorization Models
Learning the β-divergence in Tweedie Compound Poisson Matrix Factorization Models Umut Şimşekli umut.simsekli@boun.edu.tr Dept. of Computer Engineering, Boğaziçi University, 34342 Bebek, Istanbul, Turkey
More informationConstrained Nonnegative Matrix Factorization with Applications to Music Transcription
Constrained Nonnegative Matrix Factorization with Applications to Music Transcription by Daniel Recoskie A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the
More informationConsistent Wiener Filtering for Audio Source Separation
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Consistent Wiener Filtering for Audio Source Separation Le Roux, J.; Vincent, E. TR2012-090 October 2012 Abstract Wiener filtering is one of
More information