Temporal models with low-rank spectrograms

Size: px

Start display at page:

Download "Temporal models with low-rank spectrograms"

Noreen Gaines
5 years ago
Views:

1 Temporal models with low-rank spectrograms Cédric Févotte Institut de Recherche en Informatique de Toulouse (IRIT) IEEE MLSP Aalborg, Sep. 218

2 Outline NMF for spectral unmixing Generalities Itakura-Saito NMF Low-rank time-frequency synthesis Analysis vs synthesis Generative model Maximum joint likelihood estimation Variants NMF with transform learning Principle Optimisation 2

3 Collaborators Low-rank time-frequency synthesis NMF with transform learning Matthieu Kowalski (Paris-Saclay University) Dylan Fagot Herwig Wendt (CNRS, IRIT, Toulouse) 3

4 NMF for audio spectral unmixing (Smaragdis and Brown, 23) power spectrogram S W H frequency transform original temporal signal x(t) time y fn : short-time Fourier transform (STFT) of temporal signal x(t). s fn = y fn 2 : power spectrogram. NMF extracts recurring spectral patterns from the data by solving min D(S WH) = d(s fn [WH] fn ) W,H fn Successful applications in audio source separation and music transcription. 4

5 All factors are positive-valued: NMF for audio spectral unmixing (Smaragdis! Resulting and Brown, reconstruction 23) is additive X W H Non-negative 2 16 Input music passage 6 Component Frequency (Hz) Frequency Time (sec) 4 Component A BOUT THIS NON-NEGATIVE BUSINESS reproduced from (Smaragdis, 213) 5

6 NMF for audio spectral unmixing (Smaragdis and Brown, 23) power spectrogram S W H frequency transform original temporal signal x(t) time What is the right time-frequency transform S? What is the right measure of fit D(S WH)? NMF approximates the spectrogram by a sum of rank-one spectrograms. How to reconstruct temporal components? What about phase? 6

7 Itakura-Saito NMF (Févotte, Bertin, and Durrieu, 29) Gaussian low-rank variance model of the complex-valued STFT: y fn N c(, [WH] fn ) Related work by (Benaroya et al., 23; Abdallah and Plumbley, 24; Parry and Essa, 27) 7

8 Itakura-Saito NMF (Févotte, Bertin, and Durrieu, 29) Gaussian low-rank variance model of the complex-valued STFT: y fn N c(, [WH] fn ) Log-likelihood equivalent to Itakura-Saito (IS) divergence: log p(y WH) = D IS ( Y 2 WH) + cst Related work by (Benaroya et al., 23; Abdallah and Plumbley, 24; Parry and Essa, 27) 7

9 Itakura-Saito NMF (Févotte, Bertin, and Durrieu, 29) Gaussian low-rank variance model of the complex-valued STFT: y fn N c(, [WH] fn ) Log-likelihood equivalent to Itakura-Saito (IS) divergence: log p(y WH) = D IS ( Y 2 WH) + cst Zero-mean assumption: E [x(t)] = implies E [y fn ] = by linearity. Related work by (Benaroya et al., 23; Abdallah and Plumbley, 24; Parry and Essa, 27) 7

10 Itakura-Saito NMF (Févotte, Bertin, and Durrieu, 29) Gaussian low-rank variance model of the complex-valued STFT: y fn N c(, [WH] fn ) Log-likelihood equivalent to Itakura-Saito (IS) divergence: log p(y WH) = D IS ( Y 2 WH) + cst Zero-mean assumption: E [x(t)] = implies E [y fn ] = by linearity. Underlies a Gaussian composite model (GCM): y fn = z kfn, k z kfn N c(, w fk h kn ) Related work by (Benaroya et al., 23; Abdallah and Plumbley, 24; Parry and Essa, 27) 7

11 Itakura-Saito NMF (Févotte, Bertin, and Durrieu, 29) Gaussian low-rank variance model of the complex-valued STFT: y fn N c(, [WH] fn ) Log-likelihood equivalent to Itakura-Saito (IS) divergence: log p(y WH) = D IS ( Y 2 WH) + cst Zero-mean assumption: E [x(t)] = implies E [y fn ] = by linearity. Underlies a Gaussian composite model (GCM): y fn = k z kfn, z kfn N c(, w fk h kn ) Latent STFT components can be estimated a posteriori by Wiener filter: ẑ kfn = E [z kfn Y, W, H] = w fkh kn [WH] fn y fn Related work by (Benaroya et al., 23; Abdallah and Plumbley, 24; Parry and Essa, 27) 7

12 Itakura-Saito NMF (Févotte, Bertin, and Durrieu, 29) Gaussian low-rank variance model of the complex-valued STFT: y fn N c(, [WH] fn ) Log-likelihood equivalent to Itakura-Saito (IS) divergence: log p(y WH) = D IS ( Y 2 WH) + cst Zero-mean assumption: E [x(t)] = implies E [y fn ] = by linearity. Underlies a Gaussian composite model (GCM): y fn = k z kfn, z kfn N c(, w fk h kn ) Latent STFT components can be estimated a posteriori by Wiener filter: ẑ kfn = E [z kfn Y, W, H] = w fkh kn [WH] fn y fn Inverse-STFT of {ẑ kfn } fn produces temporal components such that: x(t) = k ĉk(t) Related work by (Benaroya et al., 23; Abdallah and Plumbley, 24; Parry and Essa, 27) 7

13 The Itakura-Saito divergence (Itakura and Saito, 1968; Gray et al., 198) Divergence d(x y) as a function of y with x=1 Itakura-Saito Generalised KL Squared Euclidean IS divergence: d IS (x y) = x y log x y 1 Nonconvex in y. Scale-invariance: d IS (λ x λ y) = d IS (x y) Very relevant for spectral data with high dynamic range. In comparison, d Euc (λ x λ y) = λ 2 d Euc (x y), d KL (λ x λ y) = λ d KL (x y). 8

14 Optimisation for IS-NMF (Févotte and Idier, 211) Objective min W,H D(S WH) = fn = fn [ ] s fn s fn log 1 [WH] fn [WH] fn [ ] s fn + log[wh] fn + cst [WH] fn State of the art Block-coordinate descent (W, H) with majorisation-minimisation (MM) Updates of W and H equivalent by transposition of S MM leads to multiplicative updates (linear complexity per iteration) f h kn h w fks fn [WH] 2 fn kn f w fk[wh] 1 fn Nonconvex problem (because of bilinearity and the divergence), initialisation matters. 9

15 Piano toy example (MIDI numbers : 61, 65, 68, 72) Figure: Three representations of data. demo available at 1

16 Piano toy example IS-NMF on power spectrogram with K = 8 K = Dictionary W Coefficients H.2.2 Reconstructed components K = K = K = K = K = K = K = Pitch estimates: (True values: 61, 65, 68, 72) x 1 5

17 Piano toy example KL-NMF on magnitude spectrogram with K = 8 Dictionary W Coefficients H Reconstructed components K = K = K = K = K = K = K = K = Pitch estimates: (True values: 61, 65, 68, 72) x 1 5

18 Follow-up on IS-NMF penalised versions promoting sparsity or dynamics (Lefèvre, Bach, and Févotte, 211a; Févotte, 211; Févotte, Le Roux, and Hershey, 213) model order selection (Tan and Févotte, 213) online/incremental variants (Dessein et al., 21; Lefèvre et al., 211b) Bayesian approaches (Hoffman et al., 21; Dikmen and Févotte, 211; Turner and Sahani, 214) full-covariance models (Liutkus et al., 211; Yoshii et al., 213) improved phase models (Badeau, 211; Magron, Badeau, and David, 217) multichannel variants (Ozerov and Févotte, 21; Sawada et al., 213; Kounades-Bastian et al., 216; Leglaive et al., 216) Sources S NMF: W H Mixing system A noise 1 Mixture X + + noise 2 Multichannel NMF problem: Estimate W, H and A from X 13

19 Outline NMF for spectral unmixing Generalities Itakura-Saito NMF Low-rank time-frequency synthesis Analysis vs synthesis Generative model Maximum joint likelihood estimation Variants NMF with transform learning Principle Optimisation 14

20 Analysis vs synthesis IS-NMF is a generative model of the STFT but not of the raw signal itself. Low-rank time-frequency synthesis (LRTFS) fills in this ultimate gap. STFT is an analysis transform y fn = t x(t)φ fn(t) LRTFS is a synthesis model x(t) = fn α fn φ fn (t) + e(t) Figure: Two Gabor atoms φ fn (t) 15

21 Low-rank time frequency synthesis (LRTFS) (Févotte and Kowalski, 214) Gaussian low-rank variance model of the synthesis coefficients: x(t) = α fn φ fn (t) + e(t), fn α fn N c (, [WH] fn ), e(t) N c (, λ). LRTFS is a generative model of raw signal x(t). Like in IS-NMF, latent composite structure of the synthesis coefficients: α fn = k z kfn, z kfn N c (, w fk h kn ). Given {ẑ kfn } fn, temporal components can be reconstructed as ĉ k (t) = fn ẑkfn φ fn (t). 16

22 Relation to sparse Bayesian learning (SBL) Generative signal model in vector/matrix form: x = Φα + e. x, e: vectors of signal and residual time samples (size T ), α: vector of synthesis coefficients αfn (size FN), Φ: time-frequency dictionary (size T FN). Synthesis coefficients model in vector/matrix form: p(α v) = N c (α, diag(v)). v: vector of variance coefficients vfn = [WH] fn (size FN). Similar to sparse Bayesian learning (Tipping, 21; Wipf and Rao, 24) except that the variance parameters are tied together by the low-rank structure WH. 17

23 Maximum joint likelihood estimation in LRTFS Optimise C(α, W, H) def = log p(x, α W, H, λ) = 1 λ x Φα [ αfn 2 ] + log [WH] fn + cst [WH] fn fn Block coordinate descent (α, W, H) 18

24 Maximum joint likelihood estimation in LRTFS Optimise C(α, W, H) def = log p(x, α W, H, λ) = 1 λ x Φα [ αfn 2 ] + log [WH] fn + cst [WH] fn fn Block coordinate descent (α, W, H) Optimisation of α 1 min α C M λ x Φα fn α fn 2 [WH] fn Ridge regression with complex-valued FISTA (Chaâri et al., 211; Florescu et al., 214) 18

25 Maximum joint likelihood estimation in LRTFS Optimise C(α, W, H) def = log p(x, α W, H, λ) = 1 λ x Φα [ αfn 2 ] + log [WH] fn + cst [WH] fn fn Block coordinate descent (α, W, H) Optimisation of W, H min W,H d IS ( α fn 2 [WH] fn ) fn IS-NMF with majorisation-minimisation (Févotte and Idier, 211) 18

26 Maximum joint likelihood estimation in LRTFS Algorithm 1: Block coordinate descent for LRTFS Set L Φ 2 2 (gradient inverse step size) Set α () = Φ H x (STFT) repeat % Update W and H with NMF {W (i+1), H (i+1) } = arg min W,H fn d IS( α (i) fn 2 [WH] fn ) % Update α with FISTA repeat % Gradient descent z (i) = α (i) + 1 L ΦH (x Φα (i) ) % Shrink α (i+1) fn = [WH](i+1) fn z (i) [WH] (i+1) +λ/l fn fn % Accelerate with momentum until convergence; until convergence; Complexity is one NMF per update of synthesis coefficients Efficient multiplication by Φ or Φ H using the LTFAT time-frequency toolbox 19

27 Noisy piano example 1 (a) noisy signal Time (s) (b) IS-NMF decomposition audio: c 1(t) c 2(t) c 3(t) c 4(t) c 5(t) c 6(t) c 7(t) c 8(t) c 9(t) c 1(t) 2

28 Noisy piano example 1 (a) noisy signal Time (s) (b) LRTFS decomposition audio: c 1(t) c 2(t) c 3(t) c 4(t) c 5(t) c 6(t) c 7(t) c 8(t) c 9(t) c 1(t) 21

29 Remarks about LRTFS Real-valued signals (Févotte and Kowalski, 218) x(t) previously assumed complex-valued for simplicity. In practice, x(t) is real-valued and Φ, α have Hermitian symmetry: x(t) = F /2 f =1 n=1 N 2R[α fn φ fn (t)] + e(t) More difficult to address but leads to essentially the same algorithm. Multi-layer representations (Févotte and Kowalski, 214, 215) LRTFS allows for multi-resolution hybrid representations: x = Φ a α a + Φ b α b + e. Φa and Φ b are time-frequency dictionaries with possibly different resolutions, αa and α b have their own structure, either low-rank or sparse. Not possible with standard NMF! 22

30 Remarks about LRTFS Compressive sensing (Févotte and Kowalski, 218) Signal x of size T sensed through linear operator A of size S T : b = Ax + e = AΦα + e Thanks to LRTFS, low-rankness can be used instead of sparsity. Estimate α from b under α fn N c (, [WH] fn ), similar algorithm. 25 toy piano 18 song excerpt output SNR 15 1 output SNR LRTFS SBL SBL L1 L1 2 LRTFS Oracle: spectrogram Oracle: spectrogram Oracle: WH (IS NMF) Oracle: WH (IS NMF) Number of measurements S (in % of T) Number of measurements S (in % of T) Recovery with LRTFS, SBL, LASSO (optimal hyperparameter) and two oracles Mamavatu by S. Raman (acoustic guitar, percussion, drums) 23

31 Outline NMF for spectral unmixing Generalities Itakura-Saito NMF Low-rank time-frequency synthesis Analysis vs synthesis Generative model Maximum joint likelihood estimation Variants NMF with transform learning Principle Optimisation 24

32 NMF with transform learning (Fagot, Wendt, and Févotte, 218) windowed segments X W H Fourier transform U 2 original temporal signal x(t) time 25

33 NMF with transform learning (Fagot, Wendt, and Févotte, 218) windowed segments X W H adaptive transform U 2 original temporal signal x(t) time 26

34 NMF with transform learning (Fagot, Wendt, and Févotte, 218) Power spectrogram S can be written as S = U FT X 2 X of size F N contains adjacent and windowed segments of x(t). UFT of size F F is the orthogonal Fourier matrix. 27

35 NMF with transform learning (Fagot, Wendt, and Févotte, 218) Power spectrogram S can be written as S = U FT X 2 X of size F N contains adjacent and windowed segments of x(t). UFT of size F F is the orthogonal Fourier matrix. Traditional NMF: min D( U FTX 2 WH) s.t. W, H W,H 27

36 NMF with transform learning (Fagot, Wendt, and Févotte, 218) Power spectrogram S can be written as S = U FT X 2 X of size F N contains adjacent and windowed segments of x(t). UFT of size F F is the orthogonal Fourier matrix. Traditional NMF: min D( U FTX 2 WH) s.t. W, H W,H NMF with transform learning (TL-NMF): min W,H,U D( UX 2 WH) s.t. W, H, U orthogonal Can be interpreted as a one-layer factorising network. 27

37 NMF with transform learning (Fagot, Wendt, and Févotte, 218) Power spectrogram S can be written as S = U FT X 2 X of size F N contains adjacent and windowed segments of x(t). UFT of size F F is the orthogonal Fourier matrix. Traditional NMF: min D( U FTX 2 WH) s.t. W, H W,H NMF with transform learning (TL-NMF): min W,H,U D( UX 2 WH) s.t. W, H, U orthogonal Can be interpreted as a one-layer factorising network. Inspired by the sparsifying transform of (Ravishankar and Bresler, 213): min U UX 1 s.t. U square invertible 27

38 NMF with transform learning (Fagot, Wendt, and Févotte, 218) Objective min W,H,U D IS( UX 2 WH) + λ H 1 s.t. W, H w k 1 = 1 UU T = I For simplicity, U real-valued orthogonal matrix. Practical importance of imposing sparsity on H. Block coordinate descent (U, W, H) Update of (W, H) with majorisation-minimisation. Update of U: Projected gradient descent with line-search (Manton, 22) Jacobi algorithm (U decomposed as a product of Givens rotations) (Wendt, Fagot, and Févotte, 218) 28

39 Learnt atoms from the piano toy example u 1 u 7 u 13 u 19 u 25 u 2 u 8 u 14 u 2 u 26 u 3 u 9 u 15 u 21 u 27 u 4 u 1 u 16 u 22 u 28 u 5 u 11 u 17 u 23 u 29 u 6 u 12 u 18 u 24 u 3 3 most active atoms learnt with TL-NMF (random initialisation) 29

40 Learnt atoms from the piano toy example u 5 u 6 u 7 u 8 u 13 u 14 u 1 u 2 u 1 u 11 u 19 u 21 Some atoms form pairs in phase quadrature 3

41 Temporal decomposition with TL-NMF Component Component Component Component Component Component Component Component time (s) time (s) Audio: c 1(t) c 2(t) c 3(t) c 4(t) c 5(t) c 6(t) c 7(t) c 8(t) 31

42 Summary IS-NMF is a generative model of the STFT. LRTFS is a generative model of the signal itself, with low-rank variance structure of the synthesis coefficients. TL-NMF learns a short-time transform together with the factorisation. Both LRTFS and TL-NMF take the raw signal as input. 32

43 Summary IS-NMF is a generative model of the STFT. LRTFS is a generative model of the signal itself, with low-rank variance structure of the synthesis coefficients. TL-NMF learns a short-time transform together with the factorisation. Both LRTFS and TL-NMF take the raw signal as input. Available PhD & postdoc positions in machine learning & signal processing within ERC project FACTORY 32

44 References I S. A. Abdallah and M. D. Plumbley. Polyphonic transcription by nonnegative sparse coding of power spectra. In Proc. International Symposium Music Information Retrieval Conference (ISMIR), pages , Barcelona, Spain, Oct. 24. R. Badeau. Gaussian modeling of mixtures of non-stationary signals in the time-frequency domain (HR-NMF). In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 211. doi: 1.119/ASPAA L. Benaroya, R. Gribonval, and F. Bimbot. Non negative sparse representation for Wiener based source separation with a single sensor. In Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages , Hong Kong, 23. L. Chaâri, J.-C. Pesquet, A. Benazza-Benyahia, and P. Ciuciu. A wavelet-based regularized reconstruction algorithm for SENSE parallel MRI with applications to neuroimaging. Medical Image Analysis, 15(2):185 21, 211. L. Daudet and B. Torrésani. Hybrid representations for audiophonic signal encoding. Signal Processing, 82(11): , 22. ISSN doi: A. Dessein, A. Cont, and G. Lemaitre. Real-time polyphonic music transcription with non-negative matrix factorization and beta-divergence. In Proc. International Society for Music Information Retrieval Conference (ISMIR), 21. O. Dikmen and C. Févotte. Nonnegative dictionary learning in the exponential noise model for adaptive music signal representation. In Advances in Neural Information Processing Systems (NIPS), pages , Granada, Spain, Dec URL 33

45 References II D. Fagot, H. Wendt, and C. Févotte. Nonnegative matrix factorization with transform learning. In Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr URL C. Févotte. Majorization-minimization algorithm for smooth Itakura-Saito nonnegative matrix factorization. In Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, May 211. URL C. Févotte and J. Idier. Algorithms for nonnegative matrix factorization with the beta-divergence. Neural Computation, 23(9): , Sep doi: /NECO a 168. URL C. Févotte and M. Kowalski. Low-rank time-frequency synthesis. In Advances in Neural Information Processing Systems (NIPS), Dec URL C. Févotte and M. Kowalski. Hybrid sparse and low-rank time-frequency signal decomposition. In Proc. European Signal Processing Conference (EUSIPCO), Nice, France, Sep URL C. Févotte and M. Kowalski. Estimation with low-rank time-frequency synthesis models. IEEE Transactions on Signal Processing, 66(15): , Aug doi: URL C. Févotte, N. Bertin, and J.-L. Durrieu. Nonnegative matrix factorization with the Itakura-Saito divergence. With application to music analysis. Neural Computation, 21(3):793 83, Mar. 29. doi: /neco URL 34

46 References III C. Févotte, J. Le Roux, and J. R. Hershey. Non-negative dynamical system with application to speech and audio. In Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, Canada, May 213. URL A. Florescu, E. Chouzenoux, J.-C. Pesquet, P. Ciuciu, and S. Ciochina. A majorize-minimize memory gradient method for complex-valued inverse problems. Signal Processing, 13: , 214. R. M. Gray, A. Buzo, A. H. Gray, and Y. Matsuyama. Distortion measures for speech processing. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4): , Aug M. Hoffman, D. Blei, and P. Cook. Bayesian nonparametric matrix factorization for recorded music. In Proc. 27th International Conference on Machine Learning (ICML), Haifa, Israel, 21. F. Itakura and S. Saito. Analysis synthesis telephony based on the maximum likelihood method. In Proc 6 th International Congress on Acoustics, pages C 17 C 2, Tokyo, Japan, Aug D. Kounades-Bastian, L. Girin, X. Alameda-Pineda, S. Gannot, and R. Horaud. A variational EM algorithm for the separation of time-varying convolutive audio mixtures. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 24(8): , Aug ISSN doi: 1.119/TASLP A. Lefèvre, F. Bach, and C. Févotte. Itakura-Saito nonnegative matrix factorization with group sparsity. In Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, May 211a. URL A. Lefèvre, F. Bach, and C. Févotte. Online algorithms for nonnegative matrix factorization with the Itakura-Saito divergence. In Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), Mohonk, NY, Oct. 211b. URL 35

47 References IV S. Leglaive, R. Badeau, and G. Richard. Multichannel audio source separation with probabilistic reverberation priors. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 24(12): , Dec ISSN doi: 1.119/TASLP A. Liutkus, R. Badeau, and G. Richard. Gaussian processes for underdetermined source separation. IEEE Transactions on Signal Processing, 59(7): , July 211. doi: 1.119/TSP P. Magron, R. Badeau, and B. David. Phase-dependent anisotropic Gaussian model for audio source separation. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 217. doi: 1.119/ICASSP J. H. Manton. Optimization algorithms exploiting unitary constraints. IEEE Transactions on Signal Processing, 5(3):635 65, Mar. 22. A. Ozerov and C. Févotte. Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Transactions on Audio, Speech and Language Processing, 18(3): , Mar. 21. doi: 1.119/TASL URL R. M. Parry and I. Essa. Phase-aware non-negative spectrogram factorization. In Proc. International Conference on Independent Component Analysis and Signal Separation (ICA), pages , London, UK, Sep. 27. S. Ravishankar and Y. Bresler. Learning sparsifying transforms. IEEE Transactions on Signal Processing, 61(5): , Mar ISSN X. doi: 1.119/TSP H. Sawada, H. Kameoka, S. Araki, and N. Ueda. Multichannel extensions of non-negative matrix factorization with complex-valued data. IEEE Transactions on Audio, Speech, and Language Processing, 21(5): , May 213. ISSN doi: 1.119/TASL P. Smaragdis. About this non-negative business. WASPAA keynote slides, 213. URL 36

48 References V P. Smaragdis and J. C. Brown. Non-negative matrix factorization for polyphonic music transcription. In Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), Oct. 23. V. Y. F. Tan and C. Févotte. Automatic relevance determination in nonnegative matrix factorization with the beta-divergence. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(7): , July 213. URL M. E. Tipping. Sparse Bayesian learning and the relevance vector machine. Journal of Machine Learning Research, 1: , 21. R. E. Turner and M. Sahani. Time-frequency analysis as probabilistic inference. IEEE Transactions on Signal Processing, 62(23): , Dec 214. doi: 1.119/TSP H. Wendt, D. Fagot, and C. Févotte. Jacobi algorithm for nonnegative matrix factorization with transform learning. In Proc. European Signal Processing Conference (EUSIPCO), Sep URL D. P. Wipf and B. D. Rao. Sparse bayesian learning for basis selection. IEEE Transactions on Signal Processing, 52(8): , Aug. 24. K. Yoshii, R. Tomioka, D. Mochihashi, and M. Goto. Infinite positive semidefinite tensor factorization for source separation of mixture signals. In Proc. International Conference on Machine Learning (ICML),

NONNEGATIVE MATRIX FACTORIZATION WITH TRANSFORM LEARNING. Dylan Fagot, Herwig Wendt and Cédric Févotte

NONNEGATIVE MATRIX FACTORIZATION WITH TRANSFORM LEARNING Dylan Fagot, Herwig Wendt and Cédric Févotte IRIT, Université de Toulouse, CNRS, Toulouse, France firstname.lastname@irit.fr ABSTRACT Traditional