Matrix Co-Factorization on Compressed Sensing

Size: px
Start display at page:

Download "Matrix Co-Factorization on Compressed Sensing"

Transcription

1 Matrix Co-Factorization on Compressed Sensing Jiho Yoo 1 and Seungjin Choi 1,2 1 Department of Computer Science 2 Division of IT Convergence Engineering Pohang University of Science and Technology San 31 Hyoja-dong, Nam-gu, Pohang , Korea zentasis, seungjin@postech.ac.kr Astract In this paper we address the prolem of matrix factorization on compressively-sampled measurements which are otained y random projections. While this approach improves the scalaility of matrix factorization, its performance is not satisfactory. We present a matrix co-factorization method where compressed measurements and a small numer of uncompressed measurements are jointly decomposed, sharing a factor matrix. We evaluate the performance of three matrix factorization methods in terms of Cramér-Rao ounds, including: (1) matrix factorization on uncompressed data (MF); (2) matrix factorization on compressed data (CS- MF); (3) matrix co-factorization on compressed and uncompressed data (CS-MCF). Numerical experiments demonstrate that CS-MCF improves the performance of CS-MF, emphasizing the useful ehavior of exploiting side information (a small numer of uncompressed measurements). 1 Introduction Matrix factorization is a decomposition method where a data matrix R M N is approximated as a product of two or more factor matrices which uncover the latent structure of the data. The 2-factor decomposition seeks latent factor matrices U R M K andv R N K which minimize the following ojective function J MF = 1 2 UV 2 F, (1) where F denotes the Froenious norm. When columns in are treated as data points in M-dimensional space, U and V are referred to e as asis matrix and encoding matrix, respectively. Exemplary matrix factorization methods include principal component analysis (PCA) [Jolliffe, 2002], nonnegative matrix factorization (NMF) [Lee and Seung, 1999], proailistic matrix factorization (PMF) [Srero et al., 2005], to name a few. Matrix factorization has estalished itself as a powerful technique in various applications such as collaorative prediction [Koren et al., 2009], document clustering [u et al., 2003; Yoo and Choi, 2010], music transcription [Smaragdis and Brown, 2003], and rain wave analysis [Lee and Choi, 2009]. In some of these applications such as collaorative prediction, the data matrix is sparse since most of entries are missing or unoserved. Thus, matrix factorization methods can e applied to a large scale prolem without much extra care, since only oserved elements are used for learning. On the other hand, data matrix is dense in some applications where spectrograms of music or rain wave data are analyzed. In such applications, it is not easy to handle large-scale data. Compressed sensing is a new sensing/sampling paradigm which ensures near-optimal recovery of sparse signals from a small numer of linear measurements [Donoho, 2006; Candes and Tao, 2006]. Compressed sensing is a promising approach to handling a large-scale dense data matrix since it is allowed to perform a processing (learning or data analysis) in the compressed domain (in which a compressivelysampled matrix is much smaller than the original input data matrix in size) rather than in the amient space. Let Y R P N e the compressively-sampled data matrix which is otained y Y = Φ, in which Φ R P M is a linear sensing matrix which is often given y a random projection. A direct application of the compressed sensing framework to matrix factorization, referred to as CS-MF, yields the following ojective function J CS-MF = 1 2 Y ΦUV 2 F. (2) CS-MF determines U and V which minimize (2), improving the scalaility since Y and Φ requires less space in storage, compared to. Recently, nonnegative matrix factorization (NMF) in the compressed domain, referred to as CS-NMF, was presented in [O Grady and Rickard, 2008], where (2) is minimized with nonnegativity constraints imposed on factor matrices U and V as well as the data matrix. In contrast to the standard compressed sensing framework where U is known in advance, CS-MF or CS-NMF estimates othu and V in the compressed-domain, which degrades the reconstruction performance. Matrix co-factorization (MCF) is a technique that decompose two or more data matrices jointly, sharing some factor matrices in factorizations. In addition to the data matrix, side information matrices are also considered for the joint decomposition, including lael information [Yu et al., 2005; Lee et al., 2010], link information [Zhu et al., 2007], intersuject variations [Lee and Choi, 2009], and spectrograms

2 of music [Yoo et al., 2010]. MCF was also applied to the general entity-relationship models [Singh and Gordon, 2008; Yoo and Choi, 2009] to learn the characteristics of each entity from the relationships. In this paper we present a method of MCF on compressively-sampled data, referred to as CS- MCF, where we decompose the compressed data matrix and a small portion of uncompressed original data jointly in order to improve the performance of matrix factorization in the compressed domain. We compute the Cramér-Rao ound (CRB) for matrix factorization on uncompressed data (MF), CS-MF, and CS-MCF in the case of Gaussian likelihood, showing that CS-MCF indeed improves the performance over CS-MF in terms of the reconstruction quality. We consider alternating least squares (ALS) algorithms for MF, CS-MF, and CS-MCF. In addition, we also consider multiplicative updates, as in NMF, comparing NMF, CS-NMF (NMF in the compressed domain), and CS-NMCF (nonnegative matrix cofactorization in the compressed domain with small uncompressed information). 2 Matrix Co-Factorization on Compressively Sampled Data Compressed sensing [Donoho, 2006; Candes and Tao, 2006] is a new framework which performs the measurement and compression simultaneously to reduce computation required in the measurement process. The fundamental theoretical results of compressed sensing is ased on the sparsity of the representation of given data, which makes possile to reduce the required numer of measurement less than the amount computed from the classical Shannon-Nyquist sampling theorem. Compressed sensing assumes that the asis which sparsely represents the given data is known, however, the appropriate representation asis for given data is usually not known in advance. In that cases we have to learn a representation matrix from a given data, however, the prolem of learning representation asis from the compressed measurement has not een studied much. Matrix factorization in the compressed domain considers a decomposition of the compressively-sampled data matrix Y = Φ where Φ R P M is a random projection. Thus given and Φ, CS-MF seeks Y ΦUV, determining factor matrices U and V which minimizes (2). In contrast to the compressed sensing framework where the asis matrix U is known in advance, CS-MF should estimate oth U and V given compressed data Y. We elieve that this explains the poor reconstruction performance of CS-MF in our experiments. Now we explain matrix co-factorization in the compressed domain with exploiting partial uncompressed data. We partition the data matrix R M N into two su-matrices c R M C and u R M (N C). Assuming that C > (N C), the su-matrix c is compressed y a random projection Φ, leading to Y = Φ c with the ause of notation. CS-MCF considers a joint decomposition of the compressed data matrix Y and partial uncompressed data u, minimizing the following ojective function: J CS-MCF = 1 2 Y ΦUV 2 F + λ 2 u UW 2 F, (3) c Y Φ U u u = = V U W Figure 1: Pictorial illustration of CS-MCF. The input data matrix is partitioned into c (to e compressed) and u (partial uncompressed data). Then Y = Φ c and u are jointly decomposed, sharing the factor matrix U, leading to Y ΦUV and u UW. where the factor matrix U R M K is shared, capturing the common characteristics in representing oth Y and u, and V R C K and W R (N C) K capture individual characteristics. The parameter λ controls the alance etween two decomposition. Pictorial illustration of CS-MCF is given in Fig ALS Updates ALS updates for U, V, W are easily derived y solving U V W = 0, = 0, = 0, for U,V,W, respectively, and the corresponding gradients of (3) are calculated as U V W leading to ALS updates = Φ Y V +Φ ΦUV V λ u W +λuw W, = Y ΦU +V U Φ ΦU, = u U +WU U, vec(u) ((V V ) (Φ Φ)+λ(W W) I M ) 1 vec(φ Y V +λ u W), V Y ΦU(U Φ ΦU) 1, W u U(U U) 1, wherei M is the identity matrix with sizem M, vec(u) = [u 1,...,u K ] is the vec operator which creates a column vector from the matrix U y stacking columns of U, and denotes the Kronecker product (tensor product).

3 Tale 1: Algorithm description for ALS updates and multiplicative updates. Model Updates Model Updates MF U V (V V ) 1 NMF U U UV V V V U(U U) 1 V V U V U U CS-MF vec(u) ((V V ) (Φ Φ)) 1 vec(φ Y V ) CS-NMF U U Φ Y V Φ ΦUV V V Y ΦU(U Φ ΦU) 1 V V Y ΦU V U Φ ΦU CS-MCF vec(u) ((V V ) (Φ Φ)+λ(W W) I M ) 1 CS-NMCF U U Φ Y V +λ u W Φ ΦUV V +λuw W vec(φ Y V +λ u W) V Y ΦU(U Φ ΦU) 1 V V Y ΦU V U Φ ΦU W u U(U U) 1 W W u U WU U 2.2 Multiplicative Updates As in NMF, we derive multiplicative updates for U,V,W which iteratively minimize (3) with nonnegativity constraints imposed on factor matrices, given the nonnegative data matrix. Our derivation follows the technique used in [Yoo and Choi, 2008] where the gradient of an ojective functionj is decomposed as J Θ = [ ] + [ ] J J, Θ Θ where [ ] L + [ Θ > 0 and L Θ] > 0, and then multiplicative updates for parameters Θ are given y Θ Θ [ J ] Θ [ J Θ ] +, where denotes Hadamard product (element-wise product) and the division is done in an element-wise manner. Applying this technique to the minimization of (3) yields Φ Y V +λ u W U U Φ ΦUV V +λuw W, Y ΦU V V V U Φ ΦU, W W u U WU U. ALS or multiplicative updating algorithms for MF and CS- MF are also derived in the same way, which are summarized in Tale 1. 3 Cramér-Rao Bounds We calculate the Cramér-Rao ound (CRB) to evaluate the performance of matrix factorization, CS-MF, and CS-MCF. CRB [Kay, 1993] provide the lower ound of the variance of uniased estimators of a deterministic parameter in the following form, E (θ θ)(θ θ) I 1, (4) whereθ is the estimated parameter of the model, θ is the true value of the parameter, A B denotes that A B is the positive semi-definite matrix, and I is the Fisher information matrix, each element of which is computed y I ij = E 2 logp(x θ) θ i θ j, (5) where p(x θ) is the likelihood. In the case of matrix factorization, we have factor matrices U R M K and V R N K as the parameters for the model, so θ ecomes θ = [ vec(u), vec(v ) ] = [ u 1,u 2,,u M,v 1,v 2,,v N], where u i and v j represent the column vector consist of the elements in the i-th and j-th row of U and V, respectively. Then, the Fisher information matrix has the following representation, [ ] IU I I = UV IUV I V Iu 1 Iu 1u M Iu 1v 1 Iu 1v N = Iu M u 1 Iu M Iu M v 1 Iu M v N Iv 1 u 1 Iv 1 u M Iv 1. Iv 1 v N Iv Nu 1 Iv 1u M Iv Nv 1 Iv N where each part is computed for the corresponding parameter, for example, [Iu i ] k,k [ I u i u i ] k,k = E 2 logp( U,V ) u ik u ik = E 2 logp( U,V ) u ik u i k, (6), (7) where[a] i,j represents the (i,j)-th element in the matrixa. The Fisher information matrix of CS-MCF has a different parametrization with the matrixu,v andw, which is I = I U I UV I UW IUV I V I VW, IUW I V W I W

4 Tale 2: Graphical model, likelihood, Fisher information matrix (FIM) and Cramér-Rao lower ound for matrix factorization (MF), matrix factorization for compressed measurements (CS-MF), and matrix co-factorization for compressed measurements (CS-MCF). MF CS-MF CS-MCF U V U V U V W c u Graphical Model Y Y Likelihood p( U,V ) p(y U,V ) p(y, u U,V,W) = j N(x j Uv j,σ) = j N(y j ΦUv j,σ) = j N(y j ΦUv j,σ ) k N(xu k Uw k,σ a ) FIM I U (V Σ 1 V ) I M (V V ) (Φ Σ 1 Φ) (V V ) (Φ Σ 1 a Φ)+(W Σ 1 W) I M I V (U Σ 1 U) I N (U Φ Σ 1 ΦU) I N (U Φ Σ 1 a ΦU) I C I UV vec(uσ 1 )vec(v ) vec(uφ Σ 1 Φ)vec(V ) vec(u Φ Σ 1 a Φ)vec(V ) I W I UW (U Σ 1 U) I N C vec(uσ 1 )vec(w) I V W 0 CRB (U known) CRB (U unknown) [I V ] 1 [I V ] 1 [ IV I VW [ ] 1 [ ] 1 IU I UV IU I UV IUV I V IUV I V I V W I W ] 1 I U I UV I UW I UV I V I V W I UW I V W I W 1 and each element of the parts is computed in the similar way to (6) and (7). The CRB for matrix factorization, CS-MF and CS-MCF are computed in the following susections. We use the maximum likelihood estimator (MLE), which is known to e asymptotically uniased and efficient under mild regularity conditions [Kay, 1993]. The analysis is ased on these asymptotic properties, assuming that there are sufficient numer of samples. The models, computed Fisher information matrices and the form of CRB are summarized in Tale 2. Since the solution of the matrix factorization is not uniquely determined, the direct comparison etween the estimated parameters and the ground-truth values is meaningless. Instead of directly comparing the parameters themselves, we use the expected reconstruction error ( ij ij ) 2 E ij = E = E (u i v j ũ i ṽj) 2, where ij, ũ i, andṽ j are the ground-truth values of the parameters, and ij is the element of the reconstructed input matrix from the estimated parameters u i and v j. Although the factor matrices is not uniquely determined, the reconstruction errore ij is the same for all the possile decompositions. The lower ound for the reconstruction error is represented y using the variances of the estimators, such as E ij = E (u i v j ũ i v j +ũ i v j ũ i ṽj) 2 = E v j (u i ũ i )(u i ũ i ) v j +E ũ i (v j ṽ j )(v j ṽ j ) ũ i +2E v j (u i ũ i )(v j ṽ j ) ũ i. If we assume that the estimator is uniased, the last term of the aove equation ecomes zero, and the remaining terms are lower-ounded y the computed CRB, E ij E v j [ I 1 ] u i v j +ũ i [ I 1 ] v j ũ i (8) where [ I 1] u i represents the part of the inverse Fisher information matrix corresponding to the position ofiu i. Then,

5 the error ound ecomes E ij ṽ [ j I 1 ] [I 1 u ṽ j + Tr ] [ i u I 1 ] i v j [ I 1 ] v ũ i. (9) j +ũ i Note that if we assume that the asis matrix U is known in advance, we only have to deal with the parameter V. With the computation similar to the aove, the lower ound of the reconstruction error is computed in the following form, E ij ũ [ i I 1 ] v ũ j i, (10) which is smaller than the case when the asis is unknown in the amount of the terms corresponding to U, which is [ ] I 1 [I u ṽ j + Tr ] [ ] 1 i u I 1 i v. j ṽ j 3.1 Cramér-Rao Bounds for Matrix Factorization The Fisher information matrix is computed from the likelihood of the model. If we assume that the noise is additive and distriuted as the Gaussian distriution, the model is written y = UV +N, wheren ij N(0,σ 2 ), andn(µ,σ 2 ) is the Gaussian distriution with mean µ and variance σ 2. Then, the likelihood of the model ecomes p( U,V ) = j N(x j Uv j,σ 2 I K ). Calculation ofi V The derivative of the log-likelihood with respect tov j is computed y logp( U,V ) v j = U Σ 1 x j U Σ 1 Uv j, where Σ = σ 2 I M. If we differentiate aove again with respect tov j, wherej j, the result ecomes zero. Ifj = j, then, 2 logp( U,V ) v j v j = U Σ 1 U. Therefore, the part of Fisher information matrix for V is computed y I V = E (U Σ 1 U) I N = (U Σ 1 U) I N. Calculation ofi U In this case, we use the likelihood of the model having the following form, p( U,V ) = i N(x i V u i,σ 2 I K ), where x i represents the i-th row of the input matrix. The first derivative with respect to u i is computed as logp( U,V ) u i = V Σ 1 x i V Σ 1 V u i. (11) Again, the second derivative ecomes zero if i i, and if i = i, 2 logp( U,V ) u i u i = V Σ 1 V. Therefore, Fisher information matrix foru is written y I U = E (V Σ 1 V ) I M = (V Σ 1 V ) I M. Calculation ofi UV To compute I UV, we differentiate (11) with respect to v j, which ecomes 2 logp( U,V ) u i v j = 1 ( xij σ 2 I K u i v ji K v j u ) i. Since Ex ij I K = u i v ji K, the first two terms of the aove equation vanishes when we take the expectation. Therefore, the Fisher information matrix for U and V is written y I UV = vec(uσ 1 )vec(v ). 3.2 Cramér-Rao Bound for CS-MF We model the CS-MF with the following two-step procedure, = UV +N a, Y = Φ +N, where [N a ] i,j N(0,σa) 2 is the representation error and [N ] i,j N(0,σ 2 ) is the measurement error. The likelihood is written y p(y U,V ) = p(y )p( U,V )d = j N(y j ΦUv j,σ), whereσ = σ 2 a ΦΦ +σ 2 I P, whereφ R P M. Calculation ofi V The derivative of the log-likelihood with respect tov j is computed y logp(y U,V ) v j = U Φ Σ 1 y j U Φ Σ 1 ΦUv j. Differentiate again with v j ecomes zero when j j, and ifj = j, then 2 logp(y U,V ) v j v j = U Φ Σ 1 ΦU. Therefore, the Fisher information matrix for V is computed y = E U Φ Σ 1 ΦU I N I V = U Φ Σ 1 ΦU I N.

6 Calculation ofi U The derivative with respect to u i is written y logp(y U,V ) u i = ) (v j y j Σ 1 φ i v j φ i Σ 1 ΦUv j. j The derivative of aove equation with respect to u i is simplified y 2 logp(y U,V ) u i u i = (Φ Σ 1 Φ) ii V V, which leads the Fisher information matrix foru to e I U = (V V ) (Φ Σ 1 Φ). Calculation ofi UV If we differentiate (12) with respect to v j, we otain 2 logp(y U,V ) u i v j + = φ i Σ 1 y j +I K φ i Σ 1 ΦUv j +I K U Φ Σ 1 φ i v j +. The first two terms disappear when we take the expectation, so the Fisher information matrix foru andv ecomes I UV = vec(u Φ Σ 1 Φ)vec(V ). The theoretical ound of CS-MF is larger than that of matrix factorization. The main difference of the computed Fisher information matrix is from the change of the termσ 1 of matrix factorization to the term Φ Σ 1 Φ. Note that the Σ 1 in the matrix factorization is Σ 1 = σ 2 a I M. (12) However,Φ Σ 1 Φ in the CS-MF is written y Φ Σ 1 Φ = Φ (σ 2 a ΦΦ +σ 2 I P) 1 Φ. This is usually smaller than the corresponding term in the matrix factorization. For example, if we assume thatσ 2 = 0 and each element of sensing matrix is drawn independently from N(0,1), we approximate the term using Φ Φ PI M and ΦΦ MI P, such that Φ Σ 1 Φ Φ (σ 2 ami P ) 1 Φ 1 M σ 2 a Φ Φ P M σ 2 a I M, which is P/M times smaller than that part of matrix factorization (12). Since the CRB is calculated as the inverse of the Fisher information matrix, CS-MF has larger CRB than that of the matrix factorization. 3.3 Cramér-Rao Bound for CS-MCF The relations in CS-MCF are modeled as in the previous section, such that c = UV +N a, Y = Φ c +N, u = UW +N c, where [N a ] i,j N(0,σa), 2 [N c ] i,j N(0,σa) 2 and [N ] i,j N(0,σ 2 ). In this case we have three parameters U, V, andw, so the Fisher information matrix is computed as I = I U J UV I UW IUV I V I V W. IUW I V W I W The likelihood is p(y, u U,V,W) = p(y U,V )p( u U,W) = j N(y j ΦUv j,σ ) l N(x u l Uw l,σ a ), whereσ = σ 2 a ΦΦ +σ 2 I P andσ a = σ 2 a I M. Calculation ofi U Differentiating the log-likelihood with respect tou i leads logp(y, u U,V,W) u i = ( ) v j y j Σ 1 φ i v j φ i Σ 1 ΦUv j j +W Σ 1 a x i W Σ 1 a Wu i. (13) Differentiating again with respect to u i yields 2 logp(y, u U,V,W) u i u i = (Φ Σ 1 Φ) ii V V W Σ 1 a W, wherei = i, and yields 2 logp(y, u U,V,W) u i u i = (Φ Σ 1 Φ) ii V V, wherei i. Therefore, the Fisher information matrix foru ecomes I U = (V V ) (Φ Σ 1 Φ)+(W Σ 1 a W) I M. Calculation ofi V, I W, I UV, andi V W I V, I W, I UV, and I V W are computed in the similar way as in matrix factorization and CS-MF, in the following forms, I V I W I UV I UW = (U Φ Σ 1 ΦU) I C, = (U Σ 1 a U) I N C, = vec(u Φ Σ 1 Φ)vec(V ), = vec(uσ 1 a )vec(w).

7 SNR (db) U-known MF CS-MF CS-MCF U-unknown MF CS-MF CS-MCF SNR (db) U-known NMF CS-NMF CS-NMCF U-unknown NMF CS-NMF CS-NMCF 0 and variance 1. For the multiplicative update algorithms which have the nonnegativity constraints, each element is sampled from a uniform distriution etween 0 and 1. For the co-factorizations, 20 out of 100 columns were chosen for the uncompressed prior. As a measure of the reconstruction performances, we used signal-to-noise ratio (SNR) 2 F SNR = 10log 10, UV 2 F (a) Figure 2: SNR computed from the CRB (dark) and SNR achieved from the algorithms (light), for the (a) ALS algorithms with Gaussian data () Multiplicative algorithms with nonnegative data. In each graph, left three are for the case of known representation U, and right three are for the case of unknownu. Calculation ofi V W Fisher information matrix forv andw,i V W, ecomes zero matrix of dimension CK (N C)K ecause the term W anishes in the derivative with respect tov j. The Fisher information matrix of CS-MCF is the comination of the Fisher information matrix of matrix factorization and Fisher information matrix of CS-MF. The parameter V in matrix factorization and CS-MF is divided into V and W in CS-MCF. The I V and I UV in CS-MCF follows the form of the corresponding parts of CS-MF, ut the I W and I UW follows the form of the corresponding parts of matrix factorization. Moreover, I U is the direct comination of the I U of matrix factorization andi U of CS-MF. In the previous section, we showed that in general the CRB of matrix factorization is smaller than the CRB of CS-MF, so this kind of comination leads smaller CRB than that of CS-MF. 4 Numerical Experiments For the numerical experiments, we computed the ounds of the reconstruction performances using the equation (9), and compared them with the actual reconstruction performances of the algorithms listed in the Tale 1. We also computed the ounds for the cases where we assume that the parameter U is known in advance (10). In that cases, the actual reconstruction performances were measured ased on the modified algorithms, where the parameter U is fixed with the groundtruth values. To compute the CRB, we have to know the ground-truth value of the parameters U, V, and W, which makes it difficult to use the real-world data where the ground-truth factor matrices are unknown. Therefore, we used synthetic data for the experiments. The factor matrices U R , V R and the sensing matrix Φ R are randomly generated, and the input matrices were drawn from them with additive Gaussian noise. For the ALS-ased algorithms, each element of the factor matrices and the sensing matrix is sampled from the Gaussian distriution with mean () where is the true value of the input matrix, U and V are the factor matrices otained y running the algorithms. The SNRs were averaged over the 100 different trials (Figure 2). The ounds were etter for the cases where the matrix U is known in advance, than the cases where the matrix U is unknown. This indicates that the estimation of oth U and V is more difficult than the estimation of V only. The ounds of CS-MF and CS-NMF were seriously degraded compared to the ounds of MF and NMF, as well as the actual performances, especially for the cases that the matrix U is unknown. However, the co-factorization ased methods showed much improved ounds, as well as the performances, compared to CS-MF and CS-NMF. The use of uncompressed prior as the side information actually helps to improve the reconstruction performance in the learning from the compressed data. As an illustrative example, we ran the algorithms for the compressively sampled data on the cameraman image (Figure 3). We divided the image using 8-y-8 patches to uild an input matrix, and compress each column into the size of % of the data is randomly selected to e used for the uncompressed prior. CS-MCF rought much etter reconstruction result than the CS-MF. 5 Conclusions We have presented a method for matrix co-factorization in the compressed domain where compressively-sampled data and partial uncompressed data were jointly decomposed, sharing a factor matrix. We calculated CRBs for three matrix factorization methods (MF, CS-MF, CS-MCF) in the case of Gaussian likelihood, showing that CS-MCF improves the reconstruction performance over CS-MF. Numerical experiments on image confirmed the etter performance of CS-MCF over CS-MF. Acknowledgments: This work was supported y NIPA- MSRA Creative IT/SW Research Project, the Converging Research Center Program funded y Korea MEST (No. 2010K001171), NIPA ITRC Support Program (NIPA C ), and NRF WCU Program (R ). References [Candes and Tao, 2006] E. Candes and T. Tao. Near optimal signal recovery from random projections: Universal encoding strategies? IEEE Transactions on Information Theory, 52(12): , 2006.

8 (a) () (c) (d) (e) Figure 3: Reconstruction example of an image: (a) original image () the reconstruction result of CS-MF (c) the reconstruction result of CS-MCF (d) the reconstruction result of CS-NMF (e) the reconstruction result of CS-NMCF [Donoho, 2006] D. L. Donoho. Compressed sensing. IEEE Transactions on Information Theory, 52(4): , [Jolliffe, 2002] I. T. Jolliffe. Principal Component Analysis. Spinger-Verlag, 2 edition, [Kay, 1993] S. M. Kay. Fundamentals of Statistical Signal Processing: Estimation Theory. Prentice Hall, [Koren et al., 2009] Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. IEEE Computer, 42(8):30 37, [Lee and Choi, 2009] H. Lee and S. Choi. Group nonnegative matrix factorization for EEG classification. In Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), Clearwater Beach, Florida, [Lee and Seung, 1999] D. D. Lee and H. S. Seung. Learning the parts of ojects y non-negative matrix factorization. Nature, 401: , [Lee et al., 2010] H. Lee, J. Yoo, and S. Choi. Semisupervised nonnegative matrix factorization. IEEE Signal Processing Letters, 17(1):4 7, [O Grady and Rickard, 2008] P. D. O Grady and S. T. Rickard. Non-negative matrix factorisation of compressiely sampled non-negative signals. In Proceedings of the 8th IMA International Conference on Mathematics in Signal Processing, Cirencester, UK, [Singh and Gordon, 2008] A. P. Singh and G. J. Gordon. Relational learning via collective matrix factorization. In Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), Las Vegas, Nevada, [Smaragdis and Brown, 2003] P. Smaragdis and J. C. Brown. Non-negative matrix factorization for polyphonic music transcription. In Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pages , New Paltz, NY, [Srero et al., 2005] N. Srero, J. D. M. Rennie, and T. Jaakkola. Maximum-margin matrix factorization. In Advances in Neural Information Processing Systems (NIPS), volume 17. MIT Press, [u et al., 2003] W. u,. Liu, and Y. Gong. Document clustering ased on non-negative matrix factorization. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval (SI- GIR), Toronto, Canada, [Yoo and Choi, 2008] J. Yoo and S. Choi. Orthogonal nonnegative matrix factorization: Multiplicative updates on Stiefel manifolds. In Proceedings of the 9th International Conference on Intelligent Data Engineering and Automated Learning (IDEAL), Daejeon, Korea, [Yoo and Choi, 2009] J. Yoo and S. Choi. Weighted nonnegative matrix co-tri-factorization for collaorative prediction. In Proceedings of the 1st Asian Conference on Machine Learning (ACML), Nanjing, China, [Yoo and Choi, 2010] J. Yoo and S. Choi. Orthogonal nonnegative matrix-tri-factorization for co-clustering: Multiplicative updates on stiefel manifolds. Information Processing and Management, 46(5): , [Yoo et al., 2010] J. Yoo, M. Kim, K. Kang, and S. Choi. Nonnegative matrix partial co-factorization for drum source separation. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Dallas, Texas, [Yu et al., 2005] K. Yu, S. Yu, and V. Tresp. Multi-lael informed latent semantic indexing. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Salvador, Brazil, [Zhu et al., 2007] S. Zhu, K. Yu, Y. Chi, and Y. Gong. Comining content and link for classification using matrix factorization. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Amsterdam, The Netherlands, 2007.

Orthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds

Orthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds Orthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds Jiho Yoo and Seungjin Choi Department of Computer Science Pohang University of Science and Technology San 31 Hyoja-dong,

More information

On Spectral Basis Selection for Single Channel Polyphonic Music Separation

On Spectral Basis Selection for Single Channel Polyphonic Music Separation On Spectral Basis Selection for Single Channel Polyphonic Music Separation Minje Kim and Seungjin Choi Department of Computer Science Pohang University of Science and Technology San 31 Hyoja-dong, Nam-gu

More information

Nonnegative Matrix Factorization

Nonnegative Matrix Factorization Nonnegative Matrix Factorization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

NONNEGATIVE matrix factorization (NMF) is a

NONNEGATIVE matrix factorization (NMF) is a Algorithms for Orthogonal Nonnegative Matrix Factorization Seungjin Choi Abstract Nonnegative matrix factorization (NMF) is a widely-used method for multivariate analysis of nonnegative data, the goal

More information

Nonnegative Matrix Factorization with Orthogonality Constraints

Nonnegative Matrix Factorization with Orthogonality Constraints Nonnegative Matrix Factorization with Orthogonality Constraints Jiho Yoo and Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology (POSTECH) Pohang, Republic

More information

Nonnegative Matrix Factorization Clustering on Multiple Manifolds

Nonnegative Matrix Factorization Clustering on Multiple Manifolds Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Nonnegative Matrix Factorization Clustering on Multiple Manifolds Bin Shen, Luo Si Department of Computer Science,

More information

CONVOLUTIVE NON-NEGATIVE MATRIX FACTORISATION WITH SPARSENESS CONSTRAINT

CONVOLUTIVE NON-NEGATIVE MATRIX FACTORISATION WITH SPARSENESS CONSTRAINT CONOLUTIE NON-NEGATIE MATRIX FACTORISATION WITH SPARSENESS CONSTRAINT Paul D. O Grady Barak A. Pearlmutter Hamilton Institute National University of Ireland, Maynooth Co. Kildare, Ireland. ABSTRACT Discovering

More information

Constrained Projection Approximation Algorithms for Principal Component Analysis

Constrained Projection Approximation Algorithms for Principal Component Analysis Constrained Projection Approximation Algorithms for Principal Component Analysis Seungjin Choi, Jong-Hoon Ahn, Andrzej Cichocki Department of Computer Science, Pohang University of Science and Technology,

More information

A Coupled Helmholtz Machine for PCA

A Coupled Helmholtz Machine for PCA A Coupled Helmholtz Machine for PCA Seungjin Choi Department of Computer Science Pohang University of Science and Technology San 3 Hyoja-dong, Nam-gu Pohang 79-784, Korea seungjin@postech.ac.kr August

More information

Hierarchical Bayesian Matrix Factorization with Side Information

Hierarchical Bayesian Matrix Factorization with Side Information Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Hierarchical Bayesian Matrix Factorization with Side Information Sunho Park 1, Yong-Deok Kim 1, Seungjin Choi 1,2

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Sparseness Constraints on Nonnegative Tensor Decomposition

Sparseness Constraints on Nonnegative Tensor Decomposition Sparseness Constraints on Nonnegative Tensor Decomposition Na Li nali@clarksonedu Carmeliza Navasca cnavasca@clarksonedu Department of Mathematics Clarkson University Potsdam, New York 3699, USA Department

More information

Data Fusion in Large Arrays of Microsensors 1

Data Fusion in Large Arrays of Microsensors 1 Approved for pulic release; distriution is unlimited. Data Fusion in Large Arrays of Microsensors Alan S. Willsky, Dmitry M. Malioutov, and Müjdat Çetin Room 35-433, M.I.T. Camridge, MA 039 willsky@mit.edu

More information

Matrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang

Matrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang Matrix Factorization & Latent Semantic Analysis Review Yize Li, Lanbo Zhang Overview SVD in Latent Semantic Indexing Non-negative Matrix Factorization Probabilistic Latent Semantic Indexing Vector Space

More information

Non-negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs

Non-negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Non-negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs Paris Smaragdis TR2004-104 September

More information

NMF WITH SPECTRAL AND TEMPORAL CONTINUITY CRITERIA FOR MONAURAL SOUND SOURCE SEPARATION. Julian M. Becker, Christian Sohn and Christian Rohlfing

NMF WITH SPECTRAL AND TEMPORAL CONTINUITY CRITERIA FOR MONAURAL SOUND SOURCE SEPARATION. Julian M. Becker, Christian Sohn and Christian Rohlfing NMF WITH SPECTRAL AND TEMPORAL CONTINUITY CRITERIA FOR MONAURAL SOUND SOURCE SEPARATION Julian M. ecker, Christian Sohn Christian Rohlfing Institut für Nachrichtentechnik RWTH Aachen University D-52056

More information

Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation

Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation Mikkel N. Schmidt and Morten Mørup Technical University of Denmark Informatics and Mathematical Modelling Richard

More information

L 2,1 Norm and its Applications

L 2,1 Norm and its Applications L 2, Norm and its Applications Yale Chang Introduction According to the structure of the constraints, the sparsity can be obtained from three types of regularizers for different purposes.. Flat Sparsity.

More information

Robust Principal Component Analysis

Robust Principal Component Analysis ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M

More information

Fast Nonnegative Matrix Factorization with Rank-one ADMM

Fast Nonnegative Matrix Factorization with Rank-one ADMM Fast Nonnegative Matrix Factorization with Rank-one Dongjin Song, David A. Meyer, Martin Renqiang Min, Department of ECE, UCSD, La Jolla, CA, 9093-0409 dosong@ucsd.edu Department of Mathematics, UCSD,

More information

Rank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs

Rank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs Rank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs Bhargav Kanagal Department of Computer Science University of Maryland College Park, MD 277 bhargav@cs.umd.edu Vikas

More information

PROBLEM of online factorization of data matrices is of

PROBLEM of online factorization of data matrices is of 1 Online Matrix Factorization via Broyden Updates Ömer Deniz Akyıldız arxiv:1506.04389v [stat.ml] 6 Jun 015 Abstract In this paper, we propose an online algorithm to compute matrix factorizations. Proposed

More information

Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing

Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing 1 Zhong-Yuan Zhang, 2 Chris Ding, 3 Jie Tang *1, Corresponding Author School of Statistics,

More information

arxiv: v3 [cs.lg] 18 Mar 2013

arxiv: v3 [cs.lg] 18 Mar 2013 Hierarchical Data Representation Model - Multi-layer NMF arxiv:1301.6316v3 [cs.lg] 18 Mar 2013 Hyun Ah Song Department of Electrical Engineering KAIST Daejeon, 305-701 hyunahsong@kaist.ac.kr Abstract Soo-Young

More information

Cholesky Decomposition Rectification for Non-negative Matrix Factorization

Cholesky Decomposition Rectification for Non-negative Matrix Factorization Cholesky Decomposition Rectification for Non-negative Matrix Factorization Tetsuya Yoshida Graduate School of Information Science and Technology, Hokkaido University N-14 W-9, Sapporo 060-0814, Japan yoshida@meme.hokudai.ac.jp

More information

Automatic Rank Determination in Projective Nonnegative Matrix Factorization

Automatic Rank Determination in Projective Nonnegative Matrix Factorization Automatic Rank Determination in Projective Nonnegative Matrix Factorization Zhirong Yang, Zhanxing Zhu, and Erkki Oja Department of Information and Computer Science Aalto University School of Science and

More information

Non-negative matrix factorization with fixed row and column sums

Non-negative matrix factorization with fixed row and column sums Available online at www.sciencedirect.com Linear Algebra and its Applications 9 (8) 5 www.elsevier.com/locate/laa Non-negative matrix factorization with fixed row and column sums Ngoc-Diep Ho, Paul Van

More information

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent Unsupervised Machine Learning and Data Mining DS 5230 / DS 4420 - Fall 2018 Lecture 7 Jan-Willem van de Meent DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Dimensionality Reduction Goal:

More information

Non-Negative Matrix Factorization And Its Application to Audio. Tuomas Virtanen Tampere University of Technology

Non-Negative Matrix Factorization And Its Application to Audio. Tuomas Virtanen Tampere University of Technology Non-Negative Matrix Factorization And Its Application to Audio Tuomas Virtanen Tampere University of Technology tuomas.virtanen@tut.fi 2 Contents Introduction to audio signals Spectrogram representation

More information

SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIX FACTORIZATION AND SPECTRAL MASKS. Emad M. Grais and Hakan Erdogan

SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIX FACTORIZATION AND SPECTRAL MASKS. Emad M. Grais and Hakan Erdogan SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIX FACTORIZATION AND SPECTRAL MASKS Emad M. Grais and Hakan Erdogan Faculty of Engineering and Natural Sciences, Sabanci University, Orhanli

More information

Secrets of Matrix Factorization: Supplementary Document

Secrets of Matrix Factorization: Supplementary Document Secrets of Matrix Factorization: Supplementary Document Je Hyeong Hong University of Cambridge jhh37@cam.ac.uk Andrew Fitzgibbon Microsoft, Cambridge, UK awf@microsoft.com Abstract This document consists

More information

Single Channel Signal Separation Using MAP-based Subspace Decomposition

Single Channel Signal Separation Using MAP-based Subspace Decomposition Single Channel Signal Separation Using MAP-based Subspace Decomposition Gil-Jin Jang, Te-Won Lee, and Yung-Hwan Oh 1 Spoken Language Laboratory, Department of Computer Science, KAIST 373-1 Gusong-dong,

More information

Independent Component Analysis (ICA)

Independent Component Analysis (ICA) Independent Component Analysis (ICA) Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Single Channel Music Sound Separation Based on Spectrogram Decomposition and Note Classification

Single Channel Music Sound Separation Based on Spectrogram Decomposition and Note Classification Single Channel Music Sound Separation Based on Spectrogram Decomposition and Note Classification Hafiz Mustafa and Wenwu Wang Centre for Vision, Speech and Signal Processing (CVSSP) University of Surrey,

More information

Nonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy

Nonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy Nonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy Caroline Chaux Joint work with X. Vu, N. Thirion-Moreau and S. Maire (LSIS, Toulon) Aix-Marseille

More information

Non-negative Matrix Factorization: Algorithms, Extensions and Applications

Non-negative Matrix Factorization: Algorithms, Extensions and Applications Non-negative Matrix Factorization: Algorithms, Extensions and Applications Emmanouil Benetos www.soi.city.ac.uk/ sbbj660/ March 2013 Emmanouil Benetos Non-negative Matrix Factorization March 2013 1 / 25

More information

Non-negative matrix factorization and term structure of interest rates

Non-negative matrix factorization and term structure of interest rates Non-negative matrix factorization and term structure of interest rates Hellinton H. Takada and Julio M. Stern Citation: AIP Conference Proceedings 1641, 369 (2015); doi: 10.1063/1.4906000 View online:

More information

Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE Artificial Intelligence Grad Project Dr. Debasis Mitra

Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE Artificial Intelligence Grad Project Dr. Debasis Mitra Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE 5290 - Artificial Intelligence Grad Project Dr. Debasis Mitra Group 6 Taher Patanwala Zubin Kadva Factor Analysis (FA) 1. Introduction Factor

More information

Maximum Within-Cluster Association

Maximum Within-Cluster Association Maximum Within-Cluster Association Yongjin Lee, Seungjin Choi Department of Computer Science Pohang University of Science and Technology San 3 Hyoja-dong, Nam-gu, Pohang 790-784, Korea Abstract This paper

More information

Using Matrix Decompositions in Formal Concept Analysis

Using Matrix Decompositions in Formal Concept Analysis Using Matrix Decompositions in Formal Concept Analysis Vaclav Snasel 1, Petr Gajdos 1, Hussam M. Dahwa Abdulla 1, Martin Polovincak 1 1 Dept. of Computer Science, Faculty of Electrical Engineering and

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 12 Jan-Willem van de Meent (credit: Yijun Zhao, Percy Liang) DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Linear Dimensionality

More information

COMPLEX CONSTRAINED CRB AND ITS APPLICATION TO SEMI-BLIND MIMO AND OFDM CHANNEL ESTIMATION. Aditya K. Jagannatham and Bhaskar D.

COMPLEX CONSTRAINED CRB AND ITS APPLICATION TO SEMI-BLIND MIMO AND OFDM CHANNEL ESTIMATION. Aditya K. Jagannatham and Bhaskar D. COMPLEX CONSTRAINED CRB AND ITS APPLICATION TO SEMI-BLIND MIMO AND OFDM CHANNEL ESTIMATION Aditya K Jagannatham and Bhaskar D Rao University of California, SanDiego 9500 Gilman Drive, La Jolla, CA 92093-0407

More information

SIGNALS with sparse representations can be recovered

SIGNALS with sparse representations can be recovered IEEE SIGNAL PROCESSING LETTERS, VOL. 22, NO. 9, SEPTEMBER 2015 1497 Cramér Rao Bound for Sparse Signals Fitting the Low-Rank Model with Small Number of Parameters Mahdi Shaghaghi, Student Member, IEEE,

More information

arxiv: v1 [stat.ml] 10 Apr 2017

arxiv: v1 [stat.ml] 10 Apr 2017 Integral Transforms from Finite Data: An Application of Gaussian Process Regression to Fourier Analysis arxiv:1704.02828v1 [stat.ml] 10 Apr 2017 Luca Amrogioni * and Eric Maris Radoud University, Donders

More information

Nonparameteric Regression:

Nonparameteric Regression: Nonparameteric Regression: Nadaraya-Watson Kernel Regression & Gaussian Process Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro,

More information

Learning to Learn and Collaborative Filtering

Learning to Learn and Collaborative Filtering Appearing in NIPS 2005 workshop Inductive Transfer: Canada, December, 2005. 10 Years Later, Whistler, Learning to Learn and Collaborative Filtering Kai Yu, Volker Tresp Siemens AG, 81739 Munich, Germany

More information

CP DECOMPOSITION AND ITS APPLICATION IN NOISE REDUCTION AND MULTIPLE SOURCES IDENTIFICATION

CP DECOMPOSITION AND ITS APPLICATION IN NOISE REDUCTION AND MULTIPLE SOURCES IDENTIFICATION International Conference on Computer Science and Intelligent Communication (CSIC ) CP DECOMPOSITION AND ITS APPLICATION IN NOISE REDUCTION AND MULTIPLE SOURCES IDENTIFICATION Xuefeng LIU, Yuping FENG,

More information

Preserving Privacy in Data Mining using Data Distortion Approach

Preserving Privacy in Data Mining using Data Distortion Approach Preserving Privacy in Data Mining using Data Distortion Approach Mrs. Prachi Karandikar #, Prof. Sachin Deshpande * # M.E. Comp,VIT, Wadala, University of Mumbai * VIT Wadala,University of Mumbai 1. prachiv21@yahoo.co.in

More information

cross-language retrieval (by concatenate features of different language in X and find co-shared U). TOEFL/GRE synonym in the same way.

cross-language retrieval (by concatenate features of different language in X and find co-shared U). TOEFL/GRE synonym in the same way. 10-708: Probabilistic Graphical Models, Spring 2015 22 : Optimization and GMs (aka. LDA, Sparse Coding, Matrix Factorization, and All That ) Lecturer: Yaoliang Yu Scribes: Yu-Xiang Wang, Su Zhou 1 Introduction

More information

CPM: A Covariance-preserving Projection Method

CPM: A Covariance-preserving Projection Method CPM: A Covariance-preserving Projection Method Jieping Ye Tao Xiong Ravi Janardan Astract Dimension reduction is critical in many areas of data mining and machine learning. In this paper, a Covariance-preserving

More information

Depth versus Breadth in Convolutional Polar Codes

Depth versus Breadth in Convolutional Polar Codes Depth versus Breadth in Convolutional Polar Codes Maxime Tremlay, Benjamin Bourassa and David Poulin,2 Département de physique & Institut quantique, Université de Sherrooke, Sherrooke, Quéec, Canada JK

More information

Sparse and Shift-Invariant Feature Extraction From Non-Negative Data

Sparse and Shift-Invariant Feature Extraction From Non-Negative Data MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Sparse and Shift-Invariant Feature Extraction From Non-Negative Data Paris Smaragdis, Bhiksha Raj, Madhusudana Shashanka TR8-3 August 8 Abstract

More information

IN this paper, we consider the estimation of the frequency

IN this paper, we consider the estimation of the frequency Iterative Frequency Estimation y Interpolation on Fourier Coefficients Elias Aoutanios, MIEEE, Bernard Mulgrew, MIEEE Astract The estimation of the frequency of a complex exponential is a prolem that is

More information

Discovering Convolutive Speech Phones using Sparseness and Non-Negativity Constraints

Discovering Convolutive Speech Phones using Sparseness and Non-Negativity Constraints Discovering Convolutive Speech Phones using Sparseness and Non-Negativity Constraints Paul D. O Grady and Barak A. Pearlmutter Hamilton Institute, National University of Ireland Maynooth, Co. Kildare,

More information

An Adaptive Clarke Transform Based Estimator for the Frequency of Balanced and Unbalanced Three-Phase Power Systems

An Adaptive Clarke Transform Based Estimator for the Frequency of Balanced and Unbalanced Three-Phase Power Systems 07 5th European Signal Processing Conference EUSIPCO) An Adaptive Clarke Transform Based Estimator for the Frequency of Balanced and Unalanced Three-Phase Power Systems Elias Aoutanios School of Electrical

More information

Linear Models for Regression

Linear Models for Regression Linear Models for Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

COMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017

COMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017 COMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University TOPIC MODELING MODELS FOR TEXT DATA

More information

A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors

A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors Nico Vervliet Joint work with Lieven De Lathauwer SIAM AN17, July 13, 2017 2 Classification of hazardous

More information

Group Sparse Non-negative Matrix Factorization for Multi-Manifold Learning

Group Sparse Non-negative Matrix Factorization for Multi-Manifold Learning LIU, LU, GU: GROUP SPARSE NMF FOR MULTI-MANIFOLD LEARNING 1 Group Sparse Non-negative Matrix Factorization for Multi-Manifold Learning Xiangyang Liu 1,2 liuxy@sjtu.edu.cn Hongtao Lu 1 htlu@sjtu.edu.cn

More information

Minimax MMSE Estimator for Sparse System

Minimax MMSE Estimator for Sparse System Proceedings of the World Congress on Engineering and Computer Science 22 Vol I WCE 22, October 24-26, 22, San Francisco, USA Minimax MMSE Estimator for Sparse System Hongqing Liu, Mandar Chitre Abstract

More information

A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement

A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement Simon Leglaive 1 Laurent Girin 1,2 Radu Horaud 1 1: Inria Grenoble Rhône-Alpes 2: Univ. Grenoble Alpes, Grenoble INP,

More information

A PROBABILISTIC INTERPRETATION OF SAMPLING THEORY OF GRAPH SIGNALS. Akshay Gadde and Antonio Ortega

A PROBABILISTIC INTERPRETATION OF SAMPLING THEORY OF GRAPH SIGNALS. Akshay Gadde and Antonio Ortega A PROBABILISTIC INTERPRETATION OF SAMPLING THEORY OF GRAPH SIGNALS Akshay Gadde and Antonio Ortega Department of Electrical Engineering University of Southern California, Los Angeles Email: agadde@usc.edu,

More information

A Modified PMF Model Incorporating Implicit Item Associations

A Modified PMF Model Incorporating Implicit Item Associations A Modified PMF Model Incorporating Implicit Item Associations Qiang Liu Institute of Artificial Intelligence College of Computer Science Zhejiang University Hangzhou 31007, China Email: 01dtd@gmail.com

More information

Dimension Reduction Using Nonnegative Matrix Tri-Factorization in Multi-label Classification

Dimension Reduction Using Nonnegative Matrix Tri-Factorization in Multi-label Classification 250 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'15 Dimension Reduction Using Nonnegative Matrix Tri-Factorization in Multi-label Classification Keigo Kimura, Mineichi Kudo and Lu Sun Graduate

More information

MULTI-RESOLUTION SIGNAL DECOMPOSITION WITH TIME-DOMAIN SPECTROGRAM FACTORIZATION. Hirokazu Kameoka

MULTI-RESOLUTION SIGNAL DECOMPOSITION WITH TIME-DOMAIN SPECTROGRAM FACTORIZATION. Hirokazu Kameoka MULTI-RESOLUTION SIGNAL DECOMPOSITION WITH TIME-DOMAIN SPECTROGRAM FACTORIZATION Hiroazu Kameoa The University of Toyo / Nippon Telegraph and Telephone Corporation ABSTRACT This paper proposes a novel

More information

Linear Models for Regression

Linear Models for Regression Linear Models for Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

a Short Introduction

a Short Introduction Collaborative Filtering in Recommender Systems: a Short Introduction Norm Matloff Dept. of Computer Science University of California, Davis matloff@cs.ucdavis.edu December 3, 2016 Abstract There is a strong

More information

* Matrix Factorization and Recommendation Systems

* Matrix Factorization and Recommendation Systems Matrix Factorization and Recommendation Systems Originally presented at HLF Workshop on Matrix Factorization with Loren Anderson (University of Minnesota Twin Cities) on 25 th September, 2017 15 th March,

More information

New Cramér-Rao Bound Expressions for Coprime and Other Sparse Arrays

New Cramér-Rao Bound Expressions for Coprime and Other Sparse Arrays New Cramér-Rao Bound Expressions for Coprime and Other Sparse Arrays Chun-Lin Liu 1 P. P. Vaidyanathan 2 1,2 Dept. of Electrical Engineering, MC 136-93 California Institute of Technology, Pasadena, CA

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

Collaborative Filtering with Aspect-based Opinion Mining: A Tensor Factorization Approach

Collaborative Filtering with Aspect-based Opinion Mining: A Tensor Factorization Approach 2012 IEEE 12th International Conference on Data Mining Collaborative Filtering with Aspect-based Opinion Mining: A Tensor Factorization Approach Yuanhong Wang,Yang Liu, Xiaohui Yu School of Computer Science

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Multi-Task Clustering using Constrained Symmetric Non-Negative Matrix Factorization

Multi-Task Clustering using Constrained Symmetric Non-Negative Matrix Factorization Multi-Task Clustering using Constrained Symmetric Non-Negative Matrix Factorization Samir Al-Stouhi Chandan K. Reddy Abstract Researchers have attempted to improve the quality of clustering solutions through

More information

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1

More information

Bregman Divergences. Barnabás Póczos. RLAI Tea Talk UofA, Edmonton. Aug 5, 2008

Bregman Divergences. Barnabás Póczos. RLAI Tea Talk UofA, Edmonton. Aug 5, 2008 Bregman Divergences Barnabás Póczos RLAI Tea Talk UofA, Edmonton Aug 5, 2008 Contents Bregman Divergences Bregman Matrix Divergences Relation to Exponential Family Applications Definition Properties Generalization

More information

Iterative Laplacian Score for Feature Selection

Iterative Laplacian Score for Feature Selection Iterative Laplacian Score for Feature Selection Linling Zhu, Linsong Miao, and Daoqiang Zhang College of Computer Science and echnology, Nanjing University of Aeronautics and Astronautics, Nanjing 2006,

More information

Local Feature Extraction Models from Incomplete Data in Face Recognition Based on Nonnegative Matrix Factorization

Local Feature Extraction Models from Incomplete Data in Face Recognition Based on Nonnegative Matrix Factorization American Journal of Software Engineering and Applications 2015; 4(3): 50-55 Published online May 12, 2015 (http://www.sciencepublishinggroup.com/j/ajsea) doi: 10.11648/j.ajsea.20150403.12 ISSN: 2327-2473

More information

Spectrum Opportunity Detection with Weak and Correlated Signals

Spectrum Opportunity Detection with Weak and Correlated Signals Specum Opportunity Detection with Weak and Correlated Signals Yao Xie Department of Elecical and Computer Engineering Duke University orth Carolina 775 Email: yaoxie@dukeedu David Siegmund Department of

More information

Compressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach

Compressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach Compressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach Asilomar 2011 Jason T. Parker (AFRL/RYAP) Philip Schniter (OSU) Volkan Cevher (EPFL) Problem Statement Traditional

More information

Transfer Learning for Collective Link Prediction in Multiple Heterogenous Domains

Transfer Learning for Collective Link Prediction in Multiple Heterogenous Domains Transfer Learning for Collective Link Prediction in Multiple Heterogenous Domains Bin Cao caobin@cse.ust.hk Nathan Nan Liu nliu@cse.ust.hk Qiang Yang qyang@cse.ust.hk Hong Kong University of Science and

More information

ORTHOGONALITY-REGULARIZED MASKED NMF FOR LEARNING ON WEAKLY LABELED AUDIO DATA. Iwona Sobieraj, Lucas Rencker, Mark D. Plumbley

ORTHOGONALITY-REGULARIZED MASKED NMF FOR LEARNING ON WEAKLY LABELED AUDIO DATA. Iwona Sobieraj, Lucas Rencker, Mark D. Plumbley ORTHOGONALITY-REGULARIZED MASKED NMF FOR LEARNING ON WEAKLY LABELED AUDIO DATA Iwona Sobieraj, Lucas Rencker, Mark D. Plumbley University of Surrey Centre for Vision Speech and Signal Processing Guildford,

More information

Probabilistic Latent Variable Models as Non-Negative Factorizations

Probabilistic Latent Variable Models as Non-Negative Factorizations 1 Probabilistic Latent Variable Models as Non-Negative Factorizations Madhusudana Shashanka, Bhiksha Raj, Paris Smaragdis Abstract In this paper, we present a family of probabilistic latent variable models

More information

Sparse representation classification and positive L1 minimization

Sparse representation classification and positive L1 minimization Sparse representation classification and positive L1 minimization Cencheng Shen Joint Work with Li Chen, Carey E. Priebe Applied Mathematics and Statistics Johns Hopkins University, August 5, 2014 Cencheng

More information

Bias-Variance Trade-Off in Hierarchical Probabilistic Models Using Higher-Order Feature Interactions

Bias-Variance Trade-Off in Hierarchical Probabilistic Models Using Higher-Order Feature Interactions - Trade-Off in Hierarchical Probabilistic Models Using Higher-Order Feature Interactions Simon Luo The University of Sydney Data61, CSIRO simon.luo@data61.csiro.au Mahito Sugiyama National Institute of

More information

Bayesian Methods for Sparse Signal Recovery

Bayesian Methods for Sparse Signal Recovery Bayesian Methods for Sparse Signal Recovery Bhaskar D Rao 1 University of California, San Diego 1 Thanks to David Wipf, Jason Palmer, Zhilin Zhang and Ritwik Giri Motivation Motivation Sparse Signal Recovery

More information

c Springer, Reprinted with permission.

c Springer, Reprinted with permission. Zhijian Yuan and Erkki Oja. A FastICA Algorithm for Non-negative Independent Component Analysis. In Puntonet, Carlos G.; Prieto, Alberto (Eds.), Proceedings of the Fifth International Symposium on Independent

More information

Spectral Clustering. Spectral Clustering? Two Moons Data. Spectral Clustering Algorithm: Bipartioning. Spectral methods

Spectral Clustering. Spectral Clustering? Two Moons Data. Spectral Clustering Algorithm: Bipartioning. Spectral methods Spectral Clustering Seungjin Choi Department of Computer Science POSTECH, Korea seungjin@postech.ac.kr 1 Spectral methods Spectral Clustering? Methods using eigenvectors of some matrices Involve eigen-decomposition

More information

FAST AND ACCURATE DIRECTION-OF-ARRIVAL ESTIMATION FOR A SINGLE SOURCE

FAST AND ACCURATE DIRECTION-OF-ARRIVAL ESTIMATION FOR A SINGLE SOURCE Progress In Electromagnetics Research C, Vol. 6, 13 20, 2009 FAST AND ACCURATE DIRECTION-OF-ARRIVAL ESTIMATION FOR A SINGLE SOURCE Y. Wu School of Computer Science and Engineering Wuhan Institute of Technology

More information

When Dictionary Learning Meets Classification

When Dictionary Learning Meets Classification When Dictionary Learning Meets Classification Bufford, Teresa 1 Chen, Yuxin 2 Horning, Mitchell 3 Shee, Liberty 1 Mentor: Professor Yohann Tendero 1 UCLA 2 Dalhousie University 3 Harvey Mudd College August

More information

Two-Dimensional Linear Discriminant Analysis

Two-Dimensional Linear Discriminant Analysis Two-Dimensional Linear Discriminant Analysis Jieping Ye Department of CSE University of Minnesota jieping@cs.umn.edu Ravi Janardan Department of CSE University of Minnesota janardan@cs.umn.edu Qi Li Department

More information

Collaborative Filtering via Ensembles of Matrix Factorizations

Collaborative Filtering via Ensembles of Matrix Factorizations Collaborative Ftering via Ensembles of Matrix Factorizations Mingrui Wu Max Planck Institute for Biological Cybernetics Spemannstrasse 38, 72076 Tübingen, Germany mingrui.wu@tuebingen.mpg.de ABSTRACT We

More information

Simplicial Nonnegative Matrix Tri-Factorization: Fast Guaranteed Parallel Algorithm

Simplicial Nonnegative Matrix Tri-Factorization: Fast Guaranteed Parallel Algorithm Simplicial Nonnegative Matrix Tri-Factorization: Fast Guaranteed Parallel Algorithm Duy-Khuong Nguyen 13, Quoc Tran-Dinh 2, and Tu-Bao Ho 14 1 Japan Advanced Institute of Science and Technology, Japan

More information

MULTIPLICATIVE ALGORITHM FOR CORRENTROPY-BASED NONNEGATIVE MATRIX FACTORIZATION

MULTIPLICATIVE ALGORITHM FOR CORRENTROPY-BASED NONNEGATIVE MATRIX FACTORIZATION MULTIPLICATIVE ALGORITHM FOR CORRENTROPY-BASED NONNEGATIVE MATRIX FACTORIZATION Ehsan Hosseini Asl 1, Jacek M. Zurada 1,2 1 Department of Electrical and Computer Engineering University of Louisville, Louisville,

More information

Large-scale Ordinal Collaborative Filtering

Large-scale Ordinal Collaborative Filtering Large-scale Ordinal Collaborative Filtering Ulrich Paquet, Blaise Thomson, and Ole Winther Microsoft Research Cambridge, University of Cambridge, Technical University of Denmark ulripa@microsoft.com,brmt2@cam.ac.uk,owi@imm.dtu.dk

More information

Class Mean Vector Component and Discriminant Analysis for Kernel Subspace Learning

Class Mean Vector Component and Discriminant Analysis for Kernel Subspace Learning 1 Class Mean Vector Component and Discriminant Analysis for Kernel Suspace Learning Alexandros Iosifidis Department of Engineering, ECE, Aarhus University, Denmark alexandros.iosifidis@eng.au.dk arxiv:1812.05988v1

More information

Gaussian Processes for Big Data. James Hensman

Gaussian Processes for Big Data. James Hensman Gaussian Processes for Big Data James Hensman Overview Motivation Sparse Gaussian Processes Stochastic Variational Inference Examples Overview Motivation Sparse Gaussian Processes Stochastic Variational

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

AN ENHANCED INITIALIZATION METHOD FOR NON-NEGATIVE MATRIX FACTORIZATION. Liyun Gong 1, Asoke K. Nandi 2,3 L69 3BX, UK; 3PH, UK;

AN ENHANCED INITIALIZATION METHOD FOR NON-NEGATIVE MATRIX FACTORIZATION. Liyun Gong 1, Asoke K. Nandi 2,3 L69 3BX, UK; 3PH, UK; 213 IEEE INERNAIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, SEP. 22 25, 213, SOUHAMPON, UK AN ENHANCED INIIALIZAION MEHOD FOR NON-NEGAIVE MARIX FACORIZAION Liyun Gong 1, Asoke K. Nandi 2,3

More information

Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text

Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text Yi Zhang Machine Learning Department Carnegie Mellon University yizhang1@cs.cmu.edu Jeff Schneider The Robotics Institute

More information

TIGHT BOUNDS FOR THE FIRST ORDER MARCUM Q-FUNCTION

TIGHT BOUNDS FOR THE FIRST ORDER MARCUM Q-FUNCTION TIGHT BOUNDS FOR THE FIRST ORDER MARCUM Q-FUNCTION Jiangping Wang and Dapeng Wu Department of Electrical and Computer Engineering University of Florida, Gainesville, FL 3611 Correspondence author: Prof.

More information