Linear dimensionality reduction for data analysis

Size: px
Start display at page:

Download "Linear dimensionality reduction for data analysis"

Transcription

1 Linear dimensionality reduction for data analysis Nicolas Gillis Joint work with Robert Luce, François Glineur, Stephen Vavasis, Robert Plemmons, Gabriella Casalino

2 The setup Dimensionality reduction for data analysis Given a set of n data points m j (j = 1, 2,..., n), we would like to understand the underlying structure of this data. Nicolas Gillis Linear dimensionality reduction for data analysis 2

3 The setup Dimensionality reduction for data analysis Given a set of n data points m j (j = 1, 2,..., n), we would like to understand the underlying structure of this data. A fundamental and powerful tool is linear dimensionality reduction: find a set of r basis vectors u k (1 k r) so that for all j for some weights v kj. Nicolas Gillis Linear dimensionality reduction for data analysis 2

4 The setup Dimensionality reduction for data analysis Given a set of n data points m j (j = 1, 2,..., n), we would like to understand the underlying structure of this data. A fundamental and powerful tool is linear dimensionality reduction: find a set of r basis vectors u k (1 k r) so that for all j for some weights v kj. This is equivalent to the low-rank approximation of matrix M: M = [m 1 m 2... m n ] [u 1 u 2... u r ] [v 1 v 2... v n ] = UV. Nicolas Gillis Linear dimensionality reduction for data analysis 2

5 Constrained Low-Rank Matrix Approximations Nicolas Gillis Linear dimensionality reduction for data analysis 3

6 Constrained Low-Rank Matrix Approximations How to measure the error M UV? Ex. PCA/truncated SVD use X = X 2 F = i,j X 2 ij. Nicolas Gillis Linear dimensionality reduction for data analysis 3

7 Constrained Low-Rank Matrix Approximations How to measure the error M UV? Ex. PCA/truncated SVD use X = X 2 F = i,j X 2 ij. What constraints should the factors U Ω U and V Ω V satisfy? Ex. PCA has no constraints, NMF with U 0 and V 0. Nicolas Gillis Linear dimensionality reduction for data analysis 3

8 Constrained Low-Rank Matrix Approximations How to measure the error M UV? Ex. PCA/truncated SVD use X = X 2 F = i,j X 2 ij. What constraints should the factors U Ω U and V Ω V satisfy? Ex. PCA has no constraints, NMF with U 0 and V 0. Goal of this presentation: show some applications, present several models and discuss some algorithms. Nicolas Gillis Linear dimensionality reduction for data analysis 3

9 Recommender systems In some cases, some entries are missing/unknown. For example, we would like to predict how much someone is going to like a movie based on its movie preferences (e.g., 1 to 5 stars) : Movies Users 2 3 2??? 1? 3 2 1? 4 1? 5 4? 3 2? 1 2? 4 1? Nicolas Gillis Linear dimensionality reduction for data analysis 4

10 Recommender systems In some cases, some entries are missing/unknown. For example, we would like to predict how much someone is going to like a movie based on its movie preferences (e.g., 1 to 5 stars) : Movies Users 2 3 2??? 1? 3 2 1? 4 1? 5 4? 3 2? 1 2? 4 1? Huge potential in electronic commercial sites (movies, books, music,... ). Good recommendations will increase the propensity of a purchase. Nicolas Gillis Linear dimensionality reduction for data analysis 4

11 Low-rank matrix approximations The behavior of users is modeled using linear combination of feature users (related to age, sex, culture, etc.) M(:, j) }{{} user j r k=1 U(:, k) }{{} feature user k V (k, j) }{{} weights Nicolas Gillis Linear dimensionality reduction for data analysis 5

12 Low-rank matrix approximations The behavior of users is modeled using linear combination of feature users (related to age, sex, culture, etc.) M(:, j) }{{} user j r k=1 U(:, k) }{{} feature user k V (k, j) }{{} weights Or equivalently, movies ratings are modeled as linear combinations of feature movies (related to the genres - child oriented, serious vs. escapist, thriller, romantic, actors, etc.). M(i, :) }{{} movie i r k=1 U(i, k) }{{} weights V (k, :) }{{} genre k Nicolas Gillis Linear dimensionality reduction for data analysis 5

13 For example, using a rank-2 factorization on the Netflix dataset, female vs. male and serious vs. escapist behaviors were extracted. Koren, Bell, Volinsky, Matrix Factorization Techniques for Recommender Systems, Winners of the Netflix prize 1,000,000$. Nicolas Gillis Linear dimensionality reduction for data analysis 6

14 PCA with weights and missing data inf U R m r V R r n M UV T 2 W = ij where W 0 is a weight matrix. W ij (M UV T ) 2 ij, (WLRA) Nicolas Gillis Linear dimensionality reduction for data analysis 7

15 PCA with weights and missing data inf U R m r V R r n M UV T 2 W = ij where W 0 is a weight matrix. W ij (M UV T ) 2 ij, NP-hard in general (G., Glineur 2011). Ill-posed, e.g., M = (WLRA) ( 1? 0 1 ). Nicolas Gillis Linear dimensionality reduction for data analysis 7

16 PCA with weights and missing data inf U R m r V R r n M UV T 2 W = ij where W 0 is a weight matrix. W ij (M UV T ) 2 ij, NP-hard in general (G., Glineur 2011). Ill-posed, e.g., M = (WLRA) ( 1? 0 1 Convexification using the nuclear norm: min X X + M X 2 W with O(nr log 2 n) observations (Fazel 2002, Candès and Recht 2009) assuming incoherence. ). Nicolas Gillis Linear dimensionality reduction for data analysis 7

17 PCA with weights and missing data inf U R m r V R r n M UV T 2 W = ij where W 0 is a weight matrix. W ij (M UV T ) 2 ij, NP-hard in general (G., Glineur 2011). Ill-posed, e.g., M = (WLRA) ( 1? 0 1 Convexification using the nuclear norm: min X X + M X 2 W with O(nr log 2 n) observations (Fazel 2002, Candès and Recht 2009) assuming incoherence. Alternating/Local minimization can lead to optimal solutions under similar assumptions (Keshavan, Montanari, Oh 2009; Jain, Netrapalli, Sanghavi 2013; Last week: Bhojanapalli, Neyshabur, Srebro 2016; Ge, Lee, Ma 2016). Riemannian optimization techniques (Boumal, Absil 2011). ). Nicolas Gillis Linear dimensionality reduction for data analysis 7

18 PCA with weights and missing data inf U R m r V R r n M UV T 2 W = ij where W 0 is a weight matrix. W ij (M UV T ) 2 ij, NP-hard in general (G., Glineur 2011). Ill-posed, e.g., M = (WLRA) ( 1? 0 1 Convexification using the nuclear norm: min X X + M X 2 W with O(nr log 2 n) observations (Fazel 2002, Candès and Recht 2009) assuming incoherence. Alternating/Local minimization can lead to optimal solutions under similar assumptions (Keshavan, Montanari, Oh 2009; Jain, Netrapalli, Sanghavi 2013; Last week: Bhojanapalli, Neyshabur, Srebro 2016; Ge, Lee, Ma 2016). Riemannian optimization techniques (Boumal, Absil 2011). See the youtube video Linear Inverse Problems by Ankur Moitra (MIT). Nicolas Gillis Linear dimensionality reduction for data analysis 7 ).

19 Background substraction in a video sequence Nicolas Gillis Linear dimensionality reduction for data analysis 8

20 Background substraction in a video sequence Nicolas Gillis Linear dimensionality reduction for data analysis 8

21 Robust PCA min M X 0 + γ X UV T. (RPCA) X R m n,u R m r,v R r n Nicolas Gillis Linear dimensionality reduction for data analysis 9

22 Robust PCA min M X 0 + γ X UV T. (RPCA) X R m n,u R m r,v R r n Very similar developments as for PCA with missing data: Convexification using the l 1 and nuclear norms: min X X + M X 1 (Chandrasekaran, Sanghavi, Parrilo, Willsky 2011, Candès, Li, Ma, Wright 2011), provably recovers the solutions if the model holds (M is low-rank + sparse + some noise). Nicolas Gillis Linear dimensionality reduction for data analysis 9

23 Robust PCA min M X 0 + γ X UV T. (RPCA) X R m n,u R m r,v R r n Very similar developments as for PCA with missing data: Convexification using the l 1 and nuclear norms: min X X + M X 1 (Chandrasekaran, Sanghavi, Parrilo, Willsky 2011, Candès, Li, Ma, Wright 2011), provably recovers the solutions if the model holds (M is low-rank + sparse + some noise). Alternating minimization can lead to optimal solutions under similar assumptions (Netrapalli, Niranjan, Sanghavi, Anandkumar 2014). Nicolas Gillis Linear dimensionality reduction for data analysis 9

24 Complexity of Robust PCA min u R m,v R n M uv T 1 = i,j M ij u i v j. (rank-one RPCA) Nicolas Gillis Linear dimensionality reduction for data analysis 10

25 Complexity of Robust PCA min u R m,v R n M uv T 1 = i,j M ij u i v j. (rank-one RPCA) NP-hard problem (G. and Vavasis 2018). Nicolas Gillis Linear dimensionality reduction for data analysis 10

26 Complexity of Robust PCA min u R m,v R n M uv T 1 = i,j M ij u i v j. (rank-one RPCA) NP-hard problem (G. and Vavasis 2018). If M is binary, M {0, 1} m n, any optimal solution (u, v ) can be assumed to be binary, that is, (u, v ) {0, 1} m {0, 1} n. Nicolas Gillis Linear dimensionality reduction for data analysis 10

27 Complexity of Robust PCA min u R m,v R n M uv T 1 = i,j M ij u i v j. (rank-one RPCA) NP-hard problem (G. and Vavasis 2018). If M is binary, M {0, 1} m n, any optimal solution (u, v ) can be assumed to be binary, that is, (u, v ) {0, 1} m {0, 1} n. MAX-CUT can be reduced to the binary l 1 -norm rank-one approximation. Nicolas Gillis Linear dimensionality reduction for data analysis 10

28 Blind hyperspectral unmixing Figure: Urban hyperspectral image, 162 spectral bands and 307-by-307 pixels. Nicolas Gillis Linear dimensionality reduction for data analysis 11

29 Blind hyperspectral unmixing Figure: Urban hyperspectral image, 162 spectral bands and 307-by-307 pixels. Problem. Identify the materials and classify the pixels. Model. Linear mixing model. Nicolas Gillis Linear dimensionality reduction for data analysis 11

30 Linear mixing model Nicolas Gillis Linear dimensionality reduction for data analysis 12

31 Linear mixing model Nicolas Gillis Linear dimensionality reduction for data analysis 12

32 Blind hyperspectral unmixing Nicolas Gillis Linear dimensionality reduction for data analysis 13

33 Blind hyperspectral unmixing Basis elements allow to recover the different endmembers: U 0; Nicolas Gillis Linear dimensionality reduction for data analysis 13

34 Blind hyperspectral unmixing Basis elements allow to recover the different endmembers: U 0; Abundances of the endmembers in each pixel: V 0. Nicolas Gillis Linear dimensionality reduction for data analysis 13

35 Urban hyperspectral image Nicolas Gillis Linear dimensionality reduction for data analysis 14

36 Urban hyperspectral image Figure: Decomposition of the Urban dataset. Nicolas Gillis Linear dimensionality reduction for data analysis 15

37 Urban hyperspectral image Figure: Decomposition of the Urban dataset. Nicolas Gillis Linear dimensionality reduction for data analysis 15

38 Urban hyperspectral image Figure: Decomposition of the Urban dataset. Nicolas Gillis Linear dimensionality reduction for data analysis 15

39 Nonnegative Matrix Factorization (NMF) Given a matrix M R p n + and a factorization rank r min(p, n), find U R p r and V R r n such that min U 0,V 0 M UV 2 F = i,j (M UV ) 2 ij. (NMF) Nicolas Gillis Linear dimensionality reduction for data analysis 16

40 Nonnegative Matrix Factorization (NMF) Given a matrix M R p n + and a factorization rank r min(p, n), find U R p r and V R r n such that min U 0,V 0 M UV 2 F = i,j (M UV ) 2 ij. (NMF) NMF is a linear dimensionality reduction technique for nonnegative data : r M(:, i) }{{} 0 k=1 U(:, k) }{{} 0 V (k, i) }{{} 0 for all i. Nicolas Gillis Linear dimensionality reduction for data analysis 16

41 Nonnegative Matrix Factorization (NMF) Given a matrix M R p n + and a factorization rank r min(p, n), find U R p r and V R r n such that min U 0,V 0 M UV 2 F = i,j (M UV ) 2 ij. (NMF) NMF is a linear dimensionality reduction technique for nonnegative data : r M(:, i) }{{} 0 k=1 U(:, k) }{{} 0 V (k, i) }{{} 0 for all i. Why nonnegativity? Interpretability: Nonnegativity constraints lead to easily interpretable factors (and a sparse and part-based representation). Many applications. image processing, text mining, hyperspectral unmixing, community detection, clustering, etc. Nicolas Gillis Linear dimensionality reduction for data analysis 16

42 Application 2: topic recovery and document classification Nicolas Gillis Linear dimensionality reduction for data analysis 17

43 Application 2: topic recovery and document classification Basis elements allow to recover the different topics; Nicolas Gillis Linear dimensionality reduction for data analysis 17

44 Application 2: topic recovery and document classification Basis elements allow to recover the different topics; Weights allow to assign each text to its corresponding topics. Nicolas Gillis Linear dimensionality reduction for data analysis 17

45 Standard NMF Algorithms We would like to solve min M UV U R m r,v R 2 r n F such that U 0, V 0, (NMF) which is NP-hard in general (Vavasis, 2009). Nicolas Gillis Linear dimensionality reduction for data analysis 18

46 Standard NMF Algorithms We would like to solve min M UV U R m r,v R 2 r n F such that U 0, V 0, (NMF) which is NP-hard in general (Vavasis, 2009). Most NMF algorithms use a two-block coordinate descent scheme, since subproblems are convex nonnegative least squares (NNLS): (0) Select initial matrices (U, V ). Then repeat the following two steps: (i) Fix V : find a new U 0 such that M UV 2 F is reduced. (ii) Fix U: find a new V 0 such that M UV 2 F is reduced. Nicolas Gillis Linear dimensionality reduction for data analysis 18

47 Standard NMF Algorithms We would like to solve min M UV U R m r,v R 2 r n F such that U 0, V 0, (NMF) which is NP-hard in general (Vavasis, 2009). Most NMF algorithms use a two-block coordinate descent scheme, since subproblems are convex nonnegative least squares (NNLS): (0) Select initial matrices (U, V ). Then repeat the following two steps: (i) Fix V : find a new U 0 such that M UV 2 F is reduced. (ii) Fix U: find a new V 0 such that M UV 2 F is reduced. So far, no optimality guarantees for alternating optimization for NMF (under some suitable conditions). Nicolas Gillis Linear dimensionality reduction for data analysis 18

48 Standard NMF Algorithms We would like to solve min M UV U R m r,v R 2 r n F such that U 0, V 0, (NMF) which is NP-hard in general (Vavasis, 2009). Most NMF algorithms use a two-block coordinate descent scheme, since subproblems are convex nonnegative least squares (NNLS): (0) Select initial matrices (U, V ). Then repeat the following two steps: (i) Fix V : find a new U 0 such that M UV 2 F is reduced. (ii) Fix U: find a new V 0 such that M UV 2 F is reduced. So far, no optimality guarantees for alternating optimization for NMF (under some suitable conditions). Is there another way? Nicolas Gillis Linear dimensionality reduction for data analysis 18

49 Separability Assumption Separability of M: there exists an index set K and V 0 with M = M(:, K) V, with K = r. }{{} U Nicolas Gillis Linear dimensionality reduction for data analysis 19

50 Separability Assumption Separability of M: there exists an index set K and V 0 with M = M(:, K) V, with K = r. }{{} U [AGKM12] Arora, Ge, Kannan, Moitra, Computing a Nonnegative Matrix Factorization Provably, STOC Nicolas Gillis Linear dimensionality reduction for data analysis 19

51 Separability Assumption Separability of M: there exists an index set K and V 0 with M = M(:, K) V, with K = r. }{{} U [AGKM12] Arora, Ge, Kannan, Moitra, Computing a Nonnegative Matrix Factorization Provably, STOC Nicolas Gillis Linear dimensionality reduction for data analysis 19

52 Applications In hyperspectral imaging, this is the pure-pixel assumption: for each material, there is a pure pixel containing only that material. [M+14] Ma et al., A Signal Processing Perspective on Hyperspectral Unmixing: Insights from Remote Sensing, IEEE Signal Processing Magazine 31(1):67-81, Nicolas Gillis Linear dimensionality reduction for data analysis 20

53 Applications In hyperspectral imaging, this is the pure-pixel assumption: for each material, there is a pure pixel containing only that material. [M+14] Ma et al., A Signal Processing Perspective on Hyperspectral Unmixing: Insights from Remote Sensing, IEEE Signal Processing Magazine 31(1):67-81, In document classification: for each topic, there is a pure word used only by that topic (an anchor word). [A+13] Arora et al., A Practical Algorithm for Topic Modeling with Provable Guarantees, ICML Nicolas Gillis Linear dimensionality reduction for data analysis 20

54 Applications In hyperspectral imaging, this is the pure-pixel assumption: for each material, there is a pure pixel containing only that material. [M+14] Ma et al., A Signal Processing Perspective on Hyperspectral Unmixing: Insights from Remote Sensing, IEEE Signal Processing Magazine 31(1):67-81, In document classification: for each topic, there is a pure word used only by that topic (an anchor word). [A+13] Arora et al., A Practical Algorithm for Topic Modeling with Provable Guarantees, ICML Time-resolved raman spectra analysis: each substance has a peak in its spectrum while the other spectra are (close to) zero. [L+16] Luce et al., Using Separable Nonnegative Matrix Factorization for the Analysis of Time-Resolved Raman Spectra, Nicolas Gillis Linear dimensionality reduction for data analysis 20

55 Applications In hyperspectral imaging, this is the pure-pixel assumption: for each material, there is a pure pixel containing only that material. [M+14] Ma et al., A Signal Processing Perspective on Hyperspectral Unmixing: Insights from Remote Sensing, IEEE Signal Processing Magazine 31(1):67-81, In document classification: for each topic, there is a pure word used only by that topic (an anchor word). [A+13] Arora et al., A Practical Algorithm for Topic Modeling with Provable Guarantees, ICML Time-resolved raman spectra analysis: each substance has a peak in its spectrum while the other spectra are (close to) zero. [L+16] Luce et al., Using Separable Nonnegative Matrix Factorization for the Analysis of Time-Resolved Raman Spectra, Others: video summarization, foreground-background separation. [ESV12] Elhamifar, Sapiro, Vidal, See all by looking at a few: Sparse modeling for finding representative objects, CVPR [KSK13] Kumar, Sindhwani, Near-separable Non-negative Matrix Factorization with l 1 - and Bregman Loss Functions, SIAM data mining Nicolas Gillis Linear dimensionality reduction for data analysis 20

56 Geometric Interpretation The columns of U are the vertices of the convex hull of the columns of M: r r M(:, j) = U(:, k)v (k, j) j, where V (k, j) = 1, V 0. k=1 k=1 Nicolas Gillis Linear dimensionality reduction for data analysis 21

57 Geometric Interpretation with Noise The columns of U are the vertices of the convex hull of the columns of M: r r M(:, j) U(:, k)v (k, j) j, where V (k, j) = 1, V 0. k=1 k=1 Nicolas Gillis Linear dimensionality reduction for data analysis 21

58 Successive Projection Algorithm (SPA) 0: Initially K =. For i = 1 : r 1: Find j = argmax j M(:, j). 2: K = K {j }. 3: M ( I uu T ) M where u = M(:,j ) M(:,j ) 2. end modified Gram-Schmidt with column pivoting. Nicolas Gillis Linear dimensionality reduction for data analysis 22

59 Successive Projection Algorithm (SPA) 0: Initially K =. For i = 1 : r 1: Find j = argmax j M(:, j). 2: K = K {j }. 3: M ( I uu T ) M where u = M(:,j ) M(:,j ) 2. end modified Gram-Schmidt with column pivoting. ) Theorem. If ɛ O, SPA satisfies ( σmin (U) rκ 2 (U) max U(:, k) M(:, K(k)) O ( ɛκ 2 (U) ). 1 k r Advantages. Extremely fast, no parameter. Drawbacks. Requires U to be full rank; bound is weak. [GV14] G., Vavasis, Fast and Robust Recursive Algorithms for Separable Nonnegative Matrix Factorization, IEEE Trans. Patt. Anal. Mach. Intell. 36 (4), pp , Nicolas Gillis Linear dimensionality reduction for data analysis 22

60 Numerical results for the Urban HSI Nicolas Gillis Linear dimensionality reduction for data analysis 23

61 Minimum-volume NMF: Relaxing separability test min M M(:, K)V K,V 0 2 F such that K = r. Fu, Huang, Sidiropoulos, Ma, Nonnegative matrix factorization for signal and data analytics: Identifiability, algorithms, and applications, IEEE Signal Proc. Magazine, Nicolas Gillis Linear dimensionality reduction for data analysis 24

62 Minimum-volume NMF: Relaxing separability min vol(u) such that M UV U 0,V 0 2 F ɛ, where vol(u) det(u T U), V (:, j) r for all j. Fu, Huang, Sidiropoulos, Ma, Nonnegative matrix factorization for signal and data analytics: Identifiability, algorithms, and applications, IEEE Signal Proc. Magazine, Nicolas Gillis Linear dimensionality reduction for data analysis 24

63 Minimum-volume NMF: Relaxing separability min vol(u) such that M UV U 0,V 0 2 F ɛ, where vol(u) det(u T U), V (:, j) r for all j. Fu, Huang, Sidiropoulos, Ma, Nonnegative matrix factorization for signal and data analytics: Identifiability, algorithms, and applications, IEEE Signal Proc. Magazine, Nicolas Gillis Linear dimensionality reduction for data analysis 24

64 Minimum-volume NMF: Relaxing separability min vol(u) such that M UV U 0,V 0 2 F ɛ, where vol(u) det(u T U), V (:, j) r for all j. Open problems: Efficient algorithm for min-vol NMF, robustness to noise. Fu, Huang, Sidiropoulos, Ma, Nonnegative matrix factorization for signal and data analytics: Identifiability, algorithms, and applications, IEEE Signal Proc. Magazine, Nicolas Gillis Linear dimensionality reduction for data analysis 24

65 Sequential NMF with underapproximations It is possible to solve NMF sequentially, solving at each step min M uv T 2 F such that uv T M M uv T 0. u 0,v 0 Nicolas Gillis Linear dimensionality reduction for data analysis 25

66 Sequential NMF with underapproximations It is possible to solve NMF sequentially, solving at each step min M uv T 2 F such that uv T M M uv T 0. u 0,v 0 NMU is yet another linear dimensionality reduction technique. However, As PCA/SVD, it is sequential and is well-posed. G., Glineur, Using Underapproximations for Sparse Nonnegative Matrix Factorization, Pattern Recognition, Nicolas Gillis Linear dimensionality reduction for data analysis 25

67 Sequential NMF with underapproximations It is possible to solve NMF sequentially, solving at each step min M uv T 2 F such that uv T M M uv T 0. u 0,v 0 NMU is yet another linear dimensionality reduction technique. However, As PCA/SVD, it is sequential and is well-posed. As NMF, it leads to a separation by parts. Moreover the additional underapproximation constraints enhance this property. G., Glineur, Using Underapproximations for Sparse Nonnegative Matrix Factorization, Pattern Recognition, Nicolas Gillis Linear dimensionality reduction for data analysis 25

68 Sequential NMF with underapproximations It is possible to solve NMF sequentially, solving at each step min M uv T 2 F such that uv T M M uv T 0. u 0,v 0 NMU is yet another linear dimensionality reduction technique. However, As PCA/SVD, it is sequential and is well-posed. As NMF, it leads to a separation by parts. Moreover the additional underapproximation constraints enhance this property. In the presence of pure-pixels, the NMU recursion is able to detect materials individually. G., Glineur, Using Underapproximations for Sparse Nonnegative Matrix Factorization, Pattern Recognition, G., Plemmons, Dimensionality Reduction, Classification, and Spectral Mixture Analysis using Nonnegative Underapproximation, Optical Engineering, Nicolas Gillis Linear dimensionality reduction for data analysis 25

69 Urban hyperspectral image Nicolas Gillis Linear dimensionality reduction for data analysis 26

70 Urban hyperspectral image Nicolas Gillis Linear dimensionality reduction for data analysis 26

71 Urban hyperspectral image Nicolas Gillis Linear dimensionality reduction for data analysis 26

72 Urban hyperspectral image Nicolas Gillis Linear dimensionality reduction for data analysis 26

73 Urban hyperspectral image Nicolas Gillis Linear dimensionality reduction for data analysis 26

74 Sparse low-rank matrix approximations Decompose a low-rank matrix with known coefficient sparsity. M = UV, rank(m) = rank(u) = r, V (:, j) 0 k = r s < r j. Nicolas Gillis Linear dimensionality reduction for data analysis 27

75 Sparse low-rank matrix approximations Decompose a low-rank matrix with known coefficient sparsity. M = UV, rank(m) = rank(u) = r, V (:, j) 0 k = r s < r j. Many existing theoretical results (see, e.g., [Gribonval 16]) and algorithms (Dictionary Learning). But: Not many results specific to the low-rank case Only two deterministic identifiability results [Elad 06, Georgiev 05] Not much in the NMF case except l 1 regularization Nicolas Gillis Linear dimensionality reduction for data analysis 27

76 Identifiability with sparsity: example Example: p = 3, r = 3, s=sparsity=1, n = 9. data points first decomposition

77 Identifiability with sparsity: example Example: p = 3, r = 3, s=sparsity=1, n = 9. data points first decomposition second decomposition Nicolas Gillis Linear dimensionality reduction for data analysis 28

78 Identifiability results Theorem Let M = UV where rank(u) = rank(m) = r and each column of V has at least s zeros. Nicolas Gillis Linear dimensionality reduction for data analysis 29

79 Identifiability results Theorem Let M = UV where rank(u) = rank(m) = r and each column of V has at least s zeros. The factorization (U, V ) is essentially unique if on each hyperplane spanned by all but one column of U, there are r(r 2) s + 1 data points with spark r. [CG18] Cohen, G., Identifiability of Low-Rank Sparse Component Analysis, arxiv: Nicolas Gillis Linear dimensionality reduction for data analysis 29

80 Geometric intuition Example: p = 3, r = 3, sparsity=1, n = = 9. data points unique decomposition Nicolas Gillis Linear dimensionality reduction for data analysis 30

81 Sparsity in action Spectral unmixing, R = 6, s = 4 Nicolas Gillis Linear dimensionality reduction for data analysis 31

82 Sparsity in action Spectral unmixing, R = 6, s = 4 Sparsity is another way to obtain identifiability for matrix decompositions. Nicolas Gillis Linear dimensionality reduction for data analysis 31

83 Sparsity in action Spectral unmixing, R = 6, s = 4 Sparsity is another way to obtain identifiability for matrix decompositions. Hard combinatorial problems to solve... Nicolas Gillis Linear dimensionality reduction for data analysis 31

84 Conclusion 1 Low-rank matrix approximations are useful and widely used linear models in data analysis and machine learning. Nicolas Gillis Linear dimensionality reduction for data analysis 32

85 Conclusion 1 Low-rank matrix approximations are useful and widely used linear models in data analysis and machine learning. 2 Except for PCA, most of these models lead to difficult non-convex optimization problems. Nicolas Gillis Linear dimensionality reduction for data analysis 32

86 Conclusion 1 Low-rank matrix approximations are useful and widely used linear models in data analysis and machine learning. 2 Except for PCA, most of these models lead to difficult non-convex optimization problems. 3 However, under some appropriate assumptions, solutions with optimality guarantee can be recovered (using convexification, standard optimization schemes or dedicated algorithms). Nicolas Gillis Linear dimensionality reduction for data analysis 32

87 Conclusion 1 Low-rank matrix approximations are useful and widely used linear models in data analysis and machine learning. 2 Except for PCA, most of these models lead to difficult non-convex optimization problems. 3 However, under some appropriate assumptions, solutions with optimality guarantee can be recovered (using convexification, standard optimization schemes or dedicated algorithms). 4 This is a very active area of research. Many open questions (e.g., uniqueness issues, drawing the line between easy and difficult instances), new models, and applications. Nicolas Gillis Linear dimensionality reduction for data analysis 32

88 Thank you for your attention! Code and papers available on Nicolas Gillis Linear dimensionality reduction for data analysis 33

Semidefinite Programming Based Preconditioning for More Robust Near-Separable Nonnegative Matrix Factorization

Semidefinite Programming Based Preconditioning for More Robust Near-Separable Nonnegative Matrix Factorization Semidefinite Programming Based Preconditioning for More Robust Near-Separable Nonnegative Matrix Factorization Nicolas Gillis nicolas.gillis@umons.ac.be https://sites.google.com/site/nicolasgillis/ Department

More information

Robust Principal Component Analysis

Robust Principal Component Analysis ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M

More information

The convex algebraic geometry of rank minimization

The convex algebraic geometry of rank minimization The convex algebraic geometry of rank minimization Pablo A. Parrilo Laboratory for Information and Decision Systems Massachusetts Institute of Technology International Symposium on Mathematical Programming

More information

Lecture Notes 10: Matrix Factorization

Lecture Notes 10: Matrix Factorization Optimization-based data analysis Fall 207 Lecture Notes 0: Matrix Factorization Low-rank models. Rank- model Consider the problem of modeling a quantity y[i, j] that depends on two indices i and j. To

More information

CSC 576: Variants of Sparse Learning

CSC 576: Variants of Sparse Learning CSC 576: Variants of Sparse Learning Ji Liu Department of Computer Science, University of Rochester October 27, 205 Introduction Our previous note basically suggests using l norm to enforce sparsity in

More information

Adaptive one-bit matrix completion

Adaptive one-bit matrix completion Adaptive one-bit matrix completion Joseph Salmon Télécom Paristech, Institut Mines-Télécom Joint work with Jean Lafond (Télécom Paristech) Olga Klopp (Crest / MODAL X, Université Paris Ouest) Éric Moulines

More information

Recovering any low-rank matrix, provably

Recovering any low-rank matrix, provably Recovering any low-rank matrix, provably Rachel Ward University of Texas at Austin October, 2014 Joint work with Yudong Chen (U.C. Berkeley), Srinadh Bhojanapalli and Sujay Sanghavi (U.T. Austin) Matrix

More information

Provable Alternating Minimization Methods for Non-convex Optimization

Provable Alternating Minimization Methods for Non-convex Optimization Provable Alternating Minimization Methods for Non-convex Optimization Prateek Jain Microsoft Research, India Joint work with Praneeth Netrapalli, Sujay Sanghavi, Alekh Agarwal, Animashree Anandkumar, Rashish

More information

Spectral k-support Norm Regularization

Spectral k-support Norm Regularization Spectral k-support Norm Regularization Andrew McDonald Department of Computer Science, UCL (Joint work with Massimiliano Pontil and Dimitris Stamos) 25 March, 2015 1 / 19 Problem: Matrix Completion Goal:

More information

BEYOND MATRIX COMPLETION

BEYOND MATRIX COMPLETION BEYOND MATRIX COMPLETION ANKUR MOITRA MASSACHUSETTS INSTITUTE OF TECHNOLOGY Based on joint work with Boaz Barak (MSR) Part I: RecommendaIon systems and parially observed matrices THE NETFLIX PROBLEM movies

More information

Non-convex Robust PCA: Provable Bounds

Non-convex Robust PCA: Provable Bounds Non-convex Robust PCA: Provable Bounds Anima Anandkumar U.C. Irvine Joint work with Praneeth Netrapalli, U.N. Niranjan, Prateek Jain and Sujay Sanghavi. Learning with Big Data High Dimensional Regime Missing

More information

Research Statement Qing Qu

Research Statement Qing Qu Qing Qu Research Statement 1/4 Qing Qu (qq2105@columbia.edu) Today we are living in an era of information explosion. As the sensors and sensing modalities proliferate, our world is inundated by unprecedented

More information

On Identifiability of Nonnegative Matrix Factorization

On Identifiability of Nonnegative Matrix Factorization On Identifiability of Nonnegative Matrix Factorization Xiao Fu, Kejun Huang, and Nicholas D. Sidiropoulos Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, 55455,

More information

Low Rank Matrix Completion Formulation and Algorithm

Low Rank Matrix Completion Formulation and Algorithm 1 2 Low Rank Matrix Completion and Algorithm Jian Zhang Department of Computer Science, ETH Zurich zhangjianthu@gmail.com March 25, 2014 Movie Rating 1 2 Critic A 5 5 Critic B 6 5 Jian 9 8 Kind Guy B 9

More information

1 Introduction to constrained low-rank matrix approximations. Volume 25 Number 1 January

1 Introduction to constrained low-rank matrix approximations. Volume 25 Number 1 January Volume 25 Number 1 January 2017 7 Introduction to Nonnegative Matrix Factorization Nicolas Gillis Department of Mathematics and Operational Research University of Mons, Rue de Houdain 9, 7000 Mons, Belgium

More information

Analysis of Robust PCA via Local Incoherence

Analysis of Robust PCA via Local Incoherence Analysis of Robust PCA via Local Incoherence Huishuai Zhang Department of EECS Syracuse University Syracuse, NY 3244 hzhan23@syr.edu Yi Zhou Department of EECS Syracuse University Syracuse, NY 3244 yzhou35@syr.edu

More information

Breaking the Limits of Subspace Inference

Breaking the Limits of Subspace Inference Breaking the Limits of Subspace Inference Claudia R. Solís-Lemus, Daniel L. Pimentel-Alarcón Emory University, Georgia State University Abstract Inferring low-dimensional subspaces that describe high-dimensional,

More information

Collaborative Filtering

Collaborative Filtering Case Study 4: Collaborative Filtering Collaborative Filtering Matrix Completion Alternating Least Squares Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Carlos Guestrin

More information

Structured matrix factorizations. Example: Eigenfaces

Structured matrix factorizations. Example: Eigenfaces Structured matrix factorizations Example: Eigenfaces An extremely large variety of interesting and important problems in machine learning can be formulated as: Given a matrix, find a matrix and a matrix

More information

Matrix Completion: Fundamental Limits and Efficient Algorithms

Matrix Completion: Fundamental Limits and Efficient Algorithms Matrix Completion: Fundamental Limits and Efficient Algorithms Sewoong Oh PhD Defense Stanford University July 23, 2010 1 / 33 Matrix completion Find the missing entries in a huge data matrix 2 / 33 Example

More information

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 7: Matrix completion Yuejie Chi The Ohio State University Page 1 Reference Guaranteed Minimum-Rank Solutions of Linear

More information

Sparse and Low-Rank Matrix Decompositions

Sparse and Low-Rank Matrix Decompositions Forty-Seventh Annual Allerton Conference Allerton House, UIUC, Illinois, USA September 30 - October 2, 2009 Sparse and Low-Rank Matrix Decompositions Venkat Chandrasekaran, Sujay Sanghavi, Pablo A. Parrilo,

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via nonconvex optimization Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Matrix completion: Fundamental limits and efficient algorithms. Sewoong Oh Stanford University

Matrix completion: Fundamental limits and efficient algorithms. Sewoong Oh Stanford University Matrix completion: Fundamental limits and efficient algorithms Sewoong Oh Stanford University 1 / 35 Low-rank matrix completion Low-rank Data Matrix Sparse Sampled Matrix Complete the matrix from small

More information

Collaborative Filtering Matrix Completion Alternating Least Squares

Collaborative Filtering Matrix Completion Alternating Least Squares Case Study 4: Collaborative Filtering Collaborative Filtering Matrix Completion Alternating Least Squares Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade May 19, 2016

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 11, 2016 Paper presentations and final project proposal Send me the names of your group member (2 or 3 students) before October 15 (this Friday)

More information

ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition

ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition Wing-Kin (Ken) Ma 2017 2018 Term 2 Department of Electronic Engineering The Chinese University

More information

A Practical Algorithm for Topic Modeling with Provable Guarantees

A Practical Algorithm for Topic Modeling with Provable Guarantees 1 A Practical Algorithm for Topic Modeling with Provable Guarantees Sanjeev Arora, Rong Ge, Yoni Halpern, David Mimno, Ankur Moitra, David Sontag, Yichen Wu, Michael Zhu Reviewed by Zhao Song December

More information

BEYOND MATRIX COMPLETION

BEYOND MATRIX COMPLETION BEYOND MATRIX COMPLETION ANKUR MOITRA MASSACHUSETTS INSTITUTE OF TECHNOLOGY Based on joint work with Boaz Barak (Harvard) Part I: RecommendaJon systems and parjally observed matrices THE NETFLIX PROBLEM

More information

MATRIX RECOVERY FROM QUANTIZED AND CORRUPTED MEASUREMENTS

MATRIX RECOVERY FROM QUANTIZED AND CORRUPTED MEASUREMENTS MATRIX RECOVERY FROM QUANTIZED AND CORRUPTED MEASUREMENTS Andrew S. Lan 1, Christoph Studer 2, and Richard G. Baraniuk 1 1 Rice University; e-mail: {sl29, richb}@rice.edu 2 Cornell University; e-mail:

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via convex relaxations Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Scalable Completion of Nonnegative Matrices with the Separable Structure

Scalable Completion of Nonnegative Matrices with the Separable Structure Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-6) Scalable Completion of Nonnegative Matrices with the Separable Structure Xiyu Yu, Wei Bian, Dacheng Tao Center for Quantum

More information

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Lingchen Kong and Naihua Xiu Department of Applied Mathematics, Beijing Jiaotong University, Beijing, 100044, People s Republic of China E-mail:

More information

GROUP-SPARSE SUBSPACE CLUSTERING WITH MISSING DATA

GROUP-SPARSE SUBSPACE CLUSTERING WITH MISSING DATA GROUP-SPARSE SUBSPACE CLUSTERING WITH MISSING DATA D Pimentel-Alarcón 1, L Balzano 2, R Marcia 3, R Nowak 1, R Willett 1 1 University of Wisconsin - Madison, 2 University of Michigan - Ann Arbor, 3 University

More information

Low-rank Matrix Completion with Noisy Observations: a Quantitative Comparison

Low-rank Matrix Completion with Noisy Observations: a Quantitative Comparison Low-rank Matrix Completion with Noisy Observations: a Quantitative Comparison Raghunandan H. Keshavan, Andrea Montanari and Sewoong Oh Electrical Engineering and Statistics Department Stanford University,

More information

Statistical Performance of Convex Tensor Decomposition

Statistical Performance of Convex Tensor Decomposition Slides available: h-p://www.ibis.t.u tokyo.ac.jp/ryotat/tensor12kyoto.pdf Statistical Performance of Convex Tensor Decomposition Ryota Tomioka 2012/01/26 @ Kyoto University Perspectives in Informatics

More information

Matrix Completion from Fewer Entries

Matrix Completion from Fewer Entries from Fewer Entries Stanford University March 30, 2009 Outline The problem, a look at the data, and some results (slides) 2 Proofs (blackboard) arxiv:090.350 The problem, a look at the data, and some results

More information

GoDec: Randomized Low-rank & Sparse Matrix Decomposition in Noisy Case

GoDec: Randomized Low-rank & Sparse Matrix Decomposition in Noisy Case ianyi Zhou IANYI.ZHOU@SUDEN.US.EDU.AU Dacheng ao DACHENG.AO@US.EDU.AU Centre for Quantum Computation & Intelligent Systems, FEI, University of echnology, Sydney, NSW 2007, Australia Abstract Low-rank and

More information

Robust Near-Separable Nonnegative Matrix Factorization Using Linear Optimization

Robust Near-Separable Nonnegative Matrix Factorization Using Linear Optimization Journal of Machine Learning Research 5 (204) 249-280 Submitted 3/3; Revised 0/3; Published 4/4 Robust Near-Separable Nonnegative Matrix Factorization Using Linear Optimization Nicolas Gillis Department

More information

Intersecting Faces: Non-negative Matrix Factorization With New Guarantees

Intersecting Faces: Non-negative Matrix Factorization With New Guarantees Rong Ge Microsoft Research New England James Zou Microsoft Research New England RONGGE@MICROSOFT.COM JAZO@MICROSOFT.COM Abstract Non-negative matrix factorization (NMF) is a natural model of admixture

More information

Rank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs

Rank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs Rank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs Bhargav Kanagal Department of Computer Science University of Maryland College Park, MD 277 bhargav@cs.umd.edu Vikas

More information

Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds

Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds Tao Wu Institute for Mathematics and Scientific Computing Karl-Franzens-University of Graz joint work with Prof.

More information

Phil Schniter. Collaborators: Jason Jeremy and Volkan

Phil Schniter. Collaborators: Jason Jeremy and Volkan Bilinear Generalized Approximate Message Passing (BiG-AMP) for Dictionary Learning Phil Schniter Collaborators: Jason Parker @OSU, Jeremy Vila @OSU, and Volkan Cehver @EPFL With support from NSF CCF-,

More information

Rank Determination for Low-Rank Data Completion

Rank Determination for Low-Rank Data Completion Journal of Machine Learning Research 18 017) 1-9 Submitted 7/17; Revised 8/17; Published 9/17 Rank Determination for Low-Rank Data Completion Morteza Ashraphijuo Columbia University New York, NY 1007,

More information

Sparse and low-rank decomposition for big data systems via smoothed Riemannian optimization

Sparse and low-rank decomposition for big data systems via smoothed Riemannian optimization Sparse and low-rank decomposition for big data systems via smoothed Riemannian optimization Yuanming Shi ShanghaiTech University, Shanghai, China shiym@shanghaitech.edu.cn Bamdev Mishra Amazon Development

More information

Combining Sparsity with Physically-Meaningful Constraints in Sparse Parameter Estimation

Combining Sparsity with Physically-Meaningful Constraints in Sparse Parameter Estimation UIUC CSL Mar. 24 Combining Sparsity with Physically-Meaningful Constraints in Sparse Parameter Estimation Yuejie Chi Department of ECE and BMI Ohio State University Joint work with Yuxin Chen (Stanford).

More information

From Compressed Sensing to Matrix Completion and Beyond. Benjamin Recht Department of Computer Sciences University of Wisconsin-Madison

From Compressed Sensing to Matrix Completion and Beyond. Benjamin Recht Department of Computer Sciences University of Wisconsin-Madison From Compressed Sensing to Matrix Completion and Beyond Benjamin Recht Department of Computer Sciences University of Wisconsin-Madison Netflix Prize One million big ones! Given 100 million ratings on a

More information

Latent Variable Graphical Model Selection Via Convex Optimization

Latent Variable Graphical Model Selection Via Convex Optimization Latent Variable Graphical Model Selection Via Convex Optimization The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published

More information

Correlated-PCA: Principal Components Analysis when Data and Noise are Correlated

Correlated-PCA: Principal Components Analysis when Data and Noise are Correlated Correlated-PCA: Principal Components Analysis when Data and Noise are Correlated Namrata Vaswani and Han Guo Iowa State University, Ames, IA, USA Email: {namrata,hanguo}@iastate.edu Abstract Given a matrix

More information

Sum-of-Squares Method, Tensor Decomposition, Dictionary Learning

Sum-of-Squares Method, Tensor Decomposition, Dictionary Learning Sum-of-Squares Method, Tensor Decomposition, Dictionary Learning David Steurer Cornell Approximation Algorithms and Hardness, Banff, August 2014 for many problems (e.g., all UG-hard ones): better guarantees

More information

Divide-and-Conquer Matrix Factorization

Divide-and-Conquer Matrix Factorization Divide-and-Conquer Matrix Factorization Lester Mackey Collaborators: Ameet Talwalkar Michael I. Jordan Stanford University UCLA UC Berkeley December 14, 2015 Mackey (Stanford) Divide-and-Conquer Matrix

More information

Recommendation Systems

Recommendation Systems Recommendation Systems Popularity Recommendation Systems Predicting user responses to options Offering news articles based on users interests Offering suggestions on what the user might like to buy/consume

More information

EE 381V: Large Scale Learning Spring Lecture 16 March 7

EE 381V: Large Scale Learning Spring Lecture 16 March 7 EE 381V: Large Scale Learning Spring 2013 Lecture 16 March 7 Lecturer: Caramanis & Sanghavi Scribe: Tianyang Bai 16.1 Topics Covered In this lecture, we introduced one method of matrix completion via SVD-based

More information

arxiv: v1 [stat.ml] 1 Mar 2015

arxiv: v1 [stat.ml] 1 Mar 2015 Matrix Completion with Noisy Entries and Outliers Raymond K. W. Wong 1 and Thomas C. M. Lee 2 arxiv:1503.00214v1 [stat.ml] 1 Mar 2015 1 Department of Statistics, Iowa State University 2 Department of Statistics,

More information

ONP-MF: An Orthogonal Nonnegative Matrix Factorization Algorithm with Application to Clustering

ONP-MF: An Orthogonal Nonnegative Matrix Factorization Algorithm with Application to Clustering ONP-MF: An Orthogonal Nonnegative Matrix Factorization Algorithm with Application to Clustering Filippo Pompili 1, Nicolas Gillis 2, P.-A. Absil 2,andFrançois Glineur 2,3 1- University of Perugia, Department

More information

Robust Asymmetric Nonnegative Matrix Factorization

Robust Asymmetric Nonnegative Matrix Factorization Robust Asymmetric Nonnegative Matrix Factorization Hyenkyun Woo Korea Institue for Advanced Study Math @UCLA Feb. 11. 2015 Joint work with Haesun Park Georgia Institute of Technology, H. Woo (KIAS) RANMF

More information

Matrix Completion for Structured Observations

Matrix Completion for Structured Observations Matrix Completion for Structured Observations Denali Molitor Department of Mathematics University of California, Los ngeles Los ngeles, C 90095, US Email: dmolitor@math.ucla.edu Deanna Needell Department

More information

An iterative hard thresholding estimator for low rank matrix recovery

An iterative hard thresholding estimator for low rank matrix recovery An iterative hard thresholding estimator for low rank matrix recovery Alexandra Carpentier - based on a joint work with Arlene K.Y. Kim Statistical Laboratory, Department of Pure Mathematics and Mathematical

More information

Matrix Completion from Noisy Entries

Matrix Completion from Noisy Entries Matrix Completion from Noisy Entries Raghunandan H. Keshavan, Andrea Montanari, and Sewoong Oh Abstract Given a matrix M of low-rank, we consider the problem of reconstructing it from noisy observations

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms François Caron Department of Statistics, Oxford STATLEARN 2014, Paris April 7, 2014 Joint work with Adrien Todeschini,

More information

Fast Algorithms for Robust PCA via Gradient Descent

Fast Algorithms for Robust PCA via Gradient Descent Fast Algorithms for Robust PCA via Gradient Descent Xinyang Yi Dohyung Park Yudong Chen Constantine Caramanis The University of Texas at Austin Cornell University yixy,dhpark,constantine}@utexas.edu yudong.chen@cornell.edu

More information

Matrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang

Matrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang Matrix Factorization & Latent Semantic Analysis Review Yize Li, Lanbo Zhang Overview SVD in Latent Semantic Indexing Non-negative Matrix Factorization Probabilistic Latent Semantic Indexing Vector Space

More information

A Simple Algorithm for Nuclear Norm Regularized Problems

A Simple Algorithm for Nuclear Norm Regularized Problems A Simple Algorithm for Nuclear Norm Regularized Problems ICML 00 Martin Jaggi, Marek Sulovský ETH Zurich Matrix Factorizations for recommender systems Y = Customer Movie UV T = u () The Netflix challenge:

More information

Sparse representation classification and positive L1 minimization

Sparse representation classification and positive L1 minimization Sparse representation classification and positive L1 minimization Cencheng Shen Joint Work with Li Chen, Carey E. Priebe Applied Mathematics and Statistics Johns Hopkins University, August 5, 2014 Cencheng

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Adrien Todeschini Inria Bordeaux JdS 2014, Rennes Aug. 2014 Joint work with François Caron (Univ. Oxford), Marie

More information

A Characterization of Sampling Patterns for Union of Low-Rank Subspaces Retrieval Problem

A Characterization of Sampling Patterns for Union of Low-Rank Subspaces Retrieval Problem A Characterization of Sampling Patterns for Union of Low-Rank Subspaces Retrieval Problem Morteza Ashraphijuo Columbia University ashraphijuo@ee.columbia.edu Xiaodong Wang Columbia University wangx@ee.columbia.edu

More information

SPARSE signal representations have gained popularity in recent

SPARSE signal representations have gained popularity in recent 6958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 10, OCTOBER 2011 Blind Compressed Sensing Sivan Gleichman and Yonina C. Eldar, Senior Member, IEEE Abstract The fundamental principle underlying

More information

EE 381V: Large Scale Optimization Fall Lecture 24 April 11

EE 381V: Large Scale Optimization Fall Lecture 24 April 11 EE 381V: Large Scale Optimization Fall 2012 Lecture 24 April 11 Lecturer: Caramanis & Sanghavi Scribe: Tao Huang 24.1 Review In past classes, we studied the problem of sparsity. Sparsity problem is that

More information

arxiv: v1 [math.oc] 11 Jun 2009

arxiv: v1 [math.oc] 11 Jun 2009 RANK-SPARSITY INCOHERENCE FOR MATRIX DECOMPOSITION VENKAT CHANDRASEKARAN, SUJAY SANGHAVI, PABLO A. PARRILO, S. WILLSKY AND ALAN arxiv:0906.2220v1 [math.oc] 11 Jun 2009 Abstract. Suppose we are given a

More information

Solving Corrupted Quadratic Equations, Provably

Solving Corrupted Quadratic Equations, Provably Solving Corrupted Quadratic Equations, Provably Yuejie Chi London Workshop on Sparse Signal Processing September 206 Acknowledgement Joint work with Yuanxin Li (OSU), Huishuai Zhuang (Syracuse) and Yingbin

More information

Robust Spectral Inference for Joint Stochastic Matrix Factorization

Robust Spectral Inference for Joint Stochastic Matrix Factorization Robust Spectral Inference for Joint Stochastic Matrix Factorization Kun Dong Cornell University October 20, 2016 K. Dong (Cornell University) Robust Spectral Inference for Joint Stochastic Matrix Factorization

More information

An Extended Frank-Wolfe Method, with Application to Low-Rank Matrix Completion

An Extended Frank-Wolfe Method, with Application to Low-Rank Matrix Completion An Extended Frank-Wolfe Method, with Application to Low-Rank Matrix Completion Robert M. Freund, MIT joint with Paul Grigas (UC Berkeley) and Rahul Mazumder (MIT) CDC, December 2016 1 Outline of Topics

More information

ENGG5781 Matrix Analysis and Computations Lecture 0: Overview

ENGG5781 Matrix Analysis and Computations Lecture 0: Overview ENGG5781 Matrix Analysis and Computations Lecture 0: Overview Wing-Kin (Ken) Ma 2018 2019 Term 1 Department of Electronic Engineering The Chinese University of Hong Kong Course Information W.-K. Ma, ENGG5781

More information

Donald Goldfarb IEOR Department Columbia University UCLA Mathematics Department Distinguished Lecture Series May 17 19, 2016

Donald Goldfarb IEOR Department Columbia University UCLA Mathematics Department Distinguished Lecture Series May 17 19, 2016 Optimization for Tensor Models Donald Goldfarb IEOR Department Columbia University UCLA Mathematics Department Distinguished Lecture Series May 17 19, 2016 1 Tensors Matrix Tensor: higher-order matrix

More information

Convex sets, conic matrix factorizations and conic rank lower bounds

Convex sets, conic matrix factorizations and conic rank lower bounds Convex sets, conic matrix factorizations and conic rank lower bounds Pablo A. Parrilo Laboratory for Information and Decision Systems Electrical Engineering and Computer Science Massachusetts Institute

More information

Dense Error Correction for Low-Rank Matrices via Principal Component Pursuit

Dense Error Correction for Low-Rank Matrices via Principal Component Pursuit Dense Error Correction for Low-Rank Matrices via Principal Component Pursuit Arvind Ganesh, John Wright, Xiaodong Li, Emmanuel J. Candès, and Yi Ma, Microsoft Research Asia, Beijing, P.R.C Dept. of Electrical

More information

Optimisation Combinatoire et Convexe.

Optimisation Combinatoire et Convexe. Optimisation Combinatoire et Convexe. Low complexity models, l 1 penalties. A. d Aspremont. M1 ENS. 1/36 Today Sparsity, low complexity models. l 1 -recovery results: three approaches. Extensions: matrix

More information

COR-OPT Seminar Reading List Sp 18

COR-OPT Seminar Reading List Sp 18 COR-OPT Seminar Reading List Sp 18 Damek Davis January 28, 2018 References [1] S. Tu, R. Boczar, M. Simchowitz, M. Soltanolkotabi, and B. Recht. Low-rank Solutions of Linear Matrix Equations via Procrustes

More information

Binary matrix completion

Binary matrix completion Binary matrix completion Yaniv Plan University of Michigan SAMSI, LDHD workshop, 2013 Joint work with (a) Mark Davenport (b) Ewout van den Berg (c) Mary Wootters Yaniv Plan (U. Mich.) Binary matrix completion

More information

A tutorial on sparse modeling. Outline:

A tutorial on sparse modeling. Outline: A tutorial on sparse modeling. Outline: 1. Why? 2. What? 3. How. 4. no really, why? Sparse modeling is a component in many state of the art signal processing and machine learning tasks. image processing

More information

Low-Rank Matrix Recovery

Low-Rank Matrix Recovery ELE 538B: Mathematics of High-Dimensional Data Low-Rank Matrix Recovery Yuxin Chen Princeton University, Fall 2018 Outline Motivation Problem setup Nuclear norm minimization RIP and low-rank matrix recovery

More information

arxiv: v3 [math.oc] 16 Aug 2017

arxiv: v3 [math.oc] 16 Aug 2017 A Fast Gradient Method for Nonnegative Sparse Regression with Self Dictionary Nicolas Gillis Robert Luce arxiv:1610.01349v3 [math.oc] 16 Aug 2017 Abstract A nonnegative matrix factorization (NMF) can be

More information

Quick Introduction to Nonnegative Matrix Factorization

Quick Introduction to Nonnegative Matrix Factorization Quick Introduction to Nonnegative Matrix Factorization Norm Matloff University of California at Davis 1 The Goal Given an u v matrix A with nonnegative elements, we wish to find nonnegative, rank-k matrices

More information

Generalized Conditional Gradient and Its Applications

Generalized Conditional Gradient and Its Applications Generalized Conditional Gradient and Its Applications Yaoliang Yu University of Alberta UBC Kelowna, 04/18/13 Y-L. Yu (UofA) GCG and Its Apps. UBC Kelowna, 04/18/13 1 / 25 1 Introduction 2 Generalized

More information

Recovering Low-Rank and Sparse Matrices via Robust Bilateral Factorization

Recovering Low-Rank and Sparse Matrices via Robust Bilateral Factorization 2014 IEEE International Conference on Data Mining Recovering Low-Rank and Sparse Matrices via Robust Bilateral Factorization Fanhua Shang, Yuanyuan Liu, James Cheng, Hong Cheng Dept. of Computer Science

More information

Matrix-Tensor and Deep Learning in High Dimensional Data Analysis

Matrix-Tensor and Deep Learning in High Dimensional Data Analysis Matrix-Tensor and Deep Learning in High Dimensional Data Analysis Tien D. Bui Department of Computer Science and Software Engineering Concordia University 14 th ICIAR Montréal July 5-7, 2017 Introduction

More information

arxiv: v1 [math.st] 13 May 2017

arxiv: v1 [math.st] 13 May 2017 Submitted to the Annals of Statistics BLIND REGRESSION VIA NEAREST NEIGHBORS UNDER LATENT VARIABLE MODELS BY CHRISTINA LEE, YIHUA LI, DEVAVRAT SHAH AND DOGYOON SONG arxiv:705.04867v [math.st] 3 May 07

More information

Nonnegative Matrix Factorization

Nonnegative Matrix Factorization Nonnegative Matrix Factorization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Research Statement Figure 2: Figure 1:

Research Statement Figure 2: Figure 1: Qing Qu Research Statement 1/6 Nowadays, as sensors and sensing modalities proliferate (e.g., hyperspectral imaging sensors [1], computational microscopy [2], and calcium imaging [3] in neuroscience, see

More information

MATRIX COMPLETION AND TENSOR RANK

MATRIX COMPLETION AND TENSOR RANK MATRIX COMPLETION AND TENSOR RANK HARM DERKSEN Abstract. In this paper, we show that the low rank matrix completion problem can be reduced to the problem of finding the rank of a certain tensor. arxiv:1302.2639v2

More information

Matrix Completion from a Few Entries

Matrix Completion from a Few Entries Matrix Completion from a Few Entries Raghunandan H. Keshavan and Sewoong Oh EE Department Stanford University, Stanford, CA 9434 Andrea Montanari EE and Statistics Departments Stanford University, Stanford,

More information

Sparse Parameter Estimation: Compressed Sensing meets Matrix Pencil

Sparse Parameter Estimation: Compressed Sensing meets Matrix Pencil Sparse Parameter Estimation: Compressed Sensing meets Matrix Pencil Yuejie Chi Departments of ECE and BMI The Ohio State University Colorado School of Mines December 9, 24 Page Acknowledgement Joint work

More information

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28 Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:

More information

Tighter Low-rank Approximation via Sampling the Leveraged Element

Tighter Low-rank Approximation via Sampling the Leveraged Element Tighter Low-rank Approximation via Sampling the Leveraged Element Srinadh Bhojanapalli The University of Texas at Austin bsrinadh@utexas.edu Prateek Jain Microsoft Research, India prajain@microsoft.com

More information

Data Mining and Matrices

Data Mining and Matrices Data Mining and Matrices 08 Boolean Matrix Factorization Rainer Gemulla, Pauli Miettinen June 13, 2013 Outline 1 Warm-Up 2 What is BMF 3 BMF vs. other three-letter abbreviations 4 Binary matrices, tiles,

More information

Recovery of Simultaneously Structured Models using Convex Optimization

Recovery of Simultaneously Structured Models using Convex Optimization Recovery of Simultaneously Structured Models using Convex Optimization Maryam Fazel University of Washington Joint work with: Amin Jalali (UW), Samet Oymak and Babak Hassibi (Caltech) Yonina Eldar (Technion)

More information

Collaborative Filtering. Radek Pelánek

Collaborative Filtering. Radek Pelánek Collaborative Filtering Radek Pelánek 2017 Notes on Lecture the most technical lecture of the course includes some scary looking math, but typically with intuitive interpretation use of standard machine

More information

MULTIPLICATIVE ALGORITHM FOR CORRENTROPY-BASED NONNEGATIVE MATRIX FACTORIZATION

MULTIPLICATIVE ALGORITHM FOR CORRENTROPY-BASED NONNEGATIVE MATRIX FACTORIZATION MULTIPLICATIVE ALGORITHM FOR CORRENTROPY-BASED NONNEGATIVE MATRIX FACTORIZATION Ehsan Hosseini Asl 1, Jacek M. Zurada 1,2 1 Department of Electrical and Computer Engineering University of Louisville, Louisville,

More information

Compressed Sensing and Robust Recovery of Low Rank Matrices

Compressed Sensing and Robust Recovery of Low Rank Matrices Compressed Sensing and Robust Recovery of Low Rank Matrices M. Fazel, E. Candès, B. Recht, P. Parrilo Electrical Engineering, University of Washington Applied and Computational Mathematics Dept., Caltech

More information

Nonnegative Matrix Factorization Complexity, Algorithms and Applications. Nicolas Gillis

Nonnegative Matrix Factorization Complexity, Algorithms and Applications. Nicolas Gillis Université catholique de Louvain École polytechnique de Louvain Département d ingénierie mathématique Nonnegative Matrix Factorization Complexity, Algorithms and Applications Nicolas Gillis Thesis submitted

More information