Divide-and-Conquer Matrix Factorization

Size: px
Start display at page:

Download "Divide-and-Conquer Matrix Factorization"

Transcription

1 Divide-and-Conquer Matrix Factorization Lester Mackey Collaborators: Ameet Talwalkar Michael I. Jordan Stanford University UCLA UC Berkeley December 14, 2015 Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

2 Introduction Motivation: Large-scale Matrix Completion Goal: Estimate a matrix L 0 R m n given a subset of its entries?? ??...? ? 5? Examples Collaborative filtering: How will user i rate movie j? Netflix: 40 million users, 200K movies and television shows Ranking on the web: Is URL j relevant to user i? Google News: millions of articles, 1 billion users Link prediction: Is user i friends with user j? Facebook: 1.5 billion users Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

3 Introduction Motivation: Large-scale Matrix Completion Goal: Estimate a matrix L 0 R m n given a subset of its entries?? ??...? ? 5? State of the art MC algorithms Strong estimation guarantees Plagued by expensive subroutines (e.g., truncated SVD) This talk Present divide and conquer approaches for scaling up any MC algorithm while maintaining strong estimation guarantees Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

4 Matrix Completion Exact Matrix Completion Background Goal: Estimate a matrix L 0 R m n given a subset of its entries Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

5 Matrix Completion Noisy Matrix Completion Background Goal: Given entries from a matrix M = L 0 + Z R m n where Z is entrywise noise and L 0 has rank r m, n, estimate L 0 Good news: L 0 has (m + n)r mn degrees of freedom L 0 = A B Factored form: AB for A R m r and B R n r Bad news: Not all low-rank matrices can be recovered Question: What can go wrong? Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

6 Matrix Completion What can go wrong? Background Entire column missing 1 2? ? ? No hope of recovery! Solution: Uniform observation model Assume that the set of s observed entries Ω is drawn uniformly at random: Ω Unif(m, n, s) Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

7 Matrix Completion What can go wrong? Background Bad spread of information 1 L = 0 [ 1 ][ ] = Can only recover L if L 11 is observed Solution: Incoherence with standard basis (Candès and Recht, 2009) A matrix L = UΣV R m n with rank(l) = r is incoherent if { max i UU e i 2 µr/m Singular vectors are not too skewed: max i VV e i 2 µr/n µr and not too cross-correlated: UV mn (In this literature, it s good to be incoherent) Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

8 Matrix Completion How do we estimate L 0? Background First attempt: minimize A rank(a) subject to (i,j) Ω (A ij M ij ) 2 2. Problem: Computationally intractable! Solution: Solve convex relaxation (Fazel, Hindi, and Boyd, 2001; Candès and Plan, 2010) minimize A A subject to (i,j) Ω (A ij M ij ) 2 2 where A = k σ k(a) is the trace/nuclear norm of A. Questions: Will the nuclear norm heuristic successfully recover L 0? Can nuclear norm minimization scale to large MC problems? Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

9 Matrix Completion Background Noisy Nuclear Norm Heuristic: Does it work? Yes, with high probability. Typical Theorem If L 0 with rank r is incoherent, s rn log 2 (n) entries of M R m n are observed uniformly at random, and ˆL solves the noisy nuclear norm heuristic, then ˆL L 0 F f(m, n) with high probability when M L 0 F. See Candès and Plan (2010); Mackey, Talwalkar, and Jordan (2014b); Keshavan, Montanari, and Oh (2010); Negahban and Wainwright (2010) Implies exact recovery in the noiseless setting ( = 0) Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

10 Matrix Completion Background Noisy Nuclear Norm Heuristic: Does it scale? Not quite... Standard interior point methods (Candès and Recht, 2009): O( Ω (m + n) 3 + Ω 2 (m + n) 2 + Ω 3 ) More efficient, tailored algorithms: Singular Value Thresholding (SVT) (Cai, Candès, and Shen, 2010) Augmented Lagrange Multiplier (ALM) (Lin, Chen, Wu, and Ma, 2009a) Accelerated Proximal Gradient (APG) (Toh and Yun, 2010) All require rank-k truncated SVD on every iteration Take away: These provably accurate MC algorithms are too expensive for large-scale or real-time matrix completion Question: How can we scale up a given matrix completion algorithm and still retain estimation guarantees? Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

11 Matrix Completion DFC Divide-Factor-Combine (DFC) Our Solution: Divide and conquer 1 Divide M into submatrices. 2 Factor each submatrix in parallel. 3 Combine submatrix estimates to estimate L 0. Advantages Submatrix completion is often much cheaper than completing M Multiple submatrix completions can be carried out in parallel DFC works with any base MC algorithm With the right choice of division and recombination, yields estimation guarantees comparable to those of the base algorithm Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

12 Matrix Completion DFC DFC-Proj: Partition and Project 1 Randomly partition M into t column submatrices M = [ C 1 C 2 C t ] where each Ci R m l 2 Complete the submatrices in parallel to obtain [Ĉ1 Ĉ 2 Ĉ t ] Reduced cost: Expect t-fold speed-up per iteration Parallel computation: Pay cost of one cheaper MC 3 Project submatrices onto a single low-dimensional column space Estimate column space of L 0 with column space of Ĉ1 ˆL proj ] = Ĉ1Ĉ+ 1 [Ĉ1 Ĉ 2 Ĉ t Common technique for randomized low-rank approximation (Frieze, Kannan, and Vempala, 1998) Minimal cost: O(mk 2 + lk 2 ) where k = rank(ˆl proj ) 4 Ensemble: Project onto column space of each Ĉj and average Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

13 Matrix Completion DFC: Does it work? DFC Yes, with high probability. Theorem (Mackey, Talwalkar, and Jordan, 2014b) If L 0 with rank r is incoherent and s = ω(r 2 n log 2 (n)/ɛ 2 ) entries of M R m n are observed uniformly at random, then l = o(n) random columns suffice to have ˆL proj L 0 F (2 + ɛ)f(m, n) with high probability when M L 0 F and the noisy nuclear norm heuristic is used as a base algorithm. Can sample vanishingly small fraction of columns (l/n 0) Implies exact recovery for noiseless ( = 0) setting Analysis streamlined by matrix Bernstein inequality Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

14 Matrix Completion DFC: Does it work? DFC Yes, with high probability. Proof Ideas: 1 If L 0 is incoherent (has good spread of information), its partitioned submatrices are incoherent w.h.p. 2 Each submatrix has sufficiently many observed entries w.h.p. Submatrix completion succeeds 3 Random submatrix captures the full column space of L 0 w.h.p. Analysis builds on randomized l 2 regression work of Drineas, Mahoney, and Muthukrishnan (2008) Column projection succeeds Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

15 Matrix Completion DFC Noisy Recovery Error Simulations MC Proj 10% Proj Ens 10% Base MC RMSE % revealed entries Figure : Recovery error of DFC relative to base algorithm (APG) with m = 10K and r = 10. Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

16 DFC Speed-up Matrix Completion Simulations Proj 10% Proj Ens 10% Base MC MC time (s) m x 10 4 Figure : Speed-up over base algorithm (APG) for random matrices with r = 0.001m and 4% of entries revealed. Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

17 Matrix Completion Application: Collaborative filtering CF Task: Given a sparsely observed matrix of user-item ratings, predict the unobserved ratings Issues Full-rank rating matrix Noisy, non-uniform observations The Data Netflix Prize Dataset million ratings in {1,..., 5} 17,770 movies, 480,189 users 1 Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

18 Matrix Completion Application: Collaborative filtering CF Task: Predict unobserved user-item ratings Netflix Method RMSE Time APG s DFC-Proj-25% s DFC-Proj-10% s DFC-Proj-Ens-25% s DFC-Proj-Ens-10% s Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

19 Robust Matrix Factorization Background Robust Matrix Factorization Goal: Given a matrix M = L 0 + S 0 + Z where L 0 is low-rank, S 0 is sparse, and Z is entrywise noise, recover L 0 (Chandrasekaran, Sanghavi, Parrilo, and Willsky, 2009; Candès, Li, Ma, and Wright, 2011; Zhou, Li, Wright, Candès, and Ma, 2010) Examples: Background modeling/foreground activity detection M L 0 S (Candès, Li, Ma, and Wright, 2011) Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

20 Robust Matrix Factorization Background Robust Matrix Factorization Goal: Given a matrix M = L 0 + S 0 + Z where L 0 is low-rank, S 0 is sparse, and Z is entrywise noise, recover L 0 (Chandrasekaran, Sanghavi, Parrilo, and Willsky, 2009; Candès, Li, Ma, and Wright, 2011; Zhou, Li, Wright, Candès, and Ma, 2010) M L 0 S 0 S 0 can be viewed as an outlier/gross corruption matrix Ordinary PCA breaks down in this setting Harder than MC: outlier locations are unknown More expensive than MC: dense, fully observed matrices Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

21 Robust Matrix Factorization How do we recover L 0? Background First attempt: minimize L,S rank(l) + λ card(s) subject to M L S F. Problem: Computationally intractable! Solution: Convex relaxation minimize L,S L + λ S 1 subject to M L S F. where S 1 = ij S ij is the l 1 entrywise norm of S. Question: Does it work? Will noisy Principal Component Pursuit (PCP) recover L 0? Question: Is it efficient? Can noisy PCP scale to large RMF problems? Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

22 Robust Matrix Factorization Background Noisy Principal Component Pursuit: Does it work? Yes, with high probability. Theorem (Zhou, Li, Wright, Candès, and Ma, 2010) If L 0 with rank r is incoherent, and S 0 R m n contains s non-zero entries with uniformly distributed locations, then if the minimizer to the problem with λ = 1/ n satisfies r = O ( m/ log 2 n ) and s c mn, minimize L,S L + λ S 1 subject to M L S F. ˆL L 0 F f(m, n) with high probability when M L 0 S 0 F. See also Agarwal, Negahban, and Wainwright (2011) Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

23 Robust Matrix Factorization Background Noisy Principal Component Pursuit: Is it efficient? Not quite... Standard interior point methods: O(n 6 ) (Chandrasekaran, Sanghavi, Parrilo, and Willsky, 2009) More efficient, tailored algorithms: Accelerated Proximal Gradient (APG) (Lin, Ganesh, Wright, Wu, Chen, and Ma, 2009b) Augmented Lagrange Multiplier (ALM) (Lin, Chen, Wu, and Ma, 2009a) Require rank-k truncated SVD on every iteration Best case SVD(m, n, k) = O(mnk) Idea: Leverage the divide-and-conquer techniques developed for MC in the RMF setting Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

24 Robust Matrix Factorization DFC: Does it work? Background Yes, with high probability. Theorem (Mackey, Talwalkar, and Jordan, 2014b) If L 0 with rank r is incoherent, and S 0 R m n contains s c mn non-zero entries with uniformly distributed locations, then ( r 2 log 2 ) (n) l = O random columns suffice to have ɛ 2 ˆL proj L 0 F (2 + ɛ)f(m, n) with high probability when M L 0 S 0 F and noisy principal component pursuit is used as the base algorithm. Can sample polylogarithmic number of columns Implies exact recovery for noiseless ( = 0) setting Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

25 Robust Matrix Factorization DFC Estimation Error Simulations 0.25 RMF RMSE Proj 10% Proj Ens 10% Base RMF % of outliers Figure : Estimation error of DFC and base algorithm (APG) with m = 1K and r = 10. Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

26 DFC Speed-up Robust Matrix Factorization Simulations RMF Proj 10% Proj Ens 10% Base RMF time (s) m Figure : Speed-up over base algorithm (APG) for random matrices with r = 0.01m and 10% of entries corrupted. Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

27 Robust Matrix Factorization Video Application: Video background modeling Task Each video frame forms one column of matrix M Decompose M into stationary background L 0 and moving foreground objects S 0 M L 0 S 0 Challenges Video is noisy Foreground corruption is often clustered, not uniform Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

28 Robust Matrix Factorization Video Application: Video background modeling Example: Significant foreground variation Specs 1 minute of airport surveillance (Li, Huang, Gu, and Tian, 2004) 1000 frames, pixels Base algorithm: half an hour DFC: 7 minutes Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

29 Robust Matrix Factorization Video Application: Video background modeling Example: Changes in illumination Specs 1.5 minutes of lobby surveillance (Li, Huang, Gu, and Tian, 2004) 1546 frames, pixels Base algorithm: 1.5 hours DFC: 8 minutes Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

30 Future Directions Future Directions New Applications and Datasets Practical problems with large-scale or real-time requirements Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

31 Future Directions Example: Large-scale Affinity Estimation Goal: Estimate semantic similarity between pairs of datapoints Motivation: Assign class labels to datapoints based on similarity Examples from computer vision Image tagging: tree vs. firefighter vs. Tony Blair Video / multimedia content detection: wedding vs. concert Face clustering: Application: Content detection, 9K YouTube videos, 20 classes Baseline: Low Rank Representation (Liu, Lin, and Yu, 2010) Strong guarantees but 1.5 days to run Divide and conquer (Talwalkar, Mackey, Mu, Chang, and Jordan, 2013) Comparable guarantees Comparable performance in 1 hour (5 subproblems) Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

32 Future Directions Future Directions New Applications and Datasets Practical problems with large-scale or real-time requirements New Divide-and-Conquer Strategies Other ways to reduce computation while preserving accuracy Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

33 Future Directions DFC-Nys: Generalized Nyström Decomposition 1 Choose a random column submatrix C R m l and a random row submatrix R R d n from M. Call their intersection W. [ ] [ ] W M12 W M = C = R = [W M M 21 M 22 M 12 ] 21 2 Recover the low rank components of C and R in parallel to obtain Ĉ and ˆR 3 Recover L 0 from Ĉ, ˆR, and their intersection Ŵ ˆL nys = ĈŴ+ ˆR Generalized Nyström method (Goreinov, Tyrtyshnikov, and Zamarashkin, 1997) Minimal cost: O(mk 2 + lk 2 + dk 2 ) where k = rank(ˆl nys ) 4 Ensemble: Run p times in parallel and average estimates Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

34 Future Directions Future Directions New Applications and Datasets Practical problems with large-scale or real-time requirements New Divide-and-Conquer Strategies Other ways to reduce computation while preserving accuracy More extensive use of ensembling New Theory Analyze statistical implications of divide and conquer algorithms Trade-off between statistical and computational efficiency Impact of ensembling Developing suite of matrix concentration inequalities to aid in the analysis of randomized algorithms with matrix data Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

35 Future Directions Concentration Inequalities Matrix concentration P{ X E X t} δ P{λ max (X E X) t} δ Non-asymptotic control of random matrices with complex distributions Applications Matrix completion from sparse random measurements (Gross, 2011; Recht, 2011; Negahban and Wainwright, 2010; Mackey, Talwalkar, and Jordan, 2014b) Randomized matrix multiplication and factorization (Drineas, Mahoney, and Muthukrishnan, 2008; Hsu, Kakade, and Zhang, 2011) Convex relaxation of robust or chance-constrained optimization (Nemirovski, 2007; So, 2011; Cheung, So, and Wang, 2011) Random graph analysis (Christofides and Markström, 2008; Oliveira, 2009) Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

36 Future Directions Concentration Inequalities Matrix concentration P{λ max (X E X) t} δ Difficulty: Matrix multiplication is not commutative e X+Y e X e Y e Y e X Past approaches (Ahlswede and Winter, 2002; Oliveira, 2009; Tropp, 2011) Rely on deep results from matrix analysis Apply to sums of independent matrices and matrix martingales Our work (Mackey, Jordan, Chen, Farrell, and Tropp, 2014a; Paulin, Mackey, and Tropp, 2015) Stein s method of exchangeable pairs (1972), as advanced by Chatterjee (2007) for scalar concentration Improved exponential tail inequalities (Hoeffding, Bernstein, Bounded differences) Polynomial moment inequalities (Khintchine, Rosenthal) Dependent sums and more general matrix functionals Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

37 Future Directions Example: Matrix Bounded Differences Inequality Corollary (Paulin, Mackey, and Tropp, 2015) Suppose Z = (Z 1,..., Z n ) has independent coordinates, and ( H(z1,..., z j,..., z n ) H(z 1,..., z j,..., z n ) ) 2 A 2 j for all j and values z 1,..., z n, z j. Define the boundedness parameter σ 2 := n. If each A j is d d, then, for all t 0, j=1 A2 j P{λ max (H(Z) E H(Z)) t} d e t2 /(2σ 2). Improves prior results in the literature (e.g., Tropp, 2011) Useful for analyzing Multiclass classifier performance (Machart and Ralaivola, 2012) Crowdsourcing accuracy (Dalvi, Dasgupta, Kumar, and Rastogi, 2013) Convergence in non-differentiable optimization (Zhou and Hu, 2014) Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

38 Future Directions Future Directions New Applications and Datasets Practical problems with large-scale or real-time requirements New Divide-and-Conquer Strategies Other ways to reduce computation while preserving accuracy More extensive use of ensembling New Theory Analyze statistical implications of divide and conquer algorithms Trade-off between statistical and computational efficiency Impact of ensembling Developing suite of matrix concentration inequalities to aid in the analysis of randomized algorithms with matrix data Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

39 The End Future Directions Thanks! Divide Factor Combine (Project) P Ω (M) P Ω (C 1 ) P Ω (C 2 ) P Ω (C t ) Ĉ 1 Ĉ ˆLproj 2 Ĉ t Divide Factor Combine (Nyström) P Ω (M) P Ω (C) P Ω (R) Ĉ ˆR ˆLnys Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

40 References I Future Directions Agarwal, A., Negahban, S., and Wainwright, M. J. Noisy matrix decomposition via convex relaxation: Optimal rates in high dimensions. In International Conference on Machine Learning, Ahlswede, R. and Winter, A. Strong converse for identification via quantum channels. IEEE Trans. Inform. Theory, 48(3): , Mar Cai, J. F., Candès, E. J., and Shen, Z. A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization, 20(4), Candès, E. J. and Recht, B. Exact matrix completion via convex optimization. Foundations of Computational Mathematics, 9 (6): , Candès, E. J., Li, X., Ma, Y., and Wright, J. Robust principal component analysis? Journal of the ACM, 58(3):1 37, Candès, E.J. and Plan, Y. Matrix completion with noise. Proceedings of the IEEE, 98(6): , Chandrasekaran, V., Sanghavi, S., Parrilo, P. A., and Willsky, A. S. Sparse and low-rank matrix decompositions. In Allerton Conference on Communication, Control, and Computing, Chandrasekaran, V., Parrilo, P. A., and Willsky, A. S. Latent variable graphical model selection via convex optimization. preprint, Chatterjee, S. Stein s method for concentration inequalities. Probab. Theory Related Fields, 138: , Cheung, S.-S., So, A. Man-Cho, and Wang, K. Chance-constrained linear matrix inequalities with dependent perturbations: a safe tractable approximation approach. Available at Christofides, D. and Markström, K. Expansion properties of random cayley graphs and vertex transitive graphs via matrix martingales. Random Struct. Algorithms, 32(1):88 100, Dalvi, N., Dasgupta, A., Kumar, R., and Rastogi, V. Aggregating crowdsourced binary ratings. In Proceedings of the 22Nd International Conference on World Wide Web, WWW 13, pp , Republic and Canton of Geneva, Switzerland, Drineas, P., Mahoney, M. W., and Muthukrishnan, S. Relative-error CUR matrix decompositions. SIAM Journal on Matrix Analysis and Applications, 30: , Fazel, M., Hindi, H., and Boyd, S. P. A rank minimization heuristic with application to minimum order system approximation. In In Proceedings of the 2001 American Control Conference, pp , Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

41 References II Future Directions Frieze, A., Kannan, R., and Vempala, S. Fast Monte-Carlo algorithms for finding low-rank approximations. In Foundations of Computer Science, Goreinov, S. A., Tyrtyshnikov, E. E., and Zamarashkin, N. L. A theory of pseudoskeleton approximations. Linear Algebra and its Applications, 261(1-3):1 21, Gross, D. Recovering low-rank matrices from few coefficients in any basis. IEEE Trans. Inform. Theory, 57(3): , Mar Hsu, D., Kakade, S. M., and Zhang, T. Dimension-free tail inequalities for sums of random matrices. Available at arxiv: , Keshavan, R. H., Montanari, A., and Oh, S. Matrix completion from noisy entries. Journal of Machine Learning Research, 99: , Li, L., Huang, W., Gu, I. Y. H., and Tian, Q. Statistical modeling of complex backgrounds for foreground object detection. IEEE Transactions on Image Processing, 13(11): , Lin, Z., Chen, M., Wu, L., and Ma, Y. The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. UIUC Technical Report UILU-ENG , 2009a. Lin, Z., Ganesh, A., Wright, J., Wu, L., Chen, M., and Ma, Y. Fast convex optimization algorithms for exact recovery of a corrupted low-rank matrix. UIUC Technical Report UILU-ENG , 2009b. Liu, G., Lin, Z., and Yu, Y. Robust subspace segmentation by low-rank representation. In International Conference on Machine Learning, Machart, P. and Ralaivola, L. Confusion Matrix Stability Bounds for Multiclass Classification. Available at February Mackey, L., Jordan, M. I., Chen, R. Y., Farrell, B., and Tropp, J. A. Matrix concentration inequalities via the method of exchangeable pairs. The Annals of Probability, 42(3): , 2014a. Mackey, L., Talwalkar, A., and Jordan, M. I. Distributed matrix completion and robust factorization. Journal of Machine Learning Research, 2014b. In press. Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

42 References III Future Directions Min, K., Zhang, Z., Wright, J., and Ma, Y. Decomposing background topics from keywords by principal component pursuit. In Conference on Information and Knowledge Management, Negahban, S. and Wainwright, M. J. Restricted strong convexity and weighted matrix completion: Optimal bounds with noise. arxiv: v2[cs.it], Nemirovski, A. Sums of random symmetric matrices and quadratic optimization under orthogonality constraints. Math. Program., 109: , January ISSN doi: /s URL Oliveira, R. I. Concentration of the adjacency matrix and of the Laplacian in random graphs with independent edges. Available at arxiv: , Nov Paulin, D., Mackey, L., and Tropp, J. A. Efron-Stein Inequalities for Random Matrices. The Annals of Probability, to appear Recht, B. Simpler approach to matrix completion. J. Mach. Learn. Res., 12: , So, A. Man-Cho. Moment inequalities for sums of random matrices and their applications in optimization. Math. Program., 130 (1): , Stein, C. A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In Proc. 6th Berkeley Symp. Math. Statist. Probab., Berkeley, Univ. California Press. Talwalkar, Ameet, Mackey, Lester, Mu, Yadong, Chang, Shih-Fu, and Jordan, Michael I. Distributed low-rank subspace segmentation. December Toh, K. and Yun, S. An accelerated proximal gradient algorithm for nuclear norm regularized least squares problems. Pacific Journal of Optimization, 6(3): , Tropp, J. A. User-friendly tail bounds for sums of random matrices. Found. Comput. Math., August Zhou, Enlu and Hu, Jiaqiao. Gradient-based adaptive stochastic search for non-differentiable optimization. Automatic Control, IEEE Transactions on, 59(7): , Zhou, Z., Li, X., Wright, J., Candès, E. J., and Ma, Y. Stable principal component pursuit. In IEEE International Symposium on Information Theory Proceedings (ISIT), pp , Mackey (Stanford) Divide-and-Conquer Matrix Factorization December 14, / 42

Matrix Completion and Matrix Concentration

Matrix Completion and Matrix Concentration Matrix Completion and Matrix Concentration Lester Mackey Collaborators: Ameet Talwalkar, Michael I. Jordan, Richard Y. Chen, Brendan Farrell, Joel A. Tropp, and Daniel Paulin Stanford University UCLA UC

More information

Stein s Method for Matrix Concentration

Stein s Method for Matrix Concentration Stein s Method for Matrix Concentration Lester Mackey Collaborators: Michael I. Jordan, Richard Y. Chen, Brendan Farrell, and Joel A. Tropp University of California, Berkeley California Institute of Technology

More information

Robust Principal Component Analysis

Robust Principal Component Analysis ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M

More information

Analysis of Robust PCA via Local Incoherence

Analysis of Robust PCA via Local Incoherence Analysis of Robust PCA via Local Incoherence Huishuai Zhang Department of EECS Syracuse University Syracuse, NY 3244 hzhan23@syr.edu Yi Zhou Department of EECS Syracuse University Syracuse, NY 3244 yzhou35@syr.edu

More information

Distributed Matrix Completion and Robust Factorization

Distributed Matrix Completion and Robust Factorization Journal of Machine Learning Research 16 (2015) 913-960 Submitted 10/13; Revised 9/14; Published 4/15 Distributed Matrix Completion and Robust Factorization Lester Mackey Stanford University Department

More information

Dense Error Correction for Low-Rank Matrices via Principal Component Pursuit

Dense Error Correction for Low-Rank Matrices via Principal Component Pursuit Dense Error Correction for Low-Rank Matrices via Principal Component Pursuit Arvind Ganesh, John Wright, Xiaodong Li, Emmanuel J. Candès, and Yi Ma, Microsoft Research Asia, Beijing, P.R.C Dept. of Electrical

More information

Linear dimensionality reduction for data analysis

Linear dimensionality reduction for data analysis Linear dimensionality reduction for data analysis Nicolas Gillis Joint work with Robert Luce, François Glineur, Stephen Vavasis, Robert Plemmons, Gabriella Casalino The setup Dimensionality reduction for

More information

Conditions for Robust Principal Component Analysis

Conditions for Robust Principal Component Analysis Rose-Hulman Undergraduate Mathematics Journal Volume 12 Issue 2 Article 9 Conditions for Robust Principal Component Analysis Michael Hornstein Stanford University, mdhornstein@gmail.com Follow this and

More information

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 7: Matrix completion Yuejie Chi The Ohio State University Page 1 Reference Guaranteed Minimum-Rank Solutions of Linear

More information

Structured matrix factorizations. Example: Eigenfaces

Structured matrix factorizations. Example: Eigenfaces Structured matrix factorizations Example: Eigenfaces An extremely large variety of interesting and important problems in machine learning can be formulated as: Given a matrix, find a matrix and a matrix

More information

Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds

Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds Tao Wu Institute for Mathematics and Scientific Computing Karl-Franzens-University of Graz joint work with Prof.

More information

Matrix completion: Fundamental limits and efficient algorithms. Sewoong Oh Stanford University

Matrix completion: Fundamental limits and efficient algorithms. Sewoong Oh Stanford University Matrix completion: Fundamental limits and efficient algorithms Sewoong Oh Stanford University 1 / 35 Low-rank matrix completion Low-rank Data Matrix Sparse Sampled Matrix Complete the matrix from small

More information

Low-rank Matrix Completion with Noisy Observations: a Quantitative Comparison

Low-rank Matrix Completion with Noisy Observations: a Quantitative Comparison Low-rank Matrix Completion with Noisy Observations: a Quantitative Comparison Raghunandan H. Keshavan, Andrea Montanari and Sewoong Oh Electrical Engineering and Statistics Department Stanford University,

More information

Low Rank Matrix Completion Formulation and Algorithm

Low Rank Matrix Completion Formulation and Algorithm 1 2 Low Rank Matrix Completion and Algorithm Jian Zhang Department of Computer Science, ETH Zurich zhangjianthu@gmail.com March 25, 2014 Movie Rating 1 2 Critic A 5 5 Critic B 6 5 Jian 9 8 Kind Guy B 9

More information

From Compressed Sensing to Matrix Completion and Beyond. Benjamin Recht Department of Computer Sciences University of Wisconsin-Madison

From Compressed Sensing to Matrix Completion and Beyond. Benjamin Recht Department of Computer Sciences University of Wisconsin-Madison From Compressed Sensing to Matrix Completion and Beyond Benjamin Recht Department of Computer Sciences University of Wisconsin-Madison Netflix Prize One million big ones! Given 100 million ratings on a

More information

Recovering any low-rank matrix, provably

Recovering any low-rank matrix, provably Recovering any low-rank matrix, provably Rachel Ward University of Texas at Austin October, 2014 Joint work with Yudong Chen (U.C. Berkeley), Srinadh Bhojanapalli and Sujay Sanghavi (U.T. Austin) Matrix

More information

Non-convex Robust PCA: Provable Bounds

Non-convex Robust PCA: Provable Bounds Non-convex Robust PCA: Provable Bounds Anima Anandkumar U.C. Irvine Joint work with Praneeth Netrapalli, U.N. Niranjan, Prateek Jain and Sujay Sanghavi. Learning with Big Data High Dimensional Regime Missing

More information

Matrix Completion: Fundamental Limits and Efficient Algorithms

Matrix Completion: Fundamental Limits and Efficient Algorithms Matrix Completion: Fundamental Limits and Efficient Algorithms Sewoong Oh PhD Defense Stanford University July 23, 2010 1 / 33 Matrix completion Find the missing entries in a huge data matrix 2 / 33 Example

More information

Robust PCA. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng

Robust PCA. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng Robust PCA CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Robust PCA 1 / 52 Previously...

More information

Using Friendly Tail Bounds for Sums of Random Matrices

Using Friendly Tail Bounds for Sums of Random Matrices Using Friendly Tail Bounds for Sums of Random Matrices Joel A. Tropp Computing + Mathematical Sciences California Institute of Technology jtropp@cms.caltech.edu Research supported in part by NSF, DARPA,

More information

The convex algebraic geometry of rank minimization

The convex algebraic geometry of rank minimization The convex algebraic geometry of rank minimization Pablo A. Parrilo Laboratory for Information and Decision Systems Massachusetts Institute of Technology International Symposium on Mathematical Programming

More information

Lecture Notes 10: Matrix Factorization

Lecture Notes 10: Matrix Factorization Optimization-based data analysis Fall 207 Lecture Notes 0: Matrix Factorization Low-rank models. Rank- model Consider the problem of modeling a quantity y[i, j] that depends on two indices i and j. To

More information

CSC 576: Variants of Sparse Learning

CSC 576: Variants of Sparse Learning CSC 576: Variants of Sparse Learning Ji Liu Department of Computer Science, University of Rochester October 27, 205 Introduction Our previous note basically suggests using l norm to enforce sparsity in

More information

Compressed Sensing and Robust Recovery of Low Rank Matrices

Compressed Sensing and Robust Recovery of Low Rank Matrices Compressed Sensing and Robust Recovery of Low Rank Matrices M. Fazel, E. Candès, B. Recht, P. Parrilo Electrical Engineering, University of Washington Applied and Computational Mathematics Dept., Caltech

More information

Recovery of Low-Rank Plus Compressed Sparse Matrices with Application to Unveiling Traffic Anomalies

Recovery of Low-Rank Plus Compressed Sparse Matrices with Application to Unveiling Traffic Anomalies July 12, 212 Recovery of Low-Rank Plus Compressed Sparse Matrices with Application to Unveiling Traffic Anomalies Morteza Mardani Dept. of ECE, University of Minnesota, Minneapolis, MN 55455 Acknowledgments:

More information

Matrix Completion for Structured Observations

Matrix Completion for Structured Observations Matrix Completion for Structured Observations Denali Molitor Department of Mathematics University of California, Los ngeles Los ngeles, C 90095, US Email: dmolitor@math.ucla.edu Deanna Needell Department

More information

Sparse representation classification and positive L1 minimization

Sparse representation classification and positive L1 minimization Sparse representation classification and positive L1 minimization Cencheng Shen Joint Work with Li Chen, Carey E. Priebe Applied Mathematics and Statistics Johns Hopkins University, August 5, 2014 Cencheng

More information

Matrix Completion from Noisy Entries

Matrix Completion from Noisy Entries Matrix Completion from Noisy Entries Raghunandan H. Keshavan, Andrea Montanari, and Sewoong Oh Abstract Given a matrix M of low-rank, we consider the problem of reconstructing it from noisy observations

More information

GoDec: Randomized Low-rank & Sparse Matrix Decomposition in Noisy Case

GoDec: Randomized Low-rank & Sparse Matrix Decomposition in Noisy Case ianyi Zhou IANYI.ZHOU@SUDEN.US.EDU.AU Dacheng ao DACHENG.AO@US.EDU.AU Centre for Quantum Computation & Intelligent Systems, FEI, University of echnology, Sydney, NSW 2007, Australia Abstract Low-rank and

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via convex relaxations Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Matrix Completion from a Few Entries

Matrix Completion from a Few Entries Matrix Completion from a Few Entries Raghunandan H. Keshavan and Sewoong Oh EE Department Stanford University, Stanford, CA 9434 Andrea Montanari EE and Statistics Departments Stanford University, Stanford,

More information

Adaptive one-bit matrix completion

Adaptive one-bit matrix completion Adaptive one-bit matrix completion Joseph Salmon Télécom Paristech, Institut Mines-Télécom Joint work with Jean Lafond (Télécom Paristech) Olga Klopp (Crest / MODAL X, Université Paris Ouest) Éric Moulines

More information

Breaking the Limits of Subspace Inference

Breaking the Limits of Subspace Inference Breaking the Limits of Subspace Inference Claudia R. Solís-Lemus, Daniel L. Pimentel-Alarcón Emory University, Georgia State University Abstract Inferring low-dimensional subspaces that describe high-dimensional,

More information

EE 381V: Large Scale Learning Spring Lecture 16 March 7

EE 381V: Large Scale Learning Spring Lecture 16 March 7 EE 381V: Large Scale Learning Spring 2013 Lecture 16 March 7 Lecturer: Caramanis & Sanghavi Scribe: Tianyang Bai 16.1 Topics Covered In this lecture, we introduced one method of matrix completion via SVD-based

More information

arxiv: v1 [stat.ml] 1 Mar 2015

arxiv: v1 [stat.ml] 1 Mar 2015 Matrix Completion with Noisy Entries and Outliers Raymond K. W. Wong 1 and Thomas C. M. Lee 2 arxiv:1503.00214v1 [stat.ml] 1 Mar 2015 1 Department of Statistics, Iowa State University 2 Department of Statistics,

More information

Sparse and Low-Rank Matrix Decompositions

Sparse and Low-Rank Matrix Decompositions Forty-Seventh Annual Allerton Conference Allerton House, UIUC, Illinois, USA September 30 - October 2, 2009 Sparse and Low-Rank Matrix Decompositions Venkat Chandrasekaran, Sujay Sanghavi, Pablo A. Parrilo,

More information

Spectral k-support Norm Regularization

Spectral k-support Norm Regularization Spectral k-support Norm Regularization Andrew McDonald Department of Computer Science, UCL (Joint work with Massimiliano Pontil and Dimitris Stamos) 25 March, 2015 1 / 19 Problem: Matrix Completion Goal:

More information

Latent Variable Graphical Model Selection Via Convex Optimization

Latent Variable Graphical Model Selection Via Convex Optimization Latent Variable Graphical Model Selection Via Convex Optimization The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published

More information

MATRIX RECOVERY FROM QUANTIZED AND CORRUPTED MEASUREMENTS

MATRIX RECOVERY FROM QUANTIZED AND CORRUPTED MEASUREMENTS MATRIX RECOVERY FROM QUANTIZED AND CORRUPTED MEASUREMENTS Andrew S. Lan 1, Christoph Studer 2, and Richard G. Baraniuk 1 1 Rice University; e-mail: {sl29, richb}@rice.edu 2 Cornell University; e-mail:

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via nonconvex optimization Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

ELE 538B: Mathematics of High-Dimensional Data. Spectral methods. Yuxin Chen Princeton University, Fall 2018

ELE 538B: Mathematics of High-Dimensional Data. Spectral methods. Yuxin Chen Princeton University, Fall 2018 ELE 538B: Mathematics of High-Dimensional Data Spectral methods Yuxin Chen Princeton University, Fall 2018 Outline A motivating application: graph clustering Distance and angles between two subspaces Eigen-space

More information

Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition

Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition Gongguo Tang and Arye Nehorai Department of Electrical and Systems Engineering Washington University in St Louis

More information

arxiv: v2 [stat.ml] 1 Jul 2013

arxiv: v2 [stat.ml] 1 Jul 2013 A Counterexample for the Validity of Using Nuclear Norm as a Convex Surrogate of Rank Hongyang Zhang, Zhouchen Lin, and Chao Zhang Key Lab. of Machine Perception (MOE), School of EECS Peking University,

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms François Caron Department of Statistics, Oxford STATLEARN 2014, Paris April 7, 2014 Joint work with Adrien Todeschini,

More information

A Fast Algorithm For Computing The A-optimal Sampling Distributions In A Big Data Linear Regression

A Fast Algorithm For Computing The A-optimal Sampling Distributions In A Big Data Linear Regression A Fast Algorithm For Computing The A-optimal Sampling Distributions In A Big Data Linear Regression Hanxiang Peng and Fei Tan Indiana University Purdue University Indianapolis Department of Mathematical

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Adrien Todeschini Inria Bordeaux JdS 2014, Rennes Aug. 2014 Joint work with François Caron (Univ. Oxford), Marie

More information

Stochastic dynamical modeling:

Stochastic dynamical modeling: Stochastic dynamical modeling: Structured matrix completion of partially available statistics Armin Zare www-bcf.usc.edu/ arminzar Joint work with: Yongxin Chen Mihailo R. Jovanovic Tryphon T. Georgiou

More information

A Counterexample for the Validity of Using Nuclear Norm as a Convex Surrogate of Rank

A Counterexample for the Validity of Using Nuclear Norm as a Convex Surrogate of Rank A Counterexample for the Validity of Using Nuclear Norm as a Convex Surrogate of Rank Hongyang Zhang, Zhouchen Lin, and Chao Zhang Key Lab. of Machine Perception (MOE), School of EECS Peking University,

More information

An iterative hard thresholding estimator for low rank matrix recovery

An iterative hard thresholding estimator for low rank matrix recovery An iterative hard thresholding estimator for low rank matrix recovery Alexandra Carpentier - based on a joint work with Arlene K.Y. Kim Statistical Laboratory, Department of Pure Mathematics and Mathematical

More information

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Lingchen Kong and Naihua Xiu Department of Applied Mathematics, Beijing Jiaotong University, Beijing, 100044, People s Republic of China E-mail:

More information

Combining Sparsity with Physically-Meaningful Constraints in Sparse Parameter Estimation

Combining Sparsity with Physically-Meaningful Constraints in Sparse Parameter Estimation UIUC CSL Mar. 24 Combining Sparsity with Physically-Meaningful Constraints in Sparse Parameter Estimation Yuejie Chi Department of ECE and BMI Ohio State University Joint work with Yuxin Chen (Stanford).

More information

Robust PCA via Outlier Pursuit

Robust PCA via Outlier Pursuit Robust PCA via Outlier Pursuit Huan Xu Electrical and Computer Engineering University of Texas at Austin huan.xu@mail.utexas.edu Constantine Caramanis Electrical and Computer Engineering University of

More information

Matrix Completion from Fewer Entries

Matrix Completion from Fewer Entries from Fewer Entries Stanford University March 30, 2009 Outline The problem, a look at the data, and some results (slides) 2 Proofs (blackboard) arxiv:090.350 The problem, a look at the data, and some results

More information

Collaborative Filtering Matrix Completion Alternating Least Squares

Collaborative Filtering Matrix Completion Alternating Least Squares Case Study 4: Collaborative Filtering Collaborative Filtering Matrix Completion Alternating Least Squares Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade May 19, 2016

More information

arxiv: v1 [cs.cv] 1 Jun 2014

arxiv: v1 [cs.cv] 1 Jun 2014 l 1 -regularized Outlier Isolation and Regression arxiv:1406.0156v1 [cs.cv] 1 Jun 2014 Han Sheng Department of Electrical and Electronic Engineering, The University of Hong Kong, HKU Hong Kong, China sheng4151@gmail.com

More information

Scalable Subspace Clustering

Scalable Subspace Clustering Scalable Subspace Clustering René Vidal Center for Imaging Science, Laboratory for Computational Sensing and Robotics, Institute for Computational Medicine, Department of Biomedical Engineering, Johns

More information

Donald Goldfarb IEOR Department Columbia University UCLA Mathematics Department Distinguished Lecture Series May 17 19, 2016

Donald Goldfarb IEOR Department Columbia University UCLA Mathematics Department Distinguished Lecture Series May 17 19, 2016 Optimization for Tensor Models Donald Goldfarb IEOR Department Columbia University UCLA Mathematics Department Distinguished Lecture Series May 17 19, 2016 1 Tensors Matrix Tensor: higher-order matrix

More information

Compressive Sensing and Beyond

Compressive Sensing and Beyond Compressive Sensing and Beyond Sohail Bahmani Gerorgia Tech. Signal Processing Compressed Sensing Signal Models Classics: bandlimited The Sampling Theorem Any signal with bandwidth B can be recovered

More information

GROUP-SPARSE SUBSPACE CLUSTERING WITH MISSING DATA

GROUP-SPARSE SUBSPACE CLUSTERING WITH MISSING DATA GROUP-SPARSE SUBSPACE CLUSTERING WITH MISSING DATA D Pimentel-Alarcón 1, L Balzano 2, R Marcia 3, R Nowak 1, R Willett 1 1 University of Wisconsin - Madison, 2 University of Michigan - Ann Arbor, 3 University

More information

A randomized algorithm for approximating the SVD of a matrix

A randomized algorithm for approximating the SVD of a matrix A randomized algorithm for approximating the SVD of a matrix Joint work with Per-Gunnar Martinsson (U. of Colorado) and Vladimir Rokhlin (Yale) Mark Tygert Program in Applied Mathematics Yale University

More information

Provable Alternating Minimization Methods for Non-convex Optimization

Provable Alternating Minimization Methods for Non-convex Optimization Provable Alternating Minimization Methods for Non-convex Optimization Prateek Jain Microsoft Research, India Joint work with Praneeth Netrapalli, Sujay Sanghavi, Alekh Agarwal, Animashree Anandkumar, Rashish

More information

Robust Orthonormal Subspace Learning: Efficient Recovery of Corrupted Low-rank Matrices

Robust Orthonormal Subspace Learning: Efficient Recovery of Corrupted Low-rank Matrices Robust Orthonormal Subspace Learning: Efficient Recovery of Corrupted Low-rank Matrices Xianbiao Shu 1, Fatih Porikli 2 and Narendra Ahuja 1 1 University of Illinois at Urbana-Champaign, 2 Australian National

More information

Low-rank Matrix Completion from Noisy Entries

Low-rank Matrix Completion from Noisy Entries Low-rank Matrix Completion from Noisy Entries Sewoong Oh Joint work with Raghunandan Keshavan and Andrea Montanari Stanford University Forty-Seventh Allerton Conference October 1, 2009 R.Keshavan, S.Oh,

More information

Low-Rank Matrix Recovery

Low-Rank Matrix Recovery ELE 538B: Mathematics of High-Dimensional Data Low-Rank Matrix Recovery Yuxin Chen Princeton University, Fall 2018 Outline Motivation Problem setup Nuclear norm minimization RIP and low-rank matrix recovery

More information

High-Rank Matrix Completion and Subspace Clustering with Missing Data

High-Rank Matrix Completion and Subspace Clustering with Missing Data High-Rank Matrix Completion and Subspace Clustering with Missing Data Authors: Brian Eriksson, Laura Balzano and Robert Nowak Presentation by Wang Yuxiang 1 About the authors

More information

Randomized Numerical Linear Algebra: Review and Progresses

Randomized Numerical Linear Algebra: Review and Progresses ized ized SVD ized : Review and Progresses Zhihua Department of Computer Science and Engineering Shanghai Jiao Tong University The 12th China Workshop on Machine Learning and Applications Xi an, November

More information

Robust PCA via Outlier Pursuit

Robust PCA via Outlier Pursuit 1 Robust PCA via Outlier Pursuit Huan Xu, Constantine Caramanis, Member, and Sujay Sanghavi, Member Abstract Singular Value Decomposition (and Principal Component Analysis) is one of the most widely used

More information

Lecture 18 Nov 3rd, 2015

Lecture 18 Nov 3rd, 2015 CS 229r: Algorithms for Big Data Fall 2015 Prof. Jelani Nelson Lecture 18 Nov 3rd, 2015 Scribe: Jefferson Lee 1 Overview Low-rank approximation, Compression Sensing 2 Last Time We looked at three different

More information

High-dimensional Statistics

High-dimensional Statistics High-dimensional Statistics Pradeep Ravikumar UT Austin Outline 1. High Dimensional Data : Large p, small n 2. Sparsity 3. Group Sparsity 4. Low Rank 1 Curse of Dimensionality Statistical Learning: Given

More information

High-dimensional Statistical Models

High-dimensional Statistical Models High-dimensional Statistical Models Pradeep Ravikumar UT Austin MLSS 2014 1 Curse of Dimensionality Statistical Learning: Given n observations from p(x; θ ), where θ R p, recover signal/parameter θ. For

More information

Recursive Sparse Recovery in Large but Structured Noise - Part 2

Recursive Sparse Recovery in Large but Structured Noise - Part 2 Recursive Sparse Recovery in Large but Structured Noise - Part 2 Chenlu Qiu and Namrata Vaswani ECE dept, Iowa State University, Ames IA, Email: {chenlu,namrata}@iastate.edu Abstract We study the problem

More information

Optimisation Combinatoire et Convexe.

Optimisation Combinatoire et Convexe. Optimisation Combinatoire et Convexe. Low complexity models, l 1 penalties. A. d Aspremont. M1 ENS. 1/36 Today Sparsity, low complexity models. l 1 -recovery results: three approaches. Extensions: matrix

More information

Exact Recoverability of Robust PCA via Outlier Pursuit with Tight Recovery Bounds

Exact Recoverability of Robust PCA via Outlier Pursuit with Tight Recovery Bounds Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence Exact Recoverability of Robust PCA via Outlier Pursuit with Tight Recovery Bounds Hongyang Zhang, Zhouchen Lin, Chao Zhang, Edward

More information

Solving Corrupted Quadratic Equations, Provably

Solving Corrupted Quadratic Equations, Provably Solving Corrupted Quadratic Equations, Provably Yuejie Chi London Workshop on Sparse Signal Processing September 206 Acknowledgement Joint work with Yuanxin Li (OSU), Huishuai Zhuang (Syracuse) and Yingbin

More information

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary

More information

Decoupled Collaborative Ranking

Decoupled Collaborative Ranking Decoupled Collaborative Ranking Jun Hu, Ping Li April 24, 2017 Jun Hu, Ping Li WWW2017 April 24, 2017 1 / 36 Recommender Systems Recommendation system is an information filtering technique, which provides

More information

An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems

An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems Kim-Chuan Toh Sangwoon Yun March 27, 2009; Revised, Nov 11, 2009 Abstract The affine rank minimization

More information

Binary matrix completion

Binary matrix completion Binary matrix completion Yaniv Plan University of Michigan SAMSI, LDHD workshop, 2013 Joint work with (a) Mark Davenport (b) Ewout van den Berg (c) Mary Wootters Yaniv Plan (U. Mich.) Binary matrix completion

More information

Recovery of Simultaneously Structured Models using Convex Optimization

Recovery of Simultaneously Structured Models using Convex Optimization Recovery of Simultaneously Structured Models using Convex Optimization Maryam Fazel University of Washington Joint work with: Amin Jalali (UW), Samet Oymak and Babak Hassibi (Caltech) Yonina Eldar (Technion)

More information

Joint Capped Norms Minimization for Robust Matrix Recovery

Joint Capped Norms Minimization for Robust Matrix Recovery Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-7) Joint Capped Norms Minimization for Robust Matrix Recovery Feiping Nie, Zhouyuan Huo 2, Heng Huang 2

More information

Correlated-PCA: Principal Components Analysis when Data and Noise are Correlated

Correlated-PCA: Principal Components Analysis when Data and Noise are Correlated Correlated-PCA: Principal Components Analysis when Data and Noise are Correlated Namrata Vaswani and Han Guo Iowa State University, Ames, IA, USA Email: {namrata,hanguo}@iastate.edu Abstract Given a matrix

More information

Scalable Completion of Nonnegative Matrices with the Separable Structure

Scalable Completion of Nonnegative Matrices with the Separable Structure Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-6) Scalable Completion of Nonnegative Matrices with the Separable Structure Xiyu Yu, Wei Bian, Dacheng Tao Center for Quantum

More information

A Characterization of Sampling Patterns for Union of Low-Rank Subspaces Retrieval Problem

A Characterization of Sampling Patterns for Union of Low-Rank Subspaces Retrieval Problem A Characterization of Sampling Patterns for Union of Low-Rank Subspaces Retrieval Problem Morteza Ashraphijuo Columbia University ashraphijuo@ee.columbia.edu Xiaodong Wang Columbia University wangx@ee.columbia.edu

More information

Relative-Error CUR Matrix Decompositions

Relative-Error CUR Matrix Decompositions RandNLA Reading Group University of California, Berkeley Tuesday, April 7, 2015. Motivation study [low-rank] matrix approximations that are explicitly expressed in terms of a small numbers of columns and/or

More information

arxiv: v1 [math.oc] 11 Jun 2009

arxiv: v1 [math.oc] 11 Jun 2009 RANK-SPARSITY INCOHERENCE FOR MATRIX DECOMPOSITION VENKAT CHANDRASEKARAN, SUJAY SANGHAVI, PABLO A. PARRILO, S. WILLSKY AND ALAN arxiv:0906.2220v1 [math.oc] 11 Jun 2009 Abstract. Suppose we are given a

More information

Direct Robust Matrix Factorization for Anomaly Detection

Direct Robust Matrix Factorization for Anomaly Detection Direct Robust Matrix Factorization for Anomaly Detection Liang Xiong Machine Learning Department Carnegie Mellon University lxiong@cs.cmu.edu Xi Chen Machine Learning Department Carnegie Mellon University

More information

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28 Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:

More information

Rank Determination for Low-Rank Data Completion

Rank Determination for Low-Rank Data Completion Journal of Machine Learning Research 18 017) 1-9 Submitted 7/17; Revised 8/17; Published 9/17 Rank Determination for Low-Rank Data Completion Morteza Ashraphijuo Columbia University New York, NY 1007,

More information

Distributed Low-rank Subspace Segmentation

Distributed Low-rank Subspace Segmentation Distributed Low-rank Subspace Segmentation Ameet Talwalkar a Lester Mackey b Yadong Mu c Shih-Fu Chang c Michael I. Jordan a a University of California, Berkeley b Stanford University c Columbia University

More information

Statistical Performance of Convex Tensor Decomposition

Statistical Performance of Convex Tensor Decomposition Slides available: h-p://www.ibis.t.u tokyo.ac.jp/ryotat/tensor12kyoto.pdf Statistical Performance of Convex Tensor Decomposition Ryota Tomioka 2012/01/26 @ Kyoto University Perspectives in Informatics

More information

Robustness of Principal Components

Robustness of Principal Components PCA for Clustering An objective of principal components analysis is to identify linear combinations of the original variables that are useful in accounting for the variation in those original variables.

More information

Fast Algorithms for Robust PCA via Gradient Descent

Fast Algorithms for Robust PCA via Gradient Descent Fast Algorithms for Robust PCA via Gradient Descent Xinyang Yi Dohyung Park Yudong Chen Constantine Caramanis The University of Texas at Austin Cornell University yixy,dhpark,constantine}@utexas.edu yudong.chen@cornell.edu

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Saniv Kumar, Google Research, NY EECS-6898, Columbia University - Fall, 010 Saniv Kumar 9/13/010 EECS6898 Large Scale Machine Learning 1 Curse of Dimensionality Gaussian Mixture Models

More information

Lecture 9: Numerical Linear Algebra Primer (February 11st)

Lecture 9: Numerical Linear Algebra Primer (February 11st) 10-725/36-725: Convex Optimization Spring 2015 Lecture 9: Numerical Linear Algebra Primer (February 11st) Lecturer: Ryan Tibshirani Scribes: Avinash Siravuru, Guofan Wu, Maosheng Liu Note: LaTeX template

More information

Using SVD to Recommend Movies

Using SVD to Recommend Movies Michael Percy University of California, Santa Cruz Last update: December 12, 2009 Last update: December 12, 2009 1 / Outline 1 Introduction 2 Singular Value Decomposition 3 Experiments 4 Conclusion Last

More information

Data Mining and Matrices

Data Mining and Matrices Data Mining and Matrices 08 Boolean Matrix Factorization Rainer Gemulla, Pauli Miettinen June 13, 2013 Outline 1 Warm-Up 2 What is BMF 3 BMF vs. other three-letter abbreviations 4 Binary matrices, tiles,

More information

Collaborative Filtering

Collaborative Filtering Case Study 4: Collaborative Filtering Collaborative Filtering Matrix Completion Alternating Least Squares Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Carlos Guestrin

More information

Shifted Subspaces Tracking on Sparse Outlier for Motion Segmentation

Shifted Subspaces Tracking on Sparse Outlier for Motion Segmentation Proceedings of the wenty-hird International Joint Conference on Artificial Intelligence Shifted Subspaces racking on Sparse Outlier for Motion Segmentation ianyi Zhou and Dacheng ao Centre for Quantum

More information

Handout 8: Dealing with Data Uncertainty

Handout 8: Dealing with Data Uncertainty MFE 5100: Optimization 2015 16 First Term Handout 8: Dealing with Data Uncertainty Instructor: Anthony Man Cho So December 1, 2015 1 Introduction Conic linear programming CLP, and in particular, semidefinite

More information

Rank minimization via the γ 2 norm

Rank minimization via the γ 2 norm Rank minimization via the γ 2 norm Troy Lee Columbia University Adi Shraibman Weizmann Institute Rank Minimization Problem Consider the following problem min X rank(x) A i, X b i for i = 1,..., k Arises

More information