Scalable Subspace Clustering

Similar documents
Sparse Subspace Clustering

Scalable Sparse Subspace Clustering by Orthogonal Matching Pursuit

Graph based Subspace Segmentation. Canyi Lu National University of Singapore Nov. 21, 2013

Provable Subspace Clustering: When LRR meets SSC

Sparse representation classification and positive L1 minimization

Low-Rank Subspace Clustering

GROUP-SPARSE SUBSPACE CLUSTERING WITH MISSING DATA

Closed-Form Solutions in Low-Rank Subspace Recovery Models and Their Implications. Zhouchen Lin ( 林宙辰 ) 北京大学 Nov. 7, 2015

arxiv: v2 [cs.cv] 23 Sep 2018

Subspace Clustering with Irrelevant Features via Robust Dantzig Selector

Part I Generalized Principal Component Analysis

SUBSPACE CLUSTERING WITH DENSE REPRESENTATIONS. Eva L. Dyer, Christoph Studer, Richard G. Baraniuk

The Information-Theoretic Requirements of Subspace Clustering with Missing Data

Linear-Time Subspace Clustering via Bipartite Graph Modeling

SUBSPACE CLUSTERING WITH DENSE REPRESENTATIONS. Eva L. Dyer, Christoph Studer, Richard G. Baraniuk. ECE Department, Rice University, Houston, TX

arxiv: v2 [cs.cv] 20 Apr 2014

arxiv: v1 [cs.lg] 9 May 2016

A Local Non-Negative Pursuit Method for Intrinsic Manifold Structure Preservation

Achieving Stable Subspace Clustering by Post-Processing Generic Clustering Results

Robust Subspace Clustering

Consensus Algorithms for Camera Sensor Networks. Roberto Tron Vision, Dynamics and Learning Lab Johns Hopkins University

Discriminative Subspace Clustering

DIMENSIONALITY reduction is a powerful tool for

FILTRATED ALGEBRAIC SUBSPACE CLUSTERING

Robust Subspace Clustering

Robust Subspace Clustering

Automatic Model Selection in Subspace Clustering via Triplet Relationships

IT IS well known that the problem of subspace segmentation

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

Solving Corrupted Quadratic Equations, Provably

ROBUST SUBSPACE CLUSTERING. Stanford University, University of California, Berkeley and Stanford University

On the Sub-Optimality of Proximal Gradient Descent for l 0 Sparse Approximation

Non-convex Robust PCA: Provable Bounds

Rice University. Endogenous Sparse Recovery. Eva L. Dyer. A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree

Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise

A Counterexample for the Validity of Using Nuclear Norm as a Convex Surrogate of Rank

arxiv: v2 [stat.ml] 1 Jul 2013

Structured matrix factorizations. Example: Eigenfaces

Spectral Clustering using Multilinear SVD

Tensor LRR Based Subspace Clustering

L 2,1 Norm and its Applications

Dimensionality Reduction Using the Sparse Linear Model: Supplementary Material

Robust Component Analysis via HQ Minimization

Sketching for Large-Scale Learning of Mixture Models

Block-Sparse Recovery via Convex Optimization

Nonnegative Matrix Factorization Clustering on Multiple Manifolds

Robust Motion Segmentation by Spectral Clustering

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

A Characterization of Sampling Patterns for Union of Low-Rank Subspaces Retrieval Problem

Analysis of Spectral Kernel Design based Semi-supervised Learning

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

The Multibody Trifocal Tensor: Motion Segmentation from 3 Perspective Views

Low-rank Matrix Completion with Noisy Observations: a Quantitative Comparison

A New GPCA Algorithm for Clustering Subspaces by Fitting, Differentiating and Dividing Polynomials

Global Optimality in Matrix and Tensor Factorization, Deep Learning & Beyond

ECS289: Scalable Machine Learning

Relations Among Some Low-Rank Subspace Recovery Models

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

Robust Principal Component Analysis

Breaking the Limits of Subspace Inference

An Introduction to Sparse Approximation

Multi-Task Learning for Subspace Segmentation

Dimensionality Reduction:

Algebraic Clustering of Affine Subspaces

When Dictionary Learning Meets Classification

Tensor LRR and Sparse Coding-Based Subspace Clustering Yifan Fu, Junbin Gao, David Tien, Zhouchen Lin, Senior Member, IEEE, and Xia Hong

Machine Learning for Signal Processing Sparse and Overcomplete Representations

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation

High-Rank Matrix Completion and Subspace Clustering with Missing Data

Graph Partitioning Using Random Walks

Riemannian Metric Learning for Symmetric Positive Definite Matrices

Adaptive Affinity Matrix for Unsupervised Metric Learning

Group Sparse Non-negative Matrix Factorization for Multi-Manifold Learning

A Convex Approach for Designing Good Linear Embeddings. Chinmay Hegde

REPORT DOCUMENTATION PAGE

Lecture Notes 1: Vector spaces

Semi Supervised Distance Metric Learning

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28

Some Recent Advances. in Non-convex Optimization. Purushottam Kar IIT KANPUR

The convex algebraic geometry of rank minimization

Subspace Clustering with Priors via Sparse Quadratically Constrained Quadratic Programming

PU Learning for Matrix Completion

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

THE permeation of the Internet and social networks into our daily life, as well as the ever increasing number of connected

Sparse & Redundant Signal Representation, and its Role in Image Processing

Detecting Sparse Structures in Data in Sub-Linear Time: A group testing approach

Algorithms for Calculating Statistical Properties on Moving Points

ECS289: Scalable Machine Learning

Semidefinite Programming Based Preconditioning for More Robust Near-Separable Nonnegative Matrix Factorization

On the Projection Matrices Influence in the Classification of Compressed Sensed ECG Signals

Restricted Strong Convexity Implies Weak Submodularity

Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization

Automatic Subspace Learning via Principal Coefficients Embedding

Naval Air Warfare Center Weapons Division China Lake, CA. Automatic Target Recognition via Sparse Representations

Tensor LRR and sparse coding based subspace clustering

Distributed Low-rank Subspace Segmentation

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

A Brief Overview of Practical Optimization Algorithms in the Context of Relaxation

Sparsity and Compressed Sensing

Transcription:

Scalable Subspace Clustering René Vidal Center for Imaging Science, Laboratory for Computational Sensing and Robotics, Institute for Computational Medicine, Department of Biomedical Engineering, Johns Hopkins University

High-Dimensional Data In many areas, we deal with high-dimensional data Computer vision Medical imaging Medical robotics Signal processing Bioinformatics

Low-Dimensional Manifolds Face clustering and classification Lossy image representation S 1 S 2 Motion segmentation DT segmentation Video segmentation 3

Subspace Clustering Problem Given a set of points lying in multiple subspaces, identify The number of subspaces and their dimensions A basis for each subspace The segmentation of the data points Challenges Model selection Nonconvex Combinatorial More challenges Noise Outliers Missing entries

Subspace Clustering Problem: Challenges Even more challenges Angles between subspaces are small Nearby points are in different subspaces S 2 S 1 100 100 90 90 Percentage of subspace pairs 80 70 60 50 40 30 20 Hopkins 155 Extended YaleB Percentage of data points 80 70 60 50 40 30 20 Hopkins 155 Extended YaleB 10 10 0 0 10 20 30 40 50 60 70 80 90 Subspace angle (degree) 0 5 10 15 20 25 Number of nearest neighbors

Prior Work: Sparse and Low-Rank Methods Approach Data are self-expressive Global affinity by convex optimization Representative methods Sparse Subspace Clustering (SSC) (Elhamifar-Vidal 09 10 13, Candes-Soltanolkotabi 12 13, Wang-Xu 13) Low-Rank Subspace Clustering (LRR and LRSC) (Costeira-Kanade 98, Kanatani 01, Vidal 08, Liu et al. 10 13, Wei-Lin 10, Favaro-Vidal 11 13) Least Square Regression (LSR) (Lu 12) Sparse + Low-Rank (Luo 11, Wang 13) Sparse + Frobenius Elastic Net (EnSC) (Dyer 13, You 16)

Prior Work: Sparse and Low-Rank Methods min C,E f(c)+ g(e) s.t.x = XC + E Sparse Subspace Clustering (SSC) `1 `1, `22 Least Squares Regression (LSR) `22 `22 Elastic Net Subspace Clustering (EnSC) `1 + `22 `22 Low Rank Representation (LRR) nuclear `2,1 Low Rank Subspace Clustering (LRSC) nuclear `1, `22 f g Advantages Convex optimization Broad theoretical results Robust to noise/corruptions Disadvantages / Open Problems Low-dimensional subspaces Missing entries Scalability: can handle 10,000 data points

Talk Outline Sparse Subspace Clustering by Basis Pursuit (SSC-BP) Theoretical guarantees for noiseless, noisy, and corrupted data Great performance for applications with 1,000 data points Scalable Sparse Subspace Clustering by Orthogonal Matching Pursuit (SSC-OMP) Theoretical guarantees for noiseless data Scalable to 600,000 data points Scalable Elastic Net Subspace Clustering (EnSC) Theoretical guarantees for noiseless data New active set algorithm that is scalable to 600,000 data points E. Elhamifar and R. Vidal. Sparse Subspace Clustering. CVPR 2009. E. Elhamifar and R. Vidal. Sparse Subspace Clustering: Algorithm, Theory and Applications. TPAMI 2013. C. You, D. Robinson, R. Vidal. Scalable Sparse Subspace Clustering by Orthogonal Matching Pursuit. CVPR 2016. C. You, C. Li, D. Robinson, R. Vidal, Scalable Elastic Net Subspace Clustering. CVPR 2016.

Sparse Subspace Clustering by Basis Pursuit Ehsan Elhamifar and René Vidal Computer Science, Northeastern University Center for Imaging Science, Johns Hopkins University

Sparse Subspace Clustering: Spectral Clustering Spectral clustering Represent data points as nodes in graph G Connect nodes i and j with weight c ij Infer clusters from Laplacian of G How to define a subspace-preserving affinity matrix C? c ij 6=0 c ij =0 points in the same subspace: points in different subspaces:

Sparse Subspace Clustering: Intuition Data in a union of subspaces (UoS) are self-expressive NX x j = c ij x i =) x j = Xc j =) X = XC i=1 Data in a UoS admits a subspace-preserving representation c ij 6=0 =) x i and x j belong to the same subspace S 3 S 1 S 2 X S 1 S 3 S 2 Under what conditions is solution to P0 subspace-preserving? P 0 : min c j kc j k 0 s. t. x j = Xc j, c jj =0 E. Elhamifar and R. Vidal. Sparse Subspace Clustering. CVPR 2009. E. Elhamifar and R. Vidal. Clustering Disjoint Subspaces via Sparse Representation. ICASSP 2010. E. Elhamifar and R. Vidal. Sparse Subspace Clustering: Algorithm, Theory and Applications. TPAMI 2013.

Sparse Subspace Clustering: Noiseless Data Under what conditions on the subspaces and the data is the solution to P1 subspace-preserving? Point by point: All points: min c j kc j k 1 s. t. x j = Xc j, c jj =0 min C kck 1 s. t. X = XC, diag(c) =0 Theorem 1: P1 gives a subspace-preserving representation if the subspaces are independent, i.e., nm dim i=1 S i = nx dim(s i ) i=1 S 2 S 1 E. Elhamifar and R. Vidal. Sparse Subspace Clustering. CVPR 2009. E. Elhamifar and R. Vidal. Sparse Subspace Clustering: Algorithm, Theory and Applications. TPAMI 2013.

Sparse Subspace Clustering: Noiseless Data Independence may be too restrictive: e.g., articulated motions Theorem 2: P1 gives a subspace-preserving representation if the subspaces are sufficiently separated and the data are well distributed inside the subspaces, i.e., if for all i=1,, n, (incoherence) µ i = max j6=i cos( ij) <r i Theorem 3: n d-dimensional subspaces drawn independently, uniformly at random ρd + 1 points per subspace drawn independently, uniformly at random P1 gives a subspace-preserving representation with high probability if the dimension of the subspace d is small relative to the ambient dimension D (inradius) d< c2 ( ) log( ) 12 log(n) D E. Elhamifar and R. Vidal. Clustering Disjoint Subspaces via Sparse Representation. ICASSP 2010. E. Elhamifar and R. Vidal. Sparse Subspace Clustering: Algorithm, Theory and Applications. TPAMI 2013. M. Soltanolkotabi, E. Candes. A geometric analysis of subspace clustering with outliers. Annals of Statistics, 40(4):2195 2238, 2013.

6 Sparse Subspace Clustering: Noisy Data Under what conditions on the subspaces and the data is the solution to LASSO subspace-preserving? Noiseless (P1): Noise (LASSO): min C kck 1 s. t. X = XC, diag(c) =0 min C kck 1 + 2 kx XCk 2 F s. t. diag(c) =0 Theorem 4: LASSO gives a subspace-preserving representation if the subspaces are sufficiently separated, the data are well distributed, the noise is small enough, and the LASSO parameter is well chosen, i.e., µ i <r i, < r i(r i µ i ) 3r 2 i +8r i +2, < < Wang, Y.-X. and Xu, H. Noisy sparse subspace clustering. ICML 2013.

Experiments on Face Clustering Faces under varying illumination 9D subspace Extended Yale B dataset 38 subjects 64 images per subject Clustering error SSC < 2.0% error for 2 subjects SSC < 11.0% error for 10 subjects E. Elhamifar and R. Vidal, Sparse Subspace Clustering: Algorithm, Theory, and Applications, TPAMI13. Clustering error (%) 70 60 50 40 30 20 10 SSC LRSC LRR H LRR SCC LSA D = 2,016 dimensional data 0 2 4 6 8 10 Number of subjects

Scalable Sparse Subspace Clustering by Orthogonal Matching Pursuit Chong You, Daniel Robinson, and René Vidal Center for Imaging Science, Johns Hopkins University Applied Mathematics and Statistics, Johns Hopkins University

[1] E. Elhamifar and R. Vidal, Sparse Subspace Clustering, CVPR 09 [2] G. Liu, Z. Lin, Y. Yu, Robust Subspace Segmentation by Low- Rank Representation, ICML 10 [3] Lu et al,, Robust and efficient subspace segmentation via least squares regression, ECCV 2012. [4] X. Chen and D. Cai. Large Scale Spectral Clustering with Landmarkbased Representation, AAAI 11 [5] X. Peng, L. Zhang, Z. Yi, Scalable Sparse Subspace Clustering, CVPR 13 [6] A. Adler, M. Elad, Y. Hel-Or, Linear-Time Subspace Clustering via Bipartite Graph Modeling. TNNLS 15 Prior Work: overview arbitrary subspaces This work independent subspaces 1K data points 1M data points

Sparse Subspace Clustering (SSC) [1] Elhamifar-Vidal, Sparse Subspace Clustering, CVPR 2009 [2] Dyer et al, Greedy Feature Selection for Subspace Clustering, JMLR 2014

Sparse Subspace Clustering (SSC) [1] Elhamifar-Vidal, Sparse Subspace Clustering, CVPR 2009 [2] Dyer et al, Greedy Feature Selection for Subspace Clustering, JMLR 2014

SSC by Orthogonal Matching Pursuit (SSC-OMP)

Guaranteed Correct Connections: Deterministic Model [3] M. Soltanolkotabi, E. Candes. A geometric analysis of subspace clustering with outliers. Annals of Statistics, 40(4): 2195 2238, 2013.

Guaranteed Correct Connections: Deterministic Model SSC-OMP gives correct connections if

Guaranteed Correct Connections: Deterministic Model SSC-OMP gives correct connections if

Guaranteed Correct Connections: Random Model [3] M. Soltanolkotabi, E. Candes. A geometric analysis of subspace clustering with outliers. Annals of Statistics, 40(4): 2195 2238, 2013.

Synthetic Experiments

Synthetic Experiments

Experiment on Extended Yale B

Experiment on MNIST

Conclusion

Scalable Elastic Net Subspace Clustering Chong You, Chun-Guang Li *, Daniel Robinson, and René Vidal Center for Imaging Science, Johns Hopkins University *SICE, Beijing University of Posts and Telecommunications Applied Mathematics and Statistics, Johns Hopkins University

Motivation

Elastic net Subspace Clustering (EnSC)

Scalable Elastic net Subspace Clustering Prior methods ADMM Interior point Solution path Proximal gradient method etc.

Geometry of the Elastic Net Solution

Correct Connections vs. Connectivity

Guaranteed Correct Connections

Oracle Guided Active Set (ORGEN) Algorithm

Oracle Guided Active Set (ORGEN) Algorithm

Experiments database # data ambient dim. # clusters Examples Coil-100 7,200 1024 100 PIE 11,554 1024 68 MNIST 70,000 500 10 CovType 581,012 54 7

Experiments database # data SSC-BP SSC-OMP EnSC Coil-100 7,200 57.10% 42.93% 69.24% PIE 11,554 41.94% 24.06% 52.98% MNIST 70,000-93.07% 93.79% CovType 581,012-48.76% 53.52%

Experiments database # data SSC-BP SSC-OMP EnSC Coil-100 7,200 127 mins 3 mins 3 mins PIE 11,554 412 mins 5 mins 13 mins MNIST 70,000-6 mins 28 mins CovType 581,012-783 mins 1452 mins

Conclusion

Conclusions Many problems in computer vision can be posed as subspace clustering problems Spatial and temporal video segmentation Face clustering under varying illumination These problems can be solved using Sparse Subspace Clustering by Basis Pursuit (SSC-BP) Sparse Subspace Clustering by Orthogonal Matching Pursuit (SSC-OMP) Elastic Net Subspace Clustering (EnSC) These algorithms are provably correct when Subspaces are sufficiently separated Data are well distributed within each subspace The subspace dimension is small relative to the ambient dimension SSC-OMP and EnSC are scalable to 1M data points

Acknowledgements Vision Lab @ Johns Hopkins University http://www.vision.jhu.edu Thank You!