arxiv: v1 [cs.lg] 31 Oct PDF Free Download

ACCELERATED SPARSE SUBSPACE CLUSTERING Abofaz Hashemi and Haris Vikao Department of Eectrica and Computer Engineering, University of Texas at Austin, Austin, TX, USA arxiv:7.26v [cs.lg] 3 Oct 27 ABSTRACT State-of-the-art agorithms for sparse subspace custering perform spectra custering on a simiarity matrix typicay obtained by representing each data point as a sparse combination of other points using either basis pursuit (BP) or orthogona matching pursuit (OMP). BP-based methods are often prohibitive in practice whie the performance of OMP-based schemes are unsatisfactory, especiay in settings where data points are highy simiar. In this paper, we propose a nove agorithm that expoits an acceerated variant of orthogona east-squares to efficienty find the underying subspaces. We show that under certain conditions the proposed agorithm returns a subspace-preserving soution. Simuation resuts iustrate that the proposed method compares favoraby with BP-based method in terms of running time whie being significanty more accurate than OMP-based schemes. Index Terms sparse subspace custering, acceerated orthogona east squares, scaabe agorithm, arge-scae data. INTRODUCTION Massive amounts of data coected by recent information systems give rise to new chaenges in the fied of signa processing, machine earning, and data anaysis. One such chaenge is to deveop fast and accurate agorithms so as to find owdimensiona structures in arge-scae high-dimensiona data sets. The task of extracting such ow-dimensiona structures is encountered in many practica appications incuding motion segmentation and face custering in computer vision [, 2], image representation and compression in image custering [3, 4], and hybrid system identification in systems theory [5]. In these settings, the data can be thought of as being a coection of points ying on a union of ow-dimensiona subspaces. The goa of subspace custering is to organize data points into severa custers so that each custer contains ony the points from the same subspace. Subspace custering has drawn significant attention over the past decade [6]. Among various approaches to subspace custering, methods that rey on spectra custering [7] to anayze the simiarity matrix representing the reations among data points have received much attention due to their simpicity, theoretica rigour, and superior performance. These methods assume that the data is sef-expressive [8], i.e., each data point can be represented by a inear combination of the other points in the union of subspaces. This motivates the search for a a so-caed subspace preserving simiarity matrix which estabishes stronger connections among the points originating from a simiar subspace. To form such a simiarity matrix, the sparse subspace custering (SSC) method in [8, 9] empoys a sparse reconstruction agorithm referred to as basis pursuit (BP) that aims to minimize an -norm objective by means of convex optimization approaches such as interior point [] or aternating direction of method of mutipiers (ADMM) []. In [2, 3], orthogona matching pursuit (OMP) is used to greediy buid the simiarity matrix. Low rank subspace custering approaches in [4 7] rey on convex optimization techniques with 2 -norm and nucear norm reguarizations and find the singuar vaue decomposition (SVD) of the data so as to buid the simiarity matrix. Finay, [8] presents an agorithm that constructs the simiarity matrix through threshoding the correations among the data points. Performance of sef-expressiveness-based subspace custering schemes was anayzed in various settings. It was shown in [8, 9] that when the subspaces are disjoint (independent), the BP-based method is subspace preserving. [9, 2] take a geometric point of view to further study the performance of BP-based SSC agorithm in the setting of intersecting subspaces and in the presence of outiers. These resuts are extended to the OMP-based SSC in [2, 3]. Sparse subspace custering of arge-scae data is computationay chaenging. The computationa compexity of state-of-the-art BP-based method in [8] and the ow rank representation methods [4 7] is often prohibitive in practica appications. On the other hand, current scaabe SSC agorithms, e.g., [2, 3], may produce poor custering soutions, especiay in scenarios where the subspaces are not we separated. In this paper, we address these chaenges by proposing a nove sef-expressiveness-based agorithm for subspace custering that expoits a fast variant of orthogona east-squares (OLS) to efficienty form a simiarity matrix by finding a sparse representation for each data point. We anayze the performance of the proposed scheme and show that in the scenarios where the subspaces are independent, the proposed agorithm aways finds a soution that is subspacepreserving. Simuation studies iustrate that our proposed SSC agorithm significanty outperforms the state-of-the-art method [8] in terms of runtime whie providing essentiay the same or better custering accuracy. The resuts further

iustrate that, unike the methods in [8,2,3], when the subspaces are dependent our proposed scheme finds a subspace preserving soution. The rest of the paper is organized as foows. Section 2 formay states the subspace custering probem and reviews some reevant concepts. In Section 3, we introduce the acceerated sparse subspace custering agorithm and anayze its performance. Section 4 presents the simuation resuts whie the concuding remarks are stated in Section 5. 2. PROBLEM FORMULATION First, we briefy summarize notation used in the paper and then formay introduce the SSC probem. Bod capita etters denote matrices whie bod owercase etters represent vectors. For a matrix A, A ij denotes the (i,j) entry of A, and a j is the j th coumn of A. Additionay, A S is the submatrix of A that contains the coumns of A indexed by the set S. L S denotes the subspace spanned by the coumns of A S. P S = I A SA S is the projection operator onto the orthogona compement of L S where A S = ( A A S A S) S denotes the Moore-Penrose pseudoinverse of A S and I is the identity matrix. Further, et [n] = {,...,n},be the vector of a ones, andu(,q) denote the uniform distribution on[, q]. The SSC probem is detaied next. Let{y} N i= be a coection of data points inr D and ety = [y,...,y N ] R D N be the data matrix representing the data points. The data points are drawn from a union of n subspaces {S i } n i= with dimensions{d i } n i=. Without a oss of generaity, we assume that the coumns of Y, i.e., the data points, are normaized vectors with unit 2 norm. The goa of subspace custering is to partition {y} N i= into n groups so that the points that beong to the same subspace are assigned to the same custer. In the sparse subspace custering (SSC) framework [8], one assumes that the data points satisfy the sef-expressiveness property formay stated beow. Definition. A coection of data points{y} N i= satisfies the sef-expressiveness property if each data point has a inear representation in terms of the other points in the coection, i.e., there exist a representation matrixcsuch that Y = YC, diag(c) =. () Notice that since each point in S i can be written in terms of at mostd i points ins i, SSC aims to find a sparse subspace preserving C as formaized next. Definition 2. A representation matrix C is subspace preserving if for aj, [N] and a subspaces i it hods that C j = y j,y S i. (2) The MATLAB impementation of the proposed agorithm is avaiabe at https://github.com/reaabofaz/assc. The task of finding a subspace preserving C eads to the optimization probem [8] min c j c j s.t. y j = Yc j, C jj =, (3) where c j is the j th coumn of C. Given a subspace preserving soution C, one constructs a simiarity matrix W = C + C for the data points. The graph normaized Lapacian of the simiarity matrix W is then used as an input to a spectra custering agorithm [7] which in turn produces custering assignments. 3. ACCELERATED OLS FOR SUBSPACE CLUSTERING In this section, we deveop a nove sef-expressiveness-based agorithm for the subspace custering probem and anayze its performance. We propose to find an approximate soution to the probem min c j c j s.t. y j Yc j 2 2 ǫ, C jj =, (4) by empoying a ow-compexity variant of the orthogona east-squares (OLS) agorithm [2] so as to find a sparse representation for each data point and thus construct C. Note that in (4), ǫ is a sma predefined parameter that is used as the stopping criterion of the proposed agorithm. The OLS agorithm, drawn much attention in recent years [2 26], is a greedy heuristic that iterativey reconstructs sparse signas by identifying one nonzero signa component at a time. The compexity of using cassica OLS [2] to find a subspace preserving C athough ower than that of the BP-based SSC method [8] might be prohibitive in appications invoving arge-scae data. To this end, we propose a fast variant of OLS referred to as acceerated OLS (AOLS) [27] that significanty improves both the running time and accuracy of the cassica OLS. AOLS repaces the aforementioned singe component seection strategy by the procedure wherel indices are seected at each iteration, eading to significant improvements in both computationa cost and accuracy. To enabe significant gains in speed, AOLS efficienty buids a coection of orthogona vectors {u,...,u L } T = that represent the basis of the subspace that incudes the approximation of the sparse signa. 2 In order to use AOLS for the SSC probem, consider the task of finding a sparse representation for y j. Let A j [N]\{j} be the set containing indices of data points with nonzero coefficients in the representation of y j. That is, for a A j, C j. The proposed agorithm for sparse subspace custering, referred to as acceerated sparse subspace custering (ASSC), finds A j in an iterative fashion (See Agorithm ). In particuar, starting with 2 T < N is the maximum number of iterations that depends on the threshod parameter ǫ.

Agorithm Acceerated Sparse Subspace Custering : Input: Y, L,ǫ,T 2: Output: custering assignment vector s 3: forj =,...,N 4: Initiaize r = y j, i =, A j =, t = y for a [N]\{j} 5: whie r i 2 2 ǫ andi < T 6: Seect {s,...,s L } corresponding to L argest terms(y r i/y t(i) ) 2 t (i) 2 2 7: A j A j {s,...,s L } 8: i i+ 9: Perform (6) L times to update {u,...,u L } i = andr i = t (i ) L t (i ) uik k= u ik u 2 ik for a 2 : t (i) [N]\{j} : end whie 2: c j = Y A j y j 3: end for 4: W = C + C 5: Appy spectra custering on the normaized Lapacian of W to obtains A j =, in the i th iteration we identify L data points {y s,...,y sl } for the representation of y j. The indices {s,...,s L } [N]\(A j {j}) correspond to the L argest terms (y r i /y t(i ) ) 2 t (i ) 2 2, where r i = P A j y j denotes the residua vector in the i th iteration with r = y j, and t (i) = t (i ) L k= t (i ) uik u ik 2 u ik (5) 2 is the projection of y onto the span of orthogona vectors {u,...,u L } i =. Once {y s,...,y sl } are seected, we use the assignment u ik = y s k r i t (i) ys k t (i) s k, r i r i u ik, (6) s k L times to obtain r i and {u,...,u L } i = that are required for subsequent iterations. This procedure is continued unti r i 2 2 < ǫ for some iteration i T, or the agorithm reaches the predefined maximum number of iterations T. Then the vector of coefficients c j used for representing y j is computed as the east-squares soution c j = Y A j y j. Finay, having found c j s, we construct W = C + C and appy spectra custering on its normaized Lapacian to obtain the custering soution. 3.. Performance Guarantee for ASSC In this section, we anayze performance of the ASSC agorithm under the scenario that data points are noiseess and drawn from a union of independent subspaces, as defined next. Definition 3. Let {S i } n i= be a coection of subspaces with dimensions {d i } n i=. Define n i= S i = { i y i : y i S i }. Then, {S i } n i= is caed independent if and ony if dim( n i= S i) = n i= d i. Theorem states our main theoretica resuts about the performance of the proposed ASSC agorithm. Theorem. Let {y i } N i= be a coection of noiseess data points drawn from a union of independent subspaces{s i } n i=. Then, the representation matrix C returned by the ASSC agorithm is subspace preserving. The proof of Theorem, omitted for brevity, reies on the observation that in order to seect new representation points, ASSC finds data points that are highy correated with the current residua vector. Since the subspaces are independent, if ASSC chooses a point that is drawn from a different subspace, its corresponding coefficient wi be zero once ASSC meets a terminating criterion (e.g., 2 -norm of the residua vector becomes ess thanǫort = N ). Hence, ony the points that are drawn from the same subspace wi have nonzero coefficients in the fina sparse representation. Remark: It has been shown in [8, 2, 3] that if subspaces are independent, SSC-BP and SSC-OMP schemes are aso subspace preserving. However, as we iustrate in our simuation resuts, ASSC is very robust with respect to dependencies among the data points across different subspaces whie in those settings SSC-BP and SSC-OMP strugge to produce a subspace preserving matrix C. Further theoretica anaysis of this setting is eft to future work. 4. SIMULATION RESULTS To evauate performance of the ASSC agorithm, we compare it to that of the BP-based [8, 9] and OMP-based [2, 3] SSC schemes, referred to as SSC-BP and SSC-OMP, respectivey. For SSC-BP, two impementations based on ADMM and interior point methods are avaiabe by the authors of [8, 9]. The interior point impementation of SSC-BP is more accurate than the ADMM impementation whie the ADMM impementation tends to produce sup-optima soution in a few iterations. However, the interior point impementation is very sow even for reativey sma probems. Therefore, in our simuation studies we use the ADMM impementation of SSC-BP that is provided by the authors of [8, 9]. Our scheme is tested for L = and L = 2. We consider the foowing two scenarios: () A random mode where the subspaces are with high probabiity near-independent; and (2) The setting where we used hybrid dictionaries [25] to generate simiar data points across different subspaces which in turn impies the independence assumption no onger hods. In both scenarios, we randomy generate n = 5 subspaces, each of dimension d = 6, in an ambient space of dimension D = 9.

4.9.8.9.7.8 2.7.2.2 - (a) Subspace preserving rate (b) Subspace preserving error (c) Custering accuracy (d) Running time (sec) Fig. : Performance comparison of ASSC, SSC-OMP [2,3], and SSC-BP [8,9] on synthetic data with no perturbation. The points are drawn from 5 subspaces of dimension 6 in ambient dimension 9. Each subspace contains the same number of points and the overa number of points is varied from 25 to 5. 4.9.8.7 5.9.8.25.7 2.2 5.2.5 - (a) Subspace preserving rate (b) Subspace preserving error (c) Custering accuracy (d) Running time (sec) Fig. 2: Performance comparison of ASSC, SSC-OMP [2, 3], and SSC-BP [8, 9] on synthetic data with perturbation terms Q U(,). The points are drawn from 5 subspaces of dimension 6 in ambient dimension 9. Each subspace contains the same number of points and the overa number of points is varied from 25 to 5. Each subspace contains N i sampe points where we vary N i from 5 to ; therefore, the tota number of data points, N = n i= N i, is varied from 25 to 5. The resuts are averaged over 2 independent instances. For scenario (), we generate data points by uniformy samping from the unit sphere. For the second scenario, after samping a data point, we add a perturbation termq D whereq U(,). In addition to comparing the agorithms in terms of their custering accuracy and running time, we use the foowing metrics defined in [8, 9] that quantify the subspace preserving property of the representation matrix C returned by each agorithm: Subspace preserving rate defined as the fraction of points whose representations are subspace-preserving, Subspace preserving error defined as the fraction of norms of the representation coefficients associated with points from other subspaces, i.e., j ( i O C ij / c j ) where O N represents the set of data points from other subspaces. The resuts for the scenario () and (2) are iustrated in Fig. and Fig. 2, respectivey. As we see in Fig., ASSC is neary as fast as SSC-OMP and orders of magnitude faster than SSC-BP whie ASSC achieves better subspace preserving rate, subspace preserving error, and custering accuracy compared to competing schemes. Regarding the second scenario, we observe that the performance of SSC- OMP is severey deteriorated whie ASSC sti outperforms both SSC-BP and SSC-OMP in terms of accuracy. Further, simiar to the first scenario, running time of ASSC is simiar to that of SSC-OMP whie both methods are much faster that SSC-BP. Overa as Fig. and Fig. 2 iustrate, ASSC agorithm, especiay with L = 2, is superior to other schemes and is essentiay as fast as the SSC-OMP method. 5. CONCLUSION In this paper, we proposed a nove agorithm for custering high dimensiona data ying on a union of subspaces. The proposed agorithm, referred to as acceerated sparse subspace custering (ASSC), empoys a computationay efficient variant of the orthogona east-squares agorithm to construct a simiarity matrix under the assumption that each data point can be written as a sparse inear combination of other data points in the subspaces. ASSC then performs spectra custering on the simiarity matrix to find the custering soution. We anayzed the performance of the proposed scheme and provided a theorem stating that if the subspaces are independent, the simiarity matrix generated by ASSC is subspacepreserving. In simuations, we demonstrated that the proposed agorithm is orders of magnitudes faster than the BPbased SSC scheme [8, 9] and essentiay deivers the same or better custering soution. The resuts aso show that ASSC outperforms the state-of-the-art OMP-based method [2, 3], especiay in scenarios where the data points across different subspaces are simiar. As part of the future work, it woud be of interest to extend our resuts and anayze performance of ASSC in the genera setting where the subspaces are arbitrary and not necessariy independent. Moreover, it woud be beneficia to deveop distributed impementations for further acceeration of ASSC.

6. REFERENCES [] A. Y. Yang, J. Wright, Y. Ma, and S. S. Sastry, Unsupervised segmentation of natura images via ossy data compression, Computer Vision and Image Understanding, vo., no. 2, pp. 22 225, 28. [2] R. Vida, R. Tron, and R. Hartey, Mutiframe motion segmentation with missing data using powerfactorization and gpca, Internationa Journa of Computer Vision, vo. 79, no., pp. 85 5, 28. [3] J. Ho, M.-H. Yang, J. Lim, K.-C. Lee, and D. Kriegman, Custering appearances of objects under varying iumination conditions, in Computer vision and pattern recognition, 23. Proceedings. 23 IEEE computer society conference on, vo., pp. I I, IEEE, 23. [4] W. Hong, J. Wright, K. Huang, and Y. Ma, Mutiscae hybrid inear modes for ossy image representation, IEEE Transactions on Image Processing, vo. 5, no. 2, pp. 3655 367, 26. [5] R. Vida, S. Soatto, Y. Ma, and S. Sastry, An agebraic geometric approach to the identification of a cass of inear hybrid systems, in Decision and Contro, 23. Proceedings. 42nd IEEE Conference on, vo., pp. 67 72, IEEE, 23. [6] R. Vida, Subspace custering, IEEE Signa Processing Magazine, vo. 28, no. 2, pp. 52 68, 2. [7] A. Y. Ng, M. I. Jordan, Y. Weiss, et a., On spectra custering: Anaysis and an agorithm, in Proceedings of the Advances in Neura Information Processing Systems (NIPS), vo. 4, pp. 849 856, 2. [8] E. Ehamifar and R. Vida, Sparse subspace custering, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 279 2797, IEEE, 29. [9] E. Ehamifar and R. Vida, Sparse subspace custering: Agorithm, theory, and appications, IEEE transactions on pattern anaysis and machine inteigence, vo. 35, no., pp. 2765 278, 23. [] S.-J. Kim, K. Koh, M. Lustig, S. Boyd, and D. Gorinevsky, An interior-point method for arge-scae -reguarized east squares, IEEE journa of seected topics in signa processing, vo., no. 4, pp. 66 67, 27. [] S. Boyd, N. Parikh, E. Chu, B. Peeato, and J. Eckstein, Distributed optimization and statistica earning via the aternating direction method of mutipiers, Foundations and Trends R in Machine Learning, vo. 3, no., pp. 22, Jan. 2. [2] E. L. Dyer, A. C. Sankaranarayanan, and R. G. Baraniuk, Greedy feature seection for subspace custering, The Journa of Machine Learning Research, vo. 4, no., pp. 2487 257, 23. [3] C. You, D. Robinson, and R. Vida, Scaabe sparse subspace custering by orthogona matching pursuit, in in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 398 3927, 26. [4] C.-Y. Lu, H. Min, Z.-Q. Zhao, L. Zhu, D.-S. Huang, and S. Yan, Robust and efficient subspace segmentation via east squares regression, Computer Vision ECCV 22, pp. 347 36, 22. [5] G. Liu, Z. Lin, S. Yan, J. Sun, Y. Yu, and Y. Ma, Robust recovery of subspace structures by ow-rank representation, IEEE Transactions on Pattern Anaysis and Machine Inteigence, vo. 35, no., pp. 7 84, 23. [6] P. Favaro, R. Vida, and A. Ravichandran, A cosed form soution to robust subspace estimation and custering, in Computer Vision and Pattern Recognition (CVPR), 2 IEEE Conference on, pp. 8 87, IEEE, 2. [7] R. Vida and P. Favaro, Low rank subspace custering (rsc), Pattern Recognition Letters, vo. 43, pp. 47 6, 24. [8] R. Hecke and H. Böcskei, Robust subspace custering via threshoding, IEEE Transactions on Information Theory, vo. 6, no., pp. 632 6342, 25. [9] M. Sotanokotabi and E. J. Candes, A geometric anaysis of subspace custering with outiers, The Annas of Statistics, pp. 295 2238, Aug. 22. [2] M. Sotanokotabi, E. Ehamifar, E. J. Candes, et a., Robust subspace custering, The Annas of Statistics, vo. 42, no. 2, pp. 669 699, Apr. 24. [2] S. Chen, S. A. Biings, and W. Luo, Orthogona east squares methods and their appication to non-inear system identification, Internationa Journa of Contro, vo. 5, no. 5, pp. 873 896, Nov. 989. [22] A. Hashemi and H. Vikao, Recovery of sparse signas via branch and bound east-squares, in Proceedings of IEEE Internationa Conference on Acoustics, Speech, and Signa Processing (ICASSP), pp. 476 4764, IEEE, 27. [23] L. Reboo-Neira and D. Lowe, Optimized orthogona matching pursuit approach, IEEE Signa Processing Letters, vo. 9, no. 4, pp. 37 4, Apr. 22. [24] A. Hashemi and H. Vikao, Sparse inear regression via generaized orthogona east-squares, in Proceedings of IEEE Goba Conference on Signa and Information Processing (GobaSIP), pp. 35 39, IEEE, Dec. 26. [25] C. Soussen, R. Gribonva, J. Idier, and C. Herzet, Joint k-step anaysis of orthogona matching pursuit and orthogona east squares, IEEE Transactions on Information Theory, vo. 59, no. 5, pp. 358 374, May 23. [26] C. Herzet, A. Drémeau, and C. Soussen, Reaxed recovery conditions for omp/os by expoiting both coherence and decay, IEEE Transactions on Information Theory, vo. 62, no., pp. 459 47, 26. [27] A. Hashemi and H. Vikao, Samping requirements and acceerated schemes for sparse inear regression with orthogona east-squares, arxiv preprint arxiv, 26.

arxiv: v1 [cs.lg] 31 Oct 2017