Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition

Size: px

Start display at page:

Download "Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition"

Sibyl Conley
5 years ago
Views:

1 1 Robust Principal Component nalysis Based on Low-Rank and Block-Sparse Matrix Decomposition Qiuwei Li Colorado School of Mines Gongguo Tang Colorado School of Mines rye Nehorai Washington University in St. Louis 1.1 Introduction Notations and Problem Setup The Methods of ugmented Lagrange Multipliers Implementation Numerical Simulations Convergence Behavior Comparison of Final Relative rror Comparison of The Matrices and xamine of effect of the block sparsity s 1.6 Conclusions References Introduction The classical principal component analysis (PC) [Jol02] is arguably one of the most important tools in high dimensional data analysis. However, the classical PC is not robust to gross corruptions on even only a few entries of the underlying low-rank matrix L containing the principal components. The robust principal component analysis (RPC) [C- SPW09, CLMW11, ZLW + 10] models the gross corruptions as a sparse matrix S superpositioned on the low-rank matrix L. The authors of [CSPW09,CLMW11,ZLW + 10] presented various incoherence conditions under which the low-rank matrix L and the sparse matrix B can be accurately decomposed by solving a convex program, the principal component pursuit: min + λ 1 subject to D = +, (1.1), where D is the observation matrix, λ is a tuning parameter, and and 1 denote the nuclear norm and the sum of the absolute values of the matrix entries, respectively. Many algorithms have been designed to efficiently solve the resulting convex program, e.g., the singular value thresholding [CCS08], the alternating direction method [YY09], the accelerated proximal gradient (PG) method [LGW + 09], and the augmented Lagrange multiplier method [LCWM09]. Some of these algorithms also apply stably when the observation matrix is corrupted by small, entry-wise noise [ZLW + 10]. The applications of RPC include video surveillance [CLMW11], face recognition [PGW + 12, LLY10], latent semantic indexing [MZWM10], image retrieval and management [ZYM10], and computer 1-1

2 1-2 Book title goes here vision [ZGLM12, WGS + 11], to name a few. We comment that, in certain applications, the sparse matrix S is actually of more interest than the low-rank matrix, which contains the principal components. However, RPC is not robust to column/row outliers in the case where is a column/row block sparse matrix. Many data sets arranged in matrix formats are of low-rank due to the correlations and connections within the data samples. The ubiquity of low-rank matrices is also suggested by the success and popularity of principal component analysis in many applications [Jol02]. xamples of special interest are network traffic matrices [JW + 10], and the data matrix formed in face recognition [WYG + 09], where the columns correspond to vectorized versions of face images. When a few columns of the data matrix are generated by mechanisms different from the rest of the columns, the existence of these outlying columns tends to destroy the low-rank structure of the data matrix. decomposition that enforces the low-rankness of one part and the block-sparsity of the other part would separate the principal components from the outliers.in this work, we design algorithms for low-rank and block-sparse matrix decomposition as a robust version of principal component analysis insensitive to column/row outliers. We use a convex program to separate the low-rank part and block-sparse part of the observed matrix. The decomposition involves the following model: D = +, (1.2) The observation matrix D is now the sum of the low-rank component and the blocksparse component. The block-sparse matrix contains mostly zero columns, with several non-zero ones corresponding to outliers. In order to eliminate ambiguity, the columns of the low-rank matrix corresponding to the outlier columns are assumed to be zeros. We demonstrate that the following convex program recovers the low-rank matrix and the block-sparsity matrix : min, + κ(1 λ) 2,1 + κλ 2,1 subject to D = +, (1.3) where, 2,1 denote respectively the Frobinius norm, and the l 1 norm of the vector formed by taking the l 2 norms of the columns of the underlying matrix. Note that the extra term κ(1 λ) 2,1 actually ensures the recovered has exact zero columns corresponding to the outliers. We design efficient algorithms to solve the convex program (1.3) based on the augmented Lagrange multiplier (LM) method. The LM method was employed to successfully perform low-rank and sparse matrix decomposition for large scale problems [LCWM09]. The challenges here are the special structure of the 2,1 norm, and the existence of the extra term κ(1 λ) 2,1. We demonstrate the validity of (1.3) in decomposition and compare the performance of the LM algorithm using numerical simulations. In future work, we will test the algorithms by applying them to face recognition and network traffic anomaly detection problems. Due to the ubiquity of low-rank matrices and the importance of outlier detection, we expect low-rank and block-sparse matrix decomposition to have wide applications, especially in computer vision and data mining. The paper is organized as follows. In Section 1.2, we introduce notations and the problem setup. Section 1.3 is devoted to the methods of augmented Lagrange multipliers for solving the convex program (1.3). We provide implementation details in Section 1.4. Numerical experiments are used to demonstrate the effectiveness of our method and the results are reported in Section 1.5. Section 1.6 summarizes our conclusions.

3 Robust Principal Component nalysis Based on Low-Rank and Block-Sparse Matrix Decomposition Notations and Problem Setup In this section, we introduce notations and the problem setup used throughout the paper. For any matrix R n p, we use ij to denote its ijth element, and j to denote its jth column The usual l p norm is denoted by p, p 1 when the argument is a vector. The nuclear norm, the Frobinius norm F, and the l 2,1 norm 2,1 of matrix are defined as follows: = i σ i (), (1.4) F = 2 ij, (1.5) i,j 2,1 = j j 2, (1.6) where σ i () is the ith largest singular value of. We consider the following model: D = +. (1.7) Here has at most s non-zero columns. These non-zero columns are considered as outliers in the entire data matrix D. The corresponding columns of are assumed to be zeros. In addition, has low rank, i.e., rank() r. We use the following optimization to recover and from the observation D: min, + κ(1 λ) 2,1 + κλ 2,1 subject to D = +. (1.8) This procedure separates the outliers in and the principal components in. s we will see in the numerical examples, simply enforcing the sparsity of and the low-rankness of would not separates and well. 1.3 The Methods of ugmented Lagrange Multipliers The ugmented Lagrange Multipliers (LM) method is a general procedure to solve the following equality constrained optimization problem: min f(x) subject to h(x) = 0, (1.9) where f : R n R and h : R n R m. The procedure defines the augmented Lagrangian function as L(x, λ; µ) = f(x) + λ, h(x) + µ 2 h(x) 2 2, (1.10) where λ R m is the Lagrange multiplier vector, and µ is a positive scalar. The LM iteratively solves x and the Lagrangian multiplier vector λ as shown in Table 1.1. To apply the general LM to problem (1.8), we define f(x) = + κ(1 λ) 2,1 + κλ 2,1, (1.11) h(x) = D (1.12)

4 1-4 Book title goes here with X = (, ). The corresponding augmented Lagrangian function is L(,, Y ; µ) = + κ(1 λ) 2,1 + κλ 2,1 + Y, D + µ 2 D 2 F, (1.13) where we use Y to denote the Lagrange multiplier. Given Y = Y k and µ = µ k, one key step in applying the general LM method is to solve min, L(,, Y k ; µ k ). We adopt an alternating procedure: for fixed = k, solve and for fixed = k+1, solve k+1 = argmin L(, k, Y k ; µ k ), (1.14) k+1 = argmin L( k+1,, Y k ; µ k ). (1.15) We first address (1.15), which is equivalent to solving min κλ 2,1 + µ k 2 D k Y k µ k 2 F. (1.16) Denote G = D k µ k Y k. The following lemma gives the closed-form solution of the minimization with respect to. Lemma 1 The solution to min η 2, G 2 F (1.17) is given by for j = 1,..., p. j = G j max ( ) 0, 1 η G j 2 (1.18) Proof: Note the objective function expands as = η 2, G 2 F n ( η j ) 2 j G j 2 2. (1.19) j=1 TBL 1.1 General augmented Lagrange multiplier method 1. initialize: Given ρ > 1, µ 0 > 0, starting point x s 0 and λ0; k 0 2. while not converged do 3. pproximately solve x k+1 = argmin x L(x, λ k ; µ k ) using starting point x s k Set starting point x s k+1 = x k+1. Update the Lagrange multiplier vector λ k+1 = λ k + µ k h(x k+1 ). 6. Update µ k+1 = ρµ k. 7. end while 8. output: x x k, λ λ k.

5 Robust Principal Component nalysis Based on Low-Rank and Block-Sparse Matrix Decomposition1-5 Now we can minimize with respect to j separately. Denote h(e) = η e e g 2 2 = η e First consider g 2 η. Cauchy-Schwarz inequality leads to h(e) η e [ e e, g + g 2 ] 2. (1.20) [ e e 2 g 2 + g 2 2] = 1 2 e (η g 2 ) e g 2 2 (1.21) 1 2 g 2 2, (1.22) where for the last inequality we used (1.21) is an increasing function on [0, ) if g 2 η. pparently, h(e) = g 2 /2 is achieved by e = 0 which is also unique. Therefore, if g 2 η, then the minimum of h(e) is achieved by the unique solution e = 0 In the second case when g 2 > η. Setting the derivative of h(e) with respect to e to zero yields ( ) η e + 1 = g. (1.23) e 2 Taking the l 2 norm of both sides yields e 2 = g 2 η > 0. (1.24) Plugging e 2 into (1.23) gives ( e = g 1 η ). (1.25) g 2 In conclusion, the minimum of (1.17) is achieved by with ( ( )) j = G j max 0, 1 η G j 2 (1.26) for j = 1, 2,..., p. We denote the operator solving (1.17) as T η ( ). So = T η (G) sets the columns of to be zero vectors if the l 2 norms of the corresponding columns of G are less than η, and scales down the columns otherwise by a factor 1 Now we turn to solve which can be rewritten as η G j 2. k+1 = argmin L(, k, Y k ; µ k ) (1.27) k+1 = { argmin + κ(1 λ) 2,1 + µ } k 2 G 2 F, (1.28)

6 1-6 Book title goes here where G = D k + 1 µ k Y k. We know that without the extra κ(1 λ) 2,1 term, a closed form solution is simply given by the soft-thresholding operator [CCS08]. Unfortunately, a closed form solution to (1.28) is not available. We use the Douglas/Peaceman-Rachford (DR) monotone operator splitting method [CP07,FSM07,FSM09] to iteratively solve (1.28). Define f 1 () = κ(1 λ) 2,1 + µ k 2 G 2 F and f 2 () =. For any β > 0 and a sequence, the DR itration for (1.28) is expressed as α t (0, 2) (j+1/2) = prox βf2 ( (j) ), (1.29) (j+1) = (j) + α j (prox βf1 (2 (j+1/2) (j) ) (j+1/2)). (1.30) Here for any proper, lower semi-continuous, convex function f, the proximity operators prox f ( ) gives the unique point prox f (x) achieving the infimum of the function z 1 2 x z f(z). (1.31) The following lemmas gives the explicit expression for the proximity operator involved in our DR iteration when f() = and f() = κ(1 λ) 2,1 + µ k 2 G 2 F. Lemma 2 For f( ) =, suppose the singular value decomposition of is = UΣV T, then the proximity operator is prox βf () = US β (Σ)V T, (1.32) where the nonnegative soft-thresholding operator is defined as S ν (x) = max(0, x ν), x 0, ν > 0, (1.33) for a scalar x and extended entry-wisely to vectors and matrices. For f( ) = η 2,1 + µ 2 G 2 F, the proximity operator is prox βf () = T βη 1+βµ ( ) + βµg. (1.34) 1 + βµ With these preparations, we present the algorithm for solving (1.8) using the LM, named as RPC-LBD, in Table 1.2. Note that instead of alternating between (1.14) and (1.15) many times until convergence, the RPC-LBD algorithm in Table 1.2 executes them only once. This inexact version greatly reduces the computational burden and yields sufficiently accurate results for appropriately tuned parameters. The strategy was also adopted in [LCWM09] to perform the low-rank and sparse matrix decomposition.

7 Robust Principal Component nalysis Based on Low-Rank and Block-Sparse Matrix Decomposition Implementation Parameter selection and initialization: The tuning parameters in (1.8) are chosen to be κ = 1.1 and λ = We do not have a systematic way to select these parameters. But these empirical values work well for all tested cases. For the DR iteration (1.29) and (1.30), we have β = 0.2 and α j 1. The parameters for the LM method are µ 0 = 30/ sign(d) 2 and ρ = 1.1. The low-rank matrix, the block-sparse matrix, and the Lagrange multiplier Y are initialized respectively to D, 0, and 0. The outer loop in Table 1.2 terminates when it reaches the maximal iteration number 500 or the error tolerance The error in the outer loop is computed by D k k F / D F. The inner loop for the DR iteration has maximal iteration number 20 and tolerance error The error in the inner loop is the Frobenius norm of the difference between successive (j) k s. Performing SVD: The major computational cost in the RPC-LBD algorithm is the singular value decomposition (SVD) in the DR iteration (Line 07 in Table 1.2). We usually do not need to compute the full SVD because only those that are greater than β are used. We use the PROPCK [Lar98] to compute the first (largest) few singular values. In order to both save computation and ensure accuracy, we need to predict the number of singular values of (j) k+1 that exceed β for each iteration. We adopt the following rule [LCWM09]: sv k+1 = { svnk + 1, if svn k < sv k ; min(svn k + 10, d), if svn k = sv k., where sv 0 = 10 and svn k = { svpk, if maxgap k 2; min(svp k, maxid k ), if maxgap k > 2. Here d = min(n, p), sv k is the predicted number of singular values that are greater than β, and svp k is the number of singular values in the sv k singular values that are larger than β. In addition, maxgap k and maxid k are respectively the largest ratio between successive singular values of (j) k+1 when arranged in a decreasing order and the corresponding index. TBL 1.2 Block-sparse and low-rank matrix decomposition via LM RPC-LBD(D, κ, λ) Input: Data matrix D R n p and the parameters κ, λ. Output: The low-rank part and the block-sparse part. 01. initialize: 0 D; 0 0; µ 0 = 30/ sign(d) 2 ; ρ > 0, β > 0, α (0, 1), k while not converged do 03. \\ Lines 4-11 solve k+1 = argmin L(, k, Y k ; µ k ). 04. G = D k + 1 µ Y k. k 05. j 0, (0) k+1 = G. 06. while not converged do 07. (j+1/2) k+1 = US β (Σ)V ( T where (j) ( k+1 = UΣV T is the SVD of (j) ) k+1. ) 08. (j+1) k+1 = (j) k+1 + α 09. j j end while T βκ(1 λ) 1+βµ k 11. k+1 = (j+1/2) k G = D k µ Y k. ( k 13. k+1 = T κλ G ). µ k 14. Y k+1 = Y k + µ k (D k+1 k+1 ). 15. µ k+1 = ρµ k. 16. k k end while 18. output: k, k. 2 (j+1/2) (j) k+1 k+1 +βµ k G 1+βµ k (j+1/2) k+1.

8 1-8 Book title goes here 1.5 Numerical Simulations Through this section, we consider only square matrices with n = p. The square matrix D is generated by follows: Construction of matrix D Construct matrix as the product of two Gaussian matrices with dimensions n r and r n. By this way, we get rank() = r. Normalize the columns of. Generate a support S of size s as a random subset of {1,..., p}. Let the columns of corresponding to the support S be sampled from the Gaussian distribution. Normalize the non-zero columns of. Set the columns of L corresponding to S to zeros. Construct D as the sum of and Convergence Behavior In this subsection, we mainly focus on the convergence behavior of the RPC-LBD and compare it with the classical RPC solved by the LM algorithm proposed in [LCWM09]. In order to illustrate the convergence behavior of these two algorithms, we apply these two algorithms to the same matrix D by running the same iterations K which is chosen as 20 here and then observe the evolution of the reconstruction error ρ (k), ρ (k), ρ (k) and ρ (k), respectively. The reconstruction error ρ (k), ρ (k), ρ (k) and ρ (k) are defined as follows. DFINITION 1.1 ssume (k) is the approximation to and (k) is the approximation to by running k iteration of the algorithm RPC- LBD. ssume (k) is the approximation to and (k) is the approximation to by running k iteration of the algorithm RPC. Then, ρ (k) (k) F F (1.35) ρ (k) (k) F F (1.36) ρ (k) (k) F F (1.37) ρ (k) (k) F F (1.38)

9 Robust Principal Component nalysis Based on Low-Rank and Block-Sparse Matrix Decomposition Reconstruction error of RPC RPC LBD FIGUR 1.1: Reconstruction error ρ n = 100, p = 100, r = 4, s = itration: k (k) and ρ (k) versus iteration number k for Case I: n = 100, p = 100, r = 4, s = 4 Figure 1.1 shows Reconstruction error ρ (k) and ρ (k) versus iteration number k for the case n = 100, p = 100, r = 4, s = 4 using RPC-LBD and RPC, respectively. Figure 1.2 shows Reconstruction error ρ (k) and ρ (k) versus iteration number k for the case n = 100, p = 100, r = 4, s = 4 using RPC-LBD and RPC, respectively. Case II: n = 200, p = 200, r = 8, s = 8 Figure 1.3 shows Reconstruction error ρ (k) and ρ (k) versus iteration number k for the case n = 200, p = 200, r = 8, s = 8 using RPC-LBD and RPC, respectively. Figure 1.4 shows Reconstruction error ρ (k) and ρ (k) versus iteration number k for the case n = 200, p = 200, r = 8, s = 8 using RPC-LBD and RPC, respectively. Case III: n = 300, p = 300, r = 12, s = 12 Figure 1.5 shows Reconstruction error ρ (k) and ρ (k) versus iteration number k for the case n = 300, p = 300, r = 12, s = 12 using RPC-LBD and RPC, respectively. Figure 1.6 shows Reconstruction error ρ (k) and ρ (k) versus iteration number k for the case n = 200, p = 300, r = 12, s = 12 using RPC-LBD and RPC, respectively.

10 1-10 Book title goes here 10 0 Reconstruction error of RPC RPC LBD FIGUR 1.2: Reconstruction error ρ n = 100, p = 100, r = 4, s = itration: k (k) and ρ (k) versus iteration number k for 10 0 Reconstruction error of RPC RPC LBD FIGUR 1.3: Reconstruction error ρ n = 200, p = 200, r = 8, s = itration: k (k) and ρ (k) versus iteration number k for

11 Robust Principal Component nalysis Based on Low-Rank and Block-Sparse Matrix Decomposition Reconstruction error of RPC RPC LBD FIGUR 1.4: Reconstruction error ρ n = 200, p = 200, r = 8, s = itration: k (k) and ρ (k) versus iteration number k for 10 0 Reconstruction error of RPC RPC LBD FIGUR 1.5: Reconstruction error ρ n = 300, p = 300, r = 12, s = itration: k (k) and ρ (k) versus iteration number k for

12 1-12 Book title goes here 10 0 Reconstruction error of RPC RPC LBD FIGUR 1.6: Reconstruction error ρ n = 300, p = 300, r = 12, s = itration: k (k) and ρ (k) versus iteration number k for Comment 5.1: s seen from the simulations, in all the cases, i.e., n = 100, n = 200, n = 300, the proposed algorithm RPC-LBD always outperforms the classical algorithm RP- C in term of the evolutions of the four indicators: ρl (k), ρ (k), ρ (k) and ρ (k). For example, from the comparison of the evolution of the indicators ρ (k) and ρ (k), we can see RPC would be quickly trapped into a local minimizer with about 9 iterations for all the three cases. In the contrary, the proposed RPC-LBD would not be that quickly trapped into a local minimizer hence it could approaches to a much more better approximations of the global minimizer, i.e., the true low rank matrix and this happens for all the three cases (n = 100, n = 200, n = 300 ). Furthermore, ρ (k) and ρ (k) are in fact the relative error between the approximation and the true low rank matrix L in k-th iteration. Thus, from the simulations, we can see the smallest relative error the classical algorithm RPC can achieved is at the order of 10 1 and the smallest relative error the proposed algorithm RPC-LBD can achieved is at the order of Since the classical algorithm RPC would be quickly be trapped into some local minimizer within only 9 iterations for all the three cases. It only need 20 iterations to have a good comparison between the evolution of the four indicators of the two algorithms. Here, r = round(0.04n) and s = round(0.3n) for all the three cases Comparison of Final Relative rror In the last subsection, we get a main idea of the convergence behaviour of the proposed algorithm RPC-LBD and the classical algorithm RPC. In this subsection, we will will compare the final results the two algorithms can achieve in term of the final relative recovery errors ρ, ρ, ρ and ρ, which is defined as follows.

13 Robust Principal Component nalysis Based on Low-Rank and Block-Sparse Matrix Decomposition1-13 DFINITION 1.2 ssume is the final approximation to and is the final approximation to by running the algorithm RPC-LBD until convergence. ssume is the final approximation to and is the final approximation to by running the algorithm RPC until convergence. Then, the final relative recovery errors are defined as follows: ρ F (1.39) F ρ F (1.40) F ρ F (1.41) F ρ F (1.42) F In Table 1.3, we compare the four final relative recovery errors of the RPC- LBD and the RPC: ρ, ρ, ρ and ρ. We consider n = 100, 200, 300, 400, 500, 600, 700, 800 and 900. Comment 5.2: The rank of the low-rank matrix is r = round(0.04n) and the number of the non-zero columns of the block sparse matrix is s = round(0.04n). We see that the RPC-LBD can accurately recover the low-rank matrix L and also the block sparse matrix. The relative recovery error of matrix by RPC- LBD ρ is at the order of In contrary, the relative recovery error of matrix L by RPC ρ ρ is at the order of Similarly, the relative recovery error of matrix by RPC-LBD is at the oder of However, the relative recovery error of matrix by RPC ρ is at the order of TBL 1.3 RPC. n Comparison of our algorithm (RPC-LBD) and the ρ ρ ρ ρ e e e e e e e e e e e e e e e e e e

14 1-14 Book title goes here Comparison of The Matrices and In this section, we will see the visible image of the different reconstructed matrices and by the proposed algorithm RPC-LBD and the classical algorithm RPC, respectively. We choose only the reconstructed matrices and for a visible sight since and are approximation to the block sparse matrix whose image version can give us a more obvious feeling. However, for the the reconstructed matrices and are approximation to the low rank matrix which, here, is the product of two gaussian matrix. Thus, in order to generate a more intuitively sight, in the following, we will see the visible image of the different reconstructed matrices and in the different cases.in the following case, we consider n = p, r = round(0.03n) and s = round(0.3n). Case I: n = 100, p = 100, r = 3, s = 30 In Figure 1.7, we show the original and those recovered by the RPC-LBD and the classical RPC. Here n = 100, p = 100, r = 3, s = 30. We can see that our algorithm RPC- LBD recovers the block sparse matrix near perfectly, while the RPC performs badly. ctually, the relative error for L recovered by the RPC-LBD ρ is while the relative error ρ is Case II: n = 200, p = 200, r = 6, s = 60 In Figure 1.8, we show the original and those recovered by the RPC-LBD and the classical RPC. Here n = 200, p = 200, r = 6, s = 60. We can see that our algorithm RPC- LBD recovers the block sparse matrix near perfectly, while the RPC performs badly. ctually, the relative error for recovered by the RPC-LBD ρ is while the relative error ρ is Case III: n = 300, p = 300, r = 9, s = 90 In Figure 1.9, we show the original and those recovered by the RPC-LBD and the classical RPC. Here n = 300, p = 300, r = 9, s = 90. We can see that our algorithm RPC- LBD recovers the block sparse matrix near perfectly, while the RPC performs badly. ctually, the relative error for recovered by the RPC-LBD ρ is while the relative error ρ is Comment 5.3: In the above cases, we set the block sparsity s of matrix as s = round(0.3n), which is already very high. Therefore, the classical algorithm RPC almost failed to recovery the block sparse matrix for all the cases. However, RPC-LBD can recovery exactly the block sparse matrix for all the cases and the relative error achieved by RPC-LBD is at the order of In contrast, the relative error achieved by RPC is about 30% which is very high. Hence, the proposed algorithm RPC-LBD is more robust to the factor of block sparsity s.

15 Robust Principal Component nalysis Based on Low-Rank and Block-Sparse Matrix Decomposition1-15 (a) (b) (c) FIGUR 1.7: For the case n = 100, p = 100, r = 3, s = 30. (a) The original ; (b) ; (c).

16 1-16 Book title goes here (a) (b) (c) FIGUR 1.8: For the case n = 200, p = 200, r = 6, s = 60. (a) The original ; (b) ; (c).

17 Robust Principal Component nalysis Based on Low-Rank and Block-Sparse Matrix Decomposition1-17 (a) (b) (c) FIGUR 1.9: For the case n = 300, p = 300, r = 9, s = 90. (a) The original ; (b) ; (c).

18 1-18 Book title goes here RPC RPC LBD Reconstruction error of sparsity: s FIGUR 1.10: Reconstruction error ρ and ρ versus block sparsity s varying from 10 to 50 with n = 100, p = 100, r = xamine of effect of the block sparsity s In Figure 1.10, we show the final relative error ρ and ρ corresponding to RPC and RPC-LBD, respectively, for block sparsity s varying from 10 to 50 with n = 100, p = 100, r = 4. In Figure 1.11, we show the final relative error ρ and ρ corresponding to RPC and RPC-LBD, respectively, for block sparsity s varying from 10 to 50 with n = 100, p = 100, r = 4. Comment 5.4: We set the block sparsity s of matrix varying in the range [ ]. Figure 1.10 and Figure 1.11 show the both RPC-LBD and RPC would behave worse when the block sparsity s becomes higher. One can see the classical algorithm RPC failed to recovery the block sparse matrix and the low rank matrix for all the different kinds of block sparsity. However, RPC-LBD can recovery exactly them when s = 10, 20. For s = 30, 40, 50, although RPC-LBD can not exactly recover the matrices,, its final relative error is still much smaller than those achieved by RPC, which, again, implies the proposed algorithm RPC-LBD is more robust to the block sparsity s.

19 Robust Principal Component nalysis Based on Low-Rank and Block-Sparse Matrix Decomposition RPC RPC LBD Reconstruction error of sparsity: s FIGUR 1.11: Reconstruction error ρ and ρ versus block sparsity s varying from 10 to 50 with n = 100, p = 100, r = Conclusions In this work we proposed a convex program for accurate low-rank and block-sparse matrix decomposition. We solved the convex program using the augmented Lagrange multiplier method. We used the Douglas/Peaceman-Rachford (DR) monotone operator splitting method to solve a subproblem involved in the augmented Lagrange multiplier method. We demonstrated the accuracy of our program and compared its results with those given by the robust principal component analysis. The proposed algorithm is potentially useful for outlier detection. References 1.. bdelkefi, Y. Jiang, W. Wang,. slebo, and O. Kvittem. Robust traffic anomaly detection with principal component pursuit. In Proceedings of the CM CoNXT Student Workshop, CoNXT 10 Student Workshop, pages 10:1 10:2, New York, NY, US, CM. 2. J-F Cai,. J. Candès, and Z. Shen. singular value thresholding algorithm for matrix completion. SIM J. on Optimization, 20(4): , J. Candès, X. Li, Y. Ma, and J. Wright. Robust principal component analysis? J. CM, 58(3):1 37, jun P. L. Combettes and J.. Pesquet. Douglas-rachford splittting approach to nonsmooth convex variational signal recovery. I Journal of Selected Topics in Signal Processing, 1(4): , Venkat Chandrasekaran, Sujay Sanghavi, Pablo Parrilo, and lan S Willsky. Sparse and low-rank matrix decompositions. In Communication, Control, and Computing, llerton th nnual llerton Conference on, pages I, M. J. Fadili, J. L. Starck, and F. Murtagh. Inpainting and zooming using sparse representations. The Computer Journal, 52:64 79, 2007.

20 1-20 Book title goes here 7. M. J. Fadili, J. L. Starck, and F. Murtagh. Inpainting and zooming using sparse representations. The Computer Journal, 52(1):64, I. T. Jolliffe. Principal component analysis. Springer series in statistics. Springer, R. M. Larsen. Lanczos bidiagonalization with partial reorthogonalization. Department of computer science, arhus University, Technical report, DIMI PB-357, code available at rmunk/propck/, Z. Lin, M. Chen, L. Wu, and Y. Ma. The augmented Lagrange multiplier method for exact recovery of corrupted low-rank matrices. UIUC Technical Report UILU- NG , nov Zhouchen Lin, rvind Ganesh, John Wright, Leqin Wu, Minming Chen, and Yi Ma. Fast convex optimization algorithms for exact recovery of a corrupted low-rank matrix. Computational dvances in Multi-Sensor daptive Processing (CMSP), 61, Guangcan Liu, Zhouchen Lin, and Yong Yu. Robust subspace segmentation by low-rank representation. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), pages , Kerui Min, Zhengdong Zhang, John Wright, and Yi Ma. Decomposing background topics from keywords by principal component pursuit. In Proceedings of the 19th CM international conference on Information and knowledge management, pages CM, Yigang Peng, rvind Ganesh, John Wright, Wenli Xu, and Yi Ma. Rasl: Robust alignment by sparse and low-rank decomposition for linearly correlated images. Pattern nalysis and Machine Intelligence, I Transactions on, 34(11): , Lun Wu, rvind Ganesh, Boxin Shi, Yasuyuki Matsushita, Yongtian Wang, and Yi Ma. Robust photometric stereo via low-rank matrix completion and recovery. In Computer Vision CCV 2010, pages Springer, J. Wright,. Yang,. Ganesh, S. Sastry, and Y. Ma. Robust face recognition via sparse representation. 31(2), feb Xiaoming Yuan and Junfeng Yang. Sparse and low-rank matrix decomposition via alternating direction methods. preprint, Zhengdong Zhang, rvind Ganesh, Xiao Liang, and Yi Ma. Tilt: transform invariant low-rank textures. International Journal of Computer Vision, 99(1):1 24, Zihan Zhou, Xiaodong Li, John Wright, mmanuel Candes, and Yi Ma. Stable principal component pursuit. In Information Theory Proceedings (ISIT), 2010 I International Symposium on, pages I, Guangyu Zhu, Shuicheng Yan, and Yi Ma. Image tag refinement towards low-rank, content-tag prior and error sparsity. In Proceedings of the international conference on Multimedia, pages CM, 2010.

Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition

Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition Gongguo Tang and Arye Nehorai Department of Electrical and Systems Engineering Washington University in St Louis