Supplementary Materials: Bilinear Factor Matrix Norm Minimization for Robust PCA: Algorithms and Applications

Size: px
Start display at page:

Download "Supplementary Materials: Bilinear Factor Matrix Norm Minimization for Robust PCA: Algorithms and Applications"

Transcription

1 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 16 Supplementary Materials: Bilinear Factor Matrix Norm Minimization for Robust PCA: Algorithms and Applications Fanhua Shang, Member, IEEE, James Cheng, Yuanyuan Liu, Zhi-Quan Luo, Fellow, IEEE, and Zhouchen Lin, Senior Member, IEEE In this supplementary material, we give the detailed proofs of some theorems, lemmas and properties We also provide the stopping criterion of our algorithms and the details of Algorithm In addition, we present two new ADMM algorithms for image recovery application and their pseudo-codes, and some additional experimental results for both synthetic and real-world datasets NOTATIONS R l denotes the l-dimensional Euclidean space, and the set of all m n matrices 1 with real entries is denoted by R m n TrX T Y = ij X ijy ij, where Tr denotes the trace of a matrix We assume the singular values of X R m n are ordered as σ 1 X σ X σ r X > σ r+1 X = = σ n X = 0, where r = ranx Then the SVD of X is denoted by X = UΣV T, where Σ = diagσ 1,, σ n I n denotes an identity matrix of size n n Definition 5 For any vector x R l, its l p -norm for 0<p< is defined as l 1/p x lp = x i p i=1 where x i is the i-th element of x When p = 1, the l 1 -norm of x is x l1 = i x i which is convex, while the l p -norm of x is a quasi-norm when 0 < p < 1, which is non-convex and violates the triangle inequality In addition, the l -norm of x is x l = i x i The above definition can be naturally extended from vectors to matrices by the following form S lp = s i,j p i,j 1/p Definition 6 The Schatten-p norm 0<p< of a matrix X R m n is defined as follows: n 1/p X Sp = σ p i X where σ i X denotes the i-th largest singular value of X i=1 F Shang, J Cheng and Y Liu are with the Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, NT, Hong Kong {fhshang, jcheng, yyliu}@csecuheduh Z-Q Luo is with Shenzhen Research Institute of Big Data, the Chinese University of Hong Kong, Shenzhen, China and Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, MN 55455, USA luozq@cuheducn and luozq@umnedu Z Lin is with Key Laboratory of Machine Perception MOE, School of EECS, Peing University, Beijing , PR China, and the Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai 0040, PR China zlin@pueducn 1 Without loss of generality, we assume m n in this paper

2 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 17 In the following, we will list some special cases of the Schatten-p norm 0<p< When 0<p<1, the Schatten-p norm is a quasi-norm, and it is non-convex and violates the triangle inequality When p=1, the Schatten-1 norm also nown as the nuclear norm or trace norm of X is defined as n X = σ i X When p=, the Schatten- norm is more commonly called the Frobenius norm defined as X F = n Xi,j APPENDIX A: PROOF OF LEMMA i=1 i=1 σ i X = i,j To prove Lemma, we first define the doubly stochastic matrix, and give the following lemma Definition 7 A square matrix is doubly stochastic if its elements are non-negative real numbers, and the sum of elements of each row or column is equal to 1 Lemma 7 Let P R n n be a doubly stochastic matrix, and if then 0 x 1 x x n, y 1 y y n 0, 6 n p ij x i y j i,j=1 n x y The proof of Lemma 7 is essentially similar to that of the lemma in [1], thus we give the following proof setch for this lemma Proof: Using 6, there exist non-negative numbers α i and β j for all 1 i, j n such that x = α i, y = β j for all = 1,, n 1 i j n Let δ ij denote the Kronecer delta ie, δ ij =1 if i=j, and δ ij =0 otherwise, we have n n n x y p ij x i y j = δ ij p ij x i y j =1 = 1 i,j n = If r s, by the lemma in [1] we now that Therefore, we have 1 r,s n r i n,1 j s i,j=1 δ ij p ij α r β s 1 i<r,1 j s =1 i,j=1 α r β s 1 r i j s n r i n,1 j s δ ij p ij + r i n,1 j s The same result can be obtained in a similar way for r s Proof of Lemma : Proof: Using the properties of the trace, we now that TrX T Y = δ ij p ij δ ij p ij 0, and 1 i<r,1 j s δ ij p ij 0 n U X ijλ i τ j i,j=1 δ ij p ij = 0 Note that U X is a unitary matrix, ie, UX T U X = U X UX T = I n, which implies that U X ij is a doubly stochastic matrix By Lemma 7, we have TrX T Y n i=1 λ iτ i Note that the Frobenius norm is the induced norm of the l -norm on matrices

3 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 18 APPENDIX B: PROOFS OF THEOREMS 1 AND Proof of Theorem 1: Proof: Using Lemma, for any factor matrices U R m d and V R n d with the constraint X =UV T, we have U + V X S1/ On the other hand, let U =L X Σ 1/ X and V =R X Σ 1/ X, where X =L XΣ X RX T is the SVD of X as in Lemma, then we have X = U V T and X S1/ = U + V /4 Therefore, we have This completes the proof U,V :X=UV T U + V = X S1/ Proof of Theorem : Proof: Using Lemma 4, for any U R m d and V R n d with the constraint X =UV T, we have U / F + V X S/ On the other hand, let U =L X Σ 1/ X and V =R X Σ / X, where X =L XΣ X RX T is the SVD of X as in Lemma, then we have X = U V T and X S/ = [ U F + V /] / Thus, we have This completes the proof U,V :X=UV T U / F + V = X S/ APPENDIX C: PROOF OF PROPERTY 4 Proof: The proof of Property 4 involves some properties of the l p -norm, which we recall as follows For any vector x in R n and 0 < p p 1 1, the following inequalities hold: x l1 x lp1, x lp1 x lp n 1 p 1 p 1 x lp1 Let X be an m n matrix of ran r, and denote its compact SVD by X = U m r Σ r r Vn r T By Theorems 1 and, and the properties of the l p -norm mentioned above, we have X = diagσ r r l1 diagσ r r l 1 = X D-N ranx X, X = diagσ r r l1 diagσ r r l = X F-N ranx X In addition, Therefore, we have X F-N = diagσ r r l diagσ r r l 1 = X D-N X X F-N X D-N ranx X This completes the proof APPENDIX D: SOLVING 18 VIA ADMM Similar to Algorithm 1, we also propose an efficient algorithm based on the alternating direction method of multipliers ADMM to solve 18, whose augmented Lagrangian function is given by L µ U, V, L, S, V, Y 1, Y, Y = λ U F + V + P Ω S / l / + Y 1, V V + Y, UV T L + Y, L+S D + µ UV T L F + V V F + L + S D F where Y 1 R n d, Y R m n and Y R m n are the matrices of Lagrange multipliers

4 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 19 Algorithm ADMM for solving S+L / problem 18 Input: D R m n, the given ran d and λ Initialize: µ 0, ρ>1, =0, and ϵ 1: while not converged do : while not converged do : Update U +1 and V +1 by 61 and 6, respectively 4: Update V +1 by V +1 = D λ/µ V +1 Y /µ 5: Update L +1 and S +1 by 4 and in this paper, respectively 6: end while // Inner loop 7: Update the multipliers Y1 +1, Y +1 and Y +1 by Y1 +1 =Y1 +µ V +1 V +1, Y +1 =Y +µ U +1 V+1 T +1, and Y +1 =Y +µ L +1 +S +1 D 8: Update µ +1 by µ +1 = ρµ 9: : end while // Outer loop Output: U +1 and V +1 Update of U +1 and V +1 : For updating U +1 and V +1, we consider the following optimization problems: V U and their optimal solutions can be given by where P =L µ 1 Y Update of V +1 : V +1 = λ U F + µ UV T L + µ 1 Y F, 7 V V + µ 1 Y 1 F + U +1 V T L + µ 1 Y F, 8 U +1 = µ P V λ I d + µ V T V 1, 9 [ V + µ 1 Y 1 + P T U +1 ] I d + U T +1U +1 1, 40 To update V +1, we fix the other variables and solve the following optimization problem V λ V + µ V V +1 + Y 1 /µ F 41 Similar to and 4, the closed-form solution of 41 can also be obtained by the SVT operator [] defined as follows Definition 8 Let Y be a matrix of size m n m n, and U Y Σ Y VY T be its SVD Then the singular value thresholding SVT operator D τ is defined as []: D τ Y = U Y S τ Σ Y VY T, where S τ x = max x τ, 0 sgnx is the soft shrinage operator [], [4], [5] Update of L +1 : For updating L +1, we consider the following optimization problem: U +1 V T L +1 L+µ 1 Y F + L+S D+µ 1 Y F 4 Since 4 is a least squares problem, and thus its closed-form solution is given by L +1 = 1 U +1 V+1 T + µ 1 Y S + D µ 1 Y 4 Together with the update scheme of S +1, as stated in in this paper, we develop an efficient ADMM algorithm to solve the Frobenius/nuclear hybrid norm penalized RPCA problem 18, as outlined in Algorithm

5 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 0 APPENDIX E: PROOF OF THEOREM In this part, we first prove the boundedness of multipliers and some variables of Algorithm 1, and then we analyze the convergence of Algorithm 1 To prove the boundedness, we first give the following lemma Lemma 8 [6] Let H be a real Hilbert space endowed with an inner product, and a corresponding norm, and y x, where fx denotes the subgradient of fx Then y =1 if x 0, and y 1 if x=0, where is the dual norm of For instance, the dual norm of the nuclear norm is the spectral norm,, ie, the largest singular value Lemma 9 Boundedness Let Y +1 1 =Y 1 +µ Û+1 U +1, Y +1 =Y +µ V +1 V +1, Y +1 =Y +µ U +1 V T and Y +1 4 = Y 4 +µ L +1 +S +1 D Suppose that µ is non-decreasing and {Û, V }, {Y 1, Y, Y µ +1 =0 µ 4/ 4 / µ 1 }, {L } and {S } produced by Algorithm 1 are all bounded +1 L +1 <, then the sequences {U, V }, Proof: Let U := U, V, V := Û, V and Y := Y 1, Y, Y, Y 4 By the first-order optimality conditions of the augmented Lagrangian function of 17 with respect to Û and V ie, Problems and 4, we have 0 ÛL µ U +1, V +1, L +1, S +1, Y and 0 V L µ U +1, V +1, L +1, S +1, Y, ie, Y1 +1 λ Û+1 and Y +1 λ V +1, where Y1 +1 = µ Û+1 U +1 +Y1 /µ and Y +1 = µ V +1 V +1 + Y /µ By Lemma 8, we obtain Y +1 1 λ and Y +1 λ, which implies that the sequences {Y1 } and {Y } are bounded Next we prove the boundedness of {Y4 / µ 1 } Using 7, it is easy to show that PΩ 4 } is bounded Let A=D L +1 µ 1 the sequence {PΩ +1 Y {Y4 / µ 1 }, we consider the following two cases Case 1: a ij > µ If a ij > µ, i, j Ω Y +1 4 = 0, which implies that Y 4, and ΦS:= P ΩS 1/ l 1/ To prove the boundedness of, and using Proposition 1, then [S +1 ] ij > 0 The first-order optimality condition of 7 implies that [ ΦS +1 ] ij + [Y +1 4 ] ij = 0, where [ ΦS +1 ] ij denotes the gradient of the penalty ΦS at [S +1 ] ij Since Φs ij = signs ij / s ij, then [Y ] ij = [S +1 ] ij Using Proposition 1, we have [Y4 +1 ] ij 1 = µ µ [S+1 ] ij = where γ =/µ Case : a ij, i, j Ω µ If a ij, and using Proposition 1, we have [S +1 ] ij = 0 Since µ [Y +1 4 /µ ] ij = [L +1 + µ 1 Y a ij 1 + cos π ϕ γa ij µ 4 D + S +1 ] ij = [L +1 + µ 1 1, Y 4 D] ij = a ij, µ then [Y4 +1 ] ij µ, and Y +1 4 F / µ = P Ω Y +1 4 F / µ Ω Therefore, {Y 4 / µ 1 } is bounded By the iterative scheme of Algorithm 1, we have L µ U +1, V + 1, L +1, S +1, Y L µ U +1, V +1, L, S, Y L µ U, V, L, S, Y =L µ 1 U, V, L, S, Y 1 + α Y j=1 j Yj 1 Y F + β 4 Y 4 1 F µ 1,

6 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 1 where α = µ 1+µ µ 1, β = µ 1+µ, and the above equality holds due to the definition of the augmented Lagrangian µ 1 4/ function L µ U, V, L, S, Û, V, {Y i } Since µ is non-decreasing, and =1 µ /µ 4/ <, then =1 β = =1 α = =1 =1 µ 1 + µ µ 1 µ 1 + µ µ 4/ 1 =1 µ µ =1 1 1 µ <, µ 4/ 1 µ < < =1 µ 4/ 1 Since { Y4 F / µ 1 } is bounded, and µ = ρµ 1 and ρ > 1, then { Y4 1 F / µ 1 } is also bounded, which implies that { Y4 Y4 1 F µ / 1 } is bounded Then {L µ U +1, V +1, L +1, S +1, Y } is upper-bounded due to the boundedness of the sequences of all Lagrange multipliers, ie, {Y 1 }, {Y }, {Y } and { Y 4 Y 1 4 F / µ } λ Û + V + P Ω S 1/ l 1/ = L µ 1 U, V, L, S, Y i=1 Y i F 1 Yi F is upper-bounded note that the above equality holds due to the definition of L µ U, V, L, S, Û, V, {Y i }, thus {S }, {Û} and { V } are all bounded Similarly, by U = Û [Y1 Y1 1 ]/µ 1, V = V [Y Y 1 ]/µ 1, L = U V T [Y Y 1 ]/µ 1 and the boundedness of {Û}, { V }, {Yi } i=1,,, and {Y4 / µ 1 }, thus {U }, {V } and {L } are also bounded This means that each bounded sequence must have a convergent subsequence due to the Bolzano-Weierstrass theorem µ 1 Proof of Theorem : Proof: I Û+1 U +1 = [Y1 +1 Y1 ]/µ, V +1 V +1 = [Y +1 Y ]/µ, U +1 V+1 T L +1 = [Y +1 Y ]/µ and L +1 +S +1 D = [Y4 +1 Y4 ]/µ Due to the boundedness of {Y1 }, {Y }, {Y } and {Y4 / µ 1 }, the non-decreasing property of {µ }, and <, we have =0 =0 =0 µ +1/µ 4/ Û+1 U +1 F L +1 U +1 V T +1 F which implies that =0 =0 lim Û+1 U +1 F =0, µ +1 µ Y1 +1 Y1 F <, µ +1 µ Y +1 Y F <, V +1 V +1 F =0 =0 µ +1 µ =0 L +1 +S +1 D F lim V +1 V +1 F = 0, and lim L +1 + S +1 D F = 0 =0 Y +1 Y F <, µ +1 Y +1 µ 5/ lim L +1 U +1 V+1 T F = 0, 4 Y4 F µ 1/ Hence, {U, V, Û, V, L, S } approaches to a feasible solution In the following, we will prove that the sequences {U } and {V } are Cauchy sequences Using Y1 = Y1 1 +µ 1 Û U, Y = Y 1 +µ 1 V V and Y = Y 1 +µ 1 U V T L, then the first-order optimality conditions of 19 and 0 with respect to U and V are written as follows: U +1 V T L + Y V + U +1 Û Y 1 µ µ = U +1 V T U V T Y 1 + Y + Y µ 1 µ 1 µ Y Y 1 = U +1 U V T V +I d + = 0, µ 1 + Y V +U +1 U +U Û Y 1 µ V + Y 1 1 Y1 µ Y 1 µ 1 µ <, 44

7 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE = V +1 U T +1 L T + Y T V +1 U T +1 V U T µ U +1 + V +1 V Y 1 Y T + Y T + Y µ 1 µ 1 T µ = V +1 V U T +1U +1 +I d +V U T +1 U T U +1 + = 0 By 44 and 45, we obtain = Recall that we have V +1 V [ V U T U T +1U +1 + µ µ U +1 + V +1 V +V V Y Y T Y 1 T µ 1 + Y T U +1 U [ Y 1 Y = Y V + Y 1 Y1 1 + Y ] 1 1 V T V + I d, µ 1 µ 1 =0 Y 1 T Y T µ 1 Y T U +1 U F =0 =0 µ µ +1 µ 4/ µ 1 ϑ 1 <, =0 µ U +1 + Y Y 1 µ +1 µ ϑ 1 µ µ + Y µ 1 =0 µ µ +1 ϑ µ 4/ 1 <, U +1 + Y 1 Y Y µ 1 ] U T +1U +1 +I d 1 where the constant ϑ 1 is defined as {[ ] } ϑ 1 = max ρ Y 1 Y F + Y F V F +ρ Y1 Y1 1 F + Y1 F V T V +I d 1 F, = 1,, In addition, V +1 V F =0 1 [ µ =0 + =0 =0 ρ Y 1 Y F + Y F U +1 F +ρ Y Y 1 V F U +1 F U T +1U +1 +I d 1 F U +1 U F 1 ϑ U +1 U F + ϑ µ =0 ϑ U +1 U F + =0 F + Y F ] U+1U T +1 +I d 1 F =0 µ +1 µ ϑ <, where the constants ϑ and ϑ are defined as { } ϑ = max V F U +1 F U+1U T +1 +I d 1 F, = 1,,, {[ } ϑ = max ρ Y 1 Y F + Y F U +1 F +ρ Y Y 1 F + Y F ] U +1U T +1 +I d 1 F, = 1,, Consequently, both {U } and {V } are convergent sequences Moreover, it is not difficult to verify that {U } and {V } are both Cauchy sequences Similarly, {Û}, { V }, {S } and {L } are also Cauchy sequences Practically, the stopping criterion of Algorithm 1 is satisfied within a finite number of iterations II Let U, V, Û, V, L, S be a critical point of 17, and ΦS = P Ω S 1/ l 1/ Applying the Fermat s rule as in [7] to the subproblem 7, we then obtain 0 λ Û + Y 1 and 0 λ V + Y, 0 ΦS + P Ω Y 4 and P Ω cy 4 = 0, L = U V T, Û = U, V = V, L + S = D µ 45

8 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Applying the Fermat s rule to 7, we have 0 ΦS +1 + P Ω Y +1 4, P Ω cy +1 4 = 0 46 In addition, the first-order optimal conditions for and 4 are given by 0 λ Û+1 + Y +1 1, 0 λ V +1 + Y Since {U }, {V }, {Û}, { V }, {L } and {S } are Cauchy sequences, then lim U +1 U = 0, lim V +1 V = 0, lim V +1 V = 0, lim L +1 L = 0, lim Û+1 Û = 0, lim S +1 S = 0 Let U, V, Û, V, S and L be their limit points, respectively Together with the results in I, then we have that U =Û, V = V, L =U V T and L +S =D Using 46 and 47, the following holds 0 λ Û + Y 1 and 0 λ V + Y, 0 ΦS + P Ω Y 4 and P Ω cy 4 = 0, L = U V T, Û = U, V = V, L + S = D Therefore, any accumulation point {U, V, Û, V, L, S } of the sequence {U, V, Û, Û, L, S } generated by Algorithm 1 satisfies the KKT conditions for the problem 17 That is, the sequence asymptotically satisfies the KKT conditions of 17 In particular, whenever the sequence {U, V, L, S } converges, it converges to a critical point of 15 This completes the proof APPENDIX F: STOPPING CRITERION For the problem 15, the KKT conditions are Using 48, we have 0 λ U + Y V and 0 λ V + Y T U, 0 ΦS + P Ω Ŷ and P Ω cŷ = 0, 48 L = U V T, P Ω L + P Ω S = P Ω D Y λ U V and Y [U T ] λ V T 49 where V is the pseudo-inverse of V The two conditions hold if and only if U V [U T ] V T Recalling the equivalence relationship between 15 and 17, the KKT conditions for 17 are given by 0 λ Û + Y 1 and 0 λ V + Y, 0 ΦS + P Ω Ŷ and P Ω cŷ = 0, L =U V T, Û =U, V =V, L +S =D Hence, we use the following conditions as the stopping criteria for Algorihtm 1: 50 where ϵ 1 = max{ U V T V F / V F } max {ϵ 1 / D F, ϵ } < ϵ L F, L +S D F, Y 1 V Û T Y T F } and ϵ = max{ Û U F / U F, V

9 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 4 Algorithm Solving image recovery problem 51 via ADMM Input: P Ω D, the given ran d, and λ Initialize: µ 0 =, µ max =10 0, ρ>1, =0, and ϵ 1: while not converged do [ : U +1 = L + Y /µ V +Û Y 1 /µ ] V T [ L + Y /µ T U +1 + V Y V + I 1 ] U T : V +1 = /µ +1 U +1 + I 1 4: Û+1 = D λ/µ U+1 + Y1 /µ, and V +1 = D λ/µ V+1 + Y /µ D+µ 5: L +1 = P U +1 V T +1 Y Ω 1+µ + PΩ U+1 V+1 T Y /µ 6: Y1 +1 = Y1 + µ U +1 Û+1, Y +1 = Y + µ V +1 V +1, and Y +1 = Y + µ L+1 U +1 V T 7: µ +1 = ρµ, µ max 8: + 1 9: end while Output: U +1 and V APPENDIX G: ALGORITHMS FOR IMAGE RECOVERY In this part, we propose two efficient ADMM algorithms as outlined in Algorithms and 4 to solve the following D-N/F-N penalty regularized least squares problems for matrix completion: λ U,V,L U + V + 1 P ΩL P Ω D F, st, L = UV T, 51 U,V,L λ U 1 F + V + P ΩL P Ω D F, st, L = UV T 5 Similar to 17 and 18, we also introduce the matrices Û and V as auxiliary variables to 51 ie, 5 in this paper, and obtain the following equivalent formulation, The augmented Lagrangian function of 5 is λ U,V,Û, V,L Û + V + 1 P ΩL P Ω D F, st, L = UV T, U = Û, V = V L µ = λ Û + V + 1 P ΩL P Ω D F + Y 1, U Û + Y, V V + Y, L UV T + µ U Û F + V V F + L UV T F where Y i i=1,, are the matrices of Lagrange multipliers 5 Updating U +1 and V +1 : By fixing the other variables at their latest values, and removing the terms that do not depend on U and V and adding some proper terms that do not depend on U and V, the optimization problems with respect to U and V are formulated as follows: U Û + Y1 /µ F + L UV T + Y /µ F, 54 V V + Y /µ F + L U +1 V T + Y /µ F 55 Since both 54 and 55 are smooth convex optimization problems, their closed-form solutions are given by [ ] 1 U +1 = L + Y /µ V + Û Y1 /µ V T V + I, 56 [ V +1 = L + Y /µ T U +1 + V ] 1 Y /µ U +1U T +1 + I 57

10 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 5 Algorithm 4 Solving image recovery problem 5 via ADMM Input: P Ω D, the given ran d, and λ Initialize: µ 0 =, µ max =10 0, ρ>1, =0, and ϵ 1: while not converged do : U +1 = [ µ L + Y ] V µ V T V + λ/i 1 [ : V +1 = L + Y /µ T U +1 + V ] U Y1 T /µ +1 U +1 + I 1 4: V +1 = D λ/µ V+1 + Y1 /µ D+µ 5: L +1 = P U +1 V T +1 Y Ω 1+µ + PΩ U+1 V+1 T Y /µ V +1 V +1 6: Y1 +1 =Y 7: µ +1 = ρµ, µ max 8: + 1 9: end while Output: U +1 and V µ and Y +1 =Y +µ L+1 U +1 V T +1 Updating Û+1 and V +1 : By eeping all other variables fixed, Û+1 is updated by solving the following problem: λ Û + µ U +1 Û + Y 1 /µ F 58 To solve 58, the SVT operator [] is considered as follows: Û +1 =D λ/µ U +1 +Y1 /µ 59 Similarly, V +1 is given by V +1 = D λ/µ V +1 + Y /µ 60 Updating L +1 : By fixing all other variables, the optimal L +1 is the solution to the following problem: 1 P ΩL P Ω D F + µ L U +1V+1 T + Y F 61 µ Since 61 is a smooth convex optimization problem, it is easy to show that the optimal solution to 61 is D + µ U +1 V+1 T L +1 = P Y Ω + PΩ U +1 V+1 T Y /µ 1 + µ where P Ω is the complementary operator of P Ω, ie, P Ω D ij =0 if i, j Ω, and P Ω D ij =D ij otherwise Based on the description above, we develop an efficient ADMM algorithm for solving the double nuclear norm imization problem 51, as outlined in Algorithm Similarly, we also present an efficient ADMM algorithm to solve 5, as outlined in Algorithm 4 6 APPENDIX H: MORE EXPERIMENTAL RESULTS In this paper, we compared both our methods with the state-of-the-art methods, such as LMaFit [8], RegL1 4 [9], Unifying [10], facten 5 [11], RPCA 6 [6], PSVT 7 [1], 8 [1], and LpSq 9 [14] The Matlab code of the proposed methods can be downloaded from the lin cslzhang/ DFMNMzip?dl=0

11 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 6 Stop criterion 10 r = 5 r = 10 r = 0 Stop criterion 10 r = 5 r = 10 r = 0 RSE 10 r = 5 r = 10 r = 0 RSE 10 r = 5 r =10 r = Iterations Iterations a Stop criterion: S+L 1/ left and S+L / right Iterations Iterations b RSE: S+L 1/ left and S+L / right Fig 1 Convergence behavior of our S+L 1/ and S+L / methods for the cases of the matrix ran 5, 10 and 0 10 S+L 1/ S+L / S+L 1/ S+L / RSE 10 LpSq PSVT Unifying S+L 1/ RSE 10 LpSq PSVT Unifying S+L 1/ RSE RSE 10 S+L / Parameter of ran S+L / Parameter of ran a RSE vs d: left and right Regularization parameter Regularization parameter b RSE vs λ: left and right Fig Comparison of RSE results on corrupted matrices with different ran parameters a and regularization parameters b Convergence Behavior Fig 1 illustrates the evolution of the relative squared error RSE, ie L L F / L F, and stop criterion over the iterations on corrupted matrices of size 1, 000 1, 000 with outlier ratio 5%, respectively From the results, it is clear that both the stopping tolerance and RSE values of our two methods decrease fast, and they converge within only a small number of iterations, usually within 50 iterations Robustness Lie the other non-convex methods such as PSVT and Unifying, the most important parameter of our methods is the ran parameter d To verify the robustness of our methods with respect to d, we report the RSE results of PSVT, Unifying and our methods on corrupted matrices with outlier ratio 10% in Fig a, in which we also present the results of the baseline method, LpSq [14] It is clear that both our methods perform much more robust than PSVT and Unifying, and consistently yield much better solutions than the other methods in all settings To verify the robustness of both our methods with respect to another important parameter ie the regularization parameter λ, we also report the RSE results of our methods on corrupted matrices with outlier ratio 10% in Fig b Note that the ran parameter of both our methods is computed by our ran estimation procedure From the resutls, one can see that both our methods demonstrate very robust performance over a wide range of the regularization parameter, eg from to 10 0 Text Removal We report the text removal results of our methods with varying ran parameters from 10 to 40, as shown in Fig, where the ran of the original image is 10 We also present the results of the baseline method, LpSq [14] The results show that our methods significantly outperform the other methods in terms of RSE and F-measure, and they perform much more robust than Unifying with respect to the ran parameter Moving Object Detection We present the detailed descriptions for five surveillance video sequences: Bootstrap, Hall, Lobby, Mall and WaterSurface data sets, as shown in Table 1 Moreover, Fig 4 illustrates the foreground and bacground separation results on the Hall, Mall, Lobby and WaterSurface data sets

12 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 7 TABLE 1 Information of the surveillance video sequences used in our experiments Datasets Bootstrap Hall Lobby Mall WaterSurface Size frames [10, 160] 1000 [144, 176] 1000 [18, 160] 1000 [56, 0] 600 [18, 160] 500 Description Crowded scene Crowded scene Dynamic foreground Crowded scene Dynamic bacground RSE LpSq Unifying S+L 1/ S+L / Parameter of ran F measure LpSq Unifying S+L 1/ S+L / Parameter of ran Fig The text removal results including RSE and F-measure of LpSq [14], Unifying [10] and our methods with different ran parameters Image Alignment We also conducted some image aligment experiments on the Dummy images used in [15] Figs 5b and 5c show the results of alignment, low-ran and sparse estimations by both our methods, some of which are shown in Fig 5a We can observe that both our methods are able to align the facial images nicely despite illuation variations and occlusions, and correctly detect and remove them Image Inpainting In this part, we first reported the average PSNR results of two proposed methods ie, D-N and F-N with different ratios of random missing pixels from 95% to 80%, as shown in Fig 6 Since the methods in [16], [17] are very slow, we only present the average inpainting results of 11 [18] and 1 [1], both of which use the fast SVD strategy and need to compute only partical SVD instead of the full one Thus, and are usually much faster than the methods [16], [17] Considering that only a small fraction of pixels are randomly selected, thus we conducted 50 independent runs and report the average PSNR and standard deviation std The results show that both our methods consistently outperform [18] and [1] in all the settings This experiment actually shows that both our methods have even greater advantage over existing methods in the cases when the numer of observed pixels is very limited, eg, 5% observed pixels As suggested in [17], we set d=9 for our two methods and 1 [17] To evaluate the robustness of our two methods with respect to their ran parameter, we report the average PSNR and standard deviation of two proposed methods with varying ran parameter d from 7 to 15, as shown in Fig 7 Moreover, we also present the average inpainting results of [17] over 50 independent runs It is clear that two proposed methods perform much more robust than with repsect to the ran parameter REFERENCES [1] L Mirsy, A trace inequality of John von Neumann, Monatsh Math, vol 79, pp 0 06, 1975 [] J Cai, E J Candès, and Z Shen, A singular value thresholding algorithm for matrix completion, SIAM J Optim, vol 0, no 4, pp , 010 [] I Daubechies, M Defrise, and C De Mol, An iterative thresholding algorithm for linear inverse problems with a sparsity constraint, Commun Pur Appl Math, vol 57, no 11, pp , 004 [4] D L Donoho, De-noising by soft-thresholding, IEEE Trans Inform Theory, vol 41, no, pp 61 67, May mattohc/nnlshtml 1 cslzhang/ 1

13 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 8 a Input,segmentation b RegL1 [9] c Unifying [10] d facten [11] e S+L 1/ f S+L / Fig 4 Bacground and foreground separation results of different algorithms on the Hall, Mall, Lobby and WaterSurface data sets The one frame with missing data of each sequence top and its manual segmentation bottom are shown in a The results by different algorithms are presented from b to f, respectively The top panel is the recovered bacground and the bottom panel is the segmentation [5] E T Hale, W Yin, and Y Zhang, Fixed-point continuation for l 1 -imization: Methodology and convergence, SIAM J Optim, vol 19, no, pp , 008 [6] Z Lin, M Chen, and L Wu, The augmented Lagrange multiplier method for exact recovery of corrupted low-ran matrices, Univ Illinois, Urbana-Champaign, Tech Rep, Mar 009 [7] F Wang, W Cao, and Z Xu, Convergence of multi-bloc Bregman ADMM for nonconvex composite problems, arxiv: , 015 [8] Y Shen, Z Wen, and Y Zhang, Augmented Lagrangian alternating direction method for matrix separation based on low-ran factorization, Optim Method Softw, vol 9, no, pp 9 6, 014 [9] Y Zheng, G Liu, S Sugimoto, S Yan, and M Outomi, Practical low-ran matrix approximation under robust L 1 -norm, in Proc IEEE

14 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 9 a Original occluded images b Alignment results by our S+L 1/ method c Alignment results by our S+L / method Fig 5 Aligning the Dummy images used in [15] a Ten out of the 100 images b and c: From top to bottom show the aligned, low-ran and sparse results using our S+L 1/ and S+L / methods, respectively Conf Comput Vis Pattern Recognit, 01, pp [10] R Cabral, F De la Torre, J Costeira, and A Bernardino, Unifying nuclear norm and bilinear factorization approaches for low-ran matrix decomposition, in Proc IEEE Int Conf Comput Vis, 01, pp [11] E Kim, M Lee, and S Oh, Elastic-net regularization of singular values for robust subspace learning, in Proc IEEE Conf Comput Vis Pattern Recognit, 015, pp [1] T Oh, Y Tai, J Bazin, H Kim, and I Kweon, Partial sum imization of singular values in robust PCA: Algorithm and applications, IEEE Trans Pattern Anal Mach Intell, vol 8, no 4, pp , Apr 016 [1] S Gu, Q Xie, D Meng, W Zuo, X Feng, and L Zhang, Weighted nuclear norm imization and its applications to low level vision, Int J Comput Vis, vol 11, no, pp 18 08, 017 [14] F Nie, H Wang, X Cai, H Huang, and C Ding, Robust matrix completion via joint Schatten p-norm and L p-norm imization, in Proc IEEE Int Conf Data Mining, 01, pp [15] Y Peng, A Ganesh, J Wright, W Xu, and Y Ma, RASL: Robust alignment by sparse and low-ran decomposition for linearly correlated images, IEEE Trans Pattern Anal Mach Intell, vol 4, no 11, pp 46, Nov 01 [16] C Lu, J Tang, S Yan, and Z Lin, Generalized nonconvex nonsmooth low-ran imization, in Proc IEEE Conf Comput Vis Pattern Recognit, 014, pp [17] Y Hu, D Zhang, J Ye, X Li, and X He, Fast and accurate matrix completion via truncated nuclear norm regularization, IEEE Trans Pattern Anal Mach Intell, vol 5, no 9, pp , Sep 01 [18] K-C Toh and S Yun, An accelerated proximal gradient algorithm for nuclear norm regularized least squares problems, Pac J Optim, vol 6, pp , 010

15 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE PSNR db 0 15 PSNR db PSNR db 16 PSNR db a Image 1 b Image c Image d Image PSNR db PSNR db PSNR db PSNR db e Image 5 f Image 6 g Image 7 h Image 8 Fig 6 The average PSNR and standard deviation of [18], [1] and both our methods for image inpainting vs fraction of observed pixels best viewed in colors PSNR db 4 PSNR db PSNR db 0 PSNR db Ran parameter Ran parameter Ran parameter Ran parameter a Image 1 b Image c Image d Image PSNR db PSNR db 1 PSNR db PSNR db Ran parameter Ran parameter Ran parameter Ran parameter e Image 5 f Image 6 g Image 7 h Image 8 Fig 7 The average PSNR and standard deviation of [17] and both our methods for image inpainting vs the ran parameter best viewed in colors

THE sparse and low-rank priors have been widely used

THE sparse and low-rank priors have been widely used This article has been accepted for publication in a future issue of this journal, but has not been fully edited Content may change prior to final publication Citation information: DOI 101109/TPAMI017748590,

More information

THE sparse and low-rank priors have been widely used

THE sparse and low-rank priors have been widely used IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 1 Bilinear Factor Matrix Norm Minimization for Robust PCA: Algorithms and Applications Fanhua Shang, Member, IEEE, James Cheng, Yuanyuan Liu,

More information

A Brief Overview of Practical Optimization Algorithms in the Context of Relaxation

A Brief Overview of Practical Optimization Algorithms in the Context of Relaxation A Brief Overview of Practical Optimization Algorithms in the Context of Relaxation Zhouchen Lin Peking University April 22, 2018 Too Many Opt. Problems! Too Many Opt. Algorithms! Zero-th order algorithms:

More information

Robust PCA. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng

Robust PCA. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng Robust PCA CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Robust PCA 1 / 52 Previously...

More information

Recovering Low-Rank and Sparse Matrices via Robust Bilateral Factorization

Recovering Low-Rank and Sparse Matrices via Robust Bilateral Factorization 2014 IEEE International Conference on Data Mining Recovering Low-Rank and Sparse Matrices via Robust Bilateral Factorization Fanhua Shang, Yuanyuan Liu, James Cheng, Hong Cheng Dept. of Computer Science

More information

Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition

Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition Gongguo Tang and Arye Nehorai Department of Electrical and Systems Engineering Washington University in St Louis

More information

Robust Principal Component Analysis

Robust Principal Component Analysis ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M

More information

CSC 576: Variants of Sparse Learning

CSC 576: Variants of Sparse Learning CSC 576: Variants of Sparse Learning Ji Liu Department of Computer Science, University of Rochester October 27, 205 Introduction Our previous note basically suggests using l norm to enforce sparsity in

More information

Supplementary Material of A Novel Sparsity Measure for Tensor Recovery

Supplementary Material of A Novel Sparsity Measure for Tensor Recovery Supplementary Material of A Novel Sparsity Measure for Tensor Recovery Qian Zhao 1,2 Deyu Meng 1,2 Xu Kong 3 Qi Xie 1,2 Wenfei Cao 1,2 Yao Wang 1,2 Zongben Xu 1,2 1 School of Mathematics and Statistics,

More information

Partial Sum Minimization of Singular Values in Robust PCA: Algorithm and Applications

Partial Sum Minimization of Singular Values in Robust PCA: Algorithm and Applications Partial Sum Minimization of Singular Values in Robust PCA: Algorithm and Applications Tae-Hyun Oh Student Member, IEEE, Yu-Wing Tai Senior Member, IEEE, Jean-Charles Bazin Member, IEEE, Hyeongwoo Kim Student

More information

Big Data Analytics: Optimization and Randomization

Big Data Analytics: Optimization and Randomization Big Data Analytics: Optimization and Randomization Tianbao Yang Tutorial@ACML 2015 Hong Kong Department of Computer Science, The University of Iowa, IA, USA Nov. 20, 2015 Yang Tutorial for ACML 15 Nov.

More information

Fast Nonnegative Matrix Factorization with Rank-one ADMM

Fast Nonnegative Matrix Factorization with Rank-one ADMM Fast Nonnegative Matrix Factorization with Rank-one Dongjin Song, David A. Meyer, Martin Renqiang Min, Department of ECE, UCSD, La Jolla, CA, 9093-0409 dosong@ucsd.edu Department of Mathematics, UCSD,

More information

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss arxiv:1811.04545v1 [stat.co] 12 Nov 2018 Cheng Wang School of Mathematical Sciences, Shanghai Jiao

More information

Linearized Alternating Direction Method with Adaptive Penalty for Low-Rank Representation

Linearized Alternating Direction Method with Adaptive Penalty for Low-Rank Representation Linearized Alternating Direction Method with Adaptive Penalty for Low-Rank Representation Zhouchen Lin Visual Computing Group Microsoft Research Asia Risheng Liu Zhixun Su School of Mathematical Sciences

More information

Low-Rank Tensor Completion by Truncated Nuclear Norm Regularization

Low-Rank Tensor Completion by Truncated Nuclear Norm Regularization Low-Rank Tensor Completion by Truncated Nuclear Norm Regularization Shengke Xue, Wenyuan Qiu, Fan Liu, and Xinyu Jin College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou,

More information

Robust Component Analysis via HQ Minimization

Robust Component Analysis via HQ Minimization Robust Component Analysis via HQ Minimization Ran He, Wei-shi Zheng and Liang Wang 0-08-6 Outline Overview Half-quadratic minimization principal component analysis Robust principal component analysis Robust

More information

arxiv: v1 [cs.cv] 1 Jun 2014

arxiv: v1 [cs.cv] 1 Jun 2014 l 1 -regularized Outlier Isolation and Regression arxiv:1406.0156v1 [cs.cv] 1 Jun 2014 Han Sheng Department of Electrical and Electronic Engineering, The University of Hong Kong, HKU Hong Kong, China sheng4151@gmail.com

More information

FAST FIRST-ORDER METHODS FOR STABLE PRINCIPAL COMPONENT PURSUIT

FAST FIRST-ORDER METHODS FOR STABLE PRINCIPAL COMPONENT PURSUIT FAST FIRST-ORDER METHODS FOR STABLE PRINCIPAL COMPONENT PURSUIT N. S. AYBAT, D. GOLDFARB, AND G. IYENGAR Abstract. The stable principal component pursuit SPCP problem is a non-smooth convex optimization

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via convex relaxations Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Weighted Schatten p-norm Minimization for Image Denoising and Background Subtraction

Weighted Schatten p-norm Minimization for Image Denoising and Background Subtraction SUBMIT TO IEEE TRANSACTIONS ON IMAGE PROCESSING 1 Weighted Schatten p-norm Minimization for Image Denoising and Bacground Subtraction Yuan Xie, Member, IEEE, Shuhang Gu, Yan Liu, Wangmeng Zuo, Wensheng

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)

More information

GROUP-SPARSE SUBSPACE CLUSTERING WITH MISSING DATA

GROUP-SPARSE SUBSPACE CLUSTERING WITH MISSING DATA GROUP-SPARSE SUBSPACE CLUSTERING WITH MISSING DATA D Pimentel-Alarcón 1, L Balzano 2, R Marcia 3, R Nowak 1, R Willett 1 1 University of Wisconsin - Madison, 2 University of Michigan - Ann Arbor, 3 University

More information

Elastic-Net Regularization of Singular Values for Robust Subspace Learning

Elastic-Net Regularization of Singular Values for Robust Subspace Learning Elastic-Net Regularization of Singular Values for Robust Subspace Learning Eunwoo Kim Minsik Lee Songhwai Oh Department of ECE, ASRI, Seoul National University, Seoul, Korea Division of EE, Hanyang University,

More information

A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations

A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations Chuangchuang Sun and Ran Dai Abstract This paper proposes a customized Alternating Direction Method of Multipliers

More information

Augmented Lagrangian alternating direction method for matrix separation based on low-rank factorization

Augmented Lagrangian alternating direction method for matrix separation based on low-rank factorization Augmented Lagrangian alternating direction method for matrix separation based on low-rank factorization Yuan Shen Zaiwen Wen Yin Zhang January 11, 2011 Abstract The matrix separation problem aims to separate

More information

A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem

A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem Kangkang Deng, Zheng Peng Abstract: The main task of genetic regulatory networks is to construct a

More information

Robust Principal Component Analysis with Missing Data

Robust Principal Component Analysis with Missing Data Robust Principal Component Analysis with Missing Data Fanhua Shang, Yuanyuan Liu, James Cheng, Hong Cheng Dept. of Computer Science and Engineering, The Chinese University of Hong Kong {fhshang, jcheng}@cse.cuhk.edu.hk

More information

Tensor-Tensor Product Toolbox

Tensor-Tensor Product Toolbox Tensor-Tensor Product Toolbox 1 version 10 Canyi Lu canyilu@gmailcom Carnegie Mellon University https://githubcom/canyilu/tproduct June, 018 1 INTRODUCTION Tensors are higher-order extensions of matrices

More information

Relaxed Majorization-Minimization for Non-smooth and Non-convex Optimization

Relaxed Majorization-Minimization for Non-smooth and Non-convex Optimization Relaxed Majorization-Minimization for Non-smooth and Non-convex Optimization Chen Xu 1, Zhouchen Lin 1,2,, Zhenyu Zhao 3, Hongbin Zha 1 1 Key Laboratory of Machine Perception (MOE), School of EECS, Peking

More information

A Scalable Spectral Relaxation Approach to Matrix Completion via Kronecker Products

A Scalable Spectral Relaxation Approach to Matrix Completion via Kronecker Products A Scalable Spectral Relaxation Approach to Matrix Completion via Kronecker Products Hui Zhao Jiuqiang Han Naiyan Wang Congfu Xu Zhihua Zhang Department of Automation, Xi an Jiaotong University, Xi an,

More information

On the Convex Geometry of Weighted Nuclear Norm Minimization

On the Convex Geometry of Weighted Nuclear Norm Minimization On the Convex Geometry of Weighted Nuclear Norm Minimization Seyedroohollah Hosseini roohollah.hosseini@outlook.com Abstract- Low-rank matrix approximation, which aims to construct a low-rank matrix from

More information

Stochastic dynamical modeling:

Stochastic dynamical modeling: Stochastic dynamical modeling: Structured matrix completion of partially available statistics Armin Zare www-bcf.usc.edu/ arminzar Joint work with: Yongxin Chen Mihailo R. Jovanovic Tryphon T. Georgiou

More information

arxiv: v1 [cs.cv] 17 Nov 2015

arxiv: v1 [cs.cv] 17 Nov 2015 Robust PCA via Nonconvex Rank Approximation Zhao Kang, Chong Peng, Qiang Cheng Department of Computer Science, Southern Illinois University, Carbondale, IL 6901, USA {zhao.kang, pchong, qcheng}@siu.edu

More information

Affine Structure From Motion

Affine Structure From Motion EECS43-Advanced Computer Vision Notes Series 9 Affine Structure From Motion Ying Wu Electrical Engineering & Computer Science Northwestern University Evanston, IL 68 yingwu@ece.northwestern.edu Contents

More information

S 1/2 Regularization Methods and Fixed Point Algorithms for Affine Rank Minimization Problems

S 1/2 Regularization Methods and Fixed Point Algorithms for Affine Rank Minimization Problems S 1/2 Regularization Methods and Fixed Point Algorithms for Affine Rank Minimization Problems Dingtao Peng Naihua Xiu and Jian Yu Abstract The affine rank minimization problem is to minimize the rank of

More information

L 2,1 Norm and its Applications

L 2,1 Norm and its Applications L 2, Norm and its Applications Yale Chang Introduction According to the structure of the constraints, the sparsity can be obtained from three types of regularizers for different purposes.. Flat Sparsity.

More information

Low-Rank Factorization Models for Matrix Completion and Matrix Separation

Low-Rank Factorization Models for Matrix Completion and Matrix Separation for Matrix Completion and Matrix Separation Joint work with Wotao Yin, Yin Zhang and Shen Yuan IPAM, UCLA Oct. 5, 2010 Low rank minimization problems Matrix completion: find a low-rank matrix W R m n so

More information

arxiv: v2 [stat.ml] 1 Jul 2013

arxiv: v2 [stat.ml] 1 Jul 2013 A Counterexample for the Validity of Using Nuclear Norm as a Convex Surrogate of Rank Hongyang Zhang, Zhouchen Lin, and Chao Zhang Key Lab. of Machine Perception (MOE), School of EECS Peking University,

More information

A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices

A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices Ryota Tomioka 1, Taiji Suzuki 1, Masashi Sugiyama 2, Hisashi Kashima 1 1 The University of Tokyo 2 Tokyo Institute of Technology 2010-06-22

More information

Recovery of Low-Rank Plus Compressed Sparse Matrices with Application to Unveiling Traffic Anomalies

Recovery of Low-Rank Plus Compressed Sparse Matrices with Application to Unveiling Traffic Anomalies July 12, 212 Recovery of Low-Rank Plus Compressed Sparse Matrices with Application to Unveiling Traffic Anomalies Morteza Mardani Dept. of ECE, University of Minnesota, Minneapolis, MN 55455 Acknowledgments:

More information

Relaxed Majorization-Minimization for Non-smooth and Non-convex Optimization

Relaxed Majorization-Minimization for Non-smooth and Non-convex Optimization Relaxed Majorization-Minimization for Non-smooth and Non-convex Optimization MM has been successfully applied to a wide range of problems. Mairal (2013 has given a comprehensive review on MM. Conceptually,

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon

More information

Low-rank Matrix Completion with Noisy Observations: a Quantitative Comparison

Low-rank Matrix Completion with Noisy Observations: a Quantitative Comparison Low-rank Matrix Completion with Noisy Observations: a Quantitative Comparison Raghunandan H. Keshavan, Andrea Montanari and Sewoong Oh Electrical Engineering and Statistics Department Stanford University,

More information

A Counterexample for the Validity of Using Nuclear Norm as a Convex Surrogate of Rank

A Counterexample for the Validity of Using Nuclear Norm as a Convex Surrogate of Rank A Counterexample for the Validity of Using Nuclear Norm as a Convex Surrogate of Rank Hongyang Zhang, Zhouchen Lin, and Chao Zhang Key Lab. of Machine Perception (MOE), School of EECS Peking University,

More information

Generalized Higher-Order Tensor Decomposition via Parallel ADMM

Generalized Higher-Order Tensor Decomposition via Parallel ADMM Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence Generalized Higher-Order Tensor Decomposition via Parallel ADMM Fanhua Shang 1, Yuanyuan Liu 2, James Cheng 1 1 Department of

More information

1. Introduction. In this paper we consider a class of matrix factorization problems, which can be modeled as

1. Introduction. In this paper we consider a class of matrix factorization problems, which can be modeled as 1 3 4 5 6 7 8 9 10 11 1 13 14 15 16 17 A NON-MONOTONE ALTERNATING UPDATING METHOD FOR A CLASS OF MATRIX FACTORIZATION PROBLEMS LEI YANG, TING KEI PONG, AND XIAOJUN CHEN Abstract. In this paper we consider

More information

PRINCIPAL Component Analysis (PCA) is a fundamental approach

PRINCIPAL Component Analysis (PCA) is a fundamental approach IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Tensor Robust Principal Component Analysis with A New Tensor Nuclear Norm Canyi Lu, Jiashi Feng, Yudong Chen, Wei Liu, Member, IEEE, Zhouchen

More information

Robust Low-Rank Modelling on Matrices and Tensors

Robust Low-Rank Modelling on Matrices and Tensors Imperial College London Department of Computing MSc in Advanced Computing Robust Low-Ran Modelling on Matrices and Tensors by Georgios Papamaarios Submitted in partial fulfilment of the requirements for

More information

Matrix-Tensor and Deep Learning in High Dimensional Data Analysis

Matrix-Tensor and Deep Learning in High Dimensional Data Analysis Matrix-Tensor and Deep Learning in High Dimensional Data Analysis Tien D. Bui Department of Computer Science and Software Engineering Concordia University 14 th ICIAR Montréal July 5-7, 2017 Introduction

More information

Graph-Laplacian PCA: Closed-form Solution and Robustness

Graph-Laplacian PCA: Closed-form Solution and Robustness 2013 IEEE Conference on Computer Vision and Pattern Recognition Graph-Laplacian PCA: Closed-form Solution and Robustness Bo Jiang a, Chris Ding b,a, Bin Luo a, Jin Tang a a School of Computer Science and

More information

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Lingchen Kong and Naihua Xiu Department of Applied Mathematics, Beijing Jiaotong University, Beijing, 100044, People s Republic of China E-mail:

More information

GoDec: Randomized Low-rank & Sparse Matrix Decomposition in Noisy Case

GoDec: Randomized Low-rank & Sparse Matrix Decomposition in Noisy Case ianyi Zhou IANYI.ZHOU@SUDEN.US.EDU.AU Dacheng ao DACHENG.AO@US.EDU.AU Centre for Quantum Computation & Intelligent Systems, FEI, University of echnology, Sydney, NSW 2007, Australia Abstract Low-rank and

More information

IT IS well known that the problem of subspace segmentation

IT IS well known that the problem of subspace segmentation IEEE TRANSACTIONS ON CYBERNETICS 1 LRR for Subspace Segmentation via Tractable Schatten-p Norm Minimization and Factorization Hengmin Zhang, Jian Yang, Member, IEEE, Fanhua Shang, Member, IEEE, Chen Gong,

More information

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 36 Why proximal? Newton s method: for C 2 -smooth, unconstrained problems allow

More information

Coordinate Update Algorithm Short Course Operator Splitting

Coordinate Update Algorithm Short Course Operator Splitting Coordinate Update Algorithm Short Course Operator Splitting Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 25 Operator splitting pipeline 1. Formulate a problem as 0 A(x) + B(x) with monotone operators

More information

arxiv: v1 [stat.ml] 1 Mar 2015

arxiv: v1 [stat.ml] 1 Mar 2015 Matrix Completion with Noisy Entries and Outliers Raymond K. W. Wong 1 and Thomas C. M. Lee 2 arxiv:1503.00214v1 [stat.ml] 1 Mar 2015 1 Department of Statistics, Iowa State University 2 Department of Statistics,

More information

The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices

The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices Noname manuscript No. (will be inserted by the editor) The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices Zhouchen Lin Minming Chen Yi Ma Received: date / Accepted:

More information

Supplement to A Generalized Least Squares Matrix Decomposition. 1 GPMF & Smoothness: Ω-norm Penalty & Functional Data

Supplement to A Generalized Least Squares Matrix Decomposition. 1 GPMF & Smoothness: Ω-norm Penalty & Functional Data Supplement to A Generalized Least Squares Matrix Decomposition Genevera I. Allen 1, Logan Grosenic 2, & Jonathan Taylor 3 1 Department of Statistics and Electrical and Computer Engineering, Rice University

More information

ONP-MF: An Orthogonal Nonnegative Matrix Factorization Algorithm with Application to Clustering

ONP-MF: An Orthogonal Nonnegative Matrix Factorization Algorithm with Application to Clustering ONP-MF: An Orthogonal Nonnegative Matrix Factorization Algorithm with Application to Clustering Filippo Pompili 1, Nicolas Gillis 2, P.-A. Absil 2,andFrançois Glineur 2,3 1- University of Perugia, Department

More information

The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices

The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices Noname manuscript No. (will be inserted by the editor) The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices Zhouchen Lin Minming Chen Yi Ma Received: date / Accepted:

More information

Dense Error Correction for Low-Rank Matrices via Principal Component Pursuit

Dense Error Correction for Low-Rank Matrices via Principal Component Pursuit Dense Error Correction for Low-Rank Matrices via Principal Component Pursuit Arvind Ganesh, John Wright, Xiaodong Li, Emmanuel J. Candès, and Yi Ma, Microsoft Research Asia, Beijing, P.R.C Dept. of Electrical

More information

Lecture 9: September 28

Lecture 9: September 28 0-725/36-725: Convex Optimization Fall 206 Lecturer: Ryan Tibshirani Lecture 9: September 28 Scribes: Yiming Wu, Ye Yuan, Zhihao Li Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These

More information

A Randomized Approach for Crowdsourcing in the Presence of Multiple Views

A Randomized Approach for Crowdsourcing in the Presence of Multiple Views A Randomized Approach for Crowdsourcing in the Presence of Multiple Views Presenter: Yao Zhou joint work with: Jingrui He - 1 - Roadmap Motivation Proposed framework: M2VW Experimental results Conclusion

More information

Fast Algorithms for Structured Robust Principal Component Analysis

Fast Algorithms for Structured Robust Principal Component Analysis Fast Algorithms for Structured Robust Principal Component Analysis Mustafa Ayazoglu, Mario Sznaier and Octavia I. Camps Dept. of Electrical and Computer Engineering, Northeastern University, Boston, MA

More information

Multi-stage convex relaxation approach for low-rank structured PSD matrix recovery

Multi-stage convex relaxation approach for low-rank structured PSD matrix recovery Multi-stage convex relaxation approach for low-rank structured PSD matrix recovery Department of Mathematics & Risk Management Institute National University of Singapore (Based on a joint work with Shujun

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley A. d Aspremont, INFORMS, Denver,

More information

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725 Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:

More information

Homework 5. Convex Optimization /36-725

Homework 5. Convex Optimization /36-725 Homework 5 Convex Optimization 10-725/36-725 Due Tuesday April 21 at 4:00pm submitted to Mallory Deptola in GHC 8001 (make sure to submit each problem separately) ***Instructions: you will solve 4 questions,

More information

Compressive Sensing, Low Rank models, and Low Rank Submatrix

Compressive Sensing, Low Rank models, and Low Rank Submatrix Compressive Sensing,, and Low Rank Submatrix NICTA Short Course 2012 yi.li@nicta.com.au http://users.cecs.anu.edu.au/~yili Sep 12, 2012 ver. 1.8 http://tinyurl.com/brl89pk Outline Introduction 1 Introduction

More information

Analysis of Robust PCA via Local Incoherence

Analysis of Robust PCA via Local Incoherence Analysis of Robust PCA via Local Incoherence Huishuai Zhang Department of EECS Syracuse University Syracuse, NY 3244 hzhan23@syr.edu Yi Zhou Department of EECS Syracuse University Syracuse, NY 3244 yzhou35@syr.edu

More information

Beyond RPCA: Flattening Complex Noise in the Frequency Domain

Beyond RPCA: Flattening Complex Noise in the Frequency Domain Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17) Beyond RPCA: Flattening Complex Noise in the Frequency Domain Yunhe Wang, 1,3 Chang Xu, Chao Xu, 1,3 Dacheng Tao 1 Key

More information

Deep Learning: Approximation of Functions by Composition

Deep Learning: Approximation of Functions by Composition Deep Learning: Approximation of Functions by Composition Zuowei Shen Department of Mathematics National University of Singapore Outline 1 A brief introduction of approximation theory 2 Deep learning: approximation

More information

EUSIPCO

EUSIPCO EUSIPCO 013 1569746769 SUBSET PURSUIT FOR ANALYSIS DICTIONARY LEARNING Ye Zhang 1,, Haolong Wang 1, Tenglong Yu 1, Wenwu Wang 1 Department of Electronic and Information Engineering, Nanchang University,

More information

Breaking the Limits of Subspace Inference

Breaking the Limits of Subspace Inference Breaking the Limits of Subspace Inference Claudia R. Solís-Lemus, Daniel L. Pimentel-Alarcón Emory University, Georgia State University Abstract Inferring low-dimensional subspaces that describe high-dimensional,

More information

Sparse and low-rank decomposition for big data systems via smoothed Riemannian optimization

Sparse and low-rank decomposition for big data systems via smoothed Riemannian optimization Sparse and low-rank decomposition for big data systems via smoothed Riemannian optimization Yuanming Shi ShanghaiTech University, Shanghai, China shiym@shanghaitech.edu.cn Bamdev Mishra Amazon Development

More information

Linearized Alternating Direction Method: Two Blocks and Multiple Blocks. Zhouchen Lin 林宙辰北京大学

Linearized Alternating Direction Method: Two Blocks and Multiple Blocks. Zhouchen Lin 林宙辰北京大学 Linearized Alternating Direction Method: Two Blocks and Multiple Blocks Zhouchen Lin 林宙辰北京大学 Dec. 3, 014 Outline Alternating Direction Method (ADM) Linearized Alternating Direction Method (LADM) Two Blocks

More information

THE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR

THE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR THE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR WEN LI AND MICHAEL K. NG Abstract. In this paper, we study the perturbation bound for the spectral radius of an m th - order n-dimensional

More information

A Proximal Alternating Direction Method for Semi-Definite Rank Minimization (Supplementary Material)

A Proximal Alternating Direction Method for Semi-Definite Rank Minimization (Supplementary Material) A Proximal Alternating Direction Method for Semi-Definite Rank Minimization (Supplementary Material) Ganzhao Yuan and Bernard Ghanem King Abdullah University of Science and Technology (KAUST), Saudi Arabia

More information

Recent developments on sparse representation

Recent developments on sparse representation Recent developments on sparse representation Zeng Tieyong Department of Mathematics, Hong Kong Baptist University Email: zeng@hkbu.edu.hk Hong Kong Baptist University Dec. 8, 2008 First Previous Next Last

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms François Caron Department of Statistics, Oxford STATLEARN 2014, Paris April 7, 2014 Joint work with Adrien Todeschini,

More information

Homework 1. Yuan Yao. September 18, 2011

Homework 1. Yuan Yao. September 18, 2011 Homework 1 Yuan Yao September 18, 2011 1. Singular Value Decomposition: The goal of this exercise is to refresh your memory about the singular value decomposition and matrix norms. A good reference to

More information

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization / Coordinate descent Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Adding to the toolbox, with stats and ML in mind We ve seen several general and useful minimization tools First-order methods

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Adrien Todeschini Inria Bordeaux JdS 2014, Rennes Aug. 2014 Joint work with François Caron (Univ. Oxford), Marie

More information

Joint Capped Norms Minimization for Robust Matrix Recovery

Joint Capped Norms Minimization for Robust Matrix Recovery Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-7) Joint Capped Norms Minimization for Robust Matrix Recovery Feiping Nie, Zhouyuan Huo 2, Heng Huang 2

More information

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via nonconvex optimization Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Low Rank Matrix Completion Formulation and Algorithm

Low Rank Matrix Completion Formulation and Algorithm 1 2 Low Rank Matrix Completion and Algorithm Jian Zhang Department of Computer Science, ETH Zurich zhangjianthu@gmail.com March 25, 2014 Movie Rating 1 2 Critic A 5 5 Critic B 6 5 Jian 9 8 Kind Guy B 9

More information

arxiv: v2 [cs.na] 24 Mar 2015

arxiv: v2 [cs.na] 24 Mar 2015 Volume X, No. 0X, 200X, X XX doi:1934/xx.xx.xx.xx PARALLEL MATRIX FACTORIZATION FOR LOW-RANK TENSOR COMPLETION arxiv:1312.1254v2 [cs.na] 24 Mar 2015 Yangyang Xu Department of Computational and Applied

More information

Variational Image Restoration

Variational Image Restoration Variational Image Restoration Yuling Jiao yljiaostatistics@znufe.edu.cn School of and Statistics and Mathematics ZNUFE Dec 30, 2014 Outline 1 1 Classical Variational Restoration Models and Algorithms 1.1

More information

Multiple Similarities Based Kernel Subspace Learning for Image Classification

Multiple Similarities Based Kernel Subspace Learning for Image Classification Multiple Similarities Based Kernel Subspace Learning for Image Classification Wang Yan, Qingshan Liu, Hanqing Lu, and Songde Ma National Laboratory of Pattern Recognition, Institute of Automation, Chinese

More information

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 7: Matrix completion Yuejie Chi The Ohio State University Page 1 Reference Guaranteed Minimum-Rank Solutions of Linear

More information

Multisets mixture learning-based ellipse detection

Multisets mixture learning-based ellipse detection Pattern Recognition 39 (6) 731 735 Rapid and brief communication Multisets mixture learning-based ellipse detection Zhi-Yong Liu a,b, Hong Qiao a, Lei Xu b, www.elsevier.com/locate/patcog a Key Lab of

More information

Block Coordinate Descent for Regularized Multi-convex Optimization

Block Coordinate Descent for Regularized Multi-convex Optimization Block Coordinate Descent for Regularized Multi-convex Optimization Yangyang Xu and Wotao Yin CAAM Department, Rice University February 15, 2013 Multi-convex optimization Model definition Applications Outline

More information

Robust Tensor Factorization Using R 1 Norm

Robust Tensor Factorization Using R 1 Norm Robust Tensor Factorization Using R Norm Heng Huang Computer Science and Engineering University of Texas at Arlington heng@uta.edu Chris Ding Computer Science and Engineering University of Texas at Arlington

More information

Fantope Regularization in Metric Learning

Fantope Regularization in Metric Learning Fantope Regularization in Metric Learning CVPR 2014 Marc T. Law (LIP6, UPMC), Nicolas Thome (LIP6 - UPMC Sorbonne Universités), Matthieu Cord (LIP6 - UPMC Sorbonne Universités), Paris, France Introduction

More information

An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems

An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems Kim-Chuan Toh Sangwoon Yun March 27, 2009; Revised, Nov 11, 2009 Abstract The affine rank minimization

More information

Nonnegative Matrix Factorization Clustering on Multiple Manifolds

Nonnegative Matrix Factorization Clustering on Multiple Manifolds Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Nonnegative Matrix Factorization Clustering on Multiple Manifolds Bin Shen, Luo Si Department of Computer Science,

More information

An Effective Tensor Completion Method Based on Multi-linear Tensor Ring Decomposition

An Effective Tensor Completion Method Based on Multi-linear Tensor Ring Decomposition An Effective Tensor Completion Method Based on Multi-linear Tensor Ring Decomposition Jinshi Yu, Guoxu Zhou, Qibin Zhao and Kan Xie School of Automation, Guangdong University of Technology, Guangzhou,

More information

Reweighted Nuclear Norm Minimization with Application to System Identification

Reweighted Nuclear Norm Minimization with Application to System Identification Reweighted Nuclear Norm Minimization with Application to System Identification Karthi Mohan and Maryam Fazel Abstract The matrix ran minimization problem consists of finding a matrix of minimum ran that

More information