Penalty Decomposition Methods for Rank and l 0 -Norm Minimization

Size: px
Start display at page:

Download "Penalty Decomposition Methods for Rank and l 0 -Norm Minimization"

Transcription

1 IPAM, October 12, 2010 p. 1/54 Penalty Decomposition Methods for Rank and l 0 -Norm Minimization Zhaosong Lu Simon Fraser University Joint work with Yong Zhang (Simon Fraser)

2 IPAM, October 12, 2010 p. 2/54 Outline of Talk Rank and l 0 -norm minimization problems Technical preliminaries PD methods for rank minimization PD methods forl 0 -norm minimization Numerical results

3 IPAM, October 12, 2010 p. 3/54 Rank minimization Rank minimization: min{f(x) : rank(x) r, X X Ω}, X min{f(x)+ν rank(x) : X X Ω} X for somer, ν 0, wherex is a closed convex set,ωis a closed unitarily invariant set in R m n, and f : R m n R is a continuously differentiable function. Applications: combinatorial optimization; nonconvex QP; image recovery; nearest low-rank correlation matrix and etc.

4 IPAM, October 12, 2010 p. 4/54 l 0 -norm minimization l 0 -norm minimization: min x {f(x) : x J 0 r, x X}, min x {f(x)+ν x J 0 : x X} for some integer r 0 and ν 0 controlling the sparsity of the solution, where X is a closed convex set in R n,f : R n R is a continuously differentiable function, and x J 0 denotes the cardinality of the subvector formed by the entries of x indexed by J. Applications: compressed sensing; sparse logistic regression; sparse inverse covariance selection and etc.

5 IPAM, October 12, 2010 p. 5/54 Technical preliminaries Proposition: Let be a unitarily invariant norm on R m n, and let F : R m n R be a unitarily invariant function. Suppose that X R m n is a unitarily invariant set. LetA R m n be given, q = min(m,n), and letφbe a non-decreasing function on [0, ). Suppose that UΣ(A)V T is the singular value decomposition of A. Then, X = UD(x )V T is an optimal solution of the problem min F(X)+φ( X A ) s.t. X X, wherex R q is an optimal solution of the problem min F(D(x))+φ( D(x) Σ(A) ) s.t. D(x) X.

6 IPAM, October 12, 2010 p. 6/54 Technical preliminaries (cont d) Corollary 1: Let ν 0 and A R m n be given, and let q = min(m,n). Suppose thatx R m n is a unitarily invariant set, and UΣ(A)V T is the singular value decomposition of A. Then, X = UD(x )V T is an optimal solution of the problem min{ν rank(x)+ 1 2 X A 2 F : X X}, wherex R q is an optimal solution of the problem min{ν x x σ(a) 2 2 : D(x) X}.

7 IPAM, October 12, 2010 p. 7/54 Technical preliminaries (cont d) Corollary 2: Let r 0 and A R m n be given, and let q = min(m,n). Suppose thatx R m n is a unitarily invariant set, and UΣ(A)V T is the singular value decomposition of A. Then, X = UD(x )V T is an optimal solution of the problem min{ X A F : rank(x) r, X X}, wherex R q is an optimal solution of the problem min{ x σ(a) 2 : x 0 r, D(x) X}.

8 IPAM, October 12, 2010 p. 8/54 Technical preliminaries (cont d) Corollary 3: Let ν 0 and A R m n be given, and let q = min(m,n). Suppose thatuσ(a)v T is the singular value decomposition of A. Then, X = UD(x )V T is an optimal solution of the problem minν X X A 2 F, wherex R q is an optimal solution of the problem minν x x σ(a) 2 2.

9 IPAM, October 12, 2010 p. 9/54 Technical preliminaries (cont d) Corollary 4: Let r 0 and A R m n be given, and let q = min(m,n). Suppose thatuσ(a)v T is the singular value decomposition of A. Then, X = UD(x )V T is an optimal solution of the problem min{ X A F : X r}, wherex R q is an optimal solution of the problem min{ x σ(a) 2 : x 1 r}.

10 IPAM, October 12, 2010 p. 10/54 Technical preliminaries (cont d) Proposition: Let X i R and φ i : R R for i = 1,...,n be given. Suppose that r is a positive integer and 0 X i for alli. Consider the following l 0 -norm minimization problem: { } n min φ(x) = φ i (x i ) : x 0 r, x X 1 X n i=1. (1) Let x i Argmin{φ i (x i ) : x i X i } and I {1,...,n} be the index set corresponding to r largest values of{vi }n i=1, where vi = φ i (0) φ i ( x i) for i = 1,...,n. Then, x is an optimal solution of problem (1), where x is defined as follows: x i = { x i if i I ; 0 otherwise, i = 1,...,n.

11 IPAM, October 12, 2010 p. 11/54 Technical preliminaries (cont d) Proposition: Let X i R and φ i : R R for i = 1,...,n be given. Suppose that ν 0 and 0 X i for alli. Consider the following l 0 -norm minimization problem: { } n min ν x 0 + φ i (x i ) : x X 1 X n i=1. (2) Let x i Argmin{φ i (x i ) : x i X i } and vi = φ i (0) ν φ i ( x i) for i = 1,...,n. Then, x is an optimal solution of problem (2), where x is defined as follows: x i = { x i if v i 0; 0 otherwise, i = 1,...,n.

12 IPAM, October 12, 2010 p. 12/54 Consider PD methods for rank minimization min{f(x) : rank(x) r, X X Ω}, X (3) min{f(x)+ν rank(x) : X X Ω} X (4) for somer, ν 0, wherex is a closed convex set,ωis a closed unitarily invariant set in R m n, and f : R m n R is a continuously differentiable function. Assumption: Problems (3) and (4) are feasible, and moreover, at least a feasible solution, denoted by X feas, is known.

13 IPAM, October 12, 2010 p. 13/54 PD methods for rank minimization where min{f(x) : rank(x) r, X X Ω} X min X,Y {f(x) : X Y = 0, X X, Y Y}, (5) Y := {Y Ω rank(y) r}. Given > 0, define: Q (X,Y) := f(x)+ 2 X Y 2 F, Q (X,U,V) := Q (X,UV) X R m n,u R m r,v R r n.

14 PD method for (5) (asymmetric matrices): Let {ǫ k } be a positive decreasing sequence. Let 0 > 0,σ > 1 be given. Choose an arbitrary Y0 0 Y and a constant Υ max{f(x feas ),min X X Q 0(X,Y0 0 )}. Set k = 0. 1) Set l = 0 and apply the BCD method to find an approximate solution (X k,y k ) X Y for the penalty subproblem: by performing steps 1a)-1d): min{q k(x,y) : X X, Y Y} (6) 1a) SolveXl+1 k Arg min Q k(x,y l k). X X 1b) SolveYl+1 k ArgminQ k(x l+1 k,y). Y Y 1c) Set (X k,y k ) := (Xl+1 k,y l+1 k ). If (Xk,Y k ) satisfies dist ( X Q k(x k,y k ),N X (X k ) ) ǫ k, (7) U Q k(x k,u k,v k )+Z k Y(V k ) T F ǫ k, (8) V Q k(x k,u k,v k )+(U k ) T Z k Y F ǫ k (9) IPAM, October 12, 2010 p. 14/54

15 IPAM, October 12, 2010 p. 15/54 for some Z k Y N Ω(Y k ),U k R m r,v k R r n such that (U k ) T U k = I, Y k = U k V k, (10) then go to step 2). 1d) Set l l+1and go to step 1a). 2) Set k+1 := σ k. 3) If min X X Q k+1 (X,Y k ) > Υ, set Y k+1 0 := X feas. Otherwise, set Y k+1 0 := Y k. 4) Set k k +1and go to step 1). end

16 IPAM, October 12, 2010 p. 16/54 Theorem (outer iterations): Assume that ǫ k 0. Let {(X k,y k )} be generated by the above PD method, and {(U k,v k,zy k )} be the associated seq satisfying (7)-(10). Suppose that the level set X Υ := {X X f(x) Υ} is compact. Then: (a) {(X k,y k,u k,v k )} is bounded; (b) Suppose that {(X k,y k,u k,v k )} k K converges to(x,y,u,v ). Then, (X,Y ) is a feasible point of problem (5). Moreover, if d X d U V U d V d U V +U d V d Y : d X T X (X ),d U R m r, d V R r n,d Y T Ω (X ) = Rm n R m n holds, then {(Z k X,Zk Y )} k K is bounded, where Z k X := k(x k Y k ), and each cluster point (Z X,Z Y ) of {(Zk X,Zk Y )} k K together with(x,u,v ) satisfies f(x ) Z X N X(X ), (Z X Z Y )(V ) T = 0, (U ) T (Z X Z Y ) = 0, X U V = 0, Z Y N Ω (X ).

17 IPAM, October 12, 2010 p. 17/54 PD methods for rank minimization (cont d) Remark: The above cluster point (X,U,V,ZX,Z Y ) satisfies the KKT conditions of the following reformulation of (5) (or, equivalently, (3)): min X,U,V {f(x) : X UV = 0, UV Ω, X X, U Rm r, V R r n }. Theorem (inner iterations): Suppose that the following condition: { du V + Ūd V d Y : d U R m r,d V R r n,d Y T Ω (Ȳ)} = R m n holds for any Ū Rm r, V R r n such that ŪT Ū = I and Ȳ = Ū V Ω. The approximate solution (X k,y k ) X Y for problem (6) satisfying (7)-(10) can be found by the BCD method described in steps 1a)-1d) within a finite number of iterations.

18 4) Set k k + 1 and go to step 1). IPAM, October 12, 2010 p. 18/54 PD methods for rank minimization (cont d) Penalty decomposition method for (4): Let 0 > 0,σ > 1 be given. Choose an arbitraryy 0 0 Ω and a constant Υ such that Υ max{f(x feas )+ν rank(x feas ),min X X P 0(X,Y 0 0 )}. Set k = 0. 1) Set l = 0 and apply the BCD method to find an approximate solution (X k,y k ) X Ω for the penalty subproblem min{p k(x,y) : X X, Y Ω} by performing steps 1a)-1c): 1a) SolveXl+1 k Arg min P k(x,y l k). X X 1b) SolveYl+1 k ArgminP k(x l+1 k,y). Y Ω 1c) Set l l+1and go to step 1a). 2) Set k+1 := σ k. 3) If min X X P k+1 (X,Y k ) > Υ, set Y k+1 0 := X feas. Otherwise, set Y k+1 0 := Y k.

19 IPAM, October 12, 2010 p. 19/54 PD methods for l 0 -norm minimization Consider the l 0 -norm minimization problems: min{f(x) : x J 0 r, x X}, x (11) min{f(x)+ν x J 0 : x X} x (12) for some integer r 0 and ν 0 controlling the sparsity of the solution, where X is a closed convex set in R n,f : R n R is a continuously differentiable function, and x J 0 denotes the cardinality of the subvector formed by the entries of x indexed by J. Assumption: Problems (11) and (12) are feasible, and moreover, at least a feasible solution, denoted by x feas, is known.

20 IPAM, October 12, 2010 p. 20/54 PD methods for l 0 -norm minimization For simplicity, assume J = {1,2,...,n}. Define: Observe: X M = {D(x) : x X}, f M (X) = f(d (X)) X D n. min{f(x) : x 0 r, x X} x min {f M(X) : rank(x) r, X X M }, (13) X which can be suitably solved by the above PD method. Define: Y M := {Y S n rank(y) r}, Q (X,Y) := f M (X)+ 2 X Y 2 F, Q (X,U,D) := Q (X,UDU T ) X D n,u R n r,d D r.

21 Penalty decomposition method for (13): Let {ǫ k } be a positive decreasing sequence. Let 0 > 0,σ > 1 be given. Choose an arbitrary Y0 0 Y M and a constant Υ max{f(x feas ),min X XM Q 0(X,Y0 0 )}. Set k = 0. 1) Set l = 0 and apply the BCD method to find an approximate solution (X k,y k ) X M Y M for the penalty subproblem: by performing steps 1a)-1d): 1a) SolveX k l+1 1b) SolveY k l+1 min{q k(x,y) : X X M, Y Y M } Arg min X X M Q k(x,y k l ). Arg min Y Y M Q k(x k l+1,y). 1c) Set (X k,y k ) := (X k l+1,y k l+1 ). If (Xk,Y k ) satisfies dist ( X Q k(x k,y k ),N XM (X k ) ) ǫ k, U Q k(x k,u k,d k ) F ǫ k, (14) D Q k(x k,u k,d k ) F ǫ k IPAM, October 12, 2010 p. 21/54

22 IPAM, October 12, 2010 p. 22/54 for some U k R n r,d k D r such that then go to step 2). 1d) Set l l+1and go to step 1a). 2) Set k+1 := σ k. (U k ) T U k = I, Y k = U k D k (U k ) T, (15) 3) If min X X M Q k+1 (X,Y k ) > Υ, set Y k+1 0 := D(x feas ). Otherwise, set Y k+1 0 := Y k. 4) Set k k +1and go to step 1). end

23 Theorem: Assume thatǫ k 0. Let {(X k,y k,u k,d k )} be generated by the above PD method satisfying (14) and (15). Suppose that X Υ := {X X M f M (X) Υ} is compact. Then: (a) {(X k,y k,u k,d k )} is bounded; (b) Suppose that {(X k,y k,u k,d k )} k K converges to(x,y,u,d ). Then, X = Y and X is a feasible point of problem (13). Moreover, if the following condition d X d U D (U ) T U d D (U ) T U D d T U : d X T XM (X ), d U R n r,d D D r holds, then {Z k } k K is bounded, where Z k := k(x k Y k ), and each cluster point Z of {Z k } k K together with (X,U,D ) satisfies f M (X ) Z N XM (X ), Z U D = 0, D ( (U ) T Z U ) = 0, Dn X U D (U ) T = 0. IPAM, October 12, 2010 p. 23/54

24 IPAM, October 12, 2010 p. 24/54 PD methods for l 0 -norm minimization Remark: The above cluster point (X,U,D,Z ) satisfies the KKT conditions of the following reformulation of (13) (or, equivalently, (11)). min {f M(X) : X UDU T = 0, X X M, U R n r, D D r }. X,U,D Goal: Transfer the above PD method into the one involving vector operations only. Define: Y = {y R n : y 0 r}, q (x,y) = f(x)+ 2 x y 2 2 x,y R n.

25 Penalty decomposition method for (11): Let {ǫ k } be a positive decreasing sequence. Let 0 > 0,σ > 1 be given. Choose an arbitrary y 0 0 Y and a constant Υ max{f(x feas ),min x X q 0(x,y 0 0)}. Set k = 0. 1) Set l = 0 and apply the BCD method to find an approximate solution (x k,y k ) X Y for the penalty subproblem by performing steps 1a)-1d): min{q k(x,y) : x X, y Y} 1a) Solvex k l+1 Argmin x X q k(x,y k l ). 1b) Solvey k l+1 Argmin y Y q k(x k l+1,y). 1c) Set (x k,y k ) := (x k l+1,yk l+1 ). If (xk,y k ) satisfies dist ( x q k(x k,y k ),N X (x k ) ) ǫ k, then go to step 2). 1d) Set l l+1and go to step 1a). IPAM, October 12, 2010 p. 25/54

26 IPAM, October 12, 2010 p. 26/54 2) Set k+1 := σ k. 3) Ifmin x X q k+1 (x,y k ) > Υ, sety k+1 0 := x feas. Otherwise, set y k+1 0 := y k. 4) Setk k +1and go to step 1). end

27 IPAM, October 12, 2010 p. 27/54 Theorem: Assume that ǫ k 0. Let {(x k,y k )} be generated by the above PD method andj k = {j k 1,...,j k r} a set ofr distinct indices such that (y k ) j = 0 for all j / J k. Suppose that X Υ := {x X f(x) Υ} is compact. Then: (a) {(x k,y k )} is bounded; (b) Suppose (x,y ) is a cluster point of{(x k,y k )}. Then, x = y and x is a feasible point of problem (11). Moreover, there exists a subsequence K such that {(x k,y k )} k K (x,y ) andj k = J for some index set J when k K is sufficiently large. Furthermore, if {d x +d d : d x T X (x ), d d R n, (d d ) j = 0 j / J } = R n holds, then {z k := k(x k y k )} k K is bounded and each cluster point z of {z k } k K together withx satisfies f(x ) z N X (x ), z j = 0 j J. (16)

28 and (16) becomes f(x ) N X (x ). But (17) clearly cannot when I {1,...,n}. IPAM, October 12, 2010 p. 28/54 Remark: The optimality condition (16) is generally stronger than one natural optimality condition for (11). Let I = {j : x j 0}. Suppose that x is a local minimum of (11). Then x is clearly a local minimum of min x X {f(x) : x j = 0 j / I }. Assume that the constraint qualification {d x +d d : d x T X (x ),(d d ) j = 0 j / I } = R n holds at x. Then there existsz R n such that f(x ) z N X (x ), z j = 0 j I. (17) Clearly, when I J, (16) is generally stronger than (17). For example, when r = n and J = {1,...,n}, problem (11) reduces to min x {f(x) : x X}

29 IPAM, October 12, 2010 p. 29/54 Matrix completion Test I: recover a low-rank matrix M R m n based on a subset of entries ofm. min X R m n rank(x) s.t. X ij = M ij, (i,j) Θ, whereθis a subset of index pairs(i,j). Instances: randomly generate50 copies of M and Θ with m = n = 40, p = 800 for eachr = 1,...,10 by the same procedure as described by Ma, Goldfarb and Chen (2008).

30 IPAM, October 12, 2010 p. 30/54 Matrix completion Initialization: X feas m n such that Xij feas Xij feas = 0 for all(i,j) / Θ. Y 0 0 = X feas. 0 = 0.1 and σ = 10. Inner termination criterion: = M ij for all(i,j) Θ and Q k(x k l,y k l ) Q k(x k l 1,Y k l 1 ) max( Q k(x k l,y k l ),1) Outer termination criterion: max ij X k ij Y k ij 10 5.

31 IPAM, October 12, 2010 p. 31/54 Relative error: Matrix completion rel_err := X M F M F, wherex is an approximate recovery for M. As in Recht, Fazel, and Parrilo (2007) and Candés and Recht (2008), we say a matrix M is successfully recovered by X if the relative error is less than Remark: The recoverabily of M depends on two ratios: SR = p/(mn). FR = r(m+n r)/p.

32 IPAM, October 12, 2010 p. 32/54 Matrix completion Table 1: Numerical results for m = 40, n = 40 and p = 800 Problems FPCA LMaFit PD Rank FR NS rel_err Time NS rel_err Time NS rel_err Time e e e e e e e e e e e e e e e e e e e e e e e e

33 IPAM, October 12, 2010 p. 33/54 Matrix completion Test II: recover a high-rank matrix M R n n, whose most of singular values are nearly zero, by a low-rank matrix based on a subset of entries {M ij } (i,j) Θ. Instances: randomly generate50 instances for each sample ratio varying from0.5 to 0.9 with the singular values given by σ i = i 4 for alli.

34 IPAM, October 12, 2010 p. 34/54 Matrix completion Table 2: Numerical results for n = 40 and σ i = i 4 FPCA LMaFit PD SR Rank Rank rel_err Rank rel_err Rank rel_err e e e e e e e e e e e e e e e 4

35 IPAM, October 12, 2010 p. 35/54 Matrix completion Test III (grayscale image inpainting problem): fill the missing pixel values of the image at given pixel locations. (a) original image (b) rank 40 image (c) 50% masked original image (d) recovered image by PD

36 IPAM, October 12, 2010 p. 36/54 Matrix completion Test III (grayscale image inpainting problem): fill the missing pixel values of the image at given pixel locations. (e) 50% masked rank 40 image (f) recovered image by PD (g) 6.3% masked rank 40 image (h) recovered image by PD

37 IPAM, October 12, 2010 p. 37/54 Nearest low-rank correlation Nearest low-rank correlation problem: 1 min X X S n 2 C 2 F s.t. diag(x) = e, rank(x) r, X 0 for some correlation matrix C S n + and some integer r [1,n]

38 IPAM, October 12, 2010 p. 38/54 Nearest low-rank correlation Table 3: Comparison of Major and PD Problem Major PD Iter Obj Time Iter Obj Time P1n100r P1n100r P1n100r P1n100r P1n100r P1n500r P1n500r P1n500r P1n500r P1n500r

39 IPAM, October 12, 2010 p. 39/54 Nearest low-rank correlation Table 4: Comparison of Major and PD Problem Major PD Iter Obj Time Iter Obj Time P2n100r P2n100r P2n100r P2n100r P2n100r P2n500r P2n500r P2n500r P2n500r P2n500r

40 IPAM, October 12, 2010 p. 40/54 Nearest low-rank correlation Table 5: Comparison of Major and PD Problem Major PD Iter Obj Time Iter Obj Time P3n100r P3n100r P3n100r P3n100r P3n100r P3n500r P3n500r P3n500r P3n500r P3n500r

41 IPAM, October 12, 2010 p. 41/54 Nearest low-rank correlation Table 6: Comparison of Major and PD Problem Major PD Iter Obj Time Iter Obj Time P4n100r P4n100r P4n100r P4n100r P4n100r P4n500r P4n500r P4n500r P4n500r P4n500r

42 IPAM, October 12, 2010 p. 42/54 Sparse logistic regression Given n samplesz i s with p features, and n binary outcomesb i s, let a i = b i z i fori = 1,...,n. The average logistic loss function is: l avg (v,w) := n i=1 θ(wt a i +vb i )/n for v R and w R p, whereθ is the logistic loss function θ(t) := log(1+exp( t)). The sparse logistic regression problem: min v,w {l avg(v,w) : w 0 r}, where integer r [1,p] is for controlling the sparsity of the solution. The l 1 -norm regularization problem: min v,w l avg(v,w)+λ w 1, whereλ 0 is a regularization parameter.

43 IPAM, October 12, 2010 p. 43/54 Sparse logistic regression Given any model variables(v,w) and a sample vector z R p, the outcome predicted by (v,w) for z is given by φ(z) = sgn(w T z +v), where sgn(t) = { +1 if t > 0, 1 otherwise. The error rate of (v,w) for predicting the outcomesb 1,...,b n : { n } Error := φ(z i ) b i 0 /n 100%. i=1 Goal: compare the quality of the solutions of similar sparsity obtained by our PD (l 0 ) and IPM method (l 1 ) (Kim et al. (2007)).

44 IPAM, October 12, 2010 p. 44/54 Sparse logistic regression Table 7: Computational results on three real data sets Data Features Samples IPM PD p n λ/λ max r l avg Error(%) l avg Error(%) Colon Ionosphere Advertisements

45 IPAM, October 12, 2010 p. 45/54 Sparse logistic regression Table 8: Average computational time on six random problems Size Time n p r = 0.1p r = 0.3p r = 0.5p r = 0.7p r = 0.9p

46 IPAM, October 12, 2010 p. 46/54 Sparse inverse covariance selection Sparse inverse covariance selection: max X 0 s.t. logdetx Σ,X X ij 0 r, (i,j) Ω X ij = 0 (i,j) Ω, where Ω = {(i,j) : (i,j) / Ω, i j}, and r [1, Ω ] is some integer for controlling the sparsity of the solution. (18) The l 1 -norm regularization: max X 0 logdetx Σ,X s.t. X ij = 0 (i,j) Ω, (i,j) Ω ρ ij X ij (19) where{ρ ij } (i,j) Ω is a set of regularization parameters.

47 IPAM, October 12, 2010 p. 47/54 Sparse inverse covariance selection Test I (random data): compare the solution quality of the l 0 problem and its l 1 regularization. Data: randomly generate Σ and Ω by the same manner as in d Aspremont et al. (2007) and L. (2008). Normalized entropy loss: Loss := 1 p ( Σ t,x logdet(σ t X) p). apply the PPA (Wang, Sun and Toh (2009)) to solve the l 1 regularization problem with ρ Ω = 0.01, 0.05, 0.1 and 0.5 and obtain solution X ; setr = (i,j) Ω X ij 0 for the l 0 -norm problem so that the solution by PD method is at least as sparse asx.

48 IPAM, October 12, 2010 p. 48/54 Sparse inverse covariance selection Table 9: Computational results for δ = 10% Problem PPA PD p Ω ρ Ω r Likelihood Loss Time Likelihood Loss Time

49 IPAM, October 12, 2010 p. 49/54 Sparse inverse covariance selection Table 10: Computational results for δ = 50% Problem PPA PD p Ω ρ Ω r Likelihood Loss Time Likelihood Loss Time

50 IPAM, October 12, 2010 p. 50/54 Sparse inverse covariance selection Table 11: Computational results for δ = 100% Problem PPA PD p Ω ρ Ω r Likelihood Loss Time Likelihood Loss Time

51 IPAM, October 12, 2010 p. 51/54 Sparse inverse covariance selection Test II (random data): compare the sparsity recovery ability of the l 0 problem and its l 1 regularization. Data: setp = 30 and randomly generate true and sample covariance matricesσ t and Σ by the same manner as in d Aspremont (2007); setω = {(i,j) : (Σ t ) 1 ij = 0, i j 15}; setρ ij = ρ Ω for all(i,j) Ω, whereρ Ω is the smallest value such that the total number of nonzero off-diagonal entries of the approximate solution obtained by the PPA when applied to the l 1 problem equals (i,j) Ω (Σt ) 1 ij 0; setr = (i,j) Ω (Σt ) 1 ij 0 for the l 0 problem.

52 IPAM, October 12, 2010 p. 52/54 Sparse inverse covariance selection (a) Original inverse(σ t ) 1 (b) Noisy inverseσ 1 (c) Approx solution of (19) (d) Approx solution of (18)

53 IPAM, October 12, 2010 p. 53/54 Sparse inverse covariance selection Test III (real data): compare the solution quality of the l 0 problem and its l 1 regularization. Data: pre-process two gene expression data by the same procedure as described by Li and Toh (2010) to obtain Σ; setω = and ρ ij = ρ Ω for some ρ Ω > 0; choose r so that the solution given by the PD method when applied to the l 0 problem is at least as sparse as the one obtained by the PPA when applied to the l 1 problem.

54 IPAM, October 12, 2010 p. 54/54 Sparse inverse covariance selection Table 12: Computational results on two real data sets Data Genes Samples PPA PD p n ρ Ω r Likelihood Loss Time Likelihood Loss Time Lymph Leukemia

Sparse Approximation via Penalty Decomposition Methods

Sparse Approximation via Penalty Decomposition Methods Sparse Approximation via Penalty Decomposition Methods Zhaosong Lu Yong Zhang February 19, 2012 Abstract In this paper we consider sparse approximation problems, that is, general l 0 minimization problems

More information

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28 Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:

More information

Low-Rank Factorization Models for Matrix Completion and Matrix Separation

Low-Rank Factorization Models for Matrix Completion and Matrix Separation for Matrix Completion and Matrix Separation Joint work with Wotao Yin, Yin Zhang and Shen Yuan IPAM, UCLA Oct. 5, 2010 Low rank minimization problems Matrix completion: find a low-rank matrix W R m n so

More information

Sparse Recovery via Partial Regularization: Models, Theory and Algorithms

Sparse Recovery via Partial Regularization: Models, Theory and Algorithms Sparse Recovery via Partial Regularization: Models, Theory and Algorithms Zhaosong Lu and Xiaorui Li Department of Mathematics, Simon Fraser University, Canada {zhaosong,xla97}@sfu.ca November 23, 205

More information

Multi-stage convex relaxation approach for low-rank structured PSD matrix recovery

Multi-stage convex relaxation approach for low-rank structured PSD matrix recovery Multi-stage convex relaxation approach for low-rank structured PSD matrix recovery Department of Mathematics & Risk Management Institute National University of Singapore (Based on a joint work with Shujun

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via convex relaxations Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 7: Matrix completion Yuejie Chi The Ohio State University Page 1 Reference Guaranteed Minimum-Rank Solutions of Linear

More information

Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs

Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs Raphael Louca & Eilyan Bitar School of Electrical and Computer Engineering American Control Conference (ACC) Chicago,

More information

TRANSFORMED SCHATTEN-1 ITERATIVE THRESHOLDING ALGORITHMS FOR LOW RANK MATRIX COMPLETION

TRANSFORMED SCHATTEN-1 ITERATIVE THRESHOLDING ALGORITHMS FOR LOW RANK MATRIX COMPLETION TRANSFORMED SCHATTEN- ITERATIVE THRESHOLDING ALGORITHMS FOR LOW RANK MATRIX COMPLETION SHUAI ZHANG, PENGHANG YIN, AND JACK XIN Abstract. We study a non-convex low-rank promoting penalty function, the transformed

More information

Exact penalty decomposition method for zero-norm minimization based on MPEC formulation 1

Exact penalty decomposition method for zero-norm minimization based on MPEC formulation 1 Exact penalty decomposition method for zero-norm minimization based on MPEC formulation Shujun Bi, Xiaolan Liu and Shaohua Pan November, 2 (First revised July 5, 22) (Second revised March 2, 23) (Final

More information

Minimizing the Difference of L 1 and L 2 Norms with Applications

Minimizing the Difference of L 1 and L 2 Norms with Applications 1/36 Minimizing the Difference of L 1 and L 2 Norms with Department of Mathematical Sciences University of Texas Dallas May 31, 2017 Partially supported by NSF DMS 1522786 2/36 Outline 1 A nonconvex approach:

More information

Robust Principal Component Analysis

Robust Principal Component Analysis ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M

More information

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Zhaosong Lu October 5, 2012 (Revised: June 3, 2013; September 17, 2013) Abstract In this paper we study

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu

More information

An Augmented Lagrangian Approach for Sparse Principal Component Analysis

An Augmented Lagrangian Approach for Sparse Principal Component Analysis An Augmented Lagrangian Approach for Sparse Principal Component Analysis Zhaosong Lu Yong Zhang July 12, 2009 Abstract Principal component analysis (PCA) is a widely used technique for data analysis and

More information

The convex algebraic geometry of rank minimization

The convex algebraic geometry of rank minimization The convex algebraic geometry of rank minimization Pablo A. Parrilo Laboratory for Information and Decision Systems Massachusetts Institute of Technology International Symposium on Mathematical Programming

More information

Sparse Covariance Selection using Semidefinite Programming

Sparse Covariance Selection using Semidefinite Programming Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support

More information

Linearized Alternating Direction Method: Two Blocks and Multiple Blocks. Zhouchen Lin 林宙辰北京大学

Linearized Alternating Direction Method: Two Blocks and Multiple Blocks. Zhouchen Lin 林宙辰北京大学 Linearized Alternating Direction Method: Two Blocks and Multiple Blocks Zhouchen Lin 林宙辰北京大学 Dec. 3, 014 Outline Alternating Direction Method (ADM) Linearized Alternating Direction Method (LADM) Two Blocks

More information

Subspace Projection Matrix Completion on Grassmann Manifold

Subspace Projection Matrix Completion on Grassmann Manifold Subspace Projection Matrix Completion on Grassmann Manifold Xinyue Shen and Yuantao Gu Dept. EE, Tsinghua University, Beijing, China http://gu.ee.tsinghua.edu.cn/ ICASSP 2015, Brisbane Contents 1 Background

More information

S 1/2 Regularization Methods and Fixed Point Algorithms for Affine Rank Minimization Problems

S 1/2 Regularization Methods and Fixed Point Algorithms for Affine Rank Minimization Problems S 1/2 Regularization Methods and Fixed Point Algorithms for Affine Rank Minimization Problems Dingtao Peng Naihua Xiu and Jian Yu Abstract The affine rank minimization problem is to minimize the rank of

More information

Lecture: Matrix Completion

Lecture: Matrix Completion 1/56 Lecture: Matrix Completion http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html Acknowledgement: this slides is based on Prof. Jure Leskovec and Prof. Emmanuel Candes s lecture notes Recommendation systems

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)

More information

Fixed point and Bregman iterative methods for matrix rank minimization

Fixed point and Bregman iterative methods for matrix rank minimization Math. Program., Ser. A manuscript No. (will be inserted by the editor) Shiqian Ma Donald Goldfarb Lifeng Chen Fixed point and Bregman iterative methods for matrix rank minimization October 27, 2008 Abstract

More information

Functional SVD for Big Data

Functional SVD for Big Data Functional SVD for Big Data Pan Chao April 23, 2014 Pan Chao Functional SVD for Big Data April 23, 2014 1 / 24 Outline 1 One-Way Functional SVD a) Interpretation b) Robustness c) CV/GCV 2 Two-Way Problem

More information

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Lingchen Kong and Naihua Xiu Department of Applied Mathematics, Beijing Jiaotong University, Beijing, 100044, People s Republic of China E-mail:

More information

Generalized Power Method for Sparse Principal Component Analysis

Generalized Power Method for Sparse Principal Component Analysis Generalized Power Method for Sparse Principal Component Analysis Peter Richtárik CORE/INMA Catholic University of Louvain Belgium VOCAL 2008, Veszprém, Hungary CORE Discussion Paper #2008/70 joint work

More information

An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems

An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems Kim-Chuan Toh Sangwoon Yun March 27, 2009; Revised, Nov 11, 2009 Abstract The affine rank minimization

More information

Solving Corrupted Quadratic Equations, Provably

Solving Corrupted Quadratic Equations, Provably Solving Corrupted Quadratic Equations, Provably Yuejie Chi London Workshop on Sparse Signal Processing September 206 Acknowledgement Joint work with Yuanxin Li (OSU), Huishuai Zhuang (Syracuse) and Yingbin

More information

Accelerated Block-Coordinate Relaxation for Regularized Optimization

Accelerated Block-Coordinate Relaxation for Regularized Optimization Accelerated Block-Coordinate Relaxation for Regularized Optimization Stephen J. Wright Computer Sciences University of Wisconsin, Madison October 09, 2012 Problem descriptions Consider where f is smooth

More information

MIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications. Class 08: Sparsity Based Regularization. Lorenzo Rosasco

MIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications. Class 08: Sparsity Based Regularization. Lorenzo Rosasco MIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications Class 08: Sparsity Based Regularization Lorenzo Rosasco Learning algorithms so far ERM + explicit l 2 penalty 1 min w R d n n l(y

More information

Block Coordinate Descent for Regularized Multi-convex Optimization

Block Coordinate Descent for Regularized Multi-convex Optimization Block Coordinate Descent for Regularized Multi-convex Optimization Yangyang Xu and Wotao Yin CAAM Department, Rice University February 15, 2013 Multi-convex optimization Model definition Applications Outline

More information

Gauge optimization and duality

Gauge optimization and duality 1 / 54 Gauge optimization and duality Junfeng Yang Department of Mathematics Nanjing University Joint with Shiqian Ma, CUHK September, 2015 2 / 54 Outline Introduction Duality Lagrange duality Fenchel

More information

Recovery of Sparse Signals from Noisy Measurements Using an l p -Regularized Least-Squares Algorithm

Recovery of Sparse Signals from Noisy Measurements Using an l p -Regularized Least-Squares Algorithm Recovery of Sparse Signals from Noisy Measurements Using an l p -Regularized Least-Squares Algorithm J. K. Pant, W.-S. Lu, and A. Antoniou University of Victoria August 25, 2011 Compressive Sensing 1 University

More information

Tractable Upper Bounds on the Restricted Isometry Constant

Tractable Upper Bounds on the Restricted Isometry Constant Tractable Upper Bounds on the Restricted Isometry Constant Alex d Aspremont, Francis Bach, Laurent El Ghaoui Princeton University, École Normale Supérieure, U.C. Berkeley. Support from NSF, DHS and Google.

More information

Least Sparsity of p-norm based Optimization Problems with p > 1

Least Sparsity of p-norm based Optimization Problems with p > 1 Least Sparsity of p-norm based Optimization Problems with p > Jinglai Shen and Seyedahmad Mousavi Original version: July, 07; Revision: February, 08 Abstract Motivated by l p -optimization arising from

More information

Low-Rank Matrix Recovery

Low-Rank Matrix Recovery ELE 538B: Mathematics of High-Dimensional Data Low-Rank Matrix Recovery Yuxin Chen Princeton University, Fall 2018 Outline Motivation Problem setup Nuclear norm minimization RIP and low-rank matrix recovery

More information

Convex optimization. Javier Peña Carnegie Mellon University. Universidad de los Andes Bogotá, Colombia September 2014

Convex optimization. Javier Peña Carnegie Mellon University. Universidad de los Andes Bogotá, Colombia September 2014 Convex optimization Javier Peña Carnegie Mellon University Universidad de los Andes Bogotá, Colombia September 2014 1 / 41 Convex optimization Problem of the form where Q R n convex set: min x f(x) x Q,

More information

Adaptive First-Order Methods for General Sparse Inverse Covariance Selection

Adaptive First-Order Methods for General Sparse Inverse Covariance Selection Adaptive First-Order Methods for General Sparse Inverse Covariance Selection Zhaosong Lu December 2, 2008 Abstract In this paper, we consider estimating sparse inverse covariance of a Gaussian graphical

More information

Convex Optimization M2

Convex Optimization M2 Convex Optimization M2 Lecture 8 A. d Aspremont. Convex Optimization M2. 1/57 Applications A. d Aspremont. Convex Optimization M2. 2/57 Outline Geometrical problems Approximation problems Combinatorial

More information

Adaptive one-bit matrix completion

Adaptive one-bit matrix completion Adaptive one-bit matrix completion Joseph Salmon Télécom Paristech, Institut Mines-Télécom Joint work with Jean Lafond (Télécom Paristech) Olga Klopp (Crest / MODAL X, Université Paris Ouest) Éric Moulines

More information

Recent Developments in Compressed Sensing

Recent Developments in Compressed Sensing Recent Developments in Compressed Sensing M. Vidyasagar Distinguished Professor, IIT Hyderabad m.vidyasagar@iith.ac.in, www.iith.ac.in/ m vidyasagar/ ISL Seminar, Stanford University, 19 April 2018 Outline

More information

arxiv: v2 [cs.na] 24 Mar 2015

arxiv: v2 [cs.na] 24 Mar 2015 Volume X, No. 0X, 200X, X XX doi:1934/xx.xx.xx.xx PARALLEL MATRIX FACTORIZATION FOR LOW-RANK TENSOR COMPLETION arxiv:1312.1254v2 [cs.na] 24 Mar 2015 Yangyang Xu Department of Computational and Applied

More information

Robust PCA. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng

Robust PCA. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng Robust PCA CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Robust PCA 1 / 52 Previously...

More information

Least squares regularized or constrained by L0: relationship between their global minimizers. Mila Nikolova

Least squares regularized or constrained by L0: relationship between their global minimizers. Mila Nikolova Least squares regularized or constrained by L0: relationship between their global minimizers Mila Nikolova CMLA, CNRS, ENS Cachan, Université Paris-Saclay, France nikolova@cmla.ens-cachan.fr SIAM Minisymposium

More information

SUPPLEMENTAL NOTES FOR ROBUST REGULARIZED SINGULAR VALUE DECOMPOSITION WITH APPLICATION TO MORTALITY DATA

SUPPLEMENTAL NOTES FOR ROBUST REGULARIZED SINGULAR VALUE DECOMPOSITION WITH APPLICATION TO MORTALITY DATA SUPPLEMENTAL NOTES FOR ROBUST REGULARIZED SINGULAR VALUE DECOMPOSITION WITH APPLICATION TO MORTALITY DATA By Lingsong Zhang, Haipeng Shen and Jianhua Z. Huang Purdue University, University of North Carolina,

More information

Regularization and Inverse Problems

Regularization and Inverse Problems Regularization and Inverse Problems Caroline Sieger Host Institution: Universität Bremen Home Institution: Clemson University August 5, 2009 Caroline Sieger (Bremen and Clemson) Regularization and Inverse

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms François Caron Department of Statistics, Oxford STATLEARN 2014, Paris April 7, 2014 Joint work with Adrien Todeschini,

More information

1-Bit Matrix Completion

1-Bit Matrix Completion 1-Bit Matrix Completion Mark A. Davenport School of Electrical and Computer Engineering Georgia Institute of Technology Yaniv Plan Mary Wootters Ewout van den Berg Matrix Completion d When is it possible

More information

High Dimensional Inverse Covariate Matrix Estimation via Linear Programming

High Dimensional Inverse Covariate Matrix Estimation via Linear Programming High Dimensional Inverse Covariate Matrix Estimation via Linear Programming Ming Yuan October 24, 2011 Gaussian Graphical Model X = (X 1,..., X p ) indep. N(µ, Σ) Inverse covariance matrix Σ 1 = Ω = (ω

More information

1-Bit Matrix Completion

1-Bit Matrix Completion 1-Bit Matrix Completion Mark A. Davenport School of Electrical and Computer Engineering Georgia Institute of Technology Yaniv Plan Mary Wootters Ewout van den Berg Matrix Completion d When is it possible

More information

SOLVING A LOW-RANK FACTORIZATION MODEL FOR MATRIX COMPLETION BY A NONLINEAR SUCCESSIVE OVER-RELAXATION ALGORITHM

SOLVING A LOW-RANK FACTORIZATION MODEL FOR MATRIX COMPLETION BY A NONLINEAR SUCCESSIVE OVER-RELAXATION ALGORITHM SOLVING A LOW-RANK FACTORIZATION MODEL FOR MATRIX COMPLETION BY A NONLINEAR SUCCESSIVE OVER-RELAXATION ALGORITHM ZAIWEN WEN, WOTAO YIN, AND YIN ZHANG CAAM TECHNICAL REPORT TR10-07 DEPARTMENT OF COMPUTATIONAL

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

CSC 576: Variants of Sparse Learning

CSC 576: Variants of Sparse Learning CSC 576: Variants of Sparse Learning Ji Liu Department of Computer Science, University of Rochester October 27, 205 Introduction Our previous note basically suggests using l norm to enforce sparsity in

More information

Frist order optimization methods for sparse inverse covariance selection

Frist order optimization methods for sparse inverse covariance selection Frist order optimization methods for sparse inverse covariance selection Katya Scheinberg Lehigh University ISE Department (joint work with D. Goldfarb, Sh. Ma, I. Rish) Introduction l l l l l l The field

More information

Estimators based on non-convex programs: Statistical and computational guarantees

Estimators based on non-convex programs: Statistical and computational guarantees Estimators based on non-convex programs: Statistical and computational guarantees Martin Wainwright UC Berkeley Statistics and EECS Based on joint work with: Po-Ling Loh (UC Berkeley) Martin Wainwright

More information

A Randomized Nonmonotone Block Proximal Gradient Method for a Class of Structured Nonlinear Programming

A Randomized Nonmonotone Block Proximal Gradient Method for a Class of Structured Nonlinear Programming A Randomized Nonmonotone Block Proximal Gradient Method for a Class of Structured Nonlinear Programming Zhaosong Lu Lin Xiao March 9, 2015 (Revised: May 13, 2016; December 30, 2016) Abstract We propose

More information

Iterative Hard Thresholding Methods for l 0 Regularized Convex Cone Programming

Iterative Hard Thresholding Methods for l 0 Regularized Convex Cone Programming Iterative Hard Thresholding Methods for l 0 Regularized Convex Cone Programming arxiv:1211.0056v2 [math.oc] 2 Nov 2012 Zhaosong Lu October 30, 2012 Abstract In this paper we consider l 0 regularized convex

More information

Tensor Completion for Estimating Missing Values in Visual Data

Tensor Completion for Estimating Missing Values in Visual Data Tensor Completion for Estimating Missing Values in Visual Data Ji Liu, Przemyslaw Musialski 2, Peter Wonka, and Jieping Ye Arizona State University VRVis Research Center 2 Ji.Liu@asu.edu, musialski@vrvis.at,

More information

1-Bit Matrix Completion

1-Bit Matrix Completion 1-Bit Matrix Completion Mark A. Davenport School of Electrical and Computer Engineering Georgia Institute of Technology Yaniv Plan Mary Wootters Ewout van den Berg Matrix Completion d When is it possible

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley A. d Aspremont, INFORMS, Denver,

More information

Source Reconstruction for 3D Bioluminescence Tomography with Sparse regularization

Source Reconstruction for 3D Bioluminescence Tomography with Sparse regularization 1/33 Source Reconstruction for 3D Bioluminescence Tomography with Sparse regularization Xiaoqun Zhang xqzhang@sjtu.edu.cn Department of Mathematics/Institute of Natural Sciences, Shanghai Jiao Tong University

More information

Analysis of Robust PCA via Local Incoherence

Analysis of Robust PCA via Local Incoherence Analysis of Robust PCA via Local Incoherence Huishuai Zhang Department of EECS Syracuse University Syracuse, NY 3244 hzhan23@syr.edu Yi Zhou Department of EECS Syracuse University Syracuse, NY 3244 yzhou35@syr.edu

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2013 PROBLEM SET 2

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2013 PROBLEM SET 2 STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2013 PROBLEM SET 2 1. You are not allowed to use the svd for this problem, i.e. no arguments should depend on the svd of A or A. Let W be a subspace of C n. The

More information

On the Moreau-Yosida regularization of the vector k-norm related functions

On the Moreau-Yosida regularization of the vector k-norm related functions On the Moreau-Yosida regularization of the vector k-norm related functions Bin Wu, Chao Ding, Defeng Sun and Kim-Chuan Toh This version: March 08, 2011 Abstract In this paper, we conduct a thorough study

More information

Solving A Low-Rank Factorization Model for Matrix Completion by A Nonlinear Successive Over-Relaxation Algorithm

Solving A Low-Rank Factorization Model for Matrix Completion by A Nonlinear Successive Over-Relaxation Algorithm Solving A Low-Rank Factorization Model for Matrix Completion by A Nonlinear Successive Over-Relaxation Algorithm Zaiwen Wen, Wotao Yin, Yin Zhang 2010 ONR Compressed Sensing Workshop May, 2010 Matrix Completion

More information

Sparse PCA with applications in finance

Sparse PCA with applications in finance Sparse PCA with applications in finance A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon 1 Introduction

More information

Sparse & Redundant Signal Representation, and its Role in Image Processing

Sparse & Redundant Signal Representation, and its Role in Image Processing Sparse & Redundant Signal Representation, and its Role in Michael Elad The CS Department The Technion Israel Institute of technology Haifa 3000, Israel Wave 006 Wavelet and Applications Ecole Polytechnique

More information

A New Estimate of Restricted Isometry Constants for Sparse Solutions

A New Estimate of Restricted Isometry Constants for Sparse Solutions A New Estimate of Restricted Isometry Constants for Sparse Solutions Ming-Jun Lai and Louis Y. Liu January 12, 211 Abstract We show that as long as the restricted isometry constant δ 2k < 1/2, there exist

More information

FAST FIRST-ORDER METHODS FOR STABLE PRINCIPAL COMPONENT PURSUIT

FAST FIRST-ORDER METHODS FOR STABLE PRINCIPAL COMPONENT PURSUIT FAST FIRST-ORDER METHODS FOR STABLE PRINCIPAL COMPONENT PURSUIT N. S. AYBAT, D. GOLDFARB, AND G. IYENGAR Abstract. The stable principal component pursuit SPCP problem is a non-smooth convex optimization

More information

Recovery of Simultaneously Structured Models using Convex Optimization

Recovery of Simultaneously Structured Models using Convex Optimization Recovery of Simultaneously Structured Models using Convex Optimization Maryam Fazel University of Washington Joint work with: Amin Jalali (UW), Samet Oymak and Babak Hassibi (Caltech) Yonina Eldar (Technion)

More information

r=1 r=1 argmin Q Jt (20) After computing the descent direction d Jt 2 dt H t d + P (x + d) d i = 0, i / J

r=1 r=1 argmin Q Jt (20) After computing the descent direction d Jt 2 dt H t d + P (x + d) d i = 0, i / J 7 Appendix 7. Proof of Theorem Proof. There are two main difficulties in proving the convergence of our algorithm, and none of them is addressed in previous works. First, the Hessian matrix H is a block-structured

More information

The First Fundamental Form

The First Fundamental Form The First Fundamental Form Outline 1. Bilinear Forms Let V be a vector space. A bilinear form on V is a function V V R that is linear in each variable separately. That is,, is bilinear if αu + β v, w αu,

More information

Fast and Robust Phase Retrieval

Fast and Robust Phase Retrieval Fast and Robust Phase Retrieval Aditya Viswanathan aditya@math.msu.edu CCAM Lunch Seminar Purdue University April 18 2014 0 / 27 Joint work with Yang Wang Mark Iwen Research supported in part by National

More information

Matrix Support Functional and its Applications

Matrix Support Functional and its Applications Matrix Support Functional and its Applications James V Burke Mathematics, University of Washington Joint work with Yuan Gao (UW) and Tim Hoheisel (McGill), CORS, Banff 2016 June 1, 2016 Connections What

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Adrien Todeschini Inria Bordeaux JdS 2014, Rennes Aug. 2014 Joint work with François Caron (Univ. Oxford), Marie

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

A multi-stage convex relaxation approach to noisy structured low-rank matrix recovery

A multi-stage convex relaxation approach to noisy structured low-rank matrix recovery A multi-stage convex relaxation approach to noisy structured low-rank matrix recovery Shujun Bi, Shaohua Pan and Defeng Sun March 1, 2017 Abstract This paper concerns with a noisy structured low-rank matrix

More information

Reconstruction from Anisotropic Random Measurements

Reconstruction from Anisotropic Random Measurements Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional

More information

On Convergence of the Alternating Projection Method for Matrix Completion and Sparse Recovery Problems

On Convergence of the Alternating Projection Method for Matrix Completion and Sparse Recovery Problems On Convergence of the Alternating Projection Method for Matrix Completion and Sparse Recovery Problems Ming-Jun Lai Department of Mathematics, University of Georgia, Athens, GA, USA 30602 mjlai@ugaedu

More information

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis Massimiliano Pontil 1 Today s plan SVD and principal component analysis (PCA) Connection

More information

Optimization over Sparse Symmetric Sets via a Nonmonotone Projected Gradient Method

Optimization over Sparse Symmetric Sets via a Nonmonotone Projected Gradient Method Optimization over Sparse Symmetric Sets via a Nonmonotone Projected Gradient Method Zhaosong Lu November 21, 2015 Abstract We consider the problem of minimizing a Lipschitz dierentiable function over a

More information

8.1 Concentration inequality for Gaussian random matrix (cont d)

8.1 Concentration inequality for Gaussian random matrix (cont d) MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration

More information

1. Introduction. We consider the following constrained optimization problem:

1. Introduction. We consider the following constrained optimization problem: SIAM J. OPTIM. Vol. 26, No. 3, pp. 1465 1492 c 2016 Society for Industrial and Applied Mathematics PENALTY METHODS FOR A CLASS OF NON-LIPSCHITZ OPTIMIZATION PROBLEMS XIAOJUN CHEN, ZHAOSONG LU, AND TING

More information

High dimensional Ising model selection

High dimensional Ising model selection High dimensional Ising model selection Pradeep Ravikumar UT Austin (based on work with John Lafferty, Martin Wainwright) Sparse Ising model US Senate 109th Congress Banerjee et al, 2008 Estimate a sparse

More information

Conditional Gradient Algorithms for Rank-One Matrix Approximations with a Sparsity Constraint

Conditional Gradient Algorithms for Rank-One Matrix Approximations with a Sparsity Constraint Conditional Gradient Algorithms for Rank-One Matrix Approximations with a Sparsity Constraint Marc Teboulle School of Mathematical Sciences Tel Aviv University Joint work with Ronny Luss Optimization and

More information

MLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT

MLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT MLCC 2018 Variable Selection and Sparsity Lorenzo Rosasco UNIGE-MIT-IIT Outline Variable Selection Subset Selection Greedy Methods: (Orthogonal) Matching Pursuit Convex Relaxation: LASSO & Elastic Net

More information

An iterative hard thresholding estimator for low rank matrix recovery

An iterative hard thresholding estimator for low rank matrix recovery An iterative hard thresholding estimator for low rank matrix recovery Alexandra Carpentier - based on a joint work with Arlene K.Y. Kim Statistical Laboratory, Department of Pure Mathematics and Mathematical

More information

Lecture Notes 9: Constrained Optimization

Lecture Notes 9: Constrained Optimization Optimization-based data analysis Fall 017 Lecture Notes 9: Constrained Optimization 1 Compressed sensing 1.1 Underdetermined linear inverse problems Linear inverse problems model measurements of the form

More information

First-order methods for structured nonsmooth optimization

First-order methods for structured nonsmooth optimization First-order methods for structured nonsmooth optimization Sangwoon Yun Department of Mathematics Education Sungkyunkwan University Oct 19, 2016 Center for Mathematical Analysis & Computation, Yonsei University

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional

More information

Augmented Lagrangian alternating direction method for matrix separation based on low-rank factorization

Augmented Lagrangian alternating direction method for matrix separation based on low-rank factorization Augmented Lagrangian alternating direction method for matrix separation based on low-rank factorization Yuan Shen Zaiwen Wen Yin Zhang January 11, 2011 Abstract The matrix separation problem aims to separate

More information

New ways of dimension reduction? Cutting data sets into small pieces

New ways of dimension reduction? Cutting data sets into small pieces New ways of dimension reduction? Cutting data sets into small pieces Roman Vershynin University of Michigan, Department of Mathematics Statistical Machine Learning Ann Arbor, June 5, 2012 Joint work with

More information

Random hyperplane tessellations and dimension reduction

Random hyperplane tessellations and dimension reduction Random hyperplane tessellations and dimension reduction Roman Vershynin University of Michigan, Department of Mathematics Phenomena in high dimensions in geometric analysis, random matrices and computational

More information

Optimisation Combinatoire et Convexe.

Optimisation Combinatoire et Convexe. Optimisation Combinatoire et Convexe. Low complexity models, l 1 penalties. A. d Aspremont. M1 ENS. 1/36 Today Sparsity, low complexity models. l 1 -recovery results: three approaches. Extensions: matrix

More information

Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization

Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization Forty-Fifth Annual Allerton Conference Allerton House, UIUC, Illinois, USA September 26-28, 27 WeA3.2 Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization Benjamin

More information

GREEDY SIGNAL RECOVERY REVIEW

GREEDY SIGNAL RECOVERY REVIEW GREEDY SIGNAL RECOVERY REVIEW DEANNA NEEDELL, JOEL A. TROPP, ROMAN VERSHYNIN Abstract. The two major approaches to sparse recovery are L 1-minimization and greedy methods. Recently, Needell and Vershynin

More information

https://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:

More information

Tensor Low-Rank Completion and Invariance of the Tucker Core

Tensor Low-Rank Completion and Invariance of the Tucker Core Tensor Low-Rank Completion and Invariance of the Tucker Core Shuzhong Zhang Department of Industrial & Systems Engineering University of Minnesota zhangs@umn.edu Joint work with Bo JIANG, Shiqian MA, and

More information

An Introduction to Correlation Stress Testing

An Introduction to Correlation Stress Testing An Introduction to Correlation Stress Testing Defeng Sun Department of Mathematics and Risk Management Institute National University of Singapore This is based on a joint work with GAO Yan at NUS March

More information