TRANSFORMED SCHATTEN-1 ITERATIVE THRESHOLDING ALGORITHMS FOR LOW RANK MATRIX COMPLETION

Size: px
Start display at page:

Download "TRANSFORMED SCHATTEN-1 ITERATIVE THRESHOLDING ALGORITHMS FOR LOW RANK MATRIX COMPLETION"

Transcription

1 TRANSFORMED SCHATTEN- ITERATIVE THRESHOLDING ALGORITHMS FOR LOW RANK MATRIX COMPLETION SHUAI ZHANG, PENGHANG YIN, AND JACK XIN Abstract. We study a non-convex low-rank promoting penalty function, the transformed Schatten- (TS), and its applications in matrix completion. The TS penalty, as a matrix quasi-norm defined on its singular values, interpolates the rank and the nuclear norm through a nonnegative parameter a (,+ ). We consider the unconstrained TS regularized low-rank matrix recovery problem and develop a fixed point representation for its global minimizer. The TS thresholding functions are in closed analytical form for all parameter values. The TS threshold values differ in subcritical (supercritical) parameter regime where the TS threshold functions are continuous (discontinuous). We propose TS iterative thresholding algorithms and compare them with some state-of-the-art algorithms on matrix completion test problems. For problems with known rank, a fully adaptive TS iterative thresholding algorithm consistently performs the best under different conditions, where ground truth matrices are generated by multivariate Gaussian, (,) uniform and Chi-square distributions. For problems with unknown rank, TS algorithms with an additional rank estimation procedure approach the level of IRucL-q which is an iterative reweighted algorithm, non-convex in nature and best in performance. Key words. Transformed Schatten- penalty, fixed point representation, closed form thresholding function, iterative thresholding algorithms, matrix completion. AMS subject classifications. 9C26, 9C46. Introduction Low rank matrix completion problems arise in many applications such as collaborative filtering in recommender systems [4, 7], minimum order system and low-dimensional Euclidean embedding in control theory [4, 5], network localization [8], and others [26]. The mathematical problem is: min X R m n rank(x) s.t. X L, (.) where L is a convex set. In this paper, we are interested in methods for solving the affine rank minimization problem (ARMP) min X R m n rank(x) s.t. A (X) = b in Rp, (.2) where the linear transformation A : R m n R p and vector b R p are given. The matrix completion problem min X R m n rank(x) s.t. X i,j = M i,j, (i,j) Ω (.3) is a special case of (.2), where X and M are both m n matrices and Ω is a subset of index pairs {(i,j)}. The optimization problems above are known to be NP-hard. Many alternative penalties have been utilized as proxies for finding low rank solutions in both the constrained and unconstrained settings: min X R m n F (X) s.t. A (X) = b (.4) The work was partially supported by NSF grant DMS and DMS Department of Mathematics, University of California, Irvine, CA, 92697, USA. szhang3@uci.edu, penghany@uci.edu, jxin@math.uci.edu. Phone: (949) Fax: (949)

2 2 TS for Low Rank Matrix Completion and min X R m n 2 A (X) b 2 2 +λf (X). (.5) The penalty function F ( ) is defined on singular values of matrix X, typically F (X) = f(σ i ), where σ i is the i-th largest singular value of X arranged in descending order. The Schatten p-norm (nuclear norm at p = ) results when f(x) = x p, i p [,]. At p = (p = 2), F is the rank (Frobenius norm). Recovering rank under suitable conditions for p (,] has been extensively studied in theories and algorithms [2, 3, 4, 9, 2, 2, 23, 24, 28]. Non-convex penalty based methods have shown better performance on hard problems [2, 24]. There is also a novel method to solve the constrained problem (.4), from the perspective of gauge dual [32, 33]. Recently, a class of l based non-convex penalty, the transformed l (TL), has been found effective and robust for compressed sensing problems [3, 3]. TL interpolates l and l, similar to l p quasi-norm (p (,)). In the entire range of interpolation parameter, TL enjoys closed form iterative thresholding function, which is available for l p only at some specific values, like p =,,/2,2/3, see [, 5, 7, 29]. This feature allows TL to perform fast and robust sparse minimization in a much wider range than l p quasi-norm. Moreover, the TL penalty possesses the unbiasedness and Lipschitz continuity besides sparsity [2, 22]. It is the goal of this paper to extend TL penalty to TS (transformed Schatten-) for low rank matrix completion and compare it with state of the art methods in the literature. The rest of the paper is organized as follows. In section 2, we present the transformed Schatten- function (TS), the TS regularized minimization problems, and a derivation of thresholding representation of the global minimum. In section 3, we propose two thresholding algorithms (TS-s and TS-s2) based on a fixed point equation of the global minimum. In section 4, we compare TS algorithms with some stateof-the-art algorithms through numerical experiments in low rank matrix recovery and image inpainting. Concluding remarks are in section 5... Notation Here we set the notations for this paper. Two kinds of inner products are used in the following sections, one is between matrices and the other is a bilinear operation for vectors: (x,y) = x i y i for vectors x,y; i X,Y = tr(y T X) = X i,j Y i,j for matrices X,Y. i,j Assume matrix X R m n has r positive singular values σ σ 2... σ r >. Let us introduce some common matrix norms or quasi-norms as, Nuclear norm: X = r σ i ; i= Schatten p quasi-norm: X p = ( r σ p i )/p, for p (,); Frobenius norm: X F = ( i=σ r i 2) 2. Also X 2 F = X,X = Xi,j 2. i,j Ky Fan k-norm: X F k = k σ i, for k r; i= i=

3 S. Zhang, P. Yin, J. Xin 3 Induced L 2 norm: X L 2 = max v 2= Xv 2 = σ. Define function vec( ) to unfold one matrix columnwise into a vector. So it is clear that vec(x) 2 = X F, where the left hand side norm is vector s l 2 norm. Define the shrinkage identity k matrix I s k Rm n as following, { I s k (i,i) =, the first k diagonal elements; I s k (i,j) =, others. (.6) Operator tr k ( ) is defined as the first k partial trace of a matrix, tr k (X) = k X i,i. (.7) The following matrix functions will be used in the proof of next section, and we want to write them out first here for reference: i= C λ (X) = 2 A (X) b 2 2 +λt (X); C λ,µ (X,Z) = µ { C λ (X) 2 A (X) A } (Z) X Z 2 F = µλt (X)+ µ 2 b 2 2 µ 2 A (Z) 2 2 µ(a (X),b A (Z))+ 2 X Z 2 F ; B µ (Z) = Z +µa (b A (Z)). (.8) 2. TS minimization and thresholding representation First, let us introduce Transformed Schatten- penalty function(ts) based on the singular values of a matrix: rank(x) T (X) = ρ a (σ i ), (2.) where ρ a ( ) is a linear-to-linear rational function with parameter a (, ) [3, 3], i= ρ a ( x ) = (a+) x. (2.2) a+ x With the change of parameter a, TL interpolates l and l norms: lim a + ρ a(x) = I {x }, lim ρ a(x) = x. a + In Fig. 2., level lines of TL on the plane are shown at small and large values of parameter a, resembling those of l (at a = ), l /2 (at a = ), and l (at a =.). We shall focus on TS regularized problem min X R m n 2 A (X) b 2 2 +λt (X), (2.3) where the linear transform A : R m n R p can be determined by p given matrices A,...,A p R m n, that is, A (X) = ( A,X,..., A p,x ) T.

4 4 TS for Low Rank Matrix Completion (a) l (b) TL with a = (c) TL with a = (d) TL with a = Fig. 2.: Level lines of TL with different parameters: a = (figure b), a = (figure c), a =. (figure d). For large parameter a, the graph looks almost the same as l (figure a). While for small value of a, it tends to the axis. 2.. Overview of TL minimization To set the stage for the discussion of the TS regularized problem (2.3), we review the following results on one-dimensional TL optimization [3]. Let us consider the unconstrained TL regularized problem: min x R n 2 Ax y 2 2 +λp a (x), (2.4) where matrix A R m n, vector y R m are given, P a (x) = ρ a ( x i ) and function ρ a ( ) i is as in (2.2). In this subsection of TL minimization, we want to overwrite operator B µ ( ) over vector x, instead of matrix field as before in (.8), B µ (x) = x+µa T (y Ax). (2.5) In the following theorem (2.), we prove that there exists a closed form expression for proximal operator prox λρa on univariate TL regularization problem, where prox λρa (x) = argmin 2 (y x)2 +λρ a (y). y R

5 S. Zhang, P. Yin, J. Xin 5 Proximal operator of a convex function usually intends to solve a small convex regularization problem, which often admits a closed-form formula or an efficient specialized numerical method. However, for non-convex functions, like l p with p (.), their related proximal operators do not have closed form solutions in general. There are many iterative algorithms to approximate optimal solution. But they need more computing time and sometimes only converge to some local optimal or stationary points. In this subsection, we will show that for TL function, there indeed exists a closed-formed formula for its optimal solution. Different with other thresholding operators, TL has 2 threshold value formulas depending on regularization parameter λ and TL parameter a. We present them here with same notation as [3]. { t 2 = λ a+ a (sub-critical parameter) t 3 = 2λ(a+) a 2 (super-critical parameter). (2.6) The inequality t 3 t 2 holds and the equality is realized if and only if λ = a2 2(a+), see [3]. Let sgn( ) be the standard signum function with sgn() =, and { ( ) 2 ϕ(x) h λ (x) = sgn(x) (a+ x ) cos 2a x } (2.7) 3 with ϕ(x) = arccos( 27λa(a+) 2(a+ x ) ). In general, h 3 λ (x) x, see [3]. Theorem 2.. ([3]) The optimal solution of y = argmin{ 2 (y x)2 +λρ a ( y )} is a y R thresholding function of the form: y = {, x t h λ (x), x > t (2.8) where h λ ( ) is defined in (2.7), and the threshold parameter t depends on λ as follows:. if λ a2 2(a+) (sub-critical and critical), 2. if λ > a2 2(a+) (super-critical), t = t 2 = λ a+ a ; t = t 3 = 2λ(a+) a 2. According to the above theorem, we introduce thresholding operator g λ,a ( ) in R, g λ,a (w) = {, if w t; h λ (w), if w > t, (2.9) where t is the thresholding value in Theorem 2. and h λ ( ) in (2.7). In [3], the authors proved that when λ <, the TL thresholding function a2 2(a+) is continuous, same as the soft-thresholding function [8, 9]. While if λ > a2 2(a+), the TL thresholding function has a jump discontinuity at threshold value, similar to the half-thresholding function [29]. For different thresholding schemes, it is believed that

6 6 TS for Low Rank Matrix Completion continuous formula is more stable, while discontinuous formula separates nonzero and trivial coefficients more efficiently and sometimes converges faster. We have the following representation theorem for TL regularized problem (2.4). Theorem 2.2. ([3]) If x = (x,x 2,...,x n) T is a TL regularized solution (2.4) with a and λ being positive constants, and < µ < A 2, then letting t = t 2I { } + λµ a2 2(a+) t 3I { }, the optimal solution satisfies the fixed point equation: λµ> a2 2(a+) x i = g λµ,a ([B µ (x )] i ) i =,...,n. (2.) In the following, we will extend this result to TS low rank matrix completion and propose 2 thresholding algorithms based on it TS thresholding representation theory Here we assume m n. For a matrix X R m n with rank equal to r, its singular values vector σ = (σ,...,σ m ) is arranged as σ σ 2... σ r > = σ r+ =... = σ m. The singular value decomposition (SVD) is X = UDV T, where U = (U i,j ) m m and V = (V i,j ) n n are unitary matrices, with D = Diag(σ) R m n diagonal. In [3], Ky Fan proved the dominance theorem and derive the following Ky Fan k-norm inequality. Lemma 2.. (Ky Fan k-norm inequality) For a matrix X R m n with SVD: X = U DV T, where diagonal elements of D are arranged in decreasing order, we have: X,I s k D,I s k, that is, tr k (X) tr k (D) = X F k, k =,2,...,m. The inequalities become equalities if and only if X = D. Here matrix I s k and operator tr k( ) are defined in section.. Another proof of this inequality without using dominance theorem is available. We leave it in the appendix for readers convenience, making the paper self-contained. Theorem 2.3. Consider matrix Y R m n that admits a singular value decomposition of the form: Y = U Diag(σ)V T, where σ = (σ,...,σ m ). Then the unique global minimizer of min X R m n 2 X Y 2 F +λt (X) is: X s = G λ,a (Y ) = UDiag(g λ,a (σ))v T, (2.) where g λ,a ( ) is defined in (2.9) and applied entrywise to σ. Proof. First due to the unitary invariance property of Frobenius norm and Y = UDiag(σ)V T, we have So 2 X Y 2 F +λt (X) = 2 U T XV Diag(σ) 2 F +λt (U T XV ). X s = argmin 2 X Y 2 F +λt (X) { = U argmin 2 X Diag(σ) 2 F }V +λt (X) T. X R m n X R m n (2.2)

7 S. Zhang, P. Yin, J. Xin 7 Next we want to show: argmin 2 X Diag(σ) 2 F +λt (X) X R m n = argmin {D R m n is diagonal} 2 D Diag(σ) 2 F +λt (D) (2.3) For any X R m n, suppose it admits SVD: X = U x Diag(σ x )V T x. Denote D x = Diag(σ x ) and D y = Diag(σ). We can rewrite diagonal matrix D y as D y = m σ i Ii s, where σ i = σ i σ i+ for i =,2,...,m, and σ m = σ m. So simply, i matrix I s i is defined in section.. So: i m σ i = σ k. Note that the shrinkage identity i=k X,D y = X, m σ i Ii s = m X, σ i Ii s i m D x, σ i Ii s = D x,d y, i where we used Lemma 2. for the inequality. The equality holds if and only if X = D x. Thus we have Also due to T (X) = T (D x ), X D y 2 F = X 2 F + D y 2 F 2 X,D y D x 2 F + D y 2 F 2 D x,d y = D x D y 2 F. 2 X D y 2 F +λt (X) 2 D x D y 2 F +λt (D x ). Only when X = D x is a diagonal matrix, the above will become equality. So we proved equation (2.3). Denote a diagonal matrix D R m n as D = Diag(d). Then: 2 D Diag(σ) 2 F +λt (D) = m i= i { } 2 d i σ i 2 2 +λρ a ( d i ) By Theorem 2., we have g λ,a (σ i ) = argmin{ d 2 d σ i 2 2 +λρ a ( d ) }. It follows that argmin {D R m n and D is diagonal} 2 D Diag(σ) 2 F +λ T (D) = argmin 2 X Diag(σ) 2 F +λ T (X) X R m n = Diag(g λ,a (σ)). (2.4) In view of (2.2), the matrix X s = UDiag(g λ,a (σ))v T is a global minimizer, which will be denoted as G λ,a (Y ). Let X be another optimal solution for the problem min X R m n 2 X Y 2 F +λt (X) and denote X = U T X V. Then X will be an optimal solution of argmin 2 X X R m n Diag(σ) 2 F +λt (X). Based on the above proof and Theorem 2., we have X =

8 8 TS for Low Rank Matrix Completion Diag(g λ,a (σ)). Since U and V are unitary, X = U X V T = X s. We proved that the matrix G λ,a (Y ) is the unique optimal solution for the optimization problem. The proof is complete. Lemma 2.2. For any fixed λ >, µ > and matrix Z R m n, let X s = G λµ,a (B µ (Z)), then X s is the unique global minimizer of the problem min C λ,µ(x,z), where the X R m n matrix function C λ,µ (X,Z) is defined in (.8) of section.. Proof. First, we will rewrite the formula of C λ,µ (X,Z). Note that A (X) and A (Z) are vectors in space R p. Thus in the formula of C λ,µ (X,Z), there exist norms and inner products for both matrices and vectors. By definition, Thus if we fix matrix Z, C λ,µ (X,Z) = 2 X 2 F X,Z + 2 Z 2 F +λµt (X)+ µ 2 b 2 2 µ(a (X),b A (Z)) µ 2 A (Z) 2 2 = 2 X 2 F + 2 Z 2 F + µ 2 b 2 2 µ 2 A (Z) 2 2 +λµt (X) X,Z +µa (b A (Z)) = 2 X B µ(z) 2 F +λµt (X) 2 B µ(z) 2 F + 2 Z 2 F + µ 2 b 2 2 µ 2 A (Z) 2 2 (2.5) argmin C λ,µ (X,Z) = argmin 2 X B µ(z) 2 F +λµt (X) (2.6) X R m n X R m n Then by Theorem 2.3, X s is the unique global minimizer. Theorem 2.4. For fixed parameters, λ > and < µ < A 2 2. If X is a global minimizer for problem C λ (X), then X is the unique global minimizer for problem min C λ,µ(x,x ). X R m n Proof. C λ,µ (X,X ) = µ{ 2 A (X) b 2 2 +λt (X)} + 2 { X X 2 F µ A (X) A (X ) 2 2} µ{ 2 A (X) b 2 2 +λt (X) } = µc λ (X) µc λ (X ) = C λ,µ (X,X ) The first inequality is due to the fact: A (X) A (X ) 2 2 = Avec(X) Avec(X ) 2 2 A 2 2 vec(x X ) 2 2 A 2 2 X X 2 F (2.7) (2.8) From the above inequalities, we know that X is an optimal solution for min X R m n C λ,µ (X,X ). The uniqueness of X follows from Lemma 2.2. By the above Theorems and Lemmas, if X is a global minimizer of C λ (X), it is also the unique global minimizer of C λ,µ (X,Z) with Z = X, which has the closed form solution formula. Thus we arrive at the following fixed point equation for this global minimizer X : X = G λµ,a (B µ (X )). (2.9) Suppose the SVD for matrix B µ (X ) is U Diag(σ b )V T, then X = U Diag(g λµ,a (σ b ))V T, which means that the singular values of X satisfy σi = g λµ,a(σb,i ), for i =,...,m.

9 S. Zhang, P. Yin, J. Xin 9 3. TS thresholding algorithms Next we will utilize the fixed point equation (2.9) to derive two thresholding algorithms for TS regularized problem (2.3). As in [3, 3], from the equation X = G λµ,a (B µ (X )) = UDiag(g λµ,a (σ))v T, we will replace the optimal matrix X with X k on the left and X k on the right at the k-th step of iteration as: X k = G λµ,a (B µ (X k )) = U k Diag ( g λµ,a (σ k ) ) V k,t, (3.) where unitary matrices U k, V k and singular values {σ k } come from the SVD decomposition of matrix B µ (X k ). Operator g λµ,a ( ) is defined in (2.9), and {, if w < t; g λµ,a (w) = h λµ (w), if w t. (3.2) Recall that the thresholding parameter t is: { t 2 = λµ a+ a2 a, if λ 2(a+)µ t = ; t 3 = 2λµ(a+) a 2, if λ > a2 2(a+)µ. (3.3) With an initial matrix X, we obtain an iterative algorithm, called TS iterative thresholding (IT) algorithm. It is the basic TS iterative scheme. Later, two adaptive and more efficient IT algorithms (TS-s and TS-s2) will be introduced. 3.. Semi-Adaptive Thresholding Algorithm TS-s We begin with formulating an optimal condition for regularization parameter λ, which serves as the basis for the parameter selection and updating in this semi-adaptive algorithm. Suppose the optimal solution matrix X has rank r, by prior knowledge or estimation. Here, we still assume m n. For any µ, denote B µ (X) = X +µa T (b A (X)) and {σ i } m i= are the m non-negative singular values for B µ(x). Suppose that X is the optimal solution matrix of (2.3), and the singular values of matrix B µ (X ) are denoted as σ σ 2... σ m. Then by the fixed equation (2.9), the following inequalities hold: σ i > t i {,2,...,r}, σ j t j {r +,r +2,...,m}, (3.4) where t is our threshold value. Recall that t 3 t t 2. So It follows that σ r t t 3 = 2λµ(a+) a 2 ; σ r+ t t 2 = λµ a+ a. (3.5) λ aσ r+ µ(a+) λ λ 2 (a+2σ r) 2 8(a+)µ or λ [λ,λ 2 ]. The above estimate helps to set optimal regularization parameter. A choice of λ is (I) When λ a2 2(a+)µ, set λ = λ. Then we will have λ a2 2(a+)µ and thus thresholding value t = t 2;

10 TS for Low Rank Matrix Completion (II) When λ > Then λ > a2 2(a+)µ, set λ = λ 2. a2 2(a+)µ. Thus we choose thresholding value t = t 3. In practice, we approximate B µ (X ) by B µ (X n ) in the above formula, that is, the i-th singular value σi is approximated by σn i at the n-th iteration step. Thus, we have λ,n = aσn r+ µ(a+), and λ 2,n = (a+2σn r ) 2. We choose optimal parameter λ at the n-th step 8(a+)µ as λ n = { λ,n, if λ,n a2 2(a+)µ, λ 2,n, if λ,n > a2 2(a+)µ. (3.6) This way, we obtain an adaptive iterative algorithm without pre-setting the regularization parameter λ. The TL parameter a is still free and needs to be selected beforehand. Thus the algorithm is overall semi-adaptive, called TS-s for short and summarized in Algorithm. Algorithm : TS-s thresholding algorithm Initialize: Given X and parameter µ and a. while NOT converged do. Y n = B µ (X n ) = X n µa (A (X n ) b), and compute SVD of Y n as Y n = U Diag(σ)V T ; 2. Determine the value for λ n by (3.6), then obtain the thresholding value t n by (3.3); 3. X n+ = G λnµ,a(y n ) = UDiag(g λnµ,a(σ))v T ; Then, n n+. end while 3.2. Adaptive Thresholding Algorithm TS-s2 Different from TS-s where the parameter a needs to be determined manually, here at each iterative step, a we choose a = a n such that equality λ n = 2 n 2(a n+)µ n holds. The threshold value t is given by a single formula with t = t 3 = t 2. Putting λ = at critical value, the parameter a is expressed as: a2 2(a+)µ The threshold value is: a = λµ+ (λµ) 2 +2λµ. (3.7) t = λµ a+ a = λµ 2 + (λµ)2 +2λµ. (3.8) 2 Let X be the TL optimal solution and σ be the singular values for matrix B µ (X ). Then we have the following inequalities: So, for parameter λ, we have: σ i > t i {,2,...,r}, σ j t j {r +,r +2,...,m}. (3.9) 2(σr+) 2 +2σr+ λ 2(σ r) 2 +2σr.

11 S. Zhang, P. Yin, J. Xin Once the value of λ is determined, the parameter a is given by (3.7). In the iterative method, we approximate the optimal solution X by X n and further use B µ (X n ) s singular values {σ n i } i to replace those of B µ (X ). The resulting parameter selection is: λ n = 2(σn r+) 2 ; +2σ n r+ a n = λ n µ n + (λ n µ n ) 2 +2λ n µ n. (3.) In this algorithm (TS-s2 for short), only parameter µ is fixed, satisfying inequality µ (, A 2 ). Its algorithm is summarized in Algorithm 2. Algorithm 2: TS-s2 thresholding algorithm Initialize: Given X and parameter µ. while NOT converged do. Y n = X n µa (A (X n ) b), and compute SVD of Y n as Y n = U Diag(σ)V T ; 2. Determine the values for λ n and a n by (3.), then update threshold value t n = λ n µ an + a ; n 3. X n+ = G λn µ,a n(y n ) = U Diag(g λn µ,a(σ))v T ; Then n n+. end while 4. Numerical experiments In this section, we present numerical experiments to illustrate the effectiveness of our Algorithms: semi-adaptive TS-s and adaptive TS-s2, compared with several state-of-art solvers on matrix completion problems. The comparison solvers include: LMaFit [28], FPCA [23], sirls-q [24], IRucLq-M [2], LRGeomCG [34] The code LMAFit solves a low-rank factorization model, instead of computing SVD which usually takes a big chunk of computation time. Also part of its codes is written in C, same as LRGeomCG. So once this method converges, it is the fastest method among all comparisons. All others codes are implemented under Matlab environment and involve SVD approximated by fast Monte Carlo algorithms [, ]. FPCA is a nuclear norm minimization code, while sirls-q and IRucLq-M are iterative reweighted least square algorithms for Schatten-q quasi-norm optimizations. LRGeomCG algorithm explores matrix completion based on Riemannian optimization. It tries to minimize the least-square distance on the sampling set over the Riemannian manifold of fixed-rank matrices. When the rank information is known or well approximated, this method is efficient and accurate, as shown in these experiments below, especially for standard Gaussian matrices. But a drawback of LRGeomCG is that the rank of the manifold is fixed. Basically, it is hard for it to handle unknown rank cases. In our TS algorithms, MC SVD algorithm [] is implemented at each iteration step, same as FPCA. We also tried another fast SVD approximation algorithms, but MC SVD is the most suitable one, satisfying both speed and accuracy requirements in one iterative algorithm. All our tests were performed on a Lenovo desktop: 6 GB TS matlab codes can be downloaded from

12 2 TS for Low Rank Matrix Completion of RAM and Core Quad processor i7-477 with CPU at 3.4GHz under 64-bit Ubuntu system. We tested and compared these solvers on low rank matrix completion problems under various conditions, including multivariate Gaussian, uniform and χ 2 distributions. We also tested the algorithms on grayscale image recovery from partial observations (image inpainting). 4.. Implementation details In the following series of tests, we generated random matrices M = M L M T R R m n, where matrices M L and M R are in spaces R m r and R n r respectively. By setting parameter r to be small, we obtain a low rank matrix M with rank at most r. After this step, we uniformly random-sampled a subset ω with p entries from M. The following quantities help to quantify the difficulty of a recovery problem. SR (Sampling ratio): SR = p/mn. FR (Freedom ratio): FR = r(m+n r)/p, which is the freedom of rank r matrix divided by the number of measurement. According to [23], if FR >, there are infinite number of matrices with rank r and the given entries. r m (Maximum rank with which the matrix can be recovered): r m = m+n (m+n) 2 4p (floor function), 2 which is defined as the largest rank such that FR. The TS thresholding algorithms do not guarantee a global minimum in general, similar to non-convex schemes in -dimensional compressed sensing problems. Indeed we observe that TS thresholding with random starts may get stuck at local minima especially when parameter FR (freedom ratio) is high or the matrix completion is difficult. A good initial matrix X is important for thresholding algorithms. In our numerical experiments, instead of choosing X = or random, we set X equal to matrix M whose elements are as observed on Ω and zero elsewhere. The stopping criterion is X n+ X n F max{ X n F,} tol where X n+ and X n are numerical results from two contiguous iterative steps, and tol is a moderately small number. In all these following experiments, we fix tol = 6 with maximum iteration steps. We also use the relative error rel.err = X opt M F M F (4.) to estimate the closeness of X opt to M, where X opt is the optimal solution produced by all numerical algorithms Rank estimation For thresholding algorithms, rank r is the most important parameter, especially for our TS methods, where thresholding value t is determined based on r. If the true rank r is unknown, we adopt the rank decreasing estimation method (also called maximum

13 S. Zhang, P. Yin, J. Xin 3 eigengap method) as in [2, 28], thereby extending both TS-s and TS-s2 schemes to work with an overestimated initial rank parameter K. In the following tests, unless otherwise specified, we set K =.5r. The idea behind this estimation method is as follows. Suppose that at step n, our current matrix is X. The eigenvalues of X T X are arranged with descending order and λ rmin λ rmin+... λ K+ > is the r min -th through K +-th eigenvalues of X T X, where r min is manually specified minimum rank estimate. Then we compute the quotient sequence λ i = λ i /λ i+, i = r min,...,k. Let K = argmin λ i, r min i K the corresponding index for maximal element of { λ i }. If the eigenvalue gap indicator τ = λ K(K r min +)/ i K λ i >, we adjust our rank estimator from K to K. During numerical simulations, we did this adjustment only once for each problem. In most cases, this estimation adjustment is quite satisfactory and the adjusted estimate is very close to the true rank r Choice of a: optimal parameter testing for TS-s. A major difference between TS-s and TS-s2 is the choice of parameter a, which influences the behaviour of penalty function ρ a ( ) of TS. When a tends to zero, the function T (X) approaches the rank. We tested TS-s on small size low rank matrix completion with different a values, varying among {.,.5,,,}, for both known rank scheme and the scheme with rank estimation. In these tests, M = M L MR T is a random matrix, where M L and M R are generated under i.i.d standard normal distribution. The rank r of M varies from to 22. For each value of a, we conducted 5 independent tests with different M and sample index set ω. We declared M to be recovered successfully if the relative error (4.) was less than 5 3. The test results for known rank scheme and rank estimation scheme are both shown in Figure 4.. The success rate curves of rank estimation scheme are not as clustered as those of known rank scheme. In order to clearly identify the optimal parameter a, we ignored the curve of a =. in the right figure as it is always below all others. The vertical red dotted line there indicates the position where FR =.6. It is interesting to see that for known rank scheme, parameter a = is the optimal strategy, which coincides with the optimal parameter setting in [3]. It is observed that when we use thresholding algorithm under transformed L (TL) or transformed Schatten- (TS) quasi norm, it is usually optimal to set a = with given information of sparsity or rank. However, for the scheme with rank estimation, it is more complicated. Based on our tests, if FR <.6, it is better to set a to reach good performance. On the other hand, if FR >.6, a = is nearly the optimal choice. So for all the following tests, when we apply TS-s with rank estimation, the parameter a is set to be {, if FR <.6; a =, if FR.6. In applications where FR is not available, we suggest to use a =, since its performance is also acceptable if FR <.6.

14 4 TS for Low Rank Matrix Completion Parameter test for a with rank known prior Parameter test for a with rank estimation.9.8 a=. a=.5 a= a= a=.9.8 Frequency of success Frequency of success a=.5 a= a= a= rank 6 fr = rank with fr changing from.23 to.78 Rank is known Rank is estimated Fig. 4.: Optimal parameter test for semi-adaptive method: TS-s 4.2. Completion of Random Matrices The ground truth matrix M is generated as the matrix product of two low rank matrices M L and M R. Their dimensions are m r and n r respectively, with r min(m,n). In these following experiments, except clearly stated, M L and M R are generated with multivariate normal distribution N (µ,σ), with µ = and Σ = {( cov) I (i=j) +cov} r r determined by parameter cov. Thus matrix M = M L MR T has rank at most r. It is known that success recovery is related to FR. The higher FR is, the harder it is to recover the original low rank matrix. In the first batch of tests, we varied rank r and fixed all other parameters, i.e. matrix size (m,n), sampling rate (sr). Thus FR was changing along with rank. It is observed that the performance of TS-s and TS-s2 are very different, due to adopting single or double thresholds. TS-s2 uses only one (smooth) thresholding scheme with changing parameter a. It converges faster than TS-s when the rank is known, see subsection On the other hand, TS-s utilizes two (smooth and discontinuous) thresholding schemes, and is more robust in case of overestimated rank. TS-s outperforms TS-s2 when rank estimation is used in lieu of the true rank value, see subsection IRucL-q method is found to be very robust for varied covariance and rank estimation, yet it underperforms TS methods at high FR, even with more computing time. Though TS methods rely on the same rank estimation method as IRucL-q, IRucL-q achieves the best results in the absence of true rank value. A possible reason is that in IRucL-q iterations, the singular values of matrix X are computed more accurately. In TS, singular values are computed by fast Monte Carlo method at every iteration. Due to random sampling of Monte Carlo method, there are more errors especially at the beginning stage of iteration. The resulting matrices X n may cause less accurate rank estimation Matrix completion with known rank In this subsection, we implemented all six algorithms under the condition that true rank value is given. They are TS-s, TS-s2, sirls-q, IRucL-q, LMaFit and

15 S. Zhang, P. Yin, J. Xin 5 LRGeomCG. We skipped FPCA since rank is always adaptively estimated there. Gaussian matrices with different ranks. In these tests, matrix M = M L M T R was generated under uncorrelated normal distribution with µ =. We conducted tests both on low dimensional matrices with m = n = (Table 4.) and high dimensional matrices with m = n = (Table 4.2). Tests on non-square matrices with m n show similar results. In Table 4., rank r varies from 5 to 8, while FR increases from.2437 up to.89. For lower rank (less than 5), LMaFit is the best algorithm with low relative errors and fast convergence speed. Part of the reason is that this method does not involve SVD (singular value decomposition) operations during iteration. LRGeomCG approaches the performance of LMaFit when r. However, as FR values are above.7, it became hard for LMaFit to find truth low rank matrix M. Its performance is not as good as stated in paper [34] with possible reason that we generate M with mean µ equal to, instead of in [34]. We also tested LRGeomCG with µ = where it has very small relative error and also fast convergence rate. It is also noticed that in Table 4., the two TS algorithms performed very well and remained stable for different FR values. At similar order of accuracy, the TLs are faster than IRucL-q. For large size matrices (m = n = ), rank r is varied from 5 to, see table 4.2. The sirls-q and LMaFit only worked for lower FR. IRucL-q can still produce satisfactory results with relative error around 3, but its iterations took longer time. In [2], it was carried out by high speed-performance CPU with many cores. Here we used an ordinary processor with only 4 cores and 8 threads. It is believed that with a better machine, IRucL-q will be much faster, since parallel computing is embedded in its codes. As seen in the table, LRGeomCG is always convergent and achieves almost same accuracy with TS-s and TS-s2. However, its computation time grows fast with increasing rank. A little difference between the two TS algorithms begins to emerge when the matrix size is large. Although when rank is given, they all performed better than other schemes, adaptive TS-s2 is a little faster than semi-adaptive TS-s. It is believed by choosing optimal parameter a, TS-s will be improved. The parameter a is related to matrix M, i.e. how it is generated, its inner structure, and dimension. In TS-s2, the value of parameter a does not need to be manually determined. Gaussian Matrices with Different Covariance. In this subsection, the rank r, the sampling rate, and the freedom ratio FR are fixed. We varied parameter cov to generate covariance matrices of multivariate normal distribution. In Table 4.3, we chose two rank values, r = 5 and r = 8. It is harder to recover the original matrix M when it is more coherent. IRucL-q does better in this regime. Its mean computing time and relative errors are less influenced by the changing cov. Results on large size matrices are shown in Table 4.4. TS-s2 scheme is much better than TS-s, both in relative error and computing time. In small size matrix experiments, TS-s2 is the best among comparisons. In Table 4.4, we fixed rank = 3 with cov among {.,...,.7}. TS-s2 is still satisfactory both in accuracy and speed for low covariance (i.e cov.6). However, for cov.7, relative errors increased from 6 to around 4. It is also observed that IRucL-q algorithm is very stable and robust under covariance change. Matrices from other distributions. We also compare algorithms with other distributions, including (,) uniform distribution and Chi-square distribution with k

16 6 TS for Low Rank Matrix Completion Table 4.: Comparison of TS-s, TS-s2, sirls-q, IRucL-q, LMaFit and LRGeomCG on recovery of uncorrelated multivariate Gaussian matrices at known rank, m = n =, SR =.4, with stopping criterion tol = 6. Problem TS-s TS-s2 sirls-q* rank FR rel.err time rel.err time rel.err time e e e e e e e e e e e-7.2.3e e e e e-5.33.e e e e e e e e e e e e-4. Problem IRucL-q LMaFit LRGeomCG rank FR rel.err time rel.err time rel.err time e e-6.2.3e e e e e e e e e e e e e e e e e e e e e e e e-.34.9e e e e e e e-.94 * Notes:. The sirls-q iterations did not converge when rank > 4 and FR.65. Comparison is skipped over this range. Results for rank (,2,3) are skipped as they differ little from rank 4. Similar rank samplings occur in Tables 4.(2-3), 4.(5-6). 2. Matrix M is generated from multivariate normal distribution with mean µ =, instead of. = (degree of freedom). All other parameters are same as Table 4.. The results are displayed at Table 4.5 (uniform distribution) and Table 4.6 (Chi-square distribution). Only partial numerical results are showed here with rank r = 7,8,9,,4,5. From these two tables, two TS algorithms have satisfying relative errors and stable performance, same as IRuccL-q. For these two non-gaussian distributions, it becomes harder to successfully recover low rank matrix for LMaFit and LRGeomCG, especially when rank r > Matrix completion with rank estimation We conducted numerical experiments on rank estimation schemes. The initial rank estimation is given as.5r, which is a commonly used overestimate. FPCA [23] is included for comparison, while LRGeomCG and sirls-q are excluded. FPCA is a fast and robust iterative algorithm based on nuclear norm regularization. We considered two classes of matrices: uncorrelated Gaussian matrices with changing rank; correlated Gaussian matrices with fixed rank (r = 5,). The results are

17 S. Zhang, P. Yin, J. Xin 7 Table 4.2: Numerical experiments on recovery of uncorrelated multivariate Gaussian matrices at known rank, m = n =, SR =.3. Problem TS-s TS-s2 sirls-q rank FR rel.err time rel.err time rel.err time e e e e e e e e e e e e Problem IRucL-q LMaFit LRGeomCG rank FR rel.err time rel.err time rel.err time e e e e e e e e e e e e Table 4.3: Numerical experiments on multivariate Gaussian matrices with varying covariance at known rank, m = n =, SR =.4. Problem TS-s TS-s2 sirls-q rank cor rel.err time rel.err time rel.err time e e e e e e e e e e e e e e e e e e e e e- 6.8 Problem IRucL-q LMaFit LRGeomCG rank cor rel.err time rel.err time rel.err time e e-2.7.2e e e e e-5.7.e e e e e e e-.25.7e e e e e e e-.29 shown in Table 4.7 and Table 4.8. It is interesting that under rank estimation, the semi-adaptive TS-s fared much better than TS-s2. In low rank and low covariance cases, TS-s is the best in terms of accuracy and computing time among comparisons. However, in the regime of high covariance and rank, it became harder for TS methods to perform efficient recovery. IRucL-q did the best, being both stable and robust. In the most difficult case, at rank = 5 and FR approximately equal to.7, IRucL-q can still obtain an accurate result with relative error around Image inpainting

18 8 TS for Low Rank Matrix Completion Table 4.4: Numerical experiments on multivariate Gaussian matrices with varying covariance at known rank, m = n =, SR =.4. Problem TS-s TS-s2 sirls-q rank cor rel.err time rel.err time rel.err time e e e e e e e e e e e e e e e e e e e e e- 5.3 Problem IRucL-q LMaFit LRGeomCG rank cor rel.err time rel.err time rel.err time e e e e e e e e e e e e e e e e e e e e e Table 4.5: Comparison with random matrices generated from (,) uniform distribution. Rank r is given and m = n =, SR =.4, with stopping criterion tol = 6. Problem TS-s TS-s2 sirls-q* rank FR rel.err time rel.err time rel.err time e e e e e e e e e e e e e-5.8.2e-5.58 Problem IRucL-q LMaFit LRGeomCG rank FR rel.err time rel.err time rel.err time e e e e e e e e e e e e e e-.8.24e e e-.6.7e-.76 As in [2, 28], we conducted grayscale image inpainting experiments to recover low rank images from partial observations, and compare with IRcuL-q and LMaFit algorithms. The boat image (see Figure 4.2) is used to produce ground truth as in [2] with rank equal to 4 and at resolution. Different levels of noisy disturbances

19 S. Zhang, P. Yin, J. Xin 9 Table 4.6: Comparison with random matrices generated from Chi-square distribution with k = (degree of freedom). Rank r is given and m = n =, SR =.4, with stopping criterion tol = 6. Problem TS-s TS-s2 sirls-q* rank FR rel.err time rel.err time rel.err time e e e e e e e e e e e e e e-5.73 Problem IRucL-q LMaFit LRGeomCG rank FR rel.err time rel.err time rel.err time e e-6.4.8e e e e e e e e e e e e-.5.46e e e e-.57 Table 4.7: Numerical experiments for low rank matrix completion algorithms under rank estimation. True matrices are uncorrelated multivariate Gaussian, m = n =, SR =.4. Problem TS-s TS-s2 FPCA IRucL-q LMaFit rank FR rel.err time rel.err time rel.err time rel.err time rel.err time e e e-.9.84e e e e e e e e e e e e e e e e e e e e e e e e e e e-.2 Table 4.8: Numerical experiments on low rank matrix completion algorithms under rank estimation. True matrices are multivariate Gaussian with different covariance, m = n =, and SR =.4. Problem TS-s TS-s2 FPCA IRucL-q LMaFit rank cor rel.err time rel.err time rel.err time rel.err time rel.err time e e e e e e e e e e e e e e e-2..5.e e-.4.2e e e e e-.4.2e e e e e e e e-2.

20 2 TS for Low Rank Matrix Completion Original image Sample image with noise TS-s2 IRucL_q LMaFit-inc LMaFit-fix Fig. 4.2: Image inpainting experiments with SR =.3,σ =.5. are added to the original image Mo by the formula M = Mo + σ kmo kf ε, kεkf where the matrix ε is a standard Gaussian. Here we only applied scheme TS-s2. For IRucL-q, we followed the setting in [2] by choosing α =.9 and λ = 2 σ. Both fixed rank ( LMaFit-fix ) and increased rank (LMaFit-inc) schemes are implemented for LMaFit. We took fixed rank r = 4 for TSs2, LMaFit-fix and IRucL-q. Computational results are in Table 4.9 with sampling ratios varying among {.3,.4,.5} and noise strength σ in {.,.5,.,.5,.2,.25}. The performance for each algorithm is measured in CPU time, PSNR (peak-signal noise ratio), and MSE (mean squared error). Here we focus more on PSNR values and placed the top 2 in bold for each experiment. We observed that IRucL-q and TS-s2 fared about the same. Either one is better than LMaFit in most cases. 5. Conclusion We presented the transformed Schatten- penalty (TS), and derived the closed form thresholding representation formula for global minimizers of TS regularized rank minimization problem. We studied two adaptive iterative TS schemes (TS-s and TS-s2) computationally for matrix completion in comparison with several state-of-art methods, in particular IRucL-q. In case of low rank matrix recovery under known rank, TS-s2 performs the best in accuracy and computational speed. In low rank matrix recovery under rank estimation, TS-s is almost on par with IRucL-q except when both the matrix covariance and rank rise to certain level. In

21 S. Zhang, P. Yin, J. Xin 2 Table 4.9: Numerical experiments on boat image inpainting with algorithms TS, IRcuL-q and LMaFit under different sampling ratio and noise levels. Problem TS-s2 IRucL-q LMaFit-inc LMaFit-fix SR σ Time PSNR MSE Time PSNR MSE Time PSNR MSE Time PSNR MSE e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e-2 future work, we shall study rank estimation techniques to further improve on TS-s and explore other applications for TS penalty. Acknowledgments. The authors would like to thank Prof. Wotao Yin for his helpful suggestions on low rank matrix completion methods and numerical experiments. We also thank the anonymous referees for their constructive comments Appendix A. Proof of Ky Fan k-norm inequality. Proof. Since X = UDiag(σ)V T, the (j,k)-th entry of matrix X is X j,k = m σ i U j,i V k,i. i= Thus, we have tr k (X) = k j= = m i=j= X j,j = k j=i= m σ i U j,i V j,i k σ i U j,i V j,i = m i= σ i w (k) i, (.) where the weight w (k) i for the singular value σ i is defined as: w (k) i = k U j,i V j,i, i =,2,...,m. (.2) j=

22 22 TS for Low Rank Matrix Completion Notice that, w (k) i k U j,i V j,i U(:,i) 2 V (:,i) 2, (.3) j= where U(:,i) and V (:,i) are the i-th column vectors for U and V. {w (k) i }, m i= w (k) i m i=j= k j= k U j,i V j,i = k m U j,i V j,i j=i= U(j,:) 2 V (j,:) 2 k, Also for weights (.4) where U(j,:) and V (j,:) are the j-th row vectors for U and V, respectively. All the m weights are bounded by, with absolute sum at most k m. Note that σ i s are in decreasing order. By equation (.), we have, for all k =,2,...,m, tr k (X) m i= σ i w (k) i k σ i = tr k (D) = X F k. i= Next, we prove the second part of the lemma equality condition, by mathematical induction. Suppose that for a given matrix X, tr k (X) = tr k (D), k =,...,m. Here, it is convenient to define X i = σ i U i Vi T, where V i (U i ) is the i-th column vector of V (U). Then matrix X can be decomposed as the sum of r rank- matrices, X = r X i. When k =, according to tr (X) = tr (D) and the proof above, we know that w () = and w () = for i = 2,...,m. i By the definition of weights w (k) i in (.2), we have w () = U, V, =. Since U and V are both unitary matrices, we have: U, = V, = ±; U,j = U j, = V,j = V j, = for j. Then vectors U (V ) is the first standard basis vector in space R m (R n ). The matrix X = σ U V T is diagonal σ X =... m n i= For any index i, i k, suppose that U i,i = V i,i = ±; U i,j = U j,i = V i,j = V j,i = for any index j i. (.5)

23 S. Zhang, P. Yin, J. Xin 23 Then matrix X i = σ i U i Vi T, with i k, is diagonal and can be expressed as... X i = σ i (i-th row)... Under those conditions, let us consider the case with index i = k. Clearly, we have tr k (X) = tr k (D). Similarly as before, thanks to the formula (.) and inequalities (.3) and (.4), it is true that m n w (k) i = for i =,...,k; and w (k) i = for i > k. Furthermore, by definition (.2), w (k) k = k U j,k V j,k = U k,k V k,k =. This is because U j,k = V j,k = for index j < k, by the assumption (.5). Thus vectors U k and V k are also standard basis vectors with the k-th entry to be ±. Then... X k = σ k U k Vk T = σ k (k-th row) Finally, we prove that all matrices {X i } i=,,r are diagonal. So the original matrix X = r X i is equal to the diagonal matrix D. The other direction is obvious. We finish i= the proof. j=... m n REFERENCES [] T. Blumensath, Accelerated iterative hard thresholding, Signal Processing, 92(3), pp , 22. [2] J. Cai, E. Candès, and Z. Shen. A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization, 2(4): , 2. [3] E. Candès and T. Tao. The power of convex relaxation: Near-optimal matrix completion. Information Theory, IEEE Transactions on, 56(5):253 28, 2. [4] E. Candès, and B. Recht, Exact matrix completion via convex optimization, Found. Comput. Math., 9 (29), pp [5] W. Cao, J. Sun, and Z. Xu, Fast image deconvolution using closed-form thresholding formulas of L q,(q = /2,2/3) regularization, Journal of Visual Communication and Image Representation, 24(), pp. 3 4, 23. [6] Y. Chen, A. Jalali, S. Sanghavi, and C. Caramanis. Low-rank matrix recovery from errors and erasures. Information Theory, IEEE Transactions on, 59(7): , 23. [7] I. Daubechies, R. DeVore, M. Fornasier, C. Gunturk, Iteratively reweighted least squares minimization for sparse recovery, Comm. Pure Applied Math, 63(), pp. 38, 2. [8] I. Daubechies, M. Defrise, and C. De Mol. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Communications on pure and applied mathematics, 57():43-457, 24.

S 1/2 Regularization Methods and Fixed Point Algorithms for Affine Rank Minimization Problems

S 1/2 Regularization Methods and Fixed Point Algorithms for Affine Rank Minimization Problems S 1/2 Regularization Methods and Fixed Point Algorithms for Affine Rank Minimization Problems Dingtao Peng Naihua Xiu and Jian Yu Abstract The affine rank minimization problem is to minimize the rank of

More information

Minimization of Transformed L 1 Penalty: Closed Form Representation and Iterative Thresholding Algorithms

Minimization of Transformed L 1 Penalty: Closed Form Representation and Iterative Thresholding Algorithms Minimization of Transformed L Penalty: Closed Form Representation and Iterative Thresholding Algorithms Shuai Zhang, and Jack Xin arxiv:4.54v [cs.it] 8 Oct 6 Abstract The transformed l penalty (TL) functions

More information

Minimizing the Difference of L 1 and L 2 Norms with Applications

Minimizing the Difference of L 1 and L 2 Norms with Applications 1/36 Minimizing the Difference of L 1 and L 2 Norms with Department of Mathematical Sciences University of Texas Dallas May 31, 2017 Partially supported by NSF DMS 1522786 2/36 Outline 1 A nonconvex approach:

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms François Caron Department of Statistics, Oxford STATLEARN 2014, Paris April 7, 2014 Joint work with Adrien Todeschini,

More information

Low-Rank Factorization Models for Matrix Completion and Matrix Separation

Low-Rank Factorization Models for Matrix Completion and Matrix Separation for Matrix Completion and Matrix Separation Joint work with Wotao Yin, Yin Zhang and Shen Yuan IPAM, UCLA Oct. 5, 2010 Low rank minimization problems Matrix completion: find a low-rank matrix W R m n so

More information

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 7: Matrix completion Yuejie Chi The Ohio State University Page 1 Reference Guaranteed Minimum-Rank Solutions of Linear

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2017 LECTURE 5

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2017 LECTURE 5 STAT 39: MATHEMATICAL COMPUTATIONS I FALL 17 LECTURE 5 1 existence of svd Theorem 1 (Existence of SVD) Every matrix has a singular value decomposition (condensed version) Proof Let A C m n and for simplicity

More information

Solving A Low-Rank Factorization Model for Matrix Completion by A Nonlinear Successive Over-Relaxation Algorithm

Solving A Low-Rank Factorization Model for Matrix Completion by A Nonlinear Successive Over-Relaxation Algorithm Solving A Low-Rank Factorization Model for Matrix Completion by A Nonlinear Successive Over-Relaxation Algorithm Zaiwen Wen, Wotao Yin, Yin Zhang 2010 ONR Compressed Sensing Workshop May, 2010 Matrix Completion

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Adrien Todeschini Inria Bordeaux JdS 2014, Rennes Aug. 2014 Joint work with François Caron (Univ. Oxford), Marie

More information

Robust Principal Component Analysis

Robust Principal Component Analysis ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M

More information

An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems

An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems Kim-Chuan Toh Sangwoon Yun March 27, 2009; Revised, Nov 11, 2009 Abstract The affine rank minimization

More information

Analysis of Robust PCA via Local Incoherence

Analysis of Robust PCA via Local Incoherence Analysis of Robust PCA via Local Incoherence Huishuai Zhang Department of EECS Syracuse University Syracuse, NY 3244 hzhan23@syr.edu Yi Zhou Department of EECS Syracuse University Syracuse, NY 3244 yzhou35@syr.edu

More information

Sparse signals recovered by non-convex penalty in quasi-linear systems

Sparse signals recovered by non-convex penalty in quasi-linear systems Cui et al. Journal of Inequalities and Applications 018) 018:59 https://doi.org/10.1186/s13660-018-165-8 R E S E A R C H Open Access Sparse signals recovered by non-conve penalty in quasi-linear systems

More information

MINIMIZATION OF TRANSFORMED L 1 PENALTY: THEORY, DIFFERENCE OF CONVEX FUNCTION ALGORITHM, AND ROBUST APPLICATION IN COMPRESSED SENSING

MINIMIZATION OF TRANSFORMED L 1 PENALTY: THEORY, DIFFERENCE OF CONVEX FUNCTION ALGORITHM, AND ROBUST APPLICATION IN COMPRESSED SENSING MINIMIZATION OF TRANSFORMED L PENALTY: THEORY, DIFFERENCE OF CONVEX FUNCTION ALGORITHM, AND ROBUST APPLICATION IN COMPRESSED SENSING SHUAI ZHANG, AND JACK XIN Abstract. We study the minimization problem

More information

Principal Component Analysis

Principal Component Analysis Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used

More information

Two Results on the Schatten p-quasi-norm Minimization for Low-Rank Matrix Recovery

Two Results on the Schatten p-quasi-norm Minimization for Low-Rank Matrix Recovery Two Results on the Schatten p-quasi-norm Minimization for Low-Rank Matrix Recovery Ming-Jun Lai, Song Li, Louis Y. Liu and Huimin Wang August 14, 2012 Abstract We shall provide a sufficient condition to

More information

SPARSE SIGNAL RESTORATION. 1. Introduction

SPARSE SIGNAL RESTORATION. 1. Introduction SPARSE SIGNAL RESTORATION IVAN W. SELESNICK 1. Introduction These notes describe an approach for the restoration of degraded signals using sparsity. This approach, which has become quite popular, is useful

More information

Sparsity Matters. Robert J. Vanderbei September 20. IDA: Center for Communications Research Princeton NJ.

Sparsity Matters. Robert J. Vanderbei September 20. IDA: Center for Communications Research Princeton NJ. Sparsity Matters Robert J. Vanderbei 2017 September 20 http://www.princeton.edu/ rvdb IDA: Center for Communications Research Princeton NJ The simplex method is 200 times faster... The simplex method is

More information

Robust PCA. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng

Robust PCA. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng Robust PCA CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Robust PCA 1 / 52 Previously...

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon

More information

SOLVING A LOW-RANK FACTORIZATION MODEL FOR MATRIX COMPLETION BY A NONLINEAR SUCCESSIVE OVER-RELAXATION ALGORITHM

SOLVING A LOW-RANK FACTORIZATION MODEL FOR MATRIX COMPLETION BY A NONLINEAR SUCCESSIVE OVER-RELAXATION ALGORITHM SOLVING A LOW-RANK FACTORIZATION MODEL FOR MATRIX COMPLETION BY A NONLINEAR SUCCESSIVE OVER-RELAXATION ALGORITHM ZAIWEN WEN, WOTAO YIN, AND YIN ZHANG CAAM TECHNICAL REPORT TR10-07 DEPARTMENT OF COMPUTATIONAL

More information

Information-Theoretic Limits of Matrix Completion

Information-Theoretic Limits of Matrix Completion Information-Theoretic Limits of Matrix Completion Erwin Riegler, David Stotz, and Helmut Bölcskei Dept. IT & EE, ETH Zurich, Switzerland Email: {eriegler, dstotz, boelcskei}@nari.ee.ethz.ch Abstract We

More information

CSC 576: Variants of Sparse Learning

CSC 576: Variants of Sparse Learning CSC 576: Variants of Sparse Learning Ji Liu Department of Computer Science, University of Rochester October 27, 205 Introduction Our previous note basically suggests using l norm to enforce sparsity in

More information

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 36 Why proximal? Newton s method: for C 2 -smooth, unconstrained problems allow

More information

Recent Developments in Compressed Sensing

Recent Developments in Compressed Sensing Recent Developments in Compressed Sensing M. Vidyasagar Distinguished Professor, IIT Hyderabad m.vidyasagar@iith.ac.in, www.iith.ac.in/ m vidyasagar/ ISL Seminar, Stanford University, 19 April 2018 Outline

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via convex relaxations Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Low-rank Matrix Completion with Noisy Observations: a Quantitative Comparison

Low-rank Matrix Completion with Noisy Observations: a Quantitative Comparison Low-rank Matrix Completion with Noisy Observations: a Quantitative Comparison Raghunandan H. Keshavan, Andrea Montanari and Sewoong Oh Electrical Engineering and Statistics Department Stanford University,

More information

Compressive Sensing and Beyond

Compressive Sensing and Beyond Compressive Sensing and Beyond Sohail Bahmani Gerorgia Tech. Signal Processing Compressed Sensing Signal Models Classics: bandlimited The Sampling Theorem Any signal with bandwidth B can be recovered

More information

Lecture Notes 10: Matrix Factorization

Lecture Notes 10: Matrix Factorization Optimization-based data analysis Fall 207 Lecture Notes 0: Matrix Factorization Low-rank models. Rank- model Consider the problem of modeling a quantity y[i, j] that depends on two indices i and j. To

More information

Homework 1. Yuan Yao. September 18, 2011

Homework 1. Yuan Yao. September 18, 2011 Homework 1 Yuan Yao September 18, 2011 1. Singular Value Decomposition: The goal of this exercise is to refresh your memory about the singular value decomposition and matrix norms. A good reference to

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley A. d Aspremont, INFORMS, Denver,

More information

Block Coordinate Descent for Regularized Multi-convex Optimization

Block Coordinate Descent for Regularized Multi-convex Optimization Block Coordinate Descent for Regularized Multi-convex Optimization Yangyang Xu and Wotao Yin CAAM Department, Rice University February 15, 2013 Multi-convex optimization Model definition Applications Outline

More information

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Sparse regression. Optimization-Based Data Analysis.   Carlos Fernandez-Granda Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic

More information

Lecture 8: February 9

Lecture 8: February 9 0-725/36-725: Convex Optimiation Spring 205 Lecturer: Ryan Tibshirani Lecture 8: February 9 Scribes: Kartikeya Bhardwaj, Sangwon Hyun, Irina Caan 8 Proximal Gradient Descent In the previous lecture, we

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2 MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2 1 Ridge Regression Ridge regression and the Lasso are two forms of regularized

More information

Least Sparsity of p-norm based Optimization Problems with p > 1

Least Sparsity of p-norm based Optimization Problems with p > 1 Least Sparsity of p-norm based Optimization Problems with p > Jinglai Shen and Seyedahmad Mousavi Original version: July, 07; Revision: February, 08 Abstract Motivated by l p -optimization arising from

More information

1 Non-negative Matrix Factorization (NMF)

1 Non-negative Matrix Factorization (NMF) 2018-06-21 1 Non-negative Matrix Factorization NMF) In the last lecture, we considered low rank approximations to data matrices. We started with the optimal rank k approximation to A R m n via the SVD,

More information

Rank minimization via the γ 2 norm

Rank minimization via the γ 2 norm Rank minimization via the γ 2 norm Troy Lee Columbia University Adi Shraibman Weizmann Institute Rank Minimization Problem Consider the following problem min X rank(x) A i, X b i for i = 1,..., k Arises

More information

Compressed Sensing and Robust Recovery of Low Rank Matrices

Compressed Sensing and Robust Recovery of Low Rank Matrices Compressed Sensing and Robust Recovery of Low Rank Matrices M. Fazel, E. Candès, B. Recht, P. Parrilo Electrical Engineering, University of Washington Applied and Computational Mathematics Dept., Caltech

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Zhaosong Lu October 5, 2012 (Revised: June 3, 2013; September 17, 2013) Abstract In this paper we study

More information

Sparse PCA with applications in finance

Sparse PCA with applications in finance Sparse PCA with applications in finance A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon 1 Introduction

More information

SPARSE signal representations have gained popularity in recent

SPARSE signal representations have gained popularity in recent 6958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 10, OCTOBER 2011 Blind Compressed Sensing Sivan Gleichman and Yonina C. Eldar, Senior Member, IEEE Abstract The fundamental principle underlying

More information

Stability and Robustness of Weak Orthogonal Matching Pursuits

Stability and Robustness of Weak Orthogonal Matching Pursuits Stability and Robustness of Weak Orthogonal Matching Pursuits Simon Foucart, Drexel University Abstract A recent result establishing, under restricted isometry conditions, the success of sparse recovery

More information

Subspace Projection Matrix Completion on Grassmann Manifold

Subspace Projection Matrix Completion on Grassmann Manifold Subspace Projection Matrix Completion on Grassmann Manifold Xinyue Shen and Yuantao Gu Dept. EE, Tsinghua University, Beijing, China http://gu.ee.tsinghua.edu.cn/ ICASSP 2015, Brisbane Contents 1 Background

More information

Augmented Lagrangian alternating direction method for matrix separation based on low-rank factorization

Augmented Lagrangian alternating direction method for matrix separation based on low-rank factorization Augmented Lagrangian alternating direction method for matrix separation based on low-rank factorization Yuan Shen Zaiwen Wen Yin Zhang January 11, 2011 Abstract The matrix separation problem aims to separate

More information

Rapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization

Rapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization Rapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization Shuyang Ling Department of Mathematics, UC Davis Oct.18th, 2016 Shuyang Ling (UC Davis) 16w5136, Oaxaca, Mexico Oct.18th, 2016

More information

Sparse Approximation via Penalty Decomposition Methods

Sparse Approximation via Penalty Decomposition Methods Sparse Approximation via Penalty Decomposition Methods Zhaosong Lu Yong Zhang February 19, 2012 Abstract In this paper we consider sparse approximation problems, that is, general l 0 minimization problems

More information

Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds

Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds Tao Wu Institute for Mathematics and Scientific Computing Karl-Franzens-University of Graz joint work with Prof.

More information

High Dimensional Covariance and Precision Matrix Estimation

High Dimensional Covariance and Precision Matrix Estimation High Dimensional Covariance and Precision Matrix Estimation Wei Wang Washington University in St. Louis Thursday 23 rd February, 2017 Wei Wang (Washington University in St. Louis) High Dimensional Covariance

More information

ACCELERATED LINEARIZED BREGMAN METHOD. June 21, Introduction. In this paper, we are interested in the following optimization problem.

ACCELERATED LINEARIZED BREGMAN METHOD. June 21, Introduction. In this paper, we are interested in the following optimization problem. ACCELERATED LINEARIZED BREGMAN METHOD BO HUANG, SHIQIAN MA, AND DONALD GOLDFARB June 21, 2011 Abstract. In this paper, we propose and analyze an accelerated linearized Bregman (A) method for solving the

More information

Bindel, Fall 2009 Matrix Computations (CS 6210) Week 8: Friday, Oct 17

Bindel, Fall 2009 Matrix Computations (CS 6210) Week 8: Friday, Oct 17 Logistics Week 8: Friday, Oct 17 1. HW 3 errata: in Problem 1, I meant to say p i < i, not that p i is strictly ascending my apologies. You would want p i > i if you were simply forming the matrices and

More information

Conditions for Robust Principal Component Analysis

Conditions for Robust Principal Component Analysis Rose-Hulman Undergraduate Mathematics Journal Volume 12 Issue 2 Article 9 Conditions for Robust Principal Component Analysis Michael Hornstein Stanford University, mdhornstein@gmail.com Follow this and

More information

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725 Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:

More information

Conditional Gradient (Frank-Wolfe) Method

Conditional Gradient (Frank-Wolfe) Method Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties

More information

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = 30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can

More information

Sparse Solutions of an Undetermined Linear System

Sparse Solutions of an Undetermined Linear System 1 Sparse Solutions of an Undetermined Linear System Maddullah Almerdasy New York University Tandon School of Engineering arxiv:1702.07096v1 [math.oc] 23 Feb 2017 Abstract This work proposes a research

More information

arxiv: v2 [cs.na] 24 Mar 2015

arxiv: v2 [cs.na] 24 Mar 2015 Volume X, No. 0X, 200X, X XX doi:1934/xx.xx.xx.xx PARALLEL MATRIX FACTORIZATION FOR LOW-RANK TENSOR COMPLETION arxiv:1312.1254v2 [cs.na] 24 Mar 2015 Yangyang Xu Department of Computational and Applied

More information

Lecture 9: September 28

Lecture 9: September 28 0-725/36-725: Convex Optimization Fall 206 Lecturer: Ryan Tibshirani Lecture 9: September 28 Scribes: Yiming Wu, Ye Yuan, Zhihao Li Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These

More information

Sparse Solutions of Linear Systems of Equations and Sparse Modeling of Signals and Images: Final Presentation

Sparse Solutions of Linear Systems of Equations and Sparse Modeling of Signals and Images: Final Presentation Sparse Solutions of Linear Systems of Equations and Sparse Modeling of Signals and Images: Final Presentation Alfredo Nava-Tudela John J. Benedetto, advisor 5/10/11 AMSC 663/664 1 Problem Let A be an n

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via nonconvex optimization Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Constructing Explicit RIP Matrices and the Square-Root Bottleneck

Constructing Explicit RIP Matrices and the Square-Root Bottleneck Constructing Explicit RIP Matrices and the Square-Root Bottleneck Ryan Cinoman July 18, 2018 Ryan Cinoman Constructing Explicit RIP Matrices July 18, 2018 1 / 36 Outline 1 Introduction 2 Restricted Isometry

More information

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis. Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Vector space Consists of: A set V A scalar

More information

Rank Determination for Low-Rank Data Completion

Rank Determination for Low-Rank Data Completion Journal of Machine Learning Research 18 017) 1-9 Submitted 7/17; Revised 8/17; Published 9/17 Rank Determination for Low-Rank Data Completion Morteza Ashraphijuo Columbia University New York, NY 1007,

More information

Compressive Sensing of Sparse Tensor

Compressive Sensing of Sparse Tensor Shmuel Friedland Univ. Illinois at Chicago Matheon Workshop CSA2013 December 12, 2013 joint work with Q. Li and D. Schonfeld, UIC Abstract Conventional Compressive sensing (CS) theory relies on data representation

More information

Recovering any low-rank matrix, provably

Recovering any low-rank matrix, provably Recovering any low-rank matrix, provably Rachel Ward University of Texas at Austin October, 2014 Joint work with Yudong Chen (U.C. Berkeley), Srinadh Bhojanapalli and Sujay Sanghavi (U.T. Austin) Matrix

More information

Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition

Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition Gongguo Tang and Arye Nehorai Department of Electrical and Systems Engineering Washington University in St Louis

More information

Low-Rank Matrix Recovery

Low-Rank Matrix Recovery ELE 538B: Mathematics of High-Dimensional Data Low-Rank Matrix Recovery Yuxin Chen Princeton University, Fall 2018 Outline Motivation Problem setup Nuclear norm minimization RIP and low-rank matrix recovery

More information

Sparse Covariance Selection using Semidefinite Programming

Sparse Covariance Selection using Semidefinite Programming Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support

More information

Solving Corrupted Quadratic Equations, Provably

Solving Corrupted Quadratic Equations, Provably Solving Corrupted Quadratic Equations, Provably Yuejie Chi London Workshop on Sparse Signal Processing September 206 Acknowledgement Joint work with Yuanxin Li (OSU), Huishuai Zhuang (Syracuse) and Yingbin

More information

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER 2011 7255 On the Performance of Sparse Recovery Via `p-minimization (0 p 1) Meng Wang, Student Member, IEEE, Weiyu Xu, and Ao Tang, Senior

More information

A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations

A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations Chuangchuang Sun and Ran Dai Abstract This paper proposes a customized Alternating Direction Method of Multipliers

More information

Linear and Multilinear Algebra. Linear maps preserving rank of tensor products of matrices

Linear and Multilinear Algebra. Linear maps preserving rank of tensor products of matrices Linear maps preserving rank of tensor products of matrices Journal: Manuscript ID: GLMA-0-0 Manuscript Type: Original Article Date Submitted by the Author: -Aug-0 Complete List of Authors: Zheng, Baodong;

More information

Supplementary Materials for Riemannian Pursuit for Big Matrix Recovery

Supplementary Materials for Riemannian Pursuit for Big Matrix Recovery Supplementary Materials for Riemannian Pursuit for Big Matrix Recovery Mingkui Tan, School of omputer Science, The University of Adelaide, Australia Ivor W. Tsang, IS, University of Technology Sydney,

More information

Fixed point and Bregman iterative methods for matrix rank minimization

Fixed point and Bregman iterative methods for matrix rank minimization Math. Program., Ser. A manuscript No. (will be inserted by the editor) Shiqian Ma Donald Goldfarb Lifeng Chen Fixed point and Bregman iterative methods for matrix rank minimization October 27, 2008 Abstract

More information

Sparse and low-rank decomposition for big data systems via smoothed Riemannian optimization

Sparse and low-rank decomposition for big data systems via smoothed Riemannian optimization Sparse and low-rank decomposition for big data systems via smoothed Riemannian optimization Yuanming Shi ShanghaiTech University, Shanghai, China shiym@shanghaitech.edu.cn Bamdev Mishra Amazon Development

More information

Compressed sensing. Or: the equation Ax = b, revisited. Terence Tao. Mahler Lecture Series. University of California, Los Angeles

Compressed sensing. Or: the equation Ax = b, revisited. Terence Tao. Mahler Lecture Series. University of California, Los Angeles Or: the equation Ax = b, revisited University of California, Los Angeles Mahler Lecture Series Acquiring signals Many types of real-world signals (e.g. sound, images, video) can be viewed as an n-dimensional

More information

User s Guide for LMaFit: Low-rank Matrix Fitting

User s Guide for LMaFit: Low-rank Matrix Fitting User s Guide for LMaFit: Low-rank Matrix Fitting Yin Zhang Department of CAAM Rice University, Houston, Texas, 77005 (CAAM Technical Report TR09-28) (Versions beta-1: August 23, 2009) Abstract This User

More information

Lecture 5 : Projections

Lecture 5 : Projections Lecture 5 : Projections EE227C. Lecturer: Professor Martin Wainwright. Scribe: Alvin Wan Up until now, we have seen convergence rates of unconstrained gradient descent. Now, we consider a constrained minimization

More information

Lecture 2: Linear Algebra Review

Lecture 2: Linear Algebra Review EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1

More information

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 2

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 2 EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 2 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory April 5, 2012 Andre Tkacenko

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

Matrix Completion from a Few Entries

Matrix Completion from a Few Entries Matrix Completion from a Few Entries Raghunandan H. Keshavan and Sewoong Oh EE Department Stanford University, Stanford, CA 9434 Andrea Montanari EE and Statistics Departments Stanford University, Stanford,

More information

Gauge optimization and duality

Gauge optimization and duality 1 / 54 Gauge optimization and duality Junfeng Yang Department of Mathematics Nanjing University Joint with Shiqian Ma, CUHK September, 2015 2 / 54 Outline Introduction Duality Lagrange duality Fenchel

More information

Abstract. 1. Introduction

Abstract. 1. Introduction Journal of Computational Mathematics Vol.28, No.2, 2010, 273 288. http://www.global-sci.org/jcm doi:10.4208/jcm.2009.10-m2870 UNIFORM SUPERCONVERGENCE OF GALERKIN METHODS FOR SINGULARLY PERTURBED PROBLEMS

More information

Penalty Decomposition Methods for Rank and l 0 -Norm Minimization

Penalty Decomposition Methods for Rank and l 0 -Norm Minimization IPAM, October 12, 2010 p. 1/54 Penalty Decomposition Methods for Rank and l 0 -Norm Minimization Zhaosong Lu Simon Fraser University Joint work with Yong Zhang (Simon Fraser) IPAM, October 12, 2010 p.

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2013 PROBLEM SET 2

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2013 PROBLEM SET 2 STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2013 PROBLEM SET 2 1. You are not allowed to use the svd for this problem, i.e. no arguments should depend on the svd of A or A. Let W be a subspace of C n. The

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

Elaine T. Hale, Wotao Yin, Yin Zhang

Elaine T. Hale, Wotao Yin, Yin Zhang , Wotao Yin, Yin Zhang Department of Computational and Applied Mathematics Rice University McMaster University, ICCOPT II-MOPTA 2007 August 13, 2007 1 with Noise 2 3 4 1 with Noise 2 3 4 1 with Noise 2

More information

A New Estimate of Restricted Isometry Constants for Sparse Solutions

A New Estimate of Restricted Isometry Constants for Sparse Solutions A New Estimate of Restricted Isometry Constants for Sparse Solutions Ming-Jun Lai and Louis Y. Liu January 12, 211 Abstract We show that as long as the restricted isometry constant δ 2k < 1/2, there exist

More information

Lecture 25: November 27

Lecture 25: November 27 10-725: Optimization Fall 2012 Lecture 25: November 27 Lecturer: Ryan Tibshirani Scribes: Matt Wytock, Supreeth Achar Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have

More information

Key Words: low-rank matrix recovery, iteratively reweighted least squares, matrix completion.

Key Words: low-rank matrix recovery, iteratively reweighted least squares, matrix completion. LOW-RANK MATRIX RECOVERY VIA ITERATIVELY REWEIGHTED LEAST SQUARES MINIMIZATION MASSIMO FORNASIER, HOLGER RAUHUT, AND RACHEL WARD Abstract. We present and analyze an efficient implementation of an iteratively

More information

Nonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy

Nonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy Nonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy Caroline Chaux Joint work with X. Vu, N. Thirion-Moreau and S. Maire (LSIS, Toulon) Aix-Marseille

More information

A MODIFIED TSVD METHOD FOR DISCRETE ILL-POSED PROBLEMS

A MODIFIED TSVD METHOD FOR DISCRETE ILL-POSED PROBLEMS A MODIFIED TSVD METHOD FOR DISCRETE ILL-POSED PROBLEMS SILVIA NOSCHESE AND LOTHAR REICHEL Abstract. Truncated singular value decomposition (TSVD) is a popular method for solving linear discrete ill-posed

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR

THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR 1. Definition Existence Theorem 1. Assume that A R m n. Then there exist orthogonal matrices U R m m V R n n, values σ 1 σ 2... σ p 0 with p = min{m, n},

More information

Homework 4. Convex Optimization /36-725

Homework 4. Convex Optimization /36-725 Homework 4 Convex Optimization 10-725/36-725 Due Friday November 4 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)

More information

Low rank matrix completion by alternating steepest descent methods

Low rank matrix completion by alternating steepest descent methods SIAM J. IMAGING SCIENCES Vol. xx, pp. x c xxxx Society for Industrial and Applied Mathematics x x Low rank matrix completion by alternating steepest descent methods Jared Tanner and Ke Wei Abstract. Matrix

More information

Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization

Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization Forty-Fifth Annual Allerton Conference Allerton House, UIUC, Illinois, USA September 26-28, 27 WeA3.2 Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization Benjamin

More information

Linear Algebra for Machine Learning. Sargur N. Srihari

Linear Algebra for Machine Learning. Sargur N. Srihari Linear Algebra for Machine Learning Sargur N. srihari@cedar.buffalo.edu 1 Overview Linear Algebra is based on continuous math rather than discrete math Computer scientists have little experience with it

More information