TRANSFORMED SCHATTEN-1 ITERATIVE THRESHOLDING ALGORITHMS FOR LOW RANK MATRIX COMPLETION
|
|
- Olivia Russell
- 5 years ago
- Views:
Transcription
1 TRANSFORMED SCHATTEN- ITERATIVE THRESHOLDING ALGORITHMS FOR LOW RANK MATRIX COMPLETION SHUAI ZHANG, PENGHANG YIN, AND JACK XIN Abstract. We study a non-convex low-rank promoting penalty function, the transformed Schatten- (TS), and its applications in matrix completion. The TS penalty, as a matrix quasi-norm defined on its singular values, interpolates the rank and the nuclear norm through a nonnegative parameter a (,+ ). We consider the unconstrained TS regularized low-rank matrix recovery problem and develop a fixed point representation for its global minimizer. The TS thresholding functions are in closed analytical form for all parameter values. The TS threshold values differ in subcritical (supercritical) parameter regime where the TS threshold functions are continuous (discontinuous). We propose TS iterative thresholding algorithms and compare them with some state-of-the-art algorithms on matrix completion test problems. For problems with known rank, a fully adaptive TS iterative thresholding algorithm consistently performs the best under different conditions, where ground truth matrices are generated by multivariate Gaussian, (,) uniform and Chi-square distributions. For problems with unknown rank, TS algorithms with an additional rank estimation procedure approach the level of IRucL-q which is an iterative reweighted algorithm, non-convex in nature and best in performance. Key words. Transformed Schatten- penalty, fixed point representation, closed form thresholding function, iterative thresholding algorithms, matrix completion. AMS subject classifications. 9C26, 9C46. Introduction Low rank matrix completion problems arise in many applications such as collaborative filtering in recommender systems [4, 7], minimum order system and low-dimensional Euclidean embedding in control theory [4, 5], network localization [8], and others [26]. The mathematical problem is: min X R m n rank(x) s.t. X L, (.) where L is a convex set. In this paper, we are interested in methods for solving the affine rank minimization problem (ARMP) min X R m n rank(x) s.t. A (X) = b in Rp, (.2) where the linear transformation A : R m n R p and vector b R p are given. The matrix completion problem min X R m n rank(x) s.t. X i,j = M i,j, (i,j) Ω (.3) is a special case of (.2), where X and M are both m n matrices and Ω is a subset of index pairs {(i,j)}. The optimization problems above are known to be NP-hard. Many alternative penalties have been utilized as proxies for finding low rank solutions in both the constrained and unconstrained settings: min X R m n F (X) s.t. A (X) = b (.4) The work was partially supported by NSF grant DMS and DMS Department of Mathematics, University of California, Irvine, CA, 92697, USA. szhang3@uci.edu, penghany@uci.edu, jxin@math.uci.edu. Phone: (949) Fax: (949)
2 2 TS for Low Rank Matrix Completion and min X R m n 2 A (X) b 2 2 +λf (X). (.5) The penalty function F ( ) is defined on singular values of matrix X, typically F (X) = f(σ i ), where σ i is the i-th largest singular value of X arranged in descending order. The Schatten p-norm (nuclear norm at p = ) results when f(x) = x p, i p [,]. At p = (p = 2), F is the rank (Frobenius norm). Recovering rank under suitable conditions for p (,] has been extensively studied in theories and algorithms [2, 3, 4, 9, 2, 2, 23, 24, 28]. Non-convex penalty based methods have shown better performance on hard problems [2, 24]. There is also a novel method to solve the constrained problem (.4), from the perspective of gauge dual [32, 33]. Recently, a class of l based non-convex penalty, the transformed l (TL), has been found effective and robust for compressed sensing problems [3, 3]. TL interpolates l and l, similar to l p quasi-norm (p (,)). In the entire range of interpolation parameter, TL enjoys closed form iterative thresholding function, which is available for l p only at some specific values, like p =,,/2,2/3, see [, 5, 7, 29]. This feature allows TL to perform fast and robust sparse minimization in a much wider range than l p quasi-norm. Moreover, the TL penalty possesses the unbiasedness and Lipschitz continuity besides sparsity [2, 22]. It is the goal of this paper to extend TL penalty to TS (transformed Schatten-) for low rank matrix completion and compare it with state of the art methods in the literature. The rest of the paper is organized as follows. In section 2, we present the transformed Schatten- function (TS), the TS regularized minimization problems, and a derivation of thresholding representation of the global minimum. In section 3, we propose two thresholding algorithms (TS-s and TS-s2) based on a fixed point equation of the global minimum. In section 4, we compare TS algorithms with some stateof-the-art algorithms through numerical experiments in low rank matrix recovery and image inpainting. Concluding remarks are in section 5... Notation Here we set the notations for this paper. Two kinds of inner products are used in the following sections, one is between matrices and the other is a bilinear operation for vectors: (x,y) = x i y i for vectors x,y; i X,Y = tr(y T X) = X i,j Y i,j for matrices X,Y. i,j Assume matrix X R m n has r positive singular values σ σ 2... σ r >. Let us introduce some common matrix norms or quasi-norms as, Nuclear norm: X = r σ i ; i= Schatten p quasi-norm: X p = ( r σ p i )/p, for p (,); Frobenius norm: X F = ( i=σ r i 2) 2. Also X 2 F = X,X = Xi,j 2. i,j Ky Fan k-norm: X F k = k σ i, for k r; i= i=
3 S. Zhang, P. Yin, J. Xin 3 Induced L 2 norm: X L 2 = max v 2= Xv 2 = σ. Define function vec( ) to unfold one matrix columnwise into a vector. So it is clear that vec(x) 2 = X F, where the left hand side norm is vector s l 2 norm. Define the shrinkage identity k matrix I s k Rm n as following, { I s k (i,i) =, the first k diagonal elements; I s k (i,j) =, others. (.6) Operator tr k ( ) is defined as the first k partial trace of a matrix, tr k (X) = k X i,i. (.7) The following matrix functions will be used in the proof of next section, and we want to write them out first here for reference: i= C λ (X) = 2 A (X) b 2 2 +λt (X); C λ,µ (X,Z) = µ { C λ (X) 2 A (X) A } (Z) X Z 2 F = µλt (X)+ µ 2 b 2 2 µ 2 A (Z) 2 2 µ(a (X),b A (Z))+ 2 X Z 2 F ; B µ (Z) = Z +µa (b A (Z)). (.8) 2. TS minimization and thresholding representation First, let us introduce Transformed Schatten- penalty function(ts) based on the singular values of a matrix: rank(x) T (X) = ρ a (σ i ), (2.) where ρ a ( ) is a linear-to-linear rational function with parameter a (, ) [3, 3], i= ρ a ( x ) = (a+) x. (2.2) a+ x With the change of parameter a, TL interpolates l and l norms: lim a + ρ a(x) = I {x }, lim ρ a(x) = x. a + In Fig. 2., level lines of TL on the plane are shown at small and large values of parameter a, resembling those of l (at a = ), l /2 (at a = ), and l (at a =.). We shall focus on TS regularized problem min X R m n 2 A (X) b 2 2 +λt (X), (2.3) where the linear transform A : R m n R p can be determined by p given matrices A,...,A p R m n, that is, A (X) = ( A,X,..., A p,x ) T.
4 4 TS for Low Rank Matrix Completion (a) l (b) TL with a = (c) TL with a = (d) TL with a = Fig. 2.: Level lines of TL with different parameters: a = (figure b), a = (figure c), a =. (figure d). For large parameter a, the graph looks almost the same as l (figure a). While for small value of a, it tends to the axis. 2.. Overview of TL minimization To set the stage for the discussion of the TS regularized problem (2.3), we review the following results on one-dimensional TL optimization [3]. Let us consider the unconstrained TL regularized problem: min x R n 2 Ax y 2 2 +λp a (x), (2.4) where matrix A R m n, vector y R m are given, P a (x) = ρ a ( x i ) and function ρ a ( ) i is as in (2.2). In this subsection of TL minimization, we want to overwrite operator B µ ( ) over vector x, instead of matrix field as before in (.8), B µ (x) = x+µa T (y Ax). (2.5) In the following theorem (2.), we prove that there exists a closed form expression for proximal operator prox λρa on univariate TL regularization problem, where prox λρa (x) = argmin 2 (y x)2 +λρ a (y). y R
5 S. Zhang, P. Yin, J. Xin 5 Proximal operator of a convex function usually intends to solve a small convex regularization problem, which often admits a closed-form formula or an efficient specialized numerical method. However, for non-convex functions, like l p with p (.), their related proximal operators do not have closed form solutions in general. There are many iterative algorithms to approximate optimal solution. But they need more computing time and sometimes only converge to some local optimal or stationary points. In this subsection, we will show that for TL function, there indeed exists a closed-formed formula for its optimal solution. Different with other thresholding operators, TL has 2 threshold value formulas depending on regularization parameter λ and TL parameter a. We present them here with same notation as [3]. { t 2 = λ a+ a (sub-critical parameter) t 3 = 2λ(a+) a 2 (super-critical parameter). (2.6) The inequality t 3 t 2 holds and the equality is realized if and only if λ = a2 2(a+), see [3]. Let sgn( ) be the standard signum function with sgn() =, and { ( ) 2 ϕ(x) h λ (x) = sgn(x) (a+ x ) cos 2a x } (2.7) 3 with ϕ(x) = arccos( 27λa(a+) 2(a+ x ) ). In general, h 3 λ (x) x, see [3]. Theorem 2.. ([3]) The optimal solution of y = argmin{ 2 (y x)2 +λρ a ( y )} is a y R thresholding function of the form: y = {, x t h λ (x), x > t (2.8) where h λ ( ) is defined in (2.7), and the threshold parameter t depends on λ as follows:. if λ a2 2(a+) (sub-critical and critical), 2. if λ > a2 2(a+) (super-critical), t = t 2 = λ a+ a ; t = t 3 = 2λ(a+) a 2. According to the above theorem, we introduce thresholding operator g λ,a ( ) in R, g λ,a (w) = {, if w t; h λ (w), if w > t, (2.9) where t is the thresholding value in Theorem 2. and h λ ( ) in (2.7). In [3], the authors proved that when λ <, the TL thresholding function a2 2(a+) is continuous, same as the soft-thresholding function [8, 9]. While if λ > a2 2(a+), the TL thresholding function has a jump discontinuity at threshold value, similar to the half-thresholding function [29]. For different thresholding schemes, it is believed that
6 6 TS for Low Rank Matrix Completion continuous formula is more stable, while discontinuous formula separates nonzero and trivial coefficients more efficiently and sometimes converges faster. We have the following representation theorem for TL regularized problem (2.4). Theorem 2.2. ([3]) If x = (x,x 2,...,x n) T is a TL regularized solution (2.4) with a and λ being positive constants, and < µ < A 2, then letting t = t 2I { } + λµ a2 2(a+) t 3I { }, the optimal solution satisfies the fixed point equation: λµ> a2 2(a+) x i = g λµ,a ([B µ (x )] i ) i =,...,n. (2.) In the following, we will extend this result to TS low rank matrix completion and propose 2 thresholding algorithms based on it TS thresholding representation theory Here we assume m n. For a matrix X R m n with rank equal to r, its singular values vector σ = (σ,...,σ m ) is arranged as σ σ 2... σ r > = σ r+ =... = σ m. The singular value decomposition (SVD) is X = UDV T, where U = (U i,j ) m m and V = (V i,j ) n n are unitary matrices, with D = Diag(σ) R m n diagonal. In [3], Ky Fan proved the dominance theorem and derive the following Ky Fan k-norm inequality. Lemma 2.. (Ky Fan k-norm inequality) For a matrix X R m n with SVD: X = U DV T, where diagonal elements of D are arranged in decreasing order, we have: X,I s k D,I s k, that is, tr k (X) tr k (D) = X F k, k =,2,...,m. The inequalities become equalities if and only if X = D. Here matrix I s k and operator tr k( ) are defined in section.. Another proof of this inequality without using dominance theorem is available. We leave it in the appendix for readers convenience, making the paper self-contained. Theorem 2.3. Consider matrix Y R m n that admits a singular value decomposition of the form: Y = U Diag(σ)V T, where σ = (σ,...,σ m ). Then the unique global minimizer of min X R m n 2 X Y 2 F +λt (X) is: X s = G λ,a (Y ) = UDiag(g λ,a (σ))v T, (2.) where g λ,a ( ) is defined in (2.9) and applied entrywise to σ. Proof. First due to the unitary invariance property of Frobenius norm and Y = UDiag(σ)V T, we have So 2 X Y 2 F +λt (X) = 2 U T XV Diag(σ) 2 F +λt (U T XV ). X s = argmin 2 X Y 2 F +λt (X) { = U argmin 2 X Diag(σ) 2 F }V +λt (X) T. X R m n X R m n (2.2)
7 S. Zhang, P. Yin, J. Xin 7 Next we want to show: argmin 2 X Diag(σ) 2 F +λt (X) X R m n = argmin {D R m n is diagonal} 2 D Diag(σ) 2 F +λt (D) (2.3) For any X R m n, suppose it admits SVD: X = U x Diag(σ x )V T x. Denote D x = Diag(σ x ) and D y = Diag(σ). We can rewrite diagonal matrix D y as D y = m σ i Ii s, where σ i = σ i σ i+ for i =,2,...,m, and σ m = σ m. So simply, i matrix I s i is defined in section.. So: i m σ i = σ k. Note that the shrinkage identity i=k X,D y = X, m σ i Ii s = m X, σ i Ii s i m D x, σ i Ii s = D x,d y, i where we used Lemma 2. for the inequality. The equality holds if and only if X = D x. Thus we have Also due to T (X) = T (D x ), X D y 2 F = X 2 F + D y 2 F 2 X,D y D x 2 F + D y 2 F 2 D x,d y = D x D y 2 F. 2 X D y 2 F +λt (X) 2 D x D y 2 F +λt (D x ). Only when X = D x is a diagonal matrix, the above will become equality. So we proved equation (2.3). Denote a diagonal matrix D R m n as D = Diag(d). Then: 2 D Diag(σ) 2 F +λt (D) = m i= i { } 2 d i σ i 2 2 +λρ a ( d i ) By Theorem 2., we have g λ,a (σ i ) = argmin{ d 2 d σ i 2 2 +λρ a ( d ) }. It follows that argmin {D R m n and D is diagonal} 2 D Diag(σ) 2 F +λ T (D) = argmin 2 X Diag(σ) 2 F +λ T (X) X R m n = Diag(g λ,a (σ)). (2.4) In view of (2.2), the matrix X s = UDiag(g λ,a (σ))v T is a global minimizer, which will be denoted as G λ,a (Y ). Let X be another optimal solution for the problem min X R m n 2 X Y 2 F +λt (X) and denote X = U T X V. Then X will be an optimal solution of argmin 2 X X R m n Diag(σ) 2 F +λt (X). Based on the above proof and Theorem 2., we have X =
8 8 TS for Low Rank Matrix Completion Diag(g λ,a (σ)). Since U and V are unitary, X = U X V T = X s. We proved that the matrix G λ,a (Y ) is the unique optimal solution for the optimization problem. The proof is complete. Lemma 2.2. For any fixed λ >, µ > and matrix Z R m n, let X s = G λµ,a (B µ (Z)), then X s is the unique global minimizer of the problem min C λ,µ(x,z), where the X R m n matrix function C λ,µ (X,Z) is defined in (.8) of section.. Proof. First, we will rewrite the formula of C λ,µ (X,Z). Note that A (X) and A (Z) are vectors in space R p. Thus in the formula of C λ,µ (X,Z), there exist norms and inner products for both matrices and vectors. By definition, Thus if we fix matrix Z, C λ,µ (X,Z) = 2 X 2 F X,Z + 2 Z 2 F +λµt (X)+ µ 2 b 2 2 µ(a (X),b A (Z)) µ 2 A (Z) 2 2 = 2 X 2 F + 2 Z 2 F + µ 2 b 2 2 µ 2 A (Z) 2 2 +λµt (X) X,Z +µa (b A (Z)) = 2 X B µ(z) 2 F +λµt (X) 2 B µ(z) 2 F + 2 Z 2 F + µ 2 b 2 2 µ 2 A (Z) 2 2 (2.5) argmin C λ,µ (X,Z) = argmin 2 X B µ(z) 2 F +λµt (X) (2.6) X R m n X R m n Then by Theorem 2.3, X s is the unique global minimizer. Theorem 2.4. For fixed parameters, λ > and < µ < A 2 2. If X is a global minimizer for problem C λ (X), then X is the unique global minimizer for problem min C λ,µ(x,x ). X R m n Proof. C λ,µ (X,X ) = µ{ 2 A (X) b 2 2 +λt (X)} + 2 { X X 2 F µ A (X) A (X ) 2 2} µ{ 2 A (X) b 2 2 +λt (X) } = µc λ (X) µc λ (X ) = C λ,µ (X,X ) The first inequality is due to the fact: A (X) A (X ) 2 2 = Avec(X) Avec(X ) 2 2 A 2 2 vec(x X ) 2 2 A 2 2 X X 2 F (2.7) (2.8) From the above inequalities, we know that X is an optimal solution for min X R m n C λ,µ (X,X ). The uniqueness of X follows from Lemma 2.2. By the above Theorems and Lemmas, if X is a global minimizer of C λ (X), it is also the unique global minimizer of C λ,µ (X,Z) with Z = X, which has the closed form solution formula. Thus we arrive at the following fixed point equation for this global minimizer X : X = G λµ,a (B µ (X )). (2.9) Suppose the SVD for matrix B µ (X ) is U Diag(σ b )V T, then X = U Diag(g λµ,a (σ b ))V T, which means that the singular values of X satisfy σi = g λµ,a(σb,i ), for i =,...,m.
9 S. Zhang, P. Yin, J. Xin 9 3. TS thresholding algorithms Next we will utilize the fixed point equation (2.9) to derive two thresholding algorithms for TS regularized problem (2.3). As in [3, 3], from the equation X = G λµ,a (B µ (X )) = UDiag(g λµ,a (σ))v T, we will replace the optimal matrix X with X k on the left and X k on the right at the k-th step of iteration as: X k = G λµ,a (B µ (X k )) = U k Diag ( g λµ,a (σ k ) ) V k,t, (3.) where unitary matrices U k, V k and singular values {σ k } come from the SVD decomposition of matrix B µ (X k ). Operator g λµ,a ( ) is defined in (2.9), and {, if w < t; g λµ,a (w) = h λµ (w), if w t. (3.2) Recall that the thresholding parameter t is: { t 2 = λµ a+ a2 a, if λ 2(a+)µ t = ; t 3 = 2λµ(a+) a 2, if λ > a2 2(a+)µ. (3.3) With an initial matrix X, we obtain an iterative algorithm, called TS iterative thresholding (IT) algorithm. It is the basic TS iterative scheme. Later, two adaptive and more efficient IT algorithms (TS-s and TS-s2) will be introduced. 3.. Semi-Adaptive Thresholding Algorithm TS-s We begin with formulating an optimal condition for regularization parameter λ, which serves as the basis for the parameter selection and updating in this semi-adaptive algorithm. Suppose the optimal solution matrix X has rank r, by prior knowledge or estimation. Here, we still assume m n. For any µ, denote B µ (X) = X +µa T (b A (X)) and {σ i } m i= are the m non-negative singular values for B µ(x). Suppose that X is the optimal solution matrix of (2.3), and the singular values of matrix B µ (X ) are denoted as σ σ 2... σ m. Then by the fixed equation (2.9), the following inequalities hold: σ i > t i {,2,...,r}, σ j t j {r +,r +2,...,m}, (3.4) where t is our threshold value. Recall that t 3 t t 2. So It follows that σ r t t 3 = 2λµ(a+) a 2 ; σ r+ t t 2 = λµ a+ a. (3.5) λ aσ r+ µ(a+) λ λ 2 (a+2σ r) 2 8(a+)µ or λ [λ,λ 2 ]. The above estimate helps to set optimal regularization parameter. A choice of λ is (I) When λ a2 2(a+)µ, set λ = λ. Then we will have λ a2 2(a+)µ and thus thresholding value t = t 2;
10 TS for Low Rank Matrix Completion (II) When λ > Then λ > a2 2(a+)µ, set λ = λ 2. a2 2(a+)µ. Thus we choose thresholding value t = t 3. In practice, we approximate B µ (X ) by B µ (X n ) in the above formula, that is, the i-th singular value σi is approximated by σn i at the n-th iteration step. Thus, we have λ,n = aσn r+ µ(a+), and λ 2,n = (a+2σn r ) 2. We choose optimal parameter λ at the n-th step 8(a+)µ as λ n = { λ,n, if λ,n a2 2(a+)µ, λ 2,n, if λ,n > a2 2(a+)µ. (3.6) This way, we obtain an adaptive iterative algorithm without pre-setting the regularization parameter λ. The TL parameter a is still free and needs to be selected beforehand. Thus the algorithm is overall semi-adaptive, called TS-s for short and summarized in Algorithm. Algorithm : TS-s thresholding algorithm Initialize: Given X and parameter µ and a. while NOT converged do. Y n = B µ (X n ) = X n µa (A (X n ) b), and compute SVD of Y n as Y n = U Diag(σ)V T ; 2. Determine the value for λ n by (3.6), then obtain the thresholding value t n by (3.3); 3. X n+ = G λnµ,a(y n ) = UDiag(g λnµ,a(σ))v T ; Then, n n+. end while 3.2. Adaptive Thresholding Algorithm TS-s2 Different from TS-s where the parameter a needs to be determined manually, here at each iterative step, a we choose a = a n such that equality λ n = 2 n 2(a n+)µ n holds. The threshold value t is given by a single formula with t = t 3 = t 2. Putting λ = at critical value, the parameter a is expressed as: a2 2(a+)µ The threshold value is: a = λµ+ (λµ) 2 +2λµ. (3.7) t = λµ a+ a = λµ 2 + (λµ)2 +2λµ. (3.8) 2 Let X be the TL optimal solution and σ be the singular values for matrix B µ (X ). Then we have the following inequalities: So, for parameter λ, we have: σ i > t i {,2,...,r}, σ j t j {r +,r +2,...,m}. (3.9) 2(σr+) 2 +2σr+ λ 2(σ r) 2 +2σr.
11 S. Zhang, P. Yin, J. Xin Once the value of λ is determined, the parameter a is given by (3.7). In the iterative method, we approximate the optimal solution X by X n and further use B µ (X n ) s singular values {σ n i } i to replace those of B µ (X ). The resulting parameter selection is: λ n = 2(σn r+) 2 ; +2σ n r+ a n = λ n µ n + (λ n µ n ) 2 +2λ n µ n. (3.) In this algorithm (TS-s2 for short), only parameter µ is fixed, satisfying inequality µ (, A 2 ). Its algorithm is summarized in Algorithm 2. Algorithm 2: TS-s2 thresholding algorithm Initialize: Given X and parameter µ. while NOT converged do. Y n = X n µa (A (X n ) b), and compute SVD of Y n as Y n = U Diag(σ)V T ; 2. Determine the values for λ n and a n by (3.), then update threshold value t n = λ n µ an + a ; n 3. X n+ = G λn µ,a n(y n ) = U Diag(g λn µ,a(σ))v T ; Then n n+. end while 4. Numerical experiments In this section, we present numerical experiments to illustrate the effectiveness of our Algorithms: semi-adaptive TS-s and adaptive TS-s2, compared with several state-of-art solvers on matrix completion problems. The comparison solvers include: LMaFit [28], FPCA [23], sirls-q [24], IRucLq-M [2], LRGeomCG [34] The code LMAFit solves a low-rank factorization model, instead of computing SVD which usually takes a big chunk of computation time. Also part of its codes is written in C, same as LRGeomCG. So once this method converges, it is the fastest method among all comparisons. All others codes are implemented under Matlab environment and involve SVD approximated by fast Monte Carlo algorithms [, ]. FPCA is a nuclear norm minimization code, while sirls-q and IRucLq-M are iterative reweighted least square algorithms for Schatten-q quasi-norm optimizations. LRGeomCG algorithm explores matrix completion based on Riemannian optimization. It tries to minimize the least-square distance on the sampling set over the Riemannian manifold of fixed-rank matrices. When the rank information is known or well approximated, this method is efficient and accurate, as shown in these experiments below, especially for standard Gaussian matrices. But a drawback of LRGeomCG is that the rank of the manifold is fixed. Basically, it is hard for it to handle unknown rank cases. In our TS algorithms, MC SVD algorithm [] is implemented at each iteration step, same as FPCA. We also tried another fast SVD approximation algorithms, but MC SVD is the most suitable one, satisfying both speed and accuracy requirements in one iterative algorithm. All our tests were performed on a Lenovo desktop: 6 GB TS matlab codes can be downloaded from
12 2 TS for Low Rank Matrix Completion of RAM and Core Quad processor i7-477 with CPU at 3.4GHz under 64-bit Ubuntu system. We tested and compared these solvers on low rank matrix completion problems under various conditions, including multivariate Gaussian, uniform and χ 2 distributions. We also tested the algorithms on grayscale image recovery from partial observations (image inpainting). 4.. Implementation details In the following series of tests, we generated random matrices M = M L M T R R m n, where matrices M L and M R are in spaces R m r and R n r respectively. By setting parameter r to be small, we obtain a low rank matrix M with rank at most r. After this step, we uniformly random-sampled a subset ω with p entries from M. The following quantities help to quantify the difficulty of a recovery problem. SR (Sampling ratio): SR = p/mn. FR (Freedom ratio): FR = r(m+n r)/p, which is the freedom of rank r matrix divided by the number of measurement. According to [23], if FR >, there are infinite number of matrices with rank r and the given entries. r m (Maximum rank with which the matrix can be recovered): r m = m+n (m+n) 2 4p (floor function), 2 which is defined as the largest rank such that FR. The TS thresholding algorithms do not guarantee a global minimum in general, similar to non-convex schemes in -dimensional compressed sensing problems. Indeed we observe that TS thresholding with random starts may get stuck at local minima especially when parameter FR (freedom ratio) is high or the matrix completion is difficult. A good initial matrix X is important for thresholding algorithms. In our numerical experiments, instead of choosing X = or random, we set X equal to matrix M whose elements are as observed on Ω and zero elsewhere. The stopping criterion is X n+ X n F max{ X n F,} tol where X n+ and X n are numerical results from two contiguous iterative steps, and tol is a moderately small number. In all these following experiments, we fix tol = 6 with maximum iteration steps. We also use the relative error rel.err = X opt M F M F (4.) to estimate the closeness of X opt to M, where X opt is the optimal solution produced by all numerical algorithms Rank estimation For thresholding algorithms, rank r is the most important parameter, especially for our TS methods, where thresholding value t is determined based on r. If the true rank r is unknown, we adopt the rank decreasing estimation method (also called maximum
13 S. Zhang, P. Yin, J. Xin 3 eigengap method) as in [2, 28], thereby extending both TS-s and TS-s2 schemes to work with an overestimated initial rank parameter K. In the following tests, unless otherwise specified, we set K =.5r. The idea behind this estimation method is as follows. Suppose that at step n, our current matrix is X. The eigenvalues of X T X are arranged with descending order and λ rmin λ rmin+... λ K+ > is the r min -th through K +-th eigenvalues of X T X, where r min is manually specified minimum rank estimate. Then we compute the quotient sequence λ i = λ i /λ i+, i = r min,...,k. Let K = argmin λ i, r min i K the corresponding index for maximal element of { λ i }. If the eigenvalue gap indicator τ = λ K(K r min +)/ i K λ i >, we adjust our rank estimator from K to K. During numerical simulations, we did this adjustment only once for each problem. In most cases, this estimation adjustment is quite satisfactory and the adjusted estimate is very close to the true rank r Choice of a: optimal parameter testing for TS-s. A major difference between TS-s and TS-s2 is the choice of parameter a, which influences the behaviour of penalty function ρ a ( ) of TS. When a tends to zero, the function T (X) approaches the rank. We tested TS-s on small size low rank matrix completion with different a values, varying among {.,.5,,,}, for both known rank scheme and the scheme with rank estimation. In these tests, M = M L MR T is a random matrix, where M L and M R are generated under i.i.d standard normal distribution. The rank r of M varies from to 22. For each value of a, we conducted 5 independent tests with different M and sample index set ω. We declared M to be recovered successfully if the relative error (4.) was less than 5 3. The test results for known rank scheme and rank estimation scheme are both shown in Figure 4.. The success rate curves of rank estimation scheme are not as clustered as those of known rank scheme. In order to clearly identify the optimal parameter a, we ignored the curve of a =. in the right figure as it is always below all others. The vertical red dotted line there indicates the position where FR =.6. It is interesting to see that for known rank scheme, parameter a = is the optimal strategy, which coincides with the optimal parameter setting in [3]. It is observed that when we use thresholding algorithm under transformed L (TL) or transformed Schatten- (TS) quasi norm, it is usually optimal to set a = with given information of sparsity or rank. However, for the scheme with rank estimation, it is more complicated. Based on our tests, if FR <.6, it is better to set a to reach good performance. On the other hand, if FR >.6, a = is nearly the optimal choice. So for all the following tests, when we apply TS-s with rank estimation, the parameter a is set to be {, if FR <.6; a =, if FR.6. In applications where FR is not available, we suggest to use a =, since its performance is also acceptable if FR <.6.
14 4 TS for Low Rank Matrix Completion Parameter test for a with rank known prior Parameter test for a with rank estimation.9.8 a=. a=.5 a= a= a=.9.8 Frequency of success Frequency of success a=.5 a= a= a= rank 6 fr = rank with fr changing from.23 to.78 Rank is known Rank is estimated Fig. 4.: Optimal parameter test for semi-adaptive method: TS-s 4.2. Completion of Random Matrices The ground truth matrix M is generated as the matrix product of two low rank matrices M L and M R. Their dimensions are m r and n r respectively, with r min(m,n). In these following experiments, except clearly stated, M L and M R are generated with multivariate normal distribution N (µ,σ), with µ = and Σ = {( cov) I (i=j) +cov} r r determined by parameter cov. Thus matrix M = M L MR T has rank at most r. It is known that success recovery is related to FR. The higher FR is, the harder it is to recover the original low rank matrix. In the first batch of tests, we varied rank r and fixed all other parameters, i.e. matrix size (m,n), sampling rate (sr). Thus FR was changing along with rank. It is observed that the performance of TS-s and TS-s2 are very different, due to adopting single or double thresholds. TS-s2 uses only one (smooth) thresholding scheme with changing parameter a. It converges faster than TS-s when the rank is known, see subsection On the other hand, TS-s utilizes two (smooth and discontinuous) thresholding schemes, and is more robust in case of overestimated rank. TS-s outperforms TS-s2 when rank estimation is used in lieu of the true rank value, see subsection IRucL-q method is found to be very robust for varied covariance and rank estimation, yet it underperforms TS methods at high FR, even with more computing time. Though TS methods rely on the same rank estimation method as IRucL-q, IRucL-q achieves the best results in the absence of true rank value. A possible reason is that in IRucL-q iterations, the singular values of matrix X are computed more accurately. In TS, singular values are computed by fast Monte Carlo method at every iteration. Due to random sampling of Monte Carlo method, there are more errors especially at the beginning stage of iteration. The resulting matrices X n may cause less accurate rank estimation Matrix completion with known rank In this subsection, we implemented all six algorithms under the condition that true rank value is given. They are TS-s, TS-s2, sirls-q, IRucL-q, LMaFit and
15 S. Zhang, P. Yin, J. Xin 5 LRGeomCG. We skipped FPCA since rank is always adaptively estimated there. Gaussian matrices with different ranks. In these tests, matrix M = M L M T R was generated under uncorrelated normal distribution with µ =. We conducted tests both on low dimensional matrices with m = n = (Table 4.) and high dimensional matrices with m = n = (Table 4.2). Tests on non-square matrices with m n show similar results. In Table 4., rank r varies from 5 to 8, while FR increases from.2437 up to.89. For lower rank (less than 5), LMaFit is the best algorithm with low relative errors and fast convergence speed. Part of the reason is that this method does not involve SVD (singular value decomposition) operations during iteration. LRGeomCG approaches the performance of LMaFit when r. However, as FR values are above.7, it became hard for LMaFit to find truth low rank matrix M. Its performance is not as good as stated in paper [34] with possible reason that we generate M with mean µ equal to, instead of in [34]. We also tested LRGeomCG with µ = where it has very small relative error and also fast convergence rate. It is also noticed that in Table 4., the two TS algorithms performed very well and remained stable for different FR values. At similar order of accuracy, the TLs are faster than IRucL-q. For large size matrices (m = n = ), rank r is varied from 5 to, see table 4.2. The sirls-q and LMaFit only worked for lower FR. IRucL-q can still produce satisfactory results with relative error around 3, but its iterations took longer time. In [2], it was carried out by high speed-performance CPU with many cores. Here we used an ordinary processor with only 4 cores and 8 threads. It is believed that with a better machine, IRucL-q will be much faster, since parallel computing is embedded in its codes. As seen in the table, LRGeomCG is always convergent and achieves almost same accuracy with TS-s and TS-s2. However, its computation time grows fast with increasing rank. A little difference between the two TS algorithms begins to emerge when the matrix size is large. Although when rank is given, they all performed better than other schemes, adaptive TS-s2 is a little faster than semi-adaptive TS-s. It is believed by choosing optimal parameter a, TS-s will be improved. The parameter a is related to matrix M, i.e. how it is generated, its inner structure, and dimension. In TS-s2, the value of parameter a does not need to be manually determined. Gaussian Matrices with Different Covariance. In this subsection, the rank r, the sampling rate, and the freedom ratio FR are fixed. We varied parameter cov to generate covariance matrices of multivariate normal distribution. In Table 4.3, we chose two rank values, r = 5 and r = 8. It is harder to recover the original matrix M when it is more coherent. IRucL-q does better in this regime. Its mean computing time and relative errors are less influenced by the changing cov. Results on large size matrices are shown in Table 4.4. TS-s2 scheme is much better than TS-s, both in relative error and computing time. In small size matrix experiments, TS-s2 is the best among comparisons. In Table 4.4, we fixed rank = 3 with cov among {.,...,.7}. TS-s2 is still satisfactory both in accuracy and speed for low covariance (i.e cov.6). However, for cov.7, relative errors increased from 6 to around 4. It is also observed that IRucL-q algorithm is very stable and robust under covariance change. Matrices from other distributions. We also compare algorithms with other distributions, including (,) uniform distribution and Chi-square distribution with k
16 6 TS for Low Rank Matrix Completion Table 4.: Comparison of TS-s, TS-s2, sirls-q, IRucL-q, LMaFit and LRGeomCG on recovery of uncorrelated multivariate Gaussian matrices at known rank, m = n =, SR =.4, with stopping criterion tol = 6. Problem TS-s TS-s2 sirls-q* rank FR rel.err time rel.err time rel.err time e e e e e e e e e e e-7.2.3e e e e e-5.33.e e e e e e e e e e e e-4. Problem IRucL-q LMaFit LRGeomCG rank FR rel.err time rel.err time rel.err time e e-6.2.3e e e e e e e e e e e e e e e e e e e e e e e e-.34.9e e e e e e e-.94 * Notes:. The sirls-q iterations did not converge when rank > 4 and FR.65. Comparison is skipped over this range. Results for rank (,2,3) are skipped as they differ little from rank 4. Similar rank samplings occur in Tables 4.(2-3), 4.(5-6). 2. Matrix M is generated from multivariate normal distribution with mean µ =, instead of. = (degree of freedom). All other parameters are same as Table 4.. The results are displayed at Table 4.5 (uniform distribution) and Table 4.6 (Chi-square distribution). Only partial numerical results are showed here with rank r = 7,8,9,,4,5. From these two tables, two TS algorithms have satisfying relative errors and stable performance, same as IRuccL-q. For these two non-gaussian distributions, it becomes harder to successfully recover low rank matrix for LMaFit and LRGeomCG, especially when rank r > Matrix completion with rank estimation We conducted numerical experiments on rank estimation schemes. The initial rank estimation is given as.5r, which is a commonly used overestimate. FPCA [23] is included for comparison, while LRGeomCG and sirls-q are excluded. FPCA is a fast and robust iterative algorithm based on nuclear norm regularization. We considered two classes of matrices: uncorrelated Gaussian matrices with changing rank; correlated Gaussian matrices with fixed rank (r = 5,). The results are
17 S. Zhang, P. Yin, J. Xin 7 Table 4.2: Numerical experiments on recovery of uncorrelated multivariate Gaussian matrices at known rank, m = n =, SR =.3. Problem TS-s TS-s2 sirls-q rank FR rel.err time rel.err time rel.err time e e e e e e e e e e e e Problem IRucL-q LMaFit LRGeomCG rank FR rel.err time rel.err time rel.err time e e e e e e e e e e e e Table 4.3: Numerical experiments on multivariate Gaussian matrices with varying covariance at known rank, m = n =, SR =.4. Problem TS-s TS-s2 sirls-q rank cor rel.err time rel.err time rel.err time e e e e e e e e e e e e e e e e e e e e e- 6.8 Problem IRucL-q LMaFit LRGeomCG rank cor rel.err time rel.err time rel.err time e e-2.7.2e e e e e-5.7.e e e e e e e-.25.7e e e e e e e-.29 shown in Table 4.7 and Table 4.8. It is interesting that under rank estimation, the semi-adaptive TS-s fared much better than TS-s2. In low rank and low covariance cases, TS-s is the best in terms of accuracy and computing time among comparisons. However, in the regime of high covariance and rank, it became harder for TS methods to perform efficient recovery. IRucL-q did the best, being both stable and robust. In the most difficult case, at rank = 5 and FR approximately equal to.7, IRucL-q can still obtain an accurate result with relative error around Image inpainting
18 8 TS for Low Rank Matrix Completion Table 4.4: Numerical experiments on multivariate Gaussian matrices with varying covariance at known rank, m = n =, SR =.4. Problem TS-s TS-s2 sirls-q rank cor rel.err time rel.err time rel.err time e e e e e e e e e e e e e e e e e e e e e- 5.3 Problem IRucL-q LMaFit LRGeomCG rank cor rel.err time rel.err time rel.err time e e e e e e e e e e e e e e e e e e e e e Table 4.5: Comparison with random matrices generated from (,) uniform distribution. Rank r is given and m = n =, SR =.4, with stopping criterion tol = 6. Problem TS-s TS-s2 sirls-q* rank FR rel.err time rel.err time rel.err time e e e e e e e e e e e e e-5.8.2e-5.58 Problem IRucL-q LMaFit LRGeomCG rank FR rel.err time rel.err time rel.err time e e e e e e e e e e e e e e-.8.24e e e-.6.7e-.76 As in [2, 28], we conducted grayscale image inpainting experiments to recover low rank images from partial observations, and compare with IRcuL-q and LMaFit algorithms. The boat image (see Figure 4.2) is used to produce ground truth as in [2] with rank equal to 4 and at resolution. Different levels of noisy disturbances
19 S. Zhang, P. Yin, J. Xin 9 Table 4.6: Comparison with random matrices generated from Chi-square distribution with k = (degree of freedom). Rank r is given and m = n =, SR =.4, with stopping criterion tol = 6. Problem TS-s TS-s2 sirls-q* rank FR rel.err time rel.err time rel.err time e e e e e e e e e e e e e e-5.73 Problem IRucL-q LMaFit LRGeomCG rank FR rel.err time rel.err time rel.err time e e-6.4.8e e e e e e e e e e e e-.5.46e e e e-.57 Table 4.7: Numerical experiments for low rank matrix completion algorithms under rank estimation. True matrices are uncorrelated multivariate Gaussian, m = n =, SR =.4. Problem TS-s TS-s2 FPCA IRucL-q LMaFit rank FR rel.err time rel.err time rel.err time rel.err time rel.err time e e e-.9.84e e e e e e e e e e e e e e e e e e e e e e e e e e e-.2 Table 4.8: Numerical experiments on low rank matrix completion algorithms under rank estimation. True matrices are multivariate Gaussian with different covariance, m = n =, and SR =.4. Problem TS-s TS-s2 FPCA IRucL-q LMaFit rank cor rel.err time rel.err time rel.err time rel.err time rel.err time e e e e e e e e e e e e e e e-2..5.e e-.4.2e e e e e-.4.2e e e e e e e e-2.
20 2 TS for Low Rank Matrix Completion Original image Sample image with noise TS-s2 IRucL_q LMaFit-inc LMaFit-fix Fig. 4.2: Image inpainting experiments with SR =.3,σ =.5. are added to the original image Mo by the formula M = Mo + σ kmo kf ε, kεkf where the matrix ε is a standard Gaussian. Here we only applied scheme TS-s2. For IRucL-q, we followed the setting in [2] by choosing α =.9 and λ = 2 σ. Both fixed rank ( LMaFit-fix ) and increased rank (LMaFit-inc) schemes are implemented for LMaFit. We took fixed rank r = 4 for TSs2, LMaFit-fix and IRucL-q. Computational results are in Table 4.9 with sampling ratios varying among {.3,.4,.5} and noise strength σ in {.,.5,.,.5,.2,.25}. The performance for each algorithm is measured in CPU time, PSNR (peak-signal noise ratio), and MSE (mean squared error). Here we focus more on PSNR values and placed the top 2 in bold for each experiment. We observed that IRucL-q and TS-s2 fared about the same. Either one is better than LMaFit in most cases. 5. Conclusion We presented the transformed Schatten- penalty (TS), and derived the closed form thresholding representation formula for global minimizers of TS regularized rank minimization problem. We studied two adaptive iterative TS schemes (TS-s and TS-s2) computationally for matrix completion in comparison with several state-of-art methods, in particular IRucL-q. In case of low rank matrix recovery under known rank, TS-s2 performs the best in accuracy and computational speed. In low rank matrix recovery under rank estimation, TS-s is almost on par with IRucL-q except when both the matrix covariance and rank rise to certain level. In
21 S. Zhang, P. Yin, J. Xin 2 Table 4.9: Numerical experiments on boat image inpainting with algorithms TS, IRcuL-q and LMaFit under different sampling ratio and noise levels. Problem TS-s2 IRucL-q LMaFit-inc LMaFit-fix SR σ Time PSNR MSE Time PSNR MSE Time PSNR MSE Time PSNR MSE e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e-2 future work, we shall study rank estimation techniques to further improve on TS-s and explore other applications for TS penalty. Acknowledgments. The authors would like to thank Prof. Wotao Yin for his helpful suggestions on low rank matrix completion methods and numerical experiments. We also thank the anonymous referees for their constructive comments Appendix A. Proof of Ky Fan k-norm inequality. Proof. Since X = UDiag(σ)V T, the (j,k)-th entry of matrix X is X j,k = m σ i U j,i V k,i. i= Thus, we have tr k (X) = k j= = m i=j= X j,j = k j=i= m σ i U j,i V j,i k σ i U j,i V j,i = m i= σ i w (k) i, (.) where the weight w (k) i for the singular value σ i is defined as: w (k) i = k U j,i V j,i, i =,2,...,m. (.2) j=
22 22 TS for Low Rank Matrix Completion Notice that, w (k) i k U j,i V j,i U(:,i) 2 V (:,i) 2, (.3) j= where U(:,i) and V (:,i) are the i-th column vectors for U and V. {w (k) i }, m i= w (k) i m i=j= k j= k U j,i V j,i = k m U j,i V j,i j=i= U(j,:) 2 V (j,:) 2 k, Also for weights (.4) where U(j,:) and V (j,:) are the j-th row vectors for U and V, respectively. All the m weights are bounded by, with absolute sum at most k m. Note that σ i s are in decreasing order. By equation (.), we have, for all k =,2,...,m, tr k (X) m i= σ i w (k) i k σ i = tr k (D) = X F k. i= Next, we prove the second part of the lemma equality condition, by mathematical induction. Suppose that for a given matrix X, tr k (X) = tr k (D), k =,...,m. Here, it is convenient to define X i = σ i U i Vi T, where V i (U i ) is the i-th column vector of V (U). Then matrix X can be decomposed as the sum of r rank- matrices, X = r X i. When k =, according to tr (X) = tr (D) and the proof above, we know that w () = and w () = for i = 2,...,m. i By the definition of weights w (k) i in (.2), we have w () = U, V, =. Since U and V are both unitary matrices, we have: U, = V, = ±; U,j = U j, = V,j = V j, = for j. Then vectors U (V ) is the first standard basis vector in space R m (R n ). The matrix X = σ U V T is diagonal σ X =... m n i= For any index i, i k, suppose that U i,i = V i,i = ±; U i,j = U j,i = V i,j = V j,i = for any index j i. (.5)
23 S. Zhang, P. Yin, J. Xin 23 Then matrix X i = σ i U i Vi T, with i k, is diagonal and can be expressed as... X i = σ i (i-th row)... Under those conditions, let us consider the case with index i = k. Clearly, we have tr k (X) = tr k (D). Similarly as before, thanks to the formula (.) and inequalities (.3) and (.4), it is true that m n w (k) i = for i =,...,k; and w (k) i = for i > k. Furthermore, by definition (.2), w (k) k = k U j,k V j,k = U k,k V k,k =. This is because U j,k = V j,k = for index j < k, by the assumption (.5). Thus vectors U k and V k are also standard basis vectors with the k-th entry to be ±. Then... X k = σ k U k Vk T = σ k (k-th row) Finally, we prove that all matrices {X i } i=,,r are diagonal. So the original matrix X = r X i is equal to the diagonal matrix D. The other direction is obvious. We finish i= the proof. j=... m n REFERENCES [] T. Blumensath, Accelerated iterative hard thresholding, Signal Processing, 92(3), pp , 22. [2] J. Cai, E. Candès, and Z. Shen. A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization, 2(4): , 2. [3] E. Candès and T. Tao. The power of convex relaxation: Near-optimal matrix completion. Information Theory, IEEE Transactions on, 56(5):253 28, 2. [4] E. Candès, and B. Recht, Exact matrix completion via convex optimization, Found. Comput. Math., 9 (29), pp [5] W. Cao, J. Sun, and Z. Xu, Fast image deconvolution using closed-form thresholding formulas of L q,(q = /2,2/3) regularization, Journal of Visual Communication and Image Representation, 24(), pp. 3 4, 23. [6] Y. Chen, A. Jalali, S. Sanghavi, and C. Caramanis. Low-rank matrix recovery from errors and erasures. Information Theory, IEEE Transactions on, 59(7): , 23. [7] I. Daubechies, R. DeVore, M. Fornasier, C. Gunturk, Iteratively reweighted least squares minimization for sparse recovery, Comm. Pure Applied Math, 63(), pp. 38, 2. [8] I. Daubechies, M. Defrise, and C. De Mol. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Communications on pure and applied mathematics, 57():43-457, 24.
S 1/2 Regularization Methods and Fixed Point Algorithms for Affine Rank Minimization Problems
S 1/2 Regularization Methods and Fixed Point Algorithms for Affine Rank Minimization Problems Dingtao Peng Naihua Xiu and Jian Yu Abstract The affine rank minimization problem is to minimize the rank of
More informationMinimization of Transformed L 1 Penalty: Closed Form Representation and Iterative Thresholding Algorithms
Minimization of Transformed L Penalty: Closed Form Representation and Iterative Thresholding Algorithms Shuai Zhang, and Jack Xin arxiv:4.54v [cs.it] 8 Oct 6 Abstract The transformed l penalty (TL) functions
More informationMinimizing the Difference of L 1 and L 2 Norms with Applications
1/36 Minimizing the Difference of L 1 and L 2 Norms with Department of Mathematical Sciences University of Texas Dallas May 31, 2017 Partially supported by NSF DMS 1522786 2/36 Outline 1 A nonconvex approach:
More informationProbabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms
Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms François Caron Department of Statistics, Oxford STATLEARN 2014, Paris April 7, 2014 Joint work with Adrien Todeschini,
More informationLow-Rank Factorization Models for Matrix Completion and Matrix Separation
for Matrix Completion and Matrix Separation Joint work with Wotao Yin, Yin Zhang and Shen Yuan IPAM, UCLA Oct. 5, 2010 Low rank minimization problems Matrix completion: find a low-rank matrix W R m n so
More informationECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis
ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 7: Matrix completion Yuejie Chi The Ohio State University Page 1 Reference Guaranteed Minimum-Rank Solutions of Linear
More informationSTAT 309: MATHEMATICAL COMPUTATIONS I FALL 2017 LECTURE 5
STAT 39: MATHEMATICAL COMPUTATIONS I FALL 17 LECTURE 5 1 existence of svd Theorem 1 (Existence of SVD) Every matrix has a singular value decomposition (condensed version) Proof Let A C m n and for simplicity
More informationSolving A Low-Rank Factorization Model for Matrix Completion by A Nonlinear Successive Over-Relaxation Algorithm
Solving A Low-Rank Factorization Model for Matrix Completion by A Nonlinear Successive Over-Relaxation Algorithm Zaiwen Wen, Wotao Yin, Yin Zhang 2010 ONR Compressed Sensing Workshop May, 2010 Matrix Completion
More informationProbabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms
Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Adrien Todeschini Inria Bordeaux JdS 2014, Rennes Aug. 2014 Joint work with François Caron (Univ. Oxford), Marie
More informationRobust Principal Component Analysis
ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M
More informationAn accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems
An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems Kim-Chuan Toh Sangwoon Yun March 27, 2009; Revised, Nov 11, 2009 Abstract The affine rank minimization
More informationAnalysis of Robust PCA via Local Incoherence
Analysis of Robust PCA via Local Incoherence Huishuai Zhang Department of EECS Syracuse University Syracuse, NY 3244 hzhan23@syr.edu Yi Zhou Department of EECS Syracuse University Syracuse, NY 3244 yzhou35@syr.edu
More informationSparse signals recovered by non-convex penalty in quasi-linear systems
Cui et al. Journal of Inequalities and Applications 018) 018:59 https://doi.org/10.1186/s13660-018-165-8 R E S E A R C H Open Access Sparse signals recovered by non-conve penalty in quasi-linear systems
More informationMINIMIZATION OF TRANSFORMED L 1 PENALTY: THEORY, DIFFERENCE OF CONVEX FUNCTION ALGORITHM, AND ROBUST APPLICATION IN COMPRESSED SENSING
MINIMIZATION OF TRANSFORMED L PENALTY: THEORY, DIFFERENCE OF CONVEX FUNCTION ALGORITHM, AND ROBUST APPLICATION IN COMPRESSED SENSING SHUAI ZHANG, AND JACK XIN Abstract. We study the minimization problem
More informationPrincipal Component Analysis
Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used
More informationTwo Results on the Schatten p-quasi-norm Minimization for Low-Rank Matrix Recovery
Two Results on the Schatten p-quasi-norm Minimization for Low-Rank Matrix Recovery Ming-Jun Lai, Song Li, Louis Y. Liu and Huimin Wang August 14, 2012 Abstract We shall provide a sufficient condition to
More informationSPARSE SIGNAL RESTORATION. 1. Introduction
SPARSE SIGNAL RESTORATION IVAN W. SELESNICK 1. Introduction These notes describe an approach for the restoration of degraded signals using sparsity. This approach, which has become quite popular, is useful
More informationSparsity Matters. Robert J. Vanderbei September 20. IDA: Center for Communications Research Princeton NJ.
Sparsity Matters Robert J. Vanderbei 2017 September 20 http://www.princeton.edu/ rvdb IDA: Center for Communications Research Princeton NJ The simplex method is 200 times faster... The simplex method is
More informationRobust PCA. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng
Robust PCA CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Robust PCA 1 / 52 Previously...
More informationA direct formulation for sparse PCA using semidefinite programming
A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon
More informationSOLVING A LOW-RANK FACTORIZATION MODEL FOR MATRIX COMPLETION BY A NONLINEAR SUCCESSIVE OVER-RELAXATION ALGORITHM
SOLVING A LOW-RANK FACTORIZATION MODEL FOR MATRIX COMPLETION BY A NONLINEAR SUCCESSIVE OVER-RELAXATION ALGORITHM ZAIWEN WEN, WOTAO YIN, AND YIN ZHANG CAAM TECHNICAL REPORT TR10-07 DEPARTMENT OF COMPUTATIONAL
More informationInformation-Theoretic Limits of Matrix Completion
Information-Theoretic Limits of Matrix Completion Erwin Riegler, David Stotz, and Helmut Bölcskei Dept. IT & EE, ETH Zurich, Switzerland Email: {eriegler, dstotz, boelcskei}@nari.ee.ethz.ch Abstract We
More informationCSC 576: Variants of Sparse Learning
CSC 576: Variants of Sparse Learning Ji Liu Department of Computer Science, University of Rochester October 27, 205 Introduction Our previous note basically suggests using l norm to enforce sparsity in
More informationCoordinate Update Algorithm Short Course Proximal Operators and Algorithms
Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 36 Why proximal? Newton s method: for C 2 -smooth, unconstrained problems allow
More informationRecent Developments in Compressed Sensing
Recent Developments in Compressed Sensing M. Vidyasagar Distinguished Professor, IIT Hyderabad m.vidyasagar@iith.ac.in, www.iith.ac.in/ m vidyasagar/ ISL Seminar, Stanford University, 19 April 2018 Outline
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via convex relaxations Yuejie Chi Department of Electrical and Computer Engineering Spring
More informationLow-rank Matrix Completion with Noisy Observations: a Quantitative Comparison
Low-rank Matrix Completion with Noisy Observations: a Quantitative Comparison Raghunandan H. Keshavan, Andrea Montanari and Sewoong Oh Electrical Engineering and Statistics Department Stanford University,
More informationCompressive Sensing and Beyond
Compressive Sensing and Beyond Sohail Bahmani Gerorgia Tech. Signal Processing Compressed Sensing Signal Models Classics: bandlimited The Sampling Theorem Any signal with bandwidth B can be recovered
More informationLecture Notes 10: Matrix Factorization
Optimization-based data analysis Fall 207 Lecture Notes 0: Matrix Factorization Low-rank models. Rank- model Consider the problem of modeling a quantity y[i, j] that depends on two indices i and j. To
More informationHomework 1. Yuan Yao. September 18, 2011
Homework 1 Yuan Yao September 18, 2011 1. Singular Value Decomposition: The goal of this exercise is to refresh your memory about the singular value decomposition and matrix norms. A good reference to
More informationA direct formulation for sparse PCA using semidefinite programming
A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley A. d Aspremont, INFORMS, Denver,
More informationBlock Coordinate Descent for Regularized Multi-convex Optimization
Block Coordinate Descent for Regularized Multi-convex Optimization Yangyang Xu and Wotao Yin CAAM Department, Rice University February 15, 2013 Multi-convex optimization Model definition Applications Outline
More informationSparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda
Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic
More informationLecture 8: February 9
0-725/36-725: Convex Optimiation Spring 205 Lecturer: Ryan Tibshirani Lecture 8: February 9 Scribes: Kartikeya Bhardwaj, Sangwon Hyun, Irina Caan 8 Proximal Gradient Descent In the previous lecture, we
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2
MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2 1 Ridge Regression Ridge regression and the Lasso are two forms of regularized
More informationLeast Sparsity of p-norm based Optimization Problems with p > 1
Least Sparsity of p-norm based Optimization Problems with p > Jinglai Shen and Seyedahmad Mousavi Original version: July, 07; Revision: February, 08 Abstract Motivated by l p -optimization arising from
More information1 Non-negative Matrix Factorization (NMF)
2018-06-21 1 Non-negative Matrix Factorization NMF) In the last lecture, we considered low rank approximations to data matrices. We started with the optimal rank k approximation to A R m n via the SVD,
More informationRank minimization via the γ 2 norm
Rank minimization via the γ 2 norm Troy Lee Columbia University Adi Shraibman Weizmann Institute Rank Minimization Problem Consider the following problem min X rank(x) A i, X b i for i = 1,..., k Arises
More informationCompressed Sensing and Robust Recovery of Low Rank Matrices
Compressed Sensing and Robust Recovery of Low Rank Matrices M. Fazel, E. Candès, B. Recht, P. Parrilo Electrical Engineering, University of Washington Applied and Computational Mathematics Dept., Caltech
More informationOptimization methods
Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,
More informationIterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming
Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Zhaosong Lu October 5, 2012 (Revised: June 3, 2013; September 17, 2013) Abstract In this paper we study
More informationSparse PCA with applications in finance
Sparse PCA with applications in finance A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon 1 Introduction
More informationSPARSE signal representations have gained popularity in recent
6958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 10, OCTOBER 2011 Blind Compressed Sensing Sivan Gleichman and Yonina C. Eldar, Senior Member, IEEE Abstract The fundamental principle underlying
More informationStability and Robustness of Weak Orthogonal Matching Pursuits
Stability and Robustness of Weak Orthogonal Matching Pursuits Simon Foucart, Drexel University Abstract A recent result establishing, under restricted isometry conditions, the success of sparse recovery
More informationSubspace Projection Matrix Completion on Grassmann Manifold
Subspace Projection Matrix Completion on Grassmann Manifold Xinyue Shen and Yuantao Gu Dept. EE, Tsinghua University, Beijing, China http://gu.ee.tsinghua.edu.cn/ ICASSP 2015, Brisbane Contents 1 Background
More informationAugmented Lagrangian alternating direction method for matrix separation based on low-rank factorization
Augmented Lagrangian alternating direction method for matrix separation based on low-rank factorization Yuan Shen Zaiwen Wen Yin Zhang January 11, 2011 Abstract The matrix separation problem aims to separate
More informationRapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization
Rapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization Shuyang Ling Department of Mathematics, UC Davis Oct.18th, 2016 Shuyang Ling (UC Davis) 16w5136, Oaxaca, Mexico Oct.18th, 2016
More informationSparse Approximation via Penalty Decomposition Methods
Sparse Approximation via Penalty Decomposition Methods Zhaosong Lu Yong Zhang February 19, 2012 Abstract In this paper we consider sparse approximation problems, that is, general l 0 minimization problems
More informationRobust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds
Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds Tao Wu Institute for Mathematics and Scientific Computing Karl-Franzens-University of Graz joint work with Prof.
More informationHigh Dimensional Covariance and Precision Matrix Estimation
High Dimensional Covariance and Precision Matrix Estimation Wei Wang Washington University in St. Louis Thursday 23 rd February, 2017 Wei Wang (Washington University in St. Louis) High Dimensional Covariance
More informationACCELERATED LINEARIZED BREGMAN METHOD. June 21, Introduction. In this paper, we are interested in the following optimization problem.
ACCELERATED LINEARIZED BREGMAN METHOD BO HUANG, SHIQIAN MA, AND DONALD GOLDFARB June 21, 2011 Abstract. In this paper, we propose and analyze an accelerated linearized Bregman (A) method for solving the
More informationBindel, Fall 2009 Matrix Computations (CS 6210) Week 8: Friday, Oct 17
Logistics Week 8: Friday, Oct 17 1. HW 3 errata: in Problem 1, I meant to say p i < i, not that p i is strictly ascending my apologies. You would want p i > i if you were simply forming the matrices and
More informationConditions for Robust Principal Component Analysis
Rose-Hulman Undergraduate Mathematics Journal Volume 12 Issue 2 Article 9 Conditions for Robust Principal Component Analysis Michael Hornstein Stanford University, mdhornstein@gmail.com Follow this and
More informationProximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725
Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:
More informationConditional Gradient (Frank-Wolfe) Method
Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties
More informationMatrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =
30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can
More informationSparse Solutions of an Undetermined Linear System
1 Sparse Solutions of an Undetermined Linear System Maddullah Almerdasy New York University Tandon School of Engineering arxiv:1702.07096v1 [math.oc] 23 Feb 2017 Abstract This work proposes a research
More informationarxiv: v2 [cs.na] 24 Mar 2015
Volume X, No. 0X, 200X, X XX doi:1934/xx.xx.xx.xx PARALLEL MATRIX FACTORIZATION FOR LOW-RANK TENSOR COMPLETION arxiv:1312.1254v2 [cs.na] 24 Mar 2015 Yangyang Xu Department of Computational and Applied
More informationLecture 9: September 28
0-725/36-725: Convex Optimization Fall 206 Lecturer: Ryan Tibshirani Lecture 9: September 28 Scribes: Yiming Wu, Ye Yuan, Zhihao Li Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These
More informationSparse Solutions of Linear Systems of Equations and Sparse Modeling of Signals and Images: Final Presentation
Sparse Solutions of Linear Systems of Equations and Sparse Modeling of Signals and Images: Final Presentation Alfredo Nava-Tudela John J. Benedetto, advisor 5/10/11 AMSC 663/664 1 Problem Let A be an n
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via nonconvex optimization Yuejie Chi Department of Electrical and Computer Engineering Spring
More informationConstructing Explicit RIP Matrices and the Square-Root Bottleneck
Constructing Explicit RIP Matrices and the Square-Root Bottleneck Ryan Cinoman July 18, 2018 Ryan Cinoman Constructing Explicit RIP Matrices July 18, 2018 1 / 36 Outline 1 Introduction 2 Restricted Isometry
More informationVector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.
Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Vector space Consists of: A set V A scalar
More informationRank Determination for Low-Rank Data Completion
Journal of Machine Learning Research 18 017) 1-9 Submitted 7/17; Revised 8/17; Published 9/17 Rank Determination for Low-Rank Data Completion Morteza Ashraphijuo Columbia University New York, NY 1007,
More informationCompressive Sensing of Sparse Tensor
Shmuel Friedland Univ. Illinois at Chicago Matheon Workshop CSA2013 December 12, 2013 joint work with Q. Li and D. Schonfeld, UIC Abstract Conventional Compressive sensing (CS) theory relies on data representation
More informationRecovering any low-rank matrix, provably
Recovering any low-rank matrix, provably Rachel Ward University of Texas at Austin October, 2014 Joint work with Yudong Chen (U.C. Berkeley), Srinadh Bhojanapalli and Sujay Sanghavi (U.T. Austin) Matrix
More informationRobust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition
Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition Gongguo Tang and Arye Nehorai Department of Electrical and Systems Engineering Washington University in St Louis
More informationLow-Rank Matrix Recovery
ELE 538B: Mathematics of High-Dimensional Data Low-Rank Matrix Recovery Yuxin Chen Princeton University, Fall 2018 Outline Motivation Problem setup Nuclear norm minimization RIP and low-rank matrix recovery
More informationSparse Covariance Selection using Semidefinite Programming
Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support
More informationSolving Corrupted Quadratic Equations, Provably
Solving Corrupted Quadratic Equations, Provably Yuejie Chi London Workshop on Sparse Signal Processing September 206 Acknowledgement Joint work with Yuanxin Li (OSU), Huishuai Zhuang (Syracuse) and Yingbin
More informationIEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER 2011 7255 On the Performance of Sparse Recovery Via `p-minimization (0 p 1) Meng Wang, Student Member, IEEE, Weiyu Xu, and Ao Tang, Senior
More informationA Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations
A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations Chuangchuang Sun and Ran Dai Abstract This paper proposes a customized Alternating Direction Method of Multipliers
More informationLinear and Multilinear Algebra. Linear maps preserving rank of tensor products of matrices
Linear maps preserving rank of tensor products of matrices Journal: Manuscript ID: GLMA-0-0 Manuscript Type: Original Article Date Submitted by the Author: -Aug-0 Complete List of Authors: Zheng, Baodong;
More informationSupplementary Materials for Riemannian Pursuit for Big Matrix Recovery
Supplementary Materials for Riemannian Pursuit for Big Matrix Recovery Mingkui Tan, School of omputer Science, The University of Adelaide, Australia Ivor W. Tsang, IS, University of Technology Sydney,
More informationFixed point and Bregman iterative methods for matrix rank minimization
Math. Program., Ser. A manuscript No. (will be inserted by the editor) Shiqian Ma Donald Goldfarb Lifeng Chen Fixed point and Bregman iterative methods for matrix rank minimization October 27, 2008 Abstract
More informationSparse and low-rank decomposition for big data systems via smoothed Riemannian optimization
Sparse and low-rank decomposition for big data systems via smoothed Riemannian optimization Yuanming Shi ShanghaiTech University, Shanghai, China shiym@shanghaitech.edu.cn Bamdev Mishra Amazon Development
More informationCompressed sensing. Or: the equation Ax = b, revisited. Terence Tao. Mahler Lecture Series. University of California, Los Angeles
Or: the equation Ax = b, revisited University of California, Los Angeles Mahler Lecture Series Acquiring signals Many types of real-world signals (e.g. sound, images, video) can be viewed as an n-dimensional
More informationUser s Guide for LMaFit: Low-rank Matrix Fitting
User s Guide for LMaFit: Low-rank Matrix Fitting Yin Zhang Department of CAAM Rice University, Houston, Texas, 77005 (CAAM Technical Report TR09-28) (Versions beta-1: August 23, 2009) Abstract This User
More informationLecture 5 : Projections
Lecture 5 : Projections EE227C. Lecturer: Professor Martin Wainwright. Scribe: Alvin Wan Up until now, we have seen convergence rates of unconstrained gradient descent. Now, we consider a constrained minimization
More informationLecture 2: Linear Algebra Review
EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1
More informationEE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 2
EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 2 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory April 5, 2012 Andre Tkacenko
More informationUses of duality. Geoff Gordon & Ryan Tibshirani Optimization /
Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear
More informationMatrix Completion from a Few Entries
Matrix Completion from a Few Entries Raghunandan H. Keshavan and Sewoong Oh EE Department Stanford University, Stanford, CA 9434 Andrea Montanari EE and Statistics Departments Stanford University, Stanford,
More informationGauge optimization and duality
1 / 54 Gauge optimization and duality Junfeng Yang Department of Mathematics Nanjing University Joint with Shiqian Ma, CUHK September, 2015 2 / 54 Outline Introduction Duality Lagrange duality Fenchel
More informationAbstract. 1. Introduction
Journal of Computational Mathematics Vol.28, No.2, 2010, 273 288. http://www.global-sci.org/jcm doi:10.4208/jcm.2009.10-m2870 UNIFORM SUPERCONVERGENCE OF GALERKIN METHODS FOR SINGULARLY PERTURBED PROBLEMS
More informationPenalty Decomposition Methods for Rank and l 0 -Norm Minimization
IPAM, October 12, 2010 p. 1/54 Penalty Decomposition Methods for Rank and l 0 -Norm Minimization Zhaosong Lu Simon Fraser University Joint work with Yong Zhang (Simon Fraser) IPAM, October 12, 2010 p.
More informationSTAT 309: MATHEMATICAL COMPUTATIONS I FALL 2013 PROBLEM SET 2
STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2013 PROBLEM SET 2 1. You are not allowed to use the svd for this problem, i.e. no arguments should depend on the svd of A or A. Let W be a subspace of C n. The
More informationLinear Regression and Its Applications
Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start
More informationElaine T. Hale, Wotao Yin, Yin Zhang
, Wotao Yin, Yin Zhang Department of Computational and Applied Mathematics Rice University McMaster University, ICCOPT II-MOPTA 2007 August 13, 2007 1 with Noise 2 3 4 1 with Noise 2 3 4 1 with Noise 2
More informationA New Estimate of Restricted Isometry Constants for Sparse Solutions
A New Estimate of Restricted Isometry Constants for Sparse Solutions Ming-Jun Lai and Louis Y. Liu January 12, 211 Abstract We show that as long as the restricted isometry constant δ 2k < 1/2, there exist
More informationLecture 25: November 27
10-725: Optimization Fall 2012 Lecture 25: November 27 Lecturer: Ryan Tibshirani Scribes: Matt Wytock, Supreeth Achar Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have
More informationKey Words: low-rank matrix recovery, iteratively reweighted least squares, matrix completion.
LOW-RANK MATRIX RECOVERY VIA ITERATIVELY REWEIGHTED LEAST SQUARES MINIMIZATION MASSIMO FORNASIER, HOLGER RAUHUT, AND RACHEL WARD Abstract. We present and analyze an efficient implementation of an iteratively
More informationNonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy
Nonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy Caroline Chaux Joint work with X. Vu, N. Thirion-Moreau and S. Maire (LSIS, Toulon) Aix-Marseille
More informationA MODIFIED TSVD METHOD FOR DISCRETE ILL-POSED PROBLEMS
A MODIFIED TSVD METHOD FOR DISCRETE ILL-POSED PROBLEMS SILVIA NOSCHESE AND LOTHAR REICHEL Abstract. Truncated singular value decomposition (TSVD) is a popular method for solving linear discrete ill-posed
More informationLecture Notes 1: Vector spaces
Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector
More informationTHE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR
THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR 1. Definition Existence Theorem 1. Assume that A R m n. Then there exist orthogonal matrices U R m m V R n n, values σ 1 σ 2... σ p 0 with p = min{m, n},
More informationHomework 4. Convex Optimization /36-725
Homework 4 Convex Optimization 10-725/36-725 Due Friday November 4 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)
More informationLow rank matrix completion by alternating steepest descent methods
SIAM J. IMAGING SCIENCES Vol. xx, pp. x c xxxx Society for Industrial and Applied Mathematics x x Low rank matrix completion by alternating steepest descent methods Jared Tanner and Ke Wei Abstract. Matrix
More informationGuaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization
Forty-Fifth Annual Allerton Conference Allerton House, UIUC, Illinois, USA September 26-28, 27 WeA3.2 Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization Benjamin
More informationLinear Algebra for Machine Learning. Sargur N. Srihari
Linear Algebra for Machine Learning Sargur N. srihari@cedar.buffalo.edu 1 Overview Linear Algebra is based on continuous math rather than discrete math Computer scientists have little experience with it
More information