Uniqueness Conditions for A Class of l 0 -Minimization Problems

Similar documents
RSP-Based Analysis for Sparsest and Least l 1 -Norm Solutions to Underdetermined Linear Systems

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit

Lecture: Introduction to Compressed Sensing Sparse Recovery Guarantees

A New Estimate of Restricted Isometry Constants for Sparse Solutions

Inverse problems and sparse models (1/2) Rémi Gribonval INRIA Rennes - Bretagne Atlantique, France

A Generalized Uncertainty Principle and Sparse Representation in Pairs of Bases

Lecture Notes 9: Constrained Optimization

Sparse Solutions of Systems of Equations and Sparse Modelling of Signals and Images

Sparse Solutions of Linear Systems of Equations and Sparse Modeling of Signals and Images!

Sparse Optimization Lecture: Sparse Recovery Guarantees

The Sparsest Solution of Underdetermined Linear System by l q minimization for 0 < q 1

SPARSE signal representations have gained popularity in recent

Compressed Sensing. 1 Introduction. 2 Design of Measurement Matrices

5742 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER /$ IEEE

Least Sparsity of p-norm based Optimization Problems with p > 1

Sparse solutions of underdetermined systems

IEEE SIGNAL PROCESSING LETTERS, VOL. 22, NO. 9, SEPTEMBER

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization

Recovering overcomplete sparse representations from structured sensing

Constrained optimization

Sparse Solutions of Linear Systems of Equations and Sparse Modeling of Signals and Images: Final Presentation

Lecture 22: More On Compressed Sensing

Review of Basic Concepts in Linear Algebra

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery

Lecture 1: Review of linear algebra

INDUSTRIAL MATHEMATICS INSTITUTE. B.S. Kashin and V.N. Temlyakov. IMI Preprint Series. Department of Mathematics University of South Carolina

Conditions for Robust Principal Component Analysis

Chapter 3 Transformations

Uniqueness and Uncertainty

Basic Concepts in Linear Algebra

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Strengthened Sobolev inequalities for a random subspace of functions

On the l 1 -Norm Invariant Convex k-sparse Decomposition of Signals

Stability and Robustness of Weak Orthogonal Matching Pursuits

Thresholds for the Recovery of Sparse Solutions via L1 Minimization

Sparsest Solutions of Underdetermined Linear Systems via l q -minimization for 0 < q 1

Chap 3. Linear Algebra

An algebraic perspective on integer sparse recovery

Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing

Symmetric Matrices and Eigendecomposition

Uncertainty quantification for sparse solutions of random PDEs

Introduction to Compressed Sensing

On Optimal Frame Conditioners

Constructing Explicit RIP Matrices and the Square-Root Bottleneck

2 Two-Point Boundary Value Problems

Sparse Recovery over Graph Incidence Matrices

Abstract This paper is about the efficient solution of large-scale compressed sensing problems.

Lecture 7. Econ August 18

Reconstruction from Anisotropic Random Measurements

Data Sparse Matrix Computation - Lecture 20

Characterization of half-radial matrices

Sparse representations and approximation theory

Compressed Sensing: a Subgradient Descent Method for Missing Data Problems

Linear Algebra Massoud Malek

Section 3.9. Matrix Norm

Interpolation via weighted l 1 minimization

Optimisation Combinatoire et Convexe.

Solution-recovery in l 1 -norm for non-square linear systems: deterministic conditions and open questions

Equivalence Probability and Sparsity of Two Sparse Solutions in Sparse Representation

Error Correction via Linear Programming

LIFTED CODES OVER FINITE CHAIN RINGS

Rui ZHANG Song LI. Department of Mathematics, Zhejiang University, Hangzhou , P. R. China

List of Errata for the Book A Mathematical Introduction to Compressive Sensing

Sparse Solutions of Linear Complementarity Problems

5 Measure theory II. (or. lim. Prove the proposition. 5. For fixed F A and φ M define the restriction of φ on F by writing.

Necessary and Sufficient Conditions of Solution Uniqueness in 1-Norm Minimization

An Unconstrained l q Minimization with 0 < q 1 for Sparse Solution of Under-determined Linear Systems

Linear algebra for computational statistics

Chapter 1. Matrix Algebra

CS 229r: Algorithms for Big Data Fall Lecture 19 Nov 5

Sparse Recovery with Pre-Gaussian Random Matrices

NORMS ON SPACE OF MATRICES

A Note on the Complexity of L p Minimization

A new product on 2 2 matrices

On Sparsity, Redundancy and Quality of Frame Representations

On Sparse Solutions of Underdetermined Linear Systems

Lecture 7: Positive Semidefinite Matrices

Generalized Orthogonal Matching Pursuit- A Review and Some

Sparsity in Underdetermined Systems

Interpolation-Based Trust-Region Methods for DFO

Lecture 6. Numerical methods. Approximation of functions

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Sparse & Redundant Signal Representation, and its Role in Image Processing

Homotopy methods based on l 0 norm for the compressed sensing problem

Observability of a Linear System Under Sparsity Constraints

Sparse and Low Rank Recovery via Null Space Properties

Design of Projection Matrix for Compressive Sensing by Nonsmooth Optimization

Coding the Matrix Index - Version 0

Econ Slides from Lecture 7

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

AN INTRODUCTION TO COMPRESSIVE SENSING

The Sparsity Gap. Joel A. Tropp. Computing & Mathematical Sciences California Institute of Technology

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 2

2. Matrix Algebra and Random Vectors

Two Results on the Schatten p-quasi-norm Minimization for Low-Rank Matrix Recovery

Sparse Solutions of an Undetermined Linear System

Pre-weighted Matching Pursuit Algorithms for Sparse Recovery

An Even Order Symmetric B Tensor is Positive Definite

POINT VALUES AND NORMALIZATION OF TWO-DIRECTION MULTIWAVELETS AND THEIR DERIVATIVES


Transcription:

Uniqueness Conditions for A Class of l 0 -Minimization Problems Chunlei Xu and Yun-Bin Zhao October, 03, Revised January 04 Abstract. We consider a class of l 0 -minimization problems, which is to search for the partial sparsest solution to an underdetermined linear system with additional constraints. We introduce several concepts, including l p -induced quasi-norm (0 < p < ), maximal scaled spark and scaled mutual coherence, to develop several new uniqueness conditions for the partial sparsest solution to this class of l 0 -minimization problems. A further improvement of some of these uniqueness criteria has been also achieved through the so-called concepts such as maximal scaled (sub)coherence rank. Key words. l 0 -minimization, uniqueness condition, l p -induced quasi-norm, scaled spark, scaled mutual (sub)coherence, maximal scaled coherence rank. Introduction Sparse representation, using only a few elementary atoms from a dictionary to represent data (signals, images, etc.), has been widely used in engineering and applied sciences recently (see, e.g., [5, 7, 4,, 4, 9, 8, 0,, 0, ] and the references therein). For a vector x, let x 0 denote the l 0 -norm of x, namely, the number of nonzero components of x. In this paper, we consider the following model for the sparse representation of the vector b R m : { ( ) } x min x 0 : M = b, y C, () y where M = [A, ] R m (n +n ), m n, is a concatenation of A R m n and R m n, and C is a convex set in R n which can be interpreted as certain constraints on the variable y R n. Throughout the paper, we assume that the null space of A T is nonzero, namely, N (A T ) {0}. The solution to the system ( ) x M = b, y C, () y School of Mathematics, University of Birmingham, Edgbaston, B5 TT, United Kingdom (email: cxx08@bham.ac.uk). School of Mathematics, University of Birmingham, Edgbaston, B5 TT, United Kingdom (email: y.zhao.@bham.ac.uk). This author was supported by the Engineering and Physical Sciences Research Council (EPSRC) under the grant #EP/K00946X/, and partially supported by the National Natural Science Foundation of China (NSFC) under the grant #3006.

includes two parts: x R n and y R n. The l 0 -minimization problem () is to seek a solution z = (x, y) to the system () such that the x-part is the sparsest one, but there is no requirement on the sparsity of the y-part of the solution. Such a sparsest solution x can be called the sparsest x-part solution to the system (). The l 0 -minimization problem () is NP-hard (see Natarajan [6]), and can be called a partial l 0 -minimization problem, or partial sparsity-seeking problem. Problem () is closely related to partial sparsity recovery theory (see, e.g., Bandeira et al.[], and Jacques [3]), and partial imaging reconstruction (Vaswani and Lu [8]), and the sparse Hessian recovery (Bandeira et al. []). Problem () is general enough to include some importance sparsity seeking problems as special cases. For instance, the normal l 0 -minimization min{ x 0 : Ax = b} (3) is an important special case of (). In fact, Problem () is reduced to (3) when = 0. The uniqueness of the standard l 0 -minimization (3) has been widely investigated, and has been established by using the so-called spark of a matrix A (see Donoho and Elad [6]), denoted by Spark(A), which is the smallest number of columns of a matrix that are linearly dependent. It was shown in [6, 5, 4] that for a given linear system Ax = b, if there exists a solution x satisfying x 0 < Spark(A), then x is necessarily the unique sparsest solution to (3). Since the computation of spark is generally intractable, some other verifiable conditions have been developed in the literature, for instance, by the mutual coherence [7], the largest absolute value of inner products between different normalized columns of A, i.e., µ(a) = max i j n a i, a j a i a j, where a i is the i-th column of A, i =,..., n. The mutual coherence gives a computable lower bound for the spark [6], i.e., Spark(A) + µ(a), which yields following uniqueness condition (see, e.g., [6, 5, 4]): For a given linear system Ax = b, if there exists a solution x satisfying x 0 < ( + µ(a) ), then x is necessarily the unique sparsest solution to (3). However, the mutual coherence condition might be very restrictive in some situations, and fails to provide a good lower bound for the spark. In order to improve the lower bound for spark, Zhao [9] has introduced the concept of coherence rank, submutual coherence and scaled mutual coherence, and has developed several new and improved uniqueness sufficient conditions for the solution to l 0 -minimization (3). So far, the uniqueness of the sparsest x-part solution to the general sparsity module () has not well developed. The main purpose of this paper is to study such uniqueness and to establish some criteria under which the problem () has a unique sparsest x-part solution. These results will be established through some new concepts such as the l p -induced quasi-norm, the (maximal) scaled spark, coherence, and coherence rank associated with a pair of matrices (A R m n, R m n ). These concepts can be seen as a generalization of those in [9]. This paper is organized as follows. In Section, we develop sufficient conditions for the uniqueness of x-part solutions to the l 0 -minimization problem () in terms of l p - induced quasinorm, and such concepts as maximal scaled spark, and minimal or maximal scaled mutual coherence. A further improvement of these conditions is provided in Section 3.

Uniqueness criteria for the l 0 -minimization problem () The uniqueness of the sparsest x-part solution to the system () can be developed through different concepts and properties of matrices. One of such important concept is spark together with its variants, which provides a connection between the null space of a matrix and the sparsest solution to linear equations. In this section, we show that the method used for developing uniqueness claims for the l 0 -minimization (3) can be used for ( the) development of similar claims x to the system (), while the extra variable y in the system M = b increases the complexity y of the problem (). Our first sufficient uniqueness condition for the sparsest x-part solution to () can be developed by using the so-called l p -induced quasi-norm, as shown in the following subsection.. An l p -induced quasi-norm-based uniqueness condition For any 0 < p < and a vector x R n, let x p = ( n i= x i p ) /p. When p (0, ), x p is called the l p quasi-norm of x. We now introduce the following concept. Definition. For any given matrix A R m n, when 0 < p <, the l p -induced quasi-norm of A, denoted by ψ p (A), is defined by ψ p (A) = Az p p sup 0 z R n z p p = sup Az p p. (4) z p p Clearly, for a fixed p (0, ), ψ p (A) satisfies the following properties: ψ p (A) 0, ψ p (A) > 0 for any A 0, and ψ p (A + B) ψ p (A) + ψ p (B) for any matrices A, B with same dimensions. It is worth mentioning that the triangle inequality above follows from the property: x + y p p x p p + y p p (see, e.g., []). We see that for α > 0, ψ p (αa) αψ p (A) in general. So ψ p (A) is a quasi-norm of A. Note that, for every entry z i, as p tends to zero, z i p approaches to for z i 0 and 0 for z i = 0. Thus for any given z R n, we have n lim p 0 z p p = lim z i p = z 0, (5) p 0 i= which indicates that the l 0 -norm z 0 can be approximated by x p p with sufficiently small p (0, ). Note that for a given matrix A, ψ p (A) is continuous with respect to p (0, ). Thus there might exists a positive number η such that η = lim p 0 + ψ p (A). We assume that the following property holds for the matrix M = [A, ] when p tends to 0. Assumption. Assume that matrices A, satisfy the following properties: (i) A T is a nonsingular matrix, and (ii) there exists a positive constant, denoted by ψ 0 (A A ), such that ψ 0 (A A ) = lim p 0 + ψ p(a A ), where A = (AT ) A T, the pseudo-inverse of. 3

Under Assumption. and by (4) and (5), we immediately have the following inequality: (A A )z 0 = lim p 0 + (A A )z p p lim p 0 + ψ p (A A ) z p p = ψ 0 (A A ) z 0 (6) for any z R n. We now state a uniqueness condition for Problem () under Assumption.. Theorem.3 Consider the system () with A R m n, R m n, and m < n. Let Assumption. be satisfied. Then if there exists a solution (x, y) to the system () satisfying that x must be the unique sparsest x-part solution to the system (). x 0 < Spark(M) ( + ψ 0 (A A )), (7) Proof. Assume the contrary that there is another solution (x (), y () ) to the system () such that x () is the sparsest x-part and x () x and x () 0 x 0 < Spark(M). Since both (+ψ 0 (A ( ) A )) x (x, y) and (x (), y () ) are solutions to the linear system M = b, we have y A (x x () ) + (y y () ) = 0. (8) Since A T is nonsingular, y y () can be uniquely determined by x x (), i.e., y () y = A A (x x () ), (9) where ( A is ) the pseudo-inverse of given by A = (AT ) A T. From (8), we know that x x () y y () is in the null space of the matrix M = [A, ]. This implies that the Spark(M) ( ) x x () is a lower bound for y y (), i.e., 0 ( ) x x () 0 + y y () x x () 0 = y y () Spark(M). (0) 0 Substituting (9) into (0) leads to Under Assumption., one has x x () 0 + A A (x x () ) 0 Spark(M). () Merging () and the inequality above leads to A A (x () x) 0 ψ 0 (A A ) x x () 0. ( + ψ 0 (A A )) x x () 0 Spark(M), Therefore, x 0 x () 0 + x 0 x x () 0 Spark(M) + ψ 0 (A A ). 4

Thus x 0 Spark(M), contradicting with (7). Therefore x must be the unique sparsest (+ψ 0 (A A )) x-part solution to system (). The above result provides a new uniqueness criteria for the problem () by using l p -induced quasi-norm. However, the above analysis relies on the nonsingularity of A T which might not be satisfied in more general situations. Thus we need to develop some other uniqueness criteria for the problem from other perspectives.. Uniqueness based on scaled spark and scaled mutual coherence In this section, we develop uniqueness conditions for problem () by using the so-called scaled spark and scaled mutual coherence. Lemma.4 ([4]) For any matrix M and any scaling matrix W, one has Spark(W M) + µ(w M). We use N ( ) to denote the null space of a matrix. Our first uniqueness criterion based on scaled spark is given as follows. Theorem.5 Consider the system () where A R m n, R m n exists a solution (x, y) to the system () satisfying and m < n. If there x 0 < Spark(BT A ), () where B is a basis of N (A T ), then x is the unique sparsest x-part solution to the system (). Proof. Assume that (x (), y () ) (x, y) is a solution to the system () satisfying ( that x () ) x, and x () 0 x 0 < x Spark(BT A ), where B is a basis of N (A T ). Note that () x y () is y in the null space of M = [A, ], so A (x () x ) = (y () y). (3) Note that the range space of is orthogonal to the null space of A T, namely, R( ) = N (A T ). Let B be an arbitrary basis of N (A T ). Since the right-hand side of (3) is in R(), by multiplying both sides of the equation (3) by B T, we get B T A (x () x) = 0, which implies that Therefore, x () x 0 Spark(B T A ). (4) x 0 x () 0 + x 0 Spark(B T A ). 5

i.e., x 0 Spark(BT A ), leading to a contradiction. Therefore, the system () has a unique sparsest x-part solution. Let F be a set of all bases of N (A T ), namely, where q is the dimension of N (A T ). F = {B R m q : B is a basis of N (A T )}, From the definition of the spark, we know that Spark(B T A ) is bounded. Hence, there exists the supremum of Spark(B T A ), defined as follows. Definition.6 For any matrix A R m n with m < n, let Spark (A ) = sup Spark(B T A ). (5) Spark (A ) is called the maximal scaled spark of A over F (the set of bases of N (A T )). The inequality (4) in the proof of Theorem.5 holds for all bases B of N (A T ). Therefore, the spark condition () can be further enhanced by using Spark (A ). Theorem.7 Consider the system () where A R m n and R m n and m < n. If there exists a solution (x, y) to the system () satisfying x 0 < Spark (A ), (6) where Spark (A ) is given by (5), then x is the unique sparsest x-part solution to the system (). From Lemma.4, the scaled mutual coherence may provide a lower bound for the scaled spark. An immediate consequence of Theorem.5 is the corollary below. Corollary.8 For a given system () where A R m n, R m n exists a solution (x, y) to the system () satisfying ( + x 0 < µ(b T A ) and m < n. If there ), (7) where B is a basis of N (A T ), then x is the unique sparsest x-part solution to the system (). Note that Corollary.8 holds for any bases B of N (A T ). So it makes sense to further enhance the bound (7) by introducing the following definition. Definition.9 For any matrix A R m n (m < n ) and R m n, let µ (A ) = inf µ(bt A ), µ (A ) = sup µ(b T A ). (8) µ (A ) is called the minimal scaled coherence of A over F, and µ (A ) is called the maximal scaled coherence of A over F. 6

Based on Lemma.4 and the above definition, we have the following result. Lemma.0 For any basis B of N (A T ), we have + µ(b T A ) + µ (A ) Spark (A ), (9) Proof. The first inequality holds by the definition of µ (A ). From Lemma.4, we have + µ(b T A ) Spark(BT A ) for every basis B of N (A T ). By (5), we see that Spark(B T A ) Spark (A ), thus + µ(b T A ) Spark (A ) for all B F. Since the right-hand side of the above is fixed, which is an upper bound for the left-hand side for any B F, we conclude that { Spark (A ) sup + } µ(b T = + A ) inf {µ(b T A )} = + µ (A ). By Theorem.7 and Lemma.0, we have the next enhanced uniqueness claim. Theorem. For a given system () with A R m n, R m n exists a solution (x, y) T satisfying x 0 < ( + and m < n. If there ) µ, (0) (A ) where µ (A ) is the minimal scaled coherence of A over F, then x is the unique sparsest x-part solution to the system (). Remark. The uniqueness criteria established in this section can be seen as certain generalization of that of sparsest solutions to systems of linear equations. For instance, when = 0, the null space of A T is the whole space Rm. Hence, by letting B = I, the corresponding scaled mutual coherence and scaled spark become µ(b T A ) = µ(a ), spark(b T A ) = spark(a ). The results in this section are reduced to the existing ones [6, 5, 4]. It is worth noting that the spark type uniqueness conditions are derived from the property of null spaces. It is worth mentioning that the null space based analysis is not the unique way to derive uniqueness criteria for sparsest solutions. Some other approaches such as the so-called range space property (see, e.g., [9, 0, ]) and orthogonal projection from R n +n develop uniqueness criteria. to N (A T ) [] can be also used to 7

3 Further Improvement of some uniqueness conditions Since spark conditions are difficult to verify, the mutual coherence conditions play an important role in the uniqueness theory for the l 0 -minimization problem (). As shown in Lemma.0, + µ (A ) is a good lower bound for Spark (A ) which is an improved version of the bound (7). In this section, we aim to further enhance the uniqueness claim (0) by further improving the lower bound of Spark (A ) under some situations. Following the discussions in [9], we introduce the so-called scaled coherence rank, scaled sub-coherence and scaled sub-coherence rank to achieve certain improvement on uniqueness conditions developed in section. 3. Maximal (sub) coherence and rank Let us first recall several concepts which were introduced by Zhao [9]. A R m n with columns a i, i =,..., n, consider the index set { a T i S i (A) := j : j i, a } j = µ(a), i =,..., n. a i a j For a given matrix Let α i (A) be the cardinality of S i (A), and α(a) be the largest one among α i (A) s, i.e., α(a) = max i n α i(a) = max i n S i(a). α(a) is called the coherence rank of A. Let i 0 be an index such that α(a) = α i0 (A) = S i0 (A). Define β(a) = max α i (A) = max S i (A), i n, i i 0 i n, i i 0 which is called the sub-coherence rank of A. Also we define by { µ () a T i (A) = max a j a T i : a } j < µ(a), i j a i a j a i a j the second largest absolute value of the inner product between two normalized columns of A. µ () (A) is called the sub-mutual coherence of A. Consider the sub-mutual coherence µ () (B T A ) with a scaling matrix B F. We introduce the following new concept. Definition 3. Let A R m n (m < n ) and R m n be two matrices, and F is the set of bases of N (A T ). (i) The maximal scaled sub-mutual coherence of A on F, denoted by µ () (A ), is defined as µ () (A ) = sup µ () (B T A ). () (ii) The maximal scaled coherence rank of A on F, denoted by α (A ), is defined as αa (A ) = sup{α(b T A )}. () 8

(iii) The maximal scaled sub-coherence rank on F, denoted by β (A ), is defined as βa (A ) = sup{β(b T A )}. (3) It is easy to see the following relationship between α(b T A ), β(b T A ), α (A ) and β (A ) : For every basis B of N (A T ), we have β(b T A ) α(b T A ) α (A ), β(b T A ) β (A ) α (A ). (4) 3. Improved lower bounds of Spark (A ) Following the method used to improve the lower bound of Spark(A) in [9], we can find an enhanced lower bound of Spark (A ) via the concepts introduced in Section 3.. We will make use of the following two lemmas. Lemma 3. (Brauer [3]) For any matrix A R n n with n, if λ is an eigenvalue of A, there is a pair (i, j) of positive integers with i j ( i, j n) such that where i := n j=,j i a ij for i n. λ a ii λ a jj i j, Merging Theorem.5 and Proposition.6 in [9] yields the following result. Lemma 3.3 (Zhao [9]) Let A R m n, and let α(a) and β(a) be the coherence rank and subcoherence rank of A, respectively. Suppose that one of the following conditions holds: (i) α(a) < µ(a) ; (ii) α(a) µ(a) and β(a) < α(a). Then µ() (A) > 0 and [ α(a)β(a) µ(a) ] Spark(A) + µ () (A){ µ(a)(α(a) + β(a)) + µ(a) (α(a) β(a)) + 4} > + µ(a) where µ(a) = µ(a) µ () (A) and µ () (A) is the subcoherence of A. Based on Lemma 3.3, we can construct an enhanced lower bound of Spark (A ) under some conditions, in terms of the scaled coherence rank and scaled sub-coherence rank. Theorem 3.4 Consider the system () where A R m n, R m n and m < n. Suppose that one of the following conditions holds: (i) α(b T A ) < µ(b T A for all B F ; ) (ii) α(b T A ) µ(b T A ) and β(bt A ) < α(b T A ) for all B F. Then for any B F, we have that µ () (B T A ) > 0 and SparkA (A ) { [ α(b T A )β(b T A ) µ(b T A ) } ] sup + µ () (B T A ){ µ(b T A )(α(b T A ) + β(b T A )) + } + µ (A ). where µ(b T A ) = µ(b T A ) µ () (B T A ) and = [ µ(b T A )] (α(b T A ) β(b T A )) + 4. 9

0 and Proof. Under conditions (i) and (ii), by Lemma 3.3, for any B F we have that µ () (B T A ) > Spark(B T A ) φ(b T A ) =: + [ α(b T A )β(b T A ) µ(b T A ) ] µ () (B T A ){ µ(b T A )(α(b T A ) + β(b T A )) + }, (5) where µ(b T A ) = µ(b T A ) µ () (B T A ) and = [ µ(b T A )] (α(b T A ) β(b T A )) +4. The above inequality holds for any basis B F. By the definition of Spark (A ), we have Thus it follows from (5) that Spark (A ) Spark(B T A ) for any B F. Spark (A ) φ(b T A ) for all B F (6) Inequality (6) implies that the value of φ(b T A ) is bounded by the constant Spark (A ). Hence, the supremum of φ(b T A ) over F should be bounded by Spark (A ), namely, Spark (A ) sup φ(b T A ). By Lemma 3.3 again, under conditions (i) and (ii), we see that φ(b T A ) > + µ(b T A ). Therefore, the superimum of φ(b T A ) should be greater than the value of + µ(b T A for any basis ) B F, i.e., This in turn implies that sup φ(b T A ) > + µ(b T A ) sup{φ(b T A )} sup { + for any B F. } µ(b T = + A ) µ (A ), where the last equality follows from the definition of µ (A ). Therefore, under conditions (i) and (ii), we conclude that as claimed. Spark (A ) sup{φ(b T A )} + µ (A ), Conditions (i) and (ii) in Theorem 3.4 rely on B F. A similar condition without relying on B can be also established as shown by the next result. Theorem 3.5 Consider the system () with A R m n, R m n and m < n. Let µ (A ), and µ () (A ), αa (A ), βa (A ) be four constants defined by (8), ()-(3), respectively. Suppose that one of the following conditions holds: (i) αa (A ) < µ A (A ) ; (ii) αa (A ) µ A (A ) and β (A ) < αa (A ). Then µ () (A ) > 0 and ρ (α Spark (A ) φ = + A (A ) + βa (A )) µ µ (), (A ) where µ = µ (A ) µ () ( (A ) and ρ = αa (A ) βa (A )) ( µ ) + 4. 0

Proof. Note that α(b T A ) {,..., n } for any B F. By the definition of α (A ) which is the maximum value of α(b T A ) over F, this maximum is attainable, that is, there exists a B F such that α (A ) = α( B T A ). For such a basis B F, without loss of generality, we assume that all columns of B T A are normalized in the sense that the l -norm of every column of B T A is. Note also that the spark, mutual coherence, sub-coherence, coherence rank, and sub-coherence rank are invariant under normalization. Let p = Spark( B T A ) and {c,, c p } be the set of p columns from B T A that are linearly dependent. Denote C p the submatrix consisting of these p columns. Then the Gram matrix of C p, G pp = C T p C p R p p, is singular. Since all diagonal entries of G pp are s, and the absolute value of off-diagonal entries are less than or equal to µ( B T A ). Under either condition (i) or (ii) of the theorem, we have In particular, we have α (A ) α (A ) µ (A ) µ(b T A ) for any B F. µ( B T A ) Spark( B T A ) = p. (7) Since G pp is a p p matrix, in each row of G pp, there are at most α (A ) = α( B T A ) entries whose absolute values are equal to µ( B T A ), and the absolute values of the remaining (p α (A )) entries are less than or equal to µ () ( B T A ). By the singularity of G pp, we know that λ = 0 is an eigenvalue of G pp. By Lemma 3., there exist two rows of G pp, say, the ith row and the jth row (i j), satisfying that 0 G ii 0 G jj i j = p t=,t i c T i c t p t=,t j c T j c t. (8) By the definitions of coherence rank and sub-coherence rank, if there are α (A )(= α( B T A )) entries whose absolute values are µ( B T A ) in the ith row, then for the jth row, there are at most β( B T A ) entries whose absolute values are µ( B T A ). And the absolute values of the remaining entries in either row are less than or equal to µ () ( B T A ). Therefore, from (8), we have that [α (A )µ( B T A ) + (p α (A ))µ () ( B T A )] (9) [β( B T A )µ( B T A ) + (p β( B T A ))µ () ( B T A )]. Let p = Spark (A ). Since Spark (A ) is the supremum of Spark(B T A ) over F, we have p p. Thus it follows from (9) that [α (A )µ( B T A ) + (p α (A ))µ () ( B T A )] (30) [β( B T A )µ( B T A ) + (p β( B T A ))µ () ( B T A )].

By the definition of β (A ), we have β( B T A ) β (A ). This, together with µ( B T A) µ () ( B T A ), implies that β( B T A )µ( B T A ) + (p β( B T A ))µ () ( B T A ) β (A )µ( B T A ) + (p β (A ))µ () ( B T A ). Combining (30) with the inequality above yields Note that [α (A )µ( B T A ) + (p α (A ))µ () ( B T A )] [β (A )µ( B T A ) + (p β (A ))µ () ( B T A )]. (3) β (A ) α (A ) p p, µ( B T A ) µ (A ), µ () ( B T A ) µ () (A ). So from (3), we obtain [α (A )µ (A ) + (p α (A ))µ () (A )] [β (A )µ (A ) + (p β (A ))µ () (A )]. Denote by µ := µ (A ) µ () (A ). The above inequality can be written as [ ] (p )µ () (A ) + (p )(α (A A ) + β (A A )) µ µ () (A ) + αa (A )βa (A )( µ ). (3) By the definition of µ () In fact, if µ () (A ), we know that µ () (A ) 0. We now prove that µ () (A ) = 0, then the quadratic inequality (3) becomes α (A )β (A ) ( µ (A ) ), (A ) > 0. which contradicts to either condition (i) or condition (ii) of the theorem. Thus µ () (A ) must be positive. Consider the following quadratic equation in variable t : h(t) := t + t(α (A ) + β (A )) µ + α (A )β (A )( µ ) = 0 which has only one positive root under conditions (i) and (ii). This positive root is given by t = (α (A ) + βa (A )) µ + ρ, where ρ = (αa (A ) βa (A )) ( µ ) +4. Let γ = (p )µ () (A ). The inequality (3) shows that h(γ) 0. Thus γ t, that is, Therefore, as desired. (p )µ () (A ) (α (A ) + βa (A )) µ + ρ. ρ (α Spark (A ) = p + A (A ) + βa (A )) µ µ (), (A ) By Theorem.7 and Theorem 3.5, we immediately have the next uniqueness condition.

Corollary 3.6 Consider the system () where A R m n, R m n, and m < n. Under the same condition of Theorem 3.5. If there exists a solution (x, y) to the system () satisfying that x 0 < φ =: ρ (α + A (A ) + βa (A )) µ µ (), (A ) then x is the unique sparsest x-part solution to the system (). The above corollary may also provide a tighter lower bound of Spark (A ) than Theorems. under some conditions, as indicated by the following proposition. Proposition 3.7 Let φ be a lower bound of Spark (A ) given in Theorem 3.5. Assume that αa (A ) = and αa (A ) < µ A (A ). If µ () (A ) < µ (A )( µ ) where µ = µ (A ) µ () (A ), we have φ > + µ A (A ). Proof. Under condition αa (A ) < µ A (A ), by Theorem (3.5) we get the following lower bond of Spark (A ) : ρ (α A (A ) + βa (A )) µ φ = + µ () (A ). (33) By (4), we see that α (A ) = implies that β (A ) =. Thus (33) is reduced to φ = µ µ () A (A ). Note that µ µ () (A ) = = µ (A ) + µ (A ) µ (A ) µ () µ (A ) + µ (A )( µ ) µ () (A ) µ (), (A )µ (A ) Thus if µ () (A ) < µ (A )( µ ), we must have φ > + µ (A ). The discussion in this section demonstrates that the concepts introduced in this section such as maximal scaled coherence rank and sub-coherence rank, and minimal/maximal scaled mutual coherence are quite useful in the development of uniqueness criteria for the l 0 -minimization problem (). 4 Conclusion In this paper, we have established several uniqueness conditions for the solution to a class of l 0 -minimization problems which seek sparsity only for part of the variables of the problem. This problem includes several important sparsity-seeking models as special cases. To obtain uniqueness conditions, several concepts such as maximal/minimal scaled coherence, maximal scaled coherence rank, and maximal scaled spark have been introduced in this paper. the l p -induced quasi-norm has been defined and used to establish a sufficient condition for the uniqueness of the solution to the underlying l 0 -minimization problems as well. 3 Also

References [] A. S. Bandeira and K. Scheinberg and L. N. Vicente, On partial sparse recovery, Submitted to IEEE Signal Processing letters, 03 [] A. S. Bandeira and K. Scheinberg and L. N. Vicente, Computation of sparse low degree interpolating polynomials and their application to derivative-free optimization, Mathematical Programming, 34 (0), 3-57. [3] A. Brauer, Limits for the characteristic roots of a matrix, Duke Math. J., 3 (946), 387-395. [4] A. Bruckstein and M.Elad and D. Donoho, From sparse solutions of systems of equations to sparse modeling of signals and images, SIAM Review, 5(009), 34-8. [5] M. Davenport and M. Duarte and Y. Eldar and G. Kutyniok, Introduction to Compressive Sensing, in Compressive Sensing: Theory and Applications (Eldar and Kutyniok eds.), University of Cambridge, 0. [6] D. Donoho and M. Elad, Optimally sparse representation in general (nonorthogonal) dictionaries via L minimization, Proc. Nat. Acad. Sci., 00 (003), 97-0. [7] D. Donoho and X. Huo, Uncertainty principles and ideal atomic decompositions, IEEE Transactions on Information Theory, 47 (00), 845-86. [8] A. Doostan and H. Owhadi and A. Lashgari and G. Iaccarino, Non-adapted sparse approximation of PDEs with stochastic inputs, Journal of Computational Physics, 30 (0), 305-3034. [9] M. Elad, Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing, Springer, New York, 00. [0] Y. Eldar and G. Kutyniok, Compressive Sensing: Theory and Applications, Cambridge University Press, 0. [] M. Figueiredo and R. Nowak and S. Wright, Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems, IEEE Journal of Selected Topics in Signal Processing, (007), 586-597. [] S. Foucart and A. Pajor and H. Rauhut and T. Ullrich, The Gelfand widths of l p -balls for 0 < p <, Journal of Complexity, 6 (00), 69-640. [3] L. Jacques, A short note on compressed sensing with partially known signal support, Signal Processing, 90 (00), 3308-33. [4] M. Lustig and D. Donoho and J. Pauly, Sparse MRI: the application of compressed sensing for rapid MR imaging, Magnetic Resonance in Medicine 58 (007), 895. [5] S. Mallat, A Wavelet Tour of Signal Processing, Academic Press, 998. 4

[6] B. Natarajan, Sparse approximate solutions to linear systems, SIAM Journal on Computing, 4 (995), 7-34. [7] D. Taubman and M. Marcellin, JPEG000: Image Compression Fundamentals, Standards and Practice, Kluwer Academic, 00. [8] N. Vaswani and W. Lu, Modified-CS: Modifying compressive sensing for problems with partially known support, IEEE Transactions on Signal Processing, 58 (00), 4595-4607. [9] Y.B. Zhao, New and improved conditions for uniqueness of sparsest solutions of underdetermined linear systems, Applied Mathematics and Computation, 4 (03), 58-73. [0] Y.B. Zhao, RSP-Based analysis for sparsest and least l -norm solutions to underdetermined linear systems, IEEE Transactions on Signal Processing, 6 (03), no., 5777-5788. [] Y.B. Zhao, Equivalence and strong equivalence between sparsest and least l -norm nonnegative solutions of linear systems and their application, Technical Report, Univ of Birmingham, June 0. [] Y.B. Zhao and D. Li, Reweighted l -minimization for sparse solutions to underdetermined linear system, SIAM Journal on Optimization, (0), no. 3, 065-088. 5