A New Estimate of Restricted Isometry Constants for Sparse Solutions Ming-Jun Lai and Louis Y. Liu January 12, 211 Abstract We show that as long as the restricted isometry constant δ 2k < 1/2, there exist a value (, 1] such that for any <, each minimizer of the nonconvex l minimization for the sparse solution of any underdetermined linear system is the sparse solution. 1 Introduction Let us start with the one of basic problems in compressed sensing: seek the minimizer x R n solving min{ x : Φ x = b}, (1) where x stands for the number of nonzero entries of vector x, Φ is a matrix of size m n with m << n. That is, the purpose of the research is to find the sparse solution satisfying the under-determined linear system Φx = b with x as small as possible. A key concept to describe the solution of (1) is the restricted isometry constants of a matrix Φ introduced in [5]. Definition 1 For each integer k = 1, 2,, let δ k be the smallest number such that (1 δ k ) x 2 2 Φx 2 2 (1 + δ k ) x 2 2 (2) mjlai@math.uga.edu. This author is partly supported by the National Science Foundation under grant DMS-71387. Department of Mathematics, The University of Georgia, Athens, GA 362 yliu@marlboro.edu, Department of Mathematics, Marlboro College, Marlboro, Vermont 5344. 1
holds for all k-sparse vectors x R n with x k, where x 2 the standard l 2 norm for vector x. δ k is called restricted isometry constant. One of the standard approaches to find the minimizer x is to seek the minimizer x 1 R n solving min{ x 1 : Φ x = b, x R n }, (3) where x 1 is the standard l 1 norm of vector x. Suppose that x = k. Let T {1, 2,, n} be the subset of indices for the k largest entries of x. For any vector x, let x T denote the vector whose entries agree with that of x at the indices in T and zeros for other entries. Many researchers have established the following result in various literature: Theorem 1 (Noiseless Recovery) For appropriate δ 2k >, the solution x 1 of the minimization problem (3) satisfies x x 1 2 C k 1/2 x x T 1, (4) for any x with Φx = b, where C is a positive constant dependent on δ 2k. In particular, if x is k-sparse, the recovery is exact. It is known from Candès, 28[4] that the above result holds when δ 2k < 2 1. This condition is improved in Foucart and Lai, 29[11] to be δ 2k < 2/(3+ 2).4531. Subseuently, this condition is further improved in Cai, Wang, and Xu, 21[2] for special k (multiple of 4), δ 2k < 2/(2 + 5).4721 as well as in Foucart, 21[1] to be δ 2k < 3 4 + 6.4652 and for large k, δ 2k < 4 6 + 6.4734. Recently, Li and Mo proposed another approach in [15] and showed that the ineuality (4) holds as long as δ 2k <.4931. The problem (3) was extended in [12] by seeking a minimizer x R n for a number (, 1) which solves min{ x : Φ x = b, x R n }, (5) where x is the standard l uasi-norm of vector x. See also [7], [11], [6], [14] for study of the nonconvex l minimization problem (5). In Foucart and Lai, 29[11], the following result was established. 2
Theorem 2 Suppose that δ 2k < 2(3 2)/7.4531. Then for any (, 1], x x C x x T, (6) for any x with Φx = b, where C is a positive constant dependent on δ 2k. In particular, if x = x is k-sparse, the recovery is exact. To improve the result in Theorem 2, our main result in this paper is Theorem 3 Suppose that δ 2k < 1/2. There exists a number (, 1] such that for any <, each minimizer x of the l minimization (5) is the sparse solution of (1). Furthermore, there exists a positive constant C such that for any x R n with Φx = b such that x x C x x T, where C is dependent on and δ 2k and T is the index set of the k nonzero entries of the sparse solution x. Under the assumption that the l minimization (5) can be computed, the sensing matrix Φ reuires a more relaxed condition on the restricted isometry constant to be able to find the sparse solution than the conditions listed above for Theorem 1. For simplicity, we only discuss the sparse solution for noiseless recovery in this paper. We leave the discussion on noisy recovery to the interested reader. After we establish an elementary ineuality in Preliminary section 2, we prove our main result in 3. Finally we give a few remarks in 4. 2 Preliminary Results Let x = (x 1,, x n ) T be a vector in R n and we use x p be the standard norm for vector x for any p 1 and x be the standard uasi-norm when < 1. Recall that we have the following standard ineuality x 1 n x 2 or x 2 x 1 n. (7) by the well-known Cauchy-Schwarz ineuality. ineuality is x 2 x 1 A converse of the above which can be seen directly after dividing x = max{ x i, i = 1,, n} both sides. Recently, Cai, Wang and Xu proved the following interesting ineuality in [3]. 3
Lemma 1 (Cai, Wang and Xu 1) For any x R n, x 2 x ( ) 1 n + max n 4 x i min x i. (8) 1 i n 1 i n We now extend the ineuality to the setting of uasi-norm x with (, 1). It is easy to see that for < < 1, x 2 x n 1/ 1/2 (9) by using Hölder s euality. The following converse ineuality x 2 x, x R n. is often used in the literature. Motivated by the new ineuality in (8), we would like to see the converse of the ineuality (9). Lemma 2 Fix < < 1. For any x R n, x 2 x n 1/ 1/2 + ( ) n max x i min x i. (1) 1 i n 1 i n Proof. Without loss of generality, we may assume that x 1 x 2 x n and not all x i are eual. Let f(x) = x 2 x n 1/ 1/2. Let us fix x 1 and find an upper bound for f(x). Note that f x i = x i x 1 x 2 x 1 i n 1/ 1/2 is an increasing function as a function of x i. Indeed, it is easy to see that both functions x 2 n = ( x j ) x i x 2 i and x 1 x 1 i = j=1 n ( x j ) x i j=1 (1 )/ 4
f of x i are decreasing. Thus, x i is an increasing function of x i. It follows that f(x) is convex as a function of x i for each i = 2,, n 1. The maximum achieves at either x i = x i 1 or x i = x i+1. It follows that when f achieves its maximum, x must be of the form that x 1 = x 2 = = x k and x k+1 = = x n for some 1 k < n. Thus, f(x) = It is easy to see that k(x 21 x2n) + nx 2n (k(x 1 x n) + nx n) 1/ n 1/ 1/2. f(x) n(x 21 x2n) + nx 2n (nx n) 1/ n 1/ 1/2 = n(x 1 x n ) To find a better upper bound for some < 1, see Remark 4.4. One can see from Remark 4.4 that it is not an easy task to find out which k to maximize the function g(k) = k(x 21 x2n) + nx 2n (k(x 1 x n) + nx n) 1/ n 1/ 1/2 (cf. Remark 4.4). Anyway, the result in Lemma 2 is good enough for our application in the next section. 3 Main Results and Proofs To describe our results, we need more notation. We use Null(Φ) to denote the null space of Φ and S(x) to denote the support of x R n, i.e., S = {i, x i } for x = (x 1,, x n ) T. Recall that x is a sparse solution, i.e., Φx = b with S(x ) T with cardinality of T less or eual to k. Let x be the solution of the minimization problem (5). Recall from [12] that x is the uniue spare solution x if and only if h T < h T c (11) for all nonzero vector h in the null space of Φ. It is called the null space property. Indeed, we have x = x T x T + h T + h T < x T + h T + h T c = x + h 5
by (11) for any nonzero vector h in the null space of Φ. Thus, x is the solution of (5). Another way to show the sufficiency is to let x be the solution of (5) and let h = x x which is in the null space of Φ. If h, we have x T = x x = x T + x T c x T h T + h T c. It follows that h T c h T < h T c which is a contradiction, where we have used (11). Thus, x is the sparse solution. The necessity of the null space property (11) can be seen as follows: suppose that there is a nonzero vector h null(φ) such that h T c h T. Let x = h T and b = Φx. If h T c < h T, then h T c satisfies Φ( h T c ) = b and the minimization (5) should find a solution x which is not h T, the sparse solution of this vector b which is a contradiction to the assumption that x is the uniue sparse solution x. Similarly, if h T c = h T, the minimization (5) may find two solutions h T and h T c which is a contradiction. In fact, one can find the smallest constant ρ < 1 such that h T ρ h T c, h null (Φ). Indeed, it is easy to see that the following euality holds sup h Null(Φ) h i T h i i T h i = max h Null(Φ) h 2 =1 i T h i i T h i which is denoted by ρ. In general, for h = x x, let h T = τ(h, ) h T c. (12) The purpose of the study is to show how to make τ(h, ) < 1 for all nonzero vector h in the null space of Φ. For any nonzero vector h in the null space of Φ, we rewrite h as a sum of vectors h T, h T1, h T2,, each of sparsity at most k. Here, T corresponds to the locations of the k largest entries of x ; T 1 to the locations of the k largest entries of h T c ; T 2 to the locations of the next k largest entries of h T c, and so on, where T c stands for the complement index set of T in {1, 2,, n}. Without loss of generality, we may assume that h = (h T, h T1, h T2, ) T with the cardinality of T i being eual to k for all i =, 1, 2,. Let us introduce another ratio t := t(h, ) [, 1] be a number such that h T1 = t h Ti. First of all, we have 6
Lemma 3 For (, 1), we have h Ti 2 2 1 (1 t)t(2 )/ k (2 )/ h Ti 2/. (13) Proof. It is easy to see that h Ti 2 2 h 2k+1 ( 2 h Ti ht1 ) (2 )/ h Ti k ( ht1 ) (2 )/ 1 t h T1 k t 1 1 t k (2 )/ t 2/ 2/ h Ti. t The result in (13) follows. Next we have Lemma 4 For (, 1), we have h Ti 2 Proof. By Theorem 2, we have for i 2. It follows 1 k 1/ 1/2 h Ti 1/ k 1/ 1/2 h Ti 2 h Ti + k 1/ ( h ik+1 h ik+k ) k 1/ 1/2 h Ti 2 h Ti + k 1/ h T1 /k 1/ h Ti h Ti. (14) 1/ since 1. Furthermore, we have Lemma 5 For (, 1), we have Φ(h T + h T1 ) 2 2 1 δ 2k k 2/ 1 (τ(h, )2 + t 2/ ) h Ti 2/. (15) 7
Proof. By the definition of δ 2k and using (9), we have Φ(h T + h T1 ) 2 2 (1 δ 2k ) h T + h T1 2 2 = (1 δ 2k )( h T 2 2 + h T1 2 2) (1 δ 2k )( h T 2 + h T1 2 )/k 2/ 1 2/ = 1 δ 2k k 2/ 1 (τ(h, )2 + t 2/ ) h Ti. It is easy to see that Φ(h T +h T1 ) = Φh Φ( j 2 h T j ) = Φ( h T i ). We have the following estimate Lemma 6 For (, 1), we have Φ(h T +h T1 ) 2 2 = Φ( j 2 h Tj ) 2 2 ( ) (1 t)t (2 )/ k (2 )/ + δ 2k k 2/ 1 h Ti (16) Proof. A straightforward calculation shows Φ( h Tj ) 2 2 = Φ(h Ti ), Φ(h Tj ) j 2 i,j 2 = Φ(h Tj ), Φ(h Tj ) + 2 Φ(h Ti ), Φ(h Tj ) j 2 2 i<j (1 + δ k ) h Ti 2 2 + 2δ 2k h Ti 2 h Tj 2 i>j 2 h Ti 2 2 + δ 2k ( h Ti 2 ) 2 ( ) (1 t)t (2 )/ k (2 )/ + δ 2k 2/ k 2/ 1 h Ti. 2/. By using (15) and (16), we have (1 δ 2k )(τ(h, ) 2 + t 2/ ) (1 t)t (2 )/ + δ 2k or τ(h, ) 2 (δ 2k + t (2 )/ (2 δ 2k )t 2/ )/(1 δ 2k ). (17) 8
Let us study the maximum of the right-hand side as a function of t [, 1]. Letting s = (2 )/(2 δ 2k ) with s 2, it is easy to see that the maximum happens at t = s/2 and τ(h, ) 2 (δ 2k + 2 ( s ) 2/ ( s 2/)/(1 (2 δ2k ) δ2k ) = s 2 2) δ 2ks + ( s 2/ 2). s(1 δ 2k ) If the term on the right-hand of the ineuality is less than 1, then we will have τ(h, ) < 1 and hence, x is the sparse solution of (1). To see the range of value of δ 2k, we continue the following simple analysis: ( s ) 2/ δ 2k s + < s δ2k s 2 or ( s ) 2/ 2δ 2k + 2 s < 1. Further simplification yields ( ) 2 2/ 2 δ 2k δ 2k + < 1/2. (18) 2(2 δ 2k ) 2(2 ) Since the second term on the left-hand side goes to zero as + as δ 2k < 1, 1 and ( ) 2 2/ 2 δ 2k 2(2 δ 2k ) 2(2 ) ( 2 2 ) 2/ 1 e, we can establish the results in Theorem 3. Proof. of Theorem 3. Based on the proofs of Lemmas 5 and 6, we have h T 2 ρ 2 h Ti 2/. where ρ 2 is That is, ρ 2 := δ 2ks + ( s 2/ 2). s(1 δ 2k ) h T ρ h Ti 1/. (19) 9
Since δ 2k < 1, there exists a such that (18) holds and hence, we will have ρ < 1 for any <. As x is a minimizer of (5) and for any x which is a solution of the under-determined linear euations Φx = b, we let h = x x and x T + x T c = x x = x + h = x i + h i + i T i T c x T h T + h T c x T c. Thus, we have Together with (19), we conclude x i + h i h T c h T + 2 x T c. (2) h Ti ρ h Ti + 2 x T c. That is, By (19), we have h Ti 2 1 ρ x T c. h = h T + These complete the proof. 4 Remarks We have a few remarks in order. h Ti (ρ + 1) h Ti 2(1 + ρ ) 1 ρ x T c. Remark 4.1 Clearly, the results in Theorem 3 can be extended to the noisy recovery setting as in [4] and [11]. We leave the discussion to the interested reader. Remark 4.2 The results in Theorem 3 can also be extended to dealing with sparse solution for multiple measurement vectors as discussed in [13]. We omit the details. 1
Remark 4.3 Recently the block sparse solution of compressed sensing problems was introduced and studied in [8], [1], which have many practical applications, such as DNA microarrays [17], multiband signal [16], and magnetoencephalography (MEG) [9]. In recovering the sparse solution x from Φx = b, the entries of x are grouped into blocks. That is, x = (x t1, x t2,, x tl ) with x ti being a block of entries for each i. One looks for the fewest number of nonzero blocks x ti such that Φx = b. Letting ( l x 2, = x ti 2 i=1 ) 1/ be a mixed norm with x ti 2 is the standard l 2 norm of vector x ti, one finds the block sparse solution min{ x 2,, Φx = b}. (Cf. [8] for = 1.) The concept of restricted isometry constant was extended in this mixed norm minimization when = 1 in [8]. Our study in 3 can be generalized to the setting. We leave the details to the interested reader. Remark 4.4 In order to find a better upper bound for Lemma 2, we need to find out which k maximizes f(x). Let us treat the right-hand side of the euation in the end of the proof of Theorem 2 as a function g(k). Note that g(n) = and g() =. The maximum of g must happen inside k between 1 and n 1. The derivative of g is g (k) = x 2 1 x2 n 2 k(x 2 1 x2 n) + nx 2 n (x 1 x n) n 1/ 1/2 (k(x 1 x n) + nx n) 1/ 1. The critical point satisfies That is, n 1/ 1/2 (x 2 1 x2 n) 2(x 1 x n) = k(x 2 1 x2 n) + nx 2 n(k(x 1 x n) + nx n) 1/ 1 k(x 2 1 x2 n) + nx 2 n = n1/ 1/2 (x 2 1 x2 n) 2(x 1 (k(x x 1 n) x n) + nx n) 1 1/. (21) The critical point of k is not easy to find except for = 1. Let us try a particular = 2 3. In this case, we have 11
Lemma 7 For any x R n, one has x 2 x n 1/ 1/2 2 ( ) n 1/ max 3 3 x i min x i (22) 1 i n 1 i n for = 2 3. In particular, one has x 2 x n 1/ 1/2 2 3 Proof. It is easy to see that g (k) = achieves its maximum at k = n nx 2 n + k ( x 2 1 ) (k x2 n ( ) n max 3 x i min x i. (23) 1 i n 1 i n ( ) x 2/3 1 xn 2/3 n ) + nxn 2/3 3/2 4x 4 1 +12x1/3 1 x 2/3 n +33x 8/3 1 x 4/3 n +46x 2 1 x2 n +33x4/3 1 x 8/3 n +12x 2/3 1 x 1/3 n +4x 4 n ( ) 6(x 2 1 x2 n) 3n x 4/3 1 xn 2/3 +x 2/3 1 x 4/3 n +2x 2 n 6(x 2 1 x2 n) by the standard calculation. Let p (s, t) := 4s 6 + 12s 5 t + 33s 4 t 2 + 46s 3 t 3 + 33s 2 t 4 + 12st 5 + 4t 6. Setting s := x 2/3 1 and t := x 2/3 n, we see that the maximum of g (k) is ( ) 3/2 n p p (s, t) + 3st (s + t) g (k ) = 6 6 (s, t) 3st (s + t) 6 s 2 + st + t 2. Let ( p p (s, t) + 3st (s + t) F (s, t) := 6 (s, t) 3st (s + t) s 2 + st + t 2 ) 3/2 n so that g(k ) = 6 F (s, t). To find an upper bound of F (s, t), we may 6 consider F (1, y) with y = t/s for a fixed s. It is easy to plot F (1, y) and 4 2 (1 y) 3/2 in Fig. 1 and their difference. Hence, the ineuality (22) follows. Furthermore, by the uasi-triangle ineuality for = 2/3, ( ) 1/ max x i min x i max x i min x i 1 i n 1 i n 1 i n 1 i n one obtains the ineuality in (23). The analysis above just shows that a better estimate for Lemma 2 is hard to find. 12
5.4 4 3 4 2 1 t 3 2.3 4 2 1 t 3 2 F 1, t 2.2 1 F (1, t).1 Figure 1: The graphs of F (1, t) and 4 2 (1 t) 3/2 (left) and the graph of their difference (right) References [1] R. G. Baraniuk, V. Cevher, M. F. Duarte, C. Hegde, Model-based compressive sensing, IEEE Trans. Inform. Theory 56 (21). 1982 21. [2] T. Cai, L. Wang, G. Xu, Shifting ineuality and recovery of sparse signals, IEEE Trans. Signal Process., 58(21), pp. 13-138. [3] T. Cai, L. Wang, and G. Xu, New Bounds for Restricted Isometry Constants, Information Theory, IEEE Transactions, 21, pp.4388 4394. [4] E. Candès, The restricted isometry property and its implications for compressed sensing, C. R. Acad. Sci. Ser. I 346 (28) 589-592. [5] E. Candès and T. Tao, Decoding by linear programing, IEEE Trans. Inform. Theory 51 (25) 4234215. [6] M. Davies and R. Gribonval, Restricted isometry constants where l sparse recovery can fail for < p < 1, IEEE Trans. Inform. Theory, in press. [7] R. Chartrand, Exact reconstruction of sparse signals via nonconvex minimization, IEEE Signal Process. Lett. 14 (27) 77 71. [8] Y. C. Eldar, M. Mishali, Robust recovery of signals from a structured union of subspaces, IEEE Trans. Inform. Theory 55 (29) 532 5316. 13
[9] Y. C. Eldar, P. Kuppinger, H. Bolcskei, Block-sparse signals uncertainty relations and efficient recovery, IEEE Trans. Signal Process. 58 (21) 342-354. [1] S. Foucart, A note on guaranteed sparse recovery via l 1 -minimization, Appl. Comput. Harmon. Anal. 29 (21), 97 13. [11] S. Foucart, M.J. Lai, Sparsest solutions of underdetermined linear systems via l -minimization for < < 1, Appl. Comput. Harmon. Anal. 26 (29) 395-47. [12] R. Gribonval and M. Nielsen, Sparse decompositions in unions of bases. IEEE Trans. Info. Theory, vol. 49, no. 12, pp. 332 3325, Dec 23. [13] M. J. Lai and Louis Y. Liu, The null space property for sparse recovery from multiple measurement vectors, to appear in Applied Computational Harmonic Analysis, 21. [14] M. J. Lai and J. Wang, An unconstrained l minimization for sparse solution of under determined linear systems, to appear in SIAM J. Optimization, 21. [15] S. Li and Q. Mo, New bounds on the restricted isometry constant δ 2k, submitted, 21. [16] M. Mishali, Y. C. Eldar, Blind multi-band signal reconstruction: Compressed sensing for analog signals, IEEE Trans. Signal Process. 57 (29) 993 19. [17] F. Parvaresh, H. Vikalo, S. Misra, B. Hassibi, Recovering sparse signals using sparse measurement matrices in compressed DNA microarrays, IEEE J. Sel. Top. Signal Process. 2 (28) 275 285. 14