Sparse Signal Reconstruction via Iterative Support Detection

Size: px

Start display at page:

Download "Sparse Signal Reconstruction via Iterative Support Detection"

Regina Butler
6 years ago
Views:

1 Sparse Signal Reconstruction via Iterative Support Detection Yilun Wang and Wotao Yin Abstract. We present a novel sparse signal reconstruction ethod, iterative support detection (), aiing to achieve fast reconstruction and a reduced requireent on the nuber of easureents copared to the classical l iniization approach. addresses failed reconstructions of l iniization due to insufficient easureents. It estiates a support set I fro a current reconstruction and obtains a new reconstruction by solving the iniization proble in{ i/ I xi : Ax = b}, and it iterates these two steps for a sall nuber of ties. differs fro the orthogonal atching pursuit ethod, as well as its variants, because (i) the index set I in is not necessarily nested or increasing, and (ii) the iniization proble above updates all the coponents of x at the sae tie. We generalize the null space property to the truncated null space property and present our analysis of based on the latter. We introduce an efficient ipleentation of, called threshold-, for recovering signals with fast decaying distributions of nonzeros fro copressive sensing easureents. Nuerical experients show that threshold- has significant advantages over the classical l iniization approach, as well as two state-of-the-art algoriths: the iterative reweighted l iniization algorith () and the iterative reweighted least-squares algorith (). MATLAB code is available for download fro optiization/l//. Key words. copressed sensing, l iniization, iterative support detection, basis pursuit AMS subject classifications. 68U, 65K, 9C25, 9C5 DOI..37/ Introduction and contributions. Brought to the research forefront by Donoho [3] and Candès, Roberg, and Tao [4], copressive sensing (CS) reconstructs a sparse unknown signal fro a sall set of linear projections. Let x R n denote a k-sparse unknown signal, and let b := A x R represent a set of linear projections of x. The optiization proble () (P l ) in x x subject to (s.t.) Ax = b, where x is defined as the nuber of nonzero coponents of x, can exactly reconstruct x fro O(k) rando projections. (Throughout this paper, x is used to denote the true signal to be reconstructed.) However, because x is nonconvex and cobinatorial, (P l )isipractical Received by the editors Septeber 29, 29; accepted for publication (in revised for) July 6, 2; published electronically August 26, 2. This research was supported in part by NSF CAREER award DMS , ONR grant N4-8--, the U.S. Ary Research Laboratory and the U.S. Ary Research Office grant W9NF , and an Alfred P. Sloan Research Fellowship. School of Civil and Environental Engineering, Cornell University, Ithaca, NY 4853 (yilun.wang@gail.co). This author s part of the work was done when he was a doctoral student at Rice University. Departent of Coputational and Applied Matheatics, Rice University, Houston, TX 775 (wotao.yin@ rice.edu). A k-sparse vector has no ore than k nonzero coponents. 462

2 ITERATIVE SUPPORT DETECTION 463 for real applications. A practical alternative is the basis pursuit () proble (2) (3) () or in x s.t. Ax = b, x in x + x 2ρ Ax b 2 2, where (2) is used when b contains little or no noise and, otherwise, (3) is used with a proper paraeter ρ>. The probles have been known to yield sparse solutions under certain conditions (see [4,, 7] for explanations). They can be solved by recent algoriths found in [3, 23, 2, 6, 5, 27, 3, 29]. It is shown in [6, 25] that,whena is a Gaussian rando or partial Fourier enseble, with high probability returns a solution equal to x fro = O(k log(n/k)) and O(k log(n) 4 ) linear easureents, respectively, which are uch saller than n. Coparedto(P l ), is uch easier to solve but requires significantly ore easureents. We propose an iterative support detection () ethod that runs as fast as the best algoriths but requires significantly fewer easureents. alternatively calls its two coponents: support detection and signal reconstruction. Fro an incorrect reconstruction, support detection identifies an index set I containing soe eleents of supp( x) ={i : x i }, and signal reconstruction solves (4) (Truncated ) in x x T s.t. Ax = b, where T = I C and x T = i/ I x i (or solves a least-squares penalty version corresponding to (3)). Assuing a sparse original signal x, ifi = supp( x), then the solution of (4) is,of course, equal to x. But this also happens if I contains enough, not necessarily all, entries of supp( x). When I does not have enough of supp( x) for an exact reconstruction, those entries of supp( x) ini will help (4) return a better solution, which has a reduced error copared to the solution of (2). Fro this better solution, support detection will be able to identify ore entries in supp( x) and thus yield a better I. In this way, the two coponents of work together to gradually recover supp( x) and iprove the reconstruction. Given sufficient easureents, can finally recover x. Furtherore, exact reconstruction can happen even if I includes a sall nuber of the spurious indices outside of supp( x). A siple deo in section 2 illustrates the above for a sparse Gaussian signal. requires the reliable support detection fro inexact reconstructions, which ust take advantage of the features and prior inforation about the true signal x. In this paper, we focus on the sparse or copressible signals with coponents having a fast decaying distribution of nonzeros. For these signals, we perfor support detection by thresholding the solution of (4), and we call the corresponding algorith threshold-. We present different thresholding rules including a siple one given along with the deo in section 2 and a ore efficient one discussed in subsection 4.. The latter rule was used throughout our nuerical experients in section 5. To provide theoretical explanations for, we analyze the odel (4) based on a so-called truncated null space property of A, an extension of the null space property originally studied in [9] and later in[3, 32, 3, ], which gives the widely used restricted isoetry property

3 464 YILUN WANG AND WOTAO YIN [5] in certain cases. We establish sufficient conditions for (4) toreturn x exactly. When x is not exactly sparse, its exact reconstruction is generally ipossible. We show an error bound between the solution of (4) and x. Built upon these results for a single instance of (4), the following result for is obtained: the chance for (4) to return a sparse signal x iproves if in the new detections at each iteration, the true nonzeros are ore than the false ones by a certain factor. These results are independent of specific support detection ethods used for generating I. However, we have yet to obtain a global convergence result for. While a recovery guarantee has not been obtained, nuerical coparisons to state-of-theart algoriths show that threshold- runs very fast and requires very few easureents. Threshold- calls YALL [3] with a war-start and a dynaic stopping rule to efficiently solve (4). As a result, the threshold- tie is coparable to the YALL tie for solving the odel. Threshold- was copared to, the iteratively reweighted least squares algorith () [8], and the iteratively reweighted l iniization algorith () [7] on various types of synthetic and real data. and are known for their state-of-the-art reconstruction rates, on both noiseless and noisy easureents. Given the sae nuber of easureents, threshold- and returned better signals than, which was even better than. Coparing threshold- and, the forer ran an order of agnitude faster. The rest of this paper is organized as follows. In section 2, the algorithic fraework of is given along with a siple deo. Section 3 presents preliinary theoretical results. Sections 4 and 5 study the details of threshold- and present our nuerical results, respectively. Section 6 is devoted to conclusions and discussions on future research. 2. Algorithic fraework. We first present the algorithic fraework of. Input: A and b. Set the iteration nuber s and initialize the set of detected entries I (s) ; 2. While the stopping condition is not et, do (a) T (s) (I (s) ) C := {, 2,...,n}\I (s) ; (b) x (s) solve truncated (4) fort = T (s) ; (c) I (s+) support detection using x (s) as the reference; (d) s s +. Since T () = {,...,n}, (4) in step 2(b) reduces to (2) in iteration. Like greedy algoriths such as orthogonal atching pursuit (OMP) [26], StOMP [2], and CoSaMP [24], iteratively aintains a set I of selected indices and updates x. However, differs fro greedy algoriths in how I is grown and x (s) is updated. The index set I in is not necessarily nested or increasing over the iterations. In ters of index set selection, is ore like CoSaMP, and different fro OMP and StOMP. At each iteration, after the index set I is coputed, updates all the coponents of x, including both the detected and undetected ones, at the sae tie. Both of these differences are iportant since they allow to reconstruct certain sparse signals that cannot be recovered by the existing greedy algoriths. Adeo.We generated a sparse signal x of length n = 2 with k =25nonzeronubers independently sapled fro the standard Gaussian distribution and assigned to randoly chosen coponents of x. We let = 6, created a Gaussian rando n atrix A, and

4 ITERATIVE SUPPORT DETECTION 465 th iter. (total,det,c det,w det)=(25,2,,2), Err = 5.e true signal true nonzero false nonzero st iter. (total,det,c det,w det)=(25,27,9,8), Err = 2.43e true signal true nonzero false nonzero nd iter. (total,det,c det,w det)=(25,3,24,7), Err=3.59e true signal true nonzero false nonzero rd iter. (total,det,c det,w det)=(25,25,25,), Err = 7.97e true signal true nonzero false nonzero Itr k Nonzeros Total true Detected Correct False Relative error e e e e-6 Figure. An deo that recovers x with 25 nonzeros fro 6 rando Gaussian easureents. set b := A x. We ipleented step 2(c) by a threshold rule: (5) I (s+) {i : x (s) i >ɛ (s) }. To keep it siple for now, we let (6) ɛ (s) := x (s) /β (s+) with β = 5. Note that in subsection 4., we will present a ore reliable rule for deterining ɛ (s). With 2 diensions, it is norally considered difficult to recover a signal with 25 nonzeros fro erely 6 easureents (a 2.4 easureent-to-nonzero ratio), but returns an exact reconstruction in erely four iterations. The solutions of the four iterations are depicted in the four subplots of Figure, where the coponents of x are arked by and the nonzero coponents of x (s) are arked separately by and, standing for true and false nonzeros,

5 466 YILUN WANG AND WOTAO YIN respectively. The thresholds ɛ (s) are shown as green lines. To easure the solution qualities, we give the quadruplet (total, det, c-det, w-det) and Err in the title of each subplot and in the table. They are defined as follows: (total, det, c-det, w-det): total: the nuber of total nonzero coponents of the true signal x. det: the nuber of detected nonzero coponents, equal to I (s+) =(c-det)+ (w-det). c-det: the nuber of correctly detected nonzero coponents, i.e., I (s+) {i : x i }. w-det: the nuber of falsely detected nonzero coponents, i.e., I (s+) {i : x i = }. Err: the relative error x (s) x 2 / x 2. Fro the upper left subplot, it is clear that x (), which was the solution, contained a large nuber of false nonzeros and had a large relative error. However, ost of its correct nonzero coponents were relatively large in agnitude (as a consequence of x having a relatively fast decaying distribution of nonzeros), and the thresholding ethod (5) withthethreshold ɛ () = x () /5 detected 2 nonzeros, aong which were true nonzeros and 2 were not. In spite of the 2 false detections, the detection yielded T (), which was good enough to let (4) return a uch better solution x (), depicted in the upper right subplot. This solution further allowed (5), now having the tighter threshold ɛ () = x () /5 2, to yield 9 detected true nonzeros with 8 false detections. Notably, ost of true nonzeros with large agnitude had been correctly detected. The next solution x (2), depicted in the botto left subplot, becae even better, which well atched the true signal x except for tiny false nonzero coponents. Method (5) detected 24 true nonzeros of x fro x (2) and 7 false nonzeros, and x (3) had exactly the sae nonzero coponents as x, as well as an error alost as low as the double precision. is insensitive to a sall nuber of false detections and has an attractive self-correction capacity. It is iportant to generate every I fro x (s) regardless of what it was previously; otherwise, false detection would be trapped in I. In addition, the perforance of is invariant to sall variations in the thresholding tolerance in (5). Onthesaesetofdata,we also tried to set β =3andβ =.5 in(6) and obtained an exact reconstruction of x in4and 6 iterations, respectively. 3. Preliinary theoretical analysis. The preliinary theoretical results in this section explain under what conditions the truncated odel (4) can successfully reconstruct x, especially fro easureents that are not enough for. Most of the results are based on a property of the sensing atrix A defined in subsection 3.. Focusing on the iniization proble (4), subsections 3.2 and 3.3 study exact reconstruction conditions for sparse signals and reconstruction errors for copressible signals, respectively. Finally, subsection 3.4 gives a sufficient condition for to iprove the chance of perfect reconstruction over its iteration. Note that this condition is not a recovery guarantee, which is yet to be found. 3.. The truncated null space property. We start with introducing the truncated null space property (t-nsp), a generalization of the null space property (NSP). The NSP is used in slightly different fors and with different naes in [3, 32, 3, 9, ]. We adopt the definition

6 ITERATIVE SUPPORT DETECTION 467 in []: a atrix A R n satisfies the NSP of order L for γ>if (7) η S γ η S C holds for all index sets S with S L and all η N(A), which is the null space of A. In (7) andtherestofthispaper,η S R n denotes the subvector of η consisting of η i for i S {, 2,...,n}, ands C denotes the copleent of S with respect to {,...,n}. With γ<, the NSP says that any nonzero vector η in the null space of A cannot have an l -ass concentrated on any set with L or fewer eleents. A sufficient exact reconstruction conditionforisgivenin[] based on the NSP: the true k-sparse signal x is the unique solution of if A has the NSP of order L k and <γ<. In order to analyze the iniization proble (4) withatruncatedl -nor objective, we now generalize the NSP to the t-nsp. Definition. AatrixA satisfies the t-nsp of order L for γ> and <t n if (8) η S γ η (T S C ) holds for all sets T {,...,n} with T = t, all subsets S T with S L, and all η N(A) the null space of A. For siplicity, we use t-nsp(t, L, γ) to denote the t-nsp of order L for γ and t, andwe use γ to replace γ and write t-nsp(t, L, γ) if γ is the infiu of all the feasible γ satisfying (8). Notice that when t = n, thet-nsp reduces to the NSP. Copared to the NSP, the inequality (8) inthet-nsp has an extra set liiter T of size t. It is introduced to deal with x T. Clearly, for a given A and t, γ is onotonically increasing in L. On the other hand, fixing L, γ is onotonically decreasing in t. If γ is fixed, then the largest legitiate L is onotonically increasing in t Sufficient recovery conditions of truncated l iniization. We first analyze the odel (4) and explain why it ay require significantly fewer easureents than. Below we present a sufficient exact reconstruction condition, in which the requireent x L for is replaced by x T L. Theore 3.. Let x be a given vector, and let T be a given index set satisfying T supp( x). Assue that a atrix A satisfies t-nsp(t, L, γ) for t = T. If x T L and γ <, then x is the unique iniizer of (4) for b := A x. Proof. Thetruesignal x uniquely solves (4) if and only if (9) x T + v T > x T v N(A), v. Let S := T supp( x). Since x S = x T,wehave x T + v T = x S + v S + + v T S C =( x S + v S x S + v S ) + x }{{} T +( v T S C v S ).

7 468 YILUN WANG AND WOTAO YIN Therefore, having v S < v T S C is sufficient for (9). If x T L, then S L. According to the definition of t-nsp( T,L, γ), it holds that v S γ v T S C < v T S C. The assuption T supp( x) in Theore 3. is not essential because otherwise x is a trivial solution of (4). In addition, if A T C has independent coluns, then x is the unique solution. We note that t-nsp( T,L, γ) is stricter than what is needed when T is given because (8) isrequiredtoholdforallt with T = t. The following lea states that the t-nsp is satisfied by Gaussian atrices of appropriate sizes. Our proof is inspired by the work [33]. Lea 3.. Let <n. Assue that either A R n is a standard Gaussian atrix (i.e., one with independent and identically distributed (i.i.d.) standard noral entries) or there exists a standard Gaussian atrix B R n (n ) such that AB =. Given an index set T, with probability greater than e c(n ),theatrixa satisfies t-nsp(t, L, γ) for t = T and γ = L 2 k(d),where L d () k(d) :=c +log( n d d ), d = n T, andc,c> are absolute constants independent of the diensions, n, andd. In, d equals the nuber of detected entries (including both correct and false detections). d deterines k(d) and in turn the t-nsp paraeter γ. Since the sufficient condition of Theore 3. requires γ < (i.e., there exists a γ < ), we see that support detection affects the chance of recovery by (4). Because k(d) plays a pivotal role here, we analyze its forula () at the end of this subsection. Also, we note that the γ given in Lea 3. is not necessarily tight. Proof. Let the coluns of B span N (A); i.e., B R (n ) and AB =, andp T refer to projection to the coordinates T. Then, Λ = {v T : v N(A)} = {(P T B)w : w R n } is a randoly drawn subspace in R T with diensions up to (n ). The Kashin Garnaev Gluskin result [22, 8] states that for any p<q,with probability e c p, a randoly drawn p-diensional subspace V p R q satisfies z z 2 c q p +log(q/(q p)) z V p,z, where c and c are independent of the diensions. Applying this result with q := T = n d and p := n, weobtain or v T v T 2 c (n d) (n ) +log((n d)/((n d) (n ))) v N(A), v, v T c d v T 2 +log n d d v N(A), v.

8 ITERATIVE SUPPORT DETECTION 469 Let k(d) be defined in () where c = c2 4. For all S T with S L we have k(d) vt 2 2 v T and thus v S S v S 2 L k(d) k(d) vs 2 L k(d) k(d) vt 2 L 2 k(d) v T, or, equivalently, v S γ v T S C, where γ = L 2 k(d). The lea follows fro the definition of the t-nsp. L Theore 3. and Lea 3. lead to the following theore. Theore 3.2. Let x R n and T be given such that T supp( x). Let <nand A R n be given as in Lea 3.. Then, with probability greater than e c(n ),the true signal x is the unique solution of (4) for b := A x if () x T <k(d), where k(d) is defined in (), d = n t = n T, andc,c > are absolute constants independent of the diensions, n, and d. Furtherore, let d c = I supp( t) denote the nuber of correct detections. Then, () is equivalent to (2) x <k(d)+d c. Proof. In Lea 3., letl = x T. If L<k(d), then γ <. Then we have that the first result follows fro Theore 3.. Inequalities () and(2) are equivalent since x = x T + d c. Note that when d =, condition () reduces to the existing result for : for the sae vector x and A given in Theore 3.2 above, with probability greater than e c(n ), x is the unique solution of (4) forb := A x if x c( + log(n/)) ; naely, the inequality () holdsford =. In light of (2), to copare with truncated, we shall copare k() with k(d)+d c. Below we argue that if there are enough correct detections (i.e., d c /d is sufficiently large), then we get k() <k(d)+d c, and it is easier for truncated to recover x than it is for. To see this, we start with (3) k() = k(d) d k (d) and study k (d). Because (4) is equivalent to a linear progra which has a solution with no ore than nonzeros, we naturally assue d <. Then, we obtain ( ) (4) k (d) := c +log ( n ) + n d ( ( d +log n d )) 2(n <. d d) On the other hand, we have <k (d) for the following reasons. First, it is well known that universal stable reconstruction (by any ethod) requires x </2, which we now

9 47 YILUN WANG AND WOTAO YIN assue. Second, when fails (which is the case we are interested in), Theore 3.2 gives us k() x. Therefore, we have k() </2or c (5) +log( n ) < 2. Plugging (5) into(4), one can deduce <k (d) <, which together with (3) gives (6) k() = k(d) d k (d) <k(d)+d. Equation (6) eansthatifd = d c (i.e., there is no false detection), truncated does better than. For practical reasons, we do not always have d = d c, but, on the other hand, one does not push x to its liit at /2. Consequently, (5) is very conservative. In fact, it is c safe to assue +log( n ) < /4, which gives /2 <k (d) < and thus (7) k() = k(d) d k (d) <k(d)+ 2 d. Coparing the right-hand sides of (2) and(7), we can see that as long as d c (/2)d (i.e., at least half of the detections are correct), truncated is ore likely than to recover x. It is easy to extend this analysis to the iterations of as follows. Since k(d) reduces at a rate slower than /2, k(d)+d c will increase as long as d c grows faster in ters of Δd c /Δd >/2 between iterations. Finally, we can also observe that the higher the saple ratio /n is, the saller k (d) is. Hence, the sufficient reconstruction condition becoes even easier to satisfy Stability of truncated l iniization. Because any practical signals are not exactly sparse, we study the reconstruction error of (4) applied to general signals, which is expressed in the best L-ter approxiation error of x: σ L ( x) := inf{ x x : x L, x R di( x) }. For a signal x with a fast decaying tail (in ters of the distribution of its entries), this error is uch saller than x.theore3.3 below states that under certain conditions on A, (4) returns a solution with an l -error bounded by σ L ( x) up to a constant factor depending only on T. The theore needs the following lea, which is an extension of Lea 4.2 in []. Lea 3.2. Consider proble (4) with a given T,andletz,z F(b). Assue that A satisfies t-nsp(t, L, γ), wheret = T and γ <. LetS T be the set of indices corresponding to the largest L entries in z T. We have (8) (z z ) T S C ( ) z γ T z T +2σ L (z T ), where σ L (z T ) is the best L-ter approxiation error of z T. Proof. We have z T S C = σ L (z T ) and (z z) T S C z T S C + z T S C = z T z S + σ L (z T ) = z T + z T z T z S + σ L (z T ) = z S z S + z T z T +2σ L (z T ) (z z ) S + z T z T +2σ L (z T ).

10 ITERATIVE SUPPORT DETECTION 47 Equation (8) follows fro the above inequality and the definition of t-nsp(t, L, γ), which says (9) (z z) S γ( (z z) T S C ). Lea 3.2 leads to the following theore. Theore 3.3. Consider proble (4) for a given T. Assue that A satisfies t-nsp(t, L, γ), where t = T and γ <. Let x be the solution of (4), andlet x be the true signal. Then, x T x T, and (2) x x 2C T σ L ( x T ), where C T = +(+ax{, T C /L}) γ. γ Proof. For notation cleanliness, we introduce S := T C = I, S 2 := S T, S 3 := T S C, which for a partition of {,...,n}. Case. S L. We can find S S 2 such that S S = L. Fro t-nsp(t, L, γ) for A, weget (2) (z z ) S (z z ) S S γ (z z ) S3. Case 2. S >L. Let S S denote the set of indices corresponding to the largest L entries of (z z ) S.Frot-NSP(t, L, γ) fora, wehave (22) (z z ) S S L (z z ) S S γ L (z z ) S3. Cobining (2) and(22) gives (z z ) S ax This, together with (8) and(9), gives {, S } γ (z z ) S3. L (23) (24) (25) z z = (z z ) S + (z z ) S2 + (z z ) S3 ( + ( + ax{, S /L}) γ) (z z ) S3 ( ) C T z T z T +2σ L (z T ). Finally, let z and z denote the true signal x and the solution x of (4), respectively. The optiality of x gives x T x T, fro which (2) follows. Theore 3.3 states that the reconstruction error of (4) is bounded by the best L-ter approxiation error of x T up to a ultiple depending on γ and T C.Whent = n, it reduces to the existing result for established in [9]: (26) x x 2 +γ γ σ L ( x),

11 472 YILUN WANG AND WOTAO YIN when A satisfies the NSP of order L for γ (, ). To copare truncated with on their error bounds given in (2) and(26), respectively, we need to study the tail of x. We clai that aking correct detections alone is not sufficient to ake (2) better than (26). To see this, assue that γ = γ in both bounds. Support detection akes t = T <nand T C >. As discussed at the end of subsection 3. above, we get L L. Then it is unclear whether σ L ( x T ) is less than σ L ( x) or vice versa. In addition, C T is bigger than ( + γ)/( γ). Hence, the error bound in (2) canbe bigger than (26). Only if the tail decays fast enough in the sense that σ L ( x T ) σ L ( x) can (2) reliably give a saller bound than (26). This coparison also applies to two instances of truncated, one having a bigger T than the other. For support detection to lead to a reduced reconstruction error, there ust be enough correct detections and x ust have a fast decaying distribution of nonzeros. This conclusion atches our nuerical results given in section 5 below Iterative behavior of. The results in the previous two subsections concern signal reconstruction by (4), not the iterations, and exact reconstruction requires the t-nsp with a paraeter γ <. This subsection presents a sufficient condition for the iterations to yield a decreasing sequence of γ. Theore 3.4. Suppose that A has the t-nsp(t, L, γ) as well as t-nsp(t,l, γ ) with t <t and L <L.If(L L ) > γ(t t (L L )), then γ < γ. Proof. Let <J <Jand <γ<. For given T, η,ands that satisfy T {,...,n}, T = t, η N(A), η, S T, S = J, γ = η S / η T \S, we have η T \S = η T \S η T \S (T \S ) γ η S η T \S (T \S ) = γ η S + γ η S S η T \S (T \S ) for any T with cardinality t and any S satisfying S S, S = J, S T,andS S T \ T. In particular, we choose S such that S S consists of the largest J J entries of η T \T in agnitude. Accordingto(J J ) > γ(t t (J J )), we have (27) S S > γ T \ S (T \ S ). If η S S, then this condition eans γ η S S > η T \S (T \S ) and, thus, η T \S > γ η S. Otherwise, i.e., η S S =, wehave γ η S S = η T \S (T \S ) =. However, we can still show η T \S > γ η S by showing η T \S > γ η S ; i.e., the first inequality in the equation array above holds strictly. To see this, we first get η T \S fro γ η S S = η T \S (T \S ) =, η T \S γ η S, γ >, and η. Next, we generate S by first letting it be S, second dropping any one entry in S S (which has a zero value), and then picking up a nonzero entry in T \ S. Such S satisfies η T \S > η T \ S γ η S > γ η S.

12 ITERATIVE SUPPORT DETECTION 473 Therefore, we have (27) η T \S > γ η S. Therefore γ < γ. To understand the result, we assue that ( iteration ) fails to reconstruct x so that we can apply (t, L, γ) and(t,l, γ ) to iterations and, respectively. Recall that the nubers of correct and wrong detections are denoted by d c and d w, respectively. We can see (L L ) > γ(t t (L L )) being equivalent to d c > γd w, eaning that fro x () ust ake at least γ ties as any correct detections as wrong detections in order to guarantee γ < γ, which indicates a forward step toward exact reconstruction. The result can also be applied to two consecutive iterations s and s + and give the condition Δd c > γδd w,where γ applies to iteration s and Δd c and Δd w are the changes in nuber of correct and wrong detections fro iteration s to s +, respectively. In practice, does not know γ, d c,ord w, but needs to know when to stop its iterations. We suggest two alternatives. The first uses the ( /2 +) th largest coponent of x (s). Accordingto[], this value is proportional to an upper bound of the error. Therefore, once the value is or sall enough, we obtain an (alost) exact reconstruction and can stop the iteration. In addition, we can also stop the iteration if this value stagnates or shows a steady increase over the past few iterations. Another stopping rule is based on coparing I (s ) and I (s). When fails to further iprove the solution (including the case when the solution is already exact), I (s) and I (s ) will be (alost) identical. 4. Threshold- for fast decaying signals. 4.. A support detection schee based on thresholding. requires reliable support detection. In this section, we present effective detection strategies for signals with a fast decaying distribution of nonzero values (hereafter, we call the fast decaying signals), which include sparse Gaussian signals and certain power-law decaying signals. Our strategies are based on thresholding (28) I (s+) := {i : x (s) i >ɛ (s) }, s =,, 2,... We ter the resulting algorith threshold-. Before discussing the choice of ɛ (k),wenote that the support sets I (s) are not necessarily increasing and nested; i.e., I (s) I (s+) ay not hold for all s. This is iportant because it is very difficult to copletely avoid wrong detections by setting ɛ (s) based on available interediate solutions x (i) for i s. The true solution is not known, and a coponent of x (i), no atter how big, could nevertheless still be zero in the true solution. Not requiring I (s) to be onotonic leaves the chance for support detection to reove previous wrong detections, aking I (s) less sensitive to ɛ (s) and thus aking ɛ (s) easier to choose. We study different rules for ɛ (s). We have seen in section 2 the siple rule ɛ (s) = x (s) /β (s+) with β >. This rule is quite effective with an appropriate β. However, the proper range of β is case-dependent. An excessively large β results in too any false detections and consequently low solution quality, while an excessively sall β tends to cause a large nuber of iterations. Because the third rule below is ore effective, this first rule is not recoended.

13 474 YILUN WANG AND WOTAO YIN.5 n=2, =6, k=25, Err = 4.53e.5.5 true signal true nonzero false nonzero (a) The nonzeros of the true signal vs. the true/ false nonzeros of the reconstructed signal.4.2 n=2, =6, k=25, Err = 4.53e false nonzero true nonzero adopted threshold vaue reference threshold value.2 n=2, =6, k=25, Err = 4.53e false nonzero true nonzero adopted threshold vaue reference threshold value (b) The sorted coponents of the failed reconstruction and their first significant jup (c) A zoo-in of (b) Figure 2. Illustration of support detection fro a failed reconstruction by looking for the first significant jup. The failed reconstruction was obtained by fro Gaussian linear easureents of a sparse Gaussian signal. Err is the relative error in the l 2-nor. The second rule is a toll-based rule, naely, setting ɛ (s) so that I (s) has a given cardinality, which increases in s. We tried cardinality sequences such as, 2,... and 2, 4,... and obtained high quality reconstructions fro a sall nuber of easureents. However, because I (s) grows slowly, threshold- takes a large nuber of iterations. In coparison, the third rule below offers a uch better balance between quality and speed. The rule of our choice is based on locating the first significant jup in the increasingly sorted sequence x (s) [i] (x [i] denotes the ith largest coponent of x by agnitude), as illustrated by Figure 2. The rule looks for the sallest i such that (29) x (s) [i+] x (s) [i] >τ (s),

14 ITERATIVE SUPPORT DETECTION 475 where τ (s) is selected below. This aounts to sweeping the increasing sequence x (s+) [i] and looking for the first jup larger than τ (s). In the exaple in Figure 2, it is located at [i] = 88. Then, we set ɛ (s) = x (s) [i]. Typically, this rule allows (28) to detect a large nuber of true nonzeros with few false alars. It is less arbitrary than the first rule while also leading to faster convergence than the second rule. The first significant jup exists because in x (s), the true nonzeros (the red circles in Figure 2) are large in size and sall in nuber, while the false ones (the blue diaonds in Figure 2) are large in nuber and sall in size. Therefore, the agnitudes of the true nonzeros are spread out, while those of the false ones are clustered. The false ones are the searing due to the nonzeros in x that are zeroed out in x (s). To explain this atheatically, let us decopose x (s), which we assue to have nonzero entries. 2 Slightly abusing the notation, we let B = {i : x (s) i } denote both the set of basic indices and the corresponding square subatrix of A, sowehavebx (s) B = b. Let V = BC. Fro basic algebra, we get x (s) B = x B + B A V x V,where x B =[ x U ; ] foru = B supp( x). Because x V tends to have saller coponents than x 3 U and B has a diluting effect, the coponents of B A V x V tend to be uch saller than those in x U. Therefore, x (s) U are typically doinated by x U and thus have relatively large coponents. Here we adopt a very siple ethod to define τ (s) for different kinds of sparse or copressible signals. For the sparse Gaussian signals, we set τ (s) siply as x (s),whichis used in the exaple shown in Figure 2 and gives no false detection. For the first 5 iterations in the experients reported in section 5, we ade the detection slightly ore conservative by increasing τ (s) to 6 s+ τ (s),s=,, 2, 3, 4. Besides sparse Gaussian signals, we also tried the first significant jup rule on synthetic sparse and copressible power-law decaying signals, as well as on the wavelet coefficients of the Shepp Logan phanto and the caeraan iage in section 5. The sparse and copressible power-law decaying signals were constructed by first generating a sequence of nubers obeying the power-decay rule, such as {i /λ } k i=, followed by ultiplying each entry by a rando sign and applying a rando perutation to the signed sequence. We set τ (s) according to λ. For exaple, when λ =/3, one can set τ (s) = x (s) //2; when λ =.8 or,onecanset τ (s) = x (s) //5; when λ = 2 or 4, one can set τ (s) = x (s) //2. For the wavelet coefficients of the phanto iage and caeraan iage, one can set τ (s) = x (s) /. As we did for sparse Gaussian signals, for the first 9 iterations in the experients, we ade the detection slightly ore conservative by increasing τ (s) to 8 s+ τ (s),s=,, 2, 3, 4, 5, 6, 7. The above heuristic, which worked very well in our experients, is certainly not necessarily optial; on the other hand, it has been observed that threshold- is not very sensitive to τ (s). Finally, we give three coents on support detections. First, jups based on counting 2 Proble (4) can be reduced to a linear progra. It is possible that x (s) has either less or ore than nonzero entries, but for ost atrices A used in CS and ost signals, x (s) has exactly nonzero entries with a nonsingular basis B. This assuption offers technical convenience and is not essential to the arguent that follows. 3 Roughly, a fast decaying x can be regarded as consisting of a signal part of large coponents and a noise part of saller and zero coponents. Stability results say that the signal part is ostly included in x U.

15 476 YILUN WANG AND WOTAO YIN on the change of agnitudes of two neighboring coponents are far fro optial because the choice of τ (s) ay vary a lot for different kinds of sparse signals as shown above. One can apply other available effective jup detection ethods [28, 2]. Second, any threshold-based support detection rule requires true signals to have a fast decaying distribution of nonzeros in order to work reliably. It does not work on signals that decay slowly or have no decay at all (e.g., sparse Bernoulli signals). Third, one should be able to find better support detection ethods for real world signals such as those with grouped nonzero coponents (cf. the odel-based CS [2]) and natural/edical iages (cf. edge-enhanced copressive iaging [9]) YALL and war-start. The truncated odel (4) can be solved by ost existing l algoriths/solvers with straightforward odifications. We chose YALL [3] since it is aong the fastest ones whether the solution is sparse or not. In the nuerical tests reported in section 5, we applied YALL to all, truncated, and weighted l probles that arise in threshold- and the copared algoriths. For copleteness, we give a short overview of YALL. Based upon applying the alternating direction ethod to the Lagrange dual of { n } in w i x i : Ax = b, x i= the iteration in YALL has the basic for (3) (3) (32) y l+ = αaz l β(ax l b), z l+ = P w (A y l+ + x l /μ), x l+ = x l + γμ(a y l+ z l+ ), where μ>, γ (, ( + 5)/2), α =,andβ = μ.p w is an orthogonal projection onto the box B w {z C n : z i w i, i =,...,n}. This is a first-order prial-dual algorith in which the prial variables x and dual variables y and z (a dual slack) are updated at every iteration. In our nuerical experients, we stopped YALL iterations once the relative change x l+ x l 2 / x l 2 fell below a certain prescribed stopping tolerance. YALL was called ultiple ties in threshold- to return x (s), s =,,... Because the first significant jup rule does not need a highly accurate solution, a loose stopping tolerance was set in YALL for all but the last threshold- iteration. The tolerances used are given in subsection 5.5 below. To deterine whether a threshold- iteration k was the final iteration, the loose stopping tolerance was put in place to stop YALL and upon stopping of YALL, I (s+) was generated and copared to I (s) and I (s ).IfallthreeI s were identical or alost so, any further iteration would likely give the sae or a very siilar solution. At this tie, we let YALL resue fro where it stopped and return an accurate final solution till a tighter tolerance was reached. In other words, threshold- uses coarse solutions for support detection and returns a fine solution to the user. To further accelerate threshold-, we war-started YALL whenever possible. Specifically, for each instance of, truncated, and weighted l probles, YALL was started fro the solution (x, y, z) of the previous instance if available.

16 ITERATIVE SUPPORT DETECTION 477 As a result of applying varying stopping tolerance and war-start, the total threshold- tie, including the tie of solving ultiple truncated probles, was alost the sae on average as the tie of solving a single proble. 5. Nuerical ipleentation and experients. Threshold- was copared to the odel (2), the iterative reweighted least-squares algorith () [8], and the iterative reweighted l iniization algorith () [7]. and appear to be state-of-the-art in ters of the nuber of easureents required. These coparisons 4 show that threshold- requires as few easureents as and also runs as fast as, which is uch faster than. 5.. Reviews of and. Let us first briefly review the algoriths [7] and [8]. They are both iterative procedures attepting to solve the following l p iniization proble: (33) in x p s.t. Ax = b, where p [, ]. At the sth iteration, the algorith coputes { n } (34) x (s) in w (s) x i x i : Ax = b, i= where the weights are set as (35) w (s) i := ( x (s ) i + η ) p, and η is a regularization paraeter. Initially, x () is the solution of the proble. iteratively iniizes a weighted l 2 function to generate x (s) : { } (36) x (s) in w (s) x i x i 2 : Ax = b. The solution of (36) can be given explicitly as (37) x (s) = Q s A T (AQ s A T ) b, where Q s is the diagonal atrix with entries / w (s) i,theweightsaresetas (38) w (s) i i := ( x (s ) i 2 + ζ) p/2, and ζ is a regularization paraeter. Initially, x () is the least-squares solution of Ax = b. We set p := uniforly in the experients since it is reported that this value leads to better reconstructions than p>. Note that if x (s ) in both (35)and(38) is set equal to x and η = ζ =, then following the convention / =,wehave i w(s) i x i 2 = i w(s) i x i 2 = x. This result, though not holding for η, ζ >, indicates that the two objective functions are sooth approxiations to x. When η and ζ are large, n x i and i w(s) x i 2 are i= w(s) i close to x and x 2 2, respectively, so they tend to have fewer local inia. Therefore, η and ζ are both initially large and gradually increase as s increases. The setting of these paraeters is given in subsection 5.4. i 4 More coparisons to other algoriths, including certain greedy algoriths, are given on the second author s website.

17 478 YILUN WANG AND WOTAO YIN 5.2. Denoising iniization probles. The easured data is soeties inaccurate due to various kinds of iprecisions or containations. Assue that b = Ax + z, wherez is i.i.d. Gaussian with zero ean and standard deviation (noise level) σ. When σ is not big, the unconstrained proble (3) is known to yield a faithful reconstruction for an appropriate ρ depending on σ. We found that (3) could be solved faster and yield a slightly ore accurate solution than the constrained proble (2). Therefore, (3) was used in our tests with noisy easureents. In our figures, we use L/L2 for (3). For, reweighting was applied to the above noise-aware proble (3), and each of its iterations was changed fro (34) to { n } (39) x (s) in w (s) x i x i + 2ρ b Ax 2 2, i= where weights w (s) i were generated as before. Given an index set T, threshold- solved the following truncated l version of (3): (4) x (s) in x T (s) + 2ρ b Ax 2 2. The sae ρ was set for the three probles (3), (39), and (4), which were all solved by YALL iterations (3), (3), and (32) withnewα := μ μ+ρ and β := μ+ρ. For, however, we did not relax Ax = b in (36) for noisy easureents because the resulting unconstrained proble is no easier to solve nor does it return solutions with less error, at least when the error level σ is not excessively large. Therefore, (36) wassolvedby for noisy easureents Transfor sparsity. In any situations, it is not the true signal x itself but its representation under a certain basis, frae, or dictionary that is sparse or copressible. In such a case, ȳ = W x is sparse or copressible for a certain linear transfor W. Instead of iniizing x and x T, one should iniize Wx and (Wx) T. Then, the weight variables in and should be updated according to the coponents of (Wx)instead of those of x. In the case of transfor sparsity, the above siple changes were applied to all algoriths and solvers Experiental settings and test platfors. The test sets are suarized in Table. Our experient included five different test sets, the first four of which used various synthetic signals and standard i.i.d. Gaussian sensing atrices A generated by A=randn(,n) in MAT- LAB. The first two sets used noise-free easureents, and different aounts of white noise were added to easureents in the third set. With noise, exact reconstruction was ipossible, so in the third set we did not change but easured solution errors for different levels of noise. The fourth set used signals with entries following power-laws aong which soe of the signals had their tails truncated and were thus sparse. As zero-decay sparse signals are just sparse ± signals, this set also included sparse Bernoulli signals. The last (fifth) set used two-diensional iages of different sizes and tested sensing atrices A that were partial discrete cosine atrices fored by choosing the first row and a rando subset of the reaining rows fro the full discrete cosine atrices. Since all the operations in threshold- and

18 ITERATIVE SUPPORT DETECTION 479 Table Suary of test sets. # Nonzeros or iage nae Noise σ Diension n Sparsity k Measureents Repetitions Gaussian 6 8 6:4: Gaussian 6 4 8::22 Gaussian ::4 2 Gaussian 3 2:5:8 Gaussian Gaussian Gaussian Power-law at varying rates 6 4 or 6 Varied Shepp Logan phanto :337:666 5 Shepp Logan phanto :337:744 Caeraan :262:32763 involving A (which are Ax and A x) were coputed by the discrete cosine transfor, A were never explicitly fored or stored in eory in these two algoriths. On the other hand, needs explicit atrices A, so it was not tested in the fifth set. 5 The fifth set also included tests with noise added to the easureents. The iages were assued to be sparse under the two-diensional Haar wavelets. Specifically, the sparse Gaussian signals were generated in MATLAB by xbar =zeros(n,); p=randper(n); xbar(p(:k))=randn(k,); and the sparse Bernoulli signals were generated by the sae coands except xbar(p(:k))=2*(rand(k,)>.5)-; The power-law decaying signals were generated by xbar=zeros(n,); p=randper(n); xbar(p(:n))=sign(randn(n,)).*((:n).^(-/labda)) ; xbar=xbar/ax(abs(xbar)); and, replacing the second line by xbar(p(:k))=sign(randn(k,)).*((:k).^(-/labda)), we obtained the sparse ones. The variable labda was set to different values (described in subsection 4.), which controls the rate of decay. The larger labda was, the lower was the rate of decay. All test code was written and tested in MATLAB v7.7. running in GNU/Linux Release on a Dell Optiplex GX62 with dual 3.2GHz Intel Pentiu D CPUs (only one CPU was used by MATLAB) and 3 GB of eory Stopping tolerance and soothing paraeters. Perforances of all tested code depended on paraeters. For fairness, threshold-,,, and were stopped upon (4) x l+ x l 2 x l 2 ɛ, with different interediate but the sae final stopping tolerances ɛ. 5 iterations could be odified to use Ax and A x rather than A in the explicit for, but for the purpose of this paper, no odification was done.

19 48 YILUN WANG AND WOTAO YIN Test sets, 2, and4. In all tests, threshold- was set to run no ore than 9 iterations (with the reason given below in test sets and 2), in which the first iteration had ɛ := and the rest except for the last one had ɛ := 2. Saller ɛ values did not ake solutions or running ties better. ɛ := 6 was set for the final iterations of threshold-. The stopping rule is based on coparing consecutive support sets I (s) asdescribedinsection4.2. ɛ := 6 was also set for, as well as all iterations of. Larger interediate ɛ valueswouldakereturnworsesolutions. had9iterationsasrecoendedin[7], but its soothing paraeter η was initialized to and reduced by half each tie, different fro but slightly better than the recoendation. Recoendedin[8] for and for test sets and 2, s soothing paraeter ζ was initialized to and reduced to ζ/ whenever (4) was satisfied for ɛ := ζ/ until ζ reached 8 when the final ɛ = 6 becae effective. We optiized ɛ and the stopping value of ζ for the ore challenging test 4. For power-law decaying signals (either sparse or not), ɛ := 3/2 ζ and the stopping ζ was set to 9 ; for sparse Bernoulli signals, ɛ := ζ and the stopping ζ was set to. Again, ɛ finally reached 6. The above optiization to ade its frequency of successful reconstruction slightly higher than that of threshold- in a couple of tests. Test set 3. Because of easureent noise and thus reconstruction errors, it was not necessary to ipose a tight tolerance for this set of tests. All four algoriths uniforly had the reduced final ɛ := σ/. Threshold- ran no ore than 9 iterations with the sae interediate ɛ values as in test sets, 2, and 4. ran 9 iterations with interediate ɛ := σ/ constantly. For all but the last iteration, ɛ := ax{ σ/, ζ/}. Test set 5. Threshold- was copared only to and because atrices A were too large to for in. The only paraeter change was the final ɛ := ax{ 4,σ/} for all three tested algoriths Experiental results. Test set. Sparse signals containing k =4, 8, 5 nonzeros were used in this set, and corresponding results were plotted in Figures 3, 4, and5, respectively. Figure 3 depicts the perforance of the four tested algoriths. Figure 3(a) shows that threshold- and achieved alost the sae recoverability, which was significantly higher than that of in ters of the nuber of required easureents. Not surprisingly, the recoverability of the ethod was the worst. Figure 3(b) shows that threshold- was uch faster than both and, and was even coparable to. To su up, threshold- was not only the fastest but also required the least nuber of easureents. With the sall k = 8 (Figure 4), threshold- had no speed advantage over, but it was still uch faster than. Qualitywise, threshold- was on par with and and better than. With the larger k = 5, threshold- was uch faster than both and, and all three achieved coparable recoverability with % starting around = 3. Test set 2. This test set used larger signals (n = 3). Figure 6 shows that threshold-,, and achieved recoverability siilar to what they did in test set. Because of the relatively large signal size, and were, however, uch slower than threshold-. The fact that threshold- ran as fast as suggests that effective support detection

20 ITERATIVE SUPPORT DETECTION 48 Acceptable reconstruction frequencies n=6, k=4, =8:: (a) Recoverability Running tie (s) n=6, k=4, =8:: (b) CPU tie Figure 3. Test set with k = 4: Coparisons in recoverability and CPU tie. Acceptable reconstruction frequencies n=6, k=8, =6:4: (a) Recoverability Running tie (s) n=6, k=8, =6:4: (b) CPU tie Figure 4. Test set with k =8: Coparisons in recoverability and CPU tie. Acceptable reconstruction frequencies n=6, k=5, =25:: (a) Recoverability Running tie (s) n=6, k=5, =25:: (b) CPU tie Figure 5. Test set with k = 5: Coparisons in recoverability and CPU tie.

21 482 YILUN WANG AND WOTAO YIN Acceptable reconstruction frequencies n=3, k=, =2:5: (a) Recoverability Running tie (s) n=3, k=, =2:5: (b) CPU tie Figure 6. Test set 2: Coparisons in recoverability and CPU tie. Running tie (s) n=2, =325, k= L/L Trials (a) CPU tie Relative reconstruction errors in L2 nor 2 3 n=2, =325, k= L/L Trials (b) Reconstruction errors in l 2-nor Relative reconstruction errors in L nor 2 3 n=2, =325, k= L/L Trials (c) Reconstruction errors in l -nor Figure 7. Test set 3 with σ =.: Coparisons in CPU tie and reconstruction errors.

22 ITERATIVE SUPPORT DETECTION L/L2 n=2, =325, k= Running tie (s) Trials (a) CPU tie Relative reconstruction errors in L2 nor 2 L/L2 n=2, =325, k= Trials (b) Reconstruction errors in l 2-nor Relative reconstruction errors in L nor 2 n=2, =325, k= L/L Trials (c) Reconstruction errors in l -nor Figure 8. Test set 3 with σ =.: Coparisons in CPU tie and reconstruction errors. accelerates subproble solution (by YALL) in threshold-. The results also show that threshold- was scalable to both signal and easureent sizes. Test set 3. This test set copared solution ties and errors of the tested algoriths given noisy easureents with three noise levels: σ =., σ =., and σ =.. The corresponding results are depicted in Figures 7, 8, and9, respectively, each including three subplots for CPU ties and l 2 -andl -errors. It is clear fro these figures that threshold- was significantly faster than and and slightly faster than. Threshold- and were on par in solution quality except that at σ =., threshold- had two fails aong the total 2 trials. had any ore fails, and was the worst. Taking both reconstruction errors and CPU ties into consideration, threshold- appeared to be the best. Test set 4. This test set included two subsets. The first subset (Figure ) usedsparse signals with nonzeros decaying in power-laws at rates λ =, 2, 4, as well as sparse Bernoulli signals. The second subset (Figure ) used two copressible (nonsparse) signals with nonzeros decaying in power-laws at rates λ = /3,.8, as well as their sparse tail-reoved versions obtained by reoving all but the largest k = 8, 4 entries, respectively. In the first subset of tests, threshold-,, and had better recoverability than, but the advantage diinished as λ increased (i.e., the rate of decay decreased). In

23 484 YILUN WANG AND WOTAO YIN Running tie (s) L/L2 n=2, =325, k= Trials (a) CPU tie Relative reconstruction errors in L2 nor n=2, =325, k= L/L Trials (b) Reconstruction errors in l 2-nor Relative reconstruction errors in L nor n=2, =325, k= L/L Trials (c) Reconstruction errors in l -nor Figure 9. Test set 3 with σ =.: Coparisons in CPU tie and reconstruction errors. cases of zero-decay where the signals were sparse Bernoulli, all the tested algoriths had siilar recoverability. These results atch the conclusion in our theoretical analysis that thresholding-based support detection is effective only if the nonzeros of signals have a fast decaying distribution. The second subset of tests show how the tail of a copressible signal affects its reconstruction. By coparing Figures (c) with (d), and (e) with (f) (i.e., tail-free vs. tailed signals), we can observe that it was uch easier for any of the tested algoriths to reconstruct a tailfree sparse signal than a copressible signal. In addition, because the signal corresponding to λ =.8 has a larger tail, the iproveent of threshold-,, and over was quite sall. Test set 5. Figure 2 depicts the clean Shepp Logan phanto and its wavelet coefficients sorted by agnitude, which have a long tail and a sharp drop near 6. Threshold-,, and were tested to reconstruct the phanto fro partial discrete cosine easureents both without and with white noise. The results are given in Figures 3 and 4, respectively. For Figure 3, a reconstruction x was accepted if its 2-nor relative error x(:) x(:) 2 / x(:) 2 was within 3. Threshold- was both faster than and ore accurate than and in both of the tests.

24 ITERATIVE SUPPORT DETECTION 485 n=6, k=4, =8::22 n=6, k=4, =8::22 Acceptable reconstruction frequencies Acceptable reconstruction frequencies (a) λ=: Acceptable reconstruction frequencies (b) λ=2: Acceptable reconstruction frequencies n=6, k=4, =8::22 n=6, k=4, =8::22 Acceptable reconstruction frequencies Acceptable reconstruction frequencies (c) λ=4: Acceptable reconstruction frequencies (d) Bernoulli: Acceptable reconstruction frequencies Figure. Test set 4 with sparse power-law decaying and Bernoulli signals: Coparisons in recoverability. Figure 5 presents the reconstructed phantos fro noisy easureents corresponding to = 337, in which subplots (b), (d), and (f) highlight the differences between the reconstructions and the clean phanto. Threshold- gave a uch higher signal-to-noise ratio (SNR). Figure 6 depicts the clean iage Caeraan and its sorted wavelet coefficients, which for a long and slow decaying tail. As expected, threshold- did not return a reconstruction significantly better than either or, as shown in Figure 7. Even though thresholding in the wavelet doain is not effective for natural iages, we note that the authors of [9], however, have obtained edical iages with uch better quality by cobining (applied to total variation) with edge detection techniques that replace thresholding. For this and other reasons, we believe that with effective support detection is potentially very powerful. In suary, we copared threshold- with,, and. Threshold- can be solved as quickly as yet achieves reconstruction quality as good as or better than that of the uch slower. relies on effective support detection. For threshold-, its good perforance requires fast decaying signals.

25 486 YILUN WANG AND WOTAO YIN Sparse power decay signals λ=/3 λ=.8 Copressible power decay signals λ=/3 λ=.8 2 Magnitude 2 Magnitude Entries sorted by agnitude (a) Sparse signals Entries sorted by agnitude (b) Copressible signals Relative reconstruction errors in L2 nor n=6, k=8, =2:5: (c) λ=/3: Reconstruction errors in l 2-nor n=6, k=4, =8::3 Relative reconstruction errors in L2 nor 2 3 n=6, k=6, =2:5: (d) λ=/3: Reconstruction errors in l 2-nor n=6, k=6, =8::3 Relative reconstruction errors in L2 nor Relative reconstruction errors in L2 nor (e) λ=.8: Reconstruction errors in l 2-nor (f) λ=.8: Reconstruction errors in l 2-nor Figure. Test set 4 with sparse and copressible power-law decay signals. The sparse signals were obtained by reoving the tails of the copressible signals. Coparisons in reconstruction errors. Fast decaying tails required for good perforances of threshold-,, and.

26 ITERATIVE SUPPORT DETECTION 487 True iage 5 Wavelets coefficients Magnitude x 28 (a) Entries sorted by agnitude (b) Figure 2. Shepp Logan phanto and its wavelet coefficients sorted by agnitude. Acceptable reconstruction frequencies n=6384, k=685, =2359:337: (a) Running tie of reconstruction (s) n=6384, k=685, =2359:337: (b) Figure 3. Noiseless easureents: Acceptable reconstruction frequencies and running ties. Relative reconstruction errors in L2 nor 2 n=6384, k=685, =2359:337:744 L/L (a) Running tie of reconstruction (s) n=6384, k=685, =2359:337:744 L/L (b) Figure 4. Noisy easureents: Reconstruction errors and running ties.

488 YILUN WANG AND WOTAO YIN L/L2: SNR=8.72dB, Err=3.8e, CPU tie=27.78 s Difference 2 2.6 4 6 8 4 6 8.4.2.2.4 2 2 4 6 8 2 Saple ratio:2% 2 2 4 6 8 2 Saple ratio:2%.

6 (c) (d) : SNR=45.4dB, Err=4.86e 3, CPU tie=44.5 s Difference 2 2.6 4 6 8 4 6 8.4.2.2.4 2 2 4 6 8 2 Saple ratio:2% 2 2 4 6 8 2 Saple ratio:2%.

This paper introduces the iterative support detection () ethod for CS signal reconstruction.

27 488 YILUN WANG AND WOTAO YIN L/L2: SNR=8.72dB, Err=3.8e, CPU tie=27.78 s Difference Saple ratio:2% Saple ratio:2%.6 (a) (b) : SNR=3.2dB, Err=.94e, CPU tie=258.5 s Difference Saple ratio:2% Saple ratio:2%.6 (c) (d) : SNR=45.4dB, Err=4.86e 3, CPU tie=44.5 s Difference Saple ratio:2% Saple ratio:2%.6 (e) (f) Figure 5. Noisy easureents: Reconstructed phantos and highlighted errors corresponding to where = Concluding rearks. This paper introduces the iterative support detection () ethod for CS signal reconstruction. Both theoretical properties and practical perforances are discussed. For signals with a fast decaying distribution of nonzeros, the ipleentation of threshold- equipped with the first significant jup thresholding rule is both fast and

ITERATIVE SUPPORT DETECTION 489 4 Wavelets coefficients 2 Magnitude 2 4 6 8 2 3 4 5 6 7 Entries sorted by agnitude x 4 (a) (b) Figure 6. Iage Caeraan and its sorted wavelet coefficients.

28 ITERATIVE SUPPORT DETECTION Wavelets coefficients 2 Magnitude Entries sorted by agnitude x 4 (a) (b) Figure 6. Iage Caeraan and its sorted wavelet coefficients. n=65536, k=65536, =6553:262:32763 Relative reconstruction errors in L2 nor x 4 (a) Figure 7. Reconstructing errors where the true signal is the wavelet coefficients of Caeraan and the sensing atrices are partial discrete cosine atrices. accurate copared to the classical approach and the state-of-the-art algoriths and. Due to the liit of thresholding, threshold- does not perfor significantly better than its peers on other types of signals such as iages. However, support detection is not liited to thresholding. Effective support detection guarantees the good perforance of. Therefore future research includes studying specific signal classes and developing ore effective support detection eans, for exaple, by exploring signal structures (odel-based CS [2]). Since iniizing l is not the only approach for CS signal reconstruction, another line of future research is to apply iterative support detection to other reconstruction approaches such as the greedy algoriths, Bayesian algoriths, dictionary-based algoriths, and any others. We also feel that the usefulness of the above first significant jup rule is not liited to threshold-.

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used