Hard Thresholding Pursuit Algorithms: Number of Iterations

Size: px
Start display at page:

Download "Hard Thresholding Pursuit Algorithms: Number of Iterations"

Transcription

1 Hard Thresholding Pursuit Algorithms: Number of Iterations Jean-Luc Bouchot, Simon Foucart, Pawel Hitczenko Abstract The Hard Thresholding Pursuit algorithm for sparse recovery is revisited using a new theoretical analysis. The main result states that all sparse vectors can be exactly recovered from compressive linear measurements in a number of iterations at most proportional to the sparsity level as soon as the measurement matrix obeys a certain restricted isometry condition. The recovery is also robust to measurement error. The same conclusions are derived for a variation of Hard Thresholding Pursuit, called Graded Hard Thresholding Pursuit, which is a natural companion to Orthogonal Matching Pursuit and runs without a prior estimation of the sparsity level. In addition, for two extreme cases of the vector shape, it is shown that, with high probability on the draw of random measurements, a fixed sparse vector is robustly recovered in a number of iterations precisely equal to the sparsity level. These theoretical findings are experimentally validated, too. Key words and phrases: compressive sensing, uniform sparse recovery, nonuniform sparse recovery, random measurements, iterative algorithms, hard thresholding. AMS classification: 65F, 65J2, 5A29, 94A2. Introduction This paper deals with the standard compressive sensing problem, i.e., the reconstruction of vectors x C N from an incomplete set of m N linear measurements organized in the form y = Ax C m for some matrix A C m N. It is now well known that if x is s-sparse i.e., has only s nonzero entries) and if A is a random matrix whose number m of rows scales like s times some logarithmic factors, then the reconstruction of x is achievable via a variety of methods. The l -minimization is probably the most popular one, but simple iterative algorithms do provide alternative methods. We consider here the hard thresholding pursuit HTP) algorithm [6] as well as a novel variation and we focus on the number of iterations needed for the reconstruction. This reconstruction is addressed in two settings: S. F. and J.-L. B. partially supported by NSF DMS-2622), P. H. by Simons Foundation grant 28766). simon.foucart@centraliens.net

2 An idealized situation, where the vectors x C N are exactly sparse and where the measurements y C m are exactly equal to Ax. In this case, the exact reconstruction of x C N is targeted. A realistic situation, where the vectors x C N are not exactly sparse and where the measurements y C m contain errors e C m, i.e., y = Ax + e. In this case, only an approximate reconstruction of x C N is targeted. Precisely, the reconstruction error should be controlled by the sparsity defect and by the measurement error. The sparsity defect can be incorporated in the measurement error if y = Ax + e is rewritten as y = Ax S + e where S is an index set of s largest absolute entries of x and e := Ax S + e. We shall mainly state and prove our results in the realistic situation. They specialize to the idealized situation simply by setting e =. In fact, setting e = inside the proofs would simplify them considerably. Let us now recall that HTP) consists in constructing a sequence x n ) of s-sparse vectors, starting with an initial s-sparse x C N we take x = and iterating the scheme HTP ) HTP 2 ) S n := index set of s largest absolute entries of x n + A y Ax n ), x n := argmin{ y Az 2, suppz) S n }, until a stopping criterion is met. It had been shown [] that, in the idealized situation, exact reconstruction of every s-sparse x C N is achieved in s iterations of HTP) with y = Ax provided the coherence of the matrix A C m N satisfies µ < /3s) note that this condition can be fulfilled when m s 2 ). Exact and approximate reconstructions were treated in [6], where it was in particular shown that every s-sparse x C N is the limit of the sequence x n ) produced by HTP) with y = Ax provided the 3sth restricted isometry constant of the measurement matrix A C m N obeys δ 3s < / note that this condition can be fulfilled when m s lnn/s)). As a reminder, the kth restricted isometry constant δ k of A is defined as the smallest constant δ such that δ) z 2 2 Az δ) z 2 2 for all k-sparse z C N. In fact, it was shown in [6] that the convergence is achieved in a finite number n of iterations. In the idealized situation, it can be estimated as ln 2/3 x 2 /ξ) 2δ3s 2 ) n, ρ 3s := ln/ρ 3s ) δ3s 2 <, ξ := min x j. j suppx) This paper establishes that the number of iterations can be estimated independently of the shape of x: under a restricted isometry condition, it is at most proportional to the sparsity s, exact arithmetic is assumed and, among several candidates for the index set of largest absolute entries, the smallest one in lexicographic order is always chosen 2

3 see Theorem 5 and a robust version in Theorem 6. This is reminiscent of the work of T. Zhang [5] on orthogonal matching pursuit OMP), see also [7, Theorem 6.25] where it is proved that n 2s provided that δ 3s < /6. However, HTP) presents a significant drawback in that a prior estimation of the sparsity s is required to run the algorithm, while OMP) does not although stopping OMP) at iteration 2s does require this estimation). We therefore consider a variation of HTP) avoiding the prior estimation of s. We call it graded hard thresholding pursuit GHTP) algorithm, because the index set has a size that increases with the iteration. Precisely, starting with x =, a sequence x n ) of n-sparse vectors is constructed according to GHTP ) GHTP 2 ) S n := index set of n largest absolute entries of x n + A y Ax n ), x n := argmin{ y Az 2, suppz) S n }, until a stopping criterion is met. For GHTP), too, it is established that a restricted isometry condition implies that the number of iterations needed for robust reconstruction is at most proportional to the sparsity s, see Theorem 8. One expects the number of GHTP) iterations needed for s-sparse recovery to be exactly equal to s, though. Such a statement is indeed proved, but in the nonuniform setting rather than in the uniform setting i.e., when successful recovery is sought with high probability on the draw of random measurements for a fixed x C N rather than for all x C N simultaneously) and in two extreme cases on the shape of the s-sparse vector to be recovered. The first case deals with vectors that are flat on their support: robust nonuniform recovery is guaranteed with m s lnn) Gaussian measurements, see Proposition 9. The second case deals with vectors that are decaying exponentially fast on their support: robust nonuniform recovery is guaranteed with m max{s, lnn)} Gaussian measurements or with m s lnn) Fourier measurements, see Proposition. This is an improvement attenuated by the extra shape condition) on uniform recovery results based on the restricted isometry property that require m s lnn/s) Gaussian measurements or currently) m s ln 4 N) Fourier measurements. To conclude, the HTP) and GHTP) algorithms are tested numerically and benchmarked against the OMP) algorithm. Their performance is roughly similar in terms of success rate, with different behaviors mostly explained by the shape of the vectors to be recovered. In terms of number of iterations, however, HTP) does perform best, owing to the fact that an a priori knowledge of the sparsity s is used to run the algorithm which also provides a convenient stopping criterion). It also seems that GHTP) always requires less iterations than OMP). The empirical observations that flat vectors are least favorable, coupled with the theoretical findings of Sections 3 and 5, support the beliefs that uniform recovery via HTP) is achievable in much less iterations than the upper bound cs of Theorem 5 and that nonuniform recovery via GHTP) is achievable in exactly s iterations whatever the shape of the vector. Finally, the numerical part ends with an experimental study of the influence of the measurement error. 3

4 Throughout the paper, we make use of the following notation: x R N + is the nonincreasing rearrangement of a vector x C N, i.e., x x 2... x N and there exists a permutation π of {,..., N} such that x j = x πj) for all j {,..., N}; S is the support or an index set of s largest absolute entries of a vector x C N ; x T is the restriction of a vector x C N to an index set T depending on the context, it is interpreted either as a vector in C T or as a vector in C N ; T is the complementary set of an index set T, i.e., T = {,..., N} \ T ; T T is the symmetric difference of sets T and T, i.e., T T = T \ T ) T \ T ); ρ s, τ s are quantities associated to the sth restricted isometry constant δ s < via 2δs 2) ρ s = δs δs 2, τ s = +. δ s δ s 2 Preliminary Results This section collects the key facts needed in the subsequent analyses of HTP) and GHTP). It also includes a discussion on appropriate stopping criteria. 2. Extensions of previous arguments The arguments used in [6] to prove the convergence of the sequence x n ) produced by HTP) in the idealized situation, say, relied on the exponential decay of x x n 2, precisely x x n 2 ρ 3s x x n 2, n, where ρ 3s < provided δ 3s < / 3. This followed from two inequalities derived as results of HTP ) and HTP 2 ), namely 3) x S n 2 2δ3s 2 x xn 2, x x n 4) 2 δ2s 2 x S n 2. For the analyses of HTP) and GHTP), similar arguments are needed in the more general form below. 4

5 . Consequence of HTP )-GHTP ). If x C N is s-sparse, if y = Ax + e for some e C m, if x C N is s -sparse, and if T is an index set of t s largest absolute entries of x + A y Ax ), then 5) x T 2 2δs+s 2 +t x x A e ) S T 2. Indeed, with S := suppx), we first notice that x + A Ax x ) + A e ) T 2 2 x + A Ax x ) + A e ) S 2 2. Eliminating the contribution on S T gives x + A Ax x ) + A e ) T \S 2 x + A Ax x ) + A e ) S\T 2. The left-hand side satisfies x + A Ax x ) + A e ) T \S 2 = A A I)x x ) + A e ) T \S 2, while the right-hand side satisfies x + A Ax x ) + A e ) S\T 2 x S\T 2 A A I)x x ) + A e ) S\T 2. Therefore, we obtain x S\T 2 A A I)x x ) + A e ) T \S 2 + A A I)x x ) + A e ) S\T 2 2 A A I)x x ) + A e ) T S 2 6) 2 A A I)x x ) ) T S A e ) T S 2 2 δ s+s +t x x A e ) T S 2, where the last step used 3.8) of [6]. The resulting inequality is the one announced in 5). 2. Consequence of HTP 2 )-GHTP 2 ). If x C N is s-sparse, if y = Ax + e for some e C m, if T is an index set of size t, and if x is a minimizer of y Az 2 subject to suppz) T, then 7) x x 2 δs+t 2 x T 2 + A e ) δ T 2. s+t Indeed, the vector x is characterized by the orthogonality condition A y Ax ) ) T = in other words, by the normal equations), therefore x x ) T 2 2 = x x ) T, x x ) T = x x A y Ax ) ) T, x x ) T I A A)x x ) ) T, x x ) T + A e ) T, x x ) T 8) I A A)x x ) ) T 2 x x ) T 2 + A e ) T 2 x x ) T 2 δ s+t x x 2 x x ) T 2 + A e ) T 2 x x ) T 2, 5

6 where the last step used 3.8) of [6]. After dividing through by x x ) T 2, we obtain It then follows that x x ) T 2 δ s+t x x 2 + A e ) T 2. x x 2 2 = x x ) T x x ) T 2 2 δ s+t x x 2 + A e ) T 2) 2 + xt 2 2, in other words P x x 2 ), where the quadratic polynomial P is defined by P z) = δ 2 s+t)z 2 2δ s+t A e ) T 2 z A e ) T 2 2 x T 2 2. This implies that x x 2 is smaller than the largest root of P, i.e., δ s+t A e ) x x T 2 + A e ) T δ2 s+t ) x T δs+t 2 + δ s+t ) A e ) T 2 + δs+t 2 ) x T 2 δs+t 2. This is the inequality announced in 7). Remark. The full restricted isometry property is not needed to derive 5) and 7). The restricted isometry constants only appear in 6) and 8), where I A A)x x ) ) U 2 is bounded for U = T S and U = T. This is easier to fulfill than the restricted isometry property if suppx), suppx ), and T are all included in a fixed index set, as will be the case in Section 5. From inequalities 5) and 7), we deduce the decay of the sequences x x n 2 for HTP) )n and x x n 2 for GHTP). )n s Corollary 2. Let x C N be s-sparse and let y = Ax + e for some e C m. If x n ) is the sequence produced by HTP), then 9) x x n 2 ρ 3s x x n 2 + τ 2s e 2 for any n. If x n ) is the sequence produced by GHTP), then ) x x n 2 ρ s+2n x x n 2 + τ s+n e 2 for any n s. Proof. We consider HTP) first. Applying 5) to x = x n and T = S n gives x S n 2 2δ3s 2 x xn A e) S S n 2 and applying 7) to T = S n and x = x n gives x x n 2 δ2s 2 x S n 2 + A e) S n 2. δ 2s 6

7 Combining these two inequalities, while using δ 2s δ 3s, yields ) x x n 2δ3s 2 2 δ3s 2 x x n δ2s 2 + A e) S S n 2. δ 2s In view of A e) S S n 2 + δ 2s e 2 see e.g. 3.9) in [6]), we recognize 9) with the values of ρ 3s and τ 2s given in 2). Let us now consider GHTP). Applying 5) to x = x n and T = S n for n s gives x S n 2 2δs+2n 2 x xn A e) S S n 2 and applying 7) to T = S n and x = x n gives x x n 2 δs+n 2 x S n 2 + A e) S n 2. δ s+n Combining these two inequalities, while using δ s+n δ s+2n, yields ) x x n 2δs+2n 2 2 δs+2n 2 x x n δs+n 2 + A e) S S n 2. δ s+n In view of A e) S S n 2 + δ s+n e 2, we recognize ) with the values of ρ s+2n and τ s+n given in 2). 2.2 Additional arguments While the above argument probed only the closeness of x n to x, the following lemmas examine the indices of nonzero entries of x that are correctly captured in the support sets produced by HTP) and GHTP). These lemmas are the main addition to the earlier proof techniques. They show how many iterations are necessary to increase the number of correct indices by a specified amount. The first lemma concerns HTP). Lemma 3. Let x C N be s-sparse and let S n ) be the sequence of index sets produced by HTP) with y = Ax + e for some e C m. For integers n, p, suppose that S n contains the indices of p largest absolute entries of x. Then, for integers k, q, S n+k contains the indices of p + q largest absolute entries of x, provided ) x p+q > ρ k 3s x {p+,...,s} 2 + κ 3s e 2, where the constant κ 3s depends only on δ 3s. For instance, δ 3s < /3 yields κ 3s

8 Proof. Let π be the permutation of {, 2,..., N} for which x πj) = x j for all j {, 2,..., N}. Our hypothesis is π{,..., p}) S n and our aim is to prove that π{,..., p + q}) S n+k. In other words, we aim at proving that the x n+k +A y Ax n+k )) πj) for j {,..., p+q} are among the s largest values of x n+k + A y Ax n+k )) i for i {,..., N}. With suppx) S, cards) = s, it is the enough to show that 2) min x n+k + A y Ax n+k )) πj) > max x n+k + A y Ax n+k )) l. j {,...,p+q} l S We notice that, for j {,..., p + q} and l S, x n+k + A y Ax n+k )) πj) x πj) x + x n+k + A y Ax n+k )) πj) x p+q A A I)x x n+k ) + A e ), πj) x n+k + A y Ax n+k )) l = x + x n+k + A y Ax n+k )) l = A A I)x x n+k ) + A e ). l Therefore, 2) will be shown as soon as it is proved that, for all j {,..., p + q} and l S, 3) x p+q > A A I)x x n+k ) + A e ) + A A I)x x n+k ) + A e ). πj) l The right-hand side can be bounded by 2 I A A)x x n+k ) + A e ) {πj),l} 2 2 I A A)x x n+k ) ) {πj),l} A e ) {πj),l} 2 δ 2s+2 x x n+k δ 2 e 2 2 δ 3s ρ k 3s x xn 2 + τ ) 2s e δ 2s ) e 2, ρ 3s where 9) was used k times in the last step. Using 7) and the assumption that π{,..., p}) S n, we derive 2 I A A)x x n+k ) + A e ) {πj),l} 2 δ 3s ρ k 3s ρ k 3s x π{,...,p}) 2 + δ 2 2s x S n 2 + ρ k 3s x {p+,...,s} 2 + κ 3s e 2, 2 2 δ3s ρ k 3s + δs + δ 2s where the constant κ 3s is determined by κ 3s = ) 2 A δ3s τ 2s e) S n ) 2 + δ 2s ) e 2 δ 2s ρ 3s 2 δ3s τ 2s + ) 2 + δ 2s ) e 2 ρ 3s 2 + δ3s ) δ 3s + 2 δ3s τ 3s ρ 3s. Therefore, 3) is proved as soon as condition ) holds. The proof is completed by noticing that δ 3s /3 yields ρ 3s /2, τ 3s 2 3, and finally κ 3s 4/

9 For GHTP), an analog of Lemma 3 is needed in the form below. Lemma 4. Let x C N be s-sparse and let S n ) be the sequence of index sets produced by GHTP) with y = Ax + e for some e C m. For integers n s and p, suppose that S n contains the indices of p largest absolute entries of x. Then, for integers k, q, S n+k contains the indices of p + q largest absolute entries of x, provided 4) x p+q > ρ k s+2n+2k x {p+,...,s} 2 + κ s+2n+2k e 2, where the constant κ s+2n+2k depends only on δ s+2n+2k. Proof. Given the permutation π of {, 2,..., N} for which x πj) = x j for all j {,..., N}, our hypothesis is π{,..., p}) S n and our aim is to prove that π{,..., p + q}) S n+k. For this purpose, we aim at proving that the x n+k + A y Ax n+k )) πj) for j {,..., p + q} are among the n+k largest values of x n+k +A y Ax n+k )) i, i {,..., N}. Since n+k s, it is enough to show that they are among the s largest values of x n+k +A y Ax n+k )) i, i {,..., N}. The rest of the proof duplicates the proof of Lemma 3 modulo the change imposed by ) for the indices of δ and κ. 2.3 Stopping the HTP) and GHTP) algorithms The stopping criterion, which is an integral part of the algorithm, has not been addressed yet. We take a few moments to discuss this issue here. For HTP), the natural stopping criterion 2 is S n = S n, since the algorithm always produce the same vector afterwards. Moreover, under the condition δ 3s < / 3,. if S n = S n occurs, then robustness is guaranteed, since x n = x n substituted in 9) yields x x n 2 ρ 3s x x n 2 + τ 2s e 2, i.e., x x n 2 τ 2s ρ 3s e 2 ; 2. and S n = S n does occur provided e 2 < 4[τ 2s / ρ 3s )] x s. Indeed, for e 2 > 3 and for n large enough to have ρ n 3s x 2 [τ 2s / ρ 3s )] e 2, the inequality 9) applied n times implies that x x n 2 ρ n 3s x 2 + [τ 2s / ρ 3s )] e 2 2[τ 2s / ρ 3s )] e 2. In turn, for j suppx) and l suppx), the inequalities x n j x j x x n ) j x s 2[τ 2s / ρ 3s )] e 2 > 2[τ 2s / ρ 3s )] e 2, x n l = x x n ) l 2[τ 2s / ρ 3s )] e 2 show that S n = suppx). The same reasoning gives S n = suppx), hence S n = S n. 2 again, exact arithmetic is assumed and the lexicographic rule to break ties applies 3 in the case e =, we have x n = x for some n see )), hence x n = x, and in particular S n = S n 9

10 For GHTP), a stopping criterion not requiring any estimation of s or e 2 seems to be lost. It is worth recalling that iterative algorithms such as CoSaMP [2], IHT [2], or HTP [6] typically involve an estimation of s and that optimization methods such as quadratically constrained l -minimization typically involve an estimation of e 2 unless the measurement matrix satisfies the l -quotient property, see [4]. Here, we settle for a stopping criterion requiring an estimation of either one of s or e 2, precisely GHTP) is stopped either when n = 4s or when y Ax n 2 d e 2 for some constant d 2.83, say. Indeed, under the condition δ 9s < /3, robustness is guaranteed from Theorem 8 by the fact that x x n 2 d e 2 for some n 4s and some d Thus, in the alternative n = 4s, x x n 2 ρ 4s n 9s x x n 2 + τ 9s ρ 9s e 2 d + τ 9s ρ 9s ) e 2 d e 2 and in the alternative y Ax n 2 d e 2 which is guaranteed to occur for some n n by virtue of y Ax n 2 + δ 9s x x n 2 + δ 9s d e 2 ), x x n 2 δ9s Ax x n ) 2 y Ax n 2 + e 2 ) d + e 2 d e 2, δ9s δ9s where d := max{d + τ 9s / ρ 9s ), d + )/ δ 9s } We point out that, according to the result of [5], a similar stopping criterion involving either one of s or e 2 can be used for OMP). This makes the resemblance between GHTP) and OMP) even more compelling. Their empirical performances will be compared in Section 6. 3 Uniform Recovery via Hard Thresholding Pursuit This section is dedicated to the main results about HTP), namely that sparse recovery is achieved in a number of iterations at most proportional to the sparsity level, independently of the shape of the vector to be recovered. The results are uniform in the sense that the same measurement matrix allows for the recovery of all s-sparse vectors simultaneously. For clarity of exposition, we first state and prove a result in the idealized situation only the realistic situation is considered thereafter. Theorem 5. If the restricted isometry constant of the matrix A C m N obeys δ 3s 5, then every s-sparse vector x C N is recovered from the measurement vector y = Ax C m via a number n of iterations of HTP) satisfying n c s. The constant c 3 depends only on δ 3s. For instance, δ 3s /3 yields c 2.

11 Proof. Let π be the permutation of {, 2,..., N} for which x πj) = x j for all j {, 2,..., N}. Our goal is to prove that suppx) S n for n 3s, hence HTP 2 ) ensures that Ax n = Ax and x n = x follows from δ 3s <. For this purpose, we shall consider a partition Q Q 2... Q r, r s, of suppx) = π{,..., s}). Lemma 3 enables to argue repeatedly that, if Q... Q i has been correctly captured, then a suitable number k i of additional HTP) iterations captures Q... Q i Q i. The remaining step is just the estimation of k k r. Let us set q = and let us define the sets Q,..., Q r inductively by 5) Q i = π{q i +,..., q i }), q i := maximum index q i + such that x q i > x q 2 i +. We notice that x q i + x q i + / 2 for all i {,..., r }. With the artificial introduction of Q = and k =, we now prove by induction on i {,..., r} that 6) Q Q... Q i S k +k +...+k i, where k,..., k r are defined, with ρ := ρ 3s <, as ln 2cardQi ) + cardq i+ )/2 + + cardq r )/2 r i ) ) 7) k i := ln/ρ 2 ) For i =, the result holds trivially. Next, if 6) holds for i, i {,..., r}, Lemma 3 with e = ) guarantees that 6) holds for i, provided 8) x q i ) 2 > ρ 2k i x Qi x Qi x Qr 2 2). In view of x q i ) 2 > x q i + )2 /2 and of x Qi x Qi x Qr 2 2 x q i +) 2 cardq i ) + x q i +) 2 cardq i+ ) + + x q r +) 2 cardq r ) x q i +) 2 cardq i ) + 2 cardq i+) + + ) 2 r i cardq r), we verify that condition 8) is fulfilled thanks to the definition 7) of k i. This concludes the inductive proof. We derive that the support π{,..., s}) of x is recovered in a number of iterations at most r r n = k i + ln 2cardQ i ) + cardq i+ )/2 + + cardq r )/2 r i ) ) ) ln/ρ 2 ) i= i= = r + r ln/ρ 2 ) r i= r ln 2cardQ i ) + cardq i+ )/2 + + cardq r )/2 r i ) ). Using the concavity of the logarithm, we obtain r ln/ρ 2 ) n r 2 ln cardqi ) + cardq i+ )/2 + + cardq r )/2 r i)) r r i= 2 = ln cardq ) + + /2)cardQ 2 ) /2 + + /2 r )cardq r ) )) ln r 4 r cardq ) + cardq 2 ) + + cardq r ) )) ) 4s = ln. r.

12 Since ln4x)/x ln4) for x, we deduce that and we subsequently obtain n r + ln/ρ 2 ) n r s ln 4s/r) s/r ln4), ln4) ln/ρ 2 ) s + ln4) ) ln/ρ 2 s = ln4/ρ2 ) ) ln/ρ 2 ) s. This is the desired result with c = ln2 δ3s 2 )/δ2 3s )/ ln δ2 3s )/2δ2 3s )). We observe that δ 3s / 5 yields c ln8)/ ln2) = 3 and that δ 3s /3 yields c ln6)/ ln4) = 2. The proof is complete. The approach above does not allow for the reduction of the number of iterations below c s. Indeed, if the s-sparse vector x C N has nonzero entries x j = µ j for µ, /2), then each set Q i consists of one element, so there are s of these sets, and n = s i= k i s. However, such a vector whose nonincreasing rearrangement decays exponentially fast is highly compressible applying Theorem 6 turns out to be more appropriate see Remark 7). For other types of sparse vectors, the number of iterations can be significantly lowered. We give two examples below. In each of them, the partition involves a single set Q. This corresponds to invoking ). Sparse vectors whose nonincreasing rearrangement decays slowly. For s-sparse vectors x C N satisfying x j = /jα for all j {,..., s} and for some α flat vectors are included with the choice α = ), the number of iterations is at most proportional to lns). Indeed, we have and it follows from ) that x 2 2 x s) 2 = + + /s2α /s 2α s /s 2α = s2α+, n ln 2/3 s α+/2 ) c δ3s,α lns). ln/ρ 3s ) Gaussian vectors. In numerical experiments, the nonzero entries of an s-sparse vector g C N are often taken as independent standard Gaussian random variables g j, j S. In this case, with probability at least ε, the vector g is recovered from y = Ag with a number of HTP) iterations at most proportional to lns/ε). Indeed, we establish below that 9) 2) π P gs < 8 g 2 2 > 3π 2 P ) ε ε s 2, ) s ε ε 2. Thus, with failure probability at most ε, we have gs π/8 ε/s and g 2 3π/2 s/ε. It follows from ) that the number of iterations necessary to recover g satisfies ln2s/ε) 3/2 ) 3 ln2s/ε) n = c δ3s lns/ε). ln/ρ 3s ) 2 ln/ρ 3s ) 2

13 To obtain 9), with g denoting a standard Gaussian random variable and with t := π/8 ε/s, we write t Pgs exp u 2 /2) 2 < t) = P g j < t for some j S) s P g < t) = s du s 2π π t = ε/2. To obtain 2), with u, /4) to be chosen later and with t := 3π/2)s/ε, we write P g 2 2 > t) = P E exp u )) g 2 )) j E exp ugj 2 gj 2 j S j S 2u) /2 ) s > t = =. exput) exput) exput) j S Using 2u) /2 = + 2u/ 2u)) /2 + 4u) /2 exp2u) for u, /4), we derive )) 3π P g 2 2 > t) exp2us ut) = exp us 2ε 2 = ε 2, where we have chosen us = ln2/ε)/3π/2ε) 2), so that u us < /4 when ε < /2. We now state the extension of Theorem 5 to the realistic situation. We need to assume that the measurement error is not too large compared with the smallest nonzero absolute entry of the sparse vector such a condition quite common in the literature, see e.g. [, Theorem 3.2] or [9, Theorem 4.]) is further discussed in Subsection 6.3. Theorem 6. If the restricted isometry constant of the matrix A C m N obeys δ 3s 3, then the sequence x n ) produced by HTP) with y = Ax + e C m for some s-sparse x C N and some e C m with e 2 γ x s satisfies t x x n 2 d e 2, n c s. The constants c 3, d 2.45, and γ.79 depend only on δ 3s. Proof. Let π be the permutation of {, 2,..., N} for which x πj) = x j for all j {, 2,..., N}. As before, we partition suppx) = π{,..., s}) as Q Q 2... Q r, r s, where the Q i are defined as in 5). With Q = and k =, we prove by induction on i {,..., r} that Q Q... Q i S k +k +...+k i, where k,..., k r are defined, with ρ := ρ 3s <, by ln 6cardQi ) + cardq i+ )/2 + + cardq r )/2 r i ) ) 2) k i := ln/ρ 2 ) The induction hypothesis holds trivially for i =. Next, if it holds for i, i {,..., r}, then Lemma 3 guarantees that it holds for i, provided 22) x q i > ρ k i x Qi x Q i x Q r κ 3s e 2. 3.

14 With the choice γ := 2 2 )/4κ 3s ), we have κ 3s e 2 κ 3s γx s / 2 /4)x q i +. Moreover, in view of x q i > x q i + / 2 and of x Qi x Q i x Q r 2 2 x q i + cardq i ) + 2 cardq i+) r i cardq r), we verify that 22) is fulfilled thanks to the definition 2) of k i. This concludes the induction. In particular, the support π{,..., s}) of x is included in S n, where n can be estimated using similar arguments as before) by n = r i= k i c s, c := ln6/ρ2 ) ln/ρ 2 ). It follows from HTP 2 ) that y Ax n 2 y Ax 2 = e 2. In turn, we derive that x x n 2 δ n Ax x n ) 2 ) 2 y Ax n 2 + e 2 e 2 =: d e 2. δcs δcs We note that δ 3s /3 yields c 3, then d 6, and finally γ 4 2) 3/56. Remark 7. When the vector x is not exactly s-sparse, we denote by S an index set of its s largest absolute entries. Writing y = Ax + e as y = Ax S + e with e := Ax S + e, Theorem 6 guarantees that, under the restricted isometry condition δ 3s /3, x x n 2 x S 2 + x S x n 2 x S 2 + d e 2, n c s, provided e 2 γ x s. Let us now assume that e = for simplicity. With S, S 2,... denoting index sets of s largest absolute entries of x S, of next s largest absolute entries of x S, etc., we notice that e 2 = Ax S 2 Ax Sk 2 + δ s x Sk 2. k In particular, vectors whose nonincreasing rearrangement decays as x j = µj for µ, /2) satisfy Ax S 2 c δs µ s+, so that e 2 γ x s holds for small enough µ. Taking x S 2 c µ s+ into account, too, the reconstruction bound becomes x x n 2 c δ s µ s+. As a matter of fact, the same bound is valid if s is replaced by a smaller value t even t = ), so the reconstruction error becomes quite satisfactory after a relatively small number c t of iterations. k 4 Uniform Recovery via Graded Hard Thresholding Pursuit This section contains the main result about GHTP) in the uniform setting, namely the fact that the reconstruction of all s-sparse vectors is achieved under a restricted isometry condition in a number of iterations at most proportional to s. This is comparable to the main result about HTP), but we recall the added benefits that GHTP) does not need an estimation of 4

15 the sparsity to run and that the stopping criterion involves an estimation of either one of the sparsity or the measurement error the only other such algorithm we are aware of is OMP). In passing, we note that in both uniform and nonuniform settings) the number of GHTP) iterations for s-sparse recovery cannot be less than s, since t < s iterations only produce t- sparse vectors. We prove the result directly in the realistic situation and we invite the reader to weaken the restricted isometry condition in the idealized situation. Theorem 8. If the restricted isometry constant of the matrix A C m N obeys δ 9s 3, then the sequence x n ) produced by GHTP) with y = Ax + e C m for some s-sparse x C N and some e C m with e 2 γ x s satisfies x x n 2 d e 2, n c s. The constants c 4, d 2.45, and γ.79 depend only on δ 9s. Proof. Let π be the permutation of {, 2,..., N} for which x πj) = x j for all j {, 2,..., N}. Our goal is to show that suppx) = π{,..., s}) S n for some n 4s. The arguments are very similar to the ones used in the proof of Theorem 5, except for the notable difference that the first s iterations are ignored. We still partition π{,..., s}) as Q Q 2... Q r, r s, where the Q i are defined as in 5), and we prove by induction on i {,..., r} that 23) Q Q... Q i S s+k +k +...+k i, where Q =, k =, and k,..., k r are defined, with ρ := ρ 9s /2, by ln 6cardQi ) + cardq i+ )/2 + + cardq r )/2 r i ) ) 24) k i := ln/ρ 2 ) Based on arguments already used, we observe that r k i ln4/ρ2 ) ln/ρ 2 ) s 3s. i= We now note that 23) holds trivially for i =. Next, if 23) holds for i, i {,..., r}, then Lemma 4 guarantees that 23) holds for i, provided 25) x q i > ρ k i x Qi x Q i x Q r κ 9s e 2, since ρ s+2s+k +...+k i )+2k i ρ 9s. With γ := 2 2 )/4κ 9s ) 4 2) 3/56, we verify that 25) is fulfilled thanks to the definition 24) of k i. This concludes the induction. In particular, the support π{,..., s}) of x is included in S n with n = s + k k r 4s. We derive that y Ax n 2 y Ax 2 = e 2 from HTP 2 ), and then x x n 2 δ n Ax x n ) 2 which is the desired result with d = 2/ δ 9s 6. ) 2 y Ax n 2 + e 2 e 2, δ4s δ9s. 5

16 5 Nonuniform Recovery via Graded Hard Thresholding Pursuit In this section, we demonstrate that s-sparse recovery via exactly s iterations of GHTP) can be expected in the nonuniform setting, i.e., given an s-sparse vector x C N, a random measurement matrix allows for the recovery of this specific x with high probability. Precisely, we obtain results in two extreme cases: when the nonincreasing rearrangement of the sparse vector does not decay much and when it decays exponentially fast. Proposition 9 covers the first case, suspected to be the least favorable one see Subsection 6.2. The result and its proof have similarities with the work of [3], which was extended to incorporate measurement error in [9]. The result is stated directly in this realistic situation, and it is worth pointing out its validity for all measurement errors e C m simultaneously. Proposition 9. Let λ and let x C N be an s-sparse vector such that x λ x s. If A R m N is a normalized) Gaussian random matrix with m Cs lnn), then it occurs with probability at least 2N c that, for all e C m with e 2 γ x s, the sequences S n ) and x n ) produced by GHTP) with y = Ax + e satisfy, at iteration s, S s = suppx) and x x s 2 d e 2. The constants γ and d depend only on λ, while the constant C depends on γ and c. Proof. The result is in fact valid if normalized) Gaussian random matrices are replaced by matrices with independent subgaussian entries with mean zero and variance /m. One part of the argument relies on the fact that, for a fixed index set S of size s, 26) P A SA S I 2 2 > δ) 2 exp c δ 2 m) provided m C s/δ 2, with c, C depending only on the subgaussian distribution see e.g. [7, Theorem 9.9], which leads to the improvement [7, Theorem 9.] of [, Theorem 5.2] in terms of dependence of constants on δ). Another part of the argument relies on the fact that, for a vector v C N and an index l {,..., N}, with a l denoting the lth column of the matrix A, 27) P a l, v > t v 2 ) 4 exp c t 2 m ) for a constant c depending only on the subgaussian distribution see e.g. [7, Theorem 7.27] in the case v R N ). With S := suppx), let us now define, for n {,..., s}, two random variables χ n and ζ n as χ n := [ x n + A y Ax n )) S ] n, ζ n := [ x n + A y Ax n )) S ], 6

17 i.e., χ n is the nth largest value of x n + A Ax x n )) j when j runs through S and ζ n is the largest value of x n + A Ax x n )) l when l runs through S. We are going to prove that, with high probability, S n S for all n {,..., s}, which follows from χ n > ζ n for all n {,..., s}. The failure probability of this event is P := P n {,..., s} : ζ n χ n and χ n > ζ n,..., χ > ζ )) ) 28) P A S {l} A S {l} I 2 2 > δ for some l S s ) 29) + P ζ n χ n, χ n > ζ n,..., χ > ζ ), A S {l} A S {l} I 2 2 δ for all l S), n= where the constant δ = δ λ is to be chosen later. According to 26), the term in 28) is bounded by 2N s) exp c δ 2 m) since a proper choice of C depending on λ through δ) guarantees that m C s + )/δ 2. Let us turn to the term in 29) and let us use the shorthand P E) to denote the probability of an event E intersected with the events χ n > ζ n,..., χ > ζ ) and A S {l} A S {l} I 2 2 δ for all l S), which are assumed to hold below. On the one hand, with T s n+ S representing an index set corresponding to s n + smallest values of x n + A y Ax n )) j when j runs through S, we notice that χ n s n + x n + A y Ax n )) T s n+ 2 s n + xt s n+ 2 A A I)x x n )) T s n+ 2 A e) T s n+ 2 ). Then, since the condition χ n > ζ n yields S n S, we obtain A A I)x x n )) T s n+ 2 A SA S I)x x n ) 2 δ x x n 2. We also use the inequality A e) T s n+ 2 + δ e 2 to derive χ n x T s n+ 2 δ x x n 2 ) + δ e 2. s n + On the other hand, since S n S, we have It follows that P ζ n χ n ) ζ n = max l S A y Ax n )) l max l S max a l, Ax x n ) + + δ e 2. l S P max a l, Ax x n ) l S P max a l, Ax x n ) δ ) x x n 2, l S s A Ax x n )) l + max A e) l l S ) xt s n+ 2 δ x x n ) δ e 2 s n + 7

18 where the last step will be justified by the inequality s n + x T s n δ e 2 2δ s n + x x n 2. In order to fulfill this inequality, we remark that x T s n+ 2 s n + x s, that e 2 γ x s, and, thanks to the extension of 7) mentioned in Remark, that x x n 2 x δ 2 S n 2 + s n + + δ δ A e) S n 2 x δ 2 + δ e 2 ) λ + δ γ s n + δ 2 x s + x s, δ so that it is enough to choose δ then γ depending on λ) small enough to have 2 ) λ + δ γ + δ γ 2δ +. δ 2 δ Taking Ax x n ) 2 + δ x x n 2 into account, we obtain 3) P ζ n χ n ) P max a l, Ax x n ) l S ) δ Ax x n ) 2. + δ s Let us now remark that the condition χ = [A A S x + A e) S ] > ζ = [A A S x + A e) S ] implies that the random set S corresponding to the largest value of A S A Sx + A S e) depends only on the random submatrix A S, so that the random vector x = A y depends only on A S S, too; then, since x is supported on S S, the condition χ 2 = [x + A A S x x ) + A e) S ] 2 > ζ 2 = [x + A A S x x ) + A e) S ] implies that the random set S2 corresponding to the two largest values of x S + A S A Sx x ) + A S e) depends only on the random submatrix A S, so that the random vector x 2 = A y depends only on A S 2 S, too; and so on until we infer that the random vector x n, supported on S n S, depends only on A S, and in turn so does Ax x n ). Exploiting the independence of Ax x n ) and of the columns a l for l S, the estimate 27) can be applied in 3) to yield P ζ n χ n ) l S P a l, Ax x n ) 4N s) exp c δ 2 ) m. + δ)s ) δ Ax x n ) 2 + δ s Altogether, the failure probability P satisfies P 2N s) exp c δ 2 m) + 4sN s) exp c δ 2 ) m + δ)s 2N ) exp c δ 2 m) + N 2 exp c δ 2 ) m. + δ)s 8

19 In particular, we have χ s > ζ s, and in turn S s = S, with failure probability at most P. Then, since A S A S I 2 2 δ with failure probability given in 26), we also have x x s 2 δ Ax x s ) 2 δ y Ax s 2 + y Ax 2 ) 2 δ e 2, which is the desired result with d = 2/ δ. The resulting failure probability is bounded by 2N exp c δ 2 m) + N 2 exp c δ 2 ) m 2N 2 exp + δ)s c δ m s ) 2N c, where the last inequality holds with a proper choice of the constant C in m Cs lnn). The previous result is limited by its nonapplicability to random partial Fourier matrices. For such matrices, the uniform results of the two previous sections or any recovery result based on the restricted isometry property do apply, but the restricted isometry property is only known to be fulfilled with a number of measurements satisfying m cs ln 4 N). This threshold can be lowered in the nonuniform setting by reducing the power of the logarithm. This is typically achieved via l -minimization see [4, Theorem.3], [3, Theorem.], or [8, V 3]), for which the successful recovery of a vector does not depend on its shape but only on its sign pattern. We establish below a comparable result for GHTP): fewer measurements guarantee the high probability of nonuniform s-sparse recovery via exactly s iterations of GHTP). There is an additional proviso on the vector shape, namely its nonincreasing rearrangement should decay exponentially fast. This is admittedly a strong assumption, but possible improvements in the proof strategy perhaps incorporating ideas from Proposition 9) suggest that the result could be extended beyond this class of sparse vectors for Fourier measurements. Incidentally, for this class of vectors, it is striking that the number of Gaussian measurements can be reduced even below the classical reduction see [5, Theorems.2 and.3]) of the constant c in m cs lnn/s) when passing from the uniform to the nonuniform setting. Proposition. Let µ, ) and let x C N be an s-sparse vector such that x j+ µ x j for all j {,..., s }. If A R m N is a normalized) Gaussian random matrix with m max{cs, C lnn)}, or if A C m N is a normalized) random partial Fourier matrix with m C s lnn), then it occurs with probability at least 2N c that, for all e C m with e 2 γ x s, the sequences S n ) and x n ) produced by GHTP) with y = Ax + e satisfy, at iteration s, S s = suppx) and x x s 2 d e 2. The constants C, γ, and d depend only on µ, while the constants C and C depend on µ and c. 9

20 Proof. The result would also be valid if normalized) Gaussian random matrices were replaced by normalized) subgaussian random matrices and if normalized) random partial Fourier matrices were replaced by normalized) random sampling matrices associated to bounded orthonormal systems. We recall see [7, chapter 2] for details) that a bounded orthonormal system is a system φ,..., φ N ) of functions, defined on a set D endowed with a probability measure ν, that satisfy, for some constant K, 3) sup φ j t) K and φ k t)φ l t)dνt) = δ k,l t D for all j, k, l {,..., N}. The normalized random sampling matrix A C m N associated to φ,..., φ N ) has entries A k,l = m φ l t k ), k {,..., m}, l {,..., N}, D where the sampling points t,..., t m D are selected independently at random according to the probability measure ν. For instance, the random partial Fourier matrix corresponds to the choices D = {,..., N}, νt ) = cardt )/N for all T {,..., N}, and φ k t) = expi2πkt), in which case K =. The argument here relies on the fact see [7, Theorem 2.2]) that, for a fixed index set S of size s, ) 32) P A SA S I 2 2 > δ) 2s exp 3δ2 m 8K 2. s Let π be the permutation of {, 2,..., N} for which x πj) = x j for all j {, 2,..., N} and let Π n denote the set π{, 2,..., n}) for each n {,..., N}. Let δ = δ µ > be chosen small enough to have 2δ/ δ 2 µ) µ 2 /3 for reason that will become apparent later. Invoking 26) and 32) for the sets Π s {k}, k Π s, ensures that 33) A Π s {k} A Π s {k} I 2 2 δ. In the Gausian case, provided m C µ s, this holds with failure probability at most 2N s) exp c µ m) 2N exp c µ C µ,c lnn)) 2N c where C µ,c is chosen large enough. In the Fourier case, this holds with failure probability at most 2N s)s exp c ) µm 2N 2 exp c µ C µ,c lnn)) 2N c s where C µ,c is chosen large enough. We are going to prove by induction that S n = Π n for all n {,,..., s}. The result holds trivially for n =. To increment the induction from n to n, n {,..., s}, we need to show that min j=,...,n xn + A y Ax n )) πj) > max l=n+,...,n xn + A y Ax n )) πl). 2

21 We observe that for j {,..., n} and l {n +,..., N}, x n + A y Ax n )) πj) x πj) A A I)x x n ) + A e) πj) x n A A I)x x n )) πj) A e) πj), x n + A y Ax n )) πl) x πl) + A A I)x x n ) + A e) πl) x n+ + A A I)x x n )) πl) + A e) πl). Hence, it is enough to prove that, for all j {,..., n} and l {n +,..., N}, 34) x n > x n++ A A I)x x n )) πj) + A e) πj) + A A I)x x n )) πl) + A e) πl). For any k {,..., N}, using the the induction hypothesis and 33), we obtain 35) A A I)x x n )) k A Π s {k} A Π s {k} I)x x n ) 2 δ x x n 2. Moreover, in view of the extension of 7) mentioned in Remark, we have 36) x x n 2 Combining 35) and 36) gives, for any k {,..., N}, A A I)x x n )) k + A e) k x δ 2 Π n 2 + δ A e) Πn 2. δ x δ 2 Π n 2 + δ δ A e) Πn 2 + A e) k δ δ 2 µ 2 x n + δ A e) Πs {k} 2, where we have used x Πn 2 2 x n) 2 + µ µ 2 ) s n ) x n) 2 / µ 2 ) in the last step. Using also the estimate imposed by 33) for the maximal singular value of A Πs {k} in A e) Πs {k} 2 + δ e 2 + δ γ x s + δ γ x n, we arrive at ) A A I)x x n )) k + A δ + δ γ e) k δ 2 µ + x 2 δ n. Taking the fact that x n+ µ x n into account, we observe that 34) is fulfilled as soon as 2δ > µ + δ 2 µ δ γ. 2 δ The latter holds with the choice of δ made at the beginning and then with an appropriate choice of γ depending only on µ). At this point, we have justified inductively that S n = Π n for all n {,..., s}, hence in particular that S s = Π s = suppx). To conclude the proof, we remark that GHTP 2 ) yields y Ax s 2 y Ax 2 = e 2 and we also use the estimate imposed by 33) for the minimal singular value of A Πs to derive x x s 2 δ Ax x s ) 2 The constant d := 2/ δ depends only on µ. δ y Ax s 2 + y Ax 2 ) 2 δ e 2. 2

22 6 Numerical experiments This section provides some empirical validation of our theoretical results. We focus on the performances of HTP), GHTP), and its natural companion OMP). All the tests were carried out for Gaussian random matrices A R m N with m = 2 and N =. We generated such random matrices. To assess the influence of the decay in the nonincreasing rearrangement of s-sparse vectors x R N to be recovered, we tested flat vectors with x j = for j {,..., s}, linear vectors with x j = s + j)/s for j {,..., s}, and Gaussian vectors whose s nonzero entries are independent standard normal random variables. We generated such vectors per matrix. All these experiments can be reproduced by downloading the associated MATLAB files from the authors webpages. These files contain supplementary numerical tests, e.g. one involving Fourier submatrices to illustrate Proposition. 6. Frequency of successful recoveries We present here a comparison between HTP), GHTP), and OMP) in terms of successful recovery this complements the experiments of [6] where HTP) was compared to basis pursuit, compressive sampling matching pursuit, iterative hard thresholding, and normalized iterative hard thresholding. We have also included a comparison with a version of HTP), denoted HTP ), where the number of largest absolute entries kept at each iteration is 2s instead of s. The natural stopping criterion S n = S n was used for HTP) and HTP ), and we used the criterion 4 [suppx) S n or x x n 2 / x 2 < 4 ] for GHTP) and OMP). 9 8 HTP GHTP OMP HTP 9 8 HTP GHTP OMP HTP 9 8 HTP GHTP OMP HTP Percentage of success Percentage of success Percentage of success Sparsity Sparsity Sparsity a) Gaussian vectors b) Linear vectors c) Flat vectors Figure : Frequency of successful recoveries for HTP), GHTP), OMP), and HTP ) Gaussian measurements using Figure suggests that no algorithm consistently outperforms the other ones, although GHTP) is always marginally better than OMP). Moreover, HTP ) underperforms in two cases, possibly because theoretical requirements such as restricted isometries properties are harder to fulfill when 2s replaces s. The shape of the vector to be recovered has a strong influence on the success rate, in contrast with basis pursuit whose success depend only on the sign pattern. 4 the true sparse solution is available in this experiment, unlike in a practical scenario 22

23 6.2 Number of iterations and indices correctly captured We now focus on the information provided by our experiments in terms of number of iterations needed for the recovery of a sparse vector and in terms of number of indices from the true support correctly captured in the active sets as a function of the iteration count. First, Figure 2 displays the maximum over all our trials for the number ns) of iterations needed for s-sparse recovery as a function of s in the sparsity regions where recovery is certain see Figure ). With this indicator, HTP) is the best algorithm thanks to the fact that a prior knowledge of s is exploited. For this algorithm, no notable difference seems to result from the vector shape but remember that it did influence the success rate), so the upper bound ns) c s from Theorem 5 is likely to be an overestimation to be replaced by the bound ns) c lns), which is known to be valid for flat vectors. As for GHTP) and OMP), we observe that the vector shape influences the regions predicted by Propositions 9 and ) of s-values for which ns) = s and that GHTP) generally requires fewer iterations than OMP), probably thanks to the fact that incorrectly selected indices can be rejected at the next GHTP) iteration however, this complicates speed-ups based on QR-updates). Maximum number of iterations HTP GHTP OMP Maximum number of iterations HTP GHTP OMP Maximum number of iterations HTP GHTP OMP Sparsity a) Gaussian vectors Sparsity b) Linear vectors Sparsity c) Flat vectors Figure 2: Number of iterations for HTP), GHTP), and OMP) using Gaussian measurements Next, Figures 3, 4, and 5 display the number in) of indices from the true support that are correctly captured in the active set S n as a function of the iteration count n. We point out that these figures do not reflect the worst case anymore, but instead they present an average over all successful recoveries. Once again, HTP) is the superior algorithm because several correct indices can be captured at once. For GHTP) and OMP), the ideal situation corresponds to the selection of correct indices only, i.e., in) = n for all n s. This seems to occur for small sparsities, with larger regions of validity for vectors whose nonincreasing rearrangements decay quickly. This is an encouraging fact since in) = n has been justified for all n s in the proof of Proposition 9 for the least favorable case of flat vectors. For these vectors, Figure 5 also reveals that, even in the region where s-sparse recovery via GHTP) and OMP) is certain, the recovery is sometimes achieved in more than s iterations. 23

24 Number of correct indices s=5 s=45 s=4 s=35 s=3 s=25 s=2 Gaussian vectors HTP Number of correct indices s=5 s=45 s=4 s=35 s=3 s=25 s=2 Gaussian vectors GHTP Number of correct indices s=5 s=45 s=4 s=35 s=3 s=25 s=2 Gaussian vectors OMP Number of iterations Number of iterations Number of iterations a) Recovery via HTP) b) Recovery via GHTP) c) Recovery via OMP) Figure 3: Number of support indices of Gaussian vectors correctly captured by the active set as a function of the iteration count for HTP), GHTP), and OMP) using Gaussian measurements s=45 s=4 s=35 s=3 s=25 s=2 Linear vectors HTP s=45 s=4 s=35 s=3 s=25 s=2 Linear vectors GHTP s=45 s=4 s=35 s=3 s=25 s=2 Linear vectors OMP Number of correct indices Number of correct indices Number of correct indices Number of iterations Number of iterations Number of iterations a) Recovery via HTP) b) Recovery via GHTP) c) Recovery via OMP) Figure 4: Number of support indices of linear vectors correctly captured by the active set as a function of the iteration count for HTP), GHTP), and OMP) using Gaussian measurements s=45 s=4 s=35 s=3 s=25 s=2 Flat vectors HTP s=45 s=4 s=35 s=3 s=25 s=2 Flat vectors GHTP s=45 s=4 s=35 s=3 s=25 s=2 Flat vectors OMP Number of correct indices Number of correct indices Number of correct indices Number of iterations Number of iterations Number of iterations a) Recovery via HTP) b) Recovery via GHTP) c) Recovery via OMP) Figure 5: Number of support indices of flat vectors correctly captured by the active set as a function of the iteration count for HTP), GHTP), and OMP) using Gaussian measurements 24

25 Finally, Figure 6 shows the effect of overestimating the sparsity level as 2s instead of s in each hard thresholding pursuit iteration. We notice a slight improvement for small sparsities, as more correct indices can be captured at each iteration, but a degradation for larger sparsities, again possibly due to harder theoretical requirements. Number of correct indices s=5 s=45 s=4 s=35 s=3 s=25 s=2 Gaussian vectors HTP Number of correct indices s=45 s=4 s=35 s=3 s=25 s=2 Linear vectors HTP Number of correct indices s=45 s=4 s=35 s=3 s=25 s=2 Flat vectors HTP Number of iterations Number of iterations Number of iterations a) Recovery of Gaussian vectors b) Recovery of linear vectors c) Recovery of flat vectors Figure 6: Number of support indices correctly captured by the active set as a function of the iteration count for HTP ) using Gaussian measurements 6.3 Robustness to measurement error Our final experiment highlights the effect of the measurement error e C m on the recovery via HTP), GHTP), and OMP) of sparse vectors x C N acquired through y = Ax + e C m. For a sparsity set to 25, with other parameters still fixed at m = 2 and N =, the test was carried out on linear and flat s-sparse vectors. For an error level η varying from to 5 by increment of., we generated Gaussian matrices A, then instances of a random support S and a Gaussian noise vector e normalized by e 2 = η x s, and we run the three algorithms on the resulting measurement vectors per error level. After the algorithms exited, we averaged the number of indices from the true support correctly caught, as well as the reconstruction error measured in l 2 -norm. Under restricted isometry conditions, the proofs of Theorems 6 and 8 guarantee that all indices are correctly caught provided η is below some threshold η. This phenomenon is apparent in Figure 7 abscissa: η = e 2 /x s, ordinate: cards S ), where S is the support of the output x of the algorithm). The figure also shows that the threshold is approximately the same for the three algorithms and that it is higher for flat vectors. For linear vectors, the value η. does not differ too much from the value γ.79 obtained in Theorems 6 and 8. Once the threshold is passed, we also notice that HTP) catches fewer correct indices than GHTP) and OMP), probably due to the fact that it produces index sets of smaller size. In terms of reconstruction error, Figure 8 abscissa: η = e 2 /x s, ordinate: x x 2 /x s) displays transitions around the same values η as Figure 7. Although the ratio x x 2 / e 2 of reconstruction error by measurement error increases after the threshold, it does not rule out a robustness estimate of the type x x 2 d e 2 valid for all e C m with an absolute constant d this is established in [6] for HTP) under 25

Stability and Robustness of Weak Orthogonal Matching Pursuits

Stability and Robustness of Weak Orthogonal Matching Pursuits Stability and Robustness of Weak Orthogonal Matching Pursuits Simon Foucart, Drexel University Abstract A recent result establishing, under restricted isometry conditions, the success of sparse recovery

More information

Stability and robustness of l 1 -minimizations with Weibull matrices and redundant dictionaries

Stability and robustness of l 1 -minimizations with Weibull matrices and redundant dictionaries Stability and robustness of l 1 -minimizations with Weibull matrices and redundant dictionaries Simon Foucart, Drexel University Abstract We investigate the recovery of almost s-sparse vectors x C N from

More information

Sparse Recovery from Inaccurate Saturated Measurements

Sparse Recovery from Inaccurate Saturated Measurements Sparse Recovery from Inaccurate Saturated Measurements Simon Foucart and Jiangyuan Li Texas A&M University Abstract This article studies a variation of the standard compressive sensing problem, in which

More information

The Pros and Cons of Compressive Sensing

The Pros and Cons of Compressive Sensing The Pros and Cons of Compressive Sensing Mark A. Davenport Stanford University Department of Statistics Compressive Sensing Replace samples with general linear measurements measurements sampled signal

More information

Sparse solutions of underdetermined systems

Sparse solutions of underdetermined systems Sparse solutions of underdetermined systems I-Liang Chern September 22, 2016 1 / 16 Outline Sparsity and Compressibility: the concept for measuring sparsity and compressibility of data Minimum measurements

More information

SPARSE signal representations have gained popularity in recent

SPARSE signal representations have gained popularity in recent 6958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 10, OCTOBER 2011 Blind Compressed Sensing Sivan Gleichman and Yonina C. Eldar, Senior Member, IEEE Abstract The fundamental principle underlying

More information

Introduction to Compressed Sensing

Introduction to Compressed Sensing Introduction to Compressed Sensing Alejandro Parada, Gonzalo Arce University of Delaware August 25, 2016 Motivation: Classical Sampling 1 Motivation: Classical Sampling Issues Some applications Radar Spectral

More information

Uniform Uncertainty Principle and signal recovery via Regularized Orthogonal Matching Pursuit

Uniform Uncertainty Principle and signal recovery via Regularized Orthogonal Matching Pursuit Uniform Uncertainty Principle and signal recovery via Regularized Orthogonal Matching Pursuit arxiv:0707.4203v2 [math.na] 14 Aug 2007 Deanna Needell Department of Mathematics University of California,

More information

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit New Coherence and RIP Analysis for Wea 1 Orthogonal Matching Pursuit Mingrui Yang, Member, IEEE, and Fran de Hoog arxiv:1405.3354v1 [cs.it] 14 May 2014 Abstract In this paper we define a new coherence

More information

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER 2011 7255 On the Performance of Sparse Recovery Via `p-minimization (0 p 1) Meng Wang, Student Member, IEEE, Weiyu Xu, and Ao Tang, Senior

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57

More information

Constructing Explicit RIP Matrices and the Square-Root Bottleneck

Constructing Explicit RIP Matrices and the Square-Root Bottleneck Constructing Explicit RIP Matrices and the Square-Root Bottleneck Ryan Cinoman July 18, 2018 Ryan Cinoman Constructing Explicit RIP Matrices July 18, 2018 1 / 36 Outline 1 Introduction 2 Restricted Isometry

More information

Greedy Signal Recovery and Uniform Uncertainty Principles

Greedy Signal Recovery and Uniform Uncertainty Principles Greedy Signal Recovery and Uniform Uncertainty Principles SPIE - IE 2008 Deanna Needell Joint work with Roman Vershynin UC Davis, January 2008 Greedy Signal Recovery and Uniform Uncertainty Principles

More information

The Pros and Cons of Compressive Sensing

The Pros and Cons of Compressive Sensing The Pros and Cons of Compressive Sensing Mark A. Davenport Stanford University Department of Statistics Compressive Sensing Replace samples with general linear measurements measurements sampled signal

More information

List of Errata for the Book A Mathematical Introduction to Compressive Sensing

List of Errata for the Book A Mathematical Introduction to Compressive Sensing List of Errata for the Book A Mathematical Introduction to Compressive Sensing Simon Foucart and Holger Rauhut This list was last updated on September 22, 2018. If you see further errors, please send us

More information

Reconstruction from Anisotropic Random Measurements

Reconstruction from Anisotropic Random Measurements Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013

More information

Analysis of Greedy Algorithms

Analysis of Greedy Algorithms Analysis of Greedy Algorithms Jiahui Shen Florida State University Oct.26th Outline Introduction Regularity condition Analysis on orthogonal matching pursuit Analysis on forward-backward greedy algorithm

More information

Exponential decay of reconstruction error from binary measurements of sparse signals

Exponential decay of reconstruction error from binary measurements of sparse signals Exponential decay of reconstruction error from binary measurements of sparse signals Deanna Needell Joint work with R. Baraniuk, S. Foucart, Y. Plan, and M. Wootters Outline Introduction Mathematical Formulation

More information

Robust Sparse Recovery via Non-Convex Optimization

Robust Sparse Recovery via Non-Convex Optimization Robust Sparse Recovery via Non-Convex Optimization Laming Chen and Yuantao Gu Department of Electronic Engineering, Tsinghua University Homepage: http://gu.ee.tsinghua.edu.cn/ Email: gyt@tsinghua.edu.cn

More information

A Tutorial on Compressive Sensing. Simon Foucart Drexel University / University of Georgia

A Tutorial on Compressive Sensing. Simon Foucart Drexel University / University of Georgia A Tutorial on Compressive Sensing Simon Foucart Drexel University / University of Georgia CIMPA13 New Trends in Applied Harmonic Analysis Mar del Plata, Argentina, 5-16 August 2013 This minicourse acts

More information

Two added structures in sparse recovery: nonnegativity and disjointedness. Simon Foucart University of Georgia

Two added structures in sparse recovery: nonnegativity and disjointedness. Simon Foucart University of Georgia Two added structures in sparse recovery: nonnegativity and disjointedness Simon Foucart University of Georgia Semester Program on High-Dimensional Approximation ICERM 7 October 2014 Part I: Nonnegative

More information

Sparsest Solutions of Underdetermined Linear Systems via l q -minimization for 0 < q 1

Sparsest Solutions of Underdetermined Linear Systems via l q -minimization for 0 < q 1 Sparsest Solutions of Underdetermined Linear Systems via l q -minimization for 0 < q 1 Simon Foucart Department of Mathematics Vanderbilt University Nashville, TN 3740 Ming-Jun Lai Department of Mathematics

More information

Stochastic geometry and random matrix theory in CS

Stochastic geometry and random matrix theory in CS Stochastic geometry and random matrix theory in CS IPAM: numerical methods for continuous optimization University of Edinburgh Joint with Bah, Blanchard, Cartis, and Donoho Encoder Decoder pair - Encoder/Decoder

More information

Generalized Orthogonal Matching Pursuit- A Review and Some

Generalized Orthogonal Matching Pursuit- A Review and Some Generalized Orthogonal Matching Pursuit- A Review and Some New Results Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur, INDIA Table of Contents

More information

University of Luxembourg. Master in Mathematics. Student project. Compressed sensing. Supervisor: Prof. I. Nourdin. Author: Lucien May

University of Luxembourg. Master in Mathematics. Student project. Compressed sensing. Supervisor: Prof. I. Nourdin. Author: Lucien May University of Luxembourg Master in Mathematics Student project Compressed sensing Author: Lucien May Supervisor: Prof. I. Nourdin Winter semester 2014 1 Introduction Let us consider an s-sparse vector

More information

Compressed Sensing and Sparse Recovery

Compressed Sensing and Sparse Recovery ELE 538B: Sparsity, Structure and Inference Compressed Sensing and Sparse Recovery Yuxin Chen Princeton University, Spring 217 Outline Restricted isometry property (RIP) A RIPless theory Compressed sensing

More information

Interpolation via weighted l 1 -minimization

Interpolation via weighted l 1 -minimization Interpolation via weighted l 1 -minimization Holger Rauhut RWTH Aachen University Lehrstuhl C für Mathematik (Analysis) Mathematical Analysis and Applications Workshop in honor of Rupert Lasser Helmholtz

More information

Machine Learning for Signal Processing Sparse and Overcomplete Representations. Bhiksha Raj (slides from Sourish Chaudhuri) Oct 22, 2013

Machine Learning for Signal Processing Sparse and Overcomplete Representations. Bhiksha Raj (slides from Sourish Chaudhuri) Oct 22, 2013 Machine Learning for Signal Processing Sparse and Overcomplete Representations Bhiksha Raj (slides from Sourish Chaudhuri) Oct 22, 2013 1 Key Topics in this Lecture Basics Component-based representations

More information

GREEDY SIGNAL RECOVERY REVIEW

GREEDY SIGNAL RECOVERY REVIEW GREEDY SIGNAL RECOVERY REVIEW DEANNA NEEDELL, JOEL A. TROPP, ROMAN VERSHYNIN Abstract. The two major approaches to sparse recovery are L 1-minimization and greedy methods. Recently, Needell and Vershynin

More information

arxiv: v1 [math.na] 28 Oct 2018

arxiv: v1 [math.na] 28 Oct 2018 Iterative Hard Thresholding for Low-Rank Recovery from Rank-One Projections Simon Foucart and Srinivas Subramanian Texas A&M University Abstract arxiv:80.749v [math.na] 28 Oct 208 A novel algorithm for

More information

Lecture 13 October 6, Covering Numbers and Maurey s Empirical Method

Lecture 13 October 6, Covering Numbers and Maurey s Empirical Method CS 395T: Sublinear Algorithms Fall 2016 Prof. Eric Price Lecture 13 October 6, 2016 Scribe: Kiyeon Jeon and Loc Hoang 1 Overview In the last lecture we covered the lower bound for p th moment (p > 2) and

More information

A strongly polynomial algorithm for linear systems having a binary solution

A strongly polynomial algorithm for linear systems having a binary solution A strongly polynomial algorithm for linear systems having a binary solution Sergei Chubanov Institute of Information Systems at the University of Siegen, Germany e-mail: sergei.chubanov@uni-siegen.de 7th

More information

Elaine T. Hale, Wotao Yin, Yin Zhang

Elaine T. Hale, Wotao Yin, Yin Zhang , Wotao Yin, Yin Zhang Department of Computational and Applied Mathematics Rice University McMaster University, ICCOPT II-MOPTA 2007 August 13, 2007 1 with Noise 2 3 4 1 with Noise 2 3 4 1 with Noise 2

More information

Sensing systems limited by constraints: physical size, time, cost, energy

Sensing systems limited by constraints: physical size, time, cost, energy Rebecca Willett Sensing systems limited by constraints: physical size, time, cost, energy Reduce the number of measurements needed for reconstruction Higher accuracy data subject to constraints Original

More information

Compressed sensing. Or: the equation Ax = b, revisited. Terence Tao. Mahler Lecture Series. University of California, Los Angeles

Compressed sensing. Or: the equation Ax = b, revisited. Terence Tao. Mahler Lecture Series. University of California, Los Angeles Or: the equation Ax = b, revisited University of California, Los Angeles Mahler Lecture Series Acquiring signals Many types of real-world signals (e.g. sound, images, video) can be viewed as an n-dimensional

More information

Noisy Signal Recovery via Iterative Reweighted L1-Minimization

Noisy Signal Recovery via Iterative Reweighted L1-Minimization Noisy Signal Recovery via Iterative Reweighted L1-Minimization Deanna Needell UC Davis / Stanford University Asilomar SSC, November 2009 Problem Background Setup 1 Suppose x is an unknown signal in R d.

More information

Conditions for Robust Principal Component Analysis

Conditions for Robust Principal Component Analysis Rose-Hulman Undergraduate Mathematics Journal Volume 12 Issue 2 Article 9 Conditions for Robust Principal Component Analysis Michael Hornstein Stanford University, mdhornstein@gmail.com Follow this and

More information

THE SMALLEST SINGULAR VALUE OF A RANDOM RECTANGULAR MATRIX

THE SMALLEST SINGULAR VALUE OF A RANDOM RECTANGULAR MATRIX THE SMALLEST SINGULAR VALUE OF A RANDOM RECTANGULAR MATRIX MARK RUDELSON AND ROMAN VERSHYNIN Abstract. We prove an optimal estimate on the smallest singular value of a random subgaussian matrix, valid

More information

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28 Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:

More information

Geometry of log-concave Ensembles of random matrices

Geometry of log-concave Ensembles of random matrices Geometry of log-concave Ensembles of random matrices Nicole Tomczak-Jaegermann Joint work with Radosław Adamczak, Rafał Latała, Alexander Litvak, Alain Pajor Cortona, June 2011 Nicole Tomczak-Jaegermann

More information

Sparse and Low Rank Recovery via Null Space Properties

Sparse and Low Rank Recovery via Null Space Properties Sparse and Low Rank Recovery via Null Space Properties Holger Rauhut Lehrstuhl C für Mathematik (Analysis), RWTH Aachen Convexity, probability and discrete structures, a geometric viewpoint Marne-la-Vallée,

More information

An Introduction to Sparse Approximation

An Introduction to Sparse Approximation An Introduction to Sparse Approximation Anna C. Gilbert Department of Mathematics University of Michigan Basic image/signal/data compression: transform coding Approximate signals sparsely Compress images,

More information

Sparse Legendre expansions via l 1 minimization

Sparse Legendre expansions via l 1 minimization Sparse Legendre expansions via l 1 minimization Rachel Ward, Courant Institute, NYU Joint work with Holger Rauhut, Hausdorff Center for Mathematics, Bonn, Germany. June 8, 2010 Outline Sparse recovery

More information

Uniform Uncertainty Principle and Signal Recovery via Regularized Orthogonal Matching Pursuit

Uniform Uncertainty Principle and Signal Recovery via Regularized Orthogonal Matching Pursuit Claremont Colleges Scholarship @ Claremont CMC Faculty Publications and Research CMC Faculty Scholarship 6-5-2008 Uniform Uncertainty Principle and Signal Recovery via Regularized Orthogonal Matching Pursuit

More information

Jointly Low-Rank and Bisparse Recovery: Questions and Partial Answers

Jointly Low-Rank and Bisparse Recovery: Questions and Partial Answers Jointly Low-Rank and Bisparse Recovery: Questions and Partial Answers Simon Foucart, Rémi Gribonval, Laurent Jacques, and Holger Rauhut Abstract This preprint is not a finished product. It is presently

More information

Distributed Inexact Newton-type Pursuit for Non-convex Sparse Learning

Distributed Inexact Newton-type Pursuit for Non-convex Sparse Learning Distributed Inexact Newton-type Pursuit for Non-convex Sparse Learning Bo Liu Department of Computer Science, Rutgers Univeristy Xiao-Tong Yuan BDAT Lab, Nanjing University of Information Science and Technology

More information

of Orthogonal Matching Pursuit

of Orthogonal Matching Pursuit A Sharp Restricted Isometry Constant Bound of Orthogonal Matching Pursuit Qun Mo arxiv:50.0708v [cs.it] 8 Jan 205 Abstract We shall show that if the restricted isometry constant (RIC) δ s+ (A) of the measurement

More information

Recoverabilty Conditions for Rankings Under Partial Information

Recoverabilty Conditions for Rankings Under Partial Information Recoverabilty Conditions for Rankings Under Partial Information Srikanth Jagabathula Devavrat Shah Abstract We consider the problem of exact recovery of a function, defined on the space of permutations

More information

Course Notes: Week 1

Course Notes: Week 1 Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues

More information

Supremum and Infimum

Supremum and Infimum Supremum and Infimum UBC M0 Lecture Notes by Philip D. Loewen The Real Number System. Work hard to construct from the axioms a set R with special elements O and I, and a subset P R, and mappings A: R R

More information

Lecture 22: More On Compressed Sensing

Lecture 22: More On Compressed Sensing Lecture 22: More On Compressed Sensing Scribed by Eric Lee, Chengrun Yang, and Sebastian Ament Nov. 2, 207 Recap and Introduction Basis pursuit was the method of recovering the sparsest solution to an

More information

Sequential Monte Carlo methods for filtering of unobservable components of multidimensional diffusion Markov processes

Sequential Monte Carlo methods for filtering of unobservable components of multidimensional diffusion Markov processes Sequential Monte Carlo methods for filtering of unobservable components of multidimensional diffusion Markov processes Ellida M. Khazen * 13395 Coppermine Rd. Apartment 410 Herndon VA 20171 USA Abstract

More information

Optimization for Compressed Sensing

Optimization for Compressed Sensing Optimization for Compressed Sensing Robert J. Vanderbei 2014 March 21 Dept. of Industrial & Systems Engineering University of Florida http://www.princeton.edu/ rvdb Lasso Regression The problem is to solve

More information

Strengthened Sobolev inequalities for a random subspace of functions

Strengthened Sobolev inequalities for a random subspace of functions Strengthened Sobolev inequalities for a random subspace of functions Rachel Ward University of Texas at Austin April 2013 2 Discrete Sobolev inequalities Proposition (Sobolev inequality for discrete images)

More information

ACCORDING to Shannon s sampling theorem, an analog

ACCORDING to Shannon s sampling theorem, an analog 554 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL 59, NO 2, FEBRUARY 2011 Segmented Compressed Sampling for Analog-to-Information Conversion: Method and Performance Analysis Omid Taheri, Student Member,

More information

Pre-weighted Matching Pursuit Algorithms for Sparse Recovery

Pre-weighted Matching Pursuit Algorithms for Sparse Recovery Journal of Information & Computational Science 11:9 (214) 2933 2939 June 1, 214 Available at http://www.joics.com Pre-weighted Matching Pursuit Algorithms for Sparse Recovery Jingfei He, Guiling Sun, Jie

More information

MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design

MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications Class 19: Data Representation by Design What is data representation? Let X be a data-space X M (M) F (M) X A data representation

More information

Douglas-Rachford splitting for nonconvex feasibility problems

Douglas-Rachford splitting for nonconvex feasibility problems Douglas-Rachford splitting for nonconvex feasibility problems Guoyin Li Ting Kei Pong Jan 3, 015 Abstract We adapt the Douglas-Rachford DR) splitting method to solve nonconvex feasibility problems by studying

More information

Recent Developments in Compressed Sensing

Recent Developments in Compressed Sensing Recent Developments in Compressed Sensing M. Vidyasagar Distinguished Professor, IIT Hyderabad m.vidyasagar@iith.ac.in, www.iith.ac.in/ m vidyasagar/ ISL Seminar, Stanford University, 19 April 2018 Outline

More information

Introduction How it works Theory behind Compressed Sensing. Compressed Sensing. Huichao Xue. CS3750 Fall 2011

Introduction How it works Theory behind Compressed Sensing. Compressed Sensing. Huichao Xue. CS3750 Fall 2011 Compressed Sensing Huichao Xue CS3750 Fall 2011 Table of Contents Introduction From News Reports Abstract Definition How it works A review of L 1 norm The Algorithm Backgrounds for underdetermined linear

More information

Least Sparsity of p-norm based Optimization Problems with p > 1

Least Sparsity of p-norm based Optimization Problems with p > 1 Least Sparsity of p-norm based Optimization Problems with p > Jinglai Shen and Seyedahmad Mousavi Original version: July, 07; Revision: February, 08 Abstract Motivated by l p -optimization arising from

More information

An algebraic perspective on integer sparse recovery

An algebraic perspective on integer sparse recovery An algebraic perspective on integer sparse recovery Lenny Fukshansky Claremont McKenna College (joint work with Deanna Needell and Benny Sudakov) Combinatorics Seminar USC October 31, 2018 From Wikipedia:

More information

Sparse Recovery with Pre-Gaussian Random Matrices

Sparse Recovery with Pre-Gaussian Random Matrices Sparse Recovery with Pre-Gaussian Random Matrices Simon Foucart Laboratoire Jacques-Louis Lions Université Pierre et Marie Curie Paris, 75013, France Ming-Jun Lai Department of Mathematics University of

More information

Linear Convergence of Adaptively Iterative Thresholding Algorithms for Compressed Sensing

Linear Convergence of Adaptively Iterative Thresholding Algorithms for Compressed Sensing Linear Convergence of Adaptively Iterative Thresholding Algorithms for Compressed Sensing Yu Wang, Jinshan Zeng, Zhimin Peng, Xiangyu Chang, and Zongben Xu arxiv:48.689v3 [math.oc] 6 Dec 5 Abstract This

More information

Interpolation via weighted l 1 minimization

Interpolation via weighted l 1 minimization Interpolation via weighted l 1 minimization Rachel Ward University of Texas at Austin December 12, 2014 Joint work with Holger Rauhut (Aachen University) Function interpolation Given a function f : D C

More information

Randomness-in-Structured Ensembles for Compressed Sensing of Images

Randomness-in-Structured Ensembles for Compressed Sensing of Images Randomness-in-Structured Ensembles for Compressed Sensing of Images Abdolreza Abdolhosseini Moghadam Dep. of Electrical and Computer Engineering Michigan State University Email: abdolhos@msu.edu Hayder

More information

Recovering overcomplete sparse representations from structured sensing

Recovering overcomplete sparse representations from structured sensing Recovering overcomplete sparse representations from structured sensing Deanna Needell Claremont McKenna College Feb. 2015 Support: Alfred P. Sloan Foundation and NSF CAREER #1348721. Joint work with Felix

More information

Lecture: Introduction to Compressed Sensing Sparse Recovery Guarantees

Lecture: Introduction to Compressed Sensing Sparse Recovery Guarantees Lecture: Introduction to Compressed Sensing Sparse Recovery Guarantees http://bicmr.pku.edu.cn/~wenzw/bigdata2018.html Acknowledgement: this slides is based on Prof. Emmanuel Candes and Prof. Wotao Yin

More information

Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery

Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery Jorge F. Silva and Eduardo Pavez Department of Electrical Engineering Information and Decision Systems Group Universidad

More information

Toeplitz Compressed Sensing Matrices with. Applications to Sparse Channel Estimation

Toeplitz Compressed Sensing Matrices with. Applications to Sparse Channel Estimation Toeplitz Compressed Sensing Matrices with 1 Applications to Sparse Channel Estimation Jarvis Haupt, Waheed U. Bajwa, Gil Raz, and Robert Nowak Abstract Compressed sensing CS has recently emerged as a powerful

More information

The Analysis Cosparse Model for Signals and Images

The Analysis Cosparse Model for Signals and Images The Analysis Cosparse Model for Signals and Images Raja Giryes Computer Science Department, Technion. The research leading to these results has received funding from the European Research Council under

More information

Robustly Stable Signal Recovery in Compressed Sensing with Structured Matrix Perturbation

Robustly Stable Signal Recovery in Compressed Sensing with Structured Matrix Perturbation Robustly Stable Signal Recovery in Compressed Sensing with Structured Matri Perturbation Zai Yang, Cishen Zhang, and Lihua Xie, Fellow, IEEE arxiv:.7v [cs.it] 4 Mar Abstract The sparse signal recovery

More information

Constrained optimization

Constrained optimization Constrained optimization DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Compressed sensing Convex constrained

More information

The Dirichlet Problem for Infinite Networks

The Dirichlet Problem for Infinite Networks The Dirichlet Problem for Infinite Networks Nitin Saksena Summer 2002 Abstract This paper concerns the existence and uniqueness of solutions to the Dirichlet problem for infinite networks. We formulate

More information

Low-Rank Factorization Models for Matrix Completion and Matrix Separation

Low-Rank Factorization Models for Matrix Completion and Matrix Separation for Matrix Completion and Matrix Separation Joint work with Wotao Yin, Yin Zhang and Shen Yuan IPAM, UCLA Oct. 5, 2010 Low rank minimization problems Matrix completion: find a low-rank matrix W R m n so

More information

Signal Recovery From Incomplete and Inaccurate Measurements via Regularized Orthogonal Matching Pursuit

Signal Recovery From Incomplete and Inaccurate Measurements via Regularized Orthogonal Matching Pursuit Signal Recovery From Incomplete and Inaccurate Measurements via Regularized Orthogonal Matching Pursuit Deanna Needell and Roman Vershynin Abstract We demonstrate a simple greedy algorithm that can reliably

More information

PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION. A Thesis MELTEM APAYDIN

PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION. A Thesis MELTEM APAYDIN PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION A Thesis by MELTEM APAYDIN Submitted to the Office of Graduate and Professional Studies of Texas A&M University in partial fulfillment of the

More information

An asymptotic ratio characterization of input-to-state stability

An asymptotic ratio characterization of input-to-state stability 1 An asymptotic ratio characterization of input-to-state stability Daniel Liberzon and Hyungbo Shim Abstract For continuous-time nonlinear systems with inputs, we introduce the notion of an asymptotic

More information

Lecture Notes 9: Constrained Optimization

Lecture Notes 9: Constrained Optimization Optimization-based data analysis Fall 017 Lecture Notes 9: Constrained Optimization 1 Compressed sensing 1.1 Underdetermined linear inverse problems Linear inverse problems model measurements of the form

More information

Generalized greedy algorithms.

Generalized greedy algorithms. Generalized greedy algorithms. François-Xavier Dupé & Sandrine Anthoine LIF & I2M Aix-Marseille Université - CNRS - Ecole Centrale Marseille, Marseille ANR Greta Séminaire Parisien des Mathématiques Appliquées

More information

Recovery Based on Kolmogorov Complexity in Underdetermined Systems of Linear Equations

Recovery Based on Kolmogorov Complexity in Underdetermined Systems of Linear Equations Recovery Based on Kolmogorov Complexity in Underdetermined Systems of Linear Equations David Donoho Department of Statistics Stanford University Email: donoho@stanfordedu Hossein Kakavand, James Mammen

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu

More information

CoSaMP. Iterative signal recovery from incomplete and inaccurate samples. Joel A. Tropp

CoSaMP. Iterative signal recovery from incomplete and inaccurate samples. Joel A. Tropp CoSaMP Iterative signal recovery from incomplete and inaccurate samples Joel A. Tropp Applied & Computational Mathematics California Institute of Technology jtropp@acm.caltech.edu Joint with D. Needell

More information

Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes Numerical Analysis Lecture Notes Peter J Olver 8 Numerical Computation of Eigenvalues In this part, we discuss some practical methods for computing eigenvalues and eigenvectors of matrices Needless to

More information

Cover Page. The handle holds various files of this Leiden University dissertation

Cover Page. The handle   holds various files of this Leiden University dissertation Cover Page The handle http://hdl.handle.net/1887/57796 holds various files of this Leiden University dissertation Author: Mirandola, Diego Title: On products of linear error correcting codes Date: 2017-12-06

More information

X. Numerical Methods

X. Numerical Methods X. Numerical Methods. Taylor Approximation Suppose that f is a function defined in a neighborhood of a point c, and suppose that f has derivatives of all orders near c. In section 5 of chapter 9 we introduced

More information

arxiv: v2 [cs.it] 21 Mar 2012

arxiv: v2 [cs.it] 21 Mar 2012 Compressive Sensing of Analog Signals Using Discrete Prolate Spheroidal Sequences Mark A. Davenport s and Michael B. Wakin c arxiv:1109.3649v [cs.it] 1 Mar 01 s Department of Statistics, Stanford University,

More information

A LOWER BOUND ON BLOWUP RATES FOR THE 3D INCOMPRESSIBLE EULER EQUATION AND A SINGLE EXPONENTIAL BEALE-KATO-MAJDA ESTIMATE. 1.

A LOWER BOUND ON BLOWUP RATES FOR THE 3D INCOMPRESSIBLE EULER EQUATION AND A SINGLE EXPONENTIAL BEALE-KATO-MAJDA ESTIMATE. 1. A LOWER BOUND ON BLOWUP RATES FOR THE 3D INCOMPRESSIBLE EULER EQUATION AND A SINGLE EXPONENTIAL BEALE-KATO-MAJDA ESTIMATE THOMAS CHEN AND NATAŠA PAVLOVIĆ Abstract. We prove a Beale-Kato-Majda criterion

More information

Rui ZHANG Song LI. Department of Mathematics, Zhejiang University, Hangzhou , P. R. China

Rui ZHANG Song LI. Department of Mathematics, Zhejiang University, Hangzhou , P. R. China Acta Mathematica Sinica, English Series May, 015, Vol. 31, No. 5, pp. 755 766 Published online: April 15, 015 DOI: 10.1007/s10114-015-434-4 Http://www.ActaMath.com Acta Mathematica Sinica, English Series

More information

Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing

Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas M.Vidyasagar@utdallas.edu www.utdallas.edu/ m.vidyasagar

More information

Sparsity Regularization

Sparsity Regularization Sparsity Regularization Bangti Jin Course Inverse Problems & Imaging 1 / 41 Outline 1 Motivation: sparsity? 2 Mathematical preliminaries 3 l 1 solvers 2 / 41 problem setup finite-dimensional formulation

More information

Machine Learning for Signal Processing Sparse and Overcomplete Representations

Machine Learning for Signal Processing Sparse and Overcomplete Representations Machine Learning for Signal Processing Sparse and Overcomplete Representations Abelino Jimenez (slides from Bhiksha Raj and Sourish Chaudhuri) Oct 1, 217 1 So far Weights Data Basis Data Independent ICA

More information

Interpolation via weighted l 1 minimization

Interpolation via weighted l 1 minimization Interpolation via weighted l minimization Holger Rauhut, Rachel Ward August 3, 23 Abstract Functions of interest are often smooth and sparse in some sense, and both priors should be taken into account

More information

CS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization

CS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization CS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization Tim Roughgarden February 28, 2017 1 Preamble This lecture fulfills a promise made back in Lecture #1,

More information

A Randomized Algorithm for the Approximation of Matrices

A Randomized Algorithm for the Approximation of Matrices A Randomized Algorithm for the Approximation of Matrices Per-Gunnar Martinsson, Vladimir Rokhlin, and Mark Tygert Technical Report YALEU/DCS/TR-36 June 29, 2006 Abstract Given an m n matrix A and a positive

More information

The Sparsest Solution of Underdetermined Linear System by l q minimization for 0 < q 1

The Sparsest Solution of Underdetermined Linear System by l q minimization for 0 < q 1 The Sparsest Solution of Underdetermined Linear System by l q minimization for 0 < q 1 Simon Foucart Department of Mathematics Vanderbilt University Nashville, TN 3784. Ming-Jun Lai Department of Mathematics,

More information

THE SMALLEST SINGULAR VALUE OF A RANDOM RECTANGULAR MATRIX

THE SMALLEST SINGULAR VALUE OF A RANDOM RECTANGULAR MATRIX THE SMALLEST SINGULAR VALUE OF A RANDOM RECTANGULAR MATRIX MARK RUDELSON AND ROMAN VERSHYNIN Abstract. We prove an optimal estimate of the smallest singular value of a random subgaussian matrix, valid

More information

Small Ball Probability, Arithmetic Structure and Random Matrices

Small Ball Probability, Arithmetic Structure and Random Matrices Small Ball Probability, Arithmetic Structure and Random Matrices Roman Vershynin University of California, Davis April 23, 2008 Distance Problems How far is a random vector X from a given subspace H in

More information

Chapter Four Gelfond s Solution of Hilbert s Seventh Problem (Revised January 2, 2011)

Chapter Four Gelfond s Solution of Hilbert s Seventh Problem (Revised January 2, 2011) Chapter Four Gelfond s Solution of Hilbert s Seventh Problem (Revised January 2, 2011) Before we consider Gelfond s, and then Schneider s, complete solutions to Hilbert s seventh problem let s look back

More information

Message passing and approximate message passing

Message passing and approximate message passing Message passing and approximate message passing Arian Maleki Columbia University 1 / 47 What is the problem? Given pdf µ(x 1, x 2,..., x n ) we are interested in arg maxx1,x 2,...,x n µ(x 1, x 2,..., x

More information