A Simple Homotopy Algorithm for Compressive Sensing

Size: px

Start display at page:

Download "A Simple Homotopy Algorithm for Compressive Sensing"

Owen Newman
5 years ago
Views:

1 A Siple Hootopy Algorith for Copressive Sensing Lijun Zhang Tianbao Yang Rong Jin Zhi-Hua Zhou National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China Departent of Coputer Science, the University of Iowa, Iowa City, USA Departent of Coputer Science and Engineering, Michigan State University, East Lansing, USA Institute of Data Science and Technologies at Alibaba Group, Seattle, USA {zhanglj, Abstract In this paper, we consider the proble of recovering the s largest eleents of an arbitrary vector fro noisy easureents. Inspired by previous work, we develop an hootopy algorith which solves the l 1 -regularized least square proble for a sequence of decreasing values of the regularization paraeter. Copared to the previous ethod, our algorith is ore efficient in the sense it only updates the solution once for each interediate proble, and ore practical in the sense it has a siple stopping criterion by checking the s- parsity of the interediate solution. Theoretical analysis reveals that our ethod enjoys a linear convergence rate in reducing the recovery error. Furtherore, our guarantee for recovering the top s eleents of the target vector is tighter than previous results, and that for recovering the target vector itself atches the state of the art in copressive sensing. 1 Introduction Copressive Sensing CS) is a new paradig of data acquisition that enables reconstruction of sparse or copressible signals fro a relatively sall nuber of linear easureents Candès and Tao, 2006; Donoho, 2006). The standard assuption is that one has access to linear easureents of the for y = U x +e where x R d is the unknown target vector, U R d is the sensing atrix and e R is a vector Appearing in Proceedings of the 18 th International Conference on Artificial Intelligence and Statistics AISTATS) 2015, San Diego, CA, USA. JMLR: W&CP volue 38. Copyright 2015 by the authors. of noise. Recently, substantial process has been ade in designing the encoder U and the associated decoder which recovers x fro U and y Davenport et al., 2012). One of the ost faous decoders for CS is the l 1 - regularized least squares l 1 -LS) forulation, known as Lasso in statistics Tibshirani, 1996), given by in 1 2 y U x 2 2 +λ x 1. The recovery perforance of Lasso has been extensively studied and generally speaking, it can recover x up to the noise level under appropriate assuptions Daubechies et al., 2004; Tropp, 2006; Zhang, 2009). Under the assuption that λ is given beforehand, Xiao and Zhang 2012) propose a proxial-gradient hootopy ethod to iprove the efficiency of l 1 -LS. The key idea is to solve the l 1 -LS proble for a sequence of decreasing values of the regularization paraeter, and use an approxiate solution at the end of each stage to war start the next stage. In this study, we ake three steps further. We show that i) it is sufficient to run coposite gradient apping only once for each regularization paraeter, ii) the target regularization paraeter can be detected adaptively based on the sparsity of the current solution, and iii) this siple algorith can deliver a tighter bound for recovering the s largest eleents of x. For a vector x, we denote by x s the vector that contains the s largest eleents of x. Let ˆx be the solution returned by our algorith. Under the assuption that U is a sub-gaussian rando atrix, our algorith is able to reduce the recovery error exponentially over iterations, andthefinalrecovererror ˆx x s 2 issaller than ) slogd O e 2 + x x s 2 )+ x x s ) s 2 where e is the vector of noise, and x x s ) s 2 is the 1116

2 A Siple Hootopy Algorith for Copressive Sensing l 2 -nor of the s largest eleents of x x s. In contrast, previous analysis in CS Davenport et al., 2012) can only upper bound the recovery error for x s by O e 2 + x x s 2 ). Thus, our recovery guarantee could be significantly better than previous results if the l 2 -nor of x x s is not concentrated on its s largest eleents i.e., x x s ) s 2 x x s 2 ). Followingthetriangleinequality, we obtain the following bound for recovering x ˆx x 2 O x x s 2 + ) slogd e 2 which atches state of the art DeVore et al., 2009). 2 Related Work Existing algoriths in CS could be roughly categorized into convex optiization based approaches and greedy approaches Davenport et al., 2012; Bluensath et al., 2012). Roughly speaking, convex approaches have better theoretical guarantee, while greedy approaches are ore efficient. In the noise-free setting, Candès and Tao 2005) pose thefollowingl 1 -iniizationproble, denotedby 1, for decoding in x 1 s.t. U x = y. To analyze the recovery perforance, they introduce the Restricted Isoetry Property RIP) for atrices. Definetheisoetryconstant ofu asthesallestnuber δ s such that the following holds for all s-sparse vectors x R d 1 δ s ) x 2 2 U x δ s ) x 2 2. It has been shown that if δ s + δ 2s + δ 3s < 1 or δ 2s < 2 1, the decoder 1 yields perfect recovery for all s-sparse vectors x R d Candès and Tao, 2005; Candès, 2008). If U is constructed as the rando atrix with independent sub-gaussian coluns, a sufficient condition is to take = Ωslogd) easureents Mendelson et al., 2008). In the general case, Candès 2008) propose the following convex forulation, denoted by ǫ 1, for decoding in x 1 s.t. y U x 2 ǫ where ǫ is an upper bound of e 2. Let ǫ 1U x +e) be the solution returned by the above decoder. Suppose U satisfies the RIP of order 2s with δ 2s < 2 1, ǫ 1U x x s ) x +e) x 2 O 1 +ǫ s for all x R d. An obvious drawback of this approach is that we ust have a good a priori estiate of e 2. This liitation is soon addressed by Wojtaszczyk 2010), who shows that the decoder 1 perfors very well even in the noise setting. In particular, if U is a Gaussian rando atrix and = Ωs log d), with an overwheling probability, 1 U x x s ) x +e) x 2 O 1 + e 2 s for all x R d. Another possible way is to estiate the noise level under the Bayesian fraeworkji et al., 2008). Notice that in the above inequalities, the l 2 -nor of the recovery error is upper bounded by the l 1 - nor of the corresponding error of the best s-ter approxiation. To ake the result ore consistent, it is natural to ask whether we could upper bound U x +e) x 2 by x x s 2 for soe decoder. Unfortunately, even in the noise-free setting, if we want the following inequality U x ) x 2 O x x s 2 ) to hold for all x R d, the nuber of easureents needs to be on the order of d Cohen et al., 2009). This difficulty otivates the study of instance optiality in probability, which eans we are looking for soe perforance guarantee that holds with a high probability for an arbitrary but fixed vector x. When U is a Gaussian rando atrix and = Ωslogd), Wojtaszczyk 2010) shows that with an overwheling probability 1 U x +e) x 2 O x x s 2 + e 2 ) for any fixed x R d. This result is extended to ore general failies of atrices by DeVore et al. 2009). The typical greedy approaches include Matching Pursuit MP) Mallat and Zhang, 1993), Iterative Hard Thresholding IHT) Bluensath and Davies, 2008; Garg and Khandekar, 2009), and Copressive Sapling Matching Pursuit CoSaMP) Needell and Tropp, 2009). Due to space liitation, we only give a brief description of IHT. In each iteration, IHT first perfors gradient descent with respect to U x y 2 2 and then applies hard-thresholding to keep the s largest eleents, in contrast to the soft-thresholding in our ethod. Let x t denote the solution of IHT in the t-th iteration. The updating rule is given by x t+1 = [ x t ηuu x t y) ] s where η is the step size. Under certain RIP condition, it has been shown that the recovery error of IHT can 1117

3 Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou Algorith 1 A Siple Hootopy Algorith Input: Sensing Matrix U R d n, Measureents y R, Shrinking Paraeter γ, Sparsity s, Maxiu NuberofIterationsT, 1: Initialize x 1 = 0, λ 1 = Uy 2: for t = 1,2,...,T do 3: x t+1 = P λt xt UU x t y) ) 4: if x t+1 0 > 2s then 5: Return x t 6: end if 7: λ t+1 = γλ t 8: end for 9: Return x T+1 be upper bounded by O x x s 2 + x x s ) 1 + e 2 s for all x R d Bluensath et al., 2012). 3 A Siple Hootopy Algorith for Copressive Sensing We first introduce the proposed hootopy algorith, and then present its theoretical guarantee. 3.1 The Algorith In our algorith, we solve a sequence of l 1 -regularized least squares l 1 -LS) with decreasing regularization paraeters. In particular, we set λ 1 = Uy, λ t+1 = γλ t for soe γ < 1. For each interediate l 1 -LS proble, we use the solution fro the previous iteration as the initial point, and perfor coposite gradient apping Nesterov, 2013) to update it once, i.e., x t+1 =argin x,uu x t y) + x x t 2 2 +λ t x 1 x R 2 d =P λt xt UU x t y) ) where P λt ) is the soft-thresholding operator Donoho, 1995) defined as { 0, if α λt ; P λt α) = signα) α λ t ), otherwise. The algorith will stop when the sparsity of interediate solution exceeds the budget 2s. The above procedure is suarized in Algorith 1. Although the basic idea of our algorith is otivated fro the proxial-gradient hootopy PGH) ethod Xiao and Zhang, 2012), we ake the following substantial extensions. Multiple iterations are needed by PGH to solve the interediate optiization proble to achieve soe predefined precision, in contrast, our algorith only updates once for each proble. While PGH assues a target regularization paraeter is given beforehand, our algorith is able to detect it adaptively based on the sparsity of the interediate solution. Xiao and Zhang 2012) only analyze the recovery error of PGH for exactly sparse vectors, in coparison, we provide recovery guarantee for an arbitrary vector. This difference is ore fundaental in the context of copressive sensing. 3.2 Main Results We first describe the assuptions about the sensing atrix U = 1 [u 1,...,u ] = 1 [v 1,...,v d ] R d where u i R d,i = 1,...,, and v j R,j = 1,...,d. We assue Both the coluns and rows of U are sub-gaussian vectors, and for the sake of siplicity, we assue the sub-gaussian nor is saller than 1. That is, for any vector x R d and z R, x u i ψ2 x 2,i = 1,..., and z v j ψ2 z 2,j = 1,...,d, where ψ2 is the sub- Gaussian nor of rando variables Vershynin, 2012). u i s are sapled independently fro an isotropic distribution, that is, E[u i u i ] = I. Exaples of such rando vectors are Gaussian vectors and Radeacher vectors. Given a fixed vector x, we receive easureents y = U x + e, where e encodes potential noise. In this paper,eisunknown andweakenoassuptionabout the noise odel. Our goal is to recover the s largest eleents of x fro U and y. We have the following theore to bound the recovery error. 1 Theore 1. Let ˆx be the solution output fro our algorith and T is the axiu nuber of iterations allowed. Choosing γ = , then, with a probability at least 1 6Te τ, ˆx x s 2 ax 6Λ,3λ 1 sγ T ) 1 As a reinder, x s denotes the vector that contains the s largest eleents of x. 1118

4 A Siple Hootopy Algorith for Copressive Sensing where sτ +logd) Λ = e 2 ) +C x x s ) s 2 + x x s 2 for soe universal constant C, provided 1) C ) Reark The constants in the above theore should not be treated literally, because ade no effort to optiize the. Generally speaking, a saller γ will lead to a fast convergence rate, but a larger constant, which is 6 now, in the recovery guarantee. The above theore iplies that the recovery error reduces exponentially until it reaches OΛ). Thus, with a sufficiently large T, the recovery error of x s can be upper bounded by O ) slogd e 2 + x x s 2 )+ x x s ) s 2. In the noise-free setting, our analysis also iplies exact recovery of s-sparse vectors, since Λ = 0 if e 2 = 0 and x 0 s. Fro the literature of CS Davenport et al., 2012, Theore 1.14), we find that previous analysis is able to upper bound the recovery error of x s by c x x s 2 + e 2 ) for soe constant c > 1. Thus, when is large enough, our upper bound could be significantly saller than the existing results. Finally, it is worth to ention that if our goal is to recover x, following the triangle inequality, our analysis yields the sae upper bound as previous studies DeVore et al., 2009). 3.3 A Post-processing Step Since ˆx is 2s-sparse and x s is s-sparse, one ay ask whether it is possible to find a good s-sparse vector to approxiate x s. The following theore shows that we can siply select the s largest eleents of ˆx to approxiate x s and the recovery error is on the sae order. Theore 2. Let y R d be a s-sparse vector. Then, x s y 2 3 x y 2, x R d. 4 Analysis We here present the proofs of ain theores. The oitted proofs are provided in the suppleentary aterial. 4.1 Proof of Theore 1 Notice that starting our algorith with λ 1 = Uy has the sae effect as staring with Uy γ k, k Z, which eans we can set λ 1 as large as we need. Thus, without loss of generality, we can assue λ s ax xs 2,6Λ). 3) We first state two theores that are central to our analysis. Theore 3 reveals that the recovery error of our algorith will reduce by a constant factor until it reaches the optial level. Then, Theore 4 shows that the recovery error will reain sall, as long as the sparsity of the solution does not exceed 2s. We denote by S and S t the support set of x s and x t, respectively. Theore 3. Assue S t \ S s, x t x s 2 3λ t s, and Λ 1 2 λ t s, where Λ is given in 1). Then, with a probability at least 1 6e τ, S t+1 \S s and x t+1 x s 2 3λ t+1 s provided the condition in 2) is true. Theore 4. Assue S t 2s, x t x s 2 6Λ, and Λ > 1 2 λ t s, where Λ is given in 1). If St+1 2s, then with a probability at least 1 6e τ, x t+1 x s )Λ provided the condition in 2) is true. We continue the proof of Theore 1 in the following. Let { k = in t : Λ > 1 } 2 λ 3) t s > 1. In the following, we will show that the recovery error x t x s 2 will first decrease exponentially as t approaches k, and then keep below 6Λ. T < k Fro 3), x 1 x s 2 = x s 2 3λ 1 s. Since the condition Λ λ t s 2 holds for t = 1,...,T, we can apply Theore 3 to bound the recovery error in each iteration. Thus, with a probability at least 1 6Te τ, ˆx x s 2 = x T+1 x s 2 3λ T+1 s = 3λ1 sγ T. 1119

5 Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou T k Fro the above analysis, with a probability at least 1 6k 1)e τ, x k x s 2 3 sλ k and S k \S s, which also eans our algorith arrives the k-th iteration. In the k-th iteration, there will be two cases: S k+1 > 2s and S k+1 2s. For the first case, our algorith terinates, and return x k as the final solution, iplying ˆx x s 2 = x k x s 2 3λ k s 6Λ. For the second case, our algorith keeps running, and wecanboundtherecoveryerrorofx k+1 bytheore4. If T = k or S k+2 > 2s, our algorith terinates and return x k+1 as the final solution, which iplies ˆx x s 2 = x k+1 x s )Λ. Otherwise, our algorith keeps running. Since ) 6, the condition in Theore 4 are satisfied, and thus can be applied repeatedly to bound the recovery error for all the rest iterations. 4.2 Proof of Theore 3 We need the following theore to analyze the behavior of the coposite gradient descent. Theore 5. Suppose x t x s is a 3s-sparse vector. With a probability at least 1 6e τ, [ U U x t y ) x t x s ) ] s 2 Λ+C x t x s 2 where Λ is given in 1). Given a set S [d], x S denotes the vector which coincides with x on S and has zero coordinates outside S. We denote the sub-gradient of 1 by 1. Using the fact that x t+1 1 =argin x xt +U U x t y ) 2 x R 2 +λ 2 t x 1, d 0 x t+1 x t +U U x t y ) +λ t x t+1 1, x s x t+1 x t+1 x t +U U x t y ),x s x t+1 +λ t x s 1 λ t x t+1 1 x t+1 x t +U U x t y ),x s x t+1 +λ t x s 1 λ t [x t+1 ] S 1 and thus x t+1 x t +U U x t y ),x s x t+1 +λ t [x t+1 x s ] S 1 x t+1 x t +U U x t y ),x s x t+1 +λ t s xt+1 x s 2 λ t s xt+1 x s 2 x t+1 x t +U U x t y ),x t+1 x s = x t+1 x s 2 2 +x t+1 x s ) U U x t y ) x t x s ) ). 4) According to Theore 5, with a probability at least 1 6e τ, [ U U x t y ) x t x s ) ] s 2 Λ+C x t x s 2 2) Λ+ 1 6 x t x s 2 λ t s. The above inequality iplies the agnitude of the s- allest d s eleents of x t UU x t y) x s is saller than λ t. Cobining with the fact that x t+1 = P λt xt UU x t y) ), it is easy to verify that S t+1 \S s. Furtherore, xt+1 x s ) U U x t y ) x t x s ) ) [x t+1 x s ] S 2 + [x t+1 x s ] St+1\S 2 ) λt s λ t 2s xt+1 x s 2. Substituting the above inequality into4), with a probability at least 1 6e τ, x t+1 x s 2 ) 2 λ t s+λt 2s x t+1 x s 2 and thus x t+1 x s )λ t s 3λt+1 s. 4.3 Proof of Theore 4 We need to reuse 4) that appears in the analysis of Theore 3. According to Theore 5, with a probability at least 1 6e τ, [ U U x t y ) x t x s ) ] s 2 2) Λ+ 1 6 x t x s 2 2Λ. Notice that x t+1 x s is 3s-sparse in this case, and it is easy to verify that xt+1 x s ) U U x t y ) x t x s ) ) 2 3Λ x t+1 x s

6 A Siple Hootopy Algorith for Copressive Sensing Substituting the above inequality into 4), x t+1 x s 2 ) 2 λ t s+2 3Λ x t+1 x s 2 and thus 21+ 3)Λ x t+1 x s 2, x t+1 x s )Λ. 4.4 Proof of Theore 5 In the analysis, we need to bound UU z) s 2 for a fixed vector z R d, and [UU I)z] s 2, for all 3ssparse vectors z R d. Thus, we build the following two theores. Theore 6. For a fixed z R d, with a probability at least 1 2e τ, UU z ) ) s 2 C z 2 + z s 2 for soe constant C > 0. Theore 7. With a probability at least 1 2e τ, for all z R d with z 0 3s, [ UU I)z ] s 2 C z 2 for soe constant C > 0. We rewrite U U x t y ) x t x s ) as U U x t y ) x t x s ) =U U x t U x e ) x t x s ) =UU x s x ) }{{} +UU I)x t x s }{{ ) } :=w a :=w b Then, [ U U x t y ) x t x s ) ] s 2 w s a 2 + w s b 2 + w s c 2. }{{} Ue. :=w c Bounding wa s 2 According to Theore 6, with a probability at least 1 2e τ, wa s 2 = [ UU x s x ) ] s 2 ) C x x s 2 + x x s ) s 2 for soe constant C > 0. Bounding wb s 2 Notice that x t x s is a 3s-sparse vector. According to Theore 7, with a probability at least 1 2e τ, wb s 2 = [ UU I)x t x s ) ] s 2 C x t x s 2 for soe constant C > 0. Bounding w s c 2 Since U = 1 [v 1,...,v d ], and we assue v i is a sub-gaussian vector. We have v j e ψ2 e 2, j = 1,...,d. Using the property of Orliz nor Koltchinskii, 2009, 2011), with a probability at least 1 2e τ, v j e v j e ψ2 τ e 2 τ. By taking the union bound,, with a probability at least 1 2e τ, Ue = 1 τ +logd ax v j e e 2 j iplying wc s 2 = Ue) s sτ +logd) 2 e 2. We coplete the proof by cobining the bounds for w s a 2, w s b 2, and w s c Proof of Theore 6 We define the set of s-sparse vectors with length saller than 1 as K d,s = { x R d : x 2 1, x 0 s }. Then, it is easy to check that E s z) := ax w K d,s w UU z = UU z) s 2. Let K d,s ǫ) be a proper ǫ-net for K d,s with the sallest cardinality, and NK d,s,ǫ) = K d,s ǫ) be the covering nuber for K d,s. We have the following lea for bounding NK d,s,ǫ) Plan and Vershynin, 2013, Lea 3.3). Lea 1. For ǫ 0,1) and s d, lognk d,s,ǫ) slog ) 9d. ǫs 1121

7 Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou Using the ǫ-net K d,s ǫ), we define a discretized version of E s z) as E s z,ǫ) = ax w K d,s ǫ) w UU z. The following lea relates E s z) with E s z,ǫ). Lea 2. For ǫ 0,1/ 2), E s z) E sz,ǫ) 1 2ǫ. Based on the conclusion fro Lea 2, it is sufficient to bound E s z,ǫ). To this end, we need the following lea to bound the difference between w UU z w z for a fixed w. Lea 3. For fixed w and z with w 2 1, with a probability at least 1 2e τ, τ w UU z w z C z 2 for soe constant C > 0. By taking the union bound, with a probability at least 1 2e τ, ax w UU z w z Since w K d,s ǫ) τ +slog9d/[ǫs]) C z 2. ax w K d,s ǫ) w z ax w z = z s 2, w K d,s E s z,ǫ) = ax w K d,s ǫ) w UU z ax w UU z w z + ax w K d,s ǫ) τ +slog9d/[ǫs]) C z 2 + z s 2. w K d,s ǫ) w z WecopletetheproofbyusingLea2withǫ = 1/ Proof of Theore 7 Recall the definitions of K d,s and K d,s ǫ) in the proof of Theore 6. Evidently, for any z with z 0 3s, we have [ UU I)z ] s 2 G 3s z 2 where G 3s = ax r K d,3s [ UU I)r ] s 2. Thus, what we need is to provide an upper bound for G 3s. Define F s r) = [ UU I)r ] s 2 = ax w K d,s w UU I)r. We first provide an upper bound for F s r), and then G 3s. Fro the analysis of Theore 6, with a probability at least 1 2e τ, F s r,ǫ) = ax w K d,s ǫ) w UU I)r τ +slog9d/[ǫs]) C r 2. for soe constant C > 0. Following the proof of Lea 2, it is straightforward to show that F s r) F sr,ǫ) 1 2ǫ, ǫ 0,1/ 2). Siilar to the proof of Theore 6, we can set ǫ = 1/2. Thus, for a fixed r, with a probability at least 1 2e τ, F s r) C r 2 for soe constant C > 0. Next, we repeat the above arguent again to bound G 3s. We define a discreted version of G 3s as G 3s ǫ) = ax r K d,3s ǫ) F sr). By taking the union bound, with a probability at least 1 2e τ, +3slog9d/[3ǫs]) G 3s ǫ) C for soe constant C > 0. We coplete the proof by using G 3s G 3sǫ) 1 2ǫ, ǫ 0,1/ 2). 5 Epirical Study In this section, we perfor several experients to exaine the recovery perforance of our ethod. We first exaine the ability of exact recovery under noise-free setting. x R is a 20-sparse vector, that is generated randoly and noralized to unit length. We construct the sensing atrix U R as the Gaussian rando atrix. Fig. 1 shows how the recovery error of our algorith decreases with different choices of γ. It is clear that our algorith achieves a linear convergence rate, and a saller γ leads to a faster convergence rate. 1122

8 A Siple Hootopy Algorith for Copressive Sensing Recovery Error γ=0.8 γ=0.92 γ=0.97 γ=0.99 Recovery Error γ=0.8 γ=0.92 γ=0.97 γ= # of Iterations # of Iterations Figure 1: x is sparse. Figure 2: x is dense. Recovery Error γ=0.8 γ=0.92 γ=0.97 γ=0.99 Recovery Error SHAγ=0.87) SHAγ=0.89) SHAγ=0.9) PGH # of Iterations # of Updates Figure 3: x is dense and y contains noise. Figure 4: SHA versus PGH. We then generate a unit length x R that has a power-law decay. Specifically, the i-th largest eleents of x is proportional to i 1. Fig. 2 shows the recovery error for x 20 without noise and Fig. 3 shows the recovery error when the easureents are containated by Gaussian noise with e 2 = 0.5. We observe that the recovery error still decreases rapidly until it reaches certain precision, at which the sparsity of the current solution exceeds 40 and our ethod terinates. Finally, we copare our siple hootopy algorith SHA) with the proxial-gradient hootopy ethod PGH). Following the setting in Xiao and Zhang, 2012), we construct U R where entries are sapled independently fro U 3, 3), i.e., the u- nifor distribution over [ 3, 3], x R 5000 where x 0 = 100andthenon-zeroentriesaresapledfro U 1,1), and e R 1000 where entries are sapled fro U 0.01, 0.01). Fig. 4 shows that our algorith converges faster. 6 Conclusion and Future Work In this paper, we provide a siple hootopy algorith for CS that is ore efficient and practical than its counterpart. Theoretical analysis shows that our algorith has a linear convergence rate in reducing the recoveryerror, andtherecoveryguarantee for x s could be uch tighter than previous results under appropriate conditions. Notice that existing studies in CS provide atching lower and upper bounds for recovering the whole vector x Donoho, 2006; Cohen et al., 2009). It is unclear to us what would be the lower bound if our goal istorecoverx s. Wewillinvestigatethisissueinfuture. We will also study how to apply our algorith to onebit copressive sensing Boufounos and Baraniuk, 2008; Plan and Vershynin, 2013; Zhang et al., 2014), which is a new setting of CS where the easureent is quantized to a single bit. Acknowledgeents This research was supported by National Science Foundation of China ), Collaborative Innovation Center of Novel Software Technology and Industrialization, NSF IIS ) and ONR Award N

9 Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou References T. Bluensath and M. E. Davies. Iterative thresholding for sparse approxiations. Journal of Fourier Analysis and Applications, 145-6): , T. Bluensath, M. E. Davies, and G. Rilling. Greedy algoriths for copressed sensing. In Copressed Sensing, Theory and Applications, chapter 8, pages Cabridge University Press, P. T. Boufounos and R. G. Baraniuk. 1-bit copressive sensing. In Proceedings of the 42nd Annual Conference on Inforation Sciences and Systes, pages 16 21, E. J. Candès. The restricted isoetry property and its iplications for copressed sensing. Coptes Rendus Matheatique, ): , E. J. Candès and T. Tao. Decoding by linear prograing. Inforation Theory, IEEE Transactions on, 5112): , E. J. Candès and T. Tao. Near-optial signal recovery fro rando projections: Universal encoding s- trategies? IEEE Transactions on Inforation Theory, 5212): , A. Cohen, W. Dahen, and R. DeVore. Copressed sensing and best k-ter approxiation. Journal of the Aerican Matheatical Society, 221):211C 231, I. Daubechies, M. Defrise, and C. D. Mol. An iterative thresholding algorith for linear inverse probles with a sparsity constraint. Counications on Pure and Applied Matheatics, 5711): , M. A. Davenport, M. F. Duarte, Y. C. Eldar, and G. Kutyniok. Introduction to copressed sensing. In Copressed Sensing, Theory and Applications, chapter 1, pages Cabridge University Press, R. DeVore, G. Petrova, and P. Wojtaszczyk. Instanceoptiality in probability with an l 1 -iniization decoder. Applied and Coputational Haronic Analysis, 273): , D. L. Donoho. De-noising by soft-thresholding. IEEE Transactions on Inforation Theory, 413): , D. L. Donoho. Copressed sensing. IEEE Transactions on Inforation Theory, 524): , R. Garg and R. Khandekar. Gradient descent with sparsification: An iterative algorith for sparse recovery with restricted isoetry property. In Proceedings of the 26th Annual International Conference on Machine Learning, pages , S. Ji, Y. Xue, and L. Carin. Bayesian copressive sensing. IEEE Transactions on Signal Processing, 566): , V. Koltchinskii. The dantzig selector and sparsity oracle inequalities. Bernoulli, 153): , V. Koltchinskii. Oracle Inequalities in Epirical Risk Miniization and Sparse Recovery Probles. Springer, S. G. Mallat and Z. Zhang. Matching pursuits with tie-frequency dictionaries. IEEE Transactions on Signal Processing, 4112): , S. Mendelson, A. Pajor, and N. Toczak-Jaegerann. Unifor uncertainty principle for bernoulli and subgaussian ensebles. Constructive Approxiation, 283): , D. Needell and J. A. Tropp. Cosap: Iterative signal recovery fro incoplete and inaccurate saples. Applied and Coputational Haronic Analysis, 26 3): , Y. Nesterov. Gradient ethods for iniizing coposite functions. Matheatical Prograing, 140 1), Y. Plan and R. Vershynin. One-bit copressed sensing by linear prograing. Counications on Pure and Applied Matheatics, 668): , R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B Methodological), 581): , J. A. Tropp. Just relax: convex prograing ethods for identifying sparse signals in noise. IEEE Transactions on Inforation Theory, 523): , R. Vershynin. Introduction to the non-asyptotic analysis of rando atrices. In Copressed Sensing, Theory and Applications, chapter 5, pages Cabridge University Press, P. Wojtaszczyk. Stability and instance optiality for gaussian easureents in copressed sensing. Foundations of Coputational Matheatics, 101), ISSN L. Xiao and T. Zhang. A proxial-gradient hootopy ethodforthel 1 -regularizedleast-squaresproble. In Proceedings of the 29th International Conference on Machine Learning, pages , L. Zhang, J. Yi, and R. Jin. Efficient algoriths for robust one-bit copressive sensing. In Proceedings of the 31st International Conference on Machine Learning, T. Zhang. Soe sharp perforance bounds for least squares regression with l 1 regularization. The Annals of Statistics, 375A): ,

Randomized Recovery for Boolean Compressed Sensing

Randomized Recovery for Boolean Compressed Sensing Randoized Recovery for Boolean Copressed Sensing Mitra Fatei and Martin Vetterli Laboratory of Audiovisual Counication École Polytechnique Fédéral de Lausanne (EPFL) Eail: {itra.fatei, artin.vetterli}@epfl.ch