Recovery of Sparse Signals Using Multiple Orthogonal Least Squares

Similar documents
Recovery of Sparse Signals Using Multiple Orthogonal Least Squares

Multipath Matching Pursuit

Generalized Orthogonal Matching Pursuit- A Review and Some

of Orthogonal Matching Pursuit

Generalized Orthogonal Matching Pursuit

Pre-weighted Matching Pursuit Algorithms for Sparse Recovery

A new method on deterministic construction of the measurement matrix in compressed sensing

Uniform Uncertainty Principle and signal recovery via Regularized Orthogonal Matching Pursuit

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit

Stability and Robustness of Weak Orthogonal Matching Pursuits

GREEDY SIGNAL RECOVERY REVIEW

ORTHOGONAL matching pursuit (OMP) is the canonical

Signal Recovery From Incomplete and Inaccurate Measurements via Regularized Orthogonal Matching Pursuit

Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise

Randomness-in-Structured Ensembles for Compressed Sensing of Images

Greedy Signal Recovery and Uniform Uncertainty Principles

ORTHOGONAL MATCHING PURSUIT (OMP) is a

Subspace Pursuit for Compressive Sensing: Closing the Gap Between Performance and Complexity

Generalized Orthogonal Matching Pursuit

Uniform Uncertainty Principle and Signal Recovery via Regularized Orthogonal Matching Pursuit

CoSaMP: Greedy Signal Recovery and Uniform Uncertainty Principles

Gradient Descent with Sparsification: An iterative algorithm for sparse recovery with restricted isometry property

A New Estimate of Restricted Isometry Constants for Sparse Solutions

An Introduction to Sparse Approximation

Inverse problems and sparse models (1/2) Rémi Gribonval INRIA Rennes - Bretagne Atlantique, France

c 2011 International Press Vol. 18, No. 1, pp , March DENNIS TREDE

Compressed Sensing with Very Sparse Gaussian Random Projections

CoSaMP. Iterative signal recovery from incomplete and inaccurate samples. Joel A. Tropp

The Sparsest Solution of Underdetermined Linear System by l q minimization for 0 < q 1

INDUSTRIAL MATHEMATICS INSTITUTE. B.S. Kashin and V.N. Temlyakov. IMI Preprint Series. Department of Mathematics University of South Carolina

Solution-recovery in l 1 -norm for non-square linear systems: deterministic conditions and open questions

MATCHING PURSUIT WITH STOCHASTIC SELECTION

Model-Based Compressive Sensing for Signal Ensembles. Marco F. Duarte Volkan Cevher Richard G. Baraniuk

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery

SPARSE signal representations have gained popularity in recent

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization

The Analysis Cosparse Model for Signals and Images

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

A Greedy Search Algorithm with Tree Pruning for Sparse Signal Recovery

Compressed Sensing and Sparse Recovery

Lecture Notes 9: Constrained Optimization

Sparsest Solutions of Underdetermined Linear Systems via l q -minimization for 0 < q 1

Introduction to Compressed Sensing

Sparse Vector Distributions and Recovery from Compressed Sensing

Greedy Sparsity-Constrained Optimization

Constructing Explicit RIP Matrices and the Square-Root Bottleneck

Conditions for Robust Principal Component Analysis

Reconstruction from Anisotropic Random Measurements

Compressed Sensing and Robust Recovery of Low Rank Matrices

Lecture: Introduction to Compressed Sensing Sparse Recovery Guarantees

EUSIPCO

Thresholds for the Recovery of Sparse Solutions via L1 Minimization

Orthogonal Matching Pursuit: A Brownian Motion Analysis

Recent developments on sparse representation

5742 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER /$ IEEE

Compressive Sensing and Beyond

Generalized greedy algorithms.

Iterative Hard Thresholding for Compressed Sensing

Block-sparse Solutions using Kernel Block RIP and its Application to Group Lasso

Multiplicative and Additive Perturbation Effects on the Recovery of Sparse Signals on the Sphere using Compressed Sensing

Observability of a Linear System Under Sparsity Constraints

Error Correction via Linear Programming

Noisy Signal Recovery via Iterative Reweighted L1-Minimization

Signal Recovery from Permuted Observations

Sparse Solutions of an Undetermined Linear System

Sparse Solutions of Systems of Equations and Sparse Modelling of Signals and Images

Exact Topology Identification of Large-Scale Interconnected Dynamical Systems from Compressive Observations

Recovery of Sparse Signals via Generalized Orthogonal Matching Pursuit: A New Analysis

Abstract This paper is about the efficient solution of large-scale compressed sensing problems.

Exact Reconstruction Conditions and Error Bounds for Regularized Modified Basis Pursuit (Reg-Modified-BP)

Elaine T. Hale, Wotao Yin, Yin Zhang

L-statistics based Modification of Reconstruction Algorithms for Compressive Sensing in the Presence of Impulse Noise

The Pros and Cons of Compressive Sensing

IEEE SIGNAL PROCESSING LETTERS, VOL. 22, NO. 9, SEPTEMBER

Analysis of Greedy Algorithms

The uniform uncertainty principle and compressed sensing Harmonic analysis and related topics, Seville December 5, 2008

Strengthened Sobolev inequalities for a random subspace of functions

Recovery of Compressible Signals in Unions of Subspaces

Sensing systems limited by constraints: physical size, time, cost, energy

Simultaneous Sparsity

Rui ZHANG Song LI. Department of Mathematics, Zhejiang University, Hangzhou , P. R. China

A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing via Polarization of Analog Transmission

CS 229r: Algorithms for Big Data Fall Lecture 19 Nov 5

arxiv: v1 [math.na] 26 Nov 2009

Compressed Sensing Using Reed- Solomon and Q-Ary LDPC Codes

COMPARATIVE ANALYSIS OF ORTHOGONAL MATCHING PURSUIT AND LEAST ANGLE REGRESSION

The Sparsity Gap. Joel A. Tropp. Computing & Mathematical Sciences California Institute of Technology

AN OVERVIEW OF ROBUST COMPRESSIVE SENSING OF SPARSE SIGNALS IN IMPULSIVE NOISE

Sparsity in Underdetermined Systems

Stochastic geometry and random matrix theory in CS

Exact Signal Recovery from Sparsely Corrupted Measurements through the Pursuit of Justice

Compressed Sensing: Extending CLEAN and NNLS

Compressed sensing. Or: the equation Ax = b, revisited. Terence Tao. Mahler Lecture Series. University of California, Los Angeles

Stopping Condition for Greedy Block Sparse Signal Recovery

Z Algorithmic Superpower Randomization October 15th, Lecture 12

Stability of LS-CS-residual and modified-cs for sparse signal sequence reconstruction

A Continuation Approach to Estimate a Solution Path of Mixed L2-L0 Minimization Problems

A Generalized Uncertainty Principle and Sparse Representation in Pairs of Bases

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

Complementary Matching Pursuit Algorithms for Sparse Approximation

Transcription:

Recovery of Sparse Signals Using Multiple Orthogonal east Squares Jian Wang, Ping i Department of Statistics and Biostatistics arxiv:40.505v [stat.me] 9 Oct 04 Department of Computer Science Rutgers University Piscataway, New Jersey 08854, USA Email: {jwang,pingli}@stat.rutgers.edu Abstract We study the problem of sparse recovery from compressed measurements. This problem has generated a great deal of interest in recent years. To recover the sparse signal, we propose a new method called multiple orthogonal least squares MOS), which extends the well-known orthogonal least squares OS) algorithm by choosing multiple indices per iteration. Due to the inclusion of multiple support indices in each selection, the MOS algorithm converges in much fewer iterations and hence improves the computational efficiency over the OS algorithm. Theoretical analysis shows that MOS performs the exact recovery of any K-sparse signal within K iterations if the measurement matrix satisfies the restricted isometry property RIP) with isometry constantδ K K+ 5. Empirical experiments demonstrate that MOS is very competitive in recovering sparse signals compared to the state of the art recovery algorithms. Index Terms Compressed sensing CS), orthogonal matching pursuit OMP), orthogonal least squares OS), restricted isometry property RIP)

Recovery of Sparse Signals Using Multiple Orthogonal east Squares I. INTRODUCTION In recent years, sparse recovery has attracted much attention in signal processing, image processing, and computer science [] [4]. The main task of sparse recovery is to recover a high dimensional K-sparse vector x R n x 0 K) from a small number of linear measurements y = Φx, ) where Φ R m n m n) is often called the measurement matrix. Although the system is underdetermined, owing to the signal sparsity, x can be accurately recovered from the measurements y by solving an l 0 -minimization problem: min x x 0 subject to y = Φx. ) This method, however, is known to be intractable due to the combinatorial search involved and therefore impractical for realistic applications. Much recent effort has been devoted to develop efficient algorithms for recovering the sparse signal. In general, the algorithms can be classified into two major categorizes: those using convex optimization techniques and those relying on greedy searching principles. The optimization-based approaches replace the nonconvex l 0 -norm with its convex relaxation l -norm, translating the combinatorial hard search into a computationally tractable problem: min x x subject to y = Φx. 3) The algorithm is known as basis pursuit BP) [5]. It has been revealed that under appropriate constraints on the measurement matrix, BP yields exact recovery of the sparse signal. The second category of approaches for sparse recovery are greedy algorithms, in which signal support is iteratively constructed according to various greedy principles. Greedy algorithms have gained considerable popularity due to their computational simplicity and competitive performance. Representative algorithms include matching pursuit MP) [6], orthogonal matching pursuit OMP) [7] [] and orthogonal least squares OS) [3] [6]. Both OMP and OS update the

support of the underlying sparse signal by adding one index at a time, and estimate the sparse coefficients over the enlarged support. The main difference between OMP and OS lies in the greedy rule of updating the support. While OMP finds a column that is most strongly correlated with the signal residual, OS seeks to maximally reduce the residual energy with an enlarged support set. It has been shown that OS has better convergence property but is computationally more expensive than OMP [6]. In this paper, with the aim of improving the recovery accuracy and reducing the computational cost of OS, we propose a new method called multiple orthogonal least squares MOS), which can be viewed as an extension of OS in the sense that multiple indices are allowed to be chosen at a time. Our method is inspired by the fact that those sub-optimal candidates in each of the OS identification are likely to be reliable, and hence can be utilized to better reduce the energy of signal residual at each iteration. The main steps of the MOS algorithm are specified in Table I. Owing to the selection of multiple good candidates in each time, MOS converges in much fewer iterations in the end and thus reduces the computational cost over the conventional OS algorithm. Greedy methods with a similar flavor to MOS in adding multiple indices per iteration include stagewise OMP StOMP) [7], regularized OMP ROMP) [8], and generalized OMP gomp) [9] also known as orthogonal super greedy algorithm OSGA) [0]), etc. These algorithms identify candidates at each iteration according to the correlations between columns of the measurement matrix and the residual vector. Specifically, StOMP picks indices whose magnitudes of correlation exceed a deliberately designed threshold. ROMP first chooses a set of K indices with strongest correlations and then narrows down the candidates to a subset based on a predefined regularization rule. The gomp algorithm finds a fixed number of indices with strongest correlations in each selection. Other greedy methods adopting a different strategy of adding as well as pruning indices from the list include compressive sampling matching pursuit CoSaMP) [] and subspace pursuit SP) [] and hard thresholding pursuit HTP) [3]. The main contributions of this paper are summarized as follows. We propose a new algorithm, referred to as MOS, for recovering sparse signals from compressed measurements. We analyze the MOS algorithm using the restricted isometry property RIP). A measurement matrix Φ is said to satisfy the RIP of order K if there

3 TABE I THE MOS AGORITHM Input measurement matrix Φ R m n, measurements vector y R m, sparsity level K, and selection parameter K. Initialize iteration count k = 0, estimated support T 0 =, and residual vector r 0 = y. While End Output r k ǫ and k < min{k, m }, do k = k +. Identify S k = argmin S: S = Enlarge T k = T k S k. i S P T k {i} y. Estimate x k = argmin u:suppu)=t k y Φu. Update r k = y Φx k. the estimated support ˆT = argmin x k x k S and S: S =K K-sparse signal ˆx satisfying ˆx H\ˆT = 0 and ˆxˆT = x kˆt. exists a constant δ [0,) such that [4] δ x Φx +δ x 4) for all K-sparse vectors x. In particular, the minimum of all constants δ satisfying 4) is the isometry constant δ K. Our result shows that when >, the MOS algorithm exactly recovers any K-sparse signal in K iterations if the measurement matrix Φ obeys the RIP with isometry constant δ K K + 5. 5) For the special case when =, MOS reduces to the conventional OS algorithm and the recovery condition is given by δ K+ < K +. 6)

4 We show that the condition in 6) is nearly optimal for OS in the sense that, even with a slight relaxation of the condition e.g., to δ K+ = K ), the exact recovery of OS may fail. We conduct experiments to test the effectiveness of the MOS algorithm. Our empirical results demonstrate that MOS with > ) converges in much fewer iterations and has lower computational cost than the OS algorithm, while exhibiting better recovery accuracy. The empirical results also show that MOS is very competitive in recovering sparse signals when compared to the state of the art sparse recovery algorithms. The rest of this paper is organized as follows: In Section II, we introduce notations and lemmas that are used in the paper. In Section III, we give useful observations regarding MOS. In Section IV, we analyze the recovery condition of MOS. In Section V, we empirically study the recovery performance of the MOS algorithm. Concluding remarks are given in Section VI. II. PREIMINARIES A. Notations We first briefly summarize notations used in this paper. et H = {,,,n} and T = suppx) = {i i H,x i 0} denote the support of vector x. For S H, S is the cardinality of S. T \ S is the set of all elements contained in T but not in S. x S R S is a restriction of the vector x to the elements with indices in S. For mathematical convenience, we assume that Φ has unit l -norm columns throughout this paper. Φ S R m S is a submatrix of Φ that only contains columns indexed by S. If Φ S is full column rank, then Φ S = Φ S Φ S) Φ S is the pseudoinverse of Φ S. SpanΦ S ) represents the span of columns in Φ S. P S = Φ S Φ S stands for the projection onto SpanΦ S ). P S = I P S is the projection onto the orthogonal complement of SpanΦ S ), where I denotes the identity matrix. B. emmas The following lemmas are useful for our analysis. emma emma 3 in [4]): If a measurement matrix satisfies the RIP of both orders K and K where K K, then δ K δ K. This property is often referred to as the monotonicity of the isometry constant.

5 emma Consequences of RIP [], [5]): et S H. If δ S < then for any u R S, δ S ) u Φ S Φ Su +δ S ) u, +δ S u Φ SΦ S ) u δ S u. emma 3 emma. in [6]): et S,S H and S S =. If δ S + S <, then holds for any vector v R n supported on S. Φ S Φv δ S + S v emma 4: et S H. If δ S < then for any u R S, Proof: See Appendix A. +δ S u Φ S ) u δ S u. III. SIMPIFICATION OF MOS IDENTIFICATION et us begin with an interesting and important) observation regarding the identification step of MOS as shown in Table I. At the k +)-th iteration k 0), MOS adds to T k a set of indices, S k+ = arg min S: S = i S P T k {i} y. 7) Intuitively, a straightforward implementation of 7) requires to sort all elements in{ P T k {i} y } i H\T k and find the smallest ones and their corresponding indices). This implementation, however, is computationally expensive as it requires to construct a number of n k different orthogonal projections i.e., P T k {i}, i H\T k ). Therefore, it is highly desirable to find a cost-effective alternative to 7) for the MOS identification. Interestingly, the following proposition illustrates that 7) can be substantially simplified. Proposition : At the k+)-th iteration, the MOS algorithm identifies a set of indices: S k+ φ i,r k =arg max 8) S: S = P φ i S T k i =arg max P φ T S: S = k i,r k. 9) P φ T k i i S

6 Proof: Since P T k {i} y and P T k {i}y are orthogonal, and hence 7) is equivalent to P T k {i} y = y P T k {i}y, S k+ = arg max S: S = By noting that P T k {i} can be decomposed as see Appendix B) we have i S P T k {i}y. 0) P T k {i} = P T k +P T k φ i φ i P T k φ i ) φ i P T k, ) P T k {i}y = P T ky+p T k φ i φ i P T k φ i ) φ i P T k y a) = P T ky + P T k φ i φ i P T k φ i ) φ i P T k y b) = P T ky + φ ip T k y P T k φ i φ i P T k φ i c) = P T ky + φ i P T y k P φ T k i d) = P T ky + φi,r k P φ T k i ), ) where a) is because P T ky and P T k φ i φ ip T k φ i ) φ ip T k y are orthogonal, b) follows from that fact that φ ip T k y and φ ip T k φ i are scalars, c) is from P T k = P T k ) and P T k = P T k ), and hence and d) is due to r k = P T k y. By relating 0) and ), we have Furthermore, if we write φ ip T k φ i = φ ip T k ) P T k φ i = P T k φ i, S k+ φ i,r k = arg max. S: S = P φ i S T k i φ i,r k = φ ip T k ) P T k y = P T k φ i,r k,

7 then 8) becomes This completes the proof. S k+ = arg max S: S = i S P T k φ i P T k φ i,r k. Interestingly, { we can interpret from 8) that to identify S k+, it suffices to find the largest φ values in i,r k P T kφ i, which is much simpler than 7) as it involves only one projection }i H\T k operator i.e., P T k ). Indeed, by numerical experiments, we have confirmed that the simplification offers large reduction in computational cost. The result 8) will also play an important role in our analysis. Another interesting point we would like to mention is that the result 9) provides a geometric interpretation of the selection rule in MOS: the columns of measurement matrix are projected onto the subspace that is orthogonal to the span of the active columns, and the normalized projected columns that are best correlated with the residual vector are selected. A. Exact recovery condition IV. SIGNA RECOVERY USING MOS In this subsection, we study the recovery condition of MOS which guarantees the selection of all support indices within K iterations. For the convenience of stating the results, we say MOS makes a success in an iteration if it selects at least one correct index at that iteration. Clearly if MOS makes a success in each iteration, then it is guaranteed to select all support indices within K iterations. Success at the first iteration Observe from 8) that when k = 0 i.e., at the first iteration), we have P T k φ i = φ i =, and MOS selects the set T = S = arg max S: S = = arg max S: S = = arg max S: S = i S φ i,r k P T k φ i φ i,r k i S Φ S y. 3) i S

8 That is, By noting that K, we know Φ T y max S: S = Φ S y. 4) Φ T y K Φ Ty K Φ T Φ Tx T K δ K) x. 5) On the other hand, if no correct index is chosen at the first iteration i.e., T T = ), then Φ T y = Φ T Φ Tx T a) δ K+ x, 6) where a) follows from emma 3. This, however, contradicts 5) if or δ K+ < δ K+ < K δ K), 7) K +. 8) Therefore, under 8), at least one correct index is chosen at the first iteration of MOS. Success at the k +)-th iteration k > 0) Assume that MOS selects at least one correct index in each of the previous k ) iterations and denote by l the number of correct indices in S k. Then, l = T T k k. 9) Also, assume that T k does not contain all correct indices l < K). Under these assumptions, we will build a condition to ensure that MOS selects at least one correct index at the k + )-th iteration.

9 For convenience, we introduce two additional notations. et u denote the largest value of φ i,r k P T kφ i, i T \T k. Also, let v denote the -th largest value of Then it follows from 8) that as long as φ i,r k P T kφ i, i H\T T k ). u > v, 0) { φ u is contained in the top- among all values in i,r k P T kφ i, and hence at least one }i H\T k correct index i.e., the one corresponding to u ) is selected at the k + )-th iteration of MOS. The following proposition gives the lower bound of u and the upper bound of v. Proposition : u v ) δ K l δ xt k+k l \T k K, ) δ k l δ ) / k+ + δ k δk+ δ +K l + δ +kδ k+k l δ k ) xt \T k. ) Proof: See Appendix C. By noting that k l < K and K and also using the monotonicity of the restricted isometry constant in emma, we have K l < K δ K l < δ K, k +K l K δ k+k l δ K, k < K δ k < δ K, 3) k + K δ k+ δ K, +k K δ +k δ K.

0 From ) and 4), we have v < = + δ K ) / δ K δk ) xt \T k δ K + δ K δ K δ K δ K δk )/ δ K ) / xt \T k. 4) ) Also, from ) and 4), we have u > δ K δ K δ K ) xt δk \T k = K δ K l δk a) > δ K ) xt \T k K l ) xt \T k K, 5) where a) follows from K l < K. Using 4) and 5), we obtain the sufficient condition of u > v as ) xt δk \T k K δ K which is true under see Appendix D) δ K δ K δ K )/ δ K ) / δ K ) xt \T k, 6) K + 5. 7) Therefore, under 7), MOS selects at least one correct index at the k + )-th iteration. So far, we have obtained condition 8) for guaranteeing the success of MOS at the first iteration and condition 7) for the success of the succeeding iterations. We now combine them to get a sufficient condition of MOS ensuring the exact recovery of all K-sparse signals within K iterations.

Theorem 5 Recovery Guarantee for MOS): The MOS algorithm exactly recovers any K- sparse K ) vector x from the measurements y = Φx within K iterations if the measurement matrix Φ satisfies the RIP with isometry constant δ K < K+ 5, >, δ K+ < K+, =. Proof: Our goal is to prove an exact recovery condition for the MOS algorithm. To the end, we first derive a condition ensuring the selection of all support indices in K iterations of MOS, and then show that when all support indices are chosen, the recovery of sparse signal is exact. Clearly, the condition that guarantees MOS to select all support indices is determined by the more strict condition between 8) and 7). We consider the following two cases. = : In this case, the condition 8) for the success of the first iteration becomes δ K+ < 8) K +. 9) In fact, 9) also guarantees the success of the oncoming iterations, and hence becomes the overall condition for this case. We justify this by mathematical induction on k. Under the hypothesis that the former k < K) iterations are successful i.e., that the algorithm selects one correct index in each of the k iterations), we have T k T and hence the residual r k = y Φx k = Φx x k ) can be viewed as the measurements of K-sparse vector x x k which is supported on T T k = T ) using the measurement matrix Φ. Therefore, 9) guarantees a success at the k + )-th iteration of algorithm. In summary, 9) will ensure the selection of all support indices in K iterations. : Since δ K δ K+ and K + > K + 5, 30) 7) is more strict than 8) and becomes the overall condition for this case.

When all support indices are selected by MOS i.e., T T k where k K) denotes the number of actually performed iterations), x k T k = argmin u y Φ T ku = Φ T ky = Φ T kφx = Φ kφ T T kx T k = x T k. 3) As a result, the residual vector becomes zero rˆk = y Φxˆk = 0). The algorithm terminates and returns the exact recovery of the sparse signal ˆx = x). B. Optimality of recovery condition for OS When =, MOS reduces to the conventional OS algorithm. Theorem 5 suggests that under δ K+ < K+, OS recovers any K-sparse signal in K iterations. In fact, this condition is not only sufficient but also nearly necessary for OS in the sense that, even with a slight relaxation on the isometry constant e.g., relaxing to δ K+ = K ), the recovery may fail. We justify this argument with the following example. et Φ = 0 4 0 0 and Then, Φ Φ = 4 7 4 4 8 x =. 0 0 4. 3) Since k min{k, m } by Table I, the number of totally selected induces within K iterations of MOS) does not exceed m, which ensures that the sparse signal can be recovered through an S projection.

3 One can show that the eigenvalues of Φ Φ are λ Φ Φ) = λ Φ Φ) = + and λ 3Φ Φ) =. Hence, Φ satisfies the RIP with δ 3 = max{λ max Φ Φ), λ min Φ Φ)} { } = max, However, in this case OS identifies an incorrect index at the first iteration, and hence the recovery fails. =. 33) t = arg min i {,,3} P i y = 3 34) C. The relationship between MOS and gomp We can gain good insights by studying the relationship between MOS and the generalized OMP gomp) [9]. gomp differs from MOS in that, at each iteration, gomp finds indices with strongest correlations to the residual vector. Interestingly, it can be shown that MOS belongs to the family of weak gomp also known as WOSGA [0]), which includes gomp as a special case. By definition, the weak gomp algorithm with parameter µ k+ 0,] identifies the set S k+ of indices at the k +)-th iteration k 0) such that Φ S r k k+ µ k+ max Φ S rk. 35) S: S = Clearly it embraces the gomp algorithm as a special case when µ k+ =. Theorem 6: MOS belongs to the family of weak gomp algorithms with parameter µ k+ = δ k+ δ k. Proof: Denote by S k the set of indices selected at the k +)-th iteration such that S k+ = arg max S: S = Φ Sr k. 36) Note that in order to recover a K-sparse signal exactly with K steps of OS, any selection of incorrect index will not be allowed.

4 From 8), φ i,rk P φ i S k T k i i S k φ i,r k P T k φ i i S k φ i,r k = max S: S = Φ Sr k 37) where 37) holds because φ i has unit l -norm and hence P T k φ i. On the other hand, φ i,rk P φ i S k T k i ) φ min i S k P φ T k i i,r k i S k ) φ i max i S P k T kφ φ i i,r k i S ) k = Φ max i S k Φ ) T Φ φ k T k i S r k k Φ S r k, 38) k δ k+ δ k where 38) is due to the fact that, for any i S k, Φ T k ) Φ T k φ i a) δ k Φ T k φ i b) δ k+ δ k φ i where a) and b) follow from emma 4 and 3, respectively. = δ k+ δ k 39) By combining 38) and 37), we obtain ) Φ S r k k δ k+ max Φ S δ rk. 40) k S: S = This completes the proof.

5 V. EMPIRICA RESUTS In this section, we empirically study the performance of the MOS algorithm for recovering sparse signals. We adopt the testing strategy in [9], [], [7] which measures the performance of recovery algorithms by testing their empirical frequency of exact reconstruction of sparse signals. In each trial, we construct an m n matrix where m = 8 and n = 56) with entries drawn independently from a Gaussian distribution with zero mean and m variance. For each value ofk in{5,,64}, we generate ak-sparse signal ofn whose support is chosen uniformly at random and nonzero elements are ) drawn independently from a standard Gaussian distribution, or ) chosen randomly from the set {±,±3}. We refer to the two types of signals as the sparse Gaussian signal and the sparse 4-ary pulse amplitude modulation 4-PAM) signal, respectively. We should mention that reconstructing sparse 4-PAM signals is a particularly challenging case for OMP and OS. For comparative purposes, we consider the following recovery approaches in our simulation: 3 ) OS and MOS, ) OMP and gomp http://isl.korea.ac.kr/paper/gomp.zip), 3) StOMP http://sparselab.stanford.edu/), 4) ROMP http://www.cmc.edu/pages/faculty/dneedell), 5) CoSaMP http://www.cmc.edu/pages/faculty/dneedell), 6) BP by linear programming P) http://cvxr.com/cvx/). For each recovery approach, we perform 0, 000 independent trials and plot the empirical frequency of exact reconstruction as a function of the sparsity level K. By comparing the maximal sparsity level, i.e., the so called critical sparsity [], of the sparse signals at which the exact reconstruction always ensured, recovery accuracy of different algorithms can be compared empirically. As shown in Fig., for both sparse Gaussian and sparse 4-PAM signals, the MOS algorithm outperforms other greedy approaches with respect to the critical sparsity. Even when compared to the BP method, the MOS algorithm still exhibits very competitive reconstruction performance. For the Gaussian case, MOS has much higher critical sparsity than BP, while for 3 Note that for both gomp and MOS, the selection parameter should obey K. Hence, we choose = 3,5 in our simulation. Moreover, StOMP has two thresholding strategies: false alarm control FAC) and false discovery control FDC). We exclusively use FAC, since the FAC outperforms FDC.

6 0.9 Frequency of Exact Reconstruction 0.8 0.7 0.6 OS 0.5 MOS =3) MOS =5) 0.4 OMP 0.3 gomp =3) gomp =5) 0. StOMP ROMP 0. CoSaMP BP 0 5 0 5 0 5 30 35 40 45 50 55 60 Sparsity K a) Sparse Gaussian signal. 0.9 Frequency of Exact Reconstruction 0.8 0.7 0.6 OS 0.5 MOS =3) MOS =5) 0.4 OMP 0.3 gomp =3) gomp =5) 0. StOMP ROMP 0. CoSaMP BP 0 5 0 5 0 5 30 35 40 45 50 55 60 Sparsity K b) Sparse 4-PAM signal Fig.. Frequency of exact reconstruction of K-sparse signals. 4-PAM case, MOS and BP have almost identical performance. In Fig., we plot the running time and the number of iterations of MOS and OS for the exact reconstruction of K-sparse Gaussian and 4-PAM signals as a function of K. The running

7 Number of iterations for exact reconstruction 50 40 30 0 0 0 OS MOS =3) MOS =5) 0 0 30 40 Sparsity K Running time sec) 0. 0.5 0. 0.05 0 OS MOS =3) MOS =5) 0 0 30 40 Sparsity K Number of iterations for exact reconstruction 40 35 30 5 0 5 0 5 a) # iterations for Gaussian signals. OS MOS =3) MOS =5) 0 5 0 5 0 5 30 35 Sparsity K c) # iterations for 4-PAM signals. Running time sec) 0.4 0. 0. 0.08 0.06 0.04 0.0 b) Running time for Gaussian signals. OS MOS =3) MOS =5) 0 5 0 5 0 5 30 35 Sparsity K d) Running time for 4-PAM signals. Fig.. Comparison between OS and MOS in running time and number of iterations for exact reconstruction of K-sparse Gaussian and 4-PAM signals. time is measured using the MATAB program under the 6-core 64-bit processor, 56 Gb RAM, and Window 8 environments. We observe that for both sparse Gaussian and 4-PAM cases, the number of iterations of MOS for exact reconstruction is much smaller than that of the OS algorithm, and accordingly, the associated running time of MOS is much less than that of OS.

8 VI. CONCUSION In this paper, we have studied a sparse recovery algorithm called MOS, which extends the conventional OS algorithm by allowing multiple candidates entering the list in each selection. Our method is inspired by the fact that sub-optimal candidates in the OS identification are likely to be reliable and can be utilized to better reduce the residual energy at each time, thereby accelerating the convergence of the algorithm. We have demonstrated by RIP analysis that MOS performs the exact recovery of any K-sparse signal within K iterations if δ K K+ 5. In particular, as a special case of MOS when =, the OS algorithm exactly recovers any K- sparse signal in K iterations under δ K+ < K+, which coincides with the best-so-far recovery condition of OMP [] and is proved to be nearly optimal for the OS recovery. In addition, we have shown by empirical experiments that the MOS algorithm has faster convergence and lower computational cost than the conventional OS algorithm, while exhibiting improved recovery accuracy. Our empirical results have also shown that MOS achieves very promising performance in recovering sparse signals when compared to the state of the art reconvey algorithms. ACKNOWEDGEMENT This work is supported in part by ONR-N0004-3--0764, NSF-III-36097, AFOSR-FA9550-3--037, and NSF-Bigdata-490. APPENDIX A PROOF OF EMMA 4 Proof: Since δ S <, Φ S has full column rank. Suppose that Φ S has singular value decomposition Φ S = UΣV. Then from the definition of RIP, the maximum and minimum diagonal entries of Σ denoted by σ max and σ min, respectively) satisfy σ max +δ S and σ min δ S. Note that Φ S ) = Φ SΦ S ) Φ S) = UΣV UΣV ) UΣV ) = UΣ V, 4)

9 where Σ is the diagonal matrix formed by replacing every non-zero) diagonal entry of Σ by its reciprocal. Therefore, all singular values of Φ S ) lie between σ max and σ min. This together with 4) implies that all singular values of Φ S ) lie between +δ S and δ S, which competes the proof. Proof: Since P T k +P T k = I, APPENDIX B PROOF OF ) P T k {i} = P T kp T k {i} +P T k P T k {i} = P T k +P T k P T k {i} = P T k +P T [Φ k T k,φ i ] = P T k + a) = P T k + Φ T k φ i ] [0 P T φ k i Φ T Φ k T k ] [0 P T φ k i φ iφ T k M M M 3 M 4 where a) is from the partitioned inverse formula and After some manipulations, we have M = Φ T k P i Φ T k), [Φ T k,φ i ] Φ T φ k i φ iφ i Φ T k φ i M = Φ T k P i Φ T k) Φ T k φ i φ i φ i), M = Φ T k P i Φ T k), Φ T k φ i Φ T k φ i 4) M = Φ T k P i Φ T k). 43) P T k {i} = P T k + ] [0 P T φ k i Φ T P k i Φ T k) Φ T P k i φ i P φ T k i ) φ i P T k = P T k +P T k φ i φ i P T k φ i ) φ i P T k. 44)

0 APPENDIX C PROOF OF PROPOSITION Proof: Proof of ) Since u is the largest value of u = φ i,r k P T kφ i, i T \T k, it is clear that φ i,rk K l P φ i T \T k T k i φ i,r k K l i T \T k Φ T \T r k K l k Φ T \T kp K l T Φ k T \T kx T \T k = a) Φ T \T kφ T \T kx T \T k K l Φ T \T kp T kφ T \T kx T \T k ) Φ T \T kφ T \T kx T \T k K l ) Φ T \T kp T kφ T \T kx T \T k, 45) where a) uses the triangle inequality. Since T \T k = K l, Φ T \T kφ T \T kx T \T k δ K l ) xt \T k, 46) and Φ T \T kp T kφ T \T kx T \T k = Φ T \T kφ T kφ T k Φ T k) Φ T k Φ T \T kx T \T k a) δ k+k l Φ T k Φ T k) Φ T k Φ T \T kx T \T k b) δ k+k l Φ T Φ δ k T \T kx T \T k k δ k+k l xt δ \T k, 47) k

where a) and b) follow from emma 3 and, respectively T k and T \T k are disjoint sets and T k T \T k ) = k +K l). Finally, by combining 45), 46), and 47), we obtain ) u δ K l δ xt k+k l \T k K. 48) δ k l Proof of ) { et F be the index set corresponding to largest elements in φ i,r k ) / } φ i,r k P T kφ i i H\T T k ). Then, i F = = a) b) = = P φ T k i ) / φ min i F P φ T k i i,r k i F ) / φ max i F P T kφ i i,r k i F ) / Φ max i F Φ ) T Φ φ k T k i F rk / Φ F rk δ k+ δ k ) / δ k Φ δ k δ FP k+ T Φ k T \T kx T \T k ) / δ k Φ δ k δk+ F Φ T \T kx T \T k Φ F P T kφ T \T kx T \T k ) ) / δ k Φ δ k δk+ F Φ T \T kx T \T k + ) Φ F P T kφ T \T kx T \T k 49)

where a) is because for any i F, Φ T k ) Φ T k φ i δ k Φ T k φ i δ k+ δ k φ i = δ k+ δ k, 50) and b) from r k = y Φx k = y Φ T kφ T k y = y P T ky = P T k y = P T k Φ T \T kx T \T k Since F and T \T k are disjoint i.e., F T \T k ) = ) and F + T \T k = +K l note that T T k = l by hypothesis). Using this together with emma 3, we have Φ F Φ T \T kx T \T k δ xt +K l \T k. 5) Moreover, since F T k = and F + T k = +k, Φ F P T kφ T \T kx T \T k Φ δ +k Φ T k T \T kx T \T k = δ +k Φ T k Φ T k) Φ T k Φ T \T kx T \T k a) δ +k Φ T Φ δ k T \T kx T \T k k b) δ +kδ k+k l δ k xt \T k, 5) where a) is from emma and b) is due to emma 3 since T k and T \T k are disjoint and T \T k = K l, T k T \T k ) = k +K l). Invoking 5) and 5) into 49), we have ) / φ i,r k i F δ k ) / P φ T k i δ k δk+ δ +K l + δ ) +kδ k+k l xt \T k δ. 53) k On the other hand, by noting that v is the -th largest value of i F φ i,r k P T kφ i, i F, we obtain ) / φ i,r k v P φ T k i. 54)

3 From 53) and 54), we have δ k v δ k δk+ ) / δ +K l + δ +kδ k+k l δ k ) xt \T k. 55) et and APPENDIX D PROOF OF 7) Proof: Observe that 6) is equivalent to K δ K) δ K δ K )/ δ K δ K ) /. 56) fδ K ) = δ K) δ K δ K )/ δ K δ K ) / gδ K ) = δ K 5. Then one can check that δ K 0, ), fδ K) > gδ K ). Hence, 56) is guaranteed under K δ K 5. 57) Equivalently, δ K K + 5. 58) REFERENCES [] D.. Donoho and P. B. Stark, Uncertainty principles and signal recovery, SIAM Journal on Applied Mathematics, vol. 49, no. 3, pp. 906 93, 989. [] D.. Donoho, Compressed sensing, IEEE Trans. Inform. Theory, vol. 5, no. 4, pp. 89 306, Apr. 006. [3] E. J. Candès and T. Tao, Near-optimal signal recovery from random projections: Universal encoding strategies?, IEEE Trans. Inform. Theory, vol. 5, no., pp. 5406 545, Dec. 006. [4] E. J. Candès, J. Romberg, and T. Tao, Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information, IEEE Trans. Inform. Theory, vol. 5, no., pp. 489 509, Feb. 006.

4 [5] S. S. Chen, D.. Donoho, and M. A. Saunders, Atomic decomposition by basis pursuit, SIAM review, pp. 9 59, 00. [6] S. G. Mallat and Z. Zhang, Matching pursuits with time-frequency dictionaries, IEEE Trans. Signal Process., vol. 4, no., pp. 3397 345, Dec. 993. [7] Y. C. Pati, R. Rezaiifar, and P. S. Krishnaprasad, Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition, in Proc. 7th Annu. Asilomar Conf. Signals, Systems, and Computers. IEEE, Nov. Pacific Grove, CA, Nov. 993, vol., pp. 40 44. [8] J. A. Tropp, Greed is good: Algorithmic results for sparse approximation, IEEE Trans. Inform. Theory, vol. 50, no. 0, pp. 3 4, Oct. 004. [9] J. A. Tropp and A. C. Gilbert, Signal recovery from random measurements via orthogonal matching pursuit, IEEE Trans. Inform. Theory, vol. 53, no., pp. 4655 4666, Dec. 007. [0] M. A. Davenport and M. B. Wakin, Analysis of Orthogonal Matching Pursuit using the restricted isometry property, IEEE Trans. Inform. Theory, vol. 56, no. 9, pp. 4395 440, Sep. 00. [] T. Zhang, Sparse recovery with orthogonal matching pursuit under rip, IEEE Trans. Inform. Theory, vol. 57, no. 9, pp. 65 6, Sep. 0. [] J. Wang and B. Shim, On the recovery limit of sparse signals using orthogonal matching pursuit, IEEE Trans. Signal Process., vol. 60, no. 9, pp. 4973 4976, Sep. 0. [3] S. Chen, S. A. Billings, and W. uo, Orthogonal least squares methods and their application to non-linear system identification, International Journal of control, vol. 50, no. 5, pp. 873 896, 989. [4]. Rebollo-Neira and D. owe, Optimized orthogonal matching pursuit approach, IEEE Signal Processing etters, vol. 9, no. 4, pp. 37 40, Apr. 00. [5] S. Foucart, Stability and robustness of weak orthogonal matching pursuits, in Recent Advances in Harmonic Analysis and Applications, pp. 395 405. Springer, 03. [6] C. Soussen, R. Gribonval, J. Idier, and C. Herzet, Joint k-step analysis of orthogonal matching pursuit and orthogonal least squares, IEEE Trans. Inform. Theory, vol. 59, no. 5, pp. 358 374, May 03. [7] D.. Donoho, I. Drori, Y. Tsaig, and J.. Starck, Sparse solution of underdetermined linear equations by stagewise orthogonal matching pursuit, IEEE Trans. Inform. Theory, vol. 58, no., pp. 094, Feb. 0. [8] D. Needell and R. Vershynin, Signal recovery from incomplete and inaccurate measurements via regularized orthogonal matching pursuit, IEEE J. Sel. Topics Signal Process., vol. 4, no., pp. 30 36, Apr. 00. [9] J. Wang, S. Kwon, and B. Shim, Generalized orthogonal matching pursuit, IEEE Trans. Signal Process., vol. 60, no., pp. 60 66, Dec. 0. [0] E. iu and V. N. Temlyakov, The orthogonal super greedy algorithm and applications in compressed sensing, IEEE Trans. Inform. Theory, vol. 58, no. 4, pp. 040 047, Apr. 0. [] D. Needell and J. A. Tropp, Cosamp: Iterative signal recovery from incomplete and inaccurate samples, Applied and Computational Harmonic Analysis, vol. 6, no. 3, pp. 30 3, Mar. 009. [] W. Dai and O. Milenkovic, Subspace pursuit for compressive sensing signal reconstruction, IEEE Trans. Inform. Theory, vol. 55, no. 5, pp. 30 49, May. 009. [3] S. Foucart, Hard thresholding pursuit: an algorithm for compressive sensing, SIAM Journal on Numerical Analysis, vol. 49, no. 6, pp. 543 563, 0.

5 [4] E. J. Candès and T. Tao, Decoding by linear programming, IEEE Trans. Inform. Theory, vol. 5, no., pp. 403 45, Dec. 005. [5] S. Kwon, J. Wang, and B. Shim, Multipath matching pursuit, IEEE Trans. Inform. Theory, vol. 60, no. 5, pp. 986 300, May 04. [6] E. J. Candès, The restricted isometry property and its implications for compressed sensing, Comptes Rendus Mathematique, vol. 346, no. 9-0, pp. 589 59, 008. [7] E. Candes, M. Rudelson, T. Tao, and R. Vershynin, Error correction via linear programming, in in IEEE Symposium on Foundations of Computer Science FOCS)., 005, pp. 668 68.