Stability and Robustness of Weak Orthogonal Matching Pursuits

Similar documents
of Orthogonal Matching Pursuit

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit

Hard Thresholding Pursuit Algorithms: Number of Iterations

Sparsest Solutions of Underdetermined Linear Systems via l q -minimization for 0 < q 1

A new method on deterministic construction of the measurement matrix in compressed sensing

Sparse Recovery from Inaccurate Saturated Measurements

Stability and robustness of l 1 -minimizations with Weibull matrices and redundant dictionaries

Uniform Uncertainty Principle and signal recovery via Regularized Orthogonal Matching Pursuit

Signal Recovery From Incomplete and Inaccurate Measurements via Regularized Orthogonal Matching Pursuit

Generalized Orthogonal Matching Pursuit- A Review and Some

INDUSTRIAL MATHEMATICS INSTITUTE. B.S. Kashin and V.N. Temlyakov. IMI Preprint Series. Department of Mathematics University of South Carolina

List of Errata for the Book A Mathematical Introduction to Compressive Sensing

Strengthened Sobolev inequalities for a random subspace of functions

A Tutorial on Compressive Sensing. Simon Foucart Drexel University / University of Georgia

Analysis of Greedy Algorithms

c 2011 International Press Vol. 18, No. 1, pp , March DENNIS TREDE

Inverse problems and sparse models (1/2) Rémi Gribonval INRIA Rennes - Bretagne Atlantique, France

Sensing systems limited by constraints: physical size, time, cost, energy

Rui ZHANG Song LI. Department of Mathematics, Zhejiang University, Hangzhou , P. R. China

A New Estimate of Restricted Isometry Constants for Sparse Solutions

Greedy Signal Recovery and Uniform Uncertainty Principles

Multipath Matching Pursuit

Sparse Recovery with Pre-Gaussian Random Matrices

ORTHOGONAL matching pursuit (OMP) is the canonical

AN INTRODUCTION TO COMPRESSIVE SENSING

GREEDY SIGNAL RECOVERY REVIEW

Two added structures in sparse recovery: nonnegativity and disjointedness. Simon Foucart University of Georgia

Compressed Sensing and Sparse Recovery

Jointly Low-Rank and Bisparse Recovery: Questions and Partial Answers

Lecture Notes 9: Constrained Optimization

Recovery of Sparse Signals Using Multiple Orthogonal Least Squares

The Sparsest Solution of Underdetermined Linear System by l q minimization for 0 < q 1

Sparse analysis Lecture III: Dictionary geometry and greedy algorithms

Interpolation via weighted l 1 minimization

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise

Reconstruction from Anisotropic Random Measurements

Z Algorithmic Superpower Randomization October 15th, Lecture 12

Simultaneous Sparsity

Uniform Uncertainty Principle and Signal Recovery via Regularized Orthogonal Matching Pursuit

CoSaMP: Greedy Signal Recovery and Uniform Uncertainty Principles

Constructing Explicit RIP Matrices and the Square-Root Bottleneck

The Analysis Cosparse Model for Signals and Images

A Generalized Uncertainty Principle and Sparse Representation in Pairs of Bases

Sparse Solutions of Systems of Equations and Sparse Modelling of Signals and Images

Uniqueness Conditions for A Class of l 0 -Minimization Problems

On Rank Awareness, Thresholding, and MUSIC for Joint Sparse Recovery

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Model-Based Compressive Sensing for Signal Ensembles. Marco F. Duarte Volkan Cevher Richard G. Baraniuk

Conditions for Robust Principal Component Analysis

Pre-weighted Matching Pursuit Algorithms for Sparse Recovery

A Note on Guaranteed Sparse Recovery via l 1 -Minimization

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery

Introduction to Compressed Sensing

SPARSE signal representations have gained popularity in recent

RSP-Based Analysis for Sparsest and Least l 1 -Norm Solutions to Underdetermined Linear Systems

Compressed Sensing and Redundant Dictionaries

A simple test to check the optimality of sparse signal approximations

Lecture: Introduction to Compressed Sensing Sparse Recovery Guarantees

Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery

Compressed Sensing and Redundant Dictionaries

Robust Sparse Recovery via Non-Convex Optimization

Sparsity in Underdetermined Systems

Tractable Upper Bounds on the Restricted Isometry Constant

Exact Topology Identification of Large-Scale Interconnected Dynamical Systems from Compressive Observations

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization

Random projections. 1 Introduction. 2 Dimensionality reduction. Lecture notes 5 February 29, 2016

Recent Developments in Compressed Sensing

MATCHING-PURSUIT DICTIONARY PRUNING FOR MPEG-4 VIDEO OBJECT CODING

The Pros and Cons of Compressive Sensing

An Introduction to Sparse Approximation

Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space

Concave Mirsky Inequality and Low-Rank Recovery

Lecture 22: More On Compressed Sensing

Methods for sparse analysis of high-dimensional data, II

Noisy Signal Recovery via Iterative Reweighted L1-Minimization

Research Article Support Recovery of Greedy Block Coordinate Descent Using the Near Orthogonality Property

Sparse solutions of underdetermined systems

Abstract This paper is about the efficient solution of large-scale compressed sensing problems.

Elaine T. Hale, Wotao Yin, Yin Zhang

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Distributed Inexact Newton-type Pursuit for Non-convex Sparse Learning

Recovering overcomplete sparse representations from structured sensing

Bias-free Sparse Regression with Guaranteed Consistency

Exponential decay of reconstruction error from binary measurements of sparse signals

Supplementary Materials for Riemannian Pursuit for Big Matrix Recovery

Designing Information Devices and Systems I Discussion 13B

Noisy and Missing Data Regression: Distribution-Oblivious Support Recovery

Compressed Sensing with Very Sparse Gaussian Random Projections

arxiv: v1 [math.na] 28 Oct 2018

Methods for sparse analysis of high-dimensional data, II

Large-Scale L1-Related Minimization in Compressive Sensing and Beyond

Sparse representation classification and positive L1 minimization

Homework #2 Solutions Due: September 5, for all n N n 3 = n2 (n + 1) 2 4

Signal Recovery from Permuted Observations

Necessary and Sufficient Conditions of Solution Uniqueness in 1-Norm Minimization

Compressive Sensing and Beyond

CS 229r: Algorithms for Big Data Fall Lecture 19 Nov 5

Compressed Sensing and Robust Recovery of Low Rank Matrices

Transcription:

Stability and Robustness of Weak Orthogonal Matching Pursuits Simon Foucart, Drexel University Abstract A recent result establishing, under restricted isometry conditions, the success of sparse recovery via Orthogonal Matching Pursuit using a number of iterations proportional to the sparsity level is extended to Weak Orthogonal Matching Pursuits. The new result also applies to a Pure Orthogonal Matching Pursuit, where the index appended to the support at each iteration is chosen to maximize the subsequent decrease of the squared-norm of the residual vector. 1 Introduction and Main Results This note considers the basic Compressive Sensing problem, which aims at reconstructing s-sparse vectors x C N (i.e., vectors with at most s nonzero entries from the knowledge of measurement vectors y = Ax C m with m N. By now, it is well established see [1, 2] and references therein that this task can be carried out efficiently using the realization of a random matrix as a measurement matrix A C m N and by using l 1 -minimization as a reconstruction map : C m C N. There exist alternatives to the l 1 -minimization, though, including as a precursor the Orthogonal Matching Pursuit introduced by Mallat and Zhang in [7] in the context of sparse approximation. The basic idea of the algorithm is to construct a target support by adding one index at a time, and to find the vector with this target support that best fits the measurement. It had become usual, when trying to reconstruct s-sparse vectors, to analyze the convergence of this algorithm after s iterations, since the sth iterate is itself an s-sparse vector. For instance, the exact reconstruction of all s-sparse vectors x C N from y = Ax C m via s iterations of Orthogonal Matching Pursuit can be established under a condition on the coherence of the matrix A, or even on its cumulative coherence, see [10] for details. Lately, there has also been some work [3, 5, 6, 8] establishing sparse recovery under some (stronger than desired conditions on the restricted isometry constants of the measurement matrix. We recall briefly that the sth restricted isometry constant δ s of A is This was work was triggered by the participation of the author to the NSF supported Workshop in Linear Analysis and Probability at Texas A&M University The author is supported by the NSF grant DMS-1120622 1

defined as the smallest constant δ 0 such that (1 δ z 2 2 Az 2 2 (1 + δ z 2 2 for all s-sparse z C N. However, it was observed by Rauhut [9] and later by Mo and Shen [8] that s iterations of Orthogonal Matching Pursuit are not enough to guarantee s-sparse recovery under (more natural restricted isometry conditions. Nonetheless, it was recently proved by Zhang [11] that 30s iterations are enough to guarantee s-sparse recovery provided δ 31s < 1/3. Zhang s result also covers the important case of vectors x C N which are not exactly sparse and which are measured with some error via y = Ax + e C m in fact, it covers the case of measurement maps A : C N C m that are not necessarily linear, too. This note imitates the original arguments of Zhang and extends the results to a wider class of algorithms. Namely, we consider algorithms of the following type: Generic Orthogonal Matching Pursuit Input: measurement matrix A, measurement vector y, index set S 0. Initialization: x 0 = argmin { y Az 2, supp(z S 0}. Iteration: repeat the following steps until a stopping criterion is met at n = n (GOMP 1 (GOMP 2 S n+1 = S n {j n+1 }, x n+1 = argmin { y Az 2, supp(z S n+1}. Output: n-sparse vector x = x n. The recipe for choosing the index j n+1 distinguishes between different algorithms, which are customarily initiated with S 0 = and x 0 = 0. The classical Orthogonal Matching Pursuit algorithm corresponds to a choice where (1 (A (y Ax n j n+1 = max 1 j N (A (y Ax n j. The Weak Orthogonal Matching Pursuit algorithm with parameter 0 < ρ 1 corresponds to a choice where (2 (A (y Ax n j n+1 ρ max 1 j N (A (y Ax n j. We also single out a choice of j n+1 S n where (3 (A (y Ax n j n+1 dist(a j n+1, span{a i, i S n } = max j S n (A (y Ax n j dist(a j, span{a i, i S n }, where a 1..., a N C m denote the columns of A. We call this (rather unpractical algorithm Pure Orthogonal Matching Pursuit, because it abides by a pure greedy strategy of reducing as much as possible the squared l 2 -norm of the residual at each iteration. Indeed, the following observation can be made when a 1,..., a N all have norm one. 2

Theorem 1. For any Generic Orthogonal Matching Pursuit algorithm applied with a matrix A C m N whose columns a 1,..., a N are l 2 -normalized, the squared l 2 -norm of the residual decreases at each iteration according to where the quantity n satisfies y Ax n+1 2 2 = y Ax n 2 2 n, n = A(x n+1 x n 2 2 = x n+1 j n+1 (A (y Ax n j n+1 = (A (y Ax n j n+1 2. (A (y Ax n j n+1 2 dist(a j n+1, span{a i, i S n } 2 This result reveals that the quantity n is computable at iteration n. This fact is apparent from its third expression but it is hidden in its first two expressions. Maximizing n leads to the Pure Orthogonal Matching Pursuit, as hinted above. We informally remark at this point that n almost equals its lower bound when A has small restricted isometry constants, i.e., when its columns are almost orthonormal. Hence, under the restricted isometry property, the Pure Orthogonal Matching Pursuit algorithm almost coincides with the classical Orthogonal Matching Pursuit algorithm. This is made precise below. Theorem 2. Given an integer n 1 for which δ n < ( 5 1/2, the Pure Orthogonal Matching Pursuit algorithm iterated at most n times is a Weak Orthogonal Matching Pursuit algorithm with parameter ρ := 1 δ 2 n/(1 δ n. The proofs of Theorems 1 and 2 are postponed until Section 3. We now state the main result of this note, which reduces to Zhang s result for the classical Orthogonal Matching Pursuit algorithm when ρ = 1 (with slightly improved constants. Below, the notation x S stands for the vector equal to x on the complementary set S of S and to zero on the set S, and the notation σ s (x 1 stands for the l 1 -error of best s-term approximation to x. Theorem 3. Let A C m N be a matrix with l 2 -normalized columns. For all x C N and all e C m, let (x n be the sequence produced from y = Ax+e and S 0 = by the Weak Orthogonal Matching Pursuit algorithm with parameter 0 < ρ 1. If δ (1+3rs 1/6, r := 3/ρ 2, then there is a constant C > 0 depending only on δ (1+3rs and on ρ such that, for any S {1,..., N} with card(s = s, (4 y Ax 2rs 2 C Ax S + e 2. Furthermore, if δ (2+6rs 1/6, then, for all 1 p 2, (5 x x 4rs p C s 1 1/p σ s(x 1 + Ds 1/p 1/2 e 2 for some constants C, D > 0 depending only on δ (2+6rs and on ρ. 3

2 Sparse Recovery via Weak Orthogonal Matching Pursuits This section is dedicated to the proof of Theorem 3. We stress once again the similitude of the arguments presented here with the ones given by Zhang in [11]. Throughout this section and the next one, we will make use of the simple observation that (6 (A (y Ax k S k = 0 for all k 1. Indeed, with n := k 1 0, the definition of x n+1 via (GOMP 2 implies that the residual y Ax n+1 is orthogonal to the space {Az, supp(z S n+1 }. This means that, for all z C N supported on S n+1, we have 0 = y Ax n+1, Az = A (y Ax n+1, z, hence the claim. It will also be convenient to isolate the following statement from the flow of the argument. Lemma 4. Let A C m N be a matrix with l 2 -normalized columns. For x C N and e C m, let (x n be the sequence produced from y = Ax + e by the Weak Orthogonal Matching Pursuit algorithm with parameter 0 < ρ 1. For any n 0, any index set U not included in S n, and any vector u C N supported on U, (7 y Ax n+1 2 2 y Ax n 2 2 ρ 2 A(u xn 2 2 u S n 2 1 where δ := δ card(u S n. max{0, y Ax n 2 2 y Au 2 2} y Ax n 2 2 ρ2 (1 δ card(u \ S n max{0, y Axn 2 2 y Au 2 2}, Proof. The second inequality follows from the first one by noticing that A(u x n 2 2 (1 δ u x n 2 2 (1 δ (u x n S n 2 2, u S n 2 1 card(u \ S n u S n 2 2 = card(u \ S n (u x n S n 2 2. Now, according to Theorem 1, it is enough to prove that (8 (A (y Ax n j n+1 2 ρ 2 A(u xn 2 2 u S n 2 { y Ax n 2 2 y Au 2 2} 1 when y Ax n 2 2 y Au 2 2. Making use of (6, we observe on the one hand that (9 R A(u x n, y Ax n = R u x n, A (y Ax n = R (u x n S n, (A (y Ax n S n We observe on the other hand that (u x n S n 1 (A (y Ax n S n u S n 1 (A (y Ax n j n+1 /ρ. 2R A(u x n, y Ax n = A(u x n 2 2 + y Ax n 2 2 A(u x n (y Ax n 2 2 = A(u x n 2 2 + { y Ax n 2 2 y Au 2 } 2 (10 2 A(u x n 2 y Ax n 2 2 y Au 2 2. 4

Combining the squared versions of (9 and (10, we arrive at A(u x n 2 2 { y Ax n 2 2 y Au 2 2} us n 2 1 (A (y Ax n j n+1 2 /ρ 2. Rearranging the terms leads to the desired inequality (8. Theorem 3 would actually holds for any sequence obeying the conclusion of Lemma 4, by virtue of the following proposition. Proposition 5. Let A C m N be a matrix with l 2 -normalized columns. For an s-sparse vector x C N and for e C m, let y = Ax + e and let (x n be a sequence satisfying (7. If δ (1+3 3/ρ 2 s 1/6, then y Ax n 2 C e 2, n := 2 3/ρ 2 card(s \ S 0, where C > 0 is a constant depending only on δ (1+3 3/ρ 2 s and on ρ. Proof. The proof proceeds by induction on card(s \ S 0, where S := supp(x. If it is zero, i.e., if S S 0, then the definition of x 0 implies y Ax 0 2 y Ax 2 = e 2, and the result holds with C = 1. Let us now assume that the result holds up to an integer s 1, s 1, and let us show that it holds when card(s \ S 0 = s. We consider subsets of S \ S 0 defined by U 0 = and U l = {indices of 2 l 1 largest entries of x S 0 in modulus} for l 1, to which we associate the vectors x l := x S 0 U l, l 0. Note that the last U l, namely U log 2 (s +1, is taken to be the whole set S \ S 0 (it may have less than 2 l 1 elements, in which case 0 = x l 2 2 xl 1 2 2 /µ for a constant µ to be chosen later. We may then consider the smallest integer 1 L log 2 (s + 1 such that x L 1 2 2 µ x L 2 2. Its definition implies the (possibly empty list of inequalities x 0 2 2 < µ x 1 2 2,..., x L 2 2 2 < µ x L 1 2 2. For each l [L], we apply inequality (7 to the vector u = x x l supported on S 0 U l while noticing that (S 0 U l S n S S n and (S 0 U l \ S n (S 0 U l \ S 0 = U l, and we subtract 5

y Au 2 2 = A xl + e 2 2 from both sides to obtain max{0, y Ax n+1 2 2 A x l + e 2 2} ( 1 ρ2 (1 δ s+n card(u l ( exp ρ2 (1 δ s+n card(u l max{0, y Ax n 2 2 A x l + e 2 2} max{0, y Ax n 2 2 A x l + e 2 2}. For any K 0 and any n, k 0 satisfying n + k K, we derive by immediate induction that max{0, y Ax n+k 2 2 A x l + e 2 2} ( exp kρ2 (1 δ s+k card(u l max{0, y Ax n 2 2 A x l + e 2 2}. By separating cases in the rightmost maximum, we easily deduce that ( y Ax n+k 2 2 exp kρ2 (1 δ s+k card(u l y Ax n 2 2 + A x l + e 2 2. For some integer κ to be chosen later, applying this successively with k 1 := κ card(u 1,..., k L := κ card(u L, and K := k 1 + + k L, yields, with ν := exp(κρ 2 (1 δ s+k, y Ax k 1 2 2 1 ν y Ax0 2 2 + A x 1 + e 2 2, y Ax k 1+k 2 2 2 1 ν y Axk 1 2 2 + A x 2 + e 2 2,. y Ax k 1+ +k L 1 +k L 2 2 1 ν y Axk 1+ +k L 1 2 2 + A x L + e 2 2. Dividing the first row by ν L 1, the second row by ν L 2, and so on, then summing everything, we obtain y Ax K 2 2 y Ax0 2 2 ν L + A x1 + e 2 2 ν L 1 + + A xl 1 + e 2 2 + A x L + e 2 ν 2. Taking into account that x x 0 is supported on S 0 U 0 = S 0, the definition of x 0 implies that y Ax 0 2 2 y A(x x0 2 2 = A x0 + e 2 2, hence y Ax K 2 2 L l=0 A x l + e 2 2 ν L l Let us remark that, for l L 1 and also for l = L, L l=0 2( A x l 2 2 + e 2 2 ν L l. A x l 2 2 (1 + δ s x l 2 2 (1 + δ s µ L 1 l x L 1 2 2. 6

As a result, we have y Ax K 2 2 2(1 + δ s x L 1 2 2 µ L l=0 ( µ L l + 2 e 2 2 ν L l=0 1 ν L l 2(1 + δ s x L 1 2 2 µ(1 µ/ν + 2 e 2 2 1 ν. We choose µ = ν/2 so that µ(1 ν/µ takes its maximal value ν/4. α := 8(1 + δ s /ν and β := 2/(1 ν, It follows that, with (11 y Ax K 2 α x L 1 2 + β e 2. On the other hand, with γ := 1 δ s+k, we have We deduce that y Ax K 2 = A(x x K + e 2 A(x x K 2 e 2 γ x x K 2 e 2 γ x S K 2 e 2. (12 x S K 2 α γ xl 1 2 + β + 1 e 2. γ Let us now choose κ = 3/ρ 2, which guarantees that α (13 γ = 8(1 + δ s (1 δ s+k exp(κρ 2 (1 δ s+k < 1, since δ s+k δ (1+3 3/ρ 2 s 1/6, where we have used the fact that K = κ(1 + + 2 L 2 + card(u L < κ(2 L 1 + s 3κs 3 3/ρ 2 s. Thus, in the case ((β + 1/γ e 2 < (1 α/γ x L 1 2, we derive from (12 that x S K 2 < x L 1 2, i.e., (x S 0 S\S K 2 < (x S 0 (S\S 0 \U L 1 2. But according to the definition of U L 1, this yields card(s \ S K < card((s \ S 0 \ U L 1 = s 2 L 1. Continuing the algorithm from iteration K now amounts to starting it from iteration 0 with x 0 replaced by x K, therefore the induction hypothesis implies that y Ax K+ n 2 C e 2, n := 2 3/ρ 2 (s 2 L 1. Thus, since we also have the bound K κ(1 + + 2 L 2 + 2 L 1 < 3/ρ 2 2 L, the number of required iterations satisfies K + n 2 3/ρ 2 s, as expected. In the alternative case where ((β + 1/γ e 2 (1 α/γ x L 1 2, the situation is easier, since (11 yields y Ax K 2 α(β + 1 γ α e 2 + β e 2 =: C e 2, where the constant C 1 depends only on δ (1+3 3/ρ 2 s and on ρ. This shows that the induction hypothesis holds when card(s \ S 0 = s. The proof is now complete. 7

With Proposition 5 at hand, proving the main theorem is almost immediate. Proof of Theorem 3. Given S [N] for which card(s = s, we can write y = Ax S + e where e := Ax S +e. Applying Proposition 5 to x S and e with S 0 = then gives the desired inequality y Ax 2rs 2 C e 2 = C Ax S + e 2 for some constant C > 0 depending only on δ (1+3rs and on ρ. Deducing the second part of the theorem from the first part is rather classical (see [4], for instance, and we therefore omit the justification of this part. Remark. For the Pure Orthogonal Matching Pursuit algorithm, the parameter ρ depends on the restricted isometry constants as ρ 2 = 1 δ 2 n/(1 δ n. In this case, to guarantee that α/γ < 1 in (13, we may choose κ = 3 when δ s+k δ n 1/6. This would yield the estimate (4 in 6s iterations provided δ 10s 1/6, and estimate (5 in 12s iterations provided δ 20s 1/6. 3 Pure OMP as a Weak OMP under RIP This section is dedicated to the proofs of Theorems 1 and 2. The latter justifies that the Pure Orthogonal Matching Pursuit algorithm is interpreted as a Weak Pure Orthogonal Matching Pursuit algorithm with parameter depending on the restricted isometry constants. Proof of Theorem 1. We start by noticing that the residual y Ax n+1 is orthogonal to the space {Az, supp(z S n+1 }, and in particular to A(x n+1 x n, so that y Ax n 2 2 = y Ax n+1 + A(x n+1 x n 2 2 = y Ax n+1 2 2 + A(x n+1 x n 2 2. This establishes the first expression for n. As for the other statements about n, we separate first the case j n+1 S n. Here, the ratio (A (y Ax n j n+1 /dist(a j n+1, span{a i, i S n } is not given any meaning, but we have n = 0 in view of S n+1 = S n and x n+1 = x n, and the awaited equality (A (y Ax n j n+1 = 0 follows from (6. We now place ourselves in the case j n+1 S n. For the second expression of n, we keep in mind that x n+1 x n is supported on S n+1, that (A y S n+1 = (A Ax n+1 S n+1, and that (A (y Ax n S n = 0 to write A(x n+1 x n 2 2 = x n+1 x n, A A(x n+1 x n = x n+1 x n, (A A(x n+1 x n S n+1 = x n+1 x n, (A (y Ax n S n+1 = (x n+1 x n S n+1 \S n, (A (y Ax n S n+1 \S n = x n+1 j n+1 (A (y Ax n j n+1. For the third expression, we first notice that the step (GOMP 2 is equivalent to x n+1 = A S n+1 y, A S n+1 := (A S n+1 A S n+1 1 A S n+1. 8

[ ] We decompose A S n+1 as A S n+1 = A S n a j n+1. We then derive that [ ] [ ] A A S y = S ny A n+1 a, A S A j y n+1 S n+1 = S na S n A S na j n+1 a. n+1 j A n+1 S n 1 Among the several ways to express the inverse of a block-matrix, we select [ ] (A S A n+1 S n+1 1 M 1 M 2 =, M 3 M 4 where M 3 and M 4 (note that M 4 is a scalar in this case are given by M 4 = (1 a j n+1 A S n(a S na S n 1 A S na j n+1 1 = (1 a j n+1 A S na S n a j n+1 1, M 3 = M 4 a j n+1 A S n(a S na S n 1. It follows from the block-decomposition of x n+1 = (A S n+1 A S n+1 1 A S n+1 y that x n+1 j = M 3 A S ny + M 4a j n+1 y = M 4 a j n+1 ( A S na S n y + y = M 4 a j n+1 (y Ax n. We note that a j n+1 (y Ax n is simply (A (y Ax n j n+1. As for M 4, we note that A S na S n a j n+1 is the orthogonal projection P n a j n+1 of a j n+1 onto the space {Az, supp(z S n } = span{a i, i S n }, so that M 4 = (1 P n a j n+1, a j n+1 1 = (1 P n a j n+1 2 2 1 = dist(a j n+1, span{a i, i S n } 2. Note that P n a j n+1 2 < a j n+1 2 = 1 justifies the existence of M 4 and that P n a j n+1 2 0 shows that M 4 1, which establishes the lower bound on n. Proof of Theorem 2. According to the previous arguments and to the definition of the Pure Orthogonal Matching Pursuit algorithm, for any j S n, we have (A (y Ax n j n+1 2 1 P n a j n+1 2 2 = (A (y Ax n j n+1 2 dist(a j n+1, span{a i, i S n } 2 (A (y Ax n j 2 dist(a j, span{a i, i S n } 2 (A (y Ax n j 2. But, in view of Au, Av δ k u 2 v 2 for all disjointly supported u, v C N satisfying card(supp(u supp(v k, we have P n a j n+1 2 2 = P n a j n+1, a j n+1 = A S na S n a j n+1, a j n+1 = AA S n a j n+1, Ae j n+1 δ n+1 A S n a j n+1 2 δ n+1 1 δn AA S n a j n+1 2 = δ n+1 1 δn P n a j n+1 2. After simplification, we obtain P n a j n+1 2 δ n+1 /( 1 δ n. The condition δ n ( 5 1/2 ensures that this bound does not exceed one. It then follows that, for j S n, ( (A (y Ax n j n+1 2 1 δ2 n+1 (A (y Ax n j 2. 1 δ n The latter also holds for j S n, in view of (6, hence (2 is established when n + 1 n with ρ := 1 δ 2 n/(1 δ n. 9

References [1] E. J. CANDÈS, Compressive sampling, Proceedings of the International Congress of Mathematicians, Madrid, Spain, 2006. [2] D. L. DONOHO, Compressed sensing, IEEE Transactions on Information Theory, 52(4, 1289 1306, 2006. [3] M. DAVENPORT AND M. WAKIN, Analysis of Orthogonal Matching Pursuit using the Restricted Isometry Property, IEEE Transactions on Information Theory 56(9, 4395 4401, 2010. [4] S. FOUCART, Hard Thresholding Pursuit: an Algorithm for Compressive Sensing, SIAM Journal on Numerical Analysis 49(6, 2543 2563, 2011. [5] E. LIU AND V. TEMLYAKOV, Orthogonal Super Greedy Algorithm and Applications in Compressed Sensing, preprint. [6] R. MALEH, Improved RIP Analysis of Orthogonal Matching Pursuit, preprint. [7] S.G MALLAT AND Z. ZHANG, Matching pursuits with time-frequency dictionaries, IEEE Transactions on Signal Processing, 41(12, 3397 3415, 1993. [8] Q. MO AND Y. SHEN, Remarks on the Restricted Isometry Property in Orthogonal Matching Pursuit algorithm, preprint. [9] H. RAUHUT, On the impossibility of uniform sparse reconstruction using greedy methods, Sampling Theory in Signal and Image Processing, 7(2,197 215, 2008. [10] J. TROPP, Greed is Good: Algorithmic Results for Sparse Approximation, IEEE Transactions on Information Theory 50(10, 2231 2242, 2004. [11] T. ZHANG, Sparse Recovery with Orthogonal Matching Pursuit under RIP, IEEE Transactions on Information Theory 57(9, 6215 6221, 2011. 10