Signal Sparsity Models: Theory and Applications

Signal Sparsity Models: Theory and Applications Raja Giryes Computer Science Department, Technion Michael Elad, Technion, Haifa Israel Sangnam Nam, CMI, Marseille France Remi Gribonval, INRIA, Rennes France Mike E. Davies, University of Edinburgh Scotland Deanna Needell, Claremont McKenna College California, USA Alfred Bruckstein, Technion, Haifa Israel Dept. of Electrical Engineering-Systems, School of Electrical Engineering, July 8 th, 014

The sparsity model for signals and images

3 Why Sparsity? Emerging and fast growing field. New algorithms and theory. State-of-the-art results in many fields. Audio processing Video processing Image processing Radar Medical imaging Etc. Searching ISI-Web-of-Science (October 9 th 013): Topic=((spars* and (represent* or approx* or solution) and (dictionary or pursuit)) or (compres* and sens* and spars*))

4 Our Problem Setup Given the linear measurements y Mx e, y m m y e m M md x d x Target: Recover from. md is the measurement matrix (m d). M y

5 Denoising Given the linear measurements y x e, y m m y e m Zero entries One entries x d x Target: Recover from with noise reduction. M=I is the identity matrix. y

6 Debluring/Deconvolution Given the linear measurements y Mx e, y m m y e m Zero entries x d x Target: Recover from. is a block circulant matrix. M d d y

7 Super-Resolution Given the linear measurements y Mx e, y m m y e m Zero entries M md x d x Target: Recover from. md is the downsampling matrix (m<d). M y

8 Compressed Sensing Given the linear measurements y Mx e, y m m y e m M md x d x Target: Recover from. md is a (random) sensing matrix (m d). M y

9 Sparsity Prior (Synthesis) x D, k 0 k m Zero entries Non-zero entries x d n D dn

10 Classical Problem Setup We look at the measurements now as m y e m A MD m n n x d D dn n Signal is recovered by Our target

11 Sparsity Minimization Problem Assume bounded adversarial noise The problem we aim at solving is simply: ˆ x l ˆ arg min w s.t. y Aw l0 0 n w Dˆ l 0 0 e What can we say about the recovery?

1 Restricted Isometry Property (RIP) We say that a matrix A satisfies the RIP [Candès and Tao, 006] with a constants if for every k-sparse vector it holds that 1 w Aw 1 w. k w k k

13 Uniqueness and Stability If k then we have stability in the noisy case [Candès and Tao, 006]: 1 ˆ l 0 1 ( 0) Uniqueness in the noiseless case. A better condition for uniqueness can be achieved via the spark [Donoho and Elad, 003]. The l 0 -minimization is NP-hard. Approximation is needed. k

14 l 1 -relaxation Replacing l 0 with l 1 ˆ arg min w s.t. y MDw l1 1 n w 0.4931 If k Li 011] then C l. * [Candes and Tao 005, Mo and ˆ C 1 1 for 1 Perfect recovery in the noiseless case. l l (*) RIP condition holds for subgaussian matrices with m=o(k log(n))

15 Other Approximation Techniques Relaxation Based The Danzig Selector (DS) [Candes and Tao 007]. Greedy Thresholding Orthogonal Matching Pursuit (OMP) Regularized OMP (ROMP) [Needell, Vershynin 009] Greedy-like CoSaMP [Needell and Tropp 009] Subspace Pursuit (SP) [Dai and Milenkovic 009] Iterative Hard Thresholding (IHT) Hard Thresholding Pursuit (HTP) [Foucart 010]

16 Iterative Hard Thresholding (IHT) Approximating arg min y Aw s.t. w k w 0 0 ˆ 0, t 0 Gradient step with step size μ ˆ A y A ˆ t t 1 * t1 t t1 Thresholding: t ˆ t t T Support selection of closest k-sparse vector T t t supp, k Output: xˆ t Dˆ t supp(,k) select the support of the largest k-elements

17 Recovery Guarantees For DS, OMP, CoSaMP, SP, IHT and HTP [Candes and Tao 007, Zhang 011, Needell and Tropp 009, Dai and Milenkovic 009, Blumensath and Davies 009, Foucart, 010] we have also that if (for a>1) then ak alg ˆ C C alg is greater than. No guarantees for noise reduction. alg alg

18 Oracle Estimator for Random Noise Let e be white Gaussian with variance σ, i.e., ˆ Let be an oracle estimator Oracle Ay T T is the original support of α. Then k k E ˆ Oracle 1 1 k ei ~ N 0, k

19 Near Oracle Guarantees If e is random Gaussian and ak alg then the solution of DS and l 1 minimization obeys alg ˆ C log n k with very high probability [Candes and Tao 007, Bickel, Ritov and Tsybakov 009]. Coherence based Guarantees for l 1, OMP, thresholding [Ben Haim, Eldar and Elad 010] and DS [Cai, Wang and Xu 010].

0 Agenda Sparsity for overparametrized variational problems CoSaMP, SP and IHT near oracle performance Poisson Data Modeling Denoising relaxation techniques Compressed sensing Coherence Near-oracle performance Sparse Approximation Greedy algorithms Synthesis vs. analysis RIP Cosparse Analysis Greedylike Methods The Transform Domain Technique The Signal Space Paradigm

1 Analysis Model Another way to model the sparsity of the signal is using the analysis model [Elad, Milanfar, and Rubinstein, 007, Nam, Davies, Elad, Gribonval 013]. Looks at the coefficients of Ωx. pd is a given transform the analysis dictionary.

Analysis Model - Example Assume, x 44 x x Zero location Non-zero location x 0 3 non-zeros l p x 0 1 zeros What is the dimension of the subspace in which x resides?

3 Cosparsity and Corank We focus on the zeros in Ωx. Cosupport Λ: zeros locations. Cosparsity l: Ωx has l zeros. Corank r: rank of Ω Λ. l=r if Ω is in general position. In general x resides in a subspace of dimension d-r. Λ Λ =l Zero location Non-zero location x d l r=rank(ω Λ) Ω Λ x

4 Problem Setup Given the linear measurements, y Mx y m m y e m M md x d e Analysis Focus x p x d x d n p d Classical synthesis focus D dn

5 Analysis Minimization Problem We work directly with and not with its representation. Generates a different family of signals. The signals are characterized by their behavior and not by their building blocks. The problem we aim at solving in analysis is xˆ arg min v s.t. y Mv v d x 0

6 Approximation Methods The analysis problem is hard just like the synthesis one and thus approximation techniques are required.

7 Analysis Existing Approximations Analysis l 1 -minimization xˆ arg min v s.t. y Mv v d 1 Greedy anlaysis pursuit (GAP) [Nam, Davies, Elad and Gribonval, 013] Selects the cosupport iteratively In each iteration removes elements from the cosupport in a greedy way Simple Thresholding Orthogonal backward greedy (OBG) an algorithm for the case M=I [Rubinstein, Peleg, Elad, 013].

8 Analysis Dictionary Examples 1D and D finite difference operators Known as 1D-TV and D-TV if used with l 1. Laplacian Gabor frames Wavelets Curvelets Discrete Fourier transform (DFT) Learned dictionary And more

9 A General Recipe We present a general scheme for translating synthesis operations into analysis ones. This provides us with a recipe for converting the algorithms. The recipe is general and easy to use. We apply this recipe on CoSaMP, SP, IHT and HTP. Theoretical study of analysis techniques is more complicated.

30 Synthesis versus Analysis Synthesis Sparsity k non-zeros dn Dictionary D Sparse representation α with (at most) k nonzeros D i - i-th column in D. Support Analysis x D x Cosparsity l zeros pd Operator Cosparse representation with (at least) l zeros x p l 0 -i i-th row in. Cosupport p T 1.. n. {1.. }.

31 Orthogonal Projection Synthesis Given: a representation w and support T. Solve: min w w s.t. w 0 Solution: keep elements in w supported on T and zero the rest. w T w T C Analysis Given: a vector z and cosupport. Solve: Solution: calculate projection to the subspace spanned by and remove it from the vector. Q I min z v s.t. v 0 v

3 (Co)Support Selection Synthesis Given: a non sparse representation w. Solve: Solution: Select the indices of largest k elements in w. T T arg min w w supp w, k Optimal solution. T T Analysis Given: a non cosparse vector z. Solve: arg min z Q z Solution: Select the indices of smallest l elements in x. cosupp x, l This solution is suboptimal.

33 Cosupport Selection Optimal selection is NP-hard [Gribonval et al. 014]. In some cases efficient algorithms exist: The unitary case. 1D-DIF ( 1D-DIF ) [Han et al., 004]. Fused-Lasso ( ;I FUS 1D-DIF ) [Giryes et al., 014]. A cosupport selection scheme Sˆl is near optimal d with a constant C l if for any vector v Sˆ l z l l [Giryes et al., 014]. Q v v C inf Q v v

34 Iterative Hard Thresholding (IHT) 0 ˆ 0, t 0 Gradient step with step size μ t t 1 * 1 ˆ t A y A ˆ. t t1 Thresholding: t ˆ t t T Support selection of closest k-sparse vector T t t supp, k Output: xˆ t Dˆ t supp(,k) select the support of the largest k-elements [Blumensath and Davies, 009]

35 Analysis IHT 0 xˆ 0, t 0 Gradient step with step size μ t t1 * t1 ˆ ˆ. x x M y Mx Projection: xˆ t t t1 Q x t t Approximate cosupport selection of closest l-sparse vector t Sˆ t l x Output: xˆ t [Giryes, Nam, Gribonval and Davies, 011]

36 Analysis Greedy-like Methods We translate other synthesis operations. In a similar way we get Analaysis CoSaMP (ACoSaMP). Analysis SP (ASP). Analysis HTP (AHTP). Relaxed version for high dimensional problems Relaxed ACoSaMP (RACoSaMP). Relaxed ASP (RASP). Relaxed AHTP (RAHTP).

37 Recovery from Fourier Samples x Mx Shepp-Logan phantom 1 (5%) sampled radial lines

38 Naïve recovery Naïve recovery l minimization result with no prior

39 Shepp-Logan Derivatives D-DIF x0 Result of applying D-Finite difference operator

40 Stable Recovery from Noisy Samples 5 (10%) radial lines SNR 0 Naïve recovery, PSNR =18dB RASP recovery, PSNR =36dB

41 -RIP We say that a matrix M satisfies the -RIP with a constant if for every l-cosparse vector it l holds that l l 1 v Mv 1 v. [Giryes, Nam, Elad, Gribonval and Davies 013]. Similar to the D-RIP [Candès, Eldar, Needell, Randall, 011] v

4 Reconstruction Guarantees Theorem: If l p 1 Cl, M and C l is close enough to 1 then AIHT satisfies x xˆ C IHT e, M is the largest singular value of M C IHT is a constant greater than. Similar guarantees for ACoSaMP, ASP and AHTP. * * Given an optimal projection the condition is 1/3. l p [Giryes, Nam, Elad, Gribonval and Davies, 014]

43 Linear Dependencies in Ω For Ω-RIP the number of measurements we need is m O p llog p [Blumensath, Davies 009, Giryes et al. 014]. Linear dependencies are permitted in Ω and even encouraged. Can we get a guarantee which is in terms of d-r (r is the corank) instead of p-l? No [Giryes, Plan and Vershynin, 014]. d p

44 Related Work D-RIP based recovery guarantees for l 1 - minimization [Candès, Eldar, Needell, Randall, 011], [Liu, Mi, Li, 01]. RIP based TV recovery guarantees [Needell, Ward, 013]. ERC like recovery conditions for GAP [Nam, Davies, Elad and Gribonval, 013] and l 1 - minimization [Vaiter,Peyre,Dossal,Fadili, 013]. Thresholding denoising (M=I) performance for the case of Gaussian noise [Peleg, Elad, 013].

45 Agenda Sparsity for overparametrized variational problems CoSaMP, SP and IHT near oracle performance Poisson Data Modeling Denoising relaxation techniques Compressed sensing Coherence Near-oracle performance Sparse Approximation Greedy algorithms Synthesis vs. analysis RIP Cosparse Analysis Greedylike Methods The Transform Domain Technique The Signal Space Paradigm

46 Implications for the Synthesis Model Classically, in the synthesis model linear dependencies between small groups of columns are not allowed. However, in the analysis model, as our target is the signal, linear dependencies are even encouraged. Does the same hold for synthesis?

47 Problem Setup Given the linear measurements, y Mx y m m y e m M md x d e Signal Analysis Space Focus Focus x p x d x d n p d Classical synthesis focus D dn

48 Implication for the Synthesis Model Signal space CoSaMP signal recovery under the synthesis model also when the dictionary is highly coherent [Davenport, Needell, Wakin, 013]. An easy extension of our analysis theoretical guarantees also to synthesis [Giryes, Elad, 013]. No non-trivial dictionary is known to have a near-optimal projection in synthesis. The uniqueness and stability conditions for the signal recovery and the representation recovery are not the same [Giryes, Elad, 013].

49 Theoretical Results for Signal Space Modified SSCoSaMP with theoretical guarantees for synthesis dictionaries [Giryes and Needell, 014]. Theoretical Guarantees for (modified) OMP in the case of highly coherent dictionaries [Giryes and Elad, 013].

50 Agenda Sparsity for overparametrized variational problems CoSaMP, SP and IHT near oracle performance Poisson Data Modeling Denoising relaxation techniques Compressed sensing Coherence Near-oracle performance Sparse Approximation Greedy algorithms Synthesis vs. analysis RIP Cosparse Analysis Greedylike Methods The Transform Domain Technique The Signal Space Paradigm

51 Problem Setup Given the linear measurements, y Mx y m m y e m M md x d e Transform domain focus Signal space and analysis focus x p x d p d x Classical synthesis focus d D dn n

5 The Transform Domain Strategy Transform Domain IHT [Giryes, 014]. Efficient Algorithm. Guarantees for frames. Near oracle performance in the analysis framework. D-DIF (TV) guarantees [Giryes, 014].

53 Agenda Sparsity for overparametrized variational problems CoSaMP, SP and IHT near oracle performance Poisson Data Modeling Denoising relaxation techniques Compressed sensing Coherence Near-oracle performance Sparse Approximation Greedy algorithms Synthesis vs. analysis RIP Cosparse Analysis Greedylike Methods The Transform Domain Technique The Signal Space Paradigm

54 The Line Fitting Problem The Problem: g f The Model: Overparametrized representation for linear function d a f a Xb [I X], X diag1,...,d b e

55 Sparsity Based Solution Notice that Ω DIF a and Ω DIF b should be sparse at the same location. If k, the number of jumps, is given then solve if we know a min g [I X] s.t. DIFa DIFb k, ab, 0 b e and not k use a min DIFa DIFb s.t. g [I X] e. ab, 0 b We return to the analysis cosparse framework.

56 Sparsity Based Solution If k is given Near oracle denoising guarantees using SSCoSaMP [Giryes, Needell 014]. If is given Use GAPN [Nam, Davies, Elad and Gribonval, 013] and modify it to support structured (co)sparsity. Add a continuity constraint on the jump points. e

57 Line Fitting e~ N 0,4 e~ N 0,5

58 Line Fitting with Continuity Constraint e~ N 0,4 e~ N 0,5

59 From 1D to D For D we set a f a Xb1 Yb [I X Y] b1. b We consider options for extending Ω DIF to D: Standard: Vertical and horizontal derivatives 1 1 1-1, Standard + diagonal derivatives: 1 1 0 0 1 1 0 1 1 0 1-1,,,

60 Denoising Example Original DIF Standard, PSNR=39.09 ROF, PSNR=33.1 [Rudin, Osher and Fatemi, 199] Noisy, =0 DIF With Diagonal, PSNR=39.57

61 Denoising Example Original DIF Standard, PSNR=39.09 Noisy, =0 DIF With Diagonal, PSNR=39.57

6 Denoising Example ROF gets better SNR since the new method assumes a piecewise Original DIF Standard, SNR=9.6 linear image and therefore smoothens regions in the image. Is this useful? Noisy, =0 ROF, PSNR=30.8

63 Segmentation Gradient Map of the Original Image Original Gradient Map of the Piecwise Constant Coefficeints a of the Recovered Image DIF Standard Recovery

64 Segmentation

65 Geometrical Inpainting

66 Geometrical Inpainting With standard derivatives With diagonal derivatives

67 Agenda Sparsity for overparametrized variational problems CoSaMP, SP and IHT near oracle performance Poisson Data Modeling Denoising relaxation techniques Compressed sensing Coherence Near-oracle performance Sparse Approximation Greedy algorithms Synthesis vs. analysis RIP Cosparse Analysis Greedylike Methods The Transform Domain Technique The Signal Space Paradigm

68 Gaussian Denoising Problem y x e e where is a zero-mean white Gaussian noise with variance, i.e., e[ i] ~ N 0,. Another way to look at this problem is y[ i] ~ N x[ i],

69 Poisson Denoising Problem yi [] y Each element in is i.i.d. Poisson distributed with mean xi [] and variance xi large xi [] large. small xi [] small. xi [ ] 0 yi [ ] 0. The noise power is measured by the peak value: max xi i. Poisson noise is not an additive noise, unlike the Gaussian case. d yi [] yi [] [ ].

70 Poisson Denoising Problem Noisy image distribution: yi [] m e [ ] [ ]! 0 m P y i m x i m is an integer. 0 0

71 Poisson Denoising Problem x0 peak 100 y

7 Poisson Denoising Problem x0 peak 0.1 y

73 Exponential Sparsity Prior x exp D, k 0 Zero entries Non-zero entries x d exp D dn n [Salmon et al. 01, Giryes and Elad 01]

74 Poisson Denoising Methods BM3D with improved Anscombe [Makitalo and Foi, 011]. Non local sparse PCA [Salmon et al. 013]. GMM (Gaussian Mixture Model) based approach. Binning Down-sample noisy image to improve SNR. Denoise down-sampled image. Interpolate recovered image to return to initial size.

75 Our Method Divide the image into patches. One global dictionary for all patches. OMP-like algorithm for sparse coding. Boot-strapped stopping criterion. Online dictionary learning steps. [Giryes and Elad 013]

76 Noisy Image Max y value = 7 Peak = 1

77 Poisson Greedy Algorithm Dictionary learned atoms: Method Our NLSPCA BM3Dbin.45db 0.37db 19.41db Peak = 1 [Giryes and Elad 013].

78 Poisson Greedy Algorithm Method Our NLSPCA BM3Dbin.45db 0.37db 19.41db Peak = 1 [Salmon et al. 013].

79 Poisson Greedy Algorithm Method Our NLSPCA BM3Dbin.45db 0.37db 19.41db Peak = 1 [Makitalo and Foi 011]

80 Original Image

81 Noisy Image Max y value = 3 Peak = 0.

8 Poisson Greedy Algorithm Method Our NLSPCA BM3Dbin 3.9db.98db 3.16db Peak = 0. [Giryes and Elad 013].

83 Poisson Greedy Algorithm Method Our NLSPCA BM3Dbin 3.9db.98db 3.16db Peak = 0. [Salmon et al. 013].

84 Poisson Greedy Algorithm Method Our NLSPCA BM3Dbin 3.9db.98db 3.16db Peak = 0. [Makitalo and Foi 011]

85 Original Image

86 Noisy Image Max y value = 8 Peak =

87 Poisson Greedy Algorithm Method Our NLSPCA BM3Dbin 4.87db 3.3db 4.3db Peak = [Giryes and Elad 013].

88 Poisson Greedy Algorithm Method Our NLSPCA BM3Dbin 4.87db 3.3db 4.3db Peak = [Salmon et al. 013].

89 Poisson Greedy Algorithm Method Our NLSPCA BM3Dbin 4.87db 3.3db 4.3db Peak = [Makitalo and Foi 011]

90 Original Image

91 Conclusions The cosparse analysis model Analysis greedy-like algorithms. Signal space recovery New uniqueness and stability conditions. New algorithms with guarantees. Operating in the analysis transform domain. Relation between variational based methods and sparse representations.

9 www.cs.technion.ac.il/~raja

94 RIP Condition Can we satisfy a condition like The answer is positive: Subguassian matrices with [Candes and Tao 006] Sample Fourier matrices with [Rudelson and Vershynin 006] Other random matrices? k alg k n m O log alg k m k c n O log alg k Back

95 DS guarantee For the white Gaussian noise case, if DS 1 a log N and K 3K 1, then the DS solution obeys: DS x x ˆ C 1 a log N K DS x ˆDS 3K a N N 1 where CDS 4 / 1 and with probability a exceeding 1 1 log [Candes and Tao 007].

96 Reconstruction Guarantees Theorem: Apply either ACoSaMP or ASP with a=(l-p)/l, obtaining after t iterations. If C 1 M / C 1(*) and 4l3 p C, M, where C max C l, Cl p and is a constant greater than zero C, M whenever (*) is satisfied, then after a finite number of iterations t * ˆt * t ˆ, x x c e 0 where c <1 is a function of x 4l3 p, Cl, Clp and M. [Giryes, Nam, Elad, Gribonval and Davies, 013]

97 Hard thresholding pursuit(htp) Hard Thresholding Pursuit (HTP) [Foucart, 010] is similar to IHT but differ in the projection step Instead of projecting to the closest k-sparse vector, takes the support T i+1 of z i 1 and k minimizes the fidelity term: zˆ arg min y MD z i1 z i1 1 * * * * T T T D M MD D M y i1 i1 i1 T

98 Changing step size Instead of using a constant step size, one can choose in each iteration the step size that minimizes the fidelity term [V. Cevher, 011]: y MDzˆi 1

99 A-IHT and A-HTP gradient step cosupport selection: Projection: For A-IHT: ˆ Reminder: For A-HTP: x x M y Mx ˆ T ˆ 1 ˆ cosupp x, l i i i x Q x Q xˆ y Mx i1 i1 i1 i1 i1 I i1 i1 arg min i1 x s.t. x0 ˆ i 1 i1 i1 i1 T ˆ ˆ T Q M M Q M y ˆ 0 i1 [Giryes, Nam, Gribonval and Davies 011]

100 Algorithms variations Optimal changing stepsize instead of a fixed one: ˆ i 1 arg min y Mxi 1 Approximated changing stepsize i 1 arg min y Mxi 1 Underestimate the cosparsity and use l l.

101 Reconstruction Guarantees Theorem: Apply AIHT or AHTP with a certain constant step size or optimal step size obtaining x t after t iterations. If C 1 / C 1(*) l M l and l p 1 Cl, M, where 1 C, M is a constant greater than zero whenever (*) is satisfied, then after a finite number of iterations t * * t ˆ, x x c e 0 1 where c 1 <1 is a function of l p, Cl, and M. [Giryes, Nam, Elad, Gribonval and Davies, 013]

10 Reconstruction Guarantees Theorem: Apply either ACoSaMP or ASP with a=(l-p)/l, obtaining after t iterations. For an appropriate value of C max C l, Cl p there exists a reference constant C, M greater than zero such that if 4l3 p C, M then after a finite number of iterations t * ˆt * t ˆ, x x c e 0 where c <1 is a function of x 4l3 p, Cl, Clp and M. [Giryes, Nam, Elad, Gribonval and Davies, 013]

103 Reconstruction Guarantees Theorem: Apply AIHT or AHTP with a certain constant step size or optimal step size obtaining x t after t iterations. If C 1 / C 1(*) l M l and l p 1 Cl, M, where 1 C, M is a constant greater than zero whenever (*) is satisfied, then after a finite number of iterations t * * t ˆ, x x c e 0 1 where c 1 <1 is a function of l p, Cl, and M. [Giryes, Nam, Elad, Gribonval and Davies, 013]

104 Special Cases Optimal Projection Given an optimal projection scheme the condition of the theorem is simply 4 3 0.0156. l When is unitary the RIP and -RIP coincide and the condition becomes p M * 4k 0.0156.

105 Reconstruction Guarantees Theorem: Apply either ACoSaMP or ASP with a=(l-p)/l, obtaining after t iterations. For an appropriate value of C max C l, Cl p there exists a reference constant C, M greater than zero such that if 4l3 p C, M then after a finite number of iterations t * ˆt * t ˆ, x x c e 0 where c <1 is a function of x 4l3 p, Cl, Clp and M. [Giryes, Nam, Elad, Gribonval and Davies, 013]

106 Special Cases Optimal Projection Given an optimal projection scheme the condition of the theorem is simply 4 3 0.0156. l When is unitary the RIP and -RIP coincide and the condition becomes p M * 4k 0.0156.

107 Reconstruction Guarantees Theorem: Apply either AIHT or AHTP with a certain constant step size or optimal step size obtaining x t after t iterations. If C 1 / C 1(*) l M l and l p 1 Cl, M, where 1 C, M is a constant greater than zero whenever (*) is satisfied, then after a finite number of iterations t * * t ˆ, x x c e 0 1 where c 1 <1 is a function of l p, Cl, and M. [Giryes, Nam, Elad, Gribonval and Davies, 013]

108 Special Case Unitary Transform When is unitary the condition of the first theorems is simply M * k 1/ 3

109 -RIP Matrices p Theorem: for a fixed d md, if M satisfies CM m for any z P Mz z z e then, for any constant l 0 we have l l with probability exceeding 1 e t if 3 9 p m p llog t CM l p l l

110 (Co)Sparse vectors addition Synthesis z 1 Given two vectors and with supports T 1 and T of sizes and k Task: find the support of z1 z Solution: supp z 1 z T1 T The maximal size of the support is z T k 1 T 1 k k 1 Analysis x 1 Given two vectors and x with cosupports and of sizes and Task: find the cosupport of x1 x Solution: cosupp x l1 x 1 The minimal size of the cosupport is l l p 1 1 l 1 1

111 Objective Aware Projection Synthesis Given a vector z and support T Task: Perform an objective aware projection onto the subspace spanned by Solution: solve The last has a simple closed form solution D T arg min y MDz s.t. z 0 z T C MD T z Analysis Given a vector x and cosupport Task: Perform an objective aware projection onto the subspace orthogonal to the one spanned by Solution: solve arg min x y Mx s.t. x 0 The last has a closed form solution

11 Analysis SP (ASP) r 0 y, t 0 [1.. p] Output: xˆ xˆt Find new cosupport elements ˆ * t 1 S M r al t t1 Update the residual: t r y Mxˆ t Cosupport update: Compute new solution: t xˆ arg min x s.t. 0 x y Mx Calculate temporal solution: x p arg min x s.t. 0 x y Mx Estimate l-cosparse cosupport: Sˆl x p [Giryes and Elad 01]

113 r T 0 Compressive Sampling Matching Pursuit (CoSaMP) y, t 0 Output: xˆ xˆt Find new support elements * * t 1 T supp D M r, k t t1 Update the residual: t r y Mxˆ xˆ t t t Dzˆ Support update: T T T Compute new solution: zˆ t z p T Calculate temporal representations: p z MD y arg min s.t. z 0 T C T y MDz Estimate k- sparse support: T supp z, k p [Needell and Tropp 009]

114 Analysis CoSaMP (ACoSaMP) r 0 y, t 0 [1.. p] Output: xˆ xˆt Find new cosupport elements S ˆal M r t r y Mxˆ * t 1 t t1 Update the residual: t Cosupport update: Compute new solution: xˆ t Qx p Calculate temporal solution: x p arg min x s.t. 0 x y Mx Estimate l-cosparse cosupport: S ˆl x p [Giryes and Elad 01]