Sparsity and Compressed Sensing

Sparsity and Compressed Sensing Jalal Fadili Normandie Université-ENSICAEN, GREYC Mathematical coffees 2017

Recap: linear inverse problems Dictionary Sensing Sensing Sensing = y m 1 H m n y y 2 R m H A is fat (underdetered system). Solution is not unique (fundamental theorem of linear algebra). Are we stuck? No if the dimension of x is intrinsically small. A x 2 R n y m 1 m n n L m 1 H m n } x n 1 MC 17-2

Geometry of inverse problems x 0 2 R n MC 17-3

Geometry of inverse problems x 0 2 R n A Ax 0 2 R m MC 17-3

Geometry of inverse problems x 0 2 R n Objects on this set have a low cost A Ax 0 2 R m MC 17-3

Strong notion of sparsity R 2 R 3 MC 17-4

Strong notion of sparsity 0-sparse R 2 R 3 MC 17-4

Strong notion of sparsity 0-sparse 1-sparse R 2 R 3 MC 17-4

Strong notion of sparsity 0-sparse 1-sparse 2-sparse = dense R 2 R 3 MC 17-4

Strong notion of sparsity 0-sparse 1-sparse 2-sparse = dense 0-sparse R 2 R 3 MC 17-4

Strong notion of sparsity 0-sparse 1-sparse 2-sparse = dense 0-sparse 1-sparse R 2 R 3 MC 17-4

Strong notion of sparsity 0-sparse 1-sparse 2-sparse = dense 0-sparse 1-sparse 2-sparse R 2 R 3 MC 17-4

Strong notion of sparsity 0-sparse 1-sparse 2-sparse = dense 0-sparse 1-sparse 2-sparse 3-sparse = dense R 2 R 3 MC 17-4

Strong notion of sparsity 0-sparse 1-sparse 2-sparse = dense 0-sparse 1-sparse 2-sparse 3-sparse = dense R 2 R 3 supp(x) ={i =1,,n: x i 6=0} kxk =#supp(x) 0 (Not a norm : not positively homogenenous) Definition (Informal) x 2 R n is sparse iff kxk 0 n. MC 17-4

Weak notion of sparsity In nature, signals, images, information, are not (strongly) sparse. MC 17-5

Weak notion of sparsity In nature, signals, images, information, are not (strongly) sparse. wavelet coefficients 0 white Wavelet transform 0 black MC 17-5

Weak notion of sparsity In nature, signals, images, information, are not (strongly) sparse. wavelet coefficients 0 white Wavelet transform 0 black sorted wavelet coefficients index MC 17-5

Weak notion of sparsity In nature, signals, images, information, are not (strongly) sparse. wavelet coefficients 0 white Wavelet transform 0 black sorted wavelet coefficients Definition (Informal) x 2 R n is weakly sparse iff x (i) decreases fast enough with i. index MC 17-5

From now on, sparsity is intended in strong sense MC 17-6

What sparsity good for? Solve y =Ax where x is sparse If kxk 0 apple m and A I is full-rank (I def =supp(x)), we are done. Indeed, at least as many equations as unknowns : In practice, the support I is not known. y = A I x I. We have to infer it from the sole knowledge of y and A. MC 17-7

Regularization Solve y =Ax where x is sparse MC 17-8

Regularization Solve y =Ax where x is sparse x2r n kxk 0 such that y =Ax MC 17-8

Regularization Solve y =Ax where x is sparse x2r n kxk 0 such that y =Ax m =2 n =3 MC 17-8

Regularization Solve y =Ax where x is sparse x2r n kxk 0 such that y =Ax m =2 n =3 y=ax MC 17-8

Regularization Solve y =Ax where x is sparse x2r n kxk 0 such that y =Ax m =2 n =3 1-sparse y=ax MC 17-8

Regularization Solve y =Ax where x is sparse x2r n kxk 0 such that y =Ax m =2 n =3 1-sparse 2-sparse y=ax MC 17-8

Regularization Solve y =Ax where x is sparse x2r n kxk 0 such that y =Ax m =2 n =3 Sparsest solution 1-sparse 2-sparse y=ax MC 17-8

Regularization Solve y =Ax where x is sparse x2r n kxk 0 such that y =Ax m =2 n =3 Sparsest solution 1-sparse 2-sparse y=ax Not convex, not even continuous. In fact, this is a combinatorial NP-hard problem. Can we find a viable alternative? MC 17-8

Relaxation Solve y =Ax where x is sparse MC 17-9

Relaxation Solve y =Ax where x is sparse x2r n kxk 0 s.t. y =Ax m =1 n =2 y=ax MC 17-9

Relaxation Solve y =Ax where x is sparse x2r n kxk 0 s.t. y =Ax m =1 n =2 Not convex. Not continuous. NP-hard. Sparsest solution. y=ax MC 17-9

Relaxation Solve y =Ax where x is sparse x2r n kxk 0 s.t. y =Ax x2r n kxk 0.5 s.t. y =Ax m =1 n =2 Not convex. Not continuous. NP-hard. Sparsest solution. y=ax y=ax kxk p =! 1/p nx x i p i=1 MC 17-9

Relaxation Solve y =Ax where x is sparse x2r n kxk 0 s.t. y =Ax x2r n kxk 0.5 s.t. y =Ax m =1 n =2 Not convex. Not continuous. NP-hard. Sparsest solution. y=ax Not convex. Continuous. Sparsest solution. y=ax x2r n kxk 2 s.t. y =Ax kxk p =! 1/p nx x i p i=1 y=ax MC 17-9

Relaxation Solve y =Ax where x is sparse x2r n kxk 0 s.t. y =Ax Basis Pursuit kxk x2r n 1 s.t. y =Ax x2r n kxk 0.5 s.t. y =Ax m =1 n =2 Not convex. Not continuous. NP-hard. Sparsest solution. y=ax y=ax Not convex. Continuous. Sparsest solution. y=ax x2r n kxk 2 s.t. y =Ax x2r n kxk 1.5 s.t. y =Ax kxk p =! 1/p nx x i p i=1 y=ax Continuous. Convex. Dense (LS) solution y=ax Continuous. Convex. Dense (LS) solution MC 17-9

Relaxation Solve y =Ax where x is sparse x2r n kxk 0 s.t. y =Ax Basis Pursuit kxk x2r n 1 s.t. y =Ax x2r n kxk 0.5 s.t. y =Ax m =1 n =2 Not convex. Not continuous. NP-hard. Sparsest solution. y=ax Continuous. Convex. Sparsest solution. y=ax Not convex. Continuous. Sparsest solution. y=ax x2r n kxk 2 s.t. y =Ax x2r n kxk 1.5 s.t. y =Ax kxk p =! 1/p nx x i p i=1 y=ax Continuous. Convex. Dense (LS) solution y=ax Continuous. Convex. Dense (LS) solution MC 17-9

Tightest convex relaxation x2r n kxk 1 s.t. y =Ax (BP) MC 17-10

Tightest convex relaxation x2r n kxk 1 s.t. y =Ax (BP) B`1 = ConvHull (B`0 \ B`2) `1 is the tightest convex relaxation of `0 MC 17-10

Error correction problem Find x from v =Bw + e Error: small fraction of corruptions Corrupted ciphertext R n n>m Plaintext R m b 1 b 2 b m Codewords MC 17-11

Error correction problem Find x from v =Bw + e A such that span(b) ker(a), i.e. AB = 0. Multiply v by A : y =Av =Ae. Only a small fraction of corruptions means that e is sparse. The original problem can be cast as x? 2 Arg x2r n kxk 1 s.t. y =Ax w? =B + x? Question : when x? = e so that w? = w? (see following talks). MC 17-12

Optimization algorithms BP as a linear program : x2r n kxk 1 s.t. y =Ax (BP) Decompose x in its positive and negative part and lift in R 2n : z2r 2n 2nX i=1 z i s.t. y =[A A]z, z 0. Use your favourite LP solvers package : Cplex, Sedumi (IP), Mosek (IP), etc.. Recover x? =(z? i )n i=1 (z? i )2n i=n+1. High accuracy. Scaling with dimension n. Proximal splitting algorithms : DR, ADMM, Primal-Dual (MC of April 18th) : Scale well with dimension : cost/iteration = O(mn) vector/matrix multiplication and O(n) soft-thresholding. Iterative methods : less accurate. MC 17-13

Noiseless case y =Ax 0 : Recovery guarantees x2r n kxk 1 s.t. y =Ax (BP) When (BP) has a unique solution that is the sparsest vector x 0? Uniform guarantees : which conditions ensure recovery of all s-sparse signals? Non-uniform guarantees : which conditions ensure recovery of the s-sparse vector x 0? Sample complexity bounds (random settings) : can we constrict sensing matrices s.t. the above conditions hold? What are the optimal scalings of the problem dimensions (n, m, s)? Necessary conditions? What if x 0 is only weakly sparse? MC 17-14

Sensitivity/stability guarantees 1 x2r n 2 ky Axk2 2 + kxk 1, > 0 (BPDN/LASSO) Noisy case y =Ax 0 + " : Study stability of (BPDN) solution(s) to the noise "? `2 stability : Theorem (Typical statement) Under conditions XX, and choice any solution x? of (BPDN) obeys kx? x 0 k 2 apple C k"k 2. Support and sign stability (more stringent) : Theorem (Typical statement) Under conditions XXXX, and choice unique solution x? of (BPDN) obeys = c k"k 2, there exists C such that = f(k"k 2, i2supp(x) x i ), the supp(x? )=supp(x 0 ) and sign(x? ) = sign(x 0 ). Again uniform vs non-uniform guarantees. Sample complexity bounds (random settings) : can we constrict sensing matrices s.t. the above conditions hold? What are the optimal scalings of the problem dimensions (n, m, s)? Necessary conditions? What if x 0 is only weakly sparse? MC 17-15

Sensitivity/stability guarantees Original Recovered Stable support MC 17-16

Sensitivity/stability guarantees Original Recovered Stable support No stable support but `2stability MC 17-16

Sensitivity/stability guarantees Original Recovered Stable support No stable support but `2stability In some applications, what matters is stability of the support MC 17-16

Guarantees from a geometrical perspective MC 17-17

Notions of convex analysis Convex sets Non-convex sets MC 17-18

Notions of convex analysis Convex sets Non-convex sets a (C) 0 C MC 17-18

Notions of convex analysis Convex sets Non-convex sets a (C) 0 C Definition (Relative interior) The relative interior ri(c) of a convex set C is its interior relative to a (C). C a (C) ri(c) {x} {x} {x} [x, x 0 ] line generated by (x, x 0 ) ]x, x 0 [ Simplex in R n P n i=1 x P n i =1 i=1 x i =1,x i 2]0, 1[ MC 17-18

Subdifferential Definition (Subdifferential) The subdifferential of a convex function at x 2 R n is the set of slopes of affine functions orizing f at x, i.e. @f(x) ={u 2 R n : 8z 2 R n,f(z) f(x)+hu, z xi}. x @f(x) ={ 1}, x<0 epi( ) @f(x) ={1}, x>0 0 x MC 17-19

Normal cone Definition (Normal cone) The normal cone to a set C at x 2 C is N C (x) ={u 2 R n : hu, z xi apple0, 8z 2 C}. MC 17-20

Normal cone Definition (Normal cone) The normal cone to a set C at x 2 C is N C (x) ={u 2 R n : hu, z xi apple0, 8z 2 C}. C N C (x) MC 17-20

Normal cone Definition (Normal cone) The normal cone to a set C at x 2 C is N C (x) ={u 2 R n : hu, z xi apple0, 8z 2 C}. N C (x) =subspace V? C C =subspace V N C (x) MC 17-20

Optimality conditions for (BP) x2r n kxk 1 s.t. y =Ax (BP) MC 17-21

Optimality conditions for (BP) x2r n kxk 1 s.t. y =Ax (BP) x? 2 Arg x2r n kxk 1 s.t. y =Ax () 0 2 @ kx? k 1 + N ker(a) (x? ) () 0 2 @ kx? k 1 + span(a > ) y =Ax x? A > () span(a > ) \ @ kx? k 1 6= ; () 9 2 R m s.t. I =supp(x def? ) 8 < A > I = sign(x? I ), : A > 1 apple 1. ker(a) MC 17-21

Dual certificate Definition The vector 2 R m verifying the source condition A > 2 @ kx 0 k 1 is called a dual certificate associated to x 0. y =Ax x 0 A > ker(a) MC 17-22

Non-degenerate dual certificate Definition The vector 2 R m verifying the source condition A > 2 ri(@ kx 0 k 1 ) () A > I = sign((x 0 ) I ) and A > I c 1 <1. is called a non-degenerate dual certificate. I =supp(x def 0 ) y =Ax A > hits the relative boundary () {x : y =Ax} tangent to a higher dimensional face of x 0 () non-unique solution x 0 x 0 A > A > y =Ax ker(a) ker(a) MC 17-23

Restricted Injectivity Assumption A I is full column rank, where I def =supp(x 0 ). A natural assumption. Assume noiseless case y =Ax 0 Assume I is known, then y =Ax 0 =A I (x 0 ) I. No hope to recover x 0 uniquely, even knowing its support, if A I has a kernel. All recovery conditions in the literature assume a form of restricted injectivity. MC 17-24

Exact recovery x2r n kxk 1 s.t. y =Ax (BP) Theorem Let I =supp(x 0 ). Assume that there exists a non-degenerate dual certificate at x 0 and A I is full-rank. Then x 0 si the unique solution to (BP). Even necessary when x 0 is non-trivial. MC 17-25

Stability without support recovery y =Ax 0 + " x2r n 1 2 ky Axk2 2 + kxk 1, > 0 (BPDN/LASSO) Theorem Let I =supp(x 0 ). Assume that there exists a non-degenerate dual certificate at x 0 and A I is full-rank. Then, choosing = c k"k 2, c>0, any imizer x? of (BPDN/LASSO) obeys kx? x 0 k 2 apple C(c, A,I, ) k"k 2. MC 17-26

Stability without support recovery y =Ax 0 + " 1 x2r n 2 ky Axk2 2 + kxk 1, > 0 (BPDN/LASSO) Theorem Let I =supp(x 0 ). Assume that there exists a non-degenerate dual certificate at x 0 and A I is full-rank. Then, choosing = c k"k 2, c>0, any imizer x? of (BPDN/LASSO) obeys kx? x 0 k 2 apple C(c, A,I, ) k"k 2. Even necessary when x 0 is non-trivial. MC 17-26

Stable support and sign recovery y =Ax 0 + " 1 x2r n 2 ky Axk2 2 + kxk 1, > 0 (BPDN/LASSO) Theorem Let I =supp(x 0 ). Assume that A I is full-rank and F =A I (A > I A I ) 1 sign((x 0 ) I ) is a non-degenerate dual certificate. Then, choosing c 1 k"k 2 < <c 2 i2i (x 0) i, (BPDN/LASSO) has a unique solution x? which moreover satisfies supp(x? )=I and sign(x? ) = sign(x 0 ). MC 17-27

Take-away messages Convex relaxation is good for sparse recovery. Many (tight) guarantees with nice geometrical insight: Exact noiseless recovery. Stability without support recovery. Stable support recovery. Can we translate these conditions into sample complexity bounds? Yes: random measurements (next lecture). MC 17-28

https://fadili.users.greyc.fr/ Thanks Any questions? MC 17-29