Strengthened Sobolev inequalities for a random subspace of functions

Strengthened Sobolev inequalities for a random subspace of functions Rachel Ward University of Texas at Austin April 2013

2 Discrete Sobolev inequalities Proposition (Sobolev inequality for discrete images) Let x R N N be zero-mean. Then N N N xj,k 2 N [ ] x j+1,k x j,k + x j,k+1 x j,k j=1 k=1 j=1 k=1 or x 2 x TV Theorem (New! Strengthened Sobolev inequality) With probability 1 e cm, the following holds for all images x R N N in the null space of an m N 2 random Gaussian matrix x 2 [log(n)]3/2 x TV m

3 Discrete Sobolev inequalities Proposition (Sobolev inequality for discrete images) Let x R N N be zero-mean. Then N N N xj,k 2 N [ ] x j+1,k x j,k + x j,k+1 x j,k j=1 k=1 j=1 k=1 or x 2 x TV Theorem (New! Strengthened Sobolev inequality) With probability 1 e cm, the following holds for all images x R N N in the null space of an m N 2 random Gaussian matrix x 2 [log(n)]3/2 x TV m

4 Joint work with Deanna Needell Research supported in part by the ONR and Alfred P. Sloan Foundation

5 Motivation: Image processing

6 Images are compressible in discrete gradient

Images are compressible in discrete gradient We define the discrete directional derivatives of an image x R N N as x u : R N N R (N 1) N, (x u ) j,k = x j,k x j 1,k, x v : R N N R N (N 1), (x v ) j,k = x j,k x j,k 1, and discrete gradient operator x = (x u, x v ) R N N 2

8 Images are also compressible in wavelet bases Reconstruction of Boats from 10% of its Haar wavelet coefficients. H : R N N R N N is bivariate Haar wavelet transform Figure: First few Haar basis functions

Notation The image l p -norm is u p := ( N j=1 N k=1 u j,k p ) 1/p u is s-sparse if u 0 := {#(j, k) : u j,k 0} s u s = arg min {z: z 0=s} u z is the best s-sparse approximation to u σ s (u) p = u u s p is the s-sparse approximation error Natural images are compressible : σ s ( x) = x ( x) s 2 decays quickly in s σ s (Hx) = Hx (Hx) s 2 decays quickly in s

10 Notation The image l p -norm is u p := ( N j=1 N k=1 u j,k p ) 1/p u is s-sparse if u 0 := {#(j, k) : u j,k 0} s u s = arg min {z: z 0=s} u z is the best s-sparse approximation to u σ s (u) p = u u s p is the s-sparse approximation error Natural images are compressible : σ s ( x) = x ( x) s 2 decays quickly in s σ s (Hx) = Hx (Hx) s 2 decays quickly in s

Imaging via compressed sensing If images are so compressible (nearly low-dimensional), do we really need to know all of the pixel values for accurate reconstruction, or can we get away with taking a smaller number of linear measurements?

MRI imaging - speed up by exploiting sparsity? In Magnetic Resonance Imaging (MRI), rotating magnetic field measurements are 2D Fourier transform measurements: y k1,k 2 = ( F(x) ) = 1 x k 1,k j1 2,j N 2 e 2πi(k1j1+k2j2)/N, j 1,j 2 N/2 + 1 k 1, k 2, N/2 Each measurement takes a certain amount of time reduced number of samples m N 2 means reduced time of MRI scan.

Compressed sensing set-up 1. Signal (or image) of interest f R d which is sparse: f 0 s 2. Measurement operator A : R d R m, m d y = A f 3. Problem: Using the sparsity of f, can we reconstruct f efficiently from y? 3

A sufficient condition: Restricted Isometry Property 1 A : R d R m satisfies the Restricted Isometry Property (RIP) of order s if 3 4 f 2 Af 2 5 4 f 2 for all s-sparse f. Gaussian or Bernoulli measurement matrices satisfy the RIP with high probability when m s log d. Random subsets of rows from Fourier and others with fast multiply have similar property: m s log 4 d. 14 1 Candès-Romberg-Tao 2006

16 l 1 -minimization for sparse recovery 2 Let y = Af. Let A satisfy the Restricted Isometry Property of order s and set: Then ˆf = argmin g 1 such that Ag = y g 1. Exact recovery: If f is s-sparse, then ˆf = f. 2. Stable recovery: If f is approximately s-sparse, then ˆf f. Precisely, f ˆf 1 σ s (f ) 1 f ˆf 2 1 s σ s (f ) 1 2 Candès-Romberg-Tao, Donoho

l 1 -minimization for sparse recovery Similarly, if B be an orthonormal basis in which f is assumed sparse, and AB = AB 1 satisfies the Restricted Isometry Property of order s and ˆf = argmin Bg 1 such that Ag = y g Bˆf = argmin h 1 such that AB 1 h = y h Then B(f ˆf ) 1 σ s (Bf ) 1 f ˆf 2 1 s σ s (Bf ) 1

18 Total variation minimization for sparse recovery For x R N N, the total variation seminorm is x TV = x 1. If x has s-sparse gradient, from observations y = Ax one solves Total Variation minimization ˆx = argmin z TV subject to Az = Ax z Immediate guarantees: 1. ˆx x 2 1 s σ s ( x) 1, 2. ˆx x 1 σ s ( x) 1 Apply discrete Sobolev inequality x ˆx 2 σ s ( x) 1 Can we do better?

19 Total variation minimization for sparse recovery For x R N N, the total variation seminorm is x TV = x 1. If x has s-sparse gradient, from observations y = Ax one solves Total Variation minimization ˆx = argmin z TV subject to Az = Ax z Immediate guarantees: 1. ˆx x 2 1 s σ s ( x) 1, 2. ˆx x 1 σ s ( x) 1 Apply discrete Sobolev inequality x ˆx 2 σ s ( x) 1 Can we do better?

Discrete Sobolev inequalities Proposition (Sobolev inequality ) Let x R N N be zero-mean. Then x 2 x TV Theorem (Strengthened Sobolev inequality) For all images x R N2 in the null space of a matrix A : R N2 R m which, when composed with the bivariate Haar wavelet transform, AH 1 : R N2 R m has the Restricted Isometry Property of order 2s, the following holds: x 2 log(n) x TV s

22 Consequence for total variation minimization Suppose that the matrix A : R N2 R m is such that, composed with the bivariate Haar wavelet transform, AH : R N2 R m has the Restricted Isometry Property of order 2s. From observations y = Ax, consider Total Variation minimization Then ˆx = argmin z TV subject to Az = y z 1. ˆx x 2 1 s σ s ( x) 1 2. ˆx x TV σ s ( x) 1 3. x ˆx 2 log(n2 /s) s σ s ( x) 1

23 Consequence for total variation minimization Suppose that the matrix A : R N2 R m is such that, composed with the bivariate Haar wavelet transform, AH : R N2 R m has the Restricted Isometry Property of order 2s. From observations y = Ax, consider Total Variation minimization Then ˆx = argmin z TV subject to Az = y z 1. ˆx x 2 1 s σ s ( x) 1 2. ˆx x TV σ s ( x) 1 3. x ˆx 2 log(n2 /s) s σ s ( x) 1

Corollary (Sobolev inequality for random matrices) With probability 1 e cm, the following holds for all images x R N N in the null space of an m N 2 random Gaussian matrix: x 2 [log(n)]3/2 x TV m

50 100 150 200 250 50 100 150 200 250 Corollary (Sobolev inequality for Fourier constraints) Select m frequency pairs (k 1, k 2 ) with replacement from the distribution 1 p k1,k 2 ( k 1 + 1) 2 + ( k 2 + 1) 2. With probability 1 e cm, the following holds for all images x R N N with vanishing Fourier transform ( F(x) ) k 1,k 2 = 0: x 2 [log(n)]5 x TV m

50 100 150 200 250 50 100 150 200 250 50 100 150 200 250 50 100 150 200 250 50 100 150 200 250 50 100 150 200 250 50 Sobolev inequalities in action! 100 Original MRI image 150 50 100 200 150 200 Low frequencies only 50 250 100 150 200 250 Equispaced radial lines 250 50 100 (k 1, k 2 ) (k 2 1 + k2 2 ) 1/2 (k 1, k 2 ) (k 2 1 + k2 2 ) 1

27 Sketch of the proof Theorem (Strengthened Sobolev inequality) For all images x R N2 in the null space of a matrix A : R N2 R m which, when composed with the bivariate Haar wavelet transform, AH 1 : R N2 R m has the Restricted Isometry Property of order 2s, the following holds: x 2 log(n) x TV s

28 Sketch of the proof Theorem (Strengthened Sobolev inequality) For all images x R N2 in the null space of a matrix A : R N2 R m which, when composed with the bivariate Haar wavelet transform, AH 1 : R N2 R m has the Restricted Isometry Property of order 2s, the following holds: x 2 log(n) x TV s

29 Lemmas we use: 1. Denote the ordered bivariate Haar wavelet coefficients of mean-zero x R N2 by c (1) c (2) c (N 2 ). Then 3 c (k) x TV k That is, the sequence is in weak-l 1. 2. If c = c S0 + c S1 + c S2 +... where c S0 = (c (1), c (2),..., c (s), 0,..., 0), and c S1 = (0,..., 0, c (s+1),..., c (2s), 0,..., 0), and so on are s-sparse, then c Sj+1 2 1 s c Sj 1. 3. Recall that AH 1 : R N2 R m having the RIP of order s means that: 3 4 x 2 AH 1 x 2 5 4 x 2 s-sparse x. 3 Cohen, Devore, Petrushev, Xu, 1999

30 Suppose that Ψ = AH : R N2 R m has the RIP of order 2s. Suppose Ax = 0 Decompose c = Hx into ordered s-sparse blocks c = c S0 + c S1 + c S2 +... Then Ψc = AH Hx = Ax = 0 and 0 Ψ(c S0 + c S1 ) 2 Ψc Sj 2 j 2 (RIP of Ψ) 3 4 c S 0 + c S1 2 5 c Sj 2 4 j 2 (block trick) 3 4 c S 0 2 5/4 c Sj 1 s (c in weak l 1 ) 3 4 c S 0 2 5/4 s x TV log(n 2 /s) j 1

31 Suppose that Ψ = AH : R N2 R m has the RIP of order 2s. Suppose Ax = 0 Decompose c = Hx into ordered s-sparse blocks c = c S0 + c S1 + c S2 +... Then Ψc = AH Hx = Ax = 0 and 0 Ψ(c S0 + c S1 ) 2 Ψc Sj 2 j 2 (RIP of Ψ) 3 4 c S 0 + c S1 2 5 c Sj 2 4 j 2 (block trick) 3 4 c S 0 2 5/4 c Sj 1 s (c in weak l 1 ) 3 4 c S 0 2 5/4 s x TV log(n 2 /s) j 1

2 So 1. c S0 2 1 s x TV log(n/s), 2. c c S0 2 1 s x TV (c is in weak l 1 ) Then x 2 = c 2 c S0 2 + c c S0 2 log(n2 /s) x TV s

Extensions 1. Strengthened Sobolev inequalities and total variation minimization guarantees can be extended to multidimensional signals. 2. Strengthened Sobolev inequalities and total variation minimization guarantees are robust to noisy measurements: y = Ax + ξ, with ξ 2 ε. 3. Sobolev inequalities for random subspaces in dimension d = 1? Error guarantees for total variation minimization in dimension d = 1? 4. Can we remove the extra factor of log(n) in the derived Sobolev inequalities? 5. Deterministic constructions of subspaces satisfying strengthened Sobolev inequalities (warning: hard!!!) 6....

34 References Needell, Ward. Stable image reconstruction using total variation minimization. SIAM Journal on Imaging Sciences 6.2 (2013): 1035-1058. Needell, Ward. Near-optimal compressed sensing guarantees for total variation minimization. IEEE transactions on image processing 22.10 (2013): 3941. Krahmer, Ward, Beyond incoherence: stable and robust sampling strategies for compressive imaging. arxiv preprint arxiv:1210.2380 (2012).