Sparse analysis Lecture VII: Combining geometry and combinatorics, sparse matrices for sparse signal recovery

Similar documents
Combining geometry and combinatorics

Sparse analysis Lecture V: From Sparse Approximation to Sparse Signal Recovery

Tutorial: Sparse Signal Recovery

An Introduction to Sparse Approximation

Sparse recovery using sparse random matrices

CS on CS: Computer Science insights into Compresive Sensing (and vice versa) Piotr Indyk MIT

Tutorial: Sparse Recovery Using Sparse Matrices. Piotr Indyk MIT

Sparse Recovery Using Sparse (Random) Matrices

Tutorial: Sparse Recovery Using Sparse Matrices. Piotr Indyk MIT

COMBINING GEOMETRY AND COMBINATORICS: A UNIFIED APPROACH TO SPARSE SIGNAL RECOVERY

Explicit Constructions for Compressed Sensing of Sparse Signals

Linear Sketches A Useful Tool in Streaming and Compressive Sensing

Combining geometry and combinatorics: a unified approach to sparse signal recovery

Near-Optimal Sparse Recovery in the L 1 norm

Advances in Sparse Signal Recovery Methods. Radu Berinde

An Adaptive Sublinear Time Block Sparse Fourier Transform

Recent Developments in Compressed Sensing

Data stream algorithms via expander graphs

Sensing systems limited by constraints: physical size, time, cost, energy

Sparse Optimization Lecture: Sparse Recovery Guarantees

Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery

Strengthened Sobolev inequalities for a random subspace of functions

Large-Scale L1-Related Minimization in Compressive Sensing and Beyond

Lecture 16 Oct. 26, 2017

Sample Optimal Fourier Sampling in Any Constant Dimension

Greedy Signal Recovery and Uniform Uncertainty Principles

Near Optimal Signal Recovery from Random Projections

Sparse Fourier Transform (lecture 4)

Simultaneous Sparsity

CoSaMP. Iterative signal recovery from incomplete and inaccurate samples. Joel A. Tropp

Compressed Sensing and Sparse Recovery

Towards an Algorithmic Theory of Compressed Sensing

Signal Recovery from Permuted Observations

Uniform Uncertainty Principle and signal recovery via Regularized Orthogonal Matching Pursuit

CoSaMP: Greedy Signal Recovery and Uniform Uncertainty Principles

Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing

Solution Recovery via L1 minimization: What are possible and Why?

GREEDY SIGNAL RECOVERY REVIEW

Sparse Expander-like Real-valued Projection (SERP) matrices for compressed sensing

Uniform Uncertainty Principle and Signal Recovery via Regularized Orthogonal Matching Pursuit

Compressed Sensing: Lecture I. Ronald DeVore

Lecture 22: More On Compressed Sensing

Generalized Orthogonal Matching Pursuit- A Review and Some

Reconstruction from Anisotropic Random Measurements

Sparse analysis Lecture III: Dictionary geometry and greedy algorithms

Noisy Signal Recovery via Iterative Reweighted L1-Minimization

Algorithmic linear dimension reduction in the l 1 norm for sparse vectors

Compressed Sensing with Very Sparse Gaussian Random Projections

Heavy Hitters. Piotr Indyk MIT. Lecture 4

Multipath Matching Pursuit

A For-all Sparse Recovery in Near-Optimal Time

Model-Based Compressive Sensing for Signal Ensembles. Marco F. Duarte Volkan Cevher Richard G. Baraniuk

Combinatorial Algorithms for Compressed Sensing

Sparse Legendre expansions via l 1 minimization

APPROXIMATE SPARSE RECOVERY: OPTIMIZING TIME AND MEASUREMENTS

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Nearly Optimal Deterministic Algorithm for Sparse Walsh-Hadamard Transform

Rapidly Computing Sparse Chebyshev and Legendre Coefficient Expansions via SFTs

Compressed sensing and best k-term approximation

Sparse Fourier Transform (lecture 1)

COMPRESSED SENSING AND BEST k-term APPROXIMATION

AN INTRODUCTION TO COMPRESSIVE SENSING

Compressive Sensing with Random Matrices

Introduction How it works Theory behind Compressed Sensing. Compressed Sensing. Huichao Xue. CS3750 Fall 2011

Randomness-in-Structured Ensembles for Compressed Sensing of Images

of Orthogonal Matching Pursuit

Fast Reconstruction Algorithms for Deterministic Sensing Matrices and Applications

Wavelet decomposition of data streams. by Dragana Veljkovic

INDUSTRIAL MATHEMATICS INSTITUTE. B.S. Kashin and V.N. Temlyakov. IMI Preprint Series. Department of Mathematics University of South Carolina

A new method on deterministic construction of the measurement matrix in compressed sensing

Compressed Sensing and Linear Codes over Real Numbers

Constructing Explicit RIP Matrices and the Square-Root Bottleneck

Compressed Sensing - Near Optimal Recovery of Signals from Highly Incomplete Measurements

Sparse Recovery with Pre-Gaussian Random Matrices

A Note on the Complexity of L p Minimization

Conditions for a Unique Non-negative Solution to an Underdetermined System

Robust Support Recovery Using Sparse Compressive Sensing Matrices

Robust Principal Component Analysis

Compressed sensing. Or: the equation Ax = b, revisited. Terence Tao. Mahler Lecture Series. University of California, Los Angeles

Instance Optimal Decoding by Thresholding in Compressed Sensing

LIMITATION OF LEARNING RANKINGS FROM PARTIAL INFORMATION. By Srikanth Jagabathula Devavrat Shah

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

Z Algorithmic Superpower Randomization October 15th, Lecture 12

Elaine T. Hale, Wotao Yin, Yin Zhang

Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise

CS 229r: Algorithms for Big Data Fall Lecture 19 Nov 5

Introduction to Compressed Sensing

Optimal Deterministic Compressed Sensing Matrices

Compressive Sensing Theory and L1-Related Optimization Algorithms

Pre-weighted Matching Pursuit Algorithms for Sparse Recovery

Recovering overcomplete sparse representations from structured sensing

Interpolation via weighted l 1 minimization

Lecture: Introduction to Compressed Sensing Sparse Recovery Guarantees

An algebraic perspective on integer sparse recovery

Exponential decay of reconstruction error from binary measurements of sparse signals

Compressive Sensing and Beyond

Algorithms for sparse analysis Lecture I: Background on sparse approximation

Sparse recovery for spherical harmonic expansions

A Non-sparse Tutorial on Sparse FFTs

Transcription:

Sparse analysis Lecture VII: Combining geometry and combinatorics, sparse matrices for sparse signal recovery Anna C. Gilbert Department of Mathematics University of Michigan

Sparse signal recovery measurements: length m = k log(n) k-sparse signal length n

Problem statement m as small as possible Assume x has low complexity: x is k-sparse (with noise) Construct matrix A: R n R m Given Ax for any signal x R n, we can quickly recover x with x x p C min y k sparse x y q

Parameters Number of measurements m Recovery time Approximation guarantee (norms, mixed) One matrix vs. distribution over matrices Explicit construction Universal matrix (for any basis, after measuring) Tolerance to measurement noise

Applications Applications Data stream algorithms x i = number of items with index i Data stream algorithms can maintain Ax under increments to x and recover x i = number of approximation items with index to ix can maintain Ax under increments to x Efficient data sensing recover approximation to x Efficient data sensing digital cameras digital/analoganalog-to-digital cameras converters analog-to-digital converters Error-correcting codes Error-correcting codes code {y R n Ay code {y R = 0} n Ay = 0} x = error vector, x = Ax error = syndrome vector, Ax = syndrome

Two approaches Geometric [Donoho 04],[Candes-Tao 04, 06],[Candes-Romberg-Tao 05], [Rudelson-Vershynin 06], [Cohen-Dahmen-DeVore 06], and many others... Dense recovery matrices (e.g., Gaussian, Fourier) Geometric recovery methods (l 1 minimization, LP) x = argmin z 1 s.t. Φz = Φx Uniform guarantee: one matrix A that works for all x Combinatorial [Gilbert-Guha-Indyk-Kotidis-Muthukrishnan-Strauss 02], [Charikar-Chen-FarachColton 02] [Cormode-Muthukrishnan 04], [Gilbert-Strauss-Tropp-Vershynin 06, 07] Sparse random matrices (typically) Combinatorial recovery methods or weak, greedy algorithms Per-instance guarantees, later uniform guarantees

Paper A/E Sketch length Encode time Column sparsity/ Decode time Approx. error Noise Update time [CCFC02, CM06] E k log c n n log c n log c n k log c n l2 Cl2 E k log n n log n log n n log n l2 Cl2 [CM04] E k log c n n log c n log c n k log c n l1 Cl1 E k log n n log n log n n log n l1 Cl1 [CRT06] A k log(n/k) nk log(n/k) k log(n/k) LP l2 C k 1/2 l1 A k log c n n log n k log c n LP l2 C k 1/2 l1 Y Y [GSTV06] A k log c n n log c n log c n k log c n l1 C log nl1 Y [GSTV07] A k log c n n log c n log c n k 2 log c n l2 C k 1/2 l1 [NV07] A k log(n/k) nk log(n/k) k log(n/k) nk 2 log c C(log n)1/2 n l2 k 1/2 l1 Y A k log c n n log n k log c n nk 2 log c C(log n)1/2 n l2 k 1/2 l1 Y [GLR08] A k(log n) c log log log n kn 1 a n 1 a LP l2 C k 1/2 l1 (k large ) This paper A k log(n/k) n log(n/k) log(n/k) LP l1 Cl1 Y Prior work: summary Paper A/E Sketch length Encode time Update time Decode time Approx. error Noise [DM08] A k log(n/k) nk log(n/k) k log(n/k) nk log(n/k) log D l2 C k 1/2 l1 Y [NT08] A k log(n/k) nk log(n/k) k log(n/k) nk log(n/k) log D l2 C k 1/2 l1 A k log c n n log n k log c n n log n log D l2 C k 1/2 l1 Y Y [IR08] A k log(n/k) nlog(n/k) log(n/k) nlog(n/k) l1 Cl1 Y Important contributions, flurry of activity

Unify these techniques Achieve best of both worlds LP decoding using sparse matrices combinatorial decoding (with augmented matrices) Deterministic (explicit) constructions What do combinatorial and geometric approaches share? What makes them work?

Sparse matrices: Expander graphs S N(S) Adjacency matrix A of a d regular (1, ɛ) expander graph Graph G = (X, Y, E), X = n, Y = m For any S X, S k, the neighbor set Probabilistic construction: N(S) (1 ɛ)d S d = O(log(n/k)/ɛ), m = O(k log(n/k)/ɛ 2 ) Deterministic construction: d = O(2 O(log3 (log(n)/ɛ)) ), m = k/ɛ 2 O(log3 (log(n)/ɛ))

Bipartite graph Adjacency matrix 1 0 1 1 0 1 1 0 0 1 0 1 1 1 0 1 0 1 1 0 0 1 1 1 1 0 0 1 1 0 0 1 1 1 1 0 1 0 1 0 Measurement matrix 5 10 15 20 (larger example) 50 100 150 200 250

RIP(p) A measurement matrix A satisfies RIP(p, k, δ) property if for any k-sparse vector x, (1 δ) x p Ax p (1 + δ) x p.

RIP(p) expander Theorem (k, ɛ) expansion implies (1 2ɛ)d x 1 Ax 1 d x 1 for any k-sparse x. Get RIP(p) for 1 p 1 + 1/ log n. Theorem RIP(1) + binary sparse matrix implies (k, ɛ) expander for ɛ = 1 1/(1 + δ) 2. 2

Expansion = LP decoding Theorem Φ adjacency matrix of (2k, ɛ) expander. Consider two vectors x, x such that Φx = Φx and x 1 x 1. Then x x 1 2 1 2α(ɛ) x x k 1 where x k is the optimal k-term representation for x and α(ɛ) = (2ɛ)/(1 2ɛ). Guarantees that Linear Program recovers good sparse approximation Robust to noisy measurements too

RIP(1) RIP(2) Any binary sparse matrix which satisfies RIP(2) must have Ω(k 2 ) rows [Chandar 07] Gaussian random matrix m = O(k log(n/k)) (scaled) satisfies RIP(2) but not RIP(1) x T = ( 0 0 1 0 0 ) y T = ( 1/k 1/k 0 0 ) x 1 = y 1 but Gx 1 k Gy 1

Expansion = RIP(1) Theorem (k, ɛ) expansion implies (1 2ɛ)d x 1 Ax 1 d x 1 for any k-sparse x. Proof. Take any k-sparse x. Let S be the support of x. Upper bound: Ax 1 d x 1 for any x Lower bound: most right neighbors unique if all neighbors unique, would have Ax 1 = d x 1 can make argument robust Generalization to RIP(p) similar but upper bound not trivial.

RIP(1) = LP decoding l 1 uncertainty principle Lemma Let y satisfy Ay = 0. Let S the set of k largest coordinates of y. Then y S 1 α(ɛ) y 1. LP guarantee Theorem Consider any two vectors u, v such that for y = u v we have Ay = 0, v 1 u 1. S set of k largest entries of u. Then y 1 2 1 2α(ɛ) u S c 1.

l 1 uncertainty principle Proof. (Sketch): Let S 0 = S, S 1,... be coordinate sets of size k in decreasing order of magnitudes On the one hand A = A restricted to N(S). y S Ay A y S 1 = Ay S 1 (1 2ɛ)d y 1. On the other S1 N(S) 0 = A y 1 = A y S 1 y i l 1 (i,j) E[S l :N(S)] S2 (1 2ɛ)d y S 1 l E[S l : N(S)] 1/k y Sl 1 1 (1 2ɛ)d y S 1 2ɛdk l 1 1/k y Sl 1 1 (1 2ɛ)d y S 1 2ɛd y 1

Sparse matrices and combinatorial decoding Combinatorial decoding: bit-test B 1s = 0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 0 0 1 0 0 0 0 0 = 0 1 0 MSB LSB bit-test matrix signal = location in binary

Augmented expander Isolation Matrix with Bit Testing Isolation Matrix Bit Testing Matrix Ψ is adjacency matrix of (k, 1/8)-expander, degree d, m rows. A = Ψ r B1 with m log n rows.

Combinatorial decoding Recover(A, y) { if y = 0 then return 0 z = Reduce(A, y) w = Recover(y - Az) return z + w } Reduce(A, y) { for j = 1... m compute (i, val) such that if supp(φ) j supp(x) = {i} then x i = val if val 0 then votes[i] = votes[i] {val} w = 0 for i = 1... n such that votes[i] if votes[i] contains d/2 copies of val then w i = val return w }

Combinatorial decoding Bit-test Good votes Bad votes Retain {index, val} if have > d/2 votes for index d/2 + d/2 + d/2 = 3d/2 violates expander = each set of d/2 incorrect votes gives at most 2 incorrect indices Decrease incorrect indices by factor 2 each iteration

Augmented expander = Combinatorial decoding Theorem Ψ is (k, 1/8)-expander. A = Ψ r B 1 with m log n rows. Then, for any k-sparse x, given Ax, we can recover x in time O(m log 2 n). With additional hash matrix and polylog(n) more rows in structured matrices, can approximately recover all x in time O(k 2 log O(1) n) with same error guarantees as LP decoding. Expander central element in [Indyk 08], [Gilbert-Strauss-Tropp-Vershynin 06, 07]

ρ ρ Empirical results 1 Probability of exact recovery, signed signals 1 1 Probability of exact recovery, positive signals 1 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.8 0.7 0.7 0.7 0.7 0.6 0.6 0.6 0.6 0.5 0.5 0.4 0.5 0.4 0.5 0.3 0.4 0.3 0.4 0.2 0.3 0.2 0.3 0.1 0.2 0.1 0.2 0 0 0.2 0.4 0.6 0.8 1 δ 0.1 0 0 0.2 0.4 0.6 0.8 1 δ 0.1 Performance comparable to dense LP decoding Image reconstruction (TV/LP wavelets), running times, error bounds available in [Berinde, Indyk 08]

Summary: Structural Results Geometric RIP(2) Combinatorial RIP(1) Linear Programming Weak Greedy

More specifically, sparse binary RIP(1) matrix Expander Explicit constructions (log log n)o(1) m = k2 + 2nd hasher (for noise only) bit tester LP decoding (fast update time, sparse) Combinatorial decoding (fast update time, fast recovery time, sparse)