CS on CS: Computer Science insights into Compresive Sensing (and vice versa) Piotr Indyk MIT

Similar documents
Tutorial: Sparse Recovery Using Sparse Matrices. Piotr Indyk MIT

Tutorial: Sparse Recovery Using Sparse Matrices. Piotr Indyk MIT

Sparse Recovery Using Sparse (Random) Matrices

Combining geometry and combinatorics

Sparse analysis Lecture VII: Combining geometry and combinatorics, sparse matrices for sparse signal recovery

Sparse analysis Lecture V: From Sparse Approximation to Sparse Signal Recovery

Sparse recovery using sparse random matrices

Explicit Constructions for Compressed Sensing of Sparse Signals

Near-Optimal Sparse Recovery in the L 1 norm

An Adaptive Sublinear Time Block Sparse Fourier Transform

Sample Optimal Fourier Sampling in Any Constant Dimension

Linear Sketches A Useful Tool in Streaming and Compressive Sensing

Combining geometry and combinatorics: a unified approach to sparse signal recovery

Faster Algorithms for Sparse Fourier Transform

Model-Based Compressive Sensing for Signal Ensembles. Marco F. Duarte Volkan Cevher Richard G. Baraniuk

CS 229r: Algorithms for Big Data Fall Lecture 19 Nov 5

Sparse Fourier Transform (lecture 4)

Optimal Deterministic Compressed Sensing Matrices

Recent Developments in the Sparse Fourier Transform

Tutorial: Sparse Signal Recovery

Lecture 16 Oct. 26, 2017

COMBINING GEOMETRY AND COMBINATORICS: A UNIFIED APPROACH TO SPARSE SIGNAL RECOVERY

Faster Algorithms for Sparse Fourier Transform

Sparse Expander-like Real-valued Projection (SERP) matrices for compressed sensing

Heavy Hitters. Piotr Indyk MIT. Lecture 4

Sparser Johnson-Lindenstrauss Transforms

The Fundamentals of Compressive Sensing

An Introduction to Sparse Approximation

Robust Support Recovery Using Sparse Compressive Sensing Matrices

Fast Dimension Reduction

Sparse Fourier Transform (lecture 1)

MAT 585: Johnson-Lindenstrauss, Group testing, and Compressed Sensing

A Non-sparse Tutorial on Sparse FFTs

Fast Reconstruction Algorithms for Deterministic Sensing Matrices and Applications

Optimal Deterministic Compressed Sensing Matrices

Compressed Sensing and Linear Codes over Real Numbers

Lecture 13 October 6, Covering Numbers and Maurey s Empirical Method

Sensing systems limited by constraints: physical size, time, cost, energy

Constructing Explicit RIP Matrices and the Square-Root Bottleneck

Compressed Sensing Using Reed- Solomon and Q-Ary LDPC Codes

Compressed Sensing and Related Learning Problems

Exact Signal Recovery from Sparsely Corrupted Measurements through the Pursuit of Justice

Dimensionality reduction: Johnson-Lindenstrauss lemma for structured random matrices

AN INTRODUCTION TO COMPRESSIVE SENSING

Recent Developments in Compressed Sensing

Compressed Sensing and Sparse Recovery

ONE SKETCH FOR ALL: FAST ALGORITHMS FOR COMPRESSED SENSING

Exponential decay of reconstruction error from binary measurements of sparse signals

Performance Analysis for Sparse Support Recovery

Optimal compression of approximate Euclidean distances

Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery

Z Algorithmic Superpower Randomization October 15th, Lecture 12

Conditions for a Unique Non-negative Solution to an Underdetermined System

Strengthened Sobolev inequalities for a random subspace of functions

Advances in Sparse Signal Recovery Methods. Radu Berinde

Solution Recovery via L1 minimization: What are possible and Why?

Sparse Johnson-Lindenstrauss Transforms

Nearly Optimal Deterministic Algorithm for Sparse Walsh-Hadamard Transform

Fast Dimension Reduction

Compressed Sensing: Lecture I. Ronald DeVore

Signal Recovery from Permuted Observations

Compressed Sensing. 1 Introduction. 2 Design of Measurement Matrices

Random Methods for Linear Algebra

Rapidly Computing Sparse Chebyshev and Legendre Coefficient Expansions via SFTs

Multipath Matching Pursuit

Determinis)c Compressed Sensing for Images using Chirps and Reed- Muller Sequences

Sparse Interactions: Identifying High-Dimensional Multilinear Systems via Compressed Sensing

AFRL-RI-RS-TR

Recovering overcomplete sparse representations from structured sensing

Large-Scale L1-Related Minimization in Compressive Sensing and Beyond

LIMITATION OF LEARNING RANKINGS FROM PARTIAL INFORMATION. By Srikanth Jagabathula Devavrat Shah

Estimating Unknown Sparsity in Compressed Sensing

Greedy Signal Recovery and Uniform Uncertainty Principles

Faster Johnson-Lindenstrauss style reductions

Sparse recovery for spherical harmonic expansions

POISSON COMPRESSED SENSING. Rebecca Willett and Maxim Raginsky

Sparse Optimization Lecture: Sparse Recovery Guarantees

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Compressed Sensing and Affine Rank Minimization Under Restricted Isometry

Random hyperplane tessellations and dimension reduction

Composition of binary compressed sensing matrices

Introduction How it works Theory behind Compressed Sensing. Compressed Sensing. Huichao Xue. CS3750 Fall 2011

Compressive Sensing Theory and L1-Related Optimization Algorithms

CoSaMP. Iterative signal recovery from incomplete and inaccurate samples. Joel A. Tropp

The Pros and Cons of Compressive Sensing

Compressive Sensing of Sparse Tensor

Acceleration of Randomized Kaczmarz Method

Optimization for Compressed Sensing

Compressed Sensing with Very Sparse Gaussian Random Projections

A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing via Polarization of Analog Transmission

A fast randomized algorithm for approximating an SVD of a matrix

New ways of dimension reduction? Cutting data sets into small pieces

Compressive Sensing with Chaotic Sequence

Compressive Sensing and Beyond

Lecture 22: More On Compressed Sensing

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

An Overview of Compressed Sensing

Interpolation via weighted l 1 minimization

On the Observability of Linear Systems from Random, Compressive Measurements

Sparse Legendre expansions via l 1 minimization

Transcription:

CS on CS: Computer Science insights into Compresive Sensing (and vice versa) Piotr Indyk MIT

Sparse Approximations Goal: approximate a highdimensional vector x by x that is sparse, i.e., has few nonzero elements Formally: for an n-dimensional vector x, find x such that x is k-sparse, i.e., x 0 k x-x p C Err pk (x) where Err pk (x)=min x k-sparse x-x p 16 14 12 10 8 6 4 2 0 20 15 10 5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 1 3 5 7 9 11 13

Applications of sparse approximations Compression (image, audio, video,.) Heavy hitters, trends, etc

Compressed Sensing [Candes-Tao; Candes-Romberg-Tao; Donoho] A.k.a. compressive sensing or compressive sampling Signal acquisition/processing framework: Want to acquire a signal x=[x 1 x n ] Acquisition proceeds by computing Ax+noise of dimension m<<n We ignore measurement noise in this talk. But note that x does not have to be k-sparse From Ax we want to recover an approximation x * of x Method: solve the following program: A x = Ax minimize x* 1 subject to Ax * =Ax (other methods available, more later) Guarantee: for some C>1 (L1/L1) x-x* 1 C min k-sparse x x-x 1 (L2/L1) x-x* 2 C min k-sparse x x-x 1 /k 1/2 using m=o(k log(n/k))

Application: Single pixel camera [Wakin, Laska, Duarte, Baron, Sarvotham, Takhar, Kelly, Baraniuk 06, ] Measurement: Image x reflected by a mirror a (pixels randomly off and on) The reflected rays are aggregated using lens The sensor receives ax Measurement process repeated k times sensor receives Ax+noise Now we want to recover the image from the measurements Many other architectures, e.g., [Fergus-Torralba-Freeman], [Uttam-Goodman-Neifeld-Kim-John-Kim-Brady], etc

Connections Approximation theory Screening experiments/group testing Statistical model selection (L1 min) Estimating Fourier coefficients Linear sketching (Andrew s talk) Finite rate of innovation

CS on CS insights Computer science on compressive sensing Efficient (sub-linear) algorithms: Sparse matrices/hashing Fourier sampling Explicit constructions of matrices Compressive sensing on computer science Fast Johnson-Lindenstrauss Transform via RIP property [Ailon-Liberty 11, Krahmer-Ward 11] Other: breaking privacy [Dwork,McSherry,Talwar 07]

Compressive sensing techniques (in one slide) A matrix A satisfies RIP of order k with constant δ if for any k- sparse vector we have (1-δ) 2 A 2 (1+δ) 2 Theorem: (O(k), δ)-rip implies that if x* is a solution to the L1 minimization problem, then L2/L1: x-x* 2 C min k-sparse x x-x 1 /k 1/2 Theorem: If each entry of A is i.i.d. as N(0,1), and m=θ(k log(n/k)), then A satisfies (k, δ)-rip for some δ w.h.p. Proof: via Johnson-Lindenstrauss theorem Theorem: If each row of A is chosen i.i.d. from the rows of the Fourier (or Hadamard) matrix, and m=θ(k log c n) then A satisfies (k, δ)-rip w.h.p.

Performance Paper R/ D Sketch length Encode time Column sparsity Recovery time Approx [CRT 04] [RV 05] D k log(n/k) nk log(n/k) k log(n/k) n c l2 / l1 D k log c n n log n k log c n n c l2 / l1 Deterministic guarantee (one matrix A works for all x) Scale: Excellent Very Good Good Fair

Linear sketching via sparse binary matrices Matrix view: A Bxn matrix A, with one 1 (or +/-1) per column The i-th column has 1 at position h(i), where h(i) be chosen uniformly at random from {1 B} Hashing view: h hashes coordinates into buckets (Ax) 1 (Ax) B Ax adds up the coordinates in each bucket Large coefficients do not collide with constant probability if B=O(k) Repeat log n times 0 0 1 0 0 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 x i * x i (Ax) 1..(Ax) B Charikar-Chen-FarachColton 02, Estan-Varghese 03, Cormode-Muthukrishnan 04, Gilbert- Strauss-Vershynin-Tropp 06, Berinde-Gilbert-Indyk-Karloff-Strauss 08, Sarvotham-Baron- Baraniuk 06, 08, Lu-Montanari- Prabhakar 08, Akcakaya-Tarokh 11.

Performance Paper R/ D Sketch length Encode time Column sparsity Recovery time Approx [CCF 02], [CM 06] R k log n n log n log n n log n l2 / l2 R k log c n n log c n log c n k log c n l2 / l2 [CM 04] R k log n n log n log n n log n l1 / l1 R k log c n n log c n log c n k log c n l1 / l1 [CRT 04] [RV 05] [GSTV 06] [GSTV 07] D k log(n/k) nk log(n/k) k log(n/k) n c l2 / l1 D k log c n n log n k log c n n c l2 / l1 D k log c n n log c n log c n k log c n l1 / l1 D k log c n n log c n k log c n k 2 log c n l2 / l1 [BGIKS 08] D k log(n/k) n log(n/k) log(n/k) n c l1 / l1 [GLR 08] D k logn logloglogn kn 1-a n 1-a n c l2 / l1 [NV 07], [DM 08], [NT 08], [BD 08], [GK 09], D k log(n/k) nk log(n/k) k log(n/k) nk log(n/k) * log l2 / l1 D k log c n n log n k log c n n log n * log l2 / l1 [IR 08], [BIR 08],[BI 09] D k log(n/k) n log(n/k) log(n/k) n log(n/k)* log l1 / l1 [GLPS 10] R k log(n/k) n log c n log c n k log c n l2 / l2 Open problem 1: D k logn(n/k) n polylog n l2/l1

dense vs. sparse Restricted Isometry Property (RIP) is k-sparse 2 A 2 C 2 Consider m x n 0-1 matrices with d ones per column Do they satisfy RIP? No, unless m= (k 2 ) [Chandar 07] However, they can satisfy the following RIP-1 property [Berinde-Gilbert- Indyk-Karloff-Strauss 08]: is k-sparse d (1- ) 1 A 1 d 1 Sufficient (and necessary) condition: the underlying graph is a ( k, d(1- /2) )-expander Implies L1/L1 guarantee for L1 minimization (and other algorithms)

Expanders n m A bipartite graph is a (k,d(1- ))- expander if for any left set S, S k, we have N(S) (1- )d S Objects well-studied in theoretical computer science and coding theory Probabilistic construction yields m=o(k log (n/k)) RIP-1 yields: S d N(S) m [BGIKS 08] D k log(n/k) n log(n/k) log(n/k) n c l1 / l1 Coding theory (LDPC) handles exactly k-sparse case [Xu-Hassibi 07, Indyk 08, Jafarpour-Xu-Hassibi-Calderbank 08] n

Explicit constructions Explicit constructions of expanders: m=k quasipolylog n [.,Guruswami-Umans-Vadhan 07] Yields matrix with RIP-1 property and k quasipolylog n rows Alternate approach: Explicit Lp sections of L1 [Guruswami- Lee-Razborov 08; Karnin 11] Stronger guarantee (Lp/L1) Higher number of measurements Recent: explicit construction of RIP-2 matrix with m=k 2- [Bourgain-Dilworth-Ford-Konyagin-Kutzarova 11] Open problem 2: better bounds?

CS on CS insights Efficient (sub-linear) algorithms: Sparse matrices/hashing Fourier sampling Explicit constructions of matrices Fast Johnson-Lindenstrauss Transform via RIP property [Ailon-Liberty 11, Krahmer-Ward 11]

Fourier sampling R/ D Sketch length Encode time Column sparsity Recovery time Approx D k log c n n log n k log c n n c l2 / l1 Θ(k log c n) Fourier measurements of x suffice to recover a k-sparse approximation to x Equivalently, can recover a k-sparse approximation in the Fourier basis to x from Θ(k log c n) samples of x Fourier sampling: Boolean cube (Hadamard Transform): [KM] (cf. [GL]) Complex FT: [Mansour 92, GGIMS 02, AGS 03, GMS 05, Iwen 10, Akavia 10,HIKP 12] Best time/sample bound [HIKP 12]: O(k log(n) log(n/k)) Open problem 3: O(k log(n)) time and/or sample bound

CS on CS insights Efficient (sub-linear) algorithms: Sparse matrices/hashing Fourier sampling Explicit constructions of matrices Fast Johnson-Lindenstrauss Transform via RIP property [Ailon-Liberty 11, Krahmer-Ward 11]

JL lemma Example statement: Let A be a kxn matrix, with i.i.d. entries chosen uniformly at random from {-1,1} Then, for any x R n, we have Pr[ Ax 2 - k x 2 > εk x 2 ] exp(-c ε 2 k) Powerful hammer The time to compute Ax: O(nk)

Fast Johnson-Lindenstrauss Transform [Ailon-Chazelle 06] A structured matrix A, with guarantees as before The time to compute Ax: O(n log n) + k c [AC] showed c=3 Several improvements Idea: compute Ax = P H D x where: D: diagonal matrix, with D ii {-1,1} i.i.d. H: normalized Hadamard matrix Entries in {-1/n 1/2, 1/n 1/2 } Columns orthogonal P: a sparse projection matrix (several options possible)

RIP implies JL [Ailon-Liberty 11, Krahmer-Ward 11]: Theorem: Consider A = B D where B has RIP property of order k (e.g., can choose B = random rows from H). Then A satisfies JL lemma with error probability at most exp(-k) Running time: O(n log n) But dimension O(k * log O(1) n) as opposed to O(k) Open problem 4: JL transform with O(n * log O(1) n) time and dimension O(k) (this would actually solve problem 1 as well!)

Conclusions Many insights in both directions Many interesting open problems Some resources: Compressive sensing: J. A. Tropp, S. J. Wright, Computational Methods for Sparse Solution of Linear Inverse Problems, Proceedings of IEEE, June 2010. Sparse (+explicit) matrices: A. Gilbert, P. Indyk, Sparse recovery using sparse matrices, Proceedings of IEEE, June 2010. Fast JL transform N. Ailon, B. Chazelle, Faster Dimension Reduction, CACM, 2010.