Lecture 22: More On Compressed Sensing

Similar documents
Introduction How it works Theory behind Compressed Sensing. Compressed Sensing. Huichao Xue. CS3750 Fall 2011

Recovering overcomplete sparse representations from structured sensing

Compressed sensing. Or: the equation Ax = b, revisited. Terence Tao. Mahler Lecture Series. University of California, Los Angeles

Lecture: Introduction to Compressed Sensing Sparse Recovery Guarantees

Introduction to Compressed Sensing

Part IV Compressed Sensing

AN INTRODUCTION TO COMPRESSIVE SENSING

Strengthened Sobolev inequalities for a random subspace of functions

2D X-Ray Tomographic Reconstruction From Few Projections

Sensing systems limited by constraints: physical size, time, cost, energy

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

CS 229r: Algorithms for Big Data Fall Lecture 19 Nov 5

Data Sparse Matrix Computation - Lecture 20

On the coherence barrier and analogue problems in compressed sensing

An Introduction to Sparse Approximation

Compressed sensing and imaging

Sparse Legendre expansions via l 1 minimization

Lecture Notes 9: Constrained Optimization

Sparse Solutions of Systems of Equations and Sparse Modelling of Signals and Images

Recent Developments in Compressed Sensing

Breaking the coherence barrier - A new theory for compressed sensing

Algorithms for sparse analysis Lecture I: Background on sparse approximation

Super-resolution by means of Beurling minimal extrapolation

Sparse analysis Lecture VII: Combining geometry and combinatorics, sparse matrices for sparse signal recovery

Designing Information Devices and Systems I Discussion 13B

Sparsity Regularization

Compressed Sensing. 1 Introduction. 2 Design of Measurement Matrices

Constrained optimization

1 Regression with High Dimensional Data

Elaine T. Hale, Wotao Yin, Yin Zhang

Compressive Sensing and Beyond

Randomness-in-Structured Ensembles for Compressed Sensing of Images

Supremum of simple stochastic processes

MAT 585: Johnson-Lindenstrauss, Group testing, and Compressed Sensing

Low-Dimensional Signal Models in Compressive Sensing

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit

Beyond incoherence and beyond sparsity: compressed sensing in the real world

Sparse Solutions of an Undetermined Linear System

Uniqueness Conditions for A Class of l 0 -Minimization Problems

A Survey of Compressive Sensing and Applications

Combining geometry and combinatorics

A Tutorial on Compressive Sensing. Simon Foucart Drexel University / University of Georgia

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Sparse Optimization Lecture: Sparse Recovery Guarantees

Compressed Sensing and Related Learning Problems

Recovery of Sparse Signals from Noisy Measurements Using an l p -Regularized Least-Squares Algorithm

SPARSE signal representations have gained popularity in recent

Random projections. 1 Introduction. 2 Dimensionality reduction. Lecture notes 5 February 29, 2016

Lecture 13 October 6, Covering Numbers and Maurey s Empirical Method

A new method on deterministic construction of the measurement matrix in compressed sensing

Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing

Phase Transition Phenomenon in Sparse Approximation

Z Algorithmic Superpower Randomization October 15th, Lecture 12

The Pros and Cons of Compressive Sensing

COMPRESSED SENSING IN PYTHON

Compressed Sensing - Near Optimal Recovery of Signals from Highly Incomplete Measurements

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery

Sparse recovery for spherical harmonic expansions

Primal Dual Pursuit A Homotopy based Algorithm for the Dantzig Selector

6 Compressed Sensing and Sparse Recovery

Stable Signal Recovery from Incomplete and Inaccurate Measurements

Reconstruction from Anisotropic Random Measurements

Optimization for Compressed Sensing

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

COMPRESSIVE SAMPLING USING EM ALGORITHM. Technical Report No: ASU/2014/4

Fast Reconstruction Algorithms for Deterministic Sensing Matrices and Applications

Determinis)c Compressed Sensing for Images using Chirps and Reed- Muller Sequences

Lecture 16: Compressed Sensing

Message Passing Algorithms for Compressed Sensing: II. Analysis and Validation

Noisy Signal Recovery via Iterative Reweighted L1-Minimization

Compressed Sensing: Extending CLEAN and NNLS

An algebraic perspective on integer sparse recovery

Sparse Solutions of Linear Systems of Equations and Sparse Modeling of Signals and Images: Final Presentation

Compressed Sensing Using Reed- Solomon and Q-Ary LDPC Codes

Compressed Sensing and Sparse Recovery

Signal Recovery from Permuted Observations

Compressed Sensing: Lecture I. Ronald DeVore

Constructing Explicit RIP Matrices and the Square-Root Bottleneck

Solution-recovery in l 1 -norm for non-square linear systems: deterministic conditions and open questions

Compressive Sensing (CS)

Robust Principal Component Analysis

Stability and Robustness of Weak Orthogonal Matching Pursuits

Signal Recovery, Uncertainty Relations, and Minkowski Dimension

The Sparsest Solution of Underdetermined Linear System by l q minimization for 0 < q 1

THEORY OF COMPRESSIVE SENSING VIA l 1 -MINIMIZATION: A NON-RIP ANALYSIS AND EXTENSIONS

Compressive Sensing with Random Matrices

The convex algebraic geometry of rank minimization

Thresholds for the Recovery of Sparse Solutions via L1 Minimization

RSP-Based Analysis for Sparsest and Least l 1 -Norm Solutions to Underdetermined Linear Systems

CoSaMP. Iterative signal recovery from incomplete and inaccurate samples. Joel A. Tropp

Conditions for Robust Principal Component Analysis

Pre-weighted Matching Pursuit Algorithms for Sparse Recovery

MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design

The Sparsity Gap. Joel A. Tropp. Computing & Mathematical Sciences California Institute of Technology

Optimisation Combinatoire et Convexe.

Sparse analysis Lecture V: From Sparse Approximation to Sparse Signal Recovery

Motivation Sparse Signal Recovery is an interesting area with many potential applications. Methods developed for solving sparse signal recovery proble

Bayesian Methods for Sparse Signal Recovery

Sparse solutions of underdetermined systems

Block-sparse Solutions using Kernel Block RIP and its Application to Group Lasso

Transcription:

Lecture 22: More On Compressed Sensing Scribed by Eric Lee, Chengrun Yang, and Sebastian Ament Nov. 2, 207 Recap and Introduction Basis pursuit was the method of recovering the sparsest solution to an underdetermined linear system i.e. x x Ax b < ɛ Compressed sensing is related to basis pursuit, but stems from a different context, in which there is some underlying signal or function f R n that we cannot observe. What we can observe is some set of linear observations { } m y i y i = ψi T f i= () with ψ i R n. Compressed sensing asks if there is some minimal number set of observations that may be made i.e. gives bounds on the size of m. Let Ψ = [ ψ ψ 2... ψ m ] R n m. If m = n, we can set up the system of equations Ψ T f = y (2) with Ψ = Ψ T Ψ T 2. Ψ T m y = y y 2. y m If Ψ is invertible, the solution is trivial (given by inverting Ψ). However, we are interested in some m < n such that exact recovery is still possible. For general problems, this is not possible; we are interested on problems with a bit more structure to them.

2 Mutual Coherence To be more precise, we assume that f is data-sparse in some basis Φ: f = Φx where Φ R n n is assumed to be orthogonal and xinr n is sparse. This basis Φ may be derived from some physical problem or model. Then expanding f into (2), we observe that this is equivalent to solving Ψ T Φx = y (3) However, if Ψ is not invertible (the case when m < n), we are interested in when the basis pursuit-esque formulation where A = Ψ T Φ and we would like to solve the optimization problem z z Az = y Note however that we are interested in exact solutions and not approximate solutions. To simplify things, we make the additional assumption that the columns of Ψ are sampled uniformly at random from an larger orthogonal matrix Ψ R n n That is, A = ( ΨR) T Φ where R R n m is a sampling matrix that picks out columns of Ψ Before we define Mutual Coherence, we first give a toy example to illustrate why Mutual Coherence is important. 2. Toy Example Assume that Ψ = Φ = F where F is the n n Discrete Fourier Transform Matrix. Let x = e, the unit vector associated with the first coordinate. Then regardless of which R we pick, we note that our samples y = Ψ T Φe (4) Will be the zero vector unless Ψ contains the first column of F. This is in some senses, the worst case scenario; we need m = n in order to guarantee that we can extract e. Conversely, if m = n, we have a very high probability of getting no information from our linear measurements y 2

2.2 Mutual Coherence Keeping the toy example in mind, Mutual Coherence is defined as µ( Ψ, Φ) = n max i,j ψ T i φ j (5) Where φ j is the j-th column of the matrix Φ. The way µ is defined here is slightly different from the way it might be defined in basis pursuit. µ here ranges between and n (with representing the lowest possible score and n representing the highest possible score). But other than the fact that µ is scaled up by n, there is no difference in its meaning; µ represents how different two bases are. Following this logic, low mutual incoherence between two bases indicates guarantees that any vector sparse in one basis cannot be be sparse in the other; this is a critical property to have, as going back to the toy example, we see that having a vector be sparse in both bases leads to sensing problems. Theorem 2. Given f R n, x R n, Φ R n n, and Ψ R n n such that f = Φx with Ψ and Φ orthogonal, and x k-sparse (having only k nonzero entries), select m columns of Ψ uniformly at random and put in Ψ. Then if m c µ( Ψ, Φ) 2 k log(n) for some constant c, the solution ˆx to the optimization problem z z Ψ T Φz = y is exact with high probability. Now, in both exact and approximate sparse recovery cases, we want to find some structures of A to give provable guarantees and bounds for the effect of our sparse recovery. Thus we define Restricted Isometry Property (RIP) as follows: Definition. Restricted Isometry Property (RIP) is defined as: For k =, 2,..., define δ k (a constant) as the smallest number such that ( δ k ) x 2 2 Ax 2 2 ( + δ k ) x 2 2 for all k-sparse x. Then we say the matrix A satisfies k-restricted Isometry Property with restricted isometry constant δ k. For a matrix A that satisfies the RIP, any subset of k columns of A are well-behaved, i.e. A is guaranteed to project any vector x with corresponding nonzero pattern to another 3

vector that has 2-norm close to x. A has the RIP if δ, δ 2,, δ 2k are small, which leads to the alternative form of RIP: δ k = max card(s) s A SA S I 2 should be small, in which the 2-norm is the spectral norm. Let s say we want to recover a k-sparse signal and δ 2k is sufficiently <. We have an equivalent form of RIP being ( δ 2k ) x x 2 2 2 Ax Ax 2 2 ( + δ 2k ) x x 2 2 2. Here 2k instead of k exists because we can only guarantee x x 2 to be 2k-sparse. Theorem 2.2 Assume δ 2k < 2, then ˆx, the solution to z z Az = y obeys ˆx x 2 c x x k / k and ˆx x c x x k, where y = Ax and x k is the best k-sparse approximation of x. Note: x is not necessarily sparse here. Now, suppose y = Ax + g, in which g is noise and g 2 ɛ, and let ˆx solve z z Az y 2 ɛ which is noise-tolerant instead of the exact case in the above theorem. Theorem 2.3 Assume δ 2k 2, then ˆx satisfies ˆx x 2 c 0 x x k / k + c ɛ for constants c 0 and c. For example, if δ 2k < 4, then c 0 5.5, c 6. Some more conclusions on the structure of A: What general classes of matrices A satisfy the RIP? As it turns out, if random matrices do the job. To be more precise, we want an A with m close to k satisfying the RIP. In the cases where either A R m n with i.i.d. normal mean and variance m entries, or A R m n has each column sampled uniformly at random on the unit sphere in R m, A obeys the RIP for δ 2k (< 2 ) with high probability if m cklog( n k ). 4

In the case A = R Ψ T Φ, in which R is a random subsampling matrix, it is sufficient to have m k(logn) 4 and get the RIP with high probability for δ 2k. If A = GΦ with G being an m n random matrix as before, and Φ being a set of fixed orthogonal bases, A obeys the RIP with high probability if m cklog( n k ). 3 Supplementary Information: Application of Compressed Sensing to MRI Imaging Compressed sensing has found innumerable applications in imaging, in particular medical imaging, and seismic imaging, where the cost of measurement is high, but the data can usually be represented in a sparse format. Further, it has found applications in biological sensing, radar systems, communication networks, and many more []. 50 50 00 00 50 50 200 200 250 50 00 50 200 250 250 50 00 50 200 250 Figure : Shepp-Logan phantom (left), and the magnitude of its 2D discrete-cosine transform coefficients (right) [2] used compressed sensing to accelerate magnetic resonance imaging (MRI) by massively under-sampling the Fourier domain. The paper shows that MRI images are sparse in many domains, including DCT, wavelet, and spatial finite-difference domains. Figure demonstrates this: The left side of figure the Shepp-Logan phantom, which is a schematic of the human brain used to compare medical imaging algorithms. The right image shows the magnitudes of the 2D discrete cosine transform coefficients. Evidently, the image is extremely sparse in this domain, whereas there are large regions with significant intensities in the image domain. Further, several decisions have to be made to apply compressed sensing in practice. For example, the paper discusses how to incorporate prior knowledge of the sparsity pattern into a compressed sensing algorithm. In practice, many signals have large Fourier 5

coefficients for small frequencies, while also a few critically important high frequency components. Variable density sampling incorporates this knowledge, by under-sampling the frequency domain less for small frequencies, and more for high frequencies. The paper shows that this method achieves superior performance in practice, when the frequencies are sampled according to a power law. Further, there are multiple ways of taking a three dimensional MRI image: by subsequently scanning 2D slices, or by a pure 3D scan. The authors point out that randomly sub-sampling 3D space, instead of subsequent 2D images, reveals more of the image redundancy, capturing most of the advantage of compressed sensing. Lastly, it is intriguing that the authors propose fixing a sampling pattern after finding one with good image reconstruction properties, as measured on a few sample images. This makes the algorithm deterministic and sacrifices the theoretical, probabilistic guarantees of the under-sampling scheme. The practical justification of this is that one expects the images of a particular body region to be sufficiently similar, in order for the good reconstruction behavior to apply to MRI scans of different people. References [] S. Qaisar, R. M. Bilal, W. Iqbal, M. Naureen, and S. Lee. Compressive sensing: From theory to applications, a survey. Journal of Communications and Networks, 5(5):443 456, 0 203. [2] Michael Lustig, David Donoho, and John M. Pauly. Sparse mri: The application of compressed sensing for rapid mr imaging. Magnetic Resonance in Medicine, 58(6):82 95, 2 2007. 6