Mathematical introduction to Compressed Sensing

Similar documents
Compressed sensing. Or: the equation Ax = b, revisited. Terence Tao. Mahler Lecture Series. University of California, Los Angeles

Introduction How it works Theory behind Compressed Sensing. Compressed Sensing. Huichao Xue. CS3750 Fall 2011

2.3. Clustering or vector quantization 57

Estimating Unknown Sparsity in Compressed Sensing

Least Squares and Linear Systems

Lecture 5 : Sparse Models

Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing

Introduction to Support Vector Machines

2D X-Ray Tomographic Reconstruction From Few Projections

A Survey of Compressive Sensing and Applications

Sparsity in Underdetermined Systems

Cheng Soon Ong & Christian Walder. Canberra February June 2018

arxiv: v1 [cs.it] 21 Feb 2013

Sparse Approximation and Variable Selection

Lecture: Introduction to Compressed Sensing Sparse Recovery Guarantees

Introduction to Machine Learning

Sparse solutions of underdetermined systems

Part IV Compressed Sensing

Towards airborne seismic. Thomas Rapstine & Paul Sava

sparse and low-rank tensor recovery Cubic-Sketching

Sparse Solutions of Systems of Equations and Sparse Modelling of Signals and Images

Strengthened Sobolev inequalities for a random subspace of functions

LIMITATION OF LEARNING RANKINGS FROM PARTIAL INFORMATION. By Srikanth Jagabathula Devavrat Shah

Learning goals: students learn to use the SVD to find good approximations to matrices and to compute the pseudoinverse.

SPARSE signal representations have gained popularity in recent

Recent Developments in Compressed Sensing

Sparsity and Compressed Sensing

Advanced Introduction to Machine Learning CMU-10715

Sparse & Redundant Signal Representation, and its Role in Image Processing

Quick Introduction to Nonnegative Matrix Factorization

Introduction to Compressed Sensing

Sparse Solutions of an Undetermined Linear System

An Introduction to Sparse Approximation

Recovering overcomplete sparse representations from structured sensing

Noisy Signal Recovery via Iterative Reweighted L1-Minimization

CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS

Sparse linear models

Compressed Sensing: Extending CLEAN and NNLS

Compressed Sensing - Near Optimal Recovery of Signals from Highly Incomplete Measurements

Overview. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Sparsifying Transform Learning for Compressed Sensing MRI

Inverse Problems meets Statistical Learning

Lecture 22: More On Compressed Sensing

Sparse and Robust Optimization and Applications

Compressive Sensing and Beyond

Oslo Class 6 Sparsity based regularization

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28

Sparse Algorithms are not Stable: A No-free-lunch Theorem

Designing Information Devices and Systems I Spring 2018 Homework 13

Recovery of Sparse Signals from Noisy Measurements Using an l p -Regularized Least-Squares Algorithm

Compressed Sensing: Lecture I. Ronald DeVore

Machine Learning for Signal Processing Sparse and Overcomplete Representations. Bhiksha Raj (slides from Sourish Chaudhuri) Oct 22, 2013

Sensing systems limited by constraints: physical size, time, cost, energy

Neural networks and optimization

Compressive Sensing Forensics Xiaoyu Chu, Student Member, IEEE, Matthew Christopher Stamm, Member, IEEE, and K. J. Ray Liu, Fellow, IEEE

Machine Learning for Signal Processing Sparse and Overcomplete Representations

Lecture 16: Compressed Sensing

CPSC 540: Machine Learning

Exponential decay of reconstruction error from binary measurements of sparse signals

Compressive sensing of low-complexity signals: theory, algorithms and extensions

Bayesian Methods for Sparse Signal Recovery

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Principal components analysis COMS 4771

COMPRESSIVE SAMPLING USING EM ALGORITHM. Technical Report No: ASU/2014/4

Support Vector Machines

Compressed Sensing and Neural Networks

High-dimensional Statistics

Generalized Orthogonal Matching Pursuit- A Review and Some

Estimation of large dimensional sparse covariance matrices

Model Selection with Partly Smooth Functions

ECE 595: Machine Learning I Adversarial Attack 1

Solving Underdetermined Linear Equations and Overdetermined Quadratic Equations (using Convex Programming)

Multiscale Geometric Analysis: Thoughts and Applications (a summary)

Sparse analysis Lecture VII: Combining geometry and combinatorics, sparse matrices for sparse signal recovery

Algorithms for sparse analysis Lecture I: Background on sparse approximation

Motivation Sparse Signal Recovery is an interesting area with many potential applications. Methods developed for solving sparse signal recovery proble

Deep Learning: Approximation of Functions by Composition

Dictionary Learning for photo-z estimation

Denoising and Compression Using Wavelets

On the Projection Matrices Influence in the Classification of Compressed Sensed ECG Signals

Model-Based Compressive Sensing for Signal Ensembles. Marco F. Duarte Volkan Cevher Richard G. Baraniuk

High-dimensional Statistical Models

ECE 595: Machine Learning I Adversarial Attack 1

Stability of Modified-CS over Time for recursive causal sparse reconstruction

MIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications. Class 08: Sparsity Based Regularization. Lorenzo Rosasco

CSC 576: Variants of Sparse Learning

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

Primal Dual Pursuit A Homotopy based Algorithm for the Dantzig Selector

Phase Transition Phenomenon in Sparse Approximation

Machine Learning. Support Vector Machines. Manfred Huber

Designing Information Devices and Systems I Discussion 13B

Foundations of Natural Language Processing Lecture 5 More smoothing and the Noisy Channel Model

Support Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM

Empirical Methods in Natural Language Processing Lecture 10a More smoothing and the Noisy Channel Model

CS295: Convex Optimization. Xiaohui Xie Department of Computer Science University of California, Irvine

LINEARIZED BREGMAN ITERATIONS FOR FRAME-BASED IMAGE DEBLURRING

Inverse problem and optimization

Sparse and Low Rank Recovery via Null Space Properties

Overview. Analog capturing device (camera, microphone) PCM encoded or raw signal ( wav, bmp, ) A/D CONVERTER. Compressed bit stream (mp3, jpg, )

Transcription:

Mathematical introduction to Compressed Sensing Lesson 1 : measurements and sparsity Guillaume Lecué ENSAE Mardi 31 janvier 2016 Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 1 / 31

Aim of the course: analyze high-dimensional data 1 Understand low-dimensional structures in high-dimensional spaces 2 reveal this structure via appropriate measurements 3 construct efficient algorithms to learn this structure Tools: 1 approximation theory 2 probability theory 3 convex optimization algorithms Rem.: Register at http://datascience-x-master-paris-saclay.fr Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 2 / 31

First lesson is about: Two central ideas: 1 Sparsity 2 measurements through three examples: 1 Single pixel camera 2 face recognition 3 Financial data Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 3 / 31

What is Compressed Sensing? Classical data acquisition system in two steps: signal acquisition RAW format file compression JPEG format file Compressed sensing makes it in one step: signal Compressed sensing data In french: Compressed Sensing = "acquisition comprimée" Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 4 / 31

Q: How is it possible? A: Construct clever measurements! x: a signal (finite dimensional vector, say x R N ) Take m linear measurements of signal x: y i = x, X i, i = 1,..., m where: 1 X i : i-th measurement vector (in R N ), 2 y i : i-th measurement (= data = observation). Problem: reconstruct x from the measurements (y i ) m i=1 and the measurement vectors (X i ) m i=1 with m as small as possible. Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 5 / 31

Matrix version of Compressed Sensing We denote y = y 1. y m R m and A = X 1. X m y: measurements vector and A: measurements matrix Problem: find x such that y = Ax when m << N R m N A x = y CS = solve a highly undetermined linear system Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 6 / 31

Sparsity = low-dimensional structure Since m < N there is no unique solution to the problem y = Ax no hope to reconstruct x from the m measurements y i = x, X i. Idea: Signals to recover have some low-dimensional structure. We assume that x is sparse. Definition Support of x = (x 1,..., x N ) R N : supp(x) = { j {1,..., N} : x j 0 } Size of the support of x: x 0 = supp(x) x is s-sparse when x 0 s. Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 7 / 31

Sparsity and the undetermined system y = Ax Idea: Maybe the kernel of A is in such a position that the sparsest solution to y = Ax is x itself? Procedure: Look for the sparsest solution of the system y = Ax: ˆx 0 argmin t 0 (1) At=y which looks for vector t with the shortest support in the affine set of solutions {t R N : At = y} = x + ker(a). Idea: Denote Σ s = {t R N : t 0 s}. If Σ s ( x + ker(a) ) = {x} for s = x 0 then the sparsest element in x + ker(a) is x and so ˆx 0 = x Definition ˆx 0 is called the l 0 -minimization procedure (cf. Second lesson) Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 8 / 31

Compressed sensing: problems statement Problem 1: Construct a minimal number of measurement vectors X 1,..., X m such that one can reconstruct any s-sparse signal x from the m measurements ( x, X i ) m i=1. Problem 2: Construct efficient algorithms that can reconstruct exactly any sparse signal x from the measurements ( x, X i ) m i=1. Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 9 / 31

Is signal x really sparse? Sparsity of signal x is the main assumption in Compressed Sensing (and more generally in high-dimensional statistics). Q.: Is it true that "real signals" are sparse? Three examples: 1 images 2 face recognition 3 financial data Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 10 / 31

Compressed Sensing in images Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 11 / 31

Sparse representation of images An image is a: 1 vector f R n n 2 function f : {0,..., n 1} 2 R Images can be developed into basis: f = n 2 j=1 f, ψj ψj Problem in approximation theory: Find basis (ψ j ) such that ( f, ψ j ) n 2 j=1 is (approximatively) a sparse vector for real life images f. Solution: Wavelets basis (cf. Gabriel Peyré course) notebook: wavelet decomposition Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 12 / 31

Sparse representation of images Graphics: Representation of ( log f, ψ j ) n 2 j=1 in a decaying order for n = 256 (256 2 = 65.536 coefficients). Conclusion: When developed in an appropriate basis, images have an almost sparse representation. Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 13 / 31

Sparse representation of images Idea: Compression of images by thresholding small wavelets coefficients (JPEG 2000). Remark: these are the only three slides about approximation theory in this course! Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 14 / 31

Compressed sensing and images Two differences with the CS framework introduced above: 1 images are almost sparse 2 images are (almost) sparse not in the canonical basis but in some other (wavelet) basis. Two consequences: 1 our procedures will be asked to "adapt" to this almost sparse situation: stability property 2 we need to introduce a stuctured sparsity: being sparse in some general basis. Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 15 / 31

Structured sparsity Definition Let F = {f 1,, f p } be a dictionary in R N. A vector x R N is said s-sparse in F when there exists J {1,..., p} such that J s and x = j J θ j f j. In this case, x = Fθ where F = [f 1, f p ] R N p and θ R p is a s-sparse in the canonical basis. For CS measurements, one has: y = Ax = AFθ where θ Σ s and so one just has to replace the measurement matrix A by AF. Conclusion: All the course deals only with vectors that are sparse in the canonical basis. Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 16 / 31

What is a photos machine using CS? It should take measurements like: We take m measurements: In particular, measurements y 1,..., y m are real numbers that can be stored using only one pixel in the camera. Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 17 / 31

Single pixel camera from RICE University DMD: digital micromirror device randomly orientated Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 18 / 31

Single pixel camera from RICE University Example of reconstruction of an image using the single pixel camera. Two problems: 1 How do we choose the measurement vectors:,,? 2 Is there an efficient algorithm to reconstruct the signal from those few measurements? Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 19 / 31

CS in face Recognition Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 20 / 31

face recognition and Compressed Sensing Database: D := {(φ j, l j ) : 1 j N} where : 1 φ j R m is a vector representation of the j-th image, (for instance, concatenation of the images pixels value) 2 l j {1,..., C} is a label refering to a person A same person may be represented in D several times from various angles, luminosity, etc.. Problem: Given a new image y R m, we want to label it with an element from the set {l j, j = 1,..., C} "Classical" solution: use multi-class classification algorithm. Here: Face recognition as a CS problem. Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 21 / 31

The sparsity assumption in face recognition Empirical observation: If for all of the C individuals one has: 1 a large enough number of images, 2 enough diversity in terms of angles and brightness then for any new image y R m of individual number i {1,..., C}, one expect that y φ j x j. j:l j =i Consequence: We assume that a new image y R m can be written as where: y = Φx + ζ 1 Φ = [Φ 1 Φ 2 Φ C ] and Φ i = [φ j : l j = i] for any i {1,..., C}, 2 x = [0 0 x i 0 0 ] where x i is the restriction of x to the columns of Φ i in Φ 3 ζ R m error due to linear approximation of y by columns in Φ. Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 22 / 31

Face recognition as a noisy CS problem Compare with the benchmark CS setup, one has three difference: 1 there is an additional noise term ζ 2 the sparsity assumption on x is stronger here: x is block-sparse 3 depending on the control one has on the database, we may or may not have the ability to choose (in a restricted way) the measurement matrix. Three consequences: 1 our procedures will be asked to deal with noisy data: robustness property 2 we will design procedures taking advantages of more "advanced" sparsity like the block-sparsity one 3 when one is in a situation where there is no control on the choice of measurement vectors then one can try several procedures and see how they behave. Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 23 / 31

Construction of a measurement matrix in face recognition Various angles: Various brightness: Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 24 / 31

CS in Finance Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 25 / 31

Finance and CS Problem: We observe the performances of a portfolio every minute: y1,..., ym. We would like to know how it is structured (shares and quantity). Data: In addition to y1,..., ym, we know the values of all shares at any time: Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 26 / 31

Finance and CS x i,j : value of share j at time i. We have the following data: t = 1 : y 1 : portfolio value (x 1,j ) N j=1 : shares values t = 2 : y 2 : portfolio value (x 2,j ) N j=1 : shares values t = m : y m : portfolio value (x m,j ) N j=1 : shares values Sparsity assumption: The portfolio contains only a limited number of shares and its structure did not change during the observation time. Problem formulation: find x R N such that y = Ax where y = (y i ) m i=1 and A = ( x i,j : 1 i m, 1 j N ) and x is supposed to be sparse. Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 27 / 31

Noisy data robust procedures In the case of noisy data, y = Ax + ζ we want procedures that are robust to this noise perturbation. Definition (Robustness) We say that a procedure : R m R N is robust of order s when for any s-sparse vector x R N and any noise ζ R m, one has where c 0 is some absolute constant. x (Ax + ζ) 2 c 0 ζ 2 Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 28 / 31

Approximatively sparse signal stable procedures In cases where signal x is only "approximatively sparse" that is min z Σ s x z 1 is not exactly equal to zero but "small", we want procedures to perform still well. Definition (Stability) We say that a procedure : R m R N is stable of order s when for any vector x R N one has where c 0 is some absolute constant. min z Σs x z x (Ax) 2 c 1 0 s Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 29 / 31

Stable and robust procedures Definition (Stability and robustness) We say that a procedure : R m R N is robust and stable of order s when for any x R N and ζ R m, one has min z Σs x z x (Ax + ζ) 2 c 1 0 s + c 0 ζ 2 where c 0 is some absolute constant. Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 30 / 31

CS and high-dimensional statistics Definition We say that a statistical problem is a high-dimensional statistical problem when one has to estimate a N-dimensional parameter using m observations and m < N. 1 CS is therefore a high-dimensional statistical problem. 2 Noisy CS is exactly the linear regression statistical model when the noise is assumed to be random. Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 31 / 31