Three Recent Examples of The Effectiveness of Convex Programming in the Information Sciences

Similar documents
Super-resolution via Convex Programming

Towards a Mathematical Theory of Super-resolution

Fast and Robust Phase Retrieval

Support Detection in Super-resolution

Agenda. Applications of semidefinite programming. 1 Control and system theory. 2 Combinatorial and nonconvex optimization

Robust Principal Component Analysis

Phase recovery with PhaseCut and the wavelet transform case

The Phase Problem: A Mathematical Tour from Norbert Wiener to Random Matrices and Convex Optimization

Combining Sparsity with Physically-Meaningful Constraints in Sparse Parameter Estimation

PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION. A Thesis MELTEM APAYDIN

Statistical Issues in Searches: Photon Science Response. Rebecca Willett, Duke University

The convex algebraic geometry of rank minimization

Solving Corrupted Quadratic Equations, Provably

ROBUST BLIND SPIKES DECONVOLUTION. Yuejie Chi. Department of ECE and Department of BMI The Ohio State University, Columbus, Ohio 43210

Sparse Recovery Beyond Compressed Sensing

Nonlinear Analysis with Frames. Part III: Algorithms

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

Optimization-based sparse recovery: Compressed sensing vs. super-resolution

Provable Alternating Minimization Methods for Non-convex Optimization

Low-Rank Matrix Recovery

Convex relaxation. In example below, we have N = 6, and the cut we are considering

Going off the grid. Benjamin Recht Department of Computer Sciences University of Wisconsin-Madison

Coherent imaging without phases

Structured matrix factorizations. Example: Eigenfaces

Fast Angular Synchronization for Phase Retrieval via Incomplete Information

Tractable Upper Bounds on the Restricted Isometry Constant

arxiv: v2 [cs.it] 20 Sep 2011

Gauge optimization and duality

Sparse Parameter Estimation: Compressed Sensing meets Matrix Pencil

Sparse Solutions of an Undetermined Linear System

Super-Resolution of Point Sources via Convex Programming

Non-convex Robust PCA: Provable Bounds

Rapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization

Compressive Sensing and Beyond

Nonconvex Methods for Phase Retrieval

Constrained optimization

PhaseLift: A novel framework for phase retrieval. Vladislav Voroninski

Introduction to Compressed Sensing

Convex relaxation. In example below, we have N = 6, and the cut we are considering

Sparse and Low-Rank Matrix Decompositions

Demixing Sines and Spikes: Robust Spectral Super-resolution in the Presence of Outliers

arxiv: v1 [cs.it] 26 Oct 2015

Overview. Optimization-Based Data Analysis. Carlos Fernandez-Granda

A Convex Approach for Designing Good Linear Embeddings. Chinmay Hegde

Random projections. 1 Introduction. 2 Dimensionality reduction. Lecture notes 5 February 29, 2016

Lecture: Examples of LP, SOCP and SDP

Optimisation Combinatoire et Convexe.

Self-Calibration and Biconvex Compressive Sensing

Super-resolution of point sources via convex programming

Analysis of Robust PCA via Local Incoherence

Lecture Notes 10: Matrix Factorization

Sum-of-Squares Method, Tensor Decomposition, Dictionary Learning

Semidefinite Programming Basics and Applications

Sparse and Low Rank Recovery via Null Space Properties

Convex sets, conic matrix factorizations and conic rank lower bounds

A Geometric Analysis of Phase Retrieval

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

Reshaped Wirtinger Flow for Solving Quadratic System of Equations

Symmetric Factorization for Nonconvex Optimization

Compressive Phase Retrieval From Squared Output Measurements Via Semidefinite Programming

Recovery of Simultaneously Structured Models using Convex Optimization

Information and Resolution

Conditions for Robust Principal Component Analysis

Donald Goldfarb IEOR Department Columbia University UCLA Mathematics Department Distinguished Lecture Series May 17 19, 2016

ELE 538B: Sparsity, Structure and Inference. Super-Resolution. Yuxin Chen Princeton University, Spring 2017

Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs

The Iterative and Regularized Least Squares Algorithm for Phase Retrieval

Uncertainty principles and sparse approximation

Lecture Notes 9: Constrained Optimization

Dense Error Correction for Low-Rank Matrices via Principal Component Pursuit

Parameter Estimation for Mixture Models via Convex Optimization

arxiv: v1 [math.oc] 11 Jun 2009

Sparse linear models

Low Rank Matrix Completion Formulation and Algorithm

AN ALGORITHM FOR EXACT SUPER-RESOLUTION AND PHASE RETRIEVAL

Phase Retrieval via Matrix Completion

Convex Optimization M2

Linear dimensionality reduction for data analysis

Phase Retrieval with Random Illumination

Minimizing the Difference of L 1 and L 2 Norms with Applications

Rank minimization via the γ 2 norm

Recovery of Low-Rank Plus Compressed Sparse Matrices with Application to Unveiling Traffic Anomalies

INEXACT ALTERNATING OPTIMIZATION FOR PHASE RETRIEVAL WITH OUTLIERS

Fundamental Limits of Weak Recovery with Applications to Phase Retrieval

Phase Retrieval using the Iterative Regularized Least-Squares Algorithm

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3

Phase Retrieval from Local Measurements: Deterministic Measurement Constructions and Efficient Recovery Algorithms

Latent Variable Graphical Model Selection Via Convex Optimization

Robust Asymmetric Nonnegative Matrix Factorization

Going off the grid. Benjamin Recht University of California, Berkeley. Joint work with Badri Bhaskar Parikshit Shah Gonnguo Tang

Robust PCA via Outlier Pursuit

PHASELIFTOFF: AN ACCURATE AND STABLE PHASE RETRIEVAL METHOD BASED ON DIFFERENCE OF TRACE AND FROBENIUS NORMS

Robust Uncertainty Principles: Exact Signal Reconstruction from Highly Incomplete Frequency Information

Near Optimal Signal Recovery from Random Projections

Compressed sensing. Or: the equation Ax = b, revisited. Terence Tao. Mahler Lecture Series. University of California, Los Angeles

Collaborative Filtering Matrix Completion Alternating Least Squares

Large-Scale L1-Related Minimization in Compressive Sensing and Beyond

Super-resolution by means of Beurling minimal extrapolation

Lecture 20: November 1st

Transcription:

Three Recent Examples of The Effectiveness of Convex Programming in the Information Sciences Emmanuel Candès International Symposium on Information Theory, Istanbul, July 2013

Three stories about a theme Often have missing information: (1) Missing phase (phase retrieval) (2) Missing and/or corrupted entries in data matrix (robust PCA) (3) Missing high-frequency spectrum (super-resolution) Makes signal/data recovery difficult This lecture Convex programming usually (but not always) returns the right answer!

Story # 1: Phase Retrieval Collaborators: Y. Eldar, X. Li, T. Strohmer, V. Voroninski

X-ray crystallography Method for determining atomic structure within a crystal principle typical setup 10 Nobel Prizes in X-ray crystallography, and counting...

Importance principle Franklin s photograph

Missing phase problem Detectors only record intensities of diffracted rays magnitude measurements only! Fraunhofer diffraction intensity of electrical field 2 ˆx(f 1, f 2 ) 2 = x(t 1, t 2 )e i2π(f1t1+f2t2) dt 1 dt 2

Missing phase problem Detectors only record intensities of diffracted rays magnitude measurements only! Fraunhofer diffraction intensity of electrical field 2 ˆx(f 1, f 2 ) 2 = x(t 1, t 2 )e i2π(f1t1+f2t2) dt 1 dt 2 Phase retrieval problem (inversion) How can we recover the phase (or equivalently signal x(t 1, t 2 )) from ˆx(f 1, f 2 )?

About the importance of phase...

About the importance of phase... DFT DFT

About the importance of phase... DFT keep magnitude swap phase DFT

About the importance of phase... DFT keep magnitude swap phase DFT

X-ray imaging: now and then Röntgen (1895) Dierolf (2010)

Ultrashort X-ray pulses Imaging single large protein complexes

Discrete mathematical model Phaseless measurements about x 0 C n b k = a k, x 0 2 k {1,..., m} = [m] Phase retrieval is feasibility problem find x subject to a k, x 2 = b k k [m] Solving quadratic equations is NP hard in general

Discrete mathematical model Phaseless measurements about x 0 C n b k = a k, x 0 2 k {1,..., m} = [m] Phase retrieval is feasibility problem find x subject to a k, x 2 = b k k [m] Solving quadratic equations is NP hard in general Nobel Prize for Hauptman and Karle ( 85): make use of very specific prior knowledge

Discrete mathematical model Phaseless measurements about x 0 C n b k = a k, x 0 2 k {1,..., m} = [m] Phase retrieval is feasibility problem find x subject to a k, x 2 = b k k [m] Solving quadratic equations is NP hard in general Nobel Prize for Hauptman and Karle ( 85): make use of very specific prior knowledge Standard approach: Gerchberg Saxton (or Fienup) iterative algorithm Sometimes works well Sometimes does not

PhaseLift a k, x 2 = b k k [m]

PhaseLift a k, x 2 = b k k [m] Lifting: X = xx a k, x 2 = Tr(x a k a kx) = Tr(a k a kxx ) := Tr(A k X) a k a k = A k Turns quadratic measurements into linear measurements about xx

PhaseLift a k, x 2 = b k k [m] Lifting: X = xx a k, x 2 = Tr(x a k a kx) = Tr(a k a kxx ) := Tr(A k X) a k a k = A k Turns quadratic measurements into linear measurements about xx Phase retrieval: equivalent formulation find X s. t. Tr(A k X) = b k k [m] X 0, rank(x) = 1 Combinatorially hard min rank(x) s. t. Tr(A k X) = b k k [m] X 0

PhaseLift a k, x 2 = b k k [m] Lifting: X = xx a k, x 2 = Tr(x a k a kx) = Tr(a k a kxx ) := Tr(A k X) a k a k = A k Turns quadratic measurements into linear measurements about xx PhaseLift: tractable semidefinite relaxation minimize Tr(X) subject to Tr(A k X) = b k k [m] X 0 This is a semidefinite program (SDP) Trace is convex proxy for rank

Semidefinite programming (SDP) Special class of convex optimization problems Relatively natural extension of linear programming (LP) Efficient numerical solvers (interior point methods) LP (std. form): x R n SDP (std. form): X R n n minimize c, x subject to a T k x = b k k = 1,... x 0 minimize C, X subject to A k, X = b k k = 1,... X 0 Standard inner product: C, X = Tr(C X)

From overdetermined to highly underdetermined Quadratic equations Lift b k = a k, x 2 k [m] b = A(xx ) minimize subject to Tr(X) A(X) = b X 0 Have we made things worse? overdetermined (m > n) highly underdetermined (m n 2 )

This is not really new... Relaxation of quadratically constrained QP s Shor (87) [Lower bounds on nonconvex quadratic optimization problems] Goemans and Williamson (95) [MAX-CUT] Ben-Tal and Nemirovskii (01) [Monograph]... Similar approach for array imaging: Chai, Moscoso, Papanicolaou (11)

Quadratic equations: geometric view I

Quadratic equations: geometric view I

Quadratic equations: geometric view I

Quadratic equations: geometric view I

Quadratic equations: geometric view I

Quadratic equations: geometric view I

Quadratic equations: geometric view II

Exact phase retrieval via SDP Quadratic equations b k = a k, x 2 k [m] b = A(xx ) Simplest model: a k independently and uniformly sampled on unit sphere of C n if x C n (complex-valued problem) of R n if x R n (real-valued problem)

Exact phase retrieval via SDP Quadratic equations b k = a k, x 2 k [m] b = A(xx ) Simplest model: a k independently and uniformly sampled on unit sphere of C n if x C n (complex-valued problem) of R n if x R n (real-valued problem) Theorem (C. and Li ( 12); C., Strohmer and Voroninski ( 11)) Assume m n. With prob. 1 O(e γm ), for all x C n, only point in feasible set {X : A(X) = b and X 0} is xx

Exact phase retrieval via SDP Quadratic equations b k = a k, x 2 k [m] b = A(xx ) Simplest model: a k independently and uniformly sampled on unit sphere of C n if x C n (complex-valued problem) of R n if x R n (real-valued problem) Theorem (C. and Li ( 12); C., Strohmer and Voroninski ( 11)) Assume m n. With prob. 1 O(e γm ), for all x C n, only point in feasible set {X : A(X) = b and X 0} is xx Injectivity if m 4n 2 (Balan, Bodmann, Casazza, Edidin 09)

Numerical results: noiseless recovery (a) Smooth signal (real part) (b) Random signal (real part) Figure: Recovery (with reweighting) of n-dimensional complex signal (2n unknowns) from 4n quadratic measurements (random binary masks)

How is this possible? How can feasible set {X 0} {A(X) = b} have a unique point? [ ] x y Intersection of 0 with affine space y z

Correct representation Rank-1 matrices are on the boundary (extreme rays) of PSD cone

My mental representation

My mental representation

My mental representation

My mental representation

My mental representation

With noise b k x, a k 2 k [m] Noise aware recovery (SDP) minimize A(X) b 1 = k Tr(a ka k X) b k subject to X 0 Signal ˆx obtained by extracting first eigenvector (PC) of solution matrix

With noise b k x, a k 2 k [m] Noise aware recovery (SDP) minimize A(X) b 1 = k Tr(a ka k X) b k subject to X 0 Signal ˆx obtained by extracting first eigenvector (PC) of solution matrix In same setup as before and for realistic noise models, no method whatsoever can possibly yield a fundamentally smaller recovery error [C. and Li (2012)]

Numerical results: noisy recovery Figure: SNR versus relative MSE on a db-scale for different numbers of illuminations with binary masks

Numerical results: noiseless 2D images 50 50 100 100 150 150 200 200 250 50 100 150 200 250 original image 250 50 100 150 200 250 3 Gaussian masks Courtesy S. Marchesini (LBL) 50 50 100 100 150 150 200 200 250 250 50 100 150 200 250 50 100 150 200 250 8 binary masks error with 8 binary masks

Story #2: Robust Principal Component Analysis Collaborators: X. Li, Y. Ma, J. Wright

The separation problem (Chandrasekahran et al.) M = L + S M: data matrix (observed) L: low-rank (unobserved) S: sparse (unobserved)

The separation problem (Chandrasekahran et al.) M = L + S M: data matrix (observed) L: low-rank (unobserved) S: sparse (unobserved) Problem: can we recover L and S accurately? Again, missing information

Motivation: robust principal component analysis (RPCA) PCA sensitive to outliers: breaks down with one (badly) corrupted data point x 11 x 12... x 1n x 21 x 22... x 2n........ x d1 x d2... x dn

Motivation: robust principal component analysis (RPCA) PCA sensitive to outliers: breaks down with one (badly) corrupted data point x 11 x 12... x 1n x 11 x 12... x 1n x 21 x 22... x 2n x 21 x 22... x 2n.... =........ x d1 x d2... x dn x d1 A... x dn

Robust PCA Data increasingly high dimensional Gross errors frequently occur in many applications Image processing Web data analysis Bioinformatics... Occlusions Malicious tampering Sensor failures... Important to make PCA robust

Gross errors Observe corrupted entries Users Movies A A A Y ij = L ij + S ij (i, j) Ω obs L low-rank matrix S entries that have been tampered with (impulsive noise) Problem Recover L from missing and corrupted samples

The L+S model (Partial) information y = A(M) about M }{{} object = L }{{} low rank + S }{{} sparse A????? A?????????? A? A????? A?? RPCA Dynamic MR data = low-dimensional structure + corruption video seq. = static background + sparse innovation Graphical modeling with hidden variables: Chandrasekaran, Sanghavi, Parrilo, Willsky ( 09, 11) marginal inverse covariance of observed variables = low-rank + sparse

When does separation make sense? 0 0 0 0 0 0 Low-rank component cannot be sparse: L =...... 0 0 0 0 0 0

When does separation make sense? 0 0 0 0 0 0 Low-rank component cannot be sparse: L =...... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Sparse component cannot be low rank: S =...... 0 0 0 0 0

Low-rank component cannot be sparse 0 0 0 0 0 0 L = 0 0 0 0 0 0...... 0 0 0 0 0 0 Incoherent condition [C. and Recht ( 08)]: column and row spaces not aligned with coordinate axes (singular vectors are not sparse)

Low-rank component cannot be sparse A A 0 0 0 0 0 0 M = 0 0 0 0 0 0...... 0 0 0 0 0 0 Incoherent condition [C. and Recht ( 08)]: column and row spaces not aligned with coordinate axes (singular vectors are not sparse)

Sparse component cannot be low-rank x 1 x 2 x n 1 x n x 2 x n 1 x n x 1 x 2 x n 1 x n A x 2 x n 1 x n L = L + S =.......... x 1 x 2 x n 1 x n A x }{{} 2 x n 1 x n 1x Sparsity pattern will be assumed (uniform) random

Demixing by convex programming L unknown (rank unknown) M = L + S S unknown (# of entries 0, locations, magnitudes all unknown)

Demixing by convex programming L unknown (rank unknown) M = L + S S unknown (# of entries 0, locations, magnitudes all unknown) Recovery via SDP minimize ˆL + λ Ŝ 1 subject to ˆL + Ŝ = M See also Chandrasekaran, Sanghavi, Parrilo, Willsky ( 09) nuclear norm: L = i σ i(l) (sum of sing. values) l 1 norm: S 1 = ij S ij (sum of abs. values)

Exact recovery via SDP min ˆL + λ Ŝ 1 s. t. ˆL + Ŝ = M

Exact recovery via SDP min ˆL + λ Ŝ 1 s. t. ˆL + Ŝ = M Theorem L is n n of rank(l) ρ r n (log n) 2 and incoherent S is n n, random sparsity pattern of cardinality at most ρ s n 2 Then with probability 1 O(n 10 ), SDP with λ = 1/ n is exact: ˆL = L, Ŝ = S Same conclusion for rectangular matrices with λ = 1/ max dim

Exact recovery via SDP min ˆL + λ Ŝ 1 s. t. ˆL + Ŝ = M Theorem L is n n of rank(l) ρ r n (log n) 2 and incoherent S is n n, random sparsity pattern of cardinality at most ρ s n 2 Then with probability 1 O(n 10 ), SDP with λ = 1/ n is exact: ˆL = L, Ŝ = S Same conclusion for rectangular matrices with λ = 1/ max dim No tuning parameter! Whatever the magnitudes of L and S A A A A A A A A A A A A

Phase transitions in probability of success (a) PCP, Random Signs (b) PCP, Coherent Signs (c) Matrix Completion L = XY T is a product of independent n r i.i.d. N (0, 1/n) matrices

Missing and corrupted RPCA min ˆL + λ Ŝ 1 s. t. ˆLij + Ŝij = L ij + S ij (i, j) Ω obs A????? A?????????? A? A????? A??

Missing and corrupted RPCA min ˆL + λ Ŝ 1 s. t. ˆLij + Ŝij = L ij + S ij (i, j) Ω obs A????? A?????????? A? A????? A?? Theorem L as before Ω obs random set of size 0.1n 2 (missing frac. is arbitrary) Each observed entry corrupted with prob. τ τ 0 Then with prob. 1 O(n 10 ), PCP with λ = 1/ 0.1n is exact: ˆL = L Same conclusion for rectangular matrices with λ = 1/ 0.1max dim

Background subtraction

With noise With Li, Ma, Wright & Zhou ( 10) Z stochastic or deterministic perturbation Y ij = L ij + S ij + Z ij (i, j) Ω minimize ˆL + λ Ŝ 1 subject to (i,j) Ω (M ij ˆL ij Ŝij) 2 δ 2 When perfect (noiseless) separation occurs = noisy variant is stable

Story #3: Super-resolution Collaborator: C. Fernandez-Granda

Limits of resolution In any optical imaging system, diffraction imposes fundamental limit on resolution The physical phenomenon called diffraction is of the utmost importance in the theory of optical imaging systems (Joseph Goodman)

Bandlimited imaging systems (Fourier optics) f obs (t) = (h f)(t) h : point spread function (PSF) ˆf obs (ω) = ĥ(ω) ˆf(ω) ĥ : transfer function (TF) Bandlimited system ω > Ω ĥ(ω) = 0 ˆf obs (ω) = ĥ(ω) ˆf(ω) suppresses all high-frequency components

Bandlimited imaging systems (Fourier optics) f obs (t) = (h f)(t) h : point spread function (PSF) ˆf obs (ω) = ĥ(ω) ˆf(ω) ĥ : transfer function (TF) Bandlimited system ω > Ω ĥ(ω) = 0 ˆf obs (ω) = ĥ(ω) ˆf(ω) suppresses all high-frequency components Example: coherent imaging ĥ(ω) = 1 P (ω) indicator of pupil element TF PSF cross-section (PSF) Pupil Airy disk

Rayleigh resolution limit Lord Rayleigh

The super-resolution problem objective data Retrieve fine scale information from low-pass data Equivalent description: extrapolate spectrum (ill posed)

Random vs. low-frequency sampling Random sampling (CS) Low-frequency sampling (SR) Compressive sensing: spectrum interpolation Super-resolution: spectrum extrapolation

Super-resolving point sources This lecture Signal of interest is superposition of point sources Celestial bodies in astronomy Line spectra in speech analysis Fluorescent molecules in single-molecule microscopy Many applications Radar Spectroscopy Medical imaging Astronomy Geophysics...

Single molecule imaging Microscope receives light from fluorescent molecules Problem Resolution is much coarser than size of individual molecules (low-pass data) Can we beat the diffraction limit and super-resolve those molecules? Higher molecule density faster imaging

Mathematical model Signal x = j a jδ τj a j C, τ j T [0, 1] Data y = F n x: n = 2f lo + 1 low-frequency coefficients (Nyquist sampling) y(k) = 1 0 e i2πkt x(dt) = j a j e i2πkτj k Z, k f lo Resolution limit: (λ lo /2 is Rayleigh distance) 1/f lo = λ lo

Mathematical model Signal x = j a jδ τj a j C, τ j T [0, 1] Data y = F n x: n = 2f lo + 1 low-frequency coefficients (Nyquist sampling) y(k) = 1 0 e i2πkt x(dt) = j a j e i2πkτj k Z, k f lo Resolution limit: (λ lo /2 is Rayleigh distance) 1/f lo = λ lo Question Can we resolve the signal beyond this limit? Swap time and frequency spectral estimation

Can you find the spikes? Low-frequency data about spike train

Can you find the spikes? Low-frequency data about spike train

Recovery by minimum total-variation Recovery by cvx prog. min x TV subject to F n x = y x TV = x(dt) is continuous analog of l 1 norm x = j a j δ τj = x TV = j a j With noise min 1 2 y F n x 2 l 2 + λ x TV

Recovery by convex programming y(k) = 1 0 e i2πkt x(dt) k f lo

Recovery by convex programming y(k) = 1 0 e i2πkt x(dt) k f lo Theorem (C. and Fernandez Granda (2012)) If spikes are separated by at least 2 /f lo := 2 λ lo then min TV solution is exact! For real-valued x, a min dist. of 1.87λ lo suffices

Recovery by convex programming y(k) = 1 0 e i2πkt x(dt) k f lo Theorem (C. and Fernandez Granda (2012)) If spikes are separated by at least 2 /f lo := 2 λ lo then min TV solution is exact! For real-valued x, a min dist. of 1.87λ lo suffices Infinite precision!

Recovery by convex programming y(k) = 1 0 e i2πkt x(dt) k f lo Theorem (C. and Fernandez Granda (2012)) If spikes are separated by at least 2 /f lo := 2 λ lo then min TV solution is exact! For real-valued x, a min dist. of 1.87λ lo suffices Infinite precision! Whatever the amplitudes

Recovery by convex programming y(k) = 1 0 e i2πkt x(dt) k f lo Theorem (C. and Fernandez Granda (2012)) If spikes are separated by at least 2 /f lo := 2 λ lo then min TV solution is exact! For real-valued x, a min dist. of 1.87λ lo suffices Infinite precision! Whatever the amplitudes Can recover (2λ lo ) 1 = f lo /2 = n/4 spikes from n low-freq. samples

Recovery by convex programming y(k) = 1 0 e i2πkt x(dt) k f lo Theorem (C. and Fernandez Granda (2012)) If spikes are separated by at least 2 /f lo := 2 λ lo then min TV solution is exact! For real-valued x, a min dist. of 1.87λ lo suffices Infinite precision! Whatever the amplitudes Can recover (2λ lo ) 1 = f lo /2 = n/4 spikes from n low-freq. samples Cannot go below λ lo

Recovery by convex programming y(k) = 1 0 e i2πkt x(dt) k f lo Theorem (C. and Fernandez Granda (2012)) If spikes are separated by at least 2 /f lo := 2 λ lo then min TV solution is exact! For real-valued x, a min dist. of 1.87λ lo suffices Infinite precision! Whatever the amplitudes Can recover (2λ lo ) 1 = f lo /2 = n/4 spikes from n low-freq. samples Cannot go below λ lo Essentially same result in higher dimensions

About separation: sparsity is not enough! CS: sparse signals are away from null space of sampling operator Super-res: this is not the case Signal Spectrum 10 0 10 5 10 10

About separation: sparsity is not enough! CS: sparse signals are away from null space of sampling operator Super-res: this is not the case Signal Spectrum 10 0 10 5 10 10 10 0 10 5 10 10

Analysis via prolate spheroidal functions David Slepian If distance between spikes less than λ lo /2 (Rayleigh), problem hopelessly ill posed

The super-resolution factor (SRF) Have data at resolution λ lo Super-resolution factor Wish resolution λ hi SRF = λ lo λ hi

The super-resolution factor (SRF): frequency viewpoint Observe spectrum up to f lo Wish to extrapolate up to f hi Super-resolution factor SRF = f hi f lo

With noise y = F n x + noise F n x = 1 0 e i2πkt x(dt) k f lo At native resolution (ˆx x) ϕ λlo TV noise level

With noise y = F n x + noise F n x = 1 0 e i2πkt x(dt) k f lo At finer resolution λ hi = λ lo /SRF, convex programming achieves (ˆx x) ϕ λhi TV SRF 2 noise level

With noise y = F n x + noise F n x = 1 0 e i2πkt x(dt) k f lo At finer resolution λ hi = λ lo /SRF, convex programming achieves (ˆx x) ϕ λhi TV SRF 2 noise level Modulus of continuity studies for super-resolution: Donoho ( 92)

Formulation as a finite-dimensional problem Primal problem min x TV s. t. F n x = y Dual problem max Re y, c s. t. F nc 1 Infinite-dimensional variable x Finitely many constraints Finite-dimensional variable c Infinitely many constraints (Fn c)(t) = c k e i2πkt k f lo

Formulation as a finite-dimensional problem Primal problem min x TV s. t. F n x = y Dual problem max Re y, c s. t. F nc 1 Infinite-dimensional variable x Finitely many constraints Finite-dimensional variable c Infinitely many constraints (Fn c)(t) = c k e i2πkt k f lo Semidefinite representability (F n c)(t) 1 for all t [0, 1] equivalent to (1) there is Q Hermitian s. t. [ ] Q c c 0 1 (2) Tr(Q) = 1 (3) sums along superdiagonals vanish: n j i=1 Q i,i+j = 0 for 1 j n 1

SDP formulation Dual as an SDP maximize Re y, c subject to [ ] Q c c 0 1 n j i=1 Q i,i+j = δ j 0 j n 1 Dual solution c: coeffs. of low-pass trig. polynomial k c ke i2πkt interpolating the sign of the primal solution

SDP formulation Dual as an SDP maximize Re y, c subject to [ ] Q c c 0 1 n j i=1 Q i,i+j = δ j 0 j n 1 Dual solution c: coeffs. of low-pass trig. polynomial k c ke i2πkt interpolating the sign of the primal solution To recover spike locations (1) Solve dual (2) Check when polynomial takes on magnitude 1

Noisy example SNR: 14 db Measurements

Noisy example SNR: 14 db Measurements Low res signal

Noisy example SNR: 14 db Measurements High res signal

Noisy example Average localization error: 6.54 10 4 High res Signal Estimate

Summary Three important problems with missing data Phase retrieval Matrix completion/rpca Super-resolution Three simple and model-free recovery procedures via convex programming Three near-perfect solutions

Apologies: things I have not talked about Algorithms Applications Avalanche of related works

A small sample of papers I have greatly enjoyed Phase retrieval Netrapalli, Jain, Sanghavi, Phase retrieval using alternating minimization ( 13) Waldspurger, d Aspremont, Mallat, Phase recovery, MaxCut and complex semidefinite programming ( 12) Robust PCA Gross, Recovering low-rank matrices from few coefficients in any basis ( 09) Chandrasekaran, Parrilo and Willsky, Latent variable graphical model selection via convex optimization ( 11) Hsu, Kakade and Zhang, Robust matrix decomposition with outliers ( 11) Super-resolution Kahane, Analyse et synthèse harmoniques ( 11) Slepian, Prolate spheroidal wave functions, Fourier analysis, and uncertainty. V - The discrete case ( 78)