Combining Sparsity with Physically-Meaningful Constraints in Sparse Parameter Estimation

Similar documents
Sparse Parameter Estimation: Compressed Sensing meets Matrix Pencil

ELE 538B: Sparsity, Structure and Inference. Super-Resolution. Yuxin Chen Princeton University, Spring 2017

ROBUST BLIND SPIKES DECONVOLUTION. Yuejie Chi. Department of ECE and Department of BMI The Ohio State University, Columbus, Ohio 43210

Spectral Compressed Sensing via Structured Matrix Completion

Towards a Mathematical Theory of Super-resolution

Going off the grid. Benjamin Recht Department of Computer Sciences University of Wisconsin-Madison

Super-resolution via Convex Programming

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

Parameter Estimation for Mixture Models via Convex Optimization

Robust Principal Component Analysis

AN ALGORITHM FOR EXACT SUPER-RESOLUTION AND PHASE RETRIEVAL

Optimization-based sparse recovery: Compressed sensing vs. super-resolution

Solving Corrupted Quadratic Equations, Provably

Going off the grid. Benjamin Recht University of California, Berkeley. Joint work with Badri Bhaskar Parikshit Shah Gonnguo Tang

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization

Sparse Solutions of an Undetermined Linear System

Sparse Sensing in Colocated MIMO Radar: A Matrix Completion Approach

SPARSE signal representations have gained popularity in recent

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

Massive MIMO: Signal Structure, Efficient Processing, and Open Problems II

A Fast Algorithm for Reconstruction of Spectrally Sparse Signals in Super-Resolution

Self-Calibration and Biconvex Compressive Sensing

Support Detection in Super-resolution

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit

Rapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization

Non-convex Robust PCA: Provable Bounds

Overview. Optimization-Based Data Analysis. Carlos Fernandez-Granda

CSC 576: Variants of Sparse Learning

Sparse Recovery Beyond Compressed Sensing

Sparse Legendre expansions via l 1 minimization

Low-rank Matrix Completion with Noisy Observations: a Quantitative Comparison

Introduction to Compressed Sensing

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

arxiv: v4 [cs.it] 21 Dec 2017

Recovery of Simultaneously Structured Models using Convex Optimization

Strengthened Sobolev inequalities for a random subspace of functions

Fast and Robust Phase Retrieval

Robust Recovery of Positive Stream of Pulses

Introduction How it works Theory behind Compressed Sensing. Compressed Sensing. Huichao Xue. CS3750 Fall 2011

Low-rank Promoting Transformations and Tensor Interpolation - Applications to Seismic Data Denoising

Recovery of Low-Rank Plus Compressed Sparse Matrices with Application to Unveiling Traffic Anomalies

SPECTRAL COMPRESSIVE SENSING WITH POLAR INTERPOLATION. Karsten Fyhn, Hamid Dadkhahi, Marco F. Duarte

Fast Angular Synchronization for Phase Retrieval via Incomplete Information

Coherent imaging without phases

Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas

Recovery of Low Rank and Jointly Sparse. Matrices with Two Sampling Matrices

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

Off-the-Grid Line Spectrum Denoising and Estimation with Multiple Measurement Vectors

Random projections. 1 Introduction. 2 Dimensionality reduction. Lecture notes 5 February 29, 2016

Blind Deconvolution Using Convex Programming. Jiaming Cao

LIMITATION OF LEARNING RANKINGS FROM PARTIAL INFORMATION. By Srikanth Jagabathula Devavrat Shah

Sensing systems limited by constraints: physical size, time, cost, energy

Fast 2-D Direction of Arrival Estimation using Two-Stage Gridless Compressive Sensing

On the recovery of measures without separation conditions

The convex algebraic geometry of rank minimization

Solving Underdetermined Linear Equations and Overdetermined Quadratic Equations (using Convex Programming)

COMPRESSIVE SENSING WITH AN OVERCOMPLETE DICTIONARY FOR HIGH-RESOLUTION DFT ANALYSIS. Guglielmo Frigo and Claudio Narduzzi

Structured matrix factorizations. Example: Eigenfaces

Optimisation Combinatoire et Convexe.

Compressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach

Compressive Line Spectrum Estimation with Clustering and Interpolation

Three Recent Examples of The Effectiveness of Convex Programming in the Information Sciences

Compressed Sensing and Sparse Recovery

Compressive Parameter Estimation: The Good, The Bad, and The Ugly

LARGE SCALE 2D SPECTRAL COMPRESSED SENSING IN CONTINUOUS DOMAIN

Super-Resolution of Mutually Interfering Signals

Analysis of Robust PCA via Local Incoherence

Demixing Sines and Spikes: Robust Spectral Super-resolution in the Presence of Outliers

Low Rank Matrix Completion Formulation and Algorithm

Lecture Notes 5: Multiresolution Analysis

Information and Resolution

Three Generalizations of Compressed Sensing

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Recovering overcomplete sparse representations from structured sensing

Covariance Sketching via Quadratic Sampling

SIGNALS with sparse representations can be recovered

Minimizing the Difference of L 1 and L 2 Norms with Applications

Compressed Sensing and Robust Recovery of Low Rank Matrices

MUSIC for multidimensional spectral estimation: stability and super-resolution

Lecture Notes 10: Matrix Factorization

Lecture 22: More On Compressed Sensing

arxiv: v1 [cs.it] 4 Nov 2017

Performance Analysis for Strong Interference Remove of Fast Moving Target in Linear Array Antenna

Compressed sensing. Or: the equation Ax = b, revisited. Terence Tao. Mahler Lecture Series. University of California, Los Angeles

Compressive Sensing and Beyond

Compressive Two-Dimensional Harmonic Retrieval via Atomic Norm Minimization

Blind Compressed Sensing

Cooperative Non-Line-of-Sight Localization Using Low-rank + Sparse Matrix Decomposition

arxiv: v2 [cs.it] 16 Feb 2013

Compressed Sensing and Neural Networks

Efficient Spectral Methods for Learning Mixture Models!

Introduction to Sparsity. Xudong Cao, Jake Dreamtree & Jerry 04/05/2012

Multiresolution Analysis

Sparsity in system identification and data-driven control

Estimating Unknown Sparsity in Compressed Sensing

Constrained optimization

Mathematical analysis of ultrafast ultrasound imaging

Exact Joint Sparse Frequency Recovery via Optimization Methods

Lecture Notes 9: Constrained Optimization

Seismic data interpolation and denoising using SVD-free low-rank matrix factorization

Transcription:

UIUC CSL Mar. 24 Combining Sparsity with Physically-Meaningful Constraints in Sparse Parameter Estimation Yuejie Chi Department of ECE and BMI Ohio State University Joint work with Yuxin Chen (Stanford). Page

Sparse Fourier Analysis In many sensing applications, one is interested in identification of a parametric signal: x (t) = r d i e j2π t,f i, t n... n K i= (f i [, ) K frequencies, d i amplitudes, r model order) Occam s razor: the number of modes r is small. Sensing with a minimal cost: how to identify the parametric signal model from a small subset of samples of x(t) This problem has many (classical) applications in communications, remote sensing, and array signal processing. Page 2

Applications in Communications and Sensing Multipath channels: a (relatively) small number of strong paths. Radar Target Identification: a (relatively) small number of strong scatters. Page 3

Applications in Imaging Swap time and frequency: r z (t) = diδ(t ti) i= Applications in microscopy imaging and astrophysics. Resolution is limited by the point spread function of the imaging system Page 4

x (t τ ) = Something Old: Parametric Estimation Exploring Physically-meaningful Constraints: shift invariance of the harmonic structure r d i e j2π t τ,f i = i= r d i e j2π τ,f i e j2π t,f i i= Prony s method: root-finding. SVD based approaches: ESPRIT [RoyKailath 989], MUSIC [Schmidt 986], matrix pencil [HuaSarkar 99, Hua 992]. spectrum blind sampling [Bresler 996], finite rate of innovation [Vitterli 2], Xampling [Eldar 2]. Pros: perfect recovery from (equi-spaced) O(r) samples Cons: sensitive to noise and outliers, usually require prior knowledge on the model order. Page 5

Something New: Compressed Sensing Exploring Sparsity: Compressed Sensing [Candes and Tao 26, Donoho 26] capture the attributes (sparsity) of signals from a small number of samples. Discretize the frequency and assume a sparse representation over the discretized basis f i F = { n,..., n n } { n 2,..., n 2 n 2 }... Pros: perfect recovery from O(r log n) random samples, robust against noise and outliers Cons: sensitive to gridding error Page 6

Sensitivity to Basis Mismatch A toy example: x(t) = e j2πf t : The success of CS relies on sparsity in the DFT basis. Basis mismatch: Physics places f off grid by a frequency offset. Basis mismatch translates a sparse signal into an incompressible signal. 2 4 Mismatch θ=.π/n 2 4 Mismatch θ=.5π/n 2 4 Mismatch θ=π/n 2 4 Normalized recovery error=.86 2 4 Normalized recovery error=.346 2 4 Normalized recovery error=.873 Finer grid does not help, and one never estimates the true continuousvalued frequencies! [Chi, Scharf, Pezeshki, Calderbank 2] Page 7

Our Approach Conventional approaches enforce physically-meaningful constraints, but not sparsity; Compressed sensing enforces sparsity, but not physically-meaningful constraints; Approach: We combine sparsity with physically-meaningful constraints, so that we can stably estimate the continuous-valued frequencies from a minimal number of time-domain samples. revisit matrix pencil proposed for array signal processing revitalize matrix pencil by combining it with convex optimization 5 5 2 25 3 35 5 5 2 25 3 35 Page 8

Two-Dimensional Frequency Model Stack the signal x (t) = r i= d i e j2π t,f i into a matrix X C n n 2. The matrix X has the following Vandermonde decomposition: X = Y Here, D = diag {d,, d r } and Y = y y 2 y r y n 2 y n r y n Vandemonde matrix D Z T. diagonal matrix, Z = z z 2 z r z n 2 2 z n 2 r z n 2 Vandemonde matrix where y i = exp(j2πf i ), z i = exp(j2πf 2i ), f i = (f i, f 2i ). Goal: We observe a random subset of entries of X, and wish to recover the missing entries. Page 9

Matrix Completion recall that X = Y D ZT. Vandemonde diagonal Vandemonde where D = diag {d,, dr }, and Convex Relaxation y Y = n y y2 n y2 z, Z = n2 n z yr yr z2 n z2 2 n2 zr zr Quick observation: X can be a low-rank matrix with rank(x) = r. Question: can we apply Matrix Completion algorithms directly on X ' () Σ performance. Yes, but it yields sub-optimal r max{n at least decompose It requires, n2}samples. No, X is no longer low-rank if r > min (n, n2) Note that r can be as large as nn2 * ' () * ' () = + Σ Page H *

Revisiting Matrix Pencil: Matrix Enhancement Given a data matrix X, Hua proposed the following matrix enhancement for two-dimensional frequency models [Hua 992]: Choose two pencil parameters k and k 2 ; 5 5 2 25 3 35 5 5 2 25 3 35 An enhanced form X e is an k (n k + ) block Hankel matrix : X X X n k X X e = X 2 X n k +, X k X k X n where each block is a k 2 (n 2 k 2 + ) Hankel matrix as follows x l, x l, x l,n2 k 2 x X l = l, x l,2 x l,n2 k 2 +. x l,k2 x l,k2 x l,n2 Page

Low Rankness of the Enhanced Matrix Choose pencil parameters k = Θ(n ) and k 2 = Θ(n 2 ), the dimensionality of X e is proportional to n n 2 n n 2. The enhanced matrix can be decomposed as follows [Hua 992]: X e = Z L Z L Y d The Task Z L Y k d D [Z R, Y d Z R,, Y n k d Z R ], C = A * + B* Z L and Z R are Vandermonde matrices specified by z,..., z r, Y d = diag [y, y 2,, y r ]. The enhanced form X e is low-rank. rank (X e ) r Spectral Sparsity Low Rankness holds even with damping modes. Given Sparse Errors Matrix Low-rank Matrix Page 2

Enhanced Matrix Completion (EMaC) The natural algorithm is to find the enhanced matrix with the minimal rank satisfying the measurements: minimize M C n n 2 subject to rank (M e ) M i,j = X i,j, (i, j) Ω where Ω denotes the sampling set. Motivated by Matrix Completion, we will solve its convex relaxation, (EMaC) minimize M C n n 2 subject to M e M i,j = X i,j, (i, j) Ω where denotes the nuclear norm. The algorithm is referred to as Enhanced Matrix Completion (EMaC). Page 3

Enhanced Matrix Completion (EMaC) (EMaC) minimize n n M C 2 subject to M e M i,j = X i,j, (i, j) Ω Convex Relaxation existing MC result won t apply requires at least O(nr) samples Question: How many samples do we need ' () Σ * decompose = ' () Σ Page 4 *

Introduce Coherence Measure Define the 2-D Dirichlet kernel: K(k, k 2, f, f 2 ) = ( e j2πkf k k 2 e j2πf ) ( e j2πk2f2 e j2πf 2 ), Define G L and G R as r r Gram matrices such that (G L ) i,l = K(k, k 2, f i f l, f 2i f 2l ), (G R ) i,l = K(n k +, n 2 k 2 +, f i f l, f 2i f 2l )..5.4.9.3.8 separation on y axis.2...2.7.6.5.4.3.3.2.4..5.5.4.3.2...2.3.4.5 separation on x axis Dirichlet Kernel Page 5

Introduce Incoherence Measure Incoherence condition holds w.r.t. µ if σ min (G L ) µ, σ min (G R ) µ. Examples: µ = Θ() under many scenarios: Randomly generated frequencies; (Mild) perturbation of grid points; In D, let k n 2 : well-separated frequencies (Liao and Fannjiang, 24): = min i j f i f j 2 n, which is about 2 times Rayleigh limits. Page 6

Theoretical Guarantees for Noiseless Case Theorem [Chen and Chi, 23] (Noiseless Samples) Let n = n n 2. If all measurements are noiseless, then EMaC recovers X perfectly with high probability if m > Cµr log 4 n. where C is some universal constant. Implications deterministic signal model, random observation; coherence condition µ only depends on the frequencies but the amplitudes. near-optimal within logarithmic factors: Θ(rpolylogn). general theoretical guarantees for Hankel (Toeplitz) matrix completion. see applications in control, MRI, natural language processing, etc Page 7

Phase Transition r: sparsity level 22 2 8 6 4 2 8 6 4 2.9.8.7.6.5.4.3.2. 2 4 6 8 2 4 6 8 2 m: number of samples Figure : Phase transition diagrams where spike locations are randomly generated. The results are shown for the case where n = n 2 = 5. Page 8

Robustness to Bounded Noise Assume the samples are noisy X = X o + N, where N is bounded noise: (EMaC-Noisy) minimize M C n n 2 M e subject to P Ω (M X) F δ, Theorem 2 [Chen and Chi, 23] (Noisy Samples) Suppose X o is a noisy copy of X that satisfies P Ω (X X o ) F δ. Under the conditions of Theorem, the solution to EMaC-Noisy satisfies with probability exceeding n 2. ˆX e X e F {2 n + 8n + 8 2n 2 m } δ Implications: The average entry inaccuracy is bounded above by O( n m δ). In practice, EMaC-Noisy usually yields better estimate. Page 9

Singular Value Thresholding (Noisy Case) Several optimized solvers for Hankel matrix completion exist, for example [Fazel et. al. 23, Liu and Vandenberghe 29] Algorithm Singular Value Thresholding for EMaC : initialize Set M = X e and t =. 2: repeat 3: ) Q t D τt (M t ) (singular-value thresholding) 4: 2) M t Hankel X (Q t ) (projection onto a Hankel matrix consistent with observation) 5: 3) t t + 6: until convergence 25 2 True Signal Reconstructed Signal Amplitude 5 5 2 3 4 5 6 7 8 9 Time (vectorized) Figure 2: dimension:, r = 3, m n n 2 = 5.8%, SNR = db. Page 2

Robustness to Sparse Outliers What if a constant portion of measurements are arbitrarily corrupted 8 6 4 2 real part 2 4 6 clean signal noisy samples 8 2 3 4 5 6 data index Robust PCA approach [Candes et. al. 2, Chandrasekaran et. al. 2] Solve the following algorithm: (RobustEMaC) minimize M,S C n n 2 subject to M e + λ S e (M + S) i,l = X corrupted i,l, (i, l) Ω Page 2

Theoretical Guarantees for Robust Recovery Assume the percent of corrupted entries is s. Theorem 3 [Chen and Chi, 23] (Sparse Outliers) Set n = n n 2 and λ = m log n. Let s 2%, then RobustEMaC recovers X with high probability if m > Cµr 2 log 3 n, where C is some universal constant. Implications: Sample complexity: m Θ(r 2 log 3 n), slight loss than the previous case; Robust to a constant portion of outliers: s Θ(m) Page 22

Robustness to Sparse Corruptions 8 8 6 6 4 4 2 2 real part real part 2 2 4 4 6 clean signal noisy samples 8 2 3 4 5 6 data index (a) Observation 6 clean signal recovered signal recovered corruptions 8 2 3 4 5 6 data index (b) Recovery Figure 3: Robustness to sparse corruptions: (a) Clean signal and its corrupted subsampled samples; (b) recovered signal and the sparse corruptions. Page 23

Phase Transition for Line Spectrum Estimation Fix the amount of corruption as % of the total number of samples: 25.9 r: sparsity level 2 5 5.8.7.6.5.4.3.2. 2 4 6 8 m: number of samples Figure 4: Phase transition diagrams where spike locations are randomly generated. The results are shown for the case where n = 25. Page 24

Related Work Cadzow s denoising: non-convex heuristic to denoise line spectrum data based on the Hankel form. Atomic norm minimization: recently proposed by [Tang et. compressive line spectrum estimation off the grid. min s s A subject to P Ω (s) = P Ω (x), al., 23] for where the atomic norm is defined as x A = inf { i d i x(t) = i d i e j2πf it }. Random signal model: if the frequencies are separated by 4 times the Rayleigh limit and the phases are random, then perfect recovery with O(r log r log n) samples; no stability result with random observations or sparse corruptions; extendable to multi-dimensional frequencies [Chi and Chen, 23], but the SDP characterization is more complicated [Xu et. al. 23]. Page 25

Comparison with Atomic Norm Minimization Phase transition for line spectrum estimation: numerically, the EMaC approach seems less sensitive to the separation condition. EMaC Atomic Norm 4 4 35.9 35.9.8.8 without separation r: sparsity level 3 25 2 5.7.6.5.4 r: sparsity level 3 25 2 5.7.6.5.4.3.3.2.2 5. 5. 2 3 4 5 6 7 8 9 2 m: number of samples 2 3 4 5 6 7 8 9 2 m: number of samples 4 4 35.9 35.9.8.8 3 3 with separation r: sparsity level 25 2 5.7.6.5.4 r: sparsity level 25 2 5.7.6.5.4.3.3.2.2 5. 5. 2 3 4 5 6 7 8 9 2 m: number of samples 2 3 4 5 6 7 8 9 2 m: number of samples Page 26

Super Resolution (2-D) Super resolution: EMaC can also be applied to super resolution, where one observes low-end spectrum and extrapolate to higher frequencies [Candes and Fernandez-Granda, 22]. 2 3 4 5 theoretical guarantee remains an open problem 6 7 8 2 3 4 5 6 7 8.9.9.8.7.8.7 amplitude.5.2.4.6.8.8.6.4.2.6.5.4.3.2...2.3.4.5.6.7.8.9.6.5.4.3.2...2.3.4.5.6.7.8.9 (a) Ground Truth (b) Low Resolution Image (c) Super-Resolution via EMaC Page 27

Final Remarks Convex Relaxation Connect spectral compressed sensing with matrix completion ' () Σ decompose = * Connect traditional approach (parametric harmonic retrieval) with recent advance (MC) ' Applying Taylor expansion: 5 5 2 25 3 35 5 5 2 25 3 35 Σno = Σno K!"#$ Σno + sparse matrix Treat W as noise (BUT WHY super W resolution small Future work: performance guarantees for 2-D n smal H K H and Σ Σ Page 28 Page 5

Q&A Preprints available at arxiv: Robust Spectral Compressed Sensing via Structured Matrix Completion http://arxiv.org/abs/34.826 Thank You! Questions Page 29