Computable Performance Analysis of Sparsity Recovery with Applications

Similar documents
Performance Analysis for Sparse Support Recovery

The Stability of Low-Rank Matrix Reconstruction: a Constrained Singular Value Perspective

Sparse Sensing in Colocated MIMO Radar: A Matrix Completion Approach

Tractable Upper Bounds on the Restricted Isometry Constant

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit

Strengthened Sobolev inequalities for a random subspace of functions

Reconstruction from Anisotropic Random Measurements

GREEDY SIGNAL RECOVERY REVIEW

Introduction to Compressed Sensing

Sparse Solutions of an Undetermined Linear System

Signal Recovery from Permuted Observations

Compressed sensing. Or: the equation Ax = b, revisited. Terence Tao. Mahler Lecture Series. University of California, Los Angeles

Thresholds for the Recovery of Sparse Solutions via L1 Minimization

Multipath Matching Pursuit

Compressive Sensing and Beyond

Recent Developments in Compressed Sensing

Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing

Self-Calibration and Biconvex Compressive Sensing

Gauge optimization and duality

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Generalized Orthogonal Matching Pursuit- A Review and Some

COMPRESSED SENSING IN PYTHON

Sensing systems limited by constraints: physical size, time, cost, energy

Compressed Sensing and Neural Networks

Rapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization

Recovering overcomplete sparse representations from structured sensing

Electromagnetic Imaging Using Compressive Sensing

IEEE SIGNAL PROCESSING LETTERS, VOL. 22, NO. 9, SEPTEMBER

SPARSE signal representations have gained popularity in recent

Robust multichannel sparse recovery

Large-Scale L1-Related Minimization in Compressive Sensing and Beyond

Compressed Sensing. 1 Introduction. 2 Design of Measurement Matrices

Structured matrix factorizations. Example: Eigenfaces

A NEW FRAMEWORK FOR DESIGNING INCOHERENT SPARSIFYING DICTIONARIES

Lecture Notes 9: Constrained Optimization

Lecture: Introduction to Compressed Sensing Sparse Recovery Guarantees

Combining Sparsity with Physically-Meaningful Constraints in Sparse Parameter Estimation

Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery

MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design

Robust Principal Component Analysis

On the Projection Matrices Influence in the Classification of Compressed Sensed ECG Signals

CoSaMP. Iterative signal recovery from incomplete and inaccurate samples. Joel A. Tropp

Analog-to-Information Conversion

Joint Direction-of-Arrival and Order Estimation in Compressed Sensing using Angles between Subspaces

The Analysis Cosparse Model for Signals and Images

Pre-weighted Matching Pursuit Algorithms for Sparse Recovery

Performance Analysis for Strong Interference Remove of Fast Moving Target in Linear Array Antenna

Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise

of Orthogonal Matching Pursuit

Solving Underdetermined Linear Equations and Overdetermined Quadratic Equations (using Convex Programming)

Co-Prime Arrays and Difference Set Analysis

Lecture 22: More On Compressed Sensing

Model-Based Compressive Sensing for Signal Ensembles. Marco F. Duarte Volkan Cevher Richard G. Baraniuk

An Introduction to Sparse Approximation

Design of Projection Matrix for Compressive Sensing by Nonsmooth Optimization

Noisy Signal Recovery via Iterative Reweighted L1-Minimization

Sparsifying Transform Learning for Compressed Sensing MRI

Compressed Sensing and Robust Recovery of Low Rank Matrices

A new method on deterministic construction of the measurement matrix in compressed sensing

Does Compressed Sensing have applications in Robust Statistics?

Super-resolution via Convex Programming

Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition

EUSIPCO

Massive MIMO: Signal Structure, Efficient Processing, and Open Problems II

Uniqueness Conditions for A Class of l 0 -Minimization Problems

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

Simultaneous Sparsity

CS 229r: Algorithms for Big Data Fall Lecture 19 Nov 5

SIGNALS with sparse representations can be recovered

Design of Spectrally Shaped Binary Sequences via Randomized Convex Relaxations

Multiplicative and Additive Perturbation Effects on the Recovery of Sparse Signals on the Sphere using Compressed Sensing

Fast Hard Thresholding with Nesterov s Gradient Method

Enhanced Compressive Sensing and More

Sparse Solutions of Systems of Equations and Sparse Modelling of Signals and Images

Methods for sparse analysis of high-dimensional data, II

The Pros and Cons of Compressive Sensing

5742 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER /$ IEEE

Signal Recovery From Incomplete and Inaccurate Measurements via Regularized Orthogonal Matching Pursuit

TRACKING SOLUTIONS OF TIME VARYING LINEAR INVERSE PROBLEMS

Sparse and Low-Rank Matrix Decompositions

Primal Dual Pursuit A Homotopy based Algorithm for the Dantzig Selector

Going off the grid. Benjamin Recht Department of Computer Sciences University of Wisconsin-Madison

Compressed Sensing and Related Learning Problems

Sparse Parameter Estimation: Compressed Sensing meets Matrix Pencil

12.4 Known Channel (Water-Filling Solution)

Fast Angular Synchronization for Phase Retrieval via Incomplete Information

Randomness-in-Structured Ensembles for Compressed Sensing of Images

PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION. A Thesis MELTEM APAYDIN

Efficient Adaptive Compressive Sensing Using Sparse Hierarchical Learned Dictionaries

Tractable performance bounds for compressed sensing.

CSC 576: Variants of Sparse Learning

Uniform Uncertainty Principle and signal recovery via Regularized Orthogonal Matching Pursuit

Compressive Sensing (CS)

Compressed Sensing: Lecture I. Ronald DeVore

Approximate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery

Sparse linear models

Elaine T. Hale, Wotao Yin, Yin Zhang

Lecture 3. Random Fourier measurements

Minimizing the Difference of L 1 and L 2 Norms with Applications

Sparse Optimization Lecture: Sparse Recovery Guarantees

Transcription:

Computable Performance Analysis of Sparsity Recovery with Applications Arye Nehorai Preston M. Green Department of Electrical & Systems Engineering Washington University in St. Louis, USA European Signal Processing Conference (EUSIPCO) August 28, 2012 Computable Bounds 1

Acknowledgements Based on collaborations with Gongguo Tang (Ph.D. 2011) and Satyabrata Sen (Ph.D 2010). Supported by the US National Science Foundation, Air Force Office of Scientific Research, and the Office of Naval Research. Computable Bounds 2

Figure Acknowledgements Figures on slide 8 and slide 16 are adapted from R. Baraniuk, J. Romberg and M. Wakin s slides Tutorial on Compressive Sensing. Figures on slide 9 and slide 10 are modified from E. J. Candés, J. Romberg and T. Tao, Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information, IEEE Trans. Inf. Theory, vol. 52, no. 2, pp. 489-509, Feb. 2006. Figures on slide 11 and slide 12 are reproduced from J. Wright, A. Yang, A. Ganesh, S. Sastry, and Y. Ma, Robust face recognition via sparse representation, IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 2, Feb. 2009. Figure on slide 14 is from D. M. Maliouto, A sparse signal reconstruction perspective for source localization with sensor arrays, Master thesis, MIT, 2003. Computable Bounds 3

Outline Introduction Sparsity recovery Application examples Future work Computable Bounds 4

Outline Introduction Sparsity recovery Application examples Future work Computable Bounds 5

Introduction Low-dimensional structures are ubiquitous in signals: Sparse vectors Compressive sensing MRI Image processing and computer vision Block-sparse vectors Radar Sensor array processing Low-rank matrices Collaborative filtering Robust principal component analysis Low-dimensional manifolds Subspace learning Manifold learning Exploiting low-dimensional structures enables more accurate signal recovery. Computable Bounds 6

Sparsity Example: Compressive Sensing Interest in exploiting sparsity grew recently due to developments in compressive sensing. Traditional signal sampling acquires a signal using expensive hi-fidelity sensors, then compresses the data with a loss of fidelity. Compressive sensing (CS) combines the acquisition with the compression by sampling the signal in a novel way with less data. CS replaces samples with general linear projections, and linear reconstruction with non-linear reconstruction, thus shifting the burden from the hi-fidelity sensing to reconstruction. Key assumption: Many natural signals x have sparse representations in some transform domains Φ, i.e., x = Φs for some sparse vector s. Computable Bounds 7

Sparsity Example: Compressive Sensing (cont.) Figure 1: Paradigm of compressive sensing. Surprising fact 1,2 : Suffices to use m = O(k log n) n linear, non-adaptive, random measurements y to reconstruct a sparse signal, where k = s 0 is the sparsity level of s. The reconstruction performance depends on the sensing matrix A. 1 E. J. Candés, J. Romberg and T. Tao, Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information, IEEE IT, vol. 52, no. 2, pp. 489-509, Feb. 2006. 2 D. L. Donoho, Compressed sensing, IEEE IT, vol. 52, no. 4, pp. 1289 1306, Apr. 2006. Computable Bounds 8

Sparsity Example: MRI MRI images can often be well approximated by piece-wise constant functions. Instead of observing the image directly, we observe its Fourier transform coefficients sampled along, e.g., a radial trajectory in the 2D spatial frequency domain. (a) Logan-Shepp phantom (b) Fourier transform (c) Sampling trajectory Computable Bounds 9

Sparsity Example: MRI (cont.) Reconstruction using minimum energy (or l 2 norm) results in many artifacts. However, the total variation (TV) minimization, which enforces the piece-wise constant property of the image, recovers the original image exactly. (d) Min-energy recovery (e) Magnitude of gradient (f) Min-TV recovery Figure 2: Exploiting sparsity improves the MRI recovery. Computable Bounds 10

Sparsity Example: Image Processing and Computer Vision Sparse recovery is also useful in single-image super-resolution, and face recognition 3. In face recognition, a given facial image is sparsely represented using a dictionary database. The significant coefficients in the representation reveals the person s identity. Figure 3: Robust face recognition with occlusion. 3 J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma, Robust Face Recognition via Sparse Representation, IEEE PAMI, vol.31, no.2, pp.210-227, Feb. 2009 Computable Bounds 11

Sparsity Example: Image Processing and Computer Vision (cont.) This approach yields accurate recognition. It is also robust to occlusions and noise corruptions. Figure 4: Robust face recognition with corruption. Computable Bounds 12

Sparsity Example: Sensor Arrays We can also use sparsity to estimate continuous parameters in nonlinear models 4. Consider, for example, the estimation of directions-of-arrival (DOAs) using a sensor array. The narrowband observation model is given by y = Ã( θ) x + w, where Ã( θ) is the array manifold matrix, and x is the signal vector. The unknown DOA parameter θ is continuous. 4 D. Malioutov, M. Cetin, and A. S. Willsky, A sparse signal reconstruction perspective for source localization with sensor arrays, IEEE TSP, vol. 53, no. 8, pp. 3010 3022, 2005. Computable Bounds 13

Sparsity Example: Sensor Arrays (cont.) We discretize the parameter space into grid points given as θ = [θ 1,..., θ N ] T to create a sparse recovery problem. (a) DOAs of two sources. (b) DOA discretization. Figure 5: Sparse modeling for DOA estimation. The observation model then becomes linear: y = [Ã(θ 1) Ã(θ N)]x + w = Ax + w, where the entries of x are nonzero if and only if there is a source direction at the corresponding grid point, implying x is sparse. Computable Bounds 14

Sparsity Example: Sensor Arrays (cont.) Figure 6: Super resolution results. Discretization leads to super resolution when there is no basis mismatch. Sensing matrix depends on the sensor configuration and discretization strategy. Computable Bounds 15

Common Theme The sparse signal x is observed by a linear model corrupted by noise: y = Ax + w. The sensing matrix A R m n projects the set of sparse vectors in R n onto a low dimensional space R m. There exist convex programs that exploit sparsity to recover the signal. The recovery performance highly depends on the sensing matrix A. A good matrix A should preserve the structure of the set of sparse vectors. Our goal: Find computable bounds on recovery errors for a given A. Computable Bounds 16

Motivations for Computable Performance Analysis A computable performance analysis would enable us to: Quantify the confidence in the reconstructed signal, especially when there are no other ways to justify the correctness of the reconstructed signal. Optimize the system design. Figure 6: Two MRI sampling trajectories. Left: radial, Right: spiral. Computable Bounds 17

Our Contributions Introduce a family of functions that quantify the goodness of sensing matrices in sparsity recovery. Derive bounds on reconstruction error in terms of these goodness measures for recovery algorithms. Design efficient algorithms to compute these goodness measures and bounds. Computable Bounds 18

Outline Introduction Sparsity recovery Application examples Future work Computable Bounds 19

Model Consider the measurement model where y = Ax + w, the signal x R n is sparse with l 0 -sparsity level x 0 = k n, the matrix A R m n has m rows and n columns with m n, the noise vector w R m is either bounded w ε, where = 1, 2, or Gaussian w N (0, σ 2 I). Computable Bounds 20

Sparse Signal Recovery In the absence of noise, the signal x can be recovered by solving P 0 : arg min x x 0 subject to y = Ax. P 0 is a non-convex optimization problem and it is NP hard to solve 5. Convex relaxation methods replace the l 0 norm with the l 1 norm. 5 B. K. Natarajan, Sparse approximate solutions to linear systems, SIAM J. on Computing, vol. 24, no. 2, pp. 227 234, 1995. Computable Bounds 21

Existing Recovery Algorithms Basis Pursuit: min z R n z 1 subject to y Az ε Dantzig Selector: min z R n z 1 subject to A T (y Az) λ LASSO Estimator: min z R n 1 2 y Az 2 2 + λ z 1 Figure 6: Geometry of l 1 minimization in the noise-free case. Computable Bounds 22

Previous Approaches for Performance Analysis Restricted Isometry Constant (RIC) 6 : δ k (A) = max Az 2 2/ z 2 2 1 subject to z 0 k, z:z 0 error bounds: If w 2 ε, then the recovery error of the Basis Pursuit is bounded as ˆx x 2 4 1 + δ 2k (A) 1 (1 + 2)δ 2k (A) ε, computational difficulty: No practical way to compute δ k (A) exactly. 6 E. J. Candés, T. Tao, Near-optimal signal recovery from random projections and universal encoding strategies, IEEE Trans. Inform. Theory, vol. 52, no. 12, pp. 5406-5425, Dec. 2006. Computable Bounds 23

Previous Approaches for Performance Analysis (cont.) Mutual Coherence (MC) 7 : µ(a) = max i j sufficient condition: if x 0 1 2 (noise-free case) via Basis Pursuit, however, this condition is weak, A T i A j, A i 2 A j 2 ( 1 + 1 µ(a) ), then we get exact recovery If w 2 ε, then the recovery error of the Basis Pursuit is bounded as 8 ˆx x 2 2 1 µ(a)(4k 1) ε. 7 D. L. Donoho and M. Elad, Optimally sparse representation in general (nonorthogonal) dictionaries via l1 minimization, in Proc. Nat. Aca. Sci., Vol. 100, pp. 2197-2202, Mar. 2003. 8 D.L. Donoho, M. Elad, and V. Temlyakov, Stable recovery of sparse overcomplete representations in the presence of noise, IEEE Trans. On Information Theory, Vol. 52, pp. 6-18, Jan. 2006. Computable Bounds 24

Related Work on Computable Performance Analysis The following papers included sufficient and necessary conditions for exact recovery of sparse vector in the noise-free case: 9 A. Juditsky, A. Nemirovski, On verifiable sufficient conditions for sparse signal recovery via l 1 minimization, Mathematical Programming Ser. B, vol. 127, pp. 57-88, 2011. 10 A. Juditsky, F. Kilinc Karzan, A. Nemirovski, Verifiable conditions of l 1 -recovery of sparse signals with sign restrictions, Mathematical Programming Ser. B, vol. 127, pp. 89-122, 2011. 11 A. d Aspremont, L. El Ghaoui, Testing the nullspace property using semidefinite programming, Mathematical Programming Ser. B, vol. 127, pp. 123-144, 2011. Our Innovations Verification: Computation: More efficient algorithms to verify exact recovery without noise. Computable performance bounds on recovery errors in noise. Computable Bounds 25

Quality Measure ω (Q, s) For s [1, n] and A R m n, we define ω (Q, s) = min z:z 0 where Q = A or A T A, and = 1, 2, or. Qz z subject to z 1 z s, s is a measure of the sparsity of z: Smaller s implies more sparse z; also larger ω (Q, s) and better reconstruction performance. Without the sparsity constraint, with l norm replaced by l 2 and replaced with l 2, ω (Q, s) is the minimal singular value of Q. ω (Q, s) is a measure of the incoherence (quality) of A, and it will determine the performance bounds. Figure 7: Constraint set in R 3 for s = 1.4. Computable Bounds 26

Reconstruction Error Bounds Theorem 1. Suppose the noise w satisfies w ε, A T w λ, and A T w κλ, κ (0, 1), for the Basis Pursuit, the Dantzig Selector, and the LASSO estimator, respectively, then we have 2ε ˆx x for the Basis Pursuit, ω (A, 2k) ˆx x ˆx x 2λ ω (A T A, 2k) (1 + κ)λ ω (A T A, 2k/(1 κ)) for the Dantzig Selector, and for the LASSO estimator. These error bounds are inversely proportional to ω (Q, s). ω (Q, s) > 0 implies exact recovery in the noise-free case (where ε = 0, λ = 0). When the sparsity level k of the signal decreases, ω (Q, s) becomes larger, implying smaller reconstruction error. Computable Bounds 27

Reconstruction Error Bounds (cont.) The error bounds on the l 1 and l 2 norms can be expressed via ˆx x 1 ck ˆx x and ˆx x 2 ck ˆx x. Computable Bounds 28

Topics of Next Slides Computable Bounds 29

Verification of ω (Q, s) > 0: General Case We provide a computable way to verify sufficient conditions for exact sparse recovery in the noise-free case (see also Shtok et. al. 12 ). Theorem 2. Define s = max{s : ω (Q, s) > 0}. Then, k s /2 = exact sparse recovery. In addition, s is the inverse of the maximum of the n optimal values of the following linear programs: max z z i subject to Qz = 0, z 1 1, i = 1,..., n. Thus, s /2 is the maximal sparsity level below which exact recovery is guaranteed in the noise-free case. 12 J. Shtok and M. Elad, Analysis of the Basis Pursuit Via the Capacity Sets, Jour. of Fourier Analysis and Applications, vol. 14, no. 5-6, pp. 688-711, Dec. 2008. Computable Bounds 30

Verification of ω (Q, s) > 0: Fourier Case For the special yet important class of Fourier sensing matrices, the computational cost can be greatly reduced. Theorem 3. If H is the Fourier transform matrix on a finite abelian group, and the rows of A are sampled from the rows of H, then the optimal values of max z are equal for i = 1, 2,..., n. z i subject to Qz = 0, z 1 1 For these sensing matrices, we compute s by solving a single linear program. Examples include the Fourier matrix and the Hadamard matrix, which are widely used in compressive sensing Computable Bounds 31

Numerical Examples: Maximal Sparsity Levels Table 1: Comparison of our algorithm for sufficient conditions for exact recovery with Juditsky and Nemirovski s in bounding the maximal sparsity levels s /2 for Gaussian sensing matrices (n = 256). Max Sparsity Level m s /2 CPU time Our Algorithm JN s Algorithm Our Algorithm JN s Algorithm 25 1 1 6 91 51 2 2 8 191 76 3 3 10 856 102 4 4 13 5630 128 4 5 16 5711 153 6 6 20 1381 179 7 7 24 3356 204 10 10 25 10039 230 13 14 28 8332 Observation: The two algorithms give similar maximal sparsity levels, but ours is much faster. Computable Bounds 32

Numerical Examples: Maximal Sparsity Levels (cont.) Table 2: Comparison of sufficient conditions for exact recovery based on ω and the Mutual Coherence for Hadamard matrices. m n Max Sparsity Level n = 2048 n = 4096 n = 8192 ) s 2 1 2 (1 + 1 µ ) s 2 1 2 (1 + 1 µ ) s 2 1 2 (1 + 1 µ.2 6 3 9 4 12 6.3 9 4 13 5 18 7.4 12 5 17 7 24 10.5 15 6 21 9 30 13.6 19 8 27 9 39 16.7 25 9 35 14 50 17.8 34 13 48 17 68 23.9 53 20 74 22 105 34 Observation: Our sufficient condition for exact recovery is stronger than given by the Mutual Coherence for Hadamard matrices. Computable Bounds 33

Computation of ω (Q, s): General Case We provide a way to compute ω (Q, s) for any given Q and s, which readily translates to upper bounds on recovery errors, in the noisy case. Theorem 4. The quantity ω (Q, s) is the minimum of the optimal values of the following n linear programs or quadratic programs: min q u R n 1 i Q(:, i)u s.t. u 1 s 1, i = 1,..., n. Here q i is the ith column of Q and Q(:, i) are columns except the ith one. The ith optimization finds the best approximation of the ith column using a sparse (measured by l 1 norm) linear combination of the rest columns. Thus, if the columns of Q can well approximate each other using sparse linear combinations, ω is small and the reconstruction error is large. Note, the Mutual Coherence considers the approximability between two columns. Our ω is more accurate because it considers all columns. Computable Bounds 34

Computation of ω (Q, s): Fourier Case The computation cost can be greatly reduced for the special yet important class of Fourier sensing matrices, similar to the verification case. Theorem 5. If H is the Fourier transform matrix on a finite abelian group, and the rows of A are sampled from the rows of H, then the optimal values of min q u R n 1 i Q(:, i)u subject to u 1 s 1 are equal for all i = 1,..., n. For these sensing matrices, we compute a single ω (Q, s) by solving a single linear program or quadratic program. Examples include the Fourier matrix and the Hadamard matrix. Computable Bounds 35

Numerical Examples: Performance Bounds Comparison I Table 3: ω 2 (A, s) based bounds ˆx x 2 2 2k ω 2 (A,2k) ε vs. RIC based bounds ˆx x 2 4 1+δ 2k 1 (1+ ε for the Basis Pursuit with Bernoulli sensing matrices 2)δ 2k and n = 256 with ε = 1. Bound on Estimation Error k m 51 77 102 128 154 179 205 1 ω bound 4.2 3.8 3.5 3.4 3.3 3.2 3.2 RIC bound 23.7 16.1 13.2 10.6 11.9 2 ω bound 31.4 12.2 9.0 7.4 6.5 6.0 5.6 RIC bound 72.1 192.2 3 ω bound 252.0 30.9 16.8 12.0 10.1 8.9 4 ω bound 52.3 23.4 16.5 13.6 5 ω bound 57.0 28.6 20.1 6 ω bound 1256.6 53.6 30.8 7 ω bound 161.6 50.6 8 ω bound 93.1 9 ω bound 258.7 Computable Bounds 36

Numerical Examples: Performance Bounds Comparison II 2 2k Figure 8: ˆx x 2 ω 2 (A,2k) ε vs. MC based bound ˆx x 2 2 for the Basis Pursuit for Hadamard sensing matrices with n = 2048. 1 µ(a)(4k 1) ε Computable Bounds 37

Numerical Examples: Observations The bounds using ω are tighter than the bounds based on Mutual Coherence or RIC. Bounds based on ω still apply even when the bounds based on Mutual Coherence or RIC do not apply, e.g., for small m and large k. Computable Bounds 38

Quality Measure: l 1 -CMSV For s [1, n] and A R m n, we define the l 1 -constrained minimal singular value (l 1 -CMSV) of A by ρ s (A) = min z:z 0 Ax 2 x 2, subject to z 2 1 z 2 2 s. ρ s (A) is approximated by solving a constrained optimization problem using an interior point method. Similar to ω (Q, s), ρ s (A) is a measure of incoherence of A, and s is a measure of the sparsity of z. Bounds using ρ s (A) are computationally more amenable than using RIC. Bounds based on ρ s (A) are also tighter than bounds using ω (Q, s). Computable Bounds 39

Reconstruction Error Bounds Using l 1 -CMSV Theorem 6. Suppose the noise w satisfies w 2 ε, A T w λ, and A T w κλ, κ (0, 1), for the Basis Pursuit, the Dantzig Selector, and the LASSO estimator, respectively, then we have ˆx x 2 2ε ρ 4k ˆx x 2 4 k ρ 2 λ 4k (1 + κ) ˆx x 2 1 κ. 2 k ρ 2 λ 4k/(1 κ) 2 for the Basis Pursuit, for the Dantzig Selector, and for the LASSO estimator. Computable Bounds 40

Outline Introduction Sparsity recovery Application example Multi-objective optimization of OFDM radar waveform for target detection Future work Computable Bounds 41

Problem Description Goal: Detect a far-field target in the presence of multipath reflections. Challenges: Complex physical phenomena: multiple reflections, fading effects, etc. Lack of line-of-sight (LOS) propagation path to the target. Unknown frequency response of the target. Our Approach Employ OFDM signal to increase the frequency diversity and overcome fading. Exploit multipath reflections to improve the spatial diversity. Computable Bounds 42

Problem Description Radar Buildings Targets Non-LOS region Figure 9: Urban multipath scenario. Computable Bounds 43

Our Approach (cont.) Reformulate the target detection problem as estimating the spectrum of a sparse signal by exploiting the sparsity of multiple signal paths and target velocity. Employ a sparse-recovery algorithm based on the Dantzig selector (DS) approach and analyze its performance in terms of the l 1 -constrained minimal singular value of the measurement matrix. Image of target A Constant range curve Target A Target B v A v B Target C v C Image of target C Propose a constrained multi objective optimization (MOO) technique to design the spectral parameters of the OFDM waveform. Reflecting surface Radar Reflecting surface Computable Bounds 44

Measurement Model Assumptions: Far-field, point target moving with a constant velocity. The target remains within a range cell over the coherent processing interval (CPI). Fixed, mono-static, and coherent radar. The radar knows the geometry of the environment and position of the range cell. The information of the known range cell (τ) is incorporated into the model by choosing t = τ + nt PRI, n = 0, 1,..., N 1. Coherent processing interval n = 0 n = 1 n = N-1 T TPRI Computable Bounds 45

Measurement Model (cont.) Signal model: The complex envelope of the transmitted OFDM signal is s(t) = L 1 l=0 a l e j2πl ft, where L : number of subcarriers, a = [a 0, a 1,..., a L 1 ] T : complex transmitted weights ensuring a H a = 1 for constant energy transmission, f = B/(L + 1) = 1/T : subcarrier spacing, B : signal bandwidth, T : pulse duration. Adaptive design: We will select the coefficients a l s to maximize the target-detection performance. Computable Bounds 46

Measurement Model (cont.) Consider a target corresponding to the known range cell τ, and the radar receives information about the target (moving with v) through the path p. Then, the complex envelope of the received signal at the l-th subchannel y l (n) = a l x lp φ l (n, p, v) + e l (n), for l = 0,..., L 1, n = 0,..., N 1, (1) where φ l (n, p, v) e j2πf lτ e j2πf lβ p nt PRI, and x lp : target scattering coefficient at the l-th subchannel and p-th path, β p = 2 v, u p /c : effective Doppler coefficient along the p-th path, u p : direction-of-arrival (DOA) unit-vector of the p-th path, c : speed of propagation, f l = f c + l f and f c is the carrier frequency, e l (n) : clutter, measurement noise, and co-channel interference (CCI). Computable Bounds 47

Sparse Model We discretize the possible signal paths and target velocities into P and V grid points, respectively. Considering all possible combinations of (p i, v j ), i = 1, 2,..., P, j = 1, 2,..., V, we can rewrite (1) as where y l (n) = a l φ l (n) T x l + e l (n), φ l (n) = [ φ l (n, p 1, v 1 ),..., φ l (n, p 1, v V ), φ l (n, p 2, v 1 ),..., φ l (n, p P, v V ) ] T, x l is a P V 1 sparse vector, having only k l non-zero entries, where k l = I l : sparsity level of x l, I l = {ĩ [1, P ] : pĩ-th path carries target information}. Computable Bounds 48

Sparse Model (cont.) Concatenating the measurements of all L subchannels and N time samples where y = Φ x + e, y = [ y(0) T,..., y(n 1) ] T T and y(n) = [y0 (n),..., y L 1 (n)] T, [ ] Φ = (A Φ(0)) T (A Φ(N 1)) T T with A = diag(a) and Φ(n) = blkdiag ( φ 0 (n) T,..., φ L 1 (n) ) T, x = [ ] x T 0,..., x T T L 1 L 1 is a sparse-vector having k = l=0 k l non-zero entries, e = [ e(0) T,..., e(n 1) T ] T and e(n) = [e0 (n),..., e L 1 (n)] T. Thus, Φ contains the response bases corresponding to all possible paths and velocities. Computable Bounds 49

Statistical Model The vector e(n) represents the clutter, measurement noise, and co-channel interference at the output of L subchannels. We assume that e(n) is temporally white, zero-mean complex Gaussian vector, co-channel interference among the subchannels is characterized by a covariance matrix Σ. Hence, the measurement vector is distributed as y CN LN (Φ x, I N Σ). The identity matrix I N is due to the temporal white noise distribution. Computable Bounds 50

Sparse Recovery and Performance To obtain an estimate of x (a k-sparse vector) from the noisy measurements y, obtained through a linear model we apply the Dantzig selector (DS). Recall that the DS is given by y = Φ x + e, x DS = min z C LP V z 1 subject to Φ H (y Φz) λ σ, (2) where λ = 2 log(lp V ) is a control parameter and σ = tr(σ)/l. To assess the reconstruction performance of this recovery algorithm, we use the l 1 -constrained minimal singular value (l 1 -CMSV) of Φ. Computable Bounds 51

Sparse Recovery - Further Simplification We observe an additional structure in the sparse measurement model, i.e., y = [ ] Φ 0 Φ L 1 x 0. + e, x L 1 where each pair of block-matrices is orthogonal, i.e., Φ H l 1 Φ l2 = 0 for l 1 l 2, each x l, l = 0, 1,..., L 1, is sparse with sparsity level k l. To obtain an estimate of x, x = [ x T 0,..., x T L 1 ] T, by exploiting these properties, we employ simpler L decomposed Dantzig selectors. x DDS,l = min z l C P V z l 1 subject to Φ H l (y Φ l z l ) λ l σ, (3) where λ l = 2 log(p V ). Computable Bounds 52

Performance Analysis Theorem 7. Consider the estimate x obtained by employing our decomposed DS. An upper bound exists on the l 2 -norm of the sparse-estimation error x DDS x 2 4 L 1 whereas using the original DS we get l=0 λ 2 l k l σ 2 ρ 4 4k l (Φ l ), x DS x 2 4 λ k σ ρ 2 4k (Φ). Theorem 8. The L small Dantzig selectors in (3) perform better than the original Dantzig selector in (2) in terms of a smaller upper bound on the l 2 - norm of the sparse-estimation error: 4 L 1 l=0 λ 2 l k l σ 2 ρ 4 4k (Φ l ) 4 λ k σ ρ 2 l 4k (Φ). Computable Bounds 53

Adaptive Waveform Design We can adaptively design the OFDM spectral parameters, a l, to minimize the upper bound on the sparse-estimation error as a (1) = arg min a C L L 1 l=0 λ 2 l k l σ 2 a 4 l ρ4 4k l ( Φ l ) subject to a H a = 1, where Φ l = a l Φl. However, the computation of ρ 4kl ( Φ l ) is difficult with the complex variables. Therefore, we use a computable lower bound on ρ 4kl ( Φ l ), defined as ρ 8kl ( Ψ l ) ρ 4kl ( Φ l ), where Ψ T l Ψ l = [ Ψ T 1 Ψ 1 + Ψ T 2 Ψ 2 0 0 Ψ T 1 Ψ 1 + Ψ T 2 Ψ 2 ], Ψ 1 = Re Φ l, and Ψ 2 = Im Φ l. Computable Bounds 54

Adaptive Waveform Design (cont.) Hence, to minimize the upper bound on the sparse-estimation error, we formulate a single-objective optimization problem as a (1) = arg min a C L L 1 l=0 λ 2 l k l σ 2 a 4 l ρ4 8k l ( Ψ l ) subject to a H a = 1. Using the Lagrange-multiplier approach, we easily obtain the solution as a (1) l = where α l = λ2 l k l σ 2 ρ 4 8k l ( Ψ l ). (2α l) 1/3 L 1 l=0 (2α 1/3, for l = 0, 1,..., L 1, l) Note that the designed waveform depends solely on Φ, the measurement matrix. Computable Bounds 55

Adaptive Waveform Design (cont.) However, to achieve better detection performance it is also essential that the signal parameters are adaptive to the target and noise parameters (x and Σ) To detect the presence of a target in the range cell under test, a standard procedure is to construct the following decision problem { H0 : y = e H 1 : y = Φ x + e, and to find out whether the measurement y is distributed as CN LN (0, I N Σ) or CN LN (Φx, I N Σ). To optimize the detection performance, we maximize the squared Mahalanobisdistance between these two distributions d 2 = x H Φ H (I N Σ) 1 Φ x. Computable Bounds 56

Adaptive Waveform Design (cont.) Hence, in addition to minimizing the upper bound on the estimation error, we propose maximizing another single-objective function based on the squared Mahalanobis-distance (d 2 ) as [ ] a (2) = arg max x H Φ H (I N Σ) 1 Φ x a C L }{{} d 2, subject to a a = 1. After some algebraic manipulation, we have [ N 1 d 2 = a H n=0 ( Φ(n) x x H Φ(n) H) T Σ 1 and therefore the solution of the optimization problem, a (2), will [ be the eigenvector corresponding to the largest eigenvalue of N 1 ( Φ(n) x x H Φ(n) H) T Σ 1]. n=0 ] a, Computable Bounds 57

Multi-Objective Optimization We adaptively design the OFDM spectral parameters, a l, using a constrained multi-objective optimization (MOO) that simultaneously optimizes two objective functions: minimize the upper bound on the sparse-estimation error, maximize the squared Mahalanobis-distance. Mathematically, this is represented as L 1 arg min a C L l=0 a opt = [ N 1 arg max a C L a H n=0 subject to a H a = 1. λ 2 l k l σ 2, a 4 l ρ4 ( Φ 4k l ) l ( Φ(n) x x H Φ(n) H) T Σ 1] a We employ the well-known nondominated sorting genetic algorithm II (NSGA- II) to solve our MOO problem, imposing a restriction on the solutions to satisfy the constraint a H a = 1., Computable Bounds 58

Numerical Examples Problem: Detect a moving target in the presence of multipath in 2D. Target and multipath parameters: The range cell that is at a distance of 3 km from the radar. The target is 13.5 m east from the center line, moving with velocity v = (35/ 2) (î + ĵ) m/s and remains within the range cell over a CPI. There are two actual paths between the target and radar: one direct and one reflected, subtending angles of 0.26 and 0.51, respectively, with respect to the radar. Constant range curve Reflecting surface 3 km 40 m Target 13.5 m Radar v Reflecting surface The scattering coefficients are varied to simulate three target responses: Target 1: x (1) d = [1, 1, 1] T, x (1) r = [0.5, 0.5, 0.5] T. Target 2: x (2) d = [4, 1, 2] T, x (2) r = [2, 0.5, 1] T. Target 3: x (3) d = [1, 10, 1] T, x (3) r = [0.5, 5, 0.5] T. Image of target Computable Bounds 59

Numerical Examples (cont.) Radar parameters: Carrier frequency f c = 1 GHz Available bandwidth B = 100 MHz Number of OFDM subcarriers L = 3 Subcarrier spacing of f = B/(L + 1) = 25 MHz Pulse width T = 1/ f = 40 ns Pulse repetition interval T P = 4 ms Number of coherent pulses N = 20 All the transmit OFDM weights were equal; i.e., a l = 1/ L l Computable Bounds 60

Simulation: Numerical Examples (cont.) We partition the signal paths and target velocities into P = 5 and V = 3 uniform grid points. Hence, the associated signal grid paths subtend angles of { 0.5, 0.25, 0, 0.25, 0.5 } with respect to the radar, and target grid velocities are {25, 35, 45} m/s. Note, these grid points are different from the true parameters. Generate the noise samples from a CN (0, 1) distribution, and then scale to satisfy the required target to clutter-plus-noise ratio (TCNR) TCNR = xh x N L σ0 2. Comment: We kept the clutter-plus-noise power to be the same in each subcarrier by considering Σ = σ0 2 I L. Hence, TCNR will depend only on the target s RCS at each frequency. Computable Bounds 61

Numerical Examples (cont.) Parameters of NSGA-II: Population size = 500 Number of generations = 50 Crossover probability = 0.9 Mutation probability = 0.1 The constraint a H a = 1 is relaxed by ensuring that the solutions satisfy 0.999 a H a 1.001 Computable Bounds 62

Numerical Examples (cont.) Performance of the standard and Decomposed Dantzig Selectors: Target 1 Normalized root mean squred error 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 Original DS Decomposed DS 0 0 5 10 15 20 Target to clutter plus noise ratio (TCNR) (in db) Probability of detction (P D ) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Original DS Decomposed DS 10 1 Probability of false alarm (P FA ) Computation time (in sec) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Original DS Decomposed DS 0 0 5 10 15 20 Target to clutter plus noise ratio (TCNR) (in db) Target 2 Normalized root mean squared error 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 Original DS Decomposed DS 0 0 5 10 15 20 Target to clutter plus noise ratio (TCNR) (in db) Probability of detection (P D ) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Original DS Decomposed DS 0 10 2 10 1 Probability of false alarm (P FA ) Computation time (in sec) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Original DS Decomposed DS 0 0 5 10 15 20 Target to clutter plus noise ratio (TCNR) (in db) Normalized RMSE. Empirical ROC. Computation time. Computable Bounds 63

Numerical Examples (cont.) Solutions of the single-objective optimization problems: Minimizing the upper bound on the sparse-estimation error we obtained a (1) = [0.54, 0.16, 0.83] T, irrespective of the target parameters. Maximizing the squared Mahalanobis-distance we found a (2) = [1, 0, 0] T or [0, 1, 0] T or [0, 0, 1] T for Target 1, a (2) = [1, 0, 0] T for Target 2, a (2) = [0, 1, 0] T for Target 3. Observation: The maximization of the squared Mahalanobis-distance provided an adaptive waveform with all the signal energy concentrated over a single subcarrier that had the strongest target response, thus losing the frequency diversity of the system. Computable Bounds 64

Numerical Examples (cont.) Solutions of the NSGA-II (MOO problem) at the 50th generation: Solutions 1 1 1 0.8 0.8 0.8 a 3 0.6 0.4 a 3 0.6 0.4 a 3 0.6 0.4 0.2 0.2 0.2 0 0 0 0 0 0 0.5 a 1 1 1 0.5 a 2 0 0.5 a 1 1 1 0.5 a 2 0 0.5 a 1 1 1 0.5 a 2 0 Pareto-front Squared Mahalanobis distance 350 300 250 200 150 100 50 0 2 4 6 8 10 Squared upper bound on sparse estimation x error 10 9 Squared Mahalanobis distance 260 240 220 200 180 160 140 0 2 4 6 8 10 Squared upper bound on sparse estimation x error 10 8 Squared Mahalanobis distance 1100 1000 900 800 700 600 500 400 300 200 0 1 2 3 4 5 Squared upper bound on sparse estimation x error 10 9 Target 1 Target 2 Target 3 Computable Bounds 65

Numerical Examples (cont.) Effect of target scattering coefficent on the NSGA-II solutions: We averaged the whole population of 500 solutions and found a opt,avg = [0.61, 0.39, 0.68] T for Target 1, a opt,avg = [0.88, 0.20, 0.36] T for Target 2, a opt,avg = [0.13, 0.96, 0.15] T for Target 3. Observation: The solution of the MOO distributes the energy of the optimal waveform across different subcarriers in proportion to the distribution of the target energy; i.e., it puts more signal energy into that particular subcarrier in which the target response is stronger. Computable Bounds 66

Numerical Examples (cont.) Performance improvement due to adaptive waveform design: Normalized root mean squared error 2 1.5 1 0.5 Fixed waveform l 1 CMSV minimized adaptive waveform NSGA II optimized adaptive waveform Probability of detection (P D ) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Fixed waveform l 1 CMSV minimized adaptive waveform NSGA II optimized adaptive waveform 0 0 2 4 6 8 10 12 14 16 18 20 Target to clutter plus noise ratio (TCNR) (in db) RMSE 0 10 2 10 1 Probability of false alarm (P FA ) ROC Computable Bounds 67

Outline Introduction Sparsity recovery Application example Future work Computable Bounds 68

Open Problems Develop computable tight bounds. Bounds for other signal structures, such as block-sparse vectors and low-rank matrices. Procedures for optimization of system design. Computationally efficient algorithms for sensing matrices other than Fourier or Hadamard. Minimization of modeling errors due to discrete grid mismatch. Computable Bounds 69

Possible Approaches Develop computable tighter bounds that control the average or typical system performance. Encode more signal structures into the model rather than using bounds, for example using the continuous-parameter domain. Computable Bounds 70

References G. Tang and A. Nehorai, Verifiable and computable performance analysis of sparsity recovery, submitted for publication. G. Tang and A. Nehorai, Performance analysis of sparse recovery based on constrained minimal singular values, IEEE Trans. Signal Processing, vol. 59, no. 12, pp. 5734-5745, Dec. 2011. G. Tang and A. Nehorai, Fixed point theory and semidefinite programming for computable performance analysis of block-sparsity recovery, submitted for publication. S. Sen, G. Tang, and A. Nehorai, Multiobjective optimization of OFDM radar waveform for target detection, IEEE Trans. on Signal Processing, Vol. 59, pp. 639-652, Feb. 2011. Computable Bounds 71

Questions? Computable Bounds 72

Thank You! Computable Bounds 73