Computable Performance Analysis of Sparsity Recovery with Applications

Computable Performance Analysis of Sparsity Recovery with Applications Arye Nehorai Preston M. Green Department of Electrical & Systems Engineering Washington University in St. Louis, USA European Signal Processing Conference (EUSIPCO) August 28, 2012 Computable Bounds 1

Acknowledgements Based on collaborations with Gongguo Tang (Ph.D. 2011) and Satyabrata Sen (Ph.D 2010). Supported by the US National Science Foundation, Air Force Office of Scientific Research, and the Office of Naval Research. Computable Bounds 2

Figure Acknowledgements Figures on slide 8 and slide 16 are adapted from R. Baraniuk, J. Romberg and M. Wakin s slides Tutorial on Compressive Sensing. Figures on slide 9 and slide 10 are modified from E. J. Candés, J. Romberg and T. Tao, Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information, IEEE Trans. Inf. Theory, vol. 52, no. 2, pp. 489-509, Feb. 2006. Figures on slide 11 and slide 12 are reproduced from J. Wright, A. Yang, A. Ganesh, S. Sastry, and Y. Ma, Robust face recognition via sparse representation, IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 2, Feb. 2009. Figure on slide 14 is from D. M. Maliouto, A sparse signal reconstruction perspective for source localization with sensor arrays, Master thesis, MIT, 2003. Computable Bounds 3

Outline Introduction Sparsity recovery Application examples Future work Computable Bounds 4

Outline Introduction Sparsity recovery Application examples Future work Computable Bounds 5

Introduction Low-dimensional structures are ubiquitous in signals: Sparse vectors Compressive sensing MRI Image processing and computer vision Block-sparse vectors Radar Sensor array processing Low-rank matrices Collaborative filtering Robust principal component analysis Low-dimensional manifolds Subspace learning Manifold learning Exploiting low-dimensional structures enables more accurate signal recovery. Computable Bounds 6

Sparsity Example: Compressive Sensing Interest in exploiting sparsity grew recently due to developments in compressive sensing. Traditional signal sampling acquires a signal using expensive hi-fidelity sensors, then compresses the data with a loss of fidelity. Compressive sensing (CS) combines the acquisition with the compression by sampling the signal in a novel way with less data. CS replaces samples with general linear projections, and linear reconstruction with non-linear reconstruction, thus shifting the burden from the hi-fidelity sensing to reconstruction. Key assumption: Many natural signals x have sparse representations in some transform domains Φ, i.e., x = Φs for some sparse vector s. Computable Bounds 7

Sparsity Example: Compressive Sensing (cont.) Figure 1: Paradigm of compressive sensing. Surprising fact 1,2 : Suffices to use m = O(k log n) n linear, non-adaptive, random measurements y to reconstruct a sparse signal, where k = s 0 is the sparsity level of s. The reconstruction performance depends on the sensing matrix A. 1 E. J. Candés, J. Romberg and T. Tao, Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information, IEEE IT, vol. 52, no. 2, pp. 489-509, Feb. 2006. 2 D. L. Donoho, Compressed sensing, IEEE IT, vol. 52, no. 4, pp. 1289 1306, Apr. 2006. Computable Bounds 8

Sparsity Example: MRI MRI images can often be well approximated by piece-wise constant functions. Instead of observing the image directly, we observe its Fourier transform coefficients sampled along, e.g., a radial trajectory in the 2D spatial frequency domain. (a) Logan-Shepp phantom (b) Fourier transform (c) Sampling trajectory Computable Bounds 9

Sparsity Example: MRI (cont.) Reconstruction using minimum energy (or l 2 norm) results in many artifacts. However, the total variation (TV) minimization, which enforces the piece-wise constant property of the image, recovers the original image exactly. (d) Min-energy recovery (e) Magnitude of gradient (f) Min-TV recovery Figure 2: Exploiting sparsity improves the MRI recovery. Computable Bounds 10

Sparsity Example: Image Processing and Computer Vision Sparse recovery is also useful in single-image super-resolution, and face recognition 3. In face recognition, a given facial image is sparsely represented using a dictionary database. The significant coefficients in the representation reveals the person s identity. Figure 3: Robust face recognition with occlusion. 3 J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma, Robust Face Recognition via Sparse Representation, IEEE PAMI, vol.31, no.2, pp.210-227, Feb. 2009 Computable Bounds 11

Sparsity Example: Image Processing and Computer Vision (cont.) This approach yields accurate recognition. It is also robust to occlusions and noise corruptions. Figure 4: Robust face recognition with corruption. Computable Bounds 12

Sparsity Example: Sensor Arrays We can also use sparsity to estimate continuous parameters in nonlinear models 4. Consider, for example, the estimation of directions-of-arrival (DOAs) using a sensor array. The narrowband observation model is given by y = Ã( θ) x + w, where Ã( θ) is the array manifold matrix, and x is the signal vector. The unknown DOA parameter θ is continuous. 4 D. Malioutov, M. Cetin, and A. S. Willsky, A sparse signal reconstruction perspective for source localization with sensor arrays, IEEE TSP, vol. 53, no. 8, pp. 3010 3022, 2005. Computable Bounds 13

Sparsity Example: Sensor Arrays (cont.) We discretize the parameter space into grid points given as θ = [θ 1,..., θ N ] T to create a sparse recovery problem. (a) DOAs of two sources. (b) DOA discretization. Figure 5: Sparse modeling for DOA estimation. The observation model then becomes linear: y = [Ã(θ 1) Ã(θ N)]x + w = Ax + w, where the entries of x are nonzero if and only if there is a source direction at the corresponding grid point, implying x is sparse. Computable Bounds 14

Sparsity Example: Sensor Arrays (cont.) Figure 6: Super resolution results. Discretization leads to super resolution when there is no basis mismatch. Sensing matrix depends on the sensor configuration and discretization strategy. Computable Bounds 15

Common Theme The sparse signal x is observed by a linear model corrupted by noise: y = Ax + w. The sensing matrix A R m n projects the set of sparse vectors in R n onto a low dimensional space R m. There exist convex programs that exploit sparsity to recover the signal. The recovery performance highly depends on the sensing matrix A. A good matrix A should preserve the structure of the set of sparse vectors. Our goal: Find computable bounds on recovery errors for a given A. Computable Bounds 16

Motivations for Computable Performance Analysis A computable performance analysis would enable us to: Quantify the confidence in the reconstructed signal, especially when there are no other ways to justify the correctness of the reconstructed signal. Optimize the system design. Figure 6: Two MRI sampling trajectories. Left: radial, Right: spiral. Computable Bounds 17

Our Contributions Introduce a family of functions that quantify the goodness of sensing matrices in sparsity recovery. Derive bounds on reconstruction error in terms of these goodness measures for recovery algorithms. Design efficient algorithms to compute these goodness measures and bounds. Computable Bounds 18

Outline Introduction Sparsity recovery Application examples Future work Computable Bounds 19

Model Consider the measurement model where y = Ax + w, the signal x R n is sparse with l 0 -sparsity level x 0 = k n, the matrix A R m n has m rows and n columns with m n, the noise vector w R m is either bounded w ε, where = 1, 2, or Gaussian w N (0, σ 2 I). Computable Bounds 20

Sparse Signal Recovery In the absence of noise, the signal x can be recovered by solving P 0 : arg min x x 0 subject to y = Ax. P 0 is a non-convex optimization problem and it is NP hard to solve 5. Convex relaxation methods replace the l 0 norm with the l 1 norm. 5 B. K. Natarajan, Sparse approximate solutions to linear systems, SIAM J. on Computing, vol. 24, no. 2, pp. 227 234, 1995. Computable Bounds 21

Existing Recovery Algorithms Basis Pursuit: min z R n z 1 subject to y Az ε Dantzig Selector: min z R n z 1 subject to A T (y Az) λ LASSO Estimator: min z R n 1 2 y Az 2 2 + λ z 1 Figure 6: Geometry of l 1 minimization in the noise-free case. Computable Bounds 22

Previous Approaches for Performance Analysis Restricted Isometry Constant (RIC) 6 : δ k (A) = max Az 2 2/ z 2 2 1 subject to z 0 k, z:z 0 error bounds: If w 2 ε, then the recovery error of the Basis Pursuit is bounded as ˆx x 2 4 1 + δ 2k (A) 1 (1 + 2)δ 2k (A) ε, computational difficulty: No practical way to compute δ k (A) exactly. 6 E. J. Candés, T. Tao, Near-optimal signal recovery from random projections and universal encoding strategies, IEEE Trans. Inform. Theory, vol. 52, no. 12, pp. 5406-5425, Dec. 2006. Computable Bounds 23

Previous Approaches for Performance Analysis (cont.) Mutual Coherence (MC) 7 : µ(a) = max i j sufficient condition: if x 0 1 2 (noise-free case) via Basis Pursuit, however, this condition is weak, A T i A j, A i 2 A j 2 ( 1 + 1 µ(a) ), then we get exact recovery If w 2 ε, then the recovery error of the Basis Pursuit is bounded as 8 ˆx x 2 2 1 µ(a)(4k 1) ε. 7 D. L. Donoho and M. Elad, Optimally sparse representation in general (nonorthogonal) dictionaries via l1 minimization, in Proc. Nat. Aca. Sci., Vol. 100, pp. 2197-2202, Mar. 2003. 8 D.L. Donoho, M. Elad, and V. Temlyakov, Stable recovery of sparse overcomplete representations in the presence of noise, IEEE Trans. On Information Theory, Vol. 52, pp. 6-18, Jan. 2006. Computable Bounds 24

Related Work on Computable Performance Analysis The following papers included sufficient and necessary conditions for exact recovery of sparse vector in the noise-free case: 9 A. Juditsky, A. Nemirovski, On verifiable sufficient conditions for sparse signal recovery via l 1 minimization, Mathematical Programming Ser. B, vol. 127, pp. 57-88, 2011. 10 A. Juditsky, F. Kilinc Karzan, A. Nemirovski, Verifiable conditions of l 1 -recovery of sparse signals with sign restrictions, Mathematical Programming Ser. B, vol. 127, pp. 89-122, 2011. 11 A. d Aspremont, L. El Ghaoui, Testing the nullspace property using semidefinite programming, Mathematical Programming Ser. B, vol. 127, pp. 123-144, 2011. Our Innovations Verification: Computation: More efficient algorithms to verify exact recovery without noise. Computable performance bounds on recovery errors in noise. Computable Bounds 25

Quality Measure ω (Q, s) For s [1, n] and A R m n, we define ω (Q, s) = min z:z 0 where Q = A or A T A, and = 1, 2, or. Qz z subject to z 1 z s, s is a measure of the sparsity of z: Smaller s implies more sparse z; also larger ω (Q, s) and better reconstruction performance. Without the sparsity constraint, with l norm replaced by l 2 and replaced with l 2, ω (Q, s) is the minimal singular value of Q. ω (Q, s) is a measure of the incoherence (quality) of A, and it will determine the performance bounds. Figure 7: Constraint set in R 3 for s = 1.4. Computable Bounds 26

Reconstruction Error Bounds Theorem 1. Suppose the noise w satisfies w ε, A T w λ, and A T w κλ, κ (0, 1), for the Basis Pursuit, the Dantzig Selector, and the LASSO estimator, respectively, then we have 2ε ˆx x for the Basis Pursuit, ω (A, 2k) ˆx x ˆx x 2λ ω (A T A, 2k) (1 + κ)λ ω (A T A, 2k/(1 κ)) for the Dantzig Selector, and for the LASSO estimator. These error bounds are inversely proportional to ω (Q, s). ω (Q, s) > 0 implies exact recovery in the noise-free case (where ε = 0, λ = 0). When the sparsity level k of the signal decreases, ω (Q, s) becomes larger, implying smaller reconstruction error. Computable Bounds 27

Reconstruction Error Bounds (cont.) The error bounds on the l 1 and l 2 norms can be expressed via ˆx x 1 ck ˆx x and ˆx x 2 ck ˆx x. Computable Bounds 28

Topics of Next Slides Computable Bounds 29

Verification of ω (Q, s) > 0: General Case We provide a computable way to verify sufficient conditions for exact sparse recovery in the noise-free case (see also Shtok et. al. 12 ). Theorem 2. Define s = max{s : ω (Q, s) > 0}. Then, k s /2 = exact sparse recovery. In addition, s is the inverse of the maximum of the n optimal values of the following linear programs: max z z i subject to Qz = 0, z 1 1, i = 1,..., n. Thus, s /2 is the maximal sparsity level below which exact recovery is guaranteed in the noise-free case. 12 J. Shtok and M. Elad, Analysis of the Basis Pursuit Via the Capacity Sets, Jour. of Fourier Analysis and Applications, vol. 14, no. 5-6, pp. 688-711, Dec. 2008. Computable Bounds 30

Verification of ω (Q, s) > 0: Fourier Case For the special yet important class of Fourier sensing matrices, the computational cost can be greatly reduced. Theorem 3. If H is the Fourier transform matrix on a finite abelian group, and the rows of A are sampled from the rows of H, then the optimal values of max z are equal for i = 1, 2,..., n. z i subject to Qz = 0, z 1 1 For these sensing matrices, we compute s by solving a single linear program. Examples include the Fourier matrix and the Hadamard matrix, which are widely used in compressive sensing Computable Bounds 31

Numerical Examples: Maximal Sparsity Levels Table 1: Comparison of our algorithm for sufficient conditions for exact recovery with Juditsky and Nemirovski s in bounding the maximal sparsity levels s /2 for Gaussian sensing matrices (n = 256). Max Sparsity Level m s /2 CPU time Our Algorithm JN s Algorithm Our Algorithm JN s Algorithm 25 1 1 6 91 51 2 2 8 191 76 3 3 10 856 102 4 4 13 5630 128 4 5 16 5711 153 6 6 20 1381 179 7 7 24 3356 204 10 10 25 10039 230 13 14 28 8332 Observation: The two algorithms give similar maximal sparsity levels, but ours is much faster. Computable Bounds 32

Numerical Examples: Maximal Sparsity Levels (cont.) Table 2: Comparison of sufficient conditions for exact recovery based on ω and the Mutual Coherence for Hadamard matrices. m n Max Sparsity Level n = 2048 n = 4096 n = 8192 ) s 2 1 2 (1 + 1 µ ) s 2 1 2 (1 + 1 µ ) s 2 1 2 (1 + 1 µ.2 6 3 9 4 12 6.3 9 4 13 5 18 7.4 12 5 17 7 24 10.5 15 6 21 9 30 13.6 19 8 27 9 39 16.7 25 9 35 14 50 17.8 34 13 48 17 68 23.9 53 20 74 22 105 34 Observation: Our sufficient condition for exact recovery is stronger than given by the Mutual Coherence for Hadamard matrices. Computable Bounds 33

Computation of ω (Q, s): General Case We provide a way to compute ω (Q, s) for any given Q and s, which readily translates to upper bounds on recovery errors, in the noisy case. Theorem 4. The quantity ω (Q, s) is the minimum of the optimal values of the following n linear programs or quadratic programs: min q u R n 1 i Q(:, i)u s.t. u 1 s 1, i = 1,..., n. Here q i is the ith column of Q and Q(:, i) are columns except the ith one. The ith optimization finds the best approximation of the ith column using a sparse (measured by l 1 norm) linear combination of the rest columns. Thus, if the columns of Q can well approximate each other using sparse linear combinations, ω is small and the reconstruction error is large. Note, the Mutual Coherence considers the approximability between two columns. Our ω is more accurate because it considers all columns. Computable Bounds 34

Computation of ω (Q, s): Fourier Case The computation cost can be greatly reduced for the special yet important class of Fourier sensing matrices, similar to the verification case. Theorem 5. If H is the Fourier transform matrix on a finite abelian group, and the rows of A are sampled from the rows of H, then the optimal values of min q u R n 1 i Q(:, i)u subject to u 1 s 1 are equal for all i = 1,..., n. For these sensing matrices, we compute a single ω (Q, s) by solving a single linear program or quadratic program. Examples include the Fourier matrix and the Hadamard matrix. Computable Bounds 35

Numerical Examples: Performance Bounds Comparison I Table 3: ω 2 (A, s) based bounds ˆx x 2 2 2k ω 2 (A,2k) ε vs. RIC based bounds ˆx x 2 4 1+δ 2k 1 (1+ ε for the Basis Pursuit with Bernoulli sensing matrices 2)δ 2k and n = 256 with ε = 1. Bound on Estimation Error k m 51 77 102 128 154 179 205 1 ω bound 4.2 3.8 3.5 3.4 3.3 3.2 3.2 RIC bound 23.7 16.1 13.2 10.6 11.9 2 ω bound 31.4 12.2 9.0 7.4 6.5 6.0 5.6 RIC bound 72.1 192.2 3 ω bound 252.0 30.9 16.8 12.0 10.1 8.9 4 ω bound 52.3 23.4 16.5 13.6 5 ω bound 57.0 28.6 20.1 6 ω bound 1256.6 53.6 30.8 7 ω bound 161.6 50.6 8 ω bound 93.1 9 ω bound 258.7 Computable Bounds 36

Numerical Examples: Performance Bounds Comparison II 2 2k Figure 8: ˆx x 2 ω 2 (A,2k) ε vs. MC based bound ˆx x 2 2 for the Basis Pursuit for Hadamard sensing matrices with n = 2048. 1 µ(a)(4k 1) ε Computable Bounds 37

Numerical Examples: Observations The bounds using ω are tighter than the bounds based on Mutual Coherence or RIC. Bounds based on ω still apply even when the bounds based on Mutual Coherence or RIC do not apply, e.g., for small m and large k. Computable Bounds 38

Quality Measure: l 1 -CMSV For s [1, n] and A R m n, we define the l 1 -constrained minimal singular value (l 1 -CMSV) of A by ρ s (A) = min z:z 0 Ax 2 x 2, subject to z 2 1 z 2 2 s. ρ s (A) is approximated by solving a constrained optimization problem using an interior point method. Similar to ω (Q, s), ρ s (A) is a measure of incoherence of A, and s is a measure of the sparsity of z. Bounds using ρ s (A) are computationally more amenable than using RIC. Bounds based on ρ s (A) are also tighter than bounds using ω (Q, s). Computable Bounds 39

Reconstruction Error Bounds Using l 1 -CMSV Theorem 6. Suppose the noise w satisfies w 2 ε, A T w λ, and A T w κλ, κ (0, 1), for the Basis Pursuit, the Dantzig Selector, and the LASSO estimator, respectively, then we have ˆx x 2 2ε ρ 4k ˆx x 2 4 k ρ 2 λ 4k (1 + κ) ˆx x 2 1 κ. 2 k ρ 2 λ 4k/(1 κ) 2 for the Basis Pursuit, for the Dantzig Selector, and for the LASSO estimator. Computable Bounds 40

Outline Introduction Sparsity recovery Application example Multi-objective optimization of OFDM radar waveform for target detection Future work Computable Bounds 41

Problem Description Goal: Detect a far-field target in the presence of multipath reflections. Challenges: Complex physical phenomena: multiple reflections, fading effects, etc. Lack of line-of-sight (LOS) propagation path to the target. Unknown frequency response of the target. Our Approach Employ OFDM signal to increase the frequency diversity and overcome fading. Exploit multipath reflections to improve the spatial diversity. Computable Bounds 42

Problem Description Radar Buildings Targets Non-LOS region Figure 9: Urban multipath scenario. Computable Bounds 43

Our Approach (cont.) Reformulate the target detection problem as estimating the spectrum of a sparse signal by exploiting the sparsity of multiple signal paths and target velocity. Employ a sparse-recovery algorithm based on the Dantzig selector (DS) approach and analyze its performance in terms of the l 1 -constrained minimal singular value of the measurement matrix. Image of target A Constant range curve Target A Target B v A v B Target C v C Image of target C Propose a constrained multi objective optimization (MOO) technique to design the spectral parameters of the OFDM waveform. Reflecting surface Radar Reflecting surface Computable Bounds 44

Measurement Model Assumptions: Far-field, point target moving with a constant velocity. The target remains within a range cell over the coherent processing interval (CPI). Fixed, mono-static, and coherent radar. The radar knows the geometry of the environment and position of the range cell. The information of the known range cell (τ) is incorporated into the model by choosing t = τ + nt PRI, n = 0, 1,..., N 1. Coherent processing interval n = 0 n = 1 n = N-1 T TPRI Computable Bounds 45

Measurement Model (cont.) Signal model: The complex envelope of the transmitted OFDM signal is s(t) = L 1 l=0 a l e j2πl ft, where L : number of subcarriers, a = [a 0, a 1,..., a L 1 ] T : complex transmitted weights ensuring a H a = 1 for constant energy transmission, f = B/(L + 1) = 1/T : subcarrier spacing, B : signal bandwidth, T : pulse duration. Adaptive design: We will select the coefficients a l s to maximize the target-detection performance. Computable Bounds 46

Measurement Model (cont.) Consider a target corresponding to the known range cell τ, and the radar receives information about the target (moving with v) through the path p. Then, the complex envelope of the received signal at the l-th subchannel y l (n) = a l x lp φ l (n, p, v) + e l (n), for l = 0,..., L 1, n = 0,..., N 1, (1) where φ l (n, p, v) e j2πf lτ e j2πf lβ p nt PRI, and x lp : target scattering coefficient at the l-th subchannel and p-th path, β p = 2 v, u p /c : effective Doppler coefficient along the p-th path, u p : direction-of-arrival (DOA) unit-vector of the p-th path, c : speed of propagation, f l = f c + l f and f c is the carrier frequency, e l (n) : clutter, measurement noise, and co-channel interference (CCI). Computable Bounds 47

Sparse Model We discretize the possible signal paths and target velocities into P and V grid points, respectively. Considering all possible combinations of (p i, v j ), i = 1, 2,..., P, j = 1, 2,..., V, we can rewrite (1) as where y l (n) = a l φ l (n) T x l + e l (n), φ l (n) = [ φ l (n, p 1, v 1 ),..., φ l (n, p 1, v V ), φ l (n, p 2, v 1 ),..., φ l (n, p P, v V ) ] T, x l is a P V 1 sparse vector, having only k l non-zero entries, where k l = I l : sparsity level of x l, I l = {ĩ [1, P ] : pĩ-th path carries target information}. Computable Bounds 48

Sparse Model (cont.) Concatenating the measurements of all L subchannels and N time samples where y = Φ x + e, y = [ y(0) T,..., y(n 1) ] T T and y(n) = [y0 (n),..., y L 1 (n)] T, [ ] Φ = (A Φ(0)) T (A Φ(N 1)) T T with A = diag(a) and Φ(n) = blkdiag ( φ 0 (n) T,..., φ L 1 (n) ) T, x = [ ] x T 0,..., x T T L 1 L 1 is a sparse-vector having k = l=0 k l non-zero entries, e = [ e(0) T,..., e(n 1) T ] T and e(n) = [e0 (n),..., e L 1 (n)] T. Thus, Φ contains the response bases corresponding to all possible paths and velocities. Computable Bounds 49

Statistical Model The vector e(n) represents the clutter, measurement noise, and co-channel interference at the output of L subchannels. We assume that e(n) is temporally white, zero-mean complex Gaussian vector, co-channel interference among the subchannels is characterized by a covariance matrix Σ. Hence, the measurement vector is distributed as y CN LN (Φ x, I N Σ). The identity matrix I N is due to the temporal white noise distribution. Computable Bounds 50

Sparse Recovery and Performance To obtain an estimate of x (a k-sparse vector) from the noisy measurements y, obtained through a linear model we apply the Dantzig selector (DS). Recall that the DS is given by y = Φ x + e, x DS = min z C LP V z 1 subject to Φ H (y Φz) λ σ, (2) where λ = 2 log(lp V ) is a control parameter and σ = tr(σ)/l. To assess the reconstruction performance of this recovery algorithm, we use the l 1 -constrained minimal singular value (l 1 -CMSV) of Φ. Computable Bounds 51

Sparse Recovery - Further Simplification We observe an additional structure in the sparse measurement model, i.e., y = [ ] Φ 0 Φ L 1 x 0. + e, x L 1 where each pair of block-matrices is orthogonal, i.e., Φ H l 1 Φ l2 = 0 for l 1 l 2, each x l, l = 0, 1,..., L 1, is sparse with sparsity level k l. To obtain an estimate of x, x = [ x T 0,..., x T L 1 ] T, by exploiting these properties, we employ simpler L decomposed Dantzig selectors. x DDS,l = min z l C P V z l 1 subject to Φ H l (y Φ l z l ) λ l σ, (3) where λ l = 2 log(p V ). Computable Bounds 52

Performance Analysis Theorem 7. Consider the estimate x obtained by employing our decomposed DS. An upper bound exists on the l 2 -norm of the sparse-estimation error x DDS x 2 4 L 1 whereas using the original DS we get l=0 λ 2 l k l σ 2 ρ 4 4k l (Φ l ), x DS x 2 4 λ k σ ρ 2 4k (Φ). Theorem 8. The L small Dantzig selectors in (3) perform better than the original Dantzig selector in (2) in terms of a smaller upper bound on the l 2 - norm of the sparse-estimation error: 4 L 1 l=0 λ 2 l k l σ 2 ρ 4 4k (Φ l ) 4 λ k σ ρ 2 l 4k (Φ). Computable Bounds 53

Adaptive Waveform Design We can adaptively design the OFDM spectral parameters, a l, to minimize the upper bound on the sparse-estimation error as a (1) = arg min a C L L 1 l=0 λ 2 l k l σ 2 a 4 l ρ4 4k l ( Φ l ) subject to a H a = 1, where Φ l = a l Φl. However, the computation of ρ 4kl ( Φ l ) is difficult with the complex variables. Therefore, we use a computable lower bound on ρ 4kl ( Φ l ), defined as ρ 8kl ( Ψ l ) ρ 4kl ( Φ l ), where Ψ T l Ψ l = [ Ψ T 1 Ψ 1 + Ψ T 2 Ψ 2 0 0 Ψ T 1 Ψ 1 + Ψ T 2 Ψ 2 ], Ψ 1 = Re Φ l, and Ψ 2 = Im Φ l. Computable Bounds 54

Adaptive Waveform Design (cont.) Hence, to minimize the upper bound on the sparse-estimation error, we formulate a single-objective optimization problem as a (1) = arg min a C L L 1 l=0 λ 2 l k l σ 2 a 4 l ρ4 8k l ( Ψ l ) subject to a H a = 1. Using the Lagrange-multiplier approach, we easily obtain the solution as a (1) l = where α l = λ2 l k l σ 2 ρ 4 8k l ( Ψ l ). (2α l) 1/3 L 1 l=0 (2α 1/3, for l = 0, 1,..., L 1, l) Note that the designed waveform depends solely on Φ, the measurement matrix. Computable Bounds 55

Adaptive Waveform Design (cont.) However, to achieve better detection performance it is also essential that the signal parameters are adaptive to the target and noise parameters (x and Σ) To detect the presence of a target in the range cell under test, a standard procedure is to construct the following decision problem { H0 : y = e H 1 : y = Φ x + e, and to find out whether the measurement y is distributed as CN LN (0, I N Σ) or CN LN (Φx, I N Σ). To optimize the detection performance, we maximize the squared Mahalanobisdistance between these two distributions d 2 = x H Φ H (I N Σ) 1 Φ x. Computable Bounds 56

Adaptive Waveform Design (cont.) Hence, in addition to minimizing the upper bound on the estimation error, we propose maximizing another single-objective function based on the squared Mahalanobis-distance (d 2 ) as [ ] a (2) = arg max x H Φ H (I N Σ) 1 Φ x a C L }{{} d 2, subject to a a = 1. After some algebraic manipulation, we have [ N 1 d 2 = a H n=0 ( Φ(n) x x H Φ(n) H) T Σ 1 and therefore the solution of the optimization problem, a (2), will [ be the eigenvector corresponding to the largest eigenvalue of N 1 ( Φ(n) x x H Φ(n) H) T Σ 1]. n=0 ] a, Computable Bounds 57

Multi-Objective Optimization We adaptively design the OFDM spectral parameters, a l, using a constrained multi-objective optimization (MOO) that simultaneously optimizes two objective functions: minimize the upper bound on the sparse-estimation error, maximize the squared Mahalanobis-distance. Mathematically, this is represented as L 1 arg min a C L l=0 a opt = [ N 1 arg max a C L a H n=0 subject to a H a = 1. λ 2 l k l σ 2, a 4 l ρ4 ( Φ 4k l ) l ( Φ(n) x x H Φ(n) H) T Σ 1] a We employ the well-known nondominated sorting genetic algorithm II (NSGA- II) to solve our MOO problem, imposing a restriction on the solutions to satisfy the constraint a H a = 1., Computable Bounds 58

Numerical Examples Problem: Detect a moving target in the presence of multipath in 2D. Target and multipath parameters: The range cell that is at a distance of 3 km from the radar. The target is 13.5 m east from the center line, moving with velocity v = (35/ 2) (î + ĵ) m/s and remains within the range cell over a CPI. There are two actual paths between the target and radar: one direct and one reflected, subtending angles of 0.26 and 0.51, respectively, with respect to the radar. Constant range curve Reflecting surface 3 km 40 m Target 13.5 m Radar v Reflecting surface The scattering coefficients are varied to simulate three target responses: Target 1: x (1) d = [1, 1, 1] T, x (1) r = [0.5, 0.5, 0.5] T. Target 2: x (2) d = [4, 1, 2] T, x (2) r = [2, 0.5, 1] T. Target 3: x (3) d = [1, 10, 1] T, x (3) r = [0.5, 5, 0.5] T. Image of target Computable Bounds 59

Numerical Examples (cont.) Radar parameters: Carrier frequency f c = 1 GHz Available bandwidth B = 100 MHz Number of OFDM subcarriers L = 3 Subcarrier spacing of f = B/(L + 1) = 25 MHz Pulse width T = 1/ f = 40 ns Pulse repetition interval T P = 4 ms Number of coherent pulses N = 20 All the transmit OFDM weights were equal; i.e., a l = 1/ L l Computable Bounds 60

Simulation: Numerical Examples (cont.) We partition the signal paths and target velocities into P = 5 and V = 3 uniform grid points. Hence, the associated signal grid paths subtend angles of { 0.5, 0.25, 0, 0.25, 0.5 } with respect to the radar, and target grid velocities are {25, 35, 45} m/s. Note, these grid points are different from the true parameters. Generate the noise samples from a CN (0, 1) distribution, and then scale to satisfy the required target to clutter-plus-noise ratio (TCNR) TCNR = xh x N L σ0 2. Comment: We kept the clutter-plus-noise power to be the same in each subcarrier by considering Σ = σ0 2 I L. Hence, TCNR will depend only on the target s RCS at each frequency. Computable Bounds 61

Numerical Examples (cont.) Parameters of NSGA-II: Population size = 500 Number of generations = 50 Crossover probability = 0.9 Mutation probability = 0.1 The constraint a H a = 1 is relaxed by ensuring that the solutions satisfy 0.999 a H a 1.001 Computable Bounds 62

Numerical Examples (cont.) Performance of the standard and Decomposed Dantzig Selectors: Target 1 Normalized root mean squred error 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 Original DS Decomposed DS 0 0 5 10 15 20 Target to clutter plus noise ratio (TCNR) (in db) Probability of detction (P D ) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Original DS Decomposed DS 10 1 Probability of false alarm (P FA ) Computation time (in sec) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Original DS Decomposed DS 0 0 5 10 15 20 Target to clutter plus noise ratio (TCNR) (in db) Target 2 Normalized root mean squared error 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 Original DS Decomposed DS 0 0 5 10 15 20 Target to clutter plus noise ratio (TCNR) (in db) Probability of detection (P D ) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Original DS Decomposed DS 0 10 2 10 1 Probability of false alarm (P FA ) Computation time (in sec) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Original DS Decomposed DS 0 0 5 10 15 20 Target to clutter plus noise ratio (TCNR) (in db) Normalized RMSE. Empirical ROC. Computation time. Computable Bounds 63

Numerical Examples (cont.) Solutions of the single-objective optimization problems: Minimizing the upper bound on the sparse-estimation error we obtained a (1) = [0.54, 0.16, 0.83] T, irrespective of the target parameters. Maximizing the squared Mahalanobis-distance we found a (2) = [1, 0, 0] T or [0, 1, 0] T or [0, 0, 1] T for Target 1, a (2) = [1, 0, 0] T for Target 2, a (2) = [0, 1, 0] T for Target 3. Observation: The maximization of the squared Mahalanobis-distance provided an adaptive waveform with all the signal energy concentrated over a single subcarrier that had the strongest target response, thus losing the frequency diversity of the system. Computable Bounds 64

Numerical Examples (cont.) Solutions of the NSGA-II (MOO problem) at the 50th generation: Solutions 1 1 1 0.8 0.8 0.8 a 3 0.6 0.4 a 3 0.6 0.4 a 3 0.6 0.4 0.2 0.2 0.2 0 0 0 0 0 0 0.5 a 1 1 1 0.5 a 2 0 0.5 a 1 1 1 0.5 a 2 0 0.5 a 1 1 1 0.5 a 2 0 Pareto-front Squared Mahalanobis distance 350 300 250 200 150 100 50 0 2 4 6 8 10 Squared upper bound on sparse estimation x error 10 9 Squared Mahalanobis distance 260 240 220 200 180 160 140 0 2 4 6 8 10 Squared upper bound on sparse estimation x error 10 8 Squared Mahalanobis distance 1100 1000 900 800 700 600 500 400 300 200 0 1 2 3 4 5 Squared upper bound on sparse estimation x error 10 9 Target 1 Target 2 Target 3 Computable Bounds 65

Numerical Examples (cont.) Effect of target scattering coefficent on the NSGA-II solutions: We averaged the whole population of 500 solutions and found a opt,avg = [0.61, 0.39, 0.68] T for Target 1, a opt,avg = [0.88, 0.20, 0.36] T for Target 2, a opt,avg = [0.13, 0.96, 0.15] T for Target 3. Observation: The solution of the MOO distributes the energy of the optimal waveform across different subcarriers in proportion to the distribution of the target energy; i.e., it puts more signal energy into that particular subcarrier in which the target response is stronger. Computable Bounds 66

Numerical Examples (cont.) Performance improvement due to adaptive waveform design: Normalized root mean squared error 2 1.5 1 0.5 Fixed waveform l 1 CMSV minimized adaptive waveform NSGA II optimized adaptive waveform Probability of detection (P D ) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Fixed waveform l 1 CMSV minimized adaptive waveform NSGA II optimized adaptive waveform 0 0 2 4 6 8 10 12 14 16 18 20 Target to clutter plus noise ratio (TCNR) (in db) RMSE 0 10 2 10 1 Probability of false alarm (P FA ) ROC Computable Bounds 67

Outline Introduction Sparsity recovery Application example Future work Computable Bounds 68

Open Problems Develop computable tight bounds. Bounds for other signal structures, such as block-sparse vectors and low-rank matrices. Procedures for optimization of system design. Computationally efficient algorithms for sensing matrices other than Fourier or Hadamard. Minimization of modeling errors due to discrete grid mismatch. Computable Bounds 69

Possible Approaches Develop computable tighter bounds that control the average or typical system performance. Encode more signal structures into the model rather than using bounds, for example using the continuous-parameter domain. Computable Bounds 70

References G. Tang and A. Nehorai, Verifiable and computable performance analysis of sparsity recovery, submitted for publication. G. Tang and A. Nehorai, Performance analysis of sparse recovery based on constrained minimal singular values, IEEE Trans. Signal Processing, vol. 59, no. 12, pp. 5734-5745, Dec. 2011. G. Tang and A. Nehorai, Fixed point theory and semidefinite programming for computable performance analysis of block-sparsity recovery, submitted for publication. S. Sen, G. Tang, and A. Nehorai, Multiobjective optimization of OFDM radar waveform for target detection, IEEE Trans. on Signal Processing, Vol. 59, pp. 639-652, Feb. 2011. Computable Bounds 71

Questions? Computable Bounds 72

Thank You! Computable Bounds 73