Exponential decay of reconstruction error from binary measurements of sparse signals Deanna Needell Joint work with R. Baraniuk, S. Foucart, Y. Plan, and M. Wootters
Outline Introduction Mathematical Formulation & Methods Practical CS Other notions of sparsity Heavy quantization Adaptive sampling
The mathematical problem 1. Signal of interest f C d (= C N N ) 2. Measurement operator A : C d C m (m d) 3. Measurements y = A f + ξ y = A f + ξ 4. Problem: Reconstruct signal f from measurements y
Sparsity Measurements y = A f + ξ. y = A f + ξ Assume f is sparse: def In the coordinate basis: f 0 = supp(f ) s d In orthonormal basis: f = Bx where x 0 s d In practice, we encounter compressible signals. f s is the best s-sparse approximation to f
Many applications... Radar, Error Correction Computational Biology, Geophysical Data Analysis Data Mining, classification Neuroscience Imaging Sparse channel estimation, sparse initial state estimation Topology identification of interconnected systems...
Sparsity... Sparsity in coordinate basis: f=x
Reconstructing the signal f from measurements y l 1 -minimization [Candès-Romberg-Tao] Let A satisfy the Restricted Isometry Property and set: f ˆ = argmin g 1 such that A f y 2 ε, g where ξ 2 ε. Then we can stably recover the signal f : f ˆ f 2 ε + x x s 1 s. This error bound is optimal.
Restricted Isometry Property A satisfies the Restricted Isometry Property (RIP) when there is δ < c such that (1 δ) f 2 A f 2 (1 + δ) f 2 whenever f 0 s. m d Gaussian or Bernoulli measurement matrices satisfy the RIP with high probability when m s logd. Random Fourier and others with fast multiply have similar property: m s log 4 d.
Other recovery methods Greedy Algorithms If A satisfies the RIP, then A A is close to the identity on sparse vectors Use proxy p = A y = A Ax x Threshold to maintain sparsity: ˆx = H s (p) Repeat (Iterative Hard Thresholding)
The One-Bit Compressive Sensing Problem
The One-Bit Compressive Sensing Problem Standard CS: vectors x R n with x 0 s acquired via nonadaptive linear measurements a i, x, i = 1,..., m.
The One-Bit Compressive Sensing Problem Standard CS: vectors x R n with x 0 s acquired via nonadaptive linear measurements a i, x, i = 1,..., m. In practice, measurements need to be quantized.
The One-Bit Compressive Sensing Problem Standard CS: vectors x R n with x 0 s acquired via nonadaptive linear measurements a i, x, i = 1,..., m. In practice, measurements need to be quantized. One-Bit CS: extreme quantization as y = sign(ax), i.e., y i = sign a i, x, i = 1,..., m.
The One-Bit Compressive Sensing Problem Standard CS: vectors x R n with x 0 s acquired via nonadaptive linear measurements a i, x, i = 1,..., m. In practice, measurements need to be quantized. One-Bit CS: extreme quantization as y = sign(ax), i.e., y i = sign a i, x, i = 1,..., m. Goal: find reconstruction maps : {±1} m R n such that,
The One-Bit Compressive Sensing Problem Standard CS: vectors x R n with x 0 s acquired via nonadaptive linear measurements a i, x, i = 1,..., m. In practice, measurements need to be quantized. One-Bit CS: extreme quantization as y = sign(ax), i.e., y i = sign a i, x, i = 1,..., m. Goal: find reconstruction maps : {±1} m R n such that, assuming the l 2 -normalization of x (why?),
The One-Bit Compressive Sensing Problem Standard CS: vectors x R n with x 0 s acquired via nonadaptive linear measurements a i, x, i = 1,..., m. In practice, measurements need to be quantized. One-Bit CS: extreme quantization as y = sign(ax), i.e., y i = sign a i, x, i = 1,..., m. Goal: find reconstruction maps : {±1} m R n such that, assuming the l 2 -normalization of x (why?), x (y) γ provided the oversampling factor satisfies m λ := s ln(n/s) f (γ) for f slowly increasing when γ decreases to zero
The One-Bit Compressive Sensing Problem Standard CS: vectors x R n with x 0 s acquired via nonadaptive linear measurements a i, x, i = 1,..., m. In practice, measurements need to be quantized. One-Bit CS: extreme quantization as y = sign(ax), i.e., y i = sign a i, x, i = 1,..., m. Goal: find reconstruction maps : {±1} m R n such that, assuming the l 2 -normalization of x (why?), x (y) γ provided the oversampling factor satisfies m λ := s ln(n/s) f (γ) for f slowly increasing when γ decreases to zero, equivalently x (y) g(λ) for g rapidly decreasing to zero when λ increases.
A visual
A visual
Existing Theoretical Results
Existing Theoretical Results Convex optimization algorithms [Plan Vershynin 13a, 13b].
Existing Theoretical Results Convex optimization algorithms [Plan Vershynin 13a, 13b]. Uniform, nonadaptive, no quantization error: If A R m n is a Gaussian matrix, then w/hp x LP(y) LP (y) 2 λ 1/5 whenever x 0 s, x 2 = 1. 2
Existing Theoretical Results Convex optimization algorithms [Plan Vershynin 13a, 13b]. Uniform, nonadaptive, no quantization error: If A R m n is a Gaussian matrix, then w/hp x LP(y) LP (y) 2 λ 1/5 whenever x 0 s, x 2 = 1. 2 Nonuniform, nonadaptive, random quantization error: Fix x R n with x 0 s, x 2 = 1. If A R m n is a Gaussian matrix, then w/hp x SOCP (y) 2 λ 1/4.
Existing Theoretical Results Convex optimization algorithms [Plan Vershynin 13a, 13b]. Uniform, nonadaptive, no quantization error: If A R m n is a Gaussian matrix, then w/hp x LP(y) LP (y) 2 λ 1/5 whenever x 0 s, x 2 = 1. 2 Nonuniform, nonadaptive, random quantization error: Fix x R n with x 0 s, x 2 = 1. If A R m n is a Gaussian matrix, then w/hp x SOCP (y) 2 λ 1/4. Uniform, nonadaptive, adversarial quantization error: If A R m n is a Gaussian matrix, then w/hp x SOCP (y) 2 λ 1/12 whenever x 0 s, x 2 = 1.
Limitations of the Framework
Limitations of the Framework Power decay is optimal since x opt (y) 2 λ 1 even if supp(x) known in advance [Goyal Vetterli Thao 98].
Limitations of the Framework Power decay is optimal since x opt (y) 2 λ 1 even if supp(x) known in advance [Goyal Vetterli Thao 98]. Geometric intuition x S n 1 http://dsp.rice.edu/1bitcs/choppyanimated.gif
Limitations of the Framework Power decay is optimal since x opt (y) 2 λ 1 even if supp(x) known in advance [Goyal Vetterli Thao 98]. Geometric intuition x S n 1 http://dsp.rice.edu/1bitcs/choppyanimated.gif Remedy: adaptive choice of dithers τ 1,..., τ m in y i = sign( a i, x τ i ), i = 1,..., m.
Exponential Decay: General Strategy
Exponential Decay: General Strategy Rely on an order-one quantization/recovery scheme: for any x R n with x 0 s, x 2 R, take q s ln(n/s) one-bit measurements and estimate both the direction and the magnitude of x by producing x such that x x 2 R/4.
Exponential Decay: General Strategy Rely on an order-one quantization/recovery scheme: for any x R n with x 0 s, x 2 R, take q s ln(n/s) one-bit measurements and estimate both the direction and the magnitude of x by producing x such that x x 2 R/4. Let x R n with x 0 s, x 2 R. Start with x 0 = 0.
Exponential Decay: General Strategy Rely on an order-one quantization/recovery scheme: for any x R n with x 0 s, x 2 R, take q s ln(n/s) one-bit measurements and estimate both the direction and the magnitude of x by producing x such that x x 2 R/4. Let x R n with x 0 s, x 2 R. Start with x 0 = 0. For t = 0, 1,..., estimate x x t by x x t, then set x t+1 = x t + x x t, so that x x t+1 2 R/4 t+1.
Exponential Decay: General Strategy Rely on an order-one quantization/recovery scheme: for any x R n with x 0 s, x 2 R, take q s ln(n/s) one-bit measurements and estimate both the direction and the magnitude of x by producing x such that x x 2 R/4. Let x R n with x 0 s, x 2 R. Start with x 0 = 0. For t = 0, 1,..., estimate x x t by x x t, then set x t+1 = H s (x t + x x t ), so that x x t+1 2 R/2 t+1.
Exponential Decay: General Strategy Rely on an order-one quantization/recovery scheme: for any x R n with x 0 s, x 2 R, take q s ln(n/s) one-bit measurements and estimate both the direction and the magnitude of x by producing x such that x x 2 R/4. Let x R n with x 0 s, x 2 R. Start with x 0 = 0. For t = 0, 1,..., estimate x x t by x x t, then set x t+1 = H s (x t + x x t ), so that x x t+1 2 R/2 t+1. After T iterations, number of measurements is m = qt, and x x T 2 R 2 T = R 2 m q = R exp ( cλ).
Exponential Decay: General Strategy Rely on an order-one quantization/recovery scheme: for any x R n with x 0 s, x 2 R, take q s ln(n/s) one-bit measurements and estimate both the direction and the magnitude of x by producing x such that x x 2 R/4. Let x R n with x 0 s, x 2 R. Start with x 0 = 0. For t = 0, 1,..., estimate x x t by x x t, then set x t+1 = H s (x t + x x t ), so that x x t+1 2 R/2 t+1. After T iterations, number of measurements is m = qt, and x x T 2 R 2 T = R 2 m q = R exp ( cλ). Software step needed to compute the thresholds τ i = a i, x t.
Order-One Scheme Based on Convex Optimization
Order-One Scheme Based on Convex Optimization Measurement vectors a 1,..., a q : independent N (0, I q ).
Order-One Scheme Based on Convex Optimization Measurement vectors a 1,..., a q : independent N (0, I q ). Dithers τ 1,..., τ q : independent N (0, R 2 ).
Order-One Scheme Based on Convex Optimization Measurement vectors a 1,..., a q : independent N (0, I q ). Dithers τ 1,..., τ q : independent N (0, R 2 ). x = argmin z 1 subject to z 2 R, y i ( a i, z τ i ) 0.
Order-One Scheme Based on Convex Optimization Measurement vectors a 1,..., a q : independent N (0, I q ). Dithers τ 1,..., τ q : independent N (0, R 2 ). x = argmin z 1 subject to z 2 R, y i ( a i, z τ i ) 0. If q cδ 4 s ln(n/s), then w/hp x x δr whenever x 0 s, x 2 R.
Order-One Scheme Based on Convex Optimization Measurement vectors a 1,..., a q : independent N (0, I q ). Dithers τ 1,..., τ q : independent N (0, R 2 ). x = argmin z 1 subject to z 2 R, y i ( a i, z τ i ) 0. If q cδ 4 s ln(n/s), then w/hp x x δr whenever x 0 s, x 2 R. Pros: dithers are nonadaptive.
Order-One Scheme Based on Convex Optimization Measurement vectors a 1,..., a q : independent N (0, I q ). Dithers τ 1,..., τ q : independent N (0, R 2 ). x = argmin z 1 subject to z 2 R, y i ( a i, z τ i ) 0. If q cδ 4 s ln(n/s), then w/hp x x δr whenever x 0 s, x 2 R. Pros: dithers are nonadaptive. Cons: slow, post-quantization error not handled.
Order-One Scheme Based on Hard Thresholding
Order-One Scheme Based on Hard Thresholding Measurement vectors a 1,..., a q : independent N (0, I q ).
Order-One Scheme Based on Hard Thresholding Measurement vectors a 1,..., a q : independent N (0, I q ). Use half of them to estimate the direction of x as u = H s(a sign(ax)).
Order-One Scheme Based on Hard Thresholding Measurement vectors a 1,..., a q : independent N (0, I q ). Use half of them to estimate the direction of x as u = H s(a sign(ax)). Construct sparse vectors v, w (supp(v) supp(u)) according to 2Rv 2Ru w x
Order-One Scheme Based on Hard Thresholding Measurement vectors a 1,..., a q : independent N (0, I q ). Use half of them to estimate the direction of x as u = H s(a sign(ax)). Construct sparse vectors v, w (supp(v) supp(u)) according to 2Rv 2Ru w x Use other half to estimate the direction of x w applying hard thresholding again.
Order-One Scheme Based on Hard Thresholding Measurement vectors a 1,..., a q : independent N (0, I q ). Use half of them to estimate the direction of x as u = H s(a sign(ax)). Construct sparse vectors v, w (supp(v) supp(u)) according to 2Rv 2Ru w x Use other half to estimate the direction of x w applying hard thresholding again. Plane geometry to estimate direction and magnitude of x.
Order-One Scheme Based on Hard Thresholding Measurement vectors a 1,..., a q : independent N (0, I q ). Use half of them to estimate the direction of x as u = H s(a sign(ax)). Construct sparse vectors v, w (supp(v) supp(u)) according to 2Rv 2Ru w x Use other half to estimate the direction of x w applying hard thresholding again. Plane geometry to estimate direction and magnitude of x. Cons: dithers a i, w are adaptive.
Order-One Scheme Based on Hard Thresholding Measurement vectors a 1,..., a q : independent N (0, I q ). Use half of them to estimate the direction of x as u = H s(a sign(ax)). Construct sparse vectors v, w (supp(v) supp(u)) according to 2Rv 2Ru w x Use other half to estimate the direction of x w applying hard thresholding again. Plane geometry to estimate direction and magnitude of x. Cons: dithers a i, w are adaptive. Pros: deterministic, fast, handles pre/post-quantization errors.
Measurement Errors
Measurement Errors Pre-quantization error e R m in y i = sign( a i, x τ i + e i ).
Measurement Errors Pre-quantization error e R m in y i = sign( a i, x τ i + e i ). If e εr 2 T (or e t 2 ε q x x t 2 throughout), then x x T 2 R 2 T = R exp( cλ) for the convex-optimization and hard-thresholding schemes.
Measurement Errors Pre-quantization error e R m in y i = sign( a i, x τ i + e i ). If e εr 2 T (or e t 2 ε q x x t 2 throughout), then x x T 2 R 2 T = R exp( cλ) for the convex-optimization and hard-thresholding schemes. Post-quantization error f {±1} m in y i = f i sign( a i, x τ i ).
Measurement Errors Pre-quantization error e R m in y i = sign( a i, x τ i + e i ). If e εr 2 T (or e t 2 ε q x x t 2 throughout), then x x T 2 R 2 T = R exp( cλ) for the convex-optimization and hard-thresholding schemes. Post-quantization error f {±1} m in y i = f i sign( a i, x τ i ). If card({i : f t i = 1}) ηq throughout, then x x T 2 R 2 T = R exp( cλ) for the hard-thresholding scheme.
Numerical Illustration 0.03 0.025 1000 hard thresholding based tests n=100, s=15, m=100000, T=10 (4 minutes) perfect one bit measurements prequantization noise (st. dev.=0.1) bit flips (5%) 0.04 0.035 1000 second order cone programming based tests n=100, s=10, m=20000, T=5 (11 hours) perfect one bit measurements prequantization noise (st. dev.=0.005) 0.03 x x* / x 0.02 0.015 0.01 x x* / x 0.025 0.02 0.015 0.01 0.005 0.005 0 3 4 5 6 7 8 9 10 t 0 1 1.5 2 2.5 3 3.5 4 4.5 5 t
Numerical Illustration, ctd 10 2 500 hard thresholding based tests n=100, m/s=2000:2000:20000, T=6 (9 hours) s=10 s=15 s=20 s=25 10 1 100 second order cone programming based tests n=100, m/s=20:20:200, T=5 (11 hours) s=10 s=15 s=20 s=25 x x* / x 10 3 x x* / x 10 2 10 4 0 1 2 3 4 5 6 7 8 9 m/(s log(n/s)) x 10 4 10 3 0 100 200 300 400 500 600 700 800 m/(s log(n/s))
Ingredients for the Proofs
Ingredients for the Proofs Let A R q n with independent N (0, 1) entries.
Ingredients for the Proofs Let A R q n with independent N (0, 1) entries. Sign Product Embedding Property: if q Cδ 6 s ln(n/s), then with w/hp π/2 Aw, sign(ax) w, x q δ for all w, x R n with w 0, x 0 s and w 2 = x 2 = 1.
Ingredients for the Proofs Let A R q n with independent N (0, 1) entries. Sign Product Embedding Property: if q Cδ 6 s ln(n/s), then with w/hp π/2 Aw, sign(ax) w, x q δ for all w, x R n with w 0, x 0 s and w 2 = x 2 = 1. Simultaneous (l 2, l 1 )-Quotient Property: w/hp, every e R q can be written as e = Au with where s = q/ ln(n/q). { u 2 d e 2 / q, u 1 d s e 2 / q,
Ingredients for the Proofs Let A R q n with independent N (0, 1) entries. Sign Product Embedding Property: if q Cδ 6 s ln(n/s), then with w/hp π/2 Aw, sign(ax) w, x q δ for all w, x R n with w 0, x 0 s and w 2 = x 2 = 1. Simultaneous (l 2, l 1 )-Quotient Property: w/hp, every e R q can be written as e = Au with { u 2 d e 2 / q, u 1 d s e 2 / q, where s = q/ ln(n/q). Restricted Isometry Property: if q Cδ 2 s ln(n/s), then with w/hp 1 q Ax 2 2 x 2 2 δ x 2 2 for all x R n with x 0 s.
Ingredients for the Proofs, ctd
Ingredients for the Proofs, ctd Random hyperplane tessellations of sb n 1 S n 1 :
Ingredients for the Proofs, ctd Random hyperplane tessellations of sb n 1 S n 1 : a 1,..., a q R n independent N (0, I q ).
Ingredients for the Proofs, ctd Random hyperplane tessellations of sb1 n S n 1 : a 1,..., a q R n independent N (0, I q ). If q Cδ 4 s ln(n/s), then w/hp all x, x sb1 n S n 1 with sign a i, x = sign a i, x, i = 1,..., q, satisfy x x 2 δ.
Ingredients for the Proofs, ctd Random hyperplane tessellations of sb1 n S n 1 : a 1,..., a q R n independent N (0, I q ). If q Cδ 4 s ln(n/s), then w/hp all x, x sb1 n S n 1 with sign a i, x = sign a i, x, i = 1,..., q, satisfy x x 2 δ. Random hyperplane tessellations of sb n 1 Bn 2 :
Ingredients for the Proofs, ctd Random hyperplane tessellations of sb1 n S n 1 : a 1,..., a q R n independent N (0, I q ). If q Cδ 4 s ln(n/s), then w/hp all x, x sb1 n S n 1 with sign a i, x = sign a i, x, i = 1,..., q, satisfy x x 2 δ. Random hyperplane tessellations of sb n 1 Bn 2 : a1,..., a q R n independent N (0, I q ),
Ingredients for the Proofs, ctd Random hyperplane tessellations of sb1 n S n 1 : a 1,..., a q R n independent N (0, I q ). If q Cδ 4 s ln(n/s), then w/hp all x, x sb1 n S n 1 with sign a i, x = sign a i, x, i = 1,..., q, satisfy x x 2 δ. Random hyperplane tessellations of sb n 1 Bn 2 : a1,..., a q R n independent N (0, I q ), τ 1,..., τ q R independent N (0, 1),
Ingredients for the Proofs, ctd Random hyperplane tessellations of sb1 n S n 1 : a 1,..., a q R n independent N (0, I q ). If q Cδ 4 s ln(n/s), then w/hp all x, x sb1 n S n 1 with sign a i, x = sign a i, x, i = 1,..., q, satisfy x x 2 δ. Random hyperplane tessellations of sb n 1 Bn 2 : a1,..., a q R n independent N (0, I q ), τ 1,..., τ q R independent N (0, 1), apply the previous results to [a i, τ i ], [x, 1], [x, 1].
Ingredients for the Proofs, ctd Random hyperplane tessellations of sb1 n S n 1 : a 1,..., a q R n independent N (0, I q ). If q Cδ 4 s ln(n/s), then w/hp all x, x sb1 n S n 1 with sign a i, x = sign a i, x, i = 1,..., q, satisfy x x 2 δ. Random hyperplane tessellations of sb1 n Bn 2 : a1,..., a q R n independent N (0, I q ), τ 1,..., τ q R independent N (0, 1), apply the previous results to [a i, τ i ], [x, 1], [x, 1]. If q Cδ 4 s ln(n/s), then w/hp all x, x sb1 n Bn 2 with sign( a i, x τ i ) = sign( a i, x τ i ), i = 1,..., q, satisfy x x 2 δ.
Thank you! E-mail: dneedell@cmc.edu Web: www.cmc.edu/pages/faculty/dneedell References: R. Baraniuk, S. Foucart, D. Needell, Y. Plan, M. Wootters. Exponential decay of reconstruction error from binary measurements of sparse signal, submitted. E. J. Candès, J. Romberg, and T. Tao. Stable signal recovery from incomplete and inaccurate measurements. Communications on Pure and Applied Mathematics, 59(8):1207Ű1223, 2006. E. J. Candès, Y. C. Eldar, D. Needell and P. Randall. Compressed sensing with coherent and redundant dictionaries. Applied and Computational Harmonic Analysis, 31(1):59-73, 2010. M. A. Davenport, D. Needell and M. B. Wakin. Signal Space CoSaMP for Sparse Recovery with Redundant Dictionaries, submitted. D. Needell and R. Ward. Stable image reconstruction using total variation minimization. J. Fourier Analysis and Applications, to appear. Y. Plan and R. Vershynin. One-bit compressed sensing by linear programming, Comm. Pure Appl. Math., to appear.