Detecting Sparse Structures in Data in Sub-Linear Time: A group testing approach Boaz Nadler The Weizmann Institute of Science Israel Joint works with Inbal Horev, Ronen Basri, Meirav Galun and Ery Arias-Castro Yi-Qing Wang, Alain Trouve, Yali Amit Roi Weiss, Chen Attias, Robert Krauthgamer Dec 2017 Boaz Nadler Sublinear Time Group Testing 1
Statistical Challenges related to big data In various applications (vision in particular), we collect so much data, that either 1) data does not fit / cannot be processed on single machine Boaz Nadler Sublinear Time Group Testing 2
Statistical Challenges related to big data In various applications (vision in particular), we collect so much data, that either 1) data does not fit / cannot be processed on single machine or 2) Standard algorithms that pass over all data are too slow / take too much computing power Boaz Nadler Sublinear Time Group Testing 2
Statistical Challenges related to big data In various applications (vision in particular), we collect so much data, that either 1) data does not fit / cannot be processed on single machine or 2) Standard algorithms that pass over all data are too slow / take too much computing power Common approach to handle setting (1) is distributed learning Boaz Nadler Sublinear Time Group Testing 2
Statistical Challenges related to big data In various applications (vision in particular), we collect so much data, that either 1) data does not fit / cannot be processed on single machine or 2) Standard algorithms that pass over all data are too slow / take too much computing power Common approach to handle setting (1) is distributed learning not focus of this talk, take a look at [Rosenblatt & N. 16 ] On the optimality of averaging in distributed statistical learning Boaz Nadler Sublinear Time Group Testing 2
Statistical challenges related to big data Focus of this talk: 2) Standard algorithms to solve a task are too slow Boaz Nadler Sublinear Time Group Testing 3
Statistical challenges related to big data Focus of this talk: 2) Standard algorithms to solve a task are too slow Two key challenges: [computational & practical] develop extremely fast algorithms (linear / sub-linear complexity) [theoretical] understand lower bounds on statistical accuracy under computational constraints Boaz Nadler Sublinear Time Group Testing 3
Statistical challenges related to big data Focus of this talk: 2) Standard algorithms to solve a task are too slow Two key challenges: [computational & practical] develop extremely fast algorithms (linear / sub-linear complexity) [theoretical] understand lower bounds on statistical accuracy under computational constraints In this talk: study these two challenges for (i) edge detection in large noisy images (ii) finding sparse representations in high dimensional dictionaries Boaz Nadler Sublinear Time Group Testing 3
Edge Detection Observe n 1 n 2 image I = array of pixel values Goal: Detect edges in image, typically boundaries between objects. Search for curves Γ such that at direction n - normal to curve Γ, gradient I n is large Boaz Nadler Sublinear Time Group Testing 4
Edge Detection Observe n 1 n 2 image I = array of pixel values Goal: Detect edges in image, typically boundaries between objects. Search for curves Γ such that at direction n - normal to curve Γ, gradient I n is large A fundamental task in low level image processing Boaz Nadler Sublinear Time Group Testing 4
Edge Detection Observe n 1 n 2 image I = array of pixel values Goal: Detect edges in image, typically boundaries between objects. Search for curves Γ such that at direction n - normal to curve Γ, gradient I n is large A fundamental task in low level image processing well studied problem, many algorithms well understood theory Boaz Nadler Sublinear Time Group Testing 4
Edge Detection at low SNR Our Interest: Edge detection in noisy and large 2D images and 3D video Motivation for large: high resolution images in many applications Motivation(s): for noisy images 1. Images at non-ideal conditions: poor lighting, fog, rain, night 2. surveillance applications 3. Real time object tracking in 3D video Boaz Nadler Sublinear Time Group Testing 5
Edge Detection at low SNR Our Interest: Edge detection in noisy and large 2D images and 3D video Motivation for large: high resolution images in many applications Motivation(s): for noisy images 1. Images at non-ideal conditions: poor lighting, fog, rain, night 2. surveillance applications 3. Real time object tracking in 3D video Image Prior: - Interested in long straight (or weakly curved) edges - Sparsity - image contains few edges Boaz Nadler Sublinear Time Group Testing 5
Example: Powerlines 200 400 600 800 1000 1200 1400 1600 1800 500 1000 1500 2000 2500 Boaz Nadler Sublinear Time Group Testing 6
Traditional Edge Detection Algorithms Typical Approach: Detect edges from local image gradients Example: Canny Edge Detector, complexity O(n 2 ) linear in total number of image pixels fast, possibly suitable for real-time Limitation: Does not work well at low SNR Boaz Nadler Sublinear Time Group Testing 7
Example: Canny, run-time 2.5sec Boaz Nadler Sublinear Time Group Testing 8
Example: Canny, run-time 2.5sec Cannot detect faint powerlines of second tower Boaz Nadler Sublinear Time Group Testing 8
Modern Sophisticated Methods - Statistical theory for limits of detectability [Arias-Castro, Donoho, Huo, 05] [Brandt, Galun, Basri, 07] [Alpert, Galun, Nadler, Basri, 10] [Ofir, Galun, Nadler, Basri, 15] - (Theoretically) efficient multiscale algorithms, robust to noise Boaz Nadler Sublinear Time Group Testing 9
Modern Sophisticated Methods - Statistical theory for limits of detectability [Arias-Castro, Donoho, Huo, 05] [Brandt, Galun, Basri, 07] [Alpert, Galun, Nadler, Basri, 10] [Ofir, Galun, Nadler, Basri, 15] - (Theoretically) efficient multiscale algorithms, robust to noise and yet slow Boaz Nadler Sublinear Time Group Testing 9
Modern Sophisticated Methods - Statistical theory for limits of detectability [Arias-Castro, Donoho, Huo, 05] [Brandt, Galun, Basri, 07] [Alpert, Galun, Nadler, Basri, 10] [Ofir, Galun, Nadler, Basri, 15] - (Theoretically) efficient multiscale algorithms, robust to noise and yet slow Run time: O(min) for large images, O(hours) for video Boaz Nadler Sublinear Time Group Testing 9
Example: Straight Segment Detector, run-time 5 min Boaz Nadler Sublinear Time Group Testing 10
Challenge: Sublinear Time Edge Detection Goal: Devise edge detection algorith, that is (i) robust to noise and (ii) extremely fast Boaz Nadler Sublinear Time Group Testing 11
Challenge: Sublinear Time Edge Detection Goal: Devise edge detection algorith, that is (i) robust to noise and (ii) extremely fast Given noisy n n image I, detect long straight edges in sublinear time Boaz Nadler Sublinear Time Group Testing 11
Challenge: Sublinear Time Edge Detection Goal: Devise edge detection algorith, that is (i) robust to noise and (ii) extremely fast Given noisy n n image I, detect long straight edges in sublinear time complexity O(n α ) with α < 2 Boaz Nadler Sublinear Time Group Testing 11
Challenge: Sublinear Time Edge Detection Goal: Devise edge detection algorith, that is (i) robust to noise and (ii) extremely fast Given noisy n n image I, detect long straight edges in sublinear time complexity O(n α ) with α < 2 touching only a fraction of the image/video pixels! Boaz Nadler Sublinear Time Group Testing 11
Challenge: Sublinear Time Edge Detection Goal: Devise edge detection algorith, that is (i) robust to noise and (ii) extremely fast Given noisy n n image I, detect long straight edges in Questions: sublinear time complexity O(n α ) with α < 2 touching only a fraction of the image/video pixels! a) Statistical: which edge strengths can one detect vs. α? b) Computational: optimal sampling scheme? c) Practical: sub-linear time algorithm? Boaz Nadler Sublinear Time Group Testing 11
Problem Setup Observe n n noisy image I = I 0 + ξ I 0 - noise free image ξ - i.i.d. additive noise, zero mean, variance σ 2 Boaz Nadler Sublinear Time Group Testing 12
Problem Setup Observe n n noisy image I = I 0 + ξ I 0 - noise free image ξ - i.i.d. additive noise, zero mean, variance σ 2 Goal: Detect edges in I 0 from noisy I Boaz Nadler Sublinear Time Group Testing 12
Problem Setup Observe n n noisy image I = I 0 + ξ I 0 - noise free image ξ - i.i.d. additive noise, zero mean, variance σ 2 Goal: Detect edges in I 0 from noisy I Assumptions: - Clean image I 0 contains few step edges (sparsity) Boaz Nadler Sublinear Time Group Testing 12
Problem Setup Observe n n noisy image I = I 0 + ξ I 0 - noise free image ξ - i.i.d. additive noise, zero mean, variance σ 2 Goal: Detect edges in I 0 from noisy I Assumptions: - Clean image I 0 contains few step edges (sparsity) - Edges of interest are straight and sufficiently long Boaz Nadler Sublinear Time Group Testing 12
Problem Setup Observe n n noisy image I = I 0 + ξ I 0 - noise free image ξ - i.i.d. additive noise, zero mean, variance σ 2 Goal: Detect edges in I 0 from noisy I Assumptions: - Clean image I 0 contains few step edges (sparsity) - Edges of interest are straight and sufficiently long Definition: Edge Signal to Noise Ratio = edge contrast/σ. Boaz Nadler Sublinear Time Group Testing 12
Theoretical Questions Given sub-linear budget: 1) what are optimal sampling schemes? 2) what are fundamental limitations on sub-linear edge detection? 3) what is the tradeoff between statistical accuracy and computational complexity? Boaz Nadler Sublinear Time Group Testing 13
Optimal Sublinear Edge Detection For theoretical analysis, consider following class of images: I = {I contains only noise or one long fiber plus noise} Boaz Nadler Sublinear Time Group Testing 14
Optimal Sublinear Edge Detection For theoretical analysis, consider following class of images: I = {I contains only noise or one long fiber plus noise} Boaz Nadler Sublinear Time Group Testing 14
Fundamental Limitations / Design Principles Focus on detection under worst-case scenario Boaz Nadler Sublinear Time Group Testing 15
Fundamental Limitations / Design Principles Focus on detection under worst-case scenario Lemma: If number of observed pixels is n α with α < 1 then there exists I I whose edges cannot be detected Boaz Nadler Sublinear Time Group Testing 15
Fundamental Limitations / Design Principles Focus on detection under worst-case scenario Lemma: If number of observed pixels is n α with α < 1 then there exists I I whose edges cannot be detected Theorem: Assume number of observed pixels is s and s/n is integer. Then, i) any optimal sampling scheme must observe exactly s/n pixels per row. ii) sampling s/n whole columns is an optimal scheme Boaz Nadler Sublinear Time Group Testing 15
Statistical Accuracy vs. Computational Complexity Definition: Edge SNR = edge contrast / noise level Boaz Nadler Sublinear Time Group Testing 16
Statistical Accuracy vs. Computational Complexity Definition: Edge SNR = edge contrast / noise level Theorem: At complexity O(n α ), with α 1, SNR ln n/n α 1 4 3.5 ln(n) SNR 3 2.5 2 1.5 1 0.5 not possible lnn/n α 1 0 0.5 1 1.5 2 α Boaz Nadler Sublinear Time Group Testing 16
Sublinear Edge Detection Algorithm Key Idea: Sample few image strips Boaz Nadler Sublinear Time Group Testing 17
Sublinear Edge Detection Algorithm Key Idea: Sample few image strips first detect edges in strips Boaz Nadler Sublinear Time Group Testing 17
Sublinear Edge Detection Algorithm Key Idea: Sample few image strips first detect edges in strips next: non-maximal suppression, edge localization Boaz Nadler Sublinear Time Group Testing 17
Example: NOISY IMAGE, SNR=1 Boaz Nadler Sublinear Time Group Testing 18
Example: NOISY IMAGE, SNR=1 CANNY Boaz Nadler Sublinear Time Group Testing 18
Example: NOISY IMAGE, SNR=1 CANNY SUB LINEAR Boaz Nadler Sublinear Time Group Testing 18
Sublinear Edge Detection, run-time few seconds Boaz Nadler Sublinear Time Group Testing 19
Some Results on Real Images Detection with roughly 10% of pixels sampled. Boaz Nadler Sublinear Time Group Testing 20
Real Images Boaz Nadler Sublinear Time Group Testing 21
Part II: Sparse Representations in High Dimensions Problem Setup: [with Chen Attias and Roi Weiss] A dictionary Φ = [ϕ 1, ϕ 2,..., ϕ N ] R p N Each atom ϕ i normalized ϕ i = 1 High-dimensional p 1 Possibly Redundant N = Jp with J 1 Definition: A signal s R p is m-sparse over dictionary Φ if s = Φα where supp(α) = m p Boaz Nadler Sublinear Time Group Testing 22
Part II: Sparse Representations in High Dimensions Problem Setup: [with Chen Attias and Roi Weiss] A dictionary Φ = [ϕ 1, ϕ 2,..., ϕ N ] R p N Each atom ϕ i normalized ϕ i = 1 High-dimensional p 1 Possibly Redundant N = Jp with J 1 Definition: A signal s R p is m-sparse over dictionary Φ if s = Φα where supp(α) = m p Goal: Given (noisy version of) s, find α. Boaz Nadler Sublinear Time Group Testing 22
Part II: Sparse Representations in High Dimensions Problem Setup: [with Chen Attias and Roi Weiss] A dictionary Φ = [ϕ 1, ϕ 2,..., ϕ N ] R p N Each atom ϕ i normalized ϕ i = 1 High-dimensional p 1 Possibly Redundant N = Jp with J 1 Definition: A signal s R p is m-sparse over dictionary Φ if s = Φα where supp(α) = m p Goal: Given (noisy version of) s, find α. Applications: Image and signal analysis. Boaz Nadler Sublinear Time Group Testing 22
Computing Sparse Representation [Davis et al, 97 ] With no assumptions on Φ and on m, problem is NP-hard Key challenge: find the support. Once supp(α) is known, recovering α requires O(pm 2 ) operations. Boaz Nadler Sublinear Time Group Testing 23
Computing Sparse Representation [Davis et al, 97 ] With no assumptions on Φ and on m, problem is NP-hard Key challenge: find the support. Once supp(α) is known, recovering α requires O(pm 2 ) operations. Definition: The coherence µ of a dictionary Φ is µ = max ϕ i, ϕ j i j m-sparse signal satisfies MUTUAL-INCOHERENCE-PROPERTY (MIP) if (2m 1)µ < 1 Boaz Nadler Sublinear Time Group Testing 23
Orthogonal Matching Pursuit [Donoho & Elad 03, Tropp 04, others...] Theorem: Suppose m-sparse signal s = Φα satisfies MIP condition. Then, solution of Basis-Pursuit (BP) problem arg min x R N x 1 s.t. s = Φx recovers representation α exactly Orthogonal Matching Pursuit (OMP) also recovers α exactly Boaz Nadler Sublinear Time Group Testing 24
Orthogonal Matching Pursuit [Donoho & Elad 03, Tropp 04, others...] Theorem: Suppose m-sparse signal s = Φα satisfies MIP condition. Then, solution of Basis-Pursuit (BP) problem arg min x R N x 1 s.t. s = Φx recovers representation α exactly Orthogonal Matching Pursuit (OMP) also recovers α exactly Time complexity of OMP is O(mp 2 ) and of BP even higher. Boaz Nadler Sublinear Time Group Testing 24
Orthogonal Matching Pursuit [Donoho & Elad 03, Tropp 04, others...] Theorem: Suppose m-sparse signal s = Φα satisfies MIP condition. Then, solution of Basis-Pursuit (BP) problem arg min x R N x 1 s.t. s = Φx recovers representation α exactly Orthogonal Matching Pursuit (OMP) also recovers α exactly Time complexity of OMP is O(mp 2 ) and of BP even higher. Can one compute a sparse representation faster? Boaz Nadler Sublinear Time Group Testing 24
Runtime of Orthogonal Matching Pursuit Key quantity for identifying significant atoms I k c k = Φ r k 1 R N with all inner products between r k 1 and all N atoms in Φ Boaz Nadler Sublinear Time Group Testing 25
Runtime of Orthogonal Matching Pursuit Key quantity for identifying significant atoms I k c k = Φ r k 1 R N with all inner products between r k 1 and all N atoms in Φ Computing all N inner products Boaz Nadler Sublinear Time Group Testing 25
Runtime of Orthogonal Matching Pursuit Key quantity for identifying significant atoms I k c k = Φ r k 1 R N with all inner products between r k 1 and all N atoms in Φ Computing all N inner products For structured Φ runtime N log N (Fourier, wavelet, sparse-graph codes,...) Boaz Nadler Sublinear Time Group Testing 25
Runtime of Orthogonal Matching Pursuit Key quantity for identifying significant atoms I k c k = Φ r k 1 R N with all inner products between r k 1 and all N atoms in Φ Computing all N inner products For structured Φ runtime N log N (Fourier, wavelet, sparse-graph codes,...) For non-structured Φ runtime Np Boaz Nadler Sublinear Time Group Testing 25
Runtime of Orthogonal Matching Pursuit Key quantity for identifying significant atoms I k c k = Φ r k 1 R N with all inner products between r k 1 and all N atoms in Φ Computing all N inner products For structured Φ runtime N log N (Fourier, wavelet, sparse-graph codes,...) For non-structured Φ runtime Np If N = Jp runtime quadratic in signal dimension O(p 2 ) Boaz Nadler Sublinear Time Group Testing 25
Runtime of Orthogonal Matching Pursuit Key quantity for identifying significant atoms I k c k = Φ r k 1 R N with all inner products between r k 1 and all N atoms in Φ Computing all N inner products For structured Φ runtime N log N (Fourier, wavelet, sparse-graph codes,...) For non-structured Φ runtime Np If N = Jp runtime quadratic in signal dimension O(p 2 ) Question: Identify largest entries of c k in nearly-linear time? Boaz Nadler Sublinear Time Group Testing 25
Our contribution: A group testing approach Use (random) Tree-Based Statistical-Group-Testing data structure Boaz Nadler Sublinear Time Group Testing 26
Our contribution: A group testing approach Use (random) Tree-Based Statistical-Group-Testing data structure Finds at least one atom from r k 1 s representation w.h.p. Boaz Nadler Sublinear Time Group Testing 26
Our contribution: A group testing approach Use (random) Tree-Based Statistical-Group-Testing data structure Finds at least one atom from r k 1 s representation w.h.p. If coherence µ = O(1/ p) O ( pm 2 log(pm) ) time Boaz Nadler Sublinear Time Group Testing 26
Our contribution: A group testing approach Use (random) Tree-Based Statistical-Group-Testing data structure Finds at least one atom from r k 1 s representation w.h.p. If coherence µ = O(1/ p) O ( pm 2 log(pm) ) time Memory: O(Np log (pm)) Boaz Nadler Sublinear Time Group Testing 26
Our contribution: A group testing approach Use (random) Tree-Based Statistical-Group-Testing data structure Finds at least one atom from r k 1 s representation w.h.p. If coherence µ = O(1/ p) O ( pm 2 log(pm) ) time Memory: O(Np log (pm)) OMP + Statistical-Group-Testing (GT-OMP) At most m iterations for exact recovery Boaz Nadler Sublinear Time Group Testing 26
Our contribution: A group testing approach Use (random) Tree-Based Statistical-Group-Testing data structure Finds at least one atom from r k 1 s representation w.h.p. If coherence µ = O(1/ p) O ( pm 2 log(pm) ) time Memory: O(Np log (pm)) OMP + Statistical-Group-Testing (GT-OMP) At most m iterations for exact recovery Total runtime O ( pm 3 log (pm) ) Boaz Nadler Sublinear Time Group Testing 26
Our contribution: A group testing approach Use (random) Tree-Based Statistical-Group-Testing data structure Finds at least one atom from r k 1 s representation w.h.p. If coherence µ = O(1/ p) O ( pm 2 log(pm) ) time Memory: O(Np log (pm)) OMP + Statistical-Group-Testing (GT-OMP) At most m iterations for exact recovery Total runtime O ( pm 3 log (pm) ) For sparsity m = o(p 1/3 ) runtime sub-quadratic in p (recall MIP requires m = O(p 1/2 )) Boaz Nadler Sublinear Time Group Testing 26
Our contribution: A group testing approach Use (random) Tree-Based Statistical-Group-Testing data structure Finds at least one atom from r k 1 s representation w.h.p. If coherence µ = O(1/ p) O ( pm 2 log(pm) ) time Memory: O(Np log (pm)) OMP + Statistical-Group-Testing (GT-OMP) At most m iterations for exact recovery Total runtime O ( pm 3 log (pm) ) For sparsity m = o(p 1/3 ) runtime sub-quadratic in p (recall MIP requires m = O(p 1/2 )) For sparsity m = O(log p) runtime near-linear in p Boaz Nadler Sublinear Time Group Testing 26
Simulations results random Gaussian dictionary N = 2p, Φ ij N(0, 1/ p). sparsity m = 4 log 2 p Compared 4 algorithms: OMP [Pati et al 93, Mallat & Zhang 92 ] OMP-threshold [Yang & de Hoog 15 ] Stagewise OMP [Donoho et al 12 ] GT-OMP (ours) Averaged over 25 random signals All 4 algorithms 100% success in recovering all 25 signals Boaz Nadler Sublinear Time Group Testing 27
Runtime vs. dimension no. of inner products 10 6 10 5 10 4 OMP OMP T StOMP GT OMP 1024 2048 4096 8192 dimension p Boaz Nadler Sublinear Time Group Testing 28
Summary Increasing need for fast (possibly sub-linear) algorithms in various big-data applications Considered edge detection and sparse representation Boaz Nadler Sublinear Time Group Testing 29
Summary Increasing need for fast (possibly sub-linear) algorithms in various big-data applications Considered edge detection and sparse representation Common Theme to both problems: - Formulate estimation as a search in a very large space of possible hypothesis - Construct coarse tests that rule out many hypothesis at once - Perform more expensive/accurate tests on remaining hypotheses Boaz Nadler Sublinear Time Group Testing 29
Summary Increasing need for fast (possibly sub-linear) algorithms in various big-data applications Considered edge detection and sparse representation Common Theme to both problems: - Formulate estimation as a search in a very large space of possible hypothesis - Construct coarse tests that rule out many hypothesis at once - Perform more expensive/accurate tests on remaining hypotheses Similar ideas: Gilbert et al, Willett et al 14, Meinshausen et al 09, Haupt et al,... Boaz Nadler Sublinear Time Group Testing 29
Summary Open Questions: - Can similar approach solve other inference/learning problems? - Precisely quantify statistical vs. computational tradeoffs for other problems? Boaz Nadler Sublinear Time Group Testing 30
The End Still a long way to go Thank you! www.weizmann.ac.il/math/nadler/ Boaz Nadler Sublinear Time Group Testing 31