Sketching for Large-Scale Learning of Mixture Models
|
|
- Coleen Atkinson
- 5 years ago
- Views:
Transcription
1 Sketching for Large-Scale Learning of Mixture Models Nicolas Keriven Université Rennes 1, Inria Rennes Bretagne-atlantique Adv. Rémi Gribonval
2 Outline Introduction Practical Approach Results Theoretical analysis Conclusion and outlooks
3 Paths to Compressive Learning Goal : Compute parameters from a large database. 1/28
4 Paths to Compressive Learning Goal : Compute parameters from a large database. PCA : Classification : Regression : Density estimation : 1/28
5 Paths to Compressive Learning Goal : Compute parameters from a large database. PCA : Classification : Regression : Density estimation : Idea : compress the database beforehand. 1/28
6 Paths to Compressive Learning Goal : Compute parameters from a large database. PCA : Classification : Regression : Density estimation : Idea : compress the database beforehand. Large (tall) See e.g. [Calderbank 2009] 1/28
7 Paths to Compressive Learning Goal : Compute parameters from a large database. PCA : Classification : Regression : Density estimation : Idea : compress the database beforehand. Large (tall) See e.g. [Calderbank 2009] Large (fat) «Big data» See e.g. [Cormode 2011] 1/28
8 Inspiration Sketch Contains particular info about the database Maintained online [Cormode 2011] 2/28
9 Inspiration Sketch Learning Contains particular info about the database Maintained online [Cormode 2011] Knowledge about underlying probability distribution 2/28
10 Inspiration Sketch Learning Contains particular info about the database Maintained online [Cormode 2011] Knowledge about underlying probability distribution by Compressive Sensing Recover «low-dimensional» object from few linear measurements (ex : sparse vector, low-rank matrix ) 2/28
11 Inspiration Sketch Learning Contains particular info about the database Maintained online [Cormode 2011] Knowledge about underlying probability distribution by Compressive Sensing Recover «low-dimensional» object from few linear measurements (ex : sparse vector, low-rank matrix ) Sketch = measurements of underlying probability distribution 2/28
12 Generalized Moments (Generalized) Compressive Sensing Practical Compressive Learning 3/28
13 Generalized Moments (Generalized) Compressive Sensing Practical Compressive Learning 3/28
14 Generalized Moments (Generalized) Compressive Sensing Practical Compressive Learning Compressive Sensing : (Random) Projections 3/28
15 Generalized Moments (Generalized) Compressive Sensing Practical Compressive Learning Compressive Sensing : (Random) Projections Online Distributed 3/28
16 Generalized Moments (Generalized) Compressive Sensing Practical Compressive Learning Compressive Sensing : (Random) Projections Robustness of learning Alg.? Online Distributed 3/28
17 Mixture Model Estimation Usual Compressive Sensing 4/28
18 Mixture Model Estimation Usual Compressive Sensing Sparsity 4/28
19 Mixture Model Estimation Usual Compressive Sensing Sparsity Generalized Compressive Sensing (or ) Infinite, continuous dictionary! 4/28
20 Outline Practical Approach (Section 2 & 3) Greedy algorithm inspired by Compressive Sensing Application to K-means, GMM with diagonal covariance 5/28
21 Outline Practical Approach (Section 2 & 3) Greedy algorithm inspired by Compressive Sensing Application to K-means, GMM with diagonal covariance Theoretical Analysis (Section 4) Information-preservation guarantee Infinite-dimensional Compressive Sensing 5/28
22 Outline Introduction Practical Approach Results Theoretical analysis Conclusion and outlooks
23 Cost function Alg. 6/28
24 Cost function Alg. 6/28
25 Cost function Alg. «Ideal» decoding scheme See [Foucart 2013] 6/28
26 Cost function Alg. «Ideal» decoding scheme NP-complete See [Foucart 2013] 6/28
27 Cost function Alg. «Ideal» decoding scheme NP-complete Two approaches: Convex relaxation Greedy See [Foucart 2013] 6/28
28 Cost function Alg. «Ideal» decoding scheme NP-complete Two approaches: Convex relaxation Greedy See [Foucart 2013] 6/28
29 Cost function Alg. «Ideal» decoding scheme NP-complete Two approaches: Convex relaxation Greedy Ideal decoding scheme (Section 4) See [Foucart 2013] 6/28
30 Cost function Alg. «Ideal» decoding scheme NP-complete Two approaches: Convex relaxation Greedy Ideal decoding scheme (Section 4) Highly non-convex See [Foucart 2013] 6/28
31 Cost function Alg. «Ideal» decoding scheme NP-complete Two approaches: Convex relaxation Greedy Ideal decoding scheme (Section 4) Highly non-convex Two approaches: Convex relaxation [Bunea 2010] See [Foucart 2013] Greedy Proposed 6/28
32 Proposed algorithm Orthogonal Matching Pursuit (OMP) [Mallat 1993, Pati 1993] 1. Add atom most correlated to residual 2. Perform Least-Squares 3. Repeat until desired sparsity 7/28
33 Proposed algorithm OMP with Replacement (OMPR) [Jain 2011] Orthogonal Matching Pursuit (OMP) [Mallat 1993, Pati 1993] 1. Add atom most correlated to residual 1. Add atom most correlated to residual 2. Perform Hard-Thresholding (if necessary) 2. Perform Least-Squares 3. Perform Least-Squares 3. Repeat until desired sparsity 4. Repeat twice desired sparsity Similar to CoSAMP [Needell 2008] or SubSpace Pursuit [Dai 2009] 7/28
34 Proposed algorithm OMP with Replacement (OMPR) [Jain 2011] Orthogonal Matching Pursuit (OMP) [Mallat 1993, Pati 1993] 1. Add atom most correlated to residual 1. Add atom most correlated to residual 2. Perform Hard-Thresholding (if necessary) 2. Perform Least-Squares 3. Perform Least-Squares 3. Repeat until desired sparsity 4. Repeat twice desired sparsity Similar to CoSAMP [Needell 2008] or SubSpace Pursuit [Dai 2009] Compressive Learning OMPR (CLOMPR) (proposed) 1. Add atom most correlated to residual with gradient descent 2. Perform Hard-Thresholding 3. Perform Non-Negative Least-Squares 4. Perform gradient descent on all parameters, initialized with current ones We cannot just add a component 5. Repeat twice desired sparsity 7/28
35 CLOMPR : illustration (schematic illustration) Goal : 3-GMM. Intermediary support. 8/28
36 CLOMPR : illustration (schematic illustration) Goal : 3-GMM. Intermediary support. 1 : Add atom with gradient descent 8/28
37 CLOMPR : illustration (schematic illustration) Goal : 3-GMM. Intermediary support. 1 : Add atom with gradient descent 2 : Hard Thresholding 3 : Non-Negative Least-Squares 8/28
38 CLOMPR : illustration (schematic illustration) Goal : 3-GMM. Intermediary support. 1 : Add atom with gradient descent 2 : Hard Thresholding 3 : Non-Negative Least-Squares 4 : Gradient Descent on all parameters 8/28
39 CLOMPR : illustration (schematic illustration) Goal : 3-GMM. Intermediary support. 1 : Add atom with gradient descent 2 : Hard Thresholding 3 : Non-Negative Least-Squares Final support 4 : Gradient Descent on all parameters 8/28
40 CLOMPR : illustration (schematic illustration) Goal : 3-GMM. Intermediary support. 1 : Add atom with gradient descent 2 : Hard Thresholding 3 : Non-Negative Least-Squares Final support 4 : Gradient Descent on all parameters Define? 8/28
41 Choice of sketch To implement CLOMPR, and must have a closed-form expression w.r.t. 9/28
42 Choice of sketch To implement CLOMPR, and must have a closed-form expression w.r.t. is spatially localized 9/28
43 Choice of sketch To implement CLOMPR, and must have a closed-form expression w.r.t. is spatially localized Need incoherent sampling -> Fourier sampling 9/28
44 Choice of sketch To implement CLOMPR, and must have a closed-form expression w.r.t. is spatially localized Need incoherent sampling -> Fourier sampling Closed-form for many models! (including alpha-stable ) 9/28
45 Choice of sketch To implement CLOMPR, and must have a closed-form expression w.r.t. is spatially localized Need incoherent sampling -> Fourier sampling Closed-form for many models! (including alpha-stable ) Random Fourier sampling [Candes 2006] Random Fourier features [Rahimi 2007] 9/28
46 Choice of sketch To implement CLOMPR, and must have a closed-form expression w.r.t. is spatially localized Need incoherent sampling -> Fourier sampling Define? Closed-form for many models! (including alpha-stable ) Random Fourier sampling [Candes 2006] Random Fourier features [Rahimi 2007] 9/28
47 Designing the frequency distribution Must adjust «scale» of distribution 10/28
48 Designing the frequency distribution Must adjust «scale» of distribution Adjust by hand Not that difficult The method is quite robust 10/28
49 Designing the frequency distribution Must adjust «scale» of distribution Adjust by hand Not that difficult The method is quite robust Cross-validation Can be very long! Used in practice [Sutherland2015] 10/28
50 Designing the frequency distribution Must adjust «scale» of distribution Adjust by hand Cross-validation Automatic Proposed Not that difficult The method is quite robust Can be very long! Used in practice [Sutherland2015] Partial pre-processing Heuristic based on GMMs-like distributions 10/28
51 Summary Given database,, 1. Design Partial pre-processing to choose Draw 2. Compute Online, distributed, GPU 3. Derive mixture model with CLOMPR 11/28
52 Outline Introduction Practical Approach Results Theoretical analysis Conclusion and outlooks
53 Application : K-means, GMM K-means Goal : Classic approach Algorithm : Lloyd-Max [Lloyd 1982] (Matlab s kmeans) 12/28
54 Application : K-means, GMM K-means Goal : Classic approach Algorithm : Lloyd-Max [Lloyd 1982] (Matlab s kmeans) Model : Compressive approach (clustered distribution = noisy mixture of Diracs) 12/28
55 Application : K-means, GMM K-means GMM diagonal cov. Classic approach Classic approach Goal : Algorithm : Lloyd-Max [Lloyd 1982] (Matlab s kmeans) Goal : Algorithm : EM [Dempster 1977] (VLFeat s gmm) Model : Compressive approach (clustered distribution = noisy mixture of Diracs) 12/28
56 Application : K-means, GMM K-means GMM diagonal cov. Classic approach Classic approach Goal : Algorithm : Lloyd-Max [Lloyd 1982] (Matlab s kmeans) Goal : Algorithm : EM [Dempster 1977] (VLFeat s gmm) Model : Compressive approach Model : Compressive approach (clustered distribution = noisy mixture of Diracs) 12/28
57 Large-scale result K-means (n=5, K=10) Comparison with Matlab s kmeans Faster and more memory efficient on large databases Number of measurements does not depend on N 13/28
58 Large-scale result K-means (n=5, K=10) Comparison with Matlab s kmeans VLFeat s gmm Faster and more memory efficient on large databases Number of measurements does not depend on N GMMs, diagonal cov. (n=5, K=5) 13/28
59 Application : spectral clustering K-means (n=10, K=10, m=1000) Mean and var. over 50 exp. Spectral clustering for classification [Uw 2001], augmented MNIST database [Loosli 2007]. 14/28
60 Application : spectral clustering K-means (n=10, K=10, m=1000) Mean and var. over 50 exp. Spectral clustering for classification [Uw 2001], augmented MNIST database [Loosli 2007]. 14/28
61 Application : spectral clustering K-means (n=10, K=10, m=1000) Mean and var. over 50 exp. Spectral clustering for classification [Uw 2001], augmented MNIST database [Loosli 2007]. CLOMPR performs better and is more stable with a large database 14/28
62 Application : speaker recognition Variant of CLOMPR, faster at large K GMM (n=12, K=64) Classical method for speaker recognition [Reynolds 2000] (for proof of concept) NIST 2005 database, MFCCs. Also performs better on a large database. 15/28
63 Outline Introduction Practical Approach Results Theoretical analysis Conclusion and outlooks
64 Information-preservation guarantees Guarantee for CLOMPR? Difficult! (non-convex, random ) 16/28
65 Information-preservation guarantees Guarantee for CLOMPR? Difficult! (non-convex, random ) Solve cost func. 16/28
66 Information-preservation guarantees Guarantee for CLOMPR? Difficult! (non-convex, random ) Solve cost func. Robustness to using instead of? Robustness to not being exactly a mixture model? Guarantees in terms of usual learning cost functions? o K-means : sum of distances to closest centroid o GMMs : negative log-likelihood 16/28
67 K-means : result Goal minimize (expected risk) 17/28
68 K-means : result Goal minimize (expected risk) - separation - bounded domain Hyp. Reweighted Fourier features (needed for theory, no effect in practice) 17/28
69 K-means : result Goal minimize (expected risk) - separation - bounded domain Hyp. Reweighted Fourier features (needed for theory, no effect in practice) If w.h.p. 17/28
70 GMMs with known covariance : result Goal minimize (expected risk) 18/28
71 GMMs with known covariance : result Goal minimize (expected risk) Large enough separation - bounded domain Hyp. Fourier features 18/28
72 GMMs with known covariance : result Goal minimize (expected risk) Large enough separation - bounded domain Hyp. Fourier features If w.h.p. L1 distance from p* to the set of (separated) GMMs 18/28
73 GMM trade-off Trade-off Separation of means Size of sketch 19/28
74 GMM with unknown diagonal covariance Related to learning cost function? Efficiency of «compressive» approach? 20/28
75 GMM with unknown diagonal covariance Related to learning cost function? Efficiency of «compressive» approach? Nevertheless : when is exactly a GMM 20/28
76 Number of measurements Relative SSE K-means In theory, at least 21/28
77 Number of measurements Relative SSE K-means GMMs, known cov. Relative loglike In theory, at least 21/28
78 Number of measurements Relative SSE K-means GMMs, diagonal cov. Relative loglike GMMs, known cov. Relative loglike In theory, at least 21/28
79 Number of measurements Relative SSE K-means GMMs, diagonal cov. Relative loglike GMMs, known cov. Relative loglike In theory, at least Empirically? 21/28
80 Sketch of proof : principle Goal : Existence of instance Optimal Decoder 22/28
81 Sketch of proof : principle Goal : Existence of instance Optimal Decoder Lower Restricted Isometry Property (LRIP) [Bourrier 2014] 22/28
82 Sketch of proof : principle Goal : Existence of instance Optimal Decoder Lower Restricted Isometry Property (LRIP) [Bourrier 2014] Ex : Quantization error 22/28
83 Proving the LRIP 1 : Proving non-uniform LRIP 23/28
84 Proving the LRIP 1 : Proving non-uniform LRIP Kernel mean embedding [Smola 2007] Random (Fourier) Features [Rahimi 2007] Hoeffding, Bernstein, chaining 23/28
85 Proving the LRIP 1 : Proving non-uniform LRIP Kernel mean embedding [Smola 2007] Random (Fourier) Features [Rahimi 2007] 2 : Use - coverings to extend to uniform LRIP Hoeffding, Bernstein, chaining 23/28
86 Proving the LRIP 1 : Proving non-uniform LRIP Kernel mean embedding [Smola 2007] Random (Fourier) Features [Rahimi 2007] 2 : Use - coverings to extend to uniform LRIP Basic Set Hoeffding, Bernstein, chaining Easy! Ex : Quantization error [Boufounos 2016] 23/28
87 Proving the LRIP 1 : Proving non-uniform LRIP Kernel mean embedding [Smola 2007] Random (Fourier) Features [Rahimi 2007] 2 : Use - coverings to extend to uniform LRIP Hoeffding, Bernstein, chaining Basic Set Normalized Secant Set Easy! Ex : Quantization error [Boufounos 2016] Finite-dimensional Infinite-dimensional Easy! Difficult! 23/28
88 Results Sufficient Conditions Bad! has finite covering numbers Ex : GMMs with unknown covariance 24/28
89 Results Sufficient Conditions Bad! has finite covering numbers mixtures of sufficiently separated distributions with smooth Ex : GMMs with unknown covariance + guarantees w.r.t. risk Ex : Mixture of Diracs (K-means) with «Smooth» Random Features Smooth risk 24/28
90 Results Sufficient Conditions Bad! has finite covering numbers mixtures of sufficiently separated distributions with smooth Ex : GMMs with unknown covariance + guarantees w.r.t. risk Ex : Mixture of Diracs (K-means) with «Smooth» Random Features Smooth risk «Smoother» Random Features Ex : Mixtures of Diracs (K-means) with GMMs with known covariance 24/28
91 Outline Introduction Practical Approach Results Theoretical analysis Conclusion and outlooks
92 Conclusions Contributions Greedy algorithm for large-scale mixture learning from random moments 25/28
93 Conclusions Contributions Greedy algorithm for large-scale mixture learning from random moments Efficient heuristic to design the sketching operator as Fourier sampling 25/28
94 Conclusions Contributions Greedy algorithm for large-scale mixture learning from random moments Efficient heuristic to design the sketching operator as Fourier sampling Application to mixtures of Diracs, GMMs 25/28
95 Conclusions Contributions Greedy algorithm for large-scale mixture learning from random moments Efficient heuristic to design the sketching operator as Fourier sampling Application to mixtures of Diracs, GMMs Evaluation on synthetic and real data 25/28
96 Conclusions Contributions Greedy algorithm for large-scale mixture learning from random moments Efficient heuristic to design the sketching operator as Fourier sampling Application to mixtures of Diracs, GMMs Evaluation on synthetic and real data Information preservation guarantees using infinite-dimensional Compressive Sensing 25/28
97 The SketchMLbox SketchMLbox (sketchml.gforge.inria.fr) Mixture of Diracs («K-means») GMMs with known covariance GMMs with unknown diagonal covariance Soon: Alpha-stable Gaussian Locally Linear Mapping [Deleforge 2014] Optimized for user-defined 26/28
98 Outlooks : algorithmic guarantees Algorithmic guarantees? Non-convex cost function, randomized algorithm 27/28
99 Outlooks : algorithmic guarantees Algorithmic guarantees? Non-convex cost function, randomized algorithm Locally convex? Cost function for a 3-GMM in dimension 1 at positions {x,y,z}={-3,0,3} 27/28
100 Outlooks : algorithmic guarantees Algorithmic guarantees? Non-convex cost function, randomized algorithm Locally convex? Basin of attraction? [Jacques 2016, Candes 2016 ] Cost function for a 3-GMM in dimension 1 at positions {x,y,z}={-3,0,3} 27/28
101 Outlooks : algorithmic guarantees Algorithmic guarantees? Non-convex cost function, randomized algorithm Locally convex? Basin of attraction? [Jacques 2016, Candes 2016 ] Reached by CLOMPR with reasonable hypotheses? Cost function for a 3-GMM in dimension 1 at positions {x,y,z}={-3,0,3} 27/28
102 Outlooks : algorithmic guarantees Algorithmic guarantees? Non-convex cost function, randomized algorithm Locally convex? Basin of attraction? [Jacques 2016, Candes 2016 ] Reached by CLOMPR with reasonable hypotheses? Stopping condition? Cost function for a 3-GMM in dimension 1 at positions {x,y,z}={-3,0,3} 27/28
103 Outlooks : algorithmic guarantees Algorithmic guarantees? Non-convex cost function, randomized algorithm Locally convex? Basin of attraction? [Jacques 2016, Candes 2016 ] Reached by CLOMPR with reasonable hypotheses? Stopping condition? Recent result : locally block convex Cost function for a 3-GMM in dimension 1 at positions {x,y,z}={-3,0,3} 27/28
104 Outlooks : extension of the methods 1. Bridge observed gap between theory and practice? 28/28
105 Outlooks : extension of the methods 1. Bridge observed gap between theory and practice? Does not come from - coverings 28/28
106 Outlooks : extension of the methods 1. Bridge observed gap between theory and practice? Does not come from - coverings Improve concentration inequalities? 28/28
107 Outlooks : extension of the methods 1. Bridge observed gap between theory and practice? Does not come from - coverings Improve concentration inequalities? 2. Extend framework to other tasks? 28/28
108 Outlooks : extension of the methods 1. Bridge observed gap between theory and practice? Does not come from - coverings Improve concentration inequalities? 2. Extend framework to other tasks? Recent paper submitted to AISTATS : PCA 28/28
109 Outlooks : extension of the methods 1. Bridge observed gap between theory and practice? Does not come from - coverings Improve concentration inequalities? 2. Extend framework to other tasks? Recent paper submitted to AISTATS : PCA Other existing use of Fourier sketches? : e.g. classification [Sutherland 2015] 28/28
110 Outlooks : extension of the methods 1. Bridge observed gap between theory and practice? Does not come from - coverings Improve concentration inequalities? 2. Extend framework to other tasks? Recent paper submitted to AISTATS : PCA Other existing use of Fourier sketches? : e.g. classification [Sutherland 2015] Other kernel methods (algorithmic? Theoretical?) 28/28
111 Outlooks : extension of the methods 1. Bridge observed gap between theory and practice? Does not come from - coverings Improve concentration inequalities? 2. Extend framework to other tasks? Recent paper submitted to AISTATS : PCA Other existing use of Fourier sketches? : e.g. classification [Sutherland 2015] Other kernel methods (algorithmic? Theoretical?) 3. Extension to multi-layer sketches? (Neural networks ) 28/28
112 Outlooks : extension of the methods 1. Bridge observed gap between theory and practice? Does not come from - coverings Improve concentration inequalities? 2. Extend framework to other tasks? Recent paper submitted to AISTATS : PCA Other existing use of Fourier sketches? : e.g. classification [Sutherland 2015] Other kernel methods (algorithmic? Theoretical?) 3. Extension to multi-layer sketches? (Neural networks ) May be adapted to e.g. GMMs with unknown covariance 28/28
113 Outlooks : extension of the methods 1. Bridge observed gap between theory and practice? Does not come from - coverings Improve concentration inequalities? 2. Extend framework to other tasks? Recent paper submitted to AISTATS : PCA Other existing use of Fourier sketches? : e.g. classification [Sutherland 2015] Other kernel methods (algorithmic? Theoretical?) 3. Extension to multi-layer sketches? (Neural networks ) May be adapted to e.g. GMMs with unknown covariance Equivalence between LRIP and instance optimality still valid for non-linear operators! 28/28
114 Outlooks : extension of the methods 1. Bridge observed gap between theory and practice? Does not come from - coverings Improve concentration inequalities? 2. Extend framework to other tasks? Recent paper submitted to AISTATS : PCA Other existing use of Fourier sketches? : e.g. classification [Sutherland 2015] Other kernel methods (algorithmic? Theoretical?) 3. Extension to multi-layer sketches? (Neural networks ) May be adapted to e.g. GMMs with unknown covariance Equivalence between LRIP and instance optimality still valid for non-linear operators! CLOMPR and current sufficient conditions no longer valid 28/28
115 Thank you! K., Bourrier, Gribonval, Perez. Sketching for Large-Scale Learning of Mixture Models ICASSP 2016 K., Bourrier, Gribonval, Perez. Sketching for Large-Scale Learning of Mixture Models (extended version) submitted to Information and Inference, arxiv: K., Tremblay, Gribonval, Traonmilin. Compressive K-means ICASSP 2017 Gribonval, Blanchard, K., Traonmilin. Random moments for Sketched Statistical Learning submitted to AISTATS 2017, extended version soon
116 Appendix : CLOMPR
Rémi Gribonval Inria Rennes - Bretagne Atlantique.
Rémi Gribonval Inria Rennes - Bretagne Atlantique remi.gribonval@inria.fr Contributors & Collaborators Anthony Bourrier Nicolas Keriven Yann Traonmilin Gilles Puy Gilles Blanchard Mike Davies Tomer Peleg
More informationThe Analysis Cosparse Model for Signals and Images
The Analysis Cosparse Model for Signals and Images Raja Giryes Computer Science Department, Technion. The research leading to these results has received funding from the European Research Council under
More informationInverse problems and sparse models (1/2) Rémi Gribonval INRIA Rennes - Bretagne Atlantique, France
Inverse problems and sparse models (1/2) Rémi Gribonval INRIA Rennes - Bretagne Atlantique, France remi.gribonval@inria.fr Structure of the tutorial Session 1: Introduction to inverse problems & sparse
More informationMLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT
MLCC 2018 Variable Selection and Sparsity Lorenzo Rosasco UNIGE-MIT-IIT Outline Variable Selection Subset Selection Greedy Methods: (Orthogonal) Matching Pursuit Convex Relaxation: LASSO & Elastic Net
More informationMachine Learning for Signal Processing Sparse and Overcomplete Representations
Machine Learning for Signal Processing Sparse and Overcomplete Representations Abelino Jimenez (slides from Bhiksha Raj and Sourish Chaudhuri) Oct 1, 217 1 So far Weights Data Basis Data Independent ICA
More informationThe Pros and Cons of Compressive Sensing
The Pros and Cons of Compressive Sensing Mark A. Davenport Stanford University Department of Statistics Compressive Sensing Replace samples with general linear measurements measurements sampled signal
More informationProvable Alternating Minimization Methods for Non-convex Optimization
Provable Alternating Minimization Methods for Non-convex Optimization Prateek Jain Microsoft Research, India Joint work with Praneeth Netrapalli, Sujay Sanghavi, Alekh Agarwal, Animashree Anandkumar, Rashish
More informationA simple test to check the optimality of sparse signal approximations
A simple test to check the optimality of sparse signal approximations Rémi Gribonval, Rosa Maria Figueras I Ventura, Pierre Vergheynst To cite this version: Rémi Gribonval, Rosa Maria Figueras I Ventura,
More informationCompressive Sensing and Beyond
Compressive Sensing and Beyond Sohail Bahmani Gerorgia Tech. Signal Processing Compressed Sensing Signal Models Classics: bandlimited The Sampling Theorem Any signal with bandwidth B can be recovered
More informationGreedy Signal Recovery and Uniform Uncertainty Principles
Greedy Signal Recovery and Uniform Uncertainty Principles SPIE - IE 2008 Deanna Needell Joint work with Roman Vershynin UC Davis, January 2008 Greedy Signal Recovery and Uniform Uncertainty Principles
More informationNew Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit
New Coherence and RIP Analysis for Wea 1 Orthogonal Matching Pursuit Mingrui Yang, Member, IEEE, and Fran de Hoog arxiv:1405.3354v1 [cs.it] 14 May 2014 Abstract In this paper we define a new coherence
More informationScale Mixture Modeling of Priors for Sparse Signal Recovery
Scale Mixture Modeling of Priors for Sparse Signal Recovery Bhaskar D Rao 1 University of California, San Diego 1 Thanks to David Wipf, Jason Palmer, Zhilin Zhang and Ritwik Giri Outline Outline Sparse
More informationIntroduction to Compressed Sensing
Introduction to Compressed Sensing Alejandro Parada, Gonzalo Arce University of Delaware August 25, 2016 Motivation: Classical Sampling 1 Motivation: Classical Sampling Issues Some applications Radar Spectral
More informationNon-convex Robust PCA: Provable Bounds
Non-convex Robust PCA: Provable Bounds Anima Anandkumar U.C. Irvine Joint work with Praneeth Netrapalli, U.N. Niranjan, Prateek Jain and Sujay Sanghavi. Learning with Big Data High Dimensional Regime Missing
More informationAn Introduction to Sparse Approximation
An Introduction to Sparse Approximation Anna C. Gilbert Department of Mathematics University of Michigan Basic image/signal/data compression: transform coding Approximate signals sparsely Compress images,
More informationMachine Learning for Signal Processing Sparse and Overcomplete Representations. Bhiksha Raj (slides from Sourish Chaudhuri) Oct 22, 2013
Machine Learning for Signal Processing Sparse and Overcomplete Representations Bhiksha Raj (slides from Sourish Chaudhuri) Oct 22, 2013 1 Key Topics in this Lecture Basics Component-based representations
More informationIntroduction to Sparsity. Xudong Cao, Jake Dreamtree & Jerry 04/05/2012
Introduction to Sparsity Xudong Cao, Jake Dreamtree & Jerry 04/05/2012 Outline Understanding Sparsity Total variation Compressed sensing(definition) Exact recovery with sparse prior(l 0 ) l 1 relaxation
More informationMachine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.
Machine Learning CUNY Graduate Center, Spring 2013 Lectures 11-12: Unsupervised Learning 1 (Clustering: k-means, EM, mixture models) Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning
More informationA tutorial on sparse modeling. Outline:
A tutorial on sparse modeling. Outline: 1. Why? 2. What? 3. How. 4. no really, why? Sparse modeling is a component in many state of the art signal processing and machine learning tasks. image processing
More informationThe Pros and Cons of Compressive Sensing
The Pros and Cons of Compressive Sensing Mark A. Davenport Stanford University Department of Statistics Compressive Sensing Replace samples with general linear measurements measurements sampled signal
More informationGeneralized Orthogonal Matching Pursuit- A Review and Some
Generalized Orthogonal Matching Pursuit- A Review and Some New Results Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur, INDIA Table of Contents
More informationBayesian Methods for Sparse Signal Recovery
Bayesian Methods for Sparse Signal Recovery Bhaskar D Rao 1 University of California, San Diego 1 Thanks to David Wipf, Jason Palmer, Zhilin Zhang and Ritwik Giri Motivation Motivation Sparse Signal Recovery
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationCompressed Sensing and Neural Networks
and Jan Vybíral (Charles University & Czech Technical University Prague, Czech Republic) NOMAD Summer Berlin, September 25-29, 2017 1 / 31 Outline Lasso & Introduction Notation Training the network Applications
More informationSparse Covariance Selection using Semidefinite Programming
Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support
More informationMotivation Sparse Signal Recovery is an interesting area with many potential applications. Methods developed for solving sparse signal recovery proble
Bayesian Methods for Sparse Signal Recovery Bhaskar D Rao 1 University of California, San Diego 1 Thanks to David Wipf, Zhilin Zhang and Ritwik Giri Motivation Sparse Signal Recovery is an interesting
More informationA Convex Approach for Designing Good Linear Embeddings. Chinmay Hegde
A Convex Approach for Designing Good Linear Embeddings Chinmay Hegde Redundancy in Images Image Acquisition Sensor Information content
More informationThe Informativeness of k-means for Learning Mixture Models
The Informativeness of k-means for Learning Mixture Models Vincent Y. F. Tan (Joint work with Zhaoqiang Liu) National University of Singapore June 18, 2018 1/35 Gaussian distribution For F dimensions,
More informationDictionary Learning Using Tensor Methods
Dictionary Learning Using Tensor Methods Anima Anandkumar U.C. Irvine Joint work with Rong Ge, Majid Janzamin and Furong Huang. Feature learning as cornerstone of ML ML Practice Feature learning as cornerstone
More informationGaussian Mixture Model Uncertainty Learning (GMMUL) Version 1.0 User Guide
Gaussian Mixture Model Uncertainty Learning (GMMUL) Version 1. User Guide Alexey Ozerov 1, Mathieu Lagrange and Emmanuel Vincent 1 1 INRIA, Centre de Rennes - Bretagne Atlantique Campus de Beaulieu, 3
More informationarxiv: v1 [cs.it] 21 Feb 2013
q-ary Compressive Sensing arxiv:30.568v [cs.it] Feb 03 Youssef Mroueh,, Lorenzo Rosasco, CBCL, CSAIL, Massachusetts Institute of Technology LCSL, Istituto Italiano di Tecnologia and IIT@MIT lab, Istituto
More informationSparse analysis Lecture III: Dictionary geometry and greedy algorithms
Sparse analysis Lecture III: Dictionary geometry and greedy algorithms Anna C. Gilbert Department of Mathematics University of Michigan Intuition from ONB Key step in algorithm: r, ϕ j = x c i ϕ i, ϕ j
More informationRobust Sparse Recovery via Non-Convex Optimization
Robust Sparse Recovery via Non-Convex Optimization Laming Chen and Yuantao Gu Department of Electronic Engineering, Tsinghua University Homepage: http://gu.ee.tsinghua.edu.cn/ Email: gyt@tsinghua.edu.cn
More informationAnalysis of Greedy Algorithms
Analysis of Greedy Algorithms Jiahui Shen Florida State University Oct.26th Outline Introduction Regularity condition Analysis on orthogonal matching pursuit Analysis on forward-backward greedy algorithm
More informationDistributed Inexact Newton-type Pursuit for Non-convex Sparse Learning
Distributed Inexact Newton-type Pursuit for Non-convex Sparse Learning Bo Liu Department of Computer Science, Rutgers Univeristy Xiao-Tong Yuan BDAT Lab, Nanjing University of Information Science and Technology
More informationApplied Machine Learning for Biomedical Engineering. Enrico Grisan
Applied Machine Learning for Biomedical Engineering Enrico Grisan enrico.grisan@dei.unipd.it Data representation To find a representation that approximates elements of a signal class with a linear combination
More informationGREEDY SIGNAL RECOVERY REVIEW
GREEDY SIGNAL RECOVERY REVIEW DEANNA NEEDELL, JOEL A. TROPP, ROMAN VERSHYNIN Abstract. The two major approaches to sparse recovery are L 1-minimization and greedy methods. Recently, Needell and Vershynin
More informationInfo-Greedy Sequential Adaptive Compressed Sensing
Info-Greedy Sequential Adaptive Compressed Sensing Yao Xie Joint work with Gabor Braun and Sebastian Pokutta Georgia Institute of Technology Presented at Allerton Conference 2014 Information sensing for
More informationSignal Sparsity Models: Theory and Applications
Signal Sparsity Models: Theory and Applications Raja Giryes Computer Science Department, Technion Michael Elad, Technion, Haifa Israel Sangnam Nam, CMI, Marseille France Remi Gribonval, INRIA, Rennes France
More informationStructured matrix factorizations. Example: Eigenfaces
Structured matrix factorizations Example: Eigenfaces An extremely large variety of interesting and important problems in machine learning can be formulated as: Given a matrix, find a matrix and a matrix
More informationApproximating the Covariance Matrix with Low-rank Perturbations
Approximating the Covariance Matrix with Low-rank Perturbations Malik Magdon-Ismail and Jonathan T. Purnell Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 12180 {magdon,purnej}@cs.rpi.edu
More informationLinear Methods for Regression. Lijun Zhang
Linear Methods for Regression Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Linear Regression Models and Least Squares Subset Selection Shrinkage Methods Methods Using Derived
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationMATCHING PURSUIT WITH STOCHASTIC SELECTION
2th European Signal Processing Conference (EUSIPCO 22) Bucharest, Romania, August 27-3, 22 MATCHING PURSUIT WITH STOCHASTIC SELECTION Thomas Peel, Valentin Emiya, Liva Ralaivola Aix-Marseille Université
More informationSparse Solutions of Systems of Equations and Sparse Modelling of Signals and Images
Sparse Solutions of Systems of Equations and Sparse Modelling of Signals and Images Alfredo Nava-Tudela ant@umd.edu John J. Benedetto Department of Mathematics jjb@umd.edu Abstract In this project we are
More informationTensor Methods for Feature Learning
Tensor Methods for Feature Learning Anima Anandkumar U.C. Irvine Feature Learning For Efficient Classification Find good transformations of input for improved classification Figures used attributed to
More informationSparse analysis Lecture V: From Sparse Approximation to Sparse Signal Recovery
Sparse analysis Lecture V: From Sparse Approximation to Sparse Signal Recovery Anna C. Gilbert Department of Mathematics University of Michigan Connection between... Sparse Approximation and Compressed
More informationSensing systems limited by constraints: physical size, time, cost, energy
Rebecca Willett Sensing systems limited by constraints: physical size, time, cost, energy Reduce the number of measurements needed for reconstruction Higher accuracy data subject to constraints Original
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationAdvances in Manifold Learning Presented by: Naku Nak l Verm r a June 10, 2008
Advances in Manifold Learning Presented by: Nakul Verma June 10, 008 Outline Motivation Manifolds Manifold Learning Random projection of manifolds for dimension reduction Introduction to random projections
More informationCoSaMP: Greedy Signal Recovery and Uniform Uncertainty Principles
CoSaMP: Greedy Signal Recovery and Uniform Uncertainty Principles SIAM Student Research Conference Deanna Needell Joint work with Roman Vershynin and Joel Tropp UC Davis, May 2008 CoSaMP: Greedy Signal
More informationComparison of Modern Stochastic Optimization Algorithms
Comparison of Modern Stochastic Optimization Algorithms George Papamakarios December 214 Abstract Gradient-based optimization methods are popular in machine learning applications. In large-scale problems,
More informationScalable Subspace Clustering
Scalable Subspace Clustering René Vidal Center for Imaging Science, Laboratory for Computational Sensing and Robotics, Institute for Computational Medicine, Department of Biomedical Engineering, Johns
More informationProbabilistic Machine Learning. Industrial AI Lab.
Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear
More informationLarge-Scale L1-Related Minimization in Compressive Sensing and Beyond
Large-Scale L1-Related Minimization in Compressive Sensing and Beyond Yin Zhang Department of Computational and Applied Mathematics Rice University, Houston, Texas, U.S.A. Arizona State University March
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 218 Outlines Overview Introduction Linear Algebra Probability Linear Regression 1
More informationSPARSE signal representations have gained popularity in recent
6958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 10, OCTOBER 2011 Blind Compressed Sensing Sivan Gleichman and Yonina C. Eldar, Senior Member, IEEE Abstract The fundamental principle underlying
More informationPerformance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project
Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore
More informationPre-weighted Matching Pursuit Algorithms for Sparse Recovery
Journal of Information & Computational Science 11:9 (214) 2933 2939 June 1, 214 Available at http://www.joics.com Pre-weighted Matching Pursuit Algorithms for Sparse Recovery Jingfei He, Guiling Sun, Jie
More informationA Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices
A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices Ryota Tomioka 1, Taiji Suzuki 1, Masashi Sugiyama 2, Hisashi Kashima 1 1 The University of Tokyo 2 Tokyo Institute of Technology 2010-06-22
More informationSparse Approximation and Variable Selection
Sparse Approximation and Variable Selection Lorenzo Rosasco 9.520 Class 07 February 26, 2007 About this class Goal To introduce the problem of variable selection, discuss its connection to sparse approximation
More informationSome Recent Advances. in Non-convex Optimization. Purushottam Kar IIT KANPUR
Some Recent Advances in Non-convex Optimization Purushottam Kar IIT KANPUR Outline of the Talk Recap of Convex Optimization Why Non-convex Optimization? Non-convex Optimization: A Brief Introduction Robust
More informationCompressed Sensing: Extending CLEAN and NNLS
Compressed Sensing: Extending CLEAN and NNLS Ludwig Schwardt SKA South Africa (KAT Project) Calibration & Imaging Workshop Socorro, NM, USA 31 March 2009 Outline 1 Compressed Sensing (CS) Introduction
More informationLearning Multiple Tasks with a Sparse Matrix-Normal Penalty
Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1
More informationCS534 Machine Learning - Spring Final Exam
CS534 Machine Learning - Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the
More informationClustering K-means. Machine Learning CSE546. Sham Kakade University of Washington. November 15, Review: PCA Start: unsupervised learning
Clustering K-means Machine Learning CSE546 Sham Kakade University of Washington November 15, 2016 1 Announcements: Project Milestones due date passed. HW3 due on Monday It ll be collaborative HW2 grades
More informationWhen Dictionary Learning Meets Classification
When Dictionary Learning Meets Classification Bufford, Teresa 1 Chen, Yuxin 2 Horning, Mitchell 3 Shee, Liberty 1 Mentor: Professor Yohann Tendero 1 UCLA 2 Dalhousie University 3 Harvey Mudd College August
More informationCoSaMP. Iterative signal recovery from incomplete and inaccurate samples. Joel A. Tropp
CoSaMP Iterative signal recovery from incomplete and inaccurate samples Joel A. Tropp Applied & Computational Mathematics California Institute of Technology jtropp@acm.caltech.edu Joint with D. Needell
More informationInverse problems and sparse models (6/6) Rémi Gribonval INRIA Rennes - Bretagne Atlantique, France.
Inverse problems and sparse models (6/6) Rémi Gribonval INRIA Rennes - Bretagne Atlantique, France remi.gribonval@inria.fr Overview of the course Introduction sparsity & data compression inverse problems
More informationDetecting Sparse Structures in Data in Sub-Linear Time: A group testing approach
Detecting Sparse Structures in Data in Sub-Linear Time: A group testing approach Boaz Nadler The Weizmann Institute of Science Israel Joint works with Inbal Horev, Ronen Basri, Meirav Galun and Ery Arias-Castro
More informationRestricted Strong Convexity Implies Weak Submodularity
Restricted Strong Convexity Implies Weak Submodularity Ethan R. Elenberg Rajiv Khanna Alexandros G. Dimakis Department of Electrical and Computer Engineering The University of Texas at Austin {elenberg,rajivak}@utexas.edu
More informationRandomized Algorithms
Randomized Algorithms Saniv Kumar, Google Research, NY EECS-6898, Columbia University - Fall, 010 Saniv Kumar 9/13/010 EECS6898 Large Scale Machine Learning 1 Curse of Dimensionality Gaussian Mixture Models
More informationThree Generalizations of Compressed Sensing
Thomas Blumensath School of Mathematics The University of Southampton June, 2010 home prev next page Compressed Sensing and beyond y = Φx + e x R N or x C N x K is K-sparse and x x K 2 is small y R M or
More informationMixtures of Gaussians with Sparse Regression Matrices. Constantinos Boulis, Jeffrey Bilmes
Mixtures of Gaussians with Sparse Regression Matrices Constantinos Boulis, Jeffrey Bilmes {boulis,bilmes}@ee.washington.edu Dept of EE, University of Washington Seattle WA, 98195-2500 UW Electrical Engineering
More informationRandom Sampling of Bandlimited Signals on Graphs
Random Sampling of Bandlimited Signals on Graphs Pierre Vandergheynst École Polytechnique Fédérale de Lausanne (EPFL) School of Engineering & School of Computer and Communication Sciences Joint work with
More informationGraph Partitioning Using Random Walks
Graph Partitioning Using Random Walks A Convex Optimization Perspective Lorenzo Orecchia Computer Science Why Spectral Algorithms for Graph Problems in practice? Simple to implement Can exploit very efficient
More informationClustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.
Clustering K-means Machine Learning CSE546 Carlos Guestrin University of Washington November 4, 2014 1 Clustering images Set of Images [Goldberger et al.] 2 1 K-means Randomly initialize k centers µ (0)
More informationMultipath Matching Pursuit
Multipath Matching Pursuit Submitted to IEEE trans. on Information theory Authors: S. Kwon, J. Wang, and B. Shim Presenter: Hwanchol Jang Multipath is investigated rather than a single path for a greedy
More informationA new method on deterministic construction of the measurement matrix in compressed sensing
A new method on deterministic construction of the measurement matrix in compressed sensing Qun Mo 1 arxiv:1503.01250v1 [cs.it] 4 Mar 2015 Abstract Construction on the measurement matrix A is a central
More informationClass 2 & 3 Overfitting & Regularization
Class 2 & 3 Overfitting & Regularization Carlo Ciliberto Department of Computer Science, UCL October 18, 2017 Last Class The goal of Statistical Learning Theory is to find a good estimator f n : X Y, approximating
More informationMIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design
MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications Class 19: Data Representation by Design What is data representation? Let X be a data-space X M (M) F (M) X A data representation
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationFast Sparse Representation Based on Smoothed
Fast Sparse Representation Based on Smoothed l 0 Norm G. Hosein Mohimani 1, Massoud Babaie-Zadeh 1,, and Christian Jutten 2 1 Electrical Engineering Department, Advanced Communications Research Institute
More informationApproximate Kernel Methods
Lecture 3 Approximate Kernel Methods Bharath K. Sriperumbudur Department of Statistics, Pennsylvania State University Machine Learning Summer School Tübingen, 207 Outline Motivating example Ridge regression
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationGeneralized greedy algorithms.
Generalized greedy algorithms. François-Xavier Dupé & Sandrine Anthoine LIF & I2M Aix-Marseille Université - CNRS - Ecole Centrale Marseille, Marseille ANR Greta Séminaire Parisien des Mathématiques Appliquées
More informationSTA 414/2104: Lecture 8
STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable models Background PCA
More informationTUM 2016 Class 3 Large scale learning by regularization
TUM 2016 Class 3 Large scale learning by regularization Lorenzo Rosasco UNIGE-MIT-IIT July 25, 2016 Learning problem Solve min w E(w), E(w) = dρ(x, y)l(w x, y) given (x 1, y 1 ),..., (x n, y n ) Beyond
More informationRecovery of Sparse Signals Using Multiple Orthogonal Least Squares
Recovery of Sparse Signals Using Multiple Orthogonal east Squares Jian Wang, Ping i Department of Statistics and Biostatistics arxiv:40.505v [stat.me] 9 Oct 04 Department of Computer Science Rutgers University
More informationExponential decay of reconstruction error from binary measurements of sparse signals
Exponential decay of reconstruction error from binary measurements of sparse signals Deanna Needell Joint work with R. Baraniuk, S. Foucart, Y. Plan, and M. Wootters Outline Introduction Mathematical Formulation
More informationLinear Regression. Aarti Singh. Machine Learning / Sept 27, 2010
Linear Regression Aarti Singh Machine Learning 10-701/15-781 Sept 27, 2010 Discrete to Continuous Labels Classification Sports Science News Anemic cell Healthy cell Regression X = Document Y = Topic X
More informationQuantized Iterative Hard Thresholding:
Quantized Iterative Hard Thresholding: ridging 1-bit and High Resolution Quantized Compressed Sensing Laurent Jacques, Kévin Degraux, Christophe De Vleeschouwer Louvain University (UCL), Louvain-la-Neuve,
More informationEE 381V: Large Scale Optimization Fall Lecture 24 April 11
EE 381V: Large Scale Optimization Fall 2012 Lecture 24 April 11 Lecturer: Caramanis & Sanghavi Scribe: Tao Huang 24.1 Review In past classes, we studied the problem of sparsity. Sparsity problem is that
More informationLearning sets and subspaces: a spectral approach
Learning sets and subspaces: a spectral approach Alessandro Rudi DIBRIS, Università di Genova Optimization and dynamical processes in Statistical learning and inverse problems Sept 8-12, 2014 A world of
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationRecovery of Low-Rank Plus Compressed Sparse Matrices with Application to Unveiling Traffic Anomalies
July 12, 212 Recovery of Low-Rank Plus Compressed Sparse Matrices with Application to Unveiling Traffic Anomalies Morteza Mardani Dept. of ECE, University of Minnesota, Minneapolis, MN 55455 Acknowledgments:
More informationMachine Learning. Chao Lan
Machine Learning Chao Lan Clustering and Dimensionality Reduction Clustering Kmeans DBSCAN Gaussian Mixture Model Dimensionality Reduction principal component analysis manifold learning Other Feature Processing
More informationAlgorithms for sparse analysis Lecture I: Background on sparse approximation
Algorithms for sparse analysis Lecture I: Background on sparse approximation Anna C. Gilbert Department of Mathematics University of Michigan Tutorial on sparse approximations and algorithms Compress data
More informationConvergence Rates of Kernel Quadrature Rules
Convergence Rates of Kernel Quadrature Rules Francis Bach INRIA - Ecole Normale Supérieure, Paris, France ÉCOLE NORMALE SUPÉRIEURE NIPS workshop on probabilistic integration - Dec. 2015 Outline Introduction
More informationOptimization for Compressed Sensing
Optimization for Compressed Sensing Robert J. Vanderbei 2014 March 21 Dept. of Industrial & Systems Engineering University of Florida http://www.princeton.edu/ rvdb Lasso Regression The problem is to solve
More information