Sparse Proteomics Analysis (SPA)

Size: px
Start display at page:

Download "Sparse Proteomics Analysis (SPA)"

Transcription

1 Sparse Proteomics Analysis (SPA) Toward a Mathematical Theory for Feature Selection from Forward Models Martin Genzel Technische Universität Berlin Winter School on Compressed Sensing December 5, 2015

2 Outline 1 Biological Background 2 Sparse Proteomics Analysis (SPA) 3 Theoretical Foundation by High-dimensional Estimation Theory Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

3 1 Biological Background 2 Sparse Proteomics Analysis (SPA) 3 Theoretical Foundation by High-dimensional Estimation Theory

4 What is Proteomics? The pathological mechanisms of many diseases, such as cancer, are manifested on the level of protein activities. Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

5 What is Proteomics? The pathological mechanisms of many diseases, such as cancer, are manifested on the level of protein activities. To improve clinical treatment options and early diagnostics, we need to understand protein structures and their interactions! Proteins are long chains of amino acids, controlling many biological and chemical processes in the human body. Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

6 What is Proteomics? The pathological mechanisms of many diseases, such as cancer, are manifested on the level of protein activities. To improve clinical treatment options and early diagnostics, we need to understand protein structures and their interactions! Proteins are long chains of amino acids, controlling many biological and chemical processes in the human body. The entire set of proteins at a certain point of time is called a proteome. Proteomics is the large-scale study of the human proteome. Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

7 What is Mass Spectrometry? How to capture a proteome? Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

8 What is Mass Spectrometry? How to capture a proteome? Mass spectrometry (MS) is a popular technique to detect the abundance of proteins in samples (blood, urine, etc.). Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

9 What is Mass Spectrometry? How to capture a proteome? Mass spectrometry (MS) is a popular technique to detect the abundance of proteins in samples (blood, urine, etc.). Schematic Work-Flow Laser Mass spectrum Sample Detector Intensity (cts) Mass (m/z) Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

10 Real-World MS-Data Intensity (cts) Mass (m/z) MS-vector: x = (x 1,..., x d ) R d, d Index ˆ= Mass/Feature, Entry ˆ= Intensity/Amplitude Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

11 Real-World MS-Data Intensity (cts) Mass (m/z) MS-vector: x = (x 1,..., x d ) R d, d Index ˆ= Mass/Feature, Entry ˆ= Intensity/Amplitude Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

12 Real-World MS-Data Intensity (cts) Mass (m/z) MS-vector: x = (x 1,..., x d ) R d, d Index ˆ= Mass/Feature, Entry ˆ= Intensity/Amplitude Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

13 Feature Selection from MS-Data Goal: Detect a small set of features (disease fingerprint) that allows for an appropriate distinction between the diseased and healthy group. Schematic Work-Flow Samples Blood from healthy individual Blood from diseased individual Mass (m/z) Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

14 Feature Selection from MS-Data Goal: Detect a small set of features (disease fingerprint) that allows for an appropriate distinction between the diseased and healthy group. Schematic Work-Flow Samples Mass Spectra Blood from healthy individual MS Blood from diseased individual MS Intensity (cts) Intensity (cts) Mass (m/z) Mass (m/z) Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

15 Feature Selection from MS-Data Goal: Detect a small set of features (disease fingerprint) that allows for an appropriate distinction between the diseased and healthy group. Schematic Work-Flow Samples Mass Spectra Feature Selection Blood from healthy individual MS Intensity (cts) Mass (m/z) Disease Fingerprint Blood from diseased individual Comparing MS Intensity (cts) Mass (m/z) Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

16 Mathematical Problem Formulation Supervised Learning: We are given n samples (x 1, y 1 ),..., (x n, y n ). x k R d : y k { 1, +1}: Mass spectrum of the k-th patient Health status of the k-th patient (healthy = +1, diseased = 1) Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

17 Mathematical Problem Formulation Supervised Learning: We are given n samples (x 1, y 1 ),..., (x n, y n ). x k R d : y k { 1, +1}: Mass spectrum of the k-th patient Health status of the k-th patient (healthy = +1, diseased = 1) Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

18 Mathematical Problem Formulation Supervised Learning: We are given n samples (x 1, y 1 ),..., (x n, y n ). x k R d : y k { 1, +1}: Mass spectrum of the k-th patient Health status of the k-th patient (healthy = +1, diseased = 1) Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

19 Mathematical Problem Formulation Supervised Learning: We are given n samples (x 1, y 1 ),..., (x n, y n ). x k R d : y k { 1, +1}: Mass spectrum of the k-th patient Health status of the k-th patient (healthy = +1, diseased = 1) Goal: Learn a feature vector ω R d which is sparse, i.e., few non-zero entries, ( stability, avoid overfitting) and its entries correspond to peaks that are highly correlated with the disease. ( interpretability, biological relevance) Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

20 Mathematical Problem Formulation Supervised Learning: We are given n samples (x 1, y 1 ),..., (x n, y n ). x k R d : y k { 1, +1}: Mass spectrum of the k-th patient Health status of the k-th patient (healthy = +1, diseased = 1) Goal: Learn a feature vector ω R d which is sparse, i.e., few non-zero entries, ( stability, avoid overfitting) and its entries correspond to peaks that are highly correlated with the disease. ( interpretability, biological relevance) Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

21 How to learn a fingerprint ω?

22 1 Biological Background 2 Sparse Proteomics Analysis (SPA) 3 Theoretical Foundation by High-dimensional Estimation Theory

23 Sparse Proteomics Analysis (SPA) Sparse Proteomics Analysis is a generic framework to meet this challenge. Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

24 Sparse Proteomics Analysis (SPA) Sparse Proteomics Analysis is a generic framework to meet this challenge. Input: Sample pairs (x 1, y 1 ),..., (x n, y n ) R d { 1, +1} Compute: 1 Preprocessing (Smoothing, Standardization) 2 Feature Selection (LASSO, l 1 -SVM, Robust 1-Bit CS) 3 Postprocessing (Sparsification) Output: Sparse feature vector ω R d Biomarker extraction, dimension reduction Intensity (cts) Mass (m/z) Biomarker Identification Blood Sample Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

25 Sparse Proteomics Analysis (SPA) Sparse Proteomics Analysis is a generic framework to meet this challenge. Input: Sample pairs (x 1, y 1 ),..., (x n, y n ) R d { 1, +1} Compute: 1 Preprocessing (Smoothing, Standardization) 2 Feature Selection (LASSO, l 1 -SVM, Robust 1-Bit CS) 3 Postprocessing (Sparsification) Output: Sparse feature vector ω R d Rest of this talk Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

26 Feature Selection (Geometric Intuition) Linear Separation Model: Find a feature vector ω R d such that y k = sign( x k, ω ) for many k {1,..., n}. Moreover, ω should be sparse and interpretable. Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

27 Feature Selection via the LASSO The LASSO (Tibshirani 96) n (y k x k, ω ) 2 subject to ω 1 R min ω R d k=1 Multivariate approach, originally designed for linear regression models: y k x k, ω, k = 1,..., n. But also applicable to non-linear models Next part Later: R s to allow for s-sparse solutions (with unit norm). Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

28 Feature Selection via the LASSO The LASSO (Tibshirani 96) n (y k x k, ω ) 2 subject to ω 1 R min ω R d k=1 Multivariate approach, originally designed for linear regression models: y k x k, ω, k = 1,..., n. But also applicable to non-linear models Next part Later: R s to allow for s-sparse solutions (with unit norm). Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

29 Some Numerical Results 5-fold cross-validation for real-world pancreas data (156 samples): 1 Learn feature vector ω by SPA, using 80% of the samples. 2 Classify the remaining 20% of the sample by an ordinary SVM, after projecting onto supp(ω). 3 Iterate this procedure 12-times for random partitions. Classification accuracy for different sparsity levels s = # supp(ω) Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

30 But what about theoretical guarantees?

31 1 Biological Background 2 Sparse Proteomics Analysis (SPA) 3 Theoretical Foundation by High-dimensional Estimation Theory

32 Toward a Theoretical Foundation of SPA Linear Separation Model: Explains the observations/labels: y k = sign( x k, ω 0 ), k = 1,..., n Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

33 Toward a Theoretical Foundation of SPA Linear Separation Model: Explains the observations/labels: y k = sign( x k, ω 0 ), k = 1,..., n Forward Model: Explains the random distribution of the data: x k = M m=1 s m,ka m + n k, k = 1,..., n Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

34 Toward a Theoretical Foundation of SPA Linear Separation Model: Explains the observations/labels: y k = sign( x k, ω 0 ), k = 1,..., n Forward Model: Explains the random distribution of the data: x k = M m=1 s m,ka m + n k, k = 1,..., n a m : s m,k : n k : Deterministic feature atom, sampled Gaussian peak ( R d ) Random latent factor specifying the peak amplitude ( R) Random baseline noise ( R d ) s ",$ s ",$ % exp (% c ") - β " - Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

35 Toward a Theoretical Foundation of SPA Linear Separation Model: Explains the observations/labels: y k = sign( x k, ω 0 ), k = 1,..., n Forward Model: Explains the random distribution of the data: x k = M m=1 s m,ka m + n k, k = 1,..., n a m : s m,k : n k : Deterministic feature atom, sampled Gaussian peak ( R d ) Random latent factor specifying the peak amplitude ( R) Random baseline noise ( R d ) s ",$ s ",$ % exp (% c ") - β " - Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

36 Toward a Theoretical Foundation of SPA Linear Separation Model: Explains the observations/labels: y k = sign( x k, ω 0 ), k = 1,..., n Forward Model: Explains the random distribution of the data: x k = M m=1 s m,ka m + n k, k = 1,..., n a m : s m,k : n k : Deterministic feature atom, sampled Gaussian peak ( R d ) Random latent factor specifying the peak amplitude ( R) Random baseline noise ( R d ) s ",$ s ",$ % exp (% c ") - β " - Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

37 Toward a Theoretical Foundation of SPA Linear Separation Model: Explains the observations/labels: y k = sign( x k, ω 0 ), k = 1,..., n Forward Model: Explains the random distribution of the data: x k = M m=1 s m,ka m + n k, k = 1,..., n Supposed that sufficiently many samples are given, can we learn the sparse fingerprint ω 0? Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

38 Toward a Theoretical Foundation of SPA Linear Separation Model: Explains the observations/labels: y k = sign( x k, ω 0 ), k = 1,..., n Forward Model: Explains the random distribution of the data: x k = M m=1 s m,ka m + n k, k = 1,..., n Supposed that sufficiently many samples are given, can we learn the sparse fingerprint ω 0? Problem: The vector ω 0 is not unique because some features are perfectly correlated No hope for support recovery or approximation Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

39 Toward a Theoretical Foundation of SPA Linear Separation Model: Explains the observations/labels: y k = sign( x k, ω 0 ), k = 1,..., n Forward Model: Explains the random distribution of the data: x k = M m=1 s m,ka m + n k, k = 1,..., n Supposed that sufficiently many samples are given, can we learn the sparse fingerprint ω 0? Problem: The vector ω 0 is not unique because some features are perfectly correlated No hope for support recovery or approximation Idea: Separate the fingerprint from its data representation! Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

40 Combining the Models Assumptions: x k = M m=1 s m,ka m + n k, k = 1,..., n s k := (s 1,k,..., s M,k ) N (0, I M ) peak amplitudes n k N (0, σ 2 I d ) noise vector a 1,..., a M R d arbitrary (peak) atoms, D := a 1. R M d a M Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

41 Combining the Models Assumptions: x k = M m=1 s m,ka m + n k, k = 1,..., n s k := (s 1,k,..., s M,k ) N (0, I M ) peak amplitudes n k N (0, σ 2 I d ) noise vector a 1,..., a M R d arbitrary (peak) atoms, D := a 1. R M d Put this into the classification model: y k = sign( x k, ω 0 ) = sign( M m=1 s m,ka m + n k, ω 0 ) = sign( D s k + n k, ω 0 ) a M Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

42 Combining the Models Assumptions: x k = M m=1 s m,ka m + n k, k = 1,..., n s k := (s 1,k,..., s M,k ) N (0, I M ) peak amplitudes n k N (0, σ 2 I d ) noise vector a 1,..., a M R d arbitrary (peak) atoms, D := a 1. R M d Put this into the classification model: y k = sign( x k, ω 0 ) = sign( M m=1 s m,ka m + n k, ω 0 ) a M = sign( D s k + n k, ω 0 ) = sign( s k, Dω 0 }{{} =:z 0 + n k, ω 0 ) = sign( s k, z 0 + n k, ω 0 ) Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

43 Signal Space vs. Coefficient Space x k = M m=1 s m,ka m + n k = D s k + n k Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

44 Signal Space vs. Coefficient Space x k = M m=1 s m,ka m = D s k Let us first assume that n k = 0 (no baseline noise). Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

45 Signal Space vs. Coefficient Space x k = M m=1 s m,ka m = D s k Let us first assume that n k = 0 (no baseline noise). Then where z 0 = Dω 0. y k = sign( x k, ω 0 ) = sign( s k, z 0 ), Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

46 Signal Space vs. Coefficient Space x k = M m=1 s m,ka m = D s k Let us first assume that n k = 0 (no baseline noise). Then where z 0 = Dω 0. y k = sign( x k, ω 0 ) = sign( s k, z 0 ), z 0 has a (non-unique) representation in the dictionary D with sparse coefficients ω 0. z 0 lives in the signal space R M (independent of specific data type). ω 0 lives in the coefficient space R d (data dependent). Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

47 Signal Space vs. Coefficient Space x k = M m=1 s m,ka m = D s k Let us first assume that n k = 0 (no baseline noise). Then where z 0 = Dω 0. y k = sign( x k, ω 0 ) = sign( s k, z 0 ), z 0 has a (non-unique) representation in the dictionary D with sparse coefficients ω 0. z 0 lives in the signal space R M (independent of specific data type). ω 0 lives in the coefficient space R d (data dependent). Try to show a recovery result for z 0! Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

48 What Does This Mean for the LASSO? y k = sign( x k, ω 0 ) = sign( s k, z 0 ) with z 0 = Dω 0 SPA via the LASSO n (y k x k, ω ) 2 subject to ω 1 R min ω R d k=1 Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

49 What Does This Mean for the LASSO? y k = sign( x k, ω 0 ) = sign( s k, z 0 ) with z 0 = Dω 0 SPA via the LASSO n (y k x k, ω ) 2 min ω R B d 1 k=1 Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

50 What Does This Mean for the LASSO? y k = sign( x k, ω 0 ) = sign( s k, z 0 ) with z 0 = Dω 0 SPA via the LASSO n z:=dω min (y k x k, ω ) 2 = min ω R B1 d }{{} z R DB k=1 1 d = s k,z n (y k s k, z ) 2 k=1 Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

51 What Does This Mean for the LASSO? y k = sign( x k, ω 0 ) = sign( s k, z 0 ) with z 0 = Dω 0 SPA via the LASSO n (y k x k, ω ) 2 min ω R B d 1 k=1 } {{ } Solvable in practice! z:=dω = min z R DB1 d k=1 n (y k s k, z ) 2 } {{ } Solvable in theory! Warning: The minimizers live in different spaces! Warning: We neither know D nor s k, but just their product. Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

52 What Does This Mean for the LASSO? y k = sign( x k, ω 0 ) = sign( s k, z 0 ) with z 0 = Dω 0 SPA via the LASSO n (y k x k, ω ) 2 min ω R B d 1 k=1 } {{ } Solvable in practice! z:=dω = min z R DB1 d k=1 n (y k s k, z ) 2 } {{ } Solvable in theory! Warning: The minimizers live in different spaces! Warning: We neither know D nor s k, but just their product. Idea: Apply results for the K-LASSO with K = R DB d 1! Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

53 A Simplified Version of Roman Vershynin s Result Theorem (Plan, Vershynin 15) Suppose that s k N (0, I M ), z 0 S M 1, and the observations follow y k = sign( s k, z 0 ), k = 1,..., n. 2 Put µ = π and assume that µz 0 K, where K is convex, and n w(k) 2. Then, with high probability, the solution ẑ of the K-LASSO satisfies w(k) ẑ µz 0 2 n. The (global) mean width for bounded K R M is given by w(k) = sup g, u, where g N (0, I M ). u K Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

54 A Simplified Version of Roman Vershynin s Result Theorem (Plan, Vershynin 15) Suppose that s k N (0, I M ), z 0 S M 1, and the observations follow y k = sign( s k, z 0 ), k = 1,..., n. 2 Put µ = π and assume that µz 0 K, where K is convex, and n w(k) 2. Then, with high probability, the solution ẑ of the K-LASSO satisfies w(k) ẑ µz 0 2 n. Assume that K = µr DB d 1 z 0 = Dω 0 for some ω 0 R B d 1. Assume that the columns of D are normalized. Then w(k) R log(d). Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

55 A Recovery Guarantee for SPA Theorem (G. 15) Suppose that s k N (0, I M ). Let z 0 S M 1 and assume that there exists R > 0 such that z 0 = Dω 0 for some ω 0 R B1 d. The observations follow y k = sign( s k, z 0 ) = sign( x k, ω 0 ), k = 1,..., n. and the number of samples satisfies n R 2 log(d). Then, with high probability, the solution of the LASSO n ẑ = argmin (y k s k, z ) 2 z R DB1 d k=1 satisfies ( ) 1/4 2 ẑ π z 0 2 R2 log(d) n. Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

56 A Recovery Guarantee for SPA Theorem (G. 15) Suppose that s k N (0, I M ). Let z 0 S M 1 and assume that there exists R > 0 such that z 0 = Dω 0 for some ω 0 R B1 d. The observations follow y k = sign( s k, z 0 ) = sign( x k, ω 0 ), k = 1,..., n. and the number of samples satisfies n R 2 log(d). Then, with high probability, the solution of the LASSO n ẑ = argmin (y k s k, z ) 2 z R DB1 d k=1 satisfies ( ) 1/4 2 ẑ π z 0 2 R2 log(d) n. Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

57 A Recovery Guarantee for SPA Theorem (G. 15) Suppose that s k N (0, I M ). Let z 0 S M 1 and assume that there exists R > 0 such that z 0 = Dω 0 for some ω 0 R B1 d. The observations follow y k = sign( s k, z 0 ) = sign( x k, ω 0 ), k = 1,..., n. and the number of samples satisfies n R 2 log(d). Then, with high probability, the solution of the LASSO n ẑ = D ˆω = D argmin (y k x k, ω ) 2 ω R B1 d k=1 satisfies ( ) 1/4 2 D ˆω π Dω = ẑ π z 0 2 R2 log(d) n. Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

58 Practical Relevance for MS-Data? Extensions: Baseline noise nk N (0, σ 2 I d ) Non-trivial covariance matrix, i.e., sk N (0, Σ) Adversarial bit-flips in the model y k = sign( x k, ω 0 ) How to achieve normalized columns in D? How to guarantee that R s, i.e., s-sparse vectors are allowed? Standardize the data (centering + normalizing) Given ˆω, how to switch over to the signal space? (D is unknown) Identify supp( ˆω) with peaks (manual approach) Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

59 Practical Relevance for MS-Data? Extensions: Baseline noise nk N (0, σ 2 I d ) Non-trivial covariance matrix, i.e., sk N (0, Σ) Adversarial bit-flips in the model y k = sign( x k, ω 0 ) How to achieve normalized columns in D? How to guarantee that R s, i.e., s-sparse vectors are allowed? Standardize the data (centering + normalizing) Given ˆω, how to switch over to the signal space? (D is unknown) Identify supp( ˆω) with peaks (manual approach) Message of this talk An s-sparse disease fingerprint can be accurately recovered from only O(s log(d)) samples! Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

60 THANK YOU FOR YOUR ATTENTION! Further Reading M. Genzel Sparse Proteomics Analysis: Toward a Mathematical Foundation of Feature Selection and Disease Classification. Master s Thesis, Y. Plan, R. Vershynin The generalized Lasso with non-linear observations. arxiv: , 2015.

61 What to Do Next? Development of an abstract framework What kind of properties should the dictionary D have? Extension/generalization of the results More complicated models and algorithms Numerical verification of the theory Other examples from real-world applications Bio-informatics, neuro-imaging, astronomy, chemistry,... Dictionary learning / Factor analysis What can we learn about D? Martin Genzel Sparse Proteomics Analysis (SPA) WiCoS / 19

Sparse Proteomics Analysis

Sparse Proteomics Analysis Technische Universität Berlin Fakultät II Institut für Mathematik Sparse Proteomics Analysis Toward a Mathematical Foundation of Feature Selection and Disease Classification Masterarbeit zur Erlangung

More information

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Sparse regression. Optimization-Based Data Analysis.   Carlos Fernandez-Granda Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic

More information

Compressed Sensing and Neural Networks

Compressed Sensing and Neural Networks and Jan Vybíral (Charles University & Czech Technical University Prague, Czech Republic) NOMAD Summer Berlin, September 25-29, 2017 1 / 31 Outline Lasso & Introduction Notation Training the network Applications

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu

More information

Exponential decay of reconstruction error from binary measurements of sparse signals

Exponential decay of reconstruction error from binary measurements of sparse signals Exponential decay of reconstruction error from binary measurements of sparse signals Deanna Needell Joint work with R. Baraniuk, S. Foucart, Y. Plan, and M. Wootters Outline Introduction Mathematical Formulation

More information

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1

More information

Structured signal recovery from non-linear and heavy-tailed measurements

Structured signal recovery from non-linear and heavy-tailed measurements Structured signal recovery from non-linear and heavy-tailed measurements Larry Goldstein* Stanislav Minsker* Xiaohan Wei # *Department of Mathematics # Department of Electrical Engineering UniversityofSouthern

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57

More information

Proteomics and Variable Selection

Proteomics and Variable Selection Proteomics and Variable Selection p. 1/55 Proteomics and Variable Selection Alex Lewin With thanks to Paul Kirk for some graphs Department of Epidemiology and Biostatistics, School of Public Health, Imperial

More information

IEOR 265 Lecture 3 Sparse Linear Regression

IEOR 265 Lecture 3 Sparse Linear Regression IOR 65 Lecture 3 Sparse Linear Regression 1 M Bound Recall from last lecture that the reason we are interested in complexity measures of sets is because of the following result, which is known as the M

More information

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28 Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:

More information

1 Regression with High Dimensional Data

1 Regression with High Dimensional Data 6.883 Learning with Combinatorial Structure ote for Lecture 11 Instructor: Prof. Stefanie Jegelka Scribe: Xuhong Zhang 1 Regression with High Dimensional Data Consider the following regression problem:

More information

An Introduction to Sparse Approximation

An Introduction to Sparse Approximation An Introduction to Sparse Approximation Anna C. Gilbert Department of Mathematics University of Michigan Basic image/signal/data compression: transform coding Approximate signals sparsely Compress images,

More information

Sampling and high-dimensional convex geometry

Sampling and high-dimensional convex geometry Sampling and high-dimensional convex geometry Roman Vershynin SampTA 2013 Bremen, Germany, June 2013 Geometry of sampling problems Signals live in high dimensions; sampling is often random. Geometry in

More information

Sparse Approximation and Variable Selection

Sparse Approximation and Variable Selection Sparse Approximation and Variable Selection Lorenzo Rosasco 9.520 Class 07 February 26, 2007 About this class Goal To introduce the problem of variable selection, discuss its connection to sparse approximation

More information

From approximation theory to machine learning

From approximation theory to machine learning 1/40 From approximation theory to machine learning New perspectives in the theory of function spaces and their applications September 2017, Bedlewo, Poland Jan Vybíral Charles University/Czech Technical

More information

Computational Genomics

Computational Genomics Computational Genomics http://www.cs.cmu.edu/~02710 Introduction to probability, statistics and algorithms (brief) intro to probability Basic notations Random variable - referring to an element / event

More information

arxiv: v1 [cs.it] 21 Feb 2013

arxiv: v1 [cs.it] 21 Feb 2013 q-ary Compressive Sensing arxiv:30.568v [cs.it] Feb 03 Youssef Mroueh,, Lorenzo Rosasco, CBCL, CSAIL, Massachusetts Institute of Technology LCSL, Istituto Italiano di Tecnologia and IIT@MIT lab, Istituto

More information

Oslo Class 6 Sparsity based regularization

Oslo Class 6 Sparsity based regularization RegML2017@SIMULA Oslo Class 6 Sparsity based regularization Lorenzo Rosasco UNIGE-MIT-IIT May 4, 2017 Learning from data Possible only under assumptions regularization min Ê(w) + λr(w) w Smoothness Sparsity

More information

Building a Prognostic Biomarker

Building a Prognostic Biomarker Building a Prognostic Biomarker Noah Simon and Richard Simon July 2016 1 / 44 Prognostic Biomarker for a Continuous Measure On each of n patients measure y i - single continuous outcome (eg. blood pressure,

More information

Protein Expression Molecular Pattern Discovery by Nonnegative Principal Component Analysis

Protein Expression Molecular Pattern Discovery by Nonnegative Principal Component Analysis Protein Expression Molecular Pattern Discovery by Nonnegative Principal Component Analysis Xiaoxu Han and Joseph Scazzero Department of Mathematics and Bioinformatics Program Department of Accounting and

More information

Super-resolution via Convex Programming

Super-resolution via Convex Programming Super-resolution via Convex Programming Carlos Fernandez-Granda (Joint work with Emmanuel Candès) Structure and Randomness in System Identication and Learning, IPAM 1/17/2013 1/17/2013 1 / 44 Index 1 Motivation

More information

The Pros and Cons of Compressive Sensing

The Pros and Cons of Compressive Sensing The Pros and Cons of Compressive Sensing Mark A. Davenport Stanford University Department of Statistics Compressive Sensing Replace samples with general linear measurements measurements sampled signal

More information

Optimization for Compressed Sensing

Optimization for Compressed Sensing Optimization for Compressed Sensing Robert J. Vanderbei 2014 March 21 Dept. of Industrial & Systems Engineering University of Florida http://www.princeton.edu/ rvdb Lasso Regression The problem is to solve

More information

Analysis of Greedy Algorithms

Analysis of Greedy Algorithms Analysis of Greedy Algorithms Jiahui Shen Florida State University Oct.26th Outline Introduction Regularity condition Analysis on orthogonal matching pursuit Analysis on forward-backward greedy algorithm

More information

MIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications. Class 08: Sparsity Based Regularization. Lorenzo Rosasco

MIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications. Class 08: Sparsity Based Regularization. Lorenzo Rosasco MIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications Class 08: Sparsity Based Regularization Lorenzo Rosasco Learning algorithms so far ERM + explicit l 2 penalty 1 min w R d n n l(y

More information

Provable Alternating Minimization Methods for Non-convex Optimization

Provable Alternating Minimization Methods for Non-convex Optimization Provable Alternating Minimization Methods for Non-convex Optimization Prateek Jain Microsoft Research, India Joint work with Praneeth Netrapalli, Sujay Sanghavi, Alekh Agarwal, Animashree Anandkumar, Rashish

More information

CPSC 340: Machine Learning and Data Mining

CPSC 340: Machine Learning and Data Mining CPSC 340: Machine Learning and Data Mining Linear Classifiers: predictions Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due Friday of next

More information

CSC 576: Variants of Sparse Learning

CSC 576: Variants of Sparse Learning CSC 576: Variants of Sparse Learning Ji Liu Department of Computer Science, University of Rochester October 27, 205 Introduction Our previous note basically suggests using l norm to enforce sparsity in

More information

MLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT

MLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT MLCC 2018 Variable Selection and Sparsity Lorenzo Rosasco UNIGE-MIT-IIT Outline Variable Selection Subset Selection Greedy Methods: (Orthogonal) Matching Pursuit Convex Relaxation: LASSO & Elastic Net

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

https://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:

More information

Linear Regression. CSL603 - Fall 2017 Narayanan C Krishnan

Linear Regression. CSL603 - Fall 2017 Narayanan C Krishnan Linear Regression CSL603 - Fall 2017 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Univariate regression Multivariate regression Probabilistic view of regression Loss functions Bias-Variance analysis Regularization

More information

Linear Regression. CSL465/603 - Fall 2016 Narayanan C Krishnan

Linear Regression. CSL465/603 - Fall 2016 Narayanan C Krishnan Linear Regression CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Univariate regression Multivariate regression Probabilistic view of regression Loss functions Bias-Variance analysis

More information

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some

More information

Learning From Data: Modelling as an Optimisation Problem

Learning From Data: Modelling as an Optimisation Problem Learning From Data: Modelling as an Optimisation Problem Iman Shames April 2017 1 / 31 You should be able to... Identify and formulate a regression problem; Appreciate the utility of regularisation; Identify

More information

New ways of dimension reduction? Cutting data sets into small pieces

New ways of dimension reduction? Cutting data sets into small pieces New ways of dimension reduction? Cutting data sets into small pieces Roman Vershynin University of Michigan, Department of Mathematics Statistical Machine Learning Ann Arbor, June 5, 2012 Joint work with

More information

Advanced metric adaptation in Generalized LVQ for classification of mass spectrometry data

Advanced metric adaptation in Generalized LVQ for classification of mass spectrometry data Advanced metric adaptation in Generalized LVQ for classification of mass spectrometry data P. Schneider, M. Biehl Mathematics and Computing Science, University of Groningen, The Netherlands email: {petra,

More information

Lecture 16: Compressed Sensing

Lecture 16: Compressed Sensing Lecture 16: Compressed Sensing Introduction to Learning and Analysis of Big Data Kontorovich and Sabato (BGU) Lecture 16 1 / 12 Review of Johnson-Lindenstrauss Unsupervised learning technique key insight:

More information

FACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION

FACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION SunLab Enlighten the World FACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION Ioakeim (Kimis) Perros and Jimeng Sun perros@gatech.edu, jsun@cc.gatech.edu COMPUTATIONAL

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Probabilistic Machine Learning. Industrial AI Lab.

Probabilistic Machine Learning. Industrial AI Lab. Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear

More information

1 Sparsity and l 1 relaxation

1 Sparsity and l 1 relaxation 6.883 Learning with Combinatorial Structure Note for Lecture 2 Author: Chiyuan Zhang Sparsity and l relaxation Last time we talked about sparsity and characterized when an l relaxation could recover the

More information

Discriminative Models

Discriminative Models No.5 Discriminative Models Hui Jiang Department of Electrical Engineering and Computer Science Lassonde School of Engineering York University, Toronto, Canada Outline Generative vs. Discriminative models

More information

Advanced metric adaptation in Generalized LVQ for classification of mass spectrometry data

Advanced metric adaptation in Generalized LVQ for classification of mass spectrometry data Advanced metric adaptation in Generalized LVQ for classification of mass spectrometry data P. Schneider, M. Biehl Mathematics and Computing Science, University of Groningen, The Netherlands email: {petra,

More information

Graph Partitioning Using Random Walks

Graph Partitioning Using Random Walks Graph Partitioning Using Random Walks A Convex Optimization Perspective Lorenzo Orecchia Computer Science Why Spectral Algorithms for Graph Problems in practice? Simple to implement Can exploit very efficient

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 18, 2016 Outline One versus all/one versus one Ranking loss for multiclass/multilabel classification Scaling to millions of labels Multiclass

More information

Greedy Signal Recovery and Uniform Uncertainty Principles

Greedy Signal Recovery and Uniform Uncertainty Principles Greedy Signal Recovery and Uniform Uncertainty Principles SPIE - IE 2008 Deanna Needell Joint work with Roman Vershynin UC Davis, January 2008 Greedy Signal Recovery and Uniform Uncertainty Principles

More information

Sparse solutions of underdetermined systems

Sparse solutions of underdetermined systems Sparse solutions of underdetermined systems I-Liang Chern September 22, 2016 1 / 16 Outline Sparsity and Compressibility: the concept for measuring sparsity and compressibility of data Minimum measurements

More information

Komprimované snímání a LASSO jako metody zpracování vysocedimenzionálních dat

Komprimované snímání a LASSO jako metody zpracování vysocedimenzionálních dat Komprimované snímání a jako metody zpracování vysocedimenzionálních dat Jan Vybíral (Charles University Prague, Czech Republic) November 2014 VUT Brno 1 / 49 Definition and motivation Its use in bioinformatics

More information

Support Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM

Support Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM 1 Support Vector Machines (SVM) in bioinformatics Day 1: Introduction to SVM Jean-Philippe Vert Bioinformatics Center, Kyoto University, Japan Jean-Philippe.Vert@mines.org Human Genome Center, University

More information

SPARSE signal representations have gained popularity in recent

SPARSE signal representations have gained popularity in recent 6958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 10, OCTOBER 2011 Blind Compressed Sensing Sivan Gleichman and Yonina C. Eldar, Senior Member, IEEE Abstract The fundamental principle underlying

More information

COMS 4771 Regression. Nakul Verma

COMS 4771 Regression. Nakul Verma COMS 4771 Regression Nakul Verma Last time Support Vector Machines Maximum Margin formulation Constrained Optimization Lagrange Duality Theory Convex Optimization SVM dual and Interpretation How get the

More information

Quantized Iterative Hard Thresholding:

Quantized Iterative Hard Thresholding: Quantized Iterative Hard Thresholding: ridging 1-bit and High Resolution Quantized Compressed Sensing Laurent Jacques, Kévin Degraux, Christophe De Vleeschouwer Louvain University (UCL), Louvain-la-Neuve,

More information

Sparsity in Underdetermined Systems

Sparsity in Underdetermined Systems Sparsity in Underdetermined Systems Department of Statistics Stanford University August 19, 2005 Classical Linear Regression Problem X n y p n 1 > Given predictors and response, y Xβ ε = + ε N( 0, σ 2

More information

Sparse Algorithms are not Stable: A No-free-lunch Theorem

Sparse Algorithms are not Stable: A No-free-lunch Theorem Sparse Algorithms are not Stable: A No-free-lunch Theorem Huan Xu Shie Mannor Constantine Caramanis Abstract We consider two widely used notions in machine learning, namely: sparsity algorithmic stability.

More information

Sparse analysis Lecture V: From Sparse Approximation to Sparse Signal Recovery

Sparse analysis Lecture V: From Sparse Approximation to Sparse Signal Recovery Sparse analysis Lecture V: From Sparse Approximation to Sparse Signal Recovery Anna C. Gilbert Department of Mathematics University of Michigan Connection between... Sparse Approximation and Compressed

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Model Selection with Partly Smooth Functions

Model Selection with Partly Smooth Functions Model Selection with Partly Smooth Functions Samuel Vaiter, Gabriel Peyré and Jalal Fadili vaiter@ceremade.dauphine.fr August 27, 2014 ITWIST 14 Model Consistency of Partly Smooth Regularizers, arxiv:1405.1004,

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 27, 2015 Outline One versus all/one versus one Ranking loss for multiclass/multilabel classification Scaling to millions of labels Multiclass

More information

Optimal Linear Estimation under Unknown Nonlinear Transform

Optimal Linear Estimation under Unknown Nonlinear Transform Optimal Linear Estimation under Unknown Nonlinear Transform Xinyang Yi The University of Texas at Austin yixy@utexas.edu Constantine Caramanis The University of Texas at Austin constantine@utexas.edu Zhaoran

More information

Integrated Non-Factorized Variational Inference

Integrated Non-Factorized Variational Inference Integrated Non-Factorized Variational Inference Shaobo Han, Xuejun Liao and Lawrence Carin Duke University February 27, 2014 S. Han et al. Integrated Non-Factorized Variational Inference February 27, 2014

More information

Applied Machine Learning Annalisa Marsico

Applied Machine Learning Annalisa Marsico Applied Machine Learning Annalisa Marsico OWL RNA Bionformatics group Max Planck Institute for Molecular Genetics Free University of Berlin 22 April, SoSe 2015 Goals Feature Selection rather than Feature

More information

De-biasing the Lasso: Optimal Sample Size for Gaussian Designs

De-biasing the Lasso: Optimal Sample Size for Gaussian Designs De-biasing the Lasso: Optimal Sample Size for Gaussian Designs Adel Javanmard USC Marshall School of Business Data Science and Operations department Based on joint work with Andrea Montanari Oct 2015 Adel

More information

A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie

A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie Computational Biology Program Memorial Sloan-Kettering Cancer Center http://cbio.mskcc.org/leslielab

More information

Compressed Sensing in Cancer Biology? (A Work in Progress)

Compressed Sensing in Cancer Biology? (A Work in Progress) Compressed Sensing in Cancer Biology? (A Work in Progress) M. Vidyasagar FRS Cecil & Ida Green Chair The University of Texas at Dallas M.Vidyasagar@utdallas.edu www.utdallas.edu/ m.vidyasagar University

More information

Towards the Prediction of Protein Abundance from Tandem Mass Spectrometry Data

Towards the Prediction of Protein Abundance from Tandem Mass Spectrometry Data Towards the Prediction of Protein Abundance from Tandem Mass Spectrometry Data Anthony J Bonner Han Liu Abstract This paper addresses a central problem of Proteomics: estimating the amounts of each of

More information

DATA MINING AND MACHINE LEARNING

DATA MINING AND MACHINE LEARNING DATA MINING AND MACHINE LEARNING Lecture 5: Regularization and loss functions Lecturer: Simone Scardapane Academic Year 2016/2017 Table of contents Loss functions Loss functions for regression problems

More information

Importance Sampling for Minibatches

Importance Sampling for Minibatches Importance Sampling for Minibatches Dominik Csiba School of Mathematics University of Edinburgh 07.09.2016, Birmingham Dominik Csiba (University of Edinburgh) Importance Sampling for Minibatches 07.09.2016,

More information

Reconstruction from Anisotropic Random Measurements

Reconstruction from Anisotropic Random Measurements Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013

More information

STATS 306B: Unsupervised Learning Spring Lecture 13 May 12

STATS 306B: Unsupervised Learning Spring Lecture 13 May 12 STATS 306B: Unsupervised Learning Spring 2014 Lecture 13 May 12 Lecturer: Lester Mackey Scribe: Jessy Hwang, Minzhe Wang 13.1 Canonical correlation analysis 13.1.1 Recap CCA is a linear dimensionality

More information

Small-variance Asymptotics for Dirichlet Process Mixtures of SVMs

Small-variance Asymptotics for Dirichlet Process Mixtures of SVMs Small-variance Asymptotics for Dirichlet Process Mixtures of SVMs Yining Wang Jun Zhu Tsinghua University July, 2014 Y. Wang and J. Zhu (Tsinghua University) Max-Margin DP-means July, 2014 1 / 25 Outline

More information

25 : Graphical induced structured input/output models

25 : Graphical induced structured input/output models 10-708: Probabilistic Graphical Models 10-708, Spring 2013 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Meghana Kshirsagar (mkshirsa), Yiwen Chen (yiwenche) 1 Graph

More information

Discriminative Models

Discriminative Models No.5 Discriminative Models Hui Jiang Department of Electrical Engineering and Computer Science Lassonde School of Engineering York University, Toronto, Canada Outline Generative vs. Discriminative models

More information

25 : Graphical induced structured input/output models

25 : Graphical induced structured input/output models 10-708: Probabilistic Graphical Models 10-708, Spring 2016 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Raied Aljadaany, Shi Zong, Chenchen Zhu Disclaimer: A large

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:

More information

Linear Models for Regression CS534

Linear Models for Regression CS534 Linear Models for Regression CS534 Prediction Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict the

More information

Towards a Mathematical Theory of Super-resolution

Towards a Mathematical Theory of Super-resolution Towards a Mathematical Theory of Super-resolution Carlos Fernandez-Granda www.stanford.edu/~cfgranda/ Information Theory Forum, Information Systems Laboratory, Stanford 10/18/2013 Acknowledgements This

More information

Development and Evaluation of Methods for Predicting Protein Levels from Tandem Mass Spectrometry Data. Han Liu

Development and Evaluation of Methods for Predicting Protein Levels from Tandem Mass Spectrometry Data. Han Liu Development and Evaluation of Methods for Predicting Protein Levels from Tandem Mass Spectrometry Data by Han Liu A thesis submitted in conformity with the requirements for the degree of Master of Science

More information

Lecture 8: Classification

Lecture 8: Classification 1/26 Lecture 8: Classification Måns Eriksson Department of Mathematics, Uppsala University eriksson@math.uu.se Multivariate Methods 19/5 2010 Classification: introductory examples Goal: Classify an observation

More information

The uniform uncertainty principle and compressed sensing Harmonic analysis and related topics, Seville December 5, 2008

The uniform uncertainty principle and compressed sensing Harmonic analysis and related topics, Seville December 5, 2008 The uniform uncertainty principle and compressed sensing Harmonic analysis and related topics, Seville December 5, 2008 Emmanuel Candés (Caltech), Terence Tao (UCLA) 1 Uncertainty principles A basic principle

More information

Robust Principal Component Analysis

Robust Principal Component Analysis ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M

More information

Minimizing the Difference of L 1 and L 2 Norms with Applications

Minimizing the Difference of L 1 and L 2 Norms with Applications 1/36 Minimizing the Difference of L 1 and L 2 Norms with Department of Mathematical Sciences University of Texas Dallas May 31, 2017 Partially supported by NSF DMS 1522786 2/36 Outline 1 A nonconvex approach:

More information

Dictionary Learning for photo-z estimation

Dictionary Learning for photo-z estimation Dictionary Learning for photo-z estimation Joana Frontera-Pons, Florent Sureau, Jérôme Bobin 5th September 2017 - Workshop Dictionary Learning on Manifolds MOTIVATION! Goal : Measure the radial positions

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Mark Schmidt University of British Columbia Winter 2018 Last Time: Learning and Inference in DAGs We discussed learning in DAG models, log p(x W ) = n d log p(x i j x i pa(j),

More information

Massive MIMO: Signal Structure, Efficient Processing, and Open Problems II

Massive MIMO: Signal Structure, Efficient Processing, and Open Problems II Massive MIMO: Signal Structure, Efficient Processing, and Open Problems II Mahdi Barzegar Communications and Information Theory Group (CommIT) Technische Universität Berlin Heisenberg Communications and

More information

Bayesian Methods for Sparse Signal Recovery

Bayesian Methods for Sparse Signal Recovery Bayesian Methods for Sparse Signal Recovery Bhaskar D Rao 1 University of California, San Diego 1 Thanks to David Wipf, Jason Palmer, Zhilin Zhang and Ritwik Giri Motivation Motivation Sparse Signal Recovery

More information

Towards the Prediction of Protein Abundance from Tandem Mass Spectrometry Data

Towards the Prediction of Protein Abundance from Tandem Mass Spectrometry Data Towards the Prediction of Protein Abundance from Tandem Mass Spectrometry Data Anthony J Bonner Han Liu Abstract This paper addresses a central problem of Proteomics: estimating the amounts of each of

More information

Generalized Orthogonal Matching Pursuit- A Review and Some

Generalized Orthogonal Matching Pursuit- A Review and Some Generalized Orthogonal Matching Pursuit- A Review and Some New Results Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur, INDIA Table of Contents

More information

Optimization methods

Optimization methods Optimization methods Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda /8/016 Introduction Aim: Overview of optimization methods that Tend to

More information

Sparse Gaussian conditional random fields

Sparse Gaussian conditional random fields Sparse Gaussian conditional random fields Matt Wytock, J. ico Kolter School of Computer Science Carnegie Mellon University Pittsburgh, PA 53 {mwytock, zkolter}@cs.cmu.edu Abstract We propose sparse Gaussian

More information

Convex envelopes, cardinality constrained optimization and LASSO. An application in supervised learning: support vector machines (SVMs)

Convex envelopes, cardinality constrained optimization and LASSO. An application in supervised learning: support vector machines (SVMs) ORF 523 Lecture 8 Princeton University Instructor: A.A. Ahmadi Scribe: G. Hall Any typos should be emailed to a a a@princeton.edu. 1 Outline Convexity-preserving operations Convex envelopes, cardinality

More information

Genetic Networks. Korbinian Strimmer. Seminar: Statistical Analysis of RNA-Seq Data 19 June IMISE, Universität Leipzig

Genetic Networks. Korbinian Strimmer. Seminar: Statistical Analysis of RNA-Seq Data 19 June IMISE, Universität Leipzig Genetic Networks Korbinian Strimmer IMISE, Universität Leipzig Seminar: Statistical Analysis of RNA-Seq Data 19 June 2012 Korbinian Strimmer, RNA-Seq Networks, 19/6/2012 1 Paper G. I. Allen and Z. Liu.

More information

EUSIPCO

EUSIPCO EUSIPCO 013 1569746769 SUBSET PURSUIT FOR ANALYSIS DICTIONARY LEARNING Ye Zhang 1,, Haolong Wang 1, Tenglong Yu 1, Wenwu Wang 1 Department of Electronic and Information Engineering, Nanchang University,

More information

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 3: Sparse signal recovery: A RIPless analysis of l 1 minimization Yuejie Chi The Ohio State University Page 1 Outline

More information

The Pros and Cons of Compressive Sensing

The Pros and Cons of Compressive Sensing The Pros and Cons of Compressive Sensing Mark A. Davenport Stanford University Department of Statistics Compressive Sensing Replace samples with general linear measurements measurements sampled signal

More information

COMP 551 Applied Machine Learning Lecture 3: Linear regression (cont d)

COMP 551 Applied Machine Learning Lecture 3: Linear regression (cont d) COMP 551 Applied Machine Learning Lecture 3: Linear regression (cont d) Instructor: Herke van Hoof (herke.vanhoof@mail.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless

More information

Optimisation Combinatoire et Convexe.

Optimisation Combinatoire et Convexe. Optimisation Combinatoire et Convexe. Low complexity models, l 1 penalties. A. d Aspremont. M1 ENS. 1/36 Today Sparsity, low complexity models. l 1 -recovery results: three approaches. Extensions: matrix

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Multivariate Gaussians Mark Schmidt University of British Columbia Winter 2019 Last Time: Multivariate Gaussian http://personal.kenyon.edu/hartlaub/mellonproject/bivariate2.html

More information

Knockoffs as Post-Selection Inference

Knockoffs as Post-Selection Inference Knockoffs as Post-Selection Inference Lucas Janson Harvard University Department of Statistics blank line blank line WHOA-PSI, August 12, 2017 Controlled Variable Selection Conditional modeling setup:

More information