Analysis of Greedy Algorithms

Size: px
Start display at page:

Download "Analysis of Greedy Algorithms"

Transcription

1 Analysis of Greedy Algorithms Jiahui Shen Florida State University Oct.26th

2 Outline Introduction Regularity condition Analysis on orthogonal matching pursuit Analysis on forward-backward greedy algorithm Analysis on hard-thresholding pursuit

3 Introduction Greedy algorithms: Optimization in each step No global optimality guarantee Examples Boosting (AdaBoost, Gradient Boosting), Matching Pursuit (OMP, CoSaMP), Forward and Backward algorithms (FoBa)

4 Some notation Abbreviate l(x β; y) as l(β) J (β): support of β, i.e. J (β) = {j : β j 0} X S : sub-matrix of X formed with columns in set S β S : sub-vector of β on set S β : the true coefficient; β t : estimated β in tth iteration J \J : the elements in J but not in J, i.e. J J C J : cardinality of set J e j : a vector with only jth element as 1, others as 0

5 Example Consider OMP (greedy least square) with true model y = X β + ε Note: p > n, so X T X is not invertible OMP procedure: Select and update support: j t = argmax j l(β t 1 ) j = argmax j X j, y X β t 1 ; J t = J t 1 {j t } Update estimator: β t = argmin β l(β) subject to J (β) J t (Full correction) Orthogonal: residual (y X β t ) is orthogonal to the selected support (due to full correction)

6 Problem setup Key ingredients in greedy algorithms: Choice of loss function: quadratic loss (regression); exponential loss; non-convex loss Selection criterion: select one/multiple features; choose the one with largest gradient/largest decrease in function value; involve backward procedure/no backward Iterative rule: keep the previous weights/modify the weights

7 Problem setup Objective function: min l(x β; y) subject to β 0 q Consider learning problems with large number of features (p > n) Sparse target: linear combination of small number of features (q < n) Directly solve sparse learning problem (L 0 regularization) Given weak classifiers, Boosting can be formulated into this framework

8 Example Assumption: no noise; X j 2 = 1 for each j (unit vector) Intuition: make a connection between l(β t ) and l(β t 1 ) In regression, l(β) = y X β 2, l(β) = X, X β y A simple analysis: here L 1 is not exactly L 1 norm (definition omitted) y X β t 2 2 y X β t Optimal y X β t 1 αx j t 2 2 = y X β t α y X β t 1, X j t + α 2 FC = y X β t 1, y Optimal Select α = y X β t 1, X j t y X β t 1, X j t y L1

9 Example Combine the two equations: y X β t 2 2 y X β t 1 2 2(1 y X β t y 2 L 1 ) Result by induction: Drawback: Noise? X β X β t 2 2 Estimation error? y 2 L 1? no noise = y X β t 2 2 y 2 L 1 t + 1

10 Target of analysis Commonly used: Prediction error: X β X β t 2 2 Statistical error: β β t 2 2 Selection consistency (support recovery): J (β ) = J (β t ) Some others: Minimax error bound Iteration time Note: Many papers consider the globally optimal solution instead of the true β. Most of the time, they can be replaced with each other. (Belief: β should approximately optimize l(β))

11 Regularity condition Commonly used and well known: Restricted isometry property (RIP): ρ (s) β 2 2 X β 2 2 ρ + (s) β 2 2 for all β R p with β 0 s Restricted strong convexity/smoothness (RSC/RSS): ρ (s) β β 2 2 l(β ) l(β) l(β), β β ρ + (s) β β 2 2 for all β β R p with β β 0 s

12 Regularity condition Values for ρ + (s) and ρ (s) when n = 200, s increases from 1 to n; X has i.i.d. N(0, 1/ n) entries

13 Regularity condition Other: Restricted gradient optimal constant: l(β ), β ɛ s (β ) β 2 for all β 0 s ɛ s (β ): a measure of noise, on the σ s log(p) level for regression Sparse eigenvalue condition (a different name of RIP, but use only one side): { X β 2 } ρ (s) = inf 2 β 2 ; β 0 s 2 We will use ρ and ρ + with the definition in RSC/RSS in this talk

14 Full correction effect Full correction step: ˆβ = argmin β l(β), subject to J (β) J Effect: l(β) J = 0 for β J Result: l(β ) l(β) ρ (s) β β l(β) J \J, (β β) J \J where s J J Benefit: whenever consider l(β), β β, only consider l(β) J \J, (β β) J \J ; bound with J \J is better than J J

15 Forward effect Two common choices (if adding only one feature in each step): Select j t = argmin η,j l(β + ηe j ) (line search) Select j t = argmax j l(β) j (Computationally efficient) Same result for the two selections with full correction (due to the crude bound): J \J {l(β) min l(β + ηe t η j )} ρ (s) ρ + (1) {l(β) l(β )} Comments: Interpretation: Transfer l(β) argmin η l(β + ηe t j ) into l(β) l(β ) for any β Full correction turns J J into J \J

16 Forward effect More details: Select j t = argmin η,j l(β + ηe j ): l(β) min l(β + ηe j t ) optimality l(β) min l(β + ηe η j) η,j J \J Select j t = argmax j l(β) j : = l(β) min η,j J \J l(β + η(β j β j )e j ) l(β) min l(β + ηe j t ) = l(β) min l(β + ηsgn(β η η i)e j t ) optimality l(β) min η,j J \J l(β + ηsgn(β j)e j ) Comment: Union bound used in J \J to derive the final result

17 OMP A bit refined analysis using forward effect: {l(β) min l(β + ηe j t )} η ρ (s) ρ + (1) J \J {l(β) l(β )} Taking β as β t, β + ηe j t as β t+1 and β as β we have l(β t+1 ) l(β t ) c t {l(β t ) l(β )} where c t = ρ (s)/{ρ + (1) J \J t } It can be transformed into l(β t+1 ) l(β ) (1 c t ){l(β t ) l(β )} e ct {l(β t ) l(β )} which gives l(β t ) l(β ) e Σct l(β ) + e Σct l(β 0 )

18 OMP Recall restricted gradient optimal constant: for β 0 s, l(β ), β ɛ s (β ) β 2 Usage: statistical error bound can be achieved through l(β) l(β ): ρ (s) β β 2 2 2l(β) 2l(β ) + ɛ s(β ) 2 ρ (s) where s J t J Key step in the proof: l(β) l(β ) = l(β) l(β ) l(β ), β β + l(β ), β β Once we get l(β t ) l(β ), bound on β t β 2 2 can also be achieved

19 OMP The analysis can further be refined using several techniques: Use a different l(β ) in each step so the bound can be more precise with another term q k. q k comes from the fact that l(β) l(β ) 1.5ρ + (s) β J \J ɛ s (β )/ρ + (s) Give a criterion on t so that c t can be made into a constant to combine with q k into induction. Final result (s = J (β ) 0 + t since we consider β β t ): l(β t ) l(β ) + 2.5ɛ s (β )/ρ (s) β t β 2 6ɛ s (β )/ρ (s) = O(σ J (β ) log p) when t = 4 J ρ +(1) ρ (s) ln 20ρ +( J ) ρ (s)

20 Termination Time Intuition: if the decrease is significant for each step, then there should not be too many iterations Stop before any over-fitting happens: l(β t ) l(β ) A routine to get a bound: iteration time t controls certain parameter in another bound. A restriction on that parameter gives a bound on iteration time.

21 Forward-backward greedy algorithm FoBa-obj/FoBa-gdt Process: Forward: Select the one with largest decrease in function value/largest gradient, do full correction; stop if δ t = l(β t+1 ) l(β t ) δ Backward: delete a selected feature if min j l(β t β j e j ) l(β t ) δ t /2, do full correction Intuition of FoBa: Forward procedure ensures significant decrease in function value Backward procedure removes incorrect features in early stage If decreasing is significant, gradient should be large; Otherwise, there is a bound on the infinity norm of the gradient δ is used to control forward and backward effect

22 Backward effect Assume β is also the global optimal solution Delete j t = argmin j l(β β j e j ) l(β) and do full correction Make a good control of β on J \J : β J \J 2 2 J \J ρ + (1) {min l(β β j ) l(β)} j Crude usage: β β 2 (β β ) J \J 2 = β J \J 2 Full correction turns J J into J \J

23 FoBa How to analyze? δ can be a tool to make bounds for different quantities; δ t can be a bridge to connect bounds A simple proof of a bound on gradient: l(β) ρ + (1)δ δ l(β) min l(β + ηe j ) η,j max ηe j, l(β) ρ + (1)η 2 η,j max j l(β) j 2 ρ + (1) Start with an assumption on selecting appropriate δ so that l(β ) l(β t ). δ > 4ρ +(1) ρ 2 (s) l(β )

24 General Framework Strategy I: Use an auxiliary variable β as the optimal solution on J (β ) = J (β ) J (β t ) to help analysis Termination rule comes from l(β t ) l(β ). Divide l(β t ) l(β ) into l(β t ) l(β ) {l(β ) l(β )} and use full correction result on each part For each part, we get β t β and β β Forward step gives bound on β t β ; Backward step gives bound on β t β ; both through δ t β t β β t β + β β gives a relationship between β β and β t β

25 Termination time for FoBa Full correction and RSC/RSS: 0 l(β t ) l(β ) = l(β t ) l(β ) {l(β ) l(β )} {ρ + (s) ρ (s)(k 1) 2 } β t β 2 2 where β t β 2 k β t β 2 Bound on forward step: δ t ρ 2 (s) ρ +(1) J \J t 1 βt β 2 2 Bound on backward step: β t β 2 2 J \J ρ +(1) δt Combination through δ t gives: k = ρ2 (s) J \J ρ +(1) J \J t 1 Recall J J t J J + t, which gives an upper bound on t as: { t ( J + 1) ( ρ + (s) ρ (s) + 1)2ρ +(1) ρ (s) } 2

26 General Framework Strategy II (an easy approach): use simple inequality with regularity condition to derive bound Use RSC/RSS, transfer l(β t ) l(β ) into terms with gradient and β β t 2 2 Use Holder s inequality directly to deal with the gradient term, l(β t ), β t β l(β t ) β t β 1 β t β 1 transfers into 2-norm bound. l(β) is bounded by the design of the algorithm (involving δ)

27 FoBa Details: 0 l(β ) l(β t ) l(β t ), β β t + ρ (s) β β t Final result: ρ (s) β β t 2 2 l(β t ) J \J t, (β β t ) J \J t ρ + (1)δ J \J t β β t 2 β β t ρ+ (1)δ 2 J ρ (s) \J t β β t 2 2 ρ +(1)δ ρ 2 (s) where = {j J \J t : βj 2ρ+(1) γ}, γ = ρ 2 (s) Other bounds can be achieved as well: l(β t ) l(β ) ρ +(1)δ ρ (s) ; ρ (s) 2 ρ + (1) J t \J J \J t

28 FoBa To make the bound look better (a trick): ρ + (1)δ ρ 2 (s) J \J t β J \J t 2 2 where γ = 2ρ+(1) ρ 2 (s) Then γ 2 {j J \J t : β j γ} = 2ρ +(1) ρ 2 (s) {j J \J t : β j γ} J \J t 2 {j J \J t : β j γ} = 2( J \J t {j J \J t : β j < γ} ) which leads to J \J t 2 {j J \J t : β j < γ}

29 FoBa Strategy III: use random matrix theory and simple inequalities to derive bound X β t y 2 2 = X βt X β 2 2 ε, X βt X β + ε 2 2 Define ε = l(β ). Then a generalized version is l(β t ) = X β t X β 2 2 ε, X β t X β + l(β ) ε, X β t X β can be bounded using random matrix theory l(β t ) l(β ) can be upper bounded through forward and backward effect on l(β t ) l(β ) and l(β ) l(β ), but some more precise analysis with tricks are involved Termination time bound will also change accordingly Benefit: no need assumption on RSS (ρ + )

30 FoBa Assume ε is sub-gaussion with parameter σ Comparison between results from strategy II and III: with δ ρ 2 +(1)ρ 1 (s) ε 2 β β t 2 2 δρ 2 +(1)ρ 1 (s) β β t 2 2 ρ 1 (s)σ2 J + δρ 2 (s) with δ ρ 1 (s)σ2 log p Comparison with LASSO: A bit better than LASSO error bound: O(σ 2 J log p) LASSO also needs stronger condition (irrepresentable condition) for selection consistency

31 Selection consistency Target: J = J t Several ways to evaluate: In FoBa, max{ J \J t, J t \J } = O( ); need < 1 with high probability Suppose β is known, build necessary/sufficient condition and analyze (e.g. KKT) Derive an upper bound for β β t and add a β -min condition

32 Hard-thresholding pursuit HTP procedure: Select q features with largest absolutely values after running gradient descent: β t = Θ(β t 1 η l(β t 1 ); q); do full correction The analysis in the paper use the global optimal solution for a discussion (β is the global minimum under β q) Global optimal solution is easier to be analyzed; but we can use random matrix theory to derive bounds between β and global optimal solution

33 Hard-thresholding pursuit A naive analysis: Assume l(β ) l(β t ) RSC and Holder inequality gives: l(β ) l(β t ) l(β t ) J \J t 2 β t β 2 +ρ (s) β t β 2 2 If J t J, min j β t j < β β t 2 min j β t j 2q l(β t ) ρ (2q) 2q l(β t ) ρ (2q) guarantees support recovery

34 Hard-thresholding pursuit The complete analysis is more precise with several lemmas and tricks (details omitted) Main ideas: Under certain conditions (those unknown constant terms involved), HTP terminates when β t reaches β HTP will not terminate before β t reaching β The iteration time is finite

35 Forward effect for HTP Key idea: handle the gradient by regularity condition (β β) J 2 2 = β β, (β β) J = β β η l(β ) + η l(β), (β β) J η l(β), (β β) J ρ β β 2 + η l(β) 2 (β β) J 2 where ρ = 1 2ηρ (s) + η 2 ρ + (s) is obtained from β β, l(β ) l(β) ρ (s) β β 2 2 l(β ) l(β) 2 ρ + (s) β β 2 Result: β β 2 β J \J 2 1 ρ + η l(β) J \J 2 1 ρ

36 Some comments In general, full correction make the analysis easier but not necessarily better in practice Almost every analysis needs to use RSC/RSS (or equivalently type) Induction is still a good tool to do analysis, but the bound can be very complicated The so called constant part in bound can play a significant role in practice, so the method may fail

37 Literature Forward-backward greedy algorithm: Barron, A. R., Cohen, A., Dahmen, W., & DeVore, R. A. (2008). Approximation and learning by greedy algorithms. The annals of statistics, 36(1), Liu, J., Ye, J., & Fujimaki, R. (2014, January). Forward-backward greedy algorithms for general convex smooth functions over a cardinality constraint. In International Conference on Machine Learning (pp ). Zhang, T. (2011). Adaptive forward-backward greedy algorithm for learning sparse representations. IEEE transactions on information theory, 57(7), Matching pursuit Needell, D., & Tropp, J. A. (2009). CoSaMP: Iterative signal recovery from incomplete and inaccurate samples. Applied and computational harmonic analysis, 26(3), Zhang, T. (2009). On the consistency of feature selection using greedy least squares regression. Journal of Machine Learning Research, 10(Mar), Zhang, T. (2011). Sparse recovery with orthogonal matching pursuit under RIP. IEEE Transactions on Information Theory, 57(9), Hard thresholding pursuit: Bahmani, S., Raj, B., & Boufounos, P. T. (2013). Greedy sparsity-constrained optimization. Journal of Machine Learning Research, 14(Mar), Yuan, X., Li, P., & Zhang, T. (2014, January). Gradient hard thresholding pursuit for sparsity-constrained optimization. In International Conference on Machine Learning (pp ). Yuan, X., Li, P., & Zhang, T. (2016). Exact recovery of hard thresholding pursuit. In Advances in Neural Information Processing Systems (pp ).

Boosting. Jiahui Shen. October 27th, / 44

Boosting. Jiahui Shen. October 27th, / 44 Boosting Jiahui Shen October 27th, 2017 1 / 44 Target of Boosting Figure: Weak learners Figure: Combined learner 2 / 44 Boosting introduction and notation Boosting: combines weak learners into a strong

More information

Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise

Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published

More information

Restricted Strong Convexity Implies Weak Submodularity

Restricted Strong Convexity Implies Weak Submodularity Restricted Strong Convexity Implies Weak Submodularity Ethan R. Elenberg Rajiv Khanna Alexandros G. Dimakis Department of Electrical and Computer Engineering The University of Texas at Austin {elenberg,rajivak}@utexas.edu

More information

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28 Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:

More information

On Iterative Hard Thresholding Methods for High-dimensional M-Estimation

On Iterative Hard Thresholding Methods for High-dimensional M-Estimation On Iterative Hard Thresholding Methods for High-dimensional M-Estimation Prateek Jain Ambuj Tewari Purushottam Kar Microsoft Research, INDIA University of Michigan, Ann Arbor, USA {prajain,t-purkar}@microsoft.com,

More information

An Introduction to Sparse Approximation

An Introduction to Sparse Approximation An Introduction to Sparse Approximation Anna C. Gilbert Department of Mathematics University of Michigan Basic image/signal/data compression: transform coding Approximate signals sparsely Compress images,

More information

Adaptive Forward-Backward Greedy Algorithm for Learning Sparse Representations

Adaptive Forward-Backward Greedy Algorithm for Learning Sparse Representations Adaptive Forward-Backward Greedy Algorithm for Learning Sparse Representations Tong Zhang, Member, IEEE, 1 Abstract Given a large number of basis functions that can be potentially more than the number

More information

Stability and Robustness of Weak Orthogonal Matching Pursuits

Stability and Robustness of Weak Orthogonal Matching Pursuits Stability and Robustness of Weak Orthogonal Matching Pursuits Simon Foucart, Drexel University Abstract A recent result establishing, under restricted isometry conditions, the success of sparse recovery

More information

Generalized Orthogonal Matching Pursuit- A Review and Some

Generalized Orthogonal Matching Pursuit- A Review and Some Generalized Orthogonal Matching Pursuit- A Review and Some New Results Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur, INDIA Table of Contents

More information

Generalized greedy algorithms.

Generalized greedy algorithms. Generalized greedy algorithms. François-Xavier Dupé & Sandrine Anthoine LIF & I2M Aix-Marseille Université - CNRS - Ecole Centrale Marseille, Marseille ANR Greta Séminaire Parisien des Mathématiques Appliquées

More information

sparse and low-rank tensor recovery Cubic-Sketching

sparse and low-rank tensor recovery Cubic-Sketching Sparse and Low-Ran Tensor Recovery via Cubic-Setching Guang Cheng Department of Statistics Purdue University www.science.purdue.edu/bigdata CCAM@Purdue Math Oct. 27, 2017 Joint wor with Botao Hao and Anru

More information

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit New Coherence and RIP Analysis for Wea 1 Orthogonal Matching Pursuit Mingrui Yang, Member, IEEE, and Fran de Hoog arxiv:1405.3354v1 [cs.it] 14 May 2014 Abstract In this paper we define a new coherence

More information

The Pros and Cons of Compressive Sensing

The Pros and Cons of Compressive Sensing The Pros and Cons of Compressive Sensing Mark A. Davenport Stanford University Department of Statistics Compressive Sensing Replace samples with general linear measurements measurements sampled signal

More information

MLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT

MLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT MLCC 2018 Variable Selection and Sparsity Lorenzo Rosasco UNIGE-MIT-IIT Outline Variable Selection Subset Selection Greedy Methods: (Orthogonal) Matching Pursuit Convex Relaxation: LASSO & Elastic Net

More information

Greedy Sparsity-Constrained Optimization

Greedy Sparsity-Constrained Optimization Greedy Sparsity-Constrained Optimization Sohail Bahmani, Petros Boufounos, and Bhiksha Raj 3 sbahmani@andrew.cmu.edu petrosb@merl.com 3 bhiksha@cs.cmu.edu Department of Electrical and Computer Engineering,

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57

More information

Accelerated Stochastic Block Coordinate Gradient Descent for Sparsity Constrained Nonconvex Optimization

Accelerated Stochastic Block Coordinate Gradient Descent for Sparsity Constrained Nonconvex Optimization Accelerated Stochastic Block Coordinate Gradient Descent for Sparsity Constrained Nonconvex Optimization Jinghui Chen Department of Systems and Information Engineering University of Virginia Quanquan Gu

More information

EE 381V: Large Scale Optimization Fall Lecture 24 April 11

EE 381V: Large Scale Optimization Fall Lecture 24 April 11 EE 381V: Large Scale Optimization Fall 2012 Lecture 24 April 11 Lecturer: Caramanis & Sanghavi Scribe: Tao Huang 24.1 Review In past classes, we studied the problem of sparsity. Sparsity problem is that

More information

Greedy Signal Recovery and Uniform Uncertainty Principles

Greedy Signal Recovery and Uniform Uncertainty Principles Greedy Signal Recovery and Uniform Uncertainty Principles SPIE - IE 2008 Deanna Needell Joint work with Roman Vershynin UC Davis, January 2008 Greedy Signal Recovery and Uniform Uncertainty Principles

More information

Robust Sparse Recovery via Non-Convex Optimization

Robust Sparse Recovery via Non-Convex Optimization Robust Sparse Recovery via Non-Convex Optimization Laming Chen and Yuantao Gu Department of Electronic Engineering, Tsinghua University Homepage: http://gu.ee.tsinghua.edu.cn/ Email: gyt@tsinghua.edu.cn

More information

Noisy and Missing Data Regression: Distribution-Oblivious Support Recovery

Noisy and Missing Data Regression: Distribution-Oblivious Support Recovery : Distribution-Oblivious Support Recovery Yudong Chen Department of Electrical and Computer Engineering The University of Texas at Austin Austin, TX 7872 Constantine Caramanis Department of Electrical

More information

The Pros and Cons of Compressive Sensing

The Pros and Cons of Compressive Sensing The Pros and Cons of Compressive Sensing Mark A. Davenport Stanford University Department of Statistics Compressive Sensing Replace samples with general linear measurements measurements sampled signal

More information

of Orthogonal Matching Pursuit

of Orthogonal Matching Pursuit A Sharp Restricted Isometry Constant Bound of Orthogonal Matching Pursuit Qun Mo arxiv:50.0708v [cs.it] 8 Jan 205 Abstract We shall show that if the restricted isometry constant (RIC) δ s+ (A) of the measurement

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Linear Methods for Regression. Lijun Zhang

Linear Methods for Regression. Lijun Zhang Linear Methods for Regression Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Linear Regression Models and Least Squares Subset Selection Shrinkage Methods Methods Using Derived

More information

The Analysis Cosparse Model for Signals and Images

The Analysis Cosparse Model for Signals and Images The Analysis Cosparse Model for Signals and Images Raja Giryes Computer Science Department, Technion. The research leading to these results has received funding from the European Research Council under

More information

c 2011 International Press Vol. 18, No. 1, pp , March DENNIS TREDE

c 2011 International Press Vol. 18, No. 1, pp , March DENNIS TREDE METHODS AND APPLICATIONS OF ANALYSIS. c 2011 International Press Vol. 18, No. 1, pp. 105 110, March 2011 007 EXACT SUPPORT RECOVERY FOR LINEAR INVERSE PROBLEMS WITH SPARSITY CONSTRAINTS DENNIS TREDE Abstract.

More information

Gradient descent. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Gradient descent. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725 Gradient descent Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Gradient descent First consider unconstrained minimization of f : R n R, convex and differentiable. We want to solve

More information

Distributed Inexact Newton-type Pursuit for Non-convex Sparse Learning

Distributed Inexact Newton-type Pursuit for Non-convex Sparse Learning Distributed Inexact Newton-type Pursuit for Non-convex Sparse Learning Bo Liu Department of Computer Science, Rutgers Univeristy Xiao-Tong Yuan BDAT Lab, Nanjing University of Information Science and Technology

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

CoSaMP: Greedy Signal Recovery and Uniform Uncertainty Principles

CoSaMP: Greedy Signal Recovery and Uniform Uncertainty Principles CoSaMP: Greedy Signal Recovery and Uniform Uncertainty Principles SIAM Student Research Conference Deanna Needell Joint work with Roman Vershynin and Joel Tropp UC Davis, May 2008 CoSaMP: Greedy Signal

More information

1 Regression with High Dimensional Data

1 Regression with High Dimensional Data 6.883 Learning with Combinatorial Structure ote for Lecture 11 Instructor: Prof. Stefanie Jegelka Scribe: Xuhong Zhang 1 Regression with High Dimensional Data Consider the following regression problem:

More information

Sparse analysis Lecture III: Dictionary geometry and greedy algorithms

Sparse analysis Lecture III: Dictionary geometry and greedy algorithms Sparse analysis Lecture III: Dictionary geometry and greedy algorithms Anna C. Gilbert Department of Mathematics University of Michigan Intuition from ONB Key step in algorithm: r, ϕ j = x c i ϕ i, ϕ j

More information

Gradient Descent with Sparsification: An iterative algorithm for sparse recovery with restricted isometry property

Gradient Descent with Sparsification: An iterative algorithm for sparse recovery with restricted isometry property : An iterative algorithm for sparse recovery with restricted isometry property Rahul Garg grahul@us.ibm.com Rohit Khandekar rohitk@us.ibm.com IBM T. J. Watson Research Center, 0 Kitchawan Road, Route 34,

More information

Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing

Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas M.Vidyasagar@utdallas.edu www.utdallas.edu/ m.vidyasagar

More information

Gradient Descent. Ryan Tibshirani Convex Optimization /36-725

Gradient Descent. Ryan Tibshirani Convex Optimization /36-725 Gradient Descent Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: canonical convex programs Linear program (LP): takes the form min x subject to c T x Gx h Ax = b Quadratic program (QP): like

More information

Pre-weighted Matching Pursuit Algorithms for Sparse Recovery

Pre-weighted Matching Pursuit Algorithms for Sparse Recovery Journal of Information & Computational Science 11:9 (214) 2933 2939 June 1, 214 Available at http://www.joics.com Pre-weighted Matching Pursuit Algorithms for Sparse Recovery Jingfei He, Guiling Sun, Jie

More information

Constructing Explicit RIP Matrices and the Square-Root Bottleneck

Constructing Explicit RIP Matrices and the Square-Root Bottleneck Constructing Explicit RIP Matrices and the Square-Root Bottleneck Ryan Cinoman July 18, 2018 Ryan Cinoman Constructing Explicit RIP Matrices July 18, 2018 1 / 36 Outline 1 Introduction 2 Restricted Isometry

More information

Compressive Sensing and Beyond

Compressive Sensing and Beyond Compressive Sensing and Beyond Sohail Bahmani Gerorgia Tech. Signal Processing Compressed Sensing Signal Models Classics: bandlimited The Sampling Theorem Any signal with bandwidth B can be recovered

More information

(Part 1) High-dimensional statistics May / 41

(Part 1) High-dimensional statistics May / 41 Theory for the Lasso Recall the linear model Y i = p j=1 β j X (j) i + ɛ i, i = 1,..., n, or, in matrix notation, Y = Xβ + ɛ, To simplify, we assume that the design X is fixed, and that ɛ is N (0, σ 2

More information

Multipath Matching Pursuit

Multipath Matching Pursuit Multipath Matching Pursuit Submitted to IEEE trans. on Information theory Authors: S. Kwon, J. Wang, and B. Shim Presenter: Hwanchol Jang Multipath is investigated rather than a single path for a greedy

More information

OWL to the rescue of LASSO

OWL to the rescue of LASSO OWL to the rescue of LASSO IISc IBM day 2018 Joint Work R. Sankaran and Francis Bach AISTATS 17 Chiranjib Bhattacharyya Professor, Department of Computer Science and Automation Indian Institute of Science,

More information

Model-Based Compressive Sensing for Signal Ensembles. Marco F. Duarte Volkan Cevher Richard G. Baraniuk

Model-Based Compressive Sensing for Signal Ensembles. Marco F. Duarte Volkan Cevher Richard G. Baraniuk Model-Based Compressive Sensing for Signal Ensembles Marco F. Duarte Volkan Cevher Richard G. Baraniuk Concise Signal Structure Sparse signal: only K out of N coordinates nonzero model: union of K-dimensional

More information

Sparse representation classification and positive L1 minimization

Sparse representation classification and positive L1 minimization Sparse representation classification and positive L1 minimization Cencheng Shen Joint Work with Li Chen, Carey E. Priebe Applied Mathematics and Statistics Johns Hopkins University, August 5, 2014 Cencheng

More information

The Frank-Wolfe Algorithm:

The Frank-Wolfe Algorithm: The Frank-Wolfe Algorithm: New Results, and Connections to Statistical Boosting Paul Grigas, Robert Freund, and Rahul Mazumder http://web.mit.edu/rfreund/www/talks.html Massachusetts Institute of Technology

More information

Tractable Upper Bounds on the Restricted Isometry Constant

Tractable Upper Bounds on the Restricted Isometry Constant Tractable Upper Bounds on the Restricted Isometry Constant Alex d Aspremont, Francis Bach, Laurent El Ghaoui Princeton University, École Normale Supérieure, U.C. Berkeley. Support from NSF, DHS and Google.

More information

Reconstruction from Anisotropic Random Measurements

Reconstruction from Anisotropic Random Measurements Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013

More information

Stopping Condition for Greedy Block Sparse Signal Recovery

Stopping Condition for Greedy Block Sparse Signal Recovery Stopping Condition for Greedy Block Sparse Signal Recovery Yu Luo, Ronggui Xie, Huarui Yin, and Weidong Wang Department of Electronics Engineering and Information Science, University of Science and Technology

More information

Linear Convergence of Stochastic Iterative Greedy Algorithms with Sparse Constraints

Linear Convergence of Stochastic Iterative Greedy Algorithms with Sparse Constraints Claremont Colleges Scholarship @ Claremont CMC Faculty Publications and Research CMC Faculty Scholarship 7--04 Linear Convergence of Stochastic Iterative Greedy Algorithms with Sparse Constraints Nam Nguyen

More information

GREEDY SIGNAL RECOVERY REVIEW

GREEDY SIGNAL RECOVERY REVIEW GREEDY SIGNAL RECOVERY REVIEW DEANNA NEEDELL, JOEL A. TROPP, ROMAN VERSHYNIN Abstract. The two major approaches to sparse recovery are L 1-minimization and greedy methods. Recently, Needell and Vershynin

More information

Conditional Gradient (Frank-Wolfe) Method

Conditional Gradient (Frank-Wolfe) Method Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties

More information

Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery

Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery Jorge F. Silva and Eduardo Pavez Department of Electrical Engineering Information and Decision Systems Group Universidad

More information

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)

More information

Machine Learning for Signal Processing Sparse and Overcomplete Representations

Machine Learning for Signal Processing Sparse and Overcomplete Representations Machine Learning for Signal Processing Sparse and Overcomplete Representations Abelino Jimenez (slides from Bhiksha Raj and Sourish Chaudhuri) Oct 1, 217 1 So far Weights Data Basis Data Independent ICA

More information

TRADING ACCURACY FOR SPARSITY IN OPTIMIZATION PROBLEMS WITH SPARSITY CONSTRAINTS

TRADING ACCURACY FOR SPARSITY IN OPTIMIZATION PROBLEMS WITH SPARSITY CONSTRAINTS TRADING ACCURACY FOR SPARSITY IN OPTIMIZATION PROBLEMS WITH SPARSITY CONSTRAINTS SHAI SHALEV-SHWARTZ, NATHAN SREBRO, AND TONG ZHANG Abstract We study the problem of minimizing the expected loss of a linear

More information

Lasso Regression: Regularization for feature selection

Lasso Regression: Regularization for feature selection Lasso Regression: Regularization for feature selection Emily Fox University of Washington January 18, 2017 1 Feature selection task 2 1 Why might you want to perform feature selection? Efficiency: - If

More information

Introduction to Sparsity. Xudong Cao, Jake Dreamtree & Jerry 04/05/2012

Introduction to Sparsity. Xudong Cao, Jake Dreamtree & Jerry 04/05/2012 Introduction to Sparsity Xudong Cao, Jake Dreamtree & Jerry 04/05/2012 Outline Understanding Sparsity Total variation Compressed sensing(definition) Exact recovery with sparse prior(l 0 ) l 1 relaxation

More information

A Tight Bound of Hard Thresholding

A Tight Bound of Hard Thresholding Journal of Machine Learning Research 18 018) 1-4 Submitted 6/16; Revised 5/17; Published 4/18 A Tight Bound of Hard Thresholding Jie Shen Department of Computer Science Rutgers University Piscataway, NJ

More information

Machine Learning for Signal Processing Sparse and Overcomplete Representations. Bhiksha Raj (slides from Sourish Chaudhuri) Oct 22, 2013

Machine Learning for Signal Processing Sparse and Overcomplete Representations. Bhiksha Raj (slides from Sourish Chaudhuri) Oct 22, 2013 Machine Learning for Signal Processing Sparse and Overcomplete Representations Bhiksha Raj (slides from Sourish Chaudhuri) Oct 22, 2013 1 Key Topics in this Lecture Basics Component-based representations

More information

Compressed Sensing and Sparse Recovery

Compressed Sensing and Sparse Recovery ELE 538B: Sparsity, Structure and Inference Compressed Sensing and Sparse Recovery Yuxin Chen Princeton University, Spring 217 Outline Restricted isometry property (RIP) A RIPless theory Compressed sensing

More information

Introduction to Compressed Sensing

Introduction to Compressed Sensing Introduction to Compressed Sensing Alejandro Parada, Gonzalo Arce University of Delaware August 25, 2016 Motivation: Classical Sampling 1 Motivation: Classical Sampling Issues Some applications Radar Spectral

More information

ESL Chap3. Some extensions of lasso

ESL Chap3. Some extensions of lasso ESL Chap3 Some extensions of lasso 1 Outline Consistency of lasso for model selection Adaptive lasso Elastic net Group lasso 2 Consistency of lasso for model selection A number of authors have studied

More information

Bias-free Sparse Regression with Guaranteed Consistency

Bias-free Sparse Regression with Guaranteed Consistency Bias-free Sparse Regression with Guaranteed Consistency Wotao Yin (UCLA Math) joint with: Stanley Osher, Ming Yan (UCLA) Feng Ruan, Jiechao Xiong, Yuan Yao (Peking U) UC Riverside, STATS Department March

More information

Sparsity in Underdetermined Systems

Sparsity in Underdetermined Systems Sparsity in Underdetermined Systems Department of Statistics Stanford University August 19, 2005 Classical Linear Regression Problem X n y p n 1 > Given predictors and response, y Xβ ε = + ε N( 0, σ 2

More information

Linear Regression. Aarti Singh. Machine Learning / Sept 27, 2010

Linear Regression. Aarti Singh. Machine Learning / Sept 27, 2010 Linear Regression Aarti Singh Machine Learning 10-701/15-781 Sept 27, 2010 Discrete to Continuous Labels Classification Sports Science News Anemic cell Healthy cell Regression X = Document Y = Topic X

More information

CoSaMP. Iterative signal recovery from incomplete and inaccurate samples. Joel A. Tropp

CoSaMP. Iterative signal recovery from incomplete and inaccurate samples. Joel A. Tropp CoSaMP Iterative signal recovery from incomplete and inaccurate samples Joel A. Tropp Applied & Computational Mathematics California Institute of Technology jtropp@acm.caltech.edu Joint with D. Needell

More information

Signal Recovery from Permuted Observations

Signal Recovery from Permuted Observations EE381V Course Project Signal Recovery from Permuted Observations 1 Problem Shanshan Wu (sw33323) May 8th, 2015 We start with the following problem: let s R n be an unknown n-dimensional real-valued signal,

More information

Convex relaxation for Combinatorial Penalties

Convex relaxation for Combinatorial Penalties Convex relaxation for Combinatorial Penalties Guillaume Obozinski Equipe Imagine Laboratoire d Informatique Gaspard Monge Ecole des Ponts - ParisTech Joint work with Francis Bach Fête Parisienne in Computation,

More information

Sparsity Regularization

Sparsity Regularization Sparsity Regularization Bangti Jin Course Inverse Problems & Imaging 1 / 41 Outline 1 Motivation: sparsity? 2 Mathematical preliminaries 3 l 1 solvers 2 / 41 problem setup finite-dimensional formulation

More information

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu

More information

ORTHOGONAL matching pursuit (OMP) is the canonical

ORTHOGONAL matching pursuit (OMP) is the canonical IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 56, NO. 9, SEPTEMBER 2010 4395 Analysis of Orthogonal Matching Pursuit Using the Restricted Isometry Property Mark A. Davenport, Member, IEEE, and Michael

More information

arxiv: v1 [math.na] 26 Nov 2009

arxiv: v1 [math.na] 26 Nov 2009 Non-convexly constrained linear inverse problems arxiv:0911.5098v1 [math.na] 26 Nov 2009 Thomas Blumensath Applied Mathematics, School of Mathematics, University of Southampton, University Road, Southampton,

More information

CSC 576: Variants of Sparse Learning

CSC 576: Variants of Sparse Learning CSC 576: Variants of Sparse Learning Ji Liu Department of Computer Science, University of Rochester October 27, 205 Introduction Our previous note basically suggests using l norm to enforce sparsity in

More information

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Cun-Hui Zhang and Stephanie S. Zhang Rutgers University and Columbia University September 14, 2012 Outline Introduction Methodology

More information

Boosting Methods: Why They Can Be Useful for High-Dimensional Data

Boosting Methods: Why They Can Be Useful for High-Dimensional Data New URL: http://www.r-project.org/conferences/dsc-2003/ Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003) March 20 22, Vienna, Austria ISSN 1609-395X Kurt Hornik,

More information

Optimization methods

Optimization methods Optimization methods Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda /8/016 Introduction Aim: Overview of optimization methods that Tend to

More information

Optimization for Compressed Sensing

Optimization for Compressed Sensing Optimization for Compressed Sensing Robert J. Vanderbei 2014 March 21 Dept. of Industrial & Systems Engineering University of Florida http://www.princeton.edu/ rvdb Lasso Regression The problem is to solve

More information

Sparse Approximation and Variable Selection

Sparse Approximation and Variable Selection Sparse Approximation and Variable Selection Lorenzo Rosasco 9.520 Class 07 February 26, 2007 About this class Goal To introduce the problem of variable selection, discuss its connection to sparse approximation

More information

Penalized Squared Error and Likelihood: Risk Bounds and Fast Algorithms

Penalized Squared Error and Likelihood: Risk Bounds and Fast Algorithms university-logo Penalized Squared Error and Likelihood: Risk Bounds and Fast Algorithms Andrew Barron Cong Huang Xi Luo Department of Statistics Yale University 2008 Workshop on Sparsity in High Dimensional

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

Analysis of Multi-stage Convex Relaxation for Sparse Regularization

Analysis of Multi-stage Convex Relaxation for Sparse Regularization Journal of Machine Learning Research 11 (2010) 1081-1107 Submitted 5/09; Revised 1/10; Published 3/10 Analysis of Multi-stage Convex Relaxation for Sparse Regularization Tong Zhang Statistics Department

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Proximal-Gradient Mark Schmidt University of British Columbia Winter 2018 Admin Auditting/registration forms: Pick up after class today. Assignment 1: 2 late days to hand in

More information

Lasso Regression: Regularization for feature selection

Lasso Regression: Regularization for feature selection Lasso Regression: Regularization for feature selection Emily Fox University of Washington January 18, 2017 Feature selection task 1 Why might you want to perform feature selection? Efficiency: - If size(w)

More information

A Short Introduction to the Lasso Methodology

A Short Introduction to the Lasso Methodology A Short Introduction to the Lasso Methodology Michael Gutmann sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology March 9, 2016 Michael

More information

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Sparse regression. Optimization-Based Data Analysis.   Carlos Fernandez-Granda Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic

More information

A new method on deterministic construction of the measurement matrix in compressed sensing

A new method on deterministic construction of the measurement matrix in compressed sensing A new method on deterministic construction of the measurement matrix in compressed sensing Qun Mo 1 arxiv:1503.01250v1 [cs.it] 4 Mar 2015 Abstract Construction on the measurement matrix A is a central

More information

Non-linear Supervised High Frequency Trading Strategies with Applications in US Equity Markets

Non-linear Supervised High Frequency Trading Strategies with Applications in US Equity Markets Non-linear Supervised High Frequency Trading Strategies with Applications in US Equity Markets Nan Zhou, Wen Cheng, Ph.D. Associate, Quantitative Research, J.P. Morgan nan.zhou@jpmorgan.com The 4th Annual

More information

COMS 4721: Machine Learning for Data Science Lecture 6, 2/2/2017

COMS 4721: Machine Learning for Data Science Lecture 6, 2/2/2017 COMS 4721: Machine Learning for Data Science Lecture 6, 2/2/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University UNDERDETERMINED LINEAR EQUATIONS We

More information

An iterative hard thresholding estimator for low rank matrix recovery

An iterative hard thresholding estimator for low rank matrix recovery An iterative hard thresholding estimator for low rank matrix recovery Alexandra Carpentier - based on a joint work with Arlene K.Y. Kim Statistical Laboratory, Department of Pure Mathematics and Mathematical

More information

Lecture 5 : Projections

Lecture 5 : Projections Lecture 5 : Projections EE227C. Lecturer: Professor Martin Wainwright. Scribe: Alvin Wan Up until now, we have seen convergence rates of unconstrained gradient descent. Now, we consider a constrained minimization

More information

Descent methods. min x. f(x)

Descent methods. min x. f(x) Gradient Descent Descent methods min x f(x) 5 / 34 Descent methods min x f(x) x k x k+1... x f(x ) = 0 5 / 34 Gradient methods Unconstrained optimization min f(x) x R n. 6 / 34 Gradient methods Unconstrained

More information

Nonlinear Optimization for Optimal Control

Nonlinear Optimization for Optimal Control Nonlinear Optimization for Optimal Control Pieter Abbeel UC Berkeley EECS Many slides and figures adapted from Stephen Boyd [optional] Boyd and Vandenberghe, Convex Optimization, Chapters 9 11 [optional]

More information

Exponential decay of reconstruction error from binary measurements of sparse signals

Exponential decay of reconstruction error from binary measurements of sparse signals Exponential decay of reconstruction error from binary measurements of sparse signals Deanna Needell Joint work with R. Baraniuk, S. Foucart, Y. Plan, and M. Wootters Outline Introduction Mathematical Formulation

More information

A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables

A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider the problem of

More information

arxiv: v2 [cs.lg] 6 May 2017

arxiv: v2 [cs.lg] 6 May 2017 arxiv:170107895v [cslg] 6 May 017 Information Theoretic Limits for Linear Prediction with Graph-Structured Sparsity Abstract Adarsh Barik Krannert School of Management Purdue University West Lafayette,

More information

Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 5

Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 5 Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 5 Slides adapted from Jordan Boyd-Graber, Tom Mitchell, Ziv Bar-Joseph Machine Learning: Chenhao Tan Boulder 1 of 27 Quiz question For

More information

Statistics for high-dimensional data: Group Lasso and additive models

Statistics for high-dimensional data: Group Lasso and additive models Statistics for high-dimensional data: Group Lasso and additive models Peter Bühlmann and Sara van de Geer Seminar für Statistik, ETH Zürich May 2012 The Group Lasso (Yuan & Lin, 2006) high-dimensional

More information

Solving Corrupted Quadratic Equations, Provably

Solving Corrupted Quadratic Equations, Provably Solving Corrupted Quadratic Equations, Provably Yuejie Chi London Workshop on Sparse Signal Processing September 206 Acknowledgement Joint work with Yuanxin Li (OSU), Huishuai Zhuang (Syracuse) and Yingbin

More information