Algorithms for sparse optimization

Size: px
Start display at page:

Download "Algorithms for sparse optimization"

Transcription

1 Algorithms for sparse optimization Michael P. Friedlander University of California, Davis Google May 19, 2015

2 Sparse representations = + y = Hx y = Dx y = [ H D ] x Haar DCT Haar/DCT dictionary 2 / 40

3 Sparsity via a transform Represent a signal y as a superposition of elementary signal atoms: y = j φ jx j ie, y = Φx Orthogonal bases yield a unique representation: x = Φ T y Overcomplete dictionaries, eg, A = [Φ 1 Φ 2 ] have more flexibility. But representation is not unique. One approach: minimize nnz(x) subj to Ax y 3 / 40

4 Efficient transforms Percentage of energy Percentage of energy Percentage of energy Percentage of coefficients Percentage of coefficients Percentage of coefficients Identity Fourier Wavelet 4 / 40

5 Efficient transforms Percentage of energy Percentage of energy Percentage of energy Percentage of coefficients Percentage of coefficients Percentage of coefficients Identity Fourier Wavelet 4 / 40

6 Efficient transforms Percentage of energy Percentage of energy Percentage of energy Percentage of coefficients Percentage of coefficients Percentage of coefficients Identity Fourier Wavelet 4 / 40

7 Efficient transforms Percentage of energy Percentage of energy Percentage of energy Percentage of coefficients Percentage of coefficients Percentage of coefficients Identity Fourier Wavelet 4 / 40

8 Efficient transforms Percentage of energy Percentage of energy Percentage of energy Percentage of coefficients Percentage of coefficients Percentage of coefficients Identity Fourier Wavelet 4 / 40

9 Efficient transforms Percentage of energy Percentage of energy Percentage of energy Percentage of coefficients Percentage of coefficients Percentage of coefficients Identity Fourier Wavelet 4 / 40

10 Prototypical workflow Measure structured signal y via linear measurements: b i = f i, y, i = 1,..., m ie, b = My, M = Decode the m measurements: MΦx b, x sparse Reconstruct the signal: ŷ := Φx Guarantees: number of samples m needed to reconstruct signal w.h.p. Compressed sensing (eg, M Gaussian/Φ orthogonal): O(k log n k ) matrix completion: # matrix samples O(n 5/4 r log n) 5 / 40

11 Problem: find sparse solution x Sparse optimization minimize nnz(x) subj to Ax b Convex relaxations, eg: basis pursuit [Chen et al. (1998); Candès et al. (2006)] minimize x 1 subj to Ax b (m n) x R n nuclear norm [Fazel (2002); Candes and Recht (2009)] minimize X R n p i σ i(x) subj to AX b (m np) 6 / 40

12 APPLICATIONS and FORMULATIONS

13 Compressed sensing = + φ 1, y,..., φ m, y Φy = Φ = Gaussian Φy Φ Haar DCT x b A x minimize x 1 subj to Ax b 7 / 40

14 Image deblurring Observe b = My, y = image, M = blurring operator Recover significant coeff s via minimize 1 2 MWx b 2 + λ x 1 λ 2.3e+0 2.3e-2 2.3e-4 Ax b 2.6e+2 3.0e+0 3.1e-2 nnz(x) 72 1,741 28,484 Sparco problem blurrycam [Figueiredo et al. (2007)] 8 / 40

15 Joint sparsity: source localization b 1 b 2 b p arrival angles x 1 x 2 x p [Malioutov et al. (2005)] sensors = phase-shift gain matrix minimize X 1,2 subj to B AX F σ λ θ s 1 s 2 s 3 s 4 s 5 9 / 40

16 Mass spectrometry separating mixtures Relative intensity = Relative intensity Relative intensity Relative intensity m/z A = m/z m/z m/z mixture molecule 1 molecule 2 molecule 3 Percentage Actual Recovered Recovered (Noisy) spectral fingerprints Component minimize i x i subj to Ax b, x 0 [Du-Angeletti 06, Berg-F. 11] 10 / 40

17 Phase retrieval Observe signal x C n through magnitudes. b k = a k, x 2, k = 1,..., m Lift. b k = a k, x 2 = a k a k, xx = a k a k, X with X = xx Decode. Find X such that a k a k, X b k, X 0, rank(x) = 1 Convex relaxation. minimize tr(x) subj to a k ak, X b k, X 0 X H n [Chai et al. (2011); Candès et al. (2012); Waldspurger et al. (2015)] 11 / 40

18 Matrix completion Observe random matrix entries of an m p matrix Y. b k = Y ik,j k e T i k Ye jk = e ik e T j k, Y, k = 1,..., m Decode. Find m p matrix X such that e ik e T j k, X b k, rank(x) = min Convex relaxation. minimize X R m p σ i (X) subj to e ik ej T k, X b k i [Fazel (2002); Candes and Recht (2009)] 12 / 40

19 CONVEX OPTIMIZATION

20 Convex optimization minimize x C f (x) f : R n R is a convex function, C R n is a convex set Essential objective. f (x) : R n R := R {+ } = { f (x) if x C + otherwise No loss of generality in restricting ourselves to unconstrained problems: minimize x f (x) + g(ax) f : R n R, g : R n R, A is m n matrix 13 / 40

21 Convex functions Definition. f : R n R is convex if its epigraph is convex. epi f = { (x, τ) R n R f (x) τ } 14 / 40

22 Examples Indicator on a convex set. { 0 if x C, δ C (x) = + otherwise, epi δ C = C R + Supremum of convex functions. f := sup j J f j, {f j } j J convex functions epi f = epi f j j J Infimal convolution (epi-addition). { } (f g)(x) = inf f (z) + g(x z) z epi(f g) = epi f + epi g 15 / 40

23 Examples Generic form. minimize x f (x) + g(ax) Basis pursuit. min x x 1 st Ax = b, f = λ 1, g = δ {b} Basis pursuit denoising. min x x 1 st Ax b 2 σ, f = 1, g = δ { b 2 σ} Lasso. min x 1 2 Ax b 2 st x 1 τ, f = δ τb1, g = 1 2 b 2 16 / 40

24 DUALITY

25 Duality minimize x f (x) + g(ax) Decouple the objective terms: minimize x,z f (x) + g(z) subj to Ax = z Lagrangian function. L(x, z, y) := f (x) + g(z) + y, Ax z encapsulates all the information about the problem: { f (x) + g(z) if Ax = z sup L(x, z, y) = y + otherwise p = inf x,z sup L(x, z, y) y 17 / 40

26 Primal-dual pair. p = inf sup x,z y d = sup inf y x,z L(x, z, y) L(x, z, y) (primal problem) (dual problem) The following always holds: p d (weak duality) Under some constraint qualification, p = d (strong duality) 18 / 40

27 Fenchel dual Dual problem. sup inf y x,z L(x, z, y) = sup inf y x,z { = sup y f (x) + g(z) + y, Ax z sup x [ ] A T y, x f (x) [ ] } sup y, z g(z) z Conjugate function. h (u) = sup { u, x h(u)} x Fenchel-dual pair. minimize x minimize y f (x) + g(ax) f (A T y) + g ( y) 19 / 40

28 PROXIMAL METHODS

29 Proximal algorithm minimize x f (x) (g 0) Proximal operator. prox αf (x) := arg min z Optimality. Every optimal solution x solves { f (z) + 1 2α z x 2} f 1 2α 2 x = prox αf (x), α > 0 Proximal iteration. x + = prox αf (x) [Martinet (1970)] 20 / 40

30 { x k+1 = prox αf (x k ) := arg min f (x) + 1 } z x 2 z 2α Descent. Because x k+1 is the minimizing argument f (x k+1 ) + 1 2α k x k+1 x k 2 f (x) + 1 2α k x x k 2 x Set x = x k : f (x k+1 ) f (x k ) 1 2α k x k+1 x k 2 Convergence. finite convergence for f polyhedral [Polyak and Tretyakov (1973)] subproblems may be solved approximately [Rockafellar (1976)] f (x k ) p O(1/k) [Güler (1991)] f (x k ) p O(1/k 2 ) via an accelerated variant [Güler (1992)] 21 / 40

31 Moreau-Yosida regularization M-Y envelope f α (x) = min f (z) + 1 z 2α z x 2 proximal map prox αf (x) = arg min z f (z) + 1 2α z x 2 Useful properties: differentiable: preserves minima: decomposition: f α (x) = 1 α [ ] x prox αf (x) argmin f α = argmin f x = prox f (x) + prox f (x) Examples vector 1-norm prox α 1 (x) = sgn(x). max { x α, 0 } Schatten 1-norm prox α 1 (X) = U ΣV T, Σ = Diag [ prox α 1 ( σ(x) ) ] 22 / 40

32 Augmented Lagrangian (AL) algorithm Rockafellar ( 73) connects AL to proximal method on dual minimize f (x) subj to Ax = b ie, f (x) + δ {b} (Ax) Lagrangian: L (x, y) = f (x) + y T (Ax b) aug Lagrangian: L ρ (x, y) = f (x) + y T (Ax b) + 1 2ρ Ax b 2 AL algorithm: x k+1 = arg min x L ρ (x, y k ) y k+1 = y k + ρ(ax k+1 b) [may be inexact] [multiplier update] for linear constraints, AL algorithm Bregman used by L1-Bregman for f (x) = x 1 [Yin et al. (2008)] proposed by Hestenes/Powell ( 69) for smooth nonlinear optimization 23 / 40

33 SPLITTING METHODS projected gradient, proximal-gradient, iterative soft-thresholding, alternating projections

34 Proximal gradient minimize x f (x) + g(x) Algorithm. x + = prox αg ( x α f (x) ) with α (µ, 2/L), µ > 0, f (x) f (y) L x y Convergence. f (x k ) p O(1/k ) constant stepsize (α 1/L) f (x k ) p O(1/k 2 ) accelerated variant [Beck and Teboulle (2009)] f (x k ) p O(γ k ), γ < 1, with stronger assumptions 24 / 40

35 Interpretations Majorization minimization. Lipshitz assumption on f implies f (z) f (x) + f (x), z x + L z x 2 x, 2 z Proximal-gradient iteration minimizes the majorant: x + ( ) = prox αg x α f (x) arg min z α (0, 1/L) { g(z) + f (x) + f (x), z x + 1 z x 2} 2α Forward-backward splitting. x + = prox αg (x α f (x)) = (I + α g) 1 }{{} forward step (I α f )(x) }{{} backward step 25 / 40

36 Projected gradient minimize f (x) subj to x C x k+1 = proj C (x k α k f (x k )) LASSO minimize x 1 τ 1 2 Ax b 2 via f (x) = 1 2 Ax b 2, g(x) = δ τb1 (x) proj C ( ) can be computed in O(n) (randomized) [Duchi et al. (2008)] used by SPGL1 for Lasso subproblems [vdberg and Friedlander (2008)] BPDN (Lagrangian form) 1 minimize x R n 2 Ax b 2 + λ x 1 via minimize x R 2n + proj 0 ( ) can be computed in O(n) 1 2 Ā x b 2 + λe T x GPSR uses Barzilai-Borwein for step α k [Figueiredo-Nowak-Wright 07] 26 / 40

37 Iterative soft-thresholding (IST) minimize x 1 2 Ax b 2 + λ x 1 x + = prox αλ 1 (x αa T r), r Ax b = S αλ (x αa T r) where α (0, 2/ A 2 ) and S τ (x) = sgn(x) max{ x τ, 0} [aka, iterative shrinkage, iterative Landweber] Derived from various viewpoints: expectation-maximization [Figueiredo & Nowak 03] surrogate functionals [Daubechies et al 04] forward-backward splitting: [Combettes & Wajs 05] FISTA (accelerated variant) [Beck-Teboulle 09] 27 / 40

38 Low-rank approximation via nuclear-norm: minimize 1 2 AX b 2 + λ X n with X n = σ j (X) Singular-value soft-thresholding algorithm: x + = S αλ (X αa (r)) with r := AX b, S λ (Z) = U diag( σ)v T, σ = S λ (σ(z)) Computing S λ : No need to compute full SVD of Z, only leading singular values/vectors Monte-Carlo approach [Ma-Goldfarb-Chen 08] Lanczos bidiagonalization (via PROPACK) [Cai-Candes-Shen 08; Jain et al 10] 28 / 40

39 Typical behavior minimize X R n 1 n 2 X 1 subj to X Ω b recover matrix X rank(x ) = 4 no. of singular triples iteration 29 / 40

40 Backward-backward splitting Approximate one of the functions via its (smooth) Moreau envelope. minimize x f α (x) + g(x) (both nonsmooth) f α = f 1 2α 2, f α (x) = 1 α [x prox αf (x)] Proximal gradient. x + = prox αg ( x α fα (x) ) = prox αg prox αf (x) Projection onto convex sets (POCS). solves the problem f = δ C, x + = proj D g = δ D proj C (x) minimize x,z 1 2 z x 2 subj to x D, z C 30 / 40

41 Alternating direction of method of multipliers (ADMM) minimize x f (x) + g(x) ADMM algorithm. x + = prox αf (z u) z + = prox αg (x + + u) u + = u + x + z + used by YALL1 [Zhang et al. (2011)] ADMM algo Douglas-Rachford ( ) split Bregman linear constraints Esser s UCLA PhD thesis ( 10) ; survey by Boyd et al. ( 11) 31 / 40

42 PROXIMAL SMOOTHING

43 Primal smoothing minimize x f (x) + g(ax) Partial smoothing via the Moreau envelope: minimize x f µ (x) + g(ax), f µ = f 1 2µ 2 Example (Basis Pursuit). Huber approximation to 1-norm: f = 1, g = δ {b} = minimize x x 1 subj to Ax = b Smoothed function is Huber with parameter µ: { x 2 /2µ if x µ f µ (x) = x µ/2 otherwise µ 0 = x µ x 32 / 40

44 Regularize the primal problem. Dual smoothing minimize x f (x) + µ 2 x 2 + g(ax) Dualize. Regularizing f smoothes the conjugate: (f + µ 2 2 ) = (f 1 2µ 2 ) = f µ minimize y f µ (A T y) + g ( y) Example (Basis Pursuit). f = 1, g = δ {b} = min x x 1 + µ 2 x 2 subj to Ax = b f µ = 1 2µ dist2 B = max y b, y + 1 2µ dist2 B (A T y) Exact regularization: x µ = x for all µ [0, µ), µ > 0 [Mangasarian and Meyer (1979); Friedlander and Tseng (2007)] [Nesterov (2005); Beck and Teboulle (2012)]; TFOCS [Becker et al. (2011)] 33 / 40

45 CONCLUSION

46 much left unsaid, eg, optimal first-order methods [Nesterov (1988, 2005)] block coordinate descent [Sardy et al. (2004)] active-set methods (eg, LARS & Homotopy) [Osborne et al. (2000)] my own work. avoiding proximal operators greedy coordinate descent [Friedlander et al. (2014) (SIOPT)] [Nutini et al. (2015) (ICML)] web mpf 34 / 40

47 REFERENCES

48 References I A. Beck and M. Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2(1): , A. Beck and M. Teboulle. Smoothing and first order methods: A unified framework. SIAM J. Optim., 22(2): , S. Becker, E. Candès, and M. Grant. Templates for convex cone problems with applications to sparse signal recovery. Math. Program. Comp., 3: , doi: /s S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3(1):1 122, J.-F. Cai, E. J. Candés, and Z. Shen. A singular value thresholding algorithm for matrix completion. SIAM J. Optim., 4(20): , doi: / E. J. Candes and B. Recht. Exact matrix completion via convex optimization. Found. Comput. Math., 9: , December / 40

49 References II E. J. Candès, J. Romberg, and T. Tao. Stable signal recovery from incomplete and inaccurate measurements. Comm. Pure Appl. Math., 59(8): , E. J. Candès, T. Strohmer, and V. Voroninski. Phaselift: Exact and stable signal recovery from magnitude measurements via convex programming. Commun. Pur. Appl. Ana., A. Chai, M. Moscoso, and G. Papanicolaou. Array imaging using intensity-only measurements. Inverse Problems, 27(1):015005, URL S. S. Chen, D. L. Donoho, and M. A. Saunders. Atomic decomposition by basis pursuit. SIAM J. Sci. Comput., 20(1):33 61, P. L. Combettes and V. R. Wajs. Signal recovery by proximal forward-backward splitting. Multiscale Model. Simul., 4(4): , I. Daubechies, M. Defrise, and C. D. Mol. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Comm. Pure Appl. Math., 57: , J. Duchi, S. Shalev-Shwartz, Y. Singer, and T. Chandra. Efficient projections onto the l 1 -ball for learning in high dimensions. In Proceedings of the 25th International Conference on Machine Learning, pages , / 40

50 References III J. E. Esser. Primal Dual Algorithms for Convex Models and Applications to Image Restoration, Registration and Nonlocal Inpainting. PhD thesis, University of California, Los Angeles, M. Fazel. Matrix rank minimization with applications. PhD thesis, Elec. Eng. Dept, Stanford University, M. Figueiredo, R. Nowak, and S. J. Wright. Gradient Projection for Sparse Reconstruction: Application to Compressed Sensing and Other Inverse Problems. Sel. Top. in Signal Process., IEEE J., 1(4): , M. A. T. Figueiredo and R. D. Nowak. An EM algorithm for wavelet-based image restoration. IEEE Trans. Image Process., 12(8): , August M. P. Friedlander and P. Tseng. Exact regularization of convex programs. SIAM J. Optim., 18(4): , doi: / M. P. Friedlander, I. Macêdo, and T. K. Pong. Gauge optimization and duality. SIAM J. Optim., 24(4): , doi: / O. Güler. On the convergence of the proximal point algorithm for convex minimization. SIAM Journal on Control and Optimization, 29(2): , doi: / URL 37 / 40

51 References IV O. Güler. New proximal point algorithms for convex minimization. SIAM Journal on Optimization, 2(4): , doi: / URL M. R. Hestenes. Multiplier and gradient methods. J. Optim. Theory Appl., 4 (5): , S. Ma, D. Goldfarb, and L. Chen. Fixed point and Bregman iterative methods for matrix rank minimization. Math. Program., 128: , doi: /s D. Malioutov, M. Çetin, and A. S. Willsky. A sparse signal reconstruction perspective for source localization with sensor arrays. IEEE Trans. Sig. Proc., 53(8): , August O. L. Mangasarian and R. R. Meyer. Nonlinear perturbation of linear programs. SIAM J. Control Optim., 17(6): , November B. Martinet. Régularisation d inéquations variationnelles par approximations successives. Rev. FranÂÿcaise Informat. Recherche OpÂťerationnelle, 4 (Ser. R-3):154âĂŞ158, Y. Nesterov. On an approach to the construction of optimal methods of minimization of smooth convex functions. Ekonom. i. Mat. Metody, 24, / 40

52 References V Y. Nesterov. Smooth minimization of non-smooth functions. Math. Program., 103: , P. Netrapalli, P. Jain, and S. Sanghavi. Phase retrieval using alternating minimization. In C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Weinberger, editors, Advances in Neural Information Processing Systems 26, pages Curran Associates, Inc., URL J. Nutini, I. Laradji, M. Schmidt, and M. Friedlander. Coordinate descent converges faster with the gauss-southwell rule than random selection. In Inter. Conf. Mach. Learning, M. R. Osborne, B. Presnell, and B. A. Turlach. A new approach to variable selection in least squares problems. IMA J. Numer. Anal., 20(3): , B. Polyak and N. Tretyakov. The method of penalty bounds for constrained extremum problems. Zh. Vych. Mat i Mat. Fiz, 13:34 46, M. J. D. Powell. A method for nonlinear constraints in minimization problems. In R. Fletcher, editor, Optimization, chapter 19. Academic Press, London and New York, / 40

53 References VI R. T. Rockafellar. Augmented Lagrangians and applications of the proximal point algorithm in convex programming. Mathematics of Operations Research, 1(2):97 116, S. Sardy, A. Antoniadis, and P. Tseng. Automatic smoothing with wavelets for a wide class of distributions. J. Comput. Graph. Stat., 13, E. van den Berg and M. P. Friedlander. Probing the pareto frontier for basis pursuit solutions. SIAM J. Optim., 31(2): , doi: / I. Waldspurger, A. dâăźaspremont, and S. Mallat. Phase recovery, maxcut and complex semidefinite programming. Math. Program., 149(1-2):47 81, W. Yin, S. Osher, D. Goldfarb, and J. Darbon. Bregman iterative algorithms for l1 minimization with applications to compressed sensing. SIAM J. Imag. Sci., 1, Y. Zhang, W. Deng, J. Yang,, and W. Yin, URL 40 / 40

Gauge optimization and duality

Gauge optimization and duality 1 / 54 Gauge optimization and duality Junfeng Yang Department of Mathematics Nanjing University Joint with Shiqian Ma, CUHK September, 2015 2 / 54 Outline Introduction Duality Lagrange duality Fenchel

More information

A Unified Approach to Proximal Algorithms using Bregman Distance

A Unified Approach to Proximal Algorithms using Bregman Distance A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department

More information

Optimization for Learning and Big Data

Optimization for Learning and Big Data Optimization for Learning and Big Data Donald Goldfarb Department of IEOR Columbia University Department of Mathematics Distinguished Lecture Series May 17-19, 2016. Lecture 1. First-Order Methods for

More information

Sparse Optimization Lecture: Dual Methods, Part I

Sparse Optimization Lecture: Dual Methods, Part I Sparse Optimization Lecture: Dual Methods, Part I Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know dual (sub)gradient iteration augmented l 1 iteration

More information

ACCELERATED LINEARIZED BREGMAN METHOD. June 21, Introduction. In this paper, we are interested in the following optimization problem.

ACCELERATED LINEARIZED BREGMAN METHOD. June 21, Introduction. In this paper, we are interested in the following optimization problem. ACCELERATED LINEARIZED BREGMAN METHOD BO HUANG, SHIQIAN MA, AND DONALD GOLDFARB June 21, 2011 Abstract. In this paper, we propose and analyze an accelerated linearized Bregman (A) method for solving the

More information

ParNes: A rapidly convergent algorithm for accurate recovery of sparse and approximately sparse signals

ParNes: A rapidly convergent algorithm for accurate recovery of sparse and approximately sparse signals Preprint manuscript No. (will be inserted by the editor) ParNes: A rapidly convergent algorithm for accurate recovery of sparse and approximately sparse signals Ming Gu ek-heng im Cinna Julie Wu Received:

More information

A General Framework for a Class of Primal-Dual Algorithms for TV Minimization

A General Framework for a Class of Primal-Dual Algorithms for TV Minimization A General Framework for a Class of Primal-Dual Algorithms for TV Minimization Ernie Esser UCLA 1 Outline A Model Convex Minimization Problem Main Idea Behind the Primal Dual Hybrid Gradient (PDHG) Method

More information

Sparse and Regularized Optimization

Sparse and Regularized Optimization Sparse and Regularized Optimization In many applications, we seek not an exact minimizer of the underlying objective, but rather an approximate minimizer that satisfies certain desirable properties: sparsity

More information

Making Flippy Floppy

Making Flippy Floppy Making Flippy Floppy James V. Burke UW Mathematics jvburke@uw.edu Aleksandr Y. Aravkin IBM, T.J.Watson Research sasha.aravkin@gmail.com Michael P. Friedlander UBC Computer Science mpf@cs.ubc.ca Vietnam

More information

Basis Pursuit Denoising and the Dantzig Selector

Basis Pursuit Denoising and the Dantzig Selector BPDN and DS p. 1/16 Basis Pursuit Denoising and the Dantzig Selector West Coast Optimization Meeting University of Washington Seattle, WA, April 28 29, 2007 Michael Friedlander and Michael Saunders Dept

More information

Dual and primal-dual methods

Dual and primal-dual methods ELE 538B: Large-Scale Optimization for Data Science Dual and primal-dual methods Yuxin Chen Princeton University, Spring 2018 Outline Dual proximal gradient method Primal-dual proximal gradient method

More information

SPARSE SIGNAL RESTORATION. 1. Introduction

SPARSE SIGNAL RESTORATION. 1. Introduction SPARSE SIGNAL RESTORATION IVAN W. SELESNICK 1. Introduction These notes describe an approach for the restoration of degraded signals using sparsity. This approach, which has become quite popular, is useful

More information

Learning MMSE Optimal Thresholds for FISTA

Learning MMSE Optimal Thresholds for FISTA MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Learning MMSE Optimal Thresholds for FISTA Kamilov, U.; Mansour, H. TR2016-111 August 2016 Abstract Fast iterative shrinkage/thresholding algorithm

More information

Making Flippy Floppy

Making Flippy Floppy Making Flippy Floppy James V. Burke UW Mathematics jvburke@uw.edu Aleksandr Y. Aravkin IBM, T.J.Watson Research sasha.aravkin@gmail.com Michael P. Friedlander UBC Computer Science mpf@cs.ubc.ca Current

More information

Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems)

Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems) Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems) Donghwan Kim and Jeffrey A. Fessler EECS Department, University of Michigan

More information

Low Complexity Regularization 1 / 33

Low Complexity Regularization 1 / 33 Low Complexity Regularization 1 / 33 Low-dimensional signal models Information level: pixels large wavelet coefficients (blue = 0) sparse signals low-rank matrices nonlinear models Sparse representations

More information

About Split Proximal Algorithms for the Q-Lasso

About Split Proximal Algorithms for the Q-Lasso Thai Journal of Mathematics Volume 5 (207) Number : 7 http://thaijmath.in.cmu.ac.th ISSN 686-0209 About Split Proximal Algorithms for the Q-Lasso Abdellatif Moudafi Aix Marseille Université, CNRS-L.S.I.S

More information

Large-Scale L1-Related Minimization in Compressive Sensing and Beyond

Large-Scale L1-Related Minimization in Compressive Sensing and Beyond Large-Scale L1-Related Minimization in Compressive Sensing and Beyond Yin Zhang Department of Computational and Applied Mathematics Rice University, Houston, Texas, U.S.A. Arizona State University March

More information

Adaptive Corrected Procedure for TVL1 Image Deblurring under Impulsive Noise

Adaptive Corrected Procedure for TVL1 Image Deblurring under Impulsive Noise Adaptive Corrected Procedure for TVL1 Image Deblurring under Impulsive Noise Minru Bai(x T) College of Mathematics and Econometrics Hunan University Joint work with Xiongjun Zhang, Qianqian Shao June 30,

More information

Proximal methods. S. Villa. October 7, 2014

Proximal methods. S. Villa. October 7, 2014 Proximal methods S. Villa October 7, 2014 1 Review of the basics Often machine learning problems require the solution of minimization problems. For instance, the ERM algorithm requires to solve a problem

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

Topographic Dictionary Learning with Structured Sparsity

Topographic Dictionary Learning with Structured Sparsity Topographic Dictionary Learning with Structured Sparsity Julien Mairal 1 Rodolphe Jenatton 2 Guillaume Obozinski 2 Francis Bach 2 1 UC Berkeley 2 INRIA - SIERRA Project-Team San Diego, Wavelets and Sparsity

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.11

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.11 XI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.11 Alternating direction methods of multipliers for separable convex programming Bingsheng He Department of Mathematics

More information

The proximal mapping

The proximal mapping The proximal mapping http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/37 1 closed function 2 Conjugate function

More information

Sparse Solutions of an Undetermined Linear System

Sparse Solutions of an Undetermined Linear System 1 Sparse Solutions of an Undetermined Linear System Maddullah Almerdasy New York University Tandon School of Engineering arxiv:1702.07096v1 [math.oc] 23 Feb 2017 Abstract This work proposes a research

More information

A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR TV MINIMIZATION

A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR TV MINIMIZATION A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR TV MINIMIZATION ERNIE ESSER XIAOQUN ZHANG TONY CHAN Abstract. We generalize the primal-dual hybrid gradient (PDHG) algorithm proposed

More information

Coordinate Update Algorithm Short Course Operator Splitting

Coordinate Update Algorithm Short Course Operator Splitting Coordinate Update Algorithm Short Course Operator Splitting Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 25 Operator splitting pipeline 1. Formulate a problem as 0 A(x) + B(x) with monotone operators

More information

Math 273a: Optimization Overview of First-Order Optimization Algorithms

Math 273a: Optimization Overview of First-Order Optimization Algorithms Math 273a: Optimization Overview of First-Order Optimization Algorithms Wotao Yin Department of Mathematics, UCLA online discussions on piazza.com 1 / 9 Typical flow of numerical optimization Optimization

More information

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725 Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:

More information

where κ is a gauge function. The majority of the solvers available for the problems under consideration work with the penalized version,

where κ is a gauge function. The majority of the solvers available for the problems under consideration work with the penalized version, SPARSE OPTIMIZATION WITH LEAST-SQUARES CONSTRAINTS ON-LINE APPENDIX EWOUT VAN DEN BERG AND MICHAEL P. FRIEDLANDER Appendix A. Numerical experiments. This on-line appendix accompanies the paper Sparse optimization

More information

arxiv: v3 [math.oc] 17 Dec 2017 Received: date / Accepted: date

arxiv: v3 [math.oc] 17 Dec 2017 Received: date / Accepted: date Noname manuscript No. (will be inserted by the editor) A Simple Convergence Analysis of Bregman Proximal Gradient Algorithm Yi Zhou Yingbin Liang Lixin Shen arxiv:1503.05601v3 [math.oc] 17 Dec 2017 Received:

More information

arxiv: v1 [stat.ml] 1 Mar 2016

arxiv: v1 [stat.ml] 1 Mar 2016 DUAL SMOOTHING AND LEVEL SET TECHNIQUES FOR VARIATIONAL MATRIX DECOMPOSITION Dual Smoothing and Level Set Techniques for Variational Matrix Decomposition arxiv:1603.00284v1 [stat.ml] 1 Mar 2016 Aleksandr

More information

Optimal Value Function Methods in Numerical Optimization Level Set Methods

Optimal Value Function Methods in Numerical Optimization Level Set Methods Optimal Value Function Methods in Numerical Optimization Level Set Methods James V Burke Mathematics, University of Washington, (jvburke@uw.edu) Joint work with Aravkin (UW), Drusvyatskiy (UW), Friedlander

More information

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization Panos Parpas Department of Computing Imperial College London www.doc.ic.ac.uk/ pp500 p.parpas@imperial.ac.uk jointly with D.V.

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html

More information

Randomized Block Coordinate Non-Monotone Gradient Method for a Class of Nonlinear Programming

Randomized Block Coordinate Non-Monotone Gradient Method for a Class of Nonlinear Programming Randomized Block Coordinate Non-Monotone Gradient Method for a Class of Nonlinear Programming Zhaosong Lu Lin Xiao June 25, 2013 Abstract In this paper we propose a randomized block coordinate non-monotone

More information

Fast proximal gradient methods

Fast proximal gradient methods L. Vandenberghe EE236C (Spring 2013-14) Fast proximal gradient methods fast proximal gradient method (FISTA) FISTA with line search FISTA as descent method Nesterov s second method 1 Fast (proximal) gradient

More information

First-order methods for structured nonsmooth optimization

First-order methods for structured nonsmooth optimization First-order methods for structured nonsmooth optimization Sangwoon Yun Department of Mathematics Education Sungkyunkwan University Oct 19, 2016 Center for Mathematical Analysis & Computation, Yonsei University

More information

Recent developments on sparse representation

Recent developments on sparse representation Recent developments on sparse representation Zeng Tieyong Department of Mathematics, Hong Kong Baptist University Email: zeng@hkbu.edu.hk Hong Kong Baptist University Dec. 8, 2008 First Previous Next Last

More information

Distributed Optimization via Alternating Direction Method of Multipliers

Distributed Optimization via Alternating Direction Method of Multipliers Distributed Optimization via Alternating Direction Method of Multipliers Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato Stanford University ITMANET, Stanford, January 2011 Outline precursors dual decomposition

More information

Sparse Regularization via Convex Analysis

Sparse Regularization via Convex Analysis Sparse Regularization via Convex Analysis Ivan Selesnick Electrical and Computer Engineering Tandon School of Engineering New York University Brooklyn, New York, USA 29 / 66 Convex or non-convex: Which

More information

Optimal Value Function Methods in Numerical Optimization Level Set Methods

Optimal Value Function Methods in Numerical Optimization Level Set Methods Optimal Value Function Methods in Numerical Optimization Level Set Methods James V Burke Mathematics, University of Washington, (jvburke@uw.edu) Joint work with Aravkin (UW), Drusvyatskiy (UW), Friedlander

More information

Fast and Robust Phase Retrieval

Fast and Robust Phase Retrieval Fast and Robust Phase Retrieval Aditya Viswanathan aditya@math.msu.edu CCAM Lunch Seminar Purdue University April 18 2014 0 / 27 Joint work with Yang Wang Mark Iwen Research supported in part by National

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via convex relaxations Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

The Alternating Direction Method of Multipliers

The Alternating Direction Method of Multipliers The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein December 2, 2015 1 / 25 Background The Dual Problem Consider

More information

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 7: Matrix completion Yuejie Chi The Ohio State University Page 1 Reference Guaranteed Minimum-Rank Solutions of Linear

More information

A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR CONVEX OPTIMIZATION IN IMAGING SCIENCE

A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR CONVEX OPTIMIZATION IN IMAGING SCIENCE A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR CONVEX OPTIMIZATION IN IMAGING SCIENCE ERNIE ESSER XIAOQUN ZHANG TONY CHAN Abstract. We generalize the primal-dual hybrid gradient

More information

An Homotopy Algorithm for the Lasso with Online Observations

An Homotopy Algorithm for the Lasso with Online Observations An Homotopy Algorithm for the Lasso with Online Observations Pierre J. Garrigues Department of EECS Redwood Center for Theoretical Neuroscience University of California Berkeley, CA 94720 garrigue@eecs.berkeley.edu

More information

Agenda. Fast proximal gradient methods. 1 Accelerated first-order methods. 2 Auxiliary sequences. 3 Convergence analysis. 4 Numerical examples

Agenda. Fast proximal gradient methods. 1 Accelerated first-order methods. 2 Auxiliary sequences. 3 Convergence analysis. 4 Numerical examples Agenda Fast proximal gradient methods 1 Accelerated first-order methods 2 Auxiliary sequences 3 Convergence analysis 4 Numerical examples 5 Optimality of Nesterov s scheme Last time Proximal gradient method

More information

On convergence rate of the Douglas-Rachford operator splitting method

On convergence rate of the Douglas-Rachford operator splitting method On convergence rate of the Douglas-Rachford operator splitting method Bingsheng He and Xiaoming Yuan 2 Abstract. This note provides a simple proof on a O(/k) convergence rate for the Douglas- Rachford

More information

ABSTRACT. Recovering Data with Group Sparsity by Alternating Direction Methods. Wei Deng

ABSTRACT. Recovering Data with Group Sparsity by Alternating Direction Methods. Wei Deng ABSTRACT Recovering Data with Group Sparsity by Alternating Direction Methods by Wei Deng Group sparsity reveals underlying sparsity patterns and contains rich structural information in data. Hence, exploiting

More information

Phase recovery with PhaseCut and the wavelet transform case

Phase recovery with PhaseCut and the wavelet transform case Phase recovery with PhaseCut and the wavelet transform case Irène Waldspurger Joint work with Alexandre d Aspremont and Stéphane Mallat Introduction 2 / 35 Goal : Solve the non-linear inverse problem Reconstruct

More information

Optimization Algorithms for Compressed Sensing

Optimization Algorithms for Compressed Sensing Optimization Algorithms for Compressed Sensing Stephen Wright University of Wisconsin-Madison SIAM Gator Student Conference, Gainesville, March 2009 Stephen Wright (UW-Madison) Optimization and Compressed

More information

AUGMENTED LAGRANGIAN METHODS FOR CONVEX OPTIMIZATION

AUGMENTED LAGRANGIAN METHODS FOR CONVEX OPTIMIZATION New Horizon of Mathematical Sciences RIKEN ithes - Tohoku AIMR - Tokyo IIS Joint Symposium April 28 2016 AUGMENTED LAGRANGIAN METHODS FOR CONVEX OPTIMIZATION T. Takeuchi Aihara Lab. Laboratories for Mathematics,

More information

Minimizing Isotropic Total Variation without Subiterations

Minimizing Isotropic Total Variation without Subiterations MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Minimizing Isotropic Total Variation without Subiterations Kamilov, U. S. TR206-09 August 206 Abstract Total variation (TV) is one of the most

More information

Sparse and Low-Rank Matrix Decomposition Via Alternating Direction Method

Sparse and Low-Rank Matrix Decomposition Via Alternating Direction Method Sparse and Low-Rank Matrix Decomposition Via Alternating Direction Method Xiaoming Yuan a, 1 and Junfeng Yang b, 2 a Department of Mathematics, Hong Kong Baptist University, Hong Kong, China b Department

More information

Robust PCA. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng

Robust PCA. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng Robust PCA CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Robust PCA 1 / 52 Previously...

More information

DNNs for Sparse Coding and Dictionary Learning

DNNs for Sparse Coding and Dictionary Learning DNNs for Sparse Coding and Dictionary Learning Subhadip Mukherjee, Debabrata Mahapatra, and Chandra Sekhar Seelamantula Department of Electrical Engineering, Indian Institute of Science, Bangalore 5612,

More information

Compressed Sensing and Neural Networks

Compressed Sensing and Neural Networks and Jan Vybíral (Charles University & Czech Technical University Prague, Czech Republic) NOMAD Summer Berlin, September 25-29, 2017 1 / 31 Outline Lasso & Introduction Notation Training the network Applications

More information

Sparsity Regularization

Sparsity Regularization Sparsity Regularization Bangti Jin Course Inverse Problems & Imaging 1 / 41 Outline 1 Motivation: sparsity? 2 Mathematical preliminaries 3 l 1 solvers 2 / 41 problem setup finite-dimensional formulation

More information

ADMM and Fast Gradient Methods for Distributed Optimization

ADMM and Fast Gradient Methods for Distributed Optimization ADMM and Fast Gradient Methods for Distributed Optimization João Xavier Instituto Sistemas e Robótica (ISR), Instituto Superior Técnico (IST) European Control Conference, ECC 13 July 16, 013 Joint work

More information

Inverse problems and sparse models (1/2) Rémi Gribonval INRIA Rennes - Bretagne Atlantique, France

Inverse problems and sparse models (1/2) Rémi Gribonval INRIA Rennes - Bretagne Atlantique, France Inverse problems and sparse models (1/2) Rémi Gribonval INRIA Rennes - Bretagne Atlantique, France remi.gribonval@inria.fr Structure of the tutorial Session 1: Introduction to inverse problems & sparse

More information

Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition

Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition Gongguo Tang and Arye Nehorai Department of Electrical and Systems Engineering Washington University in St Louis

More information

This can be 2 lectures! still need: Examples: non-convex problems applications for matrix factorization

This can be 2 lectures! still need: Examples: non-convex problems applications for matrix factorization This can be 2 lectures! still need: Examples: non-convex problems applications for matrix factorization x = prox_f(x)+prox_{f^*}(x) use to get prox of norms! PROXIMAL METHODS WHY PROXIMAL METHODS Smooth

More information

Learning with stochastic proximal gradient

Learning with stochastic proximal gradient Learning with stochastic proximal gradient Lorenzo Rosasco DIBRIS, Università di Genova Via Dodecaneso, 35 16146 Genova, Italy lrosasco@mit.edu Silvia Villa, Băng Công Vũ Laboratory for Computational and

More information

I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION

I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION Peter Ochs University of Freiburg Germany 17.01.2017 joint work with: Thomas Brox and Thomas Pock c 2017 Peter Ochs ipiano c 1

More information

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some

More information

Convex Optimization. (EE227A: UC Berkeley) Lecture 15. Suvrit Sra. (Gradient methods III) 12 March, 2013

Convex Optimization. (EE227A: UC Berkeley) Lecture 15. Suvrit Sra. (Gradient methods III) 12 March, 2013 Convex Optimization (EE227A: UC Berkeley) Lecture 15 (Gradient methods III) 12 March, 2013 Suvrit Sra Optimal gradient methods 2 / 27 Optimal gradient methods We saw following efficiency estimates for

More information

Composite nonlinear models at scale

Composite nonlinear models at scale Composite nonlinear models at scale Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with D. Davis (Cornell), M. Fazel (UW), A.S. Lewis (Cornell) C. Paquette (Lehigh), and S. Roy (UW)

More information

A Brief Overview of Practical Optimization Algorithms in the Context of Relaxation

A Brief Overview of Practical Optimization Algorithms in the Context of Relaxation A Brief Overview of Practical Optimization Algorithms in the Context of Relaxation Zhouchen Lin Peking University April 22, 2018 Too Many Opt. Problems! Too Many Opt. Algorithms! Zero-th order algorithms:

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)

More information

Lecture: Algorithms for Compressed Sensing

Lecture: Algorithms for Compressed Sensing 1/56 Lecture: Algorithms for Compressed Sensing Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:

More information

Proximal Minimization by Incremental Surrogate Optimization (MISO)

Proximal Minimization by Incremental Surrogate Optimization (MISO) Proximal Minimization by Incremental Surrogate Optimization (MISO) (and a few variants) Julien Mairal Inria, Grenoble ICCOPT, Tokyo, 2016 Julien Mairal, Inria MISO 1/26 Motivation: large-scale machine

More information

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)

More information

Alternating Direction Method of Multipliers. Ryan Tibshirani Convex Optimization

Alternating Direction Method of Multipliers. Ryan Tibshirani Convex Optimization Alternating Direction Method of Multipliers Ryan Tibshirani Convex Optimization 10-725 Consider the problem Last time: dual ascent min x f(x) subject to Ax = b where f is strictly convex and closed. Denote

More information

Minimizing the Difference of L 1 and L 2 Norms with Applications

Minimizing the Difference of L 1 and L 2 Norms with Applications 1/36 Minimizing the Difference of L 1 and L 2 Norms with Department of Mathematical Sciences University of Texas Dallas May 31, 2017 Partially supported by NSF DMS 1522786 2/36 Outline 1 A nonconvex approach:

More information

Contraction Methods for Convex Optimization and monotone variational inequalities No.12

Contraction Methods for Convex Optimization and monotone variational inequalities No.12 XII - 1 Contraction Methods for Convex Optimization and monotone variational inequalities No.12 Linearized alternating direction methods of multipliers for separable convex programming Bingsheng He Department

More information

FAST ALTERNATING DIRECTION OPTIMIZATION METHODS

FAST ALTERNATING DIRECTION OPTIMIZATION METHODS FAST ALTERNATING DIRECTION OPTIMIZATION METHODS TOM GOLDSTEIN, BRENDAN O DONOGHUE, SIMON SETZER, AND RICHARD BARANIUK Abstract. Alternating direction methods are a common tool for general mathematical

More information

arxiv: v1 [cs.na] 13 Nov 2014

arxiv: v1 [cs.na] 13 Nov 2014 A FIELD GUIDE TO FORWARD-BACKWARD SPLITTING WITH A FASTA IMPLEMENTATION TOM GOLDSTEIN, CHRISTOPH STUDER, RICHARD BARANIUK arxiv:1411.3406v1 [cs.na] 13 Nov 2014 Abstract. Non-differentiable and constrained

More information

A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction

A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction Marc Teboulle School of Mathematical Sciences Tel Aviv University Joint work with H. Bauschke and J. Bolte Optimization

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Adaptive Primal Dual Optimization for Image Processing and Learning

Adaptive Primal Dual Optimization for Image Processing and Learning Adaptive Primal Dual Optimization for Image Processing and Learning Tom Goldstein Rice University tag7@rice.edu Ernie Esser University of British Columbia eesser@eos.ubc.ca Richard Baraniuk Rice University

More information

arxiv: v2 [math.oc] 18 Nov 2012

arxiv: v2 [math.oc] 18 Nov 2012 arxiv:206.2372v2 [math.oc] 8 Nov 202 PRISMA: PRoximal Iterative SMoothing Algorithm Francesco Orabona Toyota Technological Institute at Chicago francesco@orabona.com Andreas Argyriou Katholieke Universiteit

More information

2 Regularized Image Reconstruction for Compressive Imaging and Beyond

2 Regularized Image Reconstruction for Compressive Imaging and Beyond EE 367 / CS 448I Computational Imaging and Display Notes: Compressive Imaging and Regularized Image Reconstruction (lecture ) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement

More information

Splitting methods for decomposing separable convex programs

Splitting methods for decomposing separable convex programs Splitting methods for decomposing separable convex programs Philippe Mahey LIMOS - ISIMA - Université Blaise Pascal PGMO, ENSTA 2013 October 4, 2013 1 / 30 Plan 1 Max Monotone Operators Proximal techniques

More information

Primal-Dual First-Order Methods for a Class of Cone Programming

Primal-Dual First-Order Methods for a Class of Cone Programming Primal-Dual First-Order Methods for a Class of Cone Programming Zhaosong Lu March 9, 2011 Abstract In this paper we study primal-dual first-order methods for a class of cone programming problems. In particular,

More information

A second order primal-dual algorithm for nonsmooth convex composite optimization

A second order primal-dual algorithm for nonsmooth convex composite optimization 2017 IEEE 56th Annual Conference on Decision and Control (CDC) December 12-15, 2017, Melbourne, Australia A second order primal-dual algorithm for nonsmooth convex composite optimization Neil K. Dhingra,

More information

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit New Coherence and RIP Analysis for Wea 1 Orthogonal Matching Pursuit Mingrui Yang, Member, IEEE, and Fran de Hoog arxiv:1405.3354v1 [cs.it] 14 May 2014 Abstract In this paper we define a new coherence

More information

First Order Methods for Large-Scale Sparse Optimization. Necdet Serhat Aybat

First Order Methods for Large-Scale Sparse Optimization. Necdet Serhat Aybat First Order Methods for Large-Scale Sparse Optimization Necdet Serhat Aybat Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate School of Arts and

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Adrien Todeschini Inria Bordeaux JdS 2014, Rennes Aug. 2014 Joint work with François Caron (Univ. Oxford), Marie

More information

Lecture 8: February 9

Lecture 8: February 9 0-725/36-725: Convex Optimiation Spring 205 Lecturer: Ryan Tibshirani Lecture 8: February 9 Scribes: Kartikeya Bhardwaj, Sangwon Hyun, Irina Caan 8 Proximal Gradient Descent In the previous lecture, we

More information

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 36 Why proximal? Newton s method: for C 2 -smooth, unconstrained problems allow

More information

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6)

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement to the material discussed in

More information

arxiv: v1 [math.oc] 23 May 2017

arxiv: v1 [math.oc] 23 May 2017 A DERANDOMIZED ALGORITHM FOR RP-ADMM WITH SYMMETRIC GAUSS-SEIDEL METHOD JINCHAO XU, KAILAI XU, AND YINYU YE arxiv:1705.08389v1 [math.oc] 23 May 2017 Abstract. For multi-block alternating direction method

More information

Homotopy methods based on l 0 norm for the compressed sensing problem

Homotopy methods based on l 0 norm for the compressed sensing problem Homotopy methods based on l 0 norm for the compressed sensing problem Wenxing Zhu, Zhengshan Dong Center for Discrete Mathematics and Theoretical Computer Science, Fuzhou University, Fuzhou 350108, China

More information

Gradient based method for cone programming with application to large-scale compressed sensing

Gradient based method for cone programming with application to large-scale compressed sensing Gradient based method for cone programming with application to large-scale compressed sensing Zhaosong Lu September 3, 2008 (Revised: September 17, 2008) Abstract In this paper, we study a gradient based

More information

Signal Restoration with Overcomplete Wavelet Transforms: Comparison of Analysis and Synthesis Priors

Signal Restoration with Overcomplete Wavelet Transforms: Comparison of Analysis and Synthesis Priors Signal Restoration with Overcomplete Wavelet Transforms: Comparison of Analysis and Synthesis Priors Ivan W. Selesnick a and Mário A. T. Figueiredo b a Polytechnic Institute of New York University, Brooklyn,

More information

Signal Recovery From Incomplete and Inaccurate Measurements via Regularized Orthogonal Matching Pursuit

Signal Recovery From Incomplete and Inaccurate Measurements via Regularized Orthogonal Matching Pursuit Signal Recovery From Incomplete and Inaccurate Measurements via Regularized Orthogonal Matching Pursuit Deanna Needell and Roman Vershynin Abstract We demonstrate a simple greedy algorithm that can reliably

More information

LOW-RANK SPECTRAL OPTIMIZATION

LOW-RANK SPECTRAL OPTIMIZATION LOW-RANK SPECTRAL OPTIMIZATION MICHAEL P. FRIEDLANDER AND IVES MACÊDO Abstract. Various applications in signal processing and machine learning give rise to highly structured spectral optimization problems

More information

Fast Hard Thresholding with Nesterov s Gradient Method

Fast Hard Thresholding with Nesterov s Gradient Method Fast Hard Thresholding with Nesterov s Gradient Method Volkan Cevher Idiap Research Institute Ecole Polytechnique Federale de ausanne volkan.cevher@epfl.ch Sina Jafarpour Department of Computer Science

More information