Mathematical methods for Image Processing
|
|
- June Higgins
- 5 years ago
- Views:
Transcription
1 Mathematical methods for Image Processing François Malgouyres Institut de Mathématiques de Toulouse, France invitation by Jidesh P., NITK Surathkal funding Global Initiative on Academic Network Oct François Malgouyres (IMT) Mathematics for Image Processing Oct / 16
2 Plan 1 Non-smooth optimization : the proximal gradient algorithm François Malgouyres (IMT) Mathematics for Image Processing Oct / 16
3 The non-smooth problem We consider the minimization problem with where W is a Euclidean space w Argmin w W E(w) E(w) = E(w) + R(w), for all w W, E is convex, coercive, differentiable with a ipschitz gradient R is lower-semi-continuous, proper, convex and simple. Definition (proximal operator and simple) We say R is simple if there is a simple way to compute prox t R(w ) = Argmin w W t 2 w w R(w). (e.g. It is given in a closed form expression or computed by a fast algorithm.) François Malgouyres (IMT) Mathematics for Image Processing Oct / 16
4 Example 1 : R is a characteristic function et C W be a non-empty closed set : { 0, if w C R(w) = χ C (w) = +, otherwise. Then, prox t R(w ) = t Argmin w W 2 w w R(w) = Argmin w C w w 2 2 prox t R (w ) is the projection onto C. Usually easy to compute when (for instance) C is an l 1, l 2 or l ball an affine space. François Malgouyres (IMT) Mathematics for Image Processing Oct / 16
5 Example 2 : R is. 1 If R(w) = w 1 = i w i we have (w ) = Argmin w W 2 w w w 1, ( ) = Argmin w W 2 (w i w i ) 2 + w i. (1) i The i th entry of (w ) is (w ) i = Argmin t R 2 (t w i ) 2 + t. François Malgouyres (IMT) Mathematics for Image Processing Oct / 16
6 Example 2 : R is. 1 The i th entry of (w ) is (w ) i = Argmin t R 2 (t w i ) 2 + t. Proof : et v i = Argmin t R 2 (t w i )2 + t and v = (v i ) i. Since for every i and every w W 2 (v i w i ) 2 + v i 2 (w i w i ) 2 + w i, we have ( ) ( ) 2 (v i w i ) 2 + v i 2 (w i w i ) 2 + w i i Therefore (w ) = v. 2 v w v 1 2 w w w 1 i François Malgouyres (IMT) Mathematics for Image Processing Oct / 16
7 Example 2 : R is. 1 (w ) i is obtained by a soft thresholding w (w i 1, si w i > 1. ) i = 0, si 1 w i 1, w i + 1, si w i < 1, Proof : We remind that (w ) i = Argmin t R 2 (t w i )2 + t and distinguish three cases Case 1 : (w ) i > 0 (w ) i > 0 and ( (w ) i w i ) + 1 = 0 (w ) i = w i 1 and w i > 1 François Malgouyres (IMT) Mathematics for Image Processing Oct / 16
8 Example 2 : R is. 1 (w ) i is obtained by a soft thresholding w (w i 1, si w i > 1. ) i = 0, si 1 w i 1, w i + 1, si w i < 1, Proof : We remind that (w ) i = Argmin t R 2 (t w i )2 + t and distinguish three cases Case 2 : (w ) i = 0 (w ) i = 0 and ( (w ) i w i ) [ 1, 1] (w ) i = 0 and 1 w i 1 François Malgouyres (IMT) Mathematics for Image Processing Oct / 16
9 Example 2 : R is. 1 (w ) i is obtained by a soft thresholding w (w i 1, si w i > 1. ) i = 0, si 1 w i 1, w i + 1, si w i < 1, Proof : We remind that (w ) i = Argmin t R 2 (t w i )2 + t and distinguish three cases Case 3 : (w ) i < 0 (w ) i < 0 and ( (w ) i w i ) 1 = 0 (w ) i = w i + 1 and w i < 1 François Malgouyres (IMT) Mathematics for Image Processing Oct / 16
10 Example 3 : smooth case If R is continuously differentiable satisfies therefore prox t R(w ) = Argmin w W t 2 w w R(w) t ( prox t R(w ) w ) + R(prox t R(w )) = 0. prox t R(w ) = w 1 t R(proxt R(w )). prox t R (w ) is an implicit gradient step with step-size 1 t. François Malgouyres (IMT) Mathematics for Image Processing Oct / 16
11 The proximal gradient algorithm Also known as forward-backward algorithm, implicit-explicit, ISTA, PAM... Algorithm 2 Proximal gradient algorithm Entry: Entry needed for computing E, E and prox t R (.) Output: Approximation of a minimizer of E : w Initialize w While Not converged Do Compute d = E(w) Compute a step-size t 0 Update : w prox t R (w t d) End while François Malgouyres (IMT) Mathematics for Image Processing Oct / 16
12 Convergence of the Proximal Gradient Algorithm Theorem (Convergence of the Proximal Gradient algorithm) We consider E = E + R Where E : W R is convexe, coercive, differentiable, with a ipschitz gradient a of constant > 0 Where R is lower semi-continuous, proper, convex and coercive. The sequence (w k ) k N generated by the Proximal gradient Algorithm for a step-size t < 1 is such that (E(w k )) k N is non-increasing For any minimizer w of E E(w k ) E(w ) 2k w 0 w 2. a w, w W, E(w ) E(w) w w François Malgouyres (IMT) Mathematics for Image Processing Oct / 16
13 Proof (Majorize-Minorize) emma (A quadratique majorant) We have for any w, w W E(w ) E(w) + E(w), w w + 2 w w 2 2. Proof Using the second fundamental theorem of calculus, we have Therefore E(w ) = E(w) E(tw + (1 t)w ), w w dt. = E(w ) E(w) E(w), w w 1 0 E(tw + (1 t)w ) E(w), w w dt, François Malgouyres (IMT) Mathematics for Image Processing Oct / 16
14 Proof (Majorize-Minorize) emma (A quadratique majorant) We have for any w, w W E(w ) E(w) + E(w), w w + 2 w w 2 2. End of the proof = E(w ) E(w) E(w), w w E(tw + (1 t)w ) E(w), w w dt, E(tw + (1 t)w ) E(w) 2 w w 2 dt, tw + (1 t)w w 2 w w 2 dt, = w w (1 t)dt = 2 w w 2 2. François Malgouyres (IMT) Mathematics for Image Processing Oct / 16
15 Proof (Majorize-Minorize) We denote for k 1 and w W, F k (w) = E(w k 1 ) + E(w k 1 ), w w k w w k We have (using the previous emma) E(w) F k (w). (1) emma (Minorize) We have w k = Argmin w W F k (w) + R(w). (2) We also have for all w W F k (w) + R(w) F k (w k ) + R(w k ) + 2 w w k 2 2. (3) François Malgouyres (IMT) Mathematics for Image Processing Oct / 16
16 Proof (Majorize-Minorize) Proof of (2): w k = Argmin w W F k (w) + R(w) ( w k = prox R w k 1 1 ) E(w k 1 ), = Argmin w W 2 w w k E(w k 1 ) R(w), = Argmin w W 1 2 E(w k 1 ) w w k 1, E(w k 1 ) + 2 w w k R(w), = Argmin w W F k (w) + R(w). François Malgouyres (IMT) Mathematics for Image Processing Oct / 16
17 Proof (Majorize-Minorize) Proof of (3) : F k (w) + R(w) F k (w k ) + R(w k ) + 2 w w k 2 2 First notice that, for all w W F k (w) F k (w k ) = E(w k 1 ) + E(w k 1 ), w w k w w k ( (E(w k 1 ) + E(w k 1 ), w k w k 1 + ) 2 w k w k 1 2 2, = E(w k 1 ), w w k + ( w w k w w k, w k w k 1 ), 2 = 2 w w k E(w k 1 ) + (w k w k 1 ), w w k, = 2 w w k F k (w k ), w w k. François Malgouyres (IMT) Mathematics for Image Processing Oct / 16
18 Proof (Majorize-Minorize) End of the proof of (3) : F k (w) + R(w) F k (w k ) + R(w k ) + 2 w w k 2 2 F k (w) + R(w) F k (w k ) R(w k ) = 2 w w k F k (w k ), w w k + R(w) R(w k ). Moreover, since w k = Argmin w W F k (w) + R(w), 0 (F k + R)(w k ) = F k (w k ) + R(w k ), we have w k Argmin w W Fk (w k ), w w k + R(w). Therefore Fk (w k ), w w k + R(w) R(w k ) and F k (w) + R(w) F k (w k ) R(w k ) 2 w w k 2 2. François Malgouyres (IMT) Mathematics for Image Processing Oct / 16
19 Proof (Majorize-Minorize) et us resume to the proof of the main Theorem. et w Argmin w W E(w), we have using (1) (E(w) F k (w)), emma 2, and the convexity of E that E(w k ) F k (w k ) + R(w k ) F k (w ) + R(w ) 2 w w k 2 2 = E(w k 1 ) + E(w k 1 ), w w k 1 + R(w ) E(w ) w w k w w k 2 2 ( w w k w w k 2 2) François Malgouyres (IMT) Mathematics for Image Processing Oct / 16
20 François Malgouyres (IMT) Mathematics for Image Processing Oct / 16 Proof (Majorize-Minorize) Using E(w k ) F k (w k ) and w k = Argmin w W F k (w) + R(w), we have E(w k ) F k (w k ) + R(w k ) F k (w k 1 ) + R(w k 1 ) = E(w k 1 ). In words (E(w k )) k N is non-increasing. We therefore have for all k k and therefore E(w k ) E(w ) 1 k E(w k ) E(w ) E(w k ) E(w ). 2k k k =1 k k =1 ( ) E(w k ) E(w ) ( ) w w k w w k 2 2 ( w w w w k 2 2k 2) 2k w w 0 2 2
21 To go further Accelerated version exists (convergence in O( 1 k 2 )) : FISTA (Beck-Teboulle) For other algorithm using the prox, see : Chambolle-Pock Algorithm, Douglas-Rachford algorithm, Proximal Point Algorithm Convergence proof including non-convex settings : PAM (Bolte-Sabach-Teboulle) Including a Stochastic setting : Chouzenoux-Pesquet-Reppeti François Malgouyres (IMT) Mathematics for Image Processing Oct / 16
Sequential convex programming,: value function and convergence
Sequential convex programming,: value function and convergence Edouard Pauwels joint work with Jérôme Bolte Journées MODE Toulouse March 23 2016 1 / 16 Introduction Local search methods for finite dimensional
More informationI P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION
I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION Peter Ochs University of Freiburg Germany 17.01.2017 joint work with: Thomas Brox and Thomas Pock c 2017 Peter Ochs ipiano c 1
More informationProximal splitting methods on convex problems with a quadratic term: Relax!
Proximal splitting methods on convex problems with a quadratic term: Relax! The slides I presented with added comments Laurent Condat GIPSA-lab, Univ. Grenoble Alpes, France Workshop BASP Frontiers, Jan.
More informationMaster 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique
Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some
More informationMIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications. Class 08: Sparsity Based Regularization. Lorenzo Rosasco
MIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications Class 08: Sparsity Based Regularization Lorenzo Rosasco Learning algorithms so far ERM + explicit l 2 penalty 1 min w R d n n l(y
More informationProximal methods. S. Villa. October 7, 2014
Proximal methods S. Villa October 7, 2014 1 Review of the basics Often machine learning problems require the solution of minimization problems. For instance, the ERM algorithm requires to solve a problem
More informationPerturbed Proximal Gradient Algorithm
Perturbed Proximal Gradient Algorithm Gersende FORT LTCI, CNRS, Telecom ParisTech Université Paris-Saclay, 75013, Paris, France Large-scale inverse problems and optimization Applications to image processing
More informationSIAM Conference on Imaging Science, Bologna, Italy, Adaptive FISTA. Peter Ochs Saarland University
SIAM Conference on Imaging Science, Bologna, Italy, 2018 Adaptive FISTA Peter Ochs Saarland University 07.06.2018 joint work with Thomas Pock, TU Graz, Austria c 2018 Peter Ochs Adaptive FISTA 1 / 16 Some
More informationBISTA: a Bregmanian proximal gradient method without the global Lipschitz continuity assumption
BISTA: a Bregmanian proximal gradient method without the global Lipschitz continuity assumption Daniel Reem (joint work with Simeon Reich and Alvaro De Pierro) Department of Mathematics, The Technion,
More informationAgenda. Fast proximal gradient methods. 1 Accelerated first-order methods. 2 Auxiliary sequences. 3 Convergence analysis. 4 Numerical examples
Agenda Fast proximal gradient methods 1 Accelerated first-order methods 2 Auxiliary sequences 3 Convergence analysis 4 Numerical examples 5 Optimality of Nesterov s scheme Last time Proximal gradient method
More information6. Proximal gradient method
L. Vandenberghe EE236C (Spring 2013-14) 6. Proximal gradient method motivation proximal mapping proximal gradient method with fixed step size proximal gradient method with line search 6-1 Proximal mapping
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring
More information6. Proximal gradient method
L. Vandenberghe EE236C (Spring 2016) 6. Proximal gradient method motivation proximal mapping proximal gradient method with fixed step size proximal gradient method with line search 6-1 Proximal mapping
More informationOslo Class 6 Sparsity based regularization
RegML2017@SIMULA Oslo Class 6 Sparsity based regularization Lorenzo Rosasco UNIGE-MIT-IIT May 4, 2017 Learning from data Possible only under assumptions regularization min Ê(w) + λr(w) w Smoothness Sparsity
More informationDual Proximal Gradient Method
Dual Proximal Gradient Method http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/19 1 proximal gradient method
More informationA user s guide to Lojasiewicz/KL inequalities
Other A user s guide to Lojasiewicz/KL inequalities Toulouse School of Economics, Université Toulouse I SLRA, Grenoble, 2015 Motivations behind KL f : R n R smooth ẋ(t) = f (x(t)) or x k+1 = x k λ k f
More informationLasso: Algorithms and Extensions
ELE 538B: Sparsity, Structure and Inference Lasso: Algorithms and Extensions Yuxin Chen Princeton University, Spring 2017 Outline Proximal operators Proximal gradient methods for lasso and its extensions
More informationA semi-algebraic look at first-order methods
splitting A semi-algebraic look at first-order Université de Toulouse / TSE Nesterov s 60th birthday, Les Houches, 2016 in large-scale first-order optimization splitting Start with a reasonable FOM (some
More informationdans les modèles à vraisemblance non explicite par des algorithmes gradient-proximaux perturbés
Inférence pénalisée dans les modèles à vraisemblance non explicite par des algorithmes gradient-proximaux perturbés Gersende Fort Institut de Mathématiques de Toulouse, CNRS and Univ. Paul Sabatier Toulouse,
More informationOptimization methods
Optimization methods Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda /8/016 Introduction Aim: Overview of optimization methods that Tend to
More informationSplitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches
Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches Patrick L. Combettes joint work with J.-C. Pesquet) Laboratoire Jacques-Louis Lions Faculté de Mathématiques
More informationProximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725
Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:
More informationStochastic Optimization: First order method
Stochastic Optimization: First order method Taiji Suzuki Tokyo Institute of Technology Graduate School of Information Science and Engineering Department of Mathematical and Computing Sciences JST, PRESTO
More informationThis can be 2 lectures! still need: Examples: non-convex problems applications for matrix factorization
This can be 2 lectures! still need: Examples: non-convex problems applications for matrix factorization x = prox_f(x)+prox_{f^*}(x) use to get prox of norms! PROXIMAL METHODS WHY PROXIMAL METHODS Smooth
More informationPrimal-dual algorithms for the sum of two and three functions 1
Primal-dual algorithms for the sum of two and three functions 1 Ming Yan Michigan State University, CMSE/Mathematics 1 This works is partially supported by NSF. optimization problems for primal-dual algorithms
More informationOptimization methods
Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,
More informationIterative Convex Regularization
Iterative Convex Regularization Lorenzo Rosasco Universita di Genova Universita di Genova Istituto Italiano di Tecnologia Massachusetts Institute of Technology Optimization and Statistical Learning Workshop,
More informationEE 546, Univ of Washington, Spring Proximal mapping. introduction. review of conjugate functions. proximal mapping. Proximal mapping 6 1
EE 546, Univ of Washington, Spring 2012 6. Proximal mapping introduction review of conjugate functions proximal mapping Proximal mapping 6 1 Proximal mapping the proximal mapping (prox-operator) of a convex
More informationAn inertial forward-backward algorithm for the minimization of the sum of two nonconvex functions
An inertial forward-backward algorithm for the minimization of the sum of two nonconvex functions Radu Ioan Boţ Ernö Robert Csetnek Szilárd Csaba László October, 1 Abstract. We propose a forward-backward
More informationDual and primal-dual methods
ELE 538B: Large-Scale Optimization for Data Science Dual and primal-dual methods Yuxin Chen Princeton University, Spring 2018 Outline Dual proximal gradient method Primal-dual proximal gradient method
More informationLecture 8: February 9
0-725/36-725: Convex Optimiation Spring 205 Lecturer: Ryan Tibshirani Lecture 8: February 9 Scribes: Kartikeya Bhardwaj, Sangwon Hyun, Irina Caan 8 Proximal Gradient Descent In the previous lecture, we
More informationAccelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems)
Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems) Donghwan Kim and Jeffrey A. Fessler EECS Department, University of Michigan
More informationConvex Optimization. (EE227A: UC Berkeley) Lecture 15. Suvrit Sra. (Gradient methods III) 12 March, 2013
Convex Optimization (EE227A: UC Berkeley) Lecture 15 (Gradient methods III) 12 March, 2013 Suvrit Sra Optimal gradient methods 2 / 27 Optimal gradient methods We saw following efficiency estimates for
More informationORIE 4741: Learning with Big Messy Data. Proximal Gradient Method
ORIE 4741: Learning with Big Messy Data Proximal Gradient Method Professor Udell Operations Research and Information Engineering Cornell November 13, 2017 1 / 31 Announcements Be a TA for CS/ORIE 1380:
More informationA Unified Approach to Proximal Algorithms using Bregman Distance
A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department
More informationAbout Split Proximal Algorithms for the Q-Lasso
Thai Journal of Mathematics Volume 5 (207) Number : 7 http://thaijmath.in.cmu.ac.th ISSN 686-0209 About Split Proximal Algorithms for the Q-Lasso Abdellatif Moudafi Aix Marseille Université, CNRS-L.S.I.S
More informationFrom error bounds to the complexity of first-order descent methods for convex functions
From error bounds to the complexity of first-order descent methods for convex functions Nguyen Trong Phong-TSE Joint work with Jérôme Bolte, Juan Peypouquet, Bruce Suter. Toulouse, 23-25, March, 2016 Journées
More informationConditional Gradient (Frank-Wolfe) Method
Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties
More informationOWL to the rescue of LASSO
OWL to the rescue of LASSO IISc IBM day 2018 Joint Work R. Sankaran and Francis Bach AISTATS 17 Chiranjib Bhattacharyya Professor, Department of Computer Science and Automation Indian Institute of Science,
More informationSplitting methods for decomposing separable convex programs
Splitting methods for decomposing separable convex programs Philippe Mahey LIMOS - ISIMA - Université Blaise Pascal PGMO, ENSTA 2013 October 4, 2013 1 / 30 Plan 1 Max Monotone Operators Proximal techniques
More informationA memory gradient algorithm for l 2 -l 0 regularization with applications to image restoration
A memory gradient algorithm for l 2 -l 0 regularization with applications to image restoration E. Chouzenoux, A. Jezierska, J.-C. Pesquet and H. Talbot Université Paris-Est Lab. d Informatique Gaspard
More informationTight Rates and Equivalence Results of Operator Splitting Schemes
Tight Rates and Equivalence Results of Operator Splitting Schemes Wotao Yin (UCLA Math) Workshop on Optimization for Modern Computing Joint w Damek Davis and Ming Yan UCLA CAM 14-51, 14-58, and 14-59 1
More informationVariable Metric Forward-Backward Algorithm
Variable Metric Forward-Backward Algorithm 1/37 Variable Metric Forward-Backward Algorithm for minimizing the sum of a differentiable function and a convex function E. Chouzenoux in collaboration with
More informationA GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR TV MINIMIZATION
A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR TV MINIMIZATION ERNIE ESSER XIAOQUN ZHANG TONY CHAN Abstract. We generalize the primal-dual hybrid gradient (PDHG) algorithm proposed
More informationPrimal-dual coordinate descent
Primal-dual coordinate descent Olivier Fercoq Joint work with P. Bianchi & W. Hachem 15 July 2015 1/28 Minimize the convex function f, g, h convex f is differentiable Problem min f (x) + g(x) + h(mx) x
More informationA GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR CONVEX OPTIMIZATION IN IMAGING SCIENCE
A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR CONVEX OPTIMIZATION IN IMAGING SCIENCE ERNIE ESSER XIAOQUN ZHANG TONY CHAN Abstract. We generalize the primal-dual hybrid gradient
More informationCoordinate Update Algorithm Short Course Operator Splitting
Coordinate Update Algorithm Short Course Operator Splitting Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 25 Operator splitting pipeline 1. Formulate a problem as 0 A(x) + B(x) with monotone operators
More informationThe Proximal Gradient Method
Chapter 10 The Proximal Gradient Method Underlying Space: In this chapter, with the exception of Section 10.9, E is a Euclidean space, meaning a finite dimensional space endowed with an inner product,
More informationFast proximal gradient methods
L. Vandenberghe EE236C (Spring 2013-14) Fast proximal gradient methods fast proximal gradient method (FISTA) FISTA with line search FISTA as descent method Nesterov s second method 1 Fast (proximal) gradient
More informationLecture 9: September 28
0-725/36-725: Convex Optimization Fall 206 Lecturer: Ryan Tibshirani Lecture 9: September 28 Scribes: Yiming Wu, Ye Yuan, Zhihao Li Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These
More informationarxiv: v1 [stat.ml] 1 Mar 2016
DUAL SMOOTHING AND LEVEL SET TECHNIQUES FOR VARIATIONAL MATRIX DECOMPOSITION Dual Smoothing and Level Set Techniques for Variational Matrix Decomposition arxiv:1603.00284v1 [stat.ml] 1 Mar 2016 Aleksandr
More informationConvergence rates for an inertial algorithm of gradient type associated to a smooth nonconvex minimization
Convergence rates for an inertial algorithm of gradient type associated to a smooth nonconvex minimization Szilárd Csaba László November, 08 Abstract. We investigate an inertial algorithm of gradient type
More informationconsistent learning by composite proximal thresholding
consistent learning by composite proximal thresholding Saverio Salzo Università degli Studi di Genova Optimization in Machine learning, vision and image processing Université Paul Sabatier, Toulouse 6-7
More informationPrimal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions
Primal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions Olivier Fercoq and Pascal Bianchi Problem Minimize the convex function
More informationarxiv: v2 [math.oc] 21 Nov 2017
Unifying abstract inexact convergence theorems and block coordinate variable metric ipiano arxiv:1602.07283v2 [math.oc] 21 Nov 2017 Peter Ochs Mathematical Optimization Group Saarland University Germany
More informationLearning with stochastic proximal gradient
Learning with stochastic proximal gradient Lorenzo Rosasco DIBRIS, Università di Genova Via Dodecaneso, 35 16146 Genova, Italy lrosasco@mit.edu Silvia Villa, Băng Công Vũ Laboratory for Computational and
More informationProximal Newton Method. Ryan Tibshirani Convex Optimization /36-725
Proximal Newton Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: primal-dual interior-point method Given the problem min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h
More informationLecture 1: September 25
0-725: Optimization Fall 202 Lecture : September 25 Lecturer: Geoff Gordon/Ryan Tibshirani Scribes: Subhodeep Moitra Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have
More informationA Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization
A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization Panos Parpas Department of Computing Imperial College London www.doc.ic.ac.uk/ pp500 p.parpas@imperial.ac.uk jointly with D.V.
More informationComputational Statistics and Optimisation. Joseph Salmon Télécom Paristech, Institut Mines-Télécom
Computational Statistics and Optimisation Joseph Salmon http://josephsalmon.eu Télécom Paristech, Institut Mines-Télécom Plan Duality gap and stopping criterion Back to gradient descent analysis Forward-backward
More informationJournal Club. A Universal Catalyst for First-Order Optimization (H. Lin, J. Mairal and Z. Harchaoui) March 8th, CMAP, Ecole Polytechnique 1/19
Journal Club A Universal Catalyst for First-Order Optimization (H. Lin, J. Mairal and Z. Harchaoui) CMAP, Ecole Polytechnique March 8th, 2018 1/19 Plan 1 Motivations 2 Existing Acceleration Methods 3 Universal
More informationA Tutorial on Primal-Dual Algorithm
A Tutorial on Primal-Dual Algorithm Shenlong Wang University of Toronto March 31, 2016 1 / 34 Energy minimization MAP Inference for MRFs Typical energies consist of a regularization term and a data term.
More informationA first-order primal-dual algorithm with linesearch
A first-order primal-dual algorithm with linesearch Yura Malitsky Thomas Pock arxiv:608.08883v2 [math.oc] 23 Mar 208 Abstract The paper proposes a linesearch for a primal-dual method. Each iteration of
More informationProximal tools for image reconstruction in dynamic Positron Emission Tomography
Proximal tools for image reconstruction in dynamic Positron Emission Tomography Nelly Pustelnik 1 joint work with Caroline Chaux 2, Jean-Christophe Pesquet 3, and Claude Comtat 4 1 Laboratoire de Physique,
More informationOn the interior of the simplex, we have the Hessian of d(x), Hd(x) is diagonal with ith. µd(w) + w T c. minimize. subject to w T 1 = 1,
Math 30 Winter 05 Solution to Homework 3. Recognizing the convexity of g(x) := x log x, from Jensen s inequality we get d(x) n x + + x n n log x + + x n n where the equality is attained only at x = (/n,...,
More informationAccelerated primal-dual methods for linearly constrained convex problems
Accelerated primal-dual methods for linearly constrained convex problems Yangyang Xu SIAM Conference on Optimization May 24, 2017 1 / 23 Accelerated proximal gradient For convex composite problem: minimize
More informationRandomized Coordinate Descent with Arbitrary Sampling: Algorithms and Complexity
Randomized Coordinate Descent with Arbitrary Sampling: Algorithms and Complexity Zheng Qu University of Hong Kong CAM, 23-26 Aug 2016 Hong Kong based on joint work with Peter Richtarik and Dominique Cisba(University
More informationProximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization
Proximal Newton Method Zico Kolter (notes by Ryan Tibshirani) Convex Optimization 10-725 Consider the problem Last time: quasi-newton methods min x f(x) with f convex, twice differentiable, dom(f) = R
More informationOptimization for Learning and Big Data
Optimization for Learning and Big Data Donald Goldfarb Department of IEOR Columbia University Department of Mathematics Distinguished Lecture Series May 17-19, 2016. Lecture 1. First-Order Methods for
More informationA General Framework for a Class of Primal-Dual Algorithms for TV Minimization
A General Framework for a Class of Primal-Dual Algorithms for TV Minimization Ernie Esser UCLA 1 Outline A Model Convex Minimization Problem Main Idea Behind the Primal Dual Hybrid Gradient (PDHG) Method
More informationA tutorial on sparse modeling. Outline:
A tutorial on sparse modeling. Outline: 1. Why? 2. What? 3. How. 4. no really, why? Sparse modeling is a component in many state of the art signal processing and machine learning tasks. image processing
More informationMMSE Denoising of 2-D Signals Using Consistent Cycle Spinning Algorithm
Denoising of 2-D Signals Using Consistent Cycle Spinning Algorithm Bodduluri Asha, B. Leela kumari Abstract: It is well known that in a real world signals do not exist without noise, which may be negligible
More informationarxiv: v4 [math.oc] 29 Jan 2018
Noname manuscript No. (will be inserted by the editor A new primal-dual algorithm for minimizing the sum of three functions with a linear operator Ming Yan arxiv:1611.09805v4 [math.oc] 29 Jan 2018 Received:
More informationMath 273a: Optimization Overview of First-Order Optimization Algorithms
Math 273a: Optimization Overview of First-Order Optimization Algorithms Wotao Yin Department of Mathematics, UCLA online discussions on piazza.com 1 / 9 Typical flow of numerical optimization Optimization
More informationApproaching monotone inclusion problems via second order dynamical systems with linear and anisotropic damping
March 0, 206 3:4 WSPC Proceedings - 9in x 6in secondorderanisotropicdamping206030 page Approaching monotone inclusion problems via second order dynamical systems with linear and anisotropic damping Radu
More informationPrimal and Dual Variables Decomposition Methods in Convex Optimization
Primal and Dual Variables Decomposition Methods in Convex Optimization Amir Beck Technion - Israel Institute of Technology Haifa, Israel Based on joint works with Edouard Pauwels, Shoham Sabach, Luba Tetruashvili,
More informationSPARSE SIGNAL RESTORATION. 1. Introduction
SPARSE SIGNAL RESTORATION IVAN W. SELESNICK 1. Introduction These notes describe an approach for the restoration of degraded signals using sparsity. This approach, which has become quite popular, is useful
More informationON PROXIMAL POINT-TYPE ALGORITHMS FOR WEAKLY CONVEX FUNCTIONS AND THEIR CONNECTION TO THE BACKWARD EULER METHOD
ON PROXIMAL POINT-TYPE ALGORITHMS FOR WEAKLY CONVEX FUNCTIONS AND THEIR CONNECTION TO THE BACKWARD EULER METHOD TIM HOHEISEL, MAXIME LABORDE, AND ADAM OBERMAN Abstract. In this article we study the connection
More informationSmoothing Proximal Gradient Method. General Structured Sparse Regression
for General Structured Sparse Regression Xi Chen, Qihang Lin, Seyoung Kim, Jaime G. Carbonell, Eric P. Xing (Annals of Applied Statistics, 2012) Gatsby Unit, Tea Talk October 25, 2013 Outline Motivation:
More informationA Parallel Block-Coordinate Approach for Primal-Dual Splitting with Arbitrary Random Block Selection
EUSIPCO 2015 1/19 A Parallel Block-Coordinate Approach for Primal-Dual Splitting with Arbitrary Random Block Selection Jean-Christophe Pesquet Laboratoire d Informatique Gaspard Monge - CNRS Univ. Paris-Est
More informationDual methods for the minimization of the total variation
1 / 30 Dual methods for the minimization of the total variation Rémy Abergel supervisor Lionel Moisan MAP5 - CNRS UMR 8145 Different Learning Seminar, LTCI Thursday 21st April 2016 2 / 30 Plan 1 Introduction
More informationGeneralized greedy algorithms.
Generalized greedy algorithms. François-Xavier Dupé & Sandrine Anthoine LIF & I2M Aix-Marseille Université - CNRS - Ecole Centrale Marseille, Marseille ANR Greta Séminaire Parisien des Mathématiques Appliquées
More informationOptimization. Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison
Optimization Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison optimization () cost constraints might be too much to cover in 3 hours optimization (for big
More informationLecture 5 : Projections
Lecture 5 : Projections EE227C. Lecturer: Professor Martin Wainwright. Scribe: Alvin Wan Up until now, we have seen convergence rates of unconstrained gradient descent. Now, we consider a constrained minimization
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Proximal-Gradient Mark Schmidt University of British Columbia Winter 2018 Admin Auditting/registration forms: Pick up after class today. Assignment 1: 2 late days to hand in
More informationAuxiliary-Function Methods in Optimization
Auxiliary-Function Methods in Optimization Charles Byrne (Charles Byrne@uml.edu) http://faculty.uml.edu/cbyrne/cbyrne.html Department of Mathematical Sciences University of Massachusetts Lowell Lowell,
More informationA Majorize-Minimize subspace approach for l 2 -l 0 regularization with applications to image processing
A Majorize-Minimize subspace approach for l 2 -l 0 regularization with applications to image processing Emilie Chouzenoux emilie.chouzenoux@univ-mlv.fr Université Paris-Est Lab. d Informatique Gaspard
More informationSignal Processing and Networks Optimization Part VI: Duality
Signal Processing and Networks Optimization Part VI: Duality Pierre Borgnat 1, Jean-Christophe Pesquet 2, Nelly Pustelnik 1 1 ENS Lyon Laboratoire de Physique CNRS UMR 5672 pierre.borgnat@ens-lyon.fr,
More informationDouglas-Rachford Splitting: Complexity Estimates and Accelerated Variants
53rd IEEE Conference on Decision and Control December 5-7, 204. Los Angeles, California, USA Douglas-Rachford Splitting: Complexity Estimates and Accelerated Variants Panagiotis Patrinos and Lorenzo Stella
More informationA characterization of essentially strictly convex functions on reflexive Banach spaces
A characterization of essentially strictly convex functions on reflexive Banach spaces Michel Volle Département de Mathématiques Université d Avignon et des Pays de Vaucluse 74, rue Louis Pasteur 84029
More informationNesterov s Acceleration
Nesterov s Acceleration Nesterov Accelerated Gradient min X f(x)+ (X) f -smooth. Set s 1 = 1 and = 1. Set y 0. Iterate by increasing t: g t 2 @f(y t ) s t+1 = 1+p 1+4s 2 t 2 y t = x t + s t 1 s t+1 (x
More informationOn the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems
On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems Radu Ioan Boţ Ernö Robert Csetnek August 5, 014 Abstract. In this paper we analyze the
More informationIn collaboration with J.-C. Pesquet A. Repetti EC (UPE) IFPEN 16 Dec / 29
A Random block-coordinate primal-dual proximal algorithm with application to 3D mesh denoising Emilie CHOUZENOUX Laboratoire d Informatique Gaspard Monge - CNRS Univ. Paris-Est, France Horizon Maths 2014
More informationIncremental and Stochastic Majorization-Minimization Algorithms for Large-Scale Machine Learning
Incremental and Stochastic Majorization-Minimization Algorithms for Large-Scale Machine Learning Julien Mairal Inria, LEAR Team, Grenoble Journées MAS, Toulouse Julien Mairal Incremental and Stochastic
More informationFrank-Wolfe Method. Ryan Tibshirani Convex Optimization
Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)
More informationAccelerated gradient methods
ELE 538B: Large-Scale Optimization for Data Science Accelerated gradient methods Yuxin Chen Princeton University, Spring 018 Outline Heavy-ball methods Nesterov s accelerated gradient methods Accelerated
More informationA Primal-dual Three-operator Splitting Scheme
Noname manuscript No. (will be inserted by the editor) A Primal-dual Three-operator Splitting Scheme Ming Yan Received: date / Accepted: date Abstract In this paper, we propose a new primal-dual algorithm
More informationNonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy
Nonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy Caroline Chaux Joint work with X. Vu, N. Thirion-Moreau and S. Maire (LSIS, Toulon) Aix-Marseille
More informationIn applications, we encounter many constrained optimization problems. Examples Basis pursuit: exact sparse recovery problem
1 Conve Analsis Main references: Vandenberghe UCLA): EECS236C - Optimiation methods for large scale sstems, http://www.seas.ucla.edu/ vandenbe/ee236c.html Parikh and Bod, Proimal algorithms, slides and
More informationOptimization and Optimal Control in Banach Spaces
Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,
More information