Sequential convex programming,: value function and convergence
|
|
- Ethel Lawrence
- 5 years ago
- Views:
Transcription
1 Sequential convex programming,: value function and convergence Edouard Pauwels joint work with Jérôme Bolte Journées MODE Toulouse March / 16
2 Introduction Local search methods for finite dimensional nonconvex optimization. 2 / 16
3 Introduction Local search methods for finite dimensional nonconvex optimization. Sequential convex programming (SCP) x k+1 = arg min y h(y, x k ) 2 / 16
4 Introduction Local search methods for finite dimensional nonconvex optimization. Sequential convex programming (SCP) x k+1 = arg min y h(y, x k ) Convergence of the iterates to a critical point? 2 / 16
5 Introduction Local search methods for finite dimensional nonconvex optimization. Sequential convex programming (SCP) x k+1 = arg min y h(y, x k ) Convergence of the iterates to a critical point? Non convergence Jittering Convergence k / 16
6 The prox-friendly setting: Non-smooth part (constraints or more general). 3 / 16
7 The prox-friendly setting: Non-smooth part (constraints or more general). Favorable geometries Proximal splitting : Bruck, Pasty, Combettes, Pesquet, Nesterov, Beck, Teboulle, Eckstein, Tseng... 3 / 16
8 The prox-friendly setting: convergence results Non-smooth part (constraints or more general). Favorable geometries Proximal splitting : Bruck, Pasty, Combettes, Pesquet, Nesterov, Beck, Teboulle, Eckstein, Tseng... Convergence with semi-algebraic data: Bolte, Attouch, Noll, Rondepierre, Chouzenoux, Pesquet, Repetti, Lewis, Sabach, Teboulle... 3 / 16
9 The prox-friendly setting: Limitations Non-smooth part (constraints or more general). Favorable geometries Proximal splitting : Bruck, Pasty, Combettes, Pesquet, Nesterov, Beck, Teboulle, Eckstein, Tseng... Convergence with semi-algebraic data: Bolte, Attouch, Noll, Rondepierre, Chouzenoux, Pesquet, Repetti, Lewis, Sabach, Teboulle... Not all problems have tractable proximal operators 3 / 16
10 The prox-friendly setting: Limitations Non-smooth part (constraints or more general). Favorable geometries Proximal splitting : Bruck, Pasty, Combettes, Pesquet, Nesterov, Beck, Teboulle, Eckstein, Tseng... Convergence with semi-algebraic data: Bolte, Attouch, Noll, Rondepierre, Chouzenoux, Pesquet, Repetti, Lewis, Sabach, Teboulle... Not all problems have tractable proximal operators Complex geometries nonsmoothness / constraints must be approximated: LP, QP, SDP: convex programming oracles. Large field (SQP, SQCQP, Gauss-Newton,... ). prox-friendly analysis does not apply directly: More approximations sources of oscilation. convergence barely understood. 3 / 16
11 The prox-friendly setting: Limitations Non-smooth part (constraints or more general). Favorable geometries Proximal splitting : Bruck, Pasty, Combettes, Pesquet, Nesterov, Beck, Teboulle, Eckstein, Tseng... Convergence with semi-algebraic data: Bolte, Attouch, Noll, Rondepierre, Chouzenoux, Pesquet, Repetti, Lewis, Sabach, Teboulle... Not all problems have tractable proximal operators Complex geometries nonsmoothness / constraints must be approximated: LP, QP, SDP: convex programming oracles. Large field (SQP, SQCQP, Gauss-Newton,... ). prox-friendly analysis does not apply directly: More approximations sources of oscilation. convergence barely understood. Convergence to critical points in non-prox-friendly settings? 3 / 16
12 Outline 1. Existing results: gradient methods with semi-algebraic data 2. Complex geometries: sequential convex programming 3. Implicit gradient steps: the value function 4 / 16
13 Favorable geometries: gradient methods Gradient descent: min x f (x), (f smooth) s(x k x k+1 ) = f (x k ) 5 / 16
14 Favorable geometries: gradient methods Gradient descent: min x f (x), (f smooth) s(x k x k+1 ) = f (x k ) Proximal point: min x g(x) (g non-smooth) x k+1 prox g/s (x k ) s(x k x k+1 ) g(x k+1 ) (arg min y g(y)/s y x k 2 2) 5 / 16
15 Favorable geometries: gradient methods Gradient descent: min x f (x), (f smooth) s(x k x k+1 ) = f (x k ) Proximal point: min x g(x) (g non-smooth) x k+1 prox g/s (x k ) s(x k x k+1 ) g(x k+1 ) (arg min y g(y)/s y x k 2 2) Forward-Backward: min x f (x) + g(x) (f smooth, g non-smooth) x k+1 prox g/s (x k 1/s f (x k )) s(x k x k+1 ) g(x k+1 ) + f (x k ) 5 / 16
16 Favorable geometries: gradient methods Gradient descent: min x f (x), (f smooth) s(x k x k+1 ) = f (x k ) Proximal point: min x g(x) (g non-smooth) x k+1 prox g/s (x k ) s(x k x k+1 ) g(x k+1 ) (arg min y g(y)/s y x k 2 2) Forward-Backward: min x f (x) + g(x) (f smooth, g non-smooth) x k+1 prox g/s (x k 1/s f (x k )) s(x k x k+1 ) g(x k+1 ) + f (x k ) Relation between x k x k+1 and (sub)-gradient of the objective. 5 / 16
17 KL property ( Lojasiewicz 63, Kurdyka 98) Desingularizing functions on (0, r 0 ) ϕ C([0, r 0 ), R + ), ϕ C 1 (0, r 0 ), ϕ > 0, ϕ concave and ϕ(0) = 0. phi x 6 / 16
18 KL property ( Lojasiewicz 63, Kurdyka 98) Desingularizing functions on (0, r 0 ) ϕ C([0, r 0 ), R + ), ϕ C 1 (0, r 0 ), ϕ > 0, ϕ concave and ϕ(0) = 0. phi x Definition F 0 has the KL property at x (F 0 ( x) = 0) if there exists ɛ > 0 and a desingularizing function ϕ such that, dist( (ϕ F 0 )(x), 0) 1, x, x x ɛ, F 0 ( x) < F 0 (x) < F 0 ( x) + ɛ. 6 / 16
19 Illustration F 0 and ϕ F 0 F 0 and ϕ F Parameterize with ϕ sharpens the function / 16
20 Illustration F 0 and ϕ F 0 F 0 and ϕ F Parameterize with ϕ sharpens the 5 function Theorem (2006, Bolte-Daniilidis-Lewis) KL inequality holds for all lower-semicontinuous - semi-algebraic functions (and many more). 7 / 16
21 Finite length property (general recipe) (Attouch, Bolte, Svaiter, Sabach, Teboulle)... A, B > 0: Sufficient decrease: f (x k+1 ) + A x k+1 x k 2 f (x k ) Step length: B f (x k ) x k+1 x k 8 / 16
22 Finite length property (general recipe) (Attouch, Bolte, Svaiter, Sabach, Teboulle)... A, B > 0: Sufficient decrease: f (x k+1 ) + A x k+1 x k 2 f (x k ) Step length: B f (x k ) x k+1 x k Tameness: f is semi-algebraic (KL inequality) 8 / 16
23 Finite length property (general recipe) (Attouch, Bolte, Svaiter, Sabach, Teboulle)... A, B > 0: Sufficient decrease: f (x k+1 ) + A x k+1 x k 2 f (x k ) Step length: B f (x k ) x k+1 x k Tameness: Coercivity f is semi-algebraic (KL inequality) {x; f (x) f (x 0 )} is compact. 8 / 16
24 Finite length property (general recipe) (Attouch, Bolte, Svaiter, Sabach, Teboulle)... A, B > 0: Sufficient decrease: f (x k+1 ) + A x k+1 x k 2 f (x k ) Step length: B f (x k ) x k+1 x k Tameness: Coercivity f is semi-algebraic (KL inequality) {x; f (x) f (x 0 )} is compact. Finite length: x k+1 x k is bounded. {x k } converges to a critical point. 8 / 16
25 Finite length property (general recipe) (Attouch, Bolte, Svaiter, Sabach, Teboulle)... A, B > 0: Sufficient decrease: f (x k+1 ) + A x k+1 x k 2 f (x k ) Step length: B f (x k ) x k+1 x k Tameness: Coercivity f is semi-algebraic (KL inequality) {x; f (x) f (x 0 )} is compact. Finite length: x k+1 x k is bounded. {x k } converges to a critical point. Remark: There exist counterexamples for functions which are not semi-algebraic. 8 / 16
26 Outline 1. Existing results: gradient methods with semi-algebraic data 2. Complex geometries: sequential convex programming 3. Implicit gradient steps: the value function 9 / 16
27 Examples in non linear programming Approximate local models: approximation and localization Exact penalization: min x f (x) + β max {f i (x)} i=0...m (f, f i smooth). x k+1 = arg min y f (x k ) + f (x k ), y x k +β max {f i (x k ) + f i (x k ), y x k } i=1...m + s y x k / 16
28 Examples in non linear programming Approximate local models: approximation and localization Exact penalization: min x f (x) + β max {f i (x)} i=0...m (f, f i smooth). x k+1 = arg min y f (x k ) + f (x k ), y x k +β max {f i (x k ) + f i (x k ), y x k } i=1...m + s y x k 2 2 Moving ball: min x f (x) s.t. max i=1...m f i (x) 0. (f, f i smooth). x k+1 = arg min y f (x k ) + f (x k ), y x k + s y x k 2 2 s.t. max i=1...m f i(x k ) + f i (x k ), y x k + s y x k / 16
29 Examples in non linear programming Approximate local models: approximation and localization Exact penalization: min x f (x) + β max {f i (x)} i=0...m (f, f i smooth). x k+1 = arg min y f (x k ) + f (x k ), y x k +β max {f i (x k ) + f i (x k ), y x k } i=1...m + s y x k 2 2 Moving ball: min x f (x) s.t. max i=1...m f i (x) 0. (f, f i smooth). x k+1 = arg min y f (x k ) + f (x k ), y x k + s y x k 2 2 s.t. max i=1...m f i(x k ) + f i (x k ), y x k + s y x k Gauss-Newton: min x g(f (x)) (F smooth, g convex) x k+1 = arg min y g(f (x k ) + F (x k )(y x k )) + s y x k / 16
30 A gradient method? Objective: val 4 2 type Objective Approximation Approximation: H(x) = max i=1...m f i(x) x h s(y, x) = max i=1...m f i(x)+ f i (x), y x +s y x / 16
31 A gradient method? Objective: val 4 2 type Objective Approximation Approximation: H(x) = max i=1...m f i(x) x h s(y, x) = max i=1...m f i(x)+ f i (x), y x +s y x 2 2 x k+1 = arg min y h s (y, x k ): a gradient method? 11 / 16
32 A gradient method? Objective: val 4 2 type Objective Approximation Approximation: H(x) = max i=1...m f i(x) x h s(y, x) = max i=1...m f i(x)+ f i (x), y x +s y x 2 2 x k+1 = arg min y h s (y, x k ): a gradient method? Main difficulty: track activity I (x) := arg max i f i (x) (subgradients of H). I (x k+1 ) and I (x k ) are very hard to connect. No relation between x k+1 x k and elements in H(x k ) or H(x k+1 ). Same issues for all the previous methods. 11 / 16
33 Outline 1. Existing results: gradient methods with semi-algebraic data 2. Complex geometries: sequential convex programming 3. Implicit gradient steps: the value function 12 / 16
34 Toward a link between SCP and gradient methods Gradient descent is an SCP method: s(x k x k+1 ) = f (x k ) x k+1 = arg min y f (x k ) + f (x k ), y x k + s 2 y x k / 16
35 Toward a link between SCP and gradient methods Gradient descent is an SCP method: s(x k x k+1 ) = f (x k ) x k+1 = arg min y f (x k ) + f (x k ), y x k + s 2 y x k 2 2 An identity from Moreau: g convex, lower semicontinuous, G : x min y g(y) y x 2 2 (value function of prox g ). x k+1 = prox g (x k ) x k x k+1 = G(x k ) (arg min y g(y) y x k 2 2) 13 / 16
36 Toward a link between SCP and gradient methods Gradient descent is an SCP method: s(x k x k+1 ) = f (x k ) x k+1 = arg min y f (x k ) + f (x k ), y x k + s 2 y x k 2 2 An identity from Moreau: g convex, lower semicontinuous, G : x min y g(y) y x 2 2 (value function of prox g ). x k+1 = prox g (x k ) x k x k+1 = G(x k ) (arg min y g(y) y x k 2 2) A prox step is an implicit gradient step on its value function. Can we extend to more general SCP? 13 / 16
37 SCP: strongly convex tangent approximation, example Objective: (each f i is C 2, semi-algebraic) val 4 2 type Objective Approximation Approximation: H(x) = max i=1...m f i(x) x h s(y, x) = max i=1...m f i(x)+ f i (x), y x +s y x 2 2 x k+1 = arg min y h s(y, x k ) (P s(x)) 14 / 16
38 SCP: strongly convex tangent approximation, example Objective: (each f i is C 2, semi-algebraic) val 4 2 type Objective Approximation Approximation: H(x) = max i=1...m f i(x) x h s(y, x) = max i=1...m f i(x)+ f i (x), y x +s y x 2 2 x k+1 = arg min y h s(y, x k ) (P s(x)) The value function: V s (x) = value of P s (x) Critical points of V s are exactly critical points of H. dist (0, V s(x k )) C x k+1 x k locally V s(x k ) + D x k x k+1 2 V s(x k 1 ) locally (for suitable s). V s is semi-algebraic nonsmooth KL property. 14 / 16
39 SCP: strongly convex tangent approximation, example Objective: (each f i is C 2, semi-algebraic) val 4 2 type Objective Approximation Approximation: H(x) = max i=1...m f i(x) x h s(y, x) = max i=1...m f i(x)+ f i (x), y x +s y x 2 2 x k+1 = arg min y h s(y, x k ) (P s(x)) The value function: V s (x) = value of P s (x) Critical points of V s are exactly critical points of H. dist (0, V s(x k )) C x k+1 x k locally V s(x k ) + D x k x k+1 2 V s(x k 1 ) locally (for suitable s). V s is semi-algebraic nonsmooth KL property. Implicit (sub)-gradient step on the value function back to charted territory. 14 / 16
40 Non linear programming with semi-algebraic data SQP and SQCQP from (Bolte-P. 2014): General convergence result Exact penalization SQP: Sl 1 -QP (Fletcher 1985), ESQM (Auslender 2013). Inner approximating methods: Moving Ball (Auslender et. al (2010)). Ongoing work (P. 2016): Composite Gauss-Newton (Burke 1985): min x g(f (x)), g : R m R convex finite valued, F : R n R m C 2. x k+1 = arg min y g(f (x k ) + F (x k )(y x k )) + s y x k / 16
41 Conclusion Non convergence Jittering Convergence k / 16
42 Conclusion Non convergence Jittering Convergence k First general convergence result for SCP methods (complex geometry). Abstract SCP: strongly convex tangent approximations of tame objective. implicit subgradient method on the value function. More details: J. Bolte and E. Pauwels. Majorization-minimization procedures and convergence of SQP methods for semi-algebraic and tame programs. MOR / 16
A semi-algebraic look at first-order methods
splitting A semi-algebraic look at first-order Université de Toulouse / TSE Nesterov s 60th birthday, Les Houches, 2016 in large-scale first-order optimization splitting Start with a reasonable FOM (some
More informationA user s guide to Lojasiewicz/KL inequalities
Other A user s guide to Lojasiewicz/KL inequalities Toulouse School of Economics, Université Toulouse I SLRA, Grenoble, 2015 Motivations behind KL f : R n R smooth ẋ(t) = f (x(t)) or x k+1 = x k λ k f
More informationFrom error bounds to the complexity of first-order descent methods for convex functions
From error bounds to the complexity of first-order descent methods for convex functions Nguyen Trong Phong-TSE Joint work with Jérôme Bolte, Juan Peypouquet, Bruce Suter. Toulouse, 23-25, March, 2016 Journées
More informationI P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION
I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION Peter Ochs University of Freiburg Germany 17.01.2017 joint work with: Thomas Brox and Thomas Pock c 2017 Peter Ochs ipiano c 1
More informationarxiv: v2 [math.oc] 6 Oct 2016
The value function approach to convergence analysis in composite optimization Edouard Pauwels IRIT-UPS, 118 route de Narbonne, 31062 Toulouse, France. arxiv:1604.01654v2 [math.oc] 6 Oct 2016 Abstract This
More informationMajorization-minimization procedures and convergence of SQP methods for semi-algebraic and tame programs
Majorization-minimization procedures and convergence of SQP methods for semi-algebraic and tame programs Jérôme Bolte and Edouard Pauwels September 29, 2014 Abstract In view of solving nonsmooth and nonconvex
More informationA memory gradient algorithm for l 2 -l 0 regularization with applications to image restoration
A memory gradient algorithm for l 2 -l 0 regularization with applications to image restoration E. Chouzenoux, A. Jezierska, J.-C. Pesquet and H. Talbot Université Paris-Est Lab. d Informatique Gaspard
More informationAn inertial forward-backward algorithm for the minimization of the sum of two nonconvex functions
An inertial forward-backward algorithm for the minimization of the sum of two nonconvex functions Radu Ioan Boţ Ernö Robert Csetnek Szilárd Csaba László October, 1 Abstract. We propose a forward-backward
More informationConvergence rates for an inertial algorithm of gradient type associated to a smooth nonconvex minimization
Convergence rates for an inertial algorithm of gradient type associated to a smooth nonconvex minimization Szilárd Csaba László November, 08 Abstract. We investigate an inertial algorithm of gradient type
More informationProximal Alternating Linearized Minimization for Nonconvex and Nonsmooth Problems
Proximal Alternating Linearized Minimization for Nonconvex and Nonsmooth Problems Jérôme Bolte Shoham Sabach Marc Teboulle Abstract We introduce a proximal alternating linearized minimization PALM) algorithm
More informationVariable Metric Forward-Backward Algorithm
Variable Metric Forward-Backward Algorithm 1/37 Variable Metric Forward-Backward Algorithm for minimizing the sum of a differentiable function and a convex function E. Chouzenoux in collaboration with
More informationMathematical methods for Image Processing
Mathematical methods for Image Processing François Malgouyres Institut de Mathématiques de Toulouse, France invitation by Jidesh P., NITK Surathkal funding Global Initiative on Academic Network Oct. 23
More informationTame variational analysis
Tame variational analysis Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with Daniilidis (Chile), Ioffe (Technion), and Lewis (Cornell) May 19, 2015 Theme: Semi-algebraic geometry
More informationarxiv: v2 [math.oc] 21 Nov 2017
Unifying abstract inexact convergence theorems and block coordinate variable metric ipiano arxiv:1602.07283v2 [math.oc] 21 Nov 2017 Peter Ochs Mathematical Optimization Group Saarland University Germany
More informationA gradient type algorithm with backward inertial steps for a nonconvex minimization
A gradient type algorithm with backward inertial steps for a nonconvex minimization Cristian Daniel Alecsa Szilárd Csaba László Adrian Viorel November 22, 208 Abstract. We investigate an algorithm of gradient
More informationFirst Order Methods beyond Convexity and Lipschitz Gradient Continuity with Applications to Quadratic Inverse Problems
First Order Methods beyond Convexity and Lipschitz Gradient Continuity with Applications to Quadratic Inverse Problems Jérôme Bolte Shoham Sabach Marc Teboulle Yakov Vaisbourd June 20, 2017 Abstract We
More informationA Majorize-Minimize subspace approach for l 2 -l 0 regularization with applications to image processing
A Majorize-Minimize subspace approach for l 2 -l 0 regularization with applications to image processing Emilie Chouzenoux emilie.chouzenoux@univ-mlv.fr Université Paris-Est Lab. d Informatique Gaspard
More informationHedy Attouch, Jérôme Bolte, Benar Svaiter. To cite this version: HAL Id: hal
Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods Hedy Attouch, Jérôme Bolte, Benar Svaiter To cite
More informationReceived: 15 December 2010 / Accepted: 30 July 2011 / Published online: 20 August 2011 Springer and Mathematical Optimization Society 2011
Math. Program., Ser. A (2013) 137:91 129 DOI 10.1007/s10107-011-0484-9 FULL LENGTH PAPER Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward backward splitting,
More informationOn the iterate convergence of descent methods for convex optimization
On the iterate convergence of descent methods for convex optimization Clovis C. Gonzaga March 1, 2014 Abstract We study the iterate convergence of strong descent algorithms applied to convex functions.
More information6. Proximal gradient method
L. Vandenberghe EE236C (Spring 2016) 6. Proximal gradient method motivation proximal mapping proximal gradient method with fixed step size proximal gradient method with line search 6-1 Proximal mapping
More informationAccelerated Proximal Gradient Methods for Convex Optimization
Accelerated Proximal Gradient Methods for Convex Optimization Paul Tseng Mathematics, University of Washington Seattle MOPTA, University of Guelph August 18, 2008 ACCELERATED PROXIMAL GRADIENT METHODS
More informationActive sets, steepest descent, and smooth approximation of functions
Active sets, steepest descent, and smooth approximation of functions Dmitriy Drusvyatskiy School of ORIE, Cornell University Joint work with Alex D. Ioffe (Technion), Martin Larsson (EPFL), and Adrian
More informationAgenda. Fast proximal gradient methods. 1 Accelerated first-order methods. 2 Auxiliary sequences. 3 Convergence analysis. 4 Numerical examples
Agenda Fast proximal gradient methods 1 Accelerated first-order methods 2 Auxiliary sequences 3 Convergence analysis 4 Numerical examples 5 Optimality of Nesterov s scheme Last time Proximal gradient method
More informationAn Alternating Proximal Splitting Method with Global Convergence for Nonconvex Structured Sparsity Optimization
An Alternating Proximal Splitting Method with Global Convergence for Nonconvex Structured Sparsity Optimization Shubao Zhang and Hui Qian and Xiaojin Gong College of Computer Science and Technology Department
More informationProximal methods. S. Villa. October 7, 2014
Proximal methods S. Villa October 7, 2014 1 Review of the basics Often machine learning problems require the solution of minimization problems. For instance, the ERM algorithm requires to solve a problem
More informationExpanding the reach of optimal methods
Expanding the reach of optimal methods Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with C. Kempton (UW), M. Fazel (UW), A.S. Lewis (Cornell), and S. Roy (UW) BURKAPALOOZA! WCOM
More informationConvergence of descent methods for semi-algebraic and tame problems.
OPTIMIZATION, GAMES, AND DYNAMICS Institut Henri Poincaré November 28-29, 2011 Convergence of descent methods for semi-algebraic and tame problems. Hedy ATTOUCH Institut de Mathématiques et Modélisation
More information6. Proximal gradient method
L. Vandenberghe EE236C (Spring 2013-14) 6. Proximal gradient method motivation proximal mapping proximal gradient method with fixed step size proximal gradient method with line search 6-1 Proximal mapping
More informationBlock Coordinate Descent for Regularized Multi-convex Optimization
Block Coordinate Descent for Regularized Multi-convex Optimization Yangyang Xu and Wotao Yin CAAM Department, Rice University February 15, 2013 Multi-convex optimization Model definition Applications Outline
More informationFast proximal gradient methods
L. Vandenberghe EE236C (Spring 2013-14) Fast proximal gradient methods fast proximal gradient method (FISTA) FISTA with line search FISTA as descent method Nesterov s second method 1 Fast (proximal) gradient
More informationOn the convergence of a regularized Jacobi algorithm for convex optimization
On the convergence of a regularized Jacobi algorithm for convex optimization Goran Banjac, Kostas Margellos, and Paul J. Goulart Abstract In this paper we consider the regularized version of the Jacobi
More informationConvex Optimization. (EE227A: UC Berkeley) Lecture 15. Suvrit Sra. (Gradient methods III) 12 March, 2013
Convex Optimization (EE227A: UC Berkeley) Lecture 15 (Gradient methods III) 12 March, 2013 Suvrit Sra Optimal gradient methods 2 / 27 Optimal gradient methods We saw following efficiency estimates for
More informationSubgradient Method. Ryan Tibshirani Convex Optimization
Subgradient Method Ryan Tibshirani Convex Optimization 10-725 Consider the problem Last last time: gradient descent min x f(x) for f convex and differentiable, dom(f) = R n. Gradient descent: choose initial
More informationsubgradient trajectories : the convex case
trajectories : Université de Tours www.lmpt.univ-tours.fr/ ley Joint work with : Jérôme Bolte (Paris vi) Aris Daniilidis (U. Autonoma Barcelona & Tours) and Laurent Mazet (Paris xii) inequality inequality
More informationPrimal and Dual Variables Decomposition Methods in Convex Optimization
Primal and Dual Variables Decomposition Methods in Convex Optimization Amir Beck Technion - Israel Institute of Technology Haifa, Israel Based on joint works with Edouard Pauwels, Shoham Sabach, Luba Tetruashvili,
More informationHomotopy Smoothing for Non-Smooth Problems with Lower Complexity than O(1/ɛ)
1 28 Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than O(1/) Yi Xu yi-xu@uiowa.edu Yan Yan yan.yan-3@student.uts.edu.au Qihang Lin qihang-lin@uiowa.edu Tianbao Yang tianbao-yang@uiowa.edu
More informationFrom error bounds to the complexity of first-order descent methods for convex functions
From error bounds to the complexity of first-order descent methods for convex functions Jérôme Bolte Trong Phong Nguyen Juan Peypouquet Bruce W. Suter October 28, 205 Dedicated to Jean-Pierre Dedieu who
More informationRelative-Continuity for Non-Lipschitz Non-Smooth Convex Optimization using Stochastic (or Deterministic) Mirror Descent
Relative-Continuity for Non-Lipschitz Non-Smooth Convex Optimization using Stochastic (or Deterministic) Mirror Descent Haihao Lu August 3, 08 Abstract The usual approach to developing and analyzing first-order
More informationTaylor-like models in nonsmooth optimization
Taylor-like models in nonsmooth optimization Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with Ioffe (Technion), Lewis (Cornell), and Paquette (UW) SIAM Optimization 2017 AFOSR,
More informationA Unified Approach to Proximal Algorithms using Bregman Distance
A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department
More informationA New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction
A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction Marc Teboulle School of Mathematical Sciences Tel Aviv University Joint work with H. Bauschke and J. Bolte Optimization
More informationImage restoration by minimizing zero norm of wavelet frame coefficients
Image restoration by minimizing zero norm of wavelet frame coefficients Chenglong Bao a, Bin Dong b, Likun Hou c, Zuowei Shen a, Xiaoqun Zhang c,d, Xue Zhang d a Department of Mathematics, National University
More informationOptimization methods
Optimization methods Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda /8/016 Introduction Aim: Overview of optimization methods that Tend to
More informationComposite nonlinear models at scale
Composite nonlinear models at scale Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with D. Davis (Cornell), M. Fazel (UW), A.S. Lewis (Cornell) C. Paquette (Lehigh), and S. Roy (UW)
More informationA proximal-newton method for monotone inclusions in Hilbert spaces with complexity O(1/k 2 ).
H. ATTOUCH (Univ. Montpellier 2) Fast proximal-newton method Sept. 8-12, 2014 1 / 40 A proximal-newton method for monotone inclusions in Hilbert spaces with complexity O(1/k 2 ). Hedy ATTOUCH Université
More informationA Proximal Alternating Direction Method for Semi-Definite Rank Minimization (Supplementary Material)
A Proximal Alternating Direction Method for Semi-Definite Rank Minimization (Supplementary Material) Ganzhao Yuan and Bernard Ghanem King Abdullah University of Science and Technology (KAUST), Saudi Arabia
More informationdans les modèles à vraisemblance non explicite par des algorithmes gradient-proximaux perturbés
Inférence pénalisée dans les modèles à vraisemblance non explicite par des algorithmes gradient-proximaux perturbés Gersende Fort Institut de Mathématiques de Toulouse, CNRS and Univ. Paul Sabatier Toulouse,
More informationSubgradient Method. Guest Lecturer: Fatma Kilinc-Karzan. Instructors: Pradeep Ravikumar, Aarti Singh Convex Optimization /36-725
Subgradient Method Guest Lecturer: Fatma Kilinc-Karzan Instructors: Pradeep Ravikumar, Aarti Singh Convex Optimization 10-725/36-725 Adapted from slides from Ryan Tibshirani Consider the problem Recall:
More informationConvergence of Cubic Regularization for Nonconvex Optimization under KŁ Property
Convergence of Cubic Regularization for Nonconvex Optimization under KŁ Property Yi Zhou Department of ECE The Ohio State University zhou.1172@osu.edu Zhe Wang Department of ECE The Ohio State University
More informationLecture 23: November 21
10-725/36-725: Convex Optimization Fall 2016 Lecturer: Ryan Tibshirani Lecture 23: November 21 Scribes: Yifan Sun, Ananya Kumar, Xin Lu Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:
More informationProximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725
Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:
More informationMath 273a: Optimization Subgradient Methods
Math 273a: Optimization Subgradient Methods Instructor: Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com Nonsmooth convex function Recall: For ˉx R n, f(ˉx) := {g R
More informationInterior Point Algorithms for Constrained Convex Optimization
Interior Point Algorithms for Constrained Convex Optimization Chee Wei Tan CS 8292 : Advanced Topics in Convex Optimization and its Applications Fall 2010 Outline Inequality constrained minimization problems
More informationThis can be 2 lectures! still need: Examples: non-convex problems applications for matrix factorization
This can be 2 lectures! still need: Examples: non-convex problems applications for matrix factorization x = prox_f(x)+prox_{f^*}(x) use to get prox of norms! PROXIMAL METHODS WHY PROXIMAL METHODS Smooth
More informationSIAM Conference on Imaging Science, Bologna, Italy, Adaptive FISTA. Peter Ochs Saarland University
SIAM Conference on Imaging Science, Bologna, Italy, 2018 Adaptive FISTA Peter Ochs Saarland University 07.06.2018 joint work with Thomas Pock, TU Graz, Austria c 2018 Peter Ochs Adaptive FISTA 1 / 16 Some
More informationSplitting methods for decomposing separable convex programs
Splitting methods for decomposing separable convex programs Philippe Mahey LIMOS - ISIMA - Université Blaise Pascal PGMO, ENSTA 2013 October 4, 2013 1 / 30 Plan 1 Max Monotone Operators Proximal techniques
More informationPrimal-dual Subgradient Method for Convex Problems with Functional Constraints
Primal-dual Subgradient Method for Convex Problems with Functional Constraints Yurii Nesterov, CORE/INMA (UCL) Workshop on embedded optimization EMBOPT2014 September 9, 2014 (Lucca) Yu. Nesterov Primal-dual
More informationSequential Unconstrained Minimization: A Survey
Sequential Unconstrained Minimization: A Survey Charles L. Byrne February 21, 2013 Abstract The problem is to minimize a function f : X (, ], over a non-empty subset C of X, where X is an arbitrary set.
More informationThe proximal mapping
The proximal mapping http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/37 1 closed function 2 Conjugate function
More informationBISTA: a Bregmanian proximal gradient method without the global Lipschitz continuity assumption
BISTA: a Bregmanian proximal gradient method without the global Lipschitz continuity assumption Daniel Reem (joint work with Simeon Reich and Alvaro De Pierro) Department of Mathematics, The Technion,
More informationHessian Riemannian Gradient Flows in Convex Programming
Hessian Riemannian Gradient Flows in Convex Programming Felipe Alvarez, Jérôme Bolte, Olivier Brahic INTERNATIONAL CONFERENCE ON MODELING AND OPTIMIZATION MODOPT 2004 Universidad de La Frontera, Temuco,
More informationOn the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems
On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems Radu Ioan Boţ Ernö Robert Csetnek August 5, 014 Abstract. In this paper we analyze the
More informationOptimization methods
Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,
More informationLECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE
LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework - MC/MC Constrained optimization
More informationOptimization for Learning and Big Data
Optimization for Learning and Big Data Donald Goldfarb Department of IEOR Columbia University Department of Mathematics Distinguished Lecture Series May 17-19, 2016. Lecture 1. First-Order Methods for
More informationarxiv: v3 [math.oc] 27 Oct 2016
A Multi-step Inertial Forward Backward Splitting Method for Non-convex Optimization arxiv:1606.02118v3 [math.oc] 27 Oct 2016 Jingwei Liang Jalal M. Fadili Gabriel Peyré Abstract In this paper, we propose
More informationarxiv: v5 [cs.na] 22 Mar 2018
Iteratively Linearized Reweighted Alternating Direction Method of Multipliers for a Class of Nonconvex Problems Tao Sun Hao Jiang Lizhi Cheng Wei Zhu arxiv:1709.00483v5 [cs.na] 22 Mar 2018 March 26, 2018
More informationMaster 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique
Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some
More informationTight Rates and Equivalence Results of Operator Splitting Schemes
Tight Rates and Equivalence Results of Operator Splitting Schemes Wotao Yin (UCLA Math) Workshop on Optimization for Modern Computing Joint w Damek Davis and Ming Yan UCLA CAM 14-51, 14-58, and 14-59 1
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring
More informationDual Proximal Gradient Method
Dual Proximal Gradient Method http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/19 1 proximal gradient method
More informationAuxiliary-Function Methods in Optimization
Auxiliary-Function Methods in Optimization Charles Byrne (Charles Byrne@uml.edu) http://faculty.uml.edu/cbyrne/cbyrne.html Department of Mathematical Sciences University of Massachusetts Lowell Lowell,
More informationarxiv: v2 [math.oc] 22 Jun 2017
A Proximal Difference-of-convex Algorithm with Extrapolation Bo Wen Xiaojun Chen Ting Kei Pong arxiv:1612.06265v2 [math.oc] 22 Jun 2017 June 17, 2017 Abstract We consider a class of difference-of-convex
More informationDouglas-Rachford splitting for nonconvex feasibility problems
Douglas-Rachford splitting for nonconvex feasibility problems Guoyin Li Ting Kei Pong Jan 3, 015 Abstract We adapt the Douglas-Rachford DR) splitting method to solve nonconvex feasibility problems by studying
More informationAn inertial forward-backward method for solving vector optimization problems
An inertial forward-backward method for solving vector optimization problems Sorin-Mihai Grad Chemnitz University of Technology www.tu-chemnitz.de/ gsor research supported by the DFG project GR 3367/4-1
More informationEE 546, Univ of Washington, Spring Proximal mapping. introduction. review of conjugate functions. proximal mapping. Proximal mapping 6 1
EE 546, Univ of Washington, Spring 2012 6. Proximal mapping introduction review of conjugate functions proximal mapping Proximal mapping 6 1 Proximal mapping the proximal mapping (prox-operator) of a convex
More informationl 0 norm based dictionary learning by proximal methods with global convergence
l 0 norm based dictionary learning by proximal methods with global convergence Chenglong Bao, Hui Ji, Yuhui Quan and Zuowei Shen Department of Mathematics, National University of Singapore, Singapore,
More informationOptimizing Nonconvex Finite Sums by a Proximal Primal-Dual Method
Optimizing Nonconvex Finite Sums by a Proximal Primal-Dual Method Davood Hajinezhad Iowa State University Davood Hajinezhad Optimizing Nonconvex Finite Sums by a Proximal Primal-Dual Method 1 / 35 Co-Authors
More information1. Gradient method. gradient method, first-order methods. quadratic bounds on convex functions. analysis of gradient method
L. Vandenberghe EE236C (Spring 2016) 1. Gradient method gradient method, first-order methods quadratic bounds on convex functions analysis of gradient method 1-1 Approximate course outline First-order
More informationLasso: Algorithms and Extensions
ELE 538B: Sparsity, Structure and Inference Lasso: Algorithms and Extensions Yuxin Chen Princeton University, Spring 2017 Outline Proximal operators Proximal gradient methods for lasso and its extensions
More informationStochastic Optimization: First order method
Stochastic Optimization: First order method Taiji Suzuki Tokyo Institute of Technology Graduate School of Information Science and Engineering Department of Mathematical and Computing Sciences JST, PRESTO
More informationAuxiliary-Function Methods in Iterative Optimization
Auxiliary-Function Methods in Iterative Optimization Charles L. Byrne April 6, 2015 Abstract Let C X be a nonempty subset of an arbitrary set X and f : X R. The problem is to minimize f over C. In auxiliary-function
More informationMath 273a: Optimization Subgradients of convex functions
Math 273a: Optimization Subgradients of convex functions Made by: Damek Davis Edited by Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com 1 / 42 Subgradients Assumptions
More informationConvergence rate of inexact proximal point methods with relative error criteria for convex optimization
Convergence rate of inexact proximal point methods with relative error criteria for convex optimization Renato D. C. Monteiro B. F. Svaiter August, 010 Revised: December 1, 011) Abstract In this paper,
More informationSparsity Regularization
Sparsity Regularization Bangti Jin Course Inverse Problems & Imaging 1 / 41 Outline 1 Motivation: sparsity? 2 Mathematical preliminaries 3 l 1 solvers 2 / 41 problem setup finite-dimensional formulation
More informationA Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization
A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization Panos Parpas Department of Computing Imperial College London www.doc.ic.ac.uk/ pp500 p.parpas@imperial.ac.uk jointly with D.V.
More informationPerturbed Proximal Gradient Algorithm
Perturbed Proximal Gradient Algorithm Gersende FORT LTCI, CNRS, Telecom ParisTech Université Paris-Saclay, 75013, Paris, France Large-scale inverse problems and optimization Applications to image processing
More informationarxiv: v3 [math.oc] 17 Dec 2017 Received: date / Accepted: date
Noname manuscript No. (will be inserted by the editor) A Simple Convergence Analysis of Bregman Proximal Gradient Algorithm Yi Zhou Yingbin Liang Lixin Shen arxiv:1503.05601v3 [math.oc] 17 Dec 2017 Received:
More informationA Tutorial on Primal-Dual Algorithm
A Tutorial on Primal-Dual Algorithm Shenlong Wang University of Toronto March 31, 2016 1 / 34 Energy minimization MAP Inference for MRFs Typical energies consist of a regularization term and a data term.
More informationGLOBALLY CONVERGENT ACCELERATED PROXIMAL ALTERNATING MAXIMIZATION METHOD FOR L1-PRINCIPAL COMPONENT ANALYSIS. Peng Wang Huikang Liu Anthony Man-Cho So
GLOBALLY CONVERGENT ACCELERATED PROXIMAL ALTERNATING MAXIMIZATION METHOD FOR L-PRINCIPAL COMPONENT ANALYSIS Peng Wang Huikang Liu Anthony Man-Cho So Department of Systems Engineering and Engineering Management,
More informationApproaching monotone inclusion problems via second order dynamical systems with linear and anisotropic damping
March 0, 206 3:4 WSPC Proceedings - 9in x 6in secondorderanisotropicdamping206030 page Approaching monotone inclusion problems via second order dynamical systems with linear and anisotropic damping Radu
More informationCoordinate Update Algorithm Short Course Subgradients and Subgradient Methods
Coordinate Update Algorithm Short Course Subgradients and Subgradient Methods Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 30 Notation f : H R { } is a closed proper convex function domf := {x R n
More informationAccelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems)
Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems) Donghwan Kim and Jeffrey A. Fessler EECS Department, University of Michigan
More informationAnalysis of the convergence rate for the cyclic projection algorithm applied to basic semi-algebraic convex sets
Analysis of the convergence rate for the cyclic projection algorithm applied to basic semi-algebraic convex sets Jonathan M. Borwein, Guoyin Li, and Liangjin Yao Second revised version: January 04 Abstract
More informationDual and primal-dual methods
ELE 538B: Large-Scale Optimization for Data Science Dual and primal-dual methods Yuxin Chen Princeton University, Spring 2018 Outline Dual proximal gradient method Primal-dual proximal gradient method
More informationCoordinate Update Algorithm Short Course Operator Splitting
Coordinate Update Algorithm Short Course Operator Splitting Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 25 Operator splitting pipeline 1. Formulate a problem as 0 A(x) + B(x) with monotone operators
More informationLearning with stochastic proximal gradient
Learning with stochastic proximal gradient Lorenzo Rosasco DIBRIS, Università di Genova Via Dodecaneso, 35 16146 Genova, Italy lrosasco@mit.edu Silvia Villa, Băng Công Vũ Laboratory for Computational and
More informationAmir Beck August 7, Abstract. 1 Introduction and Problem/Model Formulation. In this paper we consider the following minimization problem:
On the Convergence of Alternating Minimization for Convex Programming with Applications to Iteratively Reweighted Least Squares and Decomposition Schemes Amir Beck August 7, 04 Abstract This paper is concerned
More informationSparse Regularization via Convex Analysis
Sparse Regularization via Convex Analysis Ivan Selesnick Electrical and Computer Engineering Tandon School of Engineering New York University Brooklyn, New York, USA 29 / 66 Convex or non-convex: Which
More information