Taylor-like models in nonsmooth optimization

Size: px
Start display at page:

Download "Taylor-like models in nonsmooth optimization"

Transcription

1 Taylor-like models in nonsmooth optimization Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with Ioffe (Technion), Lewis (Cornell), and Paquette (UW) SIAM Optimization 2017 AFOSR, NSF

2 Fix a closed function f : R n R. 2/18

3 Fix a closed function f : R n R. Slope: Fastest instantaneous rate of decrease 2/18

4 Fix a closed function f : R n R. Slope: Fastest instantaneous rate of decrease f ( x) := limsup x x (f ( x) f (x)) +. x x 2/18

5 Fix a closed function f : R n R. Slope: Fastest instantaneous rate of decrease If f is convex, then f ( x) := limsup x x (f ( x) f (x)) +. x x f (x) = dist(0; f (x)). 2/18

6 Fix a closed function f : R n R. Slope: Fastest instantaneous rate of decrease If f is convex, then Critical points: f ( x) := limsup x x (f ( x) f (x)) +. x x f (x) = dist(0; f (x)). x is critical for f f (x) = 0. 2/18

7 Fix a closed function f : R n R. Slope: Fastest instantaneous rate of decrease If f is convex, then Critical points: Deficiency: f is discontinuous f ( x) := limsup x x (f ( x) f (x)) +. x x f (x) = dist(0; f (x)). x is critical for f f (x) = 0. 2/18

8 Fix a closed function f : R n R. Slope: Fastest instantaneous rate of decrease If f is convex, then Critical points: f ( x) := limsup x x (f ( x) f (x)) +. x x f (x) = dist(0; f (x)). x is critical for f f (x) = 0. Deficiency: f is discontinuous = can not be used to terminate 2/18

9 Basic question: Is there a computable continuous surrogate G for f? 3/18

10 Basic question: Is there a computable continuous surrogate G for f? Desirable properties: 1. G is continuous, 2. G (x) = 0 f (x) = 0, 3. epi G and epi f are close. 3/18

11 Basic question: Is there a computable continuous surrogate G for f? Desirable properties: 1. G is continuous, 2. G (x) = 0 f (x) = 0, 3. epi G and epi f are close. Various contexts: cutting planes (Kelley), bundle (Lemaréchal, Noll, Sagastizábal, Wolfe), gradient sampling (Goldstein, Burke-Lewis-Overton) 3/18

12 Outline 1. Taylor-like models step-size, stationarity, error-bounds 2. Convex composite g + h c prox-linear method local linear/quadratic rates 4/18

13 Taylor-like models Task: Determine quality of x R n for min f (y). y 5/18

14 Taylor-like models Task: Determine quality of x R n for min f (y). y Structural assumption: Taylor-like model f x available: f x (y) f (y) η 2 x y 2 y. 5/18

15 Taylor-like models Task: Determine quality of x R n for min f (y). y Structural assumption: Taylor-like model f x available: Slope surrogate: f x (y) f (y) η 2 x y 2 y. x + argmin f x (y) and G (x) := x x + y 5/18

16 Taylor-like models Task: Determine quality of x R n for min f (y). y Structural assumption: Taylor-like model f x available: Slope surrogate: f x (y) f (y) η 2 x y 2 y. x + argmin f x (y) and G (x) := x x + y Thm: (D-Ioffe-Lewis 16) There exists ˆx satisfying 1 (point proximity) 2 x ˆx G (x). 5/18

17 Taylor-like models Task: Determine quality of x R n for min f (y). y Structural assumption: Taylor-like model f x available: Slope surrogate: f x (y) f (y) η 2 x y 2 y. x + argmin f x (y) and G (x) := x x + y Thm: (D-Ioffe-Lewis 16) There exists ˆx satisfying 1 (point proximity) 2 x ˆx (value proximity) 1 η f (ˆx) f (x) G (x). 5/18

18 Taylor-like models Task: Determine quality of x R n for min f (y). y Structural assumption: Taylor-like model f x available: Slope surrogate: f x (y) f (y) η 2 x y 2 y. x + argmin f x (y) and G (x) := x x + y Thm: (D-Ioffe-Lewis 16) There exists ˆx satisfying 1 (point proximity) 2 x ˆx (value proximity) 1 η f (ˆx) f (x) G (x). 1 (near stationarity) η f (ˆx) 5/18

19 Error bounds and linear rates Thm: (D-Ioffe-Lewis 16) Let S R n be arbitrary and x S arbitrary. Suppose (Slope EB): dist(x; S) κ f (x) x near x. 6/18

20 Error bounds and linear rates Thm: (D-Ioffe-Lewis 16) Let S R n be arbitrary and x S arbitrary. Suppose (Slope EB): dist(x; S) κ f (x) x near x. Slope EB phenomenon underling linear rates. 6/18

21 Error bounds and linear rates Thm: (D-Ioffe-Lewis 16) Let S R n be arbitrary and x S arbitrary. Suppose (Slope EB): Then it holds: (Step-size EB) dist(x; S) κ f (x) dist(x, S) (3κη + 2) G (x) x near x. x near x. Slope EB phenomenon underling linear rates. 6/18

22 Error bounds and linear rates Thm: (D-Ioffe-Lewis 16) Let S R n be arbitrary and x S arbitrary. Suppose (Slope EB): Then it holds: (Step-size EB) dist(x; S) κ f (x) dist(x, S) (3κη + 2) G (x) x near x. x near x. Slope EB phenomenon underling linear rates. Step-size EB aids linear rate analysis (Luo-Tseng 93). 6/18

23 Error bounds and linear rates Thm: (D-Ioffe-Lewis 16) Let S R n be arbitrary and x S arbitrary. Suppose (Slope EB): Then it holds: (Step-size EB) dist(x; S) κ f (x) dist(x, S) (3κη + 2) G (x) x near x. x near x. Slope EB phenomenon underling linear rates. Step-size EB aids linear rate analysis (Luo-Tseng 93). Rem: Similar for the surrogate G (x) := f (x) f x (x + ) 6/18

24 Convex composite minimization h c + g 7/18

25 Nonsmooth & Nonconvex minimization Convex composition min x f (x) = g(x) + h(c(x)) 8/18

26 Nonsmooth & Nonconvex minimization Convex composition min x f (x) = g(x) + h(c(x)) where g : R d R is closed, convex. h : R m R is convex and L-Lipschitz. c : R d R m is C 1 -smooth and c is β-lipschitz. For convenience, set η = Lβ. 8/18

27 Nonsmooth & Nonconvex minimization Convex composition min x f (x) = g(x) + h(c(x)) where g : R d R is closed, convex. h : R m R is convex and L-Lipschitz. c : R d R m is C 1 -smooth and c is β-lipschitz. For convenience, set η = Lβ. (Burke 85, 91, Cartis-Gould-Toint 11, Fletcher 82, Lewis-Wright 15, Powell 84, Wright 90, Yuan 83) 8/18

28 Composite examples Convex composition min x f (x) = g(x) + h(c(x)) Examples: Additive composite minimization: min g(x) + c(x) x Nonlinear least squares: min { c(x) : l i x i u i for i = 1,..., m} x Nonnegative Matrix Factorization: min X,Y XY T D s.t. X, Y 0 Robust Phase Retrieval: (Duchi-Ruan 17) min a, x 2 b 1 x Exact penalty subproblem: min g(x) + dist K (c(x)) x 9/18

29 Prox-linear algorithm Prox-linear model: f x (y) = g(y) + h x + = argmin y ( ) c(x) + c(x)(y x) + η y x 2 2 f x (y) G (x) = η x x + 10/18

30 Prox-linear algorithm Prox-linear model: f x (y) = g(y) + h ( ) c(x) + c(x)(y x) + η y x 2 2 x + = argmin y f x (y) G (x) = η x x + Justification: f (y) f x (y) x, y 10/18

31 Prox-linear algorithm Prox-linear model: f x (y) = g(y) + h ( ) c(x) + c(x)(y x) + η y x 2 2 x + = argmin y f x (y) G (x) = η x x + Justification: f x (y) η y x 2 f (y) f x (y) x, y 10/18

32 Prox-linear algorithm Prox-linear model: f x (y) = g(y) + h ( ) c(x) + c(x)(y x) + η y x 2 2 x + = argmin y f x (y) G (x) = η x x + Justification: f x (y) η y x 2 f (y) f x (y) x, y Prox-linear method (Burke, Fletcher, Osborne, Powell,... 80s): x k+1 = x + k. 10/18

33 Prox-linear algorithm Prox-linear model: f x (y) = g(y) + h ( ) c(x) + c(x)(y x) + η y x 2 2 x + = argmin y f x (y) G (x) = η x x + Justification: f x (y) η y x 2 f (y) f x (y) x, y Prox-linear method (Burke, Fletcher, Osborne, Powell,... 80s): x k+1 = x + k. Eg: proximal gradient, Levenberg-Marquardt methods 10/18

34 Prox-linear algorithm Prox-linear model: f x (y) = g(y) + h ( ) c(x) + c(x)(y x) + η y x 2 2 x + = argmin y f x (y) G (x) = η x x + Justification: f x (y) η y x 2 f (y) f x (y) x, y Prox-linear method (Burke, Fletcher, Osborne, Powell,... 80s): x k+1 = x + k. Eg: proximal gradient, Levenberg-Marquardt methods Convergence rate: G (x k ) < ɛ after ( ) η O ɛ 2 iterations 10/18

35 Stopping criterion What does G (x) < ɛ actually mean? 11/18

36 Stopping criterion What does G (x) < ɛ actually mean? Stationarity for target problem: 0 g(x) + c(x) h(c(x)) Stationarity for prox-subproblem: G (x) dist (0; g(x + ) + c(x) h ( c(x) + c(x)(x + x) )) 11/18

37 Stopping criterion What does G (x) < ɛ actually mean? Stationarity for target problem: 0 g(x) + c(x) h(c(x)) Stationarity for prox-subproblem: G (x) dist (0; g(x + ) + c(x) h ( c(x) + c(x)(x + x) )) Thm: (D-Lewis 16) x + is nearly stationary because ˆx with 1 η ˆx x G (x) and f (ˆx) G (x) 11/18

38 Stopping criterion What does G (x) < ɛ actually mean? Stationarity for target problem: 0 g(x) + c(x) h(c(x)) Stationarity for prox-subproblem: G (x) dist (0; g(x + ) + c(x) h ( c(x) + c(x)(x + x) )) Thm: (D-Lewis 16) x + is nearly stationary because ˆx with 1 η ˆx x G (x) and f (ˆx) G (x) Thm: (D-Paquette 16) G (x) 2η(x prox F/2η (x)). 11/18

39 Local quadratic convergence Let S = {stationary points} and fix x S. 12/18

40 Local quadratic convergence Let S = {stationary points} and fix x S. Thm: (Burke-Ferris 95) Weak sharp minimum 0 < α f (x) x / S near x, 12/18

41 Local quadratic convergence Let S = {stationary points} and fix x S. Thm: (Burke-Ferris 95) Weak sharp minimum 0 < α f (x) x / S near x, = local quadratic convergence dist(x k+1 ; S) O(dist 2 (x k ; S)). 12/18

42 Local quadratic convergence Let S = {stationary points} and fix x S. Thm: (Burke-Ferris 95) Weak sharp minimum 0 < α f (x) x / S near x, = local quadratic convergence dist(x k+1 ; S) O(dist 2 (x k ; S)). Growth interpretation: Weak sharp minimum = f (x) f (proj(x; S)) + α dist(x, S) x near x. 12/18

43 Local linear convergence Thm: (D-Lewis 16) Error bound property dist(x; S) 1 f (x) α for x near x 13/18

44 Local linear convergence Thm: (D-Lewis 16) Error bound property dist(x; S) 1 f (x) α for x near x = local linear convergence ( f (x k+1 ) f 1 α2 η 2 ) (f (x k ) f ) 13/18

45 Local linear convergence Thm: (D-Lewis 16) Error bound property dist(x; S) 1 f (x) α for x near x = local linear convergence ( f (x k+1 ) f 1 α2 η 2 ) (f (x k ) f ) Growth interpretation: (D-Mordukhovich-Nghia 14) EB property = f (x) f (proj(x, S)) + α 2 dist2 (x, S) for x near x 13/18

46 Local linear convergence Thm: (D-Lewis 16) Error bound property dist(x; S) 1 f (x) α for x near x = local linear convergence ( f (x k+1 ) f 1 α2 η 2 ) (f (x k ) f ) Growth interpretation: (D-Mordukhovich-Nghia 14) EB property = f (x) f (proj(x, S)) + α 2 dist2 (x, S) for x near x Rate becomes α η under tilt-stability (Poliquin-Rockafellar 98) 13/18

47 Robust phase retrieval (Duchi-Ruan 17) Problem: Find x R n satisfying a i, x 2 b i for a 1,..., a m R n and b 1,..., b m R +. 14/18

48 Robust phase retrieval (Duchi-Ruan 17) Problem: Find x R n satisfying a i, x 2 b i for a 1,..., a m R n and b 1,..., b m R +. Defn: (Eldar-Mendelson 12) A R m n is stable if (Ax) 2 (Ay) 2 1 λ x y x + y. 14/18

49 Robust phase retrieval (Duchi-Ruan 17) Problem: Find x R n satisfying a i, x 2 b i for a 1,..., a m R n and b 1,..., b m R +. Defn: (Eldar-Mendelson 12) A R m n is stable if (Ax) 2 (Ay) 2 1 λ x y x + y. Thm: (Duchi-Ruan 17) If a i N (0, I n ) and m/n 1 then A is stable with high probability. 14/18

50 Robust phase retrieval (Duchi-Ruan 17) Problem: Find x R n satisfying a i, x 2 b i for a 1,..., a m R n and b 1,..., b m R +. Defn: (Eldar-Mendelson 12) A R m n is stable if (Ax) 2 (Ay) 2 1 λ x y x + y. Thm: (Duchi-Ruan 17) If a i N (0, I n ) and m/n 1 then A is stable with high probability. Two ingredients: 1) Problem min x (Ax) 2 b 1 14/18

51 Robust phase retrieval (Duchi-Ruan 17) Problem: Find x R n satisfying a i, x 2 b i for a 1,..., a m R n and b 1,..., b m R +. Defn: (Eldar-Mendelson 12) A R m n is stable if (Ax) 2 (Ay) 2 1 λ x y x + y. Thm: (Duchi-Ruan 17) If a i N (0, I n ) and m/n 1 then A is stable with high probability. Two ingredients: 1) Problem min x (Ax) 2 b 1 = (Ax) 2 (Ax ) 2 1 has a weak sharp minimum = local quadratic convergence! 14/18

52 Robust phase retrieval (Duchi-Ruan 17) Problem: Find x R n satisfying a i, x 2 b i for a 1,..., a m R n and b 1,..., b m R +. Defn: (Eldar-Mendelson 12) A R m n is stable if (Ax) 2 (Ay) 2 1 λ x y x + y. Thm: (Duchi-Ruan 17) If a i N (0, I n ) and m/n 1 then A is stable with high probability. Two ingredients: 1) Problem min x (Ax) 2 b 1 = (Ax) 2 (Ax ) 2 1 has a weak sharp minimum = local quadratic convergence! 2) Can find x 0 in attraction region w.h.p. using spectrum. 14/18

53 RNA reconstruction (Duchi-Ruan 17) n = 222, m = 3n (a) x0, (b) 10 inaccurate solves, (c) one accurate solve, (d) original image. 15/18

54 Summary 1. Taylor-like models step-size, stationarity, error-bounds 2. Convex composite g + h c prox-linear method local linear/quadratic rates 16/18

55 Summary 1. Taylor-like models step-size, stationarity, error-bounds 2. Convex composite g + h c prox-linear method local linear/quadratic rates Other recent works: 1. First-order complexity & Acceleration (Paquette 2:00-2:25) 2. Stochastic prox-linear algorithms (Duchi-Ruan 17) 3. Robust phase retrieval (Duchi-Ruan 17) 16/18

56 Thank you! 17/18

57 References Nonsmooth optimization using Taylor-like models: error bounds, convergence, and termination criteria D-Ioffe-Lewis, 2016, arxiv: Error bounds, quadratic growth, and linear convergence of proximal methods D-Lewis, 2016, arxiv: Efficiency of minimizing compositions of convex functions and smooth maps D-Paquette, 2016, arxiv: /18

Expanding the reach of optimal methods

Expanding the reach of optimal methods Expanding the reach of optimal methods Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with C. Kempton (UW), M. Fazel (UW), A.S. Lewis (Cornell), and S. Roy (UW) BURKAPALOOZA! WCOM

More information

Composite nonlinear models at scale

Composite nonlinear models at scale Composite nonlinear models at scale Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with D. Davis (Cornell), M. Fazel (UW), A.S. Lewis (Cornell) C. Paquette (Lehigh), and S. Roy (UW)

More information

Efficiency of minimizing compositions of convex functions and smooth maps

Efficiency of minimizing compositions of convex functions and smooth maps Efficiency of minimizing compositions of convex functions and smooth maps D. Drusvyatskiy C. Paquette Abstract We consider global efficiency of algorithms for minimizing a sum of a convex function and

More information

An accelerated algorithm for minimizing convex compositions

An accelerated algorithm for minimizing convex compositions An accelerated algorithm for minimizing convex compositions D. Drusvyatskiy C. Kempton April 30, 016 Abstract We describe a new proximal algorithm for minimizing compositions of finite-valued convex functions

More information

Tame variational analysis

Tame variational analysis Tame variational analysis Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with Daniilidis (Chile), Ioffe (Technion), and Lewis (Cornell) May 19, 2015 Theme: Semi-algebraic geometry

More information

Active sets, steepest descent, and smooth approximation of functions

Active sets, steepest descent, and smooth approximation of functions Active sets, steepest descent, and smooth approximation of functions Dmitriy Drusvyatskiy School of ORIE, Cornell University Joint work with Alex D. Ioffe (Technion), Martin Larsson (EPFL), and Adrian

More information

A Proximal Method for Identifying Active Manifolds

A Proximal Method for Identifying Active Manifolds A Proximal Method for Identifying Active Manifolds W.L. Hare April 18, 2006 Abstract The minimization of an objective function over a constraint set can often be simplified if the active manifold of the

More information

McMaster University. Advanced Optimization Laboratory. Title: A Proximal Method for Identifying Active Manifolds. Authors: Warren L.

McMaster University. Advanced Optimization Laboratory. Title: A Proximal Method for Identifying Active Manifolds. Authors: Warren L. McMaster University Advanced Optimization Laboratory Title: A Proximal Method for Identifying Active Manifolds Authors: Warren L. Hare AdvOl-Report No. 2006/07 April 2006, Hamilton, Ontario, Canada A Proximal

More information

Dual Proximal Gradient Method

Dual Proximal Gradient Method Dual Proximal Gradient Method http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/19 1 proximal gradient method

More information

Nonsmooth optimization: conditioning, convergence, and semi-algebraic models

Nonsmooth optimization: conditioning, convergence, and semi-algebraic models Nonsmooth optimization: conditioning, convergence, and semi-algebraic models Adrian Lewis ORIE Cornell International Congress of Mathematicians Seoul, August 2014 1/16 Outline I Optimization and inverse

More information

Worst-Case Complexity Guarantees and Nonconvex Smooth Optimization

Worst-Case Complexity Guarantees and Nonconvex Smooth Optimization Worst-Case Complexity Guarantees and Nonconvex Smooth Optimization Frank E. Curtis, Lehigh University Beyond Convexity Workshop, Oaxaca, Mexico 26 October 2017 Worst-Case Complexity Guarantees and Nonconvex

More information

The nonsmooth landscape of phase retrieval

The nonsmooth landscape of phase retrieval The nonsmooth landscape of phase retrieval Damek Davis Dmitriy Drusvyatskiy Courtney Paquette Abstract We consider a popular nonsmooth formulation of the real phase retrieval problem. We show that under

More information

Inexact alternating projections on nonconvex sets

Inexact alternating projections on nonconvex sets Inexact alternating projections on nonconvex sets D. Drusvyatskiy A.S. Lewis November 3, 2018 Dedicated to our friend, colleague, and inspiration, Alex Ioffe, on the occasion of his 80th birthday. Abstract

More information

Introduction. New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems

Introduction. New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems Z. Akbari 1, R. Yousefpour 2, M. R. Peyghami 3 1 Department of Mathematics, K.N. Toosi University of Technology,

More information

The nonsmooth landscape of phase retrieval

The nonsmooth landscape of phase retrieval The nonsmooth landscape of phase retrieval Damek Davis Dmitriy Drusvyatskiy Courtney Paquette Abstract We consider a popular nonsmooth formulation of the real phase retrieval problem. We show that under

More information

Identifying Active Constraints via Partial Smoothness and Prox-Regularity

Identifying Active Constraints via Partial Smoothness and Prox-Regularity Journal of Convex Analysis Volume 11 (2004), No. 2, 251 266 Identifying Active Constraints via Partial Smoothness and Prox-Regularity W. L. Hare Department of Mathematics, Simon Fraser University, Burnaby,

More information

A PROXIMAL METHOD FOR COMPOSITE MINIMIZATION. 1. Problem Statement. We consider minimization problems of the form. min

A PROXIMAL METHOD FOR COMPOSITE MINIMIZATION. 1. Problem Statement. We consider minimization problems of the form. min A PROXIMAL METHOD FOR COMPOSITE MINIMIZATION A. S. LEWIS AND S. J. WRIGHT Abstract. We consider minimization of functions that are compositions of prox-regular functions with smooth vector functions. A

More information

Accelerated primal-dual methods for linearly constrained convex problems

Accelerated primal-dual methods for linearly constrained convex problems Accelerated primal-dual methods for linearly constrained convex problems Yangyang Xu SIAM Conference on Optimization May 24, 2017 1 / 23 Accelerated proximal gradient For convex composite problem: minimize

More information

arxiv: v1 [math.oc] 24 Mar 2017

arxiv: v1 [math.oc] 24 Mar 2017 Stochastic Methods for Composite Optimization Problems John C. Duchi 1,2 and Feng Ruan 2 {jduchi,fengruan}@stanford.edu Departments of 1 Electrical Engineering and 2 Statistics Stanford University arxiv:1703.08570v1

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

arxiv: v1 [math.oc] 9 Oct 2018

arxiv: v1 [math.oc] 9 Oct 2018 Cubic Regularization with Momentum for Nonconvex Optimization Zhe Wang Yi Zhou Yingbin Liang Guanghui Lan Ohio State University Ohio State University zhou.117@osu.edu liang.889@osu.edu Ohio State University

More information

Sequential convex programming,: value function and convergence

Sequential convex programming,: value function and convergence Sequential convex programming,: value function and convergence Edouard Pauwels joint work with Jérôme Bolte Journées MODE Toulouse March 23 2016 1 / 16 Introduction Local search methods for finite dimensional

More information

Downloaded 09/27/13 to Redistribution subject to SIAM license or copyright; see

Downloaded 09/27/13 to Redistribution subject to SIAM license or copyright; see SIAM J. OPTIM. Vol. 23, No., pp. 256 267 c 203 Society for Industrial and Applied Mathematics TILT STABILITY, UNIFORM QUADRATIC GROWTH, AND STRONG METRIC REGULARITY OF THE SUBDIFFERENTIAL D. DRUSVYATSKIY

More information

Higher-Order Methods

Higher-Order Methods Higher-Order Methods Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. PCMI, July 2016 Stephen Wright (UW-Madison) Higher-Order Methods PCMI, July 2016 1 / 25 Smooth

More information

Key words. prox-regular functions, polyhedral convex functions, sparse optimization, global convergence, active constraint identification

Key words. prox-regular functions, polyhedral convex functions, sparse optimization, global convergence, active constraint identification A PROXIMAL METHOD FOR COMPOSITE MINIMIZATION A. S. LEWIS AND S. J. WRIGHT Abstract. We consider minimization of functions that are compositions of convex or proxregular functions (possibly extended-valued

More information

Beyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory

Beyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory Beyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory Xin Liu(4Ð) State Key Laboratory of Scientific and Engineering Computing Institute of Computational Mathematics

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)

More information

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method Optimization Methods and Software Vol. 00, No. 00, Month 200x, 1 11 On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method ROMAN A. POLYAK Department of SEOR and Mathematical

More information

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization Frank E. Curtis, Lehigh University involving joint work with James V. Burke, University of Washington Daniel

More information

6. Proximal gradient method

6. Proximal gradient method L. Vandenberghe EE236C (Spring 2013-14) 6. Proximal gradient method motivation proximal mapping proximal gradient method with fixed step size proximal gradient method with line search 6-1 Proximal mapping

More information

c???? Society for Industrial and Applied Mathematics Vol. 1, No. 1, pp ,???? 000

c???? Society for Industrial and Applied Mathematics Vol. 1, No. 1, pp ,???? 000 SIAM J. OPTIMIZATION c???? Society for Industrial and Applied Mathematics Vol. 1, No. 1, pp. 000 000,???? 000 TILT STABILITY OF A LOCAL MINIMUM * R. A. POLIQUIN AND R. T. ROCKAFELLAR Abstract. The behavior

More information

Optimisation non convexe avec garanties de complexité via Newton+gradient conjugué

Optimisation non convexe avec garanties de complexité via Newton+gradient conjugué Optimisation non convexe avec garanties de complexité via Newton+gradient conjugué Clément Royer (Université du Wisconsin-Madison, États-Unis) Toulouse, 8 janvier 2019 Nonconvex optimization via Newton-CG

More information

Block Coordinate Descent for Regularized Multi-convex Optimization

Block Coordinate Descent for Regularized Multi-convex Optimization Block Coordinate Descent for Regularized Multi-convex Optimization Yangyang Xu and Wotao Yin CAAM Department, Rice University February 15, 2013 Multi-convex optimization Model definition Applications Outline

More information

Algorithms for Nonsmooth Optimization

Algorithms for Nonsmooth Optimization Algorithms for Nonsmooth Optimization Frank E. Curtis, Lehigh University presented at Center for Optimization and Statistical Learning, Northwestern University 2 March 2018 Algorithms for Nonsmooth Optimization

More information

A user s guide to Lojasiewicz/KL inequalities

A user s guide to Lojasiewicz/KL inequalities Other A user s guide to Lojasiewicz/KL inequalities Toulouse School of Economics, Université Toulouse I SLRA, Grenoble, 2015 Motivations behind KL f : R n R smooth ẋ(t) = f (x(t)) or x k+1 = x k λ k f

More information

Chapter 4. Unconstrained optimization

Chapter 4. Unconstrained optimization Chapter 4. Unconstrained optimization Version: 28-10-2012 Material: (for details see) Chapter 11 in [FKS] (pp.251-276) A reference e.g. L.11.2 refers to the corresponding Lemma in the book [FKS] PDF-file

More information

Complexity Analysis of Interior Point Algorithms for Non-Lipschitz and Nonconvex Minimization

Complexity Analysis of Interior Point Algorithms for Non-Lipschitz and Nonconvex Minimization Mathematical Programming manuscript No. (will be inserted by the editor) Complexity Analysis of Interior Point Algorithms for Non-Lipschitz and Nonconvex Minimization Wei Bian Xiaojun Chen Yinyu Ye July

More information

Accelerated Block-Coordinate Relaxation for Regularized Optimization

Accelerated Block-Coordinate Relaxation for Regularized Optimization Accelerated Block-Coordinate Relaxation for Regularized Optimization Stephen J. Wright Computer Sciences University of Wisconsin, Madison October 09, 2012 Problem descriptions Consider where f is smooth

More information

Worst Case Complexity of Direct Search

Worst Case Complexity of Direct Search Worst Case Complexity of Direct Search L. N. Vicente May 3, 200 Abstract In this paper we prove that direct search of directional type shares the worst case complexity bound of steepest descent when sufficient

More information

A quasisecant method for minimizing nonsmooth functions

A quasisecant method for minimizing nonsmooth functions A quasisecant method for minimizing nonsmooth functions Adil M. Bagirov and Asef Nazari Ganjehlou Centre for Informatics and Applied Optimization, School of Information Technology and Mathematical Sciences,

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of

More information

Optimization methods

Optimization methods Optimization methods Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda /8/016 Introduction Aim: Overview of optimization methods that Tend to

More information

A semi-algebraic look at first-order methods

A semi-algebraic look at first-order methods splitting A semi-algebraic look at first-order Université de Toulouse / TSE Nesterov s 60th birthday, Les Houches, 2016 in large-scale first-order optimization splitting Start with a reasonable FOM (some

More information

Key words. constrained optimization, composite optimization, Mangasarian-Fromovitz constraint qualification, active set, identification.

Key words. constrained optimization, composite optimization, Mangasarian-Fromovitz constraint qualification, active set, identification. IDENTIFYING ACTIVITY A. S. LEWIS AND S. J. WRIGHT Abstract. Identification of active constraints in constrained optimization is of interest from both practical and theoretical viewpoints, as it holds the

More information

Pacific Journal of Optimization (Vol. 2, No. 3, September 2006) ABSTRACT

Pacific Journal of Optimization (Vol. 2, No. 3, September 2006) ABSTRACT Pacific Journal of Optimization Vol., No. 3, September 006) PRIMAL ERROR BOUNDS BASED ON THE AUGMENTED LAGRANGIAN AND LAGRANGIAN RELAXATION ALGORITHMS A. F. Izmailov and M. V. Solodov ABSTRACT For a given

More information

Complexity analysis of second-order algorithms based on line search for smooth nonconvex optimization

Complexity analysis of second-order algorithms based on line search for smooth nonconvex optimization Complexity analysis of second-order algorithms based on line search for smooth nonconvex optimization Clément Royer - University of Wisconsin-Madison Joint work with Stephen J. Wright MOPTA, Bethlehem,

More information

Fast proximal gradient methods

Fast proximal gradient methods L. Vandenberghe EE236C (Spring 2013-14) Fast proximal gradient methods fast proximal gradient method (FISTA) FISTA with line search FISTA as descent method Nesterov s second method 1 Fast (proximal) gradient

More information

arxiv: v1 [math.oc] 1 Jul 2016

arxiv: v1 [math.oc] 1 Jul 2016 Convergence Rate of Frank-Wolfe for Non-Convex Objectives Simon Lacoste-Julien INRIA - SIERRA team ENS, Paris June 8, 016 Abstract arxiv:1607.00345v1 [math.oc] 1 Jul 016 We give a simple proof that the

More information

On the Quadratic Convergence of the Cubic Regularization Method under a Local Error Bound Condition

On the Quadratic Convergence of the Cubic Regularization Method under a Local Error Bound Condition On the Quadratic Convergence of the Cubic Regularization Method under a Local Error Bound Condition Man-Chung Yue Zirui Zhou Anthony Man-Cho So Abstract In this paper we consider the cubic regularization

More information

On the Quadratic Convergence of the Cubic Regularization Method under a Local Error Bound Condition

On the Quadratic Convergence of the Cubic Regularization Method under a Local Error Bound Condition On the Quadratic Convergence of the Cubic Regularization Method under a Local Error Bound Condition Man-Chung Yue Zirui Zhou Anthony Man-Cho So Abstract In this paper we consider the cubic regularization

More information

Proximal Minimization by Incremental Surrogate Optimization (MISO)

Proximal Minimization by Incremental Surrogate Optimization (MISO) Proximal Minimization by Incremental Surrogate Optimization (MISO) (and a few variants) Julien Mairal Inria, Grenoble ICCOPT, Tokyo, 2016 Julien Mairal, Inria MISO 1/26 Motivation: large-scale machine

More information

SIAM Conference on Imaging Science, Bologna, Italy, Adaptive FISTA. Peter Ochs Saarland University

SIAM Conference on Imaging Science, Bologna, Italy, Adaptive FISTA. Peter Ochs Saarland University SIAM Conference on Imaging Science, Bologna, Italy, 2018 Adaptive FISTA Peter Ochs Saarland University 07.06.2018 joint work with Thomas Pock, TU Graz, Austria c 2018 Peter Ochs Adaptive FISTA 1 / 16 Some

More information

Introduction. A Modified Steepest Descent Method Based on BFGS Method for Locally Lipschitz Functions. R. Yousefpour 1

Introduction. A Modified Steepest Descent Method Based on BFGS Method for Locally Lipschitz Functions. R. Yousefpour 1 A Modified Steepest Descent Method Based on BFGS Method for Locally Lipschitz Functions R. Yousefpour 1 1 Department Mathematical Sciences, University of Mazandaran, Babolsar, Iran; yousefpour@umz.ac.ir

More information

ALGORITHMS FOR MINIMIZING DIFFERENCES OF CONVEX FUNCTIONS AND APPLICATIONS

ALGORITHMS FOR MINIMIZING DIFFERENCES OF CONVEX FUNCTIONS AND APPLICATIONS ALGORITHMS FOR MINIMIZING DIFFERENCES OF CONVEX FUNCTIONS AND APPLICATIONS Mau Nam Nguyen (joint work with D. Giles and R. B. Rector) Fariborz Maseeh Department of Mathematics and Statistics Portland State

More information

BORIS MORDUKHOVICH Wayne State University Detroit, MI 48202, USA. Talk given at the SPCOM Adelaide, Australia, February 2015

BORIS MORDUKHOVICH Wayne State University Detroit, MI 48202, USA. Talk given at the SPCOM Adelaide, Australia, February 2015 CODERIVATIVE CHARACTERIZATIONS OF MAXIMAL MONOTONICITY BORIS MORDUKHOVICH Wayne State University Detroit, MI 48202, USA Talk given at the SPCOM 2015 Adelaide, Australia, February 2015 Based on joint papers

More information

6. Proximal gradient method

6. Proximal gradient method L. Vandenberghe EE236C (Spring 2016) 6. Proximal gradient method motivation proximal mapping proximal gradient method with fixed step size proximal gradient method with line search 6-1 Proximal mapping

More information

Proximal Newton Method. Ryan Tibshirani Convex Optimization /36-725

Proximal Newton Method. Ryan Tibshirani Convex Optimization /36-725 Proximal Newton Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: primal-dual interior-point method Given the problem min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h

More information

Provable Non-Convex Min-Max Optimization

Provable Non-Convex Min-Max Optimization Provable Non-Convex Min-Max Optimization Mingrui Liu, Hassan Rafique, Qihang Lin, Tianbao Yang Department of Computer Science, The University of Iowa, Iowa City, IA, 52242 Department of Mathematics, The

More information

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization Panos Parpas Department of Computing Imperial College London www.doc.ic.ac.uk/ pp500 p.parpas@imperial.ac.uk jointly with D.V.

More information

Lecture 17: October 27

Lecture 17: October 27 0-725/36-725: Convex Optimiation Fall 205 Lecturer: Ryan Tibshirani Lecture 7: October 27 Scribes: Brandon Amos, Gines Hidalgo Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These

More information

A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications

A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications Weijun Zhou 28 October 20 Abstract A hybrid HS and PRP type conjugate gradient method for smooth

More information

Convergence of Cubic Regularization for Nonconvex Optimization under KŁ Property

Convergence of Cubic Regularization for Nonconvex Optimization under KŁ Property Convergence of Cubic Regularization for Nonconvex Optimization under KŁ Property Yi Zhou Department of ECE The Ohio State University zhou.1172@osu.edu Zhe Wang Department of ECE The Ohio State University

More information

The proximal point method revisited

The proximal point method revisited The proimal point method revisited Dmitriy Drusvyatskiy Abstract In this short survey, I revisit the role of the proimal point method in large scale optimization. I focus on three recent eamples: a proimally

More information

Optimal Newton-type methods for nonconvex smooth optimization problems

Optimal Newton-type methods for nonconvex smooth optimization problems Optimal Newton-type methods for nonconvex smooth optimization problems Coralia Cartis, Nicholas I. M. Gould and Philippe L. Toint June 9, 20 Abstract We consider a general class of second-order iterations

More information

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization Proximal Newton Method Zico Kolter (notes by Ryan Tibshirani) Convex Optimization 10-725 Consider the problem Last time: quasi-newton methods min x f(x) with f convex, twice differentiable, dom(f) = R

More information

On the acceleration of augmented Lagrangian method for linearly constrained optimization

On the acceleration of augmented Lagrangian method for linearly constrained optimization On the acceleration of augmented Lagrangian method for linearly constrained optimization Bingsheng He and Xiaoming Yuan October, 2 Abstract. The classical augmented Lagrangian method (ALM plays a fundamental

More information

Introduction to Nonlinear Stochastic Programming

Introduction to Nonlinear Stochastic Programming School of Mathematics T H E U N I V E R S I T Y O H F R G E D I N B U Introduction to Nonlinear Stochastic Programming Jacek Gondzio Email: J.Gondzio@ed.ac.uk URL: http://www.maths.ed.ac.uk/~gondzio SPS

More information

Non-smooth Non-convex Bregman Minimization: Unification and new Algorithms

Non-smooth Non-convex Bregman Minimization: Unification and new Algorithms Non-smooth Non-convex Bregman Minimization: Unification and new Algorithms Peter Ochs, Jalal Fadili, and Thomas Brox Saarland University, Saarbrücken, Germany Normandie Univ, ENSICAEN, CNRS, GREYC, France

More information

Limited Memory Kelley s Method Converges for Composite Convex and Submodular Objectives

Limited Memory Kelley s Method Converges for Composite Convex and Submodular Objectives Limited Memory Kelley s Method Converges for Composite Convex and Submodular Objectives Madeleine Udell Operations Research and Information Engineering Cornell University Based on joint work with Song

More information

8 Numerical methods for unconstrained problems

8 Numerical methods for unconstrained problems 8 Numerical methods for unconstrained problems Optimization is one of the important fields in numerical computation, beside solving differential equations and linear systems. We can see that these fields

More information

Randomized Smoothing Techniques in Optimization

Randomized Smoothing Techniques in Optimization Randomized Smoothing Techniques in Optimization John Duchi Based on joint work with Peter Bartlett, Michael Jordan, Martin Wainwright, Andre Wibisono Stanford University Information Systems Laboratory

More information

Newton s Method. Ryan Tibshirani Convex Optimization /36-725

Newton s Method. Ryan Tibshirani Convex Optimization /36-725 Newton s Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, Properties and examples: f (y) = max x

More information

Computing proximal points of nonconvex functions

Computing proximal points of nonconvex functions Mathematical Programming manuscript No. (will be inserted by the editor) Warren Hare Claudia Sagastizábal Computing proximal points o nonconvex unctions the date o receipt and acceptance should be inserted

More information

Journal of Convex Analysis Vol. 14, No. 2, March 2007 AN EXPLICIT DESCENT METHOD FOR BILEVEL CONVEX OPTIMIZATION. Mikhail Solodov. September 12, 2005

Journal of Convex Analysis Vol. 14, No. 2, March 2007 AN EXPLICIT DESCENT METHOD FOR BILEVEL CONVEX OPTIMIZATION. Mikhail Solodov. September 12, 2005 Journal of Convex Analysis Vol. 14, No. 2, March 2007 AN EXPLICIT DESCENT METHOD FOR BILEVEL CONVEX OPTIMIZATION Mikhail Solodov September 12, 2005 ABSTRACT We consider the problem of minimizing a smooth

More information

An introduction to complexity analysis for nonconvex optimization

An introduction to complexity analysis for nonconvex optimization An introduction to complexity analysis for nonconvex optimization Philippe Toint (with Coralia Cartis and Nick Gould) FUNDP University of Namur, Belgium Séminaire Résidentiel Interdisciplinaire, Saint

More information

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725 Duality in Linear Programs Lecturer: Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: proximal gradient descent Consider the problem x g(x) + h(x) with g, h convex, g differentiable, and

More information

Journal of Convex Analysis (accepted for publication) A HYBRID PROJECTION PROXIMAL POINT ALGORITHM. M. V. Solodov and B. F.

Journal of Convex Analysis (accepted for publication) A HYBRID PROJECTION PROXIMAL POINT ALGORITHM. M. V. Solodov and B. F. Journal of Convex Analysis (accepted for publication) A HYBRID PROJECTION PROXIMAL POINT ALGORITHM M. V. Solodov and B. F. Svaiter January 27, 1997 (Revised August 24, 1998) ABSTRACT We propose a modification

More information

arxiv: v1 [math.oc] 7 Dec 2018

arxiv: v1 [math.oc] 7 Dec 2018 arxiv:1812.02878v1 [math.oc] 7 Dec 2018 Solving Non-Convex Non-Concave Min-Max Games Under Polyak- Lojasiewicz Condition Maziar Sanjabi, Meisam Razaviyayn, Jason D. Lee University of Southern California

More information

Recovery of Simultaneously Structured Models using Convex Optimization

Recovery of Simultaneously Structured Models using Convex Optimization Recovery of Simultaneously Structured Models using Convex Optimization Maryam Fazel University of Washington Joint work with: Amin Jalali (UW), Samet Oymak and Babak Hassibi (Caltech) Yonina Eldar (Technion)

More information

Complexity of gradient descent for multiobjective optimization

Complexity of gradient descent for multiobjective optimization Complexity of gradient descent for multiobjective optimization J. Fliege A. I. F. Vaz L. N. Vicente July 18, 2018 Abstract A number of first-order methods have been proposed for smooth multiobjective optimization

More information

Linear Convergence under the Polyak-Łojasiewicz Inequality

Linear Convergence under the Polyak-Łojasiewicz Inequality Linear Convergence under the Polyak-Łojasiewicz Inequality Hamed Karimi, Julie Nutini and Mark Schmidt The University of British Columbia LCI Forum February 28 th, 2017 1 / 17 Linear Convergence of Gradient-Based

More information

Non-smooth Non-convex Bregman Minimization: Unification and New Algorithms

Non-smooth Non-convex Bregman Minimization: Unification and New Algorithms JOTA manuscript No. (will be inserted by the editor) Non-smooth Non-convex Bregman Minimization: Unification and New Algorithms Peter Ochs Jalal Fadili Thomas Brox Received: date / Accepted: date Abstract

More information

Numerical Methods for PDE-Constrained Optimization

Numerical Methods for PDE-Constrained Optimization Numerical Methods for PDE-Constrained Optimization Richard H. Byrd 1 Frank E. Curtis 2 Jorge Nocedal 2 1 University of Colorado at Boulder 2 Northwestern University Courant Institute of Mathematical Sciences,

More information

Solving Corrupted Quadratic Equations, Provably

Solving Corrupted Quadratic Equations, Provably Solving Corrupted Quadratic Equations, Provably Yuejie Chi London Workshop on Sparse Signal Processing September 206 Acknowledgement Joint work with Yuanxin Li (OSU), Huishuai Zhuang (Syracuse) and Yingbin

More information

arxiv: v2 [math.oc] 21 Nov 2017

arxiv: v2 [math.oc] 21 Nov 2017 Unifying abstract inexact convergence theorems and block coordinate variable metric ipiano arxiv:1602.07283v2 [math.oc] 21 Nov 2017 Peter Ochs Mathematical Optimization Group Saarland University Germany

More information

LARGE-SCALE NONCONVEX STOCHASTIC OPTIMIZATION BY DOUBLY STOCHASTIC SUCCESSIVE CONVEX APPROXIMATION

LARGE-SCALE NONCONVEX STOCHASTIC OPTIMIZATION BY DOUBLY STOCHASTIC SUCCESSIVE CONVEX APPROXIMATION LARGE-SCALE NONCONVEX STOCHASTIC OPTIMIZATION BY DOUBLY STOCHASTIC SUCCESSIVE CONVEX APPROXIMATION Aryan Mokhtari, Alec Koppel, Gesualdo Scutari, and Alejandro Ribeiro Department of Electrical and Systems

More information

GENERAL NONCONVEX SPLIT VARIATIONAL INEQUALITY PROBLEMS. Jong Kyu Kim, Salahuddin, and Won Hee Lim

GENERAL NONCONVEX SPLIT VARIATIONAL INEQUALITY PROBLEMS. Jong Kyu Kim, Salahuddin, and Won Hee Lim Korean J. Math. 25 (2017), No. 4, pp. 469 481 https://doi.org/10.11568/kjm.2017.25.4.469 GENERAL NONCONVEX SPLIT VARIATIONAL INEQUALITY PROBLEMS Jong Kyu Kim, Salahuddin, and Won Hee Lim Abstract. In this

More information

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09 Numerical Optimization 1 Working Horse in Computer Vision Variational Methods Shape Analysis Machine Learning Markov Random Fields Geometry Common denominator: optimization problems 2 Overview of Methods

More information

Iterative regularization of nonlinear ill-posed problems in Banach space

Iterative regularization of nonlinear ill-posed problems in Banach space Iterative regularization of nonlinear ill-posed problems in Banach space Barbara Kaltenbacher, University of Klagenfurt joint work with Bernd Hofmann, Technical University of Chemnitz, Frank Schöpfer and

More information

Optimizing Nonconvex Finite Sums by a Proximal Primal-Dual Method

Optimizing Nonconvex Finite Sums by a Proximal Primal-Dual Method Optimizing Nonconvex Finite Sums by a Proximal Primal-Dual Method Davood Hajinezhad Iowa State University Davood Hajinezhad Optimizing Nonconvex Finite Sums by a Proximal Primal-Dual Method 1 / 35 Co-Authors

More information

Stochastic model-based minimization under high-order growth

Stochastic model-based minimization under high-order growth Stochastic model-based minimization under high-order growth Damek Davis Dmitriy Drusvyatskiy Kellie J. MacPhee Abstract Given a nonsmooth, nonconvex minimization problem, we consider algorithms that iteratively

More information

ORIE 6326: Convex Optimization. Quasi-Newton Methods

ORIE 6326: Convex Optimization. Quasi-Newton Methods ORIE 6326: Convex Optimization Quasi-Newton Methods Professor Udell Operations Research and Information Engineering Cornell April 10, 2017 Slides on steepest descent and analysis of Newton s method adapted

More information

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725 Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:

More information

This manuscript is for review purposes only.

This manuscript is for review purposes only. 1 2 3 4 5 6 7 8 9 10 11 12 THE USE OF QUADRATIC REGULARIZATION WITH A CUBIC DESCENT CONDITION FOR UNCONSTRAINED OPTIMIZATION E. G. BIRGIN AND J. M. MARTíNEZ Abstract. Cubic-regularization and trust-region

More information

10. Unconstrained minimization

10. Unconstrained minimization Convex Optimization Boyd & Vandenberghe 10. Unconstrained minimization terminology and assumptions gradient descent method steepest descent method Newton s method self-concordant functions implementation

More information

Quadratic Optimization over a Polyhedral Set

Quadratic Optimization over a Polyhedral Set International Mathematical Forum, Vol. 9, 2014, no. 13, 621-629 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/imf.2014.4234 Quadratic Optimization over a Polyhedral Set T. Bayartugs, Ch. Battuvshin

More information

Stochastic Proximal Gradient Algorithm

Stochastic Proximal Gradient Algorithm Stochastic Institut Mines-Télécom / Telecom ParisTech / Laboratoire Traitement et Communication de l Information Joint work with: Y. Atchade, Ann Arbor, USA, G. Fort LTCI/Télécom Paristech and the kind

More information

Does Alternating Direction Method of Multipliers Converge for Nonconvex Problems?

Does Alternating Direction Method of Multipliers Converge for Nonconvex Problems? Does Alternating Direction Method of Multipliers Converge for Nonconvex Problems? Mingyi Hong IMSE and ECpE Department Iowa State University ICCOPT, Tokyo, August 2016 Mingyi Hong (Iowa State University)

More information

Linear Convergence under the Polyak-Łojasiewicz Inequality

Linear Convergence under the Polyak-Łojasiewicz Inequality Linear Convergence under the Polyak-Łojasiewicz Inequality Hamed Karimi, Julie Nutini, Mark Schmidt University of British Columbia Linear of Convergence of Gradient-Based Methods Fitting most machine learning

More information

On the Minimization Over Sparse Symmetric Sets: Projections, O. Projections, Optimality Conditions and Algorithms

On the Minimization Over Sparse Symmetric Sets: Projections, O. Projections, Optimality Conditions and Algorithms On the Minimization Over Sparse Symmetric Sets: Projections, Optimality Conditions and Algorithms Amir Beck Technion - Israel Institute of Technology Haifa, Israel Based on joint work with Nadav Hallak

More information