Accelerated primal-dual methods for linearly constrained convex problems
|
|
- Brice Watts
- 5 years ago
- Views:
Transcription
1 Accelerated primal-dual methods for linearly constrained convex problems Yangyang Xu SIAM Conference on Optimization May 24, / 23
2 Accelerated proximal gradient For convex composite problem: minimize x f: convex and Lipschitz differentiable F (x) := f(x) + g(x) g: closed convex (possibly nondifferentiable) and simple Proximal gradient: x k+1 = arg min f(x k ), x + L f x 2 x xk 2 + g(x) convergence rate: F (x k ) F (x ) = O(1/k) Accelerated Proximal gradient [Beck-Teboulle 09, Nesterov 14]: ˆx k : extrapolated point x k+1 = arg min f(ˆx k ), x + L f x 2 x ˆxk 2 + g(x) convergence rate (with smart extrapolation): F (x k ) F (x ) = O(1/k 2 ) This talk: ways to accelerate primal-dual methods 2 / 23
3 Part I: accelerated linearized augmented Lagrangian 3 / 23
4 Affinely constrained composite convex problems minimize F (x) = f(x) + g(x), subject to Ax = b (LCP) x f: convex and Lipschitz differentiable g: closed convex and simple Examples nonnegative quadratic programming: f = 1 2 x Qx + c x, g = ι R n + TV image denoising: min{ 1 2 X B 2 F + λ Y 1, s.t. D(X) = Y } 4 / 23
5 Augmented Lagrangian method (ALM) At iteration k, x k+1 arg min f(x) + g(x) λ k, Ax + β x 2 Ax b 2, λ k+1 λ k γ(ax k+1 b) augmented dual gradient ascent with stepsize γ β: penalty parameter; dual gradient Lipschitz constant 1/β 0 < γ < 2β: convergence guaranteed also popular for (nonlinear, nonconvex) constrained problems x-subproblem as difficult as original problem 5 / 23
6 Linearized augmented Lagrangian method Linearize the smooth term f: x k+1 arg min f(x k ), x + η x 2 x xk 2 + g(x) λ k, Ax + β 2 Ax b 2. Linearize both f and Ax b 2 : x k+1 arg min f(x k ), x + g(x) λ k, Ax + βa r k, x + η x 2 x xk 2, where r k = Ax k b is the residual. Easier updates and nice convergence speed O(1/k) 6 / 23
7 Accelerated linearized augmented Lagrangian method At iteration k, ˆx k (1 α k ) x k + α k x k, x k+1 arg min f(ˆx k ) A λ k, x + g(x) + β k x 2 Ax b 2 + η k 2 x xk 2, x k+1 (1 α k ) x k + α k x k+1, λ k+1 λ k γ k (Ax k+1 b). Inspired by [Lan 12] on accelerated stochastic approximation reduces to linearized ALM if α k = 1, β k = β, η k = η, γ k = γ, k convergence rate: O(1/k) if η L f and 0 < γ < 2β adaptive parameters to have O(1/k 2 ) (next slides) 7 / 23
8 Better numerical performance Objective error Feasibility Violation objective minus optimal value Nonaccelerated ALM Accelerated ALM violation of feasibility Nonaccelerated ALM Accelerated ALM Iteration numbers Iteration numbers Tested on quadratic programming (subproblems solved exactly) Parameters set according to theorem (see next slide) Accelerated ALM significantly better 8 / 23
9 Guaranteed fast convergence Assumptions: There is a pair of primal-dual solution (x, λ ). f is Lipschitz continuous: f(x) f(y) L f x y Convergence rate of order O(1/k 2 ): Set parameters to where γ > 0 and η 2L f. Then k : α k = 2 k + 1, γ k = kγ, β k γ k 2, η k = η k, F ( x k+1 ) F (x ) A x t+1 b 1 k(k + 1) 1 k(k + 1) max(1, λ ) ( ) η x 1 x λ 2, γ ( ) η x 1 x λ 2, γ 9 / 23
10 Sketch of proof Let Φ( x, x, λ) = F ( x) F (x) λ, A x b. 1. Fundamental inequality (for any λ): Φ( x k+1, x, λ) (1 α k )Φ( x k, x, λ) [ x k+1 x 2 x k x 2 + x k+1 x k 2] + α2 k L f x k+1 x k 2 2 α kη k 2 + α k [ 2γ λ k λ 2 λ k+1 λ 2 + λ k+1 λ k 2] α kβ k λ k+1 λ k 2, k 2. α k = 2 k+1, γ k = kγ, β k γ k 2, η k = η and multiply k(k + 1) to the above ineq.: k k(k + 1)Φ( x k+1, x, λ) k(k 1)Φ( x k, x, λ) η [ x k+1 x 2 x k x 2] + 1 γ [ λ k λ 2 λ k+1 λ 2]. 3. Set λ 1 = 0 and sum the above inequality over k: Φ( x k+1, x 1, λ) (η x 1 x 2 + 1γ ) k(k + 1) λ 2 4. Take λ = max (1 + λ, 2 λ ) A xk+1 b and use the optimality condition A x k+1 b Φ( x, x, λ ) 0 F ( x k+1 ) F (x ) λ A x k+1 b γ 2 k 10 / 23
11 Literature [He-Yuan 10]: accelerated ALM to O(1/k 2 ) for smooth problems [Kang et. al 13]: accelerated ALM to O(1/k 2 ) for nonsmooth problems [Huang-Ma-Goldfarb 13]: accelerated linearized ALM (with linearization of augmented term) to O(1/k 2 ) for strongly convex problems [Li-Lin 16]: weak convexity, O(1/k) is optimal if augmented term linearized 11 / 23
12 Part II: accelerated linearized ADMM 12 / 23
13 Two-block structured problems Variable is partitioned into two blocks, smooth part involves one block, and nonsmooth part is separable minimize h(y) + f(z) + g(z), subject to By + Cz = b (LCP-2) y,z f convex and Lipschitz differentiable g and h closed convex and simple Examples: Total-variation regularized regression: { min y,z λ y 1 + f(z), s.t. Dz = y } 13 / 23
14 Alternating direction method of multipliers (ADMM) At iteration k, y k+1 arg min h(y) λ k, By + β y 2 By + Czk b 2, z k+1 arg min f(z) + g(z) λ k, Cz + β z 2 Byk+1 + Cz b 2, λ k+1 λ k γ(by k+1 + Cz k+1 b) 0 < γ < β: convergence guaranteed [Glowinski-Marrocco 75] updating y, z alternatingly: easier than jointly update but z-subproblem can still be difficult 14 / 23
15 Accelerated linearized ADMM At iteration k, y k+1 arg min h(y) λ k, By + β k y 2 By + Czk + b 2, z k+1 arg min f(z k ) C λ k + β k C r k+ 1 2, z + g(z) + η k z 2 z zk 2, λ k+1 λ k γ k (By k+1 + Cz k+1 b) where r k+ 1 2 = By k+1 + Cz k b. reduces to linearized ADMM if β k = β, η k = η, γ k = γ, k convergence rate: O(1/k) if 0 < γ β and η L f + β C 2 O(1/k 2 ) if adaptive parameters and strong convexity on z (next two slides) 15 / 23
16 Accelerated convergence speed Assumptions: Existence of a pair of primal-dual solution (y, z, λ ) f Lipschitz continuous: f(ẑ) f( z) L f ẑ z f strongly convex with modulus µ f (not required for y) Convergence rate of order O(1/k 2 ) Set parameters as follows (with γ > 0 and γ < η µ f /2) k : β k = γ k = (k + 1)γ, η k = (k + 1)η + L f, Then ( ) max z k z 2, F (ȳ k, z k ) F, Bȳ k + C z k b O(1/k 2 ), where F (y, z) = h(y) + f(z) + g(z) and F = F (y, z ). 16 / 23
17 Sketch of proof 1. Fundamental inequality from optimality conditions of each iterate: F (y k+1, z k+1 ) F (y, z) λ, By k+1 + Cz k+1 b 1 (λ γ k λ k+1 ), λ λ k + β k (λ k γ k λ k+1 ) β k C(z k+1 z k ) k + L f 2 zk+1 z k 2 µ f 2 zk z 2 η k z k+1 z, z k+1 z k, 2. Plug in parameters and bound cross terms: F (y k+1, z k+1 ) F (y, z ) λ, By k+1 + Cz k+1 b ( η(k + 1) z k+1 z 2 + L f z k+1 z 2) 1 + 2γ(k+1) λ λk+1 2 ( 1 2 η(k + 1) z k z 2 + (L f µ f ) z k z 2) 1 + 2γ(k+1) λ λk Multiply k + k 0 (here k 0 2L f µ f ) and sum the inequality over k: F (ȳ k+1, z k+1 ) F (y, z ) λ, Bȳ k+1 + C z k+1 b φ(y, z, λ) k 2 4. Take a special λ and use KKT conditions 17 / 23
18 Literature [Ouyang et. al 15]: O(L f /k 2 + C 0/k) with only weak convexity [Goldstein et. al 14]: O(1/k 2 ) with strong convexity on both y and z [Li-Lin 16]: O(1/k) optimal with only weak convexity Impossible to improve O(1/k) without additional assumptions [Chambolle-Pock 11, Chambolle-Pock 16, Dang-Lan 14, Bredies-Sun 16]: accelerated first-order methods on bilinear saddle-point problems Open question: weakest conditions to have O(1/k 2 ) 18 / 23
19 Numerical experiments (More results in paper) 19 / 23
20 Accelerated (linearized) ADMM Tested problem: total-variation regularized image denoising minimize X,Y 1 2 X B 2 F + µ Y 1, subject to DX = Y. (TVDN) B observed noisy Cameraman image, and D finite difference operator Compared methods: original ADMM accelerated ADMM linearized ADMM accelerated linearized ADMM accelerated Chambolle-Pock 20 / 23
21 Performance of compared methods objective minus optimal value Accelerated ADMM Accelerated Linearized ADMM 10 6 Nonaccelerated ADMM Nonaccelerated Linearized ADMM Chambolle Pock Iteration numbers objective minus optimal value Accelerated ADMM Accelerated Linearized ADMM Nonaccelerated ADMM Nonaccelerated Linearized ADMM Chambolle Pock Running time (sec.) Accelerated (linearized) ADMM significantly better than nonaccelerated one (accelerated) ADMM faster than (accelerated) linearized ADMM regarding iteration number (but the latter takes less time) 21 / 23
22 Conclusions accelerated linearized ALM to O(1/k 2 ) from O(1/k) with merely convexity accelerated (linearized) ADMM to O(1/k 2 ) from O(1/k) with strong convexity on one block variable performed numerical experiments 22 / 23
23 References 1. Y. Xu. Accelerated first-order primal-dual proximal methods for linearly constrained composite convex programming, SIAM J. Optimization, T. Goldstein, B. O Donoghue, S. Setzer, and R. Baraniuk. Fast alternating direction optimization methods, SIAM J. on Imaging Sciences, B. He and X. Yuan. On the acceleration of augmented Lagrangian method for linearly constrained optimization, Optimization Online, B. Huang, S. Ma, and D. Goldfarb. Accelerated linearized Bregman method, Journal of Scientific Computing, M. Kang, S. Yun, H. Woo, and M. Kang. Accelerated bregman method for linearly constrained l 1 -l 2 minimization, Journal of Scientific Computing, / 23
ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING
ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING YANGYANG XU Abstract. Motivated by big data applications, first-order methods have been extremely
More informationShiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers
Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)
More informationAdaptive Primal Dual Optimization for Image Processing and Learning
Adaptive Primal Dual Optimization for Image Processing and Learning Tom Goldstein Rice University tag7@rice.edu Ernie Esser University of British Columbia eesser@eos.ubc.ca Richard Baraniuk Rice University
More informationSparse Optimization Lecture: Dual Methods, Part I
Sparse Optimization Lecture: Dual Methods, Part I Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know dual (sub)gradient iteration augmented l 1 iteration
More informationHYBRID JACOBIAN AND GAUSS SEIDEL PROXIMAL BLOCK COORDINATE UPDATE METHODS FOR LINEARLY CONSTRAINED CONVEX PROGRAMMING
SIAM J. OPTIM. Vol. 8, No. 1, pp. 646 670 c 018 Society for Industrial and Applied Mathematics HYBRID JACOBIAN AND GAUSS SEIDEL PROXIMAL BLOCK COORDINATE UPDATE METHODS FOR LINEARLY CONSTRAINED CONVEX
More informationTight Rates and Equivalence Results of Operator Splitting Schemes
Tight Rates and Equivalence Results of Operator Splitting Schemes Wotao Yin (UCLA Math) Workshop on Optimization for Modern Computing Joint w Damek Davis and Ming Yan UCLA CAM 14-51, 14-58, and 14-59 1
More informationAccelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems)
Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems) Donghwan Kim and Jeffrey A. Fessler EECS Department, University of Michigan
More informationarxiv: v2 [math.oc] 25 Mar 2018
arxiv:1711.0581v [math.oc] 5 Mar 018 Iteration complexity of inexact augmented Lagrangian methods for constrained convex programming Yangyang Xu Abstract Augmented Lagrangian method ALM has been popularly
More informationFast proximal gradient methods
L. Vandenberghe EE236C (Spring 2013-14) Fast proximal gradient methods fast proximal gradient method (FISTA) FISTA with line search FISTA as descent method Nesterov s second method 1 Fast (proximal) gradient
More informationSolving DC Programs that Promote Group 1-Sparsity
Solving DC Programs that Promote Group 1-Sparsity Ernie Esser Contains joint work with Xiaoqun Zhang, Yifei Lou and Jack Xin SIAM Conference on Imaging Science Hong Kong Baptist University May 14 2014
More informationRecent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables
Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop
More informationInexact Alternating Direction Method of Multipliers for Separable Convex Optimization
Inexact Alternating Direction Method of Multipliers for Separable Convex Optimization Hongchao Zhang hozhang@math.lsu.edu Department of Mathematics Center for Computation and Technology Louisiana State
More informationarxiv: v7 [math.oc] 22 Feb 2018
A SMOOTH PRIMAL-DUAL OPTIMIZATION FRAMEWORK FOR NONSMOOTH COMPOSITE CONVEX MINIMIZATION QUOC TRAN-DINH, OLIVIER FERCOQ, AND VOLKAN CEVHER arxiv:1507.06243v7 [math.oc] 22 Feb 2018 Abstract. We propose a
More informationFAST ALTERNATING DIRECTION OPTIMIZATION METHODS
FAST ALTERNATING DIRECTION OPTIMIZATION METHODS TOM GOLDSTEIN, BRENDAN O DONOGHUE, SIMON SETZER, AND RICHARD BARANIUK Abstract. Alternating direction methods are a common tool for general mathematical
More informationCoordinate Update Algorithm Short Course Operator Splitting
Coordinate Update Algorithm Short Course Operator Splitting Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 25 Operator splitting pipeline 1. Formulate a problem as 0 A(x) + B(x) with monotone operators
More informationA Tutorial on Primal-Dual Algorithm
A Tutorial on Primal-Dual Algorithm Shenlong Wang University of Toronto March 31, 2016 1 / 34 Energy minimization MAP Inference for MRFs Typical energies consist of a regularization term and a data term.
More informationThis can be 2 lectures! still need: Examples: non-convex problems applications for matrix factorization
This can be 2 lectures! still need: Examples: non-convex problems applications for matrix factorization x = prox_f(x)+prox_{f^*}(x) use to get prox of norms! PROXIMAL METHODS WHY PROXIMAL METHODS Smooth
More informationBlock stochastic gradient update method
Block stochastic gradient update method Yangyang Xu and Wotao Yin IMA, University of Minnesota Department of Mathematics, UCLA November 1, 2015 This work was done while in Rice University 1 / 26 Stochastic
More informationUses of duality. Geoff Gordon & Ryan Tibshirani Optimization /
Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear
More informationContraction Methods for Convex Optimization and Monotone Variational Inequalities No.11
XI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.11 Alternating direction methods of multipliers for separable convex programming Bingsheng He Department of Mathematics
More informationDual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725
Dual methods and ADMM Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given f : R n R, the function is called its conjugate Recall conjugate functions f (y) = max x R n yt x f(x)
More informationAn interior-point stochastic approximation method and an L1-regularized delta rule
Photograph from National Geographic, Sept 2008 An interior-point stochastic approximation method and an L1-regularized delta rule Peter Carbonetto, Mark Schmidt and Nando de Freitas University of British
More informationA Unified Approach to Proximal Algorithms using Bregman Distance
A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department
More informationA Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization
A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization Panos Parpas Department of Computing Imperial College London www.doc.ic.ac.uk/ pp500 p.parpas@imperial.ac.uk jointly with D.V.
More informationLecture 3. Optimization Problems and Iterative Algorithms
Lecture 3 Optimization Problems and Iterative Algorithms January 13, 2016 This material was jointly developed with Angelia Nedić at UIUC for IE 598ns Outline Special Functions: Linear, Quadratic, Convex
More informationPrimal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions
Primal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions Olivier Fercoq and Pascal Bianchi Problem Minimize the convex function
More informationLecture: Algorithms for Compressed Sensing
1/56 Lecture: Algorithms for Compressed Sensing Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:
More informationOn the acceleration of augmented Lagrangian method for linearly constrained optimization
On the acceleration of augmented Lagrangian method for linearly constrained optimization Bingsheng He and Xiaoming Yuan October, 2 Abstract. The classical augmented Lagrangian method (ALM plays a fundamental
More informationINERTIAL PRIMAL-DUAL ALGORITHMS FOR STRUCTURED CONVEX OPTIMIZATION
INERTIAL PRIMAL-DUAL ALGORITHMS FOR STRUCTURED CONVEX OPTIMIZATION RAYMOND H. CHAN, SHIQIAN MA, AND JUNFENG YANG Abstract. The primal-dual algorithm recently proposed by Chambolle & Pock (abbreviated as
More informationBeyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory
Beyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory Xin Liu(4Ð) State Key Laboratory of Scientific and Engineering Computing Institute of Computational Mathematics
More informationNOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS. 1. Introduction. We consider first-order methods for smooth, unconstrained
NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS 1. Introduction. We consider first-order methods for smooth, unconstrained optimization: (1.1) minimize f(x), x R n where f : R n R. We assume
More informationSIAM Conference on Imaging Science, Bologna, Italy, Adaptive FISTA. Peter Ochs Saarland University
SIAM Conference on Imaging Science, Bologna, Italy, 2018 Adaptive FISTA Peter Ochs Saarland University 07.06.2018 joint work with Thomas Pock, TU Graz, Austria c 2018 Peter Ochs Adaptive FISTA 1 / 16 Some
More informationContraction Methods for Convex Optimization and Monotone Variational Inequalities No.16
XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of
More informationARock: an algorithmic framework for asynchronous parallel coordinate updates
ARock: an algorithmic framework for asynchronous parallel coordinate updates Zhimin Peng, Yangyang Xu, Ming Yan, Wotao Yin ( UCLA Math, U.Waterloo DCO) UCLA CAM Report 15-37 ShanghaiTech SSDS 15 June 25,
More informationA Primal-dual Three-operator Splitting Scheme
Noname manuscript No. (will be inserted by the editor) A Primal-dual Three-operator Splitting Scheme Ming Yan Received: date / Accepted: date Abstract In this paper, we propose a new primal-dual algorithm
More informationPrimal-dual coordinate descent
Primal-dual coordinate descent Olivier Fercoq Joint work with P. Bianchi & W. Hachem 15 July 2015 1/28 Minimize the convex function f, g, h convex f is differentiable Problem min f (x) + g(x) + h(mx) x
More informationDistributed Optimization via Alternating Direction Method of Multipliers
Distributed Optimization via Alternating Direction Method of Multipliers Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato Stanford University ITMANET, Stanford, January 2011 Outline precursors dual decomposition
More informationPrimal-dual algorithms for the sum of two and three functions 1
Primal-dual algorithms for the sum of two and three functions 1 Ming Yan Michigan State University, CMSE/Mathematics 1 This works is partially supported by NSF. optimization problems for primal-dual algorithms
More informationBlock Coordinate Descent for Regularized Multi-convex Optimization
Block Coordinate Descent for Regularized Multi-convex Optimization Yangyang Xu and Wotao Yin CAAM Department, Rice University February 15, 2013 Multi-convex optimization Model definition Applications Outline
More informationDual Proximal Gradient Method
Dual Proximal Gradient Method http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/19 1 proximal gradient method
More informationAccelerated Proximal Gradient Methods for Convex Optimization
Accelerated Proximal Gradient Methods for Convex Optimization Paul Tseng Mathematics, University of Washington Seattle MOPTA, University of Guelph August 18, 2008 ACCELERATED PROXIMAL GRADIENT METHODS
More informationAgenda. Fast proximal gradient methods. 1 Accelerated first-order methods. 2 Auxiliary sequences. 3 Convergence analysis. 4 Numerical examples
Agenda Fast proximal gradient methods 1 Accelerated first-order methods 2 Auxiliary sequences 3 Convergence analysis 4 Numerical examples 5 Optimality of Nesterov s scheme Last time Proximal gradient method
More informationOptimization methods
Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,
More informationA SIMPLE PARALLEL ALGORITHM WITH AN O(1/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS
A SIMPLE PARALLEL ALGORITHM WITH AN O(/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS HAO YU AND MICHAEL J. NEELY Abstract. This paper considers convex programs with a general (possibly non-differentiable)
More informationOptimization for Learning and Big Data
Optimization for Learning and Big Data Donald Goldfarb Department of IEOR Columbia University Department of Mathematics Distinguished Lecture Series May 17-19, 2016. Lecture 1. First-Order Methods for
More informationAdaptive Restarting for First Order Optimization Methods
Adaptive Restarting for First Order Optimization Methods Nesterov method for smooth convex optimization adpative restarting schemes step-size insensitivity extension to non-smooth optimization continuation
More informationOn Stochastic Primal-Dual Hybrid Gradient Approach for Compositely Regularized Minimization
On Stochastic Primal-Dual Hybrid Gradient Approach for Compositely Regularized Minimization Linbo Qiao, and Tianyi Lin 3 and Yu-Gang Jiang and Fan Yang 5 and Wei Liu 6 and Xicheng Lu, Abstract We consider
More information9. Dual decomposition and dual algorithms
EE 546, Univ of Washington, Spring 2016 9. Dual decomposition and dual algorithms dual gradient ascent example: network rate control dual decomposition and the proximal gradient method examples with simple
More informationIteration-complexity of first-order penalty methods for convex programming
Iteration-complexity of first-order penalty methods for convex programming Guanghui Lan Renato D.C. Monteiro July 24, 2008 Abstract This paper considers a special but broad class of convex programing CP)
More informationarxiv: v1 [math.oc] 23 May 2017
A DERANDOMIZED ALGORITHM FOR RP-ADMM WITH SYMMETRIC GAUSS-SEIDEL METHOD JINCHAO XU, KAILAI XU, AND YINYU YE arxiv:1705.08389v1 [math.oc] 23 May 2017 Abstract. For multi-block alternating direction method
More informationProximal Newton Method. Ryan Tibshirani Convex Optimization /36-725
Proximal Newton Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: primal-dual interior-point method Given the problem min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h
More informationMore First-Order Optimization Algorithms
More First-Order Optimization Algorithms Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye Chapters 3, 8, 3 The SDM
More informationMath 273a: Optimization Overview of First-Order Optimization Algorithms
Math 273a: Optimization Overview of First-Order Optimization Algorithms Wotao Yin Department of Mathematics, UCLA online discussions on piazza.com 1 / 9 Typical flow of numerical optimization Optimization
More informationGradient Sliding for Composite Optimization
Noname manuscript No. (will be inserted by the editor) Gradient Sliding for Composite Optimization Guanghui Lan the date of receipt and acceptance should be inserted later Abstract We consider in this
More informationDoes Alternating Direction Method of Multipliers Converge for Nonconvex Problems?
Does Alternating Direction Method of Multipliers Converge for Nonconvex Problems? Mingyi Hong IMSE and ECpE Department Iowa State University ICCOPT, Tokyo, August 2016 Mingyi Hong (Iowa State University)
More informationSEMI-SMOOTH SECOND-ORDER TYPE METHODS FOR COMPOSITE CONVEX PROGRAMS
SEMI-SMOOTH SECOND-ORDER TYPE METHODS FOR COMPOSITE CONVEX PROGRAMS XIANTAO XIAO, YONGFENG LI, ZAIWEN WEN, AND LIWEI ZHANG Abstract. The goal of this paper is to study approaches to bridge the gap between
More informationPrimal-dual Subgradient Method for Convex Problems with Functional Constraints
Primal-dual Subgradient Method for Convex Problems with Functional Constraints Yurii Nesterov, CORE/INMA (UCL) Workshop on embedded optimization EMBOPT2014 September 9, 2014 (Lucca) Yu. Nesterov Primal-dual
More informationWHY DUALITY? Gradient descent Newton s method Quasi-newton Conjugate gradients. No constraints. Non-differentiable ???? Constrained problems? ????
DUALITY WHY DUALITY? No constraints f(x) Non-differentiable f(x) Gradient descent Newton s method Quasi-newton Conjugate gradients etc???? Constrained problems? f(x) subject to g(x) apple 0???? h(x) =0
More informationContraction Methods for Convex Optimization and monotone variational inequalities No.12
XII - 1 Contraction Methods for Convex Optimization and monotone variational inequalities No.12 Linearized alternating direction methods of multipliers for separable convex programming Bingsheng He Department
More informationON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS
ON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS WEI DENG AND WOTAO YIN Abstract. The formulation min x,y f(x) + g(y) subject to Ax + By = b arises in
More informationExpanding the reach of optimal methods
Expanding the reach of optimal methods Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with C. Kempton (UW), M. Fazel (UW), A.S. Lewis (Cornell), and S. Roy (UW) BURKAPALOOZA! WCOM
More informationPerturbed Proximal Primal Dual Algorithm for Nonconvex Nonsmooth Optimization
Noname manuscript No. (will be inserted by the editor Perturbed Proximal Primal Dual Algorithm for Nonconvex Nonsmooth Optimization Davood Hajinezhad and Mingyi Hong Received: date / Accepted: date Abstract
More informationCoordinate descent methods
Coordinate descent methods Master Mathematics for data science and big data Olivier Fercoq November 3, 05 Contents Exact coordinate descent Coordinate gradient descent 3 3 Proximal coordinate descent 5
More informationYou should be able to...
Lecture Outline Gradient Projection Algorithm Constant Step Length, Varying Step Length, Diminishing Step Length Complexity Issues Gradient Projection With Exploration Projection Solving QPs: active set
More informationLECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE
LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework - MC/MC Constrained optimization
More informationLecture 11 and 12: Penalty methods and augmented Lagrangian methods for nonlinear programming
Lecture 11 and 12: Penalty methods and augmented Lagrangian methods for nonlinear programming Coralia Cartis, Mathematical Institute, University of Oxford C6.2/B2: Continuous Optimization Lecture 11 and
More informationSequential Unconstrained Minimization: A Survey
Sequential Unconstrained Minimization: A Survey Charles L. Byrne February 21, 2013 Abstract The problem is to minimize a function f : X (, ], over a non-empty subset C of X, where X is an arbitrary set.
More informationIntroduction to Alternating Direction Method of Multipliers
Introduction to Alternating Direction Method of Multipliers Yale Chang Machine Learning Group Meeting September 29, 2016 Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction
More informationLecture 23: November 21
10-725/36-725: Convex Optimization Fall 2016 Lecturer: Ryan Tibshirani Lecture 23: November 21 Scribes: Yifan Sun, Ananya Kumar, Xin Lu Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:
More informationOne Mirror Descent Algorithm for Convex Constrained Optimization Problems with Non-Standard Growth Properties
One Mirror Descent Algorithm for Convex Constrained Optimization Problems with Non-Standard Growth Properties Fedor S. Stonyakin 1 and Alexander A. Titov 1 V. I. Vernadsky Crimean Federal University, Simferopol,
More information10 Numerical methods for constrained problems
10 Numerical methods for constrained problems min s.t. f(x) h(x) = 0 (l), g(x) 0 (m), x X The algorithms can be roughly divided the following way: ˆ primal methods: find descent direction keeping inside
More informationDual and primal-dual methods
ELE 538B: Large-Scale Optimization for Data Science Dual and primal-dual methods Yuxin Chen Princeton University, Spring 2018 Outline Dual proximal gradient method Primal-dual proximal gradient method
More informationACCELERATED BUNDLE LEVEL TYPE METHODS FOR LARGE SCALE CONVEX OPTIMIZATION
ACCELERATED BUNDLE LEVEL TYPE METHODS FOR LARGE SCALE CONVEX OPTIMIZATION By WEI ZHANG A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
More informationarxiv: v1 [math.oc] 1 Jul 2016
Convergence Rate of Frank-Wolfe for Non-Convex Objectives Simon Lacoste-Julien INRIA - SIERRA team ENS, Paris June 8, 016 Abstract arxiv:1607.00345v1 [math.oc] 1 Jul 016 We give a simple proof that the
More informationOptimizing Nonconvex Finite Sums by a Proximal Primal-Dual Method
Optimizing Nonconvex Finite Sums by a Proximal Primal-Dual Method Davood Hajinezhad Iowa State University Davood Hajinezhad Optimizing Nonconvex Finite Sums by a Proximal Primal-Dual Method 1 / 35 Co-Authors
More informationPart 5: Penalty and augmented Lagrangian methods for equality constrained optimization. Nick Gould (RAL)
Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization Nick Gould (RAL) x IR n f(x) subject to c(x) = Part C course on continuoue optimization CONSTRAINED MINIMIZATION x
More informationConvex Optimization Algorithms for Machine Learning in 10 Slides
Convex Optimization Algorithms for Machine Learning in 10 Slides Presenter: Jul. 15. 2015 Outline 1 Quadratic Problem Linear System 2 Smooth Problem Newton-CG 3 Composite Problem Proximal-Newton-CD 4 Non-smooth,
More information1 Computing with constraints
Notes for 2017-04-26 1 Computing with constraints Recall that our basic problem is minimize φ(x) s.t. x Ω where the feasible set Ω is defined by equality and inequality conditions Ω = {x R n : c i (x)
More informationThe Alternating Direction Method of Multipliers
The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein December 2, 2015 1 / 25 Background The Dual Problem Consider
More information1. Gradient method. gradient method, first-order methods. quadratic bounds on convex functions. analysis of gradient method
L. Vandenberghe EE236C (Spring 2016) 1. Gradient method gradient method, first-order methods quadratic bounds on convex functions analysis of gradient method 1-1 Approximate course outline First-order
More informationCSC 576: Gradient Descent Algorithms
CSC 576: Gradient Descent Algorithms Ji Liu Department of Computer Sciences, University of Rochester December 22, 205 Introduction The gradient descent algorithm is one of the most popular optimization
More informationAlgorithms for constrained local optimization
Algorithms for constrained local optimization Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Algorithms for constrained local optimization p. Feasible direction methods Algorithms for constrained
More informationA GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR TV MINIMIZATION
A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR TV MINIMIZATION ERNIE ESSER XIAOQUN ZHANG TONY CHAN Abstract. We generalize the primal-dual hybrid gradient (PDHG) algorithm proposed
More information10-725/36-725: Convex Optimization Spring Lecture 21: April 6
10-725/36-725: Conve Optimization Spring 2015 Lecturer: Ryan Tibshirani Lecture 21: April 6 Scribes: Chiqun Zhang, Hanqi Cheng, Waleed Ammar Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:
More informationAn Algorithmic Framework of Generalized Primal-Dual Hybrid Gradient Methods for Saddle Point Problems
An Algorithmic Framework of Generalized Primal-Dual Hybrid Gradient Methods for Saddle Point Problems Bingsheng He Feng Ma 2 Xiaoming Yuan 3 January 30, 206 Abstract. The primal-dual hybrid gradient method
More informationDual Methods. Lecturer: Ryan Tibshirani Convex Optimization /36-725
Dual Methods Lecturer: Ryan Tibshirani Conve Optimization 10-725/36-725 1 Last time: proimal Newton method Consider the problem min g() + h() where g, h are conve, g is twice differentiable, and h is simple.
More informationA GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR CONVEX OPTIMIZATION IN IMAGING SCIENCE
A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR CONVEX OPTIMIZATION IN IMAGING SCIENCE ERNIE ESSER XIAOQUN ZHANG TONY CHAN Abstract. We generalize the primal-dual hybrid gradient
More informationAn adaptive accelerated first-order method for convex optimization
An adaptive accelerated first-order method for convex optimization Renato D.C Monteiro Camilo Ortiz Benar F. Svaiter July 3, 22 (Revised: May 4, 24) Abstract This paper presents a new accelerated variant
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring
More informationLasso: Algorithms and Extensions
ELE 538B: Sparsity, Structure and Inference Lasso: Algorithms and Extensions Yuxin Chen Princeton University, Spring 2017 Outline Proximal operators Proximal gradient methods for lasso and its extensions
More informationLecture 25: Subgradient Method and Bundle Methods April 24
IE 51: Convex Optimization Spring 017, UIUC Lecture 5: Subgradient Method and Bundle Methods April 4 Instructor: Niao He Scribe: Shuanglong Wang Courtesy warning: hese notes do not necessarily cover everything
More informationAlgorithms for Nonsmooth Optimization
Algorithms for Nonsmooth Optimization Frank E. Curtis, Lehigh University presented at Center for Optimization and Statistical Learning, Northwestern University 2 March 2018 Algorithms for Nonsmooth Optimization
More informationarxiv: v1 [math.oc] 5 Dec 2014
FAST BUNDLE-LEVEL TYPE METHODS FOR UNCONSTRAINED AND BALL-CONSTRAINED CONVEX OPTIMIZATION YUNMEI CHEN, GUANGHUI LAN, YUYUAN OUYANG, AND WEI ZHANG arxiv:141.18v1 [math.oc] 5 Dec 014 Abstract. It has been
More informationAn Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods
An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods Renato D.C. Monteiro B. F. Svaiter May 10, 011 Revised: May 4, 01) Abstract This
More informationOn convergence rate of the Douglas-Rachford operator splitting method
On convergence rate of the Douglas-Rachford operator splitting method Bingsheng He and Xiaoming Yuan 2 Abstract. This note provides a simple proof on a O(/k) convergence rate for the Douglas- Rachford
More informationOptimisation in Higher Dimensions
CHAPTER 6 Optimisation in Higher Dimensions Beyond optimisation in 1D, we will study two directions. First, the equivalent in nth dimension, x R n such that f(x ) f(x) for all x R n. Second, constrained
More informationQuiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006
Quiz Discussion IE417: Nonlinear Programming: Lecture 12 Jeff Linderoth Department of Industrial and Systems Engineering Lehigh University 16th March 2006 Motivation Why do we care? We are interested in
More informationDouglas-Rachford Splitting: Complexity Estimates and Accelerated Variants
53rd IEEE Conference on Decision and Control December 5-7, 204. Los Angeles, California, USA Douglas-Rachford Splitting: Complexity Estimates and Accelerated Variants Panagiotis Patrinos and Lorenzo Stella
More informationA Linearly Convergent First-order Algorithm for Total Variation Minimization in Image Processing
A Linearly Convergent First-order Algorithm for Total Variation Minimization in Image Processing Cong D. Dang Kaiyu Dai Guanghui Lan October 9, 0 Abstract We introduce a new formulation for total variation
More informationA Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations
A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations Chuangchuang Sun and Ran Dai Abstract This paper proposes a customized Alternating Direction Method of Multipliers
More information