Introduction to Alternating Direction Method of Multipliers
|
|
- Daisy Gilmore
- 5 years ago
- Views:
Transcription
1 Introduction to Alternating Direction Method of Multipliers Yale Chang Machine Learning Group Meeting September 29, 2016 Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction Method of Multipliers September 29, / 17
2 Formulation min f () + g(z) s.t. A + Bz = c (1),z where R n, z R m, A R p n, B R p m, c R p f : R n R {+ } and g : R m R {+ } are closed, proper, and conve, or equivalently, the epigraph of f, g are closed nonempty conve sets (epigraphf = {(, t) R n R f () t}) f, g can be non-differentiable and can take value + (e.g. indicator function I() = 0 if C and I() = + otherwise) We also assume the strong duality condition holds (Slater s condition) Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction Method of Multipliers September 29, / 17
3 ADMM Iterations: Unscaled Form The Augmented Lagrangian of problem (1) is L ρ (, z, y) = f () + g(z) + y T (A + Bz c) + ρ 2 A + Bz c 2 2 (2) The ADMM iterations are The algorithm state is (z k, y k ) k+1 = arg minl ρ (, z k, y k ) (3) z k+1 = arg minl ρ ( k+1, z, y k ) (4) z y k+1 = y k + ρ(a k+1 + Bz k+1 c) (5) Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction Method of Multipliers September 29, / 17
4 ADMM Iterations: Scaled Form Let r = A + Bz c, u = y/ρ, the augmented Lagrangian can be written as L ρ (, z, y) = f () + g(z) + y T r + ρ/2 r 2 2 The ADMM iterations become = f () + g(z) + ρ/2 r + y/ρ 2 2 1/(2ρ) y 2 2 = f () + g(z) + ρ/2 r + u 2 2 ρ/2 u 2 2 k+1 = arg min f () + ρ/2 A + Bz k c + u k 2 2 (6) z k+1 = arg min g(z) + ρ/2 A k+1 + Bz c + u k 2 2 z (7) u k+1 = u k + A k+1 + Bz k+1 c (8) In practice, we prefer the scaled form because it s shorter and easier to work with. Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction Method of Multipliers September 29, / 17
5 Stopping Criteria The following inequality always holds f ( k ) + g(z k ) p (y k ) T r k + ( k ) T s k (9) where p = inf{f () + g(z) A + Bz = c}, is the optimal value of when f () + g(z) achieves the optimal objective p r k = A k + Bz k c is called primal residual, s k = ρa T B(z k z k 1 ) is called dual residual If r k 2 ɛ primal and s k 2 ɛ dual both hold, the iterations can stop There are some suggestions on how to set ɛ primal and ɛ dual in practice, you can refer to Boyd s tutorial for details We will skip the proof of convergence for now Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction Method of Multipliers September 29, / 17
6 General Patterns: Proimal Operator How to solve each individual step in the iterations? The -update step can be epressed as + = arg min f () + (ρ/2) A v 2 2 (10) where v = Bz + c u is a known constant in this step. Case 1: Proimal Operator: assume A = I, which appears frequently in eamples, we have + = arg min f () + (ρ/2) v 2 2 = pro f,ρ (v) where pro f,ρ is called the proimal operator of f with penalty ρ, when the function f is simple enough, the proimal operator can be evaluated analytically Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction Method of Multipliers September 29, / 17
7 Proimal Operator: Eamples Indicator Function: f () = 0 if C and f () = + otherwise, where C is a closed nonempty conve set, then the -update is + = pro f,ρ (v) = Π C (v) where Π C denotes the projection onto set C (in the Euclidean norm). 1) Hyper-rectangle C = { l u} (Reall the unit ball of l ) l k, v k l k = (Π C (v)) k = v k, l k v k u k u k, v k u k 2) Affine set C = { R n A = b}(a R m n ) = Π C (v) = v A (Av b) (A is pseudoinverse of A) 3) Positive semidefinite cone C = S n + = Π C (V ) = n i=1 (λ i) + u i ui T, where V = n i=1 λ iu i ui T More eamples can be found from Proimal Algorithms by Parikh and Boyd. Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction Method of Multipliers September 29, / 17
8 General Patterns: Quadratic Objective Suppose f () = 1 2 T P + q T + r (P S n +), the -update becomes + = arg min = arg min 1 2 T P + q T + r + ρ 2 A v T (P + ρa T A) (ρa T v q) T = (P + ρa T A) 1 (ρa T v q) Here we assume F = P + ρa T A is invertible. Computing + requires solving for the linear system F = g (g = ρa T v q). Eploiting Sparsity: when F is sparse, the computation cost can be effectively reduced Caching: when multiple linear systems with the same coefficient matri F need to be solved, the factorization of F can be computed once and saved for later use Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction Method of Multipliers September 29, / 17
9 General Patterns: Smooth Objective Terms When f is smooth, general iterative methods can be used to carry out the -minimization step, eamples include gradient descent, conjugate gradient method and L-BFGS etc. The presence of the quadratic penalty term tends to improve the convergence of the iterative algorithms. There are a few techniques to speed up the convergence of ADMM iterations Early Termination: early termination in the - or z-updates can result in more ADMM iterations, but lower cost per iteration, giving an overall improvement in efficiency Warm Start: initialize the iterative method using the solution obtained from the previous iteration Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction Method of Multipliers September 29, / 17
10 General Patterns: Decomposition Block Separability: suppose R n can be partitionaed into N subvectors and that f is separable w.r.t. this partition, i.e. f () = f 1 ( 1 ) + + f N ( N ) where i R n i and N i=1 n i = n. If the quadratic term A 2 2 is also separable w.r.t. the partition, then the augmented Lagrangian L ρ is separable. Then the -update can be carried out in parallel One eample is f () = λ 1 (λ > 0) and A = I, in this case the i -update is + i = arg min λ i + ρ 2 ( i v i ) 2. Using subdifferential calculus, the solution is + i = S λ/ρ (v i ), where the soft thresholding operator S is defined as a κ a > κ S κ (a) = 0 a κ a + κ a < κ Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction Method of Multipliers September 29, / 17
11 Apply ADMM to Constrained Conve Optimization The generic constrained conve optimization problem is min f () s.t. C (11) where R n, f and C are conve. Its ADMM form can be written as min f () + g(z) s.t. z = 0 (12),z where g is the indicator function of C. The augmented Lagrangian is L ρ (, z, u) = f () + g(z) + ρ/2 z + u 2 2, so the scaled form of ADMM is k+1 = arg min f () + ρ/2 z k + u k 2 2 = pro f,ρ (z k u k ) z k+1 = Π C ( k+1 + u k ) u k+1 = u k + k+1 z k+1 Note here f need not be smooth. Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction Method of Multipliers September 29, / 17
12 Eample: Conve Feasibility Find a point in the intersection of two closed nonempty conve sets C and D. In ADMM, the problem can be written as min f () + g(z) s.t. z = 0,z where f is the indicator function of C and g is the indicator function of D. The scaled form of ADMM is k+1 = Π C (z k u k ) z k+1 = Π D ( k+1 + u k ) u k+1 = u k + k+1 z k+1 Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction Method of Multipliers September 29, / 17
13 Eample: Linear and Quadratic Programming 1 min 2 T P + q T s.t.a = b, 0 where R n, P S+ n, when P = 0, this reduces to the linear programming. The ADMM form is min f () + g(z) s.t. z = 0,z where f () = 1 2 T P + q T, domf = { A = b} and g is the indicator function of the nonnegative orthant R n +. The scaled form of ADMM has iterations k+1 = arg min f () + ρ/2 z k + u k 2 2 :A=b z k+1 = Π R n + ( k+1 + u k ) = ( k+1 + u k ) + u k+1 = u k + k+1 z k+1 Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction Method of Multipliers September 29, / 17
14 Eample: l 1 -Norm Problems A lot of machine learning algorithms involve minimizing a loss function with a regularization term or side constraints. Both the loss function and regularization term can be nonsmooth. We use l 1 -norm problems to show how to apply ADMM to solve such problems. min l() + λ 1 (13) where l is any conve loss function. The ADMM form is min l() + g(z) s.t. z = 0,z where g(z) = λ z 1, the ADMM iterations are k+1 = arg minl() + ρ/2 z k + u k = pro l,ρ z k + u k 2 2 z k+1 = S λ/ρ ( k+1 + u k ) u k+1 = u k + k+1 z k+1 Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction Method of Multipliers September 29, / 17
15 Eample: Sparse Inverse Covariance Selection Given a dataset consisting of samples from a zero mean Gaussian distribution in R n : a i N (0, Σ), i = 1,, N, (Σ 1 ) ij is zero if and only if the i-th and j-th components of the random variable are conditionally independent given the other variables Let X = Σ 1 and the objective be the negative log-likelihood of data with l 1 regularization on X The objective can be written as min Tr(SX ) log X + λ X 1, where X S n +, 1 is defined X elementwise, the domain of log X is S n ++, the set of symmetric positive definite n n matri. The ADMM steps are X k+1 = arg min X S n ++ Tr(SX ) log X + ρ/2 X Z K + U K 2 F Z k+1 = arg min λ Z 1 + ρ/2 X k+1 Z + U k 2 F Z U k+1 = U k + X k+1 Z k+1 Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction Method of Multipliers September 29, / 17
16 Eample: Sparse Inverse Covariance Selection (continued) The Z-minimization step can be solved by elementwise soft thresholding Z k+1 ij = S λ/ρ (X k+1 ij + U k ij ) The X -minimization step also has an analytical solution using the first-order optimality condition S X 1 + ρ(x Z k + U k ) = 0. We will construct a matri X satisfying ρx X 1 = ρ(z k U k ) S and X 0. Let the RHS be factorized as QΛQ T (Λ = diag(λ 1,, λ n ), Q T = QQ T = I ), then we have ρ X X 1 = Λ, where X = Q T XQ. We can construct X by finding numbers X ii that satisfy ρ X ii 1/ X ii = λ i, which leads to X ii = (λ i + λ 2 i + 4ρ)/(2ρ) Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction Method of Multipliers September 29, / 17
17 What Have We Learned? If f, g are smooth, the updates can always be solved using general iterative methods If f, g are so simple that an closed-form proimal operator evaluation is feasible, the updates can be carried out using analytical formulas. More eamples of proimal operator evaluations are available at the paper Proimal Algorithms written by Parikh and Boyd Techniques like early termination and warm start can be utilized to speed up convergence We have been focusing on non-distributed versions of various problems, their distributed versions need to be studied for parallelization Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction Method of Multipliers September 29, / 17
Dual Methods. Lecturer: Ryan Tibshirani Convex Optimization /36-725
Dual Methods Lecturer: Ryan Tibshirani Conve Optimization 10-725/36-725 1 Last time: proimal Newton method Consider the problem min g() + h() where g, h are conve, g is twice differentiable, and h is simple.
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course otes for EE7C (Spring 018): Conve Optimization and Approimation Instructor: Moritz Hardt Email: hardt+ee7c@berkeley.edu Graduate Instructor: Ma Simchowitz Email: msimchow+ee7c@berkeley.edu October
More informationDual Ascent. Ryan Tibshirani Convex Optimization
Dual Ascent Ryan Tibshirani Conve Optimization 10-725 Last time: coordinate descent Consider the problem min f() where f() = g() + n i=1 h i( i ), with g conve and differentiable and each h i conve. Coordinate
More informationLecture 23: November 19
10-725/36-725: Conve Optimization Fall 2018 Lecturer: Ryan Tibshirani Lecture 23: November 19 Scribes: Charvi Rastogi, George Stoica, Shuo Li Charvi Rastogi: 23.1-23.4.2, George Stoica: 23.4.3-23.8, Shuo
More informationThe Alternating Direction Method of Multipliers
The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein December 2, 2015 1 / 25 Background The Dual Problem Consider
More information10-725/36-725: Convex Optimization Spring Lecture 21: April 6
10-725/36-725: Conve Optimization Spring 2015 Lecturer: Ryan Tibshirani Lecture 21: April 6 Scribes: Chiqun Zhang, Hanqi Cheng, Waleed Ammar Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:
More informationUses of duality. Geoff Gordon & Ryan Tibshirani Optimization /
Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear
More informationFrank-Wolfe Method. Ryan Tibshirani Convex Optimization
Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)
More informationDual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725
Dual methods and ADMM Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given f : R n R, the function is called its conjugate Recall conjugate functions f (y) = max x R n yt x f(x)
More informationDuality Uses and Correspondences. Ryan Tibshirani Convex Optimization
Duality Uses and Correspondences Ryan Tibshirani Conve Optimization 10-725 Recall that for the problem Last time: KKT conditions subject to f() h i () 0, i = 1,... m l j () = 0, j = 1,... r the KKT conditions
More informationIn applications, we encounter many constrained optimization problems. Examples Basis pursuit: exact sparse recovery problem
1 Conve Analsis Main references: Vandenberghe UCLA): EECS236C - Optimiation methods for large scale sstems, http://www.seas.ucla.edu/ vandenbe/ee236c.html Parikh and Bod, Proimal algorithms, slides and
More information10725/36725 Optimization Homework 4
10725/36725 Optimization Homework 4 Due November 27, 2012 at beginning of class Instructions: There are four questions in this assignment. Please submit your homework as (up to) 4 separate sets of pages
More informationSparse Gaussian conditional random fields
Sparse Gaussian conditional random fields Matt Wytock, J. ico Kolter School of Computer Science Carnegie Mellon University Pittsburgh, PA 53 {mwytock, zkolter}@cs.cmu.edu Abstract We propose sparse Gaussian
More informationOn the interior of the simplex, we have the Hessian of d(x), Hd(x) is diagonal with ith. µd(w) + w T c. minimize. subject to w T 1 = 1,
Math 30 Winter 05 Solution to Homework 3. Recognizing the convexity of g(x) := x log x, from Jensen s inequality we get d(x) n x + + x n n log x + + x n n where the equality is attained only at x = (/n,...,
More informationConvex Analysis Notes. Lecturer: Adrian Lewis, Cornell ORIE Scribe: Kevin Kircher, Cornell MAE
Convex Analysis Notes Lecturer: Adrian Lewis, Cornell ORIE Scribe: Kevin Kircher, Cornell MAE These are notes from ORIE 6328, Convex Analysis, as taught by Prof. Adrian Lewis at Cornell University in the
More informationEE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6)
EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement to the material discussed in
More information2 Regularized Image Reconstruction for Compressive Imaging and Beyond
EE 367 / CS 448I Computational Imaging and Display Notes: Compressive Imaging and Regularized Image Reconstruction (lecture ) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement
More information1 Sparsity and l 1 relaxation
6.883 Learning with Combinatorial Structure Note for Lecture 2 Author: Chiyuan Zhang Sparsity and l relaxation Last time we talked about sparsity and characterized when an l relaxation could recover the
More informationCoordinate Update Algorithm Short Course Subgradients and Subgradient Methods
Coordinate Update Algorithm Short Course Subgradients and Subgradient Methods Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 30 Notation f : H R { } is a closed proper convex function domf := {x R n
More informationLecture 8. Strong Duality Results. September 22, 2008
Strong Duality Results September 22, 2008 Outline Lecture 8 Slater Condition and its Variations Convex Objective with Linear Inequality Constraints Quadratic Objective over Quadratic Constraints Representation
More informationConstrained Optimization and Lagrangian Duality
CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may
More informationDivide-and-combine Strategies in Statistical Modeling for Massive Data
Divide-and-combine Strategies in Statistical Modeling for Massive Data Liqun Yu Washington University in St. Louis March 30, 2017 Liqun Yu (WUSTL) D&C Statistical Modeling for Massive Data March 30, 2017
More informationDistributed Optimization and Statistics via Alternating Direction Method of Multipliers
Distributed Optimization and Statistics via Alternating Direction Method of Multipliers Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato Stanford University Stanford Statistics Seminar, September 2010
More informationProximal Methods for Optimization with Spasity-inducing Norms
Proximal Methods for Optimization with Spasity-inducing Norms Group Learning Presentation Xiaowei Zhou Department of Electronic and Computer Engineering The Hong Kong University of Science and Technology
More informationICS-E4030 Kernel Methods in Machine Learning
ICS-E4030 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 28. September, 2016 Juho Rousu 28. September, 2016 1 / 38 Convex optimization Convex optimisation This
More informationSparse Optimization Lecture: Basic Sparse Optimization Models
Sparse Optimization Lecture: Basic Sparse Optimization Models Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know basic l 1, l 2,1, and nuclear-norm
More information10-725/ Optimization Midterm Exam
10-725/36-725 Optimization Midterm Exam November 6, 2012 NAME: ANDREW ID: Instructions: This exam is 1hr 20mins long Except for a single two-sided sheet of notes, no other material or discussion is permitted
More informationMath 273a: Optimization Subgradients of convex functions
Math 273a: Optimization Subgradients of convex functions Made by: Damek Davis Edited by Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com 1 / 42 Subgradients Assumptions
More information9. Interpretations, Lifting, SOS and Moments
9-1 Interpretations, Lifting, SOS and Moments P. Parrilo and S. Lall, CDC 2003 2003.12.07.04 9. Interpretations, Lifting, SOS and Moments Polynomial nonnegativity Sum of squares (SOS) decomposition Eample
More informationCS-E4830 Kernel Methods in Machine Learning
CS-E4830 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 27. September, 2017 Juho Rousu 27. September, 2017 1 / 45 Convex optimization Convex optimisation This
More informationAn efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss
An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss arxiv:1811.04545v1 [stat.co] 12 Nov 2018 Cheng Wang School of Mathematical Sciences, Shanghai Jiao
More informationLecture 23: Conditional Gradient Method
10-725/36-725: Conve Optimization Spring 2015 Lecture 23: Conditional Gradient Method Lecturer: Ryan Tibshirani Scribes: Shichao Yang,Diyi Yang,Zhanpeng Fang Note: LaTeX template courtesy of UC Berkeley
More informationLECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE
LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework - MC/MC Constrained optimization
More informationAlgorithms for Nonsmooth Optimization
Algorithms for Nonsmooth Optimization Frank E. Curtis, Lehigh University presented at Center for Optimization and Statistical Learning, Northwestern University 2 March 2018 Algorithms for Nonsmooth Optimization
More informationConvex Optimization and l 1 -minimization
Convex Optimization and l 1 -minimization Sangwoon Yun Computational Sciences Korea Institute for Advanced Study December 11, 2009 2009 NIMS Thematic Winter School Outline I. Convex Optimization II. l
More informationMath 273a: Optimization Lagrange Duality
Math 273a: Optimization Lagrange Duality Instructor: Wotao Yin Department of Mathematics, UCLA Winter 2015 online discussions on piazza.com Gradient descent / forward Euler assume function f is proper
More informationAlternating Direction Method of Multipliers. Ryan Tibshirani Convex Optimization
Alternating Direction Method of Multipliers Ryan Tibshirani Convex Optimization 10-725 Consider the problem Last time: dual ascent min x f(x) subject to Ax = b where f is strictly convex and closed. Denote
More informationLecture 16: October 22
0-725/36-725: Conve Optimization Fall 208 Lecturer: Ryan Tibshirani Lecture 6: October 22 Scribes: Nic Dalmasso, Alan Mishler, Benja LeRoy Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:
More informationSubgradient Method. Ryan Tibshirani Convex Optimization
Subgradient Method Ryan Tibshirani Convex Optimization 10-725 Consider the problem Last last time: gradient descent min x f(x) for f convex and differentiable, dom(f) = R n. Gradient descent: choose initial
More informationarxiv: v1 [math.oc] 16 Nov 2015
Conve programming with fast proimal and linear operators arxiv:1511.04815v1 [math.oc] 16 Nov 2015 Matt Wytock, Po-Wei Wang, J. Zico Kolter November 17, 2015 Abstract We present Epsilon, a system for general
More informationPreconditioning via Diagonal Scaling
Preconditioning via Diagonal Scaling Reza Takapoui Hamid Javadi June 4, 2014 1 Introduction Interior point methods solve small to medium sized problems to high accuracy in a reasonable amount of time.
More informationLecture 24 November 27
EE 381V: Large Scale Optimization Fall 01 Lecture 4 November 7 Lecturer: Caramanis & Sanghavi Scribe: Jahshan Bhatti and Ken Pesyna 4.1 Mirror Descent Earlier, we motivated mirror descent as a way to improve
More informationConstrained optimization
Constrained optimization DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Compressed sensing Convex constrained
More informationMaster 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique
Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some
More informationProximal methods. S. Villa. October 7, 2014
Proximal methods S. Villa October 7, 2014 1 Review of the basics Often machine learning problems require the solution of minimization problems. For instance, the ERM algorithm requires to solve a problem
More informationDistributed Optimization via Alternating Direction Method of Multipliers
Distributed Optimization via Alternating Direction Method of Multipliers Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato Stanford University ITMANET, Stanford, January 2011 Outline precursors dual decomposition
More informationPrimal-dual Subgradient Method for Convex Problems with Functional Constraints
Primal-dual Subgradient Method for Convex Problems with Functional Constraints Yurii Nesterov, CORE/INMA (UCL) Workshop on embedded optimization EMBOPT2014 September 9, 2014 (Lucca) Yu. Nesterov Primal-dual
More information12. Interior-point methods
12. Interior-point methods Convex Optimization Boyd & Vandenberghe inequality constrained minimization logarithmic barrier function and central path barrier method feasibility and phase I methods complexity
More informationAccelerated primal-dual methods for linearly constrained convex problems
Accelerated primal-dual methods for linearly constrained convex problems Yangyang Xu SIAM Conference on Optimization May 24, 2017 1 / 23 Accelerated proximal gradient For convex composite problem: minimize
More informationComputational Optimization. Constrained Optimization Part 2
Computational Optimization Constrained Optimization Part Optimality Conditions Unconstrained Case X* is global min Conve f X* is local min SOSC f ( *) = SONC Easiest Problem Linear equality constraints
More informationBig Data Analytics: Optimization and Randomization
Big Data Analytics: Optimization and Randomization Tianbao Yang Tutorial@ACML 2015 Hong Kong Department of Computer Science, The University of Iowa, IA, USA Nov. 20, 2015 Yang Tutorial for ACML 15 Nov.
More informationACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING
ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING YANGYANG XU Abstract. Motivated by big data applications, first-order methods have been extremely
More informationLagrange duality. The Lagrangian. We consider an optimization program of the form
Lagrange duality Another way to arrive at the KKT conditions, and one which gives us some insight on solving constrained optimization problems, is through the Lagrange dual. The dual is a maximization
More informationCSCI : Optimization and Control of Networks. Review on Convex Optimization
CSCI7000-016: Optimization and Control of Networks Review on Convex Optimization 1 Convex set S R n is convex if x,y S, λ,µ 0, λ+µ = 1 λx+µy S geometrically: x,y S line segment through x,y S examples (one
More informationPrimal/Dual Decomposition Methods
Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Subgradients
More information10-725/36-725: Convex Optimization Prerequisite Topics
10-725/36-725: Convex Optimization Prerequisite Topics February 3, 2015 This is meant to be a brief, informal refresher of some topics that will form building blocks in this course. The content of the
More informationConditional Gradient (Frank-Wolfe) Method
Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties
More informationLOCAL LINEAR CONVERGENCE OF ADMM Daniel Boley
LOCAL LINEAR CONVERGENCE OF ADMM Daniel Boley Model QP/LP: min 1 / 2 x T Qx+c T x s.t. Ax = b, x 0, (1) Lagrangian: L(x,y) = 1 / 2 x T Qx+c T x y T x s.t. Ax = b, (2) where y 0 is the vector of Lagrange
More informationApplications of Linear Programming
Applications of Linear Programming lecturer: András London University of Szeged Institute of Informatics Department of Computational Optimization Lecture 9 Non-linear programming In case of LP, the goal
More informationAdditional Homework Problems
Additional Homework Problems Robert M. Freund April, 2004 2004 Massachusetts Institute of Technology. 1 2 1 Exercises 1. Let IR n + denote the nonnegative orthant, namely IR + n = {x IR n x j ( ) 0,j =1,...,n}.
More informationA Brief Review on Convex Optimization
A Brief Review on Convex Optimization 1 Convex set S R n is convex if x,y S, λ,µ 0, λ+µ = 1 λx+µy S geometrically: x,y S line segment through x,y S examples (one convex, two nonconvex sets): A Brief Review
More informationConvexity II: Optimization Basics
Conveity II: Optimization Basics Lecturer: Ryan Tibshirani Conve Optimization 10-725/36-725 See supplements for reviews of basic multivariate calculus basic linear algebra Last time: conve sets and functions
More informationYou should be able to...
Lecture Outline Gradient Projection Algorithm Constant Step Length, Varying Step Length, Diminishing Step Length Complexity Issues Gradient Projection With Exploration Projection Solving QPs: active set
More informationDuality revisited. Javier Peña Convex Optimization /36-725
Duality revisited Javier Peña Conve Optimization 10-725/36-725 1 Last time: barrier method Main idea: approimate the problem f() + I C () with the barrier problem f() + 1 t φ() tf() + φ() where t > 0 and
More informationFantope Regularization in Metric Learning
Fantope Regularization in Metric Learning CVPR 2014 Marc T. Law (LIP6, UPMC), Nicolas Thome (LIP6 - UPMC Sorbonne Universités), Matthieu Cord (LIP6 - UPMC Sorbonne Universités), Paris, France Introduction
More informationLecture 25: November 27
10-725: Optimization Fall 2012 Lecture 25: November 27 Lecturer: Ryan Tibshirani Scribes: Matt Wytock, Supreeth Achar Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have
More information5. Duality. Lagrangian
5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized
More informationLecture 7: Weak Duality
EE 227A: Conve Optimization and Applications February 7, 2012 Lecture 7: Weak Duality Lecturer: Laurent El Ghaoui 7.1 Lagrange Dual problem 7.1.1 Primal problem In this section, we consider a possibly
More informationEE364b Convex Optimization II May 30 June 2, Final exam
EE364b Convex Optimization II May 30 June 2, 2014 Prof. S. Boyd Final exam By now, you know how it works, so we won t repeat it here. (If not, see the instructions for the EE364a final exam.) Since you
More informationMath 273a: Optimization Convex Conjugacy
Math 273a: Optimization Convex Conjugacy Instructor: Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com Convex conjugate (the Legendre transform) Let f be a closed proper
More informationCoordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /
Coordinate descent Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Adding to the toolbox, with stats and ML in mind We ve seen several general and useful minimization tools First-order methods
More informationConvex Optimization and Modeling
Convex Optimization and Modeling Duality Theory and Optimality Conditions 5th lecture, 12.05.2010 Jun.-Prof. Matthias Hein Program of today/next lecture Lagrangian and duality: the Lagrangian the dual
More informationIntroduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research
Introduction to Machine Learning Lecture 7 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Convex Optimization Differentiation Definition: let f : X R N R be a differentiable function,
More informationLecture: Duality of LP, SOCP and SDP
1/33 Lecture: Duality of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:
More informationSupport Vector Machines: Maximum Margin Classifiers
Support Vector Machines: Maximum Margin Classifiers Machine Learning and Pattern Recognition: September 16, 2008 Piotr Mirowski Based on slides by Sumit Chopra and Fu-Jie Huang 1 Outline What is behind
More informationConvex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014
Convex Optimization Dani Yogatama School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA February 12, 2014 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12,
More informationConvex Optimization M2
Convex Optimization M2 Lecture 3 A. d Aspremont. Convex Optimization M2. 1/49 Duality A. d Aspremont. Convex Optimization M2. 2/49 DMs DM par email: dm.daspremont@gmail.com A. d Aspremont. Convex Optimization
More informationSparse Covariance Selection using Semidefinite Programming
Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Support Vector Machines Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique
More informationDistributed Convex Optimization
Master Program 2013-2015 Electrical Engineering Distributed Convex Optimization A Study on the Primal-Dual Method of Multipliers Delft University of Technology He Ming Zhang, Guoqiang Zhang, Richard Heusdens
More informationHomework 5. Convex Optimization /36-725
Homework 5 Convex Optimization 10-725/36-725 Due Tuesday November 22 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)
More informationLecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem
Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R
More informationLecture Note 5: Semidefinite Programming for Stability Analysis
ECE7850: Hybrid Systems:Theory and Applications Lecture Note 5: Semidefinite Programming for Stability Analysis Wei Zhang Assistant Professor Department of Electrical and Computer Engineering Ohio State
More informationEE 367 / CS 448I Computational Imaging and Display Notes: Noise, Denoising, and Image Reconstruction with Noise (lecture 10)
EE 367 / CS 448I Computational Imaging and Display Notes: Noise, Denoising, and Image Reconstruction with Noise (lecture 0) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement
More informationCSC 576: Gradient Descent Algorithms
CSC 576: Gradient Descent Algorithms Ji Liu Department of Computer Sciences, University of Rochester December 22, 205 Introduction The gradient descent algorithm is one of the most popular optimization
More informationMark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.
CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.
More informationOn Optimal Frame Conditioners
On Optimal Frame Conditioners Chae A. Clark Department of Mathematics University of Maryland, College Park Email: cclark18@math.umd.edu Kasso A. Okoudjou Department of Mathematics University of Maryland,
More informationDual Proximal Gradient Method
Dual Proximal Gradient Method http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/19 1 proximal gradient method
More informationarxiv: v1 [math.oc] 23 May 2017
A DERANDOMIZED ALGORITHM FOR RP-ADMM WITH SYMMETRIC GAUSS-SEIDEL METHOD JINCHAO XU, KAILAI XU, AND YINYU YE arxiv:1705.08389v1 [math.oc] 23 May 2017 Abstract. For multi-block alternating direction method
More informationLECTURE 13 LECTURE OUTLINE
LECTURE 13 LECTURE OUTLINE Problem Structures Separable problems Integer/discrete problems Branch-and-bound Large sum problems Problems with many constraints Conic Programming Second Order Cone Programming
More informationExtended Monotropic Programming and Duality 1
March 2006 (Revised February 2010) Report LIDS - 2692 Extended Monotropic Programming and Duality 1 by Dimitri P. Bertsekas 2 Abstract We consider the problem minimize f i (x i ) subject to x S, where
More informationSelected Topics in Optimization. Some slides borrowed from
Selected Topics in Optimization Some slides borrowed from http://www.stat.cmu.edu/~ryantibs/convexopt/ Overview Optimization problems are almost everywhere in statistics and machine learning. Input Model
More informationLecture 14: Optimality Conditions for Conic Problems
EE 227A: Conve Optimization and Applications March 6, 2012 Lecture 14: Optimality Conditions for Conic Problems Lecturer: Laurent El Ghaoui Reading assignment: 5.5 of BV. 14.1 Optimality for Conic Problems
More informationOptimization for Learning and Big Data
Optimization for Learning and Big Data Donald Goldfarb Department of IEOR Columbia University Department of Mathematics Distinguished Lecture Series May 17-19, 2016. Lecture 1. First-Order Methods for
More informationLecture 5 : Projections
Lecture 5 : Projections EE227C. Lecturer: Professor Martin Wainwright. Scribe: Alvin Wan Up until now, we have seen convergence rates of unconstrained gradient descent. Now, we consider a constrained minimization
More informationSEMI-SMOOTH SECOND-ORDER TYPE METHODS FOR COMPOSITE CONVEX PROGRAMS
SEMI-SMOOTH SECOND-ORDER TYPE METHODS FOR COMPOSITE CONVEX PROGRAMS XIANTAO XIAO, YONGFENG LI, ZAIWEN WEN, AND LIWEI ZHANG Abstract. The goal of this paper is to study approaches to bridge the gap between
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring
More informationConvex Optimization & Lagrange Duality
Convex Optimization & Lagrange Duality Chee Wei Tan CS 8292 : Advanced Topics in Convex Optimization and its Applications Fall 2010 Outline Convex optimization Optimality condition Lagrange duality KKT
More informationSuppose that the approximate solutions of Eq. (1) satisfy the condition (3). Then (1) if η = 0 in the algorithm Trust Region, then lim inf.
Maria Cameron 1. Trust Region Methods At every iteration the trust region methods generate a model m k (p), choose a trust region, and solve the constraint optimization problem of finding the minimum of
More informationConvex Optimization. Newton s method. ENSAE: Optimisation 1/44
Convex Optimization Newton s method ENSAE: Optimisation 1/44 Unconstrained minimization minimize f(x) f convex, twice continuously differentiable (hence dom f open) we assume optimal value p = inf x f(x)
More information