Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables
|
|
- Eustace Roberts
- 5 years ago
- Views:
Transcription
1 Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop on Optimization for Modern Computation BICMR, Beijing, China September 2, 2014
2 Outline ADMM for N = 2 Existing work on ADMM for N 3 Convergence Rates of ADMM for N 3 BSUM-M
3 Alternating Direction Method of Multipliers (ADMM) Convex optimization min f 1 (x 1 )+f 2 (x 2 )+...+f N (x N ) s.t. A 1 x 1 +A 2 x A N x N = b x j X j, j = 1,2,...,N. f j : closed convex function X j : closed convex set Augmented Lagrangian function L γ (x 1,...,x N ;λ) := N N f j (x j ) λ, A j x j b + γ N 2 A j x j b 2 2 j=1 j=1 j=1
4 Augmented Lagrangian function L γ (x 1,...,x N ;λ) := N N f j (x j ) λ, A j x j b + γ N 2 A j x j b 2 2 j=1 j=1 j=1 x1 k+1 := argmin x1 X 1 L γ (x 1,x2 k,...,xk N ;λk ) x 2 k+1 := argmin x2 X 2 L γ (x1 k+1,x 2,x3 k,...,xk N ;λk ). x k+1 N := argmin xn ( X N L γ (x1 k+1,x2 k+1,...,x k+1 N 1,x N;λ k ) N ) λ k+1 := λ k γ j=1 A jx k+1 j b. Update the primal variables in a Gauss-Seidel manner.
5 ADMM for N = 2 ADMM for N = 2 x1 k+1 := argmin x1 X 1 L γ (x 1,x2 k;λk ) x2 k+1 := argmin x2 ( X 2 L γ (x1 k+1,x 2 ;λ k ) ) λ k+1 := λ k γ A 1 x1 k+1 +A 2 x2 k+1 b. Long history goes back to variational methods for PDEs in 1950s; Relate to Douglas-Rachford and Peaceman-Rachford Operator Splitting Methods for finding zero of monotone operators. Find x, s.t.,0 A(x)+B(x). Revisited recently for sparse optimization [Wang-Yang-Yin-Zhang-2008] [Goldstein-Osher-2009] [Boyd-etal-2011]
6 Global Convergence of ADMM for N = 2 ADMM for N = 2 x1 k+1 := argmin x1 X 1 L γ (x 1,x2 k;λk ) x2 k+1 := argmin x2 ( X 2 L γ (x1 k+1,x 2 ;λ k ) ) λ k+1 := λ k γ A 1 x1 k+1 +A 2 x2 k+1 b. Global convergence for any γ > 0. (Fortin-Glowinski-1983; Gabay-1983; Glowinski-Le Tallec-1989; Eckstein-Bertsekas-1992) ADMM for N = 2 with fixed dual step size x1 k+1 := argmin x1 X 1 L γ (x 1,x2 k;λk ) x2 k+1 := argmin x2 X ( 2 L γ (x1 k+1,x 2 ;λ k ) ) λ k+1 := λ k αγ A 1 x1 k+1 +A 2 x2 k+1 b. α > 0 is a fixed dual step size Global convergence for any γ > 0 and α (0, ).
7 Sublinear Convergence of ADMM for N = 2 Ergodic O(1/k) convergence (He-Yuan-2012) Non-Ergodic O(1/k) convergence (He-Yuan-2012) Ergodic O(1/k) convergence (Monteiro-Svaiter-2013)
8 Linear Convergence Rate of ADMM for N = 2 Douglas-Rachford splitting method converges linearly if B is coercive and Lipschitz (Lions-Mercier-1979) Linear convergence for solving linear programs (Eckstein-Bertsekas-1990) Linear convergence for quadratic programs (Han-Yuan-2013; Boley-2013)
9 Generalized ADMM Generalized ADMM for N = 2 (Deng-Yin-2012) x1 k+1 := argmin x1 X 1 L γ (x 1,x2 k;λk )+ 1 2 x 1 x1 k 2 P x2 k+1 := argmin x2 X 2 L γ (x k+1 ( λ k+1 := λ k αγ 1,x 2 ;λ k )+ 1 2 ) x 2 x2 k 2 Q 2 b. A 1 x k+1 1 +A 2 x k+1 One sufficient condition for guaranteeing global linear convergence: P = Q = 0, α = 1, f 2 strongly convex, f 2 Lipschtiz continuous, A 2 full row rank.
10 ADMM for N 3: a counter example A negative result (Chen-He-Ye-Yuan-2013): Direct extension of multi-block ADMM is not necessarily convergent A counter example: A 1 x 1 +A 2 x 2 +A 3 x 3 = 0, where A = (A 1,A 2,A 3 ) = The update of multi-block ADMM with γ = 1 is x k x k x k x3 k+1 = x k x k λ k λ k
11 ADMM for N 3: a counter example Equivalently, M = x2 k+1 x3 k+1 λ k+1 Note that ρ(m) > 1. = M Theorem (Chen-He-Ye-Yuan-2013) x k 2 x k 3 λ k, where There existing an example where the direct extension of ADMM of three blocks with a real number initial point is not necessarily convergent for any choice of γ > 0.
12 ADMM for N 3: Strong convexity? min 0.05x x x x 1 s.t x 2 = x 3 For γ = 1, ρ(m) = > 1 Able to find a proper initial point such that ADMM diverges Even for strongly convex programming, the extended ADMM is not necessarily convergent for a certain γ > 0.
13 ADMM for N 3: Strong convexity works! Global convergence Theorem (Han-Yuan-2012) If f i,i = 1,...,N are strongly convex with parameter σ i s, and { } 2σ i 0 < γ < min i=1,...,n 3(N 1)λ max (A, i A i ) then multi-block ADMM globally converges. Convergence Rate?
14 ADMM for N 3: weaker condition and convergence rate u := (x 1,...,x N ), x t i = 1 t +1 t k=0 Theorem (Lin-Ma-Zhang-2014a) x k+1 i,1 i N, λ t = 1 t +1 If f 2,...,f N are strongly convex, f 1 is convex, and { γ min 2 i N 1 t λ k+1. k=0 2σ i (2N i)(i 1)λ max (A i A i ), 2σ N (N 2)(N +1)λ max (A N A N) N then f(ū t ) f(u ) = O(1/t), and A i x i t b = O(1/t). Weaker condition Ergodic O(1/t) convergence rate in terms of objective value and primal feasibility i=1 },
15 ADMM for N 3: non-ergodic convergence rate Optimality measure: if A 2 x2 k+1 A 2 x2 k = 0, A 3 x3 k+1 A 3 x3 k = 0, A 1 x1 k+1 +A 2 x2 k+1 +A 3 x3 k+1 b = 0, then (x k+1 1,x k+1 2,x k+1 3,λ k+1 ) is optimal. Define R k+1 := A 1 x1 k+1 +A 2 x2 k+1 +A 3 x3 k+1 b 2 +2 A 2 x2 k+1 A 2 x2 k 2 +3 A 3 x3 k+1 A 3 x3 k 2. We can prove: R k = o(1/k)
16 ADMM for N 3: non-ergodic convergence rate Theorem (Lin-Ma-Zhang-2014a) If f 2 and f 3 are strongly convex, and { γ min σ 2 2λ max (A 2 A 2), σ 3 2λ max (A 3 A 3) }, then R k < + and R k = o(1/k). k=1
17 ADMM for N 3: non-ergodic convergence rate Theorem (Lin-Ma-Zhang-2014a) If f 2,...,f N are strongly convex, and { γ min 2 i N 1 then i=1 2σ i (2N i)(i 1)λ max (A i A i ), 2σ N (N 2)(N +1)λ max (A N A N) R k < + and R k = o(1/k), k=1 where 2 N R k+1 := A i x k+1 i b + N i=2 (2N i)(i 1) A i xi k A i x k+1 i 2. 2 },
18 ADMM for N 3: global linear convergence Globally linear convergence of ADMM for N 3 (Lin-Ma-Zhang-2014b) s.c. Lipschitz full row rank full column rank 1 f 2,,f N f N A N 2 f 1,,f N f 1,, f N 3 f 2,,f N f 1,, f N A 1 Table: Three scenarios leading to global linear convergence Reduce to the conditions in (Deng-Yin-2012) when N = 2
19 Variants: Modified Proximal Jacobian ADMM (Deng-Lai-Peng-Yin-2014) x1 k+1 := argmin x1 X 1 L γ (x 1,x2 k,...,xk N ;λk )+ 1 2 x 1 x1 k 2 P 1 x2 k+1 := argmin x2 X 2 L γ (x1 k,x 2,x3 k,...,xk N ;λk )+ 1 2 x 2 x2 k 2 P 2 x k+1 N λ k+1. := argmin xn X ( N L γ (x1 k,xk 2,...,xk N 1,x N;λ k )+ 1 2 x N xn k 2 P N N ) := λ k αγ j=1 A jx k+1 j b. Conditions for convergence: { Pi γ(1/ǫ i 1)A i A i,i = 1,2,...,N N i=1 ǫ i < 2 α o(1/k) convergence rate in non-ergodic sense
20 Variants: Modified Proximal Gauss-Seidel ADMM x1 k+1 := argmin x1 X 1 L γ (x 1,x2 k,xk 3 ;λk )+ 1 2 x 1 x1 k 2 P 1 x2 k+1 := argmin x2 X 2 L γ (x1 k+1,x 2,x3 k;λk )+ 1 2 x 2 x2 k 2 P 2 x3 k+1 := argmin x3 X ( 3 L γ (x1 k+1,x2 k+1,x 3 ;λ k )+ 1 2 x 3 x3 k 2 P 3 3 ) λ k+1 := λ k αγ j=1 A jx k+1 j b.
21 Proximal Gauss-Seidel ADMM Theorem (Lin-Ma-Zhang-2014c) Ergodic O(1/k) convergence rate in terms of both objective value and primal feasibility, under conditions: f 3 is strongly convex, and P 1 γ(3+ 5 ǫ 1 )A 1 A 1 P 2 γ(1+ 3 ǫ 2 )A 2 A 2 P 3 γ( 1 ǫ 3 1)A 3 A 3 3(ǫ 1 +ǫ 2 +ǫ 3 ) < 1 α γ < σ 3 ǫ 3 2(ǫ 3 +1)λ max(a 3 A 3) Ongoing work. More coming soon.
22 The General Problem Formulation We consider the following convex optimization problem min f(x) := g (x 1,,x K )+ K k=1 h k(x k ) subject to E 1 x 1 +E 2 x 2 + +E K x K = q, x k X k, k = 1,2,...,K, (P) g( ) a smooth convex function; h k a nonsmooth convex function x := (x T 1,...,xT K )T R n block variables X := K k=1 X k feasible set E := (E 1,,E K ) and h(x) := K k=1 h k(x k ) Augmented Lagrangian function L(x;y) = g(x)+h(x)+ y,q Ex + ρ q Ex 2 2
23 Small dual step size and error bound condition ADMM for N 3 without strong convexity assumption (Hong-Luo-2012) x k+1 1 := argmin x1 X 1 L(x 1,x2 k,...,xk N ;yk ) x2 k+1 := argmin x2 X 2 L(x1 k+1,x 2,x3 k,...,xk N ;yk ). x k+1 N := argmin xn ( X N L(x1 k+1,x2 k+1,...,x k+1 N 1,x N;y k ) N ) y k+1 := y k α j=1 A jx k+1 j b. Do not assume strong convexity, but need other conditions.
24 Small dual step size and error bound condition Error bound condition: there exist positive scalars τ and δ such that the following error bound holds dist(x,x(y)) τ x L(x;y), for all (x,y) such that x L(x;y) δ [Hong-Luo-2012]: Given that α is small enough (upper bounded by some constant related to τ and δ), the small step size variant of ADMM converges linearly. α is too small and not computable, thus not practical.
25 The BSUM-M Algorithm: Main Ideas A Block Successive Upper-bound Minimization Method of Multipliers Introduce the augmented Lagrangian function for problem (P) L(x;y) = g(x)+h(x)+ y,q Ex + ρ q Ex 2 2 y dual variable; ρ 0 primal stepsize Main idea: Primal update 1 Update the primal variables successively (Gauss-Seidel) 2 Optimize some approximate version of L(x, y) Main idea: Dual update 1 Inexact dual ascent + proper step size control
26 The BSUM-M Algorithm: Details At iteration r +1, a block variable x k is updated by solving ( min u k xk ;x1 r+1,,x r+1 x k X k 1,xr k, ),xr K k + y r+1,q E k x k +h k (x k ) u k ( ; x1 r+1,,x r+1 k 1,xr k,,xr K ): is an upper-bound of g(x)+ ρ q Ex 2 2 at the current iterate (x r+1 1,,x r+1 k 1,xr k,,xr K )
27 The BSUM-M Algorithm: Details (cont.) The BSUM-M Algorithm At each iteration r 1: ) K y r+1 = y r +α r (q Ex r ) = y r +α (q r E k xk r, k=1 x r+1 k = arg min u k (x k ;w r+1 x k X k ) y r+1,e k x k +h k (x k ), k k where α r > 0 is the dual stepsize. To simplify notations, we have defined w r+1 k := (x r+1 1,,x r+1 k 1,xr k,xr k+1,,xr K ),
28 The BSUM-M Algorithm: Randomized Version Select a vector {p k > 0} K k=0 such that K k=0 p k = 1 Each iteration t only updates a single randomly selected primal or dual variable At iteration t 1, pick k {0,,K} with probability p k and If k = 0 y t+1 = y t +α t (q Ex t ), x t+1 k = xk t, k = 1,,K. Else If k {1,,K} x t+1 k = argmin xk X k u k (x k ;x t ) y r,e k x k +h k (x k ), x t+1 j = x t j, j k, yt+1 = y t. End
29 Convergence Analysis: Assumptions Assumption A (on the problem) (a) Problem (P) is convex problem (b) g(x) = l(ax)+ x,b ; l( ) smooth strictly convex, A not necessarily full column rank (c) Nonsmooth function h k : h k (x k ) = λ k x k 1 + J w J x k,j 2, where x k = (,x k,j, ) is a partition of x k ; λ k 0 and w J 0 are some constants. (d) The feasible sets {X k } are compact polyhedral sets, and are given by X k := {x k C k x k c k }.
30 Convergence Analysis: Assumptions Assumption B (on u k ) (a) u k (v k ;x) g(v k,x k )+ ρ 2 E kv k q +E k x k 2, v k X k, x,k (upper-bound) (b) u k (x k ;x) = g(x)+ ( ρ 2 Ex q 2, x,k (locally tight) (c) u k (x k ;x) = k g(x)+ ρ 2 Ex q 2), x,k (d) For any given x, u k (v k ;x) is strongly convex in v k (e) For given x, u k (v k ;x) has Lipchitz continuous gradient Figure: Illustration of the upper-bound.
31 The Convergence Result Theorem (Hong, Chang, Wang, Razaviyanyn, Ma and Luo 2014) Suppose Assumptions A-B hold. Assume the dual stepsize α r satisfies r=1 Then we have the following: α r =, lim r αr = 0. 1 For the BSUM-M, we have lim r Ex r q = 0, and every limit point of {x r,y r } is a primal and dual optimal solution. 2 For the RBSUM-M, we have lim t Ex t q = 0 w.p.1. Further, every limit point of {x t,y t } is a primal and dual optimal solution w.p.1.
32 Counterexample for multi-block ADMM Recently [Chen-He-Ye-Yuan 13] shows (through an example) that applying ADMM to multi-block problem can diverge We show applying (R)BSUM-M to the same problem converges (diminishing dual step size) Main message: Dual stepsize control is crucial Consider the following linear systems of equations (unique solution x 1 = x 2 = x 3 = 0) E 1 x 1 +E 2 x 2 +E 3 x 3 = 0, with [E 1 E 2 E 3 ] =
33 Counterexample for multi-block ADMM (cont.) 0.2 x x x x x 3 x 1 +x 2 +x x 3 x 1 +x 2 +x iteration (r) iteration (t) Figure: Iterates generated by the BSUM-M. Each curve is averaged over 1000 runs (with random starting points). Figure: Iterates generated by the RBSUM-M algorithm. Each curve is averaged over 1000 runs (with random starting points)
34 Thank you for your attention!
The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent
The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent Yinyu Ye K. T. Li Professor of Engineering Department of Management Science and Engineering Stanford
More informationShiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers
Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)
More informationDoes Alternating Direction Method of Multipliers Converge for Nonconvex Problems?
Does Alternating Direction Method of Multipliers Converge for Nonconvex Problems? Mingyi Hong IMSE and ECpE Department Iowa State University ICCOPT, Tokyo, August 2016 Mingyi Hong (Iowa State University)
More informationContraction Methods for Convex Optimization and Monotone Variational Inequalities No.16
XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of
More informationDistributed Optimization via Alternating Direction Method of Multipliers
Distributed Optimization via Alternating Direction Method of Multipliers Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato Stanford University ITMANET, Stanford, January 2011 Outline precursors dual decomposition
More informationDealing with Constraints via Random Permutation
Dealing with Constraints via Random Permutation Ruoyu Sun UIUC Joint work with Zhi-Quan Luo (U of Minnesota and CUHK (SZ)) and Yinyu Ye (Stanford) Simons Institute Workshop on Fast Iterative Methods in
More informationarxiv: v1 [math.oc] 23 May 2017
A DERANDOMIZED ALGORITHM FOR RP-ADMM WITH SYMMETRIC GAUSS-SEIDEL METHOD JINCHAO XU, KAILAI XU, AND YINYU YE arxiv:1705.08389v1 [math.oc] 23 May 2017 Abstract. For multi-block alternating direction method
More informationBeyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory
Beyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory Xin Liu(4Ð) State Key Laboratory of Scientific and Engineering Computing Institute of Computational Mathematics
More informationInexact Alternating Direction Method of Multipliers for Separable Convex Optimization
Inexact Alternating Direction Method of Multipliers for Separable Convex Optimization Hongchao Zhang hozhang@math.lsu.edu Department of Mathematics Center for Computation and Technology Louisiana State
More informationNonconvex ADMM: Convergence and Applications
Nonconvex ADMM: Convergence and Applications Instructor: Wotao Yin (UCLA Math) Based on CAM 15-62 with Yu Wang and Jinshan Zeng Summer 2016 1 / 54 1. Alternating Direction Method of Multipliers (ADMM):
More informationSplitting methods for decomposing separable convex programs
Splitting methods for decomposing separable convex programs Philippe Mahey LIMOS - ISIMA - Université Blaise Pascal PGMO, ENSTA 2013 October 4, 2013 1 / 30 Plan 1 Max Monotone Operators Proximal techniques
More informationAccelerated primal-dual methods for linearly constrained convex problems
Accelerated primal-dual methods for linearly constrained convex problems Yangyang Xu SIAM Conference on Optimization May 24, 2017 1 / 23 Accelerated proximal gradient For convex composite problem: minimize
More informationContraction Methods for Convex Optimization and monotone variational inequalities No.12
XII - 1 Contraction Methods for Convex Optimization and monotone variational inequalities No.12 Linearized alternating direction methods of multipliers for separable convex programming Bingsheng He Department
More informationCoordinate Update Algorithm Short Course Operator Splitting
Coordinate Update Algorithm Short Course Operator Splitting Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 25 Operator splitting pipeline 1. Formulate a problem as 0 A(x) + B(x) with monotone operators
More informationDouglas-Rachford Splitting: Complexity Estimates and Accelerated Variants
53rd IEEE Conference on Decision and Control December 5-7, 204. Los Angeles, California, USA Douglas-Rachford Splitting: Complexity Estimates and Accelerated Variants Panagiotis Patrinos and Lorenzo Stella
More informationDual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725
Dual methods and ADMM Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given f : R n R, the function is called its conjugate Recall conjugate functions f (y) = max x R n yt x f(x)
More informationOptimal Linearized Alternating Direction Method of Multipliers for Convex Programming 1
Optimal Linearized Alternating Direction Method of Multipliers for Convex Programming Bingsheng He 2 Feng Ma 3 Xiaoming Yuan 4 October 4, 207 Abstract. The alternating direction method of multipliers ADMM
More informationMath 273a: Optimization Overview of First-Order Optimization Algorithms
Math 273a: Optimization Overview of First-Order Optimization Algorithms Wotao Yin Department of Mathematics, UCLA online discussions on piazza.com 1 / 9 Typical flow of numerical optimization Optimization
More informationTight Rates and Equivalence Results of Operator Splitting Schemes
Tight Rates and Equivalence Results of Operator Splitting Schemes Wotao Yin (UCLA Math) Workshop on Optimization for Modern Computing Joint w Damek Davis and Ming Yan UCLA CAM 14-51, 14-58, and 14-59 1
More informationYou should be able to...
Lecture Outline Gradient Projection Algorithm Constant Step Length, Varying Step Length, Diminishing Step Length Complexity Issues Gradient Projection With Exploration Projection Solving QPs: active set
More informationContraction Methods for Convex Optimization and Monotone Variational Inequalities No.11
XI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.11 Alternating direction methods of multipliers for separable convex programming Bingsheng He Department of Mathematics
More informationSparse Optimization Lecture: Dual Methods, Part I
Sparse Optimization Lecture: Dual Methods, Part I Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know dual (sub)gradient iteration augmented l 1 iteration
More informationDual Ascent. Ryan Tibshirani Convex Optimization
Dual Ascent Ryan Tibshirani Conve Optimization 10-725 Last time: coordinate descent Consider the problem min f() where f() = g() + n i=1 h i( i ), with g conve and differentiable and each h i conve. Coordinate
More informationOptimizing Nonconvex Finite Sums by a Proximal Primal-Dual Method
Optimizing Nonconvex Finite Sums by a Proximal Primal-Dual Method Davood Hajinezhad Iowa State University Davood Hajinezhad Optimizing Nonconvex Finite Sums by a Proximal Primal-Dual Method 1 / 35 Co-Authors
More informationACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING
ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING YANGYANG XU Abstract. Motivated by big data applications, first-order methods have been extremely
More informationarxiv: v1 [math.oc] 27 Jan 2013
An Extragradient-Based Alternating Direction Method for Convex Minimization arxiv:1301.6308v1 [math.oc] 27 Jan 2013 Shiqian MA Shuzhong ZHANG January 26, 2013 Abstract In this paper, we consider the problem
More informationA Unified Approach to Proximal Algorithms using Bregman Distance
A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department
More informationHYBRID JACOBIAN AND GAUSS SEIDEL PROXIMAL BLOCK COORDINATE UPDATE METHODS FOR LINEARLY CONSTRAINED CONVEX PROGRAMMING
SIAM J. OPTIM. Vol. 8, No. 1, pp. 646 670 c 018 Society for Industrial and Applied Mathematics HYBRID JACOBIAN AND GAUSS SEIDEL PROXIMAL BLOCK COORDINATE UPDATE METHODS FOR LINEARLY CONSTRAINED CONVEX
More informationOn convergence rate of the Douglas-Rachford operator splitting method
On convergence rate of the Douglas-Rachford operator splitting method Bingsheng He and Xiaoming Yuan 2 Abstract. This note provides a simple proof on a O(/k) convergence rate for the Douglas- Rachford
More informationLinearized Alternating Direction Method of Multipliers via Positive-Indefinite Proximal Regularization for Convex Programming.
Linearized Alternating Direction Method of Multipliers via Positive-Indefinite Proximal Regularization for Convex Programming Bingsheng He Feng Ma 2 Xiaoming Yuan 3 July 3, 206 Abstract. The alternating
More informationDual Methods. Lecturer: Ryan Tibshirani Convex Optimization /36-725
Dual Methods Lecturer: Ryan Tibshirani Conve Optimization 10-725/36-725 1 Last time: proimal Newton method Consider the problem min g() + h() where g, h are conve, g is twice differentiable, and h is simple.
More informationMulti-Block ADMM and its Convergence
Multi-Block ADMM and its Convergence Yinyu Ye Department of Management Science and Engineering Stanford University Email: yyye@stanford.edu We show that the direct extension of alternating direction method
More informationThe Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent
The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent Caihua Chen Bingsheng He Yinyu Ye Xiaoming Yuan 4 December, Abstract. The alternating direction method
More informationRandomized Coordinate Descent with Arbitrary Sampling: Algorithms and Complexity
Randomized Coordinate Descent with Arbitrary Sampling: Algorithms and Complexity Zheng Qu University of Hong Kong CAM, 23-26 Aug 2016 Hong Kong based on joint work with Peter Richtarik and Dominique Cisba(University
More informationA Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem
A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem Kangkang Deng, Zheng Peng Abstract: The main task of genetic regulatory networks is to construct a
More informationLinearized Alternating Direction Method: Two Blocks and Multiple Blocks. Zhouchen Lin 林宙辰北京大学
Linearized Alternating Direction Method: Two Blocks and Multiple Blocks Zhouchen Lin 林宙辰北京大学 Dec. 3, 014 Outline Alternating Direction Method (ADM) Linearized Alternating Direction Method (LADM) Two Blocks
More informationON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS
ON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS WEI DENG AND WOTAO YIN Abstract. The formulation min x,y f(x) + g(y) subject to Ax + By = b arises in
More informationConvergence of Fixed-Point Iterations
Convergence of Fixed-Point Iterations Instructor: Wotao Yin (UCLA Math) July 2016 1 / 30 Why study fixed-point iterations? Abstract many existing algorithms in optimization, numerical linear algebra, and
More informationPrimal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions
Primal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions Olivier Fercoq and Pascal Bianchi Problem Minimize the convex function
More informationAlternating Direction Method of Multipliers. Ryan Tibshirani Convex Optimization
Alternating Direction Method of Multipliers Ryan Tibshirani Convex Optimization 10-725 Consider the problem Last time: dual ascent min x f(x) subject to Ax = b where f is strictly convex and closed. Denote
More informationOn the Convergence of Multi-Block Alternating Direction Method of Multipliers and Block Coordinate Descent Method
On the Convergence of Multi-Block Alternating Direction Method of Multipliers and Block Coordinate Descent Method Caihua Chen Min Li Xin Liu Yinyu Ye This version: September 9, 2015 Abstract The paper
More informationOptimization for Learning and Big Data
Optimization for Learning and Big Data Donald Goldfarb Department of IEOR Columbia University Department of Mathematics Distinguished Lecture Series May 17-19, 2016. Lecture 1. First-Order Methods for
More informationDual and primal-dual methods
ELE 538B: Large-Scale Optimization for Data Science Dual and primal-dual methods Yuxin Chen Princeton University, Spring 2018 Outline Dual proximal gradient method Primal-dual proximal gradient method
More informationUses of duality. Geoff Gordon & Ryan Tibshirani Optimization /
Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear
More informationDistributed Optimization and Statistics via Alternating Direction Method of Multipliers
Distributed Optimization and Statistics via Alternating Direction Method of Multipliers Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato Stanford University Stanford Statistics Seminar, September 2010
More informationLecture 11 and 12: Penalty methods and augmented Lagrangian methods for nonlinear programming
Lecture 11 and 12: Penalty methods and augmented Lagrangian methods for nonlinear programming Coralia Cartis, Mathematical Institute, University of Oxford C6.2/B2: Continuous Optimization Lecture 11 and
More informationPrimal-dual coordinate descent
Primal-dual coordinate descent Olivier Fercoq Joint work with P. Bianchi & W. Hachem 15 July 2015 1/28 Minimize the convex function f, g, h convex f is differentiable Problem min f (x) + g(x) + h(mx) x
More informationADMM and Fast Gradient Methods for Distributed Optimization
ADMM and Fast Gradient Methods for Distributed Optimization João Xavier Instituto Sistemas e Robótica (ISR), Instituto Superior Técnico (IST) European Control Conference, ECC 13 July 16, 013 Joint work
More informationKey words. alternating direction method of multipliers, convex composite optimization, indefinite proximal terms, majorization, iteration-complexity
A MAJORIZED ADMM WITH INDEFINITE PROXIMAL TERMS FOR LINEARLY CONSTRAINED CONVEX COMPOSITE OPTIMIZATION MIN LI, DEFENG SUN, AND KIM-CHUAN TOH Abstract. This paper presents a majorized alternating direction
More informationARock: an algorithmic framework for asynchronous parallel coordinate updates
ARock: an algorithmic framework for asynchronous parallel coordinate updates Zhimin Peng, Yangyang Xu, Ming Yan, Wotao Yin ( UCLA Math, U.Waterloo DCO) UCLA CAM Report 15-37 ShanghaiTech SSDS 15 June 25,
More informationOn the equivalence of the primal-dual hybrid gradient method and Douglas Rachford splitting
Mathematical Programming manuscript No. (will be inserted by the editor) On the equivalence of the primal-dual hybrid gradient method and Douglas Rachford splitting Daniel O Connor Lieven Vandenberghe
More informationPrimal-dual fixed point algorithms for separable minimization problems and their applications in imaging
1/38 Primal-dual fixed point algorithms for separable minimization problems and their applications in imaging Xiaoqun Zhang Department of Mathematics and Institute of Natural Sciences Shanghai Jiao Tong
More informationProximal splitting methods on convex problems with a quadratic term: Relax!
Proximal splitting methods on convex problems with a quadratic term: Relax! The slides I presented with added comments Laurent Condat GIPSA-lab, Univ. Grenoble Alpes, France Workshop BASP Frontiers, Jan.
More informationPrimal/Dual Decomposition Methods
Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Subgradients
More informationProximal ADMM with larger step size for two-block separable convex programming and its application to the correlation matrices calibrating problems
Available online at www.isr-publications.com/jnsa J. Nonlinear Sci. Appl., (7), 538 55 Research Article Journal Homepage: www.tjnsa.com - www.isr-publications.com/jnsa Proximal ADMM with larger step size
More informationConvergence rate of inexact proximal point methods with relative error criteria for convex optimization
Convergence rate of inexact proximal point methods with relative error criteria for convex optimization Renato D. C. Monteiro B. F. Svaiter August, 010 Revised: December 1, 011) Abstract In this paper,
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 29, 2016 Outline Convex vs Nonconvex Functions Coordinate Descent Gradient Descent Newton s method Stochastic Gradient Descent Numerical Optimization
More information6. Proximal gradient method
L. Vandenberghe EE236C (Spring 2013-14) 6. Proximal gradient method motivation proximal mapping proximal gradient method with fixed step size proximal gradient method with line search 6-1 Proximal mapping
More informationPenalty and Barrier Methods General classical constrained minimization problem minimize f(x) subject to g(x) 0 h(x) =0 Penalty methods are motivated by the desire to use unconstrained optimization techniques
More informationNumerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen
Numerisches Rechnen (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2011/12 IGPM, RWTH Aachen Numerisches Rechnen
More informationFast proximal gradient methods
L. Vandenberghe EE236C (Spring 2013-14) Fast proximal gradient methods fast proximal gradient method (FISTA) FISTA with line search FISTA as descent method Nesterov s second method 1 Fast (proximal) gradient
More informationSEMI-SMOOTH SECOND-ORDER TYPE METHODS FOR COMPOSITE CONVEX PROGRAMS
SEMI-SMOOTH SECOND-ORDER TYPE METHODS FOR COMPOSITE CONVEX PROGRAMS XIANTAO XIAO, YONGFENG LI, ZAIWEN WEN, AND LIWEI ZHANG Abstract. The goal of this paper is to study approaches to bridge the gap between
More informationFast-and-Light Stochastic ADMM
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-6) Fast-and-Light Stochastic ADMM Shuai Zheng James T. Kwok Department of Computer Science and Engineering
More informationA Primal-dual Three-operator Splitting Scheme
Noname manuscript No. (will be inserted by the editor) A Primal-dual Three-operator Splitting Scheme Ming Yan Received: date / Accepted: date Abstract In this paper, we propose a new primal-dual algorithm
More informationFrank-Wolfe Method. Ryan Tibshirani Convex Optimization
Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)
More informationarxiv: v1 [math.oc] 21 Apr 2016
Accelerated Douglas Rachford methods for the solution of convex-concave saddle-point problems Kristian Bredies Hongpeng Sun April, 06 arxiv:604.068v [math.oc] Apr 06 Abstract We study acceleration and
More informationA SIMPLE PARALLEL ALGORITHM WITH AN O(1/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS
A SIMPLE PARALLEL ALGORITHM WITH AN O(/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS HAO YU AND MICHAEL J. NEELY Abstract. This paper considers convex programs with a general (possibly non-differentiable)
More information4TE3/6TE3. Algorithms for. Continuous Optimization
4TE3/6TE3 Algorithms for Continuous Optimization (Algorithms for Constrained Nonlinear Optimization Problems) Tamás TERLAKY Computing and Software McMaster University Hamilton, November 2005 terlaky@mcmaster.ca
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Distributed ADMM for Gaussian Graphical Models Yaoliang Yu Lecture 29, April 29, 2015 Eric Xing @ CMU, 2005-2015 1 Networks / Graphs Eric Xing
More informationDistributed Convex Optimization
Master Program 2013-2015 Electrical Engineering Distributed Convex Optimization A Study on the Primal-Dual Method of Multipliers Delft University of Technology He Ming Zhang, Guoqiang Zhang, Richard Heusdens
More informationThe Alternating Direction Method of Multipliers
The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein December 2, 2015 1 / 25 Background The Dual Problem Consider
More informationPrimal-dual algorithms for the sum of two and three functions 1
Primal-dual algorithms for the sum of two and three functions 1 Ming Yan Michigan State University, CMSE/Mathematics 1 This works is partially supported by NSF. optimization problems for primal-dual algorithms
More informationA GENERAL INERTIAL PROXIMAL POINT METHOD FOR MIXED VARIATIONAL INEQUALITY PROBLEM
A GENERAL INERTIAL PROXIMAL POINT METHOD FOR MIXED VARIATIONAL INEQUALITY PROBLEM CAIHUA CHEN, SHIQIAN MA, AND JUNFENG YANG Abstract. In this paper, we first propose a general inertial proximal point method
More informationGeneralized ADMM with Optimal Indefinite Proximal Term for Linearly Constrained Convex Optimization
Generalized ADMM with Optimal Indefinite Proximal Term for Linearly Constrained Convex Optimization Fan Jiang 1 Zhongming Wu Xingju Cai 3 Abstract. We consider the generalized alternating direction method
More informationOperator Splitting for Parallel and Distributed Optimization
Operator Splitting for Parallel and Distributed Optimization Wotao Yin (UCLA Math) Shanghai Tech, SSDS 15 June 23, 2015 URL: alturl.com/2z7tv 1 / 60 What is splitting? Sun-Tzu: (400 BC) Caesar: divide-n-conquer
More informationarxiv: v3 [math.oc] 17 Dec 2017 Received: date / Accepted: date
Noname manuscript No. (will be inserted by the editor) A Simple Convergence Analysis of Bregman Proximal Gradient Algorithm Yi Zhou Yingbin Liang Lixin Shen arxiv:1503.05601v3 [math.oc] 17 Dec 2017 Received:
More informationMath 273a: Optimization Subgradient Methods
Math 273a: Optimization Subgradient Methods Instructor: Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com Nonsmooth convex function Recall: For ˉx R n, f(ˉx) := {g R
More informationAlgorithms for constrained local optimization
Algorithms for constrained local optimization Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Algorithms for constrained local optimization p. Feasible direction methods Algorithms for constrained
More informationWHY DUALITY? Gradient descent Newton s method Quasi-newton Conjugate gradients. No constraints. Non-differentiable ???? Constrained problems? ????
DUALITY WHY DUALITY? No constraints f(x) Non-differentiable f(x) Gradient descent Newton s method Quasi-newton Conjugate gradients etc???? Constrained problems? f(x) subject to g(x) apple 0???? h(x) =0
More informationFAST ALTERNATING DIRECTION OPTIMIZATION METHODS
FAST ALTERNATING DIRECTION OPTIMIZATION METHODS TOM GOLDSTEIN, BRENDAN O DONOGHUE, SIMON SETZER, AND RICHARD BARANIUK Abstract. Alternating direction methods are a common tool for general mathematical
More informationPreconditioned Alternating Direction Method of Multipliers for the Minimization of Quadratic plus Non-Smooth Convex Functionals
SpezialForschungsBereich F 32 Karl Franzens Universita t Graz Technische Universita t Graz Medizinische Universita t Graz Preconditioned Alternating Direction Method of Multipliers for the Minimization
More informationMath 273a: Optimization Subgradients of convex functions
Math 273a: Optimization Subgradients of convex functions Made by: Damek Davis Edited by Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com 1 / 42 Subgradients Assumptions
More informationCoordinate Update Algorithm Short Course Subgradients and Subgradient Methods
Coordinate Update Algorithm Short Course Subgradients and Subgradient Methods Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 30 Notation f : H R { } is a closed proper convex function domf := {x R n
More informationIterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming
Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Zhaosong Lu October 5, 2012 (Revised: June 3, 2013; September 17, 2013) Abstract In this paper we study
More informationConductivity imaging from one interior measurement
Conductivity imaging from one interior measurement Amir Moradifam (University of Toronto) Fields Institute, July 24, 2012 Amir Moradifam (University of Toronto) A convergent algorithm for CDII 1 / 16 A
More information1. Gradient method. gradient method, first-order methods. quadratic bounds on convex functions. analysis of gradient method
L. Vandenberghe EE236C (Spring 2016) 1. Gradient method gradient method, first-order methods quadratic bounds on convex functions analysis of gradient method 1-1 Approximate course outline First-order
More informationBig Data Analytics: Optimization and Randomization
Big Data Analytics: Optimization and Randomization Tianbao Yang Tutorial@ACML 2015 Hong Kong Department of Computer Science, The University of Iowa, IA, USA Nov. 20, 2015 Yang Tutorial for ACML 15 Nov.
More informationFirst-order methods for structured nonsmooth optimization
First-order methods for structured nonsmooth optimization Sangwoon Yun Department of Mathematics Education Sungkyunkwan University Oct 19, 2016 Center for Mathematical Analysis & Computation, Yonsei University
More informationRelaxed linearized algorithms for faster X-ray CT image reconstruction
Relaxed linearized algorithms for faster X-ray CT image reconstruction Hung Nien and Jeffrey A. Fessler University of Michigan, Ann Arbor The 13th Fully 3D Meeting June 2, 2015 1/20 Statistical image reconstruction
More informationSolving DC Programs that Promote Group 1-Sparsity
Solving DC Programs that Promote Group 1-Sparsity Ernie Esser Contains joint work with Xiaoqun Zhang, Yifei Lou and Jack Xin SIAM Conference on Imaging Science Hong Kong Baptist University May 14 2014
More informationA New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction
A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction Marc Teboulle School of Mathematical Sciences Tel Aviv University Joint work with H. Bauschke and J. Bolte Optimization
More informationAlgorithms for Nonsmooth Optimization
Algorithms for Nonsmooth Optimization Frank E. Curtis, Lehigh University presented at Center for Optimization and Statistical Learning, Northwestern University 2 March 2018 Algorithms for Nonsmooth Optimization
More informationNESTT: A Nonconvex Primal-Dual Splitting Method for Distributed and Stochastic Optimization
ESTT: A onconvex Primal-Dual Splitting Method for Distributed and Stochastic Optimization Davood Hajinezhad, Mingyi Hong Tuo Zhao Zhaoran Wang Abstract We study a stochastic and distributed algorithm for
More informationSTA141C: Big Data & High Performance Statistical Computing
STA141C: Big Data & High Performance Statistical Computing Lecture 8: Optimization Cho-Jui Hsieh UC Davis May 9, 2017 Optimization Numerical Optimization Numerical Optimization: min X f (X ) Can be applied
More informationM. Marques Alves Marina Geremia. November 30, 2017
Iteration complexity of an inexact Douglas-Rachford method and of a Douglas-Rachford-Tseng s F-B four-operator splitting method for solving monotone inclusions M. Marques Alves Marina Geremia November
More informationLecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem
Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R
More informationOptimization for Machine Learning
Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html
More informationAN AUGMENTED LAGRANGIAN METHOD FOR l 1 -REGULARIZED OPTIMIZATION PROBLEMS WITH ORTHOGONALITY CONSTRAINTS
AN AUGMENTED LAGRANGIAN METHOD FOR l 1 -REGULARIZED OPTIMIZATION PROBLEMS WITH ORTHOGONALITY CONSTRAINTS WEIQIANG CHEN, HUI JI, AND YANFEI YOU Abstract. A class of l 1 -regularized optimization problems
More informationarxiv: v4 [math.oc] 29 Jan 2018
Noname manuscript No. (will be inserted by the editor A new primal-dual algorithm for minimizing the sum of three functions with a linear operator Ming Yan arxiv:1611.09805v4 [math.oc] 29 Jan 2018 Received:
More informationAlternating Direction Augmented Lagrangian Algorithms for Convex Optimization
Alternating Direction Augmented Lagrangian Algorithms for Convex Optimization Joint work with Bo Huang, Shiqian Ma, Tony Qin, Katya Scheinberg, Zaiwen Wen and Wotao Yin Department of IEOR, Columbia University
More information