The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent

Similar documents
Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables

The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent

arxiv: v1 [math.oc] 23 May 2017

Distributed Optimization via Alternating Direction Method of Multipliers

On convergence rate of the Douglas-Rachford operator splitting method

On the Convergence of Multi-Block Alternating Direction Method of Multipliers and Block Coordinate Descent Method

Dealing with Constraints via Random Permutation

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Multi-Block ADMM and its Convergence

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16

Inexact Alternating Direction Method of Multipliers for Separable Convex Optimization

Optimal Linearized Alternating Direction Method of Multipliers for Convex Programming 1

Linearized Alternating Direction Method of Multipliers via Positive-Indefinite Proximal Regularization for Convex Programming.

Splitting methods for decomposing separable convex programs

Application of the Strictly Contractive Peaceman-Rachford Splitting Method to Multi-block Separable Convex Programming

Coordinate Update Algorithm Short Course Operator Splitting

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.11

Beyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory

Does Alternating Direction Method of Multipliers Converge for Nonconvex Problems?

Distributed Optimization and Statistics via Alternating Direction Method of Multipliers

Accelerated primal-dual methods for linearly constrained convex problems

Proximal ADMM with larger step size for two-block separable convex programming and its application to the correlation matrices calibrating problems

Adaptive Stochastic Alternating Direction Method of Multipliers

Optimization for Learning and Big Data

Contraction Methods for Convex Optimization and monotone variational inequalities No.12

Relaxed linearized algorithms for faster X-ray CT image reconstruction

A Unified Approach to Proximal Algorithms using Bregman Distance

Math 273a: Optimization Overview of First-Order Optimization Algorithms

2 Regularized Image Reconstruction for Compressive Imaging and Beyond

Tight Rates and Equivalence Results of Operator Splitting Schemes

A GENERAL INERTIAL PROXIMAL POINT METHOD FOR MIXED VARIATIONAL INEQUALITY PROBLEM

An Algorithmic Framework of Generalized Primal-Dual Hybrid Gradient Methods for Saddle Point Problems

The Alternating Direction Method of Multipliers

Convergence of a Class of Stationary Iterative Methods for Saddle Point Problems

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.18

On the acceleration of augmented Lagrangian method for linearly constrained optimization

arxiv: v2 [math.oc] 1 Dec 2014

Sparse and Low-Rank Matrix Decomposition Via Alternating Direction Method

Randomized Coordinate Descent with Arbitrary Sampling: Algorithms and Complexity

Alternating Direction Method of Multipliers. Ryan Tibshirani Convex Optimization

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6)

Generalized ADMM with Optimal Indefinite Proximal Term for Linearly Constrained Convex Optimization

Frist order optimization methods for sparse inverse covariance selection

Convex Optimization Algorithms for Machine Learning in 10 Slides

Nonconvex ADMM: Convergence and Applications

Sparse Gaussian conditional random fields

ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING

Douglas-Rachford Splitting: Complexity Estimates and Accelerated Variants

A relaxed customized proximal point algorithm for separable convex programming

On Glowinski s Open Question of Alternating Direction Method of Multipliers

INERTIAL PRIMAL-DUAL ALGORITHMS FOR STRUCTURED CONVEX OPTIMIZATION

On the Iteration Complexity of Some Projection Methods for Monotone Linear Variational Inequalities

Dual Ascent. Ryan Tibshirani Convex Optimization

Dual Methods. Lecturer: Ryan Tibshirani Convex Optimization /36-725

ALADIN An Algorithm for Distributed Non-Convex Optimization and Control

arxiv: v1 [math.oc] 27 Jan 2013

Dual and primal-dual methods

Provable Alternating Minimization Methods for Non-convex Optimization

M. Marques Alves Marina Geremia. November 30, 2017

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss

Probabilistic Graphical Models

Preconditioned Alternating Direction Method of Multipliers for the Minimization of Quadratic plus Non-Smooth Convex Functionals

On Glowinski s Open Question on the Alternating Direction Method of Multipliers

Project Discussions: SNL/ADMM, MDP/Randomization, Quadratic Regularization, and Online Linear Programming

Optimal linearized symmetric ADMM for multi-block separable convex programming

You should be able to...

ADMM and Fast Gradient Methods for Distributed Optimization

An ADMM Algorithm for Clustering Partially Observed Networks

The Multi-Path Utility Maximization Problem

Primal-dual Subgradient Method for Convex Problems with Functional Constraints

Optimizing Nonconvex Finite Sums by a Proximal Primal-Dual Method

A General Framework for a Class of Primal-Dual Algorithms for TV Minimization

ADMM Fused Lasso for Copy Number Variation Detection in Human 3 March Genomes / 1

Key words. alternating direction method of multipliers, convex composite optimization, indefinite proximal terms, majorization, iteration-complexity

Sparse Optimization Lecture: Dual Methods, Part I

Introduction to Alternating Direction Method of Multipliers

A Brief Overview of Practical Optimization Algorithms in the Context of Relaxation

10-725/36-725: Convex Optimization Spring Lecture 21: April 6

Improving an ADMM-like Splitting Method via Positive-Indefinite Proximal Regularization for Three-Block Separable Convex Minimization

A Primal-dual Three-operator Splitting Scheme

Infeasibility Detection in Alternating Direction Method of Multipliers for Convex Quadratic Programs

Adaptive Corrected Procedure for TVL1 Image Deblurring under Impulsive Noise

On the equivalence of the primal-dual hybrid gradient method and Douglas Rachford splitting

Donald Goldfarb IEOR Department Columbia University UCLA Mathematics Department Distinguished Lecture Series May 17 19, 2016

A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem

Convergent prediction-correction-based ADMM for multi-block separable convex programming

Linearized Alternating Direction Method: Two Blocks and Multiple Blocks. Zhouchen Lin 林宙辰北京大学

A Convex Approach for Designing Good Linear Embeddings. Chinmay Hegde

AUGMENTED LAGRANGIAN METHODS FOR CONVEX OPTIMIZATION

Asynchronous Algorithms for Conic Programs, including Optimal, Infeasible, and Unbounded Ones

ARock: an algorithmic framework for asynchronous parallel coordinate updates

Proximal-like contraction methods for monotone variational inequalities in a unified framework

Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches

Approximation algorithms for nonnegative polynomial optimization problems over unit spheres

RECOVERING LOW-RANK AND SPARSE COMPONENTS OF MATRICES FROM INCOMPLETE AND NOISY OBSERVATIONS. December

ALTERNATING DIRECTION METHOD OF MULTIPLIERS WITH VARIABLE STEP SIZES

A Simple Heuristic Based on Alternating Direction Method of Multipliers for Solving Mixed-Integer Nonlinear Optimization

An ADMM algorithm for optimal sensor and actuator selection

LARGE SCALE COMPOSITE OPTIMIZATION PROBLEMS WITH COUPLED OBJECTIVE FUNCTIONS: THEORY, ALGORITHMS AND APPLICATIONS CUI YING. (B.Sc.

ON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS

Transcription:

The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent Yinyu Ye K. T. Li Professor of Engineering Department of Management Science and Engineering Stanford University, and The International Center of Management Science and Engineering Nanjing University, Nanjing, China http: //www.stanford.edu/ yyye Joint work with Caihua Chen, Bingsheng He, Xiaoming Yuan April 25, 2014 Yinyu Ye Stanford 1/ 22

Outline 1 Background and Motivation 2 Divergent Examples for the Extended ADMM 3 The Small-Stepsize Variant of ADMM 4 Conclusions Yinyu Ye Stanford 2/ 22

1. Background and Motivation Yinyu Ye Stanford 3/ 22

Alternating Direction Method of Multipliers I min {θ 1 (x 1 )+θ 2 (x 2 ) A 1 x 1 + A 2 x 2 = b, x 1 X 1, x 2 X 2 } θ 1 (x 1 )andθ 2 (x 2 ) are convex closed proper functions; X 1 and X 2 are convex sets. Yinyu Ye Stanford 4/ 22

Alternating Direction Method of Multipliers I min {θ 1 (x 1 )+θ 2 (x 2 ) A 1 x 1 + A 2 x 2 = b, x 1 X 1, x 2 X 2 } θ 1 (x 1 )andθ 2 (x 2 ) are convex closed proper functions; X 1 and X 2 are convex sets. Alternating direction method of multipliers(glowinski & Marrocco 75, Gabay & Mercier 76): x1 k+1 = arg min{l A (x 1, x2 k,λk ) x 1 X 1 }, x2 k+1 = arg min{l A (x1 k+1, x 2,λ k ) x 2 X 2 }, λ k+1 = λ k β(a 1 x1 k+1 + A 2 x2 k+1 b), where the augmented Lagrangian function L A is defined as L A (x 1, x 2,λ)= 2 θ i (x i ) λ T ( 2 A i x i b ) + β 2 A i x i b 2. 2 i=1 i=1 i=1 Yinyu Ye Stanford 4/ 22

Alternating direction method of multipliers II Theoretical results of ADMM: Douglas-Rachford Splitting method to its dual (Gabay 76) A special implementation of the Proximal Point Algorithm (Eckstein & Bertsekas 92). Thus the convergence of ADMM can be easily established by using the classical operator theory. O(1/k) convergence speed (He & Yuan 12, Monteiro & Svaiter 13) Linear convergence under certain conditions (Lions & Mercier 79, Eckstein 89, ) Yinyu Ye Stanford 5/ 22

Alternating direction method of multipliers II Theoretical results of ADMM: Douglas-Rachford Splitting method to its dual (Gabay 76) A special implementation of the Proximal Point Algorithm (Eckstein & Bertsekas 92). Thus the convergence of ADMM can be easily established by using the classical operator theory. O(1/k) convergence speed (He & Yuan 12, Monteiro & Svaiter 13) Linear convergence under certain conditions (Lions & Mercier 79, Eckstein 89, ) Applications of ADMM: Partial differential equations, mechanics, image processing, compressed processing, statistical learning, compute version, semidefinite programming, Yinyu Ye Stanford 5/ 22

ADMM for Multi-block Convex Minimization Problems Convex minimization problems with three blocks: min θ 1 (x 1 )+θ 2 (x 2 )+θ 3 (x 3 ) s.t. A 1 x 1 + A 2 x 2 + A 3 x 3 = b x 1 X 1, x 2 X 2, x 3 X 3 θ 1 (x 1 ),θ 2 (x 2 )andθ 3 (x 3 ) are convex closed proper functions; X 1, X 2 and X 3 are convex sets. Yinyu Ye Stanford 6/ 22

ADMM for Multi-block Convex Minimization Problems Convex minimization problems with three blocks: min θ 1 (x 1 )+θ 2 (x 2 )+θ 3 (x 3 ) s.t. A 1 x 1 + A 2 x 2 + A 3 x 3 = b x 1 X 1, x 2 X 2, x 3 X 3 θ 1 (x 1 ),θ 2 (x 2 )andθ 3 (x 3 ) are convex closed proper functions; X 1, X 2 and X 3 are convex sets. The direct extension of ADMM: x1 k+1 =argmin{l A (x 1, x2 k, x 3 k,λk ) x 1 X 1 } x2 k+1 =argmin{l A (x1 k+1, x 2, x3 k,λk ) x 2 X 2 } x3 k+1 =argmin{l A (x1 k+1, x2 k+1, x 3,λ k ) x 3 X 3 } λ k+1 = λ k β(a 1 x1 k+1 + A 2 x2 k+1 + A 3 x3 k+1 b) L A (x 1, x 2, x 3,λ)= 3 θ i (x i ) λ T ( 3 A i x i b ) + β 3 A i x i b 2 2 i=1 i=1 i=1 Yinyu Ye Stanford 6/ 22

Applications of the Extended ADMM The extended ADMM can find many applications: robust PCA with noisy and incomplete data, image alignment problem, latent variable Gaussian graphical mode, quadratic discriminant analysis model, etc. was popularly used in practice, and it has outperformed other variants of ADMM most of the time. Yinyu Ye Stanford 7/ 22

Applications of the Extended ADMM The extended ADMM can find many applications: robust PCA with noisy and incomplete data, image alignment problem, latent variable Gaussian graphical mode, quadratic discriminant analysis model, etc. was popularly used in practice, and it has outperformed other variants of ADMM most of the time. Therefore, one would expect that the extended ADMM always converges. However, Yinyu Ye Stanford 7/ 22

Theoretical Results of the Extended ADMM Not easy to analyze the convergence The operator theory for the ADMM cannot be directly extended to the ADMM with three blocks. Big difference between the ADMM with two blocks and with three blocks. Yinyu Ye Stanford 8/ 22

Theoretical Results of the Extended ADMM Not easy to analyze the convergence The operator theory for the ADMM cannot be directly extended to the ADMM with three blocks. Big difference between the ADMM with two blocks and with three blocks. Existing results for global convergence: Strong convexity; plus β in a specific range (Han & Yuan 12). Certain conditions on the problem; then take a sufficiently small stepsize γ in the update of the multipliers (Hong & Luo 12), i.e., λ k+1 = λ k γβ(a 1 x1 k+1 + A 2 x2 k+1 + A k+1 3 b). A correction step (He&Tao&Yuan 12, He&Tao&Yuan-IMA). Yinyu Ye Stanford 8/ 22

Theoretical Results of the Extended ADMM Not easy to analyze the convergence The operator theory for the ADMM cannot be directly extended to the ADMM with three blocks. Big difference between the ADMM with two blocks and with three blocks. Existing results for global convergence: Strong convexity; plus β in a specific range (Han & Yuan 12). Certain conditions on the problem; then take a sufficiently small stepsize γ in the update of the multipliers (Hong & Luo 12), i.e., λ k+1 = λ k γβ(a 1 x1 k+1 + A 2 x2 k+1 + A k+1 3 b). A correction step (He&Tao&Yuan 12, He&Tao&Yuan-IMA). But, these did not answer the open question whether or not the direct extension of ADMM converges under the simple convexity assumption. Yinyu Ye Stanford 8/ 22

2. Divergent Examples for the Extended ADMM Yinyu Ye Stanford 9/ 22

Strategy to Construct the Counter-example A sufficient condition to guarantee the ADMM convergence: A T 1 A 2 =0, or A T 2 A 3 =0, or A T 3 A 1 =0. Consider the case A T 1 A 2 = 0 in which the extended ADMM reduces to the ADMM with two blocks (by regarding (x 1, x 2 )as one variable). Yinyu Ye Stanford 10/ 22

Strategy to Construct the Counter-example A sufficient condition to guarantee the ADMM convergence: A T 1 A 2 =0, or A T 2 A 3 =0, or A T 3 A 1 =0. Consider the case A T 1 A 2 = 0 in which the extended ADMM reduces to the ADMM with two blocks (by regarding (x 1, x 2 )as one variable). Consequently, our strategy to construct a non-convergence example: A 1, A 2 and A 3 are similar but not identical. No objective function so that the operator is a linear mapping and the convergence is independent of the choice of β Then, we set β = 1 for simplicity. Yinyu Ye Stanford 10/ 22

Strategy to Construct the Counter-example A sufficient condition to guarantee the ADMM convergence: A T 1 A 2 =0, or A T 2 A 3 =0, or A T 3 A 1 =0. Consider the case A T 1 A 2 = 0 in which the extended ADMM reduces to the ADMM with two blocks (by regarding (x 1, x 2 )as one variable). Consequently, our strategy to construct a non-convergence example: A 1, A 2 and A 3 are similar but not identical. No objective function so that the operator is a linear mapping and the convergence is independent of the choice of β Then, we set β = 1 for simplicity. Thus, we simply consider the system of homogeneous linear equations with three variables: A 1 x 1 + A 2 x 2 + A 3 x 3 =0. Yinyu Ye Stanford 10/ 22

Divergent Example of the Extended ADMM I Concretely, we take A =(A 1, A 2, A 3 )= 1 1 1 1 1 2 1 2 2. Thus the extended ADMM with β = 1 can be specified as 3 0 0 0 0 0 4 6 0 0 0 0 5 7 9 0 0 0 1 1 1 1 0 0 1 1 2 0 1 0 1 2 2 0 0 1 x k+1 1 x k+1 2 x k+1 3 λ k+1 0 4 5 1 1 1 0 0 7 1 1 2 = 0 0 0 1 2 2 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 x k 1 x k 2 x k 3 λ k. Yinyu Ye Stanford 11/ 22

Divergent Example of the Extended ADMM II Or equivalently, where x k+1 2 x k+1 3 λ k+1 = M x k 2 x k 3 λ k, 144 9 9 9 18 M = 1 8 157 5 13 8 162 64 122 122 58 64 56 35 35 91 56 88 26 26 62 88. Yinyu Ye Stanford 12/ 22

Divergent Example of the Extended ADMM III The matrix M = V Diag(d)V 1,where 0.9836 + 0.2984i 0.9836 0.2984i d = 0.8744 + 0.2310i 0.8744 0.2310i. 0 and V = 0.1314 + 0.2661i 0.1314 0.2661i 0.1314 0.2661i 0.1314 + 0.2661i 0 0.0664 0.2718i 0.0664 + 0.2718i 0.0664 + 0.2718i 0.0664 0.2718i 0 0.2847 0.4437i 0.2847 + 0.4437i 0.2847 0.4437i 0.2847 + 0.4437i 0.5774 0.5694 0.5694 0.5694 0.5694 0.5774 0.4270 + 0.2218i 0.4270 0.2218i 0.4270 + 0.2218i 0.4270 0.2218i 0.5774, Note that ρ(m) = d 1 = d 2 > 1. Yinyu Ye Stanford 13/ 22

Divergent Example of the Extended ADMM IV Take the initial point (x2 0, x0 3,λ0 )asv(:, 1) + V (:, 2) R 5.Then x k+1 2 = V Diag(d k+1 )V 1 which is divergent. x k+1 3 λ k+1 = V Diag(d k+1 ) = V 1 1 0 3 1 (0.9836 + 0.2984i ) k+1 (0.9836 0.2984i ) k+1 0 3 1 x 0 2 x 0 3 λ 0, Yinyu Ye Stanford 14/ 22

Strong Convexity Helps? Consider the following example min 0.05x1 2 +0.05x2 2 +0.05x2 3 1 1 1 x 1 s.t. 1 1 2 x 2 =0. 1 2 2 x 3 (1) Yinyu Ye Stanford 15/ 22

Strong Convexity Helps? Consider the following example min 0.05x1 2 +0.05x2 2 +0.05x2 3 1 1 1 x 1 s.t. 1 1 2 x 2 =0. 1 2 2 x 3 (1) the matrix M in the extended ADMM (β =1)has ρ(m) =1.0087 > 1 Yinyu Ye Stanford 15/ 22

Strong Convexity Helps? Consider the following example min 0.05x1 2 +0.05x2 2 +0.05x2 3 1 1 1 x 1 s.t. 1 1 2 x 2 =0. 1 2 2 x 3 (1) the matrix M in the extended ADMM (β =1)has ρ(m) =1.0087 > 1 able to find a proper initial point such that the extended ADMM diverges Yinyu Ye Stanford 15/ 22

Strong Convexity Helps? Consider the following example min 0.05x1 2 +0.05x2 2 +0.05x2 3 1 1 1 x 1 s.t. 1 1 2 x 2 =0. 1 2 2 x 3 (1) the matrix M in the extended ADMM (β =1)has ρ(m) =1.0087 > 1 able to find a proper initial point such that the extended ADMM diverges even for strongly convex programming, the extended ADMM is not necessarily convergent for a given β>0 Yinyu Ye Stanford 15/ 22

4. The Small Stepsize Variant of ADMM Yinyu Ye Stanford 16/ 22

The stepsize of ADMM In the direct extension of ADMM, the Lagrangian multiplier is updated by λ k+1 := λ k γβ(a 1 x k+1 1 + A 2 x k+1 2 +...+ A j x k+1 j ). Convergence is proved: Yinyu Ye Stanford 17/ 22

The stepsize of ADMM In the direct extension of ADMM, the Lagrangian multiplier is updated by λ k+1 := λ k γβ(a 1 x k+1 1 + A 2 x k+1 2 +...+ A j x k+1 j ). Convergence is proved: j = 1; (Augmented Lagrangian Method) for γ (0, 2), (Hestenes 69, Powell 69). j = 2; (Alternating Direction Method of Multipliers) for γ (0, 1+ 5 2 ), (Glowinski, 84). j 3; for γ sufficiently small provided additional conditions on the problem, (Hong & Luo 12) Yinyu Ye Stanford 17/ 22

The stepsize of ADMM In the direct extension of ADMM, the Lagrangian multiplier is updated by λ k+1 := λ k γβ(a 1 x k+1 1 + A 2 x k+1 2 +...+ A j x k+1 j ). Convergence is proved: j = 1; (Augmented Lagrangian Method) for γ (0, 2), (Hestenes 69, Powell 69). j = 2; (Alternating Direction Method of Multipliers) for γ (0, 1+ 5 2 ), (Glowinski, 84). j 3; for γ sufficiently small provided additional conditions on the problem, (Hong & Luo 12) Question: Is there a problem-data-independent γ such that the method converges? Yinyu Ye Stanford 17/ 22

A Numerical Study (Ongoing) Consider the linear system 1 1 1 1 1 1+γ 1 1+γ 1+γ x 1 x 2 x 3 =0. Yinyu Ye Stanford 18/ 22

A Numerical Study (Ongoing) Consider the linear system 1 1 1 1 1 1+γ 1 1+γ 1+γ x 1 x 2 x 3 =0. Table: The radius of the problem γ 1 0.1 1e-2 1e-3 1e-4 1e-5 1e-6 1e-7 ρ(m) 1.0278 1.0026 1.0001 > 1 > 1 > 1 > 1 > 1 Yinyu Ye Stanford 18/ 22

A Numerical Study (Ongoing) Consider the linear system 1 1 1 1 1 1+γ 1 1+γ 1+γ x 1 x 2 x 3 =0. Table: The radius of the problem γ 1 0.1 1e-2 1e-3 1e-4 1e-5 1e-6 1e-7 ρ(m) 1.0278 1.0026 1.0001 > 1 > 1 > 1 > 1 > 1 Thus, there seems no practical problem-data-independent γ such that the small-step size variant works. Yinyu Ye Stanford 18/ 22

5. Conclusion Yinyu Ye Stanford 19/ 22

Conclusion We construct examples to show that the direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent for any given algorithm parameter β. Yinyu Ye Stanford 20/ 22

Conclusion We construct examples to show that the direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent for any given algorithm parameter β. Even in the case where the objective function is strongly convex, the direct extension of ADMM loses its convergence for a given β. Yinyu Ye Stanford 20/ 22

Conclusion We construct examples to show that the direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent for any given algorithm parameter β. Even in the case where the objective function is strongly convex, the direct extension of ADMM loses its convergence for a given β. There doesn t exist a problem-data-independent stepsize γ such that the small stepsize variant of ADMM would work. Yinyu Ye Stanford 20/ 22

Conclusion We construct examples to show that the direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent for any given algorithm parameter β. Even in the case where the objective function is strongly convex, the direct extension of ADMM loses its convergence for a given β. There doesn t exist a problem-data-independent stepsize γ such that the small stepsize variant of ADMM would work. Is there a cyclic non-converging example? Yinyu Ye Stanford 20/ 22

Conclusion We construct examples to show that the direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent for any given algorithm parameter β. Even in the case where the objective function is strongly convex, the direct extension of ADMM loses its convergence for a given β. There doesn t exist a problem-data-independent stepsize γ such that the small stepsize variant of ADMM would work. Is there a cyclic non-converging example? Our results support the need of a correction step in the ADMM-type method (He&Tao&Yuan 12, He&Tao&Yuan-IMA). Yinyu Ye Stanford 20/ 22

Conclusion We construct examples to show that the direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent for any given algorithm parameter β. Even in the case where the objective function is strongly convex, the direct extension of ADMM loses its convergence for a given β. There doesn t exist a problem-data-independent stepsize γ such that the small stepsize variant of ADMM would work. Is there a cyclic non-converging example? Our results support the need of a correction step in the ADMM-type method (He&Tao&Yuan 12, He&Tao&Yuan-IMA). Question: Is there an simple correction of the ADMM for the multi-block convex minimization problems? Or how to treat the multi blocks equally? Yinyu Ye Stanford 20/ 22

How to Treat All Blocks Equally? Answer: Independent uniform random permutation in each iteration! Yinyu Ye Stanford 21/ 22

How to Treat All Blocks Equally? Answer: Independent uniform random permutation in each iteration! Select the block-update order in the uniformly random fashion this equivalently reduces the ADMM algorithm to one block. Yinyu Ye Stanford 21/ 22

How to Treat All Blocks Equally? Answer: Independent uniform random permutation in each iteration! Select the block-update order in the uniformly random fashion this equivalently reduces the ADMM algorithm to one block. Or fix the first block, and then select the rest block order in the uniformly random fashion this equivalently reduces the ADMM algorithm to two blocks. Yinyu Ye Stanford 21/ 22

How to Treat All Blocks Equally? Answer: Independent uniform random permutation in each iteration! Select the block-update order in the uniformly random fashion this equivalently reduces the ADMM algorithm to one block. Or fix the first block, and then select the rest block order in the uniformly random fashion this equivalently reduces the ADMM algorithm to two blocks. It works for the example, and it works in general my conjecture. Yinyu Ye Stanford 21/ 22

Thank You! Yinyu Ye Stanford 22/ 22