On the Global Linear Convergence of the ADMM with Multi-Block Variables

Similar documents
MMA and GCMMA two methods for nonlinear optimization

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

Some modelling aspects for the Matlab implementation of MMA

Structured Nonconvex and Nonsmooth Optimization: Algorithms and Iteration Complexity Analysis

Lecture 10 Support Vector Machines II

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso

Lecture 20: November 7

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

The Order Relation and Trace Inequalities for. Hermitian Operators

The Study of Teaching-learning-based Optimization Algorithm

Inexact Newton Methods for Inverse Eigenvalue Problems

Assortment Optimization under MNL

Kernel Methods and SVMs Extension

Convergence rates of proximal gradient methods via the convex conjugate

Module 9. Lecture 6. Duality in Assignment Problems

On a direct solver for linear least squares problems

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

The Minimum Universal Cost Flow in an Infeasible Flow Network

Iteration-complexity of a Jacobi-type non-euclidean ADMM for multi-block linearly constrained nonconvex programs

Research Article Global Sufficient Optimality Conditions for a Special Cubic Minimization Problem

Lecture Notes on Linear Regression

General viscosity iterative method for a sequence of quasi-nonexpansive mappings

The lower and upper bounds on Perron root of nonnegative irreducible matrices

Inexact Alternating Minimization Algorithm for Distributed Optimization with an Application to Distributed MPC

Perfect Competition and the Nash Bargaining Solution

Randomized block proximal damped Newton method for composite self-concordant minimization

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS

A Hybrid Variational Iteration Method for Blasius Equation

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution.

Solutions to exam in SF1811 Optimization, Jan 14, 2015

COS 521: Advanced Algorithms Game Theory and Linear Programming

ECE559VV Project Report

Least squares cubic splines without B-splines S.K. Lucas

Lecture 4. Instructor: Haipeng Luo

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

APPENDIX A Some Linear Algebra

Singular Value Decomposition: Theory and Applications

Lecture 12: Discrete Laplacian

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16

arxiv: v1 [math.oc] 6 Jan 2016

Interactive Bi-Level Multi-Objective Integer. Non-linear Programming Problem

A Local Variational Problem of Second Order for a Class of Optimal Control Problems with Nonsmooth Objective Function

MAT 578 Functional Analysis

arxiv: v3 [math.na] 1 Jul 2017

Vector Norms. Chapter 7 Iterative Techniques in Matrix Algebra. Cauchy-Bunyakovsky-Schwarz Inequality for Sums. Distances. Convergence.

Power law and dimension of the maximum value for belief distribution with the max Deng entropy

Solving the Quadratic Eigenvalue Complementarity Problem by DC Programming

e - c o m p a n i o n

10-801: Advanced Optimization and Randomized Methods Lecture 2: Convex functions (Jan 15, 2014)

Convex Optimization. Optimality conditions. (EE227BT: UC Berkeley) Lecture 9 (Optimality; Conic duality) 9/25/14. Laurent El Ghaoui.

Research Article. Almost Sure Convergence of Random Projected Proximal and Subgradient Algorithms for Distributed Nonsmooth Convex Optimization

On the Interval Zoro Symmetric Single-step Procedure for Simultaneous Finding of Polynomial Zeros

On the convergence of the block nonlinear Gauss Seidel method under convex constraints

Deriving the X-Z Identity from Auxiliary Space Method

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

SELECTED SOLUTIONS, SECTION (Weak duality) Prove that the primal and dual values p and d defined by equations (4.3.2) and (4.3.3) satisfy p d.

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

Feature Selection: Part 1

Randić Energy and Randić Estrada Index of a Graph

Perron Vectors of an Irreducible Nonnegative Interval Matrix

Lecture 21: Numerical methods for pricing American type derivatives

On the Multicriteria Integer Network Flow Problem

1 GSW Iterative Techniques for y = Ax

Control of Uncertain Bilinear Systems using Linear Controllers: Stability Region Estimation and Controller Design

Lagrange Multipliers Kernel Trick

EEE 241: Linear Systems

Finding Primitive Roots Pseudo-Deterministically

The Geometry of Logit and Probit

Affine transformations and convexity

Lecture 17: Lee-Sidford Barrier

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper

Calculation of time complexity (3%)

Difference Equations

On the correction of the h-index for career length

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EQUATION

Computing Correlated Equilibria in Multi-Player Games

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Successive Lagrangian Relaxation Algorithm for Nonconvex Quadratic Optimization

More metrics on cartesian products

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

Errors for Linear Systems

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

A note on almost sure behavior of randomly weighted sums of φ-mixing random variables with φ-mixing weights

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Which Separator? Spring 1

Physics 5153 Classical Mechanics. Principle of Virtual Work-1

Time-Varying Systems and Computations Lecture 6

VQ widely used in coding speech, image, and video

Chapter 2 A Class of Robust Solution for Linear Bilevel Programming

System of implicit nonconvex variationl inequality problems: A projection method approach

Lecture 10 Support Vector Machines. Oct

Coordinate friendly structures, algorithms and applications arxiv: v3 [math.oc] 14 Aug 2016

2.3 Nilpotent endomorphisms

Exercise Solutions to Real Analysis

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Transcription:

On the Global Lnear Convergence of the ADMM wth Mult-Block Varables Tany Ln Shqan Ma Shuzhong Zhang May 31, 01 Abstract The alternatng drecton method of multplers ADMM has been wdely used for solvng structured convex optmzaton problems In partcular, the ADMM can solve convex programs that mnmze the sum of N convex functons wth N-block varables lnked by some lnear constrants Whle the convergence of the ADMM for N = was well establshed n the lterature, t remaned an open problem for a long tme whether or not the ADMM for N 3 s stll convergent Recently, t was shown n [3] that wthout further condtons the ADMM for N 3 may actually fal to converge In ths paper, we show that under some easly verfable and reasonable condtons the global lnear convergence of the ADMM when N 3 can stll be assured, whch s mportant snce the ADMM s a popular method for solvng large scale mult-block optmzaton models and s known to perform very well n practce even when N 3 Our study ams to offer an explanaton for ths phenomenon Keywords: Alternatng Drecton Method of Multplers, Global Lnear Convergence, Convex Optmzaton 1 Introducton In ths paper, we consder the global lnear convergence of the standard alternatng drecton method of multplers ADMM for solvng convex mnmzaton problems wth N-block varables when N 3 The problem under consderaton can be formulated as mn f1 x 1 f x f N x N st A 1 x 1 A x A N x N = b, x X, = 1,, N, 11 where A R p n, b R p, X R n are closed convex sets, and f : R n R p are closed convex functons Note that the convex constrant x X can be ncorporated nto the obectve usng an Department of Systems Engneerng and Engneerng Management, The Chnese Unversty of Hong Kong, Shatn, New Terrtores, Hong Kong, Chna Department of Industral and Systems Engneerng, Unversty of Mnnesota, Mnneapols, MN 5555, USA 1

ndcator functon, e, 11 can be rewrtten as mn f 1 x 1 f x f N x N st A 1 x 1 A x A N x N = b, 1 where f x := f x x and 1 x := { 0 f x X otherwse We thus consder the equvalent reformulaton 1 throughout ths paper for the ease of presentaton For gven x k,, xk N ; λk, a typcal teraton of the ADMM for solvng 1 can be summarzed as: where x k1 1 := argmn x1 L γ x 1, x k,, xk N ; λk x k1 := argmn x L γ x k1 1, x, x k 3,, xk N ; λk x k1 N := argmn xn L γ x k1 1, x k1,, x k1, x N; λ k N λ k1 := λ k γ A x k1, L γ x 1,, x N ; λ := f x λ, A x γ A x denotes the augmented Lagrangan functon of 1 wth λ beng the Lagrange multpler and γ > 0 beng a penalty parameter It s noted that n each teraton, the ADMM updates the prmal varables x 1,, x N n a Gauss-Sedel manner When N =, the ADMM 13 was shown to be equvalent to the Douglas-Rachford operator splttng method that dated back to 1950s for solvng varatonal problems arsng from PDEs [5, 9] The convergence of the ADMM 13 when N = was thus establshed n the context of operator splttng methods [16, 8] Recently, ADMM has been revsted due to ts success n solvng structured convex optmzaton problems arsng from sparse and low-rank optmzaton and related problems we refer the readers to some recent survey papers for more detals, see, eg, [, 6] In [16], Lons and Mercer showed that the Douglas-Rachford operator splttng method converges lnearly under the assumpton that some nvolved monotone operator s both coercve and Lpschtz Ecksten and Bertsekas [7] showed the lnear convergence of the ADMM 13 wth N = for solvng lnear programs, whch depends on a bound on the largest terate n the course of the algorthm In a recent work by Deng and Yn [], a generalzed ADMM was proposed n whch some proxmal terms were added to the two subproblems n 13, and t was shown that ths generalzed ADMM converges lnearly under certan assumptons on the strong convexty of functons f 1 and f, and the rank of A 1 and A For nstance, one suffcent condton suggested n [] that guarantees the lnear convergence of the generalzed ADMM s that f 1 and f are both strongly convex, f s Lpschtz contnuous and A s of full row rank Han and Yuan [11] and Boley [1] both studed the local lnear convergence of ADMM 13 when N = for solvng quadratc programs The result n [11] was based on some error bound condton [17], and the one gven n [1] was obtaned by frst wrtng the ADMM as a matrx recurrence and then performng a spectral 13

analyss on the recurrence Moreover, t was shown that the ADMM 13 when N = converges sublnearly under the smple convexty assumpton both n ergodc and non-ergodc sense [13, 18, 1] It should be noted that all the convergence results on the ADMM 13 dscussed above are for the case N = Whle the convergence propertes of the ADMM when N = have been well studed, ts convergence when N 3 has remaned unclear for a very long tme The followng ncludes some recent progresses on ths drecton In a recent work by Chen et al [3], a counter-example was gven whch shows that wthout further condtons the ADMM for N 3 may actually fal to converge Exstng works that study suffcent condtons ensurng the convergence of ADMM when N 3 are brefly summarzed as follows Han and Yuan [10] proved the global convergence of ADMM 13 under the condton that f 1,, f N are all strongly convex and γ s restrcted to certan regon Hong and Luo [1] proposed to adopt a small step sze when updatng the Lagrange multpler λ k n 13, e, they suggested that the update for λ k, e, N λ k1 := λ k γ A x k1, 1 be changed to λ k1 := λ k αγ N A x k1, 15 where α > 0 s a small step sze It was shown n [1] that ths varant of ADMM converges lnearly under the assumpton that certan error bound condton holds and α s bounded by some constant that s related to the error bound condton In a very recent work by Ln, Ma and Zhang [15], t was shown that the ADMM 13 possesses sublnear convergence rate n both ergodc and non-ergodc sense under the condtons that f,, f N are strongly convex and γ s restrcted to certan regon Our contrbuton In ths paper, we show the global lnear convergence of ADMM 13 when N 3 It should be noted that the lnear convergence results n [16,, 11, 1] are for the case N =, whle ours consder the case when N 3 Moreover, compared wth the local lnear convergence results n [11] and [1] for N =, we prove the global lnear convergence for N 3 Furthermore, our result s for the orgnal standard mult-block ADMM 13, whle the one presented n [1] s a varant of 13 whch replaces 1 wth 15 To the best of our knowledge, our results n ths paper are the frst global lnear convergence results for the orgnal standard mult-block ADMM 13 when N 3 The rest of ths paper s organzed as follows In Secton, we provde some prelmnares and prove three techncal lemmas for the subsequent analyss In Secton 3, we prove the global lnear convergence of ADMM 13 under three dfferent scenaros Fnally, we conclude the paper n Secton Prelmnares and Techncal Lemmas We use Ω X 1 X X N R p to denote the set of prmal-dual optmal solutons of 1 Note that accordng to the frst-order optmalty condtons for 1, solvng 1 s equvalent to fndng x 1,, x N, λ Ω 3

such that the followngs hold: A λ f x, = 1,,, N, 1 A x = 0 We thus make the followng assumpton throughout ths paper Assumpton 1 The optmal set Ω for problem 1 s non-empty In our analyss, the followng well-known dentty s used frequently: w 1 w w 3 w = 1 w1 w w 1 w 3 w w 3 w w 3 Notatons We use g to denote a subgradent of f ; λ max B and λ mn B denote respectvely the largest and smallest egenvalues of a real symmetrc matrx B; x denotes the Eucldean norm of x We use σ > 0 to denote the convexty parameter of f, e, the followng nequaltes hold for = 1,, N: x y g x g y σ x y, x, y X, where g x f x s the subdfferental of f Note that f s strongly convex f and only f σ > 0, and f f s convex but not strongly convex, then σ = 0 In ths paper, we consder three scenaros that lead to global lnear convergence of ADMM 13 The condtons of the three scenaros are lsted n Table 1 scenaro strongly convex Lpschtz contnuous full row rank full column rank 1 f,, f N f N A N f 1,, f N f 1,, f N 3 f,, f N f 1,, f N A 1 Table 1: Three scenaros leadng to global lnear convergence We remark here that when N =, the three scenaros lsted n Table 1 actually reduce to the same condtons consdered by Deng and Yn as scenaros 1, and 3, respectvely n [] We also remark here that snce we ncorporated the ndcator functons nto the obectve functon n 1, scenaro 1 actually requres that there s no constrant x N X N ; scenaros and 3 requre that there s no constrant x X, = 1,, N The frst-order optmalty condtons for the N subproblems n 13 are gven by A λ k γa A x k1 A x k f x k1, = 1,,, N, 5 =1 =1

where we have adopted the conventon N =N1 a = 0 By combnng wth the updatng formula for λ k 1, 5 can be rewrtten as [ ] A λk1 γa =1 A x k xk1 f x k1, = 1,,, N 6 Before we present the lnear convergence of ADMM 13, we prove the followng three techncal lemmas that wll be used n subsequent analyss Lemma Let x 1,, x N, λ Ω The sequence {x k 1, xk,, xk N, λk } generated va ADMM 13 satsfes, γ [ σ σ N =1 A x x k γn 1 λ max A A γ λ λ k γ x k1 x ] γn N λ max A NA N x k1 N x N γ =1 A 1x k1 A x x k1 1 A x k = γ λ λ k1 7 Proof Combnng 6, 1 and yelds, x k1 x A λ k1 λ γ A x k x k1 σ x k1 x, = 1,, N 8 =1 From 1 and, t s easy to obtan A x k1 x = 1 γ λk λ k1 9 Summng 8 over = 1,, N and usng 9, we can get 1 γ λk λ k1 λ k1 λ γ x x k1 A A x k x k1 =1 σ x k1 x 10 5

By adoptng the conventon 0 a = 0, we have that x x k1 A A x k x k1 = =1 1 A x =1 =1 A x A x k1 =1 A x k1 =1 = 1 A x A x k A x A x k1 =1 =1 =1 =1 1 1 A x A x k1 A x A x k1 A x k =1 = =1 =1 1 A x A x k A x A x k1 =1 =1 =1 =1 1 A x A x k1 1 =1 = A 1x k1 1 A x k = = 1 A x A x k A x A x k1 =1 =1 =1 =1 γ λk1 λ k 1 A x A x k1 1 A 1x k1 1 A x k = =1 = = A x k, 11 where n the second equalty we have used the dentty 3, and the last equalty follows from 1 By combnng 10 and 11, we have γ A x =1 =1 A x k A x =1 γ λk λ k1 λ k1 λ γ λk1 λ k γ σ x k1 x γ A 1x k1 1 A x k = A x k1 1 A x =1 = =1 = A x k1 1 6

Usng agan, we obtan 1 A x =1 = A x k1 = A x k1 x = N where the nequalty follows from the convexty of Therefore, we have = = = = 1 A x A x k1 =1 = N λ max A A x k1 x = N 1 λ max A A x k1 x By combnng 1 and 13 and usng the dentty = λ max A A x k1 x, N N λ max A NA N x k1 N x N 1 λ k λ k1 λ k1 λ γ γ λk1 λ k = 1 λ λ k λ λ k1, γ we have γ A x A x k =1 =1 λ λ k λ λ k1 γ [ σ σ N A x =1 =1 ] γn 1 λ max A A x k1 x γn N λ max A NA N x k1 N x N γ A x k1 A 1x k1 1 A x k =, 13 whch further mples 7 by usng 7

Remark 3 We note here that 7 can be equvalently rearranged as γ γ =1 [ σ σ N A x x k γ λ λ k γ γn 1 λ max A A x k1 x ] γn N λ max A NA N x k1 N x N γ =1 A x x k γ =1 =1 A 1x k1 A x x k1 A x x k1 1 A x k = γ λ λ k1 Both 7 and 1 wll be used n subsequent analyss In scenaro 1, we wll use 7 to show that γ =1 A x x k γ λ λ k converges to zero lnearly; n scenaros and 3, we wll use 1 to show that converges to zero lnearly γ =1 A x x k γ λ λ k 1 The next lemma consders the convergence of {x k 1,, xk N, λk } under condtons lsted n scenaros and 3 n Table 1 Lemma Assume that the condtons lsted n scenaro or scenaro 3 n Table 1 hold Moreover, we assume that γ satsfes the followng condtons: { } σ γ < mn =,, N 1λ max A A, σ N N N λ max A N A 15 N Then x k 1,, xk N, λk generated by ADMM 13 converges to some x 1,, x N, λ Ω Proof Note that the condtons lsted n scenaros and 3 n Table 1 both requre that f,, f N are strongly convex Denote the rght hand sde of nequalty 7 by ξ k It follows from 15 and 7 that ξ k 0 and k=0 ξk <, whch further mples that ξ k 0 Hence, for any x 1,, x N, λ Ω, we have x k x 0 for =,, N, and A 1 x k1 1 N = A x k 0, whch also mples that A 1 x k 1 A 1x 1 0 In scenaro, t s assumed that f 1 s strongly convex Thus σ 1 > 0 and 7 mples 8

that x k 1 x 1 0 In scenaro 3, t s assumed that A 1 s of full column rank It thus follows from A 1 x k 1 A 1x 1 0 that xk 1 x 1 0 Moreover, when 15 holds, t follows from 7 that γ N =1 A x xk γ λ λ k s non-ncreasng and upper bounded It thus follows that λ λ k converges and {λ k } s bounded Therefore, {λ k } has a convergng subsequence {λ k } Let λ = lm {λ k } By passng the lmt n 6, t holds that A λ = f x for = 1,,, N Thus, x 1,, x N, λ Ω and we can ust let λ = λ Snce λ λ k converges and λ k λ, we conclude that λ k λ Before proceedng to the next lemma, we defne a constant κ that wll be used subsequently Defnton 1 We defne a constant κ as follows If the matrx [A 1,, A N ] s of full row rank, then κ := λ 1 mn [A 1,, A N ][A 1,, A N ] > 0 Otherwse, assume rank[a 1,, A N ] = r < p Wthout loss of generalty, assumng that the frst r rows of [A 1,, A N ] denoted by [A r 1,, Ar N ] are lnearly ndependent, we have [A 1,, A N ] = [ I B ] [A r 1,, A r N], 16 where I R r r s the dentty matrx and B R p r r Let E := I B B[A r 1,, Ar N ] It s easy to see that E has full row rank Then κ s defned as κ := λ 1 mn EE λ max I B B > 0 The next lemma concerns boundng λ k1 λ usng terms related to x k x, = 1,, N Lemma 5 Let x 1,, x N, λ Ω Assume that the condtons lsted n scenaro or scenaro 3 n Table 1 hold, and γ satsfes 15 Suppose f s Lpschtz contnuous wth constant L for = 1,, N, and the ntal Lagrange multpler λ 0 s n the range space of [A 1,, A N ] note that lettng λ 0 = 0 suffces It holds that λ k1 λ κl x k1 x κγ λ max A A where κ > 0 s defned n Defnton 1 =1 A x k x =1 17 A x k1 x, Proof We frst show the followng nequalty λ k1 λ κ A 1 A N 9 λ k1 λ 18

In case, [A 1,, A N ] has full row rank, so 18 holds trvally Now we consder case By the updatng formula of λ k1 1 and, we know that f the ntal Lagrange multpler λ 0 s n the range space of [A 1,, A N ], then λ k, k = 1,,, always stay n the range space of [A 1,, A N ], so does λ Therefore, from 16, we can get λ k1 = [ I B ] [ λ k1 r, λ I = B ] λ r, A 1 A N λ k1 λ = A r 1 A r N I B Bλ k1 r λ r, where λ k1 r and λ r denote the frst r rows of λ k1 and λ, respectvely Snce E := IB B[A r 1,, Ar N ] has full row rank, t now follows that A 1 A N λ k1 λ whch mples 18 = E λ k1 r λ r λ mn EE λ k1 r λ r λ mnee λ max I B B λk1 λ, Usng the optmalty condtons 6, and the Lpschtz contnuty of f, = 1,, N, we have A 1 γa A 1 0 λk1 λ 0 A x k x k1 A = γa A Nx k N x k1 N N 0 0 = f x k1 f x L x k1 x, whch together wth 18 mples that λ k1 λ A 1 A κ λk1 λ A N γa 1 0 0 κ A x k x k1 = γa A Nx k N x k1 N L x k1 x 0 0 κγ λ max A A A x k x k1 κ L x k1 x =1 κγ λ max A A A x k x A x k1 x κ L x k1 x =1 =1 10

3 Global Lnear Convergence of the ADMM In ths secton, we prove the global lnear convergence of the ADMM 13 under the three scenaros lsted n Table 1 We note the followng nequalty, =1 A x x k1 [ ] N 1 λ max A A x x k1, 31 = whch follows from the convexty of We shall use ths nequalty n our subsequent analyss 31 Q-lnear convergence under scenaro 1 Theorem 31 Suppose that the condtons lsted n scenaro 1 n Table 1 hold If γ satsfes 15, then t holds that γ A x x k =1 γ λ λ k 1 δ 1 γ 3 A x x k1 γ λ λ k1, where δ 1 := mn =,, { =1 } σ γn 1λ max A A γσ N γ N N λ max A N γn 1λ max A A, A N λ 1 mn A NA N L N γ NN 1λ max A N A N 33 Note that t follows from 15 that δ 1 > 0 As a result of 3, we conclude that A x k, A x k,, A x k, λ k converges Q-lnearly = =3 =N Proof Because f N s Lpschtz contnuous wth constant L N, by settng = N n 6 and 1, we get whch mples A Nλ k1 λ = f N x k1 N f Nx N L N x k1 N x N, due to the fact that A N s of full row rank λ k1 λ λ 1 mn A NA NL N x k1 N x N, 3 11

By combnng 7, 33, 31 and 3, t follows that note that we do not assume that f 1 s strongly convex, and thus σ 1 = 0, γ A x x k =1 γ λ λ k γ A x x k1 =1 γ λ λ k1 [ ] γn 1 σ λ max A A x k1 x = σ N γn N [ N [ γn 1 δ 1 = δ 1 γ A x x k1 =1 λ max A NA N x k1 N x N ] λ max A A x x k1 λ 1 mn A NA N L N γ γ λ λ k1, x N x k1 N ] whch further mples 3 3 Q-lnear convergence under scenaro Theorem 3 Suppose that the condtons lsted n scenaro n Table 1 hold If γ satsfes { } σ γ < mn =,, 3N 1λ max A A, σ N 3N 3N λ max A N A, 35 N then t holds that γ A x x k =1 γ λ λ k 1 δ γ A x x k1 γ λ λ k1, =1 36 where and δ := δ 3 := mn =,, mn,, { } σ1 γ δ := mn κl, δ 3, δ, δ 5, 37 1 { σ γ 3γ N 1λ max A A γ N 1λ max A A κl { } 1 κλ max A A, δ 5 := σ Nγ 3N 3N γ λ max A N A N γ NN 1λ max A N A N κl, N }, 38 1

where κ s defned n Defnton 1 Note that t follows from 35 that δ > 0 As a result of 36, we conclude that A x k, A x k,, A x k, λ k converges Q-lnearly = =3 =N Proof By combnng 17 and 31, we have 1 δ γ A x x k1 1 δ =1 γ λ λ k1 [ ] γn 1 1 δ λ max A A x x k1 = δ κγ λ max A A γ A x k x A x k1 x =1 =1 κl x k1 x γ [ σ σ N ] γn 1 λ max A A x k1 x γn N A x x k =1 λ max A NA N x k1 N x N γ =1 A x x k1, 39 where the last nequalty follows from the defnton of δ n 37 Fnally we note that combnng 39 wth 1 yelds 36 33 Q-lnear convergence under scenaro 3 Theorem 33 Suppose that the condtons lsted n scenaro 3 n Table 1 hold If γ satsfes 35, then t holds that γ A x x k =1 γ λ λ k 310 1 δ 6 γ A x x k1 γ λ λ k1, =1 13

where δ 6 := mn { γ κγ N 1λ max A 1 A 1 κl 1 λ 1 mn A 1 A 1, δ 3, δ, δ 5 }, 311 wth δ 3, δ and δ 5 defned n 38 Note that t follows from 35 that δ 6 > 0 As a result of 310, we conclude that A x k, A x k,, A x k, λ k converges Q-lnearly = =3 =N Proof Snce A 1 s of full column rank, t s easy to verfy that λ mn A 1 A 1 x k1 1 x 1 A 1 x k1 1 x 1 = A 1 x k1 1 A x k A x k x = = A 1x k1 1 A x k A x k x = = 31 Combnng 31 and 17 yelds, 1 γ λ λ k1 κl x x k1 γ = γ κl 1λ 1 mn A 1 A 1 A 1x k1 1 A x k A x k x = = κγn 1λ max A A A x k x A x k1 x =1 =1 313 1

Combnng 313, 31 and 311 yelds, 1 δ 6 γ [ σ σ N γ A 1x k1 γ =1 A x x k1 γn 1 λ max A A γn N λ max A NA N 1 =1 A x k = A x x k γ δ 6 1 γ λ λ k1 x k1 x ] x k1 N x N =1 A x x k1, whch together wth 1 mples 310 3 R-lnear Convergence From the results n Theorems 31, 3 and 33, we have the followng mmedate corollary on the R-lnear convergence of ADMM 13 Corollary 3 Under the same condtons n Theorem 31, or Theorem 3, or Theorem 33, x k N, λk and A x k, = 1,, N 1 converge R-lnearly Moreover, f A, = 1,,, N 1 are further assumed to be of full column rank, then x k, = 1,,, N 1 converge R-lnearly Proof Note that under all the three scenaros, we have shown that the sequence A x k, A x k,, A x k, λ k = =3 =N converges Q-lnearly It follows that λ k and =1 A x k, = 1,, N 1 converge R-lnearly, snce any part of a Q-lnear convergent quantty converges R-lnearly It also mples that A x k,, A Nx k N converge R-lnearly It now follows from 9 that A 1 x k 1 converges R-lnearly By settng = N n 8, one obtans, x k1 N x N A Nλ k1 λ σ N x k1 N x N, whch mples that x k1 N x N A N λ k1 λ σ N x k1 N x N, 15

e, x k1 N x N A N λ k1 λ σ N The R-lnear convergence of x k N then follows from the fact that λk converges R-lnearly Now we make some remarks on the convergence results presented n ths secton Remark 35 If we ncorporate the ndcator functon nto the obectve functon n 1, then ts subgradent cannot be Lpschtz contnuous on the boundary of the constrant set Therefore, scenaros and 3 can only occur f the constrant sets X s are actually the whole space However, scenaro 1 does allow most of the constrant sets to exst; essentally, t only requres that x N s unconstraned, and all other blocks of varables can be constraned It remans an nterestng queston to fgure out f the lnear convergence rate stll holds f all blocks of varables are constraned Remark 36 Fnally, we remark that the scenaro 1 n Table 1 also gves rse to a lnear convergence rate of the ADMM for convex optmzaton wth nequalty constrants: mn f1 x 1 f x f N x N st A 1 x 1 A x A N x N b x X, = 1,,, N In that case, by ntroducng a slack varable x 0 wth the constrant x 0 R p, the correspondng ADMM becomes x k1 0 := argmn x0 R p L γx 0, x k 1,, xk N ; λk = N A x k b γ λk, where x k1 := argmn x X L γ x k1 0, x k1 1,, x k1 1, x, x k 1,, xk N ; λk, = 1,,, N, λ k1 := λ k γ L γ x 0, x 1,, x N ; λ := x k1 0 A x k1 f x λ, x 0, A x γ x 0 A x Suppose that the functons f, =,, N are all strongly convex, and f N s Lpschtz contnuous, x N X N does not present and A N has full row rank, Theorem 31 assures that the above ADMM algorthm converges globally lnearly Conclusons In ths paper we proved that the orgnal ADMM for convex optmzaton wth mult-block varables s lnearly convergent under some condtons In partcular, we presented three scenaros under whch a 16

lnear convergence rate holds for the ADMM; these condtons can be consdered as extensons of the ones dscussed n [] for the -block ADMM Convergence and complexty analyss for mult-block ADMM are mportant because the ADMM s wdely used and acknowledged to be an effcent and effectve practcal soluton method for large scale convex optmzaton models arsng from mage processng, statstcs, machne learnng, and so on Acknowledgements Research of Shqan Ma was supported n part by the Hong Kong Research Grants Councl RGC Early Career Scheme ECS Proect ID: CUHK 39513 Research of Shuzhong Zhang was supported n part by the Natonal Scence Foundaton under Grant Number CMMI-1161 References [1] D Boley Local lnear convergence of the alternatng drecton method of multplers on quadratc or lnear programs SIAM Journal on Optmzaton, 3:183 07, 013 [] S Boyd, N Parkh, E Chu, B Peleato, and J Ecksten Dstrbuted optmzaton and statstcal learnng va the alternatng drecton method of multplers Foundatons and Trends n Machne Learnng, 31:1 1, 011 [3] C Chen, B He, Y Ye, and X Yuan The drect extenson of ADMM for mult-block convex mnmzaton problems s not necessarly convergent Preprnt, 013 [] W Deng and W Yn On the global and lnear convergence of the generalzed alternatng drecton method of multplers Techncal report, Rce Unversty CAAM, 01 [5] J Douglas and H H Rachford On the numercal soluton of the heat conducton problem n and 3 space varables Transactons of the Amercan Mathematcal Socety, 8:1 39, 1956 [6] J Ecksten Augmented lagrangan and alternatng drecton methods for convex optmzaton: A tutoral and some llustratve computatonal results Preprnt, 01 [7] J Ecksten and D P Bertsekas An alternatng drecton method for lnear programmng Techncal report, MIT Laboratory for Informaton and Decson Systems, 1990 [8] J Ecksten and D P Bertsekas On the Douglas-Rachford splttng method and the proxmal pont algorthm for maxmal monotone operators Mathematcal Programmng, 55:93 318, 199 [9] D Gabay Applcatons of the method of multplers to varatonal nequaltes In M Fortn and R Glownsk, edtors, Augmented Lagrangan Methods: Applcatons to the Soluton of Boundary Value Problems North-Hollan, Amsterdam, 1983 [10] D Han and X Yuan A note on the alternatng drecton method of multplers Journal of Optmzaton Theory and Applcatons, 1551:7 38, 01 17

[11] D Han and X Yuan Local lnear convergence of the alternatng drecton method of multplers for quadratc programs SIAM J Numer Anal,, 516:36 357, 013 [1] B He and X Yuan On nonergodc convergence rate of Douglas-Rachford alternatng drecton method of multplers Preprnt, 01 [13] B He and X Yuan On the O1/n convergence rate of Douglas-Rachford alternatng drecton method SIAM Journal on Numercal Analyss, 50:700 709, 01 [1] M Hong and Z Luo On the lnear convergence of the alternatng drecton method of multplers Preprnt, 01 [15] T Ln, S Ma, and S Zhang On the convergence rate of mult-block ADMM submtted, March 01 [16] P L Lons and B Mercer Splttng algorthms for the sum of two nonlnear operators SIAM Journal on Numercal Analyss, 16:96 979, 1979 [17] Z-Q Luo and P Tseng Error bounds and the convergence analyss of matrx splttng algorthms for the affne varatonal nequalty problem SIAM J Optm, :3 5, 199 [18] R D C Montero and B F Svater Iteraton-complexty of block-decomposton algorthms and the alternatng drecton method of multplers SIAM Journal on Optmzaton, 3:75 507, 013 18