The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent

Size: px
Start display at page:

Download "The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent"

Transcription

1 The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent Caihua Chen Bingsheng He Yinyu Ye Xiaoming Yuan 4 December, Abstract. The alternating direction method of multipliers (ADMM) is now widely used in many fields, and its convergence was proved when two blocks of variables are alternatively updated. It is strongly desirable and practically valuable to extend ADMM directly to the case of a multi-block convex minimization problem where its objective function is the sum of more than two separable convex functions. However, the convergence of this extension has been missing for a long time neither affirmatively proved convergence nor counter example showing its failure of convergence is known in the literature. In this paper we answer this long-standing open question: the direct extension of ADMM is not necessarily convergent. We present an example showing its failure of convergence, and a sufficient condition ensuring its convergence. Keywords. Alternating direction method of multipliers, Convergence analysis, Convex programming, Sparse optimization, Low-rank Optimization, Image processing, Statistical learning, Computer vision Introduction We consider the convex minimization model with linear constraints and an objective function which is the sum of three functions without coupled variables: min θ (x ) + θ (x ) + θ (x ) s.t. A x + A x + A x = b, x X, x X, x X, (.) where A i R p n i (i =,, ), b R p, X i R n i (i =,, ) are closed convex sets; and θ i : R n i R (i =,, ) are closed convex but not necessarily smooth functions. The solution set of (.) is assumed to be nonempty. The abstract model (.) captures many applications in diversifying areas e.g. see the image alignment problem in [8], the robust principal component analysis model with noisy and incomplete data in [], the latent variable Gaussian graphical model selection in International Center of Management Science and Engineering, School of Management and Engineering, Nanjing University, China. This author was supported in part by the Natural Science Foundation of Jiangsu Province under project grant No. BK55 and the NSFC Grant chchen@nju.edu.cn. International Centre of Management Science and Engineering, School of Management and Engineering, and Department of Mathematics, Nanjing University, China. This author was supported by the NSFC Grant 97 and the MOEC fund hebma@nju.edu.cn. Department of Management Science and Engineering, School of Engineering, Stanford University, USA; and International Center of Management Science and Engineering, School of Management and Engineering, Nanjing University. This author was supported by AFOSR Grant FA yyye@stanford.edu. 4 Department of Mathematics, Hong Kong Baptist University, Hong Kong, China. This author was supported partially by the General Research Fund from Hong Kong Research Grants Council: 6. xmyuan@hkbu.edu.hk.

2 [4, 7] and the quadratic discriminant analysis model in [6]. Our discussion is inspired by the scenario where each function θ i may have some specific properties and it deserves to explore them in algorithmic design. This is often encountered in some sparse and low-rank optimization models, such as the just-mentioned applications of (.). We thus do not consider the generic treatment that the sum of three functions is regarded as one general function and advantageous properties of each individual θ i are ignored or not fully used. The alternating direction method of multipliers (ADMM) was originally proposed in [9] (see also [, 7]), and it is now a benchmark for the following convex minimization model analogous to (.) but with only two blocks of functions and variables: min θ (x ) + θ (x ) s.t. A x + A x = b, x X, x X. (.) Let L A (x, x, λ) = θ (x ) + θ (x ) λ T ( A x + A x b ) + β A x + A x b (.) be the augmented Lagrangian function of (.) with λ R p the Lagrange multiplier and β > a penalty parameter. Then, the iterative scheme of ADMM for (.) is x k+ = Argmin{L A (x, x k, λ k ) x X }, (.4a) (ADMM) x k+ = Argmin{L A (x k+, x, λ k ) x X }, (.4b) λ k+ = λ k β(a x k+ + A x k+ b). (.4c) The iterative scheme of ADMM embeds a Gaussian-Seidel decomposition into each iteration of the augmented Lagrangian method in [4, 9]; thus the functions θ and θ are treated individually and so easier subproblems could be generated. This feature is very advantageous for a broad spectrum of application such as partial differential equations, mechanics, image processing, statistical learning, computer vision, and so on. In fact, ADMM has recently witnessed a renaissance in many application domains after a long period without too much attention. We refer to [, 5, 8] for some review papers on ADMM. With the same philosophy as ADMM to take advantage of each θ i s properties individually, it is natural to extend the original ADMM (.4) for (.) directly to (.) and obtain the scheme where (Extended ADMM) x k+ = Argmin{L A (x, x k, x k, λ k ) x X }, (.5a) x k+ = Argmin{L A (x k+, x, x k, λ k ) x X }, (.5b) x k+ = Argmin{L A (x k+, x k+, x, λ k ) x X }, (.5c) λ k+ = λ k β(a x k+ + A x k+ + A x k+ b), (.5d) L A (x, x, x, λ) = θ i (x i ) λ T ( A x + A x + A x b ) + β A x + A x + A x b (.6) i= is the augmented Lagrangian function of (.). This direct extension of ADMM is strongly desired and practically used by many users, see e.g. [8, ]. The convergence of (.5), however, has been ambiguous for a long time there is neither affirmative convergence proof nor counter example

3 showing its failure of convergence in the literature. This convergence ambiguity has inspired an active research topic in developing such algorithms that are some slightly twisted versions of (.5) but with provable convergence and competitive numerical efficiency and iteration simplicity, see e.g. [,, 5]. Since the extended ADMM scheme (.5) does work well for some applications (e.g. [8, ]), users have the inclination to imagine that this scheme seems convergent even though they are perplexed by the rigorous proof. In the literature, there is even very little hint for the difficulty in the convergence proof for (.5), see [5] for an insightful explanation. The main purpose of this paper is to answer this long-standing open question negatively: The extended ADMM scheme (.5) is not necessarily convergent. We will give an example (and a strategy for constructing such an example) to demonstrate its failure of convergence, and show that the convergence of (.5) can be guaranteed when any two coefficient matrices in (.) are orthogonal. A Sufficient Condition Ensuring the Convergence of (.5) We first study a condition that can ensure the convergence for the direct extension of ADMM (.5). Our methodology of constructing a counter example to show the failure of convergence for (.5) is also clear via this study. Our claim is that the convergence of (.5) is guaranteed when any two coefficient matrices in (.) are orthogonal. We thus will discuss the cases: A T A =, A T A = and A T A =. This new condition does not impose any strong convexity on the objective function in (.), and it simply requires to check the orthogonality of the coefficient matrices. So, it is more checkable than conditions in the literature such as the one in [] which requires strong convexity on all functions in the objective and restricts the choice of the penalty parameter β into a specific range; and the one in [5] which requires to attach a sufficiently small shrinkage factor to the Lagrange-multiplier updating step (.5d) such that a certain error-bound condition is satisfied.. Case : A T A = or A T A = We remark that if two coefficient matrices of (.) in consecutive order are orthogonal, i.e., A T A = or A T A =, then the direct extension of ADMM (.5) reduces to a special case of the original ADMM (.4). Thus the convergence of (.5) under this condition is implied by well known results in ADMM literature. To see this, let us first assume A T A =. According to the first-order optimality conditions of the minimization problems in (.5), we have x k+ i X i (i =,, ) and θ (x ) θ (x k+ ) + (x x k+ ) T { A T [λ k β(a x k+ + A x k + A x k b)]}, x X, (.a) θ (x ) θ (x k+ ) + (x x k+ ) T { A T [λ k β(a x k+ + A x k+ + A x k b)]}, x X, (.b) θ (z) θ (x k+ ) + (x x k+ ) T { A T [λ k β(a x k+ + A x k+ + A x k+ b)]}, x X. (.c) Then, because of A T A =, it follows from (.) that θ (x ) θ (x k+ ) + (x x k+ ) T { A T [λ k β(a x k+ + A x k b)]}, x X, (.a) θ (x ) θ (x k+ ) + (x x k+ ) T { A T [λ k β(a x k+ + A x k b)]}, x X, (.b) θ (x ) θ (x k+ ) + (x x k+ ) T { A T [λ k β(a x k+ + A x k+ + A x k b)]}, x X, (.c)

4 which is also the first-order optimality condition of the scheme { (x k+, x k+ ) = Argmin θ (x ) + θ (x ) (λ k ) T (A x + A x ) + β A x + A x + A x k b x X, x X }, (.a) x k+ = Argmin{θ (x ) (λ k ) T A x + β A x k+ + A x k+ + A x b x X, }, (.b) λ k+ = λ k β(a x k+ + A x k+ + A x k+ b). (.c) Clearly, (.) is a specific application of the original ADMM (.4) to (.) by regarding (x, x ) as one variable, [A, A ] as one matrix and θ (x ) + θ (x ) as one function. Note in (.), both x k and x k are not required to generate the (k + )-th iteration under the orthogonality condition AT A =. Existing convergence results for the original ADMM such as those in [6, ] thus hold for the special case of (.5) with the orthogonality condition A T A =. Similar discussion can be carried out under the orthogonality condition A T A =.. Case : A T A = In the last subsection, we have discussed the cases where two consecutive coefficient matrices are orthogonal. Now, we pay more attention to the case where A T A = and show that it can also ensure the convergence of (.5). To prepare for the proof, we need to make something clear. First, note the update order of (.5) at each iteration is x x x λ and then it repeats cyclically. We thus can rewrite (.5) in the form x k+ = Argmin{L A (x k+, x, x k, λ k ) x X }, (.4a) x k+ = Argmin{L A (x k+, x k+, x, λ k ) x X }, (.4b) λ k+ = λ k β(a x k+ + A x k+ + A x k+ b), (.4c) x k+ = Argmin{L A (x, x k+, x k+, λ k+ ) x X }. (.4d) According to (.4), there is a update for the variable λ between the updates for x and x. Thus, the case A T A = requires discussion different from that in the last subsection. We will focus on this representation of (.5) within this subsection. Second, it worths to mention that the variable x is not involved in the iteration of (.4), meaning the scheme (.4) generating a new iterate only based on (x k+, x k, λk ). We thus follow the terminology in [] to call x an intermediate variable; and correspondingly call (x, x, λ) essential variables because they are really necessary to execute the iteration of (.4). Accordingly, we use the notations w k = (x k+, x k, xk, λk ), u k = w k \λ k = (x k+, x k, xk ), vk = w k \x k = (xk+, x k, λk ), v = w\x = (x, x, λ), V = X X R p and V := {v = (x, x, λ ) w = (x, x, x, λ ) Ω }. Note the first element in w k, u k or v k is x k+ rather than x k. Third, it is useful to characterize the model (.) by a variational inequality. More specifically, finding a saddle point of the Lagrange function of (.) is equivalent to solving the variational inequality problem: Finding w Ω such that VI(Ω, F, θ) : θ(u) θ(u ) + (w w ) T F (w ), w Ω, (.5a) 4

5 where and u = x x x, w = F (w) = x x x λ, θ(u) = θ (x ) + θ (x ) + θ (x ), (.5b) A T λ A T λ A T λ A x + A x + A x b. (.5c) Obviously, the mapping F ( ) defined in (.5c) is monotone because it is affine with a skew-symmetric matrix. Last, let us take a deeper look at the output of (.4) and investigate some of its properties. In fact, deriving the first-order optimality condition of the minimization problems in (.4) and rewriting (.4c) appropriately, we obtain θ (x ) θ (x k+ ) + (x x k+ ) T { A T [λ k β(a x k+ + A x k+ + A x k b)]}, x X, (.6a) θ (x ) θ (x k+ ) + (x x k+ ) T { A T [λ k β(a x k+ + A x k+ + A x k+ b)]}, x X, (.6b) (A x k+ + A x k+ + A x k+ b) + β (λk+ λ k ) =, (.6c) θ (x ) θ (x k+ ) + (x x k+ ) T { A T [λ k+ β(a x k+ + A x k+ + A x k+ b)]}, x X. (.6d) Then, substituting (.6c) in (.6a) (.6b) and (.6d) and using A T A =, we get θ (x ) θ (x k+ ) + (x x k+ ) T { A T λ k+ + βa T A (x k x k+ )}, x X, (.7a) θ (x ) θ (x k+ ) + (x x k+ ) T { A T λ k+ }, x X, (.7b) (A x k+ + A x k+ + A x k+ b) + A (x k+ x k+ ) β (λk λ k+ ) =, (.7c) θ (x ) θ (x k+ ) + (x x k+ ) T { A T λ k+ βa T A (x k+ x k+ ) + A T (λ k λ k+ )}, x X. (.7d) With the definitions of θ, F, Ω, u k and v k, we can rewrite (.7) as a compact form. We summarize it in the next lemma and omit its proof as it is just a compact reformulation of (.7). Lemma.. Let w k+ = (x k+, x k+, x k+, λ k+ ) be generated by (.4) from given v k = (x k+, x k, λk ). Then we have w k+ Ω, θ(u) θ(u k+ ) + (w w k+ ) T {F (w k+ ) + Q(v k v k+ )}, w Ω, (.8) where Q = βa T A A T βa T A. (.9) A I β Note the assertion (.8) is useful for quantifying the accuracy of w k+ to a solution point of VI(Ω, F, θ), because of the variational inequality reformulation (.5) of (.). Now, we are ready to prove the convergence for the direct extension of ADMM under the condition A T A =. We first refine the assertion (.8) under this additional condition. 5

6 Lemma.. Let w k+ = (x k+, x k+, x k+, λ k+ ) be generated by (.4) from given v k = (x k+, x k, λk ). If A T A =, then we have where w k+ Ω, θ(u) θ(u k+ ) + (w w k+ ) T {F (w k+ ) + βp A (x k x k+ )} P = A T A T A T (v v k+ ) T H(v k v k+ ), w Ω, (.) x, v = x λ and H = Proof. Since A T A =, the following is an identity: x x k+ T βa T A βa T A A T x x k+ x x k+ βa T λ λ k+ A A β I = x x k+ x x k+ λ λ k+ T βa T A A T βa T A βa T A A T βa T A A β I A β I x k+ x k+ x k xk+ λ λ k+ x k+ x k+ x k xk+ λ k λ k+. (.) Adding the above identity to the both sides of (.8) and using the notations of v and H, we obtain w k+ Ω, θ(u) θ(u k+ ) + (w w k+ ) T {F (w k+ ) + Q (v k v k+ )}. (v v k+ ) T H(v k v k+ ), w Ω, (.) where (see Q in (.9)) Q = Q + βa T A βa T A A T βa T A A β I βa T A = βa T A βa T A. By using the structures of the matrices Q and P (see (.)), and the vector v, we have The assertion (.) is proved. (w w k+ ) T Q (v k v k+ ) = (w w k+ ) T βp A (x k x k+ ). Let us define two auxiliary sequences which will only serve for simplifying our notation in convergence analysis: x k x k+ w k x k = x k = x k+ x k x k+ and ũk = x k, (.) λ k λ k+ βa (x k x k xk+ ) where {x k+, x k+, x k+, λ k+ } is generated by (.4). In the next lemma, we establish an important inequality based on the assertion in Lemma., which will play a vital role in convergence analysis. 6

7 Lemma.. Let w k+ = (x k+, x k+, x k+, λ k+ ) be generated by (.4) from given v k = (x k+, x k, λk ). If A T A =, we have w k Ω and θ(u) θ(ũ k ) + (w w k ) T F ( w k ) ( v v k+ H v v k ) H + vk v k+ H, w Ω,(.4) where w k and ũ k are defined in (.). Proof. According to the definition of w k, (.) can be rewritten as w k Ω, θ(u) θ(ũ k ) + (w w k ) T F ( w k ) β(a x k+ + A x k+ + A x k+ b) T A (x k x k+ ) +(v v k+ ) T H(v k v k+ ), w Ω, (.5) Setting x = x k in (.7b), we obtain θ (x k ) θ (x k+ ) + (x k x k+ ) T { A T λ k+ }. (.6) Note that (.7b) is also true for the (k )th iteration. Thus, it holds that θ (x ) θ (x k ) + (x x k ) T { A T λ k }. Setting x = x k+ in the last inequality, we obtain which together with (.6) yields that θ (x k+ ) θ (x k ) + (x k+ x k ) T { A T λ k }, (.7) (λ k λ k+ ) T A (x k x k+ ), k. (.8) By using the fact λ k λ k+ = β(a x k+ + A x k+ + A x k+ b) and the assumption A T A =, we get immdeiately that and hence β(a x k+ + A x k+ + A x k+ b) T A (x k x k+ ), (.9) w k Ω, θ(u) θ(ũ k ) + (w w k ) T F ( w k ) (v v k+ ) T H(v k v k+ ), w Ω. (.) By substituting the identity (v v k+ ) T H(v k v k+ ) = v v ( k+ H v v k ) H + vk v k+ H into the right-hand side of (.), we obtain (.4). Now, we are able to establish the contraction property with respect to the solution set of VI(Ω, F, θ) for the sequence {v k } generated by (.4), from which the convergence of (.4) can be easily established. Theorem.4. Assume A T A = for the model (.). Let {x k+, x k, xk, λk } be the sequence generated by the direct extension of ADMM (.4). Then, we have: 7

8 (i) The sequence {v k := (x k+, x k, λk )} is contractive with respective to the solution of VI(Ω, F, θ), i.e., v k+ v H v k v H v k v k+ H. (.) (ii) If the matrices [A, A ] and A are assumed to be full column rank, then the sequence {w k } converges to a KKT point of the model (.). Proof. (i) The first assertion is straightforward based on (.4). Setting w = w in (.4), we get ( v k v H v k+ v H) vk v k+ H θ(ũ k ) θ(u ) + ( w k w ) T F ( w k ). From the monotonicity of F and (.5), it follows that θ(ũ k ) θ(u ) + ( w k w ) T F ( w k ) θ(ũ k ) θ(u ) + ( w k w ) T F (w ), and thus (.) is proved. Clearly, (.) indicates that the sequence {v k } is contractive with respect to the solution set of VI(Ω, F, θ), see e.g. []. (ii) To prove (ii), by the inequality (.), it follows that the sequences {A x k+ β λk } and {A x k } are both bounded. Since A has full column rank, we deduce that {x k } is bounded. Note that A x k+ + A x k = A x k+ β λk (A x k β λk ) A x k + b. (.) Hence, {A x k+ + A x k } is bounded. Together with the assumption that [A, A ] has full column rank, we conclude that the sequences {x k+ }, {x k } and {λk } are all bounded. Therefore, there exists a subsequence {x n k+, x n k+, x n k+, λ nk+ } that converges to a limit point, say (x, x, x, λ ). Moreover, from (.), we see immediately that which shows and thus v k v k+ H < +, (.) k= lim k H(vk v k+ ) =, (.4) lim k Q(vk v k+ ) =. (.5) Then, by taking the limits on the both sides of (.8), using (.5), one can immediately write w Ω, θ(u) θ(u ) + (w w ) T F (w ), w Ω, (.6) which means w = (x, x, x, λ ) is a KKT point of (.). Hence, the inequality (.) is also valid if (x, x, x, λ ) is replaced by (x, x, x, λ ). Then it holds that which implies that v k+ v H v k v H, (.7) lim A (x k+ x ) k β (λk λ ) =, lim A (x k x ) =. (.8) k 8

9 By taking limits to (.), using (.8) and the assumptions, we know lim k xk+ = x, lim k xk = x, lim k xk = x, lim k λk = λ. (.9) which completes the proof of this theorem. Inspired by [], we can also establish a worst-case convergence rate measured by the iteration complexity in ergodic sense for the direct extension of ADMM (.4). This is summarized in the following theorem. Theorem.5. Assume A T A = for the model (.). Let {(x k, xk, xk, λk )} be the sequence generated by the direct extension of ADMM (.4) and w k be defined in (.). After t iterations of (.4), we take w t = t w k. (.) t + Then, w W and it satisfies θ(ũ t ) θ(u) + ( w t w) T F (w) k= Proof. By the monotonicity of F and (.4), it follows that (t + ) v v H, w Ω. (.) w k Ω, θ(u) θ(ũ k ) + (w w k ) T F (w) + v vk H v vk+ H, w Ω. (.) Together with the convexity of X, X and X, (.) implies that w t Ω. Summing the inequality (.) over k =,,..., t, we obtain (t + )θ(u) t k= ( θ(ũ k ) + (t + )w Use the notation of w t, it can be written as t + t k= t θ(ũ k ) θ(u) + ( w t w) T F (w) k= Since θ(u) is convex and we have that ũ t = t + θ(ũ t ) t + w k) T F (w) + v v H, w Ω. t ũ k, k= (t + ) v v H, w Ω. (.) t θ(ũ k ). Substituting it into (.), the assertion of this theorem follows directly. Remark.6. For an arbitrarily given compact set D Ω, let d = sup{ v v H } v = w\x, w D}, where v = (x, x, λ ). Then, after t iterations of the extended ADMM (.4), the point w t defined in (.) satisfies sup{θ(ũ t ) θ(u) + ( w t w) T d F (w)} (t + ), (.4) which, according to the definition (.5), means w t is an approximate solution of VI(Ω, F, θ) with an accuracy of O(/t). Thus a worst-case O(/t) convergence rate in ergodic sense is established for the direct extension of ADMM (.4). k= 9

10 An Example Showing the Failure of Convergence of (.5) In the last section, we have shown that if it is additionally assumed that any two coefficient matrices in (.) be orthogonal, then the direct extension of ADMM is convergent in any order. In this section, we give an example to show the failure of convergence when (.5) is applied to solve (.) without such an orthogonality assumption. More specifically, we consider the following linear homogenous equation with three variables: x A + x A + x A =, (.) where A i R (i =,, ) are all column vectors and [A, A, A ] is assumed to be nonsingular; and x i R (i =,, ). The unique solution of (.) is thus x = x = x =. Clearly, (.) is a special case of (.) where the objective function is zero, b is the zero vector in R and X i = R for i =,,. The direct extension of ADMM (.5) is thus applicable, and the corresponding optimal Lagrange multiplier is.. The Iterative Scheme of (.5) for (.) Now, we elucidate the iterative scheme when the direct extension of ADMM (.5) is applied to solve the linear equation (.). In fact, as we will show, it can be represented as a matrix recursion. Specifying the scheme (.5) in general setting with β = by the particular setting in (.), we obtain A T λ k + A T (A x k+ + A x k + A x k ) =, (.a) It follows from the first equation in (.) that A T λ k + A T (A x k+ + A x k+ + A x k ) =, (.b) A T λ k + A T (A x k+ + A x k+ + A x k+ ) =, (.c) (A x k+ + A x k+ + A x k+ ) + λ k+ λ k =. (.d) x k+ = ( A T A T A A x k A T A x k + A T λ k). (.) Substituting (.) into (.b), (.c) and (.d), we obtain a reformulation of (.) A T A A T A A T A A A I A T A A T = A T I x k+ x k+ λ k+ A T A A T A A T A A ( A T A, A T A, A T ) x k x k λ k. (.4) Let L = A T A A T A A T A (.5) A A I

11 and R = A T A A T A T I A T A A T A A T A A ( A T A, A T A, A T ). (.6) Then the iterative formula (.4) can be rewritten in the following matrix recursion context: x k+ x k+ λ k+ = M x k x k λ k = = M k+ x x λ (.7) with M = L R. (.8) Note for R defined in (.6), we have R =. A Thus, as will be seen in the next subsection, is an eigenvalue of the matrix M.. A Concrete Example Showing the Divergence of (.5) Now we are ready to construct a concrete example to show the divergence when the direct extension of ADMM (.5) is applied to solve the model (.). Our previous analysis in Section has shown that the scheme (.5) is convergent whenever any two coefficient matrices are orthogonal. Thus, to show the failure of convergence of (.5) for (.), the columns A, A and A in (.) should be chosen such that any two of them are not orthogonal. Moreover, notice that the sequence generated by (.7) is divergent if ρ(m) >. We thus construct the matrix A as follows: A = (A, A, A ) = Given this matrix A, the system of linear equations (.4) can be specified as 6 x k+ 7 9 x k+ λ k+ λ k+ λ k+ 7 4 = 5 ( ) 4, 5,,,. (.9) x k x k λ k λ k λ k.

12 Note with the specification in (.9), the matrices L in (.5) and R in (.6) reduce to L = and R = Thus we have M = L R = By direct computation, we know that M admits the following eigenvalue decomposition where and V = d = M = V Diag(d)V, (.) i i i.8744.i i.4.66i.4.66i i i i i i i i i i i.47.8i i.47.8i.5774 An important fact regarding d defined above is that d = d >, which offers us the opportunity to construct a divergent sequence {x k, xk, λk, λk, λk }. In fact, let us choose the initial point (x, x, λ, λ, λ )T as V (:, ) + V (:, ). It is clear that the vector V (:, ) is the complex conjugate of the vector V (:, ); thus the starting point is real. Then, since V = x x λ λ λ,,

13 we know from (.7) and (.) that x k+ x k+ λ k+ λ k+ λ k+ = V Diag(d k+ )V = V Diag(d k+ ) = V ( i ) k+ ( i ) k+ which is definitely divergent and there is no way to converge to the solution point of (.). x x λ λ λ, 4 Conclusions We showed by an example that the direct extension of the alternating direction method of multiplier (ADMM) is not necessarily convergent for solving a multi-block convex minimization model with linear constraints and an objective function which is the sum of more than two convex functions without coupled variables; a long-standing open problem is thus answered. Based on our strategy for constructing this example, it is easy to find more such examples. The negative answer to this open problem thus verifies the rationale of algorithmic design in our recent work such as [, ], where it was suggested to combine some correction steps with the output of the direct extension of ADMM in order to produce a splitting algorithm with rigorous convergence under mild assumptions for multi-block convex minimization models. We also studied a condition that can guarantee the convergence of the direct extension of ADMM. This condition is significantly different from those in the literature which often require strong convexity on the objective functions and/or restrictive choices for the penalty parameter. Instead, the new condition simply depends on the orthogonality of the given coefficient matrices in the model and poses no restriction on how to choose the penalty parameter in algorithmic implementation. Our analysis focused on the model (.) where there are three convex functions in its objective, because this case represents a wide range of applications and it is easier to demonstrate our main idea with this special case. The analysis can be easily extended to the more general multi-block case of convex minimization model where there are m > convex functions in its objective. Moreover, based on our strategy of constructing the example showing the divergence of (.5), it is easy to show that the direct extension of ADMM with some given value of the penalty parameter is still divergent even if the functions θ i in (.) are all further assumed to be strongly convex. We omit the detail of these extensions for succinctness.

14 References [] E. Blum and W. Oettli, Mathematische Optimierung. Grundlagen und Verfahren. Ökonometrie und Unternehmensforschung, Springer-Verlag, Berlin-Heidelberg-New York, 975. [] S. Boyd, N. Parikh, E. Chu, B. Peleato and J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Found. Trends Mach. Learning, (), pp. -. [] T. F. Chan and R. Glowinski, Finite element approximation and iterative solution of a class of mildly non-linear elliptic equations, technical report, Stanford University, 978. [4] V. Chandrasekaran, P. A. Parrilo, and A. S. Willsky, Latent variable graphical model selection via convex optimization, Ann. Stat., 4 (), pp [5] J. Eckstein, Augmented Lagrangian and alternating direction methods for convex optimization: A tutorial and some illustrative computational results, manuscript,. [6] M. Fortin and R. Glowinski, On Decomposition-Coordination Methods Using an Augmented Lagrangian. In M. Fortin and R. Glowinski, eds., Augmented Lagrangian Methods: Applications to the solution of Boundary Problems. North- Holland, Amsterdam, 98. [7] D. Gabay and B. Mercier, A dual algorithm for the solution of nonlinear variational problems via finite element approximations, Comput. Math. Appli., (976), pp [8] R. Glowinski, On alternating directon methods of multipliers: a historical perspective, Springer Proceedings of a Conference Dedicated to J. Periaux, to appear. [9] R. Glowinski and A. Marrocco, Approximation par éléments finis d ordre un et résolution par pénalisation-dualité d une classe de problèmes non linéaires, R.A.I.R.O., R (975), pp [] D. R. Han and X.M. Yuan, A note on the alternating direction method of multipliers, J. Optim. Theory Appli., 55 (), pp [] B. S. He, M. Tao and X. M. Yuan, Alternating direction method with Gaussian back substitution for separable convex programming, SIAM J. Optim., (), pp. -4. [] B. S. He, M. Tao and X. M. Yuan, Convergence rate and iteration complexity on the alternating direction method of multipliers with a substitution procedure for separable convex programming, Math. Oper. Res., under revision. [] B. S. He and X.M. Yuan. On the O(/n) convergence rate of the Douglas-Rachford alternating direction method, SIAM J. Num. Anal., 5 (), pp [4] M. R. Hestenes, Multiplier and gradient methods, J. Optim. Theory Appli., 4 (969), pp. -. [5] M. Hong and Z. Q. Luo, On the linear convergence of the alternating direction method of multipliers, manuscript, August. [6] G. J. McLachlan, Discriminant analysis and statistical pattern recognition, volume 544. Wiley- Interscience, 4. 4

15 [7] K. Mohan, P. London, M. Fazel, D. Witten, and S. Lee. Node-based learning of multiple gaussian graphical models. avialable at: arxiv:/.545,. [8] Y. G. Peng, A. Ganesh, J. Wright, W. L. Xu and Y. Ma, Robust alignment by sparse and lowrank decomposition for linearly correlated images, IEEE Tran. Pattern Anal. Mach. Intel. 4 (), pp [9] M. J. D. Powell, A method for nonlinear constraints in minimization problems, In Optimization edited by R. Fletcher, pp. 8-98, Academic Press, New York, 969. [] M. Tao and X. M. Yuan, Recovering low-rank and sparse components of matrices from incomplete and noisy observations, SIAM J. Optim., (), pp

The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent

The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent Yinyu Ye K. T. Li Professor of Engineering Department of Management Science and Engineering Stanford

More information

On convergence rate of the Douglas-Rachford operator splitting method

On convergence rate of the Douglas-Rachford operator splitting method On convergence rate of the Douglas-Rachford operator splitting method Bingsheng He and Xiaoming Yuan 2 Abstract. This note provides a simple proof on a O(/k) convergence rate for the Douglas- Rachford

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of

More information

On the Iteration Complexity of Some Projection Methods for Monotone Linear Variational Inequalities

On the Iteration Complexity of Some Projection Methods for Monotone Linear Variational Inequalities On the Iteration Complexity of Some Projection Methods for Monotone Linear Variational Inequalities Caihua Chen Xiaoling Fu Bingsheng He Xiaoming Yuan January 13, 2015 Abstract. Projection type methods

More information

An Algorithmic Framework of Generalized Primal-Dual Hybrid Gradient Methods for Saddle Point Problems

An Algorithmic Framework of Generalized Primal-Dual Hybrid Gradient Methods for Saddle Point Problems An Algorithmic Framework of Generalized Primal-Dual Hybrid Gradient Methods for Saddle Point Problems Bingsheng He Feng Ma 2 Xiaoming Yuan 3 January 30, 206 Abstract. The primal-dual hybrid gradient method

More information

Contraction Methods for Convex Optimization and monotone variational inequalities No.12

Contraction Methods for Convex Optimization and monotone variational inequalities No.12 XII - 1 Contraction Methods for Convex Optimization and monotone variational inequalities No.12 Linearized alternating direction methods of multipliers for separable convex programming Bingsheng He Department

More information

Optimal Linearized Alternating Direction Method of Multipliers for Convex Programming 1

Optimal Linearized Alternating Direction Method of Multipliers for Convex Programming 1 Optimal Linearized Alternating Direction Method of Multipliers for Convex Programming Bingsheng He 2 Feng Ma 3 Xiaoming Yuan 4 October 4, 207 Abstract. The alternating direction method of multipliers ADMM

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.11

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.11 XI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.11 Alternating direction methods of multipliers for separable convex programming Bingsheng He Department of Mathematics

More information

On the acceleration of augmented Lagrangian method for linearly constrained optimization

On the acceleration of augmented Lagrangian method for linearly constrained optimization On the acceleration of augmented Lagrangian method for linearly constrained optimization Bingsheng He and Xiaoming Yuan October, 2 Abstract. The classical augmented Lagrangian method (ALM plays a fundamental

More information

Linearized Alternating Direction Method of Multipliers via Positive-Indefinite Proximal Regularization for Convex Programming.

Linearized Alternating Direction Method of Multipliers via Positive-Indefinite Proximal Regularization for Convex Programming. Linearized Alternating Direction Method of Multipliers via Positive-Indefinite Proximal Regularization for Convex Programming Bingsheng He Feng Ma 2 Xiaoming Yuan 3 July 3, 206 Abstract. The alternating

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.18

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.18 XVIII - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No18 Linearized alternating direction method with Gaussian back substitution for separable convex optimization

More information

arxiv: v1 [math.oc] 23 May 2017

arxiv: v1 [math.oc] 23 May 2017 A DERANDOMIZED ALGORITHM FOR RP-ADMM WITH SYMMETRIC GAUSS-SEIDEL METHOD JINCHAO XU, KAILAI XU, AND YINYU YE arxiv:1705.08389v1 [math.oc] 23 May 2017 Abstract. For multi-block alternating direction method

More information

Application of the Strictly Contractive Peaceman-Rachford Splitting Method to Multi-block Separable Convex Programming

Application of the Strictly Contractive Peaceman-Rachford Splitting Method to Multi-block Separable Convex Programming Application of the Strictly Contractive Peaceman-Rachford Splitting Method to Multi-block Separable Convex Programming Bingsheng He, Han Liu, Juwei Lu, and Xiaoming Yuan Abstract Recently, a strictly contractive

More information

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop

More information

Generalized ADMM with Optimal Indefinite Proximal Term for Linearly Constrained Convex Optimization

Generalized ADMM with Optimal Indefinite Proximal Term for Linearly Constrained Convex Optimization Generalized ADMM with Optimal Indefinite Proximal Term for Linearly Constrained Convex Optimization Fan Jiang 1 Zhongming Wu Xingju Cai 3 Abstract. We consider the generalized alternating direction method

More information

Improving an ADMM-like Splitting Method via Positive-Indefinite Proximal Regularization for Three-Block Separable Convex Minimization

Improving an ADMM-like Splitting Method via Positive-Indefinite Proximal Regularization for Three-Block Separable Convex Minimization Improving an AMM-like Splitting Method via Positive-Indefinite Proximal Regularization for Three-Block Separable Convex Minimization Bingsheng He 1 and Xiaoming Yuan August 14, 016 Abstract. The augmented

More information

Proximal-like contraction methods for monotone variational inequalities in a unified framework

Proximal-like contraction methods for monotone variational inequalities in a unified framework Proximal-like contraction methods for monotone variational inequalities in a unified framework Bingsheng He 1 Li-Zhi Liao 2 Xiang Wang Department of Mathematics, Nanjing University, Nanjing, 210093, China

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)

More information

A relaxed customized proximal point algorithm for separable convex programming

A relaxed customized proximal point algorithm for separable convex programming A relaxed customized proximal point algorithm for separable convex programming Xingju Cai Guoyong Gu Bingsheng He Xiaoming Yuan August 22, 2011 Abstract The alternating direction method (ADM) is classical

More information

A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem

A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem Kangkang Deng, Zheng Peng Abstract: The main task of genetic regulatory networks is to construct a

More information

On the O(1/t) convergence rate of the projection and contraction methods for variational inequalities with Lipschitz continuous monotone operators

On the O(1/t) convergence rate of the projection and contraction methods for variational inequalities with Lipschitz continuous monotone operators On the O1/t) convergence rate of the projection and contraction methods for variational inequalities with Lipschitz continuous monotone operators Bingsheng He 1 Department of Mathematics, Nanjing University,

More information

A Unified Approach to Proximal Algorithms using Bregman Distance

A Unified Approach to Proximal Algorithms using Bregman Distance A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department

More information

Distributed Optimization via Alternating Direction Method of Multipliers

Distributed Optimization via Alternating Direction Method of Multipliers Distributed Optimization via Alternating Direction Method of Multipliers Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato Stanford University ITMANET, Stanford, January 2011 Outline precursors dual decomposition

More information

RECOVERING LOW-RANK AND SPARSE COMPONENTS OF MATRICES FROM INCOMPLETE AND NOISY OBSERVATIONS. December

RECOVERING LOW-RANK AND SPARSE COMPONENTS OF MATRICES FROM INCOMPLETE AND NOISY OBSERVATIONS. December RECOVERING LOW-RANK AND SPARSE COMPONENTS OF MATRICES FROM INCOMPLETE AND NOISY OBSERVATIONS MIN TAO AND XIAOMING YUAN December 31 2009 Abstract. Many applications arising in a variety of fields can be

More information

Proximal ADMM with larger step size for two-block separable convex programming and its application to the correlation matrices calibrating problems

Proximal ADMM with larger step size for two-block separable convex programming and its application to the correlation matrices calibrating problems Available online at www.isr-publications.com/jnsa J. Nonlinear Sci. Appl., (7), 538 55 Research Article Journal Homepage: www.tjnsa.com - www.isr-publications.com/jnsa Proximal ADMM with larger step size

More information

496 B.S. HE, S.L. WANG AND H. YANG where w = x y 0 A ; Q(w) f(x) AT g(y) B T Ax + By b A ; W = X Y R r : (5) Problem (4)-(5) is denoted as MVI

496 B.S. HE, S.L. WANG AND H. YANG where w = x y 0 A ; Q(w) f(x) AT g(y) B T Ax + By b A ; W = X Y R r : (5) Problem (4)-(5) is denoted as MVI Journal of Computational Mathematics, Vol., No.4, 003, 495504. A MODIFIED VARIABLE-PENALTY ALTERNATING DIRECTIONS METHOD FOR MONOTONE VARIATIONAL INEQUALITIES Λ) Bing-sheng He Sheng-li Wang (Department

More information

On the Convergence of Multi-Block Alternating Direction Method of Multipliers and Block Coordinate Descent Method

On the Convergence of Multi-Block Alternating Direction Method of Multipliers and Block Coordinate Descent Method On the Convergence of Multi-Block Alternating Direction Method of Multipliers and Block Coordinate Descent Method Caihua Chen Min Li Xin Liu Yinyu Ye This version: September 9, 2015 Abstract The paper

More information

Convergence of a Class of Stationary Iterative Methods for Saddle Point Problems

Convergence of a Class of Stationary Iterative Methods for Saddle Point Problems Convergence of a Class of Stationary Iterative Methods for Saddle Point Problems Yin Zhang 张寅 August, 2010 Abstract A unified convergence result is derived for an entire class of stationary iterative methods

More information

SPARSE AND LOW-RANK MATRIX DECOMPOSITION VIA ALTERNATING DIRECTION METHODS

SPARSE AND LOW-RANK MATRIX DECOMPOSITION VIA ALTERNATING DIRECTION METHODS SPARSE AND LOW-RANK MATRIX DECOMPOSITION VIA ALTERNATING DIRECTION METHODS XIAOMING YUAN AND JUNFENG YANG Abstract. The problem of recovering the sparse and low-rank components of a matrix captures a broad

More information

Dealing with Constraints via Random Permutation

Dealing with Constraints via Random Permutation Dealing with Constraints via Random Permutation Ruoyu Sun UIUC Joint work with Zhi-Quan Luo (U of Minnesota and CUHK (SZ)) and Yinyu Ye (Stanford) Simons Institute Workshop on Fast Iterative Methods in

More information

Inexact Alternating Direction Method of Multipliers for Separable Convex Optimization

Inexact Alternating Direction Method of Multipliers for Separable Convex Optimization Inexact Alternating Direction Method of Multipliers for Separable Convex Optimization Hongchao Zhang hozhang@math.lsu.edu Department of Mathematics Center for Computation and Technology Louisiana State

More information

Generalized Alternating Direction Method of Multipliers: New Theoretical Insights and Applications

Generalized Alternating Direction Method of Multipliers: New Theoretical Insights and Applications Noname manuscript No. (will be inserted by the editor) Generalized Alternating Direction Method of Multipliers: New Theoretical Insights and Applications Ethan X. Fang Bingsheng He Han Liu Xiaoming Yuan

More information

Sparse and Low-Rank Matrix Decomposition Via Alternating Direction Method

Sparse and Low-Rank Matrix Decomposition Via Alternating Direction Method Sparse and Low-Rank Matrix Decomposition Via Alternating Direction Method Xiaoming Yuan a, 1 and Junfeng Yang b, 2 a Department of Mathematics, Hong Kong Baptist University, Hong Kong, China b Department

More information

MS&E 318 (CME 338) Large-Scale Numerical Optimization

MS&E 318 (CME 338) Large-Scale Numerical Optimization Stanford University, Management Science & Engineering (and ICME) MS&E 318 (CME 338) Large-Scale Numerical Optimization 1 Origins Instructor: Michael Saunders Spring 2015 Notes 9: Augmented Lagrangian Methods

More information

Fast Nonnegative Matrix Factorization with Rank-one ADMM

Fast Nonnegative Matrix Factorization with Rank-one ADMM Fast Nonnegative Matrix Factorization with Rank-one Dongjin Song, David A. Meyer, Martin Renqiang Min, Department of ECE, UCSD, La Jolla, CA, 9093-0409 dosong@ucsd.edu Department of Mathematics, UCSD,

More information

1 Computing with constraints

1 Computing with constraints Notes for 2017-04-26 1 Computing with constraints Recall that our basic problem is minimize φ(x) s.t. x Ω where the feasible set Ω is defined by equality and inequality conditions Ω = {x R n : c i (x)

More information

Accelerated primal-dual methods for linearly constrained convex problems

Accelerated primal-dual methods for linearly constrained convex problems Accelerated primal-dual methods for linearly constrained convex problems Yangyang Xu SIAM Conference on Optimization May 24, 2017 1 / 23 Accelerated proximal gradient For convex composite problem: minimize

More information

A Note on KKT Points of Homogeneous Programs 1

A Note on KKT Points of Homogeneous Programs 1 A Note on KKT Points of Homogeneous Programs 1 Y. B. Zhao 2 and D. Li 3 Abstract. Homogeneous programming is an important class of optimization problems. The purpose of this note is to give a truly equivalent

More information

A GENERAL INERTIAL PROXIMAL POINT METHOD FOR MIXED VARIATIONAL INEQUALITY PROBLEM

A GENERAL INERTIAL PROXIMAL POINT METHOD FOR MIXED VARIATIONAL INEQUALITY PROBLEM A GENERAL INERTIAL PROXIMAL POINT METHOD FOR MIXED VARIATIONAL INEQUALITY PROBLEM CAIHUA CHEN, SHIQIAN MA, AND JUNFENG YANG Abstract. In this paper, we first propose a general inertial proximal point method

More information

ON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS

ON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS ON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS WEI DENG AND WOTAO YIN Abstract. The formulation min x,y f(x) + g(y) subject to Ax + By = b arises in

More information

A primal-dual fixed point algorithm for multi-block convex minimization *

A primal-dual fixed point algorithm for multi-block convex minimization * Journal of Computational Mathematics Vol.xx, No.x, 201x, 1 16. http://www.global-sci.org/jcm doi:?? A primal-dual fixed point algorithm for multi-block convex minimization * Peijun Chen School of Mathematical

More information

Efficient smoothers for all-at-once multigrid methods for Poisson and Stokes control problems

Efficient smoothers for all-at-once multigrid methods for Poisson and Stokes control problems Efficient smoothers for all-at-once multigrid methods for Poisson and Stoes control problems Stefan Taacs stefan.taacs@numa.uni-linz.ac.at, WWW home page: http://www.numa.uni-linz.ac.at/~stefant/j3362/

More information

Inexact Alternating-Direction-Based Contraction Methods for Separable Linearly Constrained Convex Optimization

Inexact Alternating-Direction-Based Contraction Methods for Separable Linearly Constrained Convex Optimization J Optim Theory Appl 04) 63:05 9 DOI 0007/s0957-03-0489-z Inexact Alternating-Direction-Based Contraction Methods for Separable Linearly Constrained Convex Optimization Guoyong Gu Bingsheng He Junfeng Yang

More information

Linearized Alternating Direction Method with Adaptive Penalty for Low-Rank Representation

Linearized Alternating Direction Method with Adaptive Penalty for Low-Rank Representation Linearized Alternating Direction Method with Adaptive Penalty for Low-Rank Representation Zhouchen Lin Visual Computing Group Microsoft Research Asia Risheng Liu Zhixun Su School of Mathematical Sciences

More information

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6)

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement to the material discussed in

More information

A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations

A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations Chuangchuang Sun and Ran Dai Abstract This paper proposes a customized Alternating Direction Method of Multipliers

More information

New hybrid conjugate gradient methods with the generalized Wolfe line search

New hybrid conjugate gradient methods with the generalized Wolfe line search Xu and Kong SpringerPlus (016)5:881 DOI 10.1186/s40064-016-5-9 METHODOLOGY New hybrid conjugate gradient methods with the generalized Wolfe line search Open Access Xiao Xu * and Fan yu Kong *Correspondence:

More information

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725 Dual methods and ADMM Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given f : R n R, the function is called its conjugate Recall conjugate functions f (y) = max x R n yt x f(x)

More information

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions International Journal of Control Vol. 00, No. 00, January 2007, 1 10 Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions I-JENG WANG and JAMES C.

More information

ALTERNATING DIRECTION METHOD OF MULTIPLIERS WITH VARIABLE STEP SIZES

ALTERNATING DIRECTION METHOD OF MULTIPLIERS WITH VARIABLE STEP SIZES ALTERNATING DIRECTION METHOD OF MULTIPLIERS WITH VARIABLE STEP SIZES Abstract. The alternating direction method of multipliers (ADMM) is a flexible method to solve a large class of convex minimization

More information

On Glowinski s Open Question of Alternating Direction Method of Multipliers

On Glowinski s Open Question of Alternating Direction Method of Multipliers On Glowinski s Open Question of Alternating Direction Method of Multipliers Min Tao 1 and Xiaoming Yuan Abstract. The alternating direction method of multipliers ADMM was proposed by Glowinski and Marrocco

More information

arxiv: v1 [cs.cv] 1 Jun 2014

arxiv: v1 [cs.cv] 1 Jun 2014 l 1 -regularized Outlier Isolation and Regression arxiv:1406.0156v1 [cs.cv] 1 Jun 2014 Han Sheng Department of Electrical and Electronic Engineering, The University of Hong Kong, HKU Hong Kong, China sheng4151@gmail.com

More information

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented

More information

Algorithms for Constrained Optimization

Algorithms for Constrained Optimization 1 / 42 Algorithms for Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University April 19, 2015 2 / 42 Outline 1. Convergence 2. Sequential quadratic

More information

Multi-Block ADMM and its Convergence

Multi-Block ADMM and its Convergence Multi-Block ADMM and its Convergence Yinyu Ye Department of Management Science and Engineering Stanford University Email: yyye@stanford.edu We show that the direct extension of alternating direction method

More information

The Alternating Direction Method of Multipliers

The Alternating Direction Method of Multipliers The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein December 2, 2015 1 / 25 Background The Dual Problem Consider

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course otes for EE7C (Spring 018): Conve Optimization and Approimation Instructor: Moritz Hardt Email: hardt+ee7c@berkeley.edu Graduate Instructor: Ma Simchowitz Email: msimchow+ee7c@berkeley.edu October

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

The semi-convergence of GSI method for singular saddle point problems

The semi-convergence of GSI method for singular saddle point problems Bull. Math. Soc. Sci. Math. Roumanie Tome 57(05 No., 04, 93 00 The semi-convergence of GSI method for singular saddle point problems by Shu-Xin Miao Abstract Recently, Miao Wang considered the GSI method

More information

Implementing the ADMM to Big Datasets: A Case Study of LASSO

Implementing the ADMM to Big Datasets: A Case Study of LASSO Implementing the AD to Big Datasets: A Case Study of LASSO Hangrui Yue Qingzhi Yang Xiangfeng Wang Xiaoming Yuan August, 07 Abstract The alternating direction method of multipliers AD has been popularly

More information

Optimal linearized symmetric ADMM for multi-block separable convex programming

Optimal linearized symmetric ADMM for multi-block separable convex programming Optimal linearized symmetric ADMM for multi-block separable convex programming Xiaokai Chang Jianchao Bai 3 and Sanyang Liu School of Mathematics and Statistics Xidian University Xi an 7007 PR China College

More information

Convergent prediction-correction-based ADMM for multi-block separable convex programming

Convergent prediction-correction-based ADMM for multi-block separable convex programming Convergent prediction-correction-based ADMM for multi-block separable convex programming Xiaokai Chang a,b, Sanyang Liu a, Pengjun Zhao c and Xu Li b a School of Mathematics and Statistics, Xidian University,

More information

AUGMENTED LAGRANGIAN METHODS FOR CONVEX OPTIMIZATION

AUGMENTED LAGRANGIAN METHODS FOR CONVEX OPTIMIZATION New Horizon of Mathematical Sciences RIKEN ithes - Tohoku AIMR - Tokyo IIS Joint Symposium April 28 2016 AUGMENTED LAGRANGIAN METHODS FOR CONVEX OPTIMIZATION T. Takeuchi Aihara Lab. Laboratories for Mathematics,

More information

Dual Ascent. Ryan Tibshirani Convex Optimization

Dual Ascent. Ryan Tibshirani Convex Optimization Dual Ascent Ryan Tibshirani Conve Optimization 10-725 Last time: coordinate descent Consider the problem min f() where f() = g() + n i=1 h i( i ), with g conve and differentiable and each h i conve. Coordinate

More information

Dual Methods. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Dual Methods. Lecturer: Ryan Tibshirani Convex Optimization /36-725 Dual Methods Lecturer: Ryan Tibshirani Conve Optimization 10-725/36-725 1 Last time: proimal Newton method Consider the problem min g() + h() where g, h are conve, g is twice differentiable, and h is simple.

More information

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Lingchen Kong and Naihua Xiu Department of Applied Mathematics, Beijing Jiaotong University, Beijing, 100044, People s Republic of China E-mail:

More information

GENERALIZED second-order cone complementarity

GENERALIZED second-order cone complementarity Stochastic Generalized Complementarity Problems in Second-Order Cone: Box-Constrained Minimization Reformulation and Solving Methods Mei-Ju Luo and Yan Zhang Abstract In this paper, we reformulate the

More information

The speed of Shor s R-algorithm

The speed of Shor s R-algorithm IMA Journal of Numerical Analysis 2008) 28, 711 720 doi:10.1093/imanum/drn008 Advance Access publication on September 12, 2008 The speed of Shor s R-algorithm J. V. BURKE Department of Mathematics, University

More information

Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition

Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition Gongguo Tang and Arye Nehorai Department of Electrical and Systems Engineering Washington University in St Louis

More information

A Primal-dual Three-operator Splitting Scheme

A Primal-dual Three-operator Splitting Scheme Noname manuscript No. (will be inserted by the editor) A Primal-dual Three-operator Splitting Scheme Ming Yan Received: date / Accepted: date Abstract In this paper, we propose a new primal-dual algorithm

More information

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method Optimization Methods and Software Vol. 00, No. 00, Month 200x, 1 11 On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method ROMAN A. POLYAK Department of SEOR and Mathematical

More information

Distributed Optimization and Statistics via Alternating Direction Method of Multipliers

Distributed Optimization and Statistics via Alternating Direction Method of Multipliers Distributed Optimization and Statistics via Alternating Direction Method of Multipliers Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato Stanford University Stanford Statistics Seminar, September 2010

More information

Dual and primal-dual methods

Dual and primal-dual methods ELE 538B: Large-Scale Optimization for Data Science Dual and primal-dual methods Yuxin Chen Princeton University, Spring 2018 Outline Dual proximal gradient method Primal-dual proximal gradient method

More information

Complexity Analysis of Interior Point Algorithms for Non-Lipschitz and Nonconvex Minimization

Complexity Analysis of Interior Point Algorithms for Non-Lipschitz and Nonconvex Minimization Mathematical Programming manuscript No. (will be inserted by the editor) Complexity Analysis of Interior Point Algorithms for Non-Lipschitz and Nonconvex Minimization Wei Bian Xiaojun Chen Yinyu Ye July

More information

Latent Variable Graphical Model Selection Via Convex Optimization

Latent Variable Graphical Model Selection Via Convex Optimization Latent Variable Graphical Model Selection Via Convex Optimization The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published

More information

Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms

Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms Carlos Humes Jr. a, Benar F. Svaiter b, Paulo J. S. Silva a, a Dept. of Computer Science, University of São Paulo, Brazil Email: {humes,rsilva}@ime.usp.br

More information

Alternating Direction Method of Multipliers. Ryan Tibshirani Convex Optimization

Alternating Direction Method of Multipliers. Ryan Tibshirani Convex Optimization Alternating Direction Method of Multipliers Ryan Tibshirani Convex Optimization 10-725 Consider the problem Last time: dual ascent min x f(x) subject to Ax = b where f is strictly convex and closed. Denote

More information

Research Article Solving the Matrix Nearness Problem in the Maximum Norm by Applying a Projection and Contraction Method

Research Article Solving the Matrix Nearness Problem in the Maximum Norm by Applying a Projection and Contraction Method Advances in Operations Research Volume 01, Article ID 357954, 15 pages doi:10.1155/01/357954 Research Article Solving the Matrix Nearness Problem in the Maximum Norm by Applying a Projection and Contraction

More information

Robust Principal Component Analysis

Robust Principal Component Analysis ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M

More information

SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1

SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1 SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1 Masao Fukushima 2 July 17 2010; revised February 4 2011 Abstract We present an SOR-type algorithm and a

More information

Using ADMM and Soft Shrinkage for 2D signal reconstruction

Using ADMM and Soft Shrinkage for 2D signal reconstruction Using ADMM and Soft Shrinkage for 2D signal reconstruction Yijia Zhang Advisor: Anne Gelb, Weihong Guo August 16, 2017 Abstract ADMM, the alternating direction method of multipliers is a useful algorithm

More information

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization Roger Behling a, Clovis Gonzaga b and Gabriel Haeser c March 21, 2013 a Department

More information

Does Alternating Direction Method of Multipliers Converge for Nonconvex Problems?

Does Alternating Direction Method of Multipliers Converge for Nonconvex Problems? Does Alternating Direction Method of Multipliers Converge for Nonconvex Problems? Mingyi Hong IMSE and ECpE Department Iowa State University ICCOPT, Tokyo, August 2016 Mingyi Hong (Iowa State University)

More information

LINEARIZED AUGMENTED LAGRANGIAN AND ALTERNATING DIRECTION METHODS FOR NUCLEAR NORM MINIMIZATION

LINEARIZED AUGMENTED LAGRANGIAN AND ALTERNATING DIRECTION METHODS FOR NUCLEAR NORM MINIMIZATION LINEARIZED AUGMENTED LAGRANGIAN AND ALTERNATING DIRECTION METHODS FOR NUCLEAR NORM MINIMIZATION JUNFENG YANG AND XIAOMING YUAN Abstract. The nuclear norm is widely used to induce low-rank solutions for

More information

The Multi-Path Utility Maximization Problem

The Multi-Path Utility Maximization Problem The Multi-Path Utility Maximization Problem Xiaojun Lin and Ness B. Shroff School of Electrical and Computer Engineering Purdue University, West Lafayette, IN 47906 {linx,shroff}@ecn.purdue.edu Abstract

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html

More information

AN ALTERNATING MINIMIZATION ALGORITHM FOR NON-NEGATIVE MATRIX APPROXIMATION

AN ALTERNATING MINIMIZATION ALGORITHM FOR NON-NEGATIVE MATRIX APPROXIMATION AN ALTERNATING MINIMIZATION ALGORITHM FOR NON-NEGATIVE MATRIX APPROXIMATION JOEL A. TROPP Abstract. Matrix approximation problems with non-negativity constraints arise during the analysis of high-dimensional

More information

GLOBAL CONVERGENCE OF CONJUGATE GRADIENT METHODS WITHOUT LINE SEARCH

GLOBAL CONVERGENCE OF CONJUGATE GRADIENT METHODS WITHOUT LINE SEARCH GLOBAL CONVERGENCE OF CONJUGATE GRADIENT METHODS WITHOUT LINE SEARCH Jie Sun 1 Department of Decision Sciences National University of Singapore, Republic of Singapore Jiapu Zhang 2 Department of Mathematics

More information

The antitriangular factorisation of saddle point matrices

The antitriangular factorisation of saddle point matrices The antitriangular factorisation of saddle point matrices J. Pestana and A. J. Wathen August 29, 2013 Abstract Mastronardi and Van Dooren [this journal, 34 (2013) pp. 173 196] recently introduced the block

More information

A Power Penalty Method for a Bounded Nonlinear Complementarity Problem

A Power Penalty Method for a Bounded Nonlinear Complementarity Problem A Power Penalty Method for a Bounded Nonlinear Complementarity Problem Song Wang 1 and Xiaoqi Yang 2 1 Department of Mathematics & Statistics Curtin University, GPO Box U1987, Perth WA 6845, Australia

More information

A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES. Fenghui Wang

A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES. Fenghui Wang A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES Fenghui Wang Department of Mathematics, Luoyang Normal University, Luoyang 470, P.R. China E-mail: wfenghui@63.com ABSTRACT.

More information

ADMM and Accelerated ADMM as Continuous Dynamical Systems

ADMM and Accelerated ADMM as Continuous Dynamical Systems Guilherme França 1 Daniel P. Robinson 1 René Vidal 1 Abstract Recently, there has been an increasing interest in using tools from dynamical systems to analyze the behavior of simple optimization algorithms

More information

20 J.-S. CHEN, C.-H. KO AND X.-R. WU. : R 2 R is given by. Recently, the generalized Fischer-Burmeister function ϕ p : R2 R, which includes

20 J.-S. CHEN, C.-H. KO AND X.-R. WU. : R 2 R is given by. Recently, the generalized Fischer-Burmeister function ϕ p : R2 R, which includes 016 0 J.-S. CHEN, C.-H. KO AND X.-R. WU whereas the natural residual function ϕ : R R is given by ϕ (a, b) = a (a b) + = min{a, b}. Recently, the generalized Fischer-Burmeister function ϕ p : R R, which

More information

An Extended Alternating Direction Method for Three-Block Separable Convex Programming

An Extended Alternating Direction Method for Three-Block Separable Convex Programming An Etended Alternating Direction Method for Three-Bloc Separable Conve Programming Jianchao Bai Jicheng Li Hongchao Zhang School of Mathematics and Statistics, Xi an Jiaotong University, Xi an 70049, China

More information

On the Convergence and O(1/N) Complexity of a Class of Nonlinear Proximal Point Algorithms for Monotonic Variational Inequalities

On the Convergence and O(1/N) Complexity of a Class of Nonlinear Proximal Point Algorithms for Monotonic Variational Inequalities STATISTICS,OPTIMIZATION AND INFORMATION COMPUTING Stat., Optim. Inf. Comput., Vol. 2, June 204, pp 05 3. Published online in International Academic Press (www.iapress.org) On the Convergence and O(/N)

More information

Stochastic dynamical modeling:

Stochastic dynamical modeling: Stochastic dynamical modeling: Structured matrix completion of partially available statistics Armin Zare www-bcf.usc.edu/ arminzar Joint work with: Yongxin Chen Mihailo R. Jovanovic Tryphon T. Georgiou

More information

Approximation algorithms for nonnegative polynomial optimization problems over unit spheres

Approximation algorithms for nonnegative polynomial optimization problems over unit spheres Front. Math. China 2017, 12(6): 1409 1426 https://doi.org/10.1007/s11464-017-0644-1 Approximation algorithms for nonnegative polynomial optimization problems over unit spheres Xinzhen ZHANG 1, Guanglu

More information

Research Note. A New Infeasible Interior-Point Algorithm with Full Nesterov-Todd Step for Semi-Definite Optimization

Research Note. A New Infeasible Interior-Point Algorithm with Full Nesterov-Todd Step for Semi-Definite Optimization Iranian Journal of Operations Research Vol. 4, No. 1, 2013, pp. 88-107 Research Note A New Infeasible Interior-Point Algorithm with Full Nesterov-Todd Step for Semi-Definite Optimization B. Kheirfam We

More information

Relaxed linearized algorithms for faster X-ray CT image reconstruction

Relaxed linearized algorithms for faster X-ray CT image reconstruction Relaxed linearized algorithms for faster X-ray CT image reconstruction Hung Nien and Jeffrey A. Fessler University of Michigan, Ann Arbor The 13th Fully 3D Meeting June 2, 2015 1/20 Statistical image reconstruction

More information

Key words. alternating direction method of multipliers, convex composite optimization, indefinite proximal terms, majorization, iteration-complexity

Key words. alternating direction method of multipliers, convex composite optimization, indefinite proximal terms, majorization, iteration-complexity A MAJORIZED ADMM WITH INDEFINITE PROXIMAL TERMS FOR LINEARLY CONSTRAINED CONVEX COMPOSITE OPTIMIZATION MIN LI, DEFENG SUN, AND KIM-CHUAN TOH Abstract. This paper presents a majorized alternating direction

More information