Contraction Methods for Convex Optimization and monotone variational inequalities No.12

Size: px
Start display at page:

Download "Contraction Methods for Convex Optimization and monotone variational inequalities No.12"

Transcription

1 XII - 1 Contraction Methods for Convex Optimization and monotone variational inequalities No.12 Linearized alternating direction methods of multipliers for separable convex programming Bingsheng He Department of Mathematics Nanjing University hebma@nju.edu.cn

2 XII Structured constrained convex optimization We consider the following structured constrained convex optimization problem min {θ 1 (x) + θ 2 (y) Ax + By = b, x X, y Y} (1.1) where θ 1 (x) : R n 1 R, θ 2 (y) : R n 2 R are convex functions (but not necessary smooth), A R m n 1, B R m n 2 and b R m, X R n 1, Y R n 2 are given closed convex sets. We let n = n 1 + n 2. The task of solving the problem (1.1) is to find an (x, y, λ ) Ω, such that θ 1 (x) θ 1 (x ) + (x x ) T ( A T λ ) 0, θ 2 (y) θ 2 (y ) + (y y ) T ( B T λ ) 0, (x, y, λ) Ω, (1.2) (λ λ ) T (Ax + By b) 0, where Ω = X Y R m.

3 XII - 3 By denoting u = x y, w = x y λ, F (w) = A T λ B T λ Ax + By b and θ(u) = θ 1 (x) + θ 2 (y), the first order optimal condition (1.2) can be written in a compact form such as w Ω, θ(u) θ(u )+(w w ) T F (w ) 0, w Ω. (1.3) Note that the mapping F is monotone. We use Ω to denote the solution set of the variational inequality (1.3). For convenience we use the notations v = y λ and V = {(y, λ ) (x, y, λ ) Ω }.

4 XII - 4 Alternating Direction Method is a simple but powerful algorithm that is well suited to distributed convex optimization [1]. This approach also has the benefit that one algorithm could be flexible enough to solve many problems. Applied ADM to the structured COP: (y k, λ k ) (y k+1, λ k+1 ) First, for given (y k, λ k ), x k+1 is the solution of the following problem { } x k+1 = Argmin θ 1 (x) + β 2 Ax + Byk b 1 β λk 2 x X (1.4a) Use λ k and the obtained x k+1, y k+1 is the solution of the following problem y k+1 = Argmin { } θ 2 (y)+ β 2 Axk+1 +By b 1 β λk 2 y Y (1.4b) λ k+1 = λ k β(ax k+1 + By k+1 b). (1.4c)

5 XII - 5 In some structured convex optimization (1.1), B is a scalar matrix. However, the solution of the subproblem (1.4a) does not have the closed form solution because of the general structure of the matrix A. In this case, we linearize the quadratic term of (1.4a) β 2 Ax + Byk b 1 β λk 2 at x k and add a proximal term r 2 x a 2 to the objective function. In other words, instead of (1.4a), we solve the following x subproblem: min { θ 1 (x) + βx T A T (Ax k + By k b 1 β λk ) + r 2 x xk 2 x X }. Based on linearizing the quadratic term of (1.4a), in this lecture, we construct the linearized alternating direction method. We still assume that the solution of the problem has a closed form. min { θ 1 (x) + r 2 x a 2 x X } (1.5)

6 XII Linearized Alternating Direction Method In the Linearized ADM, x is not an intermediate variable. The k-th iteration of the Linearized ADM is from (x k, y k, λ k ) to (x k+1, y k+1, λ k+1 ). 2.1 Linearized ADM 1. First, for given (x k, y k, λ k ), x k+1 is the solution of the following problem ( { ) x k+1 θ 1 (x) + βx T A T (Ax k + By k b 1 β = Argmin λk ) + r x 2 xk 2 x X }. (2.1a) 2. Then, use λ k and the obtained x k+1, y k+1 is the solution of the following problem } y k+1 = Argmin {θ 2 (y)+ β2 Axk+1 +By b 1β λk 2 y Y. (2.1b) 3. Finally, λ k+1 = λ k β(ax k+1 + By k+1 b). (2.1c)

7 XII - 7 Requirements on parameters β, r For given β > 0, choose r such that ri n βa T A 0. (2.2) Analysis of the optimal conditions of subproblems in (2.1) Note that x k+1, the solution of (2.1a), satisfies { x k+1 X, θ 1 (x) θ 1 (x k+1 ) + (x x k+1 ) T A T λ k +βa T ( Ax k + By k b ) } + r(x k+1 x k ) 0, x X. (2.3a) Similarly, the solution of (2.1b) y k+1 satisfies y k+1 Y, θ 2 (y) θ 2 (y k+1 ) + (y y k+1 ) T { B T λ k +βb T ( Ax k+1 + By k+1 b )} 0, y Y. (2.3b)

8 XII - 8 Substituting λ k+1 (see (2.1c)) in (2.3) (eliminating λ k ), we get x k+1 X, θ 1 (x) θ 1 (x k+1 ) + (x x k+1 ) T { A T λ k+1 + βa T B(y k y k+1 ) +(ri n1 βa T A)(x k+1 x k ) } 0, x X, (2.4a) and y k+1 Y, θ 2 (y) θ 2 (y k+1 )+ (y y k+1 ) T { B T λ k+1 } 0, y Y. (2.4b) For analysis convenience, we rewrite (2.4) as the following equivalent form: T θ(u) θ(u k+1 ) + x xk+1 AT λ k+1 +β AT B(y k y k+1 ) y y k+1 B T λ k+1 B T + ri n 1 βa T A 0 xk+1 x k 0, (x, y) X Y. 0 βb T B y k+1 y k

9 XII - 9 Combining the last inequality with (2.1c), we have T x x k+1 A T λ k+1 A T θ(u) θ(u k+1 ) + y y k+1 B T λ k+1 + β B T λ λ k+1 Ax k+1 + By k+1 b 0 ri n1 βa T A 0 0 x k+1 x k + 0 βb T B 0 y k+1 y k 0, 0 0 λ k+1 λ k 1 β I m for all (x, y, λ) X Y R m. The above inequality can be rewritten as T x x k+1 A T θ(u) θ(u k+1 )+(w w k+1 ) T F (w k+1 ) + β y y k+1 B T λ λ k+1 0 T x x k+1 ri n1 βa T A 0 0 x k x k+1 y y k+1 0 βb T B 0 y k y k+1 λ λ k λ k λ k+1 1 β I m B(y k y k+1 ) B(y k y k+1 ), w Ω.(2.5)

10 XII Convergence of Linearized ADM Based on the analysis in the last section, we have the following lemma. Lemma 2.1 Let w k+1 = (x k+1, y k+1, λ k+1 ) Ω be generated by (2.1) from the given w k = (x k, y k, λ k ). Then, we have (w k+1 w ) T G(w k w k+1 ) (w k+1 w ) T η(y k, y k+1 ), w Ω, (2.6) where η(y k, y k+1 ) = β A T B T 0 B(yk y k+1 ), (2.7) and G = ri n1 βa T A βb T B β I m. (2.8)

11 XII - 11 Proof. Setting w = w in (2.5), and using G and η(y k, y k+1 ), we get (w k+1 w ) T G(w k w k+1 ) (w k+1 w ) T η(y k, y k+1 )+θ(u k+1 ) θ(u ) + (w k+1 w ) T F (w k+1 ). Since F is monotone, it follows that θ(u k+1 ) θ(u ) + (w k+1 w ) T F (w k+1 ) θ(u k+1 ) θ(u ) + (w k+1 w ) T F (w ) 0. The last inequality is due to w k+1 Ω and w Ω is a solution of (see (1.3)). The lemma is proved. By using η(y k, y k+1 ) (see (2.7)), Ax + By = b and (2.1c), we have (w k+1 w ) T η(y k, y k+1 ) = (B(y k y k+1 )) T β{(ax k+1 + By k+1 ) (Ax + By )} = (B(y k y k+1 )) T β(ax k+1 + By k+1 b) = (λ k λ k+1 ) T B(y k y k+1 ). (2.9)

12 XII - 12 Substituting it in (2.6), we obtain (w k+1 w ) T G(w k w k+1 ) (λ k λ k+1 ) T B(y k y k+1 ), w Ω. (2.10) Lemma 2.2 Let w k+1 = (x k+1, y k+1, λ k+1 ) Ω be generated by (2.1) from the given w k = (x k, y k, λ k ). Then, we have (λ k λ k+1 ) T B(y k y k+1 ) 0. (2.11) Proof. Since (2.4b) is true for the k-th iteration and the previous iteration, we have and θ 2 (y) θ 2 (y k+1 ) + (y y k+1 ) T { B T λ k+1 } 0, y Y, (2.12) θ 2 (y) θ 2 (y k ) + (y y k ) T { B T λ k } 0, y Y, (2.13)

13 XII - 13 Setting y = y k in (2.12) and y = y k+1 in (2.13), respectively, and then adding the two resulting inequalities, we get (λ k λ k+1 ) T B(y k y k+1 ) 0. The assertion of this lemma is proved. Under the assumption (2.2), the matrix G is positive semi-definite. In addition, if B is a full column rank matrix, G is positive definite. Even if in the positive semi-definite case, we also use w w G to denote w w G = (w w) T G(w w). If B is a full column rank matrix, G is positive definite. Lemma 2.3 Let w k+1 = (x k+1, y k+1, λ k+1 ) Ω be generated by (2.1) from the given w k = (x k, y k, λ k ). Then, we have (w k+1 w ) T G(w k w k+1 ) 0, w Ω, (2.14)

14 XII - 14 and consequently w k+1 w 2 G w k w 2 G w k w k+1 2 G, w Ω. (2.15) Proof. Assertion (2.14) follows from (2.10) and (2.11) directly. By using (2.14), we have w k w 2 G = (w k+1 w ) + (w k w k+1 2 G = w k+1 w 2 G + 2(w k+1 w ) T G(w k w k+1 ) + w k w k+1 2 G w k+1 w 2 G + w k w k+1 2 G, and thus (2.15) is proved. The inequality (2.15) is essential for the convergence of the alternating direction method. Note that G is positive semi-definite. w k w k+1 2 G = 0 G(w k w k+1 ) = 0.

15 XII - 15 The inequality (2.15) can be written as x k+1 x 2 (ri βa T A) + β B(yk+1 y ) β λk+1 λ 2 x k x 2 (ri βa T A) + β B(yk y ) β λk λ 2 ( x k x k+1 2 (ri βa T A) + β B(yk y k+1 ) β λk λ k+1 2). It leads to that lim k xk = x, lim k Byk = By and lim k λk = λ. The linearizing ADM is also known as the split inexact Uzawa method in image processing literature [9, 10].

16 XII Self-Adaptive ADM-based Contraction Method In the last section, we get x k+1 by solving the following x-subproblem: min { θ 1 (x) + βx T A T (Ax k + By k b 1 β λk ) + r 2 x xk 2 x X } and it required that the parameter r to satisfy ri n βa T A 0 r > βλ max (A T A). In some practical problem, a conservative estimation of λ max (A T A) will leads a slow convergence. In this section, based on the linearized ADM, we consider the self-adaptive contraction methods. Each iteration of the self-adaptive contraction methods consists of two steps prediction step and correction step. From the given w k, the prediction step produces a test vector w k and the correction step offers the new iterate w k+1.

17 XII Prediction 1. First, for given (x k, y k, λ k ), x k is the solution of the following problem ( { ) x k θ 1 (x) + βx T A T (Ax k + By k b 1 = Argmin β λk ) + r 2 x xk 2 x X } 2. Then, use λ k and the obtained x k, ỹ k is the solution of the problem { } ỹ k θ = 2 (y) (λ k ) T (A x k + By b) Argmin + β 2 A xk + By b 2 y Y (3.1a) (3.1b) 3. Finally, λ k = λ k β(a x k + Bỹ k b). (3.1c) The subproblems in (3.1) are similar as in (2.1). Instead of (x k+1, y k+1, λ k+1 ) in (2.1), we denote the output of (3.1) by ( x k, ỹ k, λ k ).

18 XII - 18 Requirements on parameters β, r For given β > 0, choose r such that β A T A(x k x k ) νr x k x k, ν (0, 1). (3.2) If ri βa T A 0, then (3.2) is satisfied. Thus, (2.2) is sufficient for (3.2). Analysis of the optimal conditions of subproblems in (3.1) Because we get w k in (3.1) via substituting w k+1 in (2.1) by w k. Therefore, similar as (2.5), we get x x T k A T λk A T θ(u) θ(ũ k ) + y ỹ k B T λk λ λ + β B T k A x k + Bỹ k b 0 ri n1 βa T A 0 0 x k x k + 0 βb T B 0 ỹ k y k 0, w Ω. 0 0 λ k λ k 1 β I m B(yk ỹ k )

19 XII - 19 The last variational inequality can be rewritten as w k Ω, θ(u) θ(ũ k ) + (w w k ) T { F ( w k ) + η(y k, ỹ k ) + HM( w k w k ) } 0, w Ω, (3.3) where η(y k, ỹ k ) = β H = A T B T 0 ri n βb T B 0 B(yk ỹ k ), (3.4) β I m, (3.5)

20 XII - 20 and I n1 β r AT A 0 0 M = 0 I n I m. (3.6) Based on the above analysis, we have the following lemma. Lemma 3.1 Let w k = ( x k, ỹ k, λ k ) Ω be generated by (3.1) from the given w k = (x k, y k, λ k ). Then, we have ( w k w ) T HM(w k w k ) ( w k w ) T η(y k, ỹ k ), w Ω. (3.7) Proof. Setting w = w in (3.3), we obtain ( w k w ) T HM(w k w k ) ( w k w ) T η(y k, ỹ k ) +θ(ũ k ) θ(u ) + ( w k w ) T F ( w k ). (3.8)

21 XII - 21 Since F is monotone and w k Ω, it follows that θ(ũ k ) θ(u ) + ( w k w ) T F ( w k ) θ(ũ k ) θ(u ) + ( w k w ) T F (w ) 0. The last inequality is due to w k Ω and w Ω is a solution of (1.3). Substituting it in the right hand side of (3.8), the lemma is proved. In addition, because Ax + By = b and β(a x k + Bỹ k b) = λ k λ k, we have ( w k w ) T η(y k, ỹ k ) = (B(y k ỹ k )) T β{(a x k + Bỹ k ) (Ax + By )} = (λ k λ k ) T B(y k ỹ k ). (3.9) Lemma 3.2 Let w k = ( x k, ỹ k, λ k ) Ω be generated by (3.1) from the given w k = (x k, y k, λ k ). Then, we have (w k w ) T HM(w k w k ) ϕ(w k, w k ), w Ω, (3.10)

22 XII - 22 where ϕ(w k, w k ) = (w k w k ) T HM(w k w k )+(λ k λ k ) T B(y k ỹ k ). (3.11) Proof. From (3.7) and (3.9) we have ( w k w ) T HM(w k w k ) (λ k λ k ) T B(y k ỹ k ). Assertion (3.10) follows from the last inequality and the definition of ϕ(w k, w k ) directly. 3.2 The Primary Contraction Methods The primary contraction methods use M(w k w k ) as search direction and the unit step length. In other words, the new iterate is given by w k+1 = w k M(w k w k ). (3.12)

23 XII - 23 According to (3.6), it can be written as x k+1 y k+1 λ k+1 = x k + β r AT A(x k x k ) ỹ k. (3.13) λ k In the primary contraction method, only the x-part of the corrector is different from the predictor. In the method of Section 2, we need r β A T A. By using the method in this section, we need only a r to satisfy the condition (3.2). In practical computation, we try to use the average of the eigenvalues of βa T A. Using (3.10), we have w k w 2 H w k+1 w 2 H = w k w 2 H (w k w ) M(w k w k ) 2 H = 2(w k w ) T HM(w k w k ) M(w k w k ) 2 H 2ϕ(w k, w k ) M(w k w k ) 2 H. (3.14)

24 XII - 24 Because (ỹ k, λ k ) = (y k+1, λ k+1 ), the inequality (2.11) is still holds and thus (λ k λ k ) T B(y k ỹ k ) 0. Therefore, it follows from (3.14), (3.11) and the last inequality that w k w 2 H w k+1 w 2 H 2(w k w k ) T HM(w k w k ) M(w k w k ) 2 H. (3.15) Lemma 3.3 Under the condition (3.2), we have 2(w k w k ) T HM(w k w k ) M(w k w k ) 2 H (1 ν 2 )r x k x k 2 + β B(y k ỹ k ) β λk λ k 2.(3.16) Proof. First, we have 2(w k w k ) T HM(w k w k ) M(w k w k ) 2 H = (w k w k ) T ( M T H + HM M T HM ) (w k w k ).

25 XII - 25 By using the structure of the matrices H and M (see (3.5) and (3.6)), we obtain M T H + HM M T HM = H (I M T )H(I M) ri n 0 0 r( β r = 0 βb T B 0 AT A) β I m Therefore, 2(w k w k ) T HM(w k w k ) M(w k w k ) 2 H = w k w k 2 H r ( β 2 r 2 ) A T A(x k x k ) 2. (3.17) Under the condition (3.2), we have ( β 2 r 2 ) A T A(x k x k ) 2 ν 2 x k x k 2. Substituting it in (3.17), the assertion of this lemma is proved. Theorem 3.1 Let w k = ( x k, ỹ k, λ k ) Ω be generated by (3.1) from the given

26 XII - 26 w k = (x k, y k, λ k ) and the new iterate w k+1 is given by (3.12). The sequence {w k = (x k, y k, λ k )} generated by the elementary contraction method satisfies w k+1 w 2 H w k w 2 H (1 ν 2 ) w k w k 2 H. (3.18) Proof. From (3.15) and (3.16) we obtain w k w 2 H w k+1 w 2 H (1 ν 2 )r x k x k 2 + β B(y k ỹ k ) β λk λ k 2 (1 ν 2 ) w k w k 2 H. The assertion of this theorem is proved. Theorem 3.1 is essential for the convergence of the primary contraction method.

27 XII The general contraction method The general contraction method For given w k, we use w(α) = w k αm(w k w k ) (3.19) to update the α-dependent new iterate. For any w Ω, we define ϑ(α) := w k w 2 H w(α) w 2 H (3.20) and q(α) = 2αϕ(w k, w k ) α 2 M(w k w k ) 2 H, (3.21) where ϕ(w k, w k ) is defined in (3.11). Theorem 3.2 Let w(α) be defined by (3.19). For any w Ω and α 0, we have ϑ(α) q(α), (3.22) where ϑ(α) and q(α) are defined in (3.20) and (3.21), respectively.

28 XII - 28 Proof. It follows from (3.19) and (3.20) that ϑ(α) = w k w 2 H (w k w ) αm(w k w k ) 2 H = 2α(w k w ) T HM(w k w k ) α 2 M(w k w k ) 2 H. By using (3.10) and the definition of q(α), the theorem is proved. Note that q(α) in (3.21) is a quadratic function of α and it reaches its maximum at α = In practical computation, we use ϕ(w k, w k ) M(w k w k ) 2 H. (3.23) w k+1 = w k γαkm(w k w k ), (3.24) to update the new iterate, where γ [1, 2) is a relaxation factor. By using (3.20) and (3.22), we have w k+1 w 2 H w k w 2 H q(γα k). (3.25)

29 XII - 29 Note that q(γα k) = 2γα kϕ(w k, w k ) (γα k) 2 M(w k w k ) 2 H. (3.26) Using (3.23) and (3.24), we obtain q(γα k) = γ(2 γ)(α k) 2 M(w k w k ) 2 H = 2 γ γ and consequently it follows from (3.25) that w k+1 w 2 H w k w 2 H 2 γ γ On the other hand, it follows from (3.23) and (3.26) that w k w k+1 2 H, w k w k+1 2 H, w Ω. (3.27) q(γα k) = γ(2 γ)α kϕ(w k, w k ). (3.28)

30 XII - 30 By using (3.11) and (3.16), we obtain 2ϕ(w k, w k ) M(w k w k ) 2 H = 2(w k w k ) T HM(w k w k ) M(w k w k ) 2 H +2(λ k λ k ) T B(y k ỹ k ) (1 ν 2 )r x k x k 2 + β B(y k ỹ k ) β λk λ k 2 +2(λ k λ k ) T B(y k ỹ k ) = (1 ν 2 )r x k x k 2 + β B(y k ỹ k ) + 1 β (λk λ k ) 2. Thus, we have 2ϕ(w k, w k ) > M(w k w k ) 2 H and consequently α k > 1 2.

31 XII - 31 In addition, because ϕ(w k, w k ) = (w k w k ) T HM(w k w k ) + (λ k λ k ) T B(y k ỹ k ) x k x k T ri n1 βa T A 0 0 x k x k = y k ỹ k 0 βb T B 1 2 BT y k ỹ k λ k λ k B 1 β I m λ k λ k x k x k (r x k x k β A T A(x k x k ) ) + 1 yk ỹ k T 2 λ k λ βbt B 0 yk ỹ k k 1 0 β I m λ k λ. k Under the condition (3.2), it follows from the last inequality that ϕ(w k, w k ) min { (1 ν), 1 2} w k w k 2 H 1 ν 2 w k w k 2 H. (3.29)

32 XII - 32 By using (3.25), (3.28), (3.29) and αk 1 2, we obtain the following theorem for the general contraction method. Theorem 3.3 The sequence {w k = (x k, y k, λ k )} generated by the general contraction method satisfies w k+1 w 2 H w k w 2 H γ(2 γ)(1 ν) 4 w k w k 2 H, w Ω. (3.30) The inequality (3.30) in Theorem 3.3 is essential for the convergence of the general contraction method. Both the inequalities (3.18) and (3.30) can be written as w k+1 w 2 H w k w 2 H c 0 w k w k 2 H, w Ω,

33 XII - 33 where c 0 > 0 is a constant. Therefore, we have r x k+1 x 2 + β B(y k+1 y ) β λk+1 λ 2 r x k x 2 + β B(y k y ) β λk λ 2 c 0 ( r x k x k 2 + β B(y k ỹ k ) β λk λ k 2). It leads to that and ( lim r x k x k 2 + β B(y k ỹ k ) k β λk λ k 2), lim k xk = x, lim k Byk = By and lim k λk = λ.

34 XII Applications in l 1 -norm problems An important l 1 -norm problem in the area machine learning is the l 1 regularized linear regression, also called the lasso [8]. This involves solving min τ x Ax b 2 2, (4.1) where τ > 0 is a scalar regularization parameter that is usually chosen by cross-validation. In typical applications, there are many more features than training examples, and the goal is to find a parsimonious model for the data. The problem (1.1) can be reformulated to problem min τ x y 2 2 Ax y = b (4.2) which is a form of (1.1). Applied the alternating direction method (1.4) to the problem (4.2), the x-subproblem is x k+1 = Argmin {τ x 1 + β 2 (Ax yk ) 1 β λk 2 2},

35 XII - 35 and the solution does not has closed form. Applied the linearized alternating direction method (2.1) to the problem (4.2), the x-subproblem (2.1a) is x k = Argmin{τ x 1 + r 2 x [xk + 1 r λk β r AT (Ax k y k )] 2 }. (4.3) This problem is form of (1.5) and its solution has the following closed form: x k = a P τ/r B [a], where a = x k + 1 r λk β r AT (Ax k y k ) and B τ/r = {ξ R n (τ/r)e ξ (τ/r)e}. By using the linearized alternating direction method in Section 2, for given β > 0, it needs r > βλ max (A T A). By using the self-adaptive ADM-based contraction method in Section 3, it needs r to satisfy β A T A(x k x k ) νr x k x k, ν (0, 1). Because A is a generic matrix, the above condition is satisfied even if r is much less than βλ max (A T A).

36 XII - 36 References [1] S. Boyd, N. Parikh, E. Chu, B. Peleato and J. Eckstein, Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers, Foundations and Trends in Machine Learning, 3, 1-122, [2] D. Gabay, Applications of the Method of Multipliers to Variational Inequalities, in Augmented Lagrange Methods: Applications to the Solution of Boundary-valued Problems, M. Fortin and R. Glowinski, eds., North Holland, Amsterdam, The Netherlands, , [3] D. Gabay and B. Mercier, A dual algorithm for the solution of nonlinear variational problems viafinite element approximations, Computers and Mathematics with Applications, 2:17-40, [4] R. Glowinski, Numerical Methods for Nonlinear Variational Problems, Springer-Verlag, New York, Berlin, Heidelberg, Tokyo, [5] B.S. He, A class of projection and contraction methods for monotone variational inequalities, Applied Math. & Optim., 35, 69-76, [6] B.S. He, Inexact implicit methods for monotone general variational inequalities, Math. Program. Series A, 86, , [7] B.S. He, L-Z Liao, D.R. Han and H. Yang, A new inexact alternating directions method for monotone variational inequalities, Math. Progr., 92, , [8] R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58, , [9] X. Q. Zhang, M. Burger, X. Bresson and S. Osher, Bregmanized nonlocal regularization for deconvolution and sparse reconstruction, SIAM J. Imag. Sci., 3, , [10] X. Q. Zhang, M. Burger and S. Osher, A unified primal-dual algorithm framework based on Bregman iteration, J. Sci. Computing, 46, 20-46, 2010.

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.11

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.11 XI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.11 Alternating direction methods of multipliers for separable convex programming Bingsheng He Department of Mathematics

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of

More information

Optimal Linearized Alternating Direction Method of Multipliers for Convex Programming 1

Optimal Linearized Alternating Direction Method of Multipliers for Convex Programming 1 Optimal Linearized Alternating Direction Method of Multipliers for Convex Programming Bingsheng He 2 Feng Ma 3 Xiaoming Yuan 4 October 4, 207 Abstract. The alternating direction method of multipliers ADMM

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.18

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.18 XVIII - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No18 Linearized alternating direction method with Gaussian back substitution for separable convex optimization

More information

Linearized Alternating Direction Method of Multipliers via Positive-Indefinite Proximal Regularization for Convex Programming.

Linearized Alternating Direction Method of Multipliers via Positive-Indefinite Proximal Regularization for Convex Programming. Linearized Alternating Direction Method of Multipliers via Positive-Indefinite Proximal Regularization for Convex Programming Bingsheng He Feng Ma 2 Xiaoming Yuan 3 July 3, 206 Abstract. The alternating

More information

Generalized ADMM with Optimal Indefinite Proximal Term for Linearly Constrained Convex Optimization

Generalized ADMM with Optimal Indefinite Proximal Term for Linearly Constrained Convex Optimization Generalized ADMM with Optimal Indefinite Proximal Term for Linearly Constrained Convex Optimization Fan Jiang 1 Zhongming Wu Xingju Cai 3 Abstract. We consider the generalized alternating direction method

More information

The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent

The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent Caihua Chen Bingsheng He Yinyu Ye Xiaoming Yuan 4 December, Abstract. The alternating direction method

More information

Improving an ADMM-like Splitting Method via Positive-Indefinite Proximal Regularization for Three-Block Separable Convex Minimization

Improving an ADMM-like Splitting Method via Positive-Indefinite Proximal Regularization for Three-Block Separable Convex Minimization Improving an AMM-like Splitting Method via Positive-Indefinite Proximal Regularization for Three-Block Separable Convex Minimization Bingsheng He 1 and Xiaoming Yuan August 14, 016 Abstract. The augmented

More information

Proximal-like contraction methods for monotone variational inequalities in a unified framework

Proximal-like contraction methods for monotone variational inequalities in a unified framework Proximal-like contraction methods for monotone variational inequalities in a unified framework Bingsheng He 1 Li-Zhi Liao 2 Xiang Wang Department of Mathematics, Nanjing University, Nanjing, 210093, China

More information

A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem

A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem Kangkang Deng, Zheng Peng Abstract: The main task of genetic regulatory networks is to construct a

More information

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)

More information

Application of the Strictly Contractive Peaceman-Rachford Splitting Method to Multi-block Separable Convex Programming

Application of the Strictly Contractive Peaceman-Rachford Splitting Method to Multi-block Separable Convex Programming Application of the Strictly Contractive Peaceman-Rachford Splitting Method to Multi-block Separable Convex Programming Bingsheng He, Han Liu, Juwei Lu, and Xiaoming Yuan Abstract Recently, a strictly contractive

More information

Proximal ADMM with larger step size for two-block separable convex programming and its application to the correlation matrices calibrating problems

Proximal ADMM with larger step size for two-block separable convex programming and its application to the correlation matrices calibrating problems Available online at www.isr-publications.com/jnsa J. Nonlinear Sci. Appl., (7), 538 55 Research Article Journal Homepage: www.tjnsa.com - www.isr-publications.com/jnsa Proximal ADMM with larger step size

More information

496 B.S. HE, S.L. WANG AND H. YANG where w = x y 0 A ; Q(w) f(x) AT g(y) B T Ax + By b A ; W = X Y R r : (5) Problem (4)-(5) is denoted as MVI

496 B.S. HE, S.L. WANG AND H. YANG where w = x y 0 A ; Q(w) f(x) AT g(y) B T Ax + By b A ; W = X Y R r : (5) Problem (4)-(5) is denoted as MVI Journal of Computational Mathematics, Vol., No.4, 003, 495504. A MODIFIED VARIABLE-PENALTY ALTERNATING DIRECTIONS METHOD FOR MONOTONE VARIATIONAL INEQUALITIES Λ) Bing-sheng He Sheng-li Wang (Department

More information

On the acceleration of augmented Lagrangian method for linearly constrained optimization

On the acceleration of augmented Lagrangian method for linearly constrained optimization On the acceleration of augmented Lagrangian method for linearly constrained optimization Bingsheng He and Xiaoming Yuan October, 2 Abstract. The classical augmented Lagrangian method (ALM plays a fundamental

More information

A relaxed customized proximal point algorithm for separable convex programming

A relaxed customized proximal point algorithm for separable convex programming A relaxed customized proximal point algorithm for separable convex programming Xingju Cai Guoyong Gu Bingsheng He Xiaoming Yuan August 22, 2011 Abstract The alternating direction method (ADM) is classical

More information

An Algorithmic Framework of Generalized Primal-Dual Hybrid Gradient Methods for Saddle Point Problems

An Algorithmic Framework of Generalized Primal-Dual Hybrid Gradient Methods for Saddle Point Problems An Algorithmic Framework of Generalized Primal-Dual Hybrid Gradient Methods for Saddle Point Problems Bingsheng He Feng Ma 2 Xiaoming Yuan 3 January 30, 206 Abstract. The primal-dual hybrid gradient method

More information

An inexact parallel splitting augmented Lagrangian method for large system of linear equations

An inexact parallel splitting augmented Lagrangian method for large system of linear equations An inexact parallel splitting augmented Lagrangian method for large system of linear equations Zheng Peng [1, ] DongHua Wu [] [1]. Mathematics Department of Nanjing University, Nanjing PR. China, 10093

More information

Inexact Alternating Direction Method of Multipliers for Separable Convex Optimization

Inexact Alternating Direction Method of Multipliers for Separable Convex Optimization Inexact Alternating Direction Method of Multipliers for Separable Convex Optimization Hongchao Zhang hozhang@math.lsu.edu Department of Mathematics Center for Computation and Technology Louisiana State

More information

Inexact Alternating-Direction-Based Contraction Methods for Separable Linearly Constrained Convex Optimization

Inexact Alternating-Direction-Based Contraction Methods for Separable Linearly Constrained Convex Optimization J Optim Theory Appl 04) 63:05 9 DOI 0007/s0957-03-0489-z Inexact Alternating-Direction-Based Contraction Methods for Separable Linearly Constrained Convex Optimization Guoyong Gu Bingsheng He Junfeng Yang

More information

ON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS

ON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS ON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS WEI DENG AND WOTAO YIN Abstract. The formulation min x,y f(x) + g(y) subject to Ax + By = b arises in

More information

On convergence rate of the Douglas-Rachford operator splitting method

On convergence rate of the Douglas-Rachford operator splitting method On convergence rate of the Douglas-Rachford operator splitting method Bingsheng He and Xiaoming Yuan 2 Abstract. This note provides a simple proof on a O(/k) convergence rate for the Douglas- Rachford

More information

LINEARIZED AUGMENTED LAGRANGIAN AND ALTERNATING DIRECTION METHODS FOR NUCLEAR NORM MINIMIZATION

LINEARIZED AUGMENTED LAGRANGIAN AND ALTERNATING DIRECTION METHODS FOR NUCLEAR NORM MINIMIZATION LINEARIZED AUGMENTED LAGRANGIAN AND ALTERNATING DIRECTION METHODS FOR NUCLEAR NORM MINIMIZATION JUNFENG YANG AND XIAOMING YUAN Abstract. The nuclear norm is widely used to induce low-rank solutions for

More information

Distributed Optimization via Alternating Direction Method of Multipliers

Distributed Optimization via Alternating Direction Method of Multipliers Distributed Optimization via Alternating Direction Method of Multipliers Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato Stanford University ITMANET, Stanford, January 2011 Outline precursors dual decomposition

More information

Generalized Alternating Direction Method of Multipliers: New Theoretical Insights and Applications

Generalized Alternating Direction Method of Multipliers: New Theoretical Insights and Applications Noname manuscript No. (will be inserted by the editor) Generalized Alternating Direction Method of Multipliers: New Theoretical Insights and Applications Ethan X. Fang Bingsheng He Han Liu Xiaoming Yuan

More information

arxiv: v1 [math.oc] 23 May 2017

arxiv: v1 [math.oc] 23 May 2017 A DERANDOMIZED ALGORITHM FOR RP-ADMM WITH SYMMETRIC GAUSS-SEIDEL METHOD JINCHAO XU, KAILAI XU, AND YINYU YE arxiv:1705.08389v1 [math.oc] 23 May 2017 Abstract. For multi-block alternating direction method

More information

On the Iteration Complexity of Some Projection Methods for Monotone Linear Variational Inequalities

On the Iteration Complexity of Some Projection Methods for Monotone Linear Variational Inequalities On the Iteration Complexity of Some Projection Methods for Monotone Linear Variational Inequalities Caihua Chen Xiaoling Fu Bingsheng He Xiaoming Yuan January 13, 2015 Abstract. Projection type methods

More information

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 36 Why proximal? Newton s method: for C 2 -smooth, unconstrained problems allow

More information

arxiv: v1 [math.oc] 27 Jan 2013

arxiv: v1 [math.oc] 27 Jan 2013 An Extragradient-Based Alternating Direction Method for Convex Minimization arxiv:1301.6308v1 [math.oc] 27 Jan 2013 Shiqian MA Shuzhong ZHANG January 26, 2013 Abstract In this paper, we consider the problem

More information

Convergence of a Class of Stationary Iterative Methods for Saddle Point Problems

Convergence of a Class of Stationary Iterative Methods for Saddle Point Problems Convergence of a Class of Stationary Iterative Methods for Saddle Point Problems Yin Zhang 张寅 August, 2010 Abstract A unified convergence result is derived for an entire class of stationary iterative methods

More information

Accelerated primal-dual methods for linearly constrained convex problems

Accelerated primal-dual methods for linearly constrained convex problems Accelerated primal-dual methods for linearly constrained convex problems Yangyang Xu SIAM Conference on Optimization May 24, 2017 1 / 23 Accelerated proximal gradient For convex composite problem: minimize

More information

A Unified Approach to Proximal Algorithms using Bregman Distance

A Unified Approach to Proximal Algorithms using Bregman Distance A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department

More information

On the O(1/t) convergence rate of the projection and contraction methods for variational inequalities with Lipschitz continuous monotone operators

On the O(1/t) convergence rate of the projection and contraction methods for variational inequalities with Lipschitz continuous monotone operators On the O1/t) convergence rate of the projection and contraction methods for variational inequalities with Lipschitz continuous monotone operators Bingsheng He 1 Department of Mathematics, Nanjing University,

More information

A GENERAL INERTIAL PROXIMAL POINT METHOD FOR MIXED VARIATIONAL INEQUALITY PROBLEM

A GENERAL INERTIAL PROXIMAL POINT METHOD FOR MIXED VARIATIONAL INEQUALITY PROBLEM A GENERAL INERTIAL PROXIMAL POINT METHOD FOR MIXED VARIATIONAL INEQUALITY PROBLEM CAIHUA CHEN, SHIQIAN MA, AND JUNFENG YANG Abstract. In this paper, we first propose a general inertial proximal point method

More information

Alternating Direction Method of Multipliers. Ryan Tibshirani Convex Optimization

Alternating Direction Method of Multipliers. Ryan Tibshirani Convex Optimization Alternating Direction Method of Multipliers Ryan Tibshirani Convex Optimization 10-725 Consider the problem Last time: dual ascent min x f(x) subject to Ax = b where f is strictly convex and closed. Denote

More information

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6)

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement to the material discussed in

More information

An novel ADM for finding Cournot equilibria of bargaining problem with alternating offers

An novel ADM for finding Cournot equilibria of bargaining problem with alternating offers An novel ADM for finding Cournot equilibria of bargaining problem with alternating offers Zheng Peng Department of Mathematics, NanJing University, P R. China, 9 Abstract: Bargaining is a basic game in

More information

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725 Dual methods and ADMM Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given f : R n R, the function is called its conjugate Recall conjugate functions f (y) = max x R n yt x f(x)

More information

Splitting methods for decomposing separable convex programs

Splitting methods for decomposing separable convex programs Splitting methods for decomposing separable convex programs Philippe Mahey LIMOS - ISIMA - Université Blaise Pascal PGMO, ENSTA 2013 October 4, 2013 1 / 30 Plan 1 Max Monotone Operators Proximal techniques

More information

1 Introduction and preliminaries

1 Introduction and preliminaries Proximal Methods for a Class of Relaxed Nonlinear Variational Inclusions Abdellatif Moudafi Université des Antilles et de la Guyane, Grimaag B.P. 7209, 97275 Schoelcher, Martinique abdellatif.moudafi@martinique.univ-ag.fr

More information

A hybrid splitting method for smoothing Tikhonov regularization problem

A hybrid splitting method for smoothing Tikhonov regularization problem Zeng et al Journal of Inequalities and Applications 06) 06:68 DOI 086/s3660-06-098-8 R E S E A R C H Open Access A hybrid splitting method for smoothing Tikhonov regularization problem Yu-Hua Zeng,, Zheng

More information

Sparse Optimization Lecture: Dual Methods, Part I

Sparse Optimization Lecture: Dual Methods, Part I Sparse Optimization Lecture: Dual Methods, Part I Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know dual (sub)gradient iteration augmented l 1 iteration

More information

Sparse Gaussian conditional random fields

Sparse Gaussian conditional random fields Sparse Gaussian conditional random fields Matt Wytock, J. ico Kolter School of Computer Science Carnegie Mellon University Pittsburgh, PA 53 {mwytock, zkolter}@cs.cmu.edu Abstract We propose sparse Gaussian

More information

Research Article Solving the Matrix Nearness Problem in the Maximum Norm by Applying a Projection and Contraction Method

Research Article Solving the Matrix Nearness Problem in the Maximum Norm by Applying a Projection and Contraction Method Advances in Operations Research Volume 01, Article ID 357954, 15 pages doi:10.1155/01/357954 Research Article Solving the Matrix Nearness Problem in the Maximum Norm by Applying a Projection and Contraction

More information

ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING

ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING YANGYANG XU Abstract. Motivated by big data applications, first-order methods have been extremely

More information

Nonlinear Support Vector Machines through Iterative Majorization and I-Splines

Nonlinear Support Vector Machines through Iterative Majorization and I-Splines Nonlinear Support Vector Machines through Iterative Majorization and I-Splines P.J.F. Groenen G. Nalbantov J.C. Bioch July 9, 26 Econometric Institute Report EI 26-25 Abstract To minimize the primal support

More information

ALTERNATING DIRECTION METHOD OF MULTIPLIERS WITH VARIABLE STEP SIZES

ALTERNATING DIRECTION METHOD OF MULTIPLIERS WITH VARIABLE STEP SIZES ALTERNATING DIRECTION METHOD OF MULTIPLIERS WITH VARIABLE STEP SIZES Abstract. The alternating direction method of multipliers (ADMM) is a flexible method to solve a large class of convex minimization

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

Optimal linearized symmetric ADMM for multi-block separable convex programming

Optimal linearized symmetric ADMM for multi-block separable convex programming Optimal linearized symmetric ADMM for multi-block separable convex programming Xiaokai Chang Jianchao Bai 3 and Sanyang Liu School of Mathematics and Statistics Xidian University Xi an 7007 PR China College

More information

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)

More information

The semi-convergence of GSI method for singular saddle point problems

The semi-convergence of GSI method for singular saddle point problems Bull. Math. Soc. Sci. Math. Roumanie Tome 57(05 No., 04, 93 00 The semi-convergence of GSI method for singular saddle point problems by Shu-Xin Miao Abstract Recently, Miao Wang considered the GSI method

More information

MS&E 318 (CME 338) Large-Scale Numerical Optimization

MS&E 318 (CME 338) Large-Scale Numerical Optimization Stanford University, Management Science & Engineering (and ICME) MS&E 318 (CME 338) Large-Scale Numerical Optimization 1 Origins Instructor: Michael Saunders Spring 2015 Notes 9: Augmented Lagrangian Methods

More information

Beyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory

Beyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory Beyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory Xin Liu(4Ð) State Key Laboratory of Scientific and Engineering Computing Institute of Computational Mathematics

More information

SPARSE AND LOW-RANK MATRIX DECOMPOSITION VIA ALTERNATING DIRECTION METHODS

SPARSE AND LOW-RANK MATRIX DECOMPOSITION VIA ALTERNATING DIRECTION METHODS SPARSE AND LOW-RANK MATRIX DECOMPOSITION VIA ALTERNATING DIRECTION METHODS XIAOMING YUAN AND JUNFENG YANG Abstract. The problem of recovering the sparse and low-rank components of a matrix captures a broad

More information

A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations

A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations Chuangchuang Sun and Ran Dai Abstract This paper proposes a customized Alternating Direction Method of Multipliers

More information

Math 273a: Optimization Overview of First-Order Optimization Algorithms

Math 273a: Optimization Overview of First-Order Optimization Algorithms Math 273a: Optimization Overview of First-Order Optimization Algorithms Wotao Yin Department of Mathematics, UCLA online discussions on piazza.com 1 / 9 Typical flow of numerical optimization Optimization

More information

2 Regularized Image Reconstruction for Compressive Imaging and Beyond

2 Regularized Image Reconstruction for Compressive Imaging and Beyond EE 367 / CS 448I Computational Imaging and Display Notes: Compressive Imaging and Regularized Image Reconstruction (lecture ) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement

More information

A General Framework for a Class of Primal-Dual Algorithms for TV Minimization

A General Framework for a Class of Primal-Dual Algorithms for TV Minimization A General Framework for a Class of Primal-Dual Algorithms for TV Minimization Ernie Esser UCLA 1 Outline A Model Convex Minimization Problem Main Idea Behind the Primal Dual Hybrid Gradient (PDHG) Method

More information

Convex Optimization. (EE227A: UC Berkeley) Lecture 15. Suvrit Sra. (Gradient methods III) 12 March, 2013

Convex Optimization. (EE227A: UC Berkeley) Lecture 15. Suvrit Sra. (Gradient methods III) 12 March, 2013 Convex Optimization (EE227A: UC Berkeley) Lecture 15 (Gradient methods III) 12 March, 2013 Suvrit Sra Optimal gradient methods 2 / 27 Optimal gradient methods We saw following efficiency estimates for

More information

GENERAL NONCONVEX SPLIT VARIATIONAL INEQUALITY PROBLEMS. Jong Kyu Kim, Salahuddin, and Won Hee Lim

GENERAL NONCONVEX SPLIT VARIATIONAL INEQUALITY PROBLEMS. Jong Kyu Kim, Salahuddin, and Won Hee Lim Korean J. Math. 25 (2017), No. 4, pp. 469 481 https://doi.org/10.11568/kjm.2017.25.4.469 GENERAL NONCONVEX SPLIT VARIATIONAL INEQUALITY PROBLEMS Jong Kyu Kim, Salahuddin, and Won Hee Lim Abstract. In this

More information

Proximal Methods for Optimization with Spasity-inducing Norms

Proximal Methods for Optimization with Spasity-inducing Norms Proximal Methods for Optimization with Spasity-inducing Norms Group Learning Presentation Xiaowei Zhou Department of Electronic and Computer Engineering The Hong Kong University of Science and Technology

More information

A Solution Method for Semidefinite Variational Inequality with Coupled Constraints

A Solution Method for Semidefinite Variational Inequality with Coupled Constraints Communications in Mathematics and Applications Volume 4 (2013), Number 1, pp. 39 48 RGN Publications http://www.rgnpublications.com A Solution Method for Semidefinite Variational Inequality with Coupled

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html

More information

1 Non-negative Matrix Factorization (NMF)

1 Non-negative Matrix Factorization (NMF) 2018-06-21 1 Non-negative Matrix Factorization NMF) In the last lecture, we considered low rank approximations to data matrices. We started with the optimal rank k approximation to A R m n via the SVD,

More information

Gradient Descent. Ryan Tibshirani Convex Optimization /36-725

Gradient Descent. Ryan Tibshirani Convex Optimization /36-725 Gradient Descent Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: canonical convex programs Linear program (LP): takes the form min x subject to c T x Gx h Ax = b Quadratic program (QP): like

More information

Sparse and Low-Rank Matrix Decomposition Via Alternating Direction Method

Sparse and Low-Rank Matrix Decomposition Via Alternating Direction Method Sparse and Low-Rank Matrix Decomposition Via Alternating Direction Method Xiaoming Yuan a, 1 and Junfeng Yang b, 2 a Department of Mathematics, Hong Kong Baptist University, Hong Kong, China b Department

More information

Coordinate Descent and Ascent Methods

Coordinate Descent and Ascent Methods Coordinate Descent and Ascent Methods Julie Nutini Machine Learning Reading Group November 3 rd, 2015 1 / 22 Projected-Gradient Methods Motivation Rewrite non-smooth problem as smooth constrained problem:

More information

The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent

The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent Yinyu Ye K. T. Li Professor of Engineering Department of Management Science and Engineering Stanford

More information

LINEARIZED ALTERNATING DIRECTION METHOD FOR CONSTRAINED LINEAR LEAST-SQUARES PROBLEM

LINEARIZED ALTERNATING DIRECTION METHOD FOR CONSTRAINED LINEAR LEAST-SQUARES PROBLEM LINEARIZED ALTERNATING DIRECTION METHOD FOR CONSTRAINED LINEAR LEAST-SQUARES PROBLEM RAYMOND H. CHAN, MIN TAO, AND XIAOMING YUAN February 8, 20 Abstract. In this paper, we apply the alternating direction

More information

Homework 4. Convex Optimization /36-725

Homework 4. Convex Optimization /36-725 Homework 4 Convex Optimization 10-725/36-725 Due Friday November 4 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)

More information

Gauge optimization and duality

Gauge optimization and duality 1 / 54 Gauge optimization and duality Junfeng Yang Department of Mathematics Nanjing University Joint with Shiqian Ma, CUHK September, 2015 2 / 54 Outline Introduction Duality Lagrange duality Fenchel

More information

Convergence rate estimates for the gradient differential inclusion

Convergence rate estimates for the gradient differential inclusion Convergence rate estimates for the gradient differential inclusion Osman Güler November 23 Abstract Let f : H R { } be a proper, lower semi continuous, convex function in a Hilbert space H. The gradient

More information

Relaxed linearized algorithms for faster X-ray CT image reconstruction

Relaxed linearized algorithms for faster X-ray CT image reconstruction Relaxed linearized algorithms for faster X-ray CT image reconstruction Hung Nien and Jeffrey A. Fessler University of Michigan, Ann Arbor The 13th Fully 3D Meeting June 2, 2015 1/20 Statistical image reconstruction

More information

On the iterate convergence of descent methods for convex optimization

On the iterate convergence of descent methods for convex optimization On the iterate convergence of descent methods for convex optimization Clovis C. Gonzaga March 1, 2014 Abstract We study the iterate convergence of strong descent algorithms applied to convex functions.

More information

Coordinate Update Algorithm Short Course Operator Splitting

Coordinate Update Algorithm Short Course Operator Splitting Coordinate Update Algorithm Short Course Operator Splitting Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 25 Operator splitting pipeline 1. Formulate a problem as 0 A(x) + B(x) with monotone operators

More information

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss arxiv:1811.04545v1 [stat.co] 12 Nov 2018 Cheng Wang School of Mathematical Sciences, Shanghai Jiao

More information

Optimization for Learning and Big Data

Optimization for Learning and Big Data Optimization for Learning and Big Data Donald Goldfarb Department of IEOR Columbia University Department of Mathematics Distinguished Lecture Series May 17-19, 2016. Lecture 1. First-Order Methods for

More information

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization Roger Behling a, Clovis Gonzaga b and Gabriel Haeser c March 21, 2013 a Department

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Support vector machines (SVMs) are one of the central concepts in all of machine learning. They are simply a combination of two ideas: linear classification via maximum (or optimal

More information

Primal/Dual Decomposition Methods

Primal/Dual Decomposition Methods Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Subgradients

More information

Conditional Gradient (Frank-Wolfe) Method

Conditional Gradient (Frank-Wolfe) Method Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties

More information

Adaptive Primal Dual Optimization for Image Processing and Learning

Adaptive Primal Dual Optimization for Image Processing and Learning Adaptive Primal Dual Optimization for Image Processing and Learning Tom Goldstein Rice University tag7@rice.edu Ernie Esser University of British Columbia eesser@eos.ubc.ca Richard Baraniuk Rice University

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

LOCAL LINEAR CONVERGENCE OF ADMM Daniel Boley

LOCAL LINEAR CONVERGENCE OF ADMM Daniel Boley LOCAL LINEAR CONVERGENCE OF ADMM Daniel Boley Model QP/LP: min 1 / 2 x T Qx+c T x s.t. Ax = b, x 0, (1) Lagrangian: L(x,y) = 1 / 2 x T Qx+c T x y T x s.t. Ax = b, (2) where y 0 is the vector of Lagrange

More information

Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Module - 5 Lecture - 22 SVM: The Dual Formulation Good morning.

More information

FAST ALTERNATING DIRECTION OPTIMIZATION METHODS

FAST ALTERNATING DIRECTION OPTIMIZATION METHODS FAST ALTERNATING DIRECTION OPTIMIZATION METHODS TOM GOLDSTEIN, BRENDAN O DONOGHUE, SIMON SETZER, AND RICHARD BARANIUK Abstract. Alternating direction methods are a common tool for general mathematical

More information

A Tutorial on Primal-Dual Algorithm

A Tutorial on Primal-Dual Algorithm A Tutorial on Primal-Dual Algorithm Shenlong Wang University of Toronto March 31, 2016 1 / 34 Energy minimization MAP Inference for MRFs Typical energies consist of a regularization term and a data term.

More information

Efficient smoothers for all-at-once multigrid methods for Poisson and Stokes control problems

Efficient smoothers for all-at-once multigrid methods for Poisson and Stokes control problems Efficient smoothers for all-at-once multigrid methods for Poisson and Stoes control problems Stefan Taacs stefan.taacs@numa.uni-linz.ac.at, WWW home page: http://www.numa.uni-linz.ac.at/~stefant/j3362/

More information

LECTURE NOTE #8 PROF. ALAN YUILLE. Can we find a linear classifier that separates the position and negative examples?

LECTURE NOTE #8 PROF. ALAN YUILLE. Can we find a linear classifier that separates the position and negative examples? LECTURE NOTE #8 PROF. ALAN YUILLE 1. Linear Classifiers and Perceptrons A dataset contains N samples: { (x µ, y µ ) : µ = 1 to N }, y µ {±1} Can we find a linear classifier that separates the position

More information

A Primal-dual Three-operator Splitting Scheme

A Primal-dual Three-operator Splitting Scheme Noname manuscript No. (will be inserted by the editor) A Primal-dual Three-operator Splitting Scheme Ming Yan Received: date / Accepted: date Abstract In this paper, we propose a new primal-dual algorithm

More information

Gradient descent. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Gradient descent. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725 Gradient descent Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Gradient descent First consider unconstrained minimization of f : R n R, convex and differentiable. We want to solve

More information

Dual Proximal Gradient Method

Dual Proximal Gradient Method Dual Proximal Gradient Method http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/19 1 proximal gradient method

More information

Convex Optimization Algorithms for Machine Learning in 10 Slides

Convex Optimization Algorithms for Machine Learning in 10 Slides Convex Optimization Algorithms for Machine Learning in 10 Slides Presenter: Jul. 15. 2015 Outline 1 Quadratic Problem Linear System 2 Smooth Problem Newton-CG 3 Composite Problem Proximal-Newton-CD 4 Non-smooth,

More information

About Split Proximal Algorithms for the Q-Lasso

About Split Proximal Algorithms for the Q-Lasso Thai Journal of Mathematics Volume 5 (207) Number : 7 http://thaijmath.in.cmu.ac.th ISSN 686-0209 About Split Proximal Algorithms for the Q-Lasso Abdellatif Moudafi Aix Marseille Université, CNRS-L.S.I.S

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

A SIMPLE PARALLEL ALGORITHM WITH AN O(1/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS

A SIMPLE PARALLEL ALGORITHM WITH AN O(1/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS A SIMPLE PARALLEL ALGORITHM WITH AN O(/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS HAO YU AND MICHAEL J. NEELY Abstract. This paper considers convex programs with a general (possibly non-differentiable)

More information

The speed of Shor s R-algorithm

The speed of Shor s R-algorithm IMA Journal of Numerical Analysis 2008) 28, 711 720 doi:10.1093/imanum/drn008 Advance Access publication on September 12, 2008 The speed of Shor s R-algorithm J. V. BURKE Department of Mathematics, University

More information

DLM: Decentralized Linearized Alternating Direction Method of Multipliers

DLM: Decentralized Linearized Alternating Direction Method of Multipliers 1 DLM: Decentralized Linearized Alternating Direction Method of Multipliers Qing Ling, Wei Shi, Gang Wu, and Alejandro Ribeiro Abstract This paper develops the Decentralized Linearized Alternating Direction

More information

ALGORITHM CONSTRUCTION BY DECOMPOSITION 1. INTRODUCTION. The following theorem is so simple it s almost embarassing. Nevertheless

ALGORITHM CONSTRUCTION BY DECOMPOSITION 1. INTRODUCTION. The following theorem is so simple it s almost embarassing. Nevertheless ALGORITHM CONSTRUCTION BY DECOMPOSITION JAN DE LEEUW ABSTRACT. We discuss and illustrate a general technique, related to augmentation, in which a complicated optimization problem is replaced by a simpler

More information