Contraction Methods for Convex Optimization and monotone variational inequalities No.12
|
|
- Emmeline Hunt
- 5 years ago
- Views:
Transcription
1 XII - 1 Contraction Methods for Convex Optimization and monotone variational inequalities No.12 Linearized alternating direction methods of multipliers for separable convex programming Bingsheng He Department of Mathematics Nanjing University hebma@nju.edu.cn
2 XII Structured constrained convex optimization We consider the following structured constrained convex optimization problem min {θ 1 (x) + θ 2 (y) Ax + By = b, x X, y Y} (1.1) where θ 1 (x) : R n 1 R, θ 2 (y) : R n 2 R are convex functions (but not necessary smooth), A R m n 1, B R m n 2 and b R m, X R n 1, Y R n 2 are given closed convex sets. We let n = n 1 + n 2. The task of solving the problem (1.1) is to find an (x, y, λ ) Ω, such that θ 1 (x) θ 1 (x ) + (x x ) T ( A T λ ) 0, θ 2 (y) θ 2 (y ) + (y y ) T ( B T λ ) 0, (x, y, λ) Ω, (1.2) (λ λ ) T (Ax + By b) 0, where Ω = X Y R m.
3 XII - 3 By denoting u = x y, w = x y λ, F (w) = A T λ B T λ Ax + By b and θ(u) = θ 1 (x) + θ 2 (y), the first order optimal condition (1.2) can be written in a compact form such as w Ω, θ(u) θ(u )+(w w ) T F (w ) 0, w Ω. (1.3) Note that the mapping F is monotone. We use Ω to denote the solution set of the variational inequality (1.3). For convenience we use the notations v = y λ and V = {(y, λ ) (x, y, λ ) Ω }.
4 XII - 4 Alternating Direction Method is a simple but powerful algorithm that is well suited to distributed convex optimization [1]. This approach also has the benefit that one algorithm could be flexible enough to solve many problems. Applied ADM to the structured COP: (y k, λ k ) (y k+1, λ k+1 ) First, for given (y k, λ k ), x k+1 is the solution of the following problem { } x k+1 = Argmin θ 1 (x) + β 2 Ax + Byk b 1 β λk 2 x X (1.4a) Use λ k and the obtained x k+1, y k+1 is the solution of the following problem y k+1 = Argmin { } θ 2 (y)+ β 2 Axk+1 +By b 1 β λk 2 y Y (1.4b) λ k+1 = λ k β(ax k+1 + By k+1 b). (1.4c)
5 XII - 5 In some structured convex optimization (1.1), B is a scalar matrix. However, the solution of the subproblem (1.4a) does not have the closed form solution because of the general structure of the matrix A. In this case, we linearize the quadratic term of (1.4a) β 2 Ax + Byk b 1 β λk 2 at x k and add a proximal term r 2 x a 2 to the objective function. In other words, instead of (1.4a), we solve the following x subproblem: min { θ 1 (x) + βx T A T (Ax k + By k b 1 β λk ) + r 2 x xk 2 x X }. Based on linearizing the quadratic term of (1.4a), in this lecture, we construct the linearized alternating direction method. We still assume that the solution of the problem has a closed form. min { θ 1 (x) + r 2 x a 2 x X } (1.5)
6 XII Linearized Alternating Direction Method In the Linearized ADM, x is not an intermediate variable. The k-th iteration of the Linearized ADM is from (x k, y k, λ k ) to (x k+1, y k+1, λ k+1 ). 2.1 Linearized ADM 1. First, for given (x k, y k, λ k ), x k+1 is the solution of the following problem ( { ) x k+1 θ 1 (x) + βx T A T (Ax k + By k b 1 β = Argmin λk ) + r x 2 xk 2 x X }. (2.1a) 2. Then, use λ k and the obtained x k+1, y k+1 is the solution of the following problem } y k+1 = Argmin {θ 2 (y)+ β2 Axk+1 +By b 1β λk 2 y Y. (2.1b) 3. Finally, λ k+1 = λ k β(ax k+1 + By k+1 b). (2.1c)
7 XII - 7 Requirements on parameters β, r For given β > 0, choose r such that ri n βa T A 0. (2.2) Analysis of the optimal conditions of subproblems in (2.1) Note that x k+1, the solution of (2.1a), satisfies { x k+1 X, θ 1 (x) θ 1 (x k+1 ) + (x x k+1 ) T A T λ k +βa T ( Ax k + By k b ) } + r(x k+1 x k ) 0, x X. (2.3a) Similarly, the solution of (2.1b) y k+1 satisfies y k+1 Y, θ 2 (y) θ 2 (y k+1 ) + (y y k+1 ) T { B T λ k +βb T ( Ax k+1 + By k+1 b )} 0, y Y. (2.3b)
8 XII - 8 Substituting λ k+1 (see (2.1c)) in (2.3) (eliminating λ k ), we get x k+1 X, θ 1 (x) θ 1 (x k+1 ) + (x x k+1 ) T { A T λ k+1 + βa T B(y k y k+1 ) +(ri n1 βa T A)(x k+1 x k ) } 0, x X, (2.4a) and y k+1 Y, θ 2 (y) θ 2 (y k+1 )+ (y y k+1 ) T { B T λ k+1 } 0, y Y. (2.4b) For analysis convenience, we rewrite (2.4) as the following equivalent form: T θ(u) θ(u k+1 ) + x xk+1 AT λ k+1 +β AT B(y k y k+1 ) y y k+1 B T λ k+1 B T + ri n 1 βa T A 0 xk+1 x k 0, (x, y) X Y. 0 βb T B y k+1 y k
9 XII - 9 Combining the last inequality with (2.1c), we have T x x k+1 A T λ k+1 A T θ(u) θ(u k+1 ) + y y k+1 B T λ k+1 + β B T λ λ k+1 Ax k+1 + By k+1 b 0 ri n1 βa T A 0 0 x k+1 x k + 0 βb T B 0 y k+1 y k 0, 0 0 λ k+1 λ k 1 β I m for all (x, y, λ) X Y R m. The above inequality can be rewritten as T x x k+1 A T θ(u) θ(u k+1 )+(w w k+1 ) T F (w k+1 ) + β y y k+1 B T λ λ k+1 0 T x x k+1 ri n1 βa T A 0 0 x k x k+1 y y k+1 0 βb T B 0 y k y k+1 λ λ k λ k λ k+1 1 β I m B(y k y k+1 ) B(y k y k+1 ), w Ω.(2.5)
10 XII Convergence of Linearized ADM Based on the analysis in the last section, we have the following lemma. Lemma 2.1 Let w k+1 = (x k+1, y k+1, λ k+1 ) Ω be generated by (2.1) from the given w k = (x k, y k, λ k ). Then, we have (w k+1 w ) T G(w k w k+1 ) (w k+1 w ) T η(y k, y k+1 ), w Ω, (2.6) where η(y k, y k+1 ) = β A T B T 0 B(yk y k+1 ), (2.7) and G = ri n1 βa T A βb T B β I m. (2.8)
11 XII - 11 Proof. Setting w = w in (2.5), and using G and η(y k, y k+1 ), we get (w k+1 w ) T G(w k w k+1 ) (w k+1 w ) T η(y k, y k+1 )+θ(u k+1 ) θ(u ) + (w k+1 w ) T F (w k+1 ). Since F is monotone, it follows that θ(u k+1 ) θ(u ) + (w k+1 w ) T F (w k+1 ) θ(u k+1 ) θ(u ) + (w k+1 w ) T F (w ) 0. The last inequality is due to w k+1 Ω and w Ω is a solution of (see (1.3)). The lemma is proved. By using η(y k, y k+1 ) (see (2.7)), Ax + By = b and (2.1c), we have (w k+1 w ) T η(y k, y k+1 ) = (B(y k y k+1 )) T β{(ax k+1 + By k+1 ) (Ax + By )} = (B(y k y k+1 )) T β(ax k+1 + By k+1 b) = (λ k λ k+1 ) T B(y k y k+1 ). (2.9)
12 XII - 12 Substituting it in (2.6), we obtain (w k+1 w ) T G(w k w k+1 ) (λ k λ k+1 ) T B(y k y k+1 ), w Ω. (2.10) Lemma 2.2 Let w k+1 = (x k+1, y k+1, λ k+1 ) Ω be generated by (2.1) from the given w k = (x k, y k, λ k ). Then, we have (λ k λ k+1 ) T B(y k y k+1 ) 0. (2.11) Proof. Since (2.4b) is true for the k-th iteration and the previous iteration, we have and θ 2 (y) θ 2 (y k+1 ) + (y y k+1 ) T { B T λ k+1 } 0, y Y, (2.12) θ 2 (y) θ 2 (y k ) + (y y k ) T { B T λ k } 0, y Y, (2.13)
13 XII - 13 Setting y = y k in (2.12) and y = y k+1 in (2.13), respectively, and then adding the two resulting inequalities, we get (λ k λ k+1 ) T B(y k y k+1 ) 0. The assertion of this lemma is proved. Under the assumption (2.2), the matrix G is positive semi-definite. In addition, if B is a full column rank matrix, G is positive definite. Even if in the positive semi-definite case, we also use w w G to denote w w G = (w w) T G(w w). If B is a full column rank matrix, G is positive definite. Lemma 2.3 Let w k+1 = (x k+1, y k+1, λ k+1 ) Ω be generated by (2.1) from the given w k = (x k, y k, λ k ). Then, we have (w k+1 w ) T G(w k w k+1 ) 0, w Ω, (2.14)
14 XII - 14 and consequently w k+1 w 2 G w k w 2 G w k w k+1 2 G, w Ω. (2.15) Proof. Assertion (2.14) follows from (2.10) and (2.11) directly. By using (2.14), we have w k w 2 G = (w k+1 w ) + (w k w k+1 2 G = w k+1 w 2 G + 2(w k+1 w ) T G(w k w k+1 ) + w k w k+1 2 G w k+1 w 2 G + w k w k+1 2 G, and thus (2.15) is proved. The inequality (2.15) is essential for the convergence of the alternating direction method. Note that G is positive semi-definite. w k w k+1 2 G = 0 G(w k w k+1 ) = 0.
15 XII - 15 The inequality (2.15) can be written as x k+1 x 2 (ri βa T A) + β B(yk+1 y ) β λk+1 λ 2 x k x 2 (ri βa T A) + β B(yk y ) β λk λ 2 ( x k x k+1 2 (ri βa T A) + β B(yk y k+1 ) β λk λ k+1 2). It leads to that lim k xk = x, lim k Byk = By and lim k λk = λ. The linearizing ADM is also known as the split inexact Uzawa method in image processing literature [9, 10].
16 XII Self-Adaptive ADM-based Contraction Method In the last section, we get x k+1 by solving the following x-subproblem: min { θ 1 (x) + βx T A T (Ax k + By k b 1 β λk ) + r 2 x xk 2 x X } and it required that the parameter r to satisfy ri n βa T A 0 r > βλ max (A T A). In some practical problem, a conservative estimation of λ max (A T A) will leads a slow convergence. In this section, based on the linearized ADM, we consider the self-adaptive contraction methods. Each iteration of the self-adaptive contraction methods consists of two steps prediction step and correction step. From the given w k, the prediction step produces a test vector w k and the correction step offers the new iterate w k+1.
17 XII Prediction 1. First, for given (x k, y k, λ k ), x k is the solution of the following problem ( { ) x k θ 1 (x) + βx T A T (Ax k + By k b 1 = Argmin β λk ) + r 2 x xk 2 x X } 2. Then, use λ k and the obtained x k, ỹ k is the solution of the problem { } ỹ k θ = 2 (y) (λ k ) T (A x k + By b) Argmin + β 2 A xk + By b 2 y Y (3.1a) (3.1b) 3. Finally, λ k = λ k β(a x k + Bỹ k b). (3.1c) The subproblems in (3.1) are similar as in (2.1). Instead of (x k+1, y k+1, λ k+1 ) in (2.1), we denote the output of (3.1) by ( x k, ỹ k, λ k ).
18 XII - 18 Requirements on parameters β, r For given β > 0, choose r such that β A T A(x k x k ) νr x k x k, ν (0, 1). (3.2) If ri βa T A 0, then (3.2) is satisfied. Thus, (2.2) is sufficient for (3.2). Analysis of the optimal conditions of subproblems in (3.1) Because we get w k in (3.1) via substituting w k+1 in (2.1) by w k. Therefore, similar as (2.5), we get x x T k A T λk A T θ(u) θ(ũ k ) + y ỹ k B T λk λ λ + β B T k A x k + Bỹ k b 0 ri n1 βa T A 0 0 x k x k + 0 βb T B 0 ỹ k y k 0, w Ω. 0 0 λ k λ k 1 β I m B(yk ỹ k )
19 XII - 19 The last variational inequality can be rewritten as w k Ω, θ(u) θ(ũ k ) + (w w k ) T { F ( w k ) + η(y k, ỹ k ) + HM( w k w k ) } 0, w Ω, (3.3) where η(y k, ỹ k ) = β H = A T B T 0 ri n βb T B 0 B(yk ỹ k ), (3.4) β I m, (3.5)
20 XII - 20 and I n1 β r AT A 0 0 M = 0 I n I m. (3.6) Based on the above analysis, we have the following lemma. Lemma 3.1 Let w k = ( x k, ỹ k, λ k ) Ω be generated by (3.1) from the given w k = (x k, y k, λ k ). Then, we have ( w k w ) T HM(w k w k ) ( w k w ) T η(y k, ỹ k ), w Ω. (3.7) Proof. Setting w = w in (3.3), we obtain ( w k w ) T HM(w k w k ) ( w k w ) T η(y k, ỹ k ) +θ(ũ k ) θ(u ) + ( w k w ) T F ( w k ). (3.8)
21 XII - 21 Since F is monotone and w k Ω, it follows that θ(ũ k ) θ(u ) + ( w k w ) T F ( w k ) θ(ũ k ) θ(u ) + ( w k w ) T F (w ) 0. The last inequality is due to w k Ω and w Ω is a solution of (1.3). Substituting it in the right hand side of (3.8), the lemma is proved. In addition, because Ax + By = b and β(a x k + Bỹ k b) = λ k λ k, we have ( w k w ) T η(y k, ỹ k ) = (B(y k ỹ k )) T β{(a x k + Bỹ k ) (Ax + By )} = (λ k λ k ) T B(y k ỹ k ). (3.9) Lemma 3.2 Let w k = ( x k, ỹ k, λ k ) Ω be generated by (3.1) from the given w k = (x k, y k, λ k ). Then, we have (w k w ) T HM(w k w k ) ϕ(w k, w k ), w Ω, (3.10)
22 XII - 22 where ϕ(w k, w k ) = (w k w k ) T HM(w k w k )+(λ k λ k ) T B(y k ỹ k ). (3.11) Proof. From (3.7) and (3.9) we have ( w k w ) T HM(w k w k ) (λ k λ k ) T B(y k ỹ k ). Assertion (3.10) follows from the last inequality and the definition of ϕ(w k, w k ) directly. 3.2 The Primary Contraction Methods The primary contraction methods use M(w k w k ) as search direction and the unit step length. In other words, the new iterate is given by w k+1 = w k M(w k w k ). (3.12)
23 XII - 23 According to (3.6), it can be written as x k+1 y k+1 λ k+1 = x k + β r AT A(x k x k ) ỹ k. (3.13) λ k In the primary contraction method, only the x-part of the corrector is different from the predictor. In the method of Section 2, we need r β A T A. By using the method in this section, we need only a r to satisfy the condition (3.2). In practical computation, we try to use the average of the eigenvalues of βa T A. Using (3.10), we have w k w 2 H w k+1 w 2 H = w k w 2 H (w k w ) M(w k w k ) 2 H = 2(w k w ) T HM(w k w k ) M(w k w k ) 2 H 2ϕ(w k, w k ) M(w k w k ) 2 H. (3.14)
24 XII - 24 Because (ỹ k, λ k ) = (y k+1, λ k+1 ), the inequality (2.11) is still holds and thus (λ k λ k ) T B(y k ỹ k ) 0. Therefore, it follows from (3.14), (3.11) and the last inequality that w k w 2 H w k+1 w 2 H 2(w k w k ) T HM(w k w k ) M(w k w k ) 2 H. (3.15) Lemma 3.3 Under the condition (3.2), we have 2(w k w k ) T HM(w k w k ) M(w k w k ) 2 H (1 ν 2 )r x k x k 2 + β B(y k ỹ k ) β λk λ k 2.(3.16) Proof. First, we have 2(w k w k ) T HM(w k w k ) M(w k w k ) 2 H = (w k w k ) T ( M T H + HM M T HM ) (w k w k ).
25 XII - 25 By using the structure of the matrices H and M (see (3.5) and (3.6)), we obtain M T H + HM M T HM = H (I M T )H(I M) ri n 0 0 r( β r = 0 βb T B 0 AT A) β I m Therefore, 2(w k w k ) T HM(w k w k ) M(w k w k ) 2 H = w k w k 2 H r ( β 2 r 2 ) A T A(x k x k ) 2. (3.17) Under the condition (3.2), we have ( β 2 r 2 ) A T A(x k x k ) 2 ν 2 x k x k 2. Substituting it in (3.17), the assertion of this lemma is proved. Theorem 3.1 Let w k = ( x k, ỹ k, λ k ) Ω be generated by (3.1) from the given
26 XII - 26 w k = (x k, y k, λ k ) and the new iterate w k+1 is given by (3.12). The sequence {w k = (x k, y k, λ k )} generated by the elementary contraction method satisfies w k+1 w 2 H w k w 2 H (1 ν 2 ) w k w k 2 H. (3.18) Proof. From (3.15) and (3.16) we obtain w k w 2 H w k+1 w 2 H (1 ν 2 )r x k x k 2 + β B(y k ỹ k ) β λk λ k 2 (1 ν 2 ) w k w k 2 H. The assertion of this theorem is proved. Theorem 3.1 is essential for the convergence of the primary contraction method.
27 XII The general contraction method The general contraction method For given w k, we use w(α) = w k αm(w k w k ) (3.19) to update the α-dependent new iterate. For any w Ω, we define ϑ(α) := w k w 2 H w(α) w 2 H (3.20) and q(α) = 2αϕ(w k, w k ) α 2 M(w k w k ) 2 H, (3.21) where ϕ(w k, w k ) is defined in (3.11). Theorem 3.2 Let w(α) be defined by (3.19). For any w Ω and α 0, we have ϑ(α) q(α), (3.22) where ϑ(α) and q(α) are defined in (3.20) and (3.21), respectively.
28 XII - 28 Proof. It follows from (3.19) and (3.20) that ϑ(α) = w k w 2 H (w k w ) αm(w k w k ) 2 H = 2α(w k w ) T HM(w k w k ) α 2 M(w k w k ) 2 H. By using (3.10) and the definition of q(α), the theorem is proved. Note that q(α) in (3.21) is a quadratic function of α and it reaches its maximum at α = In practical computation, we use ϕ(w k, w k ) M(w k w k ) 2 H. (3.23) w k+1 = w k γαkm(w k w k ), (3.24) to update the new iterate, where γ [1, 2) is a relaxation factor. By using (3.20) and (3.22), we have w k+1 w 2 H w k w 2 H q(γα k). (3.25)
29 XII - 29 Note that q(γα k) = 2γα kϕ(w k, w k ) (γα k) 2 M(w k w k ) 2 H. (3.26) Using (3.23) and (3.24), we obtain q(γα k) = γ(2 γ)(α k) 2 M(w k w k ) 2 H = 2 γ γ and consequently it follows from (3.25) that w k+1 w 2 H w k w 2 H 2 γ γ On the other hand, it follows from (3.23) and (3.26) that w k w k+1 2 H, w k w k+1 2 H, w Ω. (3.27) q(γα k) = γ(2 γ)α kϕ(w k, w k ). (3.28)
30 XII - 30 By using (3.11) and (3.16), we obtain 2ϕ(w k, w k ) M(w k w k ) 2 H = 2(w k w k ) T HM(w k w k ) M(w k w k ) 2 H +2(λ k λ k ) T B(y k ỹ k ) (1 ν 2 )r x k x k 2 + β B(y k ỹ k ) β λk λ k 2 +2(λ k λ k ) T B(y k ỹ k ) = (1 ν 2 )r x k x k 2 + β B(y k ỹ k ) + 1 β (λk λ k ) 2. Thus, we have 2ϕ(w k, w k ) > M(w k w k ) 2 H and consequently α k > 1 2.
31 XII - 31 In addition, because ϕ(w k, w k ) = (w k w k ) T HM(w k w k ) + (λ k λ k ) T B(y k ỹ k ) x k x k T ri n1 βa T A 0 0 x k x k = y k ỹ k 0 βb T B 1 2 BT y k ỹ k λ k λ k B 1 β I m λ k λ k x k x k (r x k x k β A T A(x k x k ) ) + 1 yk ỹ k T 2 λ k λ βbt B 0 yk ỹ k k 1 0 β I m λ k λ. k Under the condition (3.2), it follows from the last inequality that ϕ(w k, w k ) min { (1 ν), 1 2} w k w k 2 H 1 ν 2 w k w k 2 H. (3.29)
32 XII - 32 By using (3.25), (3.28), (3.29) and αk 1 2, we obtain the following theorem for the general contraction method. Theorem 3.3 The sequence {w k = (x k, y k, λ k )} generated by the general contraction method satisfies w k+1 w 2 H w k w 2 H γ(2 γ)(1 ν) 4 w k w k 2 H, w Ω. (3.30) The inequality (3.30) in Theorem 3.3 is essential for the convergence of the general contraction method. Both the inequalities (3.18) and (3.30) can be written as w k+1 w 2 H w k w 2 H c 0 w k w k 2 H, w Ω,
33 XII - 33 where c 0 > 0 is a constant. Therefore, we have r x k+1 x 2 + β B(y k+1 y ) β λk+1 λ 2 r x k x 2 + β B(y k y ) β λk λ 2 c 0 ( r x k x k 2 + β B(y k ỹ k ) β λk λ k 2). It leads to that and ( lim r x k x k 2 + β B(y k ỹ k ) k β λk λ k 2), lim k xk = x, lim k Byk = By and lim k λk = λ.
34 XII Applications in l 1 -norm problems An important l 1 -norm problem in the area machine learning is the l 1 regularized linear regression, also called the lasso [8]. This involves solving min τ x Ax b 2 2, (4.1) where τ > 0 is a scalar regularization parameter that is usually chosen by cross-validation. In typical applications, there are many more features than training examples, and the goal is to find a parsimonious model for the data. The problem (1.1) can be reformulated to problem min τ x y 2 2 Ax y = b (4.2) which is a form of (1.1). Applied the alternating direction method (1.4) to the problem (4.2), the x-subproblem is x k+1 = Argmin {τ x 1 + β 2 (Ax yk ) 1 β λk 2 2},
35 XII - 35 and the solution does not has closed form. Applied the linearized alternating direction method (2.1) to the problem (4.2), the x-subproblem (2.1a) is x k = Argmin{τ x 1 + r 2 x [xk + 1 r λk β r AT (Ax k y k )] 2 }. (4.3) This problem is form of (1.5) and its solution has the following closed form: x k = a P τ/r B [a], where a = x k + 1 r λk β r AT (Ax k y k ) and B τ/r = {ξ R n (τ/r)e ξ (τ/r)e}. By using the linearized alternating direction method in Section 2, for given β > 0, it needs r > βλ max (A T A). By using the self-adaptive ADM-based contraction method in Section 3, it needs r to satisfy β A T A(x k x k ) νr x k x k, ν (0, 1). Because A is a generic matrix, the above condition is satisfied even if r is much less than βλ max (A T A).
36 XII - 36 References [1] S. Boyd, N. Parikh, E. Chu, B. Peleato and J. Eckstein, Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers, Foundations and Trends in Machine Learning, 3, 1-122, [2] D. Gabay, Applications of the Method of Multipliers to Variational Inequalities, in Augmented Lagrange Methods: Applications to the Solution of Boundary-valued Problems, M. Fortin and R. Glowinski, eds., North Holland, Amsterdam, The Netherlands, , [3] D. Gabay and B. Mercier, A dual algorithm for the solution of nonlinear variational problems viafinite element approximations, Computers and Mathematics with Applications, 2:17-40, [4] R. Glowinski, Numerical Methods for Nonlinear Variational Problems, Springer-Verlag, New York, Berlin, Heidelberg, Tokyo, [5] B.S. He, A class of projection and contraction methods for monotone variational inequalities, Applied Math. & Optim., 35, 69-76, [6] B.S. He, Inexact implicit methods for monotone general variational inequalities, Math. Program. Series A, 86, , [7] B.S. He, L-Z Liao, D.R. Han and H. Yang, A new inexact alternating directions method for monotone variational inequalities, Math. Progr., 92, , [8] R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58, , [9] X. Q. Zhang, M. Burger, X. Bresson and S. Osher, Bregmanized nonlocal regularization for deconvolution and sparse reconstruction, SIAM J. Imag. Sci., 3, , [10] X. Q. Zhang, M. Burger and S. Osher, A unified primal-dual algorithm framework based on Bregman iteration, J. Sci. Computing, 46, 20-46, 2010.
Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.11
XI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.11 Alternating direction methods of multipliers for separable convex programming Bingsheng He Department of Mathematics
More informationContraction Methods for Convex Optimization and Monotone Variational Inequalities No.16
XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of
More informationOptimal Linearized Alternating Direction Method of Multipliers for Convex Programming 1
Optimal Linearized Alternating Direction Method of Multipliers for Convex Programming Bingsheng He 2 Feng Ma 3 Xiaoming Yuan 4 October 4, 207 Abstract. The alternating direction method of multipliers ADMM
More informationContraction Methods for Convex Optimization and Monotone Variational Inequalities No.18
XVIII - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No18 Linearized alternating direction method with Gaussian back substitution for separable convex optimization
More informationLinearized Alternating Direction Method of Multipliers via Positive-Indefinite Proximal Regularization for Convex Programming.
Linearized Alternating Direction Method of Multipliers via Positive-Indefinite Proximal Regularization for Convex Programming Bingsheng He Feng Ma 2 Xiaoming Yuan 3 July 3, 206 Abstract. The alternating
More informationGeneralized ADMM with Optimal Indefinite Proximal Term for Linearly Constrained Convex Optimization
Generalized ADMM with Optimal Indefinite Proximal Term for Linearly Constrained Convex Optimization Fan Jiang 1 Zhongming Wu Xingju Cai 3 Abstract. We consider the generalized alternating direction method
More informationThe Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent
The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent Caihua Chen Bingsheng He Yinyu Ye Xiaoming Yuan 4 December, Abstract. The alternating direction method
More informationImproving an ADMM-like Splitting Method via Positive-Indefinite Proximal Regularization for Three-Block Separable Convex Minimization
Improving an AMM-like Splitting Method via Positive-Indefinite Proximal Regularization for Three-Block Separable Convex Minimization Bingsheng He 1 and Xiaoming Yuan August 14, 016 Abstract. The augmented
More informationProximal-like contraction methods for monotone variational inequalities in a unified framework
Proximal-like contraction methods for monotone variational inequalities in a unified framework Bingsheng He 1 Li-Zhi Liao 2 Xiang Wang Department of Mathematics, Nanjing University, Nanjing, 210093, China
More informationA Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem
A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem Kangkang Deng, Zheng Peng Abstract: The main task of genetic regulatory networks is to construct a
More informationRecent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables
Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop
More informationShiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers
Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)
More informationApplication of the Strictly Contractive Peaceman-Rachford Splitting Method to Multi-block Separable Convex Programming
Application of the Strictly Contractive Peaceman-Rachford Splitting Method to Multi-block Separable Convex Programming Bingsheng He, Han Liu, Juwei Lu, and Xiaoming Yuan Abstract Recently, a strictly contractive
More informationProximal ADMM with larger step size for two-block separable convex programming and its application to the correlation matrices calibrating problems
Available online at www.isr-publications.com/jnsa J. Nonlinear Sci. Appl., (7), 538 55 Research Article Journal Homepage: www.tjnsa.com - www.isr-publications.com/jnsa Proximal ADMM with larger step size
More information496 B.S. HE, S.L. WANG AND H. YANG where w = x y 0 A ; Q(w) f(x) AT g(y) B T Ax + By b A ; W = X Y R r : (5) Problem (4)-(5) is denoted as MVI
Journal of Computational Mathematics, Vol., No.4, 003, 495504. A MODIFIED VARIABLE-PENALTY ALTERNATING DIRECTIONS METHOD FOR MONOTONE VARIATIONAL INEQUALITIES Λ) Bing-sheng He Sheng-li Wang (Department
More informationOn the acceleration of augmented Lagrangian method for linearly constrained optimization
On the acceleration of augmented Lagrangian method for linearly constrained optimization Bingsheng He and Xiaoming Yuan October, 2 Abstract. The classical augmented Lagrangian method (ALM plays a fundamental
More informationA relaxed customized proximal point algorithm for separable convex programming
A relaxed customized proximal point algorithm for separable convex programming Xingju Cai Guoyong Gu Bingsheng He Xiaoming Yuan August 22, 2011 Abstract The alternating direction method (ADM) is classical
More informationAn Algorithmic Framework of Generalized Primal-Dual Hybrid Gradient Methods for Saddle Point Problems
An Algorithmic Framework of Generalized Primal-Dual Hybrid Gradient Methods for Saddle Point Problems Bingsheng He Feng Ma 2 Xiaoming Yuan 3 January 30, 206 Abstract. The primal-dual hybrid gradient method
More informationAn inexact parallel splitting augmented Lagrangian method for large system of linear equations
An inexact parallel splitting augmented Lagrangian method for large system of linear equations Zheng Peng [1, ] DongHua Wu [] [1]. Mathematics Department of Nanjing University, Nanjing PR. China, 10093
More informationInexact Alternating Direction Method of Multipliers for Separable Convex Optimization
Inexact Alternating Direction Method of Multipliers for Separable Convex Optimization Hongchao Zhang hozhang@math.lsu.edu Department of Mathematics Center for Computation and Technology Louisiana State
More informationInexact Alternating-Direction-Based Contraction Methods for Separable Linearly Constrained Convex Optimization
J Optim Theory Appl 04) 63:05 9 DOI 0007/s0957-03-0489-z Inexact Alternating-Direction-Based Contraction Methods for Separable Linearly Constrained Convex Optimization Guoyong Gu Bingsheng He Junfeng Yang
More informationON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS
ON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS WEI DENG AND WOTAO YIN Abstract. The formulation min x,y f(x) + g(y) subject to Ax + By = b arises in
More informationOn convergence rate of the Douglas-Rachford operator splitting method
On convergence rate of the Douglas-Rachford operator splitting method Bingsheng He and Xiaoming Yuan 2 Abstract. This note provides a simple proof on a O(/k) convergence rate for the Douglas- Rachford
More informationLINEARIZED AUGMENTED LAGRANGIAN AND ALTERNATING DIRECTION METHODS FOR NUCLEAR NORM MINIMIZATION
LINEARIZED AUGMENTED LAGRANGIAN AND ALTERNATING DIRECTION METHODS FOR NUCLEAR NORM MINIMIZATION JUNFENG YANG AND XIAOMING YUAN Abstract. The nuclear norm is widely used to induce low-rank solutions for
More informationDistributed Optimization via Alternating Direction Method of Multipliers
Distributed Optimization via Alternating Direction Method of Multipliers Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato Stanford University ITMANET, Stanford, January 2011 Outline precursors dual decomposition
More informationGeneralized Alternating Direction Method of Multipliers: New Theoretical Insights and Applications
Noname manuscript No. (will be inserted by the editor) Generalized Alternating Direction Method of Multipliers: New Theoretical Insights and Applications Ethan X. Fang Bingsheng He Han Liu Xiaoming Yuan
More informationarxiv: v1 [math.oc] 23 May 2017
A DERANDOMIZED ALGORITHM FOR RP-ADMM WITH SYMMETRIC GAUSS-SEIDEL METHOD JINCHAO XU, KAILAI XU, AND YINYU YE arxiv:1705.08389v1 [math.oc] 23 May 2017 Abstract. For multi-block alternating direction method
More informationOn the Iteration Complexity of Some Projection Methods for Monotone Linear Variational Inequalities
On the Iteration Complexity of Some Projection Methods for Monotone Linear Variational Inequalities Caihua Chen Xiaoling Fu Bingsheng He Xiaoming Yuan January 13, 2015 Abstract. Projection type methods
More informationCoordinate Update Algorithm Short Course Proximal Operators and Algorithms
Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 36 Why proximal? Newton s method: for C 2 -smooth, unconstrained problems allow
More informationarxiv: v1 [math.oc] 27 Jan 2013
An Extragradient-Based Alternating Direction Method for Convex Minimization arxiv:1301.6308v1 [math.oc] 27 Jan 2013 Shiqian MA Shuzhong ZHANG January 26, 2013 Abstract In this paper, we consider the problem
More informationConvergence of a Class of Stationary Iterative Methods for Saddle Point Problems
Convergence of a Class of Stationary Iterative Methods for Saddle Point Problems Yin Zhang 张寅 August, 2010 Abstract A unified convergence result is derived for an entire class of stationary iterative methods
More informationAccelerated primal-dual methods for linearly constrained convex problems
Accelerated primal-dual methods for linearly constrained convex problems Yangyang Xu SIAM Conference on Optimization May 24, 2017 1 / 23 Accelerated proximal gradient For convex composite problem: minimize
More informationA Unified Approach to Proximal Algorithms using Bregman Distance
A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department
More informationOn the O(1/t) convergence rate of the projection and contraction methods for variational inequalities with Lipschitz continuous monotone operators
On the O1/t) convergence rate of the projection and contraction methods for variational inequalities with Lipschitz continuous monotone operators Bingsheng He 1 Department of Mathematics, Nanjing University,
More informationA GENERAL INERTIAL PROXIMAL POINT METHOD FOR MIXED VARIATIONAL INEQUALITY PROBLEM
A GENERAL INERTIAL PROXIMAL POINT METHOD FOR MIXED VARIATIONAL INEQUALITY PROBLEM CAIHUA CHEN, SHIQIAN MA, AND JUNFENG YANG Abstract. In this paper, we first propose a general inertial proximal point method
More informationAlternating Direction Method of Multipliers. Ryan Tibshirani Convex Optimization
Alternating Direction Method of Multipliers Ryan Tibshirani Convex Optimization 10-725 Consider the problem Last time: dual ascent min x f(x) subject to Ax = b where f is strictly convex and closed. Denote
More informationEE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6)
EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement to the material discussed in
More informationAn novel ADM for finding Cournot equilibria of bargaining problem with alternating offers
An novel ADM for finding Cournot equilibria of bargaining problem with alternating offers Zheng Peng Department of Mathematics, NanJing University, P R. China, 9 Abstract: Bargaining is a basic game in
More informationDual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725
Dual methods and ADMM Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given f : R n R, the function is called its conjugate Recall conjugate functions f (y) = max x R n yt x f(x)
More informationSplitting methods for decomposing separable convex programs
Splitting methods for decomposing separable convex programs Philippe Mahey LIMOS - ISIMA - Université Blaise Pascal PGMO, ENSTA 2013 October 4, 2013 1 / 30 Plan 1 Max Monotone Operators Proximal techniques
More information1 Introduction and preliminaries
Proximal Methods for a Class of Relaxed Nonlinear Variational Inclusions Abdellatif Moudafi Université des Antilles et de la Guyane, Grimaag B.P. 7209, 97275 Schoelcher, Martinique abdellatif.moudafi@martinique.univ-ag.fr
More informationA hybrid splitting method for smoothing Tikhonov regularization problem
Zeng et al Journal of Inequalities and Applications 06) 06:68 DOI 086/s3660-06-098-8 R E S E A R C H Open Access A hybrid splitting method for smoothing Tikhonov regularization problem Yu-Hua Zeng,, Zheng
More informationSparse Optimization Lecture: Dual Methods, Part I
Sparse Optimization Lecture: Dual Methods, Part I Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know dual (sub)gradient iteration augmented l 1 iteration
More informationSparse Gaussian conditional random fields
Sparse Gaussian conditional random fields Matt Wytock, J. ico Kolter School of Computer Science Carnegie Mellon University Pittsburgh, PA 53 {mwytock, zkolter}@cs.cmu.edu Abstract We propose sparse Gaussian
More informationResearch Article Solving the Matrix Nearness Problem in the Maximum Norm by Applying a Projection and Contraction Method
Advances in Operations Research Volume 01, Article ID 357954, 15 pages doi:10.1155/01/357954 Research Article Solving the Matrix Nearness Problem in the Maximum Norm by Applying a Projection and Contraction
More informationACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING
ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING YANGYANG XU Abstract. Motivated by big data applications, first-order methods have been extremely
More informationNonlinear Support Vector Machines through Iterative Majorization and I-Splines
Nonlinear Support Vector Machines through Iterative Majorization and I-Splines P.J.F. Groenen G. Nalbantov J.C. Bioch July 9, 26 Econometric Institute Report EI 26-25 Abstract To minimize the primal support
More informationALTERNATING DIRECTION METHOD OF MULTIPLIERS WITH VARIABLE STEP SIZES
ALTERNATING DIRECTION METHOD OF MULTIPLIERS WITH VARIABLE STEP SIZES Abstract. The alternating direction method of multipliers (ADMM) is a flexible method to solve a large class of convex minimization
More informationUses of duality. Geoff Gordon & Ryan Tibshirani Optimization /
Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear
More informationOptimal linearized symmetric ADMM for multi-block separable convex programming
Optimal linearized symmetric ADMM for multi-block separable convex programming Xiaokai Chang Jianchao Bai 3 and Sanyang Liu School of Mathematics and Statistics Xidian University Xi an 7007 PR China College
More informationFrank-Wolfe Method. Ryan Tibshirani Convex Optimization
Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)
More informationThe semi-convergence of GSI method for singular saddle point problems
Bull. Math. Soc. Sci. Math. Roumanie Tome 57(05 No., 04, 93 00 The semi-convergence of GSI method for singular saddle point problems by Shu-Xin Miao Abstract Recently, Miao Wang considered the GSI method
More informationMS&E 318 (CME 338) Large-Scale Numerical Optimization
Stanford University, Management Science & Engineering (and ICME) MS&E 318 (CME 338) Large-Scale Numerical Optimization 1 Origins Instructor: Michael Saunders Spring 2015 Notes 9: Augmented Lagrangian Methods
More informationBeyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory
Beyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory Xin Liu(4Ð) State Key Laboratory of Scientific and Engineering Computing Institute of Computational Mathematics
More informationSPARSE AND LOW-RANK MATRIX DECOMPOSITION VIA ALTERNATING DIRECTION METHODS
SPARSE AND LOW-RANK MATRIX DECOMPOSITION VIA ALTERNATING DIRECTION METHODS XIAOMING YUAN AND JUNFENG YANG Abstract. The problem of recovering the sparse and low-rank components of a matrix captures a broad
More informationA Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations
A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations Chuangchuang Sun and Ran Dai Abstract This paper proposes a customized Alternating Direction Method of Multipliers
More informationMath 273a: Optimization Overview of First-Order Optimization Algorithms
Math 273a: Optimization Overview of First-Order Optimization Algorithms Wotao Yin Department of Mathematics, UCLA online discussions on piazza.com 1 / 9 Typical flow of numerical optimization Optimization
More information2 Regularized Image Reconstruction for Compressive Imaging and Beyond
EE 367 / CS 448I Computational Imaging and Display Notes: Compressive Imaging and Regularized Image Reconstruction (lecture ) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement
More informationA General Framework for a Class of Primal-Dual Algorithms for TV Minimization
A General Framework for a Class of Primal-Dual Algorithms for TV Minimization Ernie Esser UCLA 1 Outline A Model Convex Minimization Problem Main Idea Behind the Primal Dual Hybrid Gradient (PDHG) Method
More informationConvex Optimization. (EE227A: UC Berkeley) Lecture 15. Suvrit Sra. (Gradient methods III) 12 March, 2013
Convex Optimization (EE227A: UC Berkeley) Lecture 15 (Gradient methods III) 12 March, 2013 Suvrit Sra Optimal gradient methods 2 / 27 Optimal gradient methods We saw following efficiency estimates for
More informationGENERAL NONCONVEX SPLIT VARIATIONAL INEQUALITY PROBLEMS. Jong Kyu Kim, Salahuddin, and Won Hee Lim
Korean J. Math. 25 (2017), No. 4, pp. 469 481 https://doi.org/10.11568/kjm.2017.25.4.469 GENERAL NONCONVEX SPLIT VARIATIONAL INEQUALITY PROBLEMS Jong Kyu Kim, Salahuddin, and Won Hee Lim Abstract. In this
More informationProximal Methods for Optimization with Spasity-inducing Norms
Proximal Methods for Optimization with Spasity-inducing Norms Group Learning Presentation Xiaowei Zhou Department of Electronic and Computer Engineering The Hong Kong University of Science and Technology
More informationA Solution Method for Semidefinite Variational Inequality with Coupled Constraints
Communications in Mathematics and Applications Volume 4 (2013), Number 1, pp. 39 48 RGN Publications http://www.rgnpublications.com A Solution Method for Semidefinite Variational Inequality with Coupled
More informationOptimization for Machine Learning
Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html
More information1 Non-negative Matrix Factorization (NMF)
2018-06-21 1 Non-negative Matrix Factorization NMF) In the last lecture, we considered low rank approximations to data matrices. We started with the optimal rank k approximation to A R m n via the SVD,
More informationGradient Descent. Ryan Tibshirani Convex Optimization /36-725
Gradient Descent Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: canonical convex programs Linear program (LP): takes the form min x subject to c T x Gx h Ax = b Quadratic program (QP): like
More informationSparse and Low-Rank Matrix Decomposition Via Alternating Direction Method
Sparse and Low-Rank Matrix Decomposition Via Alternating Direction Method Xiaoming Yuan a, 1 and Junfeng Yang b, 2 a Department of Mathematics, Hong Kong Baptist University, Hong Kong, China b Department
More informationCoordinate Descent and Ascent Methods
Coordinate Descent and Ascent Methods Julie Nutini Machine Learning Reading Group November 3 rd, 2015 1 / 22 Projected-Gradient Methods Motivation Rewrite non-smooth problem as smooth constrained problem:
More informationThe Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent
The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent Yinyu Ye K. T. Li Professor of Engineering Department of Management Science and Engineering Stanford
More informationLINEARIZED ALTERNATING DIRECTION METHOD FOR CONSTRAINED LINEAR LEAST-SQUARES PROBLEM
LINEARIZED ALTERNATING DIRECTION METHOD FOR CONSTRAINED LINEAR LEAST-SQUARES PROBLEM RAYMOND H. CHAN, MIN TAO, AND XIAOMING YUAN February 8, 20 Abstract. In this paper, we apply the alternating direction
More informationHomework 4. Convex Optimization /36-725
Homework 4 Convex Optimization 10-725/36-725 Due Friday November 4 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)
More informationGauge optimization and duality
1 / 54 Gauge optimization and duality Junfeng Yang Department of Mathematics Nanjing University Joint with Shiqian Ma, CUHK September, 2015 2 / 54 Outline Introduction Duality Lagrange duality Fenchel
More informationConvergence rate estimates for the gradient differential inclusion
Convergence rate estimates for the gradient differential inclusion Osman Güler November 23 Abstract Let f : H R { } be a proper, lower semi continuous, convex function in a Hilbert space H. The gradient
More informationRelaxed linearized algorithms for faster X-ray CT image reconstruction
Relaxed linearized algorithms for faster X-ray CT image reconstruction Hung Nien and Jeffrey A. Fessler University of Michigan, Ann Arbor The 13th Fully 3D Meeting June 2, 2015 1/20 Statistical image reconstruction
More informationOn the iterate convergence of descent methods for convex optimization
On the iterate convergence of descent methods for convex optimization Clovis C. Gonzaga March 1, 2014 Abstract We study the iterate convergence of strong descent algorithms applied to convex functions.
More informationCoordinate Update Algorithm Short Course Operator Splitting
Coordinate Update Algorithm Short Course Operator Splitting Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 25 Operator splitting pipeline 1. Formulate a problem as 0 A(x) + B(x) with monotone operators
More informationAn efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss
An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss arxiv:1811.04545v1 [stat.co] 12 Nov 2018 Cheng Wang School of Mathematical Sciences, Shanghai Jiao
More informationOptimization for Learning and Big Data
Optimization for Learning and Big Data Donald Goldfarb Department of IEOR Columbia University Department of Mathematics Distinguished Lecture Series May 17-19, 2016. Lecture 1. First-Order Methods for
More informationPrimal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization
Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization Roger Behling a, Clovis Gonzaga b and Gabriel Haeser c March 21, 2013 a Department
More informationSupport Vector Machines
Support Vector Machines Support vector machines (SVMs) are one of the central concepts in all of machine learning. They are simply a combination of two ideas: linear classification via maximum (or optimal
More informationPrimal/Dual Decomposition Methods
Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Subgradients
More informationConditional Gradient (Frank-Wolfe) Method
Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties
More informationAdaptive Primal Dual Optimization for Image Processing and Learning
Adaptive Primal Dual Optimization for Image Processing and Learning Tom Goldstein Rice University tag7@rice.edu Ernie Esser University of British Columbia eesser@eos.ubc.ca Richard Baraniuk Rice University
More informationConstrained Optimization and Lagrangian Duality
CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may
More informationLOCAL LINEAR CONVERGENCE OF ADMM Daniel Boley
LOCAL LINEAR CONVERGENCE OF ADMM Daniel Boley Model QP/LP: min 1 / 2 x T Qx+c T x s.t. Ax = b, x 0, (1) Lagrangian: L(x,y) = 1 / 2 x T Qx+c T x y T x s.t. Ax = b, (2) where y 0 is the vector of Lagrange
More informationIntroduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur
Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Module - 5 Lecture - 22 SVM: The Dual Formulation Good morning.
More informationFAST ALTERNATING DIRECTION OPTIMIZATION METHODS
FAST ALTERNATING DIRECTION OPTIMIZATION METHODS TOM GOLDSTEIN, BRENDAN O DONOGHUE, SIMON SETZER, AND RICHARD BARANIUK Abstract. Alternating direction methods are a common tool for general mathematical
More informationA Tutorial on Primal-Dual Algorithm
A Tutorial on Primal-Dual Algorithm Shenlong Wang University of Toronto March 31, 2016 1 / 34 Energy minimization MAP Inference for MRFs Typical energies consist of a regularization term and a data term.
More informationEfficient smoothers for all-at-once multigrid methods for Poisson and Stokes control problems
Efficient smoothers for all-at-once multigrid methods for Poisson and Stoes control problems Stefan Taacs stefan.taacs@numa.uni-linz.ac.at, WWW home page: http://www.numa.uni-linz.ac.at/~stefant/j3362/
More informationLECTURE NOTE #8 PROF. ALAN YUILLE. Can we find a linear classifier that separates the position and negative examples?
LECTURE NOTE #8 PROF. ALAN YUILLE 1. Linear Classifiers and Perceptrons A dataset contains N samples: { (x µ, y µ ) : µ = 1 to N }, y µ {±1} Can we find a linear classifier that separates the position
More informationA Primal-dual Three-operator Splitting Scheme
Noname manuscript No. (will be inserted by the editor) A Primal-dual Three-operator Splitting Scheme Ming Yan Received: date / Accepted: date Abstract In this paper, we propose a new primal-dual algorithm
More informationGradient descent. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725
Gradient descent Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Gradient descent First consider unconstrained minimization of f : R n R, convex and differentiable. We want to solve
More informationDual Proximal Gradient Method
Dual Proximal Gradient Method http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/19 1 proximal gradient method
More informationConvex Optimization Algorithms for Machine Learning in 10 Slides
Convex Optimization Algorithms for Machine Learning in 10 Slides Presenter: Jul. 15. 2015 Outline 1 Quadratic Problem Linear System 2 Smooth Problem Newton-CG 3 Composite Problem Proximal-Newton-CD 4 Non-smooth,
More informationAbout Split Proximal Algorithms for the Q-Lasso
Thai Journal of Mathematics Volume 5 (207) Number : 7 http://thaijmath.in.cmu.ac.th ISSN 686-0209 About Split Proximal Algorithms for the Q-Lasso Abdellatif Moudafi Aix Marseille Université, CNRS-L.S.I.S
More informationOptimization methods
Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,
More informationA SIMPLE PARALLEL ALGORITHM WITH AN O(1/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS
A SIMPLE PARALLEL ALGORITHM WITH AN O(/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS HAO YU AND MICHAEL J. NEELY Abstract. This paper considers convex programs with a general (possibly non-differentiable)
More informationThe speed of Shor s R-algorithm
IMA Journal of Numerical Analysis 2008) 28, 711 720 doi:10.1093/imanum/drn008 Advance Access publication on September 12, 2008 The speed of Shor s R-algorithm J. V. BURKE Department of Mathematics, University
More informationDLM: Decentralized Linearized Alternating Direction Method of Multipliers
1 DLM: Decentralized Linearized Alternating Direction Method of Multipliers Qing Ling, Wei Shi, Gang Wu, and Alejandro Ribeiro Abstract This paper develops the Decentralized Linearized Alternating Direction
More informationALGORITHM CONSTRUCTION BY DECOMPOSITION 1. INTRODUCTION. The following theorem is so simple it s almost embarassing. Nevertheless
ALGORITHM CONSTRUCTION BY DECOMPOSITION JAN DE LEEUW ABSTRACT. We discuss and illustrate a general technique, related to augmentation, in which a complicated optimization problem is replaced by a simpler
More information