Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.18

Size: px
Start display at page:

Download "Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.18"

Transcription

1 XVIII - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No18 Linearized alternating direction method with Gaussian back substitution for separable convex optimization Bingsheng He Department of Mathematics Nanjing University hebma@njueducn The context of this lecture is based on the publication [3]

2 XVIII Introduction In this paper, we consider the general case of linearly constrained separable convex programming with m 3: min m i=1 θ i(x i ) m i=1 A ix i = b; (11) x i X i, i = 1,, m; where θ i : R n i R (i = 1,, m) are closed proper convex functions (not necessarily smooth); X i R n i (i = 1,, m) are closed convex sets; A i R l n i (i = 1,, m) are given matrices and b R l is a given vector Throughout, we assume that the solution set of (11) is nonempty In fact, even for the special case of (11) with m = 3, the convergence of the extended ADM is still open In the last lecture, we provided a novel approach towards the extension of ADM for the problem (11) More specifically, we show that if a new iterate is generated by correcting the output of the ADM with a Gaussian back substitution procedure, then the

3 XVIII - 3 sequence of iterates is convergent to a solution of (11) The resulting method is called the ADM with Gaussian back substitution (ADM-GbS) Alternatively, the ADM-GbS can be regarded as a prediction-correction type method whose predictor is generated by the ADM procedure and the correction is completed by a Gaussian back substitution procedure The main task of each iteration in ADM-GbS is to solve the following sub-problem: min{θ i (x i ) + β 2 A ix i b i 2 x i X i }, i = 1,, m (12) Thus, ADM-GbS is implementable only when the subproblems of (12) have their solutions in the closed form Again, each iteration of the proposed method in this lecture consists of two steps prediction and correction In order to implement the prediction step, we only assume that the x i -subproblem min{θ i (x i ) + r i 2 x i a i 2 x i X i }, i = 1,, m (13) has its solution in the closed form The first-order optimality condition of (11) and thus characterize (11) by a variational inequality (VI) As we will show, the VI characterization is convenient for the convergence analysis to be conducted

4 XVIII - 4 By attaching a Lagrange multiplier vector λ R l to the linear constraint, the Lagrange function of (11) is: L(x 1, x 2,, x m, λ) = m m θ i (x i ) λ T ( A i x i b), (14) i=1 i=1 which is defined on W := X 1 X 2 X m R l ( Let x 1, x 2,, x m, λ ) be a saddle point of the Lagrange function (14) Then we have L λ R l(x 1, x 2,, x m, λ) L(x 1, x 2,, x m, λ ) L xi X i (i=1,,m)(x 1, x 2,, x m, λ ) For i {1, 2,, m}, we denote by θ i (x i ) the subdifferential of the convex function θ i (x i ) and by f i (x i ) θ i (x i ) a given subgradient of θ i (x i ) It is evident that finding a saddle point of L(x 1, x 2,, x m, λ) is equivalent to finding

5 XVIII - 5 w = (x 1, x 2,, x m, λ ) W, such that (x 1 x 1) T {f 1 (x 1) A T 1 λ } 0, (x m x m) T {f m (x m) A T mλ } 0, (λ λ ) T ( m i=1 A ix i b) 0, (15) for all w = (x 1, x 2,, x m, λ) W More compactly, (15) can be written into (w w ) T F (w ) 0, w W, (16a) where w = x 1 x m λ and F (w) = f 1 (x 1 ) A T 1 λ f m (x m ) A T mλ m i=1 A ix i b (16b) Note that the operator F (w) defined in (16b) is monotone due to the fact that θ i s are all convex functions In addition, the solution set of (16), denoted by W, is also nonempty

6 XVIII Linearized ADM with Gaussian back substitution 21 Linearized ADM Prediction Step 1 ADM step (prediction step) Obtain w k = ( x k 1, x k 2,, x k m, λ k ) in the forward (alternating) order by the following ADM procedure: x k 1 =arg min { θ 1 (x 1 )+ q T 1 A 1 x 1 + r 1 2 x 1 x k 1 2 x1 X 1 } ; x k i =arg min { θ i (x i )+ qi T A i x i + r i 2 x i x k i } 2 xi X i ; x k m =arg min { θ m (x m ) + q T ma m x m + r m 2 x m x k m 2 x m X m } ; where q i = β( i 1 j=1 A j x k j + m j=i A jx k j b) λ k = λ k β( m j=1 A j x k j b) (21)

7 XVIII - 7 The prediction is implementable due to the assumption (13) of this lecture and { arg min θ i (x i )+ qi T A i x i + r i 2 x i x k i } 2 xi X i { = arg min θ i (x i ) + r i 2 x i ( x k i 1 r i A T ) i q } i 2 xi X i Assumption r i, i = 1,, m is chosen that condition is satisfied in each iteration r i x k i x k i 2 β A i (x k i x k i ) 2 (22) In the case that A i = I ni, we take r i = β, the condition (22) is satisfied Note that in this case we have { { i 1 arg min θ i (x i )+ β( x i X i j=1 { = arg min θ i (x i )+ β x i X i 2 A j x k j + (i 1 m j=i A j x k j +A i x i + j=1 } T A j x k j b) Ai x i + β } 2 x i x k i 2 m j=i+1 ) A j x k j b 1 } 2 β λk

8 XVIII Correction by the Gaussian back substitution To present the Gaussian back substitution procedure, we define the matrices: and M = r 1 I n1 0 0 βa T 2 A 1 r 2 I n2 βa T ma 1 βa T ma m 1 r m I nm β I l, (23) H = diag ( r 1 I n1, r 2 I n2,, r m I nm, 1 β I l) (24) Note that for β > 0 and r i > 0, the matrix M defined in (23) is a non-singular

9 XVIII - 9 lower-triangular block matrix In addition, according to (23) and (24), we have: H 1 M T = I n2 0 β r 1 A T 1 A 2 β r 1 A T 1 A m 0 I nm 1 β r nm 1 A T m 1A m I nm I l (25) which is a upper-triangular block matrix whose diagonal components are identity matrices The Gaussian back substitution procedure to be proposed is based on the matrix H 1 M T defined in (25)

10 XVIII - 10 Step 2 Gaussian back substitution step (correction step) Correct the ADM output w k in the backward order by the following Gaussian back substitution procedure and generate the new iterate w k+1 : H 1 M T (w k+1 w k ) = α ( w k w k) (26) Recall that the matrix H 1 M T defined in (25) is a upper-triangular block matrix The Gaussian back substitution step (26) is thus very easy to execute In fact, as we mentioned, after the predictor is generated by the linearized ADM scheme (21) in the forward (alternating) order, the proposed Gaussian back substitution step corrects the predictor in the backward order Since the Gaussian back substitution step is easy to perform, the computation of each iteration of the ADM with Gaussian back substitution is dominated by the ADM procedure (21) To show the main idea with clearer notation, we restrict our theoretical discussion to the case with fixed β > 0 The main task of the Gaussian back substitution step (26) can be rewritten into w k+1 = w k αm T H(w k w k ) (27) As we will show, M T H(w k w k ) is a descent direction of the distance function

11 XVIII w w 2 G with G = MH 1 M T at the point w = w k for any w W In this sense, the proposed linearized ADM with Gaussian back substitution can also be regarded as an ADM-based contraction method where the output of the linearized ADM scheme (21) contributes a descent direction of the distance function Thus, the constant α in (26) plays the role of a step size along the descent direction (w k w k ) In fact, we can choose the step size dynamically based on some techniques in the literature (eg [4]), and the Gaussian back substitution procedure with the constant α can be modified accordingly into the following variant with a dynamical step size: where H 1 M T (w k+1 w k ) = γα k( w k w k ), (28) α k = wk w k 2 H + w k w k 2 Q 2 w k w k 2 H ; (29)

12 XVIII - 12 Q = βa T 1 A 1 βa T 1 A 2 βa T 1 A m A T 1 βa T 2 A 1 βa T 2 A 2 βa T 2 A m A T 2 βa T ma 1 βa T ma 2 βa T ma m A T m A 1 A 2 A m 1 β I l ; (210) and γ (0, 2) Indeed, for any β > 0, the symmetric matrix Q is positive semi-definite Then, for given w k and the w k obtained by the ADM procedure (21), we have that w k w k 2 H = m r i x k i x k i λ k β λ k 2, i=1 and w k w k 2 Q = β m A i (x k i x k i ) + 1 β (λk λ k ) i=1 2, where the norm w 2 H ( w 2 Q, respectively) is defined as w T Hw (w T Qw, respectively) Note that the step size α k defined in (29) satisfies α k 1 2

13 XVIII Convergence of the Linearized ADM-GbS In this section, we prove the convergence of the proposed ADM with Gaussian back substitution for solving (11) Our proof follows the analytic framework of contractive type methods Accordingly, we divide this section into three subsections 31 Verification of the descent directions In this subsection, we mainly show that (w k w k ) is a descent direction of the function 1 2 w w 2 G at the point w = w k whenever w k w k, where w k is generated by the ADM scheme (21), w W and G is a positive definite matrix Lemma 31 Let w k = ( x k 1,, x k m, λ k ) be generated by the linearized ADM step (21) from the given vector w k = (x k 1,, x k m, λ k ) Then, we have w k W, (w w k ) T {d 2 (w k, w k ) d 1 (w k, w k )} 0, w W, (31)

14 XVIII - 14 where d 1 (w k, w k ) = r 1 I n1 0 0 βa T 2 A 1 r 2 I n2 βa T ma 1 βa T ma m 1 r m I nm β I l x k 1 x k 1 x k 2 x k 2 x k m x k m λ k λ k, (32) d 2 (w k, w k ) = F ( w k ) + β A T 1 A T 2 A T m 0 ( m j=1 A j (x k j x k j ) ) (33)

15 XVIII - 15 Proof Since x k i is the solution of (21), for i = 1, 2,, m, according to the optimality condition, we have x k i X i, (x i x k i ) T { f i ( x k i ) A T i [λ k β( i 1 j=1 A xk j + m j=i A jx k j b)] By using the fact the inequality (34) can be written as +r i ( x k i x k i ) } 0, x i X i (34) λ k = λ k β( m A j x k j b), j=1 x k i X i, (x i x k i ) T { f i ( x k i ) A T i λ k + βa T i ( m A j (x k j x k j ) ) j=i +r i ( x k i x k i ) } 0, x i X i (35)

16 XVIII - 16 Summing the inequality (35) over i = 1,, m, we obtain x k X and x 1 x k T 1 f 1 ( x x 2 x k 2 k 1) A T λ ( 1 k A T m 1 j=1 f 2 ( x k 2) A T λ A j(x k j x k j ) ) 2 k ( A T m 2 j=2 +β A j(x k j x k j ) ) x m x k ( m f m ( x k m) A T m λ k A T m Am (x k m x k m) ) x 1 x k T r 1 1 I n x k 1 x k 1 x 2 x k 2 0 r 2 I n2 x k 2 x k 2 x m x k m 0 0 r m I nm x k m x k m (36)

17 XVIII - 17 for all x X Adding the following term x 1 x k T 1 x 2 x k 2 β x m x k m A T 2 0 ( 1 j=1 A j(x k j x k j ) ) ( A T m 1 m j=1 A j(x k j x k j ) ) to the both sides of (36), we get x k X and for all x X, T x 1 x k 1 f 1 ( x k 1) A T λ ( 1 k + βa T m 1 j=1 x 2 x k A j(x k j x k j ) ) 2 f 2 ( x k 2) A T λ ( 2 k + βa T m 2 j=1 A j(x k j x k j ) ) ( x m x k m f m ( x k m) A T m λ k + βa T m m j=1 A j(x k j x k j ) ) x 1 x k 1 r 1 (x x 2 x k 2 T k 1 x k 1) 0 r 2 (x k 2 x k ( 2) + βa T 1 2 j=1 A j(x k j x k j ) ) ( x m x k m r m (x k m x k m) βa T m 1 m j=1 A j(x k j x k j ) ) (37)

18 XVIII - 18 Because that m j=1 A j x k j b = 1 β (λk λ k ), we have (λ λ k ) T ( m j=1 A j x k j b) = (λ λ k ) T 1 β (λk λ k ) Adding (38) and the last equality together, we get w k W, and for all w W T x 1 x k 1 f 1 ( x k 1) A T λ ( 1 k + βa T m 1 j=1 A j(x k j x k j ) ) x 2 x k 2 f 2 ( x k 2) A T λ ( 2 k + βa T m 2 j=1 A j(x k j x k j ) ) x m x k ( m λ λ f m ( x k m) A T m λ k + βa T m m j=1 A j(x k j x k j ) ) k m j=1 A j x k j b x 1 x k T 1 r 1 (x k 1 x k 1) 0 ( x 2 x k 2 r 2 (x k 2 x k 2) βa T 1 2 j=1 A j(x k j x k j ) ) + x m x k m λ λ r m (x k m x k ( m) k 1 β (λk λ βa T m 1 m j=1 A j(x k j x k j ) ) k ) 0 Use the notations of d 1 (w k, w k ) and d 2 (w k, w k ), the assertion is proved (38)

19 XVIII - 19 Lemma 32 Let w k = ( x k 1, x k 2,, x k m, λ k ) be generated by the ADM step (21) from the given vector w k = (x k 2,, x k m, λ k ) Then, we have ( w k w ) T d 1 (w k, w k ) (λ k λ k ) T ( m A j (x k j x k j ) ), w W, (39) j=1 where d 1 (w k, w k ) is defined in (32) Proof Since w W, it follows from (31) that ( w k w ) T d 1 (w k, w k ) ( w k w ) T d 2 (w k, w k ) (310) We consider the right-hand side of (310) By using (33), we get ( w k w ) T d 2 (w k, w k ) = ( m A j (x k j x k j ) )T β ( m A j ( x k j x j ) ) + ( w k w ) T F ( w k )(311) j=1 j=1 Then, we look at the right-hand side of (311) Since w k W, by using the monotonicity of F, we have ( w k w ) T F ( w k ) 0

20 XVIII - 20 Because that m m A j x j = b and β( A j x k j b) = λ k λ k, j=1 j=1 it follows from (311) that ( w k w ) T d 2 (w k, w k ) (λ k λ k ) T ( m A j (x k j x k j ) ) (312) j=2 Substituting (312) into (310), the assertion (39) follows immediately Since (see (23) and (32)) from (39) follows that d 1 (w k, w k ) = M(w k w k ), (313) ( w k w ) T M(w k w k ) (λ k λ k ) T ( m A j (x k j x k j ) ), w W Now, based on the last two lemmas, we are at the stage to prove the main theorem j=1 (314)

21 XVIII - 21 Theorem 31 (Main Theorem) Let w k = ( x k 1,, x k m, λ k ) be generated by the ADM step (21) from the given vector w k = (x k 1,, x k m, λ k ) Then, we have (w k w ) T M(w k w k ) 1 2 wk w k 2 H wk w k 2 Q, w W, (315) where M, H, and Q are defined in (23), (24) and (210), respectively Proof First, it follows from (314) that (w k w ) T M(w k w k ) (w k w k ) T M(w k w k ) + (λ k λ k ) T ( m A j (x k j x k j ) ), (316) for all w W j=1

22 XVIII - 22 Now, we treat the terms of the right hand side of (316) Using the matrix M (see (23)), we have (w k w k ) T M(w k w k ) = x k 1 x k 1 x k 2 x k 2 x k m x k m λ k λ k T r 1 I n1 0 0 βa T 2 A 1 r 2 I n2 βa T ma 1 βa T m A m 1 r m I nm β I l x k 1 x k 1 x k 2 x k 2 x k m x k m λ k λ k (317)

23 XVIII - 23 For the second term of the right-hand side of (316), by a manipulations, we obtain (λ k λ k ) T ( m A j (x k j x k j ) ) = j=1 x k 1 x k 1 x k 2 x k 2 x k m x k m λ k λ k x k 1 x k x k 2 x k x k m x k m A 1 A 2 A m 0 λ k λ k T (318)

24 XVIII - 24 Adding (317) and (318) together, it follows that (w k w k ) T M(w k w k ) + (λ k λ k ) T ( m j=1 A j (x k j x k j ) ) = x k 1 x k 1 x k 2 x k 2 x k m x k m λ k λ k r 1 I n1 0 0 βa T 2 A 1 r 2 I n2 βa T ma 1 βa T m A m 1 r m I nm 0 A 1 A 2 A m 1 β I l x k 1 x k 1 x k 2 x k 2 x k m x k m λ k λ k = 1 2 x k 1 x k 1 x k 2 x k 2 x k m x k m λ k λ k 2r 1 I n1 βa T 1 A2 βat 1 A m A T 1 βa T 2 A 1 2r 2 I n2 A T 2 βa T m 1 A m βa T m A 1 βa T m A m 1 2r m I nm A T m A 1 A 2 A m 2 β I l x k 1 x k 1 x k 2 x k 2 x k m x k m λ k λ k

25 XVIII - 25 Use the notation of the matrices H, Q and the condition (22) to the right-hand side of the last equality, we obtain (w k w k ) T M(w k w k ) + (λ k λ k ) T ( m A j (x k j x k j ) ) = 1 2 wk w k 2 H wk w k 2 Q j=1 Substituting the last equality in (316), the theorem is proved It follows from (315) that MH 1 M T (w k w ), M T H( w k w k ) 1 2 wk w k 2 (H+Q) In other words, by setting G = MH 1 M T, (319) MH 1 M T (w k w ) is the gradient of the distance function 1 w 2 w 2 G, and M T H( w k w k ) is a descent direction of 1 w 2 w 2 G at the current point w k whenever w k w k

26 XVIII The contractive property In this subsection, we mainly prove that the sequence generated by the proposed ADM with Gaussian back substitution is contractive with respect to the set W Note that we follow the definition of contractive type methods With this contractive property, the convergence of the proposed linearized ADM with Gaussian back substitution can be easily derived with subroutine analysis Theorem 32 Let w k = ( x k 1,, x k m, λ k ) be generated by the ADM step (21) from the given vector w k = (x k 1,, x k m, λ k ) Let the matrix G be given by (319) For the new iterate w k+1 produced by the Gaussian back substitution (27), there exists a constant c 0 > 0 such that w k+1 w 2 G w k w 2 ( G c 0 w k w k 2 H + w k w k 2 Q), w W, where H and Q are defined in (24) and (210), respectively (320)

27 XVIII - 27 Proof For G = MH 1 M T and any α 0, we obtain w k w 2 G w k+1 w 2 G = w k w 2 G (w k w ) αm T H(w k w k ) 2 G = 2α(w k w ) T M(w k w k ) α 2 w k w k 2 H (321) Substituting the result of Theorem 31 into the right-hand side of the last equation, we get and thus w k w 2 G w k+1 w 2 G α ( w k w k 2 H + w k w k 2 Q) α 2 w k w k 2 H = α(1 α) w k w k 2 H + α w k w k 2 Q, w k+1 w 2 G w k w 2 G α ( (1 α) w k w k 2 H + w k w k 2 Q), w W (322) Set c 0 = α(1 α) Recall that α [05, 1) The assertion is proved Corollary 31 The assertion of Theorem 32 also holds if the Gaussian back substitution is (28)

28 XVIII - 28 Proof Analogous to the proof of Theorem 32, we have that where α k w k w 2 G w k+1 w 2 G 2γα k(w k w ) T M(w k w k ) (γα k) 2 w k w k 2 H, (323) is given by (29) According to (29), we have that α k( w k w k 2 H Then, it follows from the above equality and (315) that ) 1 ( = w k w k 2 H + w k w k 2 ) Q 2 w k w 2 G w k+1 w 2 G γαk( w k w k 2 H + w k w k 2 ) Q 1 2 γ2 αk( w k w k 2 H + w k w k 2 ) Q = 1 2 γ(2 k( γ)α w k w k 2 H + w k w k 2 Q)

29 XVIII - 29 Because αk 1, it follows from the last inequality that 2 w k+1 w 2 G w k w 2 G 1 4 γ(2 γ)( w k w k 2 H + w k w k 2 Q), w W (324) Since γ (0, 2), the assertion of this corollary follows from (324) directly 33 Convergence The proposed lemmas and theorems are adequate to establish the global convergence of the proposed ADM with Gaussian back substitution, and the analytic framework is quite typical in the context of contractive type methods Theorem 33 Let {w k } and { w k } be the sequences generated by the proposed ADM with Gaussian back substitution Then we have 1 The sequence {w k } is bounded 2 lim k w k w k = 0,

30 XVIII Any cluster point of { w k } is a solution point of (16) 4 The sequence { w k } converges to some w W Proof The first assertion follows from (320) directly In addition, from (320) we get c 0 w k w k 2 H w 0 w 2 G k=0 and thus we get lim k w k w k 2 H = 0, and consequently and The second assertion is proved lim k xk i x k i = 0, i = 2,, m, (325) Substituting (325) into (35), for i = 1, 2,, m, we have lim k λk λ k = 0 (326) x k i X i, lim k (x i x k i ) T { f i ( x k i ) A T i λ k} 0, x i X i (327)

31 XVIII - 31 It follows from (21) and (326) that Combining (327) and (328) we get lim ( m A j x k j b) = 0 (328) k j=1 w k W, lim k (w wk ) T F ( w k ) 0, w W, (329) and thus any cluster point of { w k } is a solution point of (16) The third assertion is proved It follows from the first assertion and lim k w k w k 2 H = 0 that { w k } is also bounded Let w be a cluster point of { w k } and the subsequence {ṽ k j } converges to w It follows from (329) that w k j W, lim k (w wk j ) T F ( w k j ) 0, w W (330)

32 XVIII - 32 and consequently (x i x i ) T { f i (x i ) A T i λ } 0, x i X i, i = 1,, m, m j=1 A jx j b = 0 This means that w W is a solution point of (16) Since {w k } is Fejér monotone and lim k w k w k = 0, the sequence { w k } cannot have other cluster point and { w k } converges to w W References [1] S Boyd, N Parikh, E Chu, B Peleato and J Eckstein, Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers, Nov 2010 [2] BS He, M Tao and XM Yuan, Alternating Direction Method with Gaussian Back Substitution for Separable Convex Programming, SIAM J Optim 22, , 2012 [3] BS He and X M Yuan: Linearized Alternating Direction Method with Gaussian Back Substitution for Separable Convex Programming HTML/2011/10/3192html Numerical Algebra, Control and Optimization (Special Issue in Honor of Prof Xuchu He s 90th Birthday), 3(2), , 2013 [4] C H Ye and X M Yuan, A Descent Method for Structured Monotone Variational Inequalities, Optimization Methods and Software, 22, , 2007

Contraction Methods for Convex Optimization and monotone variational inequalities No.12

Contraction Methods for Convex Optimization and monotone variational inequalities No.12 XII - 1 Contraction Methods for Convex Optimization and monotone variational inequalities No.12 Linearized alternating direction methods of multipliers for separable convex programming Bingsheng He Department

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.11

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.11 XI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.11 Alternating direction methods of multipliers for separable convex programming Bingsheng He Department of Mathematics

More information

arxiv: v1 [math.oc] 23 May 2017

arxiv: v1 [math.oc] 23 May 2017 A DERANDOMIZED ALGORITHM FOR RP-ADMM WITH SYMMETRIC GAUSS-SEIDEL METHOD JINCHAO XU, KAILAI XU, AND YINYU YE arxiv:1705.08389v1 [math.oc] 23 May 2017 Abstract. For multi-block alternating direction method

More information

Proximal-like contraction methods for monotone variational inequalities in a unified framework

Proximal-like contraction methods for monotone variational inequalities in a unified framework Proximal-like contraction methods for monotone variational inequalities in a unified framework Bingsheng He 1 Li-Zhi Liao 2 Xiang Wang Department of Mathematics, Nanjing University, Nanjing, 210093, China

More information

A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem

A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem Kangkang Deng, Zheng Peng Abstract: The main task of genetic regulatory networks is to construct a

More information

Optimal Linearized Alternating Direction Method of Multipliers for Convex Programming 1

Optimal Linearized Alternating Direction Method of Multipliers for Convex Programming 1 Optimal Linearized Alternating Direction Method of Multipliers for Convex Programming Bingsheng He 2 Feng Ma 3 Xiaoming Yuan 4 October 4, 207 Abstract. The alternating direction method of multipliers ADMM

More information

On the Iteration Complexity of Some Projection Methods for Monotone Linear Variational Inequalities

On the Iteration Complexity of Some Projection Methods for Monotone Linear Variational Inequalities On the Iteration Complexity of Some Projection Methods for Monotone Linear Variational Inequalities Caihua Chen Xiaoling Fu Bingsheng He Xiaoming Yuan January 13, 2015 Abstract. Projection type methods

More information

The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent

The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent Caihua Chen Bingsheng He Yinyu Ye Xiaoming Yuan 4 December, Abstract. The alternating direction method

More information

Coordinate Update Algorithm Short Course Subgradients and Subgradient Methods

Coordinate Update Algorithm Short Course Subgradients and Subgradient Methods Coordinate Update Algorithm Short Course Subgradients and Subgradient Methods Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 30 Notation f : H R { } is a closed proper convex function domf := {x R n

More information

Inexact Alternating-Direction-Based Contraction Methods for Separable Linearly Constrained Convex Optimization

Inexact Alternating-Direction-Based Contraction Methods for Separable Linearly Constrained Convex Optimization J Optim Theory Appl 04) 63:05 9 DOI 0007/s0957-03-0489-z Inexact Alternating-Direction-Based Contraction Methods for Separable Linearly Constrained Convex Optimization Guoyong Gu Bingsheng He Junfeng Yang

More information

Inexact Alternating Direction Method of Multipliers for Separable Convex Optimization

Inexact Alternating Direction Method of Multipliers for Separable Convex Optimization Inexact Alternating Direction Method of Multipliers for Separable Convex Optimization Hongchao Zhang hozhang@math.lsu.edu Department of Mathematics Center for Computation and Technology Louisiana State

More information

A relaxed customized proximal point algorithm for separable convex programming

A relaxed customized proximal point algorithm for separable convex programming A relaxed customized proximal point algorithm for separable convex programming Xingju Cai Guoyong Gu Bingsheng He Xiaoming Yuan August 22, 2011 Abstract The alternating direction method (ADM) is classical

More information

On convergence rate of the Douglas-Rachford operator splitting method

On convergence rate of the Douglas-Rachford operator splitting method On convergence rate of the Douglas-Rachford operator splitting method Bingsheng He and Xiaoming Yuan 2 Abstract. This note provides a simple proof on a O(/k) convergence rate for the Douglas- Rachford

More information

Linearized Alternating Direction Method of Multipliers via Positive-Indefinite Proximal Regularization for Convex Programming.

Linearized Alternating Direction Method of Multipliers via Positive-Indefinite Proximal Regularization for Convex Programming. Linearized Alternating Direction Method of Multipliers via Positive-Indefinite Proximal Regularization for Convex Programming Bingsheng He Feng Ma 2 Xiaoming Yuan 3 July 3, 206 Abstract. The alternating

More information

On the acceleration of augmented Lagrangian method for linearly constrained optimization

On the acceleration of augmented Lagrangian method for linearly constrained optimization On the acceleration of augmented Lagrangian method for linearly constrained optimization Bingsheng He and Xiaoming Yuan October, 2 Abstract. The classical augmented Lagrangian method (ALM plays a fundamental

More information

Math 273a: Optimization Subgradient Methods

Math 273a: Optimization Subgradient Methods Math 273a: Optimization Subgradient Methods Instructor: Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com Nonsmooth convex function Recall: For ˉx R n, f(ˉx) := {g R

More information

Primal/Dual Decomposition Methods

Primal/Dual Decomposition Methods Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Subgradients

More information

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent. Lecture Notes: Orthogonal and Symmetric Matrices Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong taoyf@cse.cuhk.edu.hk Orthogonal Matrix Definition. Let u = [u

More information

Proximal ADMM with larger step size for two-block separable convex programming and its application to the correlation matrices calibrating problems

Proximal ADMM with larger step size for two-block separable convex programming and its application to the correlation matrices calibrating problems Available online at www.isr-publications.com/jnsa J. Nonlinear Sci. Appl., (7), 538 55 Research Article Journal Homepage: www.tjnsa.com - www.isr-publications.com/jnsa Proximal ADMM with larger step size

More information

Generalized ADMM with Optimal Indefinite Proximal Term for Linearly Constrained Convex Optimization

Generalized ADMM with Optimal Indefinite Proximal Term for Linearly Constrained Convex Optimization Generalized ADMM with Optimal Indefinite Proximal Term for Linearly Constrained Convex Optimization Fan Jiang 1 Zhongming Wu Xingju Cai 3 Abstract. We consider the generalized alternating direction method

More information

RECOVERING LOW-RANK AND SPARSE COMPONENTS OF MATRICES FROM INCOMPLETE AND NOISY OBSERVATIONS. December

RECOVERING LOW-RANK AND SPARSE COMPONENTS OF MATRICES FROM INCOMPLETE AND NOISY OBSERVATIONS. December RECOVERING LOW-RANK AND SPARSE COMPONENTS OF MATRICES FROM INCOMPLETE AND NOISY OBSERVATIONS MIN TAO AND XIAOMING YUAN December 31 2009 Abstract. Many applications arising in a variety of fields can be

More information

Numerical Optimization

Numerical Optimization Constrained Optimization Computer Science and Automation Indian Institute of Science Bangalore 560 012, India. NPTEL Course on Constrained Optimization Constrained Optimization Problem: min h j (x) 0,

More information

On the O(1/t) convergence rate of the projection and contraction methods for variational inequalities with Lipschitz continuous monotone operators

On the O(1/t) convergence rate of the projection and contraction methods for variational inequalities with Lipschitz continuous monotone operators On the O1/t) convergence rate of the projection and contraction methods for variational inequalities with Lipschitz continuous monotone operators Bingsheng He 1 Department of Mathematics, Nanjing University,

More information

Application of the Strictly Contractive Peaceman-Rachford Splitting Method to Multi-block Separable Convex Programming

Application of the Strictly Contractive Peaceman-Rachford Splitting Method to Multi-block Separable Convex Programming Application of the Strictly Contractive Peaceman-Rachford Splitting Method to Multi-block Separable Convex Programming Bingsheng He, Han Liu, Juwei Lu, and Xiaoming Yuan Abstract Recently, a strictly contractive

More information

Math 273a: Optimization Subgradients of convex functions

Math 273a: Optimization Subgradients of convex functions Math 273a: Optimization Subgradients of convex functions Made by: Damek Davis Edited by Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com 1 / 42 Subgradients Assumptions

More information

A hybrid splitting method for smoothing Tikhonov regularization problem

A hybrid splitting method for smoothing Tikhonov regularization problem Zeng et al Journal of Inequalities and Applications 06) 06:68 DOI 086/s3660-06-098-8 R E S E A R C H Open Access A hybrid splitting method for smoothing Tikhonov regularization problem Yu-Hua Zeng,, Zheng

More information

LINEARIZED AUGMENTED LAGRANGIAN AND ALTERNATING DIRECTION METHODS FOR NUCLEAR NORM MINIMIZATION

LINEARIZED AUGMENTED LAGRANGIAN AND ALTERNATING DIRECTION METHODS FOR NUCLEAR NORM MINIMIZATION LINEARIZED AUGMENTED LAGRANGIAN AND ALTERNATING DIRECTION METHODS FOR NUCLEAR NORM MINIMIZATION JUNFENG YANG AND XIAOMING YUAN Abstract. The nuclear norm is widely used to induce low-rank solutions for

More information

496 B.S. HE, S.L. WANG AND H. YANG where w = x y 0 A ; Q(w) f(x) AT g(y) B T Ax + By b A ; W = X Y R r : (5) Problem (4)-(5) is denoted as MVI

496 B.S. HE, S.L. WANG AND H. YANG where w = x y 0 A ; Q(w) f(x) AT g(y) B T Ax + By b A ; W = X Y R r : (5) Problem (4)-(5) is denoted as MVI Journal of Computational Mathematics, Vol., No.4, 003, 495504. A MODIFIED VARIABLE-PENALTY ALTERNATING DIRECTIONS METHOD FOR MONOTONE VARIATIONAL INEQUALITIES Λ) Bing-sheng He Sheng-li Wang (Department

More information

Algorithms for Constrained Optimization

Algorithms for Constrained Optimization 1 / 42 Algorithms for Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University April 19, 2015 2 / 42 Outline 1. Convergence 2. Sequential quadratic

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

You should be able to...

You should be able to... Lecture Outline Gradient Projection Algorithm Constant Step Length, Varying Step Length, Diminishing Step Length Complexity Issues Gradient Projection With Exploration Projection Solving QPs: active set

More information

Linearized Alternating Direction Method with Gaussian Back Substitution for Separable Convex Programming

Linearized Alternating Direction Method with Gaussian Back Substitution for Separable Convex Programming Linearized Alternating Direction Method with Gaussian Back Substitution for Separable Convex Prograing Bingsheng He and Xiaoing Yuan October 8, 0 [First version: January 3, 0] Abstract Recently, we have

More information

Research Article Solving the Matrix Nearness Problem in the Maximum Norm by Applying a Projection and Contraction Method

Research Article Solving the Matrix Nearness Problem in the Maximum Norm by Applying a Projection and Contraction Method Advances in Operations Research Volume 01, Article ID 357954, 15 pages doi:10.1155/01/357954 Research Article Solving the Matrix Nearness Problem in the Maximum Norm by Applying a Projection and Contraction

More information

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 36 Why proximal? Newton s method: for C 2 -smooth, unconstrained problems allow

More information

On the complexity of the hybrid proximal extragradient method for the iterates and the ergodic mean

On the complexity of the hybrid proximal extragradient method for the iterates and the ergodic mean On the complexity of the hybrid proximal extragradient method for the iterates and the ergodic mean Renato D.C. Monteiro B. F. Svaiter March 17, 2009 Abstract In this paper we analyze the iteration-complexity

More information

LIMITED MEMORY BUNDLE METHOD FOR LARGE BOUND CONSTRAINED NONSMOOTH OPTIMIZATION: CONVERGENCE ANALYSIS

LIMITED MEMORY BUNDLE METHOD FOR LARGE BOUND CONSTRAINED NONSMOOTH OPTIMIZATION: CONVERGENCE ANALYSIS LIMITED MEMORY BUNDLE METHOD FOR LARGE BOUND CONSTRAINED NONSMOOTH OPTIMIZATION: CONVERGENCE ANALYSIS Napsu Karmitsa 1 Marko M. Mäkelä 2 Department of Mathematics, University of Turku, FI-20014 Turku,

More information

An Algorithmic Framework of Generalized Primal-Dual Hybrid Gradient Methods for Saddle Point Problems

An Algorithmic Framework of Generalized Primal-Dual Hybrid Gradient Methods for Saddle Point Problems An Algorithmic Framework of Generalized Primal-Dual Hybrid Gradient Methods for Saddle Point Problems Bingsheng He Feng Ma 2 Xiaoming Yuan 3 January 30, 206 Abstract. The primal-dual hybrid gradient method

More information

An accelerated non-euclidean hybrid proximal extragradient-type algorithm for convex concave saddle-point problems

An accelerated non-euclidean hybrid proximal extragradient-type algorithm for convex concave saddle-point problems Optimization Methods and Software ISSN: 1055-6788 (Print) 1029-4937 (Online) Journal homepage: http://www.tandfonline.com/loi/goms20 An accelerated non-euclidean hybrid proximal extragradient-type algorithm

More information

ON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS

ON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS ON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS WEI DENG AND WOTAO YIN Abstract. The formulation min x,y f(x) + g(y) subject to Ax + By = b arises in

More information

An accelerated non-euclidean hybrid proximal extragradient-type algorithm for convex-concave saddle-point problems

An accelerated non-euclidean hybrid proximal extragradient-type algorithm for convex-concave saddle-point problems An accelerated non-euclidean hybrid proximal extragradient-type algorithm for convex-concave saddle-point problems O. Kolossoski R. D. C. Monteiro September 18, 2015 (Revised: September 28, 2016) Abstract

More information

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725 Dual methods and ADMM Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given f : R n R, the function is called its conjugate Recall conjugate functions f (y) = max x R n yt x f(x)

More information

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be

More information

A Unified Approach to Proximal Algorithms using Bregman Distance

A Unified Approach to Proximal Algorithms using Bregman Distance A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department

More information

Research Article Modified Halfspace-Relaxation Projection Methods for Solving the Split Feasibility Problem

Research Article Modified Halfspace-Relaxation Projection Methods for Solving the Split Feasibility Problem Advances in Operations Research Volume 01, Article ID 483479, 17 pages doi:10.1155/01/483479 Research Article Modified Halfspace-Relaxation Projection Methods for Solving the Split Feasibility Problem

More information

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6)

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement to the material discussed in

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

Convergent prediction-correction-based ADMM for multi-block separable convex programming

Convergent prediction-correction-based ADMM for multi-block separable convex programming Convergent prediction-correction-based ADMM for multi-block separable convex programming Xiaokai Chang a,b, Sanyang Liu a, Pengjun Zhao c and Xu Li b a School of Mathematics and Statistics, Xidian University,

More information

Series 6, May 14th, 2018 (EM Algorithm and Semi-Supervised Learning)

Series 6, May 14th, 2018 (EM Algorithm and Semi-Supervised Learning) Exercises Introduction to Machine Learning SS 2018 Series 6, May 14th, 2018 (EM Algorithm and Semi-Supervised Learning) LAS Group, Institute for Machine Learning Dept of Computer Science, ETH Zürich Prof

More information

1 Computing with constraints

1 Computing with constraints Notes for 2017-04-26 1 Computing with constraints Recall that our basic problem is minimize φ(x) s.t. x Ω where the feasible set Ω is defined by equality and inequality conditions Ω = {x R n : c i (x)

More information

One Mirror Descent Algorithm for Convex Constrained Optimization Problems with Non-Standard Growth Properties

One Mirror Descent Algorithm for Convex Constrained Optimization Problems with Non-Standard Growth Properties One Mirror Descent Algorithm for Convex Constrained Optimization Problems with Non-Standard Growth Properties Fedor S. Stonyakin 1 and Alexander A. Titov 1 V. I. Vernadsky Crimean Federal University, Simferopol,

More information

3.10 Lagrangian relaxation

3.10 Lagrangian relaxation 3.10 Lagrangian relaxation Consider a generic ILP problem min {c t x : Ax b, Dx d, x Z n } with integer coefficients. Suppose Dx d are the complicating constraints. Often the linear relaxation and the

More information

Optimization methods

Optimization methods Optimization methods Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda /8/016 Introduction Aim: Overview of optimization methods that Tend to

More information

Math 273a: Optimization Subgradients of convex functions

Math 273a: Optimization Subgradients of convex functions Math 273a: Optimization Subgradients of convex functions Made by: Damek Davis Edited by Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com 1 / 20 Subgradients Assumptions

More information

Linear Algebra and Matrix Inversion

Linear Algebra and Matrix Inversion Jim Lambers MAT 46/56 Spring Semester 29- Lecture 2 Notes These notes correspond to Section 63 in the text Linear Algebra and Matrix Inversion Vector Spaces and Linear Transformations Matrices are much

More information

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop

More information

4TE3/6TE3. Algorithms for. Continuous Optimization

4TE3/6TE3. Algorithms for. Continuous Optimization 4TE3/6TE3 Algorithms for Continuous Optimization (Algorithms for Constrained Nonlinear Optimization Problems) Tamás TERLAKY Computing and Software McMaster University Hamilton, November 2005 terlaky@mcmaster.ca

More information

Proximal methods. S. Villa. October 7, 2014

Proximal methods. S. Villa. October 7, 2014 Proximal methods S. Villa October 7, 2014 1 Review of the basics Often machine learning problems require the solution of minimization problems. For instance, the ERM algorithm requires to solve a problem

More information

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

More information

The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent

The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent Yinyu Ye K. T. Li Professor of Engineering Department of Management Science and Engineering Stanford

More information

Lecture 9: September 28

Lecture 9: September 28 0-725/36-725: Convex Optimization Fall 206 Lecturer: Ryan Tibshirani Lecture 9: September 28 Scribes: Yiming Wu, Ye Yuan, Zhihao Li Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These

More information

MAT 1332: CALCULUS FOR LIFE SCIENCES. Contents. 1. Review: Linear Algebra II Vectors and matrices Definition. 1.2.

MAT 1332: CALCULUS FOR LIFE SCIENCES. Contents. 1. Review: Linear Algebra II Vectors and matrices Definition. 1.2. MAT 1332: CALCULUS FOR LIFE SCIENCES JING LI Contents 1 Review: Linear Algebra II Vectors and matrices 1 11 Definition 1 12 Operations 1 2 Linear Algebra III Inverses and Determinants 1 21 Inverse Matrices

More information

Douglas-Rachford Splitting: Complexity Estimates and Accelerated Variants

Douglas-Rachford Splitting: Complexity Estimates and Accelerated Variants 53rd IEEE Conference on Decision and Control December 5-7, 204. Los Angeles, California, USA Douglas-Rachford Splitting: Complexity Estimates and Accelerated Variants Panagiotis Patrinos and Lorenzo Stella

More information

Statistical Machine Learning II Spring 2017, Learning Theory, Lecture 4

Statistical Machine Learning II Spring 2017, Learning Theory, Lecture 4 Statistical Machine Learning II Spring 07, Learning Theory, Lecture 4 Jean Honorio jhonorio@purdue.edu Deterministic Optimization For brevity, everywhere differentiable functions will be called smooth.

More information

Convergence rate estimates for the gradient differential inclusion

Convergence rate estimates for the gradient differential inclusion Convergence rate estimates for the gradient differential inclusion Osman Güler November 23 Abstract Let f : H R { } be a proper, lower semi continuous, convex function in a Hilbert space H. The gradient

More information

Motivation Subgradient Method Stochastic Subgradient Method. Convex Optimization. Lecture 15 - Gradient Descent in Machine Learning

Motivation Subgradient Method Stochastic Subgradient Method. Convex Optimization. Lecture 15 - Gradient Descent in Machine Learning Convex Optimization Lecture 15 - Gradient Descent in Machine Learning Instructor: Yuanzhang Xiao University of Hawaii at Manoa Fall 2017 1 / 21 Today s Lecture 1 Motivation 2 Subgradient Method 3 Stochastic

More information

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate Mixture Models & EM icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looed at -means and hierarchical clustering as mechanisms for unsupervised learning -means

More information

Coordinate Update Algorithm Short Course Operator Splitting

Coordinate Update Algorithm Short Course Operator Splitting Coordinate Update Algorithm Short Course Operator Splitting Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 25 Operator splitting pipeline 1. Formulate a problem as 0 A(x) + B(x) with monotone operators

More information

Research Article Strong Convergence of a Projected Gradient Method

Research Article Strong Convergence of a Projected Gradient Method Applied Mathematics Volume 2012, Article ID 410137, 10 pages doi:10.1155/2012/410137 Research Article Strong Convergence of a Projected Gradient Method Shunhou Fan and Yonghong Yao Department of Mathematics,

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

A Proximal Alternating Direction Method for Semi-Definite Rank Minimization (Supplementary Material)

A Proximal Alternating Direction Method for Semi-Definite Rank Minimization (Supplementary Material) A Proximal Alternating Direction Method for Semi-Definite Rank Minimization (Supplementary Material) Ganzhao Yuan and Bernard Ghanem King Abdullah University of Science and Technology (KAUST), Saudi Arabia

More information

Lecture 12 Unconstrained Optimization (contd.) Constrained Optimization. October 15, 2008

Lecture 12 Unconstrained Optimization (contd.) Constrained Optimization. October 15, 2008 Lecture 12 Unconstrained Optimization (contd.) Constrained Optimization October 15, 2008 Outline Lecture 11 Gradient descent algorithm Improvement to result in Lec 11 At what rate will it converge? Constrained

More information

DLM: Decentralized Linearized Alternating Direction Method of Multipliers

DLM: Decentralized Linearized Alternating Direction Method of Multipliers 1 DLM: Decentralized Linearized Alternating Direction Method of Multipliers Qing Ling, Wei Shi, Gang Wu, and Alejandro Ribeiro Abstract This paper develops the Decentralized Linearized Alternating Direction

More information

ORIE 4741: Learning with Big Messy Data. Proximal Gradient Method

ORIE 4741: Learning with Big Messy Data. Proximal Gradient Method ORIE 4741: Learning with Big Messy Data Proximal Gradient Method Professor Udell Operations Research and Information Engineering Cornell November 13, 2017 1 / 31 Announcements Be a TA for CS/ORIE 1380:

More information

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate Mixture Models & EM icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looed at -means and hierarchical clustering as mechanisms for unsupervised learning -means

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)

More information

Linearized Alternating Direction Method with Adaptive Penalty for Low-Rank Representation

Linearized Alternating Direction Method with Adaptive Penalty for Low-Rank Representation Linearized Alternating Direction Method with Adaptive Penalty for Low-Rank Representation Zhouchen Lin Visual Computing Group Microsoft Research Asia Risheng Liu Zhixun Su School of Mathematical Sciences

More information

Complexity of gradient descent for multiobjective optimization

Complexity of gradient descent for multiobjective optimization Complexity of gradient descent for multiobjective optimization J. Fliege A. I. F. Vaz L. N. Vicente July 18, 2018 Abstract A number of first-order methods have been proposed for smooth multiobjective optimization

More information

An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods

An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods Renato D.C. Monteiro B. F. Svaiter May 10, 011 Revised: May 4, 01) Abstract This

More information

An inexact subgradient algorithm for Equilibrium Problems

An inexact subgradient algorithm for Equilibrium Problems Volume 30, N. 1, pp. 91 107, 2011 Copyright 2011 SBMAC ISSN 0101-8205 www.scielo.br/cam An inexact subgradient algorithm for Equilibrium Problems PAULO SANTOS 1 and SUSANA SCHEIMBERG 2 1 DM, UFPI, Teresina,

More information

Dual and primal-dual methods

Dual and primal-dual methods ELE 538B: Large-Scale Optimization for Data Science Dual and primal-dual methods Yuxin Chen Princeton University, Spring 2018 Outline Dual proximal gradient method Primal-dual proximal gradient method

More information

Multi-Robotic Systems

Multi-Robotic Systems CHAPTER 9 Multi-Robotic Systems The topic of multi-robotic systems is quite popular now. It is believed that such systems can have the following benefits: Improved performance ( winning by numbers ) Distributed

More information

NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS. 1. Introduction. We consider first-order methods for smooth, unconstrained

NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS. 1. Introduction. We consider first-order methods for smooth, unconstrained NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS 1. Introduction. We consider first-order methods for smooth, unconstrained optimization: (1.1) minimize f(x), x R n where f : R n R. We assume

More information

A generalized forward-backward method for solving split equality quasi inclusion problems in Banach spaces

A generalized forward-backward method for solving split equality quasi inclusion problems in Banach spaces Available online at www.isr-publications.com/jnsa J. Nonlinear Sci. Appl., 10 (2017), 4890 4900 Research Article Journal Homepage: www.tjnsa.com - www.isr-publications.com/jnsa A generalized forward-backward

More information

Conjugate Gradient (CG) Method

Conjugate Gradient (CG) Method Conjugate Gradient (CG) Method by K. Ozawa 1 Introduction In the series of this lecture, I will introduce the conjugate gradient method, which solves efficiently large scale sparse linear simultaneous

More information

Introduction. New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems

Introduction. New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems Z. Akbari 1, R. Yousefpour 2, M. R. Peyghami 3 1 Department of Mathematics, K.N. Toosi University of Technology,

More information

ON GENERALIZED-CONVEX CONSTRAINED MULTI-OBJECTIVE OPTIMIZATION

ON GENERALIZED-CONVEX CONSTRAINED MULTI-OBJECTIVE OPTIMIZATION ON GENERALIZED-CONVEX CONSTRAINED MULTI-OBJECTIVE OPTIMIZATION CHRISTIAN GÜNTHER AND CHRISTIANE TAMMER Abstract. In this paper, we consider multi-objective optimization problems involving not necessarily

More information

Math 273a: Optimization Overview of First-Order Optimization Algorithms

Math 273a: Optimization Overview of First-Order Optimization Algorithms Math 273a: Optimization Overview of First-Order Optimization Algorithms Wotao Yin Department of Mathematics, UCLA online discussions on piazza.com 1 / 9 Typical flow of numerical optimization Optimization

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html

More information

An interior-point trust-region polynomial algorithm for convex programming

An interior-point trust-region polynomial algorithm for convex programming An interior-point trust-region polynomial algorithm for convex programming Ye LU and Ya-xiang YUAN Abstract. An interior-point trust-region algorithm is proposed for minimization of a convex quadratic

More information

Fast Nonnegative Matrix Factorization with Rank-one ADMM

Fast Nonnegative Matrix Factorization with Rank-one ADMM Fast Nonnegative Matrix Factorization with Rank-one Dongjin Song, David A. Meyer, Martin Renqiang Min, Department of ECE, UCSD, La Jolla, CA, 9093-0409 dosong@ucsd.edu Department of Mathematics, UCSD,

More information

9.520 Problem Set 2. Due April 25, 2011

9.520 Problem Set 2. Due April 25, 2011 9.50 Problem Set Due April 5, 011 Note: there are five problems in total in this set. Problem 1 In classification problems where the data are unbalanced (there are many more examples of one class than

More information

Fantope Regularization in Metric Learning

Fantope Regularization in Metric Learning Fantope Regularization in Metric Learning CVPR 2014 Marc T. Law (LIP6, UPMC), Nicolas Thome (LIP6 - UPMC Sorbonne Universités), Matthieu Cord (LIP6 - UPMC Sorbonne Universités), Paris, France Introduction

More information

An Optimization-based Approach to Decentralized Assignability

An Optimization-based Approach to Decentralized Assignability 2016 American Control Conference (ACC) Boston Marriott Copley Place July 6-8, 2016 Boston, MA, USA An Optimization-based Approach to Decentralized Assignability Alborz Alavian and Michael Rotkowitz Abstract

More information

Key words. alternating direction method of multipliers, convex composite optimization, indefinite proximal terms, majorization, iteration-complexity

Key words. alternating direction method of multipliers, convex composite optimization, indefinite proximal terms, majorization, iteration-complexity A MAJORIZED ADMM WITH INDEFINITE PROXIMAL TERMS FOR LINEARLY CONSTRAINED CONVEX COMPOSITE OPTIMIZATION MIN LI, DEFENG SUN, AND KIM-CHUAN TOH Abstract. This paper presents a majorized alternating direction

More information

From error bounds to the complexity of first-order descent methods for convex functions

From error bounds to the complexity of first-order descent methods for convex functions From error bounds to the complexity of first-order descent methods for convex functions Nguyen Trong Phong-TSE Joint work with Jérôme Bolte, Juan Peypouquet, Bruce Suter. Toulouse, 23-25, March, 2016 Journées

More information

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Zhaosong Lu October 5, 2012 (Revised: June 3, 2013; September 17, 2013) Abstract In this paper we study

More information

Lecture 15: SQP methods for equality constrained optimization

Lecture 15: SQP methods for equality constrained optimization Lecture 15: SQP methods for equality constrained optimization Coralia Cartis, Mathematical Institute, University of Oxford C6.2/B2: Continuous Optimization Lecture 15: SQP methods for equality constrained

More information

Convergence of Fixed-Point Iterations

Convergence of Fixed-Point Iterations Convergence of Fixed-Point Iterations Instructor: Wotao Yin (UCLA Math) July 2016 1 / 30 Why study fixed-point iterations? Abstract many existing algorithms in optimization, numerical linear algebra, and

More information

Constrained Optimization

Constrained Optimization 1 / 22 Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University March 30, 2015 2 / 22 1. Equality constraints only 1.1 Reduced gradient 1.2 Lagrange

More information