Convergence Rates for Projective Splitting

Size: px
Start display at page:

Download "Convergence Rates for Projective Splitting"

Transcription

1 Convergence Rates for Projective Splitting Patric R. Johnstone Jonathan Ecstein August 8, 018 Abstract Projective splitting is a family of methods for solving inclusions involving sums of maximal monotone operators. First introduced by Ecstein and Svaiter in 008, these methods have enjoyed significant innovation in recent years, becoming one of the most flexible operator splitting framewors available. While wea convergence of the iterates to a solution has been established, there have been few attempts to study convergence rates of projective splitting. The purpose of this paper is to do so under various assumptions. To this end, there are three main contributions. First, in the context of convex optimization, we establish an O1/ ergodic function convergence rate. Second, for strongly monotone inclusions, strong convergence is established as well as an ergodic O1/ convergence rate for the distance of the iterates to the solution. Finally, for inclusions featuring strong monotonicity and cocoercivity, linear convergence is established. 1 Introduction For a real Hilbert space H, consider the problem of finding z H such that 0 T i z 1 where T i : H H are maximal monotone operators and additionally there exists a subset I F {1,..., n} such that for all i I F the operator T i is Lipschitz continuous. An important instance of this problem is min F z, where F z = f i z z H and every f i : H, + ] is closed, proper, and convex, with some subset of the functions also being Fréchet differentiable with Lipschitz-continuous gradients. Under appropriate constraint qualifications, 1 and are equivalent. Problem arises in a host Department of Management Science and Information Systems, Rutgers Business School Newar and New Brunswic, Rutgers University 1

2 of applications such as machine learning, signal and image processing, inverse problems, and computer vision; see [3, 5, 6] for some examples. A relatively recently proposed class of operator splitting algorithms which can solve 1, among other problems, is projective splitting. It originated with [10] and was then generalized to more than two operators in [11]. The related algorithm in [1] introduced a technique for handling compositions of linear and monotone operators, and [4] proposed an extension to bloc-iterative and asynchronous operation bloc-iterative operation meaning that only a subset of the operators maing up the problem need to be considered at each iteration this approach may be called incremental in the optimization literature. A restricted and simplified version of this framewor appears in [8]. The recent wor [15] incorporated forward steps into the projective splitting framewor for any Lipschitz continuous operators and introduced bactracing and adaptive stepsize rules. In general, projective splitting offers unprecedented levels of flexibility compared with previous operator splitting algorithms e.g. [0, 17, 4, 7]. The framewor can be applied to arbitary sums of maximal monotone operators, the stepsizes can vary by operator and by iteration, compositions with linear operators can be handled, and bloc-iterative asynchronous implementations have been demonstrated. In previous wors on projective splitting, the main theoretical goal was to establish the wea convergence of the iterates to a solution of the monotone inclusion under study either a special case or a generalization of 1. This goal was achieved using Fejér-monotonicity arguments in coordination with the unique properties of projective splitting. The question of convergence rates has not been addressed, with the sole exception of [19], which considers a different type of convergence rate than those investigated here; we discuss the differences between our analysis and that of [19] in more detail below. Contributions To this end, there are four main novel contributions in this paper. 1. For, we establish an ergodic O1/ function value convergence rate for iterates generated by projective splitting.. When one of the operators in 1 is strongly monotone, we establish strong convergence, rather than wea, in the general Hilbert space setting, without using the Haugazeau [13] modification employed to obtain general strong convergence in [4]. Furthermore, we derive an ergodic O1/ convergence rate for the distance of the iterates to the unique solution of If additionally T 1,..., T n 1 are cocoercive, we establish linear convergence to 0 of the distance of the iterates to the unique solution. 4. We discuss the special cases of projective splitting when n = 1. Interestingly, projective splitting reduces to one of two well-nown algorithms in this case depending on whether forward or bacward steps are used. This observation has implications for the convergence rate analysis. The primary engine of the analysis is a new summability lemma Lemma 8 below in which important quantities of the algorithm are shown to be summable. This summability

3 is directly exploited in the ergodic function value rate analysis in Section 6. In Section 8, the same lemma is used to show linear convergence when strong monotonicity and cocoercivity are present. With only strong monotonicity present, we also obtain strong convergence and rates using a novel analysis in Section 7. Our convergence rates apply directly to the variants of projective splitting discussed in [4, 10, 15]. The papers [8, 11] use a slightly different separating hyperplane formulation than ours but the difference is easy to resolve. However, our analysis does not allow for the asynchrony or bloc-iterative effects developed in [4, 8, 15]. In particular, at each iteration we assume that every operator is processed and that the computations use the most up-todate information. Developing a convergence rate analysis which extends to asynchronous and bloc-iterative operation in a satisfactory manner is a matter for future wor. In [4, 8, 15], projective splitting was extended to handle the composition of each T i with a linear operator. While it is possible to extend all of our convergence rate results to allow for this generalization under appropriate conditions, for the sae of readability we will not do so here. In Section 9 we consider the case n = 1. In this case, we show that projective splitting reduces to the proximal point method [] if one uses bacward steps, or to a special case of the extragradient method with no constraint [16, 1] when one uses forward steps. Since projective splitting includes the proximal point method as a special case, the O1/ function value convergence rate derived in Section 6 cannot be improved, since this is the best rate available for the proximal point method, as established in [1]. The specific outline for the paper is as follows. Immediately below, we discuss in more detail the convergence rate analysis for projective splitting conducted in [19] and how it differs from our analysis. Section presents notation, basic mathematical results, and assumptions. Section 3 introduces the projective splitting framewor under study, along with some associated assumptions. Section 4 recalls some important lemmas from [15]. Section 5 proves some new lemmas necessary for convergence rate results, including the ey summability lemma Lemma 8. Section 6 derives the ergodic O1/ function value convergence rate for. Section 7 derives strong convergence and convergence rates under strong monotonicity. Section 8 establishes linear convergence under strong monotonicity and cocoercivity. Finally, Section 9 discusses special cases of projective splitting when n = 1. Comparison with [19] To the best of our nowledge, the only wors attempting to quantify convergence rates of projective splitting are [18] and [19], two wors by the same author. The analysis in [18] concerns a dual application of projective splitting and its convergence rate results are similar to those in [19]. In these wors, convergence rates are not defined in the more customary way they are in this paper either in terms of the distance to the solution or the gap between current function values and the optimal value of. Instead in [19] they are defined in terms of an approximate solution criterion for the monotone inclusion under study, specifically 1 with n =. Without any enlargement being applied to the operators, the approximate solution condition is as follows: a point x, y H is said to be an ɛ-approximate solution of 1 with n = if there exists a, b H s.t. a T 1 x, b T y and max{ a + b, x y } ɛ. If this condition holds with ɛ = 0, then x = y is a solution to 1. The iteration complexity of a method is then defined in terms of the 3

4 number iterations required to produce a point x, y which is a ɛ-approximate solution in this sense. We stress that this notion of iteration complexity/convergence rate is different to what we use here. Instead, we directly analyze the distance of the points generated by the algorithm to a solution of 1, that is, we study the behavior of z z, where z solves 1. Or for the special case of, we consider the convergence rate of F z F to zero, where F is the optimal value. Mathematical Preliminaries.1 Notation, Terminology, and Basic Lemmas Throughout this paper we will use the standard convention that a sum over an empty set of indices, such as n 1 a i with n = 1, is taen to be 0. A single-valued operator A : H H is called cocoercive if, for some Γ > 0, u, v H u v, Au Av Γ 1 Au Av. Every cocoercive operator is Γ-Lipschitz continuous, but not vice versa. For any maximal monotone operator A : H H and scalar ρ > 0 we will use the notation prox ρa = I+ρA 1 to denote the proximal operator, also nown as the bacward or implicit step with respect to A. In particular, x = prox ρa a = y Ax : x + ρy = a, 3 and the x and y satisfying this relation are unique. Furthermore, prox ρa is defined everywhere and rangeprox A = doma [, Proposition 3.]. In the special case where A = f for some convex, closed, and proper function f, the proximal operator may be written as prox ρ f = arg min x { 1 x a + ρfx }. 4 In the optimization context, we will use the notational convention that prox ρ f = prox ρf. Finally, we will use the following two standard results: Lemma 1. For any vectors v 1,..., v n H n, n v i n n v i. Lemma. For any x, y, z H x y, x z = x y + x z y z. 5 We will use a boldface w = w 1,..., w n 1 for elements of H n 1.. Main Assumptions Regarding Problem 1 Define the extended solution set or Kuhn-Tucer set of 1 to be { S = z, w 1,..., w n 1 H n wi T i z, i = 1,..., n 1, n 1 w i T n z }. 6 Clearly z H solves 1 if and only if there exists w H n 1 such that z, w S. Our main assumptions regarding 1 are as follows: 4

5 Assumption 1. H is a real Hilbert space and problem 1 conforms to the following: 1. For i = 1,..., n, the operators T i : H H are monotone.. For all i in some subset I F {1,..., n}, the operator T i is L i -Lipschitz continuous and thus single-valued and domt i = H. 3. For i I B {1,..., n}\i F, the operator T i is maximal and that the map prox ρti : H H can be computed to within the error tolerance specified below in Assumption however, these operators are not precluded from also being Lipschitz continuous. 4. The solution set S defined in 6 is nonempty. Proposition 1. [15, Lemma 3] Under Assumption 1, S from 6 is closed and convex. 3 The Algorithm Projective splitting is a special case of a general seperator-projector method for finding a point in a closed and convex set. At each iteration the method constructs an affine function ϕ : H n R which separates the current point from the target set S defined in 6. In other words, if p is the current point in H n generated by the algorithm, ϕ p > 0, and ϕ p 0 for all p S. The next point is then the projection of p onto the hyperplane {p : ϕ p = 0}, subject to a relaxation factor β. What maes projective splitting an operator splitting method is that the hyperplane is constructed through individual calculations on each operator T i, either prox calculations or forward steps. 3.1 The Hyperplane Let p = z, w = z, w 1,..., w n 1 be a generic point in H n. For H n, we adopt the following norm and inner product for some γ > 0: z, w = γ z + w i z 1, w 1, z, w = γ γ z1, z + wi 1, wi. 7 Define the following function for all 1: ϕ p = z x i, yi w i + z x n, yn + w i. 8 where the x i, y i are chosen so that y i T i x i for i = 1,..., n. This function is a special case of the separator function used in [4]. The following lemma proves some basic properties of ϕ ; similar results are in [1, 4, 8] in the case γ = 1. Lemma 3. [15, Lemma 4] Let ϕ be defined as in 8. Then: 1. ϕ is affine on H n. 5

6 . With respect to inner product 7 defined on H n, the gradient of ϕ is 1 ϕ = yi, x 1 x γ n, x x n,..., x n 1 x n Suppose Assumption 1 holds and that y i T i x i for i = 1,..., n. Then ϕ p 0 for all p S defined in If Assumption 1 holds, y i T i x i, and ϕ = 0, then x n, y 1,..., y n 1 S. 3. Projective Splitting Algorithm 1 is the projective splitting framewor for which we will derive convergence rates. It is a special case of the framewor of [15] without asynchrony or bloc-iterative features. In particular, we assume at each iteration that the method processes every operator T i, using the most up-to-date information possible. The framewors of [4, 8, 15] also allow for the monotone operators to be composed with linear operators. As mentioned in the introduction, our analysis may be extended to this situation, but for the sae of readibility we will not do so here. Algorithm 1 is a special case of the separator-projector algorithm applied to finding a point in S using the affine function ϕ defined in 8 [15, Lemma 6]. For the variables of Algorithm 1 defined on lines 18 19, define p = z, w = z, w 1,..., w n 1 for all 1. The points x i, y i gra T i are chosen so that ϕ p is sufficiently large to guarantee the wea convergence of p to a solution [15, Theorem 1]. For i I B, a single prox calculation is required, whereas for i I F, two forward steps are required per iteration. The algorithm has the following parameters: For each 1 and i = 1,..., n, a positive scalar stepsize ρ i. For each 1, an overrelaxation parameter β [β, β] where 0 < β β <. The fixed scalar γ > 0 from 7, which controls the relative emphasis on the primal and dual variables in the projection update in lines Sequences of errors {e i } 1 for i I B, modeling inexact computation of the proximal steps. To ease the mathematical presentation, we use the following notation in Algorithm 1 and throughout the rest of the paper: N wn wi. 10 Note that when n = 1, we have w n = 0 by the convention at the start of Section.1. 6

7 Algorithm 1: Algorithm for solving 1. Input: z 1, w 1 H n, γ > 0. 1 for = 1,,... do for i = 1,,..., n do 3 if i I B then 4 a = z + ρ i w i + e i 5 x i = prox ρ i T i a 6 y i = ρ i 1 a x i 7 else 8 x i = z ρ i T i z w i, 9 y i = T i x i. 10 u i = x i x n, i = 1,..., n 1, 11 v = n y i 1 π = u + γ 1 v 13 if π > 0 then 14 ϕ p = z, v + n 1 w i, u i n x i, yi 15 α = β ϕ p /π 16 else 17 return z +1 x n, w1 +1 y1,..., wn 1 +1 yn 1 18 z +1 = z γ 1 α v 19 w +1 0 w +1 n i = wi α u i, i = 1,..., n 1, = n 1 w+1 i 3.3 Conditions on the Errors and the Stepsizes We now state our assumptions regarding the computational errors and stepsizes in Algorithm 1. Assumptions 11 and 1 are taen exactly from [8]. Assumption 13 is new and necessary to derive our convergence rate results. Throughout the rest of the manuscript, let K be the iteration where Algorithm 1 terminates via Line 17, with K = if the method runs indefinitely. Assumption. For some σ [0, 1[ and δ 0, the following hold for all 1 K and i I B : z x i, e i σ z x i 11 e i, y i w i ρ i σ y i w i 1 e i δ z x i. 13 Note that if 13 holds with δ < 1, then 11 holds for any σ [δ, 1[. For each i I B, we will show in 33 below that the sequence { z x i } is square-summable, so an eventual consequence of 13 will be that the corresponding error sequence must be square-summable, that is, K =1 e i <. 7

8 Assumption 3. The stepsizes satisfy { } ρ min inf ρ,...,n 1 K i > 0 14 i I B ρ i sup ρ 1 K i < 15 i I F ρ i sup ρ 1 K i < L i From here on, we let ρ = max i IB ρ i and L = max i IF L i with the convention that ρ = 0 if I B = { } and L = 0 if I F = { }. The recent wor [15] includes several extensions to the basic forward-step stepsize constraint 16, under which one still obtains wea convergence of the iterates to a solution. Section 4.1 of [15] presents a bactracing linesearch that may be used when the constant L i is unnown. Section 4. of [15] presents a bactrac-free adaptive stepsize for the special case in which T i is affine but L i is unnown. Our convergence rate analysis holds for these extensions, but for the sae of readability we omit the details. 4 Results Similar to [15] Lemma 4. Suppose Assumption 1 holds. In the context of Algorithm 1, recall the notation p = z, w 1,..., w n 1. For all 1 < K: 1. The iterates can be written as where α = β ϕ p / ϕ and ϕ is defined in 8.. For all p = z, w 1,..., w n 1 S: which implies that p +1 = p α ϕ p 17 p +1 p p p β β p +1 p 18 p t p t+1 τ p 1 p, where τ = β 1 β 1 19 z z γ 1 p p γ 1 p 1 p, 0 w i w i p p p 1 p, i = 1,..., n 1. 1 Proof. The update 17 follows from algebraic manipulation of lines 10 0 and consideration of 8 and 9. Inequalities 18 and 19 result from Algorithm 1 being a separator-projector algorithm [15, Lemma 6]. A specific reference proving these results is [10, Proposition 1]. 8

9 The following lemma places an upper bound on the gradient of the affine function ϕ at each iteration. A similar result was proved in [15, Lemma 11], but since that result is slightly different, we include the full proof here. Lemma 5. Suppose assumptions 1,, and 3 hold and recall the affine function ϕ defined in 8. For all 1 K, ϕ ξ 1 z x i where Proof. Using Lemma 3, ξ 1 = n [ 1 + γ 1 L I F + ρ 1 + δ ] <. ϕ = γ 1 y i + x i x n. 3 Using Lemma 1, we begin by writing the second term on the right of 3 as x i x n x i z + z x n n z x i. 4 We next consider the first term in 3. Rearranging the update equations for Algorithm 1 as given in lines 5 and 8, we may write y i = ρ i 1 z x i + ρ i w i + e i, i IB 5 T i z = ρ i 1 z x i + ρ i w i, i IF. 6 Note that 5 rewrites line 5 of Algorithm 1 using 3. The first term of 3 may then be written as yi = yi + Ti z + yi T i z i I B i I F a yi + T i z + y i T i z i I B i I F i I F b = ρ 1 i z x i + ρ i wi + ρ 1 i e i i I B + T x i T i z i I F c 4 ρ 1 i z x i + ρ i wi + 4 ρ 1 i e i i I B 9

10 + I F i I F Ti x i T i z d 4nρ z x i e ξ 1 + e i + I F L i x i z i IB i I F z x i 7 where ξ 1 = 4n L I F + ρ 1 + δ where recall L = maxi IF L i. In the above, a uses Lemma 1, while b is obtained by substituting 5-6 into the first squared norm and using yi = T i x i for i I F in the second. Next, c uses Lemma 1 once more on both terms. Inequality d uses Lemma 1, the Lipschitz continuity of T i, n w i = 0, and Assumption 3. Finally, e follows by collecting terms and using Assumption. Combining 3, 4, and 7 establishes the Lemma with ξ 1 as defined in. Lemma 6. Suppose that assumptions 1,, and 3 hold. Then for all 1 K ϕ p ξ z x i. where { { } } ξ = min 1 σρ 1, min ρ 1 j L j > 0. 8 j I F Furthermore, for all such, ϕ p + i I F L i z x i 1 σρ i I B y i w i + ρ i I F T i z w i. 9 Proof. The claimed results are special cases of those given in lemmas 1-13 of [15]. 5 New Lemmas Needed to Derive Convergence Rates Lemma 7. Suppose assumptions 1,, and 3 hold, and recall α computed on line 15 of Algorithm 1. For all 1 < K, it holds that α α βξ /ξ 1 > 0, where ξ 1 and ξ are as defined in and 8. Proof. By Lemma 4, α defined on line 15 of Algorithm 1 may be expressed as α = β ϕ z, w ϕ

11 By Lemma 5, ϕ ξ 1 n z x i, where ξ 1 is defined in. Furthermore, Lemma 6 implies that ϕ z, w ξ n z x i, where ξ is defined in 8. Combining these two inequalities with 30 and β β yields α βξ 1 /ξ = α. Using and 8, ξ 1 and ξ are positive and finite by assumptions and 3. Since β > β > 0, we conclude that α > 0. The next lemma is the ey to proving O1/ function value convergence rate and linear convergence rate under strong mononotonicity and cocoercivity. Essentially, it shows that several ey quanitities of Algorithm 1 are square-summable, within nown bounds. Lemma 8. Suppose assumptions 1,, and 3 hold. If K =, then ϕ p 0 and ϕ 0. Furthermore, for all 1 < K, where z t+1 z t γ 1 τ p 1 p, 31 w t+1 i wi t τ p 1 p, 3 z x i ξ 1 β p +1 p, ξ wi yi E 1 p +1 p, z t x t i τξ 1 β p 1 p, 33 ξ wi t yi t τe 1 p 1 p, 34 i I F w t i T i z t τe 1 p 1 p, 35 E 1 = 1 σ 1 ρ ξ 1 L1 + ρ L ξ 1 β, 36 ξ τ is as defined in 19, and ξ 1 and ξ are as defined in and 8. Proof. Fix any 1 < K. First, from 19 in Lemma 4, we have pt+1 p t τ p 1 p. Since p +1 p = γ z +1 z + n 1 w+1 i wi, inequalities 31 and 3 follow immediately. Next, Lemma 4 also implies that p +1 p = β ϕ p / ϕ ϕ, and therefore that As argued in Lemma 7, lemmas 5 and 6 imply that p +1 p = β ϕ p ϕ. 37 ϕ p ϕ ξ ξ 1 = ϕ ξ 1 ξ ϕ p

12 Since ϕ p 0 by Lemma 3, substituting 38 into 37 and using the lower bound on β yields which in turn leads to Therefore, 1 ϕ p = β 1 ϕ p +1 p β 1 ξ1 ϕ p ξ p +1 p, ϕ p β 1 ξ1 ξ 1 p +1 p = ϕ p ξ 1 ξ β p+1 p. 39 ϕ t p t ξ 1 ξ β p t+1 p t τξ 1 β ξ p 1 p. 40 If K =, 40 implies that ϕ p 0, which in conjunction with 38 implies that ϕ 0. Combining 39 with Lemma 6 yields the first part of 33. Applying 19 from Lemma 4 then yields the second part of 33. Finally, we establish 34 and 35. Inequality 9 from Lemma 6 implies that 1 σρ i I B w i y i + ρ i I F w i T i z ϕ p + L i I F x i z 1 + ξ 1 L ξ 1 β p +1 p 41 ξ where in the second inequality we have used 39 and 33. Together with 19 from Lemma 4, 41 implies 35. Furthermore, for any i I F, we may write w i y i = w i T i z +T i z y i, from which Lemma 1 can be used to obtain w i T i z 1 w i y i T i z y i. 4 Substituting 4 into the left hand side of 41 yields 1 σρ i I B w i y i + 1 ρ i I F w i y i 1 + ξ 1 L ξ 1 β p +1 p + ρ T i z T i x i ξ i I F 1 + ξ 1 L ξ 1 β p +1 p + ρ L z x i ξ i I F 1 + ξ 1 L1 + ρ L ξ 1 β p +1 p ξ where first inequality follows by substituting y = T ix i for i I F, the second inequality uses the Lipschitz continuity of T i for i I F, and the final inequality uses 33. Because 1

13 0 σ < 1, we may replace the coefficients in front of the two left-hand sums by the 1 σ/, which can only be smaller, and obtain wi yi 1 σ 1 ρ ξ 1 L1 + ρ L ξ 1 p +1 p 43 ξ which yields the first part of 34. The second part follows by applying 19 to 43. The final technical lemma shows that the sequences {x i } and {y i } are bounded for i = 1,..., n, and computes specific bounds on their norms. Lemma 9. Suppose assumptions 1,, and 3 hold. The sequences {x i } and {yi } are bounded for i = 1,..., n. In particular, for all i = 1,..., n, and 1 K, { } x i ξ1 min + γ 1 p 1 p + γ 1 p B p S ξ x 44 { yi min n + E1 p 1 p + n p } By. 45 p S Proof. Assumption 1 asserts S is nonempty, so let p S and fix any 1 < K. From Lemma 1, p = p p + p p p + p p 1 p + p, 46 where the final inequality uses 18. It immediately follows that z γ 1 p γ 1 p 1 p + γ 1 p 47 wi p p 1 p + p. 48 Furthermore, using the definition of wn and Lemma 1, we also have n 1 wn = wi n 1 w i n p 1 p + n p. 49 Therefore, for all i = 1,..., n, we have x i z x i + z ξ 1 β ξ where second inequality uses 47 and 33. Finally, for i = 1,..., n + γ 1 p 1 p + γ 1 p, 50 y i y i w i + w i n + E 1 p 1 p + n p, 51 where the second inequality uses 34 and Note that the factor of n is only necessary when i = n, but for simplicity we will just a single bound for all i. Finally, the set S is closed and convex from Proposition 1, and the expressions 50 and 51 are continuous functions of p. Thus, we may minimize these bounds over S, yielding

14 6 Function Value Convergence Rate 6.1 Assumptions We now consider the optimization problem given in, which we repeat here: F = min F z = min f i z. 5 z H z H Assumption 4. Problem 5 conforms to the following: 1. H is a real Hilbert space.. Each f i : H, + ] is convex, closed, and proper. 3. There exists some subset I F {1,..., n} s.t. for all i I F, f is Fréchet differentiable everywhere and f i is L i -Lipschitz continuous. 4. For i I B {1,..., n}\i F, prox ρfi can be computed to within the error tolerance specified in Assumption however, these functions are not precluded from also having Lipschitz continuous gradients. 5. Letting T i = f i for i = 1,..., n, the solution set S in 6 is nonempty. When I B 1, it is possible for 5 to have a finite solution, yet for S to be empty. For constraint-qualification-lie conditions precluding such pathological situations, see for example [, Corollary 16.50]. Lemma 10. If Assumption 4 holds, then Assumption 1 holds for 1 with T i = f i for i = 1,..., n. If z, w S, then z is a solution to 5. The solution value F is finite. Proof. Closed, convex, and proper functions have maximal monotone subdifferentials [, Theorem 1.]. That z is a solution to 5 for any z, w S is a consequence of Fermat s rule [, Proposition 7.1] and the elementary fact that i f ix i f ix. Since S is nonempty, a solution to 5 exists. For any z, w S, the function f i must be subdifferentiable and hence finite at z for all i = 1,..., n, so F = n f iz must be finite. In the following analysis, we consider two types of convergence rates. First we will establish a rate of the form f i x i F = O1/ and x i x j = O1/ i, j = 1,..., n, where x i is an appropriate averaging of the sequence {x i }. With an additional Lipschitz continuity assumption, we can derive a more direct rate of the form f i x j F = O1/ 53 For an appropriate index j. This additional assumption is as follows: 14

15 Assumption 5. If I B > 1, there exists I L I B such that I L I B 1, and for all i I L the function f i is M i -Lipschitz continuous on the ball: {x : x B x } where B x is defined in 44. By virtue of Assumption 5, note that for all i I L, {x : x B x } domf i. We point out that it is not surprising that Assumption 5 is required to derive convergence rates of the form 53: suppose the first two functions f 1 and f are the respective indicator functions of closed nonempty convex sets C 1 and C, that is, f i x = 0 if x C i and otherwise f i x = +. Since neither function is differentiable, {1, } I B. Since neither function is Lipschitz continuous, {1, } / I L, unless B x C 1 C. This last situation is of little interest, since the iterates remain inside C 1 C for all iterations and thus the constraints encoded by f 1 and f may be disregarded. Instead, suppose B x C 1 C and thus {1, } / I L. Then I L I B, which violates Assumption 5. Suppose anyway that we could establish a convergence rate of O1/ or similar for some sequence of points x generated by the algorithm. But this would imply that x C 1 C for all, meaning that we would have to be able to solve an arbitrary two-set convex feasibility problem in a single iteration. Of course, it is easy to construct a counterexample in which Algorithm 1 does not find a point in the intersection of two closed convex sets within a finite number of iterations. So, to derive a convergence rate of the form 53 we allow for only one function to be non-lipschitz and thus able to encode a hard constraint. Any other nonsmooth functions present in the problem must be Lipschitz continuous on the bounded set B x which contains all points potentially encountered by the algorithm. 6. Main Result Theorem 1. Suppose assumptions, 3, and 4 hold. For 1 K, let i = 1,..., n x i = α tx t i α, t where α t is calculated on line 15 of Algorithm 1. Fixing any p S, one has 1. If K <, j = 1,..., n f i x K j = f i z K+1 = F, 54. For all 1 < K f i x i F E p 1 p + E 3 p 1 p where E = ξ 1 βξ E 1 τ + ρ n τ 15 + γe 1 + γδξ 1 β ξ 55 56

16 E 3 = n p ξ 1 βξ. 57 Furthermore, i, l = 1,..., n x i x l 4 p1 p α = 4ξ 1 p 1 p. 58 βξ 3. Additionally suppose Assumption 5 holds. If I B = 1, let j be the unique element in I B. If I B > 1, let j be the unique element of I B \I L. If I B = 0 then choose j to be any index in I F. Then, for all 1 < K, f i x j F E p 1 p + E 4 p 1 p 59 where E 4 = E 3 + 4ξ 1 βξ i I L M i + nb y 1 + LB x. 60 Proof. In the case that K <, it was established in [15, Lemma 5] that x K 1 = = x K n = z K+1 is a solution to 5, which establishes 54. We now address points and 3. Part 1: An upper bound for n f i x i F We begin by establishing the upper bound on n f i x i F in 55. Since f i is convex, f i x i F = f α tx t i i α F α t n f ix t i F t α. 61 t Since Lemma 7 implies that α α, we will aim to show that α t n f ix t i F is bounded for all 1 < K recall that K may be +. The projection updates on lines of Algorithm 1 mean that, for all 1 < K, z +1 = z γ 1 α yi 6 w +1 i = w i α x i x n, i = 1,..., n Tae any p = x, w S. By Lemma 10, F = n f ix. Then f i x i F = a f i x i f i x yi, x i x 16

17 = yi, x n x + yi, x i x n b = γ z z +1, x n x + yi, x i x α n. 64 } {{} }{{} A 1 A In the above, a uses that yi f i x i and b uses 6. We now show that α A 1 and α A both have a finite sum over. α A 1 is Summable If n I B, A 1 can be simplified as follows: A 1 a = γ z z +1, z x + ρ α nwn yn + e n = γ α z z +1, z x + γρ n b c + γρ n α z z +1, e n α γ α z z +1, z x + γρ n α z z +1, wn yn z z +1, wn yn + γρ n z z +1 + e α n γ z z +1, z x + γρ n z z +1, wn yn α α 65 + γρ n α z z +1 + δ z x n 66 where a uses the proximal update on line 5 of the algorithm for the case i = n, b used Young s inequality, and c uses assumptions and 3. On the other hand, if n I F, A 1 may be written as A 1 = γ α z z +1, z x + γρ n α z z +1, w n T n z, 67 where we have instead used the forward step on line 8. Let χ = w n y n when n I B and χ = w n T n z when n I F. Furthermore, let θ = γρ n α z z +1 + δ z x n 68 when n I B and θ = 0 if n I F. Combining 66 and 67, we may write A 1 = γ α z z +1, z x + γρ n α z z +1, χ + θ

18 Rewriting the first term in 69 using 5, we obtain γ z z +1, z x = γ z z +1 + γ z x z +1 x 70 α α α The second term in 69 may be upper bounded using Young s inequality, as follows: Thus, we obtain γρ n α z z +1, χ γρ n α z z +1 + γρ n α χ α A 1 γ 1 + ρ n z z +1 + γ z x γ z+1 x + γρ n χ + α θ. 71 Summing 71 over yields, for 1 < K, α t A t 1 γ z1 x + γ 1 + ρ n z t z t+1 + γρ n χ t + α t θ t. 7 We now consider the first three terms on the right-hand side of this relation, employing Lemma 8: γ 1 + ρ n γ z1 x 1 p1 p by 7 z t z t ρ nτ p 1 p by 31 γρ n χ t γρ n τe 1 p 1 p by 34 or 35. In the last inequality, we use 34 when n I B and 35 when n I F, and E 1 is defined in 36. We now consider the last term in 7. When n I B, we have α t θ t = γρ n z z +1 + δ z x n by 68 γρ n γ 1 τ p 1 p + δ τξ 1 β p 1 p by 31 and 33 ξ = τρ n 1 + γδξ 1 β ξ p 1 p. The resulting inequality also holds trivially when n I F, since its left-hand side must be zero. Combining all these inequalities, we obtain α t A t ρ n τ + γρ n τe 1 + τρ n 1 + γδξ 1 β p 1 p ξ 18

19 = τ + ρ n τ + γe 1 + γδξ 1 β ξ p 1 p. 73 α A is Summable We next perform a similar summability analysis on the second term in 64, A. We begin by fixing any 1 < K and writing α A = α yi, x i x n = α wi, x i x n + α yi wi, x i x n. 74 Now fix any i = 1,..., n 1. Using 63, x i x n = 1 α w +1 i w i. 75 Substituting this into the first summand in 74 and then using 5, we have α w i, x i x n = w i, w i w +1 i = wi wi, wi w +1 i + wi, wi w +1 i = 1 w i wi + wi w +1 i w +1 i wi + wi, wi w +1 i. 76 Using 75 in the second summand in 74 and applying Young s inequality yields α y i w i, x i x n = y i w i, w i w +1 i 1 y i w i + 1 w i w +1 i. 77 Substituting 76 and 77 into 74 yields α A 1 w i wi + wi w +1 i w +1 i wi + yi wi n 1 + wi, wi w +1 i. Summing this relation, we obtain that for any 1 < K, α t A t 1 wi 1 wi + wi t w t+1 i + n 1 + yi t wi t wi, wi 1 w +1 i 78 1 wi 1 wi + 19 w t i w t+1 i + yi t wi t

20 + wi wi 1 wi + w +1 i wi. 79 By 3 and 34 from Lemma 8, we then obtain wi t w t+1 i + yi t wi t τ1 + E 1 p 1 p. Finally, we may use 1 to derive wi wi 1 wi + w +1 i wi p 1 p wi n p 1 p w n p 1 p p, where in the second inequality we have used Lemma 1. Therefore, α t A t τ1 + E 1 p 1 p + n p p 1 p. 80 Recalling 61 and using 64 and the fact that α t α by Lemma 7, we obtain, for any 1 < K, f i x i F α t n f ix t i F α t α ta t 1 + A t α t α ta t 1 + A t α = ξ 1 α ta t 1 + A t. 81 βξ Using 73 and 80 in 81 establishes 55 with E and E 3 defined in 56 and 57. Part : Convergence behavior of x i x l Recall from 63 that for all i = 1,..., n 1, w t+1 i w t i = α t x t i x t n. Summing this equation over t = 1,..., and dividing by α t yields i = 1,..., n 1 x n x i = w1 i w +1 i α. 8 t Therefore, we obtain for all i, l {1,..., n 1} that x l x i = w1 i wl 1 w +1 i w +1 α t 0 l

21 = w1 i wi wl 1 + wl w +1 i wi w +1 l + wl α. t Using 1 and the triangle inequality, we therefore have for all i, l {1,..., n 1} that x l x i w1 l wl + w1 i wi + w +1 l α t w l + w+1 i w i 4 p1 p, 83 α where the second inequality uses lemmas 4 and 7. Finally, we can also use 8 to obtain, for any i = 1,..., n 1, that x n x w 1 i = i wi wi wi α t w1 i w i + w i w i α t p1 p, 84 α where the second inequality again uses lemmas 4 and 7. Together, and the definition of α imply 58. Part 3: Convergence rates at a single splitting point We now prove 59 under Assumption 5. We start by establishing that x j is within the domain of each function f i for all 1 < K: fix any 1 < K and first suppose that I B > 0, which that implies either j I B \I L or j is the unique element of I B. Since rangeprox ρfj = domf j, we have that x j domf j for all 1 < K. Since the domain of a convex function is a convex set [, Proposition 8.], this implies that x j domf j. The other possibility is that I B = 0, which means that j I F and therefore dom f j = H, implying that domf j = H, and thus trivially that x j domf j. Finally recall that for i j, i I L I F, therefore either domf i = H or {x : x B x } domf i. By Lemma 9, x j {x : x B x } for all j = 1,..., n and 1 < K. Since the ball B x convex, we must also have x j domf i in this case. Continuing, we write f i x j F = f i x i F + f i x j f i x i. 85 i j The first summation on the right-hand side of this equation is O1/ by 55, which has already been established. So we now focus on the terms in second summation. If i I B and i j, we must have by Assumption 5 that i I L. Therefore, f i x j f i x i M i x i x j 4M i p 1 p, 86 α where the the second inequality arises from 58. Otherwise, we have i I F and i j, and since dom f i = H, we may use the subgradient inequality to write f i x j f i x i f i x j, x j x i = f i x i, x j x i + f i x j f i x i, x j x i 1

22 y i, x j x i + L i x j x i x j x i yi x j x i + L i x j + x i x j x i = yi + L i x j + x i x j x i, 87 where the second inequality uses that yi = f i x i for i I F, the Cauchy-Schwarz inequality, and the Lipschitz continuity of f i. Lemma 9 assures us that x i, x j B x and yi B y, which in conjunction with 58 and L i L leads to f i x j f i x i 4B y + LB x p 1 p. 88 α Substituting 55, 86, and 88 into 85, we obtain f i x j F = f i x i F + fi x j f i x i i j E p 1 p + E 3 p 1 p + i I L 4M i p 1 p α E p 1 p + E 3 p 1 p This establishes 59 with E 4 defined in p1 p α 7 Consequences of Strong Monotonicity + 4 I F B y 1 + LB x p 1 p α M i + nb y + LB x. i I L We now investigate the consequences of strong monotonicity in 1 for the convergence rate of projective splitting. Assumption 6. For 1, there is some l {1,..., n} for which T l is strongly monotone. Note that under Assumption 6 it is immediate that the solution of 1 is unique. Theorem. Suppose assumptions 1,, 3, and 6 hold. Let z be the unique solution to 1. Fix p = z, w S. If K = + then x i z for i = 1,..., n and z z. If K <, x K i = z K+1 = z for i = 1,..., n. Furthermore for all 1 < K x avg z ξ 1 p 1 p βξ µ where x avg = 1 t xi l and ξ 1 and ξ are defined in and 8 respectively. Proof. Regarding the case K <, optimality of x K i and z K+1 was shown in [15, Lemma 5]. Let wn = n 1 w i. Using this notation and recalling the affine function defined in 8, we may write ϕ z, w = z x i, wi yi 89

23 = z x i, wi yi + z x l, wl yl i l µ z x l, 90 where the inequality uses that x i, y i and z, w i are in gra T i, along with the monotonicity of T i for i l, and also the strong monotonicity of T l. Now, Lemma 6 implies that ϕ z, w 0. Therefore, for any 1 < K, µ z x l ϕ z, w ϕ z, w a = ϕ, p p b = 1 α p p +1, p p c = d 1 α 1 p p +1 + p p p +1 p α p p +1 + p p p +1 p. Above, a uses that ϕ is affine, b employs 17 from Lemma 4, c uses 5, and d uses Lemma 7 and 18 from Lemma 4. Summing the resulting inequality yields, for all 1 < K, µ z x t l 1 p t p t+1 + p 1 p ξ 1 p 1 p, 91 α βξ where the second inequality uses Lemma 4 and 19 from Lemma 7. So, when K =, we have x l z. Lemma 8 asserts that ϕ 0 in the K = case, so from 9 we conclude that x i z for i = 1,..., n. Lemma 8 also establishes that ϕ p 0, so by Lemma 6, we have z x i 0 for all i = 1,..., n and hence z z. Finally, the convexity of the function in conjunction with 91 yields Linear Convergence Under Strong Monotonicity and Cocoercivity If we introduce a cocoercivity assumption along with strong monotonicity, we can derive linear convergence. We require that all but one of the operators be cocoercive. Without loss of generality, we designate T n to be be the operator that need not be cocoercive. Assumption 7. In 1, suppose that for i = 1,..., n 1, the operator T i is cocoercive with parameter Γ i > 0. Let Γ = max,...,n 1 Γ i. If this assumption holds, then T 1,..., T n 1 are all single-valued. If Assumption 6 also holds, this single-valuedness implies that not only is the solution z to 1 unique, but that the extended solution set S must be singleton of the form { z, w1,..., wn 1 } { = z, T 1 z,..., T n 1 z }. 3

24 Theorem 3. Suppose assumptions 1,, 3, 6, and 7 hold. Let z be the unique solution to 1 and tae p = z, w = z, T 1 z,..., T n 1 z S. If K <, p K+1 = p. For all 1 < K, p +1 p 1 E 5 p p 9 where E 5 = 1 8ξ11 + γ max{µ 1, Γ} 1 + γξ 1 β + E ξ 1 ]0, 1/4]. 93 Proof. For the finite-termination case, optimality of p K+1 was established in [15, Lemma 5]. Henceforth, we thus assume that K =. The ey idea of the proof is to show that for E 5 defined in 93, which, when used with 18 of Lemma 4, implies E 5 p p p +1 p, 94 p +1 p p p p +1 p 1 E 5 p p. We now establish 94. For 1 < K, we have ϕ p 0 and hence ϕ p ϕ p = 1 z x i, wi yi z x i, wi yi + 1 µ z x l + z x i, wi yi 1 Γ i y i w i. The last inequality here follows because n z x i, w i y i µ z x l by the same logic as in 90, and by the cocoerciveness of T 1,..., T n 1. We therefore obtain µ z x l 1 + yi wi ϕ p ϕ p = ϕ, p p Γ i = 1 α p p +1, p p, 95 where we have again used that ϕ is affine, and also 17. Continuing, we use Young s inequality applied to 95 to write, for any ν > 0, µ z x l 1 + yi wi Γ i 4 1 να p+1 p + ν p p,

25 which implies z x l + yi wi ξ 5 να p+1 p + ξ 5 ν p p, 96 where ξ 5 = max{µ 1, Γ}. From Lemma 1, we then deduce that p p = γ z z + wi wi γ z x l + γ x l z + yi wi + yi wi. n γ z x l + yi wi + γ x l z + yi wi. Substituting the upper bound 96 for the term in parentheses in 97, and then using 33 and 34 from Lemma 8 on the other two terms, we conclude that p p ξ5 1 + γ να p+1 p + ξ 5 ν p p + γξ 1 β p +1 p ξ + E 1 p +1 p. Rearranging this inequality yields We now set 1 ν1 + γξ 5 p p ν = γξ 5 + γξ 1 να β + E ξ 1 p +1 p γξ 5. Using this value of ν in 98 implies 94 with E 5 = γ ξ 5 + γξ 1 α β ξ 1 + E 1, 99 which reduces to the expression in 93. Considering 93, we note that since ξ > 0, β > 0, µ > 0, and E 1 <, we must have E 5 > 0. Finally, we show that E 5 1/4 as claimed in 93 in particular, this precludes the nonsensical situation that E 5 > 1. We write E 5 = 1 8ξ11 + γ max{µ 1, Γ} + γξ 1 β ξ 1 + E

26 a ξ γξ 1 = min { 1 σρ 1, min j IF { ρ 1 j L j }} nγ [ 1 + γ 1 L I F + ρ 1 + δ ] min { { }} 1 σρ 1, min j IF ρ 1 j L j 4nρ 1 + δ b c 1 σρ 1 + j I F ρ 1 j L j 4nρ 1 + δ I F + 1 ρ + j I F ρ j 4nρ 1 + δ I F + 1 d 1 4n1 + δ 1 4, where we employ the following reasoning: first, a uses that all terms within the parentheses in 100 are positive and that β <. In b, we use that the minimum of a set of numbers cannot exceed its average. Inequality c follows by observing that 1 σ 1 and L j 0, and then using Lemma 1. Finally, d uses that ρ ρ j and ρ ρ. 9 Simplifications when n = 1 Suppose n = 1, in which case we have either 1 I B or 1 I F. In either case, using the convention discussed at the beginning of Section.1, we have w 1 = 0 for all and the affine function defined in 8 becomes ϕ p = ϕ z = z x, y, where we have written x 1 = x and y 1 = y. If 1 I B, using w 1 = 0, in the update on lines 4-6 of the algorithm yields x + ρ y = z and y T x, where we have written T 1 = T and ρ 1 = ρ. This implies that ϕ z = ρ y. Furthermore, z ϕ = γ 1 y and so z ϕ = γ γ y = γ 1 y. Therefore, using 17, we have z +1 = z β ϕ p ϕ zϕ = z β ρ y = 1 β z + β x = 1 β z + β prox ρ z. Thus, when n = 1 and 1 I B, projective splitting reduces to the relaxed proximal point method of [9]; see also [, Theorem 3.41]. In fact, when one allows for approximate evaluation of resolvents, projective splitting with n = 1 I B reduces to the hybrid projection proximal-point method of Solodov and Svaiter [3, Algorithm 1.1]. However, the error criterion of [3, Eq. 1.1] is more restrictive than the conditions 11 1 that we propose in Assumption. 6

27 On the other hand, if 1 I F, considering lines 8-9 or the algorithm with w 1 = 0 yields Furthermore, we have Therefore, Thus, the method reduces to where x = z ρ T z and y = T x. ϕ z = ρ T z, T x and z ϕ = γ 1 T x. z +1 = z β ϕ p ϕ zϕ = z β ρ T z, T x T x T x. x = z ρ T z z +1 = z ρ T x, ρ = β ρ T z, T x T x. 101 This is the unconstrained version of the extragradient method [16]. When β = 1, the stepsize 101 corresponds to the extragradient stepsize proposed by Iusem and Svaiter in [14]. Furthermore, the bactracing linesearch for the extragradient method also proposed in [14] is almost equivalent in the unconstrained case to the linesearch we proposed for processing individual operators within projective splitting in [15], except that Iusem and Svaiter use a more restrictive termination condition perhaps necessary because they also account for the constrained case of the extragradient method. While these observations may be of interest in their own right, they also have implications for the convergence rate analysis. In particular, that projective splitting reduces to the proximal point method suggests that the O1/ ergodic convergence rate for 5 derived in Section 6 cannot be improved beyond a constant factor. This is because the same rate is unimprovable for the proximal point method under the assumption that the stepsize is bounded from above and below [1]. References [1] Alotaibi, A., Combettes, P.L., Shahzad, N.: Solving Coupled Composite Monotone Inclusions by Successive Fejér Approximations of their Kuhn Tucer Set. SIAM Journal on Optimization 44, [] Bausche, H.H., Combettes, P.L.: Convex analysis and monotone operator theory in Hilbert spaces, nd edn. Springer 017 [3] Boyd, S., Parih, N., Chu, E., Peleato, B., Ecstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning 31,

28 [4] Combettes, P.L., Ecstein, J.: Asynchronous bloc-iterative primal-dual decomposition methods for monotone inclusions. Mathematical Programming 1681-, [5] Combettes, P.L., Pesquet, J.C.: Proximal splitting methods in signal processing. In: Fixed-point algorithms for inverse problems in science and engineering, pp Springer 011 [6] Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward-bacward splitting. Multiscale Modeling & Simulation 44, [7] Davis, D., Yin, W.: A three-operator splitting scheme and its optimization applications. Set-Valued and Variational Analysis 54, [8] Ecstein, J.: A simplified form of bloc-iterative operator splitting and an asynchronous algorithm resembling the multi-bloc alternating direction method of multipliers. Journal of Optimization Theory and Applications 1731, [9] Ecstein, J., Bertseas, D.P.: On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Programming 553, [10] Ecstein, J., Svaiter, B.F.: A family of projective splitting methods for the sum of two maximal monotone operators. Mathematical Programming 1111, [11] Ecstein, J., Svaiter, B.F.: General projective splitting methods for sums of maximal monotone operators. SIAM Journal on Control and Optimization 48, [1] Güler, O.: On the convergence of the proximal point algorithm for convex minimization. SIAM Journal on Control and Optimization 9, [13] Haugazeau, Y.: Sur les inéquations variationnelles et la minimisation de fonctionnelles convexes. Ph.D. thesis, Université de Paris, Paris 1968 [14] Iusem, A., Svaiter, B.: A variant of Korpelevich s method for variational inequalities with a new search strategy. Optimization 44, [15] Johnstone, P.R., Ecstein, J.: Projective splitting with forward steps: Asynchronous and bloc-iterative operator splitting. arxiv preprint arxiv: [16] Korpelevich, G.: Extragradient method for finding saddle points and other problems. Mateon 134, [17] Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM Journal on Numerical Analysis 166, [18] Machado, M.P.: Projective method of multipliers for linearly constrained convex minimization. arxiv preprint arxiv:

29 [19] Machado, M.P.: On the complexity of the projective splitting and Spingarn s methods for the sum of two maximal monotone operators. arxiv preprint arxiv: [0] Mercier, B., Vijayasundaram, G.: Lectures on topics in finite element solution of elliptic problems. Tata Institute of Fundamental Research, Bombay 1979 [1] Nguyen, T.P., Pauwels, E., Richard, E., Suter, B.W.: Extragradient method in optimization: Convergence and complexity. Journal of Optimization Theory and Applications 1761, [] Rocafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control Optimization 145, [3] Solodov, M.V., Svaiter, B.F.: A hybrid projection-proximal point algorithm. Journal of convex analysis 61, [4] Tseng, P.: A modified forward-bacward splitting method for maximal monotone mappings. SIAM Journal on Control and Optimization 38,

Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches

Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches Patrick L. Combettes joint work with J.-C. Pesquet) Laboratoire Jacques-Louis Lions Faculté de Mathématiques

More information

An Inexact Spingarn s Partial Inverse Method with Applications to Operator Splitting and Composite Optimization

An Inexact Spingarn s Partial Inverse Method with Applications to Operator Splitting and Composite Optimization Noname manuscript No. (will be inserted by the editor) An Inexact Spingarn s Partial Inverse Method with Applications to Operator Splitting and Composite Optimization S. Costa Lima M. Marques Alves Received:

More information

ADMM for monotone operators: convergence analysis and rates

ADMM for monotone operators: convergence analysis and rates ADMM for monotone operators: convergence analysis and rates Radu Ioan Boţ Ernö Robert Csetne May 4, 07 Abstract. We propose in this paper a unifying scheme for several algorithms from the literature dedicated

More information

On the order of the operators in the Douglas Rachford algorithm

On the order of the operators in the Douglas Rachford algorithm On the order of the operators in the Douglas Rachford algorithm Heinz H. Bauschke and Walaa M. Moursi June 11, 2015 Abstract The Douglas Rachford algorithm is a popular method for finding zeros of sums

More information

R u t c o r Research R e p o r t. General Projective Splitting Methods for Sums of Maximal Monotone Operators. Jonathan Eckstein a. B. F.

R u t c o r Research R e p o r t. General Projective Splitting Methods for Sums of Maximal Monotone Operators. Jonathan Eckstein a. B. F. R u t c o r Research R e p o r t General Projective Splitting Methods for Sums of Maximal Monotone Operators Jonathan Ecstein a B. F. Svaiter b RRR 23-2007, August 2007 RUTCOR Rutgers Center for Operations

More information

An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods

An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods Renato D.C. Monteiro B. F. Svaiter May 10, 011 Revised: May 4, 01) Abstract This

More information

M. Marques Alves Marina Geremia. November 30, 2017

M. Marques Alves Marina Geremia. November 30, 2017 Iteration complexity of an inexact Douglas-Rachford method and of a Douglas-Rachford-Tseng s F-B four-operator splitting method for solving monotone inclusions M. Marques Alves Marina Geremia November

More information

Convex Optimization Notes

Convex Optimization Notes Convex Optimization Notes Jonathan Siegel January 2017 1 Convex Analysis This section is devoted to the study of convex functions f : B R {+ } and convex sets U B, for B a Banach space. The case of B =

More information

A Unified Approach to Proximal Algorithms using Bregman Distance

A Unified Approach to Proximal Algorithms using Bregman Distance A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department

More information

arxiv: v3 [math.oc] 18 Apr 2012

arxiv: v3 [math.oc] 18 Apr 2012 A class of Fejér convergent algorithms, approximate resolvents and the Hybrid Proximal-Extragradient method B. F. Svaiter arxiv:1204.1353v3 [math.oc] 18 Apr 2012 Abstract A new framework for analyzing

More information

1 Introduction and preliminaries

1 Introduction and preliminaries Proximal Methods for a Class of Relaxed Nonlinear Variational Inclusions Abdellatif Moudafi Université des Antilles et de la Guyane, Grimaag B.P. 7209, 97275 Schoelcher, Martinique abdellatif.moudafi@martinique.univ-ag.fr

More information

arxiv: v1 [math.oc] 16 May 2018

arxiv: v1 [math.oc] 16 May 2018 An Algorithmic Framewor of Variable Metric Over-Relaxed Hybrid Proximal Extra-Gradient Method Li Shen 1 Peng Sun 1 Yitong Wang 1 Wei Liu 1 Tong Zhang 1 arxiv:180506137v1 [mathoc] 16 May 2018 Abstract We

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of

More information

Splitting methods for decomposing separable convex programs

Splitting methods for decomposing separable convex programs Splitting methods for decomposing separable convex programs Philippe Mahey LIMOS - ISIMA - Université Blaise Pascal PGMO, ENSTA 2013 October 4, 2013 1 / 30 Plan 1 Max Monotone Operators Proximal techniques

More information

On the complexity of the hybrid proximal extragradient method for the iterates and the ergodic mean

On the complexity of the hybrid proximal extragradient method for the iterates and the ergodic mean On the complexity of the hybrid proximal extragradient method for the iterates and the ergodic mean Renato D.C. Monteiro B. F. Svaiter March 17, 2009 Abstract In this paper we analyze the iteration-complexity

More information

About Split Proximal Algorithms for the Q-Lasso

About Split Proximal Algorithms for the Q-Lasso Thai Journal of Mathematics Volume 5 (207) Number : 7 http://thaijmath.in.cmu.ac.th ISSN 686-0209 About Split Proximal Algorithms for the Q-Lasso Abdellatif Moudafi Aix Marseille Université, CNRS-L.S.I.S

More information

Convergence rate estimates for the gradient differential inclusion

Convergence rate estimates for the gradient differential inclusion Convergence rate estimates for the gradient differential inclusion Osman Güler November 23 Abstract Let f : H R { } be a proper, lower semi continuous, convex function in a Hilbert space H. The gradient

More information

Complexity of the relaxed Peaceman-Rachford splitting method for the sum of two maximal strongly monotone operators

Complexity of the relaxed Peaceman-Rachford splitting method for the sum of two maximal strongly monotone operators Complexity of the relaxed Peaceman-Rachford splitting method for the sum of two maximal strongly monotone operators Renato D.C. Monteiro, Chee-Khian Sim November 3, 206 Abstract This paper considers the

More information

On convergence rate of the Douglas-Rachford operator splitting method

On convergence rate of the Douglas-Rachford operator splitting method On convergence rate of the Douglas-Rachford operator splitting method Bingsheng He and Xiaoming Yuan 2 Abstract. This note provides a simple proof on a O(/k) convergence rate for the Douglas- Rachford

More information

Optimization and Optimal Control in Banach Spaces

Optimization and Optimal Control in Banach Spaces Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,

More information

Primal/Dual Decomposition Methods

Primal/Dual Decomposition Methods Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Subgradients

More information

An accelerated non-euclidean hybrid proximal extragradient-type algorithm for convex concave saddle-point problems

An accelerated non-euclidean hybrid proximal extragradient-type algorithm for convex concave saddle-point problems Optimization Methods and Software ISSN: 1055-6788 (Print) 1029-4937 (Online) Journal homepage: http://www.tandfonline.com/loi/goms20 An accelerated non-euclidean hybrid proximal extragradient-type algorithm

More information

Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms

Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms Carlos Humes Jr. a, Benar F. Svaiter b, Paulo J. S. Silva a, a Dept. of Computer Science, University of São Paulo, Brazil Email: {humes,rsilva}@ime.usp.br

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

Convergence rate of inexact proximal point methods with relative error criteria for convex optimization

Convergence rate of inexact proximal point methods with relative error criteria for convex optimization Convergence rate of inexact proximal point methods with relative error criteria for convex optimization Renato D. C. Monteiro B. F. Svaiter August, 010 Revised: December 1, 011) Abstract In this paper,

More information

An Algorithmic Framework of Variable Metric Over-Relaxed Hybrid Proximal Extra-Gradient Method

An Algorithmic Framework of Variable Metric Over-Relaxed Hybrid Proximal Extra-Gradient Method An Algorithmic Framewor of Variable Metric Over-Relaxed Hybrid Proximal Extra-Gradient Method Li Shen 1 Peng Sun 1 Yitong Wang 1 Wei Liu 1 Tong Zhang 1 Abstract We propose a novel algorithmic framewor

More information

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The

More information

An inexact subgradient algorithm for Equilibrium Problems

An inexact subgradient algorithm for Equilibrium Problems Volume 30, N. 1, pp. 91 107, 2011 Copyright 2011 SBMAC ISSN 0101-8205 www.scielo.br/cam An inexact subgradient algorithm for Equilibrium Problems PAULO SANTOS 1 and SUSANA SCHEIMBERG 2 1 DM, UFPI, Teresina,

More information

Iteration-complexity of first-order penalty methods for convex programming

Iteration-complexity of first-order penalty methods for convex programming Iteration-complexity of first-order penalty methods for convex programming Guanghui Lan Renato D.C. Monteiro July 24, 2008 Abstract This paper considers a special but broad class of convex programing CP)

More information

Math 273a: Optimization Subgradient Methods

Math 273a: Optimization Subgradient Methods Math 273a: Optimization Subgradient Methods Instructor: Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com Nonsmooth convex function Recall: For ˉx R n, f(ˉx) := {g R

More information

A Dykstra-like algorithm for two monotone operators

A Dykstra-like algorithm for two monotone operators A Dykstra-like algorithm for two monotone operators Heinz H. Bauschke and Patrick L. Combettes Abstract Dykstra s algorithm employs the projectors onto two closed convex sets in a Hilbert space to construct

More information

Research Article Strong Convergence of a Projected Gradient Method

Research Article Strong Convergence of a Projected Gradient Method Applied Mathematics Volume 2012, Article ID 410137, 10 pages doi:10.1155/2012/410137 Research Article Strong Convergence of a Projected Gradient Method Shunhou Fan and Yonghong Yao Department of Mathematics,

More information

An accelerated non-euclidean hybrid proximal extragradient-type algorithm for convex-concave saddle-point problems

An accelerated non-euclidean hybrid proximal extragradient-type algorithm for convex-concave saddle-point problems An accelerated non-euclidean hybrid proximal extragradient-type algorithm for convex-concave saddle-point problems O. Kolossoski R. D. C. Monteiro September 18, 2015 (Revised: September 28, 2016) Abstract

More information

A strongly convergent hybrid proximal method in Banach spaces

A strongly convergent hybrid proximal method in Banach spaces J. Math. Anal. Appl. 289 (2004) 700 711 www.elsevier.com/locate/jmaa A strongly convergent hybrid proximal method in Banach spaces Rolando Gárciga Otero a,,1 and B.F. Svaiter b,2 a Instituto de Economia

More information

On the convergence of a regularized Jacobi algorithm for convex optimization

On the convergence of a regularized Jacobi algorithm for convex optimization On the convergence of a regularized Jacobi algorithm for convex optimization Goran Banjac, Kostas Margellos, and Paul J. Goulart Abstract In this paper we consider the regularized version of the Jacobi

More information

Convergence of Fixed-Point Iterations

Convergence of Fixed-Point Iterations Convergence of Fixed-Point Iterations Instructor: Wotao Yin (UCLA Math) July 2016 1 / 30 Why study fixed-point iterations? Abstract many existing algorithms in optimization, numerical linear algebra, and

More information

A GENERALIZATION OF THE REGULARIZATION PROXIMAL POINT METHOD

A GENERALIZATION OF THE REGULARIZATION PROXIMAL POINT METHOD A GENERALIZATION OF THE REGULARIZATION PROXIMAL POINT METHOD OGANEDITSE A. BOIKANYO AND GHEORGHE MOROŞANU Abstract. This paper deals with the generalized regularization proximal point method which was

More information

Journal of Convex Analysis (accepted for publication) A HYBRID PROJECTION PROXIMAL POINT ALGORITHM. M. V. Solodov and B. F.

Journal of Convex Analysis (accepted for publication) A HYBRID PROJECTION PROXIMAL POINT ALGORITHM. M. V. Solodov and B. F. Journal of Convex Analysis (accepted for publication) A HYBRID PROJECTION PROXIMAL POINT ALGORITHM M. V. Solodov and B. F. Svaiter January 27, 1997 (Revised August 24, 1998) ABSTRACT We propose a modification

More information

Lecture 8 Plus properties, merit functions and gap functions. September 28, 2008

Lecture 8 Plus properties, merit functions and gap functions. September 28, 2008 Lecture 8 Plus properties, merit functions and gap functions September 28, 2008 Outline Plus-properties and F-uniqueness Equation reformulations of VI/CPs Merit functions Gap merit functions FP-I book:

More information

Douglas-Rachford splitting for nonconvex feasibility problems

Douglas-Rachford splitting for nonconvex feasibility problems Douglas-Rachford splitting for nonconvex feasibility problems Guoyin Li Ting Kei Pong Jan 3, 015 Abstract We adapt the Douglas-Rachford DR) splitting method to solve nonconvex feasibility problems by studying

More information

PARALLEL SUBGRADIENT METHOD FOR NONSMOOTH CONVEX OPTIMIZATION WITH A SIMPLE CONSTRAINT

PARALLEL SUBGRADIENT METHOD FOR NONSMOOTH CONVEX OPTIMIZATION WITH A SIMPLE CONSTRAINT Linear and Nonlinear Analysis Volume 1, Number 1, 2015, 1 PARALLEL SUBGRADIENT METHOD FOR NONSMOOTH CONVEX OPTIMIZATION WITH A SIMPLE CONSTRAINT KAZUHIRO HISHINUMA AND HIDEAKI IIDUKA Abstract. In this

More information

A double projection method for solving variational inequalities without monotonicity

A double projection method for solving variational inequalities without monotonicity A double projection method for solving variational inequalities without monotonicity Minglu Ye Yiran He Accepted by Computational Optimization and Applications, DOI: 10.1007/s10589-014-9659-7,Apr 05, 2014

More information

Strong Convergence Theorem by a Hybrid Extragradient-like Approximation Method for Variational Inequalities and Fixed Point Problems

Strong Convergence Theorem by a Hybrid Extragradient-like Approximation Method for Variational Inequalities and Fixed Point Problems Strong Convergence Theorem by a Hybrid Extragradient-like Approximation Method for Variational Inequalities and Fixed Point Problems Lu-Chuan Ceng 1, Nicolas Hadjisavvas 2 and Ngai-Ching Wong 3 Abstract.

More information

Extensions of Korpelevich s Extragradient Method for the Variational Inequality Problem in Euclidean Space

Extensions of Korpelevich s Extragradient Method for the Variational Inequality Problem in Euclidean Space Extensions of Korpelevich s Extragradient Method for the Variational Inequality Problem in Euclidean Space Yair Censor 1,AvivGibali 2 andsimeonreich 2 1 Department of Mathematics, University of Haifa,

More information

Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem

Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem Charles Byrne (Charles Byrne@uml.edu) http://faculty.uml.edu/cbyrne/cbyrne.html Department of Mathematical Sciences

More information

Optimality Conditions for Nonsmooth Convex Optimization

Optimality Conditions for Nonsmooth Convex Optimization Optimality Conditions for Nonsmooth Convex Optimization Sangkyun Lee Oct 22, 2014 Let us consider a convex function f : R n R, where R is the extended real field, R := R {, + }, which is proper (f never

More information

A Primal-dual Three-operator Splitting Scheme

A Primal-dual Three-operator Splitting Scheme Noname manuscript No. (will be inserted by the editor) A Primal-dual Three-operator Splitting Scheme Ming Yan Received: date / Accepted: date Abstract In this paper, we propose a new primal-dual algorithm

More information

Coordinate Update Algorithm Short Course Subgradients and Subgradient Methods

Coordinate Update Algorithm Short Course Subgradients and Subgradient Methods Coordinate Update Algorithm Short Course Subgradients and Subgradient Methods Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 30 Notation f : H R { } is a closed proper convex function domf := {x R n

More information

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 36 Why proximal? Newton s method: for C 2 -smooth, unconstrained problems allow

More information

Second order forward-backward dynamical systems for monotone inclusion problems

Second order forward-backward dynamical systems for monotone inclusion problems Second order forward-backward dynamical systems for monotone inclusion problems Radu Ioan Boţ Ernö Robert Csetnek March 6, 25 Abstract. We begin by considering second order dynamical systems of the from

More information

A WEAK-TO-STRONGCONVERGENCE PRINCIPLE FOR FEJÉR-MONOTONE METHODS IN HILBERT SPACES

A WEAK-TO-STRONGCONVERGENCE PRINCIPLE FOR FEJÉR-MONOTONE METHODS IN HILBERT SPACES MATHEMATICS OF OPERATIONS RESEARCH Vol. 26, No. 2, May 2001, pp. 248 264 Printed in U.S.A. A WEAK-TO-STRONGCONVERGENCE PRINCIPLE FOR FEJÉR-MONOTONE METHODS IN HILBERT SPACES HEINZ H. BAUSCHKE and PATRICK

More information

On proximal-like methods for equilibrium programming

On proximal-like methods for equilibrium programming On proximal-lie methods for equilibrium programming Nils Langenberg Department of Mathematics, University of Trier 54286 Trier, Germany, langenberg@uni-trier.de Abstract In [?] Flam and Antipin discussed

More information

An Infeasible Interior Proximal Method for Convex Programming Problems with Linear Constraints 1

An Infeasible Interior Proximal Method for Convex Programming Problems with Linear Constraints 1 An Infeasible Interior Proximal Method for Convex Programming Problems with Linear Constraints 1 Nobuo Yamashita 2, Christian Kanzow 3, Tomoyui Morimoto 2, and Masao Fuushima 2 2 Department of Applied

More information

ON A HYBRID PROXIMAL POINT ALGORITHM IN BANACH SPACES

ON A HYBRID PROXIMAL POINT ALGORITHM IN BANACH SPACES U.P.B. Sci. Bull., Series A, Vol. 80, Iss. 3, 2018 ISSN 1223-7027 ON A HYBRID PROXIMAL POINT ALGORITHM IN BANACH SPACES Vahid Dadashi 1 In this paper, we introduce a hybrid projection algorithm for a countable

More information

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Lagrange duality. The Lagrangian. We consider an optimization program of the form Lagrange duality Another way to arrive at the KKT conditions, and one which gives us some insight on solving constrained optimization problems, is through the Lagrange dual. The dual is a maximization

More information

A projection-type method for generalized variational inequalities with dual solutions

A projection-type method for generalized variational inequalities with dual solutions Available online at www.isr-publications.com/jnsa J. Nonlinear Sci. Appl., 10 (2017), 4812 4821 Research Article Journal Homepage: www.tjnsa.com - www.isr-publications.com/jnsa A projection-type method

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.18

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.18 XVIII - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No18 Linearized alternating direction method with Gaussian back substitution for separable convex optimization

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

EE 546, Univ of Washington, Spring Proximal mapping. introduction. review of conjugate functions. proximal mapping. Proximal mapping 6 1

EE 546, Univ of Washington, Spring Proximal mapping. introduction. review of conjugate functions. proximal mapping. Proximal mapping 6 1 EE 546, Univ of Washington, Spring 2012 6. Proximal mapping introduction review of conjugate functions proximal mapping Proximal mapping 6 1 Proximal mapping the proximal mapping (prox-operator) of a convex

More information

Pacific Journal of Optimization (Vol. 2, No. 3, September 2006) ABSTRACT

Pacific Journal of Optimization (Vol. 2, No. 3, September 2006) ABSTRACT Pacific Journal of Optimization Vol., No. 3, September 006) PRIMAL ERROR BOUNDS BASED ON THE AUGMENTED LAGRANGIAN AND LAGRANGIAN RELAXATION ALGORITHMS A. F. Izmailov and M. V. Solodov ABSTRACT For a given

More information

Monotone operators and bigger conjugate functions

Monotone operators and bigger conjugate functions Monotone operators and bigger conjugate functions Heinz H. Bauschke, Jonathan M. Borwein, Xianfu Wang, and Liangjin Yao August 12, 2011 Abstract We study a question posed by Stephen Simons in his 2008

More information

The proximal mapping

The proximal mapping The proximal mapping http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/37 1 closed function 2 Conjugate function

More information

Projective Splitting Methods for Pairs of Monotone Operators

Projective Splitting Methods for Pairs of Monotone Operators Projective Splitting Methods for Pairs of Monotone Operators Jonathan Eckstein B. F. Svaiter August 13, 2003 Abstract By embedding the notion of splitting within a general separator projection algorithmic

More information

c 2013 Society for Industrial and Applied Mathematics

c 2013 Society for Industrial and Applied Mathematics SIAM J. OPTIM. Vol. 3, No., pp. 109 115 c 013 Society for Industrial and Applied Mathematics AN ACCELERATED HYBRID PROXIMAL EXTRAGRADIENT METHOD FOR CONVEX OPTIMIZATION AND ITS IMPLICATIONS TO SECOND-ORDER

More information

An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods

An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods Renato D.C. Monteiro B. F. Svaiter May 10, 011 Abstract This paper presents an accelerated

More information

Convergence Theorems of Approximate Proximal Point Algorithm for Zeroes of Maximal Monotone Operators in Hilbert Spaces 1

Convergence Theorems of Approximate Proximal Point Algorithm for Zeroes of Maximal Monotone Operators in Hilbert Spaces 1 Int. Journal of Math. Analysis, Vol. 1, 2007, no. 4, 175-186 Convergence Theorems of Approximate Proximal Point Algorithm for Zeroes of Maximal Monotone Operators in Hilbert Spaces 1 Haiyun Zhou Institute

More information

Iteration-Complexity of a Newton Proximal Extragradient Method for Monotone Variational Inequalities and Inclusion Problems

Iteration-Complexity of a Newton Proximal Extragradient Method for Monotone Variational Inequalities and Inclusion Problems Iteration-Complexity of a Newton Proximal Extragradient Method for Monotone Variational Inequalities and Inclusion Problems Renato D.C. Monteiro B. F. Svaiter April 14, 2011 (Revised: December 15, 2011)

More information

On the Iteration Complexity of Some Projection Methods for Monotone Linear Variational Inequalities

On the Iteration Complexity of Some Projection Methods for Monotone Linear Variational Inequalities On the Iteration Complexity of Some Projection Methods for Monotone Linear Variational Inequalities Caihua Chen Xiaoling Fu Bingsheng He Xiaoming Yuan January 13, 2015 Abstract. Projection type methods

More information

A hybrid proximal extragradient self-concordant primal barrier method for monotone variational inequalities

A hybrid proximal extragradient self-concordant primal barrier method for monotone variational inequalities A hybrid proximal extragradient self-concordant primal barrier method for monotone variational inequalities Renato D.C. Monteiro Mauricio R. Sicre B. F. Svaiter June 3, 13 Revised: August 8, 14) Abstract

More information

GSOS: Gauss-Seidel Operator Splitting Algorithm for Multi-Term Nonsmooth Convex Composite Optimization

GSOS: Gauss-Seidel Operator Splitting Algorithm for Multi-Term Nonsmooth Convex Composite Optimization : Gauss-Seidel Operator Splitting Algorithm for Multi-Term Nonsmooth Convex Composite Optimization Li Shen 1 Wei Liu 1 Ganzhao Yuan Shiqian Ma 3 Abstract In this paper, we propose a fast Gauss-Seidel Operator

More information

On the O(1/t) convergence rate of the projection and contraction methods for variational inequalities with Lipschitz continuous monotone operators

On the O(1/t) convergence rate of the projection and contraction methods for variational inequalities with Lipschitz continuous monotone operators On the O1/t) convergence rate of the projection and contraction methods for variational inequalities with Lipschitz continuous monotone operators Bingsheng He 1 Department of Mathematics, Nanjing University,

More information

Dual and primal-dual methods

Dual and primal-dual methods ELE 538B: Large-Scale Optimization for Data Science Dual and primal-dual methods Yuxin Chen Princeton University, Spring 2018 Outline Dual proximal gradient method Primal-dual proximal gradient method

More information

Convex Optimization. (EE227A: UC Berkeley) Lecture 15. Suvrit Sra. (Gradient methods III) 12 March, 2013

Convex Optimization. (EE227A: UC Berkeley) Lecture 15. Suvrit Sra. (Gradient methods III) 12 March, 2013 Convex Optimization (EE227A: UC Berkeley) Lecture 15 (Gradient methods III) 12 March, 2013 Suvrit Sra Optimal gradient methods 2 / 27 Optimal gradient methods We saw following efficiency estimates for

More information

On the Weak Convergence of the Extragradient Method for Solving Pseudo-Monotone Variational Inequalities

On the Weak Convergence of the Extragradient Method for Solving Pseudo-Monotone Variational Inequalities J Optim Theory Appl 208) 76:399 409 https://doi.org/0.007/s0957-07-24-0 On the Weak Convergence of the Extragradient Method for Solving Pseudo-Monotone Variational Inequalities Phan Tu Vuong Received:

More information

You should be able to...

You should be able to... Lecture Outline Gradient Projection Algorithm Constant Step Length, Varying Step Length, Diminishing Step Length Complexity Issues Gradient Projection With Exploration Projection Solving QPs: active set

More information

Math 273a: Optimization Overview of First-Order Optimization Algorithms

Math 273a: Optimization Overview of First-Order Optimization Algorithms Math 273a: Optimization Overview of First-Order Optimization Algorithms Wotao Yin Department of Mathematics, UCLA online discussions on piazza.com 1 / 9 Typical flow of numerical optimization Optimization

More information

A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES. Fenghui Wang

A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES. Fenghui Wang A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES Fenghui Wang Department of Mathematics, Luoyang Normal University, Luoyang 470, P.R. China E-mail: wfenghui@63.com ABSTRACT.

More information

Solving Dual Problems

Solving Dual Problems Lecture 20 Solving Dual Problems We consider a constrained problem where, in addition to the constraint set X, there are also inequality and linear equality constraints. Specifically the minimization problem

More information

arxiv: v2 [cs.cv] 12 Sep 2014

arxiv: v2 [cs.cv] 12 Sep 2014 Noname manuscript No. (will be inserted by the editor) An inertial forward-bacward algorithm for monotone inclusions D. Lorenz T. Poc arxiv:403.35v [cs.cv Sep 04 the date of receipt and acceptance should

More information

Approaching monotone inclusion problems via second order dynamical systems with linear and anisotropic damping

Approaching monotone inclusion problems via second order dynamical systems with linear and anisotropic damping March 0, 206 3:4 WSPC Proceedings - 9in x 6in secondorderanisotropicdamping206030 page Approaching monotone inclusion problems via second order dynamical systems with linear and anisotropic damping Radu

More information

Distributed Optimization via Alternating Direction Method of Multipliers

Distributed Optimization via Alternating Direction Method of Multipliers Distributed Optimization via Alternating Direction Method of Multipliers Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato Stanford University ITMANET, Stanford, January 2011 Outline precursors dual decomposition

More information

496 B.S. HE, S.L. WANG AND H. YANG where w = x y 0 A ; Q(w) f(x) AT g(y) B T Ax + By b A ; W = X Y R r : (5) Problem (4)-(5) is denoted as MVI

496 B.S. HE, S.L. WANG AND H. YANG where w = x y 0 A ; Q(w) f(x) AT g(y) B T Ax + By b A ; W = X Y R r : (5) Problem (4)-(5) is denoted as MVI Journal of Computational Mathematics, Vol., No.4, 003, 495504. A MODIFIED VARIABLE-PENALTY ALTERNATING DIRECTIONS METHOD FOR MONOTONE VARIATIONAL INEQUALITIES Λ) Bing-sheng He Sheng-li Wang (Department

More information

WEAK CONVERGENCE OF RESOLVENTS OF MAXIMAL MONOTONE OPERATORS AND MOSCO CONVERGENCE

WEAK CONVERGENCE OF RESOLVENTS OF MAXIMAL MONOTONE OPERATORS AND MOSCO CONVERGENCE Fixed Point Theory, Volume 6, No. 1, 2005, 59-69 http://www.math.ubbcluj.ro/ nodeacj/sfptcj.htm WEAK CONVERGENCE OF RESOLVENTS OF MAXIMAL MONOTONE OPERATORS AND MOSCO CONVERGENCE YASUNORI KIMURA Department

More information

On the acceleration of augmented Lagrangian method for linearly constrained optimization

On the acceleration of augmented Lagrangian method for linearly constrained optimization On the acceleration of augmented Lagrangian method for linearly constrained optimization Bingsheng He and Xiaoming Yuan October, 2 Abstract. The classical augmented Lagrangian method (ALM plays a fundamental

More information

Regularization Inertial Proximal Point Algorithm for Convex Feasibility Problems in Banach Spaces

Regularization Inertial Proximal Point Algorithm for Convex Feasibility Problems in Banach Spaces Int. Journal of Math. Analysis, Vol. 3, 2009, no. 12, 549-561 Regularization Inertial Proximal Point Algorithm for Convex Feasibility Problems in Banach Spaces Nguyen Buong Vietnamse Academy of Science

More information

Iteration-complexity of a Rockafellar s proximal method of multipliers for convex programming based on second-order approximations

Iteration-complexity of a Rockafellar s proximal method of multipliers for convex programming based on second-order approximations Iteration-complexity of a Rockafellar s proximal method of multipliers for convex programming based on second-order approximations M. Marques Alves R.D.C. Monteiro Benar F. Svaiter February 1, 016 Abstract

More information

Proximal methods. S. Villa. October 7, 2014

Proximal methods. S. Villa. October 7, 2014 Proximal methods S. Villa October 7, 2014 1 Review of the basics Often machine learning problems require the solution of minimization problems. For instance, the ERM algorithm requires to solve a problem

More information

Convex Optimization and Modeling

Convex Optimization and Modeling Convex Optimization and Modeling Duality Theory and Optimality Conditions 5th lecture, 12.05.2010 Jun.-Prof. Matthias Hein Program of today/next lecture Lagrangian and duality: the Lagrangian the dual

More information

A New Modified Gradient-Projection Algorithm for Solution of Constrained Convex Minimization Problem in Hilbert Spaces

A New Modified Gradient-Projection Algorithm for Solution of Constrained Convex Minimization Problem in Hilbert Spaces A New Modified Gradient-Projection Algorithm for Solution of Constrained Convex Minimization Problem in Hilbert Spaces Cyril Dennis Enyi and Mukiawa Edwin Soh Abstract In this paper, we present a new iterative

More information

arxiv: v1 [math.oc] 1 Jul 2016

arxiv: v1 [math.oc] 1 Jul 2016 Convergence Rate of Frank-Wolfe for Non-Convex Objectives Simon Lacoste-Julien INRIA - SIERRA team ENS, Paris June 8, 016 Abstract arxiv:1607.00345v1 [math.oc] 1 Jul 016 We give a simple proof that the

More information

GENERAL NONCONVEX SPLIT VARIATIONAL INEQUALITY PROBLEMS. Jong Kyu Kim, Salahuddin, and Won Hee Lim

GENERAL NONCONVEX SPLIT VARIATIONAL INEQUALITY PROBLEMS. Jong Kyu Kim, Salahuddin, and Won Hee Lim Korean J. Math. 25 (2017), No. 4, pp. 469 481 https://doi.org/10.11568/kjm.2017.25.4.469 GENERAL NONCONVEX SPLIT VARIATIONAL INEQUALITY PROBLEMS Jong Kyu Kim, Salahuddin, and Won Hee Lim Abstract. In this

More information

THE CYCLIC DOUGLAS RACHFORD METHOD FOR INCONSISTENT FEASIBILITY PROBLEMS

THE CYCLIC DOUGLAS RACHFORD METHOD FOR INCONSISTENT FEASIBILITY PROBLEMS THE CYCLIC DOUGLAS RACHFORD METHOD FOR INCONSISTENT FEASIBILITY PROBLEMS JONATHAN M. BORWEIN AND MATTHEW K. TAM Abstract. We analyse the behaviour of the newly introduced cyclic Douglas Rachford algorithm

More information

CONVERGENCE PROPERTIES OF COMBINED RELAXATION METHODS

CONVERGENCE PROPERTIES OF COMBINED RELAXATION METHODS CONVERGENCE PROPERTIES OF COMBINED RELAXATION METHODS Igor V. Konnov Department of Applied Mathematics, Kazan University Kazan 420008, Russia Preprint, March 2002 ISBN 951-42-6687-0 AMS classification:

More information

A generalized forward-backward method for solving split equality quasi inclusion problems in Banach spaces

A generalized forward-backward method for solving split equality quasi inclusion problems in Banach spaces Available online at www.isr-publications.com/jnsa J. Nonlinear Sci. Appl., 10 (2017), 4890 4900 Research Article Journal Homepage: www.tjnsa.com - www.isr-publications.com/jnsa A generalized forward-backward

More information

A Distributed Newton Method for Network Utility Maximization, II: Convergence

A Distributed Newton Method for Network Utility Maximization, II: Convergence A Distributed Newton Method for Network Utility Maximization, II: Convergence Ermin Wei, Asuman Ozdaglar, and Ali Jadbabaie October 31, 2012 Abstract The existing distributed algorithms for Network Utility

More information

Math 273a: Optimization Subgradients of convex functions

Math 273a: Optimization Subgradients of convex functions Math 273a: Optimization Subgradients of convex functions Made by: Damek Davis Edited by Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com 1 / 42 Subgradients Assumptions

More information

arxiv: v4 [math.oc] 29 Jan 2018

arxiv: v4 [math.oc] 29 Jan 2018 Noname manuscript No. (will be inserted by the editor A new primal-dual algorithm for minimizing the sum of three functions with a linear operator Ming Yan arxiv:1611.09805v4 [math.oc] 29 Jan 2018 Received:

More information

Merit functions and error bounds for generalized variational inequalities

Merit functions and error bounds for generalized variational inequalities J. Math. Anal. Appl. 287 2003) 405 414 www.elsevier.com/locate/jmaa Merit functions and error bounds for generalized variational inequalities M.V. Solodov 1 Instituto de Matemática Pura e Aplicada, Estrada

More information

On the convergence properties of the projected gradient method for convex optimization

On the convergence properties of the projected gradient method for convex optimization Computational and Applied Mathematics Vol. 22, N. 1, pp. 37 52, 2003 Copyright 2003 SBMAC On the convergence properties of the projected gradient method for convex optimization A. N. IUSEM* Instituto de

More information