arxiv: v2 [cs.cv] 12 Sep 2014

Size: px
Start display at page:

Download "arxiv: v2 [cs.cv] 12 Sep 2014"

Transcription

1 Noname manuscript No. (will be inserted by the editor) An inertial forward-bacward algorithm for monotone inclusions D. Lorenz T. Poc arxiv:403.35v [cs.cv Sep 04 the date of receipt and acceptance should be inserted later Abstract In this paper, we propose an inertial forward bacward splitting algorithm to compute a zero of the sum of two monotone operators, with one of the two operators being co-coercive. The algorithm is inspired by the accelerated gradient method of Nesterov, but can be applied to a much larger class of problems including convex-concave saddle point problems and general monotone inclusions. We prove convergence of the algorithm in a Hilbert space setting and show that several recently proposed first-order methods can be obtained as special cases of the general algorithm. Numerical results show that the proposed algorithm converges faster than existing methods, while eeping the computational cost of each iteration basically unchanged. Keywords convex optimization, forward-bacward splitting, monotone inclusions, primal-dual algorithms, saddle-point problems, image restoration Introduction A fundamental problem is to find a zero of a maximal monotone operator T in a real Hilbert space X: find x X : 0 T (x). () This problem includes, as special cases, variational inequality problems, non-smooth convex optimization problems and convex-concave saddle-point problems. Therefore this problem finds many important applications in scientific fields D. Lorenz Institute for Analysis and Algebra, TU Braunschweig, 3809 Braunschweig, Germany, d.lorenz@tu-braunschweig.de T. Poc Institute for Computer Graphics and Vision, Graz University of Technology, Inffeldgasse 6, 800 Graz, Austria, and the Safety & Security Department, AIT Austrian Institute of Technology GmbH, Donau- City-Straße, 0 Vienna, Austria, poc@icg.tugraz.at such as image processing, computer vision, machine learning and signal processing. In case, T = f is the gradient of a differentiable convex function f, the most simple approach to solve () is to apply for each 0 the following recursion: x + = (Id λ T )(x ), where the operator (Id λ T ) is the so-called forward operator. Note, that the above scheme is nothing else than the classical method of steepest descend and λ > 0 is the step size parameter that has to be chosen according to a rule that guarantees convergence of the algorithm. In case, T is a general monotone operator, the classical algorithm to solve () is the proximal point algorithm which can be traced bac to the early wors of inty [3 and artinet [30. See also the thesis of Ecstein [0 for a detailed treatment of the subject. The proximal point algorithm generates a sequence x according to the recursion x + = (Id +λ T ) (x ), () where λ > 0 is a regularization parameter. The operator (Id +λ T ) is the so-called resolvent operator, that has been introduced by oreau in [3. In the context of algorithms, the resolvent operator is often referred to as the bacward operator. In the seminal paper [46, Rocafellar has shown that the sequence x generated by the proximal point algorithm converges wealy to a point x satisfying 0 T (x ). Unfortunately, in many interesting cases, the evaluation of the resolvent operator is as difficult as solving the original problem, which limits the practicability of the proximal point algorithm in its plain form. To partly overcome this problem, it is shown in [46, that the algorithm still converges when using inexact evaluations of the resolvent operator. In fact, the evaluation errors have to satisfy a certain

2 D. Lorenz, T. Poc summability condition which essentially means that the resolvent operators have to be computed with increasing accuracy. This is still somewhat limiting, since in practice the errors of the resolvent operator are often hard to control.. Splitting methods In many problems, however, the operator T can be written as the sum of two maximal monotone operators, i.e. T = A + B, such that the resolvent operators (Id +λa) and (Id +λb), are much easier to compute than the full resolvent (Id +λt ). Then, by combining the resolvents with respect to A, and B in a certain way, one might be able to mimic the effect of the full proximal step based on T. The two most successful instances that are based on combining forward and bacward steps with respect to A and B, are the Peaceman-Rachford splitting algorithm [40, x + = (Id +λb) (Id λa)(id +λa) (Id λb)(x ), and the Douglas-Rachford splitting algorithm [8, x + = (Id +λb) [(Id +λa) (Id λb) + λb(x ). These splitting techniques have been originally proposed in the context of linear operators and therefore cannot be applied to general monotone operators. In [9, Lions and ercier have analyzed and further developed these splitting algorithms. Their idea was to perform a change of variables x = (Id +λb) (v ), such that the Peaceman-Rachford and Douglas-Rachford splitting algorithms have a meaning even for A and B being multivalued operators. Regarding convergence of the algorithms, the Peaceman-Rachford algorithm still needs to assume that B is single-valued but the Douglas-Rachford algorithm converges even in the general setting, where A + B is just maximal monotone. In [, Ecstein has pointed out that the Douglas-Rachford splitting algorithm can be re-written in the form of (). Hence, it is basically a certain instance of the proximal point algorithm. oreover, Ecstein has shown that the application of the Douglas-Rachford algorithm to the dual of a certain structured convex optimization problem coincides with the so-called alternating direction method of multipliers. It is remarable, that the Douglas-Rachford splitting algorithm and its variants have seen a considerable renaissance in modern convex optimization [5,8. The main reason for the renewed interest lies in the fact that it is well suited for distributed convex programming. This is an important aspect for solving large scale convex optimization problems arising in recent image processing and machine learning applications. Another important line of splitting methods is given by the so-called forward-bacward splitting technique [4, 8, 9, 9. In contrast to the more complicated splitting techniques discussed above, the forward-bacward scheme is based (as the name suggests) on the recursive application of an explicit forward step with respect to B, followed by an implicit bacward step with respect to A. The forwardbacward algorithm is written as: x + = (Id +λ A) (Id λ B)(x ) (3) In the most general setting, where both A and B are general monotone operators, the convergence result is rather wea [39, basically, λ has to fulfill the same step-size restrictions as unconstrained subgradient descend schemes. However, if in addition B is single valued and Lipschitz, e.g. B is the gradient of a smooth convex function, the situation becomes much more beneficial. In fact, if B is L-Lipschitz, and λ is chosen such that λ < /L, the forward bacward algorithm (3) converges to a zero of T = A + B [3, 47. Similar to the Douglas-Rachford splitting algorithm, the forward-bacward algorithm has seen a renewed interest. It has been proposed and further improved in the context of sparse signal recovery [7, 5, image processing [45, and machine learning [9 applications.. Inertial methods In [44, Polya introduced the so-called heavy ball method, a two-step iterative method for minimizing a smooth convex function f. The algorithm taes the following form: { y = x + α (x x ) x + = y λ f(x ), where α [0, ) is an extrapolation factor and λ is again a step-size parameter that has to be chosen sufficiently small. The difference compared to a standard gradient method is that in each iteration, the extrapolated point y is used instead of x. It is remarable that this minor change greatly improves the performance of the scheme. In fact, its efficiency estimate [44 on strongly convex functions is equivalent to the nown lower complexity bounds of first-order methods [35 and hence the heavy-ball method resembles an optimal method. The acceleration is explained by the fact that the new iterate is given by taing a step which is a combination of the direction x x and the current antigradient direction f(x ). The heavy ball method can also be interpreted as an explicit finite differences discretization of the time dynamical system ẍ(t) + α ẋ(t) + α f(x(t)) = 0, where α, > 0 are free model parameters of the equation. This equation is used to describe the motion of a heavy body

3 An inertial forward-bacward algorithm for monotone inclusions 3 in a potential field f and hence the system is coined the heavy ball with friction dynamical system. In [, Alvarez and Attouch translated the idea of the heavy ball method to the setting of a general maximal monotone operator using the framewor of the proximal point algorithm (). The resulting algorithm is called the inertial proximal point algorithm and it is written as { y = x + α (x x ) x + = (Id +λ T ) (y ), It is shown that under certain conditions on α and λ, the algorithm converges wealy to a zero of T. In fact, the algorithm converges if λ is non-decreasing and α [0, ) is chosen such that α x x <, (5) which can be achieved by choosing α with respect to a simple on-line rule which ensures summability or in particular it is also true for α < /3. In subsequent wor [33, oudafi and Oliny introduced an additional single-valued and Lipschitz continuous operator B into the inertial proximal point algorithm: { y = x + α (x x ) x + = (Id +λ A) (y λ B(x )), It turns out that this algorithm converges as long as λ < /L, where L is the Lipschitz constant of B and the same condition (5), which is used to ensure convergence of the inertial proximal point algorithm. Note that for α > 0, the algorithm does not tae the form of a forward-bacward splitting algorithm, since B is still evaluated at the point x. In recent wor, Pesquet and Pustelni proposed a Douglas- Rachford type parallel splitting method for finding the zero of the sum of an arbitrary number maximal monotone operators. The method also includes inertial forces [4 which numerically speeds up the convergence of the algorithm. Related algorithms also including inertial forces have been proposed and investigated in [7,6..3 Optimal methods In a seminal paper [34, Nesterov proposed a modification of the heavy ball method in order to improve the convergence rate on smooth convex functions. While the heavy ball method evaluates the gradient in each iterate at the point x, the idea of Nesterov was to use the extrapolated point y also for evaluating the gradient. Additionally, the extrapolation parameter α is computed according to some special (4) (6) law that allows to prove optimal convergence rates of this scheme. The scheme is given by: { y = x + α (x x ) (7) x + = y λ f(y ), where λ = /L, There are several choices to define an optimal sequence {α } [34,35,4,48. In [35, it has been shown that the efficiency estimate of the above scheme is up to some constant factor equivalent to the lower complexity bounds of first-order methods for the class of µ-strongly convex functions, µ 0, with L-Lipschitz gradient. In [6, Güler has translated Nesterov s idea to the general setting of the proximal point algorithm, with the restriction that the operator T is the subdifferential of a convex function. Inexact versions of this algorithm have been proposed and studied in [50. In [4, Bec and Teboulle have proposed the so-called fast iterative shrinage thresholding algorithm (FISTA), that combines in a clever way the ideas of Nesterov and Güler within the forward-bacward splitting framewor. The algorithm features the same optimal convergence rate as Nesterov s method but it can be applied also in the presence of an additional but simple (with easy to compute proximal map) non-smooth convex function. The FISTA algorithm can be applied to a variety of practical problem arising in sparse signal recovery, image processing and machine learning and hence has become a standard algorithm. Related algorithms with similar properties have been independently proposed by Nesterov in [36,37..4 Content In this paper we propose a modification of the forward-bacward splitting algorithm (3) to solve monotone inclusions. Our method is inspired by the inertial forward-bacward splitting method (6), but differs from this method in two regards. First, the operator B is evaluated at the inertial extrapolate y which is inspired by Nesterov s optimal gradient method (7). In addition, we consider a symmetric positive definite map, which can be interpreted as a preconditioner or variable metric and is inspired by recently wor on primal dual algorithms [0,,4,7 and forward bacward splitting [4,,. These changes allow us to define a new meta-algorithm, that includes, as special cases for example several convex optimization algorithms that have recently attracted a lot of attention in the imaging, signal processing and machine learning communities. In section we will define the proposed algorithm and prove the general convergence in a Hilbert space setting. In section 3 we will apply the proposed algorithm to a class of convex-concave saddle-point problems and will show how several nown algorithms can be recovered from the proposed meta-algorithm. In section 4, we will apply the proposed algorithm to image processing problems including,

4 4 D. Lorenz, T. Poc image restoration and image deconvolution. In the last section, we will give some concluding remars. Proposed algorithm We consider the problem of finding a point x in a Hilbert space X such that 0 (A + B)(x ), (8) where A, B are maximal monotone operators. We additionally assume that the operator B is single-valued and cocoercive with respect to the solution set S := (A + B) (0) and a linear, selfadjoint and positive definite map L, i.e. for all x X, y S B(x) B(y), x y B(x) B(y) L (9) where, as usual, we denote x L = L x, x. Note that in the most simple case where L = l Id, l > 0, the operator B is /l co-coercive and hence l-lipschitz. However, we will later see that in some cases, it maes sense to consider more general L. The algorithm we propose in this paper is a basically a modification of the forward-bacward splitting algorithm (3). The scheme is as follows: { y = x + α (x x ) x + = (Id +λ A) (Id λ B)(y ), (0) where α [0, ) is an extrapolation factor, λ is a step-size parameter and is a linear selfadjoint and positive definite map that can be used as a preconditioner for the algorithm (cf. Section 3.). Note that the iteration can be equivalently expressed as { y = x + α (x x ) () x + = ( + λ A) ( λ B)(y ), Observe that (0) (resp. ()) differs from the inertial forwardbacward algorithm of oudafi and Oliny insofar that we also evaluate the operator B at the inertial extrapolate y. This allows us to rewrite the algorithm in the form of the standard forward-bacward algorithm (3). In the following Theorem, we analyze the basic convergence properties of the proposed algorithm. Theorem Let X be a real Hilbert space and A, B : X X be maximally monotone operators. Further assume that, L : X X are linear, bounded, selfadjoint and positive definite maps and that B is single valued and co-coercive w.r.t. L (cf. (9)). oreover, let λ > 0, α <, α [0, α, x 0 = x X and let the sequences x and y be defined by (0) (or ()). If (i) S = λ L is positive definite for all and (ii) = α x x < then x converges wealy to a solution of the inclusion 0 (A + B)(x). Proof Denote by x a zero of A + B. From (8), it holds that B(x ) A(x ). Furthermore, the second line in () can be equivalently expressed as (y x + ) λ B(y ) λ A(x + ). For convenience, we define for any symmetric positive definite, φ = x x = (x x ), x x, = x x = (x x ), x x Γ = x+ y = (x+ y ), x + y. From the well-nown identity a b, a c = a b + a c b c () we have by using the definition of the inertial extrapolate y that φ φ + = + + y x +, x + x α x x, x + x. Then, by using the monotonicity of A we deduce that (3) λ A(x + ) λ A(x ), x + x 0 (y x + ) λ B(y ) + λ B(x ), x + x 0 and y x +, x + x + λ B(x ) B(y ), x + x 0. Combining with (3), we obtain φ φ λ B(y ) B(x ), x + x α x x, x + x. From the co-coercivity property of B we have that B(y ) B(x ), x + x = B(y ) B(x ), x + y + y x (4) B(y ) B(x ) L + B(y ) B(x ), x + y B(y ) B(x ) L B(y ) B(x ) L Γ L = Γ L

5 An inertial forward-bacward algorithm for monotone inclusions 5 Substituting bac into (4) we arrive at φ φ + + λ Γ L α x x, x + x Invoing again (), it follows that φ + ( φ α φ φ ) + + λ ΓL ( + α + x x, x + x ) = Γ + λ Γ L + (α + α ). (5) The rest of the proof closely follows the proof of Theorem. in [. By the definition of S and using (α + α )/ α, we have φ + φ α (φ φ ) Γ S + α. (6) By assumption (i), the first term is non-positive and since α 0, the second term is non-negative. Now, defining θ = max(0, φ φ ) and setting Since δ is summable it follows that x x S 0 and hence lim x+ x α (x x ) S = 0. We already now that x is bounded hence, there is a convergent subsequence x ν x. Then we also get that y ν = ( + α ν )x ν α ν x ν x. Now we get from (0) that x ν = (Id +λ ν A) (y ν λ ν B(y ν )) and pass to the limit (extracting another subsequence such that λ ν λ if necessary) to obtain x = (Id + λ A) ( x λ B( x)) which is equivalent to B( x) A( x) δ = α = α x x, we obtain θ + α θ + δ αθ + δ Applying this inequality recursively, one obtaines a geometric series of the form θ + α θ + α i δ i i=0 Summing this inequality from = 0,...,, one has ( ) θ + θ + δ α =0 = Note that the series on the right hand side converges by assumption (ii). Now we set t = φ i= θ and since φ 0 and i= θ i is bounded independently of, we see that t is bounded from below. On the other hand, t + = φ + φ + θ+ φ+ i= θ i + φ θ i = t i= and hence, t is also non-decreasing, thus convergent. This implies that φ is convergent and especially that θ 0. From (6) we get x+ y S θ + αθ + δ x+ x α (x x ) S θ + αθ + δ which in turn shows that x is a solution. Opial s Theorem [38 concludes the proof. Next, we address the question whether the sequence {α } can be chosen a-priori such that the algorithm is guaranteed to converge. Indeed, in case of the inertial proximal point algorithm (4), it has already been shown in [ that convergence is ensured if {α } is a nondecreasing sequence in [0, α with α < /3. The next theorem presents a related result for the proposed algorithm. Theorem In addition to the conditions to Theorem assume that {λ } and {α } are nondecreasing sequences and that there exists a ε > 0 such that for all α R = ( 3α ) ( α ) λ L ε. (7) Then x converges wealy to a solution of the inclusion 0 (A + B)(x ). Proof The proof of this result is an adaption of the proof of Proposition. in [. From the last estimate in (5) and using the definition of y in (6) it follows that φ + φ α (φ φ ) Γ S + α ( + α ) + S α S + α x + x, x x S + (α + α) (α ) + S + (α α) S + (α + α) (α ) + S + α T, where T = ( α )λ L.

6 6 D. Lorenz, T. Poc We define µ = φ α φ + α T and since α + α and using the above inequality, µ + µ = φ + α +φ + α + + T + φ + α φ α T φ + φ α (φ φ ) + α + + T + α T (α ) + S + α + + T +. Then, we obtain since α + α µ + µ ((α )S + α + T ) (x + x ), x + x ((α + )S + α + T ) (x + x ), x + x. Now using α + α and λ + λ we obtain (α + )S + α + T (3α + ) + ( α + ) λ + L = R which finally gives µ + µ R. (8) α /3 /4 /8 0 0 Fig. Upper bound on the extrapolation factor α in dependence on. Remar 3 In case, = m Id, L = l Id, λ λ and defining the normalized step size = lλ m (0, ), assertion (7) reduces to 3α ε ( α ) 0. It easily follows that for any ε (0, (9 4)/()), the algorithm converges, if the sequence {α } is non-decreasing with 0 α α(), where 9 4 ε 3 α() = +. Observe that by assumption (7), the sequence {µ } is nonincreasing and hence φ αφ µ µ. It follows that φ α φ 0 + µ α i α φ 0 + µ α i=0 On the other hand, we have by summing up (8) from i = to, µ + µ i R i. i= Combining these two estimates it follows that i= i R i µ µ + µ + αφ α + φ 0 + µ α. Since R ε, it follows that <, = which especially shows (ii) in Theorem. The wea convergence of the x now follows from Theorem. See Figure for a plot of α() using ε = 0 6. Remar 4 Let us consider a fully-implicit variant of the scheme (0), which is given by { y = x + α (x x ), x + = (Id +λ (A + B)) (y ), where is again a linear, selfadjoint and positive definite map. In fact this algorithm, is an inertial proximal point algorithm, in the metric, whose convergence properties have been studied in [. This algorithm has less stringent convergence properties compared to the algorithm proposed in this paper, but its application to practical problems is limited since the resolvent with respect to A + B can be complicated. Interestingly, if the operator B is a linear, selfadjoint and positive semi-definite map, the above fully-implicit scheme can be significantly simplified. In fact, using λ λ and setting = λb, where λ is chosen such that > 0, it turns out that the fully implicit scheme in the metric is equivalent to our proposed inertial forward-bacward splitting algorithm (0) in the metric, which only requires to compute the resolvent with respect to A. According to Theorem. and Proposition. in [, condition (i) of Theorem can be replaced by the simpler condition λb > 0 and convergence of the algorithm is guaranteed for {α } non-decreasing in [0, α with α < /3.

7 An inertial forward-bacward algorithm for monotone inclusions 7 3 Application to convex-concave saddle-point problems Recently, so-called primal-dual splitting techniques have been proposed which are motivated by the need to solve largescale non-smooth convex optimization problems in image processing [0,, 4, 7, 6, 3, 49. These algorithms can be applied if the structure of the problem allows to rewrite it as certain convex-concave saddle-point problems. Now let X and Y be two Hilbert spaces and consider the saddle point problem min max G(x) + Q(x) + Kx, y F (y) P (y) (9) x X y Y with convex G, Q : X R, F, P : Y R, K : X Y linear and bounded and Q, P differentiable with Lipschitz gradient (with respective Lipschitz constants L Q, L P ). We define the monotone operators A, B on X Y as [ G K A = K F, B = [ Q 0 0 P and observe that the optimality system of the saddle point problem can be written as [ x 0 (A + B). y This setup fits into our general framewor of section. The standard splitting iterations (3) and (6) would not be applicable in general since the evaluation of the proximal mapping (Id +λa) may be prohibitively expensive in this case. However, if we consider the preconditioned iteration (0) with an appropriate mapping the iteration becomes feasible. The idea is, to choose such that one of the off-diagonal blocs in A cancel out. However, still has to be symmetric and positive definite and this leads to the choice = [ τ Id K K σ Id. (0) From (0) we get for λ = the following inertial primaldual forward-bacward algorithm ξ = x + α (x x ) ζ = y + α (y y ) x + = (Id +τ G) (ξ τ( Q(ξ ) + K ζ )) ξ + = x + ξ y + = (Id +σ F ) (ζ σ( P (ζ ) K ξ + )). () In the case that Q = P = 0 and α = 0 we obtain the primal-dual method from [0. The next two results characterize the conditions under which the proposed inertial primal-dual forward-bacward algorithm algorithm converges. Theorem 5 The iterates given by method () converge wealy to a solution of the saddle point problem (9) if 0 < τ < /L Q, 0 < σ < /L P, K < ( τ L Q )( σ L P ), () and if α [0, α with α < and the iterates (x, y ) fulfill α (x, y ) (x, y ) <. (3) = Furthermore, condition (3) is fulfilled, if {α } is nondecreasing and there exists ε > 0 such for all α 3α ε τ ( α ) L Q, 3α ε σ ( α ) L P, ( )( ) 3α ε τ ( α ) L 3α ε Q σ ( α ) L P ( 3α ε) K. (4) Proof Since Q and P are convex with Lipschitz-continuous gradients with Lipschitz constants L Q and L P, respectively, it follows from the Baillon-Haddad Theorem ([3, Corollary 8.6) that that Q and P are co-coercive w.r.t. L Q and L P, respectively. Hence, for x, ξ X and y, ζ Y it holds that B(x, y) B(ξ, ζ), (x, y) (ξ, ζ) X Y = Q(x) Q(ξ), x ξ X + P (y) P (ζ), y ζ Y L Q Q(x) Q(ξ) X+L P P (y) P (ζ) Y. Thus B is co-coercive w.r.t. the mapping [ L L Q = Id 0 0 L P Id It is easy to chec that S = L = [ ( ) Id K K ( σ L P ) Id τ L Q is positive definite if () is fulfilled and (3) follows from Theorem. Applying condition (7) to () we have: [ ( 3α ) τ Id K K σ Id ( α ) [ LQ Id 0 0 L P Id [ ε τ Id K K σ Id which can be checed to be true under the stated condition (4).

8 8 D. Lorenz, T. Poc Let us present some practical rules to choose feasible parameters for the algorithm. For this we introduce the parameters, δ (0, ), which can be interpreted as normalized step sizes in the primal and dual variables and the parameter r > 0, which controls the relative scaling between the primal and dual step sizes. Lemma 6 Choose, δ (0, ) and r > 0 and set τ = K r + L Q / and σ = K /r + L P /δ, Furthermore, let {α } be a non-decreasing sequence satisfying 0 α α(, δ), where 9 4 max(, δ) ε max(, δ) 3 α(, δ) = +, max(, δ) (5) and ε (0, (9 4 max(, δ))/( max(, δ))). Then, the conditions () and (4) of Theorem 5 hold, i.e. algorithm () converges wealy to a solution of the saddle-point problem (9). Proof It can be easily checed that the conditions () hold. Indeed, one has τ < /L Q, σ < /L P and also ( ) τ L Q ( K + K σ L P ) = ( rlp ( δ) δ ) + L Q( ) ( )( δ) r + L Q L P 4δ K. (6) Next, we can compute the maximum value of α that ensures convergence of the algorithm. Observe that for any r > 0 assertion (4) holds in particular if ( 3α ε) ( α ) and ( 3α ε) ( α ) τl Q where = τ K r and δ = inequalities are fulfilled if τl Q τ K r 0, σl P σ K /r 0. ( 3α ε) ( α ) max(, δ) 0, from which the upper bound α(, δ) follows. σl P σ K /r. Clearly, the two Remar 7 From equation (6), we can see that in case L P or L Q is zero (and fixed respectively δ), it might be favorable to choose larger respectively smaller values of r, since it leads to a smaller product of the terms on the left hand side of (6) and hence to larger product of primal and dual step sizes. 3. Recovering nown algorithms The proposed algorithm includes several popular algorithms for convex optimization as special cases: Forward-bacward splitting: Set K = F = P = 0 in (9), and set = Id, λ < /L Q and α = 0 in (). We obtain exactly the popular forward-bacward splitting algorithm (3) for minimizing the sum of a smooth and a non-smooth convex function. See [5,7. Nesterov s accelerated gradient method: In addition to the previous setting, let λ = /L Q and let the sequence {α } be computed according to one of the laws proposed in [34, 35, 4. We can exactly recover Nesterov s accelerated gradient method [34, 35, the accelerated proximal point algorithm [6 and FISTA [4. These algorithms offer an optimal convergence rate of O(/ ) for the function gap (G + Q)(x ) (G + Q)(x ). However, it is still unclear whether the sequence of iterates {x } converges. We cannot give a full answer here but we can at least modify the FISTA algorithm such that the sequence α x x has finite length. Following [, condition (ii) can be easily enforced on-line because it involves only past iterates. One possibility to ensure summability in (ii) is to require that α x x = O(/ ), e.g. α = min(( )/(+), c/( x x )), (7) for some c > 0. However, since in the FISTA algorithm α = ( )/( + ), Theorem still does not imply convergence of the iterates. This is left for future investigation. Primal-dual algorithms: Setting in (9) P = Q = 0 and let in () α = 0, we clearly obtain the firstorder primal-dual algorithm investigated in [43,, 0, 7. Furthermore, if we let Q be a convex function with Lipschitz continuous gradient Q, we obtain the firstorder primal-dual algorithm of Condat [6. oreover, in the present of smooth terms Q and P in the primal and dual problem, we recover Vũ s algorithm from [49. We point out that the methods in [6,49 involve an additional relaxation step of the form: x + = (( ρ ) Id +ρ (Id +λ T ) )(x ), (8) where ρ is the relaxation parameter. In case there are no smooth term Q and P the relaxation parameter ρ (0, ), in presence of Q and P the relaxation parameter is further restricted. See Section 4 for numerical comparisons. Observe, that the overrelaxation technique is quite different from the inertial technique we used in this paper, which is of the form: x + = (Id +λ T ) )(x + α (x x )). (9)

9 An inertial forward-bacward algorithm for monotone inclusions 9 Indeed, it was shown in [ that one can even use overrelaxation and inertial forces simultaneously. However, introducing an additional overrelaxation step in the proposed framewor is left for future investigation. 3. Preconditioning Besides the property of the map to mae the primal-dual iterations feasible, the map can also be interpreted as applying the algorithm (0) using = Id to the modified inclusion: B(x ) A(x ), and hence, can be interpreted as a left preconditioner to the inclusion (8). In the context of saddle point problems, Poc and Chambolle [4 proposed a preconditioning of the form [ T = K K Σ. where T and Σ are selfadjoint, positive definite maps. A condition for the positive definiteness of follows from the following lemma. Lemma 8 Let A, A be symmetric positive definite maps and B a bounded operator. If A BA [ <, then A = A B is positive definite. B A Proof We calculate [ [ [ x A B, x = x, A y B A y x + Bx, y + y, A y and estimate the middle term from below by Cauchy-Schwarz and Young s inequality and get for every c > 0 that Bx, y = A BA A x, A y c A BA A x c A y. Combining this with the assumption that A BA < we see that we can choose c such that [ x y, A [ x y ( c A BA ) A x + ( c ) A y > 0 which proves the auxiliary statement. We conclude that algorithm (0) converges as long as one has Σ KT <. In order to eep the proximal maps with respect to G and F feasible, the maps T and Σ were restricted to diagonal matrices. However, in recent wor [5, it was shown that some proximal maps are still efficiently computable if T and Σ are the sum of a diagonal matrix and a ran-one matrix. Applying the preconditioning technique to the proposed inertial primal-dual forward-bacward algorithm (), we obtain the method ξ = x + α (x x ) ζ = y + α (y y ) x + = (Id +T G) (ξ T ( Q(ξ ) + K ζ )) ξ + = x + ξ y + = (Id +Σ F ) (ζ Σ( P (ζ ) K ξ + )). (30) It turns out that the resulting method converges under appropriate conditions. Theorem 9 In the setting of Theorem 5 let furthermore Q and P be co-coercive w.r.t. the two bound, linear, symmetric and positive linear maps D and E, respectively. If it holds that Σ E > 0, (3) T D > 0, (3) (Σ E) K(T D) <, (33) and that α is a nondecreasnig sequence in [0, α with α <, and the iterates (x, y ) of (30) fulfill α (x, y ) (x, y ) < = then (x, y ) converges wealy to a saddle point of (9). Furthermore, convergence is assured if there exists an ε > 0 such that for all α it holds that ( 3α ε)σ ( α ) E, ( 3α ε)t ( α ) D, (( ) 3α ε)σ ( α) E K ) (( 3α ε)t ( α ) D ( 3α ε). Proof We start by setting C = [ D 0. 0 E (34)

10 0 D. Lorenz, T. Poc and by Theorem we only need to chec if S = C is positive. Obviously, the diagonal blocs of S are positive, by (3) and (3). Now we use Lemma 8 to see that (3), (3) and (33) imply that S is positive definite. For the second claim, we employ Theorem and only need to show that R = ( 3α ) ( α ) C ɛ which is equivalent to showing that [ T ( 3α ɛ) K K Σ ( α ) [ D E Again using Lemma 8 we obtain that (34) ensures this. 3.3 Diagonal Preconditioning In this section, we show how we can choose pointwise step sizes for both the primal and the dual variables that will ensure the convergence of the algorithm. The next result is an adaption of the preconditioner proposed in [4. Lemma 0 Assume that Q and P are co-coercive with respect to diagonal matrices D and E, where D = diag(d,..., d n ) and E = diag(e,..., e n ). Fix, δ (0, ), r > 0, s [0, and let T = diag(τ,...τ n ) and Σ = diag(σ,..., σ m ) with τ j = d j + r m i= K i,j, σ s i = then it holds that e i δ + r n j= K i,j s (35) Σ E > 0, T D > 0, (36) (Σ E) K(T D). (37) Furthermore, equation (34) is fulfilled if (5) is fulfilled. Proof The first two conditions follow from the fact that for diagonal matrices, the (36) can be written pointwise. By the definition of τ j, and σ i it follows that for any s [0, and using the convention that 0 0 = 0, d i τ i > d m i τ i = r K i,j s 0, and σ i e i > σ i e i δ = r i= n K i,j s 0. j= For the third condition, we have that for any s [0, (Σ E) K(T D) x m n = K i,j = < i= i= m i= m i= m i= j= σ i ei σ i ei δ σ i ei δ σ i ei δ σ i ei τ j n K i,j j= τ j dj dj x j x j n K i,j s Ki,j s j= j= τ j dj x j n n K i,j s K i,j s j= j= j= τ j dj x j (38) where the second line follows from K i,j K i,j and, δ < and the last line follows from the Cauchy-Schwarz inequality. By definition of σ i and τ j, and introducing r > 0, the above estimate can be simplified to m /r n n K i,j s K i,j s r = = m i= j= n K i,j s r τ j ( n m ) K i,j s j= i= dj r x j τ j dj τ j dj x j x j = x. (39) Using the above estimate in the definition of the operator norm, we obtain the desired result (Σ E) K(T D) (Σ = sup E) K(T D) x x 0 x. (40) If we now assume that (5) is fulfilled, we especially obtain that 3α ɛ δ ( α ) 3α ɛ σ i ( α ) e i and j 3α ɛ, ( α ) and consequently, by using the definition of τ j and σ i from (35), that /r K ij s 3α ɛ and r 3α ɛ τ i ( α ) d i K ij s i 3α ɛ. Now one can use the same arguments as in inequalities (38) and (39) to derive that (34) is fulfilled.

11 An inertial forward-bacward algorithm for monotone inclusions 4 Numerical experiments In this section, we provide several numerical results based on simple convex image processing problems to investigate the numerical properties of the proposed algorithm TV-l denoising Let us investigate the well-nown total variation denoising model: min u u, + λ u f, (4) where f R N is a given noisy image of size N pixels, u R N is the restored image, R N N is a sparse matrix implementing the discretized image gradient based on simple forward differences. The operator norm of is computed as 8. The parameter λ > 0 is used to control the trade-off between smoothness and data fidelity. For more information we refer to [0. Figure shows an exemplary denoising result, where we used the noisy image on the left hand side as input image and set λ = 0. The dual problem associated to (4) is given by the following optimization problem min p λf T p + I P (p), (4) where p R N is the dual variable and I P is the indicator function for the set P = {p R N : p, }. This problem can easily cast into the problem class (9), by setting Q(p) = λf T p, G = I P (p), and K = F = P = 0. In our first experiment of this section, we study the behavior of the error e = α x x, which plays a central role in showing convergence of the algorithm. Figure 3 shows the convergence of the sequence {e } generated by the FISTA algorithm by additionally using (7) for different choices of the constant c. The left figure depicts a case where c is not chosen large enough and hence the save guard shrins the extrapolation factor α such that the error e still converges with rate /. The right figure shows a case where c is chosen sufficiently large and hence the save guard does not apply. In this case, the algorithm produces the same sequence of iterates as the original FISTA algorithm. From our numerical results, it seems that the asymptotic convergence of e is actually faster that / which suggest that the iterates of FISTA are indeed convergent. In the second experiment we consider a saddle-point formulation of (4) min u max u, p + λ p u f I P (p), (43) PD-gap O(/ ) PD PD, α = / PD, ρ = Fig. 5 Comparison between inertial forces and overrelaxation. Both techniques show similar performance improvement but overrelaxation appears numerical less stable. Casting this problem in the general from (9), the most simple choice is K =, F (p) = I P (p), G(u) = u f, Q = P = 0. Hence, algorithm reduces to an inertial variant of the primal-dual algorithm of [0. According to () the step sizes τ and σ need to satisfy τσ < / K, but the ratio τ/σ can be chosen arbitrarily. Figure 4 shows the convergence of the primal dual gap for different choices for α and the ratio τ/σ. In general, one can see that the convergence becomes faster for larger values of α. According to (4), we can guarantee convergence for α < /3 but we cannot guarantee convergence for larger values of α. In fact, it turns out that the feasible range of α depends on the ratio τ/σ. For τ/σ = 0., fastest convergence is obtained by choosing α dynamically as α = ( )/( + ). In this case, the primaldual shows a very similar performance to the FISTA algorithm. For τ/σ = 0.0, the algorithm converges for up to α = /, but diverges for the dynamic choice. This behavior can be explained by the fact that the ratio τ/σ directly influences the -metric (0) which in turn leads to a divergence of the error term = α x x. Next, we provide an experiment, where we compare the effect of the inertial force with the effect of overrelaxation that has already been considered in [6,49. Figure 5 compares the primal-dual gap of the plain primal-dual (i.e. α = 0) algorithm [0 with the performance of its variants using either inertial forces using α = / or overrelaxation (see (8)) using ρ =.9. For all methods we used τ/σ = 0.0. Both variants improve the convergence of the plain primal-dual algorithm but we observed that overrelaxation leads to some numerical oscillations, in particular for values of ρ close to.

12 D. Lorenz, T. Poc (a) Noisy image (b) Restored image Fig. Application to total variation based image denoising with l fitting term. (a) shows the noisy input image containing Gaussian noise with a standard deviation of σ = 0., (b) shows the restored image using λ = α x x c/ 0 6 α x x c/ (a) c = (b) c = 0 5 Fig. 3 Convergence of the error sequence in the FISTA algorithm. 4. TV-l deconvolution Our next example incorporates an additional linear operator in the data fidelity. The problem is given by min u u, + λ Hu f, (44) where H R N N is a linear operator, for example H can be such that Hu is equivalent to the D convolution h u, where h is a convolution ernel. We again consider a saddlepoint formulation Casting this problem into the general class of problems (9), one has different possibilities. If we would choose K =, F (p) = I P (p), G(u) = λ Hu f, Q, P = 0, we would have to compute the proximal map with respect to G in each iteration of the algorithm, which can be computationally very expensive. Instead, if we choose G = 0, Q(u) = λ Hu f, we only need to compute Q(u) = λh T (Hu f) which is obviously much cheaper. We call this variant the explicit variant. Alternatively, we can additionally dualize the data term, which leads to the extended min u max u, p + λ p Hu f I P (p). (45)

13 An inertial forward-bacward algorithm for monotone inclusions PD-gap 0 0 PD-gap O(/ ) 0 FISTA α = 0 α = /3 α = / α = ( )/( +) (a) τ/σ = O(/ ) 0 FISTA α = 0 α = /3 α = / α = ( )/( +) (b) τ/σ = 0.0 Fig. 4 Convergence of the inertial primal-dual forward-bacward algorithm () for different choices of τ/σ and α. (a) Noisy and blurry image (b) Restored image Fig. 6 Application to total variation based image deconvolution with l fitting term. (a) shows the noisy (σ = 0.0) and blurry input image together with the nown point spread function, and (b) shows the restored image using λ = 000. saddle-point problem min u max u, p + Hu, q + f, q p,q λ q I P (p). where q R N is the new dual variable vector. ( ) Casting now this problem into (9), we identify K =, G = H 0, Q = 0, F (p, q) = I P (p) + λ q f, q, which eventually leads to proximal maps that are easy to compute. We call this variant the split-dual variant. Figure 7 shows a comparison of the convergence between the explicit and the split-dual variants with and without inertial forces. For the explicit variant, the maximal value of the inertial force was computed using formula (5), where we set L K = 8, L Q = λ, = and r = 00. This results in a theoretically maximal value of α = 0.36 but the algorithm also converges for α = /3 (see Remar 4). For the split-dual variant, the formulation does not involve any explicit terms and hence the maximal feasible value for α is /3. The primal and dual step sizes were computed according to the preconditioning rules (35) (sipping the explicit terms), where we again used r = 00. The figure shows the primal energy gap, where the optimal primal energy value has been computed by running the

14 4 D. Lorenz, T. Poc energy-gap explicit: α = 0 0 explicit: α = /3 split-dual: α = 0 split-dual: α = /3 O(/ ) Fig. 7 Convergence of the primal dual algorithms with and without inertial forces. explicit variant for 0000 iterations. The algorithms were stopped, after the primal energy-gap was below a threshold of 0. From the figure, one can see that for both variants, the inertial force leads to a faster convergence. One can also see that in the early stage of the iterations, the explicit variant seems to converge faster than the split-dual variant. Finally, we point out that the asymptotic convergence of both variants is considerably faster than O(/ ). 5 Conclusion In this paper we considered an inertial forward-bacward algorithm for solving monotone inclusions given by the sum of a monotone operator with an easy-to-compute resolvent operator and another monotone operator which is co-coercive. We have proven convergence of the algorithm in a general Hilbert space setting. It turns out that the proposed algorithm generalizes several recently proposed algorithms for example the FISTA algorithm of Bec and Teboulle [4 and the primal-dual algorithm of Chambolle and Poc [0. This gives rise to new inertial primal-dual algorithms for convexconcave programming. In several numerical experiments we demonstrated that the inertial term leads to faster convergence while eeping the complexity of each iteration basically unchanged. Future wor will mainly concentrate on trying to find worst-case convergence rates for particular problem classes. Acnowledgements Thomas Poc acnowledges support from the Austrian science fund (FWF) under the project Efficient algorithms for nonsmooth optimization in imaging, No. I48 and the FWF- START project Bilevel optimization for Computer Vision, No. Y79. The authors wish to than Antonin Chambolle for very helpful discussions. References. F. Alvarez. Wea convergence of a relaxed and inertial hybrid projection-proximal point algorithm for maximal monotone operators in hilbert space. SIA J. on Optimization, 4(3):773 78, F. Alvarez and H. Attouch. An inertial proximal method for maximal monotone operators via discretization of a nonlinear oscillator with damping. Set-Valued Analysis, 9(-):3, H.H. Bausche and P.L. Combettes. Convex Analysis and onotone Operator Theory in Hilbert Spaces. Springer, A. Bec and. Teboulle. A fast iterative shrinage-thresholding algorithm for linear inverse problems. SIA J. Imaging Sci., ():83 0, S. Becer and J. Fadili. A quasi-newton proximal splitting method. In Advances in Neural Information Processing Systems 5, pages , R.I. Bot and E.R. Csetne. An inertial alternating direction method of multipliers. inimax Theory and its Applications, 04. to appear. 7. R.I. Bot, E.R. Csetne, and C. Hendrich. Inertial Douglas- Rachford splitting for monotone inclusion problems. Technical report, arxiv/ , S. Boyd, N. Parih, E. Chu, B. Peleato, and J. Ecstein. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends ach. Learn., 3():, R. Bruc. On the wea convergence of an ergodic iteration for the solution of variational inequalities for monotone operators in hilbert space. Journal of athematical Analysis and Applications, 6:59 64, A. Chambolle and T. Poc. A first-order primal-dual algorithm for convex problems withapplications to imaging. Journal of athematical Imaging and Vision, 40():0 45, 0.. G. Chen and R. Rocafellar. Convergence rates in forwardbacward splitting. SIA Journal on Optimization, 7():4 444, E. Chouzenoux, J.-C. Pesquet, and A. Repetti. Variable metric forward-bacward algorithm for minimizing the sum of a differentiable function and a convex function. Journal of Optimization Theory and Applications, pages 6, P.L. Combettes and J.-C. Pesquet. Primal-dual splitting algorithm for solving inclusions with mixtures of composite, lipschitzian, and parallel-sum type monotone operators. Set-Valued and Variational Analysis, 0(): , P.L. Combettes and B.C. Vũ. Variable metric forward-bacward splitting with applications to monotone inclusions in duality. Optimization, ahead-of-print: 30, P.L. Combettes and V. Wajs. Signal recovery by proximal forwardbacward splitting. SIA ultiscale odelling and Simulation, 4(4):68 00, L. Condat. A primal-dual splitting method for convex optimization involving lipschitzian, proximable and linear composite terms. Journal of Optimization Theory and Applications, 58(): , I. Daubechies,. Defrise, and C. De ol. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Communications on Pure and Applied athematics, 57:43 457, J. Douglas and H. H. Rachford. On the numerical solution of heat conduction problems in two and three space variables. Transactions of The American athematical Society, 8:4 439, J. Duchi and Y. Singer. Efficient online and batch learning using forward bacward splitting. Journal of achine Learning Research, 0: , 009.

15 An inertial forward-bacward algorithm for monotone inclusions 5 0. J. Ecstein. Splitting methods for monotone operators with applications to parallel optimization. PhD thesis, assachusetts Institute of Technology, J. Ecstein and D.P. Bertseas. On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators. athematical Programming, 55:93 38, 99.. E. Esser, X. Zhang, and T.F. Chan. A general framewor for a class of first order primal-dual algorithms for convex optimization in imaging science. SIA J. Imaging Sciences, 3(4):05 046, D. Gabay. Applications of the method of multipliers to variational inequalities. In. Fortin and R. Glowinsi, editors, Augmented Lagrangian ethods: Applications to the Solution of Boundary Value Problems, chapter IX, pages North-Holland, Amsterdam, A.A. Goldstein. Convex programming in Hilbert spaces. Bull. Amer. ath. Soc., 70:709 70, T. Goldstein and S. Osher. The split Bregman method for L-regularized problems. SIA Journal on Imaging Sciences, ():33 343, O. Güler. On the convergence of the proximal point algorithm for convex minimization. SIA Journal on Control and Optimization, 9:403 49, B. He and X. Yuan. Convergence analysis of primal-dual algorithms for a saddle-point problem: From contraction perspective. SIA Journal on Imaging Sciences, 5():9 49, E.S. Levitin and B.T. Polya. Constrained minimization methods. U.S.S.R. Comput. ath. ath. Phys., 6(5): 50, P. L. Lions and B. ercier. Splitting algorithms for the sum of two nonlinear operators. SIA Journal on Numerical Analysis, 6(6): , B. artinet. Brève communication. régularisation d inéquations variationnelles par approximations successives. ESAI: athematical odelling and Numerical Analysis - odélisation athématique et Analyse Numérique, 4(R3):54 58, G. J. inty. onotone (nonlinear) operators in Hilbert space. Due athematical Journal, 9:34 346, J. J. oreau. Proximité et dualité dans un espace Hilbertien. Bull. Soc. ath. France, 93:73 99, A. oudafi and. Oliny. Convergence of a splitting inertial proximal method for monotone operators. Journal of Computational and Applied athematics, 55: , Yu. Nesterov. A method for solving the convex programming problem with convergence rate O(/ ). Dol. Aad. Nau SSSR, 69(3): , Yu. Nesterov. Introductory lectures on convex optimization, volume 87 of Applied Optimization. Kluwer Academic Publishers, Boston, A, 004. A basic course. 36. Yu. Nesterov. Smooth minimization of non-smooth functions. ath. Program., 03():7 5, Yu. Nesterov. Gradient methods for minimizing composite functions. athematical Programming, 40():5 6, Z. Opial. Wea convergence of the sequence of successive approximations for nonexpansive mappings. Bulletin of the American athematical Society, 73:59 597, G. B. Passty. Ergodic convergence to a zero of the sum of monotone operators in Hilbert space. Journal of athematical Analysis and Applications, 7: , D. W. Peaceman and H. H. Rachford. The numerical solution of parabolic and elliptic differential equations. J. Soc. Indust. Appl. ath., 3():8 4, J.-C. Pesquet and N. Pustelni. A parallel inertial proximal optimization methods. Pacific Journal of Optimization, 8():73 305, Apr T. Poc and A. Chambolle. Diagonal preconditioning for first order primal-dual algorithms. In International Conference of Computer Vision (ICCV 0), pages , T. Poc, D. Cremers, H. Bischof, and A. Chambolle. An algorithm for minimizing the umford-shah functional. In ICCV Proceedings, LNCS. Springer, B. T. Polya. Some methods of speeding up the convergence of iteration methods. U.S.S.R. Comput. ath. ath. Phys., 4(5): 7, H. Raguet, J. Fadili, and G. Peyré. A generalized forwardbacward splitting. SIA Journal on Imaging Sciences, 6(3):99 6, R. T. Rocafellar. onotone operators and the proximal point algorithm. SIA Journal on Control and Optimization, 4(5): , P. Tseng. Applications of a splitting algorithm to decomposition in convex programming and variational inequalities. SIA Journal on Control and Optimization, 9():9 38, P. Tseng. On accelerated proximal gradient methods for convexconcave optimization, 008. Technical report. 49. B. Vũ. A splitting algorithm for dual monotone inclusions involving cocoercive operators. Advances in Computational athematics, 38(3):667 68, S. Villa, S. Salzo, L. Baldassarre, and A. Verri. Accelerated and inexact forward-bacward algorithms. SIA Journal on Optimization, 3(3): , 03.

ADMM for monotone operators: convergence analysis and rates

ADMM for monotone operators: convergence analysis and rates ADMM for monotone operators: convergence analysis and rates Radu Ioan Boţ Ernö Robert Csetne May 4, 07 Abstract. We propose in this paper a unifying scheme for several algorithms from the literature dedicated

More information

A Primal-dual Three-operator Splitting Scheme

A Primal-dual Three-operator Splitting Scheme Noname manuscript No. (will be inserted by the editor) A Primal-dual Three-operator Splitting Scheme Ming Yan Received: date / Accepted: date Abstract In this paper, we propose a new primal-dual algorithm

More information

A Unified Approach to Proximal Algorithms using Bregman Distance

A Unified Approach to Proximal Algorithms using Bregman Distance A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department

More information

I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION

I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION Peter Ochs University of Freiburg Germany 17.01.2017 joint work with: Thomas Brox and Thomas Pock c 2017 Peter Ochs ipiano c 1

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of

More information

Approaching monotone inclusion problems via second order dynamical systems with linear and anisotropic damping

Approaching monotone inclusion problems via second order dynamical systems with linear and anisotropic damping March 0, 206 3:4 WSPC Proceedings - 9in x 6in secondorderanisotropicdamping206030 page Approaching monotone inclusion problems via second order dynamical systems with linear and anisotropic damping Radu

More information

Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches

Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches Patrick L. Combettes joint work with J.-C. Pesquet) Laboratoire Jacques-Louis Lions Faculté de Mathématiques

More information

On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems

On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems Radu Ioan Boţ Ernö Robert Csetnek August 5, 014 Abstract. In this paper we analyze the

More information

On convergence rate of the Douglas-Rachford operator splitting method

On convergence rate of the Douglas-Rachford operator splitting method On convergence rate of the Douglas-Rachford operator splitting method Bingsheng He and Xiaoming Yuan 2 Abstract. This note provides a simple proof on a O(/k) convergence rate for the Douglas- Rachford

More information

Convergence Rates for Projective Splitting

Convergence Rates for Projective Splitting Convergence Rates for Projective Splitting Patric R. Johnstone Jonathan Ecstein August 8, 018 Abstract Projective splitting is a family of methods for solving inclusions involving sums of maximal monotone

More information

Splitting methods for decomposing separable convex programs

Splitting methods for decomposing separable convex programs Splitting methods for decomposing separable convex programs Philippe Mahey LIMOS - ISIMA - Université Blaise Pascal PGMO, ENSTA 2013 October 4, 2013 1 / 30 Plan 1 Max Monotone Operators Proximal techniques

More information

arxiv: v4 [math.oc] 29 Jan 2018

arxiv: v4 [math.oc] 29 Jan 2018 Noname manuscript No. (will be inserted by the editor A new primal-dual algorithm for minimizing the sum of three functions with a linear operator Ming Yan arxiv:1611.09805v4 [math.oc] 29 Jan 2018 Received:

More information

arxiv: v1 [math.oc] 21 Apr 2016

arxiv: v1 [math.oc] 21 Apr 2016 Accelerated Douglas Rachford methods for the solution of convex-concave saddle-point problems Kristian Bredies Hongpeng Sun April, 06 arxiv:604.068v [math.oc] Apr 06 Abstract We study acceleration and

More information

GSOS: Gauss-Seidel Operator Splitting Algorithm for Multi-Term Nonsmooth Convex Composite Optimization

GSOS: Gauss-Seidel Operator Splitting Algorithm for Multi-Term Nonsmooth Convex Composite Optimization : Gauss-Seidel Operator Splitting Algorithm for Multi-Term Nonsmooth Convex Composite Optimization Li Shen 1 Wei Liu 1 Ganzhao Yuan Shiqian Ma 3 Abstract In this paper, we propose a fast Gauss-Seidel Operator

More information

Coordinate Update Algorithm Short Course Operator Splitting

Coordinate Update Algorithm Short Course Operator Splitting Coordinate Update Algorithm Short Course Operator Splitting Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 25 Operator splitting pipeline 1. Formulate a problem as 0 A(x) + B(x) with monotone operators

More information

Second order forward-backward dynamical systems for monotone inclusion problems

Second order forward-backward dynamical systems for monotone inclusion problems Second order forward-backward dynamical systems for monotone inclusion problems Radu Ioan Boţ Ernö Robert Csetnek March 6, 25 Abstract. We begin by considering second order dynamical systems of the from

More information

Adaptive Primal Dual Optimization for Image Processing and Learning

Adaptive Primal Dual Optimization for Image Processing and Learning Adaptive Primal Dual Optimization for Image Processing and Learning Tom Goldstein Rice University tag7@rice.edu Ernie Esser University of British Columbia eesser@eos.ubc.ca Richard Baraniuk Rice University

More information

Journal of Convex Analysis (accepted for publication) A HYBRID PROJECTION PROXIMAL POINT ALGORITHM. M. V. Solodov and B. F.

Journal of Convex Analysis (accepted for publication) A HYBRID PROJECTION PROXIMAL POINT ALGORITHM. M. V. Solodov and B. F. Journal of Convex Analysis (accepted for publication) A HYBRID PROJECTION PROXIMAL POINT ALGORITHM M. V. Solodov and B. F. Svaiter January 27, 1997 (Revised August 24, 1998) ABSTRACT We propose a modification

More information

A General Framework for a Class of Primal-Dual Algorithms for TV Minimization

A General Framework for a Class of Primal-Dual Algorithms for TV Minimization A General Framework for a Class of Primal-Dual Algorithms for TV Minimization Ernie Esser UCLA 1 Outline A Model Convex Minimization Problem Main Idea Behind the Primal Dual Hybrid Gradient (PDHG) Method

More information

An Algorithmic Framework of Variable Metric Over-Relaxed Hybrid Proximal Extra-Gradient Method

An Algorithmic Framework of Variable Metric Over-Relaxed Hybrid Proximal Extra-Gradient Method An Algorithmic Framewor of Variable Metric Over-Relaxed Hybrid Proximal Extra-Gradient Method Li Shen 1 Peng Sun 1 Yitong Wang 1 Wei Liu 1 Tong Zhang 1 Abstract We propose a novel algorithmic framewor

More information

A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR TV MINIMIZATION

A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR TV MINIMIZATION A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR TV MINIMIZATION ERNIE ESSER XIAOQUN ZHANG TONY CHAN Abstract. We generalize the primal-dual hybrid gradient (PDHG) algorithm proposed

More information

Inertial Douglas-Rachford splitting for monotone inclusion problems

Inertial Douglas-Rachford splitting for monotone inclusion problems Inertial Douglas-Rachford splitting for monotone inclusion problems Radu Ioan Boţ Ernö Robert Csetnek Christopher Hendrich January 5, 2015 Abstract. We propose an inertial Douglas-Rachford splitting algorithm

More information

Convergence rate estimates for the gradient differential inclusion

Convergence rate estimates for the gradient differential inclusion Convergence rate estimates for the gradient differential inclusion Osman Güler November 23 Abstract Let f : H R { } be a proper, lower semi continuous, convex function in a Hilbert space H. The gradient

More information

A Dykstra-like algorithm for two monotone operators

A Dykstra-like algorithm for two monotone operators A Dykstra-like algorithm for two monotone operators Heinz H. Bauschke and Patrick L. Combettes Abstract Dykstra s algorithm employs the projectors onto two closed convex sets in a Hilbert space to construct

More information

Dual and primal-dual methods

Dual and primal-dual methods ELE 538B: Large-Scale Optimization for Data Science Dual and primal-dual methods Yuxin Chen Princeton University, Spring 2018 Outline Dual proximal gradient method Primal-dual proximal gradient method

More information

A Tutorial on Primal-Dual Algorithm

A Tutorial on Primal-Dual Algorithm A Tutorial on Primal-Dual Algorithm Shenlong Wang University of Toronto March 31, 2016 1 / 34 Energy minimization MAP Inference for MRFs Typical energies consist of a regularization term and a data term.

More information

arxiv: v1 [math.oc] 16 May 2018

arxiv: v1 [math.oc] 16 May 2018 An Algorithmic Framewor of Variable Metric Over-Relaxed Hybrid Proximal Extra-Gradient Method Li Shen 1 Peng Sun 1 Yitong Wang 1 Wei Liu 1 Tong Zhang 1 arxiv:180506137v1 [mathoc] 16 May 2018 Abstract We

More information

On the acceleration of augmented Lagrangian method for linearly constrained optimization

On the acceleration of augmented Lagrangian method for linearly constrained optimization On the acceleration of augmented Lagrangian method for linearly constrained optimization Bingsheng He and Xiaoming Yuan October, 2 Abstract. The classical augmented Lagrangian method (ALM plays a fundamental

More information

A first-order primal-dual algorithm with linesearch

A first-order primal-dual algorithm with linesearch A first-order primal-dual algorithm with linesearch Yura Malitsky Thomas Pock arxiv:608.08883v2 [math.oc] 23 Mar 208 Abstract The paper proposes a linesearch for a primal-dual method. Each iteration of

More information

A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR CONVEX OPTIMIZATION IN IMAGING SCIENCE

A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR CONVEX OPTIMIZATION IN IMAGING SCIENCE A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR CONVEX OPTIMIZATION IN IMAGING SCIENCE ERNIE ESSER XIAOQUN ZHANG TONY CHAN Abstract. We generalize the primal-dual hybrid gradient

More information

arxiv: v1 [math.oc] 20 Jun 2014

arxiv: v1 [math.oc] 20 Jun 2014 A forward-backward view of some primal-dual optimization methods in image recovery arxiv:1406.5439v1 [math.oc] 20 Jun 2014 P. L. Combettes, 1 L. Condat, 2 J.-C. Pesquet, 3 and B. C. Vũ 4 1 Sorbonne Universités

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

arxiv: v1 [math.oc] 1 Jul 2016

arxiv: v1 [math.oc] 1 Jul 2016 Convergence Rate of Frank-Wolfe for Non-Convex Objectives Simon Lacoste-Julien INRIA - SIERRA team ENS, Paris June 8, 016 Abstract arxiv:1607.00345v1 [math.oc] 1 Jul 016 We give a simple proof that the

More information

Contraction Methods for Convex Optimization and monotone variational inequalities No.12

Contraction Methods for Convex Optimization and monotone variational inequalities No.12 XII - 1 Contraction Methods for Convex Optimization and monotone variational inequalities No.12 Linearized alternating direction methods of multipliers for separable convex programming Bingsheng He Department

More information

INERTIAL ACCELERATED ALGORITHMS FOR SOLVING SPLIT FEASIBILITY PROBLEMS. Yazheng Dang. Jie Sun. Honglei Xu

INERTIAL ACCELERATED ALGORITHMS FOR SOLVING SPLIT FEASIBILITY PROBLEMS. Yazheng Dang. Jie Sun. Honglei Xu Manuscript submitted to AIMS Journals Volume X, Number 0X, XX 200X doi:10.3934/xx.xx.xx.xx pp. X XX INERTIAL ACCELERATED ALGORITHMS FOR SOLVING SPLIT FEASIBILITY PROBLEMS Yazheng Dang School of Management

More information

Primal-dual algorithms for the sum of two and three functions 1

Primal-dual algorithms for the sum of two and three functions 1 Primal-dual algorithms for the sum of two and three functions 1 Ming Yan Michigan State University, CMSE/Mathematics 1 This works is partially supported by NSF. optimization problems for primal-dual algorithms

More information

An Inexact Spingarn s Partial Inverse Method with Applications to Operator Splitting and Composite Optimization

An Inexact Spingarn s Partial Inverse Method with Applications to Operator Splitting and Composite Optimization Noname manuscript No. (will be inserted by the editor) An Inexact Spingarn s Partial Inverse Method with Applications to Operator Splitting and Composite Optimization S. Costa Lima M. Marques Alves Received:

More information

An Infeasible Interior Proximal Method for Convex Programming Problems with Linear Constraints 1

An Infeasible Interior Proximal Method for Convex Programming Problems with Linear Constraints 1 An Infeasible Interior Proximal Method for Convex Programming Problems with Linear Constraints 1 Nobuo Yamashita 2, Christian Kanzow 3, Tomoyui Morimoto 2, and Masao Fuushima 2 2 Department of Applied

More information

On the acceleration of the double smoothing technique for unconstrained convex optimization problems

On the acceleration of the double smoothing technique for unconstrained convex optimization problems On the acceleration of the double smoothing technique for unconstrained convex optimization problems Radu Ioan Boţ Christopher Hendrich October 10, 01 Abstract. In this article we investigate the possibilities

More information

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some

More information

On the equivalence of the primal-dual hybrid gradient method and Douglas Rachford splitting

On the equivalence of the primal-dual hybrid gradient method and Douglas Rachford splitting Mathematical Programming manuscript No. (will be inserted by the editor) On the equivalence of the primal-dual hybrid gradient method and Douglas Rachford splitting Daniel O Connor Lieven Vandenberghe

More information

1 Introduction and preliminaries

1 Introduction and preliminaries Proximal Methods for a Class of Relaxed Nonlinear Variational Inclusions Abdellatif Moudafi Université des Antilles et de la Guyane, Grimaag B.P. 7209, 97275 Schoelcher, Martinique abdellatif.moudafi@martinique.univ-ag.fr

More information

About Split Proximal Algorithms for the Q-Lasso

About Split Proximal Algorithms for the Q-Lasso Thai Journal of Mathematics Volume 5 (207) Number : 7 http://thaijmath.in.cmu.ac.th ISSN 686-0209 About Split Proximal Algorithms for the Q-Lasso Abdellatif Moudafi Aix Marseille Université, CNRS-L.S.I.S

More information

Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem

Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem Charles Byrne (Charles Byrne@uml.edu) http://faculty.uml.edu/cbyrne/cbyrne.html Department of Mathematical Sciences

More information

An Overview of Recent and Brand New Primal-Dual Methods for Solving Convex Optimization Problems

An Overview of Recent and Brand New Primal-Dual Methods for Solving Convex Optimization Problems PGMO 1/32 An Overview of Recent and Brand New Primal-Dual Methods for Solving Convex Optimization Problems Emilie Chouzenoux Laboratoire d Informatique Gaspard Monge - CNRS Univ. Paris-Est Marne-la-Vallée,

More information

Convex Optimization Notes

Convex Optimization Notes Convex Optimization Notes Jonathan Siegel January 2017 1 Convex Analysis This section is devoted to the study of convex functions f : B R {+ } and convex sets U B, for B a Banach space. The case of B =

More information

Self-dual Smooth Approximations of Convex Functions via the Proximal Average

Self-dual Smooth Approximations of Convex Functions via the Proximal Average Chapter Self-dual Smooth Approximations of Convex Functions via the Proximal Average Heinz H. Bauschke, Sarah M. Moffat, and Xianfu Wang Abstract The proximal average of two convex functions has proven

More information

arxiv: v1 [math.oc] 12 Mar 2013

arxiv: v1 [math.oc] 12 Mar 2013 On the convergence rate improvement of a primal-dual splitting algorithm for solving monotone inclusion problems arxiv:303.875v [math.oc] Mar 03 Radu Ioan Boţ Ernö Robert Csetnek André Heinrich February

More information

Convergence analysis for a primal-dual monotone + skew splitting algorithm with applications to total variation minimization

Convergence analysis for a primal-dual monotone + skew splitting algorithm with applications to total variation minimization Convergence analysis for a primal-dual monotone + skew splitting algorithm with applications to total variation minimization Radu Ioan Boţ Christopher Hendrich November 7, 202 Abstract. In this paper we

More information

The Proximal Gradient Method

The Proximal Gradient Method Chapter 10 The Proximal Gradient Method Underlying Space: In this chapter, with the exception of Section 10.9, E is a Euclidean space, meaning a finite dimensional space endowed with an inner product,

More information

Convergence rate of inexact proximal point methods with relative error criteria for convex optimization

Convergence rate of inexact proximal point methods with relative error criteria for convex optimization Convergence rate of inexact proximal point methods with relative error criteria for convex optimization Renato D. C. Monteiro B. F. Svaiter August, 010 Revised: December 1, 011) Abstract In this paper,

More information

Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems)

Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems) Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems) Donghwan Kim and Jeffrey A. Fessler EECS Department, University of Michigan

More information

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 36 Why proximal? Newton s method: for C 2 -smooth, unconstrained problems allow

More information

Iterative algorithms based on the hybrid steepest descent method for the split feasibility problem

Iterative algorithms based on the hybrid steepest descent method for the split feasibility problem Available online at www.tjnsa.com J. Nonlinear Sci. Appl. 9 (206), 424 4225 Research Article Iterative algorithms based on the hybrid steepest descent method for the split feasibility problem Jong Soo

More information

Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization

Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization Mar Schmidt, Nicolas e Roux, Francis Bach To cite this version: Mar Schmidt, Nicolas e Roux, Francis Bach. Convergence Rates

More information

On the order of the operators in the Douglas Rachford algorithm

On the order of the operators in the Douglas Rachford algorithm On the order of the operators in the Douglas Rachford algorithm Heinz H. Bauschke and Walaa M. Moursi June 11, 2015 Abstract The Douglas Rachford algorithm is a popular method for finding zeros of sums

More information

Learning with stochastic proximal gradient

Learning with stochastic proximal gradient Learning with stochastic proximal gradient Lorenzo Rosasco DIBRIS, Università di Genova Via Dodecaneso, 35 16146 Genova, Italy lrosasco@mit.edu Silvia Villa, Băng Công Vũ Laboratory for Computational and

More information

WE consider an undirected, connected network of n

WE consider an undirected, connected network of n On Nonconvex Decentralized Gradient Descent Jinshan Zeng and Wotao Yin Abstract Consensus optimization has received considerable attention in recent years. A number of decentralized algorithms have been

More information

ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING

ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING YANGYANG XU Abstract. Motivated by big data applications, first-order methods have been extremely

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

Monotonicity and Restart in Fast Gradient Methods

Monotonicity and Restart in Fast Gradient Methods 53rd IEEE Conference on Decision and Control December 5-7, 204. Los Angeles, California, USA Monotonicity and Restart in Fast Gradient Methods Pontus Giselsson and Stephen Boyd Abstract Fast gradient methods

More information

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)

More information

Iteration-complexity of first-order penalty methods for convex programming

Iteration-complexity of first-order penalty methods for convex programming Iteration-complexity of first-order penalty methods for convex programming Guanghui Lan Renato D.C. Monteiro July 24, 2008 Abstract This paper considers a special but broad class of convex programing CP)

More information

Tight Rates and Equivalence Results of Operator Splitting Schemes

Tight Rates and Equivalence Results of Operator Splitting Schemes Tight Rates and Equivalence Results of Operator Splitting Schemes Wotao Yin (UCLA Math) Workshop on Optimization for Modern Computing Joint w Damek Davis and Ming Yan UCLA CAM 14-51, 14-58, and 14-59 1

More information

Math 273a: Optimization Overview of First-Order Optimization Algorithms

Math 273a: Optimization Overview of First-Order Optimization Algorithms Math 273a: Optimization Overview of First-Order Optimization Algorithms Wotao Yin Department of Mathematics, UCLA online discussions on piazza.com 1 / 9 Typical flow of numerical optimization Optimization

More information

An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods

An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods Renato D.C. Monteiro B. F. Svaiter May 10, 011 Revised: May 4, 01) Abstract This

More information

Proximal splitting methods on convex problems with a quadratic term: Relax!

Proximal splitting methods on convex problems with a quadratic term: Relax! Proximal splitting methods on convex problems with a quadratic term: Relax! The slides I presented with added comments Laurent Condat GIPSA-lab, Univ. Grenoble Alpes, France Workshop BASP Frontiers, Jan.

More information

Douglas-Rachford Splitting: Complexity Estimates and Accelerated Variants

Douglas-Rachford Splitting: Complexity Estimates and Accelerated Variants 53rd IEEE Conference on Decision and Control December 5-7, 204. Los Angeles, California, USA Douglas-Rachford Splitting: Complexity Estimates and Accelerated Variants Panagiotis Patrinos and Lorenzo Stella

More information

Conditional Gradient (Frank-Wolfe) Method

Conditional Gradient (Frank-Wolfe) Method Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties

More information

A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES. Fenghui Wang

A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES. Fenghui Wang A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES Fenghui Wang Department of Mathematics, Luoyang Normal University, Luoyang 470, P.R. China E-mail: wfenghui@63.com ABSTRACT.

More information

arxiv: v2 [math.oc] 21 Nov 2017

arxiv: v2 [math.oc] 21 Nov 2017 Unifying abstract inexact convergence theorems and block coordinate variable metric ipiano arxiv:1602.07283v2 [math.oc] 21 Nov 2017 Peter Ochs Mathematical Optimization Group Saarland University Germany

More information

SIAM Conference on Imaging Science, Bologna, Italy, Adaptive FISTA. Peter Ochs Saarland University

SIAM Conference on Imaging Science, Bologna, Italy, Adaptive FISTA. Peter Ochs Saarland University SIAM Conference on Imaging Science, Bologna, Italy, 2018 Adaptive FISTA Peter Ochs Saarland University 07.06.2018 joint work with Thomas Pock, TU Graz, Austria c 2018 Peter Ochs Adaptive FISTA 1 / 16 Some

More information

Second order forward-backward dynamical systems for monotone inclusion problems

Second order forward-backward dynamical systems for monotone inclusion problems Second order forward-backward dynamical systems for monotone inclusion problems Radu Ioan Boţ Ernö Robert Csetnek March 2, 26 Abstract. We begin by considering second order dynamical systems of the from

More information

A Parallel Block-Coordinate Approach for Primal-Dual Splitting with Arbitrary Random Block Selection

A Parallel Block-Coordinate Approach for Primal-Dual Splitting with Arbitrary Random Block Selection EUSIPCO 2015 1/19 A Parallel Block-Coordinate Approach for Primal-Dual Splitting with Arbitrary Random Block Selection Jean-Christophe Pesquet Laboratoire d Informatique Gaspard Monge - CNRS Univ. Paris-Est

More information

Non-stationary Douglas-Rachford and alternating direction method of multipliers: adaptive stepsizes and convergence

Non-stationary Douglas-Rachford and alternating direction method of multipliers: adaptive stepsizes and convergence Non-stationary Douglas-Rachford and alternating direction method of multipliers: adaptive stepsizes and convergence Dirk A. Lorenz Quoc Tran-Dinh January 14, 2018 Abstract We revisit the classical Douglas-Rachford

More information

Une méthode proximale pour les inclusions monotones dans les espaces de Hilbert, avec complexité O(1/k 2 ).

Une méthode proximale pour les inclusions monotones dans les espaces de Hilbert, avec complexité O(1/k 2 ). Une méthode proximale pour les inclusions monotones dans les espaces de Hilbert, avec complexité O(1/k 2 ). Hedy ATTOUCH Université Montpellier 2 ACSIOM, I3M UMR CNRS 5149 Travail en collaboration avec

More information

Convex Optimization. (EE227A: UC Berkeley) Lecture 15. Suvrit Sra. (Gradient methods III) 12 March, 2013

Convex Optimization. (EE227A: UC Berkeley) Lecture 15. Suvrit Sra. (Gradient methods III) 12 March, 2013 Convex Optimization (EE227A: UC Berkeley) Lecture 15 (Gradient methods III) 12 March, 2013 Suvrit Sra Optimal gradient methods 2 / 27 Optimal gradient methods We saw following efficiency estimates for

More information

Variable Metric Forward-Backward Algorithm

Variable Metric Forward-Backward Algorithm Variable Metric Forward-Backward Algorithm 1/37 Variable Metric Forward-Backward Algorithm for minimizing the sum of a differentiable function and a convex function E. Chouzenoux in collaboration with

More information

Optimization and Optimal Control in Banach Spaces

Optimization and Optimal Control in Banach Spaces Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,

More information

Convergence of Fixed-Point Iterations

Convergence of Fixed-Point Iterations Convergence of Fixed-Point Iterations Instructor: Wotao Yin (UCLA Math) July 2016 1 / 30 Why study fixed-point iterations? Abstract many existing algorithms in optimization, numerical linear algebra, and

More information

6. Proximal gradient method

6. Proximal gradient method L. Vandenberghe EE236C (Spring 2016) 6. Proximal gradient method motivation proximal mapping proximal gradient method with fixed step size proximal gradient method with line search 6-1 Proximal mapping

More information

Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms

Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms Carlos Humes Jr. a, Benar F. Svaiter b, Paulo J. S. Silva a, a Dept. of Computer Science, University of São Paulo, Brazil Email: {humes,rsilva}@ime.usp.br

More information

WEAK CONVERGENCE OF RESOLVENTS OF MAXIMAL MONOTONE OPERATORS AND MOSCO CONVERGENCE

WEAK CONVERGENCE OF RESOLVENTS OF MAXIMAL MONOTONE OPERATORS AND MOSCO CONVERGENCE Fixed Point Theory, Volume 6, No. 1, 2005, 59-69 http://www.math.ubbcluj.ro/ nodeacj/sfptcj.htm WEAK CONVERGENCE OF RESOLVENTS OF MAXIMAL MONOTONE OPERATORS AND MOSCO CONVERGENCE YASUNORI KIMURA Department

More information

R u t c o r Research R e p o r t. General Projective Splitting Methods for Sums of Maximal Monotone Operators. Jonathan Eckstein a. B. F.

R u t c o r Research R e p o r t. General Projective Splitting Methods for Sums of Maximal Monotone Operators. Jonathan Eckstein a. B. F. R u t c o r Research R e p o r t General Projective Splitting Methods for Sums of Maximal Monotone Operators Jonathan Ecstein a B. F. Svaiter b RRR 23-2007, August 2007 RUTCOR Rutgers Center for Operations

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)

More information

SPARSE SIGNAL RESTORATION. 1. Introduction

SPARSE SIGNAL RESTORATION. 1. Introduction SPARSE SIGNAL RESTORATION IVAN W. SELESNICK 1. Introduction These notes describe an approach for the restoration of degraded signals using sparsity. This approach, which has become quite popular, is useful

More information

Accelerated primal-dual methods for linearly constrained convex problems

Accelerated primal-dual methods for linearly constrained convex problems Accelerated primal-dual methods for linearly constrained convex problems Yangyang Xu SIAM Conference on Optimization May 24, 2017 1 / 23 Accelerated proximal gradient For convex composite problem: minimize

More information

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop

More information

Sparsity Regularization

Sparsity Regularization Sparsity Regularization Bangti Jin Course Inverse Problems & Imaging 1 / 41 Outline 1 Motivation: sparsity? 2 Mathematical preliminaries 3 l 1 solvers 2 / 41 problem setup finite-dimensional formulation

More information

arxiv: v3 [math.oc] 18 Apr 2012

arxiv: v3 [math.oc] 18 Apr 2012 A class of Fejér convergent algorithms, approximate resolvents and the Hybrid Proximal-Extragradient method B. F. Svaiter arxiv:1204.1353v3 [math.oc] 18 Apr 2012 Abstract A new framework for analyzing

More information

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization Panos Parpas Department of Computing Imperial College London www.doc.ic.ac.uk/ pp500 p.parpas@imperial.ac.uk jointly with D.V.

More information

On the convergence of a regularized Jacobi algorithm for convex optimization

On the convergence of a regularized Jacobi algorithm for convex optimization On the convergence of a regularized Jacobi algorithm for convex optimization Goran Banjac, Kostas Margellos, and Paul J. Goulart Abstract In this paper we consider the regularized version of the Jacobi

More information

NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS. 1. Introduction. We consider first-order methods for smooth, unconstrained

NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS. 1. Introduction. We consider first-order methods for smooth, unconstrained NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS 1. Introduction. We consider first-order methods for smooth, unconstrained optimization: (1.1) minimize f(x), x R n where f : R n R. We assume

More information

Solving monotone inclusions involving parallel sums of linearly composed maximally monotone operators

Solving monotone inclusions involving parallel sums of linearly composed maximally monotone operators Solving monotone inclusions involving parallel sums of linearly composed maximally monotone operators Radu Ioan Boţ Christopher Hendrich 2 April 28, 206 Abstract. The aim of this article is to present

More information

496 B.S. HE, S.L. WANG AND H. YANG where w = x y 0 A ; Q(w) f(x) AT g(y) B T Ax + By b A ; W = X Y R r : (5) Problem (4)-(5) is denoted as MVI

496 B.S. HE, S.L. WANG AND H. YANG where w = x y 0 A ; Q(w) f(x) AT g(y) B T Ax + By b A ; W = X Y R r : (5) Problem (4)-(5) is denoted as MVI Journal of Computational Mathematics, Vol., No.4, 003, 495504. A MODIFIED VARIABLE-PENALTY ALTERNATING DIRECTIONS METHOD FOR MONOTONE VARIATIONAL INEQUALITIES Λ) Bing-sheng He Sheng-li Wang (Department

More information

Operator Splitting for Parallel and Distributed Optimization

Operator Splitting for Parallel and Distributed Optimization Operator Splitting for Parallel and Distributed Optimization Wotao Yin (UCLA Math) Shanghai Tech, SSDS 15 June 23, 2015 URL: alturl.com/2z7tv 1 / 60 What is splitting? Sun-Tzu: (400 BC) Caesar: divide-n-conquer

More information

On the equivalence of the primal-dual hybrid gradient method and Douglas-Rachford splitting

On the equivalence of the primal-dual hybrid gradient method and Douglas-Rachford splitting On the equivalence of the primal-dual hybrid gradient method and Douglas-Rachford splitting Daniel O Connor Lieven Vandenberghe September 27, 2017 Abstract The primal-dual hybrid gradient (PDHG) algorithm

More information

A primal dual Splitting Method for Convex. Optimization Involving Lipschitzian, Proximable and Linear Composite Terms,

A primal dual Splitting Method for Convex. Optimization Involving Lipschitzian, Proximable and Linear Composite Terms, A primal dual Splitting Method for Convex Optimization Involving Lipschitzian, Proximable and Linear Composite Terms Laurent Condat Final author s version. Cite as: L. Condat, A primal dual Splitting Method

More information

A SIMPLE PARALLEL ALGORITHM WITH AN O(1/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS

A SIMPLE PARALLEL ALGORITHM WITH AN O(1/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS A SIMPLE PARALLEL ALGORITHM WITH AN O(/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS HAO YU AND MICHAEL J. NEELY Abstract. This paper considers convex programs with a general (possibly non-differentiable)

More information