Convergence analysis for a primal-dual monotone + skew splitting algorithm with applications to total variation minimization

Similar documents
On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems

On the acceleration of the double smoothing technique for unconstrained convex optimization problems

arxiv: v1 [math.oc] 12 Mar 2013

Solving monotone inclusions involving parallel sums of linearly composed maximally monotone operators

ADMM for monotone operators: convergence analysis and rates

Inertial Douglas-Rachford splitting for monotone inclusion problems

A Brøndsted-Rockafellar Theorem for Diagonal Subdifferential Operators

Robust Duality in Parametric Convex Optimization

Numerical methods for approximating. zeros of operators. and for solving variational inequalities. with applications

On Gap Functions for Equilibrium Problems via Fenchel Duality

A Primal-dual Three-operator Splitting Scheme

The Subdifferential of Convex Deviation Measures and Risk Functions

A New Fenchel Dual Problem in Vector Optimization

On the Brézis - Haraux - type approximation in nonreflexive Banach spaces

Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem

Approaching monotone inclusion problems via second order dynamical systems with linear and anisotropic damping

Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches

Second order forward-backward dynamical systems for monotone inclusion problems

Almost Convex Functions: Conjugacy and Duality

ON GAP FUNCTIONS OF VARIATIONAL INEQUALITY IN A BANACH SPACE. Sangho Kum and Gue Myung Lee. 1. Introduction

arxiv: v4 [math.oc] 29 Jan 2018

A Parallel Block-Coordinate Approach for Primal-Dual Splitting with Arbitrary Random Block Selection

Brézis - Haraux - type approximation of the range of a monotone operator composed with a linear mapping

A Dykstra-like algorithm for two monotone operators

Coordinate Update Algorithm Short Course Operator Splitting

On the relations between different duals assigned to composed optimization problems

M. Marques Alves Marina Geremia. November 30, 2017

Monotone operators and bigger conjugate functions

Maximal monotone operators are selfdual vector fields and vice-versa

Primal-dual algorithms for the sum of two and three functions 1

Optimization and Optimal Control in Banach Spaces

Math 273a: Optimization Overview of First-Order Optimization Algorithms

BASICS OF CONVEX ANALYSIS

Monotone Operator Splitting Methods in Signal and Image Recovery

A generalized forward-backward method for solving split equality quasi inclusion problems in Banach spaces

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Subdifferential representation of convex functions: refinements and applications

Local strong convexity and local Lipschitz continuity of the gradient of convex functions

PDEs in Image Processing, Tutorials

Operator Splitting for Parallel and Distributed Optimization

Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems)

Self-dual Smooth Approximations of Convex Functions via the Proximal Average

1 Introduction and preliminaries

In collaboration with J.-C. Pesquet A. Repetti EC (UPE) IFPEN 16 Dec / 29

An inertial forward-backward algorithm for the minimization of the sum of two nonconvex functions

Brøndsted-Rockafellar property of subdifferentials of prox-bounded functions. Marc Lassonde Université des Antilles et de la Guyane

About Split Proximal Algorithms for the Q-Lasso

On the order of the operators in the Douglas Rachford algorithm

Solving Dual Problems

I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

arxiv: v1 [math.oc] 20 Jun 2014

Inertial forward-backward methods for solving vector optimization problems

Adaptive Primal Dual Optimization for Image Processing and Learning

ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING

STABLE AND TOTAL FENCHEL DUALITY FOR CONVEX OPTIMIZATION PROBLEMS IN LOCALLY CONVEX SPACES

Primal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions

Proximal splitting methods on convex problems with a quadratic term: Relax!

Convex Analysis and Economic Theory AY Elementary properties of convex functions

Signal Processing and Networks Optimization Part VI: Duality

A convergence result for an Outer Approximation Scheme

Convex Optimization Notes

Victoria Martín-Márquez

MAXIMALITY OF SUMS OF TWO MAXIMAL MONOTONE OPERATORS

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Dual and primal-dual methods

Quasi-relative interior and optimization

Convex Functions. Pontus Giselsson

The resolvent average of monotone operators: dominant and recessive properties

Optimization methods

A Solution Method for Semidefinite Variational Inequality with Coupled Constraints

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

Extended Monotropic Programming and Duality 1

Division of the Humanities and Social Sciences. Supergradients. KC Border Fall 2001 v ::15.45

6. Proximal gradient method

Convergence rate estimates for the gradient differential inclusion

On duality theory of conic linear problems

PARALLEL SUBGRADIENT METHOD FOR NONSMOOTH CONVEX OPTIMIZATION WITH A SIMPLE CONSTRAINT

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

The local equicontinuity of a maximal monotone operator

Maximal monotonicity for the precomposition with a linear operator

1. Gradient method. gradient method, first-order methods. quadratic bounds on convex functions. analysis of gradient method

Dedicated to Michel Théra in honor of his 70th birthday

STRONG AND CONVERSE FENCHEL DUALITY FOR VECTOR OPTIMIZATION PROBLEMS IN LOCALLY CONVEX SPACES

THE UNIQUE MINIMAL DUAL REPRESENTATION OF A CONVEX FUNCTION

Kaisa Joki Adil M. Bagirov Napsu Karmitsa Marko M. Mäkelä. New Proximal Bundle Method for Nonsmooth DC Optimization

A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES. Fenghui Wang

The proximal mapping

Convex Optimization Conjugate, Subdifferential, Proximation

On John type ellipsoids

Lecture: Duality of LP, SOCP and SDP

Subgradient Projectors: Extensions, Theory, and Characterizations

Gauge optimization and duality

A Tutorial on Primal-Dual Algorithm

Math 273a: Optimization Subgradients of convex functions

WEAK CONVERGENCE OF RESOLVENTS OF MAXIMAL MONOTONE OPERATORS AND MOSCO CONVERGENCE

ON A HYBRID PROXIMAL POINT ALGORITHM IN BANACH SPACES

Transcription:

Convergence analysis for a primal-dual monotone + skew splitting algorithm with applications to total variation minimization Radu Ioan Boţ Christopher Hendrich November 7, 202 Abstract. In this paper we investigate the convergence behavior of a primal-dual splitting method for solving monotone inclusions involving mixtures of composite, Lipschitzian and parallel sum type operators proposed by Combettes and Pesquet in [7]. Firstly, in the particular case of convex minimization problems, we derive convergence rates for the sequence of objective function values by making use of conjugate duality techniques. Secondly, we propose for the general monotone inclusion problem two new schemes which accelerate the sequences of primal and/or dual iterates, provided strong monotonicity assumptions for some of the involved operators are fulfilled. Finally, we apply the theoretical achievements in the context of different types of image restoration problems solved via total variation regularization. Keywords. splitting method, Fenchel duality, convergence statements, image processing AMS subject classification. 90C25, 90C46, 47A52 Introduction and preliminaries The last few years have shown a rising interest in solving structured nondifferentiable convex optimization problems within the framework of the theory of conjugate functions. Applications in fields like signal and image processing, location theory and supervised machine learning motivate these efforts. In this article we investigate and improve the convergence behavior of the primaldual monotone + skew splitting method for solving monotone inclusions which was proposed by Combettes and Pesquet in [7], itself being an extension of the algorithmic scheme from [4] obtained by allowing also Lipschitzian monotone operators and parallel sums in the problem formulation. In the mentioned works, by means of a product space approach, the problem is reduced to the one of finding the zeros of the sum of a Lipschitzian monotone operator with a maximally monotone operator. The latter is solved Faculty of Mathematics, Chemnitz University of Technology, D-0907 Chemnitz, Germany, e-mail: radu.bot@mathematik.tu-chemnitz.de. Research partially supported by DFG (German Research Foundation), project BO 256/4-. Faculty of Mathematics, Chemnitz University of Technology, D-0907 Chemnitz, Germany, e-mail: christopher.hendrich@mathematik.tu-chemnitz.de. Research supported by a Graduate Fellowship of the Free State Saxony, Germany.

by using an error-tolerant version of Tseng s algorithm which has forward-backwardforward characteristics and allows to access the monotone Lipschitzian operators via explicit forward steps, while set-valued maximally monotone operators are processed via their resolvents. A notable advantage of this method is given by both its highly parallelizable character, most of its steps could be executed independently, and by the fact that allows to process maximal monotone operators and linear bounded operators separately, whenever they occur in the form of precompositions in the problem formulation. Before coming to the description of the problem formulation and of the algorithm from [7], we introduce some preliminary notions and results which are needed throughout the paper. We are considering the real Hilbert spaces H and G i, i =,..., m, endowed with the inner product, and associated norm =,, for which we use the same notation, respectively, as there is no risk of confusion. The symbols and denote weak and strong convergence, respectively. By R ++ we denote the set of strictly positive real numbers, while the indicator function of a set C H is δ C : H R := R ± }, defined by δ C (x) = 0 for x C and δ C (x) = +, otherwise. For a function f : H R we denote by dom f := x H : f(x) < + } its effective domain and call f proper if dom f and f(x) > for all x H. Let be Γ(H) := f : H R : f is proper, convex and lower semicontinuous}. The conjugate function of f is f : H R, f (p) = sup p, x f(x) : x H} for all p H and, if f Γ(H), then f Γ(H), as well. The (convex) subdifferential of f : H R at x H is the set f(x) = p H : f(y) f(x) p, y x y H}, if f(x) R, and is taken to be the empty set, otherwise. For a linear continuous operator L i : H G i, the operator L i : G i H, defined via L i x, y = x, L i y for all x H and all y G i, denotes its adjoint operator, for i,..., m}. Having two functions f, g : H R, their infimal convolution is defined by f g : H R, (f g)(x) = inf y H f(y) + g(x y)} for all x H, being a convex function when f and g are convex. Let M : H 2 H be a set-valued operator. We denote by gra M = (x, u) H H : u Mx} its graph and by ran M = u H : x H, u Mx} its range. The inverse operator of M is defined as M : H 2 H, M (u) = x H : u Mx}. The operator M is called monotone if x y, u v 0 for all (x, u), (y, v) gra M and it is called maximally monotone if there exists no monotone operator M : H 2 H such that gra M properly contains gra M. The operator M is called ρ-strongly monotone, for ρ R ++, if M ρid is monotone, i. e. x y, u v ρ x y 2 for all (x, u), (y, v) gra M, where Id denotes the identity on H. The operator M : H H is called ν- Lipschitzian for ν R ++ if it is single-valued and it fulfills Mx My ν x y for all x, y H. The resolvent of a set-valued operator M : H 2 H is J M : H 2 H, J M = (Id + M). When M is maximally monotone, the resolvent is a single-valued, -Lipschitzian and maximal monotone operator. Moreover, when f Γ(H) and γ R ++, (γf) is maximally monotone (cf. [9, Theorem 3.2.8]) and it holds J γ f = (Id + γ f) = Prox γf. Here, Prox γf (x) denotes the proximal point of parameter γ of 2

f at x H and it represents the unique optimal solution of the optimization problem inf f(y) + } y x 2. (.) y H For a nonempty, convex and closed set C H and γ R ++ we have Prox γδc = P C, where P C : H C, P C (x) = arg min z C x z, denotes the projection operator on C. Finally, the parallel sum of two set-valued operators M, M 2 : H 2 H is defined as ( ) M M 2 : H 2 H, M M 2 = M + M2. We can formulate now the monotone inclusion problem which we investigate in this paper (see [7]). Problem.. Consider the real Hilbert spaces H and G i, i =,..., m, A : H 2 H a maximally monotone operator and C : H H a monotone and µ-lipschitzian operator for some µ R ++. Furthermore, let z H and for every i,..., m}, let r i G i, let B i : G i 2 G i be maximally monotone operators, let D i : G i 2 G i be monotone operators such that Di is ν i -Lipschitzian for some ν i R ++, and let L i : H G i be a nonzero linear continuous operator. The problem is to solve the primal inclusion find x H such that z Ax + L i ((B i D i )(L i x r i )) + Cx, (.2) together with the dual inclusion find v G,..., v m G m such that ( x H) z m L i v i Ax + Cx v i (B i D i )(L i x r i ),i =,..., m. (.3) Throughout this paper we denote by G := G... G m the Hilbert space equipped with the inner product (p,..., p m ), (q,..., q m ) = p i, q i (p,..., p m ) (q,..., q m ) G and the associated norm (p,..., p m ) = m p i 2 for all (p,..., p m ) G. We introduce also the nonzero linear continuous operator L : H G, Lx = (L x,..., L m x), its adjoint being L : G H, L v = m L i v i. We say that (x, v,..., v m ) H G is a primal-dual solution to Problem., if z L i v i Ax + Cx and v i (B i D i )(L i x r i ), i =,..., m. (.4) If (x, v,..., v m ) H G is a primal-dual solution to Problem., then x is a solution to (.2) and (v,..., v m ) is a solution to (.3). Notice also that x solves (.2) z L i (B i D i )(L i x r i ) Ax + Cx z m L v G,..., v m G m such that i v i Ax + Cx, v i (B i D i )(L i x r i ), i =,..., m. 3

Thus, if x is a solution to (.2), then there exists (v,..., v m ) G such that (x, v,..., v m ) is a primal-dual solution to Problem. and if (v,..., v m ) is a solution to (.3), then there exists x H such that (x, v,..., v m ) is a primal-dual solution to Problem.. The next result provides the error-free variant of the primal-dual algorithm in [7] and the corresponding convergence statements, as given in [7, Theorem 3.]. Theorem.. For Problem. suppose that ( ) z ran A + L i (B i D i ) (L i r i ) + C. Let x 0 H and (v,0,..., v m,0 ) G, set β = maxµ, ν..., ν m } + m L i 2, [ choose ε (0, β+ ) and () n 0 a sequence in ε, ε β ] and set p,n = J γna (x n (Cx n + m L i v i,n z)) For i =,..., m ( ) ( n 0) p2,i,n = J γnb v i,n + (L i x n Di v i,n r i ) i v i,n+ = L i (p,n x n ) + (Di v i,n Di p 2,i,n ) + p 2,i,n x n+ = γ m n L i (v i,n p 2,i,n ) + (Cx n Cp,n ) + p,n. (.5) Then the following statements are true: (i) n N x n p,n 2 < + and i,..., m} n N v i,n p 2,i,n 2 < +. (ii) There exists a primal-dual solution (x, v,..., v m ) H G to Problem. such that the following hold: (a) x n x and p,n x. (b) ( i,..., m}) v i,n v i and p 2,i,n v i. In this paper we consider first Problem. in its particular formulation as a primaldual pair of convex minimization problems, approach which relies on the fact that the subdifferential of a proper, convex and lower semicontinuous function is maximally monotone, and show that the convergence rate of the sequence of objective function values on the iterates generated by (.5) is of O( n ), where n N is the number of passed iterations. Further, in Section 3, we provide for the general monotone inclusion problem, as given in Problem., two new acceleration schemes which generate under strong monotonicity assumptions sequences of primal and/or dual iterates converge with improved convergence properties. The feasibility of the proposed methods is explicitly shown in Section 4 by means of numerical experiments in the context of solving image denoising, image deblurring and image inpainting problems via total variation regularization. 4

One of the iterative schemes to which we compare our algorithms is a primal-dual splitting method for solving highly structured monotone inclusions, as well, and it was provided by Vũ in [8]. Here, instead of monotone Lipschitzian operators, cocoercive operators were used and, consequently, instead of Tseng s splitting, the forward-backward splitting method has been used. The primal-dual method due to Chambolle and Pock described in [6, Algorithm ] is a particular instance of Vũ s algorithm. 2 Convex minimization problems The aim of this section is to provide a rate of convergence for the sequence of the values of the objective function at the iterates generated by the algorithm (.5) when solving a convex minimization problem and its conjugate dual. The primal-dual pair under investigation is described in the following. Problem 2.. Consider the real Hilbert spaces H and G i, i =,..., m, f Γ(H) and h : H R a convex and differentiable function with µ-lipschitzian gradient for some µ R ++. Furthermore, let z H and for every i,..., m}, let r i G i, g i, l i Γ(G i ) such that l i is νi -strongly convex for some ν i R ++, and let L i : H G i be a nonzero linear continuous operator. We consider the convex minimization problem } (P ) inf x H and its dual problem (D) sup (v i,...,v m) G... G m f(x) + (g i l i )(L i x r i ) + h(x) x, z (2.) ( ) } (f h ) z L i v i (gi (v i ) + li (v i ) + v i, r i ). (2.2) In order to investigate the primal-dual pair (2.)-(2.2) in the context of Problem 2., one has to take A = f, C = h, and, for i =,..., m, B i = g i and D i = l i. Then A and B i, i =,..., m are maximal monotone, C is monotone, by [, Proposition 7.0], and Di = li is monotone and ν i -Lipschitz continuous for i =,..., m, according to [, Proposition 7.0, Theorem 8.5 and Corollary 6.24]. One can easily see that (see, for instance, [7, Theorem 4.2]) whenever (x, v,..., v m ) H G is a primal-dual solution to Problem., with the above choice of the involved operators, x is an optimal solution to (P ), (v,..., v m ) is an optimal solution to (D) and for (P )-(D) strong duality holds, thus the optimal objective values of the two problems coincide. The primal-dual pair in Problem 2. captures various different types of optimization problems. One such particular instance is formulated as follows and we refer for more examples to [7]. Example 2.. In Problem 2. take z = 0, let l i : G i R, l i = δ 0} and r i = 0 for i =,..., m, and set h : H R, h(x) = 0 for all x H. Then (2.) reduces to } (P ) inf f(x) + g i (L i x), x H 5

while the dual problem (2.2) becomes ( ) } (D) sup f L i v i gi (v i ). (v i,...,v m) G... G m In order to simplify the upcoming formulations and calculations we introduce the following more compact notations. With respect to Problem 2., let F : H R, F (x) = f(x) + h(x) x, z. Then dom F = dom f and its conjugate F : H R is given by F (p) = (f + h) (z + p) = (f h )(z + p), since dom h = H. Further, we set v = (v,..., v m ), v = (v,..., v m ), p 2,n = (p 2,,n,..., p 2,m,n ), r = (r,..., r m ). We define the function G : G R, G(y) = m (g i l i )(y i ) and observe that its conjugate G : G R is given by G (v) = m (g i l i ) (v i ) = m (gi + l i )(v i). Notice that, as li, i =,..., m, has full domain (cf. [, Theorem 8.5]), we get dom G = (dom g dom l )... (dom g m dom l m) = dom g... dom g m, (2.3) The primal and the dual optimization problems given in Problem 2. can be equivalently represented as and, respectively, (P ) inf F (x) + G(Lx r)}, x H (D) sup F ( L v) G (v) v, r }. v G Then x H solves (P ), v G solves (D) and for (P )-(D) strong duality holds if and only if (cf. [2, 3]) L v F (x) and Lx r G (v). (2.4) Let us mention also that for x H and v G fulfilling (2.4) it holds [ Lx r, v + F (x) G (v)] [ Lx r, v + F (x) G (v)] 0 x H v G. For given sets B H and B 2 G we introduce the so-called primal-dual gap function G B B 2 (x, v) = sup ṽ B 2 Lx r, ṽ + F (x) G (ṽ)} inf x B L x r, v + F ( x) G (v)}. (2.5) We consider the following algorithm for solving (P )-(D), which differs from the one given in Theorem. by the fact that we are asking the sequence ( ) n 0 R ++ to be nondecreasing. 6

Algorithm 2.. Let x 0 H and (v,0,..., v m,0 ) G, set choose ε ( n 0) β = maxµ, ν,..., ν m } + n L i 2, ( ) [ 0, β+ and ( ) n 0 a nondecreasing sequence in ε, ε β ] and set p,n = Prox γnf (x n ( h(x n ) + m L i v i,n z)) For i =,..., m p2,i,n = Prox γng i (v i,n + (L i x n l i (v i,n) r i )) v i,n+ = L i (p,n x n ) + ( l i (v i,n) l i (p 2,i,n)) + p 2,i,n x n+ = m L i (v i,n p 2,i,n ) + ( h(x n ) h(p,n )) + p,n. (2.6) Theorem 2.. For Problem 2. suppose that ( ) z ran f + L i ( g i l i ) (L i r i ) + h. Then there exists an optimal solution x H to (P ) and an optimal solution (v,..., v m ) G to (D), such that the following holds for the sequences generated by Algorithm 2.: (a) z m L i v i f(x) + h(x) and L i x r i g i (v i) + l i (v i) i,..., m}. (b) x n x, p,n x and v i,n v i, p 2,i,n v i i,..., m}. (c) For n 0 it holds x n x 2 n + v i,n v i 2 n x 0 x 2 0 + v i,0 v i 2 0. := N N (d) If B H and B 2 G are bounded, then for x N n=0 p,n and vi N := N n=0 p 2,i,n, i =,..., m, the primal-dual gap has the upper bound N where C(B, B 2 ) = G B B 2 (x N, v N,..., vm) N C(B, B 2 ), (2.7) N sup (x,v,...,v m) B B 2 x0 x 2 0 + v i,0 v i 2 }. 0 (e) The sequence (x N, v N,..., vn m) converges weakly to (x, v,..., v m ). Proof. Theorem 4.2 in [7] guarantees the existence of an optimal solution x H to (2.) and of an optimal solution (v,..., v m ) G to (2.2) such that strong duality holds, x n x, p,n x, as well as v i,n v i and p 2,i,n v i for i =,..., m, when n converges to +. Hence (a) and (b) are true. Thus, the solutions x and v = (v,..., v m ) fulfill (2.4). 7

Regarding the sequences (p,n ) n 0 and (p 2,i,n ) n 0, i =,..., m, generated in Algorithm 2. we have for every n 0 p,n = (Id + f) (x n ( h(x n ) + L v n z)) and, for i =,..., m, x n p,n h(x n ) L v n + z f(p,n ) p 2,i,n = (Id + g i ) (v i,n + (L i x n l i (v i,n ) r i )) v i,n p 2,i,n + L i x n l i (v i,n ) r i g i (p 2,i,n ). In other words, it holds for every n 0 xn p,n f(x) f(p,n ) + h(x n ) L v n + z, x p,n x H (2.8) and, for i =,..., m, gi (v i ) gi vi,n p 2,i,n (p 2,i,n ) + + L i x n li (v i,n ) r i, v i p 2,i,n v i G i. (2.9) In addition to that, using that h and li, i =,..., m, are convex and differentiable, it holds for every n 0 and, for i =,..., m, h(x) h(p,n ) + h(p,n ), x p,n x H (2.0) l i (v i ) l i (p 2,i,n ) + l i (p 2,i,n ), v i p 2,i,n v i G i. (2.) Consider arbitrary x H and v = (v,..., v m ) G. Since xn p,n, x p,n = x n p,n 2 + x p,n 2 x n x 2 n n n vi,n p 2,i,n, v i p 2,i,n = v i,n p 2,i,n 2 + v i p 2,i,n 2 v i,n v i 2, i =,..., m, n n n we obtain for every n 0, by using the more compact notation of the elements in G and by summing up the inequalities (2.8) (2.), x n x 2 + v n v 2 x n p,n 2 + x p,n 2 + v n p 2,n 2 + v p 2,n 2 n n n n n n + L i x n + li (p 2,i,n ) li (v i,n ) r i, v i p 2,i,n (gi + li )(v i ) + (f + h)(p,n ) [ m ] + h(p,n ) h(x n ) L v n + z, x p,n (gi + li )(p 2,i,n ) + (f + h)(x). 8

Further, using again the update rules in Algorithm 2. and the equations p,n x n+, x p,n = x n+ x 2 x n+ p,n 2 x p,n 2 n n n and, for i =,..., m, p2,i,n v i,n+, v i p 2,i,n = v i,n+ v i 2 v i,n+ p 2,i,n 2 v i p 2,i,n 2, n n n we obtain for every n 0 x n x 2 + v n v 2 x n+ x 2 + v n+ v 2 + x n p,n 2 + v n p 2,n 2 n n n n n n x n+ p,n 2 v n+ p 2,n 2 + [ Lp,n r, v G (v) + F (p,n )] n n ] [ Lx r, p 2,n G (p 2,n ) + F (x). (2.2) Further, we equip the Hilbert space H = H G with the inner product (y, p), (z, q) = y, z + p, q (y, p), (z, q) H G (2.3) and the associated norm (y, p) = y 2 + p 2 for every (y, p) H G. For every n 0 it holds x n+ p,n 2 n + v n+ p 2,n 2 n = (x n+, v n+ ) (p,n, p 2,n ) 2 n and, consequently, by making use of the Lipschitz continuity of h and l i, i =,..., m, it shows that (x n+, v n+ ) (p,n, p 2,n ) = (L (v n p 2,n ), L (p,n x n ),..., L m (p,n x n )) + ( h(x n ) h(p,n ), l (v,n ) l (p 2,,n ),..., l m(v m,n ) l (p 2,m,n )) (L (v n p 2,n ), L (p,n x n ),..., L m (p,n x n )) + ( h(x n ) h(p,n ), l(v,n ) l(p 2,,n ),..., lm(v m,n ) l(p 2,m,n )) = γ 2 n L i (v i,n p 2,i,n ) + L i (p,n x n ) 2 + γ n h(x n ) h(p,n ) 2 + li (v i,n) li (p 2,i,n) 2 ( m ) m ( m ) L i 2 v i,n p 2,i,n 2 + L i 2 p,n x n 2 + γ n µ 2 x n p,n 2 + νi 2 v i,n p 2,i,n 2 9

m L i 2 + maxµ, ν,..., ν m } (x n, v n ) (p,n, p 2,n ). (2.4) Hence, by taking into consideration the way in which ( ) n 0 is chosen, we have for every n 0 [ x n p,n 2 + v n p 2,n 2 x n+ p,n 2 v n+ p 2,n 2] n 2 γn 2 m L i 2 + maxµ, ν,..., ν m } (x, vn ) (p,n, p 2,n ) 2 0. n and, consequently, (2.2) reduces to x n x 2 n + v n v 2 n + x n+ x 2 n+ + [ Lp,n r, v G (v) + F (p,n )] + + v n+ v 2 n+ [ Lx r, p 2,n G (p 2,n ) + F (x) Let N be an arbitrary natural number. Summing the above inequality up from n = 0 to N and using the fact that ( ) n 0 is nondecreasing, it follows that x 0 x 2 + v 0 v 2 x N x 2 N + [ Lp,n r, v G (v) + F (p,n )] 0 0 N N n=0 + v N v 2 N n=0 N n=0 ] [ Lx r, p 2,n G (p 2,n ) + F (x). ]. (2.5) Replacing x = x and v = v in the above estimate, since they fulfill (2.4), we obtain [ Lp,n r, v G (v) + F (p,n )] Consequently, N n=0 ] [ Lx r, p 2,n G (p 2,n ) + F (x) 0. x 0 x 2 0 + v 0 v 2 0 x N x 2 N + v N v 2 and statement (c) follows. On the other hand, dividing (2.5) by N, using the convexity of F and G, and denoting x N := N N n=0 p,n and vi N := N N n=0 p 2,i,n, i =,..., m, we obtain ( x0 x 2 + v 0 v 2 ) [ ] Lx N r, v G (v) + F (x N ) N 0 0 [ ] Lx r, v N G (v N ) + F (x), which shows (2.7) when passing to the supremum over x B and v B 2. In this way statement (d) is verified. The weak convergence of (x N, v N ) to (x, v) when N converges to + is an easy consequence of the Stolz Cesàro Theorem, fact which shows (e). N 0

Remark 2.. In the situation when the functions g i are Lipschitz continuous on G i, i =,..., m, inequality (2.7) provides for the sequence of the values of the objective of (P ) taken at (x N ) N a convergence rate of O( N ), namely, it holds F (x N ) + G(Lx N r) F (x) G(Lx r) C(B, B 2 ) N N. (2.6) Indeed, due to statement (b) of the previous theorem, the sequence (p,n ) n 0 H is bounded and one can take B H being a bounded, convex and closed set containing this sequence. Obviously, x B. On the other hand, we take B 2 = dom g... dom gm, which is in this situation a bounded set. Then it holds, using the Fenchel- Moreau Theorem and the Young-Fenchel inequality, that } G B B 2 (x N, v N ) = F (x N ) + G(Lx N r) + G (v N ) inf L x r, v N + F ( x) x B F (x N ) + G(Lx N r) + G (v N ) Lx r, v N F (x) F (x N ) + G(Lx N r) F (x) G(Lx r). Hence, (2.6) follows by statement (d) in Theorem 2.. In a similar way, one can show that, whenever f is Lipschitz continuous, (2.7) provides for the sequence of the values of the objective of (D) taken at (v N ) N a convergence rate of O( N ). Remark 2.2. If G i, i =,..., m, are finite-dimensional real Hilbert spaces, then (2.6) is true, even under the weaker assumption that the convex functions g i, i =,..., m, have full domain, without necessarily being Lipschitz continuous. The set B H can be chosen as in Remark 2., but this time we take B 2 = m n 0 g i (L i p,n ) G, by noticing also that the functions g i, i =,..., m, are everywhere subdifferentiable. The set B 2 is bounded, as for every i =,..., m the set n 0 g i (L i p,n ) is bounded. Let be i,..., m} fixed. Indeed, as p,n x, it follows that L i p,n L i x for i =,..., m. Using the fact that the subdifferential of g i is a locally bounded operator at L i x, the boundedness of n 0 g i (L i p,n ) follows automatically. For this choice of the sets B and B 2, by using the same arguments as in the previous remark, it follows that (2.6) is true. 3 Zeros of sums of monotone operators In this section we turn our attention to the primal-dual monotone inclusion problems formulated in Problem. with the aim to provide accelerations of the iterative method described in Theorem. under the additional strong monotonicity assumptions. 3. The case when A + C is strongly monotone We focus first on the case when A + C is ρ-strongly monotone for some ρ R ++ and investigate the impact of this assumption on the convergence rate of the sequence of primal iterates. The condition A + C is ρ-strongly monotone is fulfilled when either A : H 2 H or C : H H is ρ-strongly monotone. In case that A is ρ -monotone and C is ρ 2 -monotone, the sum A + C is ρ-monotone with ρ = ρ + ρ 2.

Remark 3.. The situation when Bi + Di is τ i -strongly monotone with τ i R ++ for i =,..., m, which improves the convergence rate of the sequence of dual iterates, can be handled with appropriate modifications. Due to technical reasons we assume in the following that the operators Di in Problem. are zero for i =,..., m, thus, D i (0) = G i and D i (x) = for x 0, for i =,..., m. In Remark 3.2 we show how the results given in this particular context can be employed when treating the primal-dual pair of monotone inclusions (.2)-(.3). Consequently, the problem we deal with in this subsection is as follows. Problem 3.. Consider the real Hilbert spaces H and G i, i =,..., m, A : H 2 H a maximally monotone operator and C : H H a monotone and µ-lipschitzian operator for some µ R ++. Furthermore, let z H and for every i,..., m}, let r i G i, let B i : G i 2 G i be maximally monotone operators and let L i : H G i be a nonzero linear continuous operator. The problem is to solve the primal inclusion find x H such that z Ax + L i B i (L i x r i ) + Cx, (3.) together with the dual inclusion find v G,..., v m G m such that ( x H) z m L i v i Ax + Cx v i B i (L i x r i ), i =,..., m. (3.2) The subsequent algorithm represents an accelerated version of the one given in Theorem. and relies on the fruitful idea of using a second sequence of variable step length parameters (σ n ) n 0 R ++, which, together with the sequence of parameters ( ) n 0 R ++, play an important role in the convergence analysis. Algorithm 3.. Let x 0 H, (v,0,..., v m,0 ) G, ( }) + 4ρ γ 0 0, min, and set σ 0 = 2( + 2ρ)µ 0 ( + 2ρ) m L i 2. Consider the following updates: p,n = J γna (x n (Cx n + m L i v i,n z)) For i =,..., m p2,i,n = J ( n 0) σnb (v i,n + σ n (L i x n r i )) i v i,n+ = σ n L i (p,n x n ) + p 2,i,n x n+ = γ m n L i (v i,n p 2,i,n ) + (Cx n Cp,n ) + p,n θ n = / + 2ρ ( ), + = θ n, σ n+ = σ n /θ n. (3.3) Theorem 3.. In Problem 3. suppose that A+C is ρ-strongly monotone with ρ R ++ and let (x, v,..., v m ) H G be a primal-dual solution to Problem 3.. Then for every n 0 it holds x n x 2 m v i,n v i 2 ( + γ 2 x0 x 2 v i,0 v i 2 ) n +, (3.4) σ n γ 0 σ 0 where, σ n R ++, x n H and (v,n,..., v m,n ) G are the iterates generated by Algorithm 3.. 2 γ 2 0

Proof. Taking into account the definitions of the resolvents occurring in Algorithm 3. we obtain x n p,n Cx n L i v i,n + z Ap,n γ and n v i,n p 2,i,n + L i x n r i Bi p 2,i,n, i =,..., m, σ n which, in the light of the updating rules in (3.3), furnishes for every n 0 and x n x n+ v i,n v i,n+ σ n L i p 2,i,n + z (A + C)p,n + L i p,n r i B i p 2,i,n, i =,..., m. (3.5) The primal-dual solution (x, v,..., v m ) H G to Problem 3. fulfills (see (.4), where D i are taken to be zero for i =,..., m) z L i v i Ax + Cx and v i B i (L i x r i ), i =,..., m. Since the sum A + C is ρ-strongly monotone, we have for every n 0 p,n x, x ( ) n x n+ L i p 2,i,n + z z L i v i ρ p,n x 2 (3.6) while, due to the monotonicity of Bi : G i 2 G i, we obtain for every n 0 p 2,i,n v i, v i,n v i,n+ + L i p,n r i (L i x r i ) 0, i =,..., m. (3.7) σ n Further, we set v = (v,..., v m ), v n = (v,n,..., v m,n ), p 2,n = (p 2,,n,..., p 2,m,n ). Summing up the inequalities (3.6) and (3.7), it follows that p,n x, x n x n+ + p γ 2,n v, v n v n+ + p,n x, L (v p n σ 2,n ) n + p 2,n v, L(p,n x) ρ p,n x 2. (3.8) and, from here, p,n x, x n x n+ In the light of the equations p,n x, x n x n+ = + p 2,n v, v n v n+ σ n p,n x n+, x n x n+ ρ p,n x 2 n 0. (3.9) + x n+ x, x n x n+ = x n+ p,n 2 n x n p,n 2 n + x n x 2 n x n+ x 2 n, 3

and p 2,n v, v n v n+ = σ n inequality (3.9) reads for every n 0 p 2,n v n+, v n v n+ σ n + v n+ v, v n v n+ σ n = v n+ p 2,n 2 2σ n v n p 2,n 2 2σ n + v n v 2 2σ n v n+ v 2 x n x 2 + v n v 2 ρ p,n x 2 + x n+ x 2 + v n+ v 2 + x n p,n 2 n 2σ n n 2σ n n + v n p 2,n 2 x n+ p,n 2 v n+ p 2,n 2. (3.0) 2σ n n 2σ n Using that 2ab αa 2 + b2 α for all a, b R, α R ++, we obtain for α :=, ρ p,n x 2 ρ ( x n+ x 2 2 x n+ x x n+ p,n + x n+ p,n 2) 2ρ( ) n x n+ x 2 2ρ( ) n x n+ p,n 2, which, in combination with (3.0), yields for every n 0 x n x 2 + v n v 2 ( + 2ρ( )) x n+ x 2 + v n+ v 2 n 2σ n n 2σ n + x n p,n 2 + v n p 2,n 2 ( + 2ρ( )) x n+ p,n 2 v n+ p 2,n 2. n 2σ n n 2σ n 2σ n (3.) Investigating the last two terms in the right-hand side of the above estimate it shows for every n 0 that ( + 2ρ( )) x n+ p,n 2 n ( + 2ρ) 2 L 2 i (v i,n p 2,i,n ) + (Cx n Cp,n ) 2( + 2ρ)γ (( m ) ) n L i 2 v n p 2 2,n 2 + µ 2 x n p,n 2, and v n+ p 2,n 2 2σ n = σ n 2 ( m ) L i (p,n x n ) 2 σ ( m ) n L i 2 p,n x n 2. 2 4

Hence, for every n 0 it holds x n p,n 2 + v n p 2,n 2 ( + 2ρ( )) x n+ p,n 2 v n+ p 2,n 2 n 2σ n n 2σ n ( γn σ m n L i 2 2( + 2ρ)γ 2 nµ 2) p,n x n 2 n ( n σ n ( + 2ρ) m + L i 2) v n p 2σ 2,n 2 n 0. The nonnegativity of the expression in the above relation follows because of the sequence ( ) n 0 is nonincreasing, σ n = γ 0 σ 0 for every n 0 and ( }) + 4ρ γ 0 0, min, and σ 0 = 2( + 2ρ)µ 0 ( + 2ρ) m L i 2. Consequently, inequality (3.) becomes x n x 2 n + v n v 2 2σ n ( + 2ρ( )) x n+ x 2 n + v n+ v 2 2σ n n 0. (3.2) Notice that we have + <, σ n+ > σ n and + σ n+ = σ n for every n 0. Dividing (3.2) by and making use of we obtain θ n = x n x 2 2 n + 2ργn ( ), + = θ n, σ n+ = σ n θ n, + v n v 2 x n+ x 2 n σ n n+ 2 + v n+ v 2 n 0. n+ σ n+ Let be N. Summing this inequalities from n = 0 to N, we finally get In conclusion, x n x 2 2 x 0 x 2 2 0 + v n v 2 2σ n + v 0 v 2 x N x 2 0 σ 0 N 2 + v N v 2. (3.3) N σ N γ 2 n ( x0 x 2 2 0 + v 0 v 2 ) 0 σ 0 n 0, (3.4) which completes the proof. Next we show that ρ converges like n as n +. Proposition 3.2. Let γ 0 (0, ) and consider the sequence ( ) n 0 R ++, where Then lim n + nρ =. + = + 2ργn ( ) n 0. (3.5) 5

Proof. Since the sequence ( ) n 0 (0, ) is bounded and decreasing, it converges towards some l [0, ) as n +. We let n + in (3.5) and obtain l 2 ( + 2ρl( l)) = l 2 2ρl 3 ( l) = 0, which shows that l = 0, i. e. 0 (n + ). On the other hand, (3.5) implies that + (n + ). As ( ) n 0 is a strictly increasing and unbounded sequence, by applying the Stolz Cesàro Theorem it shows that lim n = n + lim n + = lim n + = lim n + which completes the proof. n n + n = lim n + + = lim + ( + + )(3.5) γn 2 γn+ 2 = lim n + + + 2ρ+ ( ) = Hence, we have shown the following result. lim n + n + + + + ( + + ) 2ργn+ 2 ( ) + + 2ρ( ) = 2 2ρ = ρ, Theorem 3.3. In Problem 3. suppose that A + C is ρ-strongly monotone and let (x, v,..., v m ) H G be a primal-dual solution to Problem 3.. Then, for any ε > 0, there exists some n 0 N (depending on ε and ργ 0 ) such that for any n n 0 x n x 2 + ε n 2 ( x0 x 2 ρ 2 γ0 2 + v i,0 v i 2 ) ρ 2, (3.6) γ 0 σ 0 where, σ n R ++, x n H and (v,n,..., v m,n ) G are the iterates generated by Algorithm 3.. Remark 3.2. In Algorithm 3. and Theorem 3.3 we assumed that Di = 0 for i =,..., m, however, similar statements can be also provided for Problem. under the additional assumption that the operators D i : G i 2 G i are such that Di is νi - cocoercive with ν i R ++ for i =,..., m. This assumption is in general stronger than assuming that D i is monotone and Di is ν i -Lipschitzian for i =,..., m. However, it guarantees that D i is νi -strongly monotone and maximally monotone for i =,..., m (see [, Example 20.28, Proposition 20.22 and Example 22.6]). We introduce the Hilbert space H = H G, the element z = (z, 0,..., 0) H and the maximally monotone operator A : H 2 H, A(x, y,..., y m ) = (Ax, D y,..., D m y m ) and the monotone and Lipschitzian operator C : H H, C(x, y,..., y m ) = (Cx, 0,..., 0). Notice also that A+C is strongly monotone. Furthermore, we introduce the element r = (r,..., r m ) G, the maximally monotone operator B : G 2 G, B(y,..., y m ) = (B y,..., B m y m ), and the linear continuous operator L : H G, L(x, y..., y m ) = (L x y,..., L m x y m ), having as adjoint L : G H, L (q,..., q m ) = ( m L i q i, q,..., q m ). We consider the primal problem find x = (x, p... p m ) H such that z Ax + L B (Lx r) + Cx, (3.7) 6

together with the dual inclusion problem find v G such that ( x H) z L v Ax + Cx v B(Lx r). (3.8) We notice that Algorithm 3. can be employed for solving this primal-dual pair of monotone inclusion problems and, by separately involving the resolvents of A, B i and D i, i =,..., m, as for γ R ++ J γa (x, y,..., y m ) = (J γa x, J γd y,..., J γdm y m ) (x, y,..., y m ) H J γb (q,..., q m ) = (J γb q,..., J γbm q m ) (q,..., q m ) G. Having (x, v) H G a primal-dual solution to the primal-dual pair of monotone inclusion problems (3.7)-(3.8), Algorithm 3. generates a sequence of primal iterates fulfilling (3.6) in H. Moreover, (x, v) is a a primal-dual solution to (3.7)-(3.8) if and only if z L v Ax + Cx and v B (Lx r) z L i v i Ax + Cx and v i D i p i, v i B i (L i x p i r i ), i =,..., m z L i v i Ax + Cx and v i D i p i, L i x r i B i v i + p i, i =,..., m. Thus, if (x, v) is a primal-dual solution to (3.7)-(3.8), then (x, v) is a primal-dual solution to Problem.. Viceversa, if (x, v) is a primal-dual solution to Problem., then, choosing p i Di v i, i =,..., m, and x = (x, p... p m ), it yields that (x, v) is a primal-dual solution to (3.7)-(3.8). In conclusion, the first component of every primal iterate in H generated by Algorithm 3. for finding the primal-dual solution (x, v) to (3.7)-(3.8) will furnish a sequence of iterates verifying (3.6) in H for the primal-dual solution (x, v) to Problem.. 3.2 The case when A + C and Bi monotone + Di, i =,..., m, are strongly Within this subsection we consider the case when A+C is ρ-strongly monotone with ρ R ++ and Bi +Di is τ i -strongly monotone with τ i R ++ for i =,..., m, and provide an accelerated version of the algorithm in Theorem. which generates sequences of primal and dual iterates that converge to the primal-dual solution to Problem. with an improved rate of convergence. Algorithm 3.2. Let x 0 H, (v,0,..., v m,0 ) G, and γ (0, ) such that γ ( + 2 min ρ, τ,..., τ m } m L i 2 + max µ, ν,..., ν m }). 7

Consider the following updates: p,n = J γa (x n γ (Cx n + m L i v i,n z)) For i =,..., m ( ) ( n 0) p2,i,n = J γb v i,n + γ(l i x n Di v i,n r i ) i v i,n+ = γl i (p,n x n ) + γ(di v i,n Di p 2,i,n ) + p 2,i,n x n+ = γ m L i (v i,n p 2,i,n ) + γ(cx n Cp,n ) + p,n. (3.9) Theorem 3.4. In Problem. suppose that A + C is ρ-strongly monotone with ρ R ++, Bi + Di is τ i -strongly monotone with τ i R ++ for i =,..., m, and let (x, v,..., v m ) H G be a primal-dual solution to Problem.. Then for every n 0 it holds ( ) ( ) x n x 2 + v i,n v i 2 n x 0 x 2 + v i,0 v i 2, + 2ρ min γ( γ) where ρ min = min ρ, τ,..., τ m } and x n H and (v,n,..., v m,n ) G are the iterates generated by Algorithm 3.2. Proof. Taking into account the definitions of the resolvents occurring in Algorithm 3.2 we obtain for every n 0 and x n x n+ γ L i p 2,i,n + z (A + C)p,n v i,n v i,n+ γ + L i p,n r i (B i + D i )p 2,i,n, i =,..., m. The primal-dual solution (x, v,..., v m ) H G to Problem. fulfills (see (.4)) z L i v i Ax + Cx and v i (B i D i )(L i x r i ), i =,..., m. By the strong monotonicity of A + C and Bi n 0 p,n x, x n x n+ γ + Di, i =,..., m, we obtain for every ( ) L i p 2,i,n + z z L i v i ρ p,n x 2 (3.20) and, respectively, p 2,i,n v i, v i,n v i,n+ + L i p,n r i (L i x r i ) τ i p 2,i,n v i 2, i =,..., m. γ (3.2) Consider the Hilbert space H = H G, equipped with the inner product defined in (2.3) and associated norm, and set x = (x, v,..., v m ), x n = (x n, v,n,..., v m,n ), p n = (p,n, p 2,,n,..., p 2,m,n ). 8

Summing up the inequalities (3.20) and (3.2) and using p n x, x n x n+ γ = x n+ p n 2 x n p n 2 + x n x 2 x n+ x 2, we obtain for every n 0 x n x 2 ρ min p n x 2 + x n+ x 2 + x n p n 2 x n+ p n 2. (3.22) Further, using the estimate 2ab γa 2 + b2 γ for all a, b R, we obtain ρ min p n x 2 2ρ minγ( γ) Hence, (3.22) reduces to x n+ x 2 2ρ min( γ) x n+ p n 2 2ρ minγ( γ) x n+ x 2 2ρ min x n+ p n 2 n 0. x n x 2 ( + 2ρ minγ( γ)) x n+ x 2 + x n p n 2 ( + 2ρ min) x n+ p n 2 n 0. Using the same arguments as in (2.4), it is easy to check that for every n 0 x n p n 2 ( + 2ρ min) x n+ p n 2 ( + 2ρ min )γ 2 m 2 L i 2 + max µ, ν,..., ν m } x n p n 2 0, whereby the nonnegativity of this term is ensured by the assumption that γ + 2ρmin ( m L i 2 + max µ, ν,..., ν m }). Therefore, we obtain x n x 2 ( + 2ρ min γ( γ)) x n+ x 2 n 0, which leads to ( ) x n x 2 n x 0 x 2 n 0. + 2ρ min γ( γ) 9

4 Numerical experiments in imaging In this section we test the feasibility of Algorithm 2. and of its accelerated version Algorithm 3. in the context of different problem formulations occurring in imaging and compare their performances to the ones of two other popular primal-dual algorithms introduced in [6]. For all applications discussed in this section the images have been normalized, in order to make their pixels range in the closed interval from 0 (pure black) to (pure white). 4. TV-based image denoising Our first numerical experiment aims the solving of an image denoising problem via total variation regularization. More precisely, we deal with the convex optimization problem λ T V (x) + } 2 x b 2, (4.) inf x R n where λ R ++ is the regularization parameter, T V : R n R is a discrete total variation functional and b R n is the observed noisy image. In this context, x R n represents the vectorized image X R M N, where n = M N and x i,j denotes the normalized value of the pixel located in the i-th row and the j-th column, for i =,..., M and j =,..., N. Two popular choices for the discrete total variation functional are the isotropic total variation T V iso : R n R, T V iso (x) = M N j= M + (x i+,j x i,j ) 2 + (x i,j+ x i,j ) 2 N x i+,n x i,n + and the anisotropic total variation T V aniso : R n R, T V aniso (x) = M N j= M + j= x M,j+ x M,j, x i+,j x i,j + x i,j+ x i,j N x i+,n x i,n + j= x M,j+ x M,j, where in both cases reflexive (Neumann) boundary conditions are assumed. We denote Y = R n R n and define the linear operator L : R n Y, x i,j (L x i,j, L 2 x i,j ), where L x i,j = xi+,j x i,j, if i < M 0, if i = M and L xi,j+ x 2x i,j = i,j, if j < N 0, if j = N. The operator L represents a discretization of the gradient using reflexive (Neumann) boundary conditions and standard finite differences. One can easily check that L 2 8 and that its adjoint L : Y R n is as easy to implement as the operator itself (cf. [5]). 20

σ = 0.2, λ = 0.07 σ = 0.06, λ = 0.035 ε = 0 4 ε = 0 6 ε = 0 4 ε = 0 6 ALG 350 (7.03 s) 2989 (59.82 s) 84 (3.69 s) 454 (29.07 s) ALG2 0 (2.28 s) 442 (9.9 s) 72 (.62 s) 298 (6.68 s) PD 342 (3.59 s) 333 (32.68 s) 80 (.9 s) 427 (4.87 s) PD2 96 (.02 s) 442 (4.67 s) 69 (0.76 s) 39 (3.39 s) Table 4.: Performance evaluation for the images in Figure 4.. The entries represent to the number of iterations and the CPU times in seconds, respectively, needed in order to attain a root mean squared error for the iterates below the tolerance ε. Within this example we will focus on the anisotropic total variation function which is nothing else than the composition of the l -norm in Y with the linear operator L. Due to the full splitting characteristics of the iterative methods presented in this paper, we need only to compute the proximal point of the conjugate of the l -norm, the latter being the indicator function of the dual unit ball. Thus, the calculation of the proximal point will result in the computation of a projection, which has an easy implementation. The more challenging isotropic total variation functional is employed in the forthcoming subsection in the context of an image deblurring problem. Thus, problem (4.) reads equivalently inf h(x) + g(lx)}, x Rn where h : R n R, h(x) = 2 x b 2, is -strongly monotone and differentiable with -Lipschitzian gradient and g : Y R is defined as g(y, y 2 ) = λ (y, y 2 ). Then its conjugate g : Y R is nothing else than g (p, p 2 ) = (λ ) (p, p 2 ) = λ ( p λ, p 2 λ where S = [ λ, λ] n [ λ, λ] n. Taking x 0 H, v 0 Y, γ 0 ( 0, min, }) + 4ρ 2( + 2ρ)µ and σ 0 = Algorithm 3. looks for this particular problem like ( n 0) ) = δ S (p, p 2 ), 0 ( + 2ρ) m L i 2, p,n = x n (x n b + L v n ) p 2,n = P S (v n + σ n Lx n ) v n+ = σ n L(p,n x n ) + p 2,n x n+ = L (v n p 2,n ) + (x n p,n ) + p,n θ n = / + 2ρ ( ), + = θ n, σ n+ = σ n /θ n. However, we solved the regularized image denoising problem with Algorithm 2., the primal-dual iterative scheme from [6] (see, also, [8]) and the accelerated version of the latter presented in [6, Theorem 2], as well, and refer the reader to Table 4. for a comparison of the obtained results: 2

(a) Noisy image, σ = 0.06 (b) Noisy image, σ = 0.2 (c) Denoised image, λ = 0.035 (d) Denoised image, λ = 0.07 Figure 4.: T V -l 2 image denoising. The noisy image in (a) was obtained after adding white Gaussian noise with standard deviation σ = 0.06 to the original 256 256 lichtenstein test image, while the output of Algorithm 3., for λ = 0.035, after 00 iterations is shown in (c). Likewise, the noise image when choosing σ = 0.2 and the output of the same algorithm, for λ = 0.07, after 00 iterations are shown in (b) and (d), respectively. ALG: Algorithm 2. with γ = ε 8, small ε > 0 and by taking the last iterate instead of the averaged sequence. ALG2: Algorithm 3. with ρ = 0.3, µ = and γ 0 = +4ρ 2(+2ρ)µ. PD: Algorithm in [6] with τ = 8, τσ8 = and by taking the last iterate instead of the averaged sequence. PD2: Algorithm 2 in [6] with ρ = 0.3, τ 0 = 8, τ 0 σ 0 8 =. From the point of view of the number of iterations, one can notice similarities between both the primal-dual algorithms ALG and PD and the accelerated versions ALG2 and PD2. From this point of view they behave almost equal. When comparing the CPU times, it shows that the methods in this paper need almost twice amount of time. This is since ALG and ALG2 lead back to a forward-backward-forward splitting, whereas PD and PD2 rely on a forward-backward splitting scheme, meaning that ALG and ALG2 process the double amount of forward steps than PD and PD2. In this example the evaluation of forward steps (i. e. which constitute in matix-vector multiplications involving the linear operators and their adjoints) is, compared with the calculation of projections when computing the resolvents, the most costly step. 4.2 T V -based image deblurring The second numerical experiment that we consider concerns the solving of an extremely ill-conditioned linear inverse problem which arises in image deblurring and denoising. 22

For a given matrix A R n n describing a blur operator and a given vector b R n representing the blurred and noisy image, the task is to estimate the unknown original image x R n fulfilling Ax = b. To this end we basically solve the following regularized convex nondifferentiable problem } inf Ax b x R n + λ T V iso (x) + λ 2 x + δ [0,] n(x), (4.2) where λ, λ 2 R ++ are regularization parameters and T V iso : R n R is the discrete isotropic total variation function. Notice that none of the functions occurring in (4.2) is differentiable, while the regularization is done by a combination of two regularization functionals with different properties. The blurring operator is constructed by making use of the Matlab routines imfilter and fspecial as follows: H=f s p e c i a l ( g a ussian, 9, 4 ) ; % gaussian b l u r o f s i z e 9 times 9 2 % and standard d e v i a t i o n 4 3 B=i m f i l t e r (X,H, conv, symmetric ) ; % B=observed b l u r r e d image 4 % X=o r i g i n a l image The function fspecial returns a rotationally symmetric Gaussian lowpass filter of size 9 9 with standard deviation 4, the entries of H being nonnegative and their sum adding up to. The function imfilter convolves the filter H with the image X and furnishes the blurred image B. The boundary option symmetric corresponds to reflexive boundary conditions. Thanks to the rotationally symmetric filter H, the linear operator A defined via the routine imfilter is symmetric, too. By making use of the real spectral decomposition of A, it shows that A 2 =. For (y, z), (p, q) Y, we introduce the inner product M N (y, z), (p, q) = y i,j p i,j + z i,j q i,j j= and define (y, z) = M Nj= yi,j 2 + z2 i,j. One can check that is a norm on Y and that for every x R n it holds T V iso (x) = Lx, where L is the linear operator defined in the previous section. The conjugate function ( ) : Y R of is for every (p, q) Y given by (see, for instance, [3]) ( ) 0, if (p, q) (p, q) = +, otherwise, where (p, q) = sup (p, q), (y, z) = max p 2 i,j + (y,z) i M q2 i,j. j N Therefore, the optimization problem (4.2) can be written in the form of inf f(x) + g (Ax) + g 2 (Lx)}, x R n 23

(a) Original image (b) Blurred and noisy image (c) Reconstructed image Figure 4.2: T V -l -l image deblurring. Figure (a) shows the clean 256 256 cameraman test image, (b) shows the image obtained after multiplying it with a blur operator and adding white Gaussian noise and (c) shows the averaged sequence generated by Algorithm 2. after 400 iterations. where f : R n R, f(x) = λ 2 x + δ [0,] n(x), g : R n R, g (y) = y b and g 2 : Y R, g 2 (y, z) = λ (y, z). For every p R n it holds g (p) = δ [,] n (p) + p T b (see, for instance, [2]), while for any (p, q) Y we have g2 (p, q) = δ S(p, q), with S = (p, q) Y : (p, q) λ }. We solved this problem by Algorithm 2. and to this end we made use of the following formulae for the proximal points involved in the formulation of this iterative scheme: Prox γf (x) = arg min γλ 2 z + } z [0,] n 2 z x 2 = P [0,] n (x γλ 2 n ) x R n, Prox γg (p) = arg min γ z, b + } z [,] n 2 z p 2 = P [,] n (p γb) p R n, and Prox γg 2 (p, q) = arg min (y,z) S 2 (y, z) (p, q) 2 = P S (p, q) (p, q) Y, where γ R ++, n is the vector in R n with all entries equal to and the projection operator P S : Y S is defined as (p i,j, q i,j ) max, p i,j }, p 2 i,j +qi,j 2 λ max, Taking x 0 H, (v,0, v 2,0 ) R n Y, β = + 8 = 3, ε q i,j } p 2 i,j +qi,j 2 λ. ( ) 0, β+ and ( ) n 0 a 24

[ nondecreasing sequence in ( n 0) ε, ε β ], Algorithm 2. looks for this particular problem like p,n = P [0,] n (x n (A v,n + L v 2,n + λ 2 n )) p 2,,n = P [,] n (v,n + (Ax n b)) p 2,2,n = P S (v 2,n + Lx n ) v,n+ = A(p,n x n ) + p 2,,n v 2,n+ = L(p,n x n ) + p 2,2,n x n+ = (A (v,n p 2,,n ) + L (v 2,n p 2,2,n )) + p,n. Figure 4.2 shows the original cameraman test image, which is part of the image processing toolbox in Matlab, the image obtained after multiplying it with the blur operator and adding after that normally distributed white Gaussian noise with standard deviation 0 3 and the image reconstructed by Algorithm 2. when taking as regularization parameters λ = 3e-3 and λ 2 = 2e-5. 4.3 T V -based image inpainting In the last section of the paper we show how image inpainting problems, which aim for recovering lost information, can be efficiently solved via the primal-dual methods investigated in this work. To this end, we consider the following T V -l model } inf λt V iso (x) + Kx b + δ [0,] n(x), (4.3) x R n where λ R ++ is the regularization parameter and T V iso : R n R is the isotropic total variation functional and K R n n is the diagonal matrix, where for i =,..., n, K i,i = 0, if the pixel i in the noisy image b R n is lost (in our case pure black) and K i,i =, otherwise. The induced linear operator K : R n R n fulfills K =, while, in the light of the considerations made in the previous two subsections, we have that T V iso (x) = Lx for all x R n. Thus, problem (4.3) can be formulated as inf x R n f(x) + g (Lx) + g 2 (Kx)}, where f : R n R, f(x) = δ [0,] n, g : Y R, g (y, y 2 ) = (y, y 2 ) and g 2 : R n R, g 2 (y) = y b. We solve it by Algorithm 2., the formulae for the proximal points involved in this iterative scheme been already given in Subsection 4.2. Figure 4.3 shows the original fruit image, the image obtained from it after setting to pure black 80% randomly chosen pixels and the image reconstructed by Algorithm 2. when taking as regularization parameter λ = 0.05. References [] H.H. Bauschke and P.L. Combettes. Convex Analysis and Monotone Operator Theory in Hilbert Spaces. CMS Books in Mathematics, Springer, 20. [2] R.I. Boţ. Conjugate Duality in Convex Optimization. Lecture Notes in Economics and Mathematical Systems, Vol. 637, Springer-Verlag Berlin Heidelberg, 200. 25

(a) Original image (b) 80% missing pixels (c) Reconstructed image Figure 4.3: T V -l image inpainting. Figure (a) shows the 240 256 clean fruits image, (b) shows the same image for which 80% randomly chosen pixels were set to pure black and (c) shows the nonaveraged iterate generated by Algorithm 2. after 200 iterations. [3] R.I. Boţ, S.M. Grad and G. Wanka. Duality in Vector Optimization. Springer- Verlag Berlin Heidelberg, 2009. [4] L.M. Briceño-Arias and P.L. Combettes. A monotone + skew splitting model for composite monotone inclusions in duality. SIAM Journal on Optimization 2(4):230 250, 20. [5] A. Chambolle. An algorithm for total variation minimization and applications. Journal of Mathematical Imaging and Vision 20( 2):89 97, 2004. [6] A. Chambolle and T. Pock. A first-order primal-dual algorithm for convex problems with applications to imaging. Journal of Mathematical Imaging and Vision 40():20 45, 20. [7] P.L. Combettes and J.-C. Pesquet. Primal-dual splitting algorithm for solving inclusions with mixtures of composite, Lipschitzian, and parallel-sum type monotone operators. Set-Valued and Variational Analysis 20(2):307 330, 202. [8] B.C. Vũ. A splitting algorithm for dual monotone inclusions involving cocoercive operators. Advances in Computational Mathematics, 20. http://dx.doi.org/ 0.007/s0444-0-9254-8 [9] C. Zălinescu. Convex Analysis in General Vector Spaces. World Scientific, 2002. 26