Second order forward-backward dynamical systems for monotone inclusion problems

Second order forward-backward dynamical systems for monotone inclusion problems Radu Ioan Boţ Ernö Robert Csetnek March 6, 25 Abstract. We begin by considering second order dynamical systems of the from ẍt + Γẋt + λtbxt =, where Γ : H H is an elliptic bounded self-adjoint linear operator defined on a real Hilbert space H, B : H H is a cocoercive operator and λ : [, + [, + is a relaxation function depending on time. We show the existence and uniqueness of strong global solutions in the framework of the Cauchy-Lipschitz-Picard Theorem and prove weak convergence for the generated trajectories to a zero of the operator B, by using Lyapunov analysis combined with the celebrated Opial Lemma in its continuous version. The framework allows to address from similar perspectives second order dynamical systems associated with the problem of finding zeros of the sum of a maximally monotone operator and a cocoercive one. This captures as particular case the minimization of the sum of a nonsmooth convex function with a smooth convex one and allows us to recover and improve several results from the literature concerning the minimization of a convex smooth function subject to a convex closed set by means of second order dynamical systems. When considering the unconstrained minimization of a smooth convex function we prove a rate of O/t for the convergence of the function value along the ergodic trajectory to its minimum value. A similar analysis is carried out also for second order dynamical systems having as first order term γtẋt, where γ : [, + [, + is a damping function depending on time. Key Words. dynamical systems, Lyapunov analysis, monotone inclusions, convex optimization problems, continuous forward-backward method AMS subject classification. 34G25, 47J25, 47H5, 9C25 Introduction and preliminaries This paper is motivated by the heavy ball with friction dynamical system ẍ + γẋ + fx =, which is a nonlinear oscillator with damping γ > and potential f : H R, supposed to be a convex and differentiable function defined on the real Hilbert space H. When University of Vienna, Faculty of Mathematics, Oskar-Morgenstern-Platz, A-9 Vienna, Austria, email: radu.bot@univie.ac.at. University of Vienna, Faculty of Mathematics, Oskar-Morgenstern-Platz, A-9 Vienna, Austria, email: ernoe.robert.csetnek@univie.ac.at. Research supported by FWF Austrian Science Fund, Lise Meitner Programme, project M 682-N25.

H = R 2, the system is a simplified version of the differential system describing the motion of a heavy ball that keeps rolling over the graph of the function f under its own inertia until friction stops it at a critical point of f see []. The second order dynamical system has been considered by several authors in the context of minimizing the function f, these investigations being either concerned with the convergence of the generated trajectories to a critical point of f or with the convergence of the function along the trajectories to its global minimum value see [3,7,8,]. It is also worth to mention that the time discretization of the heavy ball with friction dynamical system leads to the so-called inertial-type algorithms, which are numerical schemes sharing the feature that the current iterate of the generated sequence is defined by making use of the previous two iterates see, for instance, [3 5, 8, 2, 23]. In order to approach the minimization of f over a nonempty, convex and closed set C H, the gradient-projection second order dynamical system ẍ + γẋ + x P C x η fx =, 2 has been considered, where P C : H C denotes the projection onto the set C and η >. Convergence statements for the trajectories to a global minimizer of f over C have been provided in [7, 8]. Furthermore, in [8], these investigations have been expanded to more general second order dynamical systems of the form ẍ + γẋ + x T x =, 3 where T : H H is a nonexpansive operator. It has been shown that when γ 2 > 2 the trajectory of 8 converges weakly to an element in the fixed points set of T, provided it is nonempty. In the first part of the present manuscript we treat the second order dynamical system ẍt + Γẋt + λtbxt =, 4 where Γ : H H is an elliptic bounded self-adjoint linear operator, B : H H is a cocoercive operator and λ : [, + [, + is a relaxation function in time. We notice that the presence of the elliptic operator induces an anisotropic damping and refer to [3], where a similar construction has been used fin the context of minimizing a convex and smooth function. The existence and uniqueness of strong global solutions for 4 is obtained by applying the classical Cauchy-Lipschitz-Picard Theorem see [2, 27]. We show that under mild assumptions on the relaxation function the trajectory xt converges weakly as t + to a zero of the operator B, provided it has a nonempty set of zeros. To this end we will use Lyapunov analysis combined with the continuous version of the Opial Lemma see also [3, 7, 8], where similar techniques have been used. Further, we approach the problem of finding a zero of the sum of a maximally monotone operator and a cocoercive one via a second order dynamical system formulated by making use of the resolvent of the set-valued operator, see 3. Dynamical systems of implicit type have been already considered in the literature in [, 2, 9, 2, 4, 6, 7]. We specialize these investigations to the minimization of the sum of a nonsmooth convex function with a smooth convex function one, fact which allows us to recover and improve results given in [7,8] in the context of studying the dynamical system 2. Whenever B is the gradient of a smooth convex function we show that the latter converges along the ergodic trajectories generated by 4 to its minimum value with a rate of convergence of O/t. 2

We close the paper by showing that a similar analysis can be carried out when using as starting point dynamical systems of the form ẍt + γtẋt + λtbxt =, 5 where the damping coefficient γ : [, + [, + is a function depending on time. Throughout this paper N = {,, 2,...} denotes the set of nonnegative integers and H a real Hilbert space with inner product, and corresponding norm =,. 2 A dynamical system: existence and uniqueness of strong global solutions This section is devoted to the study of existence and uniqueness of strong global solutions of a second order dynamical system governed by Lipschitz continuous operators. Let Γ : H H be an L Γ -Lipschitz continuous operator that is L Γ and Γx Γy L Γ x y for all x, y H, B : H H L B -Lipschitz continuous operator, λ : [, + [, + a Lebesgue measurable function, u, v H and consider the dynamical system { ẍt + Γẋt + λtbxt = 6 x = u, ẋ = v. As in [2, 2], we consider the following definition of an absolutely continuous function. Definition see, for instance, [2, 2] A function x : [, b] H where b > is said to be absolutely continuous if one of the following equivalent properties holds: i there exists an integrable function y : [, b] H such that xt = x + t ysds t [, b]; ii x is continuous and its distributional derivative is Lebesgue integrable on [, b]; iii for every ε >, there exists η > such that for any finite family of intervals I k = a k, b k we have the implication I k I j = and k b k a k < η = k xb k xa k < ε. Remark a It follows from the definition that an absolutely continuous function is differentiable almost everywhere, its derivative coincides with its distributional derivative almost everywhere and one can recover the function from its derivative ẋ = y by the integration formula i. b If x : [, b] H where b > is absolutely continuous and B : H H is L-Lipschitz continuous where L, then the function z = B x is absolutely continuous, too. This can be easily seen by using the characterization of absolute continuity in Definition iii. Moreover, z is almost everywhere differentiable and the inequality ż L ẋ holds almost everywhere. 3

Definition 2 We say that x : [, + H is a strong global solution of 6 if the following properties are satisfied: i x, ẋ : [, + H are locally absolutely continuous, in other words, absolutely continuous on each interval [, b] for < b < + ; ii ẍt + Γẋt + λtbxt = for almost every t [, + ; iii x = u and ẋ = v. For proving the existence and uniqueness of strong global solutions of 6 we use the Cauchy-Lipschitz-Picard Theorem for absolutely continues trajectories see for example [2, Proposition 6.2.], [27, Theorem 54]. The key observation here is that one ca rewrite 6 as a certain first order dynamical system in a product space see also [6]. Theorem 2 Let Γ : H H be an L Γ -Lipschitz continuous operator, B : H H a L B - Lipschitz continuous operator and λ : [, + [, + a Lebesgue measurable function such that λ L loc [, + that is λ L [, b] for every < b < +. Then for each u, v H there exists a unique strong global solution of the dynamical system 6. Proof. The system 6 can be equivalently written as a first order dynamical system in the phase space H H { Ẏ t = F t, Y t 7 Y = u, v, with and Y : [, + H H, Y t = xt, ẋt F : [, + H H H H, F t, u, v = v, Γv λtbu. We endow H H with scalar product u, v, u, v H H = u, u + v, v and corresponding norm u, v H H = u 2 + v 2. a For arbitrary u, u, v, v H, by using the Lipschitz continuity of the involved operators, we obtain F t, u, v F t, u, v H H = v v 2 + Γv Γv + λtbu Bu 2 + 2L 2 Γ v v 2 + 2L 2 B λ2 t u u 2 + 2L 2 Γ + 2L2 B λ2 t u, u v, v H H + 2L Γ + 2L B λt u, u v, v H H t. As λ L loc [, +, the Lipschitz constant of F t,, is local integrable. b Next we show that u, v H, b >, F, u, v L [, b], H H. 8 For arbitrary u, v H and b > it holds b F t, u, v H H dt = b b b v 2 + Γv + λtbu 2 dt v 2 + 2 Γv 2 + 2λ 2 t Bu 2 dt v 2 + 2 Γv 2 + 2λt Bu dt 4

and from here 8 follows, by using the assumptions made on λ. In the light of the statements a and b, the existence and uniqueness of a strong global solution for 7 are consequences of the Cauchy-Lipschitz-Picard Theorem for first order dynamical systems see, for example, [2, Proposition 6.2.], [27, Theorem 54]. From here, due to the equivalence of 6 and 7, the conclusion follows. 3 Convergence of the trajectories In this section we address the convergence properties of the trajectories generated by the dynamical system 6 by assuming that B : H H is a -cocoercive operator for >, that is Bx By 2 x y, Bx By for all x, y H. To this end we will make use of the following well-known results, which can be interpreted as continuous versions of the quasi-fejér monotonicity for sequences. For their proofs we refer the reader to [2, Lemma 5.] and [2, Lemma 5.2], respectively. Lemma 3 Suppose that F : [, + R is locally absolutely continuous and bounded below and that there exists G L [, + such that for almost every t [, + Then there exists lim t F t R. d F t Gt. dt Lemma 4 If p <, r, F : [, + [, + is locally absolutely continuous, F L p [, +, G : [, + R, G L r [, + and for almost every t [, + d F t Gt, dt then lim t + F t =. The next result which we recall here is the continuous version of the Opial Lemma see, for example, [2, Lemma 5.3], [, Lemma.]. Lemma 5 Let S H be a nonempty set and x : [, + H a given map. Assume that i for every x S, lim t + xt x exists; ii every weak sequential cluster point of the map x belongs to S. Then there exists x S such that xt converges weakly to x as t +. In order to prove the convergence of the trajectories of 6, we make the following assumptions on the operator Γ and the relaxation function λ, respectively: A Γ : H H is a bounded self-adjoint linear operator, assumed to be elliptic, that is, there exists γ > such that Γu, u γ u 2 for all u H; A2 λ : [, +, + is locally absolutely continuous and there exists θ > such that for almost every t [, + we have λt and λt γ2 + θ. 9 5

Due to Definition and Remark a λt exists for almost every t and λ is Lebesgue integrable on each interval [, b] for < b < +. If λt for almost every t, then λ is monotonically increasing, thus, as λ is assumed to take only positive values, A2 yields the existence of a lower bound λ such that for almost every t [, + one has < λ λt γ2 + θ. Theorem 6 Let B : H H be a -cocoercive operator for > such that zer B := {u H : Bu = }, Γ : H H be an operator fulfilling A, λ : [, +, + be a function fulfilling A2 and u, v H. Let x : [, + H be the unique strong global solution of 6. Then the following statements are true: i the trajectory x is bounded and ẋ, ẍ, Bx L 2 [, + ; H; ii lim t + ẋt = lim t + ẍt = lim t + Bxt = ; iii xt converges weakly to an element in zer B as t +. Proof. Notice that the existence and uniqueness of the trajectory x follows from Theorem 2, since B is /-Lipschitz continuous, Γ is Γ -Lipschitz continuous and A2 ensures λ L loc [, +. i Take an arbitrary x zer B and consider for every t [, + the function ht = 2 xt x 2. We have ḣt = xt x, ẋt and ḧt = ẋt 2 + xt x, ẍt for almost every t [, +. Taking into account 6, we get for almost every t [, + ḧt + γḣt + λt xt x, Bxt + xt x, Γẋt γẋt = ẋt 2. Now we introduce the function p : [, + R, pt = 2 Γ γ Id xt x, xt x, 2 where Id denotes the identity on H. Due to A, as Γ γ Id u, u for all u H, it holds pt for all t. 3 Moreover, ṗt = Γ γ Id ẋt, xt x, which combined with, the cocoercivity of B and the fact that Bx = yields for almost every t [, + ḧt + γḣt + λt Bxt 2 + ṗt ẋt 2. Taking into account 6 one obtains for almost every t [, + hence ḧt + γḣt + ḧt + γḣt + According to A we have λt ẍt + Γẋt 2 + ṗt ẋt 2, λt ẍt 2 + 2 ẍt, Γẋt + λt λt Γẋt 2 + ṗt ẋt 2. 4 γ u Γu for all u H, 5 6

which combined with 4 yields for almost every t [, + ḧt + γḣt + ṗt + d γ 2 ẋt, Γẋt + λt dt λt ẋt 2 + By taking into account that for almost every t [, + d λt dt d ẋt, Γẋt = dt d dt we obtain for almost every t [, + ḧt + γḣt + ṗt+ d ẋt, Γẋt + dt λt γ 2 λt ẋt, Γẋt + λt λ 2 t λt ẋt, Γẋt λt λt ẍt 2. ẋt, Γẋt + γ λt λ 2 t ẋt 2, 6 λt + γ λ 2 t ẋt 2 + λt ẍt 2. 7 By using now assumption A2 we obtain that the following inequality holds for almost every t [, + ḧt + γḣt + ṗt + d ẋt, Γẋt + θ ẋt 2 + + θ dt λt γ 2 ẍt 2. 8 This implies that the function t ḣt + γht + pt + λt ẋt, Γẋt, which is locally absolutely continuous, is monotonically decreasing. Hence there exists a real number M such that for almost every t [, + ḣt + γht + pt + ẋt, Γẋt M, 9 λt which yields, together with 3 and A2, that for almost every t [, + ḣt + γht M. By multiplying this inequality with e γt and then integrating from to T, where T >, one easily obtains thus and, consequently, ht he γt + M γ e γt, h is bounded 2 the trajectory x is bounded. 2 On the other hand, from 9, by taking into account 3, A and A2, it follows that for almost every t [, + ḣt + + θ γ ẋt 2 M, 7

hence xt x, ẋt + + θ γ ẋt 2 M. This inequality, in combination with 2, yields which further implies that ẋ is bounded, 22 ḣ is bounded. 23 Integrating the inequality 8 we obtain that there exists a real number N R such that for almost every t [, + ḣt + γht + pt + ẋt, Γẋt + θ λt t ẋs 2 ds + + θ γ 2 t ẍs 2 ds N. From here, via 23, 3 and A, we conclude that ẋ, ẍ L 2 [, + ; H. Finally, from 6, A and A2 we deduce Bx L 2 [, + ; H and the proof of i is complete. ii For almost every t [, + it holds d dt 2 ẋt 2 = ẋt, ẍt 2 ẋt 2 + 2 ẍt 2 and Lemma 4 together with i lead to lim t + ẋt =. Further, by taking into consideration Remark b, for almost every t [, + we have d dt 2 Bxt 2 = Bxt, ddt Bxt 2 Bxt 2 + 2 2 ẋt 2. By using again Lemma 4 and i we get lim t + Bxt =, while the fact that lim t + ẍt = follows from 6, A and A2. iii We are going to prove that both assumptions in Opial Lemma are fulfilled. The first one concerns the existence of lim t + xt x. As seen in the proof of part i, the function t ḣt + γht + pt + λt ẋt, Γẋt is monotonically decreasing, thus from i, ii, 3, A and A2 we deduce that lim t + γht + pt exists and it is a real number. It remains to prove that lim t + pt exists and it is a real number and this will prove the first part of the Opial Lemma. Indeed, from 8 we get that for almost every t [, + ṗt ḧt γḣt d ẋt, Γẋt. 24 dt λt On the other hand, by A, for every T we have T [ ḧt γḣt d ] ẋt, Γẋt dt = dt λt ḣt γht ẋt, ΓẋT + ḣ + γh + λt ḣt + ḣ + γh + ẋ, Γẋ. λ ẋ, Γẋ λ 8

Since lim T + ḣt = see i and ii, we deduce that the function t ḧt γḣt d dt λt ẋt, Γẋt is in L [, +. From Lemma 3 it follows that there exists lim t + pt R. We come now to the second assumption of the Opial Lemma. Let x be a weak sequential cluster point of x, that is, there exists a sequence t n + as n + such that xt n n N converges weakly to x. Since B is a maximally monotone operator see for instance [3, Example 2.28], its graph is sequentially closed with respect to the weakstrong topology of the product space H H. By using also that lim n + Bxt n =, we conclude that Bx =, hence x zer B and the proof is complete. A standard instance of a cocoercive operator defined on a real Hilbert spaces is the one that can be represented as B = Id T, where T : H H is a nonexpansive operator, that is, a -Lipschitz continuous operator. As it easily follows from the nonexpansiveness of T, B is in this case /2-cocoercive. For this particular choice of the operator B, the dynamical system 6 becomes { ẍt + Γẋt + λt xt T xt = 25 x = u, ẋ = v, while assumption A2 reads A3 λ : [, +, + is locally absolutely continuous and there exists θ > such that for almost every t [, + we have λt and λt Theorem 6 gives rise to the following result. γ 2 2 + θ. 26 Corollary 7 Let T : H H be a nonexpansive operator such that Fix T = {u H : T u = u} =, Γ : H H be an operator fulfilling A, λ : [, +, + be a function fulfilling A3 and u, v H. Let x : [, + H be the unique strong global solution of 25. Then the following statements are true: i the trajectory x is bounded and ẋ, ẍ, Id T x L 2 [, + ; H; ii lim t + ẋt = lim t + ẍt = lim t + Id T xt = ; iii xt converges weakly to a point in Fix T as t +. Remark 8 In the particular case when Γ = γ Id for γ > and λt = for all t [, + the dynamical system 25 becomes { ẍt + γẋt + xt T xt = 27 x = u, ẋ = v. The convergence of the trajectories generated by 27 has been studied in [8, Theorem 3.2] under the condition γ 2 > 2. In this case A3 is obviously fulfilled for an arbitrary < θ γ 2 2/2. However, different to [8], we allow in Corollary 7 an anisotropic damping through the use of the elliptic operator Γ and also a variable relaxation function λ depending on time in [3] the anisotropic damping has been considered as well in the context of minimizing of a smooth convex function via second order dynamical systems. 9

We close the section by addressing an immediate consequence of the above corollary applied to second order dynamical systems governed by averaged operators. The operator R : H H is said to be α-averaged for α,, if there exists a nonexpansive operator T : H H such that R = α Id +αt. For α = 2 we obtain as an important representative of this class the firmly nonexpansive operators. For properties and insights concerning these families of operators we refer to the monograph [3]. We consider the dynamical system { ẍt + Γẋt + λt xt Rxt = 28 x = u, ẋ = v and formulate the assumption A4 λ : [, +, + is locally absolutely continuous and there exists θ > such that for almost every t [, + we have λt and λt γ 2 2α + θ. 29 Corollary 9 Let R : H H be an α-averaged operator for α, such that Fix R, Γ : H H be an operator fulfilling A, λ : [, +, + be a function fulfilling A4 and u, v H. Let x : [, + H be the unique strong global solution of 28. Then the following statements are true: i the trajectory x is bounded and ẋ, ẍ, Id Rx L 2 [, + ; H; ii lim t + ẋt = lim t + ẍt = lim t + Id Rxt = ; iii xt converges weakly to a point in Fix R as t +. Proof. Since R is α-averaged, there exists a nonexpansive operator T : H H such that R = α Id +αt. The conclusion is a direct consequence of Corollary 7, by taking into account that 28 is equivalent to { ẍt + Γẋt + αλt xt T xt = x = u, ẋ = v, and Fix R = Fix T. 4 Forward-backward second order dynamical systems In this section we address the monotone inclusion problem find Ax + Bx, where A : H H is a maximally monotone operator and B : H H is a -cocoercive operator for > via a second-order forward-backward dynamical system with anisotropic damping and variable relaxation parameter. For readers convenience we recall at the beginning some standard notions and results in monotone operator theory which will be used in the following see also [3,5,26]. For an arbitrary set-valued operator A : H H we denote by Gr A = {x, u H H : u Ax} its graph. We use also the notation zer A = {x H : Ax} for the set of zeros

of A. We say that A is monotone, if x y, u v for all x, u, y, v Gr A. A monotone operator A is said to be maximally monotone, if there exists no proper monotone extension of the graph of A on H H. The resolvent of A, J A : H H, is defined by J A = Id +A. If A is maximally monotone, then J A : H H is single-valued and maximally monotone see [3, Proposition 23.7 and Corollary 23.]. For an arbitrary γ > we have see [3, Proposition 23.2] p J γa x if and only if p, γ x p Gr A. 3 The operator A is said to be uniformly monotone if there exists an increasing function φ A : [, + [, + ] that vanishes only at and fulfills x y, u v φ A x y for every x, u Gr A and y, v Gr A. A popular class of operators having this property is the one of the strongly monotone operators. We say that A is γ-strongly monotone for γ >, if x y, u v γ x y 2 for all x, u, y, v Gr A. For η > we consider the dynamical system { ] ẍt + Γẋt + λt [xt J ηa xt ηbxt = 3 x = u, ẋ = v. We formulate the following assumption, where δ := min{, /η} + /2: A5 λ : [, +, + is locally absolutely continuous and there exists θ > such that for almost every t [, + we have λt and λt δγ2 2 + θ. 32 Theorem Let A : H H be a maximally monotone operator and B : H H be -cocoercive operator for > such that zera + B. Let η, 2 and set δ := min{, /η} + /2. Let Γ : H H be an operator fulfilling A, λ : [, +, + be a function fulfilling A5, u, v H and x : [, + H be the unique strong global solution of 3. Then the following statements are true: i the trajectory x is bounded and ẋ, ẍ, Id J ηa Id ηb x L 2 [, + ; H; ii lim t + ẋt = lim t + ẍt = lim t + Id JηA Id ηb xt = ; iii xt converges weakly to a point in zera + B as t + ; iv if x zera+b, then Bx Bx L 2 [, + ; H, lim t + Bxt = Bx and B is constant on zera + B; v if A or B is uniformly monotone, then xt converges strongly to the unique point in zera + B as t +. Proof. i-iii It is immediate that the dynamical system 3 can be written in the form { ẍt + Γẋt + λt xt Rxt = x = u, ẋ = v, 33 where R = J ηa Id ηb. According to [3, Corollary 23.8 and Remark 4.24iii], J ηa is /2-cocoercive. Moreover, by [3, Proposition 4.33], Id ηb is η/2-averaged. Combining this with [3, Proposition 4.32], we derive that R is /δ-averaged. The statements i-iii follow now from Corollary 9 by noticing that Fix R = zera + B see [3, Proposition 25.iv].

iv The fact that B is constant on zera + B follows from the cocoercivity of B and the monotonicity of A. A proof of this statement when A is the subdifferential of a proper, convex and lower semicontinuous function is given for instance in [, Lemma.7]. Take an arbitrary x zera + B. From the definition of the resolvent we have for almost every t [, + Bxt ηλtẍt ηλt Γẋt A λtẍt + Γẋt + xt, 34 λt which combined with Bx Ax and the monotonicity of A leads to λtẍt + λt Γẋt + xt x, Bxt + Bx ηλtẍt ηλt Γẋt. 35 After using the cocoercivity of B we obtain for almost every t [, + Bxt Bx 2 λtẍt+ Γẋt, Bxt + Bx λt ηλ 2 ẍt + Γẋt 2 t + xt x, ηλtẍt ηλt Γẋt 2 λtẍt + λt Γẋt 2 + 2 Bxt Bx 2 + xt x, ηλtẍt ηλt Γẋt. For evaluating the last term of the above inequality we use the functions h : [, + R, ht = 2 xt x 2 and p : [, + R, pt = 2 Γ γ Id xt x, xt x, already used in the proof of Theorem 6. For almost every t [, + we have and xt x, ẍt = ḧt ẋt 2 ṗt = xt x, Γẋt γ xt x, ẋt = xt x, Γẋt γḣt, hence xt x, ηλtẍt ηλt Γẋt = ḧt + γ ḣt + ṗt ẋt 2. ηλt 36 Consequently, for almost every t [, + it holds 2 Bxt Bx 2 2 λtẍt + λt Γẋt 2 ḧt + γ ḣt + ṗt ẋt 2. 37 ηλt By taking into account A5 we obtain a lower bound λ such that for almost every t [, + one has < λ λt δγ2 2 + θ. 2

By multiplying 37 with λt we obtain for almost every t [, + that λ 2 Bxt Bx 2 + ḧt + γ ḣt + ṗt η 2λ ẍt + Γẋt 2 + η ẋt 2. After integration we obtain that for every T [, + λ 2 T T Bxt Bx 2 dt + η 2λ ẍt + Γẋt 2 + η ẋt 2 dt. ḣt ḣ + γht γh + pt p As ẋ, ẍ L 2 [, + ; H, ht, pt for every T [, + and lim T + ḣt =, it follows that Bx Bx L 2 [, + ; H. Further, by taking into consideration Remark b, we have d dt 2 Bxt Bx 2 = Bxt Bx, ddt Bxt 2 Bxt Bx 2 + 2 2 ẋt 2 and from here, in light of Lemma 4, it follows that lim t + Bxt = Bx. v Let x be the unique element of zera + B. For the beginning we suppose that A is uniformly monotone with corresponding function φ A : [, + [, + ], which is increasing and vanishes only at. By similar arguments as in the proof of statement iv, for almost every t [, + we have φ A λtẍt + Γẋt + xt x λt λtẍt + λt Γẋt + xt x, Bxt + Bx ηλtẍt ηλt Γẋt, which combined with the inequality xt x, Bxt Bx yields φ A λtẍt + Γẋt + xt x λt λtẍt + Γẋt, Bxt + Bx λt ηλ 2 t ẍt + Γẋt 2 + xt x, ηλtẍt ηλt Γẋt λtẍt + Γẋt, Bxt + Bx + xt x, λt ηλtẍt ηλt Γẋt. 3

As λ is bounded by positive constants, by using i-iv it follows that the right-hand side of the last inequality converges to as t +. Hence lim φ A t + λtẍt + Γẋt + xt x λt = and the properties of the function φ A allow to conclude that λtẍt+ λt Γẋt+xt x converges strongly to as t +. By using again the boundedness of λ and ii we obtain that xt converges strongly to x as t +. Finally, suppose that B is uniformly monotone with corresponding function φ B : [, + [, + ], which is increasing and vanishes only at. The conclusion follows by letting t in the inequality xt x, Bxt Bx φ B xt x t [, + converge to + and by using that x is bounded and lim t + Bxt Bx =. Remark We would like to emphasize the fact that the statements in Theorem remain valid also for η := 2. Indeed, in this case the cocoercivity of B implies that Id ηb is nonexpansive, hence the operator R = J ηa Id ηb used in the proof is nonexpansive, too, and so the statements in i-iii follow from Corollary 7. Furthermore, the proof of the statements iv and v can be repeated also for η = 2. In the remaining of this section we turn our attention to optimization problems of the form min fx + gx, x H where f : H R {+ } is a proper, convex and lower semicontinuous function and g : H R is a convex and Fréchet differentiable function with /-Lipschitz continuous gradient for >. We recall some standard notations and facts in convex analysis. For a proper, convex and lower semicontinuous function f : H R {+ }, its convex subdifferential at x H is defined as fx = {u H : fy fx + u, y x y H}. When seen as a set-valued mapping, it is a maximally monotone operator see [24] and its resolvent is given by J η f = prox ηf see [3], where prox ηf : H H, { prox ηf x = argmin fy + } y x 2, 38 y H 2η denotes the proximal point operator of f and η >. According to [3, Definition.5], f is said to be uniformly convex with modulus function φ : [, + [, + ], if φ is increasing, vanishes only at and fulfills fαx + αy + α αφ x y αfx + αfy for all α, and x, y dom f := {x H : fx < + }. Notice that if this inequality holds for φ = ν/2 2 for ν >, then f is said to be ν-strongly convex. 4

In the following statement we approach the minimizers of f + g via the second order dynamical system { ] ẍt + Γẋt + λt [xt prox ηf xt η gxt = 39 x = u, ẋ = v. Corollary 2 Let f : H R {+ } by a proper, convex and lower semicontinuous function and g : H R be a convex and Fréchet differentiable function with /- Lipschitz continuous gradient for > such that argmin x H {fx + gx}. Let η, 2] and set δ := min{, /η} + /2. Let Γ : H H be an operator fulfilling A, λ : [, +, + be a function fulfilling A5, u, v H and x : [, + H be the unique strong global solution of 39. Then the following statements are true: i the trajectory x is bounded and ẋ, ẍ, Id prox ηf Id η g x L 2 [, + ; H; ii lim t + ẋt = lim t + ẍt = lim t + Id proxηf Id η g xt = ; iii xt converges weakly to a minimizer of f + g as t + ; iv if x is a minimizer of f + g, then gx gx L 2 [, + ; H, lim t + gxt = gx and g is constant on argmin x H {fx + gx}; v if f or g is uniformly convex, then xt converges strongly to the unique minimizer of f + g as t +. Proof. The statements are direct consequences of the corresponding ones from Theorem see also Remark, by choosing A := f and B := g, by taking into account that zer f + g = argmin{fx + gx} x H and by making use of the Baillon-Haddad Theorem, which says that g is /-Lipschitz if and only if g is -cocoercive see [3, Corollary 8.6]. For statement v we also use the fact that if f is uniformly convex with modulus φ, then f is uniformly monotone with modulus 2φ see [3, Example 22.3iii]. Remark 3 Consider again the setting in Remark 8, namely, when Γ = γ Id for γ > and λt = for every t [, +. Furthermore, for C a nonempty, convex, closed subset of H, let f = δ C be the indicator function of C, which is defined as being equal to for x C and to +, else. The dynamical system 39 attached in this setting to the minimization of g over C becomes { ẍt + γẋt + xt PC xt η gxt = 4 x = u, ẋ = v, where P C denotes the projection onto the set C. The convergence of the trajectories of 4 has been studied in [8, Theorem 3.] under the conditions γ 2 > 2 and < η 2. In this case assumption A5 trivially holds by choosing θ such that < θ γ 2 2/2 δγ 2 2/2. Thus, in order to verify A5 in case λt = for every t [, + one needs to equivalently assume that γ 2 > 2/δ. Since δ, this provides a slight improvement over [8, Theorem 3.] in what concerns the choice of γ. We refer the reader also to [7] for an analysis of the convergence rates of trajectories of the dynamical system 4 when g is endowed with supplementary properties. 5

For the two main convergence statements provided in this section it was essential to choose the step size η in the interval, 2] see Theorem, Remark and Corollary 2. This, because of the fact that in this way we were able to guarantee for the generated trajectories the existence of the limit lim t + xt x 2, where x denotes a solution of the problem under investigation. It is interesting to observe that, when dealing with convex optimization problems, one can go also beyond this classical restriction concerning the choice of the step size a similar phenomenon has been reported also in [, Section 4.2]. This is pointed out in the following result, which is valid under the assumption A6 λ : [, +, + is locally absolutely continuous and there exist a, θ, θ > such that for almost every t [, + we have λt and θ + a 2 Γ γ Id γ 2 λt ηθ + η + η 4 2a Γ γ Id +, and for the proof of which we use instead of x x 2 a modified energy functional. Corollary 4 Let f : H R {+ } by a proper, convex and lower semicontinuous function and g : H R be a convex and Fréchet differentiable function with /- Lipschitz continuous gradient for > such that argmin x H {fx + gx} =. Let be η >, Γ : H H be an operator fulfilling A, λ : [, +, + be a function fulfilling A6, u, v H and x : [, + H be the unique strong global solution of 39. Then the following statements are true: i the trajectory x is bounded and ẋ, ẍ, Id prox ηf Id η g x L 2 [, + ; H; ii lim t + ẋt = lim t + ẍt = lim t + Id proxηf Id η g xt = ; iii xt converges weakly to a minimizer of f + g as t + ; iv if x is a minimizer of f + g, then gx gx L 2 [, + ; H, lim t + gxt = gx and g is constant on argmin x H {fx + gx}; v if f or g is uniformly convex, then xt converges strongly to the unique minimizer of f + g as t +. Proof. Consider an arbitrary element x argmin x H {fx + gx} = zer f + g. Similarly to the proof of Theorem iv, we derive for almot every t [, + see the first inequality after 35 gxt gx 2 ẍt, gxt + gx + Γẋt, gxt + gx λt ηλ 2 t ẍt + Γẋt 2 + xt x, ηλtẍt ηλt Γẋt. 42 In what follows we evaluate the right-hand side of the above inequality and introduce to this end the function q : [, + R, qt = gxt gx gx, xt x. Due to the convexity of g one has qt t. 6

Further, for almost every t [, + thus Γẋt, gxt + gx = qt = ẋt, gxt gx, γ qt + Γ γ Id ẋt, gxt + gx γ qt + 2a Γ γ Id ẋt 2 + a 2 Γ γ Id gxt gx 2. 43 On the other hand, for almost every t [, + qt = ẍt, gxt gx + ẋt, ddt gxt, hence ẍt, gxt + gx qt + ẋt 2. 44 Further, we have for almost every t [, + see also 6 and 5 λt ẍt + Γẋt 2 = λt ẍt 2 + λt ẍt 2 + d dt d ẋt, Γẋt + λt dt λt Γẋt 2 ẋt, Γẋt λt + γ λt λ 2 t ẋt 2 + γ2 λt ẋt 2. 45 Finally, by multiplying 42 with λt and by using 43, 44, 45 and 36 we obtain after rearranging the terms for almost every t [, + that λt a 2 Γ γ Id gxt gx 2 + d dt 2 η h + q + γ d dt η h + q + η ṗt + d ẋt, Γẋt + η dt λt γ 2 ηλt + γ λt ηλ 2 t η Γ γ Id ẋt 2 + 2a ηλt ẍt 2. and, further, via A6 θ gxt gx 2 + d dt 2 + d ẋt, Γẋt η dt λt η h + q + γ d dt η h + q + η ṗt + θ ẋt 2 + ηλt ẍt 2. 46 This implies that the function t d dt η h + q t + γ η h + q t + η pt + ẋt, Γẋt η λt 47 7

is monotonically decreasing. Arguing as in the proof of Theorem 6, by taking into account that λ has positive upper and lower bounds, it follows that η h + q, h, q, x, ẋ, ḣ, q are bounded, ẋ, ẍ and Id prox ηf Id η g x L 2 [, + ; H and lim t + ẋt =. Since dt d Id proxηf Id η g x L 2 [, + ; H see Remark b, we derive from Lemma 4 that lim t + Id proxηf Id η g xt =. As ẍt = Γẋt λt Id prox ηf Id η g xt for every t [, +, we obtain that lim t + ẍt =. From 46 it also follows that gx gx L 2 [, + ; H and, by applying again Lemma 4, it yield lim t + gxt = gx. In this way the statements i, ii and iv are shown. iii Since the function in 47 is monotonically decreasing, from i, ii and iv it follows that the limit lim t + γ η h + u t + η pt exists and it is a real number. By using similar arguments as at the beginning of the proof of statement iii of Theorem 6, by exploiting again 46 one gets that lim t + pt R, hence lim t + η h + u t R. Since x has been chosen as an arbitrary minimizer of f + g, we conclude that for all x argmin x H {fx + gx} the limit exists, where lim Et, t + x R, Et, x = 2η xt x 2 + gxt gx gx, xt x. In what follows we use a similar technique as in [4] see, also, [, Section 4.2]. Since x is bounded, it has at least one weak sequential cluster point. We prove first that each weak sequential cluster point of x is a minimizer of f + g. Let x argmin x H {fx + gx} and t n + as n + be such that xt n n N converges weakly to x. Since xt n, gxt n Gr g, lim n + gxt n = gx and Gr g is sequentially closed in the weak-strong topology, we obtain gx = gx. From 34 written for t = t n, A = f and B = g, by letting n converge to + and by using that Gr f is sequentially closed in the weak-strong topology, we obtain gx fx. This, combined with gx = gx delivers gx fx, hence x zer f + g = argmin x H {fx + gx}. Next we show that x has at most one weak sequential cluster point, which will actually guarantee that it has exactly one weak sequential cluster point. This will imply the weak convergence of the trajectory to a minimizer of f + g. Let x, x 2 be two weak sequential cluster points of x. This means that there exist t n + as n + and t n + as n + such that xt n n N converges weakly to x as n + and xt n n N converges weakly to x 2 as n +. Since x, x 2 argmin x H{fx + gx}, we have lim t + Et, x R and lim t + Et, x 2 R, hence lim t + Et, x Et, x 2 R. We obtain lim t + η xt, x 2 x + gx 2 gx, xt R, which, when expressed by means of the sequences t n n N and t n n N, leads to η x, x 2 x + gx 2 gx, x = η x 2, x 2 x + gx 2 gx, x 2. 8

This is the same with η x x 2 2 + gx 2 gx, x 2 x = and by the monotonicity of g we conclude that x = x 2. v The proof of this statement follows in analogy to the one of the corresponding statement of Theorem v written for A = f and B = g. Remark 5 When Γ = γ Id for γ >, in order to verify the left-hand side of the second statement in assumption A6 one can take θ := inf t λt. Thus, 4 amounts in this case to the existence of θ > such that λt γ 2 ηθ + η +. When one takes λt = for every t [, +, this is verified if and only if γ 2 > η +. In other words, A6 allows in this particular setting a more relaxed choice for the parameters γ, η and, beyond the standard assumptions < η 2 and γ 2 > 2 considered in [8]. In the following we provide a rate for the convergence of a convex and Fréchet differentiable function with Lipschitz continuous gradient g : H R along the ergodic trajectory generated by { ẍt + Γẋt + λt gxt = x = u, ẋ = v 48 to the minimum value of g. To this end we make the following assumption A7 λ : [, +, + is locally absolutely continuous and there exists ζ > such that for almost every t [, + we have < ζ γλt λt. 49 Let us mention that the following result is in the spirit of a convergence rate given for the objective function values on a sequence iteratively generated by an inertial-type algorithm recently obtained in [9, Theorem ]. Theorem 6 Let g : H R be a convex and Fréchet differentiable function with /- Lipschitz continuous gradient for > such that argmin x H gx. Let Γ : H H be an operator fulfilling A, λ : [, +, + a function fulfilling A7, u, v H and x : [, + H be the unique strong global solution of 48. Then for every minimizer x of g and every T > it holds g T 2ζT T xtdt gx [ v + γu x 2 + γ Γ γ Id + λ u x 2 ]. 9

Proof. The existence and uniqueness of the trajectory of 48 follow from Theorem 2. Let be x argmin x H gx, T > and consider again the function p : [, + R, pt = 2 Γ γ Id xt x, xt x which we defined in 2. By using 48, the formula for the derivative of p, the positive semidefinitness of Γ γ Id, the convexity of g and A7 we get for almost every t [, + d dt 2 ẋt + γxt x 2 + γpt + λtgxt = ẍt + γẋt, ẋt + γxt x + γ Γ γ Idẋt, xt x + λtgxt + λt ẋt, gxt = Γ γ Idẋt λt gxt, ẋt + γxt x + Γ γ Idẋt, γxt x + λtgxt + λt ẋt, gxt γλt gxt, xt x + λtgxt λt γλtgxt gx + λtgx ζgxt gx + λtgx. We obtain after integration 2 ẋt + γxt x 2 + γpt + λt gxt 2 ẋ + γx x 2 + γp + λgx +ζ T gxt gx dt λt λgx. Be neglecting the nonnegative terms on the left-hand side of this inequality and by using that gxt gx, it yields ζ T gxt gx dt 2 v + γu x 2 + γp + λgu gx. The conclusion follows by using p = 2 Γ γ Idu x, u x 2 Γ γ Id u x 2, gu gx 2 u x 2, which is a consequence of the descent lemma see [22, Lemma.2.3] and notice that gx =, and the inequality which holds since g is convex. T g xtdt gx T T T gxt gx dt, Remark 7 Under assumption A7 on the relaxation function λ, we obtain in the above theorem only the convergence of the function g along the ergodic trajectory to a global 2

minimum value. If one is interested also in the weak convergence of the trajectory to a minimizer of g, this follows via Theorem 6 when λ is assumed to fulfill A2 notice that if x converges weakly to a minimizer of g, then from the Cesaro-Stolz Theorem one also obtains the weak convergence of the ergodic trajectory T T minimizer. Take a, b > /γ 2 and ρ γ. Then λt = ae ρt + b T xtdt to the same is an example of a relaxation function which verifies assumption A2 with < θ bγ 2 and assumption A7 with < ζ γb/a + b 2. 5 Variable damping parameters In this section we carry out a similar analysis as in the previous section, however, for second order dynamical systems having as damping coefficient a function depending on time. As starting point for our investigations we consider the dynamical system { ẍt + γtẋt + λtbxt = 5 x = u, ẋ = v, where B : H H is a cocoercive operator, λ, γ : [, + [, + are Lebesgue measurable functions and u, v H. The existence and uniqueness of strong global solutions of 5 can be shown by using the same techniques as in the proof of Theorem 2, provided that λ, γ L loc [, +. For the convergence of the trajectories we need the following assumption A2 λ, γ : [, +, + are locally absolutely continuous and there exists θ > such that for almost every t [, + we have γt λt and γ2 t λt + θ. 5 According to Definition and Remark a, λt, γt exist for almost almost every t [, + and λ, γ are Lebesgue integrable on each interval [, b] for < b < +. This combined with γt λt, yields the existence of a positive lower bound for λ and for a positive upper bound for γ. Using further the second assumption in 5 provides also a positive upper bound for λ and a positive lower bound for γ. The couple of functions λt = ae ρt + b and γt = a e ρ t + b, where a, a, ρ, ρ and b, b > fulfill the inequality b 2 b > /, verify the conditions in assumption A2. We state now the convergence result. 2

Theorem 8 Let B : H H be a -cocoercive operator for > such that zer B := {u H : Bu = } =, λ, γ : [, +, + be functions fulfilling A2 and u, v H. Let x : [, + H be the unique strong global solution of 5. Then the following statements are true: i the trajectory x is bounded and ẋ, ẍ, Bx L 2 [, + ; H; ii lim t + ẋt = lim t + ẍt = lim t + Bxt = ; iii xt converges weakly to an element in zer B as t +. Proof. With the notations in the proof of Theorem 6 and by appealing to similar arguments one obtains for almost every t [, + or, equivalently, ḧt + ḧt + γtḣt + γtḣt + γt λt d dt Combining this inequality with γt d ẋt 2 λt dt = d dt λt ẍt + γtẋt 2 ẋt 2 ẋt 2 + γ 2 t λt γt λt ẋt 2 ẋt 2 + λt ẍt 2. γtλt γt λt λ 2 ẋt 2 t and γtḣt = d dt γht γtht d γht, 52 dt it yields for almost every t [, + ḧt + d dt γht+ d γt dt λt ẋt 2 + γ 2 t γtλt + γt λt + λt λ 2 ẋt 2 + t Now, assumption A2 delivers for almost every t [, + the inequality ḧt + d dt γht + d γt dt λt ẋt 2 + θ ẋt 2 + λt ẍt 2. λt ẍt 2. γt This implies that the function t ḣt+γtht+ λt ẋt 2 is monotonically decreasing and from here one obtains the conclusion following the lines of the proof of Theorem 6, by taking also into account that lim t + γt,. When T : H H is a nonexpansive operator one obtains for the dynamical system { ẍt + γtẋt + λt xt T xt = x = u, ẋ = v 53 and by making the assumption 22

A3 λ, γ : [, +, + are locally absolutely continuous and there exists θ > such that for almost every t [, + we have γt λt and γ2 t λt 2 + θ 54 the following result which can been seen as a counterpart to Corollary 7. Corollary 9 Let T : H H be a nonexpansive operator such that Fix T = {u H : T u = u}, λ, γ : [, +, + be functions fulfilling A3 and u, v H. Let x : [, + H be the unique strong global solution of 53. Then the following statements are true: i the trajectory x is bounded and ẋ, ẍ, Id T x L 2 [, + ; H; ii lim t + ẋt = lim t + ẍt = lim t + Id T xt = ; iii xt converges weakly to a point in Fix T as t +. When R : H H is an α-averaged operator for α, one obtains for the dynamical system { ẍt + γtẋt + λt xt Rxt = 55 x = u, ẋ = v, and by making the assumption A4 λ, γ : [, +, + are locally absolutely continuous and there exists θ > such that for almost every t [, + we have γt λt and γ2 t λt 2α + θ 56 the following result which can been seen as a counterpart to Corollary 9. Corollary 2 Let R : H H be an α-averaged operator for α, such that Fix R, λ, γ : [, +, + be functions fulfilling A4 and u, v H. Let x : [, + H be the unique strong global solution of 55. Then the following statements are true: i the trajectory x is bounded and ẋ, ẍ, Id Rx L 2 [, + ; H; ii lim t + ẋt = lim t + ẍt = lim t + Id Rxt = ; iii xt converges weakly to a point in Fix R as t +. We come now to the monotone inclusion problem find Ax + Bx, where A : H H is a maximally monotone operator and B : H H is a -cocoercive operator for > and assign to it the second order dynamical system { ] ẍt + γtẋt + λt [xt J ηa xt ηbxt = 57 x = u, ẋ = v. and make the assumption 23

A5 λ, γ : [, +, + are locally absolutely continuous and there exists θ > such that for almost every t [, + we have γt λt and γ2 t λt 2 + θ. 58 δ Theorem 2 Let A : H H be a maximally monotone operator and B : H H be -cocoercive operator for > such that zera + B. Let η, 2 and set δ := min{, /η} + /2. Let λ, γ : [, +, + be functions fulfilling A5, u, v H and x : [, + H be the unique strong global solution of 57. Then the following statements are true: i the trajectory x is bounded and ẋ, ẍ, Id J ηa Id ηb x L 2 [, + ; H; ii lim t + ẋt = lim t + ẍt = lim t + Id JηA Id ηb xt = ; iii xt converges weakly to a point in zera + B as t + ; iv if x zera+b, then Bx Bx L 2 [, + ; H, lim t + Bxt = Bx and B is constant on zera + B; v if A or B is uniformly monotone, then xt converges strongly to the unique point in zera + B as t +. Proof. The statements i-iii follow by using the same arguments as in the proof of Theorem. iv We use again the notations in the proof of Theorem 6. Let be an arbitrary x zera+b. From the definition of the resolvent we have for almost every t [, + Bxt γt ηλtẍt ηλtẋt A γt + + xt, 59 λtẍt λtẋt which combined with Bx Ax and the monotonicity of A leads to γt + λtẍt λtẋt + xt x, Bxt + Bx γt ηλtẍt ηλtẋt. 6 The cocoercivity of B yields for almost every t [, + Bxt Bx 2 γt + Bxt + Bx λtẍt λtẋt, ηλ 2 ẍt + γtẋt 2 t + xt x, γt ηλtẍt ηλtẋt γt 2 2 + λtẍt λtẋt + 2 Bxt Bx 2 + xt x, γt ηλtẍt ηλtẋt. From xt z, γt ηλtẍt ηλtẋt we obtain for almost every t [, + λt Bxt Bz 2 + ḧt + γt ḣt 2 η = ḧt + γt ḣt ẋt 2 6 ηλt 24 2λt ẍt + γtẋt 2 + η ẋt 2.

The conclusion follows in analogy to the proof of iv in Theorem by using also 52. v Let x be the unique element of zera + B. When A is uniformly monotone with corresponding function φ A : [, + [, + ], which is increasing and vanishes only at, similarly to the proof of statement v in Theorem the following inequality can be derived for almost every t [, + γt φ A + + xt z λtẍt λtẋt γt + Bxt + Bz + xt z, γt λtẍt λtẋt, ηλtẍt ηλtẋt. γt This yields lim t + φ A λtẍt + λtẋt + xt z = and from here the conclusion is immediate. The case when B is uniformly monotone is to be addressed in analogy to corresponding part of the proof of Theorem v. Remark 22 In the light of the arguments provided in Remark, one can see that the statements in Theorem 2 remain valid also for η = 2. When particularizing this setting to the solving of the optimization problem min fx + gx, x H where f : H R {+ } is a proper, convex and lower semicontinuous function and g : H R is a convex and Fréchet differentiable function with /-Lipschitz continuous gradient for >, via the second order dynamical system { ] ẍt + γtẋt + λt [xt prox ηf xt η gxt = 62 x = u, ẋ = v, Corollary 2 gives rise to the following result. Corollary 23 Let f : H R {+ } by a proper, convex and lower semicontinuous function and g : H R be a convex and Fréchet differentiable function with /- Lipschitz continuous gradient for > such that argmin x H {fx + gx}. Let η, 2] and set δ := min{, /η} + /2. Let λ, γ : [, +, + be functions fulfilling A5, u, v H and x : [, + H be the unique strong global solution of 62. Then the following statements are true: i the trajectory x is bounded and ẋ, ẍ, Id prox ηf Id η g x L 2 [, + ; H; ii lim t + ẋt = lim t + ẍt = lim t + Id proxηf Id η g xt = ; iii xt converges weakly to a minimizer of f + g as t + ; iv if x is a minimizer of f + g, then gx gx L 2 [, + ; H, lim t + gxt = gx and g is constant on argmin x H {fx + gx}; v if f or g is uniformly convex, then xt converges strongly to the unique minimizer of f + g as t +. As it was also the case in the previous section, we can weaken the choice of the step size in Corollary 23 through the following assumption 25

A6 λ, γ : [, +, + are locally absolutely continuous and there exists θ > such that for almost every t [, + we have γt λt and γ2 t λt ηθ + η +. 63 Corollary 24 Let f : H R {+ } by a proper, convex and lower semicontinuous function and g : H R be a convex and Fréchet differentiable function with /- Lipschitz continuous gradient for > such that argmin x H {fx + gx} =. Let be η >, λ, γ : [, +, + be functions fulfilling A6, u, v H and x : [, + H be the unique strong global solution of 62. Then the following statements are true: i the trajectory x is bounded and ẋ, ẍ, Id prox ηf Id η g x L 2 [, + ; H; ii lim t + ẋt = lim t + ẍt = lim t + Id proxηf Id η g xt = ; iii xt converges weakly to a minimizer of f + g as t + ; iv if x is a minimizer of f + g, then gx gx L 2 [, + ; H, lim t + gxt = gx and g is constant on argmin x H {fx + gx}; v if f or g is uniformly convex, then xt converges strongly to the unique minimizer of f + g as t +. Proof. The proof follows in the lines of the one given for Corollary 4 and relies on the following key inequality, which holds for almost every t [, +, λt gxt gz 2 + d dt 2 η h + q + d γt dt η h + q + d γt η dt λt ẋt 2 γ 2 t γtλt + γt λt + + ηλt ηλ 2 t ẋt 2 + η ηλt ẍt 2, where x denotes a minimizer of f + g. This relation gives rise via A6 to λt gxt gz 2 + d dt 2 η h + q + d γt dt η h + q + d γt η dt λt ẋt 2 + θ ẋt 2 + ηλt ẍt 2, which can be seen as the counterpart to relation 46. Finally, we address the convergence rate of a convex and Fréchet differentiable function with Lipschitz continuous gradient g : H R along the ergodic trajectory generated by { ẍt + γtẋt + λt gxt = x = u, ẋ = v 64 to its global minimum value, when making the following assumption A7 λ : [, +, + is locally absolutely continuous, γ : [, +, + is twice differentiable and there exists ζ > such that for almost every t [, + we have < ζ γtλt λt, γt and 2 γtγt γt. 65 26

Theorem 25 Let g : H R be a convex and Fréchet differentiable function with /-Lipschitz continuous gradient for > such that argmin x H gx. Let λ, γ : [, +, + be functions fulfilling A7 u, v H and x : [, + H be the unique strong global solution of 64. Then for every minimizer x of g and every T > it holds T g xtdt gx T 2ζT [ v + γu x 2 + λ ] γ u x 2. Proof. Let x argmin x H gx and T >. By using 64, the convexity of g and A7 we get for almost every t [, + d dt 2 ẋt + γtxt x 2 + λtgxt γt 2 xt x 2 = ẍt + γtxt x + γtẋt, ẋt + γtxt x γt 2 xt x 2 γt ẋt, xt x + λtgxt + λt ẋt, gxt = γtλt gxt, xt x + λtgxt + γtγt γt xt x 2 2 γtλt gxt, xt x + λtgxt λt γtλtgxt gx + λtgx ζgxt gx + λtgx. We obtain after integration 2 ẋt + γxt x 2 + λt gxt γt 2 xt x 2 2 ẋ + γx x 2 + λgx γ 2 x x 2 +ζ T gxt gx dt λt λgx. The conclusion follows from here as in the proof of Theorem 6. Remark 26 A similar comment as in Remark 7 can be made also in this context. For a, a, ρ, ρ and b, b > fulfilling the inequalities b 2 b > / and ρ b one can prove that the functions λt = ae ρt + b and γt = a e ρ t + b, verify assumption A2 in Theorem 8 with < θ b 2 b and assumption A7 in Theorem 25 with < ζ bb /a + b 2. Hence, for this choice of the relaxation and damping function, one has convergence of the objective function g along the ergodic trajectory to its global minimum value as well as weak convergence of the trajectory to a minimizer of g. 27