A proximal minimization algorithm for structured nonconvex and nonsmooth problems

Size: px
Start display at page:

Download "A proximal minimization algorithm for structured nonconvex and nonsmooth problems"

Transcription

1 A proximal minimization algorithm for structured nonconvex and nonsmooth problems Radu Ioan Boţ Ernö Robert Csetnek Dang-Khoa Nguyen May 8, 08 Abstract. We propose a proximal algorithm for minimizing objective functions consisting of three summands: the composition of a nonsmooth function with a linear operator, another nonsmooth function, each of the nonsmooth summands depending on an independent block variable, and a smooth function which couples the two block variables. The algorithm is a full splitting method, which means that the nonsmooth functions are processed via their proximal operators, the smooth function via gradient steps, and the linear operator via matrix times vector multiplication. We provide sufficient conditions for the boundedness of the generated sequence and prove that any cluster point of the latter is a KKT point of the minimization problem. In the setting of the Kurdyka- Lojasiewicz property we show global convergence, and derive convergence rates for the iterates in terms of the Lojasiewicz exponent. Key Words. structured nonconvex and nonsmooth optimization, proximal algorithm, full splitting scheme, Kurdyka- Lojasiewicz property, limiting subdifferential AMS subject classification. 65K0, 90C6, 90C30 Introduction. Problem formulation and motivation In this paper we propose a full splitting algorithm for solving nonconvex and nonsmooth problems of the form min tf paxq ` G pyq ` H px, yqu, (.) px,yqprmˆrq where F : R p Ñ R Y t`8u and G: R q Ñ R Y t`8u are proper and lower semicontinuous functions, H : R mˆr q Ñ R is a Fréchet differentiable function with Lipschitz continuous gradient, and A: R m Ñ R p is a linear operator. It is noticeable that neither for the nonsmooth nor for the smooth functions convexity is assumed. In case m p and A is the identity operator, Bolte, Sabach and Teboulle formulated in [9], also in the nonconvex setting, a proximal alternating linearization method (PALM) for solving (.). PALM is a proximally regularized variant of the Gauss-Seidel alternating minimization scheme and basically consists of two proximal-gradient steps. It had a significant impact in the optimization community, as it can be used to solve a large variety of nonconvex and nonsmooth problems arising in applications such as: matrix factorization, image deblurring and denoising, the feasibility problem, compressed sensing, etc. An inertial version of PALM has been proposed by Pock and Sabach in [3]. A naive approach of PALM for solving (.) would require the calculation of the proximal operator of the function F A, for which, in general, even in the convex case, a closed formula is not available. In the last decade, an impressive progress can be noticed in the field of primal-dual/proximal ADMM algorithms, designed to solve convex optimization problems involving compositions with linear operators in the spirit of the full splitting paradigm. One of the pillars of this development is the conjugate duality theory which is available for convex optimization problems. Faculty of Mathematics, University of Vienna, Oskar-Morgenstern-Platz, 090 Vienna, Austria, radu.bot@ univie.ac.at. Research partially supported by FWF (Austrian Science Fund), project I 49-N3. Faculty of Mathematics, University of Vienna, Oskar-Morgenstern-Platz, 090 Vienna, Austria, ernoe. robert.csetnek@univie.ac.at. Research supported by FWF (Austrian Science Fund), project P 9809-N3. Faculty of Mathematics, University of Vienna, Oskar-Morgenstern-Platz, 090 Vienna, Austria, dang-khoa. nguyen@univie.ac.at. Research supported by the Doctoral Programme Vienna Graduate School on Computational Optimization (VGSCO) which is funded by FWF (Austrian Science Fund), project W60-N35.

2 The algorithm which we propose in this paper for solving the nonconvex and nonsmooth problem(.) is a full splitting scheme, too; the nonsmooth functions are processed via their proximal operators, the smooth function via gradient steps, and the linear operator via matrix times vector multiplication. In case Gpyq 0 and Hpx, yq Hpxq for any px, yq P R mˆr q, where H : R m Ñ R is a Fréchet differentiable function with Lipschitz continuous gradient, it furnishes a full splitting iterative scheme for solving the nonsmooth and nonconvex optimization problem min tf paxq ` H pxqu. (.) xprm Splitting algorithms for solving problems of the form (.) have been considered in [9], under the assumption that H is twice continuously differentiable with bounded Hessian, in [5], under the assumption that one of the summands is convex and continuous on its effective domain, and in [3], as a particular case of a general nonconvex proximal ADMM algorithm. We would like to mention in this context also [0] for the case when A is nonlinear. The convergence analysis we will carry out in this paper relies on a descent inequality, which we prove for a regularization of the augmented Lagrangian L β : R m ˆ R q ˆ R p ˆ R p Ñ R Y t`8u L β px, y, z, uq F pzq ` G pyq ` H px, yq ` xu, Ax zy ` β Ax z, β ą 0, associated with problem (.). This is obtained by an appropriate tuning of the parameters involved in the description of the algorithm. In addition, we provide sufficient conditions in terms of the input functions F, G and H for the boundedness of the generated sequence of iterates. We also show that any cluster point of this sequence is a KKT point of the optimization problem (.). By assuming that the above-mentioned regularization of the augmented Lagrangian satisfies the Kurdyka- Lojasiewicz property, we prove global convergence. If this function satisfies the Lojasiewicz property, then we can even derive convergence rates for the sequence of iterates formulated in terms of the Lojasiewicz exponent. For similar approaches relying on the use of the Kurdyka- Lojasiewicz property in the proof of the global convergence of nonconvex optimization algorithms we refer to the papers of Attouch and Bolte [], Attouch, Bolte and Svaiter [3], and Bolte, Sabach and Teboulle [9].. Notations and preliminaries Every space R d, where d is a positive integer, is assumed to be equipped with the Euclidean inner product x, y and associated norm a x, y. The Cartesian product R d ˆ R d ˆ... ˆ R d k of the Euclidean spaces R di, i,..., k, will be endowed with inner product and associated norm defined for x : px,..., x k q, y : py,..., y k q P R d ˆ R d ˆ... ˆ R d k by g kÿ fÿ x, y xx i, y i y and x e k x i, i respectively. For every x : px,..., x k q P R d ˆ R d ˆ... ˆ R d k we have g k ÿ fÿ? x i ď x e k kÿ x i ď x i. (.3) k i Let ψ : R d Ñ R Y t`8u be a proper and lower semicontinuous function and x an element of its effective domain domψ : y P R d : ψ pyq ă `8 (. The Fréchet (viscosity) subdifferential of ψ at x is p Bψ pxq : "d P R d : lim inf yñx and the limiting (Mordukhovich) subdifferential of ψ at x is i i i ψ pyq ψ pxq xd, y xy y x Bψ pxq : td P R d : exist sequences x n Ñ x and d n Ñ d as n Ñ `8 For x R domψ, we set p Bψ pxq Bψ pxq : H. * ě 0 such that ψ px n q Ñ ψ pxq as n Ñ `8 and d n P p Bψ px n q for any n ě 0u.

3 The inclusion p Bψ pxq Ď ψ pxq holds for each x P R d. If ψ is convex, then the two subdifferentials coincide with the convex subdifferential of ψ, thus p Bψ pxq Bψ pxq d P R d : ψ pyq ě ψ pxq ` xd, y P R d( for any x P R d. If x P R d is a local minimum of ψ, then 0 P Bψ pxq. We denote by crit pψq : x P R d : 0 P Bψ pxq ( the set of critical points of ψ. The limiting subdifferential fulfils the following closedness criterion: if tx n u ně0 and td n u ně0 are sequence in R d such that d n P Bψ px n q for any n ě 0 and px n, d n q Ñ px, dq and ψ px n q Ñ ψ pxq as n Ñ `8, then d P Bψ pxq. We also have the following subdifferential sum formula (see [, Proposition.07], [4, Exercise 8.8]): if Φ: R d Ñ R is a continuously differentiable function, then B pψ ` φq pxq Bψ pxq ` φ pxq for any x P R d ; and a formula for the subdifferential of the composition of ψ with a linear operator A: R k Ñ R d (see [, Proposition.], [4, Exercise 0.7]): if A is injective, then B pψ Aq pxq A T Bψ paxq for any x P R k. The following proposition collects some important properties of a (not necessarily convex) Fréchet differentiable function with Lipschitz continuous gradient. For the proof of this result we refer to [3, Proposition ]. Proposition. Let ψ : R d Ñ R be Fréchet differentiable such that its gradient is Lipschitz continuous with constant l ą 0. Then the following statements are true: piq For every x, y P R d and every z P rx, ys tp tqx ` ty : t P r0, su it holds piiq For any γ P Rz t0u it holds inf xpr d ψ pyq ď ψ pxq ` x ψ pzq, y xy ` l y x ; (.4) " ˆ ψ pxq γ l * γ ψ pxq ě inf ψ pxq. (.5) xpr d The Descent Lemma, which says that for a Fréchet differentiable function ψ : R d Ñ R having a Lipschitz continuous gradient with constant l ą 0 it holds ψ pyq ď ψ pxq ` x ψ pxq, y xy ` l y y P R d, follows from (.4) for z : x. In addition, by taking in (.4) z : y we obtain ψ pxq ě ψ pyq ` x ψ pyq, x yy l x y P R d. This is equivalent to the fact that ψ` l is a convex function, which is the same with ψ is l-semiconvex ([8]). In other words, a consequence of Proposition () is, that a Fréchet differentiable function with l-lipschitz continuous gradient is l-semiconvex. We close ths introductory section by presenting two convergence results for real sequences that will be used in the sequel in the convergence analysis. The following lemma is useful when proving convergence of numerical algorithms relying on Fejér monotonicity techniques (see, for instance, [, Lemma.], [, Lemma ]). Lemma. Let tξ n u ně0 be a sequence of real numbers and tω n u ně0 a sequence of real nonnegative numbers. Assume that tξ n u ně0 is bounded from below and that for any n ě 0 Then the following statements hold: ξ n` ` ω n ď ξ n. piq the sequence tω n u ně0 is summable, namely ÿ ně0 ω n ă `8; piiq the sequence tξ n u ně0 is monotonically decreasing and convergent. The following lemma can be found in [, Lemma.3] (see, also [, Lemma 3]). Lemma 3. Let ta n u ně0 and tb n u ně be sequences of real nonnegative numbers such that for any n ě where χ 0 P R and χ ě 0 fulfill χ 0 ` χ ă, and ÿ ně a n` ď χ 0 a n ` χ a n ` b n, (.6) b n ă `8. Then ÿ ně0 a n ă `8. 3

4 The algorithm The numerical algorithm we propose for solving (.) has the following formulation. Algorithm. Let µ, β, τ ą 0 and 0 ă σ ď. For a given starting point px 0, y 0, z 0, u 0 q P R m ˆ R q ˆ R p ˆ R p generate the sequence tpx n, y n, z n, u n qu ně0 for any n ě 0 as follows y n` P arg min G pyq ` x y H px n, y n q, yy ` µ y y n ) (.a) " z n` P arg min F pzq ` xu n, Ax n zy ` β * zpr p Ax n z (.b) x n` : x n τ ` x H px n, y n` q ` A T u n ` βa T pax n z n` q (.c) ypr q! u n` : u n ` σβ pax n` z n` q. (.d) The proximal point operator with parameter γ ą 0 (see []) of a proper and lower semicontinuous function ψ : R d Ñ R Y t`8u is the set-valued operator defined as " prox γψ : R d Ñ Rd, prox γψ pxq arg min ψ pyq ` * x y. ypr d γ Exact formulas for the proximal operator are available not only for large classes of convex functions ([4, 5, 4]), but also for various nonconvex functions ([, 5, 8]). In view of the above definition, the iterative scheme (.a) - (.d) reads for every n ě 0 y n` P prox µ G `yn µ y H px n, y n q z n` P prox β F `Axn ` β u n x n` : x n τ ` x H px n, y n` q ` A T u n ` βa T pax n z n` q u n` : u n ` σβ pax n` z n` q. One can notice the full splitting character of Algorithm and also that the first two steps can be performed in parallel. Remark. piq In case Gpyq 0 and Hpx, yq Hpxq for any px, yq P R m ˆ R q, where H : R m Ñ R is a Fréchet differentiable function with Lipschitz continuous gradient, Algorithm gives rise to an iterative scheme which has been proposed in [3] for solving the optimization problem (.). This reads for any n ě 0 z n` P prox β F `Axn ` β u n x n` : x n τ ` H px n q ` A T u n ` βa T pax n z n` q u n` : u n ` σβ pax n` z n` q. piiq In case m p and A Id is the identity operator on R m, Algorithm gives rise to an iterative scheme for solving min tf pxq ` G pyq ` H px, yqu, (.) px,yqprmˆrq which reads for any n ě 0 y n` P prox µ G `yn µ y H px n, y n q z n` P prox β F `xn ` β u n x n` : x n τ p x H px n, y n` q ` u n ` β px n z n` qq u n` : u n ` σβ px n` z n` q. This algorithm provides an alternative to PALM ([9]) for solving optimization problems of the form (.). piiiq In case m p, A Id, F pxq 0 and Hpx, yq Hpyq for any px, yq P R m ˆ R q, where H : R q Ñ R is a Fréchet differentiable function with Lipschitz continuous gradient, Algorithm gives rise to an iterative scheme for solving min tgpyq ` H pyqu, (.3) yprq 4

5 which reads for any n ě 0 y n` P prox µ G `yn µ Hpy n q, and is nothing else than the proximal-gradient method. An inertial version of the proximal-gradient method for solving (.3) in the fully nonconvex setting has been considered in [].. A descent inequality We will start with the convergence analysis of Algorithm () by proving a descent inequality, which will play a fundamental role in our investigations. We will analyse Algorithm () under the following assumptions, which we will be later even weakened. Assumption. piq the functions F, G and H are bounded from below; piiq the linear operator A is surjective; piiiq for any fixed y P R q there exists l pyq ě 0 such that x H px, yq x H `x, y ď l pyq x x P R m, (.4a) and for any fixed x P R m there exist l pxq, l 3 pxq ě 0 such that y H px, yq y H `x, y ď l pxq y y P R q, (.4b) x H px, yq x H `x, y ď l3 pxq y y P R q ; (.4c) pivq there exist l i,` ą 0, i,, 3, such that sup l py n q ď l,`, ně0 sup l px n q ď l,`, ně0 sup l 3 px n q ď l 3,`. (.5) ně0 Remark. Some comments on Assumption are in order. piq Assumption piq ensures that the sequence generated by Algorithm is well-defined. It has also as consequence that Ψ : inf tf pzq ` G pyq ` H px, yqu ą 8. (.6) px,y,zqˆrmˆrqˆrp piiq Comparing the assumptions in (iii) and (iv) to the ones in [9], one can notice the presence of the additional condition (.4c), which is essential in particular when proving the boundedness of the sequence of generated iterates. Notice that in iterative schemes of gradient type, proximal-gradient type or forward-backward-forward type (see [9,, ]) the boundedness of the iterates follow by combining a descent inequality expressed in terms of the objective function with coercivity assumptions on the later. In our setting this undertaken is less simple, since the descent inequality which we obtain below is in terms of the augmented Lagrangian associated with problem (.). piiiq The linear operator A is surjective if and only if its associated matrix has full row rank, which is the same with the fact that the matrix associated to AA T is positively definite. Since λ min `AA T z ď xaa T z, zy A T P R p, this is further equivalent to λ min `AA T ą 0, where λ min pmq denotes the minimal eigenvalue of a square matrix M. We also denote by κpmq the condition number, namely the ratio between the maximal eigenvalue λ max pmq and the minimal eigenvalue of the square matrix M, κ pmq : λ max pmq λ min pmq M λ min pmq ě. The convergence analysis will make use of the following regularized augmented Lagrangian function Ψ: R m ˆ R q ˆ R p ˆ R p ˆ R m ˆ R p Ñ R Y t`8u, 5

6 defined as `x, y, z, u, x, u ÞÑ F pzq ` G pyq ` H px, yq ` xu, Ax zy ` β Ax z ` C 0 A T `u u ` σb `x x ` C x x, where Notice that B : τid βa T A, C 0 : 4 p σq σ βλ min paa T q ě 0 and C : 8 pστ ` l,`q σβλ min paa T q ą 0. B ď τ, whenever τ ě β A. Indeed, this is a consequence of the relation }Bx} τ }x} τβ}ax} ` β }A T Ax} ď τ }x} ` βpβ}a} P R m. For simplification, we introduce the following notations R : R m ˆ R q ˆ R p ˆ R p ˆ R m ˆ R p X : `x, y, z, u, x, u X n : px n, y n, z n, u n, x n, u n ě Ψ n : Ψ px n ě. The next result provides the announced descent inequality. Lemma 4. Let Assumption be satisfied, τ ě β A and tpx n, y n, z n, u n qu ně0 be a sequence generated by Algorithm. Then for any n ě it holds where Ψ n` ` C x n` x n ` C 3 y n` y n ` C 4 u n` u n ď Ψ n, (.7) C : τ l,` ` β A 4στ βλ min paa T q 8 pστ ` l,`q σβλ min paa T q, C 3 : µ l,` C 4 : σβ. 8l 3,` σβλ min paa T q, (.8a) (.8b) (.8c) Proof. Let n ě be fixed. We will show first that F pz n` q ` G py n` q ` H px n`, y n` q ` xu n`, Ax n` z n` y ` β Ax n` z n` ` τ l,` ` β A x n` x n ` µ l,` y n` y n ` σβ u n` u n ď F pz n q ` G py n q ` H px n, y n q ` xu n, Ax n z n y ` β Ax n z n ` σβ u n` u n (.9) and provide afterwards an upper estimate for the term u n` u n on the right-hand side of (.9). From (.a) and (.b) we obtain and G py n` q ` x y H px n, y n q, y n` y n y ` µ y n` y n ď G py n q (.0) F pz n` q ` xu n, Ax n z n` y ` β Ax n z n` ď F pz n q ` xu n, Ax n z n y ` β Ax n z n (.) 6

7 respectively. On the other hand, according to the Descent Lemma we have H px n, y n` q ď H px n, y n q ` x y H px n, y n q, y n` y n y ` l px n q and, further, by taking into consideration (.c), y n` y n ď H px n, y n q ` x y H px n, y n q, y n` y n y ` l,` y n` y n H px n`, y n` q ď H px n, y n` q ` x x H px n, y n` q, x n` x n y ` l py n` q x n` x n H px n, y n` q xu n, Ax n` Ax n y β xax n z n`, Ax n` Ax n y ˆ τ l py n` q x n` x n ď H px n, y n` q xu n, Ax n` Ax n y ` β Ax n z n` β Ax n` z n` τ l,` ` β A x n` x n. Summing this inequality with (.0) and (.) gives (.9). Next we will focus on estimating u n` u n. Combining (.c) and (.d), we obtain and A T u n` ` σb px n` x n q p σq A T u n σ x H px n, y n` q A T u n ` σb px n x n q p σq A T u n σ x H px n, y n q. Subtracting these relations and making use of the notations it yields w n : A T pu n u n q ` σb px n x n q v n : σb px n x n q ` x H px n, y n q x H px n, y n` q, w n` p σq w n ` σv n. The convexity of guarantees that (notice that 0 ă σ ď ) w n` ď p σq w n ` σ v n. (.) In addition, from the definitions of w n and v n, we obtain A T pu n` u n q ď w n` ` σ B x n` x n ď w n` ` στ x n` x n (.3) and v n ď σ B x n x n ` x H px n, y n q x H px n, y n` q ď στ x n x n ` x H px n, y n q x H px n, y n q ` x H px n, y n q x H px n, y n` q ď pστ ` l,`q x n x n ` l 3,` y n` y n (.4) respectively. Using the Cauchy-Schwarz inequality, (.3) yields T λ min `AA u n` u n ď A T pu n` u n q ď w n` ` σ τ x n` x n and (.4) yields v n ď pστ ` l,`q x n x n ` l 3,` y n` y n. After combining these two inequalities with (.), we get T σλ min `AA u n` u n ` p σq w n` ď p σq w n ` σ 3 τ x n` x n ` σ pστ ` l,`q x n x n ` σl 3,` y n` y n. The desired statement follows after we multiply the above relation by the resulting inequality with (.9). 4 σ βλ min paa T ą 0 and combine q 7

8 The following result provides one possibility to choose the parameters in Algorithm, such that all three constants C, C 3 and C 4 that appear in (.7) are positive. Lemma 5. Let 4ν β ą 4σκ paa T q # β A max, βλ min `AA T 4σ 0 ă σ ă 4κ paa T q ˆ 4 ` 3σ ` ˆ 6ν β a τ b 4 ` 4σ ` 9σ 9σκ paa T q + ą 0 ă τ ă βλ T min `AA ˆ 6ν 4σ β ` a τ (.5a) (.5b) (.5c) where ν : Then we have µ ą l,` ` 6l 3,` σβλ min paa T q ą 0, (.5d) l,` λ min paa T q ą 0 and τ : 3ν β 8ν β 4νσ 4σκ `AA T ą 0. (.5e) β Furthermore, there exist γ, γ P Rz t0u such that Proof. We will prove first that l,` γ γ βλ min paa T q C ą 0 ô 4στ βλ min paa T q ˆ min tc, C 3, C 4 u ą 0. and 6l,` βλ min paa T τ ` q l,` γ γ βλ min paa T q. (.6) 6l,` σβλ min paa T q ` l,` ` β A ă 0. (.7) The reduced discriminant of the quadratic function in τ in the above relation reads ˆ 6l,` τ : ˆ 6ν β 3ν β βλ min paa T q 384ν 8ν β β 4νσ β 384l,` β λ min paat q 4νσ β 4σκ `AA T 4l,`σ T βλ min paa T 4σκ `AA q 4σκ `AA T ą 0, (.8) if σ and β are being chosen as in (.5a) and (.5b), respectively. Therefore, for T βλ min `AA ˆ 6ν 4σ β a τ ă τ ă βλ T min `AA 4σ ˆ 6ν β ` a τ (.7) is satisfied. It remains to verify the feasibility of τ in (.5c), in other words, to prove that β A ă βλ min `AA T ˆ 6ν 4σ β ` a τ. This is easy to see, as, according to (.8), we have β A ă βλ min `AA T ˆ 6ν ô 6ν 4σ β β σκ `AA T ą 0. The positivity of C 3 follows from the choice of µ in (.5d), while, obviously, C 4 ą 0. Finally, as 4ν β ą 4σκ paa T q ą 4ν, it follows that each of the two quadratic equations in (.6) (in γ and, respectively, γ ) has a nonzero real solution., 8

9 Remark 3. Hong and Luo proved recently in [6] linear convergence for the iterates generated by a Lagrangian-based algorithm in the convex setting, without any strong convexity assumption. To this end a certain error bound condition must hold true and the step size of the dual update, which is also assumed to depend on the error bound constants, must be taken small. The authors also mention that this choice of the dual step size may be too conservative and cumbersome to compute unless the objective function is strongly convex. As shown in previous lemma, the step size of the dual update in our algorithm can be computed without assuming strong convexity and indeed it depends only on the linear operator A. Theorem 6. Let Assumption be satisfied and the parameters in Algorithm be such that τ ě β A and the constants defined in Lemma 4 fulfil mintc, C 3, C 4 u ą 0. If tpx n, y n, z n, u n qu ně0 is a sequence generated by Algorithm, then the following statements are true: piq the sequence tψ n u ně is bounded from below and convergent; piiq x n` x n Ñ 0, y n` y n Ñ 0, z n` z n Ñ 0 and u n` u n Ñ 0 as n Ñ `8. (.9) Proof. First, we show that Ψ defined in (.6) is a lower bound of tψ n u ně. Suppose the contrary, namely that there exists n 0 ě such that Ψ n0 Ψ ă 0. According to Lemma 4, tψ n u ně is a nonincreasing sequence and thus for any N ě n 0 which implies that Nÿ pψ n Ψq ď n nÿ 0 n On the other hand, for any n ě it holds lim NÑ`8 n pψ n Ψq ` pn n 0 ` q pψ n0 Ψq, Nÿ pψ n Ψq 8. Ψ n Ψ ě F pz n q ` G py n q ` H px n, y n q ` xu n, Ax n z n y Ψ ě xu n, Ax n z n y σβ xu n, u n u n y σβ u n ` σβ u n u n σβ u n. Therefore, for any N ě, we have Nÿ n pψ n Ψq ě σβ Nÿ n u n u n ` σβ u N σβ u 0 ě σβ u 0, which leads to a contradiction. As tψ n u ně is bounded from below, we obtain from Lemma statement piq and also that Since for any n ě it holds x n` x n Ñ 0, y n` y n Ñ 0 and u n` u n Ñ 0 as n Ñ `8. z n` z n ď A x n` x n ` Ax n` z n` ` Ax n z n A x n` x n ` σβ u n` u n ` σβ u n u n, (.0) it follows that z n` z n Ñ 0 as n Ñ `8. Usually, for nonconvex algorithms, the fact that the sequences of differences of consecutive iterates converge to zero is shown by assuming that the generated sequences are bounded (see [3, 9, 5]). In our analysis the only ingredients for obtaining statement (ii) in Theorem 6 are the descent property and Lemma. 9

10 . General conditions for the boundedness of tpx n, y n, z n, u n qu ně0 In the following we will formulate general conditions in terms of the input data of the optimization problem (.) which guarantee the boundedness of the sequence tpx n, y n, z n, u n qu ně0. Working in the setting of Theorem 6, thanks to (.9), we have that the sequences tx n` x n u ně0, ty n` y n u ně0, tz n` z n u ně0 and tu n` u n u ně0 are bounded. Denote s : sup t x n` x n, y n` y n, z n` z n, u n` u n u ă `8. ně0 Even though this observation does not imply immediately that tpx n, y n, z n, u n qu ně0 is bounded, this will follow under standard coercivity assumptions. Recall that a function ψ : R d Ñ R Y t`8u is called coercive, if lim x Ñ`8 ψ pxq `8. Theorem 7. Let Assumption be satisfied and the parameters in Algorithm be such that τ ě β A, the constants defined in Lemma 4 fulfil mintc, C 3, C 4 u ą 0 and there exist γ, γ P Rzt0u such that (.6) holds. Suppose that one of the following conditions hold: piq the function H is coercive; piiq the operator A is invertible, and F and G are coercive. Then every sequence tpx n, y n, z n, u n qu ně0 generated by Algorithm is bounded. Proof. Let n ě be fixed. According to Lemma 4 we have that Ψ ě... ě Ψ n ě Ψ n` ě F pz n` q ` G py n` q ` H px n`, y n` q β u n` ` β Ax n` z n` ` β u n`. (.) Combine (.c) and (.d) we get ˆ A T u n` A T pu n` u n q ` B px n x n` q σ ` x H px n`, y n` q x H px n, y n` q x H px n`, y n` q, (.) which implies ˆ A T u n` ď σ A u n` u n ` pτ ` l,`q x n` x n ` x H px n`, y n` q ˆˆ ď σ A ` τ ` l,` s ` x H px n`, y n` q. By using the Cauchy-Schwarz inequality we further obtain T λ min `AA u n` ď ˆˆ A T u n` ď σ A ` τ ` l,` s ` x H px n`, y n` q. Multiplying the above relation by βλ min paa T q and combining it with (.), we get Ψ ě F pz n` q ` G py n` q ` H px n`, y n` q βλ min paa T q xh px n`, y n` q ˆˆ βλ min paa T q σ A ` τ ` l,` s ` β Ax n` z n` ` β u n`. (.3) We will prove the boundedness of tpx n, y n, z n, u n qu ně0 in each of the two scenarios. 0

11 piq According to (.3) and Proposition, we have that for any n ě H px n`, y n` q ` β Ax n` z n` ` β u n` ˆˆ ď Ψ ` βλ min paa T q σ A ` τ ` l,` s inf F pzq inf G pyq zprp ypr m " ˆ * inf H px n`, y n` q l,` ně γ γ x H px n`, y n` q ˆˆ ď Ψ ` βλ min paa T q σ A ` τ ` l,` s inf F pzq inf G pyq zprp ypr q ă ` 8. px,yqpr " Since H is coercive and bounded from below, it follows that tpx n, y n qu ně0 and Ax n z n ` inf H px, yq mˆrq are bounded. As, according to (.d), tax n z n u ně0 is bounded, it follows that tu n u ně0 and tz n u ně0 are also bounded. piiq According to (.3) and Proposition, we have this time that for any n ě F pz n` q ` G py n` q ` β Ax n` z n` ` β u n` ˆˆ ď Ψ ` βλ min paa T q σ A ` τ ` l,` s " ˆ * inf H px n`, y n` q l,` ně γ γ x H px n`, y n` q ˆˆ ď Ψ ` βλ min paa T q σ A ` τ ` l,` s inf H px, yq ă `8. px,yqprmˆrq Since F and G are coercive and bounded from below, it follows that the sequences tpy n, z n qu " ně0 and Ax n z n ` * β u n are bounded. As, according to (.d), tax n z n u ně0 is bounded, it ně0 follows that tu n u ně0 and tax n u ně0 are bounded. The fact that A is invertible implies that tx n u ně0 is bounded..3 The cluster points of tpx n, y n, z n, u n qu ně0 are KKT points We will close this section dedicated to the convergence analysis of the sequence generated by Algorithm in a general framework by proving that any cluster point of tpx n, y n, z n, u n qu ně0 is a KKT point of the optimization problem (.). We provided above general conditions which guarantee both the descent inequality (.7), with positive constants C, C 3 and C 4, and the boundedness of the generated iterates. Lemma 5 and Theorem 7 provide one possible setting that ensures these two fundamental properties of the convergence analysis. We do not want to restrict ourselves to this particular setting and, therefore, we will work, from now on, under the following assumptions. Assumption. piq the functions F, G and H are bounded from below; piiq the linear operator A is surjective; piiiq every sequence tpx n, y n, z n, u n qu ně0 generated by the Algorithm is bounded: pivq H is Lipschitz continuous with constant L ą 0 on a convex bounded subset B ˆ B Ď R m ˆ R q containing tpx n, y n qu ně0. In other words, for any px, yq, px, y q P B ˆ B it holds ` x H px, yq x H `x, y, y H px, yq y H `x, y ď L px, yq `x, y ; (.4) β u n * ně0

12 pvq the parameters µ, β, τ ą 0 and 0 ă σ ď are such that τ ě β}a} and mintc, C 3, C 4 u ą 0, where C : τ L? ` β A 4στ βλ min paa T q 8? `στ ` L σβλ min paa T q, C 3 : µ L? C 4 : σβ. 6L σβλ min paa T q, (.5a) (.5b) (.5c) Remark 4. Being facilitated by the boundedness of the generated sequence, Assumption pivq not only guarantee the fulfilment of Assumption piiiq and pivq on a convex bounded set, but it also arises in a more natural way (see also [9]). Assumption pivq holds, for instance, if H is twice continuously differentiable. In addition, as (.4) implies for any px, yq, px, y q P B ˆ B that x H px, yq x H `x, y ` y H px, yq y H `x, y ď L? ` x x ` y y, we can take l,` l,` l 3,` : L?. (.6) As (.4a) - (.4c) are valid also on a convex bounded set, the descent inequality Ψ n` ` C x n` x n ` C 3 y n` y n ` C 4 u n` u n ď Ψ ě (.7) remains true, where the constants on the left-hand sided are given in (.5) and follow from (.8) under the consideration of (.6). A possible choice of the parameters of the algorithm such that min tc, C 3, C 4 u ą 0 can be obtained also from Lemma 5. The next result provide upper estimates for the limiting subgradients of the regularized function Ψ at px n, y n, z n, u n q for every n ě. Lemma 8. Let Assumption be satisfied and tpx n, y n, z n, u n qu ně0 be a sequence generated by Algorithm. Then for any n ě it holds where D n : `d n x, d n y, d n z, d n u, d n x, dn u P BΨ pxn q, (.8) d n x : x H px n, y n q ` A T u n ` βa T pax n z n q ` C px n x n q ` σc 0 B T `A T pu n u n q ` σb px n x n q, d n y : y H px n, y n q y H px n, y n q ` µ py n y n q, d n z : u n u n ` βa px n x n q, d n u : Ax n z n ` C 0 A `A T pu n u n q ` σb px n x n q, (.9a) (.9b) (.9c) (.9d) d n x : σc 0B T `A T pu n u n q ` σb px n x n q C px n x n q, (.9e) d n u : C 0A `A T pu n u n q ` σb px n x n q. (.9f) In addition, for any n ě it holds where D n ď C 5 x n x n ` C 6 y n y n ` C 7 u n u n, (.30) C 5 :? L ` τ ` β A ` 4 pστ ` A q στc 0 ` 4C, C 6 : L? ` µ, C 7 : ` σβ ` ˆ σ A ` 4 pστ ` A q C 0 A. (.3a) (.3b) (.3c)

13 Proof. Let n ě be fixed. Applying the calculus rules of the limiting subdifferential we get x Ψ px n q x H px n, y n q ` A T u n ` βa T pax n z n q ` C px n x n q ` σc 0 B `A T T pu n u n q ` σb px n x n q, B y Ψ px n q BG py n q ` y H px n, y n q, B z Ψ px n q BF pz n q u n β pax n z n q, u Ψ px n q Ax n z n ` C 0 A `A T pu n u n q ` σb px n x n q, x Ψ px n q σc 0 B `A T T pu n u n q ` σb px n x n q C px n x n q, u Ψ px n q C 0 A `A T pu n u n q ` σb px n x n q. (.3a) (.3b) (.3c) (.3d) (.3e) (.3f) Then (.9a) and (.9d) - (.9f) follow directly from (.3a) and (.3d) - (.3f), respectively. By combining (.3b) with the optimality criterion for (.a) 0 P G py n q ` y H px n, y n q ` µ py n y n q, we obtain (.9b). Similarly, by combining (.3c) with the optimality criterion for (.b) 0 P F pz n q u n β pax n z n q, we get (.9c). In the following we will derive the upper estimates for the components of the limiting subgradient. From (.) it follows d n x ď x H px n, y n q ` A T u n ` β A Ax n z n ` `C ` σ τ C 0 xn x n In addition, we have ` στc 0 A u n u n ď L? ˆ ` τ ` C ` σ τ C 0 x n x n ` σ ` στc 0 A u n u n. d n y ď L? xn x n ` L? ` µ y n y n, d n z ď β A x n x n ` u n u n, ˆ d n u ď στc 0 A x n x n ` σβ ` C 0 A u n u n, d n x ď `σ τ C 0 ` C xn x n ` στc 0 A u n u n, d n u ď στc 0 A x n x n ` C 0 A u n u n. The inequality (.30) follows by combining the above relations with (.3). We denote by Ω : Ω `tx n u ně the set of cluster points of the sequence txn u ně Ď R, which is nonempty thanks to the boundedness of tx n u ně. The distance function of the set Ω is defined for any X P R by dist px, Ωq : inf t X Y : Y P Ωu. The main result of this section follows. Theorem 9. Let Assumption be satisfied and tpx n, y n, z n, u n qu ně0 be a sequence generated by Algorithm. The following statements are true: piq if tpx nk, y nk, z nk, u nk qu kě0 is a subsequence of tpx n, y n, z n, u n qu ně0 which converges to px, y, z, u q as k Ñ `8, then lim kñ`8 Ψ n k Ψ px, y, z, u, x, u q ; piiq it holds Ω Ď crit pψq Ď tx P R : A T u x H px, y q, 0 P BG py q ` y H px, y q, u P BF pz q, z Ax u, (.33) where X : px, y, z, u, x, u q; 3

14 piiiq it holds lim dist px n, Ωq 0; nñ`8 pivq the set Ω is nonempty, connected and compact; pvq the function Ψ takes on Ω the value Ψ lim Ψ n lim tf pz nq ` G py n q ` H px n, y n qu. nñ`8 nñ`8 Proof. Let px, y, z, u q P R m ˆ R q ˆ R p ˆ R p be such that the subsequence tx nk of tx n u ně converges to X : px, y, z, u, x, u q. (i) From (.a) and (.b) we have for any k ě and : px nk, y nk, z nk, u nk, x nk, u nk qu kě G py nk q ` x y H px nk, y nk q, y nk y nk y ` µ y n k y nk ď G py q ` x y H px nk, y nk q, y y nk y ` µ y y nk F pz nk q ` xu nk, Ax nk z nk y ` β Ax n k z nk ď F pz q ` xu nk, Ax nk z y ` β Axnk z, respectively. From (.d) and Theorem 6 follows Ax z. Taking the limit superior as k Ñ `8 on both sides of the above inequalities, we get lim sup kñ`8 F pz nk q ď F pz q and lim sup G py nk q ď G py q kñ`8 which, combined with the lower semicontinuity of F and G, lead to lim F pz n k q F pz q and lim G py n k q G py q. kñ`8 kñ`8 The desired statement follows thanks to the continuity of H. (ii) For the sequence td n u ně0 defined in (.8) - (.9), we have that D nk P BΨ px nk q for any k ě and D nk Ñ 0 as k Ñ `8, while X nk Ñ X and Ψ nk Ñ ΨpX q as k Ñ `8. The closedness criterion of the limiting subdifferential guarantees that 0 P BΨpX q or, in other words, X P crit pψq. Choosing now an element X P crit pψq, it holds which is further equivalent to (.33). $ 0 x H px, y q ` A T u ` βa T pax z q, & 0 P BG py q ` y H px, y q, 0 P BF pz q u β pax z q, % 0 Ax z, (iii)-(iv) The proof follows in the lines of the proof of Theorem 5 (ii)-(iii) in [9], also by taking into consideration [9, Remark 5], according to which the properties in (iii) and (iv) are generic for sequences satisfying X n X n Ñ 0 as n Ñ `8, which is indeed the case due to (.9). (v) Due to (.9) and the fact that tu n u ně0 is bounded, the sequences tf pz n q ` G py n q ` H px n, y n qu ně0 and tψ n u ně0 have the same limit Ψ lim Ψ n lim tf pz nq ` G py n q ` H px n, y n qu. nñ`8 nñ`8 The conclusion follows by taking into consideration the first two statements of this theorem. Remark 5. An element px, y, z, u q fulfilling (.33) is a so-called KKT point of the optimization problem (.). Such a KKT point obviously fulfils 0 P A T BF pax q ` x H px, y q, 0 P BG py q ` y H px, y q. (.34) 4

15 If A is injective, then this system of inclusions is further equivalent to 0 P B pf Aq px q ` x H px, y q B x pf A ` Hq, 0 P BG py q ` y H px, y q B y pg ` Hq, (.35) in other words, px, y q is a critical point of the optimization problem (.). On the other hand, if the functions F, G and H are convex, then, even without asking A to be injective, (.34) and (.35) are equivalent, which means that px, y q is a global minimum of the optimization problem (.). 3 Global convergence and rates In this section we will prove global convergence for the sequence tpx n, y n, z n, u n qu ně0 generated by Algorithm in the context of the Kurdyka- Lojasiewicz property and provide convergence rates for it in the context of the Lojasiewicz property. 3. Global convergence under Kurdyka- Lojasiewicz assumptions The origins of this notion go back to the pioneering work of Kurdyka who introduced in [7] a general form of the Lojasiewicz inequality [0]. An extension to the nonsmooth setting has been proposed and studied in [6, 7, 8]. Definition. Let η P p0, `8s. We denote by Φ η the set of all concave and continuous functions ϕ: r0, ηq Ñ r0, `8q which satisfy the following conditions: piq ϕ p0q 0; piiq ϕ is C on p0, ηq and continuous at 0; piiiq for any s P p0, ηq : ϕ psq ą 0. Definition. Let Ψ: R d Ñ R Y t`8u be proper and lower semicontinuous. piq The function Ψ is said to have the Kurdyka- Lojasiewicz (K L) property at a point pv P dombψ : v P R d : BΨ pvq H (, if there exists η P p0, `8s, a neighborhood V of pv and a function ϕ P Φ η such that for any the following inequality holds v P V X rψ ppvq ă Ψ pvq ă Ψ ppvq ` ηs ϕ pψ pvq Ψ ppvqq dist p0, BΨ pvqq ě. piiq If Ψ satisfies the K L property at each point of dombψ, then Ψ is called K L function. The functions ϕ belonging to the set Φ η for η P p0, `8s are called desingularization functions. The K L property reveals the possibility to reparametrize the values of Ψ in order to avoid flatness around the critical points. To the class of K L functions belong semialgebraic, real subanalytic, uniformly convex functions and convex functions satisfying a growth condition. We refer to [,, 3, 6, 7, 8, 9] for more properties of K L functions and illustrating examples. The following result, the proof of which can be found in [9, Lemma 6], will play an essential role in our convergence analysis. Lemma 0. (Uniformized K L property) Let Ω be a compact set and Ψ: R d Ñ RYt`8u be a proper and lower semicontinuous function. Assume that Ψ is constant on Ω and satisfies the K L property at each point of Ω. Then there exist ε ą 0, η ą 0 and ϕ P Φ η such that for any pv P Ω and every element u in the intersection v P R d : dist pv, Ωq ă ε ( X rψ ppvq ă Ψ pvq ă Ψ ppvq ` ηs it holds ϕ pψ pvq Ψ ppvqq dist p0, BΨ pvqq ě. 5

16 From now on we will use the following notations C 8 : min tc, C 3, C 4 u, C 9 : max tc 5, C 6, C 7 u and E n : Ψ n ě, where Ψ lim Ψ n. nñ`8 The next result shows that if Ψ is a K L function, then the sequence tpx n, y n, z n, u n qu ně0 converges to a KKT point of the optimization problem (.). This hypothesis is fulfilled if, for instance, F, G and H are semi-algebraic functions. Theorem. Let Assumption be satisfied and tpx n, y n, z n, u n qu ně0 be a sequence generated by Algorithm. If Ψ is a K L function, then the following statements are true: piq the sequence tpx n, y n, z n, u n qu ně0 has finite length, namely, ÿ ně0 x n` x n ă `8, ÿ ně0 y n` y n ă `8, ÿ ně0 z n` z n ă `8, ÿ ně0 u n` u n ă `8; piiq the sequencetpx n, y n, z n, u n qu ně0 converges to a KKT point of the optimization problem (.). Proof. Let be X P Ω, thus Ψ px q Ψ. Recall that te n u ně is monotonically decreasing and converges to 0 as n Ñ `8. We consider two cases. Case. Assume that there exists an integer n ě such that E n 0 or, equivalently, Ψ n Ψ. Due to the monotonicity of te n u ně, it follows that E n 0 or, equivalently, Ψ n Ψ for any n ě n. The inequality (.7) yields for any n ě n ` x n` x n 0, y n` y n 0 and u n` u n 0. The inequality (.0) gives us further z n` z n 0 for any n ě n `. This proves (3.). Case. Consider now the case when E n ą 0 or, equivalently, Ψ n ą Ψ for any n ě. According to Lemma 0, there exist ε ą 0, η ą 0 and a desingularization function ϕ such that for any element X in the intersection tz P R: dist pz, Ωq ă εu X tz P R: Ψ ă Ψ pzq ă Ψ ` ηu (3.) it holds Let be n ě such that for any n ě n Since ϕ pψ pxq Ψ q dist p0, BΨ pxqq ě. Ψ ă Ψ n ă Ψ ` η. lim nñ`8 dist px n, Ωq 0 (see Lemma 9 piiiq), there exists n ě such that for any n ě n dist px n, Ωq ă ε. Consequently, X n px n, y n, z n, u n, x n, u n q belongs to the intersection in (3.) for any n ě n 0 : max tn, n u, which further implies (3.) ϕ pψ n Ψ q dist p0, BΨ px n qq ϕ pe n q dist p0, BΨ px n qq ě. (3.3) Define for two arbitrary nonnegative integers i and j i,j : ϕ pψ i Ψ q ϕ pψ j Ψ q ϕ pe i q ϕ pe j q. The monotonicity of the sequence tψ n u ně0 and of the function ϕ implies that i,j ě 0 for any ď i ď j. In addition, for any N ě n 0 ě it holds Nÿ from which we get ÿ ně n,n` ă `8. n n 0 n,n` n0,n` ϕ pe n0 q ϕ pe N` q ď ϕ pe n0 q, 6

17 By combining Lemma 4 with the concavity of ϕ we obtain for any n ě n,n` ϕ pe n q ϕ pe n` q ě ϕ pe n q pe n E n` q ϕ pe n q pψ n Ψ n` q ě min tc, C 3, C 4 u ϕ pe n q x n` x n ` y n` y n ` u n` u n. Thus, (3.3) implies for any n ě n 0 x n` x n ` y n` y n ` u n` u n ď dist p0, BΨ px n qq ϕ pe n q x n` x n ` y n` y n ` u n` u n ď C 8 dist p0, BΨ px n qq n,n`. By the Cauchy-Schwarz inequality, the arithmetic mean-geometric mean inequality and Lemma 8, we have that for any n ě n 0 and every α ą 0 If we denote for any n ě 0 x n` x n ` y n` y n ` u n` u n ď? b 3 x n` x n ` y n` y n ` u n` u n ď a b 3C 8 dist p0, BΨ px n qq n,n` ď α dist p0, BΨ px n qq ` 3C 8 4α n,n` ď αc 9 p x n x n ` y n y n ` u n u n q ` 3C 8 4α n,n`. (3.4) a n : x n x n ` y n y n ` u n u n and b n : 3C 8 4α n,n`, (3.5) then the above inequality is nothing else than (.6) with χ 0 : αc 9 and χ : 0. Since ÿ ně b n ă `8, by choosing α ă {C 9, we can apply Lemma 3 to conclude that ÿ ně0 x n` x n ` y n` y n ` u n` u n ă `8. The proof of (3.) is completed by taking into account once again (.0). From (i) it follows that the sequence tpx n, y n, z n, u n qu ně0 is Cauchy, thus it converges to an element px, y, z, u q which is, according to Lemmas 9, a KKT point of the optimization problem (.). 3. Convergence rates In this section we derive convergence rates for the sequence tpx n, y n, z n, u n qu ně0 generated by Algorithm as well as for tψ n u ně0, if the regularized augmented Lagrangian Ψ satisfies the Lojasiewicz property. The following definition is from [] (see also [0]). Definition 3. Let Ψ: R d Ñ R Y t`8u be proper and lower semicontinuous. Then Ψ satisfies the Lojasiewicz property, if for any critical point pv of Ψ there exists C L ą 0, θ P r0, q and ε ą 0 such that Ψ pvq Ψ ppvq θ ď C L dist p0, P Ball ppv, εq, where Ball ppv, εq denotes the open ball with center pv and radius ε. If Assumption is fulfilled and tpx n, y n, z n, u n qu ně0 is the sequence generated by Algorithm, then, according to Theorem 9, the set of cluster points Ω is nonempty, compact and connected and Ψ takes on Ω the value Ψ ; in addition, Ω Ď crit pψq. According to [, Lemma ], if Ψ has the Lojasiewicz property, then there exist C L ą 0, θ P r0, q and ε ą 0 such that for any X P tz P R: dist pz, Ωq ă εu, 7

18 it holds Ψ pxq Ψ θ ď C L dist p0, BΨ pxqq. Obviously, Ψ is a K L function with desingularization function ϕ : r0, `8q Ñ r0, `8q, ϕ psq : θ C Ls θ, which, according to Theorem, means that Ω contains a single element X, which is the limit of tx n u ně as n Ñ `8. In other words, if Ψ has the Lojasiewicz property, then there exist C L ą 0, θ P r0, q and ε ą 0 such that for any X P Ball px, εq Ψ pxq Ψ θ ď C L dist p0, BΨ pxqq. (3.6) In this case, Ψ is said to satisfy the Lojasiewicz property with Lojasiewicz constant C L ą 0 and Lojasiewicz exponent θ P r0, q. The following lemma will provide convergence rates for a particular class of monotonically decreasing real sequences converging to 0. Its proof can be found in [3, Lemma 5]. Lemma. Let te n u ně0 be a monotonically decreasing sequence of nonnegative numbers converging 0. Assume further that there exists natural numbers n 0 ě such that for any n ě n 0 e n e n ě C e e θ n, where C e ą 0 is some constant and θ P r0, q. The following statements are true: piq if θ 0, then te n u ně0 converges in finite time; piiq if θ P p0, {s, then there exist C e,0 ą 0 and Q P r0, q such that for any n ě n 0 0 ď e n ď C e,0 Q n ; piiiq if θ P p{, q, then there exists C e, ą 0 such that for any n ě n 0 ` 0 ď e n ď C e, n θ. We prove a recurrence inequality for the sequence te n u ně0. Lemma 3. Let Assumption be satisfied and tpx n, y n, z n, u n qu ně0 be a sequence generated by Algorithm. If Ψ satisfies the Lojasiewicz property with Lojasiewicz constant C L ą 0 and Lojasiewicz exponent θ P r0, q, then there exists n 0 ě such that the following estimate holds for any n ě n 0 E n E n ě C 0 En θ C 8, where C 0 : 3 pc L C 9 q. (3.7) Proof. For every n ě we obtain from Lemma 4 E n E n Ψ n Ψ n ě C 8 x n x n ` y n y n ` u n u n ě 3 C 8 p x n x n ` y n y n ` u n u n q ě C 0 C L D n, where D n P BΨpX n q. Let ε ą 0 be such that (3.6) is fulfilled and choose n 0 ě with the property that for any n ě n 0, X n belongs to BallpX, εq. Relation (3.6) implies (3.7) for any n ě n 0. The following result follows by combining Lemma with Lemma 3. Theorem 4. Let Assumption be satisfied and tpx n, y n, z n, u n qu ně0 be a sequence generated by Algorithm. If Ψ satisfies the Lojasiewicz property with Lojasiewicz constant C L ą 0 and Lojasiewicz exponent θ P r0, q, then the following statements are true: piq if θ 0, then tψ n u ně converges in finite time; 8

19 piiq if θ P p0, {s, then there exist n 0 ě, p C 0 ą 0 and Q P r0, q such that for any n ě n 0 0 ď Ψ n Ψ ď p C 0 Q n ; piiiq if θ P p{, q, then there exist n 0 ě and p C ą 0 such that for any n ě n 0 ` 0 ď Ψ n Ψ ď p C n θ. The next lemma will play an important role when transferring the convergence rates for tψ n u ně0 to the sequence of iterates tpx n, y n, z n, u n qu ně0. Lemma 5. Let Assumption be satisfied and tpx n, y n, z n, u n qu ně0 be a sequence generated by Algorithm. Let px, y, z, u q be the KKT point of the optimization problem (.) to which tpx n, y n, z n, u n qu ně0 converges as n Ñ `8. Then there exists n 0 ě such that the following estimates hold for any n ě n 0!a )!a ) x n x ď C max En, ϕ pe n q, y n y ď C max En, ϕ pe n q,!a )!a ) z n z ď C max En, ϕ pe n q, u n u ď C max En, ϕ pe n q, (3.8) where C : a 3C 8 ` 3C 8 C 9 and C : ˆ A ` C. σβ Proof. We assume that E n ą 0 for any n ě 0. Otherwise, the sequence tpx n, y n, z n, u n qu ně0 becomes identical to px, y, z, u q beginning with a given index and the conclusion follows automatically (see the proof of Theorem ). Let ε ą 0 be such that (3.6) is fulfilled and n 0 ě be such that X n belongs to BallpX, εq for any n ě n 0. We fix n ě n 0 now. One can easily notice that x n x ď x n` x n ` x n` x ď ď ÿ x k` x k. kěn Similarly, we derive y n y ď ÿ kěn y k` y k, z n z ď ÿ z k` z k, kěn u n u ď ÿ u k` u k. kěn On the other hand, in view of (3.5) and by taking α : C 9 the inequality (3.4) can be written as a n` ď a n ` b ě n 0. Let us fix now an integer N ě n. Summing up the above inequality for k n,..., N, we have Nÿ a k` ď k n ď Nÿ Nÿ a k ` b k k n Nÿ k n k n Nÿ Nÿ a k` ` a n a N` ` k n a k` ` a n ` 3C 8C 9 ϕ pe n q. By passing N Ñ `8, we obtain ÿ a k` ÿ p x k` x k ` y k` y k ` u k` u k q kěn kěn which gives the desired statement. ď p x n` x n ` y n` y n ` u n` u n q ` 3C 8 C 9 ϕ pe n q ď? b 3 x n` x n ` y n` y n ` u n` u n ` 3C 8 C 9 ϕ pe n q ď a 3C 8 ae n E n` ` 3C 8 C 9 ϕ pe n q, k n b k 9

20 We can now formulate convergence rates for the sequence of generated iterates. Theorem 6. Let Assumption be satisfied and tpx n, y n, z n, u n qu ně0 be a sequence generated by Algorithm. Suppose further that Ψ satisfies the Lojasiewicz property with Lojasiewicz constant C L ą 0 and Lojasiewicz exponent θ P r0, q. Let px, y, z, u q be the KKT point of the optimization problem (.) to which tpx n, y n, z n, u n qu ně0 converges as n Ñ `8. Then the following statements are true: piq if θ 0, then the algorithm converges in finite time; piiq if θ P p0, {s, then there exist n 0 ě, p C0,, p C 0,, p C 0,3, p C 0,4 ą 0 and p Q P r0, q such that for any n ě n 0 x n x ď p C 0, p Q k, y n y ď p C 0, p Q k, z n z ď p C 0,3 p Q k, u n u ď p C 0,4 p Q k ; piiiq if θ P p{, q, then there exist n 0 ě and p C,, p C,, p C,3, p C,4 ą 0 such that for any n ě n 0 ` Proof. Let be the desingularization function. x n x ď p C, n θ θ, yn y ď p C, n θ θ, z n z ď p C,3 n θ θ, un u ď p C,4 n θ θ. ϕ : r0, `8q Ñ r0, `8q, s ÞÑ θ C Ls θ, (i) If θ 0, then tψ n u ně converges in finite time. As seen in the proof of Theorem, the sequence tpx n, y n, z n, u n qu ně0 becomes identical to px, y, z, u q starting from a given index. In other words, the sequence tpx n, y n, z n, u n qu ně0 converges also in finite time and the conclusion follows. Let be θ and n 0 ě such that for any n ě n 0 the inequalities (3.8) in Lemma 5 and hold. E n ď ˆ θ θ C L (ii) If θ P p0, {q, then θ ă 0 and thus for any n ě n 0 θ C LE θ n ď a E n, which implies that If θ {, then thus In both cases we have!a ) max En, ϕ pe n q a E n. ϕ pe n q C L a En,!a ) max En, ϕ pe n q max t, C L u ae ě.!a ) max En, ϕ pe n q ď max t, C L u ae ě n 0. By Theorem 4, there exist n 0 ě, C0 p ą 0 and Q P r0, q such that for Q p :? Q and every n ě n 0 it holds a b b En ď pc 0 Q n{ pc 0Q p n. The conclusion follows from Lemma 5 for n 0 : max tn 0, n 0u. 0

21 (iii) If θ P p{, q, then θ ą 0 and thus for any n ě n 0 a En ď θ C LE θ n, which implies that!a ) max En, ϕ pe n q ϕ pe n q θ C LEn θ. By Theorem 4, there exist n 0 ě and p C ą 0 such that for any n ě n 0 θ C LE θ n ď θ C L p C θ pn q θ θ. The conclusion follows again for n 0 : max tn 0, n 0u from Lemma 5. References [] H. Attouch, J. Bolte. On the convergence of the proximal algorithm for nonsmooth functions involving analytic features. Mathematical Programming 6(), 5 6 (009) [] H. Attouch, J. Bolte, P. Redont, A. Soubeyran. Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the Kurdyka Lojasiewicz inequality. Mathematics of Operations Research 35(), (00) [3] H. Attouch, J. Bolte, B. F. Svaiter. Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods. Mathematical Programming 37( ), 9 9 (03) [4] H.H. Bauschke, P.L. Combettes. Convex Analysis and Monotone Operator Theory in Hilbert Spaces. CMS Books in Mathematics. Springer, New York (0) [5] A. Beck. First-Order Methods in Optimization. MOS-SIAM Series on Optimization. SIAM, Philadelphia (07) [6] J. Bolte, A. Daniilidis, A. Lewis. The Lojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM Journal on Optimization 7(4), 05 3 (006) [7] J. Bolte, A. Daniilidis, A. Lewis, M. Shiota. Clarke subgradients of stratifiable functions. SIAM Journal on Optimization 8(), (007) [8] J. Bolte, A. Daniilidis, O. Ley, L. Mazet. Characterizations of Lojasiewicz inequalities: subgradient flows, talweg, convexity. Transactions of the American Mathematical Society 36(6), (00) [9] J. Bolte, S. Sabach, M. Teboulle. Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Mathematical Programming, 46(): (04) [0] J. Bolte, S. Sabach, M. Teboulle. Nonconvex Lagrangian-based optimization: monitoring schemes and global convergence. Mathematics of Operations Research, to appear. org/0.87/moor [] R. I. Boţ, E. R. Csetnek. An inertial Tseng s type proximal algorithm for nonsmooth and nonconvex optimization problems. Journal of Optimization Theory and Applications 7(), (06) [] R. I. Boţ, E. R. Csetnek, S. C. László. An inertial forward-backward algorithm for the minimization of the sum of two nonconvex functions. EURO Journal on Computational Optimization 4(), 3 5 (06) [3] R. I. Boţ, D.-K. Nguyen. The proximal alternating direction method of multipliers in the nonconvex setting: convergence analysis and rates. arxiv:

22 [4] P. L. Combettes, V. R. Wajs. Signal recovery by proximal forward-backward splitting. Multiscale Modeling and Simulation 4(4), (005) [5] W. Hare, C. Sagastizábal. Computing proximal points of nonconvex functions. Mathematical Programming 6(-), 58 (009) [6] M. Hong, Z.-Q. Luo. On the linear convergence of the alternating direction method of multipliers. Mathematica Programming 6, (07) [7] K. Kurdyka. On gradients of functions definable in o-minimal structures. Annales de l Institut Fourier 48, (998) [8] A. Lewis, J. Malick. Alternating projection on manifolds. Mathematics of Operations Research 33(), 6 34 (008) [9] G. Li, T. K. Pong. Global convergence of splitting methods for nonconvex composite optimization. SIAM Journal on Optimization 5(4), (05) [0] S. Lojasiewicz. Une propriété topologique des sous-ensembles analytiques réels, Les Équations aux Dérivées Partielles. Éditions du Centre National de la Recherche Scientifique, Paris, 8 89 (963) [] B. Mordukhovich. Variational Analysis and Generalized Differentiation, I: Basic Theory, II: Applications. Springer, Berlin (006) [] J. Moreau. Fonctions convexes duales et points proximaux dans un espace hilbertien. Comptes Rendus de l Académie des Sciences (Paris), Série A 55, (96) [3] T. Pock, S. Sabach. Inertial proximal alternating linearized minimization (ipalm) for nonconvex and nonsmooth problems, SIAM Journal Imaging Sciences 9(4), (06) [4] R. T. Rockafellar, R. J.-B. Wets. Variational Analysis. Fundamental Principles of Mathematical Sciences 37. Springer, Berlin (998) [5] L. Yang, T. K. Pong and X. Chen. Alternating Direction Method of Multipliers for a class of nonconvex and nonsmooth problems with applications to background/foreground extraction. SIAM Journal on Imaging Sciences, 0(), 74 0 (07)

The proximal alternating direction method of multipliers in the nonconvex setting: convergence analysis and rates

The proximal alternating direction method of multipliers in the nonconvex setting: convergence analysis and rates The proximal alternating direction method of multipliers in the nonconvex setting: convergence analysis and rates Radu Ioan Boţ Dang-Khoa Nguyen January 6, 08 Abstract. We propose two numerical algorithms

More information

An inertial forward-backward algorithm for the minimization of the sum of two nonconvex functions

An inertial forward-backward algorithm for the minimization of the sum of two nonconvex functions An inertial forward-backward algorithm for the minimization of the sum of two nonconvex functions Radu Ioan Boţ Ernö Robert Csetnek Szilárd Csaba László October, 1 Abstract. We propose a forward-backward

More information

A gradient type algorithm with backward inertial steps for a nonconvex minimization

A gradient type algorithm with backward inertial steps for a nonconvex minimization A gradient type algorithm with backward inertial steps for a nonconvex minimization Cristian Daniel Alecsa Szilárd Csaba László Adrian Viorel November 22, 208 Abstract. We investigate an algorithm of gradient

More information

Convergence rates for an inertial algorithm of gradient type associated to a smooth nonconvex minimization

Convergence rates for an inertial algorithm of gradient type associated to a smooth nonconvex minimization Convergence rates for an inertial algorithm of gradient type associated to a smooth nonconvex minimization Szilárd Csaba László November, 08 Abstract. We investigate an inertial algorithm of gradient type

More information

I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION

I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION Peter Ochs University of Freiburg Germany 17.01.2017 joint work with: Thomas Brox and Thomas Pock c 2017 Peter Ochs ipiano c 1

More information

A user s guide to Lojasiewicz/KL inequalities

A user s guide to Lojasiewicz/KL inequalities Other A user s guide to Lojasiewicz/KL inequalities Toulouse School of Economics, Université Toulouse I SLRA, Grenoble, 2015 Motivations behind KL f : R n R smooth ẋ(t) = f (x(t)) or x k+1 = x k λ k f

More information

Approaching monotone inclusion problems via second order dynamical systems with linear and anisotropic damping

Approaching monotone inclusion problems via second order dynamical systems with linear and anisotropic damping March 0, 206 3:4 WSPC Proceedings - 9in x 6in secondorderanisotropicdamping206030 page Approaching monotone inclusion problems via second order dynamical systems with linear and anisotropic damping Radu

More information

From error bounds to the complexity of first-order descent methods for convex functions

From error bounds to the complexity of first-order descent methods for convex functions From error bounds to the complexity of first-order descent methods for convex functions Nguyen Trong Phong-TSE Joint work with Jérôme Bolte, Juan Peypouquet, Bruce Suter. Toulouse, 23-25, March, 2016 Journées

More information

A semi-algebraic look at first-order methods

A semi-algebraic look at first-order methods splitting A semi-algebraic look at first-order Université de Toulouse / TSE Nesterov s 60th birthday, Les Houches, 2016 in large-scale first-order optimization splitting Start with a reasonable FOM (some

More information

arxiv: v2 [math.oc] 21 Nov 2017

arxiv: v2 [math.oc] 21 Nov 2017 Unifying abstract inexact convergence theorems and block coordinate variable metric ipiano arxiv:1602.07283v2 [math.oc] 21 Nov 2017 Peter Ochs Mathematical Optimization Group Saarland University Germany

More information

Douglas-Rachford splitting for nonconvex feasibility problems

Douglas-Rachford splitting for nonconvex feasibility problems Douglas-Rachford splitting for nonconvex feasibility problems Guoyin Li Ting Kei Pong Jan 3, 015 Abstract We adapt the Douglas-Rachford DR) splitting method to solve nonconvex feasibility problems by studying

More information

Sequential convex programming,: value function and convergence

Sequential convex programming,: value function and convergence Sequential convex programming,: value function and convergence Edouard Pauwels joint work with Jérôme Bolte Journées MODE Toulouse March 23 2016 1 / 16 Introduction Local search methods for finite dimensional

More information

On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems

On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems Radu Ioan Boţ Ernö Robert Csetnek August 5, 014 Abstract. In this paper we analyze the

More information

Second order forward-backward dynamical systems for monotone inclusion problems

Second order forward-backward dynamical systems for monotone inclusion problems Second order forward-backward dynamical systems for monotone inclusion problems Radu Ioan Boţ Ernö Robert Csetnek March 6, 25 Abstract. We begin by considering second order dynamical systems of the from

More information

Proximal Alternating Linearized Minimization for Nonconvex and Nonsmooth Problems

Proximal Alternating Linearized Minimization for Nonconvex and Nonsmooth Problems Proximal Alternating Linearized Minimization for Nonconvex and Nonsmooth Problems Jérôme Bolte Shoham Sabach Marc Teboulle Abstract We introduce a proximal alternating linearized minimization PALM) algorithm

More information

Variable Metric Forward-Backward Algorithm

Variable Metric Forward-Backward Algorithm Variable Metric Forward-Backward Algorithm 1/37 Variable Metric Forward-Backward Algorithm for minimizing the sum of a differentiable function and a convex function E. Chouzenoux in collaboration with

More information

On the iterate convergence of descent methods for convex optimization

On the iterate convergence of descent methods for convex optimization On the iterate convergence of descent methods for convex optimization Clovis C. Gonzaga March 1, 2014 Abstract We study the iterate convergence of strong descent algorithms applied to convex functions.

More information

Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches

Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches Patrick L. Combettes joint work with J.-C. Pesquet) Laboratoire Jacques-Louis Lions Faculté de Mathématiques

More information

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented

More information

First Order Methods beyond Convexity and Lipschitz Gradient Continuity with Applications to Quadratic Inverse Problems

First Order Methods beyond Convexity and Lipschitz Gradient Continuity with Applications to Quadratic Inverse Problems First Order Methods beyond Convexity and Lipschitz Gradient Continuity with Applications to Quadratic Inverse Problems Jérôme Bolte Shoham Sabach Marc Teboulle Yakov Vaisbourd June 20, 2017 Abstract We

More information

Introduction and Preliminaries

Introduction and Preliminaries Chapter 1 Introduction and Preliminaries This chapter serves two purposes. The first purpose is to prepare the readers for the more systematic development in later chapters of methods of real analysis

More information

ADMM for monotone operators: convergence analysis and rates

ADMM for monotone operators: convergence analysis and rates ADMM for monotone operators: convergence analysis and rates Radu Ioan Boţ Ernö Robert Csetne May 4, 07 Abstract. We propose in this paper a unifying scheme for several algorithms from the literature dedicated

More information

A Unified Analysis of Nonconvex Optimization Duality and Penalty Methods with General Augmenting Functions

A Unified Analysis of Nonconvex Optimization Duality and Penalty Methods with General Augmenting Functions A Unified Analysis of Nonconvex Optimization Duality and Penalty Methods with General Augmenting Functions Angelia Nedić and Asuman Ozdaglar April 16, 2006 Abstract In this paper, we study a unifying framework

More information

Characterizations of Lojasiewicz inequalities: subgradient flows, talweg, convexity.

Characterizations of Lojasiewicz inequalities: subgradient flows, talweg, convexity. Characterizations of Lojasiewicz inequalities: subgradient flows, talweg, convexity. Jérôme BOLTE, Aris DANIILIDIS, Olivier LEY, Laurent MAZET. Abstract The classical Lojasiewicz inequality and its extensions

More information

Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem

Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem Charles Byrne (Charles Byrne@uml.edu) http://faculty.uml.edu/cbyrne/cbyrne.html Department of Mathematical Sciences

More information

Hedy Attouch, Jérôme Bolte, Benar Svaiter. To cite this version: HAL Id: hal

Hedy Attouch, Jérôme Bolte, Benar Svaiter. To cite this version: HAL Id: hal Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods Hedy Attouch, Jérôme Bolte, Benar Svaiter To cite

More information

On the acceleration of the double smoothing technique for unconstrained convex optimization problems

On the acceleration of the double smoothing technique for unconstrained convex optimization problems On the acceleration of the double smoothing technique for unconstrained convex optimization problems Radu Ioan Boţ Christopher Hendrich October 10, 01 Abstract. In this article we investigate the possibilities

More information

ADVANCE TOPICS IN ANALYSIS - REAL. 8 September September 2011

ADVANCE TOPICS IN ANALYSIS - REAL. 8 September September 2011 ADVANCE TOPICS IN ANALYSIS - REAL NOTES COMPILED BY KATO LA Introductions 8 September 011 15 September 011 Nested Interval Theorem: If A 1 ra 1, b 1 s, A ra, b s,, A n ra n, b n s, and A 1 Ě A Ě Ě A n

More information

Active sets, steepest descent, and smooth approximation of functions

Active sets, steepest descent, and smooth approximation of functions Active sets, steepest descent, and smooth approximation of functions Dmitriy Drusvyatskiy School of ORIE, Cornell University Joint work with Alex D. Ioffe (Technion), Martin Larsson (EPFL), and Adrian

More information

Brøndsted-Rockafellar property of subdifferentials of prox-bounded functions. Marc Lassonde Université des Antilles et de la Guyane

Brøndsted-Rockafellar property of subdifferentials of prox-bounded functions. Marc Lassonde Université des Antilles et de la Guyane Conference ADGO 2013 October 16, 2013 Brøndsted-Rockafellar property of subdifferentials of prox-bounded functions Marc Lassonde Université des Antilles et de la Guyane Playa Blanca, Tongoy, Chile SUBDIFFERENTIAL

More information

Self-dual Smooth Approximations of Convex Functions via the Proximal Average

Self-dual Smooth Approximations of Convex Functions via the Proximal Average Chapter Self-dual Smooth Approximations of Convex Functions via the Proximal Average Heinz H. Bauschke, Sarah M. Moffat, and Xianfu Wang Abstract The proximal average of two convex functions has proven

More information

A convergence result for an Outer Approximation Scheme

A convergence result for an Outer Approximation Scheme A convergence result for an Outer Approximation Scheme R. S. Burachik Engenharia de Sistemas e Computação, COPPE-UFRJ, CP 68511, Rio de Janeiro, RJ, CEP 21941-972, Brazil regi@cos.ufrj.br J. O. Lopes Departamento

More information

A Proximal Alternating Direction Method for Semi-Definite Rank Minimization (Supplementary Material)

A Proximal Alternating Direction Method for Semi-Definite Rank Minimization (Supplementary Material) A Proximal Alternating Direction Method for Semi-Definite Rank Minimization (Supplementary Material) Ganzhao Yuan and Bernard Ghanem King Abdullah University of Science and Technology (KAUST), Saudi Arabia

More information

PARALLEL SUBGRADIENT METHOD FOR NONSMOOTH CONVEX OPTIMIZATION WITH A SIMPLE CONSTRAINT

PARALLEL SUBGRADIENT METHOD FOR NONSMOOTH CONVEX OPTIMIZATION WITH A SIMPLE CONSTRAINT Linear and Nonlinear Analysis Volume 1, Number 1, 2015, 1 PARALLEL SUBGRADIENT METHOD FOR NONSMOOTH CONVEX OPTIMIZATION WITH A SIMPLE CONSTRAINT KAZUHIRO HISHINUMA AND HIDEAKI IIDUKA Abstract. In this

More information

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be

More information

Computational Statistics and Optimisation. Joseph Salmon Télécom Paristech, Institut Mines-Télécom

Computational Statistics and Optimisation. Joseph Salmon   Télécom Paristech, Institut Mines-Télécom Computational Statistics and Optimisation Joseph Salmon http://josephsalmon.eu Télécom Paristech, Institut Mines-Télécom Plan Duality gap and stopping criterion Back to gradient descent analysis Forward-backward

More information

Inertial Douglas-Rachford splitting for monotone inclusion problems

Inertial Douglas-Rachford splitting for monotone inclusion problems Inertial Douglas-Rachford splitting for monotone inclusion problems Radu Ioan Boţ Ernö Robert Csetnek Christopher Hendrich January 5, 2015 Abstract. We propose an inertial Douglas-Rachford splitting algorithm

More information

arxiv: v2 [math.oc] 6 Oct 2016

arxiv: v2 [math.oc] 6 Oct 2016 The value function approach to convergence analysis in composite optimization Edouard Pauwels IRIT-UPS, 118 route de Narbonne, 31062 Toulouse, France. arxiv:1604.01654v2 [math.oc] 6 Oct 2016 Abstract This

More information

Proximal Operator and Proximal Algorithms (Lecture notes of UCLA 285J Fall 2016)

Proximal Operator and Proximal Algorithms (Lecture notes of UCLA 285J Fall 2016) Proximal Operator and Proximal Algorithms (Lecture notes of UCLA 285J Fall 206) Instructor: Wotao Yin April 29, 207 Given a function f, the proximal operator maps an input point x to the minimizer of f

More information

A Geometric Framework for Nonconvex Optimization Duality using Augmented Lagrangian Functions

A Geometric Framework for Nonconvex Optimization Duality using Augmented Lagrangian Functions A Geometric Framework for Nonconvex Optimization Duality using Augmented Lagrangian Functions Angelia Nedić and Asuman Ozdaglar April 15, 2006 Abstract We provide a unifying geometric framework for the

More information

Erdinç Dündar, Celal Çakan

Erdinç Dündar, Celal Çakan DEMONSTRATIO MATHEMATICA Vol. XLVII No 3 2014 Erdinç Dündar, Celal Çakan ROUGH I-CONVERGENCE Abstract. In this work, using the concept of I-convergence and using the concept of rough convergence, we introduced

More information

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS MATHEMATICS OF OPERATIONS RESEARCH Vol. 28, No. 4, November 2003, pp. 677 692 Printed in U.S.A. ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS ALEXANDER SHAPIRO We discuss in this paper a class of nonsmooth

More information

Tame variational analysis

Tame variational analysis Tame variational analysis Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with Daniilidis (Chile), Ioffe (Technion), and Lewis (Cornell) May 19, 2015 Theme: Semi-algebraic geometry

More information

Optimization and Optimal Control in Banach Spaces

Optimization and Optimal Control in Banach Spaces Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,

More information

A Brøndsted-Rockafellar Theorem for Diagonal Subdifferential Operators

A Brøndsted-Rockafellar Theorem for Diagonal Subdifferential Operators A Brøndsted-Rockafellar Theorem for Diagonal Subdifferential Operators Radu Ioan Boţ Ernö Robert Csetnek April 23, 2012 Dedicated to Jon Borwein on the occasion of his 60th birthday Abstract. In this note

More information

Dual Proximal Gradient Method

Dual Proximal Gradient Method Dual Proximal Gradient Method http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/19 1 proximal gradient method

More information

A memory gradient algorithm for l 2 -l 0 regularization with applications to image restoration

A memory gradient algorithm for l 2 -l 0 regularization with applications to image restoration A memory gradient algorithm for l 2 -l 0 regularization with applications to image restoration E. Chouzenoux, A. Jezierska, J.-C. Pesquet and H. Talbot Université Paris-Est Lab. d Informatique Gaspard

More information

1 Introduction and preliminaries

1 Introduction and preliminaries Proximal Methods for a Class of Relaxed Nonlinear Variational Inclusions Abdellatif Moudafi Université des Antilles et de la Guyane, Grimaag B.P. 7209, 97275 Schoelcher, Martinique abdellatif.moudafi@martinique.univ-ag.fr

More information

Subdifferential representation of convex functions: refinements and applications

Subdifferential representation of convex functions: refinements and applications Subdifferential representation of convex functions: refinements and applications Joël Benoist & Aris Daniilidis Abstract Every lower semicontinuous convex function can be represented through its subdifferential

More information

On a result of Pazy concerning the asymptotic behaviour of nonexpansive mappings

On a result of Pazy concerning the asymptotic behaviour of nonexpansive mappings On a result of Pazy concerning the asymptotic behaviour of nonexpansive mappings arxiv:1505.04129v1 [math.oc] 15 May 2015 Heinz H. Bauschke, Graeme R. Douglas, and Walaa M. Moursi May 15, 2015 Abstract

More information

On the order of the operators in the Douglas Rachford algorithm

On the order of the operators in the Douglas Rachford algorithm On the order of the operators in the Douglas Rachford algorithm Heinz H. Bauschke and Walaa M. Moursi June 11, 2015 Abstract The Douglas Rachford algorithm is a popular method for finding zeros of sums

More information

An inexact subgradient algorithm for Equilibrium Problems

An inexact subgradient algorithm for Equilibrium Problems Volume 30, N. 1, pp. 91 107, 2011 Copyright 2011 SBMAC ISSN 0101-8205 www.scielo.br/cam An inexact subgradient algorithm for Equilibrium Problems PAULO SANTOS 1 and SUSANA SCHEIMBERG 2 1 DM, UFPI, Teresina,

More information

You should be able to...

You should be able to... Lecture Outline Gradient Projection Algorithm Constant Step Length, Varying Step Length, Diminishing Step Length Complexity Issues Gradient Projection With Exploration Projection Solving QPs: active set

More information

ON GAP FUNCTIONS OF VARIATIONAL INEQUALITY IN A BANACH SPACE. Sangho Kum and Gue Myung Lee. 1. Introduction

ON GAP FUNCTIONS OF VARIATIONAL INEQUALITY IN A BANACH SPACE. Sangho Kum and Gue Myung Lee. 1. Introduction J. Korean Math. Soc. 38 (2001), No. 3, pp. 683 695 ON GAP FUNCTIONS OF VARIATIONAL INEQUALITY IN A BANACH SPACE Sangho Kum and Gue Myung Lee Abstract. In this paper we are concerned with theoretical properties

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of

More information

THEOREMS, ETC., FOR MATH 515

THEOREMS, ETC., FOR MATH 515 THEOREMS, ETC., FOR MATH 515 Proposition 1 (=comment on page 17). If A is an algebra, then any finite union or finite intersection of sets in A is also in A. Proposition 2 (=Proposition 1.1). For every

More information

Received: 15 December 2010 / Accepted: 30 July 2011 / Published online: 20 August 2011 Springer and Mathematical Optimization Society 2011

Received: 15 December 2010 / Accepted: 30 July 2011 / Published online: 20 August 2011 Springer and Mathematical Optimization Society 2011 Math. Program., Ser. A (2013) 137:91 129 DOI 10.1007/s10107-011-0484-9 FULL LENGTH PAPER Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward backward splitting,

More information

Radius Theorems for Monotone Mappings

Radius Theorems for Monotone Mappings Radius Theorems for Monotone Mappings A. L. Dontchev, A. Eberhard and R. T. Rockafellar Abstract. For a Hilbert space X and a mapping F : X X (potentially set-valued) that is maximal monotone locally around

More information

BASICS OF CONVEX ANALYSIS

BASICS OF CONVEX ANALYSIS BASICS OF CONVEX ANALYSIS MARKUS GRASMAIR 1. Main Definitions We start with providing the central definitions of convex functions and convex sets. Definition 1. A function f : R n R + } is called convex,

More information

A Majorize-Minimize subspace approach for l 2 -l 0 regularization with applications to image processing

A Majorize-Minimize subspace approach for l 2 -l 0 regularization with applications to image processing A Majorize-Minimize subspace approach for l 2 -l 0 regularization with applications to image processing Emilie Chouzenoux emilie.chouzenoux@univ-mlv.fr Université Paris-Est Lab. d Informatique Gaspard

More information

REAL ANALYSIS II TAKE HOME EXAM. T. Tao s Lecture Notes Set 5

REAL ANALYSIS II TAKE HOME EXAM. T. Tao s Lecture Notes Set 5 REAL ANALYSIS II TAKE HOME EXAM CİHAN BAHRAN T. Tao s Lecture Notes Set 5 1. Suppose that te 1, e 2, e 3,... u is a countable orthonormal system in a complex Hilbert space H, and c 1, c 2,... is a sequence

More information

On the convergence of a regularized Jacobi algorithm for convex optimization

On the convergence of a regularized Jacobi algorithm for convex optimization On the convergence of a regularized Jacobi algorithm for convex optimization Goran Banjac, Kostas Margellos, and Paul J. Goulart Abstract In this paper we consider the regularized version of the Jacobi

More information

Introduction to Real Analysis Alternative Chapter 1

Introduction to Real Analysis Alternative Chapter 1 Christopher Heil Introduction to Real Analysis Alternative Chapter 1 A Primer on Norms and Banach Spaces Last Updated: March 10, 2018 c 2018 by Christopher Heil Chapter 1 A Primer on Norms and Banach Spaces

More information

arxiv: v1 [math.oc] 12 Mar 2013

arxiv: v1 [math.oc] 12 Mar 2013 On the convergence rate improvement of a primal-dual splitting algorithm for solving monotone inclusion problems arxiv:303.875v [math.oc] Mar 03 Radu Ioan Boţ Ernö Robert Csetnek André Heinrich February

More information

GEOMETRIC APPROACH TO CONVEX SUBDIFFERENTIAL CALCULUS October 10, Dedicated to Franco Giannessi and Diethard Pallaschke with great respect

GEOMETRIC APPROACH TO CONVEX SUBDIFFERENTIAL CALCULUS October 10, Dedicated to Franco Giannessi and Diethard Pallaschke with great respect GEOMETRIC APPROACH TO CONVEX SUBDIFFERENTIAL CALCULUS October 10, 2018 BORIS S. MORDUKHOVICH 1 and NGUYEN MAU NAM 2 Dedicated to Franco Giannessi and Diethard Pallaschke with great respect Abstract. In

More information

Variational Analysis and Tame Optimization

Variational Analysis and Tame Optimization Variational Analysis and Tame Optimization Aris Daniilidis http://mat.uab.cat/~arisd Universitat Autònoma de Barcelona April 15 17, 2010 PLAN OF THE TALK Nonsmooth analysis Genericity of pathological situations

More information

arxiv: v1 [math.oc] 11 Jan 2008

arxiv: v1 [math.oc] 11 Jan 2008 Alternating minimization and projection methods for nonconvex problems 1 Hedy ATTOUCH 2, Jérôme BOLTE 3, Patrick REDONT 2, Antoine SOUBEYRAN 4. arxiv:0801.1780v1 [math.oc] 11 Jan 2008 Abstract We study

More information

Optimality Conditions for Constrained Optimization

Optimality Conditions for Constrained Optimization 72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

More information

Chapter 2 Metric Spaces

Chapter 2 Metric Spaces Chapter 2 Metric Spaces The purpose of this chapter is to present a summary of some basic properties of metric and topological spaces that play an important role in the main body of the book. 2.1 Metrics

More information

An introduction to Mathematical Theory of Control

An introduction to Mathematical Theory of Control An introduction to Mathematical Theory of Control Vasile Staicu University of Aveiro UNICA, May 2018 Vasile Staicu (University of Aveiro) An introduction to Mathematical Theory of Control UNICA, May 2018

More information

Mariusz Jurkiewicz, Bogdan Przeradzki EXISTENCE OF SOLUTIONS FOR HIGHER ORDER BVP WITH PARAMETERS VIA CRITICAL POINT THEORY

Mariusz Jurkiewicz, Bogdan Przeradzki EXISTENCE OF SOLUTIONS FOR HIGHER ORDER BVP WITH PARAMETERS VIA CRITICAL POINT THEORY DEMONSTRATIO MATHEMATICA Vol. XLVIII No 1 215 Mariusz Jurkiewicz, Bogdan Przeradzki EXISTENCE OF SOLUTIONS FOR HIGHER ORDER BVP WITH PARAMETERS VIA CRITICAL POINT THEORY Communicated by E. Zadrzyńska Abstract.

More information

Local strong convexity and local Lipschitz continuity of the gradient of convex functions

Local strong convexity and local Lipschitz continuity of the gradient of convex functions Local strong convexity and local Lipschitz continuity of the gradient of convex functions R. Goebel and R.T. Rockafellar May 23, 2007 Abstract. Given a pair of convex conjugate functions f and f, we investigate

More information

Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms

Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms Carlos Humes Jr. a, Benar F. Svaiter b, Paulo J. S. Silva a, a Dept. of Computer Science, University of São Paulo, Brazil Email: {humes,rsilva}@ime.usp.br

More information

Convergence of Bregman Alternating Direction Method with Multipliers for Nonconvex Composite Problems

Convergence of Bregman Alternating Direction Method with Multipliers for Nonconvex Composite Problems Convergence of Bregman Alternating Direction Method with Multipliers for Nonconvex Composite Problems Fenghui Wang, Zongben Xu, and Hong-Kun Xu arxiv:40.8625v3 [math.oc] 5 Dec 204 Abstract The alternating

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)

More information

Chapter 2 Convex Analysis

Chapter 2 Convex Analysis Chapter 2 Convex Analysis The theory of nonsmooth analysis is based on convex analysis. Thus, we start this chapter by giving basic concepts and results of convexity (for further readings see also [202,

More information

EE 546, Univ of Washington, Spring Proximal mapping. introduction. review of conjugate functions. proximal mapping. Proximal mapping 6 1

EE 546, Univ of Washington, Spring Proximal mapping. introduction. review of conjugate functions. proximal mapping. Proximal mapping 6 1 EE 546, Univ of Washington, Spring 2012 6. Proximal mapping introduction review of conjugate functions proximal mapping Proximal mapping 6 1 Proximal mapping the proximal mapping (prox-operator) of a convex

More information

Nonconvex notions of regularity and convergence of fundamental algorithms for feasibility problems

Nonconvex notions of regularity and convergence of fundamental algorithms for feasibility problems Nonconvex notions of regularity and convergence of fundamental algorithms for feasibility problems Robert Hesse and D. Russell Luke December 12, 2012 Abstract We consider projection algorithms for solving

More information

About Split Proximal Algorithms for the Q-Lasso

About Split Proximal Algorithms for the Q-Lasso Thai Journal of Mathematics Volume 5 (207) Number : 7 http://thaijmath.in.cmu.ac.th ISSN 686-0209 About Split Proximal Algorithms for the Q-Lasso Abdellatif Moudafi Aix Marseille Université, CNRS-L.S.I.S

More information

A Dykstra-like algorithm for two monotone operators

A Dykstra-like algorithm for two monotone operators A Dykstra-like algorithm for two monotone operators Heinz H. Bauschke and Patrick L. Combettes Abstract Dykstra s algorithm employs the projectors onto two closed convex sets in a Hilbert space to construct

More information

Majorization-minimization procedures and convergence of SQP methods for semi-algebraic and tame programs

Majorization-minimization procedures and convergence of SQP methods for semi-algebraic and tame programs Majorization-minimization procedures and convergence of SQP methods for semi-algebraic and tame programs Jérôme Bolte and Edouard Pauwels September 29, 2014 Abstract In view of solving nonsmooth and nonconvex

More information

Equilibrium Problems and Riesz Representation Theorem

Equilibrium Problems and Riesz Representation Theorem Equilibrium Problems and Riesz Representation Theorem Documento de Discusión CIUP DD1612 Diciembre, 2016 John Cotrina Profesor e investigador del CIUP cotrina_je@up.edu.pe Javier Zúñiga Profesor e investigador

More information

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version Convex Optimization Theory Chapter 5 Exercises and Solutions: Extended Version Dimitri P. Bertsekas Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts http://www.athenasc.com

More information

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability... Functional Analysis Franck Sueur 2018-2019 Contents 1 Metric spaces 1 1.1 Definitions........................................ 1 1.2 Completeness...................................... 3 1.3 Compactness......................................

More information

Proximal methods. S. Villa. October 7, 2014

Proximal methods. S. Villa. October 7, 2014 Proximal methods S. Villa October 7, 2014 1 Review of the basics Often machine learning problems require the solution of minimization problems. For instance, the ERM algorithm requires to solve a problem

More information

c 1998 Society for Industrial and Applied Mathematics

c 1998 Society for Industrial and Applied Mathematics SIAM J. OPTIM. Vol. 9, No. 1, pp. 179 189 c 1998 Society for Industrial and Applied Mathematics WEAK SHARP SOLUTIONS OF VARIATIONAL INEQUALITIES PATRICE MARCOTTE AND DAOLI ZHU Abstract. In this work we

More information

Entropy and Ergodic Theory Lecture 27: Sinai s factor theorem

Entropy and Ergodic Theory Lecture 27: Sinai s factor theorem Entropy and Ergodic Theory Lecture 27: Sinai s factor theorem What is special about Bernoulli shifts? Our main result in Lecture 26 is weak containment with retention of entropy. If ra Z, µs and rb Z,

More information

Research Article Modified Halfspace-Relaxation Projection Methods for Solving the Split Feasibility Problem

Research Article Modified Halfspace-Relaxation Projection Methods for Solving the Split Feasibility Problem Advances in Operations Research Volume 01, Article ID 483479, 17 pages doi:10.1155/01/483479 Research Article Modified Halfspace-Relaxation Projection Methods for Solving the Split Feasibility Problem

More information

An accelerated non-euclidean hybrid proximal extragradient-type algorithm for convex concave saddle-point problems

An accelerated non-euclidean hybrid proximal extragradient-type algorithm for convex concave saddle-point problems Optimization Methods and Software ISSN: 1055-6788 (Print) 1029-4937 (Online) Journal homepage: http://www.tandfonline.com/loi/goms20 An accelerated non-euclidean hybrid proximal extragradient-type algorithm

More information

arxiv: v3 [math.oc] 18 Apr 2012

arxiv: v3 [math.oc] 18 Apr 2012 A class of Fejér convergent algorithms, approximate resolvents and the Hybrid Proximal-Extragradient method B. F. Svaiter arxiv:1204.1353v3 [math.oc] 18 Apr 2012 Abstract A new framework for analyzing

More information

A splitting minimization method on geodesic spaces

A splitting minimization method on geodesic spaces A splitting minimization method on geodesic spaces J.X. Cruz Neto DM, Universidade Federal do Piauí, Teresina, PI 64049-500, BR B.P. Lima DM, Universidade Federal do Piauí, Teresina, PI 64049-500, BR P.A.

More information

ON A HYBRID PROXIMAL POINT ALGORITHM IN BANACH SPACES

ON A HYBRID PROXIMAL POINT ALGORITHM IN BANACH SPACES U.P.B. Sci. Bull., Series A, Vol. 80, Iss. 3, 2018 ISSN 1223-7027 ON A HYBRID PROXIMAL POINT ALGORITHM IN BANACH SPACES Vahid Dadashi 1 In this paper, we introduce a hybrid projection algorithm for a countable

More information

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework - MC/MC Constrained optimization

More information

Convex Feasibility Problems

Convex Feasibility Problems Laureate Prof. Jonathan Borwein with Matthew Tam http://carma.newcastle.edu.au/drmethods/paseky.html Spring School on Variational Analysis VI Paseky nad Jizerou, April 19 25, 2015 Last Revised: May 6,

More information

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop

More information

On Semicontinuity of Convex-valued Multifunctions and Cesari s Property (Q)

On Semicontinuity of Convex-valued Multifunctions and Cesari s Property (Q) On Semicontinuity of Convex-valued Multifunctions and Cesari s Property (Q) Andreas Löhne May 2, 2005 (last update: November 22, 2005) Abstract We investigate two types of semicontinuity for set-valued

More information

Dedicated to Michel Théra in honor of his 70th birthday

Dedicated to Michel Théra in honor of his 70th birthday VARIATIONAL GEOMETRIC APPROACH TO GENERALIZED DIFFERENTIAL AND CONJUGATE CALCULI IN CONVEX ANALYSIS B. S. MORDUKHOVICH 1, N. M. NAM 2, R. B. RECTOR 3 and T. TRAN 4. Dedicated to Michel Théra in honor of

More information

The Nearest Doubly Stochastic Matrix to a Real Matrix with the same First Moment

The Nearest Doubly Stochastic Matrix to a Real Matrix with the same First Moment he Nearest Doubly Stochastic Matrix to a Real Matrix with the same First Moment William Glunt 1, homas L. Hayden 2 and Robert Reams 2 1 Department of Mathematics and Computer Science, Austin Peay State

More information

The proximal mapping

The proximal mapping The proximal mapping http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/37 1 closed function 2 Conjugate function

More information