Numerical methods for approximating. zeros of operators. and for solving variational inequalities. with applications

Size: px
Start display at page:

Download "Numerical methods for approximating. zeros of operators. and for solving variational inequalities. with applications"

Transcription

1 Faculty of Mathematics and Computer Science Babeş-Bolyai University Erika Nagy Numerical methods for approximating zeros of operators and for solving variational inequalities with applications Doctoral Thesis Summary Supervisor: Prof. Dr. Gábor Kassay Cluj-Napoca 2013

2

3 Contents 1 Introduction 5 2 Background of the theory of monotone operators Operators Functions The primal-dual splitting technique Monotone inclusion problem formulation The primal-dual splitting algorithm Applications to convex minimization problems Numerical experiments Image processing Image deblurring Experimental setting Multifacility location problems Problem manipulation Computational experience Average consensus for colored networks Proposed algorithm Performance evaluation Support-vector machines classification The soft-margin hyperplane Experiments with digit recognition Interdisciplinary application of our algorithm 48 3

4 4 CONTENTS 5.1 Optimization in biology Pattern recognition techniques References 51

5 Chapter 1 Introduction The existence of optimization methods can be traced to the days of Newton, Lagrange, and Cauchy. The development of differential calculus methods for optimization was possible because of the contributions of Newton and Leibnitz to calculus. The foundations of calculus of variations, which deals with the minimization of functions, were laid by Bernoulli, Euler, Lagrange and Weierstrass. The method of optimization for constrained problems, which involve the addition of unknown multipliers, became known by the name of its inventor, Lagrange. Cauchy made the first application of the steepest descent method to solve unconstrained optimization problems. By the middle of the twentieth century, the high-speed digital computers made implementation of the complex optimization procedures possible and stimulated further research on newer methods. Spectacular advances followed, producing a massive literature on optimization techniques. This advancement also resulted in the emergence of several well defined new areas in optimization theory [67]. In recent years several splitting algorithms [43] have emerged for solving monotone inclusion problems involving parallel sums and compositions with linear continuous operators, which eventually are reduced to finding the zeros of the sum of a maximal monotone operator and a cocoercive or a monotone and Lipschitz continuous operator. The later problems were solved by employing in an appropriate product space forwardbackward or forward-backward-forward algorithms, respectively, and gave rise to socalled primal-dual splitting methods (see [9, 16, 20, 23, 61] and the references therein).

6 6 Introduction Recently, one can remark the interest of researchers in solving systems of monotone inclusion problems [1,5,6,22]. This is motivated by the fact that convex optimization problems arising, for instance, in areas like image processing [13], multifacility location problems [18, 31], average consensus in network coloring [46, 47] and support vector machines classification [27,28] are to be solved with respect to multiple variables, very often linked in different manners, for instance, by linear equations. The present research is motivated by the investigations made in [1]. The authors propose there an algorithm for solving coupled monotone inclusion problems, where the variables are linked by some operators which satisfy jointly a cocoercivity property. Our aim is to overcome the necessity of having differentiability for some of the functions occurring in the objective of the convex optimization problems in [1]. To this end we consider first a more general system of monotone inclusions, for which the coupling operator satisfies a Lipschitz continuity property, along with its dual system of monotone inclusions in an extended sense of the Attouch-Théra duality (see [2]). The simultaneous solving of the primal and dual system of monotone inclusions is reduced to the problem of finding the zeros of the sum of a maximal monotone operator and a monotone and Lipschitz continuous operator in an appropriate product space. The latter problem is solved by a forward-backward-forward algorithm, fact that allows us to provide for the resulting iterative scheme, which proves to have a high parallelizable formulation, both weak and strong convergence assertions. The thesis is organized in five chapters and a bibliography. In the first chapter we state the initial problem that stimulated this research. H. Attouch, L.M. Briceno- Arias and P.L. Combettes proposed in [1] a parallel splitting method for solving systems of coupled monotone inclusions in Hilbert spaces, establishing its convergence under the assumption that solutions exist. The method can handle an arbitrary number of variables and nonlinear coupling schemes. In the second chapter we give some necessary notations and preliminary results in order to facilitate the reading of the manuscript and to help the reader to understand easily the following parts. Notions like nonexpansive operators, parallel sum, cocoercivity, conjugation, Fenchel duality, subdifferentiability and maximal monotone operators are introduced together with some essential splitting algorithms like

7 Introduction 7 the Douglas-Rachford algorithm, forward-backward algorithm and Tseng s algorithm. These algorithms have been studied more extensively in [4]. Chapter three is devoted to the primal-dual splitting algorithm for solving the problem considered in this thesis and investigates its convergence behaviour. The operators arising in each of the inclusions of the system are processed in every iteration separately, namely, the single-valued are evaluated explicitly, which is equivalent to a forward step, while the set-valued ones via their resolvents, which is equivalent to a backward step. In addition, most of the steps in the iterative scheme can be executed simultaneously, this making the method applicable to a variety of convex minimization problems. The solving of convex optimization problems with multiple variables is also presented in this chapter. In Chapter four, the numerical performances and effectiveness of the proposed splitting algorithm are emphasized through several numerical experiments, like image processing, where a noisy and blurry image is cleared; multifacility location problems, where the objective is to minimize the energy path between existing facilities and multiple new facilities, which have to be located among the existing ones; average consensus on colored networks, which consists in calculating the average of the value of each node in a recursive and distributed way; image classification via support vector machines, where the aim is to construct a decision function with the help of training data, which should assign every new data correctly with a low misclassification rate. Finally, the last part of the thesis presents a collaboration with a research team in biology studying the presence of picoalgae in Transylvanian salt lakes. The aim was to adapt the proposed algorithm for image processing and pattern recognition for microscopic images and electrophoretograms. The developing process included the work with epifluorescence microscopic images of picoalgae and the goal was to count and distinguish picoalgae from the background noise. Keywords convex optimization, coupled systems, forward-backward-forward algorithm, Lipschitz continuity, monotone inclusion, operator splitting, pattern recognition

8 Chapter 2 Background of the theory of monotone operators The mathematical notion of a Hilbert space was named after the German mathematician David Hilbert. The earliest Hilbert spaces were studied as infinite-dimensional function spaces in the first decade of the 20 th century by David Hilbert, Frigyes Riesz and Erhard Schmidt. Throughout this work R + denotes the set of non negative real numbers, R ++ the set of strictly positive real numbers and R = R {± } the extended real-line. We consider Hilbert spaces endowed with the scalar (or inner) product and the associated norm =. In order to avoid confusion, when needed, appropriate indices will be used for the inner product and norm. The symbols and denote weak and strong convergence, respectively. 2.1 Operators Let H be a real Hilbert space and let 2 H be the power set of H, i.e., the family of all subsets of H. The notation M : H 2 H means that M is a set-valued operator, i.e. M maps every point x H to a set Mx H. We denote by zer M = {x H : 0 Mx} the set of zeros of M and by gra M = {(x, u) H H : u Mx} the graph of M. The domain and the range of M are dom M = {x H : Mx } and ran M = M(H),

9 2.2 Functions 9 respectively. In addition, the closure of dom M is denoted by domm and the closure of ran M is denoted by ranm. The reversal of M is ˇM : x M( x). The inverse of M, denoted by M 1 : H 2 H, is defined through its graph gra M 1 = {(u, x) H H : (x, u) gra M}. Thus, for every (x, u) H H, u Mx x M 1 u. Moreover, dom M 1 = ran M and ran M 1 = dom M. The sum and the parallel sum of two set-valued operators M, N : H 2 H are defined as M + N : H 2 H, (M + N)(x) = M(x) + N(x), x H (2.1.1) and M N : H 2 H, M N = ( M 1 + N 1) 1. (2.1.2) Let G be another real Hilbert space and consider the single-valued operator L : H G (also called mapping), that maps every point x in H to a point Lx in G. The norm of the linear continuous operator L is defined as L = sup{ Lx : x H, x 1}, while L : G H, defined by Lx y = x L y for all (x, y) H G, denotes the adjoint operator of L. 2.2 Functions Let H be a real Hilbert space. Definition Consider the function f : H R. The effective domain of f is dom f = {x H : f(x) < + }, the graph of f is gra f = {(x, ξ) H R : f(x) = ξ},

10 10 Chapter 2. Background of the theory of monotone operators the epigraph of f is epi f = {(x, ξ) H R : f(x) ξ}, the lower level set of f at height ξ R is lev f = {x H : f(x) ξ}, ξ and the strict lower level set of f at height ξ R is lev f = {x H : f(x) < ξ}. <ξ The function f is called proper if dom f and f(x) > for all x H. Furthermore, the closures of dom f and epi f are respectively denoted by domf and epif. f : H R. We denote by Γ(H) the set of proper, convex and lower semicontinuous functions The indicator function δ C : H R of a set C H is defined by 0, for x C; δ C (x) = +, otherwise. (2.2.1) Note that the indicator function δ C is lower semicontinuous if and only if C is closed. Definition Let f and g be two proper functions from H to R. The infimal convolution (or epi-sum) of f and g is defined by f g : H R, f g(x) = inf {f(y) + g(x y)}. y H Definition Let f : H R. The conjugate function (or Fenchel conjugate) of f is f : H R, f (u) = sup { u, x f(x) : x H} for all u H and the biconjugate of f is f = (f ). Note that, if f Γ(H), then f Γ(H), as well. The following version of Fenchel s duality theorem was obtained by R. T. Rockafellar [53].

11 2.2 Functions 11 Theorem Let H be a Hilbert space. Let f, g : H R + be proper convex functions. (i) If dom f dom g contains a point at which f or g is continuous, then (f + g) = f g with exact infimal convolution. (ii) Suppose that f and g belong to Γ(H). If dom f dom g contains a point at which f or g is continuous, then f g = (f + g ) with exact infimal convolution. Definition The primal problem associated with the sum of two proper functions f : H R and g : H R is min{f(x) + g(x)}, x H its dual problem is min {f (u) + g (u)}, u H the primal optimal value is µ = inf(f + g)(h), the dual optimal value is µ = inf(f + ǧ )(H), and the duality gap is 0, µ = µ R; (f, g) = µ + µ, otherwise. (2.2.2)

12 Chapter 3 The primal-dual splitting technique The following results presented in th thesis are based on the article [10], that resulted from my cooperation with professor Radu Ioan Boţ and Ernő Robert Csetnek. 3.1 Monotone inclusion problem formulation The problem under consideration is a more general system of monotone inclusions, than the one presented in [1], for which the coupling operator satisfies a Lipschitz continuity property, along with its dual system of monotone inclusions in an extended sense of the Attouch-Théra duality [2]. Problem Let m 1 be a positive integer, (H i ) 1 i m be real Hilbert spaces and for i = 1,..., m let B i : H 1... H m H i be a µ i -Lipschitz continuous operator with µ i R ++ jointly satisfying the monotonicity property (x 1,..., x m ) H 1... H m, (y 1,..., y m ) H 1... H m m x i y i B i (x 1,..., x m ) B i (y 1,..., y m ) Hi 0. (3.1.1) i=1 For every i = 1,..., m, let G i be a real Hilbert space, A i : G i 2 G i a maximal monotone operator, C i : G i 2 G i a monotone operator such that C 1 i is ν i -Lipschitz continuous with ν i R + and L i : H i G i a linear continuous operator. The problem is to solve

13 3.1 Monotone inclusion problem formulation 13 the system of coupled inclusions find x 1 H 1,..., x m H m such that together with its dual system find v 1 G 1,..., v m G m such that x 1 H 1,..., x m H m 0 L 1(A 1 C 1 )(L 1 x 1 ) + B 1 (x 1,..., x m ). 0 L m(a m C m )(L m x m ) + B m (x 1,..., x m ) (3.1.2) 0 = L 1v 1 + B 1 (x 1,..., x m ). 0 = L mv m + B m (x 1,..., x m ) v 1 (A 1 C 1 )(L 1 x 1 ). v m (A m C m )(L m x m ). (3.1.3) We say that (x 1,..., x m, v 1,..., v m ) H 1... H m G 1... G m is a primaldual solution to Problem 3.1.1, if 0 = L i v i + B i (x 1,..., x m ) and v i (A i C i )(L i x i ), i = 1,..., m. (3.1.4) If (x 1,..., x m, v 1,..., v m ) H 1... H m G 1... G m is a primal-dual solution to Problem 3.1.1, then (x 1,..., x m ) is a solution to (3.1.2) and (v 1,..., v m ) is a solution to (3.1.3). Notice also that (x 1,..., x m ) solves (3.1.2) 0 L i (A i C i )(L i x i ) + B i (x 1,..., x m ), i = 1,..., m 0 = L i v i + B i (x 1,..., x m ), i = 1,..., m v 1 G 1,..., v m G m such that. v i (A i C i )(L i x i ), i = 1,..., m. Thus, if (x 1,..., x m ) is a solution to (3.1.2), then there exists (v 1,..., v m ) G 1... G m such that (x 1,..., x m, v 1,..., v m ) is a primal-dual solution to Problem and, if (v 1,..., v m ) G 1... G m is a solution to (3.1.3), then there exists (x 1,..., x m ) H 1... H m such that (x 1,..., x m, v 1,..., v m ) is a primal-dual solution to Problem

14 14 Chapter 3. The primal-dual splitting technique 3.2 The primal-dual splitting algorithm The aim of this section is to provide an algorithm for solving Problem and to furnish weak and strong convergence results for the sequences generated by it. The proposed iterative scheme has the property that each single-valued operator is processed explicitly, while each set-valued operator is evaluated via its resolvent. Absolutely summable sequences make the algorithm error-tolerant. Algorithm For every i = 1,..., m let (a 1,i,n ) n 0, (b 1,i,n ) n 0, (c 1,i,n ) n 0 be absolutely summable sequences in H i and (a 2,i,n ) n 0, (b 2,i,n ) n 0, (c 2,i,n ) n 0 absolutely summable sequences in G i. Furthermore, set β = max m µ 2 i, ν 1,..., ν m + max L i, (3.2.1) i=1,...,m i=1 let ε ]0, 1/(β + 1)[ and (γ n ) n 0 be a sequence in [ε, (1 ε)/β]. For every i = 1,..., m let the initial points x i,0 H i and v i,0 G i be chosen arbitrary and set n 0 For i = 1,..., m y i,n = x i,n γ n (L i v i,n + B i (x 1,n,..., x m,n ) + a 1,i,n ) w i,n = v i,n γ n (C 1 i v i,n L i x i,n + a 2,i,n ) p i,n = y i,n + b 1,i,n r i,n = J γna 1 w i,n + b 2,i,n i q i,n = p i,n γ n (L i r i,n + B i (p 1,n,..., p m,n ) + c 1,i,n ) s i,n = r i,n γ n (C 1 i r i,n L i p i,n + c 2,i,n ) x i,n+1 = x i,n y i,n + q i,n v i,n+1 = v i,n w i,n + s i,n. The convergence of Algorithm is established by showing that its iterative scheme can be reduced to the error-tolerant version of the forward-backward-forward algorithm of Tseng (see [59] for the error-free case) recently provided in [16]. Our algo-

15 3.2 The primal-dual splitting algorithm 15 rithmic framework will hinge on the following splitting algorithm, which is of interest in its own right. Theorem [16] Let H be a real Hilbert space, let M : H 2 H be maximal monotone, and let L : H H be monotone. Suppose that zer(m + L) and that L is β-lipschitz continuous for some β ]0, + [. Let (a n ) n N, (b n ) n N and (c n ) n N be sequences in H such that a n < +, b n < + and c n < +, n N n N n N let x 0 H, let ε ]0, 1/(β + 1)[, let (γ n ) n N be a sequence in [ε, (1 ε)/β], and set n N y n = x n γ n (Lx n + a n ) p n = J γnmy n + b n q n = p n γ n (Lp n ) + c n ) x n+1 = x n y n + q n. Then the following hold for some x zer(m + L): (i) n N x n p n 2 < + and n N y n q n 2 < +. (ii) x n x and p n x. (iii) Suppose that one of the following is satisfied: (a) M + L is demiregular at x. (b) M or L is uniformly monotone at x. (c) int zer(m + L). Then x n x and p n x. For a more detailed insight on the problem of simultaneously solving a large class of composite monotone inclusions and their duals, while reducing it to the problem of finding a zero of the sum of a maximal monotone operator and a linear skew-adjoint operator, we recommend the work of Luis M. Briceño-Arias and Patrick L. Combettes in [16]. The following theorem establishes the convergence of Algorithm 3.2.1:

16 16 Chapter 3. The primal-dual splitting technique Theorem [10] Suppose that Problem has a primal-dual solution. For the sequences generated by Algorithm the following statements are true: (i) i {1,..., m} x i,n p i,n 2 H i < + and v i,n r i,n 2 G i < +. n 0 n 0 (ii) There exists a primal-dual solution (x 1,..., x m, v 1,..., v m ) to Problem such that: (a) i {1,..., m} x i,n x i, p i,n x i, v i,n v i and r i,n v i as n +. (b) if C 1 i, i = 1,..., m, is uniformly monotone and there exists an increasing function φ B : R + R + {+ } vanishing only at 0 and fulfilling (x 1,..., x m ) H 1... H m, (y 1,..., y m ) H 1... H m m x i y i B i (x 1,..., x m ) B i (y 1,..., y m ) Hi i=1 φ B ( (x 1,..., x m ) (y 1,..., y m ) ), (3.2.2) then i {1,..., m} x i,n x i, p i,n x i, v i,n v i and r i,n v i as n Applications to convex minimization problems In this section we turn our attention to the solving of convex minimization problems with multiple variables via the primal-dual algorithm presented and investigated in this thesis. Problem Let m 1 and p 1 be positive integers, (H i ) 1 i m, (H i) 1 i m and (G j ) 1 j p be real Hilbert spaces, f i, h i Γ(H i) such that h i is ν 1 i -strongly convex with ν i R ++, i = 1,..., m, and g j Γ(G j ) for i = 1,..., m, j = 1,..., p. Further, let be K i : H i H i and L ji : H i G j, i = 1,..., m, j = 1,..., p linear continuous operators. Consider the convex optimization problem inf (x 1,...,x m) H 1... H m { m (f i h i )(K i x i ) + i=1 ( p m )} g j L ji x i. (3.3.1) j=1 i=1

17 3.3 Applications to convex minimization problems 17 In what follows we show that under an appropriate qualification condition solving the convex optimization problem (3.3.1) can be reduced to the solving of a system of monotone inclusions of type (3.1.2). Let us define the following proper convex and lower semicontinuous function f : H 1... H m R, and the linear continuous operator (y 1,..., y m ) m (f i h i )(y i ), i=1 K : H 1... H m H 1... H m, (x 1,..., x m ) (K 1 x 1,..., K m x m ), having as adjoint K : H 1... H m H 1... H m, (y 1,..., y m ) (K 1y 1,..., K my m ). Further, consider the linear continuous operators having as adjoints L j : H 1... H m G j, (x 1,..., x m ) m L ji x i, j = 1,..., p, i=1 L j : G j H 1... H m, y (L j1y,..., L jmy), j = 1,..., p, respectively. We have (x 1,..., x m ) is an optimal solution to (3.3.1) ( ) p (0,..., 0) f K + g j L j (x 1,... x m ). (3.3.2) In order to split the above subdifferential in a sum of subdifferentials a so-called qualification condition must be fulfilled. In this context, we consider the following j=1

18 18 Chapter 3. The primal-dual splitting technique interiority-type qualification conditions: there exists x i H i such that (QC 1 ) K i x i (dom f i + dom h i ) and f i h i is continuous at K i x i, i = 1,..., m, and m i=1 L jix i dom g j and g j is continuous at m i=1 L jix i, j = 1,..., p and (QC 2 ) (0,..., 0) sqri ( m i=1 (dom f i + dom h i ) p j=1 dom g j {(K 1 x 1,..., K m x m, m i=1 L 1ix i,..., m i=1 L pix i ) : ) (x 1,..., x m ) H 1... H m }. We notice that (QC 1 ) (QC 2 ), these implications being in general strict, and refer the reader to [4, 7, 8, 29, 58, 68, 69] and the references therein for other qualification conditions in convex optimization. Remark As already pointed out, for i = 1,..., m, f i h i Γ(H i), hence, it is continuous on int(dom f i + dom h i ), providing this set is nonempty (see [29, 69]). For other results regarding the continuity of the infimal convolution of convex functions we invite the reader to consult [57]. Remark In finite-dimensional spaces the qualification condition (QC 2 ) is equivalent to there exists x (QC 2 ) i H i such that K i x i ri dom f i + ri dom h i, i = 1,..., m, and m i=1 L jix i ri dom g j, j = 1,..., p. Assuming that one of the qualification conditions above is fulfilled, we have that (x 1,..., x m ) is an optimal solution to (3.3.1) (0,..., 0) K f ( K(x 1,... x m ) ) p ( + L j g j Lj (x 1,... x m ) ) ( ) (0,..., 0) K1 (f 1 h 1 )(K 1 x 1 ),..., Km (f m h m )(K m x m ) p ( + L j g j Lj (x 1,... x m ) ). (3.3.3) j=1 j=1

19 3.3 Applications to convex minimization problems 19 The strong convexity of the functions h i imply that dom h i = H i and so (f i h i ) = f i h i, i = 1,..., m. Thus, (3.3.3) is further equivalent to where ( ) (0,..., 0) K1( f 1 h 1 )(K 1 x 1 ),..., Km( f m h m )(K m x m ) + p L jv j, j=1 ( v j g j Lj (x 1,... x m ) ) ( m ) v j g j L ji x i i=1 m i=1 L ji x i g j (v j ), j = 1,..., p. Then (x 1,..., x m ) is an optimal solution to (3.3.1) if and only if (x 1,..., x m, v 1,..., v p ) is a solution to by taking 0 K 1( f 1 h 1 )(K 1 x 1 ) + p j=1 L j1v j. 0 K m( f m h m )(K m x m ) + p j=1 L jmv j 0 g 1(v 1 ) m i=1 L 1ix i. 0 g p(v p ) m i=1 L pix i. (3.3.4) One can see now that (3.3.4) is a system of coupled inclusions of type (3.1.2), A i = f i, C i = h i, L i = K i, i = 1,..., m, G A m+j = gj j, x = 0, C m+j (x) =,, otherwise L m+j = Id Gj, j = 1,..., p, and, for (x 1,..., x m, v 1,..., v p ) H 1... H m G 1... G p, as coupling operators B i (x 1,..., x m, v 1,..., v p ) = p L jiv j, i = 1,..., m, j=1

20 20 Chapter 3. The primal-dual splitting technique and B m+j (x 1,..., x m, v 1,..., v p ) = Define m L ji x i, j = 1,..., p. i=1 B(x 1,..., x m, v 1,..., v p ) = (B 1,..., B m+p )(x 1,..., x m, v 1,..., v p ) (3.3.5) ( p ) p m m = L j1v j,..., L jmv j, L 1i x i,..., L pi x i. j=1 j=1 i=1 i=1 (3.3.6) It follows that C 1 i = ( h i ) 1 = h i = { h i } is ν i -Lipschitz continuous for i = 1,..., m. On the other hand, C 1 m+j Lipschitz continuous. is the zero operator for j = 1,..., p, thus 0- Furthermore, the operators B i, i = 1,..., m + p are linear and Lipschitz continuous, having as Lipschitz constants p µ i = L ji 2, i = 1,..., m, and µ m+j = m L ji 2, j = 1,..., p, j=1 i=1 respectively. For every (x 1,..., x m, v 1,..., v p ), (y 1,..., y m, w 1,..., w p ) H 1... H m G 1... G p it holds m x i y i B i (x 1,..., x m, v 1,..., v p ) B i (y 1,..., y m, w 1,..., w p ) Hi + = i=1 p v j w j B m+j (x 1,..., x m, v 1,..., v p ) B m+j (y 1,..., y m, w 1,..., w p ) Gj j=1 m x i y i p p L jiv j L jiw j Hi p m m v j w j L ji x i L ji y i = 0, Gj i=1 j=1 j=1 j=1 i=1 i=1 thus (3.1.1) is fulfilled. This proves also that the linear continuous operator B is skew (i.e. B = B). Remark Due to the fact that the operator B is skew, it is not cocoercive, hence,

21 3.3 Applications to convex minimization problems 21 the approach presented in [1] cannot be applied in this context. On the other hand, in the light of the characterization given in (3.3.2), in order to determine an optimal solution of the optimization problem (3.3.1) (and an optimal solution of its Fenchel-type dual as well) one can use the primal-dual proximal splitting algorithms which have been recently introduced in [23, 61]. These approaches have the particularity to deal in an efficient way with sums of compositions of proper, convex and lower semicontinuous function with linear continuous operators, by evaluating separately each function via a backward step and each linear continuous operator (and its adjoint) via a forward step. However, the iterative scheme we propose in this section for solving (3.3.1) has the advantage of exploiting the separable structure of the problem. Let us also mention that the dual inclusion problem of (3.3.4) reads (see (3.1.3)) find w 1 H 1,..., w m H m, w m+1 G 1,..., w m+p G p such that x 1 H 1,..., x m H m, v 1 G 1,..., v p G p 0 = K 1w 1 + p j=1 L j1v j. 0 = K mw m + p j=1 L jmv j 0 = w m+1 m i=1 L 1ix i. 0 = w m+p m i=1 L pix i w 1 ( f 1 h 1 )(K 1 x 1 ). w m ( f m h m )(K m x m ) w m+1 g 1(v 1 ). w m+p g p(v p ). (3.3.7) Then (x 1,...x m, v 1,..., v p, w 1,..., w m, w m+1,..., w m+p ) is a primal-dual solution to (3.3.4) - (3.3.7), if w i ( f i h i )(K i x i ), w m+j gj (v j ), p m 0 = Ki w i + L jiv j and 0 = w m+j L ji x i, i = 1,..., m, j = 1,..., p. j=1 i=1

22 22 Chapter 3. The primal-dual splitting technique Provided that (x 1,...x m, v 1,..., v p, w 1,..., w m, w m+1,..., w m+p ) is a primal-dual solution to (3.3.4) - (3.3.7), it follows that (x 1,...x m ) is an optimal solution to (3.3.1) and (w 1,..., w m, v 1,..., v p ) is an optimal solution to its Fenchel-type dual problem sup (w 1,...,w m,w m+1,...,w m+p ) H 1... H m G 1... G p K i w i+ p j=1 L ji w m+j=0,i=1,...,m { m (fi (w i ) + h i (w i )) i=1 } p gj (w m+j ). j=1 (3.3.8) Algorithm gives rise to the following iterative scheme for solving (3.3.4) - (3.3.7). Algorithm For every i = 1,..., m and every j = 1,..., p let (a 1,i,n ) n 0, (b 1,i,n ) n 0, (c 1,i,n ) n 0, be absolutely summable sequences in H i, (a 2,i,n ) n 0, (b 2,i,n ) n 0, (c 2,i,n ) n 0 be absolutely summable sequences in H i and (a 1,m+j,n ) n 0, (a 2,m+j,n ) n 0, (b 1,m+j,n ) n 0, (b 2,m+j,n ) n 0, (c 1,m+j,n ) n 0 and (c 2,m+j,n ) n 0 be absolutely summable sequences in G j. Furthermore, set where β = max m+p µ 2 i, ν 1,..., ν m + max { K 1,..., K m, 1}, (3.3.9) i=1 p µ i = L ji 2, i = 1,..., m, and µ m+j = m L ji 2, j = 1,..., p, (3.3.10) j=1 let ε ]0, 1/(β + 1)[ and (γ n ) n 0 i=1 be a sequence in [ε, (1 ε)/β]. Let the initial points (x 1,1,0,..., x 1,m,0 ) H 1... H m, (x 2,1,0,..., x 2,m,0 ) H 1... H m and

23 3.3 Applications to convex minimization problems 23 (v 1,1,0,..., v 1,p,0 ), (v 2,1,0,..., v 2,p,0 ) G 1... G p be arbitrary chosen and set n 0 For i = 1,..., m ( y 1,i,n = x 1,i,n γ n Ki x 2,i,n + ) p j=1 L jiv 1,j,n + a 1,i,n y 2,i,n = x 2,i,n γ n ( h i x 2,i,n K i x 1,i,n + a 2,i,n ) p 1,i,n = y 1,i,n + b 1,i,n p 2,i,n = Prox γnf i y 2,i,n + b 2,i,n For j = 1,..., p w 1,j,n = v 1,j,n γ n (v 2,j,n m i=1 L jix 1,i,n + a 1,m+j,n ) w 2,j,n = v 2,j,n γ n ( v 1,j,n + a 2,m+j,n ) r 1,j,n = w 1,j,n + b 1,m+j,n r 2,j,n = Prox γngj w 2,j,n + b 2,m+j,n For i = 1,..., m ( q 1,i,n = p 1,i,n γ n Ki p 2,i,n + ) p j=1 L jir 1,j,n + c 1,i,n q 2,i,n = p 2,i,n γ n ( h i p 2,i,n K i p 1,i,n + c 2,i,n ) x 1,i,n+1 = x 1,i,n y 1,i,n + q 1,i,n x 2,i,n+1 = x 2,i,n y 2,i,n + q 2,i,n For j = 1,..., p s 1,j,n = r 1,j,n γ n (r 2,j,n m i=1 L jip 1,i,n + c 1,m+j,n ) s 2,j,n = r 2,j,n γ n ( r 1,j,n + c 2,m+j,n ) v 1,j,n+1 = v 1,j,n w 1,j,n + s 1,j,n v 2,j,n+1 = v 2,j,n w 2,j,n + s 2,j,n. The following convergence result for Algorithm is a consequence of Theorem Theorem [10] Suppose that the optimization problem (3.3.1) has an optimal solution and that one of the qualification conditions (QC i ), i = 1, 2, is fulfilled. For the sequences generated by Algorithm the following statements are true: (i) i {1,..., m} x 1,i,n p 1,i,n 2 H i < +, x 2,i,n p 2,i,n 2 H i < + and n 0 n 0 j {1,..., p} v 1,j,n r 1,j,n 2 G j < +, v 2,j,n r 2,j,n 2 G j < +. n 0 n 0 (ii) There exists an optimal solution (x 1,..., x m ) to (3.3.1) and an optimal solution

24 24 Chapter 3. The primal-dual splitting technique (w 1,..., w m, w m+1,..., w m+p ) to (3.3.8), such that i {1,..., m} x 1,i,n x i, p 1,i,n x i, x 2,i,n w i and p 2,i,n w i and j {1,..., p} v 1,j,n w m+j and r 1,j,n w m+j as n +. Remark Recently, in [22], another iterative scheme for solving systems of monotone inclusions, that is also able to handle with the solving of optimization problems of type (3.3.1), in case when the functions g j, j = 1,..., p, are not necessarily differentiable, was proposed. Different to our approach, which assumes that the variables are coupled by the single-valued operator B, in [22] the coupling is made by some compositions of parallel sums of maximal monotone operators with linear continuous ones.

25 Chapter 4 Numerical experiments The obtained theoretical results are highly applicable in various fields of mathematics. In this section we present four numerical experiments which emphasize the performances of the primal-dual algorithm and some of its variants for systems of coupled monotone inclusions, namely image processing, multifacility location problems, average consensus for colored networks and image classification via support-vector machines. 4.1 Image processing The first numerical experiment solves the problem of image processing via the primal-dual splitting algorithm developed in this thesis. The aim is to estimate the unknown original image form the blurred and noisy image, with the help of the algorithm Image deblurring Let H and (H i) 1 i m be real Hilbert spaces, where m 1. For every i {1,..., m}, let f i Γ(H i) and let K i : H H i be continuous linear operators. Consider the initial convex optimization problem inf x H { m i=1 } f i (K i x). (4.1.1)

26 26 Chapter 4. Numerical experiments In the case of multiple variables, (4.1.1) can be reformulated as: { m inf f i (K i x i ) + x 1,...,x m+1 i=1 m i=1 } δ {0} (x i x m+1 ). (4.1.2) This is a special instance of Problem with (m, p) := (m + 1, m), f m+1 = 0, K m+1 = 0 and i {1,..., m + 1}, h i = δ {0} and j {1,..., m}, G j = H, g j = δ {0} and Id, if i = j; L ji : H i H j Id, if i = m + 1; (4.1.3) 0, otherwise. Let (x 1,1,0,..., x 1,m+1,0 ), (x 2,1,0,..., x 2,m+1,0 ) H 1... H m+1 and (v 1,1,0,..., v 1,m,0 ), (v 2,1,0,..., v 2,m,0 ) H 1... H m. In this case (3.3.9) yields β = 2 m + max { } K 1,..., K m+1, 1. (4.1.4) Algorithm Hence, the iterative scheme in Algorithm reads For i = 1,..., m ( ) y 1,i,n = x 1,i,n γ n K i x 2,i,n + v 1,i,n ( ) w 1,i,n = v 1,i,n γ n v2,i,n x 1,i,n + x 1,m+1,n y 2,i,n = x 2,i,n + γ n K i x 1,i,n w 2,i,n = v 2,i,n + γ n v 1,i,n y n 0 1,m+1,n = x 1,m+1,n + γ m n i=1 v 1,i,n For i = 1,..., m ( x 1,i,n+1 = x 1,i,n γ n K i prox γnf y ) i 2,i,n + w 1,i,n ( ) v 1,i,n+1 = v 1,i,n + γ n y1,i,n y 1,m+1,n x 2,i,n+1 = x 2,i,n y 2,i,n + prox γnf y i 2,i,n + γ n K i y 1,i,n v 2,i,n+1 = v 2,i,n w 2,i,n + γ n w 1,i,n x 1,m+1,n+1 = x 1,m+1,n + γ m n i=1 w 1,i,n

27 4.1 Image processing 27 Then (x 1,1,n,..., x 1,m+1,n ) (x,..., x). Consider a matrix A R n n representing the blur operator and a given vector b R n describing the blurred and noisy image. The goal is to estimate x R n, which is the unknown original image and fulfills the following Ax = b. The problem to be solved can be equivalently written as { inf f1 (x) + f 2 (Ax) }, (4.1.5) x S for f 1 : R n R, f 1 (x) = λ x 1 + δ S (x) and f 2 : R n R, f 2 (y) = y b 2. Thus f is proper, convex and lower semicontinuous with bounded domain and g is a 2- strongly convex function with full domain, differentiable everywhere and with Lipschitz continuous gradient having as Lipschitz constant 2. λ > 0 represents the regularization parameter and S R n is an n-dimensional cube representing the range of the pixels. Since each pixel furnishes a greyscale value which is between 0 and 255, a natural choice for the convex set S would be the n-dimensional cube [0, 255] n R n. In order to reduce the Lipschitz constant which appears in the developed approach, we scale the pictures that we use to exemplify the application, such that each of their pixels ranges in the interval [ 0, 1 10].

28 28 Chapter 4. Numerical experiments Experimental setting We concretely look at the Cameraman, Lena and Barbara test images. The Cameraman test image is part of the image processing toolbox in Matlab and its vectorized and scaled image dimension equals n = = By making use of the Matlab functions imfilter and fspecial, this image is blurred as follows: 1 H=f s p e c i a l ( gaussian, 9, 4 ) ; % gaussian blur o f s i z e 9 times 9 2 % and standard d e v i a t i o n 4 3 B=i m f i l t e r (X,H, conv, symmetric ) ; % B=observed b l u r r e d image 4 % X=o r i g i n a l image In row 1 the function fspecial returns a rotationally symmetric Gaussian lowpass filter of size 9 9 with standard deviation 4. The entries of H are nonnegative and their sum adds up to 1. In row 3 the function imfilter convolves the filter H with the image X R and outputs the blurred image B R The boundary option symmetric corresponds to reflexive boundary conditions. For more interesting applications concerning Matlab test images deblurred with different techniques, we refer the reader to [11 15, 19]. Thanks to the rotationally symmetric filter H, the linear operator A R n n given by the Matlab function imfilter is symmetric, too. After adding a zero-mean white Gaussian noise with standard deviation 10 4, we obtain the blurred and noisy image b R n. In this particular case m := 2, K 1 = Id, K 2 = A and β equals β = max { } A, 1. (4.1.6) By making use of the real spectral decomposition of A, it shows that A 2 = 1. Hence β =

29 4.1 Image processing 29 Algorithm Algorithm can be written as n 0 ( ) y 1,1,n = x 1,1,n γ n x2,1,n + v 1,1,n ( ) w 1,1,n = v 1,1,n γ n v2,1,n x 1,1,n + x 1,3,n y 2,1,n = x 2,1,n + γ n x 1,1,n w 2,1,n = v 2,1,n + γ n v 1,1,n ( ) y 1,2,n = x 1,2,n γ n A x 2,2,n + v 1,2,n ( ) w 1,2,n = v 1,2,n γ n v2,2,n x 1,2,n + x 1,3,n y 2,2,n = x 2,2,n + γ n Ax 1,2,n w 2,2,n = v 2,2,n + γ n v 1,2,n y 1,3,n = x 1,3,n + γ n (v 1,1,n + v 1,2,n ) ( x 1,1,n+1 = x 1,1,n γ n proxγnf y ) 1 2,1,n + w 1,1,n ( ) v 1,1,n+1 = v 1,1,n + γ n y1,1,n y 1,3,n x 2,1,n+1 = x 2,1,n y 2,1,n + prox γnf y 1 2,1,n + γ n y 1,1,n v 2,1,n+1 = v 2,1,n w 2,1,n + γ n w 1,1,n ( x 1,2,n+1 = x 1,2,n γ n A prox γnf y ) 2 2,2,n + w 1,2,n ( ) v 1,2,n+1 = v 1,2,n + γ n y1,2,n y 1,3,n x 2,2,n+1 = x 2,2,n y 2,2,n + prox γnf y 2 2,2,n + γ n Ay 1,2,n v 2,2,n+1 = v 2,2,n w 2,2,n + γ n w 1,2,n x 1,3,n+1 = x 1,3,n + γ n (w 1,1,n + w 1,2,n ). Then (x 1,1,n, x 1,2,n, x 1,3,n ) (x, x, x). Figure 4.1 shows the reconstructed images with 500 and 1000 iterations. Another test image which is very popular and is often used in image processing is the Lena test image. The picture undergoes the same blur as described in the previous case. Figure 4.2 shows the reconstructed images with 500 and 1500 iterations. A third example in image processing is the Barbara test image. We ran our algorithm on this test image too in order to demonstrate the applicability of the method on different images. Figure 4.3 shows the reconstructed images in the case

30 30 Chapter 4. Numerical experiments (a) Blurred and noisy image (b) 500 iterations (c) 1000 iterations (d) Original image Figure 4.1: Figure (a) is the obtained image after multiplying with a blur operator and adding white Gaussian noise, (b) shows the averaged sequence generated by Algorithm after 500 iterations. Figure (c) is the reconstructed image after 1000 iterations, (d) shows the clean Cameraman test image. of 400 and 800 iterations. A valuable tool for measuring the quality of these images is the so-called improvement in signal-to-noise ratio (ISNR), which is defined as ( ) x b 2 ISNR(k) = 10 log 10 x x k 2

31 4.1 Image processing 31 (a) Blurred and noisy image (b) 500 iterations (c) 1500 iterations (d) Original image Figure 4.2: Figure (a) is the obtained image after multiplying with a blur operator and adding white Gaussian noise, (b) shows the averaged sequence generated by Algorithm after 500 iterations. Figure (c) is the reconstructed image after 1500 iterations, (d) shows the clean Lena test image. where x, b and x k denote the original, observed and estimated image at iteration k, respectively. Figure 4.4 shows the evolution of the ISNR values for the three images presented.

32 32 Chapter 4. Numerical experiments (a) Blurred and noisy image (b) 400 iterations (c) 800 iterations (d) Original image Figure 4.3: Figure (a) is the obtained image after multiplying with a blur operator and adding white Gaussian noise, (b) shows the averaged sequence generated by Algorithm after 400 iterations. Figure (c) is the reconstructed image after 800 iterations, (d) shows the clean Barbara test image. 4.2 Multifacility location problems The second numerical experiment solves the multifacility location problem. It was as early as the 17 th century that mathematicians, notably Fermat, were concerned with what are known as single facility location problems. However, it was not until the 20 th century that normative approaches to solving symbolic models of these and related problems were addressed in the literature. Each of these solution techniques

33 4.2 Multifacility location problems 33 Signal to noise ratio 8 Cameraman Lena Barbara ISNR Number of iterations Figure 4.4: Evolution of the signal-to-noise ratio (ISNR) for the Cameraman, Lena and Barbara test images. concerned themselves with determining the location of a new facility, or new facilities, with respect to the location of existing facilities so as to minimize a cost function based on a weighted interfacility distance measure [17, 18, 49, 51, 52, 64 66]. The Euclidean distance problem for the case of single new facilities was addressed by Weiszfeld [63], Miehle [45], Kuhn and Kuenne [42], and Cooper [24 26] to name a few. However, it was not until the work of Kuhn [41] that the problem was considered completely solved. A computational procedure for minimizing the Euclidean multifacility problem was presented by Vergin and Rogers [60] in 1967; however, their techniques sometimes give suboptimum solutions. Two years later, Love [44] gave a scheme for solving this problem which makes use of convex programming and penalty function techniques. One advantage to this approach is that it considers the existence of various types of spatial constraints. In 1973, Eyster, White and Wierwille [31] presented the hyperboloid approximation procedure (HAP) for both rectilinear and Euclidean distance measures which extended the technique employed in solving the single facility problem to the multifacility case [18]. If one studies a list of references of the work done in the past decade involving facility location problems, it becomes readily apparent that there exists a strong in-

34 34 Chapter 4. Numerical experiments terdisciplinary interest in this area within the fields of applied mathematics, operation research, civil engineering, architecture, management science, systems engineering, logistics, economics, regional science, transportation systems, urban design and industrial engineering, among others. As a result, the term facility has taken on a very broad connotation in order to suit applications in each of these areas. For example, a facility can be a hospital, an ambulance, an airport, a police cruise, a manned space vehicle, a machine tool, a school, a student to be bused to a school in an urban environment, a remote computer terminal, a home appliance, a planned community, a warehouse, an office building, a pump in a pipeline, a sewage treatment plant, and so on [31, 33, 37]. Francis and Goldstein [34] provide a fairly recent bibliography of the facility location literature. One of the most complete classifications of these problems is provided in a book by Francis and White [35] Problem manipulation We proceed with the presentation of an efficient solution procedure for solving a multifacility location problem, which can be formulated mathematically as follows: { k inf λ i x 1 c i x 1,x i=1 k i=1 } γ i x 2 c i 2 + α x 1 x 2, (4.2.1) where we use the following notations x j =vector location of a new facility j, 1 j 2; c i =vector location of an existing facility i, 1 i k; λ i =non negative weight between the new facility x 1 and an existing facility i, 1 i k; γ i =non negative weight between the new facility x 2 and an existing facility i, 1 i k; α =non negative weight between the two new facilities x 1 and x 2.

35 4.2 Multifacility location problems 35 The discussed multifacility location problem is in fact a special instance of Problem with (m, p) := (2, 2k + 1), f i = 0, h i = δ {0}, K i = 0, where i = 1, 2, and λ j x c j 2, if j = 1,..., k; g j (x) = γ j x c j 2, if j = k + 1,..., 2k; α x, if j = 2k + 1, L j1 = Id, if j = 1,..., k; 0, if j = k + 1,..., 2k; Id, if j = 2k + 1, and L j2 = 0, if j = 1,..., k; Id, if j = k + 1,..., 2k; Id, if j = 2k + 1. Algorithm Hence, the iterative scheme in Algorithm reads For i = 1, 2 y ( 1,i,n = x 1,i,n γ n x2,i,n + 2k+1 ) j=1 L jiv 1,j,n y 2,i,n = x 2,i,n + γ n x 1,i,n For j = 1,... 2k + 1 ( ) w 1,j,n = v 1,j,n γ n v2,j,n L j1 x 1,1,n L j2 x 1,2,n w 2,j,n = v 2,j,n + γ n v 1,j,n n 0 r 2,j,n = prox γngj w 2,j,n For i = 1, 2 x ( 1,i,n+1 = x 1,i,n γ 2k+1 ) n j=1 L jiw 1,j,n x 2,i,n+1 = x 2,i,n y 2,i,n + γ n y 1,i,n For j = 1,..., 2k + 1 v ( ) 1,j,n+1 = v 1,i,n γ n r2,j,n L j1 y 1,1,n L j2 y 1,2,n ( ) v 2,j,n+1 = γ n w1,j,n v 1,j,n + r2,j,n. The algorithm implemented in the MatLab programming language works with the parameter β = 2 k We choose ε to be as small as possible from the interval ]0, 1/(β + 1)[ and the value of the parameter γ equals (1 ε)/β, for an optimal

36 36 Chapter 4. Numerical experiments solution Computational experience As an illustration of Algorithm 4.2.1, consider a location problem involving two new facilities and five existing facilities. This example is also used for illustrating the use of the hyperboloid approximation procedure presented in [31]. Since distances between facilities are Euclidean, the facilities might be machines connected by straight-line conveyors, military base connected by air travel, or industrial plants and warehouses where strait line travel can be reasonably approximated. For the example, assume the existing facilities are located at the coordinate points: P 1 = (0, 0); P 2 = (2, 4); P 3 = (6, 2); P 4 = (6, 10) and P 5 = (8, 8). Also, let λ i = (4, 2, 3, 0, 0), γ i = (0, 2, 1, 3, 2), i = 1,..., 5 and α = 2. With a starting location of x 0 1 = (0, 0) and x 0 2 = (0, 0), the result of the algorithm is illustrated in Figure 4.5. Most current techniques used for solving location problems, including Algorithm 4.2.1, define the stopping criterion based on successive changes in the value of the objective function, which is more efficient than the one based on successive changes in the location of the new facilities (convergent solution). The projection technique used in the proposed method takes full advantage of the structure of the location problem and can easily be extended to accommodate problems involving other item movements as well.

37 4.2 Multifacility location problems 37 (a) Starting location (b) 2 iterations (c) 5 iterations (d) 10 iterations (e) 25 iterations (f) 50 iterations Figure 4.5: A multifacility location problem involving two new facilities and five existing facilities. The existing facilities are located at the coordinate points P 1 = (0, 0), P 2 = (2, 4), P 3 = (6, 2), P 4 = (6, 10) and P 5 = (8, 8). The non negative weight α between the two new facilities equals 2 and the weights between a new facility and an existing facility equals λ i = (4, 2, 3, 0, 0) and γ i = (0, 2, 1, 3, 2), i = 1,..., 5.

38 38 Chapter 4. Numerical experiments 4.3 Average consensus for colored networks The third numerical experiment that we consider concerns the problem of average consensus on colored networks. Given a network, where each node possesses a measurement in form of a real number, the average consensus problem consists in calculating the average of these measurements in a recursive and distributed way, allowing the nodes to communicate information along only the available edges in the network Proposed algorithm Consider a connected network G = (V, E), where V represents the set of nodes and E represents the set of edges. Each edge is uniquely represented as a pair of nodes (i, j), where i < j. The nodes i and j can exchange their values if they can communicate directly, in other words, if (i, j) E. The set of neighbors of node k is represented with N k and its degree with D k = N k. We assume that each node possesses a measurement in form of a real number, also called color, and that no neighboring nodes have the same color. Let C denote the number of colors the network is colored with and C i the set of the nodes that have the color i, i = 1,..., C. Without affecting the generality of the problem we also assume that the first C 1 nodes are in the set C 1, the next C 2 nodes are in the set C 2, etc. Furthermore, we assume that a node coloring scheme is available. For more details concerning the mathematical modeling of the average consensus problem on colored networks we refer the reader to [46, 47]. Let P and E denote the number of nodes and edges in the network, respectively, hence, C i=1 C i = P. Denoting by θ k the measurement assigned to node k, k = 1,..., P, the problem we want to solve is { P 1 } min x R 2 (x θ k) 2. (4.3.1) k=1 The unique optimal solution to the problem (4.3.1) is θ = 1 P P k=1 θ k, namely the average of the measurements over the whole set of nodes in the network. The goal is to make this value available in each node in a distributed and recursive way. To this end,

39 4.3 Average consensus for colored networks 39 we replicate copies of x throughout the entire network, more precisely, for k = 1,..., P, node k will hold the k-th copy, denoted by x k, which will be updated iteratively during the algorithm. At the end we have to guarantee that all the copies are equal and we express this constraint by requiring that x i = x j for each (i, j) E. This gives rise to the following optimization problem { P 1 } min x=(x 1,...,x P ) R P 2 (x k θ k ) 2. (4.3.2) x i =x j, {i,j} E k=1 Let A R P E be the node-arc incidence matrix of the network, which is the matrix having each column associated to an edge in the following manner: the column associated to the edge (i, j) E has 1 at the i-th entry and 1 at the j-th entry, the remaining entries being equal to zero. Consequently, constraints in (4.3.2) can be written with the help of the transpose of the node-arc incidence matrix as A T x = 0. Taking into consideration the ordering of the nodes and the coloring scheme, we can write A T x = A T 1 x A T C x C, where x i R C i, i = 1,..., C, collects the copies of the nodes in C i, i.e. x = (x 1,..., x C1,..., x P CC +1,..., x P ). }{{}}{{} x 1 x C Hence, the optimization problem (4.3.2) becomes min x=(x 1,...,x C ) A T 1 x A T C x C=0 { C i=1 } f i (x i ), (4.3.3) where for i = 1,..., C, the function f i : R C i R is defined as f i (x i ) = l C i 1 2 (x l θ l ) 2. One can easily observe that problem (4.3.3) is a particular instance of the optimization problem (3.3.1), when taking m = C, p = 1, h i = δ {0}, K i = Id, L 1i = A T i R E C i, i = 1,..., C, and g 1 = δ {0}. Algorithm Using that h i = 0, i = 1,..., C, and Prox γg (x) = 0 for all γ > 0 and x R E, the iterative scheme in Algorithm reads, after some algebraic manipulations, in the

40 40 Chapter 4. Numerical experiments error-free case: n 0 For i = 1,..., C ( ) y 1,i,n = x 1,i,n γ n x2,i,n + A i v 1,1,n y 2,i,n = x 2,i,n + γ n x 1,i,n p 2,i,n = Prox γnf i y 2,i,n ( w 1,1,n = v 1,1,n γ n v2,1,n C ) i=1 AT i x 1,i,n For i = 1,..., C ( ) q 1,i,n = y 1,i,n γ n p2,i,n + A i w 1,1,n q 2,i,n = p 2,i,n + γ n y 1,i,n x 1,i,n+1 = x 1,i,n y 1,i,n + q 1,i,n x 2,i,n+1 = x 2,i,n y 2,i,n + q 2,i,n v 1,1,n+1 = v 1,1,n + γ n C i=1 AT i y 1,i,n v 2,1,n+1 = γ 2 n( C i=1 AT i x 1,i,n v 2,1,n ). Let us notice that for i = 1,..., C and γ > 0 it holds Prox γf i (x i ) = (1+γ) 1 (x i γθ i ), where θ i is the vector in R C i whose components are θ l with l C i Performance evaluation In order to compare the performances of our method with other existing algorithms in literature, we used the networks generated in [47] with the number of nodes ranging between 10 and The measurement θ k associated to each node was generated randomly and independently from a normal distribution with mean 10 and standard deviation 100. We worked with networks with 10, 50, 100, 200, 500, 700 and 1000 nodes and measured the performance of our algorithm from the point of view of the number of communication steps, which actually coincides with the number of iterations. As stopping criterion we considered x n 1 P θ 10 4, P θ where 1 P denotes the vector in R P having all entries equal to 1.

41 4.3 Average consensus for colored networks Watts Strogatz Network, (n,p)=(2, 0.8) Communication steps ALG 10 D ADMM Zhu Ols Number of nodes Figure 4.6: The chart shows the communication steps needed by the four algorithms for a Watts-Strogatz network with different number of nodes. ALG stands for the primal-dual algorithm proposed in this work. Geometric Network, d= ALG D ADMM Zhu Ols 800 Communication steps Number of nodes Figure 4.7: The chart shows the communication steps needed by the four algorithms for a Geometric network with different number of nodes. ALG stands for the primal-dual algorithm proposed in the thesis.

42 42 Chapter 4. Numerical experiments Networks with 10 nodes ALG D ADMM Zhu Ols 140 Communication steps Network number Figure 4.8: Comparison of the four algorithms over six networks with 10 nodes. Here, ALG stands for the proposed primal-dual algorithm. We present in Figure 4.6 and Figure 4.7 the communication steps needed when dealing with the Watts-Strogatz network with parameters (2,0.8) and with the Geometric network with a distance parameter 0.2. The Watts-Strogatz network is created from a lattice where every node is connected to 2 nodes, then the links are rewired with a probability of 0.8, while the Geometric network works with nodes in a [0, 1] 2 square and connects the nodes whose Euclidean distance is less than the given parameter 0.2. As shown in Figure 4.6 and in Figure 4.7, our algorithm performed comparable to D- AMM, presented in [47], but it performed better then the algorithms presented in [48] and [70]. In order to observe the behavior of our algorithm on different networks, we tested it on 6 models, shown in Table 4.1, with a different number of nodes. The used models and the role of its parameters are explained in Table 4.2. Observing the needed communication steps, we can conclude that our algorithm is communication-efficient and it performs better than or similarly to the algorithms in [47], [48] and [70] (as exemplified in Figure 4.8).

43 4.4 Support-vector machines classification 43 Number Model Parameters 1 Erdős-Rényi Watts-Strogatz (2, 0.8) 3 Watts-Strogatz (4, 0.6) 4 Barabasi-Albert 5 Geometric Lattice Table 4.1: Network parameters. Name Parameters Description Erdős-Rényi [30] p Every pair of nodes {i,j} is connected or not with probability p. Watts-Strogatz [62] (n, p) First, it creates a lattice where every node is connected to n nodes; then, it rewires every link with probability p. If link {i,j} is to be rewired, it removes the link and connects node i or node j (chosen with equal probability) to another node in the network, chosen uniformly. Barabasi-Albert [3] It starts with one node. At each step, one node is added to the network by connecting it to two existing nodes: the probability to connect it to node k is proportional to D k. Geometric [50] d It drops P points, corresponding to the nodes of the network, randomly in a [0, 1] 2 square; then, it connects node whose Euclidean distance is less than d. Lattice Creates a lattice of dimensions m n; m and n are chosen to make the lattice as square as possible. Table 4.2: Network models. 4.4 Support-vector machines classification The fourth numerical experiment we present in this section addresses the problem of classifying images via support vector machines. Having a set training data a i R n, i = 1,..., k, belonging to one of two given classes, denoted by -1 and +1, the aim is to construct by it a decision function given in the form of a separating hyperplane which should assign every new data to one of the two classes with a low misclassification rate.

44 44 Chapter 4. Numerical experiments The soft-margin hyperplane We construct the matrix A R k n such that each row corresponds to a data point a i, i = 1,..., k and a vector d R k such that for i = 1,..., k its i-th entry is equal to 1, if a i belongs to the class -1 and it is equal to +1, otherwise. Consider the case where the training data cannot be separated without error. In this case one may want to separate the training set with a minimal number of errors. In order to cover the situation when the separation cannot be done exactly, we consider non-negative slack variables ξ i 0, i = 1,..., k, thus the goal will be to find (s, r, ξ) R n R R k + as optimal solution of the following optimization problem (also called soft-margin support vector machines problem) min { s 2 + C ξ 2}, (4.4.1) (s,r,ξ) R n R R k + D(As+1 k r)+ξ 1 k where 1 k denotes the vector in R k having all entries equal to 1, the inequality z 1 k for z R k means z i 1, i = 1,...k, D = Diag(d) is the diagonal matrix having the vector d as main diagonal and C is a trade-off parameter. Each new data a R n will by assigned to one of the two classes by means of the resulting decision function z(a) = s T a+r, namely, a will be assigned to the class -1, if z(a) < 0, and to the class +1, otherwise. For more theoretical insights in support vector machines we refer the reader to [28, 32]. The soft-margin support vector machines problem (4.4.1) can be written as a special instance of the optimization problem (3.3.1), by taking m = 3, p = 1, f 1 ( ) = 2, f 2 = 0, f 3 ( ) = C 2 + δ R k + ( ), h i = δ {0}, K i = Id, i = 1, 2, 3, g 1 = δ {z R k :z 1 k }, L 11 = DA, L 12 = D1 k and L 13 = Id.

45 4.4 Support-vector machines classification 45 Algorithm Thus, Algorithm gives rise in the error-free case to the following iterative scheme: n 0 For i = 1, 2, 3 ( ) y 1,i,n = x 1,i,n γ n x2,i,n + L T 1iv 1,1,n y 2,i,n = x 2,i,n + γ n x 1,i,n p 2,i,n = Prox γnf y i 2,i,n ( w 1,1,n = v 1,1,n γ n v2,1,n 3 i=1 L ) 1ix 1,i,n w 2,1,n = v 2,1,n + γ n v 1,1,n r 2,1,n = Prox γng1 w 2,1,n For i = 1, 2, 3 ( ) q 1,i,n = y 1,i,n γ n p2,i,n + L T 1iw 1,1,n q 2,i,n = p 2,i,n + γ n y 1,i,n x 1,i,n+1 = x 1,i,n y 1,i,n + q 1,i,n x 2,i,n+1 = x 2,i,n y 2,i,n + q 2,i,n s 1,1,n = w 1,1,n γ n (r 2,1,n 3 i=1 L 1ip 1,i,n ) s 2,1,n = r 2,1,n + γ n w 1,1,n v 1,1,n+1 = v 1,1,n w 1,1,n + s 1,1,n v 2,1,n+1 = v 2,1,n w 2,1,n + s 2,1,n We would also like to notice that for the proximal points needed in the algorithm one has for γ > 0 and (s, r, ξ, z) R n R R k R k the following exact formulae: Prox γf 1 (s) = (2 + γ) 1 2s, Prox γf 2 (r) = 0, Prox γf 3 (ξ) = ξ γp R k + ( (2C + γ) 1 ξ ) and Prox γg1 (z) = P {x R k :x 1 k }(z) Experiments with digit recognition We made use of a data set of training images and 2041 test images of size from the website The problem

46 46 Chapter 4. Numerical experiments consisted in determining a decision function based on a pool of handwritten digits showing either the number two or the number nine, labeled by 1 and +1, respectively (see Figure 4.9). We evaluated the quality of the decision function on a test data set by computing the percentage of misclassified images. Notice that we use only a half of the available images from the training data set, in order to reduce the computational effort. (a) A sample of data for number 2 (b) A sample of data for number 9 Figure 4.9: A sample of images belonging to the classes -1 and +1, respectively. With respect to the considered data set, we denote by D= {(X i, Y i ), i = 1,..., 6000} R 784 {+1, 1} the set of available training data consisting of 3000 images in the class 1 and 3000 images in the class +1. A sample from each class of images is shown in Figure 4.9. The images have been vectorized and normalized by dividing each of them by the quantity ( i=1 X i 2) 1 2. Number of iterations Training error Test error Table 4.3: Misclassification rate in percentage for different numbers of iterations for both the training data and the test data. We stopped the primal-dual algorithm after different numbers of iterations and evaluated the performances of the resulting decision functions. In Table 4.3we present the misclassification rate in percentage for the training and for the test data (the error for the training data is less than the one for the test data) and observe that the quality of the classification increases with the number of iterations. However, even for a low number of iterations the misclassification rate outperforms the ones reported in the

47 4.4 Support-vector machines classification 47 literature dealing with numerical methods for support vector classification. Let us also mention that the numerical results are given for the case C = 1. We tested also other choices for C, however we did not observe great impact on the results.

48 Chapter 5 Interdisciplinary application of our algorithm During my studies, I collaborated with a team of researchers in biology, studying the biodiversity of picoalgae in Romanian salt lakes. We investigated the phytoplankton communities in several hypersaline lakes in the Transylvanian basin, with the latest microscopy and molecular biological techniques [38 40]. The aim was to adapt our algorithm for image processing and pattern recognition for microscopic images and electrophoretograms. The developing process included the work with epifluorescence microscopic images of picoalgae in order to count and to distinguish them from the background noise. 5.1 Optimization in biology Most of the molecular techniques lead to the method of electrophoresis in which DNA or proteins can be separated in a gel using electricity. In our case DNA fragments from different organisms collected from the saline lakes were migrated, using agarose gel electrophoresis, in order to separate the DNA fragments. Agarose is a polysaccharide polymer material, generally extracted from seaweed and is frequently used in molecular biology, biochemistry and clinical chemistry for the separation of large molecules, especially a mixed population of DNA, in a matrix of agarose. After the patterns of the migrated DNA fragments are obtained (see Figure

49 5.2 Pattern recognition techniques 49 Figure 5.1: Denaturated gradient gel electrophoresis pattern of DNA samples from the salt lakes of the Transylvanian-Basin. Columns represent different water samples and the horizontal bands within the columns represent different species. 5.1), they are analyzed and the different samples are compared to establish differences between the salt lakes. Also, the patterns can be used to establish differences within a lake between different water depths [38]. Molecular processing such as cloning and sequencing can be an expensive and time consuming method. Our aim is to optimize our algorithm to find specific patterns, in order to make the process cheaper and faster. Patterns can be distinguished with a method called pattern recognition. This is actually a classification, which attempts to assign each input value to one of a given set of classes. The procedure is a non-probabilistic binary linear classifier, very similar to the classification via support vector machines (see 4.4). In our case, epifluorescence microscopic images were taken in order to count the cyanobacterial cells from the water samples, as shown in Figure Pattern recognition techniques Enumerating the algae content of a water sample typically involves manual counts through a microscope. Recent advanced pattern recognition software enable

50 50 Chapter 5. Interdisciplinary application of our algorithm these counts to be automated. The new techniques have the potential to advance the sensitivity of monitoring systems by providing much more information on the state of the water supply in near-real time, with more statistical significance than conventional microscopy. The availability of this information can yield significant cost savings by allowing system operators to proactively treat water supplies. In general, different species of picoalgae can seem quite clearly different to the human eye, but they are not as easily distinguished mathematically. Automated pattern recognition in images has been a complex area of research within computer science for many years, because many distinctions we make regularly through our eyes and brain remain very difficult to accomplish computationally. Our work in this area has a great progress with good results. In the future, we propose to develop our optimization technique for pattern recognition, in order to eliminate the unclassified particles and to obtain better classification results of microscopic images. Figure 5.2: Epifluorescence microscopic images (magnification = 1000) of pikocyanobacteria. The first image shows the pikocyanobacteria with blue-violet excitation and in the second image they are shown with green excitation.

Convergence analysis for a primal-dual monotone + skew splitting algorithm with applications to total variation minimization

Convergence analysis for a primal-dual monotone + skew splitting algorithm with applications to total variation minimization Convergence analysis for a primal-dual monotone + skew splitting algorithm with applications to total variation minimization Radu Ioan Boţ Christopher Hendrich November 7, 202 Abstract. In this paper we

More information

On the acceleration of the double smoothing technique for unconstrained convex optimization problems

On the acceleration of the double smoothing technique for unconstrained convex optimization problems On the acceleration of the double smoothing technique for unconstrained convex optimization problems Radu Ioan Boţ Christopher Hendrich October 10, 01 Abstract. In this article we investigate the possibilities

More information

On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems

On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems Radu Ioan Boţ Ernö Robert Csetnek August 5, 014 Abstract. In this paper we analyze the

More information

Solving monotone inclusions involving parallel sums of linearly composed maximally monotone operators

Solving monotone inclusions involving parallel sums of linearly composed maximally monotone operators Solving monotone inclusions involving parallel sums of linearly composed maximally monotone operators Radu Ioan Boţ Christopher Hendrich 2 April 28, 206 Abstract. The aim of this article is to present

More information

arxiv: v1 [math.oc] 12 Mar 2013

arxiv: v1 [math.oc] 12 Mar 2013 On the convergence rate improvement of a primal-dual splitting algorithm for solving monotone inclusion problems arxiv:303.875v [math.oc] Mar 03 Radu Ioan Boţ Ernö Robert Csetnek André Heinrich February

More information

ADMM for monotone operators: convergence analysis and rates

ADMM for monotone operators: convergence analysis and rates ADMM for monotone operators: convergence analysis and rates Radu Ioan Boţ Ernö Robert Csetne May 4, 07 Abstract. We propose in this paper a unifying scheme for several algorithms from the literature dedicated

More information

A New Fenchel Dual Problem in Vector Optimization

A New Fenchel Dual Problem in Vector Optimization A New Fenchel Dual Problem in Vector Optimization Radu Ioan Boţ Anca Dumitru Gert Wanka Abstract We introduce a new Fenchel dual for vector optimization problems inspired by the form of the Fenchel dual

More information

Optimization and Optimal Control in Banach Spaces

Optimization and Optimal Control in Banach Spaces Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,

More information

Coordinate Update Algorithm Short Course Operator Splitting

Coordinate Update Algorithm Short Course Operator Splitting Coordinate Update Algorithm Short Course Operator Splitting Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 25 Operator splitting pipeline 1. Formulate a problem as 0 A(x) + B(x) with monotone operators

More information

Almost Convex Functions: Conjugacy and Duality

Almost Convex Functions: Conjugacy and Duality Almost Convex Functions: Conjugacy and Duality Radu Ioan Boţ 1, Sorin-Mihai Grad 2, and Gert Wanka 3 1 Faculty of Mathematics, Chemnitz University of Technology, D-09107 Chemnitz, Germany radu.bot@mathematik.tu-chemnitz.de

More information

On Gap Functions for Equilibrium Problems via Fenchel Duality

On Gap Functions for Equilibrium Problems via Fenchel Duality On Gap Functions for Equilibrium Problems via Fenchel Duality Lkhamsuren Altangerel 1 Radu Ioan Boţ 2 Gert Wanka 3 Abstract: In this paper we deal with the construction of gap functions for equilibrium

More information

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The

More information

Dedicated to Michel Théra in honor of his 70th birthday

Dedicated to Michel Théra in honor of his 70th birthday VARIATIONAL GEOMETRIC APPROACH TO GENERALIZED DIFFERENTIAL AND CONJUGATE CALCULI IN CONVEX ANALYSIS B. S. MORDUKHOVICH 1, N. M. NAM 2, R. B. RECTOR 3 and T. TRAN 4. Dedicated to Michel Théra in honor of

More information

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework - MC/MC Constrained optimization

More information

Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches

Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches Patrick L. Combettes joint work with J.-C. Pesquet) Laboratoire Jacques-Louis Lions Faculté de Mathématiques

More information

Monotone operators and bigger conjugate functions

Monotone operators and bigger conjugate functions Monotone operators and bigger conjugate functions Heinz H. Bauschke, Jonathan M. Borwein, Xianfu Wang, and Liangjin Yao August 12, 2011 Abstract We study a question posed by Stephen Simons in his 2008

More information

Inertial Douglas-Rachford splitting for monotone inclusion problems

Inertial Douglas-Rachford splitting for monotone inclusion problems Inertial Douglas-Rachford splitting for monotone inclusion problems Radu Ioan Boţ Ernö Robert Csetnek Christopher Hendrich January 5, 2015 Abstract. We propose an inertial Douglas-Rachford splitting algorithm

More information

A Dykstra-like algorithm for two monotone operators

A Dykstra-like algorithm for two monotone operators A Dykstra-like algorithm for two monotone operators Heinz H. Bauschke and Patrick L. Combettes Abstract Dykstra s algorithm employs the projectors onto two closed convex sets in a Hilbert space to construct

More information

Lecture: Duality of LP, SOCP and SDP

Lecture: Duality of LP, SOCP and SDP 1/33 Lecture: Duality of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:

More information

Approaching monotone inclusion problems via second order dynamical systems with linear and anisotropic damping

Approaching monotone inclusion problems via second order dynamical systems with linear and anisotropic damping March 0, 206 3:4 WSPC Proceedings - 9in x 6in secondorderanisotropicdamping206030 page Approaching monotone inclusion problems via second order dynamical systems with linear and anisotropic damping Radu

More information

A Brøndsted-Rockafellar Theorem for Diagonal Subdifferential Operators

A Brøndsted-Rockafellar Theorem for Diagonal Subdifferential Operators A Brøndsted-Rockafellar Theorem for Diagonal Subdifferential Operators Radu Ioan Boţ Ernö Robert Csetnek April 23, 2012 Dedicated to Jon Borwein on the occasion of his 60th birthday Abstract. In this note

More information

Extended Monotropic Programming and Duality 1

Extended Monotropic Programming and Duality 1 March 2006 (Revised February 2010) Report LIDS - 2692 Extended Monotropic Programming and Duality 1 by Dimitri P. Bertsekas 2 Abstract We consider the problem minimize f i (x i ) subject to x S, where

More information

Second order forward-backward dynamical systems for monotone inclusion problems

Second order forward-backward dynamical systems for monotone inclusion problems Second order forward-backward dynamical systems for monotone inclusion problems Radu Ioan Boţ Ernö Robert Csetnek March 6, 25 Abstract. We begin by considering second order dynamical systems of the from

More information

Victoria Martín-Márquez

Victoria Martín-Márquez A NEW APPROACH FOR THE CONVEX FEASIBILITY PROBLEM VIA MONOTROPIC PROGRAMMING Victoria Martín-Márquez Dep. of Mathematical Analysis University of Seville Spain XIII Encuentro Red de Análisis Funcional y

More information

M. Marques Alves Marina Geremia. November 30, 2017

M. Marques Alves Marina Geremia. November 30, 2017 Iteration complexity of an inexact Douglas-Rachford method and of a Douglas-Rachford-Tseng s F-B four-operator splitting method for solving monotone inclusions M. Marques Alves Marina Geremia November

More information

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008. 1 ECONOMICS 594: LECTURE NOTES CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS W. Erwin Diewert January 31, 2008. 1. Introduction Many economic problems have the following structure: (i) a linear function

More information

ON GAP FUNCTIONS OF VARIATIONAL INEQUALITY IN A BANACH SPACE. Sangho Kum and Gue Myung Lee. 1. Introduction

ON GAP FUNCTIONS OF VARIATIONAL INEQUALITY IN A BANACH SPACE. Sangho Kum and Gue Myung Lee. 1. Introduction J. Korean Math. Soc. 38 (2001), No. 3, pp. 683 695 ON GAP FUNCTIONS OF VARIATIONAL INEQUALITY IN A BANACH SPACE Sangho Kum and Gue Myung Lee Abstract. In this paper we are concerned with theoretical properties

More information

Convex Optimization Notes

Convex Optimization Notes Convex Optimization Notes Jonathan Siegel January 2017 1 Convex Analysis This section is devoted to the study of convex functions f : B R {+ } and convex sets U B, for B a Banach space. The case of B =

More information

minimize x subject to (x 2)(x 4) u,

minimize x subject to (x 2)(x 4) u, Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for

More information

Math 273a: Optimization Overview of First-Order Optimization Algorithms

Math 273a: Optimization Overview of First-Order Optimization Algorithms Math 273a: Optimization Overview of First-Order Optimization Algorithms Wotao Yin Department of Mathematics, UCLA online discussions on piazza.com 1 / 9 Typical flow of numerical optimization Optimization

More information

Robust Duality in Parametric Convex Optimization

Robust Duality in Parametric Convex Optimization Robust Duality in Parametric Convex Optimization R.I. Boţ V. Jeyakumar G.Y. Li Revised Version: June 20, 2012 Abstract Modelling of convex optimization in the face of data uncertainty often gives rise

More information

ZERO DUALITY GAP FOR CONVEX PROGRAMS: A GENERAL RESULT

ZERO DUALITY GAP FOR CONVEX PROGRAMS: A GENERAL RESULT ZERO DUALITY GAP FOR CONVEX PROGRAMS: A GENERAL RESULT EMIL ERNST AND MICHEL VOLLE Abstract. This article addresses a general criterion providing a zero duality gap for convex programs in the setting of

More information

WEAK CONVERGENCE OF RESOLVENTS OF MAXIMAL MONOTONE OPERATORS AND MOSCO CONVERGENCE

WEAK CONVERGENCE OF RESOLVENTS OF MAXIMAL MONOTONE OPERATORS AND MOSCO CONVERGENCE Fixed Point Theory, Volume 6, No. 1, 2005, 59-69 http://www.math.ubbcluj.ro/ nodeacj/sfptcj.htm WEAK CONVERGENCE OF RESOLVENTS OF MAXIMAL MONOTONE OPERATORS AND MOSCO CONVERGENCE YASUNORI KIMURA Department

More information

Signal Processing and Networks Optimization Part VI: Duality

Signal Processing and Networks Optimization Part VI: Duality Signal Processing and Networks Optimization Part VI: Duality Pierre Borgnat 1, Jean-Christophe Pesquet 2, Nelly Pustelnik 1 1 ENS Lyon Laboratoire de Physique CNRS UMR 5672 pierre.borgnat@ens-lyon.fr,

More information

Subdifferential representation of convex functions: refinements and applications

Subdifferential representation of convex functions: refinements and applications Subdifferential representation of convex functions: refinements and applications Joël Benoist & Aris Daniilidis Abstract Every lower semicontinuous convex function can be represented through its subdifferential

More information

Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem

Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem Charles Byrne (Charles Byrne@uml.edu) http://faculty.uml.edu/cbyrne/cbyrne.html Department of Mathematical Sciences

More information

12. Interior-point methods

12. Interior-point methods 12. Interior-point methods Convex Optimization Boyd & Vandenberghe inequality constrained minimization logarithmic barrier function and central path barrier method feasibility and phase I methods complexity

More information

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Lagrange duality. The Lagrangian. We consider an optimization program of the form Lagrange duality Another way to arrive at the KKT conditions, and one which gives us some insight on solving constrained optimization problems, is through the Lagrange dual. The dual is a maximization

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 4 Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 2 4.1. Subgradients definition subgradient calculus duality and optimality conditions Shiqian

More information

Dual methods for the minimization of the total variation

Dual methods for the minimization of the total variation 1 / 30 Dual methods for the minimization of the total variation Rémy Abergel supervisor Lionel Moisan MAP5 - CNRS UMR 8145 Different Learning Seminar, LTCI Thursday 21st April 2016 2 / 30 Plan 1 Introduction

More information

Epiconvergence and ε-subgradients of Convex Functions

Epiconvergence and ε-subgradients of Convex Functions Journal of Convex Analysis Volume 1 (1994), No.1, 87 100 Epiconvergence and ε-subgradients of Convex Functions Andrei Verona Department of Mathematics, California State University Los Angeles, CA 90032,

More information

About Split Proximal Algorithms for the Q-Lasso

About Split Proximal Algorithms for the Q-Lasso Thai Journal of Mathematics Volume 5 (207) Number : 7 http://thaijmath.in.cmu.ac.th ISSN 686-0209 About Split Proximal Algorithms for the Q-Lasso Abdellatif Moudafi Aix Marseille Université, CNRS-L.S.I.S

More information

5. Duality. Lagrangian

5. Duality. Lagrangian 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

ICS-E4030 Kernel Methods in Machine Learning

ICS-E4030 Kernel Methods in Machine Learning ICS-E4030 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 28. September, 2016 Juho Rousu 28. September, 2016 1 / 38 Convex optimization Convex optimisation This

More information

Convex Functions. Pontus Giselsson

Convex Functions. Pontus Giselsson Convex Functions Pontus Giselsson 1 Today s lecture lower semicontinuity, closure, convex hull convexity preserving operations precomposition with affine mapping infimal convolution image function supremum

More information

Convex Optimization and Modeling

Convex Optimization and Modeling Convex Optimization and Modeling Duality Theory and Optimality Conditions 5th lecture, 12.05.2010 Jun.-Prof. Matthias Hein Program of today/next lecture Lagrangian and duality: the Lagrangian the dual

More information

Sequential Pareto Subdifferential Sum Rule And Sequential Effi ciency

Sequential Pareto Subdifferential Sum Rule And Sequential Effi ciency Applied Mathematics E-Notes, 16(2016), 133-143 c ISSN 1607-2510 Available free at mirror sites of http://www.math.nthu.edu.tw/ amen/ Sequential Pareto Subdifferential Sum Rule And Sequential Effi ciency

More information

Operator Splitting for Parallel and Distributed Optimization

Operator Splitting for Parallel and Distributed Optimization Operator Splitting for Parallel and Distributed Optimization Wotao Yin (UCLA Math) Shanghai Tech, SSDS 15 June 23, 2015 URL: alturl.com/2z7tv 1 / 60 What is splitting? Sun-Tzu: (400 BC) Caesar: divide-n-conquer

More information

Proximal splitting methods on convex problems with a quadratic term: Relax!

Proximal splitting methods on convex problems with a quadratic term: Relax! Proximal splitting methods on convex problems with a quadratic term: Relax! The slides I presented with added comments Laurent Condat GIPSA-lab, Univ. Grenoble Alpes, France Workshop BASP Frontiers, Jan.

More information

An inertial forward-backward algorithm for the minimization of the sum of two nonconvex functions

An inertial forward-backward algorithm for the minimization of the sum of two nonconvex functions An inertial forward-backward algorithm for the minimization of the sum of two nonconvex functions Radu Ioan Boţ Ernö Robert Csetnek Szilárd Csaba László October, 1 Abstract. We propose a forward-backward

More information

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44 Convex Optimization Newton s method ENSAE: Optimisation 1/44 Unconstrained minimization minimize f(x) f convex, twice continuously differentiable (hence dom f open) we assume optimal value p = inf x f(x)

More information

1 Introduction and preliminaries

1 Introduction and preliminaries Proximal Methods for a Class of Relaxed Nonlinear Variational Inclusions Abdellatif Moudafi Université des Antilles et de la Guyane, Grimaag B.P. 7209, 97275 Schoelcher, Martinique abdellatif.moudafi@martinique.univ-ag.fr

More information

A Brief Review on Convex Optimization

A Brief Review on Convex Optimization A Brief Review on Convex Optimization 1 Convex set S R n is convex if x,y S, λ,µ 0, λ+µ = 1 λx+µy S geometrically: x,y S line segment through x,y S examples (one convex, two nonconvex sets): A Brief Review

More information

Optimality Conditions for Constrained Optimization

Optimality Conditions for Constrained Optimization 72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

More information

Auxiliary-Function Methods in Optimization

Auxiliary-Function Methods in Optimization Auxiliary-Function Methods in Optimization Charles Byrne (Charles Byrne@uml.edu) http://faculty.uml.edu/cbyrne/cbyrne.html Department of Mathematical Sciences University of Massachusetts Lowell Lowell,

More information

On the Brézis - Haraux - type approximation in nonreflexive Banach spaces

On the Brézis - Haraux - type approximation in nonreflexive Banach spaces On the Brézis - Haraux - type approximation in nonreflexive Banach spaces Radu Ioan Boţ Sorin - Mihai Grad Gert Wanka Abstract. We give Brézis - Haraux - type approximation results for the range of the

More information

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014 Convex Optimization Dani Yogatama School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA February 12, 2014 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12,

More information

A Dual Condition for the Convex Subdifferential Sum Formula with Applications

A Dual Condition for the Convex Subdifferential Sum Formula with Applications Journal of Convex Analysis Volume 12 (2005), No. 2, 279 290 A Dual Condition for the Convex Subdifferential Sum Formula with Applications R. S. Burachik Engenharia de Sistemas e Computacao, COPPE-UFRJ

More information

Convex Optimization M2

Convex Optimization M2 Convex Optimization M2 Lecture 3 A. d Aspremont. Convex Optimization M2. 1/49 Duality A. d Aspremont. Convex Optimization M2. 2/49 DMs DM par email: dm.daspremont@gmail.com A. d Aspremont. Convex Optimization

More information

CSCI : Optimization and Control of Networks. Review on Convex Optimization

CSCI : Optimization and Control of Networks. Review on Convex Optimization CSCI7000-016: Optimization and Control of Networks Review on Convex Optimization 1 Convex set S R n is convex if x,y S, λ,µ 0, λ+µ = 1 λx+µy S geometrically: x,y S line segment through x,y S examples (one

More information

Machine Learning. Support Vector Machines. Manfred Huber

Machine Learning. Support Vector Machines. Manfred Huber Machine Learning Support Vector Machines Manfred Huber 2015 1 Support Vector Machines Both logistic regression and linear discriminant analysis learn a linear discriminant function to separate the data

More information

Subgradient. Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes. definition. subgradient calculus

Subgradient. Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes. definition. subgradient calculus 1/41 Subgradient Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes definition subgradient calculus duality and optimality conditions directional derivative Basic inequality

More information

Conditions for zero duality gap in convex programming

Conditions for zero duality gap in convex programming Conditions for zero duality gap in convex programming Jonathan M. Borwein, Regina S. Burachik, and Liangjin Yao April 14, revision, 2013 Abstract We introduce and study a new dual condition which characterizes

More information

Dual Decomposition.

Dual Decomposition. 1/34 Dual Decomposition http://bicmr.pku.edu.cn/~wenzw/opt-2017-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/34 1 Conjugate function 2 introduction:

More information

Brøndsted-Rockafellar property of subdifferentials of prox-bounded functions. Marc Lassonde Université des Antilles et de la Guyane

Brøndsted-Rockafellar property of subdifferentials of prox-bounded functions. Marc Lassonde Université des Antilles et de la Guyane Conference ADGO 2013 October 16, 2013 Brøndsted-Rockafellar property of subdifferentials of prox-bounded functions Marc Lassonde Université des Antilles et de la Guyane Playa Blanca, Tongoy, Chile SUBDIFFERENTIAL

More information

CS-E4830 Kernel Methods in Machine Learning

CS-E4830 Kernel Methods in Machine Learning CS-E4830 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 27. September, 2017 Juho Rousu 27. September, 2017 1 / 45 Convex optimization Convex optimisation This

More information

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some

More information

A Primal-dual Three-operator Splitting Scheme

A Primal-dual Three-operator Splitting Scheme Noname manuscript No. (will be inserted by the editor) A Primal-dual Three-operator Splitting Scheme Ming Yan Received: date / Accepted: date Abstract In this paper, we propose a new primal-dual algorithm

More information

STRONG AND CONVERSE FENCHEL DUALITY FOR VECTOR OPTIMIZATION PROBLEMS IN LOCALLY CONVEX SPACES

STRONG AND CONVERSE FENCHEL DUALITY FOR VECTOR OPTIMIZATION PROBLEMS IN LOCALLY CONVEX SPACES STUDIA UNIV. BABEŞ BOLYAI, MATHEMATICA, Volume LIV, Number 3, September 2009 STRONG AND CONVERSE FENCHEL DUALITY FOR VECTOR OPTIMIZATION PROBLEMS IN LOCALLY CONVEX SPACES Abstract. In relation to the vector

More information

Local strong convexity and local Lipschitz continuity of the gradient of convex functions

Local strong convexity and local Lipschitz continuity of the gradient of convex functions Local strong convexity and local Lipschitz continuity of the gradient of convex functions R. Goebel and R.T. Rockafellar May 23, 2007 Abstract. Given a pair of convex conjugate functions f and f, we investigate

More information

Convergence of Fixed-Point Iterations

Convergence of Fixed-Point Iterations Convergence of Fixed-Point Iterations Instructor: Wotao Yin (UCLA Math) July 2016 1 / 30 Why study fixed-point iterations? Abstract many existing algorithms in optimization, numerical linear algebra, and

More information

The resolvent average of monotone operators: dominant and recessive properties

The resolvent average of monotone operators: dominant and recessive properties The resolvent average of monotone operators: dominant and recessive properties Sedi Bartz, Heinz H. Bauschke, Sarah M. Moffat, and Xianfu Wang September 30, 2015 (first revision) December 22, 2015 (second

More information

Convex Optimization Boyd & Vandenberghe. 5. Duality

Convex Optimization Boyd & Vandenberghe. 5. Duality 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

Inertial forward-backward methods for solving vector optimization problems

Inertial forward-backward methods for solving vector optimization problems Inertial forward-backward methods for solving vector optimization problems Radu Ioan Boţ Sorin-Mihai Grad Dedicated to Johannes Jahn on the occasion of his 65th birthday Abstract. We propose two forward-backward

More information

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Lecture 15 Newton Method and Self-Concordance. October 23, 2008 Newton Method and Self-Concordance October 23, 2008 Outline Lecture 15 Self-concordance Notion Self-concordant Functions Operations Preserving Self-concordance Properties of Self-concordant Functions Implications

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table

More information

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) and Kernel Methods Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2015 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

Maximal monotone operators are selfdual vector fields and vice-versa

Maximal monotone operators are selfdual vector fields and vice-versa Maximal monotone operators are selfdual vector fields and vice-versa Nassif Ghoussoub Department of Mathematics, University of British Columbia, Vancouver BC Canada V6T 1Z2 nassif@math.ubc.ca February

More information

The sum of two maximal monotone operator is of type FPV

The sum of two maximal monotone operator is of type FPV CJMS. 5(1)(2016), 17-21 Caspian Journal of Mathematical Sciences (CJMS) University of Mazandaran, Iran http://cjms.journals.umz.ac.ir ISSN: 1735-0611 The sum of two maximal monotone operator is of type

More information

STABLE AND TOTAL FENCHEL DUALITY FOR CONVEX OPTIMIZATION PROBLEMS IN LOCALLY CONVEX SPACES

STABLE AND TOTAL FENCHEL DUALITY FOR CONVEX OPTIMIZATION PROBLEMS IN LOCALLY CONVEX SPACES STABLE AND TOTAL FENCHEL DUALITY FOR CONVEX OPTIMIZATION PROBLEMS IN LOCALLY CONVEX SPACES CHONG LI, DONGHUI FANG, GENARO LÓPEZ, AND MARCO A. LÓPEZ Abstract. We consider the optimization problem (P A )

More information

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented

More information

Sequential Unconstrained Minimization: A Survey

Sequential Unconstrained Minimization: A Survey Sequential Unconstrained Minimization: A Survey Charles L. Byrne February 21, 2013 Abstract The problem is to minimize a function f : X (, ], over a non-empty subset C of X, where X is an arbitrary set.

More information

LECTURE 12 LECTURE OUTLINE. Subgradients Fenchel inequality Sensitivity in constrained optimization Subdifferential calculus Optimality conditions

LECTURE 12 LECTURE OUTLINE. Subgradients Fenchel inequality Sensitivity in constrained optimization Subdifferential calculus Optimality conditions LECTURE 12 LECTURE OUTLINE Subgradients Fenchel inequality Sensitivity in constrained optimization Subdifferential calculus Optimality conditions Reading: Section 5.4 All figures are courtesy of Athena

More information

Superiorized Inversion of the Radon Transform

Superiorized Inversion of the Radon Transform Superiorized Inversion of the Radon Transform Gabor T. Herman Graduate Center, City University of New York March 28, 2017 The Radon Transform in 2D For a function f of two real variables, a real number

More information

Lecture 7: Convex Optimizations

Lecture 7: Convex Optimizations Lecture 7: Convex Optimizations Radu Balan, David Levermore March 29, 2018 Convex Sets. Convex Functions A set S R n is called a convex set if for any points x, y S the line segment [x, y] := {tx + (1

More information

Strongly convex functions, Moreau envelopes and the generic nature of convex functions with strong minimizers

Strongly convex functions, Moreau envelopes and the generic nature of convex functions with strong minimizers University of Wollongong Research Online Faculty of Engineering and Information Sciences - Papers: Part B Faculty of Engineering and Information Sciences 206 Strongly convex functions, Moreau envelopes

More information

3.10 Lagrangian relaxation

3.10 Lagrangian relaxation 3.10 Lagrangian relaxation Consider a generic ILP problem min {c t x : Ax b, Dx d, x Z n } with integer coefficients. Suppose Dx d are the complicating constraints. Often the linear relaxation and the

More information

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) and Kernel Methods Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2014 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table

More information

Monotone Operator Splitting Methods in Signal and Image Recovery

Monotone Operator Splitting Methods in Signal and Image Recovery Monotone Operator Splitting Methods in Signal and Image Recovery P.L. Combettes 1, J.-C. Pesquet 2, and N. Pustelnik 3 2 Univ. Pierre et Marie Curie, Paris 6 LJLL CNRS UMR 7598 2 Univ. Paris-Est LIGM CNRS

More information

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010 I.3. LMI DUALITY Didier HENRION henrion@laas.fr EECI Graduate School on Control Supélec - Spring 2010 Primal and dual For primal problem p = inf x g 0 (x) s.t. g i (x) 0 define Lagrangian L(x, z) = g 0

More information

Algorithms for Nonsmooth Optimization

Algorithms for Nonsmooth Optimization Algorithms for Nonsmooth Optimization Frank E. Curtis, Lehigh University presented at Center for Optimization and Statistical Learning, Northwestern University 2 March 2018 Algorithms for Nonsmooth Optimization

More information

Chapter 1. Preliminaries. The purpose of this chapter is to provide some basic background information. Linear Space. Hilbert Space.

Chapter 1. Preliminaries. The purpose of this chapter is to provide some basic background information. Linear Space. Hilbert Space. Chapter 1 Preliminaries The purpose of this chapter is to provide some basic background information. Linear Space Hilbert Space Basic Principles 1 2 Preliminaries Linear Space The notion of linear space

More information

Primal/Dual Decomposition Methods

Primal/Dual Decomposition Methods Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Subgradients

More information

Quasi-relative interior and optimization

Quasi-relative interior and optimization Quasi-relative interior and optimization Constantin Zălinescu University Al. I. Cuza Iaşi, Faculty of Mathematics CRESWICK, 2016 1 Aim and framework Aim Framework and notation The quasi-relative interior

More information

Radial Subgradient Descent

Radial Subgradient Descent Radial Subgradient Descent Benja Grimmer Abstract We present a subgradient method for imizing non-smooth, non-lipschitz convex optimization problems. The only structure assumed is that a strictly feasible

More information

Interior-Point Methods for Linear Optimization

Interior-Point Methods for Linear Optimization Interior-Point Methods for Linear Optimization Robert M. Freund and Jorge Vera March, 204 c 204 Robert M. Freund and Jorge Vera. All rights reserved. Linear Optimization with a Logarithmic Barrier Function

More information

GEOMETRIC APPROACH TO CONVEX SUBDIFFERENTIAL CALCULUS October 10, Dedicated to Franco Giannessi and Diethard Pallaschke with great respect

GEOMETRIC APPROACH TO CONVEX SUBDIFFERENTIAL CALCULUS October 10, Dedicated to Franco Giannessi and Diethard Pallaschke with great respect GEOMETRIC APPROACH TO CONVEX SUBDIFFERENTIAL CALCULUS October 10, 2018 BORIS S. MORDUKHOVICH 1 and NGUYEN MAU NAM 2 Dedicated to Franco Giannessi and Diethard Pallaschke with great respect Abstract. In

More information

A Geometric Framework for Nonconvex Optimization Duality using Augmented Lagrangian Functions

A Geometric Framework for Nonconvex Optimization Duality using Augmented Lagrangian Functions A Geometric Framework for Nonconvex Optimization Duality using Augmented Lagrangian Functions Angelia Nedić and Asuman Ozdaglar April 15, 2006 Abstract We provide a unifying geometric framework for the

More information