A New Sequential Optimality Condition for Constrained Nonsmooth Optimization

Size: px
Start display at page:

Download "A New Sequential Optimality Condition for Constrained Nonsmooth Optimization"

Transcription

1 A New Sequential Optimality Condition for Constrained Nonsmooth Optimization Elias Salomão Helou Sandra A. Santos Lucas E. A. Simões November 23, 2018 Abstract We introduce a sequential optimality condition for locally Lipschitz constrained nonsmooth optimization, verifiable just using derivative information, and which holds even in the absence of any constraint qualification. The proposed sequential optimality condition is not only novel for nonsmooth problems, but brings new insights for the smooth case as well. We present a practical algorithm that generates iterates fulfilling the new necessary optimality condition. A main feature of the devised algorithm is to allow a stronger control over the infeasibility of the iterates than usually obtained by exact penalty strategies, ensuring theoretical and practical advantages. Illustrative numerical experiments highlight the potentialities of the algorithm. Keywords nonsmooth nonconvex optimization; constrained optimization; sequential optimality condition; constraint qualification AMS Subject Classifications 90C26, 90C30, 90C46 1 Introduction We consider constrained nonsmooth optimization problems of the form min f(x) x R n s.t. c(x) 0, (P) where both f : R n R and c : R n R p are locally Lipschitz continuous functions. Equality constraints of the type h(x) = 0 can also be easily incorporated in our framewor (see Remar 1 for more details). A common way of solving constrained optimization problems is to turn (P) into an unconstrained minimization by penalizing points x R n that violate the constraints, i.e., one sees to find a solution to min f(x) + ρz(x), (Unc-P) x Rn where ρ is a positive real number and z : R n R is a continuous function satisfying { z(x) > 0, if c(x) > 0 z(x) = 0, if c(x) 0. Institute of Mathematical Sciences and Computation - University of São Paulo. São Carlos - SP, Brazil. (elias@icmc.usp.br) Department of Applied Mathematics - University of Campinas. Campinas - SP, Brazil. (sandra@ime.unicamp.br,simoes.lea@gmail.com) 1

2 Under certain conditions, and considering ρ to be large enough, the unconstrained minimization (Unc-P) and the original optimization (P) are equivalent [12]. This approach is nown as exact penalization [28, 50] and it can be used to derive most of the theory behind constrained optimization for finite dimensions [13]. For the majority of the optimization problems, the equivalence between (P) and (Unc-P) can only be achieved for finite ρ when z is a nondifferentiable function. However, when both objective and constraint functions are smooth in the entire domain, one usually does not want to trade the constrained smooth problem by a nonsmooth unconstrained one. Hence, many alternatives have been proposed over the years in order to overcome the nondifferentiable nature of (Unc-P). For instance, the Augmented Lagrangian [1, 20] and the Sequential Quadratic Programming (SQP) [9, 46] are widely nown methods. However, when (P) already presents nonsmoothness, the nondifferentiability of z usually does not introduce any additional difficulty. For this reason, many methods developed for solving constrained nonsmooth optimization problems are based on exact penalty functions [22, 23, 38, 39, 48]. There is a strong connection between constraint qualifications (CQ) [5, 12, 46] for problem (P) and the existence of a finite penalty parameter that maes (Unc-P) equivalent to (P). Moreover, convergence results for exact penalization methods (and, more generally, for the majority of the optimization algorithms) are based on different inds of CQs. However, checing the validity of constraint qualifications is not an easy tas, which may justify the fact that most of the algorithms are not designed to test any ind of CQ even when theoretical convergence of the method relies on such conditions. As a result, practical necessary optimality conditions for (P) that do not depend on constraint qualifications are of great importance for establishing theoretically sound stopping criteria for any optimization method. The approximate Karush-Kuhn-Tucer (AKKT) and the complementary approximate Karush-Kuhn- Tucer (CAKKT) conditions [2, 4] present themselves as reliable necessary optimality conditions for smooth optimization problems. Looing at auxiliary functions that approximate the exact penalty approach, the authors of both studies show that any solution x of the smooth optimization problem must satisfy the following sequential optimality condition lim f(x ) + c i (x )µ i = 0, (1) where x is the Euclidean norm of x R n, {x } R n is a sequence converging to x, and {µ } R p + is a sequence of vectors whose components µ i must satisfy lim min{ c i (x ), µ i } = 0, where c i : R n R are the components of c : R n R p for each i {1,..., p}. A natural attempt to generalize (1) to the nonsmooth case is to use the subdifferential concept, i.e., to consider a sequence of elements ( ) {v } f(x ) + c i (x )µ i, with lim v = 0, (2) where g(x) stands for the Clare subdifferential set of the function g at the point x [18]. Unfortunately, appropriate vectors from the subdifferential satisfying these conditions are, in general, not expected to be computed by numerical algorithms, which invalidates the use of (2) as a practical stopping criteria. Using the fact that many real nonsmooth optimization problems are described by functions that are continuously differentiable in a full-measure subset of R n, the authors of [27] present a necessary optimality condition that relies upon the fact that, in many cases, the sets f(x ) and c i (x ) can be traded by f(ˆx ) and c i (ˆx ), where ˆx is a point sufficiently close to x. However, if no constraint qualification is assumed, one must have ˆx = x, which brings bac the same issues discussed in the previous paragraph. In this paper, we propose a new sequential optimality condition for constrained nonsmooth optimization problems that allows the user to wor with derivatives even in the absence of any CQ. To 2

3 show that our necessary optimality condition is practical, we present an algorithm that is proven to generate a sequence of points that satisfies such conditions. A main feature of the proposed algorithm is that it presents a stronger control over the infeasibility of the iterates than usual exact penalty methods, which has important practical consequences. One of the main challenges of exact penalty methods is the selection of the initial value of the penalty parameter. In case this value is chosen to be too large, the method may privilege the feasible region in detriment of optimality, which, in turn, can cause very short steps toward the optimal solution and/or may cause the penalized problem to be ill-conditioned. On the other hand, if the penalty parameter value is much smaller than the magnitude of the objective function at infeasible points, the method may rapidly be attracted by unconstrained minimizers that possess very low function values, preventing the user to obtain a successful solution for reasonable values of ρ. Such a phenomenon is called greediness [7, 17], and may occur even if the user starts the method at a feasible point of the optimization problem. Unlie some existing penalty methods, for which there is no simple way to compute a suitable initial value for the penalty parameter to easily confine the iterates into an almost feasible region, our algorithm allows the user to set a tolerance target value ξ > 0 for infeasibility. Outside this tolerance region, the method is indifferent to the objective function, which ensures the control of the infeasibility even at the initial iterations. As a consequence, it prevents our method to suffer from the greediness phenomenon. The outline of the paper is as follows. Section 2 presents theoretical results supporting that every local minimizer of (P) must fulfill our proposed sequential optimality conditions. Section 3 shows the relation between our sequential optimality conditions and the AKKT and CAKKT conditions in the case where the objective and constraint functions of (P) are all smooth. In Section 4, we state a general algorithm that produces a sequence of iterates fulfilling the proposed sequential optimality conditions. Section 5 brings numerical results that illustrate some of the important properties of our method. Finally, we leave Section 6 for the conclusions of this study. During the entire manuscript, the following notations will be frequently used: is the Euclidean norm; B(x, ɛ) is the Euclidean closed ball with center at x and radius ɛ; P(v X ) is the Euclidean projection of v onto X ; R + is the set of all nonnegative real numbers; R + is the set of all strictly positive real numbers; v + := max{v, 0}. If v is a vector, then v + is also a vector where every entry is taen as the maximum between the respective entry of v and zero; conv X denotes the convex hull of X ; cl X denotes the closure of X ; A + B represents the set {x + y : x A, x B}; A B represents the set {xy : x A, x B}. 2 Establishing new necessary optimality conditions We start this section by introducing the concept of (Goldstein) ɛ-subdifferential set for any locally Lipschitz continuous function f : R n R [31]. 3

4 Definition 2.1 (ɛ-subdifferential set, ɛ-subgradient, ɛ-stationary point). The ɛ-subdifferential set of f at x is given by ɛ f(x) := conv f(b(x, ɛ)), where conv S is the convex hull of the set S and B(x, ɛ) := {y : y x ɛ}. Any v ɛ f(x) is nown as an ɛ-subgradient of f at x. Moreover, if 0 ɛ f(x), then we say that x is an ɛ-stationary point for f. One of the greatest advantages of looing at ɛ-subdifferential sets instead of subdifferential sets is the possibility of seeing ɛ f(x) as a convex hull of the derivatives of f nearby x. Indeed, for any locally Lipschitz continuous function f and any full-measure subset D f of R n such that f is differentiable at any point in D f this subset always exists due to Rademachers theorem [30, Theorem 3.1.6] the following relations hold (see [14, 40]) G f ɛ (x) ɛ f(x) and ɛ1 f(x) G f ɛ 2 (x) (0 ɛ 1 < ɛ 2 ), (3) where G f ɛ (x) := cl conv f ( B(x, ɛ) D f ), with cl S being the closure of the set S. Recent advances on practical tools [14, 15, 16] to approximate G f ɛ (x) allow us to consider a new sequential optimality condition of practical use. The subgradients involved in this new necessary optimality condition are those associated with the objective function and the constraints related to the following index set I ɛ (x) := {i : y B(x, ɛ) with c i (y) 0}. Definition 2.2 (wea ɛ-approximate Nonsmooth Optimality Condition). A feasible point x R n of (P) is said to satisfy the wea ɛ-approximate Nonsmooth Optimality Condition (wea ɛ-anoc) if there exist sequences {x } R n, {ɛ } R +, where R + is the set of positive real numbers, {v } R n and {µ } R p + such that x x, ɛ 0 and v 0, where v ( G f ɛ (x ) + µ i G ci ɛ (x ) ) and i / I ɛ (x ) µ i = 0. (4) Notice that the above definition does not impose any control over the speed of the convergence ɛ 0. Not establishing a relation between the sequences {ɛ } and {x } allows ɛ x x, which, in turn, means that the wea ɛ-anoc may depart too much from the necessary condition that at least one generalized derivative of f at x must be close to a linear combination of the subgradients of the active constraints at x. Fig. 1 exemplifies this issue, showing that subgradients of far away inactive constraints may be considered in the linear combination. Figure 1: Illustration of how the value of ɛ in the ɛ-subdifferential can influence in the activeness of the constraints. Using the exact penalization approach for problem (P) and a similar reasoning employed by the authors in [4, Theorem 3.3], one can infer the wea ɛ-anoc as a necessary optimality condition for nonsmooth problems. However, this strategy does not clarify on the question of how fast the sequence {ɛ } must go to zero. To overcome this issue, we have applied a new penalization strategy that has its roots in [8, Section 4.1]. 4

5 The inspiring idea behind this new penalization is to trade the original nonsmooth problem by the minimization of a discontinuous function Θ ρ : R n R, { Θρ (x) 0 if c(x) > 0 Θ ρ (x) = f(x) ρ otherwise, where ρ R plays a different role than the one presented in the exact penalization function. Here, the parameter ρ is any constant ensuring that f(x ) ρ < 0, with x being a solution of (P). Lie the strategy of solving constrained smooth optimization problems by the use of an exact penalty function, where a degree of difficulty is added by introducing a nondifferentiable term in order to avoid a constrained minimization, this approach also inserts one extra difficulty to the nonsmoothness of the original problem by using a discontinuous function. Therefore, we avoid using this idea directly. Theorem 2.1 ahead shows that the wea ɛ-anoc is, indeed, a necessary optimality condition for problem (P) and also elucidates the matter of the speed of the convergence ɛ 0. Its proof relies on a sequence {Ψ ξ,ρ} of nonsmooth continuous functions Ψ ξ,ρ : R n R approximating the discontinuous function Θ ρ. Given a scalar ξ R +, we define { Ψ ξ,ρ (x) := max 1 c(x) } + 1, 0 [f(x) ρ] + c(x) + 1. (5) ξ It is worth noticing that, since f and c are locally Lipschitz continuous functions, the map Ψ ξ,ρ is also locally Lipschitz continuous. Preceding the result that establishes the wea ɛ-anoc as a necessary optimality condition, we present two lemmas that will facilitate the proof of Theorem 2.1. Lemma 2.1. Let x R n, (ɛ, ξ) R + R +, ρ R with ρ > f(x), and v ɛ Ψ ξ,ρ (x). Then, if ɛ > 0 is such that y B(x, ɛ) ρ f(y), there exists {x j } n+1 e T λ = 1, where e := (1, 1,..., 1) T, such that n+1 v { λ j (max where σ j 1, j {1,..., n + 1}. 1 c(xj ) + 1, 0 ξ B(x, ɛ) and λ Rn+1 +, with } ) f(x j ) + σ j c(x j ) + 1, (6) Proof. Recalling that ɛ Ψ ξ,ρ (x) is a convex set, it follows from Carathéodory s Theorem [49, Theorem 2.29] that there exists {s j } n+1 Ψ ξ,ρ(b(x, ɛ)) and λ R n+1 +, with e T λ = 1, such that v = n+1 λ js j. Because of [18, Theorem ], [18, Theorem 2.3.9], and [18, Proposition 2.3.3], we now that Ψ ξ,ρ (x) is a subset of { max 1 c(x) } ( { }) + 1 1, 0 f(x) [ρ f(x)] conv ξ ξ, 0 c(x) + 1. So, since, by hypothesis, ρ f(x j ) > 0, j {1,..., n + 1}, it follows that there exists {x j } ( { n+1 B(x, ɛ) such that (6) holds with 1 σ j 1 + [ρ f(x j 1 )] conv ξ })., 0 The next result presents sufficient conditions for the feasible point x to satisfy the wea ɛ-anoc. Lemma 2.2. Suppose x is a feasible point of problem (P), {x } is a sequence converging to x, {ɛ } and {ξ } are both real-valued sequences with ɛ 0, ξ 0 and ɛ /ξ 0, and ρ is a real value satisfying f(x ) ρ < 0. If lim c(x ) + 1 /ξ exists and it is strictly less than one, and there exists a sequence {r } R n such that r 0 and r ɛ Ψ ξ,ρ(x ) for all N, then x satisfies the wea ɛ-anoc. 5

6 Proof. Let us choose a sequence {ζ } satisfying ζ < ɛ for all N. Recalling Lemma 2.1, it follows that, for all large enough, there exists {x,j } n+1 B(x, ζ ) such that n+1 r λ j ( { } ) max 1 c(x,j ) + 1, 0 f(x,j ) + σ j c(x,j ) + 1, ξ with σj 1, j {1,..., n + 1}. Therefore, for each j, there must exist s,j f c i (x,j ) for each i, such that s,j c i where Then, defining n+1 r = λ j ( max { } 1 c(x,j ) + 1, 0 s,j f + σj ξ {1}, if c i (x) > 0 i (x) conv{1, 0}, if c i (x) = 0 {0}, if c i (x) < 0 n+1 ι f = { } λ j max 1 c(x,j ) + 1, 0 ξ and. n+1 ι ci = f(x,j ) and i (x,j )s,j c i, (7) ) λ j σj i (x,j ), (8) one can see, for large values of, that ι f is strictly positive due to the fact that lim c(x ) + 1 /ξ is strictly less than one, x x,j ɛ, and ɛ /ξ 0. Hence, for large enough, we have, for all i and j, that 0 τ f,j := λ j max { } 1 c(x,j ) + 1 ξ, 0 Additionally, the following holds: n+1 τ f,j = 1 and n+1 τ ci ι f and 0 τ ci,j := { λ j σ j i(x,j ) ι c i { 1, if ι c,j = i > 0 0, if ι ci = 0 for all i {1,..., p}., if ι ci > 0 0, if ι ci = 0. ( This together with (7) imply that r ι f ζ f(x ) + ) p ιci ζ c(x ). So, choosing v := 1 r, ι ( f we obtain v ζ f(x ) + ) p ι c i ι f ζ c(x ). Now, notice that, because of the way ι ci and i were defined, it must follow i / I ζ (x ) µ i := ιc i = 0. Moreover, since r 0, it yields ι f v 0 as well. So, ( ) v ζ f(x ) + µ i ζ c i (x ) and i / I ζ (x ) µ i = 0. (9) Remembering the inclusions presented in (3) and that ɛ1 g(x) ɛ2 g(x) for any nonsmooth function g and 0 ɛ 1 ɛ 2, it yields v ( Gɛ f (x ) + p µ i Gci ɛ (x ) ) and i / I ɛ (x ) µ i = 0, which proves the statement. We are finally able to introduce the theorem that guarantees that the wea ɛ-anoc is a necessary optimality condition. Theorem 2.1. Let x be a local minimizer of (P). Then, x satisfies the wea ɛ-anoc. 6

7 Proof. Since x is a local minimizer of (P), there must exist δ 1 > 0 such that f(x) f(x ) for all x B(x, δ 1 ) F, where F is the feasible set of (P). Moreover, setting ρ as any real number satisfying ρ > f(x ), it yields that there exists δ 2 > 0 with ρ > f(x) for all x B(x, δ 2 ). Choosing δ = min{δ 1, δ 2 }/2, we define Φ ξ, ρ : R n R as { Φ ξ,ρ (x) := max 1 max{ x } { x δ, 0}, 0 max 1 c(x) } + 1, 0 [f(x) ρ] ξ ξ + c(x) x x 2. Since Φ ξ,ρ is a coercive function for any ξ > 0, a global minimizer of this function always exists. So, taing {ξ } R + as any sequence satisfying ξ 0, we consider an infinite sequence {x } R n such that x argmin x R n Φ ξ,ρ(x). Consequently, Φ ξ,ρ(x ) Φ ξ,ρ(x ) = f(x ) ρ < 0, which yields c(x ) + 1 < ξ and x x < ξ + δ. (10) This implies that the sequence {x } must be bounded. So, let ˆx R n be a cluster point of {x }, which means that there exists an infinite index set K N such that x ˆx. We then define K { 0 τ := max 1 max{ x x } { } δ, 0}, 0 max 1 c(x ) + 1, 0 1. ξ ξ Therefore, it is possible to find an infinite index set K K such that τ ˆτ, for some ˆτ [0, 1]. Hence, since c(ˆx) + 1 = 0 because of (10), we have Φ ξ,ρ(x ) Φ ξ,ρ(x ) lim Φ K ξ,ρ(x ) lim Φ K ξ,ρ(x ) ˆτ[f(ˆx) ρ] ˆx x 2 f(x ) ρ. By (10), we now that ˆx x δ. Then, because of the way we have defined δ, we see that f(ˆx) ρ and f(x ) ρ are strictly negative numbers. Consequently, recalling that x is a local minimizer of (P), we must have ˆτ = 1 and ˆx = x. Therefore, since ˆx and ˆτ are arbitrary cluster points of {x } and {τ }, respectively, this means that any cluster point of {x } must be x and that any cluster point of {τ } must be 1. Since both sequences are bounded, it follows that x x c(x and τ 1. In particular, this last limit also ensures that lim ) + 1 ξ exists and it is strictly less than one. The equality ˆx = x implies, for any sufficiently large N, that x x δ/2 Φ ξ,ρ(x ) = Ψ ξ,ρ(x ) + x x. However, by the way we have defined x, we now that 0 Φ ξ,ρ(x ), which, for any ζ 0, gives us x x Ψ ξ,ρ(x ) r ζ Ψ ξ,ρ(x ) with r 0. Due to Lemma 2.2, the statement is proven. The demonstration of Lemma 2.2 gives a clue about the matter of convergence of ɛ. Notice that, along the proof, we had to assume ɛ /ξ > ζ /ξ 0, where each ξ is a parameter for the nonsmooth function Ψ ξ,ρ. This was necessary to ensure that the multiplier associated with ζ f(x ) is bounded away from 0 in (8), which consequently means that the objective function is always considered in the sequential optimality condition (9). However, notice that even the imposition ɛ /ξ 0 only subjects ɛ to the function Ψ ξ,ρ, meaning that this limit is useful only 7

8 when one has a method to solve nonsmooth problems that is based on such a function, which reduces the generality of the necessary condition. This issue can be overcome by noticing that µ and the limit ɛ /ξ 0 are lined inside the proof of Lemma 2.2. So, by defining a new sequential optimality condition that requires ɛ µ 0, we introduce the ɛ-approximate Nonsmooth Optimality Condition. Definition 2.3 (ɛ-approximate Nonsmooth Optimality Condition). A feasible point x R n of (P) is said to satisfy the ɛ-approximate Nonsmooth Optimality Condition (ɛ-anoc) if there exist sequences {x } R n, {ɛ } R +, {v } R n and {µ } R p + fulfilling the wea ɛ-approximate Nonsmooth Optimality Condition at x and, moreover, ɛ µ 0 holds. From the proof of Theorem 2.1, one can see that the ɛ-anoc is also a necessary optimality condition for any local solution x of (P). Theorem 2.2. Let x be a local minimizer of (P). Then, x satisfies the ɛ-anoc. Proof. Looing at the proof of Theorem 2.1, and since x is a local minimizer of (P), one can find a sequence {x } converging to x, and the associated sequences {ɛ }, {ξ } satisfying ɛ 0, ξ 0 and ɛ /ξ 0 such that the assumptions of Lemma 2.2 hold, and (9) is also valid (where ζ < ɛ ). Additionally, since ɛ /ξ 0, we also have conv{0, 1/ξ }ɛ 0. Hence, recalling that, in (8), ι f is bounded away from zero for large values of, and that {ιci ɛ } approaches zero for all i {1,..., p} (due to the fact that conv{0, 1/ξ }ɛ 0), it must follow that ɛ µ i = ɛ (ι ci /ιf ) 0 for all i {1,..., p}. This information together with Theorem 2.1 ensure the result. Remar 1. Although the sequential optimality conditions that we have proposed so far consider only inequality constraints, one can easily generalize them to nonsmooth problems that may include equality constraints. Indeed, if in problem (P) the feasible set is given by F = {x R n : c(x) 0, h(x) = 0}, where h : R n R l, we can apply the concepts of the wea ɛ-anoc and ɛ-anoc to the problem min f(x) s.t. c(x) 0, g(x) 0, x Rn where g(x) = max{h(x), h(x)}. Considering (4) to the above equality-free optimization problem and using the definition of g, one obtain a sequence {λ } R l free of sign that satisfies l v G ɛ f (x ) + µ i Gɛ ci (x ) + λ j Gɛ hj (x ) and i / I ɛ (x ) µ i = 0. For the ɛ-anoc, one only needs to replace ɛ µ 0 by ɛ ((µ ) T, (λ ) T ) 0. Although the ɛ-anoc is more restrictive over the sequence {ɛ } than the wea ɛ-anoc, it is not clear if the former has practical advantages over the latter. The next example has the goal to show that the ɛ-anoc is, in fact, stronger than the wea ɛ-anoc. Example 1. Let us consider the following nonsmooth optimization problem min x f(x) = max{x, x 2 2} s.t. c 1 (x) = x 0, c 2 (x) = x 2 0. We choose {x } R as a sequence converging to x = 0, and also let {ɛ } be such that 0 < ɛ < x. So, 2 / I ɛ (x ). Then, for all sufficiently large such that x < 1/2, and in case µ 2 = 0, we obtain Gɛ f (x ) + µ 1Gɛ c1 (x ) + µ 2Gɛ c2 (x ) = 1 + µ 1 1. This shows that if {x } fulfills the required conditions of the wea ɛ-anoc, the sequence {ɛ } must satisfy ɛ x for every large enough. This guarantees that 2 I ɛ (x ). So, for sufficiently large values of ɛ, one can always set µ 1 = 0, and find µ 2 large enough in order to have P ( 0 Gɛ f (x ) + µ 1Gɛ c1 (x ) + µ 2Gɛ c2 (x ) ) 0. The previous reasoning ensures that, although x is a local maximum point, it satisfies the wea ɛ-anoc. 8

9 However, we state that x does not satisfy the stronger condition ɛ-anoc. By contradiction, suppose x satisfies the ɛ-anoc. Then, in particular, it must fulfill the wea ɛ-anoc, which gives us ɛ x for every large. Also, notice that any element in Gɛ c2 (x ) is always greater than 2( x + ɛ ). Combining this with ɛ x gives gc 2 Gɛ c2 (x ) gc 2 4ɛ. So, v ( Gɛ f (x ) + µ 1Gɛ c1 (x ) + µ 2Gɛ c2 (x ) ) implies v = 1 + µ 1 + µ 2g c 2 1 4µ 2ɛ. (11) Since we supposed that x satisfies the ɛ-anoc, it follows ɛ µ 0. This last fact together with (11) show that v cannot converge to zero, which evinces a contradiction with the statement that x satisfies the ɛ-anoc. We now prove that our proposed optimality conditions must be, at least, as strong as the necessary optimality condition presented in [29] (considering D(x ) = R n in [29, Proposition 3.1]), which has its roots in the study presented in [36]. Theorem 2.3. If x R n is a feasible point for (P) satisfying the (wea) ɛ-anoc, then the generalized Fritz-John optimality conditions hold at x, i.e., there must exist positive real values λ 0, λ 1,..., λ p, not all simultaneously zero, such that 0 λ 0 f(x ) + λ i c i (x ) and λ i c i (x ) = 0, for all i {1,..., p}. (12) Proof. For any feasible point x R n satisfying the (wea) ɛ-anoc, there exist sequences {x } R n, {ɛ } R +, {v } R n and {µ } R p + such that x x, ɛ 0 and v 0, with ( ) v = v f + µ i v c i and i / I ɛ (x ) µ i = 0, (13) where v f Gf ɛ (x ) ɛ f(x ) and v c i Gɛ ci (x ) ɛ c i (x ). Then two situations can happen: (a) the sequence {µ } is bounded, or (b) the sequence {µ } is unbounded. Let us consider that (a) is the case. Since x x and all functions considered here are locally Lipschitz continuous functions, it follows that {v f } and {v c i }, i {1,..., p}, are all bounded sequences. Therefore, there must exist an infinite index set K N such that µ µ, v f K K v f and v c i v c K i, i {1,..., p}. So considering the result of [40, Lemma 3.2 (iii)] for the auxiliary functions f(x) = f(x) v T f x and c i (x) = c i (x) v c T i x, i {1,..., p}, it follows that v f f(x ) and v c i c i (x ), i {1,..., p}. Therefore, recalling that v 0, we get 0 ( f(x ) + p µ i c i(x )). Notice also that if c i (x ) < 0 for some i, then i / I ɛ (x )( µ i = 0) for all N sufficiently large, which yields µ i = 0. This assures the result for case (a). If (b) holds, then one only has to divide the equation that appears in (13) by max i {µ i }. This ensures that the new multipliers of v c i, i {1,..., p}, will be bounded and the multiplier of v f will be 1/ max i {µ i }. Hence, the same reasoning used in case (a) can be employed, obtaining the Fritz-John conditions with zero being the multiplier of f(x ). Considering the Mangasarian-Fromovitz constraint qualification for nonsmooth problems [27, Section 3], which guarantees that (12) cannot be true if λ 0 = 0, and the generalized KKT condition, as defined below, the corollary presented in the sequence follows immediately from the theorem above. Definition 2.4 (Generalized KKT conditions [18, Section 6.3]). A feasible point x R n of (P) is said to satisfy the generalized KKT conditions if there exists µ R p + such that ( ) 0 f(x ) + µ i c i (x ) and µ i c i (x ) = 0 for all i {1,..., p}. 9

10 Corollary 2.1. If a feasible point x R n for (P) satisfies the (wea) ɛ-anoc, and the nonsmooth Mangasarian-Fromovitz constraint qualification holds at x, then the generalized KKT condition also holds at x. We believe that the results obtained in this section have great relevance in the nonsmooth optimization field, specially when one is concerned with convergence analysis of nonsmooth optimization methods. In the nonsmooth context, it is common to prove the convergence of an algorithm by showing that every accumulation point generated by the iterates of such a method is a stationary point of (Unc-P) (see, for example, [23, 38, 39]). This can be well justified when a property called calmness holds at a local minimizer x of the original problem (P). Definition 2.5 (Calmness [12, 32]). Consider x a local minimizer of (P). We say that problem (P) is calm at x when one can find ρ > 0 sufficiently large such that x is also a locally optimal solution of (Unc-P). Thus, under the calmness hypothesis, the statement that x is a locally optimal solution of (Unc-P) becomes a necessary optimality condition. However, this is not a legitimate necessary optimality condition, in the sense that one needs to assume more than just the fact that x is a local minimizer of (P). That is not the case for the (wea) ɛ-anoc. By the results of Theorems 2.1 and 2.2, we have shown that the (wea) ɛ-anoc at x is a legitimate necessary optimality condition. Therefore, the results exposed here give alternative forms of dealing with the convergence analysis of nonsmooth optimization methods even in the absence of calmness. We end our theoretical analysis with a brief section presenting the relations between the ɛ-anoc and the sequential optimality conditions AKKT and CAKKT [2, 4] in the smooth context. 3 ɛ-anoc applied to smooth optimization The goal of this section is to show that, under mild assumptions, the scheme from Fig. 3 holds true when the objective and constraints functions of (P) are all differentiable. For completeness, we define below the concepts of AKKT and CAKKT for the following smooth optimization problem min f(x) s.t. c(x) 0, h(x) = 0, (P-smooth) x Rn where f : R n R, c : R n R p and h : R n R l are all functions of class C 1. Figure 2: Scheme of relations between the ɛ-anoc and the well-nown sequential optimality conditions in the smooth field. Definition 3.1 (AKKT, CAKKT [2, 4]). A feasible point x of (P-smooth) is an AKKT point if there exist sequences {x }, {µ } R p + and {λ } R l such that x x, lim min{ c i (x ), µ i } = 0 for all i {1,..., p}, and l lim f(x ) + µ i c i (x ) + λ j h j (x ) = 0. If, in addition, lim µ i c i(x ) = 0 and lim λ j h j(x ) = 0 for all i {1,..., p} and j {1,..., l}, then we call x a CAKKT point. 10

11 First, we prove that every feasible point x of any smooth optimization problem satisfying the ɛ-anoc must also be an AKKT point. Theorem 3.1. Suppose the functions c and h in (P-smooth) have locally Lipschitz continuous derivatives. If x is a feasible point of (P-smooth) and satisfies the ɛ-anoc, then x is also an AKKT point. Proof. Since x satisfies the ɛ-anoc, there must exist sequences {x } R n, {ɛ } R +, {v } R n, {µ } R p + and {λ } R l such that x x, ɛ 0 and v 0, where i / I ɛ (x ) µ i = 0 and v G ɛ f (x ) + ɛ ((µ ) T, (λ ) T ) 0, µ i G ci ɛ (x ) + l λ j Gɛ hj (x ). (14) Since the derivatives of the involved functions are locally Lipschitz continuous, it follows that v = f(x ) + µ i c i (x ) + l λ j h j (x ) + r f + µ i r c i + l λ j r h j, where r f, r c i and r h j are all error vectors of order ɛ. Since v 0, ɛ 0 and ɛ ((µ ) T, (λ ) T ) 0, we get l f(x ) + µ i c i (x ) + λ j h j (x ) 0. (15) Additionally, because i / I ɛ (x ) µ i = 0, one can see that min{ c i (x ), µ i } 0 for all i {1,..., p}. This together with (15) fulfill the conditions of an AKKT point, which completes the proof. Remar 2. It is worth noticing that the reciprocal is not true, i.e., AKKT does not imply the ɛ-anoc in the smooth case. Indeed, considering the optimization problem given in Example 1, x = 0 is an AKKT point (around x the optimization problem is smooth, that justifies applying the AKKT concept at this point). However, we have showed that x does not satisfy the ɛ-anoc, which guarantees that AKKT ɛ-anoc. Hence, our new sequential optimality condition is also stronger than AKKT in the smooth context. We now proceed with two examples showing that CAKKT and the ɛ-anoc are not connected to each other. Example 2. Let us consider the following constrained optimization problem min x f(x) = x 1 x 2 s.t. x 1 0, x 2 0, x 1 x 2 0. Although the point x = 0 is not a local minimizer for the problem, if one chooses x = (1/, 1/) T, µ 1 = µ 2 = 0 and µ 3 =, one can see that x is a CAKKT point [3]. However, we state that x does not fulfill the ɛ-anoc. Indeed, suppose, by contradiction, that x satisfies the ɛ-anoc. Recalling the arguments used to go from (14) to (15), and since the objective and constraint functions are all smooth and have locally Lipschitz continuous derivatives, there must exist {x } = {(x 1, x 2) T } and {µ } R 3 + with ɛ µ 0 such that x x and [ 1 1 ] + µ 1 [ 1 0 ] [ + µ ] + µ 3 [ x 2 x 1 ] 0. 11

12 By the second line of the above relation, we have that, for large values of, the following must hold: x 1 > 0 and µ 3 > 1/(2x 1). Since ɛ µ 3 0, it yields ɛ = o(x 1). Hence, 1 / I ɛ (x ) and, consequently, µ 1 = 0 for large values of. So, by the first line, we obtain µ 3x 2 1, which assures x 2 < 0 and µ 3 > 1/(2 x 2 ) for sufficiently large. Since µ 3 can only be different from zero if 3 I ɛ (x ), the value ɛ must be greater than min{x 1, x 2 } for large values of. This contradicts the fact that ɛ µ 0, which allows us to conclude that x does not satisfy the ɛ-anoc. Example 3. We now consider the smooth optimization problem min x f(x) = (x 2 2) 2 2 s.t. x 1 = 0, x 1 x 2 = 0. As shown in [4], the point x = (0, 1) T is not a CAKKT point. However, if we choose x = (1/, 1) T, λ = (, ) T and ɛ = 1/ 2, we obtain that the ɛ-anoc is fulfilled as defined in Remar 1. Therefore, x satisfies the ɛ-anoc. The above examples show that, in fact, the ɛ-anoc is a new legitimate (i.e., that does not require any ind of constraint qualification) necessary optimality condition even in the case where all functions are smooth. 4 A practical algorithm The goal of this section is to present an algorithm that is able to generate a sequence of iterates satisfying the sequential optimality conditions introduced in the last section. Looing at relation (4) and the proof of Theorem 2.1, one can easily come up with the following algorithm: given ξ 0 and ρ R sufficiently large, whenever possible, set x argmin Ψ ξ,ρ(x). (16) x R n Although conceptually simple, the above procedure cannot be used in a practical method. For optimization problems involving nonconvex functions, an implementable algorithm usually guarantees that the method will only find stationary points for the problem, not minimizers. Moreover, the set of minimizers of the function Ψ ξ,ρ may be empty, therefore this procedure may not be well defined. We thus present Algorithm 1 (PACNO - Penalized Algorithm for Constrained Nonsmooth Optimization), where we avoid these issues by using a solution of a relaxed version of (16) (see Remar 3 for a more detailed discussion about this algorithm) where we trade the global minimization for the easier problem of Step 1 of Algorithm 1, which concerns only with approximate stationary points. This is sufficient to ensure that Algorithm 1 generates a sequence of iterates that verifies the ɛ- ANOC. However, first we need to show that the set SΨ defined in (17) will eventually be nonempty and, moreover, that the sequence {ρ } generated by the algorithm will stabilize at a sufficiently large value. To guarantee that SΨ, we need to consider one extra assumption. Assumption 1. There exists α > 0 such that F α := {x R n : c(x) + 1 α} is compact. Since we are dealing with possibly nonconvex functions, it is not sufficient to as for the compactness of the feasible set, but, instead, one needs to as for the compactness of the perturbed feasible set F α. This is required to exclude some pathological functions lie the one illustrated in Fig. 3, where the feasible set is compact but there is no α > 0 ensuring the compactness of F α. Nevertheless, the assumption can be easily ensured if the feasible points are additionally restricted to box constraints. Noticing that unbounded solutions do not occur for the majority of real problems, the user may include sufficiently large artificial box constraints to enforce the condition. We start by showing a technical lemma that will be helpful in the subsequent results. 12

13 Algorithm 1: Penalized Algorithm for Constrained Nonsmooth Optimization Step 0. Choose x 0 R n, ρ 1 [0, + ), M (0, + ), {θ ξ, θ ɛ, θ ν } (0, 1), ω (0, 1], {ξ 1, ɛ 1, ν 1 } (0, + ), ξ opt [0, ξ 1 ], ɛ opt [0, ɛ 1 ] and ν opt [0, ν 1 ]. Set = 1. Step 1. Define S Ψ := {x : Ψ ξ,ρ (x) Ψ ξ,ρ (x 1 ) and P (0 ɛ Ψ ξ,ρ (x)) ν }. (17) If S Ψ, then set x as any point in S Ψ. Otherwise, set x as any point x satisfying c(x) + 1 c(x 1 ) + 1 and P (0 ɛ c(x) + 1 ) ν, and go to Step 3. Step 2. If c(x ) + 1 < ξ opt, then { terminate, if ɛ ɛ opt and ν ν opt set ξ +1 = ξ and go to Step 5, otherwise. Step 3. If c(x ) + 1 > ωξ with ɛ ɛ opt and ν ν opt, then stop. The iterate x is close to a stationary point of the infeasibility measure. Step 4. Set ξ +1 = ξ (1 θ ξ )[ξ c(x ) + 1 ] +. Step 5. If f(x ) ρ M, then set ρ +1 = ρ. Otherwise, proceed with ρ +1 = ρ + 2(M + [f(x ) ρ ] + ). Step 6. Set ɛ +1 = θ ɛ ɛ, ν +1 = θ ν ν and + 1. Go bac to Step 1. Figure 3: An illustration of a function that does not have a compact F α. The set F stands for the feasible region. Notice that one cannot find α > 0 that ensures the compactness of F α. Lemma 4.1. Let {x } be a sequence of iterates generated by Algorithm 1 with ξ opt = ɛ opt = ν opt = 0. Then, there must exist a sequence {δ } R + converging to zero such that ξ c(x ) δ. Moreover, if lim inf c(x ) + 1 = 0, then ξ 0. Proof. Since ξ opt = ɛ opt = ν opt = 0, the algorithm will have infinitely many iterations. Moreover, notice that {ξ } is a positive bounded monotone decreasing sequence, which ensures that such a sequence must converge to some positive real value ξ. Hence, since ξ is always updated by Step 4 (when ξ opt = 0), we have ξ +1 = ξ (1 θ ξ )[ξ c(x ) + 1 ] + (1 θ ξ )[ξ c(x ) + 1 ] + = ξ ξ +1 lim [ξ c(x ) + 1 ] + = 0. In other words, there exists {δ } R +, with δ 0, such that ξ c(x ) δ. So, if lim inf c(x ) + 1 = 0, it is straightforward to see that ξ 0. The next lemma is useful to prove the result that, eventually, the set SΨ will have at least one element. It gives a sufficient condition for the existence of a global minimizer of Ψ ξ,ρ. 13

14 Lemma 4.2. Let ρ be a real number such that min f(x) ρ < 0 x F α/2 and ξ (0, α], with α being a strictly positive real number satisfying Assumption 1. function Ψ ξ,ρ defined in (5) has a global minimizer. Then, the Proof. Consider any fixed ξ (0, α]. Let us first show that Ψ ξ,ρ is bounded from below. By contradiction, let us suppose the opposite, i.e., we can find a sequence {x } such that Ψ ξ,ρ (x ). Since is a negative number, we must have, for all N, that c(x ) + 1 ξ, since, otherwise, we would have Ψ ξ,ρ (x ) = c(x ) Because ξ (0, α], this means that x F α, for all N. However, Ψ ξ,ρ is a continuous function, which yields that it must assume a minimum value on F α. This is a contradiction with the fact that Ψ ξ,ρ (x ), for all N. Therefore, the function Ψ ξ,ρ must be bounded. Now, let us then consider a sequence {z } such that Ψ ξ,ρ (z ) inf x R n Ψ ξ,ρ (x). By hypothesis, we now that there exists ˆx F α/2 such that f(ˆx) ρ < 0. This implies that inf Ψ ξ,ρ(x) Ψ ξ,ρ (ˆx) c(ˆx) + 1. x R n Therefore, for any large N, we must have Ψ ξ,ρ (z ) 2 c(ˆx) + 1. Since, ˆx F α/2, it implies that Ψ ξ,ρ (z ) α. Recalling that ξ α, we must have z F α, for any large N. So, because F α is compact, one can ensure that there exists a subsequence of {z } converging to z F α such that Ψ ξ,ρ (z) = inf x R n Ψ ξ,ρ (x), which ends the proof. The following result is a technical lemma and it proves that c(x ) + 1 will be related with the value ξ and the infeasibility measure of the past iteration. It also ensures that our method will not suffer from the greediness phenomenon. Lemma 4.3. Let N be any iteration of Algorithm 1. Then, c(x ) + 1 max { ξ, c(x 1 ) + 1 }. Proof. By contradiction, suppose that there exists an iteration such that c(x ) + 1 > max { ξ, c(x 1 ) + 1 }. (18) Then, because of Step 1, we have x S Ψ. Therefore, Ψ ξ,ρ (x ) Ψ ξ,ρ (x 1 ). Moreover, because of Step 5, one can see that f(x 1 ) ρ < 0, which implies Ψ ξ,ρ (x ) Ψ ξ,ρ (x 1 ) c(x 1 ) + 1. Thus, by (18), we have Ψ ξ,ρ (x ) < c(x ) + 1. Consequently, the above inequality can only be true in the case that c(x ) + 1 < ξ, which contradicts (18). The next result tells us that, when the sequence of iterates approaches the feasible set, then, eventually, the set S Ψ will be nonempty and the sequence {ρ } will stabilize. Lemma 4.4. Let {x } be a sequence of iterates generated by Algorithm 1 with ξ opt = ɛ opt = ν opt = 0. If lim inf c(x ) + 1 = 0, then there exist a sufficiently large ˆ N and ρ R such that, for all ˆ, we have S Ψ and ρ = ρ. Proof. First, let us prove that S Ψ. Since, lim inf c(x ) + 1 = 0, we must have an iteration ˆ such that c(xˆ) + 1 α/2, where α is the positive real number satisfying Assumption 1. Consequently, by Step 5 of the algorithm, one can see that min x F α/2 f(x) ρ f(xˆ) ρ < 0, for any > ˆ. 14

15 Additionally, Lemma 4.1 tells us that ξ 0. So, by Lemma 4.2, we see, for any large N, that S Ψ. Let us now prove that there exists ρ R such that ρ = ρ, for any large N. By contradiction, let us assume that the statement is false. This means, by the way the algorithm was designed, that there exists an infinite index set K N such that f(x ) ρ > M, for all K, (19) implying that ρ. Because lim inf c(x ) + 1 = 0, it means that there exists an iteration ˆ satisfying c(xˆ) + 1 α and ξˆ α (because of Lemma 4.1). Consequently, due to Lemma 4.3 and recalling that {ξ } is a monotone decreasing sequence, we have c(x ) + 1 α, i.e., x F α, for all ˆ. However, by hypothesis, F α is compact, which, together with the assumption ρ, implies ρ max x F α f(x) + M f(x ) + M, for any large N. (20) This is a contradiction with (19). Therefore, there must exist ρ such that ρ = ρ, for any large N. We are now ready to present the convergence theorem of Algorithm 1. Because the nonsmooth optimization problem in hand might involve nonconvex functions, the result ensures that a cluster point x of the iterate sequence {x } can either be a stationary point for the infeasibility measure c( ) + 1, or satisfies the ɛ-anoc. Theorem 4.1. If {x } is the sequence of iterates generated by Algorithm 1 with ξ opt = ɛ opt = ν opt = 0 and θ ɛ (0, θ ξ ), then, given any infinite index set K N such that x x, for some K x R n, one of the following statements must be true: a) x satisfies 0 c(x ) + 1 ; b) x is a feasible point for (P) and it satisfies the ɛ-anoc. Moreover, there exists ρ R such that ρ = ρ, for any large N. Proof. We start by noticing that, since ξ opt = ɛ opt = ν opt = 0, Algorithm 1 in fact generates an infinite sequence of points {x } R n. Moreover, ξ is always updated by Step 4 (since ξ opt = 0), hence ξ +1 = ξ (1 θ ξ )[ξ c(x ) + 1 ] + θ ξ ξ. Therefore, because we suppose θ ɛ (0, θ ξ ), it yields ɛ +1 η ɛ η 2 ɛ 1... η ɛ 1, where η = θ ɛ < 1, ξ +1 ξ ξ 1 ξ 1 θ ξ which ensures ɛ /ξ 0. We then divide the proof in the following cases: i) lim inf K c(x ) + 1 /ξ 1 and there exists an infinite index set ˆK K such that S Ψ =, for all ˆK; ii) lim inf K c(x ) + 1 /ξ 1 and SΨ, for all K sufficiently large; iii) lim inf K c(x ) + 1 /ξ < 1. Suppose i) holds. By Step 1, it follows that P ( ) 0 ɛ c(x ) Consequently, since ˆK ɛ 0, [40, Lemma 3.2 (iii)] guarantees that 0 c(x ) + 1. Assume now that case ii) holds. This yields { } max 1 c(x ) + 1, 0 0. (21) ξ K 15

16 Additionally, by the way Algorithm 1 updates the value ρ, we must have ρ > f(x ) + M, for any large iteration. Recalling that ν 0, ɛ 0 (because ν opt = 0 and ɛ opt = 0), SΨ for any large iteration K, and the result of Lemma 2.1, it follows, for any large K, that there exists {x,j } n+1 B(x, ɛ ), and λ R n+1 with e T λ = 1, such that there exists r Ψ ξ,ρ (x ) with r 0 and n+1 r λ j ( { } ) max 1 c(x,j ) + 1, 0 f(x,j ) + σ j c(x,j ) + 1, (22) ξ where σj 1, j {1,..., n + 1}. Due to (21), x x,j ɛ, and ɛ /ξ 0, we get, for any j {1,..., n + 1}, { } max 1 c(x,j ) + 1, 0 ξ 0. K The above limit together with (22) tell us that P(0 ɛ c(x ) + 1 ) K 0. So, since ɛ 0, we obtain 0 c(x ) + 1 due to [40, Lemma 3.2 (iii)]. Finally, we assume the validity of case iii), and we claim c(x ) Indeed, because {ξ } is a monotone decreasing and bounded sequence, it must follow that ξ ξ, for some ξ 0. According to Lemma 4.1, there exists {δ } R + satisfying δ 0 such that ξ c(x ) δ 1 lim inf K c(x ) + 1 ξ + lim inf K δ δ lim inf > 0. ξ K ξ This yields ξ = 0. However, by the way we have designed our algorithm, this only happens if there exists an infinite index set ˆK N with c(x ) Therefore, given any τ > 0, there exists ˆ such that c(xˆ) + 1 < τ and ξˆ < τ. Because of Lemma 4.3, this ensures ˆ c(x ) + τ. Since τ > 0 is arbitrary, we obtain c(x ) + 1 0, which guarantees that x is a feasible point. Then, Lemma 4.4 assures that, for all large enough, we have S Ψ and ρ = ρ for some ρ R. Notice that Step 5 ensures f(x ) ρ M. This information, together with Step 1, Lemma 2.2, the limit ɛ /ξ 0 and the reasoning used in the proof of Theorem 2.2, prove that the cluster point x satisfies the ɛ-anoc. ˆK Remar 3. Some comments regarding Algorithm 1 are in order: (a) The technique based on sampling points to approximate G ɛ (x) that originated the method nown as Gradient Sampling (GS) [16, 40], and its recent variants [24, 25, 33, 34, 35, 44] have shown to be effective tools for minimizing nonsmooth and nonconvex functions. The good results of those methods assure that Step 1 is not just an idealized step, but it can be performed in practice 1. In addition, BFGS techniques, which are not guaranteed to converge but wor well in practice, may be used as well [22, 42]. Both possibilities were considered in Section 5. (b) Looing carefully at the proof of Theorem 4.1, we can understand the importance of Step 3 inside Algorithm 1. When c(x ) + 1 /ξ is greater than or approaching one, this implies that we are losing information about the objective function along the iterations, meaning that the method is tending to a stationary point of the infeasibility measure. Therefore, this suggests that one should choose ω 1. (c) For the case that problem (P) satisfies calmness [12] and ρ is large enough, every local minimizer of (P) is also a local minimizer of Ψ ξ,ρ for every ξ small enough (this can be easily proven by following the same reasoning used in the proof of [18, Proposition 6.4.3]). This ensures that, in many cases, PACNO will find a good approximation to the solution of the problem without needing to bring ξ down to zero. 1 Assuming that f and c i, i {1,..., p}, are continuously differentiable in full-measure open subsets of R n, all hypotheses required by the convergence theory of the GS method are satisfied. 16

17 5 Numerical results This section has the goal to illustrate different properties of PACNO. Subsection 5.1 is devoted to show that, since our method is based on a legitimate necessary optimality condition (i.e., our convergence result does not depend on any ind of constraint qualification), PACNO may achieve good convergence to the solution of the nonsmooth optimization problem even in the absence of calmness. Additionally, we exhibit a practical example in which PACNO prevents the greediness phenomenon. Subsection 5.2 reveals that our method may converge even when the optimization problem does not satisfy all of our convergence hypotheses. Finally, Subsection 5.3 aims to illustrate that Step 1 of PACNO may accept different solvers to find a stationary point of Ψ ξ,ρ. The tests were performed in a noteboo DELL Latitude 7490, processor Intel Core i7-8650u, CPU 2.11GHz, with 16GB RAM (64-bit) using Matlab R2018a. 5.1 Nonsmooth optimization in the absence of calmness In the previous sections, we have discussed the theoretical benefits of applying the penalization strategy used in Ψ ξ,ρ over the exact penalization approach to produce a practical necessary optimality condition. However, one can wonder if our penalization idea has any practical advantage when compared to the standard exact penalization procedure. Aiming to elucidate this matter, we present a simple nonsmooth optimization problem: min f(x 1, x 2 ) = max{x 3 1 x 2, x 2 } s.t. c(x 1, x 2 ) = (x 1 10) 2 0. (23) x 1,x 2 The optimal solution of this problem is given by x = (10, 500) T with its respective optimal value f = 500. Due to our choice of the constraint function that defines the feasible set F = {(x 1, x 2 ) R 2 : x 1 = 10}, the calmness property does not hold at the optimal point. This has a great impact on the behavior of methods that are based on the exact penalization approach, since this implies that one cannot solve the original problem with a finite penalty parameter [12]. In case this parameter value is allowed to go to infinity, this usually produces a sequence of points that has a poor precision on the optimal function value. To illustrate how a method based on exact penalization behaves in the absence of calmness, we have solved the nonsmooth problem (23) using the SLQP-GS 2 (version 1.3) [23] in its default mode, with the exception that we have allowed the SLQP-GS algorithm to run 10 4 iterations in order to observe the optimal value precision that such a method is able to reach. In a similar way, the GS algorithm [16, 40] was chosen to be our internal solver for Step 1 a version that we call PACNO GS and we have used the following parameter values for PACNO GS : ρ 1 = 0, M = 10, θ ξ = 0.5, θ ɛ = 0.25, θ ν = 0.5, ω = 0.99, ξ 1 = ɛ 1 = ν 1 = 1 and ξ opt = ɛ opt = ν opt = In addition, for each call of GS, the sampling radii were set first as min{10ɛ, 1} with optimality tolerance given by min{10ν, 1}, and then min{ɛ, 1} as the final sampling radius with optimality tolerance min{ν, 1}. The results are shown in Figs. 4 and 5, so that one can follow the reached distributions of the inviability measure in (a), and of the relative errors ((b) in the domain, and (c) in the image space). Because the SLQP-GS method relies on exact penalization and, additionally, its convergence theory is established only when the nonsmooth problem satisfies the calmness property at the solution, it was expected that the precision achieved by the method regarding the optimal function value would not be satisfactory. On the other hand, our algorithm possesses a convergence theory even in the absence of calmness, and, as anticipated, the precision obtained related to the optimality measure is considerably better. In addition, the PACNO GS algorithm is able to eep the iterates closer to the feasible set. Nevertheless, one can easily represent the feasible set F in a manner that the calmness property is satisfied. Indeed, we can consider the constraint x instead of (x 1 10) 2 0 (see 2 The code can be found in 17

18 (a) c(x SLQP) + 1 (b) xslqp x x (c) f(xslqp) f f Figure 4: Boxplots of the results of twenty runs (depicted as ) of the SLQP-GS method with different initial points chosen in a box [ 5, 5] 2 centered at the optimal solution of problem (23). The last iterate obtained by the SLQP-GS method is represented by x SLQP. (a) c(x PACNO) + 1 (b) xpacno x x (c) f(xpacno) f f Figure 5: Boxplots of the result of twenty runs (depicted as ) of the PACNO GS method with different initial points chosen in a box [ 5, 5] 2 centered at the optimal solution of problem (23). The last iterate obtained by the PACNO GS method is represented by x PACNO. Figs. 6 and 7). For this new optimization problem, the penalty parameter in the exact penalization approach no longer must go to infinity. However, methods based on exact penalization may experience another undesirable behavior: by initializing the method with an inappropriate value for the penalty parameter, such methods may present the greediness phenomenon. This anomaly can be seen when one tries to solve this new nonsmooth problem with the SLQP-GS method. In many runs, due to the initial value of the penalty parameter in the standard configuration of the SLQP-GS algorithm, the method gives excessive importance to the objective function in the first iterations, which carries the iterates away from the feasible set, preventing the method to converge even when one allows a large number of iterations to be performed (10 4 iterations). However, it is worth mentioning that, if the user sets a better scaled penalty parameter value, the SLQP-GS algorithm will easily converge to the solution point. In contrast, the PACNO GS algorithm is able to reach the solution without needing to bring the parameter ξ down to zero (in the twenty runs that were performed, the mean value of ξ was ept above 10 3 ), and, due to Lemma 4.3, the good behavior of PACNO GS is not subjected to a tuned value of ξ. 5.2 Bilevel optimization via a nonsmooth approach One of the most usual ways to solve a bilevel optimization problem [19] is to consider a mathematical problem with equilibrium constraints (MPEC) instead of the original multilevel instance. However, 18

19 (a) c(x SLQP) + 1 (b) xslqp x x (c) f(xslqp) f f Figure 6: Boxplots of the result of twenty runs (depicted as ) of the SLQP-GS method with different initial points chosen in a box [ 5, 5] 2 centered at the optimal solution of problem (23). The last iterate obtained by the SLQP-GS method is represented by x SLQP. (a) c(x PACNO) + 1 (b) xpacno x x (c) f(xpacno) f f Figure 7: Boxplots of the result of twenty runs (depicted as ) of the PACNO GS method with different initial points chosen in a box [ 5, 5] 2 centered at the optimal solution of problem (23). The last iterate obtained by the PACNO GS method is represented by x PACNO. there are situations in which this approach may generate an MPEC that is not equivalent to the original problem. For example, the authors of [26] show that the following minimization min x,y x s.t. x 0, y Λ(x), with Λ(x) := argmin y {y 1 : y 2 1 y 2 x, y y 2 0}, may generate an MPEC that does not possess an optimal solution when one considers the KKT conditions of the minimization problem related to the definition of Λ(x). Indeed, consider the following nonsmooth single level optimization problem in which the first three constraints are associated with the KKT conditions of the implicit optimization problem presented in Λ(x), min x,y,λ x s.t. 2λ 1 y 1 + 2λ 2 y λ 1 + λ 2 0 min{ y1 2 + y 2 + x, λ 1 } 2 + min{ y1 2 y 2, λ 2 } 2 0 x 0. (24) One can see that, for every x > 0, it is always possible to find y(x) and λ(x) such that (x, y(x), λ(x)) is a feasible point. Although the objective function value of this problem converges to zero when x 0, the feasible set is empty for x = 0. This indicates that the MPEC instance (24) does not possess 19

AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING

AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING XIAO WANG AND HONGCHAO ZHANG Abstract. In this paper, we propose an Augmented Lagrangian Affine Scaling (ALAS) algorithm for general

More information

A Primal-Dual Interior-Point Method for Nonlinear Programming with Strong Global and Local Convergence Properties

A Primal-Dual Interior-Point Method for Nonlinear Programming with Strong Global and Local Convergence Properties A Primal-Dual Interior-Point Method for Nonlinear Programming with Strong Global and Local Convergence Properties André L. Tits Andreas Wächter Sasan Bahtiari Thomas J. Urban Craig T. Lawrence ISR Technical

More information

Optimality Conditions for Constrained Optimization

Optimality Conditions for Constrained Optimization 72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

More information

Pacific Journal of Optimization (Vol. 2, No. 3, September 2006) ABSTRACT

Pacific Journal of Optimization (Vol. 2, No. 3, September 2006) ABSTRACT Pacific Journal of Optimization Vol., No. 3, September 006) PRIMAL ERROR BOUNDS BASED ON THE AUGMENTED LAGRANGIAN AND LAGRANGIAN RELAXATION ALGORITHMS A. F. Izmailov and M. V. Solodov ABSTRACT For a given

More information

5 Handling Constraints

5 Handling Constraints 5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest

More information

Kaisa Joki Adil M. Bagirov Napsu Karmitsa Marko M. Mäkelä. New Proximal Bundle Method for Nonsmooth DC Optimization

Kaisa Joki Adil M. Bagirov Napsu Karmitsa Marko M. Mäkelä. New Proximal Bundle Method for Nonsmooth DC Optimization Kaisa Joki Adil M. Bagirov Napsu Karmitsa Marko M. Mäkelä New Proximal Bundle Method for Nonsmooth DC Optimization TUCS Technical Report No 1130, February 2015 New Proximal Bundle Method for Nonsmooth

More information

On sequential optimality conditions for constrained optimization. José Mario Martínez martinez

On sequential optimality conditions for constrained optimization. José Mario Martínez  martinez On sequential optimality conditions for constrained optimization José Mario Martínez www.ime.unicamp.br/ martinez UNICAMP, Brazil 2011 Collaborators This talk is based in joint papers with Roberto Andreani

More information

Optimization and Optimal Control in Banach Spaces

Optimization and Optimal Control in Banach Spaces Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,

More information

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

8 Barrier Methods for Constrained Optimization

8 Barrier Methods for Constrained Optimization IOE 519: NL, Winter 2012 c Marina A. Epelman 55 8 Barrier Methods for Constrained Optimization In this subsection, we will restrict our attention to instances of constrained problem () that have inequality

More information

Numerical Optimization

Numerical Optimization Constrained Optimization Computer Science and Automation Indian Institute of Science Bangalore 560 012, India. NPTEL Course on Constrained Optimization Constrained Optimization Problem: min h j (x) 0,

More information

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be

More information

CONVERGENCE ANALYSIS OF AN INTERIOR-POINT METHOD FOR NONCONVEX NONLINEAR PROGRAMMING

CONVERGENCE ANALYSIS OF AN INTERIOR-POINT METHOD FOR NONCONVEX NONLINEAR PROGRAMMING CONVERGENCE ANALYSIS OF AN INTERIOR-POINT METHOD FOR NONCONVEX NONLINEAR PROGRAMMING HANDE Y. BENSON, ARUN SEN, AND DAVID F. SHANNO Abstract. In this paper, we present global and local convergence results

More information

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 4 Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 2 4.1. Subgradients definition subgradient calculus duality and optimality conditions Shiqian

More information

A CHARACTERIZATION OF STRICT LOCAL MINIMIZERS OF ORDER ONE FOR STATIC MINMAX PROBLEMS IN THE PARAMETRIC CONSTRAINT CASE

A CHARACTERIZATION OF STRICT LOCAL MINIMIZERS OF ORDER ONE FOR STATIC MINMAX PROBLEMS IN THE PARAMETRIC CONSTRAINT CASE Journal of Applied Analysis Vol. 6, No. 1 (2000), pp. 139 148 A CHARACTERIZATION OF STRICT LOCAL MINIMIZERS OF ORDER ONE FOR STATIC MINMAX PROBLEMS IN THE PARAMETRIC CONSTRAINT CASE A. W. A. TAHA Received

More information

Iteration-complexity of first-order penalty methods for convex programming

Iteration-complexity of first-order penalty methods for convex programming Iteration-complexity of first-order penalty methods for convex programming Guanghui Lan Renato D.C. Monteiro July 24, 2008 Abstract This paper considers a special but broad class of convex programing CP)

More information

CONSTRAINED NONLINEAR PROGRAMMING

CONSTRAINED NONLINEAR PROGRAMMING 149 CONSTRAINED NONLINEAR PROGRAMMING We now turn to methods for general constrained nonlinear programming. These may be broadly classified into two categories: 1. TRANSFORMATION METHODS: In this approach

More information

Nonlinear Programming, Elastic Mode, SQP, MPEC, MPCC, complementarity

Nonlinear Programming, Elastic Mode, SQP, MPEC, MPCC, complementarity Preprint ANL/MCS-P864-1200 ON USING THE ELASTIC MODE IN NONLINEAR PROGRAMMING APPROACHES TO MATHEMATICAL PROGRAMS WITH COMPLEMENTARITY CONSTRAINTS MIHAI ANITESCU Abstract. We investigate the possibility

More information

LIMITED MEMORY BUNDLE METHOD FOR LARGE BOUND CONSTRAINED NONSMOOTH OPTIMIZATION: CONVERGENCE ANALYSIS

LIMITED MEMORY BUNDLE METHOD FOR LARGE BOUND CONSTRAINED NONSMOOTH OPTIMIZATION: CONVERGENCE ANALYSIS LIMITED MEMORY BUNDLE METHOD FOR LARGE BOUND CONSTRAINED NONSMOOTH OPTIMIZATION: CONVERGENCE ANALYSIS Napsu Karmitsa 1 Marko M. Mäkelä 2 Department of Mathematics, University of Turku, FI-20014 Turku,

More information

INTERIOR-POINT METHODS FOR NONCONVEX NONLINEAR PROGRAMMING: CONVERGENCE ANALYSIS AND COMPUTATIONAL PERFORMANCE

INTERIOR-POINT METHODS FOR NONCONVEX NONLINEAR PROGRAMMING: CONVERGENCE ANALYSIS AND COMPUTATIONAL PERFORMANCE INTERIOR-POINT METHODS FOR NONCONVEX NONLINEAR PROGRAMMING: CONVERGENCE ANALYSIS AND COMPUTATIONAL PERFORMANCE HANDE Y. BENSON, ARUN SEN, AND DAVID F. SHANNO Abstract. In this paper, we present global

More information

A GLOBALLY CONVERGENT STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE

A GLOBALLY CONVERGENT STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE A GLOBALLY CONVERGENT STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE Philip E. Gill Vyacheslav Kungurtsev Daniel P. Robinson UCSD Center for Computational Mathematics Technical Report CCoM-14-1 June 30,

More information

Lecture 13: Constrained optimization

Lecture 13: Constrained optimization 2010-12-03 Basic ideas A nonlinearly constrained problem must somehow be converted relaxed into a problem which we can solve (a linear/quadratic or unconstrained problem) We solve a sequence of such problems

More information

A STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE

A STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE A STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE Philip E. Gill Vyacheslav Kungurtsev Daniel P. Robinson UCSD Center for Computational Mathematics Technical Report CCoM-14-1 June 30, 2014 Abstract Regularized

More information

Sequential Unconstrained Minimization: A Survey

Sequential Unconstrained Minimization: A Survey Sequential Unconstrained Minimization: A Survey Charles L. Byrne February 21, 2013 Abstract The problem is to minimize a function f : X (, ], over a non-empty subset C of X, where X is an arbitrary set.

More information

An inexact subgradient algorithm for Equilibrium Problems

An inexact subgradient algorithm for Equilibrium Problems Volume 30, N. 1, pp. 91 107, 2011 Copyright 2011 SBMAC ISSN 0101-8205 www.scielo.br/cam An inexact subgradient algorithm for Equilibrium Problems PAULO SANTOS 1 and SUSANA SCHEIMBERG 2 1 DM, UFPI, Teresina,

More information

Douglas-Rachford splitting for nonconvex feasibility problems

Douglas-Rachford splitting for nonconvex feasibility problems Douglas-Rachford splitting for nonconvex feasibility problems Guoyin Li Ting Kei Pong Jan 3, 015 Abstract We adapt the Douglas-Rachford DR) splitting method to solve nonconvex feasibility problems by studying

More information

Implications of the Constant Rank Constraint Qualification

Implications of the Constant Rank Constraint Qualification Mathematical Programming manuscript No. (will be inserted by the editor) Implications of the Constant Rank Constraint Qualification Shu Lu Received: date / Accepted: date Abstract This paper investigates

More information

Sequential equality-constrained optimization for nonlinear programming

Sequential equality-constrained optimization for nonlinear programming Sequential equality-constrained optimization for nonlinear programming E. G. Birgin L. F. Bueno J. M. Martínez September 9, 2015 Abstract A novel idea is proposed for solving optimization problems with

More information

1. Introduction. We consider the following constrained optimization problem:

1. Introduction. We consider the following constrained optimization problem: SIAM J. OPTIM. Vol. 26, No. 3, pp. 1465 1492 c 2016 Society for Industrial and Applied Mathematics PENALTY METHODS FOR A CLASS OF NON-LIPSCHITZ OPTIMIZATION PROBLEMS XIAOJUN CHEN, ZHAOSONG LU, AND TING

More information

AN INTERIOR-POINT METHOD FOR NONLINEAR OPTIMIZATION PROBLEMS WITH LOCATABLE AND SEPARABLE NONSMOOTHNESS

AN INTERIOR-POINT METHOD FOR NONLINEAR OPTIMIZATION PROBLEMS WITH LOCATABLE AND SEPARABLE NONSMOOTHNESS AN INTERIOR-POINT METHOD FOR NONLINEAR OPTIMIZATION PROBLEMS WITH LOCATABLE AND SEPARABLE NONSMOOTHNESS MARTIN SCHMIDT Abstract. Many real-world optimization models comse nonconvex and nonlinear as well

More information

Algorithms for Constrained Optimization

Algorithms for Constrained Optimization 1 / 42 Algorithms for Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University April 19, 2015 2 / 42 Outline 1. Convergence 2. Sequential quadratic

More information

Date: July 5, Contents

Date: July 5, Contents 2 Lagrange Multipliers Date: July 5, 2001 Contents 2.1. Introduction to Lagrange Multipliers......... p. 2 2.2. Enhanced Fritz John Optimality Conditions...... p. 14 2.3. Informative Lagrange Multipliers...........

More information

On the minimization of possibly discontinuous functions by means of pointwise approximations

On the minimization of possibly discontinuous functions by means of pointwise approximations On the minimization of possibly discontinuous functions by means of pointwise approximations E. G. Birgin N. Krejić J. M. Martínez April 5, 2016 Abstract A general approach for the solution of possibly

More information

Optimality, Duality, Complementarity for Constrained Optimization

Optimality, Duality, Complementarity for Constrained Optimization Optimality, Duality, Complementarity for Constrained Optimization Stephen Wright University of Wisconsin-Madison May 2014 Wright (UW-Madison) Optimality, Duality, Complementarity May 2014 1 / 41 Linear

More information

ON AUGMENTED LAGRANGIAN METHODS WITH GENERAL LOWER-LEVEL CONSTRAINTS. 1. Introduction. Many practical optimization problems have the form (1.

ON AUGMENTED LAGRANGIAN METHODS WITH GENERAL LOWER-LEVEL CONSTRAINTS. 1. Introduction. Many practical optimization problems have the form (1. ON AUGMENTED LAGRANGIAN METHODS WITH GENERAL LOWER-LEVEL CONSTRAINTS R. ANDREANI, E. G. BIRGIN, J. M. MARTíNEZ, AND M. L. SCHUVERDT Abstract. Augmented Lagrangian methods with general lower-level constraints

More information

The Relation Between Pseudonormality and Quasiregularity in Constrained Optimization 1

The Relation Between Pseudonormality and Quasiregularity in Constrained Optimization 1 October 2003 The Relation Between Pseudonormality and Quasiregularity in Constrained Optimization 1 by Asuman E. Ozdaglar and Dimitri P. Bertsekas 2 Abstract We consider optimization problems with equality,

More information

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization Roger Behling a, Clovis Gonzaga b and Gabriel Haeser c March 21, 2013 a Department

More information

A SHIFTED PRIMAL-DUAL INTERIOR METHOD FOR NONLINEAR OPTIMIZATION

A SHIFTED PRIMAL-DUAL INTERIOR METHOD FOR NONLINEAR OPTIMIZATION A SHIFTED RIMAL-DUAL INTERIOR METHOD FOR NONLINEAR OTIMIZATION hilip E. Gill Vyacheslav Kungurtsev Daniel. Robinson UCSD Center for Computational Mathematics Technical Report CCoM-18-1 February 1, 2018

More information

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Zhaosong Lu October 5, 2012 (Revised: June 3, 2013; September 17, 2013) Abstract In this paper we study

More information

Technische Universität Dresden Herausgeber: Der Rektor

Technische Universität Dresden Herausgeber: Der Rektor Als Manuskript gedruckt Technische Universität Dresden Herausgeber: Der Rektor The Gradient of the Squared Residual as Error Bound an Application to Karush-Kuhn-Tucker Systems Andreas Fischer MATH-NM-13-2002

More information

Math 273a: Optimization Subgradients of convex functions

Math 273a: Optimization Subgradients of convex functions Math 273a: Optimization Subgradients of convex functions Made by: Damek Davis Edited by Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com 1 / 42 Subgradients Assumptions

More information

Coordinate Update Algorithm Short Course Subgradients and Subgradient Methods

Coordinate Update Algorithm Short Course Subgradients and Subgradient Methods Coordinate Update Algorithm Short Course Subgradients and Subgradient Methods Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 30 Notation f : H R { } is a closed proper convex function domf := {x R n

More information

A convergence result for an Outer Approximation Scheme

A convergence result for an Outer Approximation Scheme A convergence result for an Outer Approximation Scheme R. S. Burachik Engenharia de Sistemas e Computação, COPPE-UFRJ, CP 68511, Rio de Janeiro, RJ, CEP 21941-972, Brazil regi@cos.ufrj.br J. O. Lopes Departamento

More information

Lecture 3. Optimization Problems and Iterative Algorithms

Lecture 3. Optimization Problems and Iterative Algorithms Lecture 3 Optimization Problems and Iterative Algorithms January 13, 2016 This material was jointly developed with Angelia Nedić at UIUC for IE 598ns Outline Special Functions: Linear, Quadratic, Convex

More information

First-order optimality conditions for mathematical programs with second-order cone complementarity constraints

First-order optimality conditions for mathematical programs with second-order cone complementarity constraints First-order optimality conditions for mathematical programs with second-order cone complementarity constraints Jane J. Ye Jinchuan Zhou Abstract In this paper we consider a mathematical program with second-order

More information

Chapter 2 Convex Analysis

Chapter 2 Convex Analysis Chapter 2 Convex Analysis The theory of nonsmooth analysis is based on convex analysis. Thus, we start this chapter by giving basic concepts and results of convexity (for further readings see also [202,

More information

Evaluation complexity for nonlinear constrained optimization using unscaled KKT conditions and high-order models by E. G. Birgin, J. L. Gardenghi, J. M. Martínez, S. A. Santos and Ph. L. Toint Report NAXYS-08-2015

More information

On proximal-like methods for equilibrium programming

On proximal-like methods for equilibrium programming On proximal-lie methods for equilibrium programming Nils Langenberg Department of Mathematics, University of Trier 54286 Trier, Germany, langenberg@uni-trier.de Abstract In [?] Flam and Antipin discussed

More information

NOTES ON EXISTENCE AND UNIQUENESS THEOREMS FOR ODES

NOTES ON EXISTENCE AND UNIQUENESS THEOREMS FOR ODES NOTES ON EXISTENCE AND UNIQUENESS THEOREMS FOR ODES JONATHAN LUK These notes discuss theorems on the existence, uniqueness and extension of solutions for ODEs. None of these results are original. The proofs

More information

A Quasi-Newton Algorithm for Nonconvex, Nonsmooth Optimization with Global Convergence Guarantees

A Quasi-Newton Algorithm for Nonconvex, Nonsmooth Optimization with Global Convergence Guarantees Noname manuscript No. (will be inserted by the editor) A Quasi-Newton Algorithm for Nonconvex, Nonsmooth Optimization with Global Convergence Guarantees Frank E. Curtis Xiaocun Que May 26, 2014 Abstract

More information

Zangwill s Global Convergence Theorem

Zangwill s Global Convergence Theorem Zangwill s Global Convergence Theorem A theory of global convergence has been given by Zangwill 1. This theory involves the notion of a set-valued mapping, or point-to-set mapping. Definition 1.1 Given

More information

U e = E (U\E) e E e + U\E e. (1.6)

U e = E (U\E) e E e + U\E e. (1.6) 12 1 Lebesgue Measure 1.2 Lebesgue Measure In Section 1.1 we defined the exterior Lebesgue measure of every subset of R d. Unfortunately, a major disadvantage of exterior measure is that it does not satisfy

More information

A smoothing augmented Lagrangian method for solving simple bilevel programs

A smoothing augmented Lagrangian method for solving simple bilevel programs A smoothing augmented Lagrangian method for solving simple bilevel programs Mengwei Xu and Jane J. Ye Dedicated to Masao Fukushima in honor of his 65th birthday Abstract. In this paper, we design a numerical

More information

The exact absolute value penalty function method for identifying strict global minima of order m in nonconvex nonsmooth programming

The exact absolute value penalty function method for identifying strict global minima of order m in nonconvex nonsmooth programming Optim Lett (2016 10:1561 1576 DOI 10.1007/s11590-015-0967-3 ORIGINAL PAPER The exact absolute value penalty function method for identifying strict global minima of order m in nonconvex nonsmooth programming

More information

Complementarity Formulations of l 0 -norm Optimization Problems

Complementarity Formulations of l 0 -norm Optimization Problems Complementarity Formulations of l 0 -norm Optimization Problems Mingbin Feng, John E. Mitchell,Jong-Shi Pang, Xin Shen, Andreas Wächter Original submission: September 23, 2013. Revised January 8, 2015

More information

An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints

An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints Klaus Schittkowski Department of Computer Science, University of Bayreuth 95440 Bayreuth, Germany e-mail:

More information

Mingbin Feng, John E. Mitchell, Jong-Shi Pang, Xin Shen, Andreas Wächter

Mingbin Feng, John E. Mitchell, Jong-Shi Pang, Xin Shen, Andreas Wächter Complementarity Formulations of l 0 -norm Optimization Problems 1 Mingbin Feng, John E. Mitchell, Jong-Shi Pang, Xin Shen, Andreas Wächter Abstract: In a number of application areas, it is desirable to

More information

Complementarity Formulations of l 0 -norm Optimization Problems

Complementarity Formulations of l 0 -norm Optimization Problems Complementarity Formulations of l 0 -norm Optimization Problems Mingbin Feng, John E. Mitchell, Jong-Shi Pang, Xin Shen, Andreas Wächter May 17, 2016 Abstract In a number of application areas, it is desirable

More information

On the complexity of an Inexact Restoration method for constrained optimization

On the complexity of an Inexact Restoration method for constrained optimization On the complexity of an Inexact Restoration method for constrained optimization L. F. Bueno J. M. Martínez September 18, 2018 Abstract Recent papers indicate that some algorithms for constrained optimization

More information

Optimality Conditions for Nonsmooth Convex Optimization

Optimality Conditions for Nonsmooth Convex Optimization Optimality Conditions for Nonsmooth Convex Optimization Sangkyun Lee Oct 22, 2014 Let us consider a convex function f : R n R, where R is the extended real field, R := R {, + }, which is proper (f never

More information

Algorithms for Nonsmooth Optimization

Algorithms for Nonsmooth Optimization Algorithms for Nonsmooth Optimization Frank E. Curtis, Lehigh University presented at Center for Optimization and Statistical Learning, Northwestern University 2 March 2018 Algorithms for Nonsmooth Optimization

More information

Penalty and Barrier Methods General classical constrained minimization problem minimize f(x) subject to g(x) 0 h(x) =0 Penalty methods are motivated by the desire to use unconstrained optimization techniques

More information

PDE-Constrained and Nonsmooth Optimization

PDE-Constrained and Nonsmooth Optimization Frank E. Curtis October 1, 2009 Outline PDE-Constrained Optimization Introduction Newton s method Inexactness Results Summary and future work Nonsmooth Optimization Sequential quadratic programming (SQP)

More information

Mathematics 530. Practice Problems. n + 1 }

Mathematics 530. Practice Problems. n + 1 } Department of Mathematical Sciences University of Delaware Prof. T. Angell October 19, 2015 Mathematics 530 Practice Problems 1. Recall that an indifference relation on a partially ordered set is defined

More information

Some new facts about sequential quadratic programming methods employing second derivatives

Some new facts about sequential quadratic programming methods employing second derivatives To appear in Optimization Methods and Software Vol. 00, No. 00, Month 20XX, 1 24 Some new facts about sequential quadratic programming methods employing second derivatives A.F. Izmailov a and M.V. Solodov

More information

Some Remarks on Constrained Optimization. José Mario Martínez

Some Remarks on Constrained Optimization. José Mario Martínez José Mario Martínez wwwimeunicampbr/ martinez Department of Applied Mathematics, University of Campinas, Brazil 2011 Collaborators: Roberto Andreani (Unicamp), Ernesto Birgin (USP), Laura Schuverdt (La

More information

CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS

CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS A Dissertation Submitted For The Award of the Degree of Master of Philosophy in Mathematics Neelam Patel School of Mathematics

More information

Algorithms for constrained local optimization

Algorithms for constrained local optimization Algorithms for constrained local optimization Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Algorithms for constrained local optimization p. Feasible direction methods Algorithms for constrained

More information

Necessary optimality conditions for optimal control problems with nonsmooth mixed state and control constraints

Necessary optimality conditions for optimal control problems with nonsmooth mixed state and control constraints Necessary optimality conditions for optimal control problems with nonsmooth mixed state and control constraints An Li and Jane J. Ye Abstract. In this paper we study an optimal control problem with nonsmooth

More information

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions International Journal of Control Vol. 00, No. 00, January 2007, 1 10 Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions I-JENG WANG and JAMES C.

More information

Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms

Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms Carlos Humes Jr. a, Benar F. Svaiter b, Paulo J. S. Silva a, a Dept. of Computer Science, University of São Paulo, Brazil Email: {humes,rsilva}@ime.usp.br

More information

Metric Spaces and Topology

Metric Spaces and Topology Chapter 2 Metric Spaces and Topology From an engineering perspective, the most important way to construct a topology on a set is to define the topology in terms of a metric on the set. This approach underlies

More information

1 Lyapunov theory of stability

1 Lyapunov theory of stability M.Kawski, APM 581 Diff Equns Intro to Lyapunov theory. November 15, 29 1 1 Lyapunov theory of stability Introduction. Lyapunov s second (or direct) method provides tools for studying (asymptotic) stability

More information

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods AM 205: lecture 19 Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods Quasi-Newton Methods General form of quasi-newton methods: x k+1 = x k α

More information

How to Characterize the Worst-Case Performance of Algorithms for Nonconvex Optimization

How to Characterize the Worst-Case Performance of Algorithms for Nonconvex Optimization How to Characterize the Worst-Case Performance of Algorithms for Nonconvex Optimization Frank E. Curtis Department of Industrial and Systems Engineering, Lehigh University Daniel P. Robinson Department

More information

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented

More information

Sequential equality-constrained optimization for nonlinear programming

Sequential equality-constrained optimization for nonlinear programming Sequential equality-constrained optimization for nonlinear programming E. G. Birgin L. F. Bueno J. M. Martínez September 9, 2015 Abstract A new method is proposed for solving optimization problems with

More information

On Penalty and Gap Function Methods for Bilevel Equilibrium Problems

On Penalty and Gap Function Methods for Bilevel Equilibrium Problems On Penalty and Gap Function Methods for Bilevel Equilibrium Problems Bui Van Dinh 1 and Le Dung Muu 2 1 Faculty of Information Technology, Le Quy Don Technical University, Hanoi, Vietnam 2 Institute of

More information

STATIONARITY RESULTS FOR GENERATING SET SEARCH FOR LINEARLY CONSTRAINED OPTIMIZATION

STATIONARITY RESULTS FOR GENERATING SET SEARCH FOR LINEARLY CONSTRAINED OPTIMIZATION STATIONARITY RESULTS FOR GENERATING SET SEARCH FOR LINEARLY CONSTRAINED OPTIMIZATION TAMARA G. KOLDA, ROBERT MICHAEL LEWIS, AND VIRGINIA TORCZON Abstract. We present a new generating set search (GSS) approach

More information

ON LICQ AND THE UNIQUENESS OF LAGRANGE MULTIPLIERS

ON LICQ AND THE UNIQUENESS OF LAGRANGE MULTIPLIERS ON LICQ AND THE UNIQUENESS OF LAGRANGE MULTIPLIERS GERD WACHSMUTH Abstract. Kyparisis proved in 1985 that a strict version of the Mangasarian- Fromovitz constraint qualification (MFCQ) is equivalent to

More information

Set, functions and Euclidean space. Seungjin Han

Set, functions and Euclidean space. Seungjin Han Set, functions and Euclidean space Seungjin Han September, 2018 1 Some Basics LOGIC A is necessary for B : If B holds, then A holds. B A A B is the contraposition of B A. A is sufficient for B: If A holds,

More information

A STABILIZED SQP METHOD: GLOBAL CONVERGENCE

A STABILIZED SQP METHOD: GLOBAL CONVERGENCE A STABILIZED SQP METHOD: GLOBAL CONVERGENCE Philip E. Gill Vyacheslav Kungurtsev Daniel P. Robinson UCSD Center for Computational Mathematics Technical Report CCoM-13-4 Revised July 18, 2014, June 23,

More information

Inequality Constraints

Inequality Constraints Chapter 2 Inequality Constraints 2.1 Optimality Conditions Early in multivariate calculus we learn the significance of differentiability in finding minimizers. In this section we begin our study of the

More information

Solving generalized semi-infinite programs by reduction to simpler problems.

Solving generalized semi-infinite programs by reduction to simpler problems. Solving generalized semi-infinite programs by reduction to simpler problems. G. Still, University of Twente January 20, 2004 Abstract. The paper intends to give a unifying treatment of different approaches

More information

Problem: A class of dynamical systems characterized by a fast divergence of the orbits. A paradigmatic example: the Arnold cat.

Problem: A class of dynamical systems characterized by a fast divergence of the orbits. A paradigmatic example: the Arnold cat. À È Ê ÇÄÁ Ë ËÌ ÅË Problem: A class of dynamical systems characterized by a fast divergence of the orbits A paradigmatic example: the Arnold cat. The closure of a homoclinic orbit. The shadowing lemma.

More information

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods AM 205: lecture 19 Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods Optimality Conditions: Equality Constrained Case As another example of equality

More information

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems Robert M. Freund February 2016 c 2016 Massachusetts Institute of Technology. All rights reserved. 1 1 Introduction

More information

Preprint ANL/MCS-P , Dec 2002 (Revised Nov 2003, Mar 2004) Mathematics and Computer Science Division Argonne National Laboratory

Preprint ANL/MCS-P , Dec 2002 (Revised Nov 2003, Mar 2004) Mathematics and Computer Science Division Argonne National Laboratory Preprint ANL/MCS-P1015-1202, Dec 2002 (Revised Nov 2003, Mar 2004) Mathematics and Computer Science Division Argonne National Laboratory A GLOBALLY CONVERGENT LINEARLY CONSTRAINED LAGRANGIAN METHOD FOR

More information

A FRITZ JOHN APPROACH TO FIRST ORDER OPTIMALITY CONDITIONS FOR MATHEMATICAL PROGRAMS WITH EQUILIBRIUM CONSTRAINTS

A FRITZ JOHN APPROACH TO FIRST ORDER OPTIMALITY CONDITIONS FOR MATHEMATICAL PROGRAMS WITH EQUILIBRIUM CONSTRAINTS A FRITZ JOHN APPROACH TO FIRST ORDER OPTIMALITY CONDITIONS FOR MATHEMATICAL PROGRAMS WITH EQUILIBRIUM CONSTRAINTS Michael L. Flegel and Christian Kanzow University of Würzburg Institute of Applied Mathematics

More information

Affine scaling interior Levenberg-Marquardt method for KKT systems. C S:Levenberg-Marquardt{)KKTXÚ

Affine scaling interior Levenberg-Marquardt method for KKT systems. C S:Levenberg-Marquardt{)KKTXÚ 2013c6 $ Ê Æ Æ 117ò 12Ï June, 2013 Operations Research Transactions Vol.17 No.2 Affine scaling interior Levenberg-Marquardt method for KKT systems WANG Yunjuan 1, ZHU Detong 2 Abstract We develop and analyze

More information

CONVERGENCE AND STABILITY OF A REGULARIZATION METHOD FOR MAXIMAL MONOTONE INCLUSIONS AND ITS APPLICATIONS TO CONVEX OPTIMIZATION

CONVERGENCE AND STABILITY OF A REGULARIZATION METHOD FOR MAXIMAL MONOTONE INCLUSIONS AND ITS APPLICATIONS TO CONVEX OPTIMIZATION Variational Analysis and Appls. F. Giannessi and A. Maugeri, Eds. Kluwer Acad. Publ., Dordrecht, 2004 CONVERGENCE AND STABILITY OF A REGULARIZATION METHOD FOR MAXIMAL MONOTONE INCLUSIONS AND ITS APPLICATIONS

More information

Generalized Uniformly Optimal Methods for Nonlinear Programming

Generalized Uniformly Optimal Methods for Nonlinear Programming Generalized Uniformly Optimal Methods for Nonlinear Programming Saeed Ghadimi Guanghui Lan Hongchao Zhang Janumary 14, 2017 Abstract In this paper, we present a generic framewor to extend existing uniformly

More information

Priority Programme 1962

Priority Programme 1962 Priority Programme 1962 An Example Comparing the Standard and Modified Augmented Lagrangian Methods Christian Kanzow, Daniel Steck Non-smooth and Complementarity-based Distributed Parameter Systems: Simulation

More information

GENERAL NONCONVEX SPLIT VARIATIONAL INEQUALITY PROBLEMS. Jong Kyu Kim, Salahuddin, and Won Hee Lim

GENERAL NONCONVEX SPLIT VARIATIONAL INEQUALITY PROBLEMS. Jong Kyu Kim, Salahuddin, and Won Hee Lim Korean J. Math. 25 (2017), No. 4, pp. 469 481 https://doi.org/10.11568/kjm.2017.25.4.469 GENERAL NONCONVEX SPLIT VARIATIONAL INEQUALITY PROBLEMS Jong Kyu Kim, Salahuddin, and Won Hee Lim Abstract. In this

More information

Subdifferential representation of convex functions: refinements and applications

Subdifferential representation of convex functions: refinements and applications Subdifferential representation of convex functions: refinements and applications Joël Benoist & Aris Daniilidis Abstract Every lower semicontinuous convex function can be represented through its subdifferential

More information

A Primal-Dual Augmented Lagrangian Penalty-Interior-Point Filter Line Search Algorithm

A Primal-Dual Augmented Lagrangian Penalty-Interior-Point Filter Line Search Algorithm Journal name manuscript No. (will be inserted by the editor) A Primal-Dual Augmented Lagrangian Penalty-Interior-Point Filter Line Search Algorithm Rene Kuhlmann Christof Büsens Received: date / Accepted:

More information

Spectral gradient projection method for solving nonlinear monotone equations

Spectral gradient projection method for solving nonlinear monotone equations Journal of Computational and Applied Mathematics 196 (2006) 478 484 www.elsevier.com/locate/cam Spectral gradient projection method for solving nonlinear monotone equations Li Zhang, Weijun Zhou Department

More information

PARALLEL SUBGRADIENT METHOD FOR NONSMOOTH CONVEX OPTIMIZATION WITH A SIMPLE CONSTRAINT

PARALLEL SUBGRADIENT METHOD FOR NONSMOOTH CONVEX OPTIMIZATION WITH A SIMPLE CONSTRAINT Linear and Nonlinear Analysis Volume 1, Number 1, 2015, 1 PARALLEL SUBGRADIENT METHOD FOR NONSMOOTH CONVEX OPTIMIZATION WITH A SIMPLE CONSTRAINT KAZUHIRO HISHINUMA AND HIDEAKI IIDUKA Abstract. In this

More information