Gerd Wachsmuth. January 22, 2016

Similar documents
Mathematical programs with complementarity constraints in Banach spaces

The limiting normal cone to pointwise defined sets in Lebesgue spaces

Optimization problems with complementarity constraints in infinite-dimensional spaces

Preprint Patrick Mehlitz, Gerd Wachsmuth Weak and strong stationarity in generalized bilevel programming and bilevel optimal control

First-order optimality conditions for mathematical programs with second-order cone complementarity constraints

Preprint ISSN

ON LICQ AND THE UNIQUENESS OF LAGRANGE MULTIPLIERS

Variational Analysis of the Ky Fan k-norm

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Priority Programme 1962

Preprint Stephan Dempe and Patrick Mehlitz Lipschitz continuity of the optimal value function in parametric optimization ISSN

First order optimality conditions for mathematical programs with semidefinite cone complementarity constraints

First order optimality conditions for mathematical programs with semidefinite cone complementarity constraints

Optimality Conditions for Constrained Optimization

A FRITZ JOHN APPROACH TO FIRST ORDER OPTIMALITY CONDITIONS FOR MATHEMATICAL PROGRAMS WITH EQUILIBRIUM CONSTRAINTS

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS

OPTIMALITY CONDITIONS AND ERROR ANALYSIS OF SEMILINEAR ELLIPTIC CONTROL PROBLEMS WITH L 1 COST FUNCTIONAL

FIRST- AND SECOND-ORDER OPTIMALITY CONDITIONS FOR MATHEMATICAL PROGRAMS WITH VANISHING CONSTRAINTS 1. Tim Hoheisel and Christian Kanzow

Implications of the Constant Rank Constraint Qualification

B- AND STRONG STATIONARITY FOR OPTIMAL CONTROL OF STATIC PLASTICITY WITH HARDENING ROLAND HERZOG, CHRISTIAN MEYER, AND GERD WACHSMUTH

Introduction to Optimization Techniques. Nonlinear Optimization in Function Spaces

Optimization and Optimal Control in Banach Spaces

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Fakultät für Mathematik und Informatik

Radius Theorems for Monotone Mappings

First order optimality conditions for mathematical programs with second-order cone complementarity constraints

Corrections and additions for the book Perturbation Analysis of Optimization Problems by J.F. Bonnans and A. Shapiro. Version of March 28, 2013

On smoothness properties of optimal value functions at the boundary of their domain under complete convexity

The Relation Between Pseudonormality and Quasiregularity in Constrained Optimization 1

AN ABADIE-TYPE CONSTRAINT QUALIFICATION FOR MATHEMATICAL PROGRAMS WITH EQUILIBRIUM CONSTRAINTS. Michael L. Flegel and Christian Kanzow

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

Centre d Economie de la Sorbonne UMR 8174

(convex combination!). Use convexity of f and multiply by the common denominator to get. Interchanging the role of x and y, we obtain that f is ( 2M ε

4 Hilbert spaces. The proof of the Hilbert basis theorem is not mathematics, it is theology. Camille Jordan

Differential Sensitivity Analysis of Variational Inequalities with Locally Lipschitz Continuous Solution Operators

Semi-infinite programming, duality, discretization and optimality conditions

Perturbation analysis of a class of conic programming problems under Jacobian uniqueness conditions 1

CHEBYSHEV INEQUALITIES AND SELF-DUAL CONES

B. Appendix B. Topological vector spaces

PROBLEMS. (b) (Polarization Identity) Show that in any inner product space

Constraint qualifications for convex inequality systems with applications in constrained optimization

Kernel Method: Data Analysis with Positive Definite Kernels

Topological vectorspaces

I teach myself... Hilbert spaces

1. Introduction. We consider mathematical programs with equilibrium constraints in the form of complementarity constraints:

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

On duality theory of conic linear problems

Summer School: Semidefinite Optimization

On Semicontinuity of Convex-valued Multifunctions and Cesari s Property (Q)

DUALITY, OPTIMALITY CONDITIONS AND PERTURBATION ANALYSIS

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory

Problem Set 6: Solutions Math 201A: Fall a n x n,

l(y j ) = 0 for all y j (1)

5. Duality. Lagrangian

Functional Analysis. Martin Brokate. 1 Normed Spaces 2. 2 Hilbert Spaces The Principle of Uniform Boundedness 32

A Unified Analysis of Nonconvex Optimization Duality and Penalty Methods with General Augmenting Functions

1. Introduction. Consider the following parameterized optimization problem:

Optimization Theory. A Concise Introduction. Jiongmin Yong

arxiv: v1 [math.oc] 22 Sep 2016

Best approximations in normed vector spaces

ON GENERALIZED-CONVEX CONSTRAINED MULTI-OBJECTIVE OPTIMIZATION

A sensitivity result for quadratic semidefinite programs with an application to a sequential quadratic semidefinite programming algorithm

Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping.

FUNCTIONAL ANALYSIS LECTURE NOTES: COMPACT SETS AND FINITE-DIMENSIONAL SPACES. 1. Compact Sets

Extreme points of compact convex sets

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms

Continuity of convex functions in normed spaces

Chapter 2 Convex Analysis

Optimality conditions for problems over symmetric cones and a simple augmented Lagrangian method

Where is matrix multiplication locally open?

Lecture 5. Theorems of Alternatives and Self-Dual Embedding

16 1 Basic Facts from Functional Analysis and Banach Lattices

Solving generalized semi-infinite programs by reduction to simpler problems.

Dedicated to Michel Théra in honor of his 70th birthday

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Inequality Constraints

1 Math 241A-B Homework Problem List for F2015 and W2016

Necessary optimality conditions for optimal control problems with nonsmooth mixed state and control constraints

Optimality, identifiability, and sensitivity

Linear Algebra Massoud Malek

Part III. 10 Topological Space Basics. Topological Spaces

Characterization of proper optimal elements with variable ordering structures

On the projection onto a finitely generated cone

FULL STABILITY IN FINITE-DIMENSIONAL OPTIMIZATION

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

CHAPTER 7. Connectedness

GEOMETRIC APPROACH TO CONVEX SUBDIFFERENTIAL CALCULUS October 10, Dedicated to Franco Giannessi and Diethard Pallaschke with great respect

POLARS AND DUAL CONES

Aubin Property and Uniqueness of Solutions in Cone Constrained Optimization

Course 212: Academic Year Section 1: Metric Spaces

Assignment 1: From the Definition of Convexity to Helley Theorem

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Convex Optimization M2

CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS

Constrained Optimization and Lagrangian Duality

Optimality, identifiability, and sensitivity

Banach Spaces V: A Closer Look at the w- and the w -Topologies

Lecture Note 5: Semidefinite Programming for Stability Analysis

Aubin Criterion for Metric Regularity

Transcription:

Strong stationarity for optimization problems with complementarity constraints in absence of polyhedricity With applications to optimization with semidefinite and second-order-cone complementarity constraints Gerd Wachsmuth January 22, 2016 We consider mathematical programs with complementarity constraints in Banach spaces. In particular, we focus on the situation that the complementarity constraint is defined by a non-polyhedric cone K. We demonstrate how strong stationarity conditions can be obtained in an abstract setting. These conditions and their verification can be made more precise in the case that Z is a Hilbert space and if the projection onto K is directionally differentiable with a derivative as given in [Haraux, 1977, Theorem 1]. Finally, we apply the theory to optimization problems with semidefinite and second-order-cone complementarity constraints. We obtain that local minimizers are strongly stationary under a variant of the linear-independence constraint qualification, and this are novel results. Keywords: mathematical programs with complementarity constraints, strong stationarity, conic programming, semidefinite cone, second-order cone MSC: 49K10, 49J52, 90C22, 90C33 1 Introduction We consider the optimization problem Minimize f(x), subject to g(x) C, G(x) K, H(x) K, G(x), H(x) = 0. (MPCC) Technische Universität Chemnitz, Faculty of Mathematics, Professorship Numerical Mathematics (Partial Differential Equations), 09107 Chemnitz, Germany, gerd.wachsmuth@mathematik.tu-chemnitz.de, http://www.tu-chemnitz.de/ mathematik/part_dgl/people/wachsmuth.

Here, K Z is a convex, closed cone in the reflexive Banach space Z. We refer to Section 4 for the precise assumptions on the remaining data of (MPCC). Due to the complementarity constraint G(x) K, H(x) K, G(x), H(x) = 0, (MPCC) is a mathematical program with complementarity constraints (MPCC) in Banach space. Already in finite dimensions, these complementarity constraints induce several difficulties, see, e.g., Luo et al. [1996], Scheel and Scholtes [2000], Hoheisel et al. [2013] and the references therein. We are going to derive optimality conditions of strongly stationary type for local minimizers x of (MPCC). We give a motivation for the study of (MPCC) with an emphasis on a situation in which the problem (MPCC) has infinite-dimensional components. A very important source of programs with complementarity constraints is given by bilevel optimization problems, see Dempe [2002]. Indeed, if we replace a lower-level problem which contains constraints including a cone K by its Karush-Kuhn-Tucker conditions, we arrive at (MPCC). We illustrate this by an example. Let us consider an optimal control problem in which the final state enters an optimization problem as a parameter, for an example, we refer to the natural gas cash-out problem, see Kalashnikov et al. [2015], Benita et al. [2015], Benita and Mehlitz [2016]. A simple prototype for such bilevel optimal control problems is Minimize f(x, z), such that ẋ(t) = f(t, x(t)) for t (0, T ), x(0) = x 0, z solves (1.2) with parameter p = x(t ) in which the lower-level problem is given by the following finite-dimensional optimization problem (1.1) Minimize j(z, p, q) with respect to z R n such that g(z, p, q) 0. (1.2) This lower-level problem depends on the parameter p = x(t ) and another parameter q which is fixed for the moment. It is clear that (1.1) becomes a problem of form (MPCC) if we replace (1.2) by its KKT-conditions. In this particular situation, the cone K is given by (, 0] m, where R m is the range space of the nonlinear mapping g. Now, let us consider the situation, in which the additional parameter q is unknown in the lower-level problem (1.2). One possibility to handle this uncertainty is the robust optimization approach from Ben-Tal and Nemirovski [1998, 2002]. That is, the objective in (1.2) is replaced by its supremum over q Q and the constraint g(z, p, q) 0 has to be satisfied for all q Q, where Q is the uncertainty set. Depending on the type of the uncertainty set Q, this robustification of (1.2) becomes a problem with second-order cone or semidefinite constraints, see, e.g., [Ben-Tal and Nemirovski, 2002, Theorem 1]. Consequently, (1.1) (together with the Karush-Kuhn-Tucker conditions of the robust counterpart of (1.2)) becomes a problem of type (MPCC) with K being the second-order cone or the cone of semidefinite matrices. Finally, we mention that a problem of type (MPCC) with an infinite-dimensional cone K is obtained, if a lower-level problem is attached to (1.1) not only for the final time T, but for all t (0, T ). This situation leads to the optimal control of a differential variational inequality, see Pang and Stewart [2008] for applications and further references. Strong stationarity for special instances of (MPCC) are obtained in Mignot [1976], Herzog et al. [2013], Wachsmuth [2014] (infinite dimensional Z) and in, e.g., Luo et al. [1996], Ding et al. [2014], Ye and Zhou [2015] (finite dimensional Z). To our knowledge, Wachsmuth [2015] is the only contribution which considers MPCCs of general type (MPCC). Therein, strong stationarity conditions for (MPCC) are defined and they possess a reasonable strength in the case that K is polyhedric. Moreover, it is shown that local minimizers are strongly stationary if a constraint qualification (CQ) is satisfied. 2

We are going to extend the results of Wachsmuth [2015] to the non-polyhedric situation. This is established by a linearization of (MPCC) and by utilizing the fact that every cone is polyhedric at the origin. We obtain explicit strong stationarity conditions in case that Z is a Hilbert space and if the directional derivative of the projection Proj K onto K can be characterized as in [Haraux, 1977, Theorem 1], see also Definition 4.6. Moreover, these conditions hold at local minimizers if a reasonable constraint qualification is satisfied. We describe the main results of this work. In Section 3 we provide two auxiliary results, which are used in Section 4 to derive stationarity conditions for (MPCC). We believe that both results are of independent interest. In Section 3.1, we consider the set F := {x C : G(x) K}. Here, C is a closed convex set and the closed set K is not assumed to be convex. We show that the tangent cone T F ( x) of F at x F is obtained by the linearization {x T C ( x) : G ( x) x T K (G( x))} if a certain CQ is satisfied, see Theorem 3.6. The second auxiliary result concerns the characterization of the tangent cone of the graph of the normal cone mapping. That is, given a closed convex cone K in a real Hilbert space Z, we are interested in the characterization of the tangent cone to the set gph T K ( ) = {(z, z ) Z Z : z K, z K, z, z = 0}. If the metric projection onto the cone K is directionally differentiable as in [Haraux, 1977, Theorem 1], we obtain a precise characterization of the tangent cone of gph T K ( ), see Section 3.2. These auxiliary results are utilized in order to prove strong stationarity of minimizers to (MPCC) in Section 4. The main result of Section 4 is Theorem 4.5, in which we prove that local minimizers satisfy the system of strong stationarity (4.15) under certain CQs, and this could also be considered the main result of this work. In order to illustrate the strength of the abstract theory, we study the cases that K is the cone of semidefinite matrices and the second-order cone, see Sections 5 and 6. In both situations, we obtain novel results. In fact, for the SDPMPCC we obtain that SDPMPCC-LICQ implies strong stationarity of local minimizers, see Theorem 5.9, and this was stated as an open problem in [Ding et al., 2014, Remark 6.1]. For the SOCMPCC we obtain the parallel result that SOCMPCC-LICQ implies strong stationarity, see Theorem 6.11. This result was already provided in [Ye and Zhou, 2015, Theorem 5.1], but the definition of SOCMPCC-LICQ in Ye and Zhou [2015] is much stronger than our definition. Finally, we consider briefly the case that K is an infinite product of second-order cones in Section 7. 2 Notation Let X be a (real) Banach space. The (norm) closure of a subset A X is denoted by cl(a). The linear subspace spanned by A is denoted by span(a). The duality pairing between X and its topological dual X is denoted by, : X X R. For subsets A X, B X, we define their polar cones and annihilators via A := {x X : x, x 0, x A}, A := {x X : x, x = 0, x A}, B := {x X : x, x 0, x B}, B := {x X : x, x = 0, x B}. For a cone C X, C = C C is the lineality space of C, that is, the largest subspace contained in C. For an arbitrary set C X and x C, we define the cone of feasible directions (also called the 3

radial cone), the Bouligand (or contingent) tangent cone and the adjacent (or inner) tangent cone by R C (x) := { h X : t > 0 : s [0, t] : x s h C }, T C (x) := { h X : lim inf dist C (x t h)/t = 0 }, t 0 TC(x) := { h X : lim sup dist C (x t h)/t = 0 }, t 0 respectively, see [Bonnans and Shapiro, 2000, Definition 2.54], [Aubin and Frankowska, 2009, Definitions 4.1.1 and 4.1.5]. Here, dist C (x) is the distance of x X to the set C. Recall, that T C (x) = T C (x) = cl(r C(x)) in case C is convex, see [Bonnans and Shapiro, 2000, Proposition 2.55]. Moreover, if C is a convex cone, we find T C (x) = C x, (2.1) see [Bonnans and Shapiro, 2000, Example 2.62]. Here, x is short for {x}. For a closed, convex set C X, we define the critical cone w.r.t. x C and v T C (x) by The set C is said to be polyhedric w.r.t. (x, v) C T C (x), if holds, cf. Haraux [1977]. K C (x, v) := T C (x) v. (2.2) K C (x, v) = cl ( R C (x) v ) (2.3) A function g : X Y is called strictly Fréchet differentiable at x X, if it is Fréchet differentiable at x and if for every δ > 0, there exists ε > 0 such that g(x 1 ) g(x 2 ) g ( x) (x 1 x 2 ) Y δ x 1 x 2 X x 1, x 2 B X ε ( x). (2.4) Here, Bε X ( x) is the closed ball around x with radius ε. Note that strict Fréchet differentiability is implied by continuous Fréchet differentiability, see [Cartan, 1967, Theorem 3.8.1]. 3 Auxiliary results In this section, we provide two auxiliary results, which will be utilized in the analysis of (MPCC). However, we believe that the results are of independent interest. 3.1 Constraint Qualification in Absence of Convexity We consider the set F := { x C : G(x) K } (3.1) which is defined by a (possibly) non-convex set K. We are going to characterize its tangent cone T F ( x) by means of its linearization cone T lin ( x) := { h T C ( x) : G ( x) h T K (G( x)) }. Since K may not be convex, we cannot apply the constraint qualification of Robinson [1976], Zowe and Kurcyusz [1979] to obtain a characterization of the tangent cone T F ( x) of F. The main result of this section is Theorem 3.6, which provides a constraint qualification that applies to the situation of (3.1). We fix the setting of (3.1). Let X, Y be (real) Banach spaces. The set C X is assumed to be closed and convex and the set K Y is closed. We emphasize that the set K is not assumed to be convex. Moreover, the function G : X Y is strictly Fréchet differentiable at the point x F. As in the situation that K is convex, no condition is required to show that the linearization cone is a superset of the tangent cone and we recall the following lemma. 4

Lemma 3.1. We have T F ( x) T lin ( x). In order to define a CQ, we need a certain tangent approximation of the non-convex set K. Definition 3.2. A cone W Y is called a tangent approximation set of K at ȳ K, if for all ρ (0, 1), there exists δ > 0, such that y K B Y δ (ȳ), w W BY δ (0) : w Y : w Y ρ w Y, y w w K. (3.2) Note that the set W is not assumed to be convex or closed. Roughly speaking, all (small) directions in W are close to tangent directions for all points in a neighborhood of ȳ. To our knowledge, this definition was not used in the literature so far. We shall show that a tangent approximation set W is a subset of the Clarke tangent cone defined by C K (ȳ) := { d Y : {y k } k N K, y k ȳ, {t k } k N R, t k 0 : dist K (y k t k d)/t k 0 }, see, e.g., [Aubin and Frankowska, 2009, Definition 4.1.5]. Here, dist K is the distance function of K. Moreover, in finite dimensions, the Clarke tangent cone is the largest tangent approximation set. Lemma 3.3. We assume that the cone W Y is a tangent approximation of K at ȳ K. Then, W C K (ȳ). Moreover, C K (ȳ) is a tangent approximation set of K at ȳ if dim(y) <. Proof. Let w W and sequences {y k } and {t k } as in the definition of C K (ȳ) be given. For a given ρ (0, 1), there exists δ > 0, such that (3.2) holds. Then, y k K B Y δ (ȳ) and t k w W B Y δ (0) for large k. By (3.2), there exists w k with w k Y ρ t k w Y and y k t k w w k K. Hence, dist K (y k t k w) w k Y ρ t k w Y. Thus, dist K (y k t k w)/t k ρ w Y for large k. Since ρ (0, 1) was arbitrary, this shows the claim. To prove that C K (ȳ) is a tangent approximation set in case dim(y) <, we proceed by contradiction. Hence, there exists ρ > 0, a sequence {y k } k N K with y k ȳ and {w k } k N C K (ȳ) with w k 0 such that dist K (y k w k ) ρ w k Y k N. We set t k = w k Y. Since Y is finite dimensional, the bounded sequence w k /t k possesses a limit w with w Y = 1 (up to a subsequence, without relabeling). By the closedness of C K (ȳ), we have w C K (ȳ), see [Aubin and Frankowska, 2009, p.127]. Hence, dist K (y k t k w) dist K(y k t k (w k /t k )) t k w w k Y ρ w w k /t k Y. t k t k t k Together with w k /t k w and ρ > 0, this is a contradiction to w C K (ȳ). The next lemma suggests that the situation is much more delicate in the infinite dimensional case. In particular, tangent approximation sets cannot be chosen convex and there might be no largest tangent approximation set. Lemma 3.4. We consider Y = L 2 (0, 1) and the closed, convex set K = {y L 2 (0, 1) : y(t) 0 for a.a. t (0, 1)}. Then for any M > 0, the cone W M = {w L 2 (0, 1) : min(w, 0) L (0,1) M w L 2 (0,1)} is a tangent approximation set of K at the constant function ȳ 1. However, R K (ȳ), T K (ȳ), the union of all W M, M > 0 and the convex hull of W M for M 2 are not tangent approximation sets. 5

Proof. Let us show that W M is a tangent approximation set of K at ȳ. Let ρ (0, 1) be given. We set δ = ρ/(2 M) and show that (3.2) is satisfied. To this end, let y K and w W M with y ȳ L2 (0,1), w L2 (0,1) δ be given. By Chebyshev s inequality, we have µ({t (0, 1) : y(t) 1/2}) µ({t (0, 1) : y(t) ȳ(t) 1/2}) 2 2 y ȳ 2 L 2 (0,1) 4 δ2. where µ denotes the Lebesgue measure on (0, 1). Since min(w, 0) L (0,1) M w L 2 (0,1) 1/2, we have µ({t (0, 1) : y(t) w(t) 0}) µ({t (0, 1) : y(t) 1/2}) 4 δ 2. We set w := min(0, y w), which yields y w w = max(0, y w) K. It remains to bound w and this is established by w 2 L 2 (0,1) min(0, y w) 2 L 2 (0,1) = (y(t) w(t)) 2 dt {t (0,1):y(t)w(t) 0} {t (0,1):y(t)w(t) 0} w(t) 2 dt µ({t (0, 1) : y(t) w(t) 0}) min(w, 0) 2 L (0,1) 4 δ2 M 2 w 2 L 2 (0,1) = ρ 2 w 2 L 2 (0,1). This shows that W M is a tangent approximation set. In order to demonstrate the last claim of the lemma, we first show that L (0, 1) is not a tangent approximation set. This is easily established, since for arbitrary δ > 0, we may set { { 0 if t δ 2, 1 if t δ 2, y(t) = and w(t) = 1 else, 0 else. It is clear that y K and w L (0, 1) and y ȳ L 2 (0,1) = w L 2 (0,1) = δ. However, the distance of y w to K is δ. Hence, (3.2) cannot be satisfied with W = L (0, 1). Since L (0, 1) R K (ȳ) T K (ȳ) and L (0, 1) M>0 W M, the sets R K (ȳ), T K (ȳ) and M>0 W M cannot be tangent approximation sets. Finally, we show that L (0, 1) is contained in the convex hull of W M, M 2. Indeed, for arbitrary f L (0, 1), we define f 1 = 2 f χ [0,1/2], f 2 = 2 f χ [1/2,1] and it is sufficient to show that f 1 and f 2 belong to the convex hull conv(w M ) of W M. Therefore, we set g ± = f 1 ± f 1 L (0,1) χ [1/2,1]. This gives g ± L (0,1) = f 1 L (0,1) and g ± L 2 (0,1) f 1 L (0,1)/ 2. Hence, g ± W M. This shows f 1 = (g g )/2 conv(w M ) and similarly we get f 2 conv(w M ). Hence, L (0, 1) conv(w M ) and, thus, conv(w M ) is not a tangent approximation set. We mention that it is also possible to derive non-trivial tangent approximation sets for non-convex sets K. As an example, we mention K := {y L 2 (0, 1) 2 : y 1 0, y 2 0, y 1 y 2 = 0 a.e. in (0, 1)}. Given ȳ K, M, ε > 0, one can use similar arguments as in the proof of Lemma 3.4 to show that { } W = w L 2 (0, 1) 2 : w L (0,1) 2 M w L 2 (0,1) 2 w 1 (t) = 0 if y 1 (t) ε, w 2 (t) = 0 if y 2 (t) ε for a.a. t (0, 1) is a tangent approximation set of K at ȳ. In what follows, we define (A) 1 := A B Z 1 (0), (3.3) 6

where A is a subset of a Banach space Z and B Z 1 (0) is the (closed) unit ball in Z. Using the notion of a tangent approximation set, we define a constraint qualification. Definition 3.5. The set F is called qualified at x F if there exists a tangent approximation set W Y of K at G( x) and M > 0 such that B Y M (0) G ( x) (C x) 1 (W) 1. (3.4) If the set W is actually a closed and convex cone, the condition (3.4) is equivalent to see [Zowe and Kurcyusz, 1979, Theorem 2.1]. Y = G ( x) R C ( x) W, (3.5) In the case that Y is finite-dimensional and K is convex, Definition 3.5 reduces to the constraint qualification of Robinson-Zowe-Kurcyusz. Indeed, in the light of Lemma 3.3, Definition 3.5 is equivalent to B Y M (0) G ( x) (C x) 1 (T K (G( x))) 1, which is, in turn, equivalent to the constraint qualification of Robinson-Zowe-Kurcyusz, see [Bonnans and Shapiro, 2000, Proposition 2.97]. Theorem 3.6. Let us assume that F is qualified at x F in the sense of Definition 3.5. Then, T lin ( x) = T F ( x). In the case that K is convex, this assertion is well known, if we replace W by R K (G( x)) in (3.5). Note that this assumption might be weaker then the qualification of F in the sense of Definition 3.5, compare Lemma 3.4. Moreover, if Y is finite-dimensional, the assertion follows from Lemma 3.1, Lemma 3.3 and [Aubin and Frankowska, 2009, Theorem 4.3.3]. In order to prove Theorem 3.6, we provide an auxiliary lemma. Its proof is inspired by the proof of [Werner, 1984, Theorem 5.2.5]. Lemma 3.7. Let the assumptions of Theorem 3.6 be satisfied. Then, there exists γ > 0, such that for all u X, v Y satisfying we find û X satisfying u X γ, v Y γ, u 1 2 (C x) 1, G( x) G ( x) u v K, û (C x) 1, and u û X 2 M G( x u) G( x) G ( x) u v Y. G( x û) K Proof. We set ε := ρ := M/4. Since G is strictly differentiable at x, there exists δ (0, 1], such that G(x) G( x) G ( x) (x x) Y ε x x X x, x B δ ( x) (3.6) and such that (3.2) is satisfied. 7

We define the constants Now, let u X and v Y with L := max{ G ( x) Y, ε}, γ := δ 4 (L 1) (M 1 1) δ 4. u X γ, v Y γ, u 1 2 (C x) 1, and G( x) G ( x) u v K be given. We define q := 2 M G( x u) G( x) G ( x) u v Y. For later reference, we provide We set q 2 M (ε u X v Y ) 2 M ( M ) ( 1 4 1 γ = 2 2 ) δ γ M 2 (L 1) δ 2 1 2. (3.7) x 0 := 0, u 0 := u, v 0 := v and construct sequences {x i } i N, {u i } i N, {v i } i N, such that the following assertions hold u i q 2 i (C x) 1 i 1, (3.8a) x i ( 1 1 2 i ) (C x)1 i 0, (3.8b) x i X u X ( 1 1 ) q 2 i 1 i 1, (3.8c) v i Y ρ q 2 i i 1, (3.8d) G( x x i ) G ( x) u i v i K i 0, (3.8e) G( x x i u i ) G( x x i ) G ( x) u i v i Y M q 2 i1 i 0. (3.8f) Note that (3.8b) (3.8e) and (3.8f), are satisfied for i = 0. For i = 1, 2,... we perform the following. Since (3.4) is satisfied, we find ( ) ui 1 ( ) w i M G( x x i 1 u i 1 ) G( x x i 1 ) G (C x)1 ( x) u i 1 v i 1 Y, (3.9) (W) 1 such that G ( x) u i w i = ( G( x x i 1 u i 1 ) G( x x i 1 ) G ( x) u i 1 v i 1 ). (3.10) Now, (3.9) together with (3.8f) for i 1 shows (3.8a) for i. Similarly, we obtain w i Y q q δ. (3.11) 2i We define We have x i := x i 1 u i 1. u 0 = u 1 2 (C x) 1, in case i = 1, u i 1 q 2 i 1 (C x) 1 1 2 i (C x) 1, in case i > 1 by (3.7), (3.8a). 8

This implies x i = x i 1 u i 1 ( 1 1 2 i 1 ) (C x)1 ( 1 2 i ) (C x)1 ( 1 1 2 i ) (C x)1, which yields (3.8b). Similarly, (3.8c) follows. Together with (3.8a), this implies By (3.10) and (3.8e) for i 1, we have x i X u X q δ and x i u i X u X q δ. (3.12) G( x x i ) G ( x) u i w i = G( x x i 1 ) G ( x) u i 1 v i 1 K. Since G( x xi ) G ( x) u i w i G( x) Y G( x x i ) G( x) G ( x) x i Y G ( x) (x i u i ) Y w i Y (3.6) ε x i X L x i u i X w i Y (3.11), (3.12) 2 (L 1) ( u X q) (3.7) 2 (L 1) ( 1 1 2 2 ) ( 1 ) γ 4 (L 1) 1 γ = δ M M and w i Y δ (by (3.11)), we can apply (3.2) and find v i with v i Y ρ w i Y and which shows (3.8e) and (3.8d) for i. G( x x i ) G ( x) u i w i w i v i K, It remains to show (3.8f). By (3.12), we can apply (3.6) and find G( x x i u i ) G( x x i ) G ( x) u i v i Y ε u i X v i Y (ε ρ) q 2 i = M q 2 i1. Altogether, we have shown (3.8). Since x i = i 1 j=0 u j we get from (3.8a) that {x i } is a Cauchy sequence. Hence, there is û with x i û in X. Moreover, we have u i 0 in X and v i 0 in Y by (3.8a) and (3.8d). Hence, passing to the limit i in (3.8e) and using the closedness of K, we get G( x û) K and from (3.8b) we find Finally, û (C x) 1. u û X = u X q j 2 j q. j=1 j=1 Proof of Theorem 3.6. In view of Lemma 3.1, we only need to prove T lin ( x) T F ( x). Let h T lin ( x) be given. Since G ( x) h T K (G( x)), there exist sequences {t k } k N (0, ) and 9

{rk K} k N Y, such that t k 0 and G( x) t k G ( x) h rk K K for all k N rk K Y = o(t k ) as k By the convexity of C and h T C ( x), there exists {rk C} k N X with x 2 t k h 2 rk C C for all k N rk C X = o(t k ) as k For large k, this implies t k h rk C 1 2 (C x) 1 and G( x) G ( x) (t k h rk) C rk K G ( x) rk C K. For large k, the norms of u k := t k h rk C and v k := rk K G ( x) rk C Lemma 3.7. This yields û k such that K are small and we can apply and t k h r C k û k X = u k û k X 2 M Hence, we have x û k F, and This proves the claim h T F ( x). û k (C x) 1 and G( x û k ) K ( x û k ) x t k G( x tk h rk) C G( x) G ( x) (t k h) rk K Y = o(t k ). = ûk t k h as k. An analogous result to Theorem 3.6 can be proved for the adjacent/inner tangent cone TF ( x). That is, under the conditions of Theorem 3.6 we also obtain T lin( x) := { h T C ( x) : G ( x) h T K(G( x)) } = T F( x). The proof follows the same lines with the obvious modifications. In the remainder of this section, we show a possibility to obtain a tangent approximation set of the graph of the normal cone mapping in the case that Z is a Hilbert space. That is, we consider the case Y = Z Z and K := gph T K ( ) := { (z 1, z 2 ) Z 2 : z 2 T K (z 1 ) } = { (z 1, z 2 ) K K : z 1, z 2 = 0 }, where K Z is a closed, convex cone. Note that we have identified Z with Z. Due to this identification, we have the well-known and important characterization (z 1, z 2 ) gph T K ( ) z 1 = Proj K (z 1 z 2 ) z 2 = Proj K (z 1 z 2 ). (3.13) In particular, the operators P : Z gph T K ( ), P (z) = (Proj K (z), Proj K (z)) are inverses of each other. Q : gph T K ( ) Z, Q(z 1, z 2 ) = z 1 z 2 10

Lemma 3.8. Assume that Z is a Hilbert space and K Z a closed, convex cone. Let ( z 1, z 2 ) gph T K ( ), T L(Z, Z) and a cone W Z be given. Suppose that for all ε > 0 there is δ > 0 with ProjK ( z w) Proj K ( z) T w Z ε w Z for all w Bδ Z (0) W, z Bδ Z ( z 1 z 2 ). (3.14) Then, W := { (w 1, w 2 ) Z 2 : w 1 w 2 W, w 1 = T (w 1 w 2 ) } is a tangent approximation set of gph T K ( ) at ( z 1, z 2 ). Roughly speaking, (3.14) asserts that T w is the directional derivative of Proj K for directions w W and the remainder term is uniform in a neighborhood of z 1 z 2. Proof. Let ρ (0, 1) be given. Set ε = ρ/2 and choose δ > 0 such that (3.14) is satisfied. Now, let (z 1, z 2 ) gph T K ( ) with (z 1, z 2 ) ( z 1, z 2 ) Z 2 δ and (w 1, w 2 ) W with (w 1, w 2 ) Z 2 δ be given. Then, (Proj K (z 1 w 1 z 2 w 2 ), Proj K (z 1 w 1 z 2 w 2 )) gph T K ( ) To satisfy Definition 3.2, it remains to show ( Proj K (z 1 w 1 z 2 w 2 ) z 1 w 1, Proj K (z 1 w 1 z 2 w 2 ) z 2 w 2 ) Z 2 ρ (w 1, w 2 ) Z 2 By (3.14) we have Proj K (z 1 w 1 z 2 w 2 ) z 1 w 1 Z = Proj K (z 1 z 2 w 1 w 2 ) Proj K (z 1 z 2 ) T (w 1 w 2 ) Z ε w 1 w 2 Z ε ( ) ( w 1 Z w 2 Z 2 ε w1 2 Z w 2 2 1/2 Z) = 2 ε (w1, w 2 ) Z 2. Since we have z 1 w 1 z 2 w 2 = Proj K (z 1 w 1 z 2 w 2 ) Proj K (z 1 w 1 z 2 w 2 ), Proj K (z 1 w 1 z 2 w 2 ) z 2 w 2 Z = Proj K (z 1 w 1 z 2 w 2 ) z 1 w 1 Z. Hence, ( Proj K (z 1 w 1 z 2 w 2 ) z 1 w 1, Proj K (z 1 w 1 z 2 w 2 ) z 2 w 2 ) Z 2 This shows the claim. = 2 Proj K (z 1 w 1 z 2 w 2 ) z 1 w 1. 2 ε (w 1, w 2 ) Z 2 = ρ (w 1, w 2 ) Z 2. 3.2 Computation of the tangent cone to the graph of the normal cone map In this section, we consider a closed convex cone K Z, where Z is a (real) Hilbert space. Recall that the graph of the normal cone mapping is given by gph T K ( ) := { (z 1, z 2 ) Z 2 : z 2 T K (z 1 ) } = { (z 1, z 2 ) K K : z 1, z 2 = 0 }. 11

We are going to derive a formula for its tangent cone T gph TK ( ). derivative of the (set-valued) normal-cone mapping. This is also called the graphical If the projection onto the cone K is directionally differentiable, we have a well-known characterization of T gph TK ( ). Lemma 3.9. Let us assume that Proj K is directionally differentiable. Then, for ( z 1, z 2 ) gph T K ( ) we have T gph TK ( ) ( z 1, z 2 ) = { (z 1, z 2 ) Z 2 : Proj K( z 1 z 2 ; z 1 z 2 ) = z 1 }. This lemma can be proved similarly as [Mordukhovich et al., 2015a, (3.11)] (let g be the identity function therein), or by transferring the proof of [Wu et al., 2014, Theorem 3.1] to the situation at hand. In order to give an explicit expression of T gph TK ( ), we use a result by Haraux [1977]. Lemma 3.10. Let us assume that Z is a Hilbert space and let ( z 1, z 2 ) gph T K ( ) be given. We define the critical cone K K ( z 1, z 2 ) = T K ( z 1 ) z 2 and assume that there is a bounded, linear, self-adjoint operator L : Z Z, such that L Proj KK ( z 1, z 2) = Proj KK ( z 1, z 2) L Proj K ( z 1 z 2 t w) = Proj K ( z 1 z 2 ) t L 2 w o(t) for all w K K ( z 1, z 2 ). (3.15) Then, for (z 1, z 2 ) Z 2 we have if and only if there exists Π Z such that (z 1, z 2 ) T gph TK ( ) ( z 1, z 2 ) Π K K ( z 1, z 2 ), z 1 z 2 Π K K ( z 1, z 2 ), (Π, z 1 z 2 Π) = 0, z 1 L 2 Π = 0. (3.16a) (3.16b) (3.16c) (3.16d) Proof. By [Haraux, 1977, Theorem 1], Proj K is directionally differentiable at z 1 z 2 and Proj K( z 1 z 2 ; v) = L 2 Proj KK ( z 1, z 2)(v) for all v Z. By Lemma 3.9 we obtain T gph TK ( ) ( z 1, z 2 ) = { (z 1, z 2 ) Z 2 : Proj } K( z 1, z 2 ; z 1 z 2 ) = z 1 = { (z 1, z 2 ) Z 2 : L 2 } Proj KK ( z 1, z 2)(z 1 z 2 ) = z 1. Now, we have Π = Proj KK ( z 1, z 2)(z 1 z 2 ) if and only if Π K K ( z 1, z 2 ), z 1 z 2 Π K K ( z 1, z 2 ), (Π, z 1 z 2 Π) = 0. This implies the assertion. 12

If K is polyhedric w.r.t. ( z 1, z 2 ), we can choose L = I, since the projection Proj K satisfies Proj K( z 1 z 2, w) = Proj KK ( z 1, z 2)(w) w Z (3.17) in this case, see [Haraux, 1977, Theorem 2] and [Mignot, 1976, Proposition 2.5]. The situation in which (3.17) is fulfilled was called projection derivation condition in [Mordukhovich et al., 2015b, Definition 4.1]. This projection derivation condition also holds if K satisfies the extended polyhedricity condition (and another technical condition), see [Mordukhovich et al., 2015b, Proposition 4.2], and in other non-polyhedric situations, see [Mordukhovich et al., 2015b, Example 4.3]. Hence, the choice L = I in Lemma 3.10 is also possible in these situations. For later reference, we also provide formulas for the normal cone to gph T K ( ) and for the largest linear subspace which is contained in the tangent cone of gph T K ( ). Lemma 3.11. Under the assumption of Lemma 3.10, we have ( ) ( ) L conv(t gph TK ( ) ( z 2 0 KK ( z 1, z 2 )) = 1, z 2 ) I L 2 I K K ( z 1, z 2 ), where conv( ) denotes the convex hull. Consequently, the normal cone to gph T K ( ) is given by ( ) T gph TK ( ) ( z 1, z 2 ) L 2 I L = 2 1 ( ) KK ( z 1, z 2 ). (3.18) 0 I K K ( z 1, z 2 ) Finally, the largest linear subspace contained in T gph TK ( ) ( z 1, z 2 ) is given by ( ) ( ) L 2 0 KK ( z 1, z 2 ) I L 2 I K K ( z 1, z 2 ) (3.19) under the assumption that L 2 (Π Π ) = 0 = Π Π = 0 Π ± K K ( z 1, z 2 ). (3.20) Note that the last result is quite surprising since the tangent cone T gph TK ( ) ( z 1, z 2 ) is (in general) not convex. Hence, we avoided to call (3.19) the lineality space of T gph TK ( ) ( z 1, z 2 ), since this is only defined for convex cones. Proof. We proof the formula for the convex hull, (3.18) is then a straightforward consequence, see, e.g., [Aubin and Frankowska, 2009, Theorem 2.4.3]. The direction is clear. To show, we consider z 1, z 2 Z such that z 1 = L 2 Π 1, z 2 = (I L 2 ) Π 1 Π 2 for some Π 1 K K ( z 1, z 2 ), Π 2 K K ( z 1, z 2 ). Then, it is easy to see that (L 2 Π 1, (I L 2 ) Π 1 ), (0, Π 2 ) T gph TK ( ) ( z 1, z 2 ). Since (z 1, z 2 ) is the sum of these two elements and since T gph TK ( ) ( z 1, z 2 ) is a cone, the assertion follows. Finally, we show the last assertion under the assumption (3.20). It is clear that (3.19) is a linear subspace and that (3.19) is contained in T gph TK ( ) ( z 1, z 2 ). It remains to show the implication ( ) ( ) L ± (z 1, z 2 ) T gph TK ( ) ( z 2 0 KK ( z 1, z 2 ) = (z 1, z 2 ) 1, z 2 ) I L 2 I K K ( z 1, z 2 ) 13

for all (z 1, z 2 ) Z 2. Let (z 1, z 2 ) Z 2 with ±(z 1, z 2 ) T gph TK ( ) ( z 1, z 2 ) be given. By Lemma 3.10 we find two elements Π ± Z such that Π ± K K ( z 1, z 2 ), (Π ±, ±(z 1 z 2 ) Π ± ) = 0, ±(z 1 z 2 ) Π ± K K ( z 1, z 2 ), ±z 1 L 2 Π ± = 0. cf. (3.16). The last equation implies L 2 (Π Π ) = 0, and due to (3.20), Π = Π follows. Hence, we have Π K K ( z 1, z 2 ), z 1 z 2 Π K K ( z 1, z 2 ) and this shows the claim. 4 MPCCs in Banach spaces In this section we apply the results of Section 3 to the general MPCC Minimize f(x), subject to g(x) C, G(x) K, H(x) K, G(x), H(x) = 0. (MPCC) Here, f : X R is Fréchet differentiable, g : X Y, G : X Z and H : X Z are strictly Fréchet differentiable, X, Y, Z are (real) Banach spaces and Z is assumed to be reflexive. Moreover, C Y is a closed, convex set and K Z is a closed, convex cone. Due to the reflexivity of Z, the problem (MPCC) is symmetric w.r.t. G and H. The problem (MPCC) was discussed in Wachsmuth [2015] and optimality conditions of strong stationary type were obtained under a CQ. However, these conditions seem to be too weak in the case that K is not polyhedric, see [Wachsmuth, 2015, Section 6.2] for an example, but stronger optimality conditions were obtained by an additional linearization argument. In this section, we apply the same idea to (MPCC). That is, we linearize (MPCC) by means of Theorem 3.6, see Section 4.1. By means of the results in Section 3.2 we recast the linearized problem as a linear MPCC, see Corollary 4.3. Finally, we derive optimality conditions via this linearized problem in Section 4.2. The main result of this section is Theorem 4.5 which asserts that local minimizers of (MPCC) are strongly stationary (in the sense of Definition 4.6) if certain CQs are satisfied. Moreover, we demonstrate that this definition of strong stationarity possesses a reasonable strength. 4.1 Linearization of the MPCC In order to apply Theorem 3.6, we reformulate the complementarity constraint G(x) K, H(x) K, G(x), H(x) = 0. Indeed, since K is a cone, it is equivalent to H(x) T K (G(x)), see (2.1), and this can be written as (G(x), H(x)) gph T K ( ). 14

Theorem 4.1. Let x X be a feasible point of (MPCC). Assume that W Z Z is a tangent approximation set of the graph of the normal cone mapping of K, i.e., of gph T K ( ), at (G( x), H( x)). If the CQ (3.4) with the setting X := X Y, C := X C, G(x, y) := ( g(x) y, G(x), H(x) ), Y := Y (Z Z ), K := {0} gph T K ( ), W := {0} W is satisfied, then } T F ( x) = T lin ( x) := {h X : g ( x) h T C (g( x)), (G ( x) h, H ( x) h) T gph TK ( ) (G( x), H( x)) (4.1) (4.2) holds, where F X is the feasible set of (MPCC). Moreover, if x is a local minimizer of (MPCC), then h = 0 is a local minimizer of Minimize f ( x) h, subject to g ( x) h T C (g( x)), (G ( x) h, H ( x) h) T gph TK ( ) (G( x), H( x)). (4.3) An alternative choice to the setting (4.1) is X := X, C := X, G(x) := ( g(x), G(x), H(x) ), Y := Y (Z Z ), K := C gph T K ( ) (4.4) together with a tangent approximation set W of K. In the case that Y is finite dimensional, we can choose W = T C (g( x)) W, see Lemma 3.3, but this may not be possible in the infinite-dimensional case, compare Lemma 3.4. Hence, the setting (4.4) might require a stronger CQ in this case. Proof. We set An application of Theorem 3.6 yields F := {(x, y) C : G(x, y) K}. T F ( x, g( x)) = { (h x, h y ) T C ( x, g( x)) : G ( x, g( x)) (h x, h y ) T K (G( x, g( x))) }. (4.5) Now, we need to decode this statement to obtain (4.2). To this end, we remark that holds for all (x, y) X Y. (x, y) F y = g(x) and x F (4.6) Now, we relate the left-hand sides of (4.2) and (4.5). Let us show that for arbitrary h X, the following statements are equivalent. (i) h T F ( x). (ii) {t n } R, {h n } X : t n 0, h n h, x t n h n F. (iii) {t n } R, {h n } X : t n 0, h n h, ( x t n h n, g( x t n h n )) F. (iv) (h, g ( x) h) T F ( x, g( x)). Indeed, (i) (ii) is just the definition of T F ( x) and (ii) (iii) follows from (4.6). The implication (iii) (iv) is obtained from (g( x t n h n ) g( x))/t n g ( x) h. Finally, (iv) (i) follows from (4.6). An easy computation shows T C ( x, g( x)) = X T C (g( x)), T K (G( x, g( x))) = {0} T gph TK ( ) (G( x), H( x)). 15

Now, it is straightforward to show that h belongs to the right-hand side of (4.2) if and only if (h, g ( x) h) belongs to the right-hand side of (4.5). This shows the equivalency of (4.2) and (4.5). To obtain the last assertion, we mention that the local optimality of x implies f ( x) h 0 for all h T F ( x). Together with (4.2), this yields that h = 0 is a local minimizer of (4.3). For convenience, we mention that the CQ (3.5) (which is equivalent to (3.4) in the case that W is closed and convex) with the setting (4.1) is given by Note that W Z Z. Y g ( x) R C (g( x)) Z = G ( x) X. (4.7) Z H W ( x) In the case that the linearized constraints of (MPCC) are surjective, we obtain the satisfaction of the CQ (4.7) and, consequently, of (3.4). Corollary 4.2. Let x be a feasible point of (MPCC), such that the operator (g ( x), G ( x), H ( x)) L(X, Y Z Z ) is surjective. Then, (3.4) is satisfied with the setting W = {0} Z Z and (4.1). The constraints in the linearized problem (4.3) contain the tangent cone of the graph of the normal cone mapping. Typically, this tangent cone will again contain complementarity constraints and the linearized problem (4.3) will be again an MPCC. Since h = 0 is a local minimizer and since every cone is polyhedric in the origin, of [Wachsmuth, 2015, Section 5.1] will possess a reasonable strength. Hence, we propose to obtain optimality conditions for the local minimizer x of (MPCC) by using the optimality conditions of [Wachsmuth, 2015, Section 5.1] for the minimizer h = 0 of (4.3). However, it is not possible to characterize this tangent cone T gph TK ( ) in the general case. A typical Hilbert space situation was already discussed in Section 3.2, see, in particular, Lemma 3.10. In this situation, we have the following result. Corollary 4.3. Let us assume that Z is a Hilbert space and let x be a local minimizer of (MPCC). Further, assume that the assertions of Theorem 4.1 and of Lemma 3.10 (with ( z 1, z 2 ) := (G( x), H( x))) are satisfied. Then, (h, Π) = 0 is a local solution of Minimize f ( x) h, with respect to (h, Π) X Z, subject to g ( x) h T C (g( x)), G ( x) h L 2 Π = 0, Π K K (G( x), H( x)), G ( x) h H ( x) h Π K K (G( x), H( x)), (Π, G ( x) h H ( x) h Π) = 0. (4.8) 4.2 Optimality conditions for the linearized MPCC In this section we will derive optimality conditions for (MPCC) via optimality conditions for the linearized problem (4.3). In the Hilbert space setting of Corollary 4.3, (4.3) can be written as (4.8), which is again an MPCC and all the constraint functions are linear. Moreover, since (h, Π) = 0 is a solution, 16

and since every cone is polyhedric in the origin, the optimality conditions of Wachsmuth [2015] possess a reasonable strength. Hence, we recall the main results from Wachsmuth [2015] concerning MPCCs in Banach spaces. We only deal with the linear case and that a local minimizer is given by h = 0, since we are going to apply these results to (4.8). We consider the MPCC Minimize l(h), subject to A h C, G h K, H h K, G h, H h = 0. Here, A : X Y, G : X Z, H : X Z are bounded linear maps, X and Y are Banach spaces, Z is a Hilbert space, and l X. The sets C Y and K Z are assumed to be closed, convex cones. We assume that h = 0 is a local minimizer of (4.9). In [Wachsmuth, 2015, Section 5] the tightened nonlinear program (TNLP) of (4.9) (actually, the TNLP is a linear conic program, since (4.9) was already linear) at h = 0 is defined as (4.9) Minimize l(h), subject to A h C, G h K, H h K. (4.10) Suppose that the CQs Y Z Z = (A, G, H) X C K K, Y Z Z = cl [ (A, G, H) X C K K ] (4.11a) (4.11b) are satisfied. Note that the (4.11a) is the CQ of Zowe and Kurcyusz [1979] for (4.10) at h = 0, whereas (4.11b) is a CQ ensuring unique multipliers for (4.10), see [Wachsmuth, 2015, Section 4]. This latter CQ (4.11b) reduces to the nondegeneracy condition if Y and Z are finite dimensional, see [Wachsmuth, 2015, Remark 4.1]. Moreover, the nondegeneracy condition (4.11b) implies (4.11a) in the finite-dimensional case. Then, there exist multipliers λ Y, µ, ν Z associated to the local minimizer h = 0, such that the system of strong stationarity l A λ G µ H ν = 0, λ C, (4.12a) (4.12b) µ K, (4.12c) ν K (4.12d) is satisfied, see [Wachsmuth, 2015, Theorem 5.2 and Proposition 5.1]. Since the cone K is polyhedric w.r.t. (0, 0), these optimality conditions possess reasonable strength, see [Wachsmuth, 2015, Section 5.2]. In Theorem 4.5, these results will be applied to (4.8). We use the setting X = X Z, A (h, Π) = (g ( x) h, G ( x) h L 2 Π), C = T C (g( x)) {0}, Y = Y Z, G (h, Π) = Π, K = K K (G( x), H( x)), Z = Z, H (h, Π) = G ( x) h H ( x) h Π (4.13) to recast (4.8) in the form (4.9). The next lemma gives an equivalent characterization of the CQ (4.11) applied to (4.8) via this setting (4.13). 17

Lemma 4.4. By applying the setting (4.13), the CQs (4.11) are equivalent to Y g ( x) I 0 0 T C (g( x)) Z = G ( x) X 0 L 2 0 K K (G( x), H( x)), (4.14a) Z H ( x) 0 I L 2 I K K (G( x), H( x)) Y Z = cl{ g ( x) I 0 0 T C (g( x)) } G ( x) X 0 L 2 0 K K (G( x), H( x)). (4.14b) Z H ( x) 0 I L 2 I K K (G( x), H( x)) Here, 0 refer to the zero map between the corresponding spaces. Proof. (4.14a) (4.11a) : We have to show that for all (y, z 1, z 2, z 3 ) Y Z Z Z, there are ( x, z 1 ) X Z, ỹ T C (g( x)), z 2 K K (G( x), H( x)) and z 3 K K (G( x), H( x)) such that y = g ( x) x ỹ, z 1 = G ( x) x L 2 z 1, z 2 = z 1 z 2, z 3 = (G ( x) H ( x)) x z 1 z 3 ( ) is satisfied. Since (4.14a) is satisfied, we find x X, ỹ T C (g( x)), z 2 K K (G( x), H( x)) and z 3 K K (G( x), H( x)) such that y = g ( x) x ỹ, z 1 L 2 z 2 = G ( x) x L 2 z 2, z 3 z 1 (L 2 I) z 2 = H ( x) x (I L 2 ) z 2 z 3. Now we set z 1 := z 2 z 2 and it is easy to verify that the above system ( ) is satisfied. The verification of (4.14a) (4.11a) is straightforward and (4.14b) (4.11b) follows from similar arguments. Note that the lower right block in the right-hand side of the CQs (4.14) is just the largest subspace contained in T gph TK ( ) (Ḡ, H), see (3.19). The strength of the CQs (4.14) is discussed for the applications to SDPMPCCs and SOCMPCCs in the next two section, see, in particular, the remarks after Definitions 5.7 and 6.9. Now, we are in position to state the main result of this section. Theorem 4.5. Let the assertions of Theorem 4.1 and of Lemma 3.10 satisfied and assume that the MPCC (4.8) satisfies the CQ (4.14). Then, there exist multipliers λ Y, µ, ν Z, such that the system is satisfied. f ( x) g ( x) λ G ( x) µ H ( x) ν = 0 (4.15a) λ T C ( x) (4.15b) L 2 (µ ν) ν K K (G( x), H( x)) (4.15c) ν K K (G( x), H( x)) (4.15d) Proof. Since the assertions of Theorem 4.1 and of Lemma 3.10 are satisfied, we can invoke Corollary 4.3 and obtain that (h, Π) = 0 is a local solution of (4.8). This linearized MPCC can be transformed to (4.10) via the setting (4.13) and the corresponding CQ (4.11) is satisfied due to Lemma 4.4. Hence, the solution (h, Π) = 0 of (4.8) is strongly stationary in the sense of (4.12). That is, there exist Lagrange 18

multipliers λ Y, ρ, κ, ν Z, such that the system f ( x) g ( x) λ G ( x) ρ [G ( x) H ( x)] ν = 0 (4.16a) L 2 ρ κ ν = 0 (4.16b) λ T C ( x) (4.16c) κ K K (G( x), H( x)) (4.16d) ν K K (G( x), H( x)) (4.16e) is satisfied. By eliminating κ = ν L 2 ρ and setting µ = ρ ν we obtain (4.15). We remark that (4.16) is obtained from (4.15) by defining ρ := µ ν and κ := ν L 2 ρ. That is, the systems (4.15) and (4.16) are equivalent. Note that conditions (4.15c) and (4.15d) are equivalent to (µ, ν) T gph TK ( ) (G( x), H( x)), see Lemma 3.11. Theorem 4.5 motivates the following definition of strong stationarity. Definition 4.6. Let x be a feasible point of (MPCC) and let L : Z Z be a bounded, linear and self-adjoint operator which satisfies (3.15). The point x is called strongly stationary, if there exist multipliers λ Y, µ, ν Z, such that the system (4.15) is satisfied. If K is polyhedric w.r.t. (G( x), H( x)), we can choose L = I, since Proj K(G( x) H( x), w) = Proj KK (G( x),h( x))(w) w Z in this case, see [Haraux, 1977, Theorem 2]. Hence, Definition 4.6 is equivalent to [Wachsmuth, 2015, Definition 5.1] in the polyhedric case. We remark that it remains an open problem to define strong stationarity in case that no bounded, linear, self-adjoint L : Z Z satisfying (3.15) exists. Based on Corollary 4.2, it is easy to see that the constraint qualifications of Theorem 4.5 are satisfied in the surjective case. Lemma 4.7. Let x be a feasible point of (MPCC) and let L : Z Z be a bounded, linear and selfadjoint operator which satisfies (3.15). Further, assume that the linear operator (g ( x), G ( x), H ( x)) L(X, Y Z Z) is surjective. Then, x is strongly stationary in the sense of Definition 4.6. Now, we have seen that the optimality conditions of strongly stationary type (4.15) can be obtained under the assumptions of Theorem 4.5, and these hold, in particular, if the linearized constraints are surjective, see Lemma 4.7. In the remainder of this section, we are going to demonstrate that these strong stationarity conditions possess a reasonable strength and we compare our results to [Mordukhovich et al., 2015a, Section 6.1]. As in the polyhedric case, see [Wachsmuth, 2015, Theorem 5.1], strong stationarity implies linearized B-stationarity. Lemma 4.8. Let x be a feasible point of (MPCC) and let L : Z Z be a bounded, linear and self-adjoint operator which satisfies (3.15). If x is strongly stationary, then f (x) h 0 h T 2 lin( x), 19

where the linearization cone Tlin 2 ( x) is given by { h X : g ( x) h T C ( x), T 2 lin( x) := Π K K (Ḡ, H) : G ( x) h = L 2 Π, G ( x) h H ( x) h Π K K (Ḡ, H) with (Ḡ, H) = (G( x), H( x)). That is, x is linearized B-stationary. } Note that the definition of T 2 lin ( x) does not contain the complementarity (Π, G ( x) hh ( x) h Π) = 0. Hence, it may be larger than T lin ( x), see also (3.16). Proof. Since x is assumed to be strongly stationary, there exist multipliers λ Y, µ, ν Z, such that (4.15) is satisfied. Let h Tlin 2 ( x) with corresponding Π Z be given. Then, f ( x) h = g ( x) h, λ G ( x) h, µ H ( x) h, ν = g ( x) h, λ G ( x) h, µ ν G ( x) h H ( x) h, ν = g ( x) h, λ Π, L 2 (µ ν) G ( x) h H ( x) h, ν = g ( x) h, λ Π, L 2 (µ ν) ν G ( x) h H ( x) h Π, ν 0. Note that we have T F ( x) T lin ( x) T 2 lin( x), see Lemma 3.1 and Lemma 3.10. Hence, under the assumptions of Lemma 4.8, we have f ( x) h 0 for all h T F ( x) and h T lin ( x) as well. This shows that the equivalence f ( x) h 0 h T F ( x) x is strongly stationary holds under the assumptions of Theorem 4.5 (or of Lemma 4.7). Hence, our definition of strong stationarity seems to be of reasonable strength. Let us compare our results with [Mordukhovich et al., 2015a, Theorem 6.3]. Therefore, we consider the problem Minimize f(x, y), such that y K, H(x, y) K (4.17), and y, H(x, y) = 0. Here, f : R n R m R is Fréchet differentiable and H : R n R m R m is continuously Fréchet differentiable. The closed convex cone K is assumed to satisfy [Mordukhovich et al., 2015a, Assumption (A1)] and the assumption of Lemma 3.10 such that we can apply the results from Mordukhovich et al. [2015a] as well as our results. We mention that (4.17) is a special case of (MPCC) (in particular, there is no cone constraint g(x, y) C and all spaces are finite dimensional) as well as a special case of [Mordukhovich et al., 2015a, (6.5)] (one has to set g to the identity function therein). In order to obtain optimality conditions for a local minimizer ( x, ȳ), the surjectivity of x H( x, ȳ) is assumed in [Mordukhovich et al., 2015a, Theorem 6.3]. By setting G(x, y) := y, we find that (G ( x, ȳ), H ( x, ȳ)) is surjective as well, and therefore all our CQs are satisfied, see Lemma 4.7. By using Lemma 3.11 it is easy to check that the resulting optimality conditions of [Mordukhovich et al., 2015a, Theorem 6.3] and Theorem 4.5 are equivalent. However, we might relax the the surjectivity assumption on x H( x, ȳ) if we use Theorem 4.5 instead of Lemma 4.7. In conclusion, both papers deal with slightly different problems and if we apply both theories to a common special case, our CQs might be weaker. 20

5 Optimization with semidefinite complementarity constraints In this section we apply the theory from Section 4 to a problem with semidefinite complementarity constraints. That is, we consider Minimize f(x), subject to g(x) C, G(x) S n, H(x) S n, (G(x), H(x)) F = 0. (SDPMPCC) Here, S n (S n ) are the positive (negative) semidefinite symmetric matrices and (, ) F is the Frobenius inner product on the space S n of real symmetric matrices of size n n, n 1. The associated norm on S n is denoted by F. Moreover, f : X R is Fréchet differentiable, g : X Y, G, H : X S n are strictly Fréchet differentiable, X, Y are (real) Banach spaces, and C Y is a closed, convex set. Note that S n is the polar cone of S n w.r.t. the Frobenius inner product, see [Ding et al., 2014, Section 2.3]. The problem (SDPMPCC) was already considered in Ding et al. [2014], Wu et al. [2014]. Both contributions address constraint qualifications and optimality conditions. However, it remained an open problem whether an analogue of MPCC-LICQ implies strong stationarity, see [Ding et al., 2014, Remark 6.1]. In this section, we are going to close this gap. 5.1 Notation and Preliminaries We recall some known results concerning the cone of semidefinite matrices, see, e.g., Vandenberghe and Boyd [1996], Todd [2001], Sun and Sun [2002], Hiriart-Urruty and Malick [2012], Ding et al. [2014] and the references therein. For a matrix A S n we call A = P Λ P an ordered eigenvalue decomposition if P R n n is an orthogonal matrix, and Λ is a diagonal matrix such that its diagonal entries are ordered decreasingly. Note that this matrix Λ is unique and we denote it by Λ(A), whenever appropriate. For a matrix A S n and index sets I, J {1,..., n}, we denote by A IJ or A I,J the submatrix of A containing the rows of A with indices from I and the columns with indices from J. If additionally, P denotes an orthogonal matrix, we define A P IJ := AP I,J := (P A P ) IJ. By we denote the Hadamard product (also called Schur product or entrywise product) of matrices, that is (A B) ij = A ij B ij. For two matrices X S n, Y S n we have the complementarity (X, Y ) F = 0 if and only if there exists a simultaneous ordered eigenvalue decomposition of X and Y and Λ(X) Λ(Y ) = 0. That is there exists an orthogonal matrix P with X = P Λ(X) P, Y = P Λ(Y ) P and Λ(X) Λ(Y ) = 0. Let X S n, Y S n with (X, Y ) F = 0 be given and let P Λ P = X Y be an ordered eigenvalue decomposition. Note that P max(λ, 0) P = X and P min(λ, 0) P = Y are ordered eigenvalue decompositions of X and Y, see [Ding et al., 2014, Theorem 2.3]. Here, the max and min is understood entrywise. We denote by (α, β, γ) the index sets corresponding to the positive, zero and negative eigenvalues of X Y. Then, we have T S n (X) = {H S n : H P β γ,β γ 0}, 21

see, e.g., [Shapiro, 1997, (26)] or [Hiriart-Urruty and Malick, 2012, (9)]. Consequently, we have K S n (X, Y ) = {H S n : H P ββ 0, H P βγ = 0, H P γγ = 0}, K S n (X, Y ) = {H S n : H P αα = 0, H P αβ = 0, H P αγ = 0, H P ββ 0}. (5.1a) (5.1b) Moreover, it is well known Proj S n is directionally differentiable. For A S n with ordered eigenvalue decomposition A = P Λ P, the directional derivative is given by Hαα P Hαβ P Σ αγ Hαγ P Proj S (A; H) = P n Hβα P Proj β S (Hββ P ) 0 P, Σ γα Hγα P 0 0 see [Sun and Sun, 2002, Theorem 4.7] and [Ding et al., 2014, (14)]. Here, Σ S n is defined by Σ ij = 1 (i, j) (α α) (α β) (β α) (β β) (5.2a) Σ ij = 0 (i, j) (γ γ) (γ β) (β γ) (5.2b) Σ ij = max(λ ii, 0) max(λ jj, 0) Λ ii Λ jj (i, j) (α γ) (γ α). (5.2c) Note that (3.15) is satisfied by the bounded, linear, self-adjoint operator L : S n S n defined by L H = P ( Σ (P H P )) P. We remark that (3.20) is satisfied by this choice of L. ( Σ) ij = Σ ij. For later reference, we remark Here, matrix Σ is defined entrywise by L 2 H = P (Σ (P H P )) P. (5.3) 5.2 Constraint Qualifications and Strong Stationarity As a first result, we provide a tangent approximation set for gph T S n ( ) by means of Lemma 3.8. Lemma 5.1. Let A S n be given and let A = P Λ P be an ordered eigenvalue decomposition. Let (α, β, γ) be the index sets corresponding to the positive, zero and negative eigenvalues of A. We define W = {H S n : H P ββ = 0} and Σ S n by (5.2). Then, for all ε > 0 there is δ > 0 such that Proj S n (à H) Proj S (Ã) P ( Σ (P H P ) ) P F ε H n F (5.4) for all à S n and H W with A à F δ and H F δ. Proof. We first consider the case that A is already diagonal and ordered decreasingly, that is A = Λ and P = I. In this proof, M denotes a generic constant which may change from line to line. Let γ > 0 be chosen such that γ < Λ ii 2 for all i α. 22