arxiv: v1 [math.oc] 24 Nov 2016

Size: px

Start display at page:

Download "arxiv: v1 [math.oc] 24 Nov 2016"

Blaze Bell
5 years ago
Views:

1 An SQP method for mathematical programs with vanishing constraints with strong convergence properties Matúš Beno, Helmut Gfrerer arxiv:6.080v [math.oc] 4 Nov 06 Abstract We propose an SQP algorithm for mathematical programs with vanishing constraints which solves at each iteration a quadratic program with linear vanishing constraints. The algorithm is based on the newly developed concept of Q-stationarity [5]. We demonstrate how Q M -stationary solutions of the quadratic program can be obtained. We show that all limit points of the sequence of iterates generated by the basic SQP method are at least M- stationary and by some extension of the method we also guarantee the stronger property of Q M -stationarity of the limit points. Key words: SQP method, mathematical programs with vanishing constraints, Q-stationarity, Q M -stationarity AMS subject classifications: 49M37, 90C6, 90C55 Introduction Consider the following mathematical program with vanishing constraints (MPVC) min x R n f(x) subject to h i (x) = 0 i E, g i (x) 0 i I, H i (x) 0, G i (x)h i (x) 0 i V, with continuously differentiable functions f, h i, i E, g i, i I, G i, H i, i V and finite index sets E, I and V. Theoretically, MPVCs can be viewed as standard nonlinear optimization problems, but due to the vanishing constraints, many of the standard constraint qualifications of nonlinear programming are violated at any feasible point x with H i ( x) = G i ( x) = 0 for some i V. On the other hand, by introducing slac variables, MPVCs may be reformulated as so-called mathematical programs with complementarity constraints (MPCCs), see [7]. However, this approach is also not satisfactory as it has turned out that MPCCs are in fact even more difficult to handle than MPVCs. This maes it necessary, both from a theoretical and numerical point of view, to consider special tailored algorithms for solving MPVCs. Recent numerical methods follow different directions. A smoothing-continuation method and a regularization approach for MPCCs are considered in [6, 0] and a combination of these techniques, a smoothing-regularization approach for MPVCs is investigated in []. In [8, 3] the relaxation method has been suggested in order to deal with the inherent difficulties of MPVCs. In this paper, we carry over a well nown SQP method from nonlinear programming to MPVCs. We proceed in a similar manner as in [4], where an SQP method for MPCCs was introduced by Beno and Gfrerer. The main tas of our method is to solve in each iteration step a quadratic program with linear vanishing constraints, a so-called auxiliary problem. Then we compute the next iterate by reducing a certain merit function along some polygonal line which is given by () Institute of Computational Mathematics, Johannes Kepler University Linz, A-4040 Linz, Austria, beno@numa.uni-linz.ac.at, helmut.gfrerer@ju.at

2 the solution procedure for the auxiliary problem. To solve the auxiliary problem we exploit the new concept of Q M -stationarity introduced in the recent paper by Beno and Gfrerer [5]. Q M -stationarity is in general stronger than M-stationarity and it turns out to be very suitable for a numerical approach as it allows to handle the program with vanishing constraints without relying on enumeration techniques. Surprisingly, we compute at least a Q M -stationary solution of the auxiliary problem just by means of quadratic programming by solving appropriate convex subproblems. Next we study the convergence of the SQP method. We show that every limit point of the generated sequence is at least M-stationary. Moreover, we consider the extended version of our SQP method, where at each iterate a correction of the iterate is made to prevent the method from converging to undesired points. Consequently we show that under some additional assumptions all limit points are at least Q M -stationary. Numerical tests indicate that our method behaves very reliably. A short outline of this paper is as follows. In section we recall the basic stationarity concepts for MPVCs as well as the recently developed concepts of Q- and Q M -stationarity. In section 3 we describe an algorithm based on quadratic programming for solving the auxiliary problem occurring in every iteration of our SQP method. We prove the finiteness and summarize some other properties of this algorithm. In section 4 we propose the basic SQP method. We describe how the next iterate is computed by means of the solution of the auxiliary problem and we consider the convergence of the overall algorithm. In section 5 we consider the extended version of the overall algorithm and we discuss its convergence. Section 6 is a summary of numerical results we obtained by implementing our basic algorithm in MATLAB and by testing it on a subset of test problems considered in the thesis of Hoheisel [7]. In what follows we use the following notation. Given a set M we denote by P(M) := {(M, M ) M M = M, M M = } the collection of all partitions of M. Further, for a real number a we use the notation (a) + := max(0, a), (a) := min(0, a). For a vector u = (u, u,..., u m ) T R m we define u, (u) +, (u) componentwise, i.e. u := ( u, u,..., u m ) T, etc. Moreover, for u R m and p we denote the l p norm of u by u p and we use the notation u := u for the standard l norm. Finally, given a sequence y R m, a point y R m and an infinite set K N we write y K y instead of lim, K y = y. Stationary points for MPVCs Given a point x feasible for () we define the following index sets I g ( x) := {i I g i ( x) = 0}, I 0+ ( x) := {i V H i ( x) = 0 < G i ( x)}, I 0 ( x) := {i V H i ( x) = 0 > G i ( x)}, () I +0 ( x) := {i V H i ( x) > 0 = G i ( x)}, I 00 ( x) := {i V H i ( x) = 0 = G i ( x)}, I + ( x) := {i V H i ( x) > 0 < G i ( x)}. In contrast to nonlinear programming there exist a lot of stationarity concepts for MPVCs. Definition.. Let x be feasible for (). Then x is called. wealy stationary, if there are multipliers λ g i, i I, λh i, i E, λg i, λh i, i V such that f( x) T + i E λ h i h i ( x) T + i I λ g i g i( x) T + i V ( λ H i H i ( x) T + λ G i G i ( x) T ) = 0 (3) and λ g i g i( x) = 0, i I, λ H i H i( x) = 0, i V, λ G i G i( x) = 0, i V, λ g i 0, i I, λh i 0, i I 0 ( x), λ G i 0, i I 00 ( x) I +0 ( x). (4)

3 . M-stationary, if it is wealy stationary and λ H i λ G i = 0, i I 00 ( x). (5) 3. Q-stationary with respect to (β, β ), where (β, β ) is a given partition of I 00 ( x), if there exist two multipliers λ = (λ h, λ g, λ H, λ G ) and λ = (λ h, λ g, λ H, λ G ), both fulfilling (3) and (4), such that λ G i = 0, λ H i, λg i 0, i β ; λ H i, λ G i 0, λ G i = 0, i β. (6) 4. Q-stationary, if there is some partition (β, β ) P(I 00 ( x)) such that x is Q-stationary with respect to (β, β ). 5. Q M -stationary, if it is Q-stationary and at least one of the multipliers λ and λ fulfills M- stationarity condition (5). 6. S-stationary, if it is wealy stationary and λ H i 0, λ G i = 0, i I 00 ( x). The concepts of Q-stationarity and Q M -stationarity were introduced in the recent paper by Beno and Gfrerer [5], whereas the other stationarity concepts are very common in the literature, see e.g. [, 7, 8]. The following implications hold: S-stationarity Q-stationarity with respect to every (β, β ) P(I 00 ( x)) Q-stationarity w.r.t. (, I 00 ( x)) Q M -stationarity M-stationarity wea stationarity. The first implication follows from the fact that the multiplier corresponding to S-stationarity fulfills the requirements for both λ and λ. The third implication holds because for (β, β ) = (, I 00 ( x)) the multiplier λ fulfills (5) since λ G i = 0 for i I 00 ( x). Note that the S-stationarity conditions are nothing else than the Karush-Kuhn-Tucer conditions for the problem (). As we will demonstrate in the next theorems, a local minimizer is S-stationary only under some comparatively stronger constraint qualification, while it is Q M - stationary under very wea constraint qualifications. Before stating the theorems we recall some common definitions. Denoting F i (x) := ( H i (x), G i (x)) T, i V, P := {(a, b) R R ab 0}, (7) F(x) := (h(x) T, g(x) T, F (x) T ) T, we see that problem () can be rewritten as D := {0} E R I P V, (8) min f(x) subject to x Ω V := {x R n F(x) D}. Recall that the contingent (also tangent) cone to a closed set Ω R m at u Ω is defined by T Ω (u) := {d R m (d ) d, (τ ) 0 : u + τ d Ω }. The linearized cone to Ω V at x Ω V is then defined as TΩ lin V ( x) := {d R n F( x)d T D (F( x))}. Further recall that x Ω V is called B-stationary if f( x)d 0 d T ΩV ( x). Every local minimizer is nown to be B-stationary. Definition.. Let x be feasible for (), i.e x Ω V. We say that the generalized Guignard constraint qualification (GGCQ) holds at x, if the polar cone of T ΩV ( x) equals the polar cone of T lin Ω V ( x). 3

4 Theorem. (c.f. [5, Theorem 8]). Assume that GGCQ is fulfilled at the point x Ω V. If x is B-stationary, then x is Q-stationary for () with respect to every partition (β, β ) P(I 00 ( x)) and it is also Q M -stationary. Theorem. (c.f. [5, Theorem 8]). If x is Q-stationary with respect to a partition (β, β ) P(I 00 ( x)), such that for every j β there exists some z j fulfilling and there is some z such that h( x)z j = 0, g i ( x)z j = 0, i I g ( x), G i ( x)z j = 0, i I +0 ( x), G i ( x)z j { 0, i β, 0, i β, H i ( x)z j = 0, i I 0 ( x) I 00 ( x) I 0+ ( x) \ {j}, H j ( x)z j = h( x) z = 0, g i ( x) z = 0, i I g ( x), G i ( x) z { = 0, i I +0 ( x), 0, i β G i ( x) z,, i β, H i ( x) z = 0, i I 0 ( x) I 00 ( x) I 0+ ( x), (9) (0) then x is S-stationary and consequently also B-stationary. Note that these two theorems together also imply that a local minimizer x Ω V is S-stationary provided GGCQ is fulfilled at x and there exists a partition (β, β ) P(I 00 ( x)), such that for every j β there exists z j fulfilling (9) and z fulfilling (0). Moreover, note that (9) and (0) are fulfilled for every partition (β, β ) P(I 00 ( x)) e.g. if the gradients of active constraints are linearly independent. On the other hand, in the special case of partition (, I 00 ( x)) P(I 00 ( x)), this conditions read as the requirement that the system h( x) z = 0, g i ( x) z = 0, i I g ( x), G i ( x) z = 0, i I +0 ( x), G i ( x) z, i I 00 ( x), H i ( x) z = 0, i I 0 ( x) I 00 ( x) I 0+ ( x) has a solution, which resembles the well-nown Mangasarian-Fromovitz constraint qualification (MFCQ) of nonlinear programming and it seems to be a rather wea and possibly often fulfilled assumption. Finally, we recall the definitions of normal cones. The regular normal cone to a closed set Ω R m at u Ω can be defined as the polar cone to the tangent cone by N Ω (u) := (T Ω (u)) = {z R m (z, d) 0 d T Ω (u)}. The limiting normal cone to a closed set Ω R m at u Ω is given by N Ω (u) := {z R m u u, z z with u Ω, z N Ω (u ) }. () In case when Ω is a convex set, regular and limiting normal cone coincide with the classical normal cone of convex analysis, i.e. N Ω (u) = N Ω (u) = {z R m (z, u v) 0 v Ω}. () Well-nown is also the following description of the limiting normal cone N Ω (u) := {z R m u u, z z with u Ω, z N Ω (u ) }. (3) 4

5 We conclude this section by the following characterization of M- and Q-stationarity via limiting normal cone. Straightforward calculations yield that R + {0} if i I 0 ( x), R {0} {0} R + if i I 00 ( x), N P (F i ( x)) = R {0} if i I 0+ ( x), {0} R + if i I +0 ( x), {0} {0} if i I + ( x), N P (F i ( x)) = R {0} if i I 0+ ( x) I 00 ( x) I 0 ( x), { R+ R N P (F i ( x)) = + if i I 00 ( x), N P (F i ( x)) if i I 0 ( x) I +0 ( x) I + ( x) and hence the M-stationarity conditions (4) and (5) can be replaced by (λ h, λ g, λ H, λ G ) N D (F( x)) = R E {u R I + (u, g( x)) = 0} N P V (F ( x)) (4) and the Q-stationarity conditions (4) and (6) can be replaced by (λ h, λ g, λ H, λ G ) R E {u R I + (u, g( x)) = 0} ν β,β i ( x), (5) i V (λ h, λ g, λ H, λ G ) R E {u R I + (u, g( x)) = 0} ν β,β i ( x), (6) i V where for (β, β ) P(I 00 ( x)) we define ν β,β i ( x) := Note also that for every i V we have 3 Solving the auxiliary problem { NP (F i ( x)) if i I 0+ ( x) β, N P (F i ( x)) if i I 0 ( x) I +0 ( x) I + ( x) β. ν I00 ( x), i ( x) N P (F i ( x)). (7) In this section, we describe an algorithm for solving quadratic problems with vanishing constraints of the type QP V C(ρ) min (s,δ) R n+ st Bs + fs + ρ( δ + δ) subject to ( δ)h i + h i s = 0 i E, ( θ g i δ)g i + g i s 0 i I, ( ( θi Hδ)H i + H i s 0, ( θ G i δ)g i + G i s ) ( ( θi Hδ)H i + H i s ) 0 i V, δ 0. (8) Here the vector θ = (θ g, θ G, θ H ) {0, } I + V =: B is chosen at the beginning of the algorithm such that some feasible point is nown in advance, e.g. (s, δ) = (0, ). The parameter ρ has to be chosen sufficiently large and acts lie a penalty parameter forcing δ to be near zero at the solution. B is a symmetric positive definite n n matrix, f, h i, g i, G i, H i denote row vectors in R n and h i, g i, G i, H i are real numbers. Note that this problem is a special case of problem () and consequently the definition of Q and Q M stationarity as well as the definition of index sets () remain valid. It turns out to be much more convenient to operate with a more general notation. Let us denote by F i := ( H i, G i ) T a vector in R, by F i := ( H T i, GT i )T a n matrix and by 5

6 P := {0} R and P := R two subsets of R. Note that for P given by (7) it holds that P = P P. The problem (8) can now be equivalently rewritten in a form QP V C(ρ) min (s,δ) R n+ st Bs + fs + ρ( δ + δ) subject to ( δ)h i + h i s = 0 i E, ( θ g i δ)g i + g i s 0 i I, δ(θi HH i, θi GG i) T + F i + F i s P i V, δ 0. For a given feasible point (s, δ) for the problem QP V C(ρ) we define the following index sets I (s, δ) := {i V δ(θ H i H i, θ G i G i ) T + F i + F i s P \ P } = I 0+ (s, δ), I (s, δ) := {i V δ(θ H i H i, θ G i G i ) T + F i + F i s P \ P } = I +0 (s, δ) I + (s, δ), I 0 (s, δ) := {i V δ(θ H i H i, θ G i G i ) T + F i + F i s P P } = I 0 (s, δ) I 00 (s, δ), where the index sets I 0+ (s, δ), I +0 (s, δ), I + (s, δ), I 0 (s, δ), I 00 (s, δ) are given by (). Further, consider the distance function d defined by d(x, A) := inf y A x y, for x R and A R. The following proposition summarizes some well-nown properties of d. Proposition 3.. Let x R and A R.. Let B R, then In particular, (9) d(x, A B) = min{d(x, A), d(x, B)}. (0) d(x, P ) = (x ) + + ( x ) +, d(x, P ) = (x ) + + (x ) +, d(x, P ) = (x ) + + (min{ x, x }) +. (). d(, A) : R R + is Lipschitz continuous with Lipschitz modulus L = and consequently 3. d(, A) : R R + is convex, provided A is convex. d(x, A) d(x + y, A) + y. () Due to the disjunctive structure of the auxiliary problem we can subdivide it into several QP-pieces. For every partition (V, V ) P(V ) we define the convex quadratic problem QP (ρ, V ) min (s,δ) R n+ st Bs + fs + ρ( δ + δ) subject to ( δ)h i + h i s = 0 i E, ( θ g i δ)g i + g i s 0 i I, δ(θi HH i, θi GG i) T + F i + F i s P i V, δ(θi HH i, θi GG i) T + F i + F i s P i V, δ 0. Since (V, V ) form a partition of V it is sufficient to define V since V is given by V = V \ V. At the solution (s, δ) of QP (ρ, V ) there is a corresponding multiplier λ(ρ, V ) = (λ h, λ g, λ H, λ G ) and a number λ δ 0 with λ δ δ = 0 fulfilling the KKT conditions: (3) Bs + f T + i E λ h i h T i + i I λ g i gt i + i V F T i λ F i = 0, (4) ρ(δ + ) λ δ λ h i h i λ g i θg i g i + (θi H H i, θi G G i )λ F i i E i I i V = 0, (5) λ g i (( θg i δ)g i + g i s) = 0, λ g i 0, i I, (6) λ F i N P (δ(θi H H i, θi G G i ) T + F i + F i s), i V, (7) λ F i N P (δ(θi H H i, θi G G i ) T + F i + F i s), i V, (8) 6

7 where λ F i := (λ H i, λg i )T for i V. Since P and P are convex sets, the above normal cones are given by (). The definition of the problem QP (ρ, V ) allows the following interpretation of Q-stationarity, which is a direct consequence of (5) and (6). Lemma 3.. A point (s, δ) is Q-stationary with respect to (β, β ) P(I 00 (s, δ)) for (9) if and only if it is the solution of the convex problems QP (ρ, I (s, δ) β ) and QP (ρ, I (s, δ) β ). Moreover, since for V = I (s, δ) I 00 (s, δ) the conditions (7),(8) read as λ F i ν I00 (s,δ), i (s, δ), it follows from (7) that if a point (s, δ) is the solution of QP (ρ, I (s, δ) I 00 (s, δ)) then it is M- stationary for (9). Finally, let us denote by δ(v ) the objective value at a solution of the problem min δ subject to the constraints of (3). (9) (s,δ) Rn+ An outline of the algorithm for solving QP V C(ρ) is as follows. Algorithm 3. (Solving the QPVC). Let ζ (0, ), ρ > and ρ > 0 be given. : Initialize: Set the starting point (s 0, δ 0 ) := (0, ), define the vector θ by θ g i := { if gi > 0, 0 if g i 0, (0, 0) if d(f i, P ) = 0, (θi H, θi G ) := (, 0) (0, ) if 0 < d(f i, P ) d(f i, P ), if 0 < d(f i, P ) < d(f i, P ) (30) and set the partition V := I (s 0, δ 0 ) and the counter of pieces t := 0. Compute (s, δ ) as the solution and λ as the corresponding multiplier of the convex problem QP (ρ, V ) and set t :=. If δ > δ 0, perform a restart: set ρ := ρ ρ and go to step. : Improvement step: while (s t, δ t ) is not a solution of the following four convex problems: QP (ρ, I (s t, δ t ) (I 00 (s t, δ t ) V t )), QP (ρ, I (s t, δ t ) (I 00 (s t, δ t ) \ V t )), (3) QP (ρ, I (s t, δ t )), QP (ρ, I (s t, δ t ) I 00 (s t, δ t )). (3) Compute (s t+, δ t+ ) as the solution and λ t+ as the corresponding multiplier of the first problem with (s t+, δ t+ ) (s t, δ t ), set V t+ to the corresponding index set and increase the counter t of pieces by. If δ t > δ t, perform a restart: set ρ := ρ ρ and go to step. 3: Chec for successful termination: If δ t < ζ set N := t, stop the algorithm and return. 4: Chec the degeneracy: If the non-degeneracy condition min{ δ(i (s t, δ t )), δ(i (s t, δ t ) I 00 (s t, δ t ))} < ζ (33) is fulfilled, perform a restart: set ρ := ρ ρ and go to step. Else stop the algorithm because of degeneracy. The selection of the index sets in step is motivated by Lemma 3., since if (s, δ) is the solution of convex problems (3), then it is Q-stationary and if (s, δ) is also the solution of convex problems (3), then it is even Q M -stationary for problem (9). We first summarize some consequences of the Initialization step. 7

8 Proposition 3... Vector θ is chosen in a way that for all i V it holds that (θ H i H i, θ G i G i ) T = d(f i, P ). (34). Partition (V, V ) is chosen in a way that for j =, it holds that i V j implies d(f i, P ) = d(f i, P j ). (35) Proof.. If d(f i, P ) = 0 we have (θi H, θg i ) = (0, 0) and (34) obviously holds. If 0 < d(f i, P ) d(f i, P ) we have (θi H, θg i ) = (, 0) and we obtain (θ H i H i, θ G i G i ) T = H i = d(f i, P ) = d(f i, P ) by () and (0). Finally, if 0 < d(f i, P ) < d(f i, P ) we have H i < 0 < G i, (θi H, θg i ) = (0, ) and thus (θi H H i, θi G G i ) T = G i = (H i ) + + (G i ) + = d(f i, P ) = d(f i, P ) follows again by () and (0).. If (θ H i H i, θ G i G i) T + F i P j for some i V and j =,, by () and (34) we obtain d(f i, P j ) (θ H i H i, θ G i G i ) T = d(f i, P ) and consequently d(f i, P j ) = d(f i, P ), because of (0). Hence we conclude that i (I j (s 0, δ 0 ) I 0 (s 0, δ 0 )) implies d(f i, P j ) = d(f i, P ) for j =, and the statement now follows from the fact that V = I (s 0, δ 0 ) and V = I (s 0, δ 0 ) I 0 (s 0, δ 0 ). The following lemma plays a crucial part in proving the finiteness of the Algorithm 3.. Lemma 3.. For each partition (V, V ) P(V ) there exists a positive constant C ρ (V ) such that for every ρ C ρ (V ) the solution (s, δ) of QP (ρ, V ) fulfills δ = δ(v ). Proof. Let (s(v ), δ(v )) denote a solution of (9). Since δ(v ) = δ(v ), it follows that the problem min s R n st Bs + fs subject to ( δ(v ))h i + h i s = 0 i E, ( θ g δ(v i ))g i + g i s 0 i I, δ(v )(θi HH i, θi GG i) T + F i + F i s P i V, δ(v )(θi HH i, θi GG i) T + F i + F i s P i V (36) is feasible and by s(v ) we denote the solution of this problem and by λ(v ) the corresponding multiplier. Further, ( s(v ), δ(v )) is a solution of (9) and by λ(v ) we denote the corresponding multiplier. Then, triple ( s(v ), δ(v )) and λ(v ) fulfills (4) and (6)-(8). Moreover, triple ( s(v ), δ(v )) and λ(v ) fulfills (6)-(8) and λ(v ) g i gt i + Fi T λ(v ) F i = 0, (37) i V λ δ i E i E λ(v ) h i h T i + i I λ(v ) h i h i i I λ(v ) g i θg i g i + i V (θ H i H i, θ G i G i )λ(v ) F i = 0 (38) for some λ δ 0 with λ δ δ(v ) = 0. Let C ρ (V ) be a positive constant such that for all ρ C ρ (V ) we have α := ρ( δ(v ) + ) i E λ(v ) h i h i i I λ(v ) g i θg i g i + i V (θ H i H i, θ G i G i ) λ(v ) F i 0 and set λ δ := αλ δ 0 and λ := λ(v ) + αλ(v ). We will now show that for such ρ it holds that ( s(v ), δ(v )) is the solution of QP (ρ, V ). 8

9 Clearly, λ δ δ(v ) = αλ δ δ(v ) = 0 and the triple ( s(v ), δ(v )) and λ also fulfills (4) due to (37) and it fulfills (6)-(8) due to the convexity of the normal cones. Moreover, taing into account the definitions of α, λ δ and λ together with (38), we obtain ρ( δ(v ) + ) λ δ i E λ h i h i i I λ g i θg i g i + i V (θ H i H i, θ G i G i ) λ F i = α αλ δ α( λ δ ) = 0, showing also (5). Hence ( s(v ), δ(v )) is the solution of QP (ρ, V ) and the proof is complete. We now formulate the main theorem of this section. Theorem 3... Algorithm 3. is finite.. If the Algorithm 3. is not terminated because of degeneracy, then (s N, δ N ) is Q M -stationary for the problem (9) and δ N < ζ. Proof.. The algorithm is obviously finite unless we perform a restart and hence increase ρ. Thus we can assume that ρ is sufficiently large, say ρ C ρ := max C ρ(v ), (V,V ) P(V ) with C ρ (V ) given by the previous lemma. However this means, taing into account also Proposition 3.3 (.), that (s t, δ t ) is feasible for the problem QP (ρ, V t ) for all t, hence δ t δ(v t ) and (s t, δ t ) is the solution of QP (ρ, V t ), implying δ t = δ(v t ) and consequently δ t δ t. Therefore we do not perform a restart in step or step. On the other hand, since we enter steps 3 and 4 with δ t = δ(i (s t, δ t )) = δ(i (s t, δ t ) I 00 (s t, δ t )), we either terminate the algorithm in step 3 with δ t < ζ if the non-degeneracy condition (33) is fulfilled or we terminate the algorithm because of degeneracy in step 4. This finishes the proof.. The statement regarding stationarity follows easily from the fact that we enter step 3 of the algorithm only when (s, δ) is a solution of problems (3) and this means that it is also Q-stationary with respect to (, I 00 (s N, δ N )) by Lemma 3.. Thus, (s, δ) is also Q M -stationary for problem (9). The claim about δ follows from the assumption that the Algorithm 3. is not terminated because of degeneracy. We conclude this section with the following proposition that brings together the basic properties of the Algorithm 3.. Proposition 3.3. If the Algorithm 3. is not terminated because of degeneracy, then the following properties hold:. For all t =,..., N the points (s t, δ t ) and (s t, δ t ) are feasible for the problem QP (ρ, V t and the point (s t, δ t ) is also the solution of the convex problem QP (ρ, V t ).. For all t =,..., N it holds that 0 δ t δ t. (39) 3. There exists a constant C t, dependent only on the number of constraints, such that N C t. (40) Proof.. By definitions of the problems QP V C(ρ) and QP (ρ, V ) it follows that a point (s, δ), feasible for QP V C(ρ), is feasible for QP (ρ, V ) if and only if I (s, δ) V I (s, δ) I 0 (s, δ). (4) The point (s 0, δ 0 ) is clearly feasible for QP (ρ, V ) and similarly the point (s t, δ t ) is feasible for QP (ρ, V t+ ) for all t =,..., N, since the partition V t+ is defined by one of the index sets of 9 )

10 (3)-(3) and thus fulfills (4). However, feasibility of (s t+, δ t+ ) for QP (ρ, V t+ ), together with (s t+, δ t+ ) being the solution of QP (ρ, V t+ ), then follows from its definition.. Statement follows from δ 0 =, from the fact that we perform a restart whenever δ t > δ t occurs and from the constraint δ Since whenever the parameter ρ is increased the algorithm goes to the step and thus the counter t of the pieces is reset to 0, it follows that after the last time the algorithm enters step we eep ρ constant. It is obvious that all the index sets V t are pairwise different implying that the maximum of switches to a new piece is V. 4 The basic SQP algorithm for MPVC An outline of the basic algorithm is as follows. Algorithm 4. (Solving the MPVC). : Initialization: Select a starting point x 0 R n together with a positive definite n n matrix B 0, a parameter ρ 0 > 0 and constants ζ (0, ) and ρ >. Select positive penalty parameters σ = (σ h, σ g, σf ). Set the iteration counter := 0. : Solve the Auxiliary problem: Run Algorithm 3. with data ζ, ρ, ρ := ρ, B := B, f := f(x ), h i := h i (x ), h i := h i (x ), i E, etc. If the Algorithm 3. stops because of degeneracy, stop the Algorithm 4. with an error message. If the final iterate s N is zero, stop the Algorithm 4. and return x as a solution. 3: Next iterate: Compute new penalty parameters σ. Set x + := x + s where s is a point on the polygonal line connecting the points s 0, s,..., s N such that an appropriate merit function depending on σ is decreased. Set ρ + := ρ, the final value of ρ in Algorithm 3.. Update B to get positive definite matrix B +. Set := + and go to step. Remar 4.. We terminate the Algorithm 4. only in the following two cases. In the first case no sufficient reduction of the violation of the constraints can be achieved. The second case will be satisfied only by chance when the current iterate is a Q M -stationary solution. Normally, this algorithm produces an infinite sequence of iterates and we must include a stopping criterion for convergence. Such a criterion could be that the violation of the constraints at some iterate is sufficiently small, max{max i E h i(x ), max i I (g i(x )) +, max i V d(f i(x ), P )} ɛ C, where F i is given by (7) and the expected decrease in our merit function is sufficiently small, see Proposition 4. below. 4. The next iterate (s N )T B s N ɛ, Denote the outcome of Algorithm 3. at the th iterate by (s t, δ t ), λ t, (V t,, V t,) for t = 0,..., N and θ, λ N, λn. 0

11 The new penalty parameters are computed by { σi, h ξ λh = i, if σi, h < ξ λ h i,, σi, h else, { σ g i, = ξ λg i, if σ g i, < ξ λ g i,, σ g i, else, { ξ λf i, if σi, F < ξ λ F i,, σi, F else, σ F i, = (4) where λ h i, = max λ h,t i,, λg i, = max λg,t i,, λf i, = max λ F,t i,, (43) with maximum being taen over t {,..., N }and < ξ < ξ. Note that this choice of σ ensures σ h λ h, σ g λ g, σf λ F. (44) 4.. The merit function We are looing for the next iterate at the polygonal line connecting the points s 0, s,..., sn. For each line segment [s t, s t ] := {( α)st + αs t α [0, ]}, t =,..., N we consider the functions φ t (α) := f(x + s) + σi, h h i (x + s) + σ g i, (g i(x + s)) + i E i I + σi,d(f F i (x + s), P ) + σi,d(f F i (x + s), P ), i V t, i V t, ˆφ t (α) := f + fs + st B s + σi, h h i + h i s + σ g i, (g i + g i s) + i E i I + σi,d(f F i + F i s, P ) + σi,d(f F i + F i s, P ), i V t, i V t, where s = ( α)s t + αs t and f = f(x ), f = f(x ), h i = h i (x ), h i = h i (x ), i E, etc. and we further denote r t,0 := ˆφ t (0) ˆφ (0), r t, := ˆφ t () ˆφ (0). (45) Lemma 4... For every t {,..., N } the function ˆφ t is convex.. For every t {,..., N } the function ˆφ t is a first order approximation of φt, that is where s = ( α)s t + αs t. φ t (α) ˆφ t (α) = o( s ), Proof.. By convexity of P and P, ˆφ t is convex because it is sum of convex functions.. By Lipschitz continuity of distance function with Lipschitz modulus L = we conclude φ t (α) ˆφ t (α) f(x + s) f fs st B s + i E σ h i, h i (x + s) h i h i s and hence the assertion follows. σ g i, g i(x + s) g i g i s + σi, F F i (x + s) F i F i s i I i V We state now the main result of this subsection. For the sae of simplicity we omit the iteration index in this part.

12 Proposition 4.. For every t {,..., N } ˆφ t (0) ˆφ t (0) (sτ s τ ) T B(s τ s τ ) 0, (46) ˆφ t () ˆφ (0) Proof. Fix t {,..., N } and note that τ= t τ= (sτ s τ ) T B(s τ s τ ) 0. (47) /(s t ) T Bs t + fs t = /(s t ) T Bs t + fs t /(s 0 ) T Bs 0 fs 0 t = /(s τ ) T Bs τ /(s τ ) T Bs τ + f(s τ s τ ), τ= because of s 0 = 0. For j = 0, consider r t+j j defined by (45). We obtain r t+j j = t ( (sτ ) T Bs τ ) (sτ ) T Bs τ + f(s τ s τ ) τ= + σ h ( i hi + h i s t h i ) + ( (gi + g i s t ) + (g i ) +) i E + i V t+j σ F i d(f i + F i s t, P ) + i I σ g i i V t+j σi F d(f i, P ) σi F d(f i, P ). i V i V σ F i d(f i + F i s t, P ) (48) Using that (s τ, δ τ ) is the solution of QP (ρ, V τ ) and multiplying the first order optimality condition (4) by (s τ s τ ) T yields ( (s τ s τ ) T Bs τ + f T + λ h,τ i h T i + λ g,τ i gi T + ) Fi T λ F,τ i = 0. (49) i E i I i V Summing up the expression on the left hand side from τ = to t, subtracting it from the right hand side of (48) and taing into account the identity /(s τ ) T Bs τ /(s τ ) T Bs τ (s τ s τ ) T Bs τ = /(s τ s τ ) T B(s τ s τ ) we obtain for j = 0, t r t+j j = (sτ s τ ) T B(s τ s τ ) (50) τ= + ( ) t σi h ( h i + h i s t h i ) λ h,τ i h i (s τ s τ ) i E τ= + ( ) t σ g i ((g i + g i s t ) + (g i ) + ) λ g,τ i g i (s τ s τ ) i I τ= + σi F d(f i + F i s t, P ) + σi F d(f i + F i s t, P ) i V t+j i V t+j σi F d(f i, P ) σi F d(f i, P ) i V i V i V t τ= (λ F,τ i ) T F i (s τ s τ ).

13 First, we claim that i V t (λ F,τ i τ= ) T F i (s τ s τ ) i V λ F i ( δ t )d(f i, P ). (5) Consider i V and τ {,..., t} with i V τ. By the feasibility of (s τ, δ τ ) and (s τ, δ τ ) for QP (ρ, V τ ) it follows that δ τ (θ H i H i, θ G i G i ) T + F i + F i s τ P, δ τ (θ H i H i, θ G i G i ) T + F i + F i s τ P and hence from (7) and () we conclude and consequently (λ F,τ i ) T ( F i (s τ s τ ) + (δ τ δ τ )(θ H i H i, θ G i G i ) T ) 0 (λ F,τ i ) T F i (s τ s τ ) (λ F,τ i ) T (δ τ δ τ )(θi H H i, θi G G i ) T λ F i (δ τ δ τ )d(f i, P ) (5) follows by the Hölder inequality and (34). Analogous argumentation yields (5) also for i, τ with i V τ and since V τ, V τ form a partition of V, the claimed inequality (5) follows. Further, we claim that for j = 0, it holds that σi F d(f i + F i s t, P ) + σi F d(f i + F i s t, P ) σi F δ t d(f i, P ). (53) i V i V t+j i V t+j From feasibility of (s t, δ t ) for either QP (ρ, V t ) or QP (ρ, V t+ ) for i V t V t+ it follows that and hence, using (34) and (), δ t (θ H i H i, θ G i G i ) T + F i + F i s t P σ F i d(f i + F i s t, P ) σ F i δ t (θ H i H i, θ G i G i ) T = σ F i δ t d(f i, P ). (54) Again, for i V t or i V t+ argumentation and since V t, V t and V t+, V t+ follows. Finally, we have it holds that σi F d(f i + F i s t, P ) σi F δt d(f i, P ) by analogous form a partition of V, the claimed inequality (53) σi F d(f i, P ) σi F d(f i, P ) = σi F d(f i, P ), (55) i V i V i V due to the fact that V, V form a partition of V and (35). Similar arguments as above show σ h i ( h i + h i s t h i ) σ g i ((g i + g i s t ) + (g i ) + ) t τ= t τ= λ h,τ i h i (s τ s τ ) (σ h i λ h i )(δ t ) h i, i E, λ g,τ i g i (s τ s τ ) (σ g i λ g i )(δt )(g i ) +, i I. Taing this into account and putting together (50), (5), (53) and (55) we obtain for j = 0, r t+j j t τ= (sτ s τ ) T B(s τ s τ ) (σi F λ F i )( δ t )d(f i, P ) i V i E (σ h i λ h i )( δ t ) h i i I (σ g i λ g i )( δt )(g i ) + and hence (46) and (47) follow by monotonicity of δ and (44). This completes the proof. 3

14 4.. Searching for the next iterate We choose the next iterate as a point from the polygonal line connecting the points s 0,..., sn. Each line segment [s t, s t ] corresponds to the convex subproblem solved by Algorithm 3. and hence each line search function ˆφ t corresponds to the usual l merit function from nonlinear programming. This maes it technically more difficult to prove the convergence behavior stated in Proposition 4. which is also the motivation for the following procedure. First we parametrize the polygonal line connecting the points s 0,..., sn by its length as a curve ŝ : [0, ] R n in the following way. We define t () := N, for every γ [0, ) we denote by t (γ) the smallest number t such that S t > γsn and we set α () :=, α (γ) := γsn S t (γ) S t (γ) S t (γ), γ [0, ), where S 0 := 0, St := t τ= sτ sτ for t =,..., N. Then we define ŝ (γ) = s t (γ) + α (γ)(s t (γ) s t (γ) ). Note that ŝ (γ) γs N. In order to simplify the proof of Proposition 4., for γ [0, ] we further consider the following line search functions Y (γ) := φ t (γ) (α (γ)), Ŷ (γ) := ˆφ t (γ) (α (γ)), Z (γ) := ( α (γ)) ˆφ t (γ) (0) + α (γ) ˆφ t (γ) (). (56) Now consider some sequence of positive numbers γ =, γ, γ3,... with > γ γj+ /γ j γ > 0 for all j N. Consider the smallest j, denoted by j() such that for some given constant ξ (0, ) one has Y (γj ) Y (0) ξ ( Z (γj ) Z (0) ). (57) Then the new iterate is given by x + := x + ŝ (γ j() ). As can be seen from the proof of Lemma 4.5, this choice ensures a decrease in merit function Φ defined in the next subsection. The following relations are direct consequences of the properties of φ t and ˆφ t Y (γ) Ŷ(γ) = o(γs N ), Ŷ (γ) Z (γ), Z (γ) Z (0) 0. (58) The last property holds due to Proposition 4. and which follows from α (0) = 0, S t (0) are defined by (45). r t, Z (γ) Z (0) = ( α (γ))r t (γ),0 + α (γ)r t (γ),, (59) Lemma 4.. The new iterate x + is well defined. = 0 and hence ˆφ t (0) (0) = ˆφ (0). We recall that rt,0 and Proof. In order to show that the new iterate is well defined, we have to prove the existence of some j such that (57) is fulfilled. Note that S t (0) = 0 and S t (0) > 0. There is some δ > 0 such that Y (γ) Ŷ(γ) ( ξ)rt (0), γs N S t (0) can choose j sufficiently large to fulfill γ j SN α (γ j ) = γ j SN /St (0), since S t (0), whenever γs N δ. Since lim j γ j = 0. This yields = 0, we < min{δ, S t (0) } and then t (γj ) = t (0) and Y (γ j ) Ŷ(γ j ) ( ξ)α (γ j )r t (γ j ),. (60) 4

15 Then by second property of (58), (59), taing into account r t (γ j ),0 0 by Proposition 4. and Y (0) = Z (0) we obtain Y (γ j ) Y (0) Ŷ(γ j ) Y (0) ( ξ)α (γ j )r t (γ j ) ξ(z (γ j ) Z (0)) + ( ξ), ( ) Z (γj ) Z (0) α (γj )r t (γ j ), ξ(z (γ j ) Z (0)) + ( ξ)( α (γ j ))r t (γ j ),0 ξ(z (γ j ) Z (0)). Thus (57) is fulfilled for this j and the lemma is proved. 4. Convergence of the basic algorithm We consider the behavior of the Algorithm 4. when it does not prematurely stop and it generates an infinite sequence of iterates Note that δ N Assumption. x, B, (s t, δ t ), λ t, (V t,, V t,), t = 0,..., N and θ, λ N, λn. < ζ. We discuss the convergence behavior under the following assumption.. There exist constants C x, C s, C λ such that x C x, S N C s, ˆλh, ˆλ g, ˆλ F C λ for all, where ˆλ h := max i E{ λ h i, }, ˆλ g := max i I{ λ g i, }, ˆλ F := max i V { λ F i, }.. There exist constants C B, C B such that C B λ(b ), B C B for all, where λ(b ) denotes the smallest eigenvalue of B. For our convergence analysis we need one more merit function Φ (x) := f(x) + i E σ h i, h i (x) + i I σ g i, (g i(x)) + + i V σ F i,d(f i (x), P ). Lemma 4.3. For each and for any γ [0, ] it holds that Φ (x + ŝ (γ)) Y (γ) and Φ (x ) = Y (0). (6) Proof. The first claim follows from the definitions of Φ and Y and the estimate d(f i (x +s), P ), d(f i (x +s), P ) min{d(f i (x +s), P ), d(f i (x +s), P )} = d(f i (x +s), P ), which holds by (0). The second claim follows from (35). A simple consequence of the way that we define the penalty parameters in (4) is the following lemma. Lemma 4.4. Under Assumption there exists some such that for all the penalty parameters remain constant, σ := σ and consequently Φ (x) = Φ (x). Remar 4.. Note that we do not use Φ for calculating the new iterate because its first order approximation is in general not convex on the line segments connecting s t and s t due to the involved min operation. Lemma 4.5. Assume that Assumption is fulfilled. Then lim Y (γj() ) Y (0) = 0. (6) 5

16 Proof. Tae an existed from Lemma 4.4. Then we have for Φ + (x + ) = Φ (x + ) = Φ (x + ŝ (γ j() )) = Φ (x + ŝ (γ j() )) Y (γ j() ) < Y (0) = Φ (x ) and therefore Φ + (x + ) Φ (x ) Y (γj() ) Y (0) < 0. Hence the sequence Φ (x ) is monotonically decreasing and therefore convergent, because it is bounded below by Assumption. Hence < lim Φ (x ) Φ (x ) = and the assertion follows. (Φ + (x + ) Φ (x )) (Y (γj() ) Y (0)) = Proposition 4.. Assume that Assumption is fulfilled. Then and consequently = lim Ŷ () Ŷ(0) = 0 (63) lim sn = 0. (64) Proof. We prove (63) by contraposition. Assuming on the contrary that (63) does not hold, by taing into account Ŷ() Ŷ(0) 0 by Proposition 4., there exists a subsequence K = {,,...} such that Ŷ() Ŷ(0) r < 0. By passing to a subsequence we can assume that for all K we have with given by Lemma 4.4 and N = N, where we have taen into account (40). By passing to a subsequence once more we can also assume that lim S t = S t, lim r, t = r, t lim r,0 t = r 0, t t {,..., N}, K K K where r t, and rt,0 are defined by (45). Note that r N r < 0. Let us first consider the case S N = 0. There exists δ > 0 such that Y (γ) Ŷ(γ) (ξ ) r N γs N K, whenever γs N δ. Since S N = 0 we can assume that S N min{δ, /} K. Then Y () Y (0) r N, + (ξ ) r N S N r N, + (ξ )r N, = ξr N, = ξ(z () Z (0)) ξ r N < 0 and this implies that for the next iterate we have j() = and hence γj() =, contradicting (6). Now consider the case S N 0 and let us define the number τ := max{t S t = 0} +. Note that Proposition 4. yields r t,, r t+,0 λ(b ) t τ= ( t s τ s τ C B s τ s τ t τ= ) = C B t (St ) (65) and therefore r := max t> τ r t < 0, where r t := max{ r t 0, r t }. By passing to a subsequence we can assume that for every t > τ and every K we have r t,0, rt, rt Now assume that for infinitely many K we have γj() S N S τ, i.e. t (γj() ) > τ. Then we conclude ( ) Y (γj() ) Y (0) ξ(z (γj() ) Z (0)) = ξ ( α (γj() ))rt (γ j() ),0 + α (γj() )rt (γ j() ), ξ r < 0 contradicting (6). Hence for all but finitely many K, without loss of generality for all K, we have γ j() S N < S τ.. 6

17 There exists δ > 0 such that Y (γ) Ŷ(γ) r τ ( ξ)γγs N 8S τ K, (66) whenever γs N δ. By eventually choosing δ smaller we can assume δ S τ / and by passing to a subsequence if necessary we can also assume that for all K we have S τ /γ δ < S τ S τ. (67) Now let for each the index j() denote the smallest j with γ j S N γ j() S N > δ and by (67) we obtain δ. It obviously holds that implying t (γ ) = τ and j() S τ γδ γγ j() S N γ j() S N δ < S τ τ γδ S α (γ ) j() S τ S τ γδ 4S τ by (67). Taing this into account together with (66) and γ j() S N δ we conclude Y (γ j() ) Ŷ(γ j() ) r τ ( ξ)γγ j() S N 8S τ ( ξ) γδ r τ 4S τ, ( ξ)α (γ j() (γ )rt j() ),. Now we can proceed as in the proof of Lemma 4. to show that j() fulfills (57). However, this yields j() j() by definition of j() and hence γj() S N γ j() S N S τ showing t (γj() ) = t (γ j() ) = τ. But then we also have α (γj() ) α γδ (γ ) j() 4 S and from τ (57) we obtain Y (γj() ) Y (0) ξ(z (γj() ) Z (0)) ξα (γj() )rt (γ j() ), ξγδ r < 0 8 S τ contradicting (6) and so (63) is proved. Condition (64) now follows from (63) because we conclude from (65) that Ŷ() Ŷ(0) C B N (S N ) C B N s N. Now we are ready to state the main result of this section. Theorem 4.. Let Assumption be fulfilled. Then every limit point of the sequence of iterates x is at least M-stationary for problem (). Proof. Let x denote a limit point of the sequence x and let K denote a subsequence such that lim K x = x. Further let λ be a limit point of the bounded sequence λ N and assume without loss of generality that lim K λ N = λ. First we show feasibility of x for the problem () together with λ g i 0 = λg i g i( x), i I and (λ H, λ G ) N P V (F ( x)). (68) Consider i I. For all it holds that ( ) 0 ( θ g i, δn )g i(x ) + g i (x )s N λ g,n i, 0. Since 0 δ N ζ, θ g i, {0, } we have ( θg i, δn ) ζ and together with sn Proposition 4. we conclude ( ) 0 lim sup g i (x ) + g i(x )s N ( θ g i, δn ) = g i ( x), K 7 0 by

18 λ g i 0 and 0 = lim λ g,n i, K ( ) g i (x ) + g i(x )s N ( θ g i, δn ) = λ g i g i( x). Hence λ g i 0 = λg i g i( x). Similar arguments show that for every i E we have ( ) 0 = lim h i (x ) + h i(x )s N K ( δ N ) = h i ( x). Finally consider i V. Taing into account (), (34) and δ N Hence, F i (x )s N ζ we obtain d(f i (x ), P ) δ N (θh i,h i (x ), θi,g G i (x )) T + F i (x )s N ζd(f i (x ), P ) + F i (x )s N. 0 by Proposition 4. implies ( ζ)d(f i ( x), P ) = lim ( ζ)d(f i (x ), P ) lim F i (x )s N = 0, K K showing the feasibility of x. Moreover, the previous arguments also imply F i (x, s N, δn ) := δn (θh i,h i (x ), θi,g G i (x )) T + F i (x ) + F i (x )s N K F i ( x). (69) Taing into account (4), the fact that λ N (9) yields (λ H,N, λ G,N fulfills M-stationarity conditions at (s N ) N P V ( F (x, s N, δn ))., δn ) for However, this together with (λ H,N, λ G,N ) (λ K H, λ G ), (69), and (3) yield (λ H, λ G ) N P V (F ( x)) and consequently (68) follows. Moreover, by first order optimality condition we have B s N + f(x ) T + i E λ h,n i, h i (x ) T + i I λ g,n i, g i(x ) T + F i (x ) T λ F,N i, = 0 i V for each and by passing to a limit and by taing into account that B s N 4. we obtain 0 by Proposition f( x) T + i E λ h i h i ( x) T + i I λ g i g i( x) T + i V F i ( x) T λ F i = 0. Hence, invoing (4) again, this together with the feasibility of x and (68) implies M-stationarity of x and the proof is complete. 5 The extended SQP algorithm for MPVC In this section we investigate what can be done in order to secure Q M -stationarity of the limit points. First, note that to prove M-stationarity of the limit points in Theorem 4. we only used that (λ H,N, λ G,N ) N P V ( F (x, s N, δn )), i.e. it is sufficient to exploit only the M-stationarity of the solutions of auxiliary problems. Further, recalling the comments after Lemma 3., the solution (s, δ) of QP (ρ, I (s, δ) I 00 (s, δ)) is M-stationary for the auxiliary problem. Thus, in Algorithm 3. for solving the auxiliary problem, it is sufficient to consider only the last problem of the four problems (3),(3). Moreover, definition of limiting normal cone () reveals that, in general, the limiting process abolishes any stationarity stronger that M-stationarity, even S-stationarity. Nevertheless, in practical situations it is liely that some assumption, securing that a stronger stationarity will be preserved in the limiting process, may be fulfilled. E.g., let x be a limit point 8

19 of x. If we assume that for all sufficiently large it holds that I 00 ( x) = I 00 (s N, δn ), then x is at least Q M -stationary for (). This follows easily, since now for all i I 00 ( x) it holds that λ G,N i, = 0, λ H,N i,, λ G,N i, 0 and consequently λ G i = lim λg,n i, = 0, λ H i = lim λh,n i, 0, λ G i = lim λg,n i, 0. This observation suggests that to obtain a stronger stationarity of a limit point, the ey is to correctly identify the bi-active index set at the limit point and it serves as a motivation for the extended version of our SQP method. Before we can discuss the extended version, we summarize some preliminary results. 5. Preliminary results Let a : R n R p and b : R n R q be continuously differentiable. Given a vector x R n we define the linear problem LP (x) f(x)d min d R n subject to a(x)d = 0, (b(x)) + b(x)d 0, d. Note that d = 0 is always feasible for this problem. Next we define a set A by (70) A := {x R n a(x) = 0, b(x) 0}. (7) Let x A and recall that the Mangasarian-Fromovitz constraint qualification (MFCQ) holds at x if the matrix a( x) has full row ran and there exists a vector d R n such that a( x)d = 0, b i ( x)d < 0, i I( x) := {i {,..., q} b i ( x) = 0}. Moreover, for a matrix M we denote by M p the norm given by and we also omit the index p in case p =. M p := sup{ Mu p u } (7) Lemma 5.. Let x A, let assume that MFCQ holds at x and let d denote the solution of LP ( x). Then for every ɛ > 0 there exists δ > 0 such that if x x δ then where d denotes the solution of LP (x). f(x)d f( x) d + ɛ, (73) Proof. The classical Robinson s result (c.f. [9, Corollary, Theorem 3]), together with MFCQ at x, yield the existence of κ > 0 and δ > 0 such that for every x with x x δ there exists ˆd with a(x) ˆd = 0, (b(x)) + b(x) ˆd 0 and d ˆd κ max{ a(x) d, ((b(x)) + b(x) d) + } =: ν. Since ˆd ˆd d + d + ν, by setting d := ˆd/( + ν) we obtain that d is feasible for LP (x) and d d + ν d ˆd + ν d ( + n)ν ( + n)ν. + ν Thus, taing into account a( x) d = 0, (b( x)) + b( x) d 0 and d, we obtain d d ( + n)κ max{ a(x) a( x), b(x) b( x) + b(x) b( x) }. 9

20 Hence, given ɛ > 0, by continuity of objective and constraint functions as well as their derivatives at x we can define δ δ such that for all x with x x δ it holds that Consequently, we obtain f(x) f( x), f(x) d d ɛ/. f(x) d f(x) d d + f(x) f( x) d + f( x) d f( x) d + ɛ and since f(x)d f(x) d by feasibility of d for LP (x), the claim is proved. Lemma 5.. Let ν (0, ) be a given constant and for a vector of positive parameters ω = (ω E, ω I ) let us define the following function ϕ(x) := f(x) + ωi E a i (x) + ωi I (b i (x)) +. (74) i {,...,p} i {,...,q} Further assume that there exist ɛ > 0 and a compact set C such that for all x C it holds that f(x)d ɛ, where d denotes the solution of LP (x). Then there exists α > 0 such that holds for all x C and every α [0, α]. ϕ(x + αd) ϕ(x) να f(x)d (75) Proof. Definition of ϕ, together with u + v + (u v + ) + for u, v R, yield ϕ(x+αd) ϕ(x) f(x+αd) f(x)+ ω ( a(x+αd) a(x) + (b(x+αd) (b(x)) + ) + ). (76) By uniform continuity of the derivatives of constraint functions and objective function on compact sets, it follows that there exists α > 0 such that for all x C and every h with h α we have f(x + h) f(x), ω ( a(x + h) a(x) + b(x + h) b(x) ) ν ɛ. (77) Hence, for all x C and every α [0, α] we obtain f(x + αd) f(x) = να f(x)d + ( ν)α f(x)d + να f(x)d ( ν)αɛ + ν On the other hand, taing into account a(x)d = 0, d, (77) and 0 ( f(x + tαd) f(x))αddt αɛ = να f(x)d ν αɛ. (b(x)) + α b(x)d = ( α)(b(x)) + α((b(x)) + b(x)d) 0 we similarly obtain for all x C and every α [0, α] ω ( a(x + αd) a(x) + (b(x + αd) (b(x)) + ) + ) ω ( 0 ( a(x + tαd) a(x))αddt + ) ( b(x + tαd) b(x))αddt 0 Consequently, (75) follows from (76) and the proof is complete. ν αɛ. 0

21 5. The extended version of Algorithm 4. For every vector x R n and every partition (W, W ) P(V ) we define the linear problem LP (x, W ) min d R n f(x)d subject to h i (x)d = 0 i E, (g i (x)) + g i (x)d 0 i I, F i (x)d P i W, (F i (x)) + F i (x)d P i W, d. (78) Note that d = 0 is always feasible for this problem and that the problem LP (x, W ) coincides with the problem LP (x) with a, b given by a := (h i (x), i E, H i (x), i W ) T, b := (g i (x), i I, H i (x), i W, G i (x), i W ) T. (79) The following proposition provides the motivation for introducing the problem LP (x, W ). Proposition 5.. Let x be feasible for (). Then x is Q-stationary with respect to (β, β ) P(I 00 ( x)) if and only if the solutions d and d of the problems LP ( x, I 0+ ( x) β ) and LP ( x, I 0+ ( x) β ) fulfill min{ f( x) d, f( x) d } = 0. (80) Proof. Feasibility of d = 0 for LP ( x, I 0+ ( x) β ) and LP ( x, I 0+ ( x) β ) implies min{ f( x) d, f( x) d } 0. Denote by d and d the solutions of LP ( x, I 0+ ( x) β ) and LP ( x, I 0+ ( x) β ) without the constraint d, and denote these problems by LP and LP. Clearly, we have min{ f( x) d, f( x) d } min{ f( x) d, f( x) d }. j The dual problem of LP for j =, is given by max λ R m i I λg i (g i( x)) ( i W j λ H i ( H i ( x)) + λ G i (G i( x)) ) subject to (3) and λ g i 0, i I, λh i, λg i 0, i W j, λg i = 0, i W j, (8) where λ = (λ h, λ g, λ H, λ G ), m = E + I + V, W j := I0+ ( x) β j, W j := V \ W j. Assume first that x is Q-stationary with respect to (β, β ) P(I 00 ( x)). Then the multipliers λ, λ from definition of Q-stationarity are feasible for dual problems of LP and LP, respectively, both with the objective value equal to zero. Hence, duality theory of linear programming yields that min{ f( x) d, f( x) d } 0 and consequently (80) follows. On the other hand, if (80) is fulfilled, is follows that min{ f( x) d, f( x) d } = 0 as well. Thus, d = 0 is an optimal solution for LP and LP and duality theory of linear programming yields that the solutions λ and λ of the dual problems exist and their objective values are both zero. However, this implies that for j =, we have λ g,j i g i ( x) = 0, i I, λ H,j i H i ( x) = 0, λ G,j i G i ( x) = 0, i V and consequently λ fulfills the conditions of λ and λ fulfills the conditions of λ, showing that x is indeed Q-stationary with respect to (β, β ). Now for each consider two partitions (W,, W, ), (W,, W, ) P(V ) and let d and d denote the solutions of LP (x, W, ) and LP (x, W, ). Choose d {d, d } such that f(x )d = min f(x )d (8) d {d,d }

22 and let (W,, W, ) {(W,, W, ), (W,, W, )} denote the corresponding partition. Next, we define the function ϕ in the following way ϕ (x) := f(x)+ σi, h h i (x) + σ g i, (g i(x)) + + σi,d(f F i (x), P )+ σi,d(f F i (x), P ). i E i I i W, i W, (83) Note that the function ϕ coincides with ϕ for a, b given by (79) with (W, W ) := (W,, W, ) and ω = (ω E, ω I ) given by ω E := (σ h i,, i E, σ F i,, i W, ), ω I := (σ g i,, i I, σf i,, i W,, σ F i,, i W, ). Proposition 5.. For all x R n it holds that 0 ϕ (x) Φ (x) σ F V max{ max i W, d(f i (x), P ), max i W, d(f i (x), P )}. (84) Proof. Non-negativity of the distance function, together with (0) yield for every i V, j =, Hence (84) now follows from j=, 0 d(f i (x), P j ) d(f i (x), P ) d(f i (x), P j ). i W j, σ F i,d(f i (x), P j ) σ F V max j=, max d(f i (x), P j ). i W j, An outline of the extended algorithm is as follows. Algorithm 5. (Solving the MPVC*). : Initialization: Select a starting point x 0 R n together with a positive definite n n matrix B 0, a parameter ρ 0 > 0 and constants ζ (0, ), ρ > and µ (0, ). Select positive penalty parameters σ = (σ, h σ g, σf ). Set the iteration counter := 0. : Correction of the iterate: Set the corrected iterate by x := x. Tae some (W,, W, ), (W,, W, ) P(V ), compute d and d as solutions of LP (x, W, ) and LP (x, W, ) and let d be given by (8). Consider a sequence of numbers α () =, α (), α(3),... with > ᾱ α(j+) /α (j) α > 0. If f(x )d < 0, denote by j() the smallest j fulfilling either Φ (x + α (j) d ) Φ (x ) µα (j) f(x )d, (85) or α (j) Φ (x ) ϕ (x ) µ f(x )d. (86) If j() fulfills (85), set x := x + α j() d. 3: Solve the Auxiliary problem: Run Algorithm 3. with data ζ, ρ, ρ := ρ, B := B, f := f( x ), h i := h i ( x ), h i := h i ( x ), i E, etc. If the Algorithm 3. stops because of degeneracy, stop the Algorithm 5. with an error message. If the final iterate s N is zero, stop the Algorithm 5. and return x as a solution. 4: Next iterate: Compute new penalty parameters σ. Set x + := x + s where s is a point on the polygonal line connecting the points

Optimality Conditions for Constrained Optimization

72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)