Nonmonotone Trust Region Methods for Nonlinear Equality Constrained Optimization without a Penalty Function

Size: px
Start display at page:

Download "Nonmonotone Trust Region Methods for Nonlinear Equality Constrained Optimization without a Penalty Function"

Transcription

1 Nonmonotone Trust Region Methods for Nonlinear Equality Constrained Optimization without a Penalty Function Michael Ulbrich and Stefan Ulbrich Zentrum Mathematik Technische Universität München München, Germany Technical Report, December 2000

2

3 NON-MONOTONE TRUST REGION METHODS FOR NONLINEAR EQUALITY CONSTRAINED OPTIMIZATION WITHOUT A PENALTY FUNCTION MICHAEL ULBRICH AND STEFAN ULBRICH Abstract We propose and analyze a class of penalty-function-free nonmonotone trust-region methods for nonlinear equality constrained optimization problems The algorithmic framework yields global convergence without using a merit function and allows nonmonotonicity independently for both, the constraint violation and the value of the Lagrangian function Similar to the Byrd Omojokun class of algorithms, each step is composed of a quasinormal and a tangential step Both steps are required to satisfy a decrease condition for their respective trust-region subproblems The proposed mechanism for accepting steps combines nonmonotone decrease conditions on the constraint violation and/or the Lagrangian function, which leads to a flexibility and acceptance behavior comparable to filter-based methods We establish the global convergence of the method Furthermore, transition to quadratic local convergence is proved Numerical tests are presented that confirm the robustness and efficiency of the approach Key words nonmonotone trust-region methods, sequential quadratic programming, penalty function, global convergence, equality constraints, local convergence, large-scale optimization AMS subject classifications 65K05, 90C30 1 Introduction We consider the nonlinear equality constrained optimization problem min f (x) subject to c(x) = 0 (11) with continuously differentiable functions f : R n R and c : R n R m For the solution of (11) we propose a method that is inspired by the class of trust-region algorithms introduced by Byrd [2], Omojokun [23], and Dennis, El-Alem, and Maciel [9], but with the important difference that our algorithm does not use a penalty or augmented Lagrange function to test the acceptability of steps Hereby, we are motivated by the impressing efficiency of sequential quadratic programming (SQP) filter methods, which were recently introduced by Fletcher and Leyffer [17] The algorithm that we investigate here does not use the concept of a filter Rather, it applies nonmonotone trust-region techniques independently to the quasi-normal subproblem and the tangential subproblem This strategy admits a flexibility in accepting steps that is comparable with filter methods Besides global convergence, our approach has two favorable properties that appear to be new for algorithms without a penalty function: (a) The method does not require a restoration procedure (b) We prove that the algorithm converges locally q-quadratically, even without an additional second order correction that is needed by many algorithms to avoid the Maratos effect For SQP filter methods global convergence has been established in Fletcher, Gould, Leyffer, and Toint [16], whereas a local convergence theory is not yet available Recently, a globally convergent primal-dual interior-point filter method was introduced by Ulbrich, Ulbrich, and Vicente [31] Except for the method presented in this paper, filter methods and its predecessor, the tolerance-tube approach by Zoppke-Donaldson [34], are the only algorithms for NLP we are aware of that do not require a penalty function We will use [2], [23], and [9] as our main references on trust-region methods for equality constrained nonlinear programming However, there are several related approaches and recent extensions that should be mentioned Regarding related work, we refer to Byrd, Schnabel, and Shultz [5], Celis, Dennis, and Tapia [6], El Alem [13, 14], Powell and Yuan [26], Lehrstuhl für Angewandte Mathematik und Mathematische Statistik, Zentrum Mathematik, Technische Universität München, D München, Germany, mulbrich@matumde Lehrstuhl für Angewandte Mathematik und Mathematische Statistik, Zentrum Mathematik, Technische Universität München, D München, Germany, sulbrich@matumde 1

4 2 M ULBRICH AND S ULBRICH and Vardi [32] Recent contributions to the analysis of trust-region methods for equality constrained problems include Dennis and Vicente [12], El Alem [15], Lalee, Nocedal, and Plantenga [20] Several extensions to problems involving inequality constraints have been proposed Here we mention only those methods that extend the ideas of Byrd [2], Omojokun [23] and Dennis, El Alem, and Maciel [9] Some of these algorithms are based on trust-region methods for box-constrained problems, see, eg, Coleman and Li [7], Conn, Gould, and Toint [8], Dennis and Vicente [11], Lin and Moré [22], Ulbrich, Ulbrich and Heinkenschloss [30], and combine them with the above approaches to handle additional equality constraints Algorithms of this type were investigated by Dennis, Heinkenschloss, Vicente [10], Plantenga [24], and Vicente [33] Byrd, Gilbert, and Nocedal [3] and Byrd, Hribar, and Nocedal [4] take a different approach by solving a sequence of equality constrained barrier problems The above references underline the important role of methods for nonlinear equality constrained optimization problems, both as stand alone methods and as solvers for subproblems This paper is organized as follows In section 2 the algorithm is developed We introduce the quasi-normal and the tangential trust-region subproblem and describe the model decrease conditions for the respective trial steps The nonmonotone decrease conditions for constraint violation and Lagrangian function, respectively, which are the key ingredients of the new algorithm, are developed in sections 21 and 22 The full algorithm is formulated in section 23 In section 3 the global convergence of the algorithm is established We first state the main result in section 31 The global convergence analysis starts in section 32 with the proof of well definedness Section 33 is devoted to the development of nonmonotone decrease results Convergence to feasible points is proved in section 34, convergence to stationary points in 35 In section 4 we show that with a Newton-type step computation the algorithm converges locally quadratically Numerical results for problems from the CUTE collection [1] are presented in section 5 Notations Throughout the paper, denotes the Euclidean norm 2 The gradient of f is denoted by f and c denotes the transposed Jacobian of c We use the abbreviations g = f and A = c 2 Development of the algorithm We denote the gradient of the objective function f by g and write A for the transposed Jacobian of c: g(x) = f (x) R n, A(x) = c(x) R n m Following Byrd [2], Omojokun [23] and Dennis, El Alem, and Maciel [9], we obtain the trial step s k = sk t +sn k at the current iterate x k by computing a quasi-normal step sk n and a tangential step sk t The purpose of the quasi-normal step sn k is to improve feasibility It is obtained as approximate solution of the trust-region subproblem min c(x k ) + A(x k ) T s n 2 subject to s n k, (21) where k > 0 denotes the trust-region radius Our requirements on the steps sk n are that there exist constants K 1, K 2 > 0, independent of k, such that sk n admits the upper bound and satisfies the decrease condition s n k min{ K 1 c k, k }, (22) c k 2 c k + A T k sn k 2 K 2 c k min { c k, k } (23) As, eg, in [9], we will assume that the matrices A(x k ) T A(x k ) are nonsingular with uniformly bounded inverses for all k Then it is well known that the Cauchy point, which is the solution

5 NONMONOTONE TRUST-REGION METHODS WITHOUT PENALTY FUNCTION 3 of (21) along the direction of steepest descent at s n = 0, satisfies the conditions (22) and (23) for appropriate constants K 1 and K 2 The assumptions stated below ensure the existence of constants K 1 and K 2 that are suitable for all iterations k Therefore, the conditions (22) and (23) can be implemented by a fraction of Cauchy decrease condition To improve optimality we seek sk t in the tangent space of the linearized constraints in such a way that it provides sufficient decrease for a quadratic model of the Lagrange function l(x, y) = f (x) + y T c(x), y R m, under a trust-region constraint To this end, we define a quadratic model q k (s) = (g(x k ) + A(x k )y k ) T s st H k s about the current point (x k, y k ) that approximates l(x k + s, y k ) l(x k, y k ) Here, H k is a symmetric approximation of x 2l(x k, y k ) Based on this model, the tangential step sk t is computed as approximate solution of the trust-region subproblem satisfying the decrease condition min q k (s n k + st ) subject to A(x k ) T s t = 0, s t k (24) q k (s n k ) q k(s n k + st k ) K 3 W T k q k(s n k ) min { W T k q k(s n k ), k with a constant K 3 > 0 independent of k, and the feasibility condition }, (25) A(x k ) T s t k = 0, st k k (26) Hereby, W k = W(x k ), where W(x) denotes a matrix whose columns form a basis of the null space of A(x) T Note that Wk T q k(s t ) is the reduced gradient of q k in terms of the representation s t = W k d of the tangential step: d ( qk (W k d) ) = W T k q k(w k d) = W T k q k(s t ) Therefore, (25) can be realized by a fraction of Cauchy decrease condition for the reduced function d q k (s n k + W kd) subject to the constraint W k d k To simplify notation we will use the abbreviations f k = f (x k ), c k = c(x k ), l k = l(x k, y k ), etc Moreover, it will be convenient to introduce the reduced gradient ĝ(x) = W(x) T g(x) Then the first order necessary optimality conditions (Karush Kuhn Tucker or KKT conditions) at a local solution x R n of (11) can be written as c( x) = 0, ĝ( x) = 0 The algorithm is based on a combination of nonmonotone decrease criteria for the quasinormal and tangential steps Non-monotone trust-region methods were investigated by Toint [28] and Ulbrich [29] We follow [29] and compare the predicted decrease promised by the trust-region model with a relaxation of the actual decrease to decide whether a step is acceptable or not Before we give a precise description of the algorithm, we introduce our assumptions for the global convergence analysis Assumptions: There exist an open convex set R n and a closed convex set ˆ with dist( ˆ, R n \ ) > 0 such that:

6 4 M ULBRICH AND S ULBRICH (A1) The functions f : R and c : R m are continuously differentiable (A2) The matrix A(x) = c(x) has full rank for all x (A3) The functions f, g = f, c, A = c, (A T A) 1, W, and (W T W) 1 are uniformly bounded on Hereby, W(x) denotes a matrix whose columns form a basis for the null space of A(x) T (A4) For all k, x k is in ˆ, and x k + s n k as well as x k + s k are in (A5) The matrices H k and the multiplier estimates y k are uniformly bounded for all k (A6) The derivatives g = f and A = c are Lipschitz continuous on In section 4 we will moreover require the following assumption in order to show transition to locally quadratic convergence in a neighborhood of a stationary point x satisfying second order sufficient conditions (A7) The functions f : R and c : R m are twice continuously differentiable Furthermore, there exists a neighborhood N of x on which 2 f and 2 c are Lipschitz continuous and H k = 2 x l(x k, y k ) for all x k N 21 A nonmonotone decrease condition for the constraint violation The decrease condition (23) for the quasi-normal step guarantees that the predicted reduction for the feasibility violation c 2, admits the estimate pred c k def = c k 2 c k + A T k s k 2, pred c k K 2 c k min{ c k, k } (27) Note hereby that Ak T s k = Ak T sn k Clearly, the requirement that the actual reduction ared c k def = c k 2 c(x k + s k ) 2 should be a fraction of the predicted reduction is too restrictive, since it could impose severe restrictions on the tangential step if the feasibility is too good in comparison to the norm of the reduced gradient In order to relax the feasibility requirement in this case and to allow nonmonotonicity we accept the step for the constraints if rared c k ρ 1pred c k, ρ 1 (0, 1) fixed, with the relaxed actual reduction rared c k { ν k c 1 } def = max R k, λ c kr c k r 2 c(x k + s k ) 2 Hereby, we require that with fixed parameters ν c N and λ (0, 1/ν c ) holds r=0 ν c k = min{k + 1,νc }, λ c kr λ > 0, νc k 1 R k c k 2, usually R k = c k 2 r=0 λ c kr = 1,

7 NONMONOTONE TRUST-REGION METHODS WITHOUT PENALTY FUNCTION 5 Before we discuss the choice of R k, we notice that a maximum of nonmonotonicity is achieved by selecting an index r, 0 r ν c k 1, such that c k r = max 0 r<ν c k c k r and setting λ c kr = 1 (ν c k 1)λ, and λc kr = λ for r r The choice of R k is an important issue in the design of the method It is done in such a way that the feasibility requirement is relaxed if the feasibility is much better than the stationarity, ie, if c k ĝ k If this situation is detected then instead of R k = c k 2 a larger value is chosen In order to keep a minimum of control over the constraint violation we choose R k not larger than some upper bound a jk Hereby, (a j ) is a slowly decreasing sequence tending to zero and j is only increased, ie, j k+1 = j k +1, if R k yields the maximum in the first term of rared c k Thus, let (a j) be a sequence with a j > 0, 0 < α 0 a j+1 a j < 1, lim j a j = 0, and a η j =, (28) j=0 where η > 4/3 is a fixed constant The following algorithm describes how R k is updated: Algorithm R: (Update of R k ) Let 0 < α,β < 1/2 If c k < min { αa jk,β ĝ k } then Set R k := min{a 2 j k, ĝ k 2 } If R k ν c k 1 r=0 λc kr c k r 2 then set j k+1 := j k + 1, else set j k+1 := j k Otherwise, set R k := c k 2 and j k+1 := j k 22 A nonmonotone decrease condition for the Lagrangian function To evaluate the descent properties of the step for the objective function we use the predicted tangential reduction of the Lagrangian l pred t k the predicted reduction of l for the whole step and the relaxed actual reduction of l rared l k def = q k (s n k ) q k(s k ), pred l k { ν k l 1 def = max l k, r=0 def = q k (s k ), λ l kr l k r where as above with fixed ν l N and λ (0, 1/ν l ) } l(x k + s k, y k ) ν l k = min{k + 1,νl }, λ l kr λ > 0, νl k 1 r=0 λ l kr = 1 REMARK 21 Another natural choice would be to use l(x k + s k, y k+1 ) in the definition of rared l k with a new multiplier estimate y k+1 and to add the term (y k y k+1 ) T (c k + A T k s k) in

8 6 M ULBRICH AND S ULBRICH the definition of pred t k and predl k Our convergence analysis can easily be adapted to handle this case as well and even simplifies On the other hand, we prefer to work only with y k until an acceptable step is found, since the computation of new multipliers y k+1 requires the usually costly evaluation of A(x k + s k ) The computation of s k ensures that the tangential step sk t provides decrease for the quadratic model q k (sk n + st ), since pred t k satisfies (25) However, this descent can be destroyed by the normal step sk n, if sn k is too large compared with ĝ k This motivates the following admissibility criterion: If pred l k promises sufficient decrease for the whole step, more precisely, if pred t k max{predc k,(predc k )µ } and pred l k γ predt k, µ (2/3, 1), γ (0, 1), then we require rared l k ρ 1pred l k This leads to the following evaluation of trial steps Evaluation of steps: Let µ (2/3, 1), γ (0, 1), ρ 1 (0, 1) Accept the trial step s k = s n k + st k if s k is acceptable for the constraints, ie, rared c k ρ 1pred c k, and if s k is acceptable for the objective function: If pred t k max{predc k,(predc k )µ } and pred l k γ predt k then rared l k ρ 1pred l k holds If the step is not acceptable then the trust-region radius k is reduced and the step s k is recomputed 23 The Algorithm We now give a complete statement of the algorithm Algorithm A: Let 0 < ρ 1 < ρ 2 < 1, 0 < γ 1 < 1 < γ 2, 0 < α,β < 1/2, 2/3 < µ < 1, and 0 < γ < 1 Fix α 0 (0, 1) and choose a sequence (a j ) satisfying the conditions in (28) Choose an initial point x 0, and an initial trust-radius 0 min > 0 Set ν := 1, k := 0, and j 0 := 0 1 (Evaluate functions at x k ) Compute c k, A k, W k, f k, g k, ĝ k := Wk T g k, and a Lagrange multiplier estimate y k 2 (Check for termination) If c k + ĝ k = 0: STOP 3 (Update R k ) Choose the weights λ c/l kr for rared c/l k Update R k by calling Algorithm R 4 (Compute trial steps) Compute a quasi-normal step sk n satisfying (22), (23), and a tangential step st k satisfying (25), (26) Set s k := sk n + st k

9 NONMONOTONE TRUST-REGION METHODS WITHOUT PENALTY FUNCTION 7 5 (Test if s k is acceptable) If pred l k γ predt k and predt k max { pred c k,(predc k )µ} then goto Step 51, else goto Step If rared c k < ρ 1 pred c k or raredl k < ρ 1 pred l k then set k := γ 1 k and goto Step 4 Else choose k+1 [max{ min, k }, max{ min,γ 2 k }], set x k+1 := x k + s k, k := k + 1 and goto Step 1 52 If rared c k < ρ 1 pred c k then set k := γ 1 k and goto Step 4 Else choose k+1 [max{ min, k }, max{ min,γ 2 k }], set x k+1 := x k + s k, k := k + 1 and goto Step 1 In our formulation of the algorithm we have avoided a further index that distinguishes between different instances of trial steps at iteration level k To prevent possible ambiguities, we use the following Notation: We say that the step s k is accepted (or successful) if it is used in Step 51 or 52 to compute the new iterate, ie, x k+1 = x k + s k If it is necessary to reference the accepted, ie, final values of s k, sk n, st k, k, pred c k, predl k, predt k, raredc k, and raredl k at iteration level k, we denote them by s k,a, sk,a n, st k,a, k,a, pred c k,a, predl k,a, predt k,a, raredc k,a, and raredl k,a, respectively 3 Global convergence analysis 31 Statement of the global convergence result The following theorem states the global convergence properties of Algorithm A THEOREM 31 (i) Under assumptions (A1) (A3), Algorithm A is well defined as long as x k stays in Moreover, if Algorithm A does not terminate finitely, then the following holds: (ii) If assumptions (A1) (A5) are satisfied, then lim c k = 0 k (iii) If assumptions (A1) (A6) are satisfied, then in addition lim inf k ĝ k = 0 The proof requires several steps and is carried out in the remainder of this section In particular, part (i) is proved in section 32, Lemma 32, part (ii) in section 34, Lemma 37, and part (iii) in section 35, Lemma 39 A local convergence analysis showing the transition to fast local convergence under suitable conditions on the step computation will be given in section 4 Throughout the remainder of this section, we will not consider the case where Algorithm A terminates successfully in Step 1, since in this situation the global convergence is trivial For the convergence analysis it will be convenient to introduce also the actual reduction of the Lagrangian ared l k def = l k l(x k + s k, y k ) The following estimates, obtained by the mean value theorem, will be used several times in this section We recall that s k = s n k + st k, AT k s k = A T k sn and max{ s n k, st k } k Assume that x k and x k +s k are contained in Denoting by τ [0, 1] an appropriate generic constant that is adjusted from case to case, and writing x τ k = x k + τ s k we find that under the assumption (A1) holds ared c k predc k A k A T k 2 k + 4 A(xτ k )c(xτ k ) A kc k k, (31) ared l k predl k 2( g(x τ k ) g k + (A(x τ k ) A k)y k ) k + 2 H k 2 k (32)

10 8 M ULBRICH AND S ULBRICH 32 Well definedness We start by showing that the algorithm is well defined, and thus establish part (i) of Theorem 31 We first note that under assumptions (A1) (A3) normal steps sk n satisfying (22), (23), and tangential steps st k satisfying (25), (26) can be obtained by enforcing a fraction of Cauchy decrease condition Hereby, (A3) ensures that the constants K 1, K 2, K 3 in (22), (23), and (25) can be chosen independently of k as long as x k We refer to [9] and, for details on the practical computation of steps providing a fraction of Cauchy decrease, to [20] LEMMA 32 Let the assumptions (A1) (A3) hold Then for x k with c k + ĝ k > 2ε > 0 there exists δ > 0 such that the step s k is accepted in Step 5 of Algorithm A whenever k δ Hence, the algorithm is well defined as long as x k Further, if also (A4) and (A5) hold and if g and A are uniformly continuous on then δ can be chosen depending only on max{min{ε, c k }, min{ε, a jk }},, ˆ, and the bounds in assumption (A3) and (A5) Proof We start with a δ (0,ε] such that the closed δ-ball about x k lies in If x k ˆ, we always achieve this by choosing δ = min { ε, 1 2 dist( ˆ, R n \ ) } Further adjustment of δ will be performed as the proof proceeds Case 1: c k ε Since δ ε, (27) implies that for any k δ we have pred c k K 2ε k Now reduce δ such that for all s R n, s 2δ, holds A k A T k δ + 4 A(x k + s)c(x k + s) A k c k (1 ρ 1 )K 2 ε, (33) which is possible by assumption (A1) This together with (31) implies rared c k predc k aredc k predc k ρ 1pred c k + (1 ρ 1)(pred c k K 2ε k ) ρ 1 pred c k If pred l k γ predt k and predt k max{predc k,(predc k )µ } then we have to satisfy the additional condition rared l k ρ 1pred l k (see Step 51) To achieve this, we note that in this case holds pred l k γ predc k γ K 2ε k, where we have used (27) We now reduce δ > 0 such that for all s R n, s 2δ, holds H k δ + g(x k + s) g k + (A(x k + s) A k )y k γ 2 (1 ρ 1)K 2 ε, (34) which can be done by assumption (A1) Then (32) ensures that the test in Step 51 is passed for all k δ Hence, the step is accepted if k δ If (A4) and (A5) hold and if g, A are uniformly continuous on then our mechanism of reducing δ can be done depending only on ε = min{ε, c k },, ˆ and on the bounds in the assumptions Case 2: ĝ k > ε By assumption (A3) there exists K 7 > 0 with W T k q k(s n k ) ĝ k K 7 k Hence, if we reduce δ such that δ 2K 1 7 ε then for all k δ holds k 2K 1 7 ĝ k and therefore by (25) pred t k K 3 4 ε min {ε, k} = K 3 4 ε k (35) Case 21: pred t k < predc k Then pred c k K 3 4 ε k for k δ We reduce δ until for all s R n with s 2δ holds A k A T k δ + 4 A(x k + s)c(x k + s) A k c k (1 ρ 1 ) K 3 4 ε

11 NONMONOTONE TRUST-REGION METHODS WITHOUT PENALTY FUNCTION 9 This is possible by (A1) Invoking (31), we obtain rared c k aredc k ρ 1pred c k whenever k δ and thus the trial step is accepted in Step 52 If (A4) and (A5) hold and g, A are uniformly continuous, δ can be chosen depending only on ε and on the bounds in the assumptions Case 22: pred t k predc k For the acceptance of the step we have to make sure that rared c k ρ 1pred c k The additional condition rared l k ρ 1pred l k is only required if also predl k γ predt k In this case, the latter requirement is met by noting that (A1) allows to reduce δ such that for all s R n, s 2δ, holds H k δ + g(x k + s) g k + (A(x k + s) A k )y k γ 2 (1 ρ 1) K 3 4 ε Then, by (32) and (35), rared l k aredl k ρ 1pred l k if k δ If (A4) and (A5) hold and if g and A are uniformly continuous then δ can be chosen depending only on ε The first requirement rared c k ρ 1pred c k can be achieved by reducing δ further according to the following cases: Case 221: c k > min { αa jk,βε } def = ε jk Reduce δ such that δ c k and that for all s R n, s 2δ, holds A k A T k δ + 4 A(x k + s)c(x k + s) A k c k (1 ρ 1 )K 2 c k This is again possible by (A1) By (27) we have pred c k K 2 c k k if k δ Hence, rared c k aredc k ρ 1pred c k by (31) whenever k δ and therefore the step is accepted in 51 If (A4) and (A5) hold and if g and A are uniformly continuous then δ can be chosen depending only on min{ε, c k } > min { αa jk,βε } and the bounds in the assumptions Case 222: c k ε jk Then R k min{a 2 j k,ε 2 } ε 2 j k / max{α 2,β 2 } 4ε 2 j k and with suitable τ [0, 1] and x τ k = x k + τ s k holds rared c k R k c k 2 (A(x τ k )c(xτ k ))T s k Since (A k c k ) T s k = (A k c k ) T s n k 0, this gives rared c k 3ε2 j k A(x τ k )c(xτ k ) A kc k k Moreover, pred c k c k 2 ε 2 j k Now reduce δ such that for all s R n, s 2δ, holds 3ε 2 j k A(x k + s)c(x k + s) A k c k δ ρ 1 ε 2 j k This is possible by (A1) Then rared c k ρ 1pred c k and, hence, the trial step is accepted in Step 52 for all k δ If (A4) and (A5) hold and if g and A are uniformly continuous then δ can be chosen depending only on min{αa jk,βε} = ε jk min{ε, c k } and the bounds in the assumptions 33 A nonmonotone decrease Lemma The following crucial decrease Lemma is a slight modification of [29, Lem 43] LEMMA 33 Suppose that there exists K 0 such that for all iteration levels k K holds rared c k,a = max { c k 2, ν c k 1 r=0 λ c kr c k r 2 } c k+1 2

12 10 M ULBRICH AND S ULBRICH Then for all k K c k+1 2 max R K νk c <l K l ρ 1 k λ min{k r,νc} predr,a c (36) Proof Set M K = max K ν c K <l K R l and ared c k,a = c k 2 c k+1 2 The proof is by induction Since R l c l 2, 0 l K, we have for k = K r=k M K c k+1 2 rared c k,a ρ 1pred c k,a Now let k K If rared c k+1,a = aredc k+1,a we get by (36) k c k+2 2 = c k+1 2 rared c k+1,a M K ρ 1 k λ min{k r,νc} predr,a c ρ 1pred c k+1,a, which implies (36) k+1, since 0 < λ < 1 Now consider the case where rared c k+1,a aredc k+1,a Then we obtain with q = νc k+1 1 by using (36) and the fact that c k+1 p 2 M K for K ν c K < k + 1 p K, c k+2 2 = q λ c k+1,p c k+1 p 2 rared c k+1,a p=0 q p=0 λ c k+1,p k q M K ρ 1 r=k k p M K ρ 1 λ min{k p r,νc} predr,a c ρ 1 pred c k+1,a r=k ρ 1 pred c k+1,a r=k λ min{k r,νc} pred c r,a ρ 1λ k r=max{k,k q+1} λ min{k r,νc} pred c r,a Now r k q + 1 yields k r q 1 ν c 2 and therefore 1 + min {k r,ν c } = min {k + 1 r,ν c } Thus, we see from the last chain of inequalities that c k+2 2 M K ρ 1 k λ min{k+1 r,νc} predr,a c ρ 1pred c k+1,a r=k k+1 = M K ρ 1 r=k λ min{k+1 r,νc }pred c r,a which concludes the proof To show the convergence towards stationary points we will moreover use the following decrease Lemma, which is very similar to the previous Lemma 33 LEMMA 34 Suppose that there exists K 0 such that for all iteration levels k K holds Then for all k K l k+1 rared l k,a + l(x k+1, y k ) l k+1 ρ 1 2 predl k,a max l l ρ 1 K νk l <l K 2 k λ min{ k r,ν l} predr,a l (37) r=k

13 NONMONOTONE TRUST-REGION METHODS WITHOUT PENALTY FUNCTION 11 Proof We only note that by the definition of rared l k,a holds { ν rared l k,a + l(x k l 1 k+1, y k ) l k+1 = max l k, r=0 λ l kr l k r } l k+1 Now (37) follows exactly by the same arguments as in the proof of Lemma (33) 34 Convergence to feasible points The following auxiliary result will be useful: LEMMA 35 Let the assumptions (A1) (A4) hold If j is increased by Algorithm R in iteration k, ie, j k+1 = j k + 1, then c k 1 λ a jk for all k k In particular, for all iterations k with j k 1 holds c k a j k λα0, (38) where α 0 is the constant in (28) Proof If j k+1 = j k + 1 then we must have c k < min{αa jk,β ĝ k } αa jk and a 2 j k R k ν c k 1 r=0 λ c kr c k r 2 Thus, using λ c kr λ, we obtain c k 2 1 λ a2 j k, k = k + 1 ν c k,,k We now show by induction c k 1 λ a jk for all k k + 1 ν c k (39) For k = k + 1 νk c,, k this is already shown Now let the assertion hold for the iterations k + 1 νk c,,k k Since by Lemma 32 the k -th iteration will eventually be successful, we obtain in particular that 0 ρ 1 pred c k,a raredc k,a = max {R k, ν c k 1 r=0 λ c k r c k r 2 } c k +1 2 { } } Since R k max c k 2, a 2 j max { c k k 2, a 2, we have by the induction hypothesis jk { c k +1 2 max a 2 j k, c k 2,, c k +1 ν c k 2} 1 λ a2 j k This proves the first assertion

14 12 M ULBRICH AND S ULBRICH Now c k a jk / λ follows immediately if j is increased by Algorithm R in iteration k, ie, j k+1 = j k + 1 For all subsequent iterations k satisfying j k = j k+1 we have by our previous result and by (28) c k a j λ k = a j k 1 a j k λ α 0 λ Therefore, (38) holds for all k with j k 1 The next Lemma shows that c k must converge to zero if the assumption of the decrease Lemma 33 does not hold LEMMA 36 Let the assumptions (A1) (A4) hold If for infinitely many iterations holds rared c k,a max { c k 2, ν c k 1 r=0 λ c kr c k r 2 } c k+1 2 then j k and c k 0 Proof Under the assumptions of the Lemma, there exists an infinite subsequence of iterations k for which holds ν { c k 1 R k = min a 2 j, ĝ k k 2} > c k 2 and R k > λ c k r c k r 2 Hence, j k +1 = j k + 1 in each iteration k and thus we must have j k But now Lemma 35 yields lim c a jk k lim λα0 = 0 k k r=0 Combining Lemmas 33 and 36, we can establish convergence to feasible points, which proves part (ii) of Theorem 31 LEMMA 37 If Algorithm A does not terminate finitely then lim c k = 0 k Proof Assume that c k does not tend to zero Then Lemma 36 yields possibly after increasing K rared c k,a = max { c k 2, ν c k 1 r=0 λ c kr c k r 2 } c k+1 2 for all k K (310) Thus Lemma 33 is applicable and we obtain that for all k K holds c k+1 2 M K ρ 1 λ νc We first show that k r=k pred c k,a, where M K = max K ν c K <l K R l (311) lim inf k c k = 0 (312)

15 NONMONOTONE TRUST-REGION METHODS WITHOUT PENALTY FUNCTION 13 If this is wrong then possibly after increasing K there exists ε > 0 with c k ε for all k K Now by (23) which, together with (311), shows that pred c k,a K 2ε min { ε, k,a }, (313) k,a < (314) k=k Thus, (x k ) ˆ is a Cauchy sequence and converges to some x ˆ The continuity of c, A, and g and the boundedness of (H k ) and (y k ), see (A1) and (A5), implies the existence of 0 < δ ε and δ > 0 such that for all x k with x k x δ and all s with s 2δ the inequalities (33) and (34) are satisfied In the proof of Lemma 32, Case 1, it was shown (note c k ε) that the step is accepted if (33), (34) hold for all s with s 2δ and if in addition k δ Since for all sufficiently large k K we have x k x δ, the mechanism of updating k would thus ensure that the step is accepted with k,a min { min,γ 1 δ} This contradicts (314) and (312) is proven Now assume that (312) holds, but c k does not converge to zero Then there is ε > 0 with cˆk 2ε for a subsequence (ˆk) By (312), we can associate with each ˆk some k ˆk with c k+1 < ε, c k ε, k = ˆk,, k As a consequence we have by (23) pred c k,a K 2ε min { } ε, k,a, k = ˆk,, k (315) Since moreover (311) holds, we must have k k=ˆk and thus by (315) and s k,a 2 k,a pred c k,a 0 for ˆk x k+1 xˆk 2 k k=ˆk k,a 0 for ˆk Since c is Lipschitz continuous on by (A3), we conclude that ε = 2ε ε c k+1 cˆk c k+1 cˆk 0 for ˆk which is a contradiction Hence, our assumption was wrong and the proof is complete 35 Convergence to stationary points As a next step we consider the convergence behavior of the reduced gradient ĝ k = W T k g k The following Lemma gives an important lower bound for acceptable trust-region radii Hereby, we establish two variants of the result One holds in the general setting of assumptions (A1) (A6) The second result is stronger and holds under assumption (A7) It is used in section 4 to achieve locally quadratic convergence LEMMA 38 Let the assumptions (A1) (A6) be satisfied Then the following holds

16 14 M ULBRICH AND S ULBRICH (i) There exists a constant κ 1 > 0 independent of k such that rared c k ρ 1pred c k is satisfied whenever { max{ s k n, st k } δ k = def κ 1 min 1, max{min{a jk, ĝ k }, c k } 2/3} (316) (ii) There exists a constant κ 2 > 0 independent of k such that the step s k is accepted whenever max{ s n k, st k } min { δ k,κ 2 max{ ĝ k, c k } } (317) with δ k as in (i) (iii) If in addition (A7) holds then there exists θ > 0 and κ 1 > 0 in (i) can be chosen such that the step s k is accepted whenever x k x < θ and max{ s n k, st k } min { δ k,κ 1 max{ ĝ k, c k } µ} (318) Proof Set σ k = max{ s n k, st k } and note that σ k k (i): Taylor expansion yields with x τ k = x k + τ s k and appropriate τ [0, 1] ared c k predc k 4 A(xτ k )A(xτ k )T A k A T k σ 2 k c(x τ k ) c(xτ k ) σ 2 k Using (A3), (A6) we conclude that there exists K 5 > 0 with We now consider two cases ared c k predc k K 5σ 2 k (σ k + c k ) (319) Case 1: c k min{αa jk,β ĝ k } Since rared c k aredc k, the decrease condition (23) and (319) ensure that raredc k ρ 1pred c k holds if K 5 σ 2 k (σ k + c k ) (1 ρ 1 )K 2 c k min{ c k,σ k } (320) If c k σ k, (320) holds if 2K 5 σ 3 k (1 ρ 1)K 2 c k 2, ie, if In the case c k > σ k, (320) is satisfied if σ k C 1/3 1 c k 2/3, C 1 def = (1 ρ 1)K 2 2K 5 2K 5 σ 2 k c k (1 ρ 1 )K 2 c k σ k, ie, if σ k C 1 Since c k min{αa jk,β ĝ k } the assertion (i) thus holds with κ 1 = C 2 def = min{c 1, C 1/3 1 min{α,β} 2/3 } Case 2: c k < min{αa jk,β ĝ k } Then R k = min{a 2 j k, ĝ k 2 } according to Algorithm R We obtain rared c k R k c k+1 2 = R k c k 2 + pred c k + (aredc k predc k )

17 NONMONOTONE TRUST-REGION METHODS WITHOUT PENALTY FUNCTION 15 Using (319) we get rared c k predc k if R k c k 2 K 5 σ 2 k (σ k + c k ) (321) Case 21: R k = ĝ k 2 Then R k c k 2 (1 β 2 ) ĝ k 2 since c k β ĝ k, and (321) is satisfied if If ĝ k > σ k, (322) holds if (1 β 2 ) ĝ k 2 K 5 σ 2 k (σ k + β ĝ k ) (322) (1 β 2 ) ĝ k 2 K 5 (1 + β) ĝ k 2 σ k, ie, if σ k C 3 def = 1 β K 5 If ĝ k σ k, (322) holds if (1 β 2 ) ĝ k 2 K 5 (1 + β)σ 3 k, ie, if σ k C 1/3 3 ĝ k 2/3 Therefore, since c k β ĝ k, (322) holds if { σ k min C 3, C 1/3 3 max { ĝ k, c k } } 2/3 Case 22: R k = a 2 j k Then R k c k 2 (1 α 2 )a 2 j k and c k αa jk As in Case 21, (321) holds if which yields with c k αa jk { } σ k min C 4, C 1/3 4 a 2/3 def j k, C 4 = 1 α, K 5 σ k min { C 4, C 1/3 4 max { a jk, c k } 2/3 } { } Thus, the assertion (i) is proven with κ 1 = min C 2, C 3, C 1/3 3, C 4, C 1/3 4 (ii): If pred t k < max { pred c k,(predc k )µ} or pred l k < γ predt k then no further acceptance criteria are required and we are done Otherwise, we have pred t k max{ pred c k,(predc k )µ} and pred l k γ predt k, (323) and get a further restriction on σ k by the requirement rared l k ρ 1pred l k Now (32) and (A6) yield a constant K 6 > 0 with ared l k predl k K 6σ 2 k (324) We consider first the case that c k ĝ k We know that in the present case holds pred l k γ predt k γ predc k γ K 2 c k min { c k,σ k } Since rared l k aredl k, we conclude from (324) that raredl k ρ 1pred l k is ensured if K 6 σ 2 k γ(1 ρ 1)K 2 c k min { c k,σ k }

18 16 M ULBRICH AND S ULBRICH This is satisfied if σ k C 5 c k, { ( ) } γ(1 ρ 1 )K 2 γ(1 ρ1 )K 1/2 2 C 5 = min, K 6 K 6 Hence, in the case c k ĝ k the step is accepted if (317) holds with κ 2 = C 5 We now consider the case ĝ k c k By (A3) and (A5) there exists a constant K 7 > 0 with W T k q k(s n k ) ĝ k K 7 s n k ĝ k K 7 σ k, (325) W T k q k(s n k ) ĝ k + K 7 s n k ĝ k + K 1 K 7 c k, (326) where we have used (22) in the last inequality Thus, we obtain for σ k 1 2K 7 ĝ k by (25) and (325) pred l k γ predt k γ K 3 4 ĝ k min { ĝ k,σ k } (327) Hence, by (324) we have rared l k ρ 1pred l k whenever σ k 1 2K 7 ĝ k and K 6 σ 2 k γ(1 ρ 1) K 3 4 ĝ k min { ĝ k,σ k } All this is satisfied whenever { 1 σ k C 5 ĝ k, C 5 = min, γ(1 ρ ( ) } 1)K 3 γ(1 ρ1 )K 1/2 3, 2K 7 4K 6 4K 6 Hence, the step is accepted if (317) holds with κ 2 = min{c 5, C 5 }, which completes the proof of (ii) (iii): Now let in addition (A7) hold We choose θ > 0 such that the 2θ-neighborhood of x is contained in N In the rest of the proof we show that after a possible further reduction of κ 1 the step is accepted whenever x k x < θ and (318) is satisfied Therefore, let x k satisfy x k x < θ As already in (ii), we have only to consider the case (323) in which we have to ensure that rared l k ρ 1pred l k To this end, let 0 < κ 1 θ/2 be such that (i) holds for κ 1 = κ 1 (and thus for all 0 < κ 1 κ 1 ) and consider steps that satisfy (316) for κ 1 = κ 1 In particular, we then have σ k θ/2 and thus [x k, x k + s k ] N Using (A7), Taylor expansion yields ared l k predl k = 1 2 st k ( 2 x l(xτ k, y k) 2 x l(x k, y k ))s k for appropriate x τ k = x k + τ s k, τ [0, 1] Thus, (A7) yields a constant K 8 > 0 with ared l k predl k K 8σ 3 k (328) Now we conclude from (22) and the first inequality in (325) that We consider first the case W T k q k(s n k ) ĝ k K 1 K 7 c k (329) c k 1 max{ ĝ k, Wk T 2K 1 K q k(sk n ) } (330) 7

19 Then by (329) NONMONOTONE TRUST-REGION METHODS WITHOUT PENALTY FUNCTION 17 W T k q k(s n k ) 1 2 ĝ k and thus (327) holds by (25) Hence, (327), (328) guarantee rared l k ρ 1pred l k whenever This is satisfied for σ k C 6 min{ ĝ k 2/3, ĝ k 1/2 }, Moreover, we have in the case (330) K 8 σ 3 k γ(1 ρ 1) K 3 4 ĝ k min { ĝ k,σ k } { (γ(1 ) ρ1 )K 1/3 ( ) } 3 γ(1 ρ1 )K 1/2 3 C 6 = min, 4K 8 4K 8 c k 1 K 1 K 7 ĝ k (331) In fact, either (331) follows directly from (330) or we have W T k q k(s n k ) 2K 1K 7 c k and thus (326) yields (331) We now choose κ 1 = min{κ 1, C 6 min{1,(k 1 K 7 ) µ }} Then (318) implies σ k δ k κ 1 and (note that µ > 2/3) σ k κ 1 min{1, max{ ĝ k, c k } µ } κ 1 min{1, ĝ k µ max{1,(k 1 K 7 ) µ }} C 6 min{1, ĝ k µ } C 6 min{ ĝ k 2/3, ĝ k 1/2 } Hence, the proof of (iii) is complete in the case (330) It remains to consider the case c k > 1 max{ ĝ k, Wk T 2K 1 K q k(sk n ) } (332) 7 Now, since s t k = W kd k with d k = (W T k W k) 1 W T k st k, (A3) and (A5) yield a constant K 9 > 0 such that pred t k W T k q k(s n k ) (W T k W k) 1 W T k st k H k σ 2 k K 9 σ k ( W T k q k(s n k ) + σ k) Since pred t k max{ pred c k,(predc k )µ}, we deduce from (23) which yields Introducing the constant C 7 = K 9 σ k ( W T k q k(s n k ) + σ k) K µ 2 c k µ min{ c k µ,σ µ k } K 9 σ k (2K 1 K 7 c k + σ k ) K µ 2 c k µ min{ c k µ,σ µ k } K µ 2 (1+2K 1 K 7 )K 9, this implies σ 2 k C 7 c k 2µ, if c k σ k, (333) σ k c k C 7 c k µ σ µ k, if c k > σ k (334)

20 18 M ULBRICH AND S ULBRICH In the case (333) we obtain σ k C 1/2 7 c k µ Since c k is bounded by (A3) and 2/3 < µ < 1, we see that in the case (334) there exists a constant C 8 > 0 with σ k C 8 We conclude that in the situation (323), (332) always holds σ k min{c 1/2 7 c k µ, C 8 } Since c k 1 2K 1 K 7 ĝ k, we find that with C 9 = min{c 1/2 7 min{1,(2k 1 K 7 ) µ }, C 8 } holds σ k C 9 min{1, max{ c k, ĝ k } µ } Choosing κ 1 = min{κ 1, C 9}, this concludes the proof of (iii) also for the case (332) The following Lemma establishes part (iii) of Theorem 31 LEMMA 39 Let (A1) (A6) hold If the algorithm does not terminate finitely then lim inf k ĝ k = 0 Proof Assume that the algorithm runs infinitely and that there are K 0 and ε (0, 1] with ĝ k > 2ε for all k K We first show that after a possible increase of K holds pred l k,a γ predt k,a, predt k,a max{ pred c k,a,(predc k,a )µ}, for all k K As in the proof of Lemma 38, see (329), there exists a constant K 7 > 0 with W T k q k(s n k,a ) ĝ k K 7 s n k,a 2ε K 1K 7 c k for all k K, where we have used (22) Since c k 0 by Lemma 37, we can increase K such that Wk T q k(sk,a n ) ε for all k K, and thus by (25) pred t k,a K 3ε min { ε, k,a } for all k K (335) On the other hand holds pred c k,a 2 A kc k s n k,a + A k A T k sn k,a 2, and, hence, by (22) and (A3) find a constant K 10 > 0 with pred c k,a K 10 c k min { c k, k,a } (336) Since c k 0 by Lemma 37 and ĝ k ε, we obtain from Lemma 38, (i) (ii), and the mechanism of updating k that after a possible increase of K holds k,a γ 1 κ 1 c k 2/3 c k for all k K, (337) and thus we have by (335) (337) that, for sufficiently large K, pred c k,a K 10 c k 2, pred t k,a K 3εγ 1 κ 1 c k 2/3 for all k K (338) Hence, using µ > 2/3 and c k 0, we see from (338) that, possibly after increasing K, pred t k,a > max{ pred c k,a,(predc k,a )µ} for all k K (339)

21 NONMONOTONE TRUST-REGION METHODS WITHOUT PENALTY FUNCTION 19 Moreover, we note that by (A3), (A5), and (22) there is K 11 > 0 with Hence, K can by (338) be enlarged such that pred l k,a predt k,a = q k(s n k,a ) K 11 c k pred l k,a γ predt k,a for all k K (340) Therefore, for all k K the accepted step s k,a satisfies (339) and (340) Thus, for all k K, the acceptance of the step takes place in Step 51 In particular, the accepted steps satisfy rared l k,a ρ 1pred l k,a γρ 1K 3 ε min { ε, k,a } for all k K, (341) where we have used (335) By our assumption holds ĝ k 2ε for all k K Thus Lemma 38, (i) (ii) and the mechanism of updating k yields a constant C 1 > 0 with k,a C 1 min { 1, max{min{a jk, 2ε}, c k } 2/3} C 1 min { } ε, a 2/3 j k (342) Hence, we obtain from (341) some C 2 > 0 with { } pred l k,a C 2ε min ε, a 2/3 j k for all k K (343) By (341) holds rared l k,a ρ 1pred l k,a for all k K We want to apply the decrease Lemma 34 and since rared l k,a uses l(x k+1, y k ), we show that l k+1 l(x k+1, y k ) becomes small compared to pred l k,a In fact, l(x k+1, y k+1 ) l(x k+1, y k ) y k+1 y k c k+1 (344) Using pred c k,a 0, we have with (319) c k+1 2 = c k 2 ared c k,a c k 2 + ared c k,a predc k,a c k 2 + K 5 2 k,a ( k,a + c k ) This together with (337) yields a constant K 12 > 0 such that c k+1 2 K 12 3 k,a for all k K (345) Using (A5), (341), (344), and (345) we obtain possibly after increasing K l(x k+1, y k+1 ) l(x k+1, y k ) ρ 1 2 predl k,a for all k K In fact, by (341), (344), (345) this is clear for k,a C 3, C 3 > 0 small enough After increasing K this holds also for all k,a C 3 > 0, since by c k+1 0 according to Lemma 37 the left term tends by (344) to zero whereas the right hand side has by (341) a positive lower bound Hence, we get with (341) rared l k,a + l(x k+1, y k ) l k+1 ρ 1 2 predl k,a for all k K This yields by Lemma 34 with M l K = max K ν l K <l K l l l k+1 M l K ρ 1 2 λνl k r=k pred l k,a

22 20 M ULBRICH AND S ULBRICH for all k K Since l k is bounded from below by (A3) and (A5) this gives with (343) k=k C 2 min{1, a 2/3 j k } pred l k,a < But the left hand side is not summable because a η j is not summable and η > 2/3 Hence, we have derived a contradiction and the proof is complete 4 Transition to fast local convergence Throughout this section we assume that assumptions (A1) (A7) hold We now show that the proposed Algorithm A converges with local quadratic rate towards a point satisfying the second order sufficient condition Hereby, we work with an SQP-Newton-type step computation These steps are shown to be accepted by our algorithm in a neighborhood of a stationary pair ( x, ȳ) ˆ R m satisfying the following standard sufficient second order condition: (O2) k=k ( ) c( x) = 0, W( x) T x 2 l( x, ȳ)w( x) positive definite, A( x) has full column rank ĝ( x) The last condition ensures that the Lagrange multiplier ȳ is unique 41 Requirement on the step computation To achieve fast local convergence we have to ensure that close to the solution SQP-Newton steps are taken This requires an appropriate splitting of these steps in their quasi-normal and tangential part and a careful choice of the Lagrange multiplier update rule For the derivation of these concepts, we begin by collecting some facts about the local convergence behavior of SQP methods Hereby, we choose an informal style of presentation since these results are by now well-known The SQP-Lagrange-Newton system is given by ( 2 x l(x k, y k ) A k Ak T 0 ) ( ) sn,k = z N,k ( ) x l(x k, y k ) c k Under the assumptions (A7) and (O2) it is well known that for all ζ (0, 1) there exist neighborhoods U N of x and V N of ȳ such that for x k U N, y k V N the steps s N,k and z N,k are well defined with x k + s N,k U N, y k + z N,k V N, x k + s N,k x + y k + z N,k ȳ ζ ( x k x + y k ȳ ), (41) and that for (x k, y k ) ( x, ȳ) holds x k + s N,k x + y k + z N,k ȳ = O( x k x 2 + y k ȳ 2 ), (42) l(x k + s N,k, y k + z N,k ) + c(x k + s N,k ) = O( l k 2 + c k 2 ) (43) Furthermore, if s n N,k and st N,k satisfy A T k sn N,k = c k, s t N,k = W kdn,k t, where (W k T 2 x l kw k )dn,k t = (ĝ k + Wk T 2 x l ks n N,k ), (44) then we have s N,k = s n N,k + st N,k Hereby, we have used the identity ĝ k = Wk T xl k Note that s n N,k solves the unconstrained quasi-normal problem min c k + A T k sn 2,

23 NONMONOTONE TRUST-REGION METHODS WITHOUT PENALTY FUNCTION 21 and that s t N,k is the corresponding solution of the unconstrained tangential problem min q k (s n N,k + st ) subject to A T k st = 0 with H k = 2 x l(x k, y k ) We would like to achieve quadratic convergence of (x k ) rather than (x k, y k ) To this end, similar as in [19], let Y : U N R m be a consistent update rule for the Lagrange multiplier, which is Lipschitz continuous at x, ie, Y( x) = ȳ, Y(x) Y( x) L y x x for all x U N (45) By a possible reduction of U N and V N we achieve that (41) holds for ζ 1/(2+2L y ), and a further reduction of U N yields Y(U N ) V N Thus, for all x k U N holds y k = Y(x k ) V N and x k + s N,k x ζ ( x k x + y k ȳ ) 1 2 x k x (46) x k + s N,k x = O( x k x 2 + y k ȳ 2 ) = O( x k x 2 ), (47) where we have used (41), (42), and (45) Therefore, the iteration x k x k+1 = x k + s N,k with y k = Y(x k ) converges q-quadratically to x In the sequel we restrict ourselves to the following class of update rules: Let B : U N R n m be continuously differentiable such that A T B is uniformly bounded invertible on U N We introduce the multiplier update Y(x) = (B(x) T A(x)) 1 B(x) T g(x), (48) which is obviously consistent and continuously differentiable Therefore, after reducing U N if necessary, B is bounded on U N and Y satisfies the Lipschitz condition In particular, if we choose B = A, we obtain the well-known least-squares multiplier update Furthermore, the adjoint update, which is widely used in optimal control, also fits in this framework: Let x be partitioned in the form x = (z T, u T ) T R m R n m such that z c(x) is invertible on U with uniformly bounded inverse In an optimal control context the standard choice for z is the state, and for u the control The adjoint update for this splitting now corresponds to B T = (Bz T, BT u ) = (I, 0) Among the many possible solutions s n N,k of the quasi-normal problem we select the one contained in span(b k ), ie, By construction holds for y k = Y(x k ) s n N,k = B k(a T k B k) 1 c k (49) l(x k, y k ) T s n N,k = 0 (410) Further, there exist constants K 1, K 13 > 0 such that s n N,k K 1 c k, s t N,k K 13( ĝ k + c k ), (411) where the first inequality follows from (49) and for the derivation of the second inequality we use (44) to obtain s t N,k = W k(w T k 2 x l kw k ) 1 (ĝ k + W T k 2 x l ks n N,k )

24 22 M ULBRICH AND S ULBRICH Furthermore, the uniformly bounded invertibility of B T A and the fact that A T W = 0 yield the uniformly bounded invertibility of (B W) and thus ( ( ) ) 0 x l k = O( (B k W k ) T x l k ) = O = O( ĝ k ) Hence, using that ĝ(x) = W(x) T x l(x, y) for all y R m, we obtain ĝ(x k + s N,k ) + c(x k + s N,k ) = O( x l(x k + s N,k, y k + z N,k ) ) + c(x k + s N,k ) = O( x l k 2 + c k 2 ) = O( ĝ k 2 + c k 2 ) (412) Collecting the results obtained so far, we have PROPOSITION 41 Let (A7) hold and assume that x ˆ satisfies the second order sufficient condition (O2) Let B(x) R n m be continuously differentiable in a neighborhood U N of x such that A T B is uniformly bounded invertible on U N Then for x k U N sufficiently close to x, y k = Y(x k ), s n N,k, and st N,k as given in (48), (49), and (44) are well defined and satisfy (410), (411) Furthermore, (46), (47), and (412) hold The following assumption states our requirements on the step computation that we need to prove fast local convergence Assumption: (A8) x ˆ satisfies the second order sufficient condition (O2), and (x k ) converges to x Moreover, there exists a neighborhood U N of x such that for all x k U N holds: (i) The Lagrange multiplier estimates are computed by y k = Y(x k ) with Y given by (48), where B(x) R n m is continuously differentiable on U N and A T B is uniformly bounded invertible on U N (ii) The step sk n = sn N,k with sn N,k as in (49) is chosen whenever sn N,k k (iii) If the reduced Hessian Wk T 2 x l kw k is positive definite, then s t N,k is computed according to (44) and sk t = st N,k is chosen whenever st N,k k REMARK 42 A possible implementation of (iii) is obtained by applying Steihaug s conjugate gradient method to (24) in the reduced variables d, where s t = W k d (or in its projected form [18]) If the reduced Hessian is positive definite then the CG-path either leaves the trust-region (in this case holds s t N,k > k), or it stays in the trust-region and converges to dn,k t If the reduced Hessian is not positive definite, the Steihaug method either detects negative curvature or stops since the path leaves the trust-region As is well known, one can allow inexactness without destroying the rate of convergence Due to space limitations this issue is not discussed here 42 Quadratic local convergence The next result shows that with the rule (A8) for the step computation Algorithm A eventually takes Newton steps THEOREM 43 Let (A1) (A8) hold Then the trial steps according to (44), (49) are eventually taken by Algorithm A and thus (x k ) converges q-quadratically to x The proof of this result requires some work We start with the following auxiliary result LEMMA 44 Let (A1) (A8) hold and let τ satisfy { 2 3 < τ < min µ, 2 3µ, η } (413) 2 Then there is K > 0 such that the following is true: If for some iteration k K holds ĝ k a τ j k > ĝ k or c k τ > ĝ k, (414)

25 NONMONOTONE TRUST-REGION METHODS WITHOUT PENALTY FUNCTION 23 then for all k k Algorithm A takes Newton steps, ie sk,a n = sn N,k and st k,a = st N,k Proof We first note that by the assumptions on µ and η the condition (413) can be satisfied Since x k x by (A8) and c( x) = 0, ĝ( x) = 0, we find K > 0 with c k 1, ĝ k 1, x k x < min{δ N,θ} and x k U N for all k K, where θ is as in Lemma 38, (iii) In particular, (411) holds for all k K Hence, we can increase K such that s n N,k, st N,k min for all k K Since the steps s n N,k and st N,k satisfy the decrease conditions (23) and (25), respectively, part (iii) of Lemma 38 yields by the mechanism of updating k that for k K the Newton step s k = s N,k is accepted whenever δ k s n N,k, st N,k γ 1δ k, where (415) { def = κ 1 min max{min{a jk, ĝ k }, c k } 2/3, max{ ĝ k, c k } µ} (416) In fact, iteration level k is entered with k min and in each subiteration k is reduced by at most the factor γ 1 From c k 0 and (411) we obtain s n N,k K 1 c k γ 1 κ 1 c k µ γ 1 δ k for all k K after a possible increase of K Thus, by (A8) the quasi-normal step satisfies sk,a n = sn N,k for all k K Now we consider the step s t N,k for k K If aτ j k > ĝ k then, using ĝ k, c k 1, max{min{a jk, ĝ k }, c k } max{ ĝ k 1/τ, c k } max{ ĝ k, c k } 1/τ Similarly, if c k τ > ĝ k, we obtain max{min{a jk, ĝ k }, c k } c k max{ ĝ k 1/τ, c k } max{ ĝ k, c k } 1/τ In both situations we conclude that, since µ < 2 3τ, for k K, K sufficiently large, holds γ 1 δ k γ 1κ 1 max{ ĝ k, c k } 2 3τ > K13 ( ĝ k + c k ) s t N,k, where we have used 3τ 2 < 1 and (411) Therefore, we have proved: If K is sufficiently large and for k K holds (414), then s k,a = s N,k (417) In the case j k, k, the sequence a jk is bounded away from zero and we see that (414) holds for all k K if K is chosen sufficiently large Therefore, (417) completes the proof in this case Now consider the case j k as k Then for K so large that j K 1, Lemma 35 yields c k 1 λα0 a jk def = K 14 a jk for all k K (418) In the case a τ j k > ĝ k we get with (418), using a j 0, ĝ k + c k < a τ j + K k 14 a jk 2a τ j 2 k α0 τ a τ j k +1 for k K, K large enough In the case c k τ > ĝ k we have ĝ k + c k < c k τ + c k 2 c k τ 2K14 τ aτ j 2K 14 τ k α0 τ a τ j k +1

5 Handling Constraints

5 Handling Constraints 5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest

More information

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES IJMMS 25:6 2001) 397 409 PII. S0161171201002290 http://ijmms.hindawi.com Hindawi Publishing Corp. A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

More information

Trust-Region SQP Methods with Inexact Linear System Solves for Large-Scale Optimization

Trust-Region SQP Methods with Inexact Linear System Solves for Large-Scale Optimization Trust-Region SQP Methods with Inexact Linear System Solves for Large-Scale Optimization Denis Ridzal Department of Computational and Applied Mathematics Rice University, Houston, Texas dridzal@caam.rice.edu

More information

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization Frank E. Curtis, Lehigh University involving joint work with James V. Burke, University of Washington Daniel

More information

An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization

An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization Frank E. Curtis, Lehigh University involving joint work with Travis Johnson, Northwestern University Daniel P. Robinson, Johns

More information

Algorithms for Constrained Optimization

Algorithms for Constrained Optimization 1 / 42 Algorithms for Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University April 19, 2015 2 / 42 Outline 1. Convergence 2. Sequential quadratic

More information

MS&E 318 (CME 338) Large-Scale Numerical Optimization

MS&E 318 (CME 338) Large-Scale Numerical Optimization Stanford University, Management Science & Engineering (and ICME) MS&E 318 (CME 338) Large-Scale Numerical Optimization 1 Origins Instructor: Michael Saunders Spring 2015 Notes 9: Augmented Lagrangian Methods

More information

Lecture 15: SQP methods for equality constrained optimization

Lecture 15: SQP methods for equality constrained optimization Lecture 15: SQP methods for equality constrained optimization Coralia Cartis, Mathematical Institute, University of Oxford C6.2/B2: Continuous Optimization Lecture 15: SQP methods for equality constrained

More information

Inexact-Restoration Method with Lagrangian Tangent Decrease and New Merit Function for Nonlinear Programming 1 2

Inexact-Restoration Method with Lagrangian Tangent Decrease and New Merit Function for Nonlinear Programming 1 2 Inexact-Restoration Method with Lagrangian Tangent Decrease and New Merit Function for Nonlinear Programming 1 2 J. M. Martínez 3 December 7, 1999. Revised June 13, 2007. 1 This research was supported

More information

Penalty and Barrier Methods General classical constrained minimization problem minimize f(x) subject to g(x) 0 h(x) =0 Penalty methods are motivated by the desire to use unconstrained optimization techniques

More information

Analysis of Inexact Trust-Region Interior-Point SQP Algorithms. Matthias Heinkenschloss Luis N. Vicente. TR95-18 June 1995 (revised April 1996)

Analysis of Inexact Trust-Region Interior-Point SQP Algorithms. Matthias Heinkenschloss Luis N. Vicente. TR95-18 June 1995 (revised April 1996) Analysis of Inexact rust-region Interior-Point SQP Algorithms Matthias Heinkenschloss Luis N. Vicente R95-18 June 1995 (revised April 1996) Department of Computational and Applied Mathematics MS 134 Rice

More information

1. Introduction. We consider the general smooth constrained optimization problem:

1. Introduction. We consider the general smooth constrained optimization problem: OPTIMIZATION TECHNICAL REPORT 02-05, AUGUST 2002, COMPUTER SCIENCES DEPT, UNIV. OF WISCONSIN TEXAS-WISCONSIN MODELING AND CONTROL CONSORTIUM REPORT TWMCC-2002-01 REVISED SEPTEMBER 2003. A FEASIBLE TRUST-REGION

More information

A null-space primal-dual interior-point algorithm for nonlinear optimization with nice convergence properties

A null-space primal-dual interior-point algorithm for nonlinear optimization with nice convergence properties A null-space primal-dual interior-point algorithm for nonlinear optimization with nice convergence properties Xinwei Liu and Yaxiang Yuan Abstract. We present a null-space primal-dual interior-point algorithm

More information

A trust region method based on interior point techniques for nonlinear programming

A trust region method based on interior point techniques for nonlinear programming Math. Program., Ser. A 89: 149 185 2000 Digital Object Identifier DOI 10.1007/s101070000189 Richard H. Byrd Jean Charles Gilbert Jorge Nocedal A trust region method based on interior point techniques for

More information

A SHIFTED PRIMAL-DUAL PENALTY-BARRIER METHOD FOR NONLINEAR OPTIMIZATION

A SHIFTED PRIMAL-DUAL PENALTY-BARRIER METHOD FOR NONLINEAR OPTIMIZATION A SHIFTED PRIMAL-DUAL PENALTY-BARRIER METHOD FOR NONLINEAR OPTIMIZATION Philip E. Gill Vyacheslav Kungurtsev Daniel P. Robinson UCSD Center for Computational Mathematics Technical Report CCoM-19-3 March

More information

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods AM 205: lecture 19 Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods Quasi-Newton Methods General form of quasi-newton methods: x k+1 = x k α

More information

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods AM 205: lecture 19 Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods Optimality Conditions: Equality Constrained Case As another example of equality

More information

Algorithms for constrained local optimization

Algorithms for constrained local optimization Algorithms for constrained local optimization Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Algorithms for constrained local optimization p. Feasible direction methods Algorithms for constrained

More information

An Inexact Newton Method for Nonlinear Constrained Optimization

An Inexact Newton Method for Nonlinear Constrained Optimization An Inexact Newton Method for Nonlinear Constrained Optimization Frank E. Curtis Numerical Analysis Seminar, January 23, 2009 Outline Motivation and background Algorithm development and theoretical results

More information

Constrained Nonlinear Optimization Algorithms

Constrained Nonlinear Optimization Algorithms Department of Industrial Engineering and Management Sciences Northwestern University waechter@iems.northwestern.edu Institute for Mathematics and its Applications University of Minnesota August 4, 2016

More information

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications Optimization Problems with Constraints - introduction to theory, numerical Methods and applications Dr. Abebe Geletu Ilmenau University of Technology Department of Simulation and Optimal Processes (SOP)

More information

A Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity

A Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity A Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity Mohammadreza Samadi, Lehigh University joint work with Frank E. Curtis (stand-in presenter), Lehigh University

More information

Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms

Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms Carlos Humes Jr. a, Benar F. Svaiter b, Paulo J. S. Silva a, a Dept. of Computer Science, University of São Paulo, Brazil Email: {humes,rsilva}@ime.usp.br

More information

AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING

AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING XIAO WANG AND HONGCHAO ZHANG Abstract. In this paper, we propose an Augmented Lagrangian Affine Scaling (ALAS) algorithm for general

More information

An Inexact Newton Method for Optimization

An Inexact Newton Method for Optimization New York University Brown Applied Mathematics Seminar, February 10, 2009 Brief biography New York State College of William and Mary (B.S.) Northwestern University (M.S. & Ph.D.) Courant Institute (Postdoc)

More information

Inexact Newton Methods and Nonlinear Constrained Optimization

Inexact Newton Methods and Nonlinear Constrained Optimization Inexact Newton Methods and Nonlinear Constrained Optimization Frank E. Curtis EPSRC Symposium Capstone Conference Warwick Mathematics Institute July 2, 2009 Outline PDE-Constrained Optimization Newton

More information

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method Optimization Methods and Software Vol. 00, No. 00, Month 200x, 1 11 On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method ROMAN A. POLYAK Department of SEOR and Mathematical

More information

Global convergence of trust-region algorithms for constrained minimization without derivatives

Global convergence of trust-region algorithms for constrained minimization without derivatives Global convergence of trust-region algorithms for constrained minimization without derivatives P.D. Conejo E.W. Karas A.A. Ribeiro L.G. Pedroso M. Sachine September 27, 2012 Abstract In this work we propose

More information

Sequential Quadratic Programming Methods

Sequential Quadratic Programming Methods Sequential Quadratic Programming Methods Klaus Schittkowski Ya-xiang Yuan June 30, 2010 Abstract We present a brief review on one of the most powerful methods for solving smooth constrained nonlinear optimization

More information

Unconstrained optimization

Unconstrained optimization Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout

More information

Some new facts about sequential quadratic programming methods employing second derivatives

Some new facts about sequential quadratic programming methods employing second derivatives To appear in Optimization Methods and Software Vol. 00, No. 00, Month 20XX, 1 24 Some new facts about sequential quadratic programming methods employing second derivatives A.F. Izmailov a and M.V. Solodov

More information

A SHIFTED PRIMAL-DUAL INTERIOR METHOD FOR NONLINEAR OPTIMIZATION

A SHIFTED PRIMAL-DUAL INTERIOR METHOD FOR NONLINEAR OPTIMIZATION A SHIFTED RIMAL-DUAL INTERIOR METHOD FOR NONLINEAR OTIMIZATION hilip E. Gill Vyacheslav Kungurtsev Daniel. Robinson UCSD Center for Computational Mathematics Technical Report CCoM-18-1 February 1, 2018

More information

A globally and quadratically convergent primal dual augmented Lagrangian algorithm for equality constrained optimization

A globally and quadratically convergent primal dual augmented Lagrangian algorithm for equality constrained optimization Optimization Methods and Software ISSN: 1055-6788 (Print) 1029-4937 (Online) Journal homepage: http://www.tandfonline.com/loi/goms20 A globally and quadratically convergent primal dual augmented Lagrangian

More information

A Primal-Dual Augmented Lagrangian Penalty-Interior-Point Filter Line Search Algorithm

A Primal-Dual Augmented Lagrangian Penalty-Interior-Point Filter Line Search Algorithm Journal name manuscript No. (will be inserted by the editor) A Primal-Dual Augmented Lagrangian Penalty-Interior-Point Filter Line Search Algorithm Rene Kuhlmann Christof Büsens Received: date / Accepted:

More information

SF2822 Applied Nonlinear Optimization. Preparatory question. Lecture 9: Sequential quadratic programming. Anders Forsgren

SF2822 Applied Nonlinear Optimization. Preparatory question. Lecture 9: Sequential quadratic programming. Anders Forsgren SF2822 Applied Nonlinear Optimization Lecture 9: Sequential quadratic programming Anders Forsgren SF2822 Applied Nonlinear Optimization, KTH / 24 Lecture 9, 207/208 Preparatory question. Try to solve theory

More information

Suppose that the approximate solutions of Eq. (1) satisfy the condition (3). Then (1) if η = 0 in the algorithm Trust Region, then lim inf.

Suppose that the approximate solutions of Eq. (1) satisfy the condition (3). Then (1) if η = 0 in the algorithm Trust Region, then lim inf. Maria Cameron 1. Trust Region Methods At every iteration the trust region methods generate a model m k (p), choose a trust region, and solve the constraint optimization problem of finding the minimum of

More information

CONSTRAINED NONLINEAR PROGRAMMING

CONSTRAINED NONLINEAR PROGRAMMING 149 CONSTRAINED NONLINEAR PROGRAMMING We now turn to methods for general constrained nonlinear programming. These may be broadly classified into two categories: 1. TRANSFORMATION METHODS: In this approach

More information

Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization. Nick Gould (RAL)

Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization. Nick Gould (RAL) Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization Nick Gould (RAL) x IR n f(x) subject to c(x) = Part C course on continuoue optimization CONSTRAINED MINIMIZATION x

More information

ON AUGMENTED LAGRANGIAN METHODS WITH GENERAL LOWER-LEVEL CONSTRAINTS. 1. Introduction. Many practical optimization problems have the form (1.

ON AUGMENTED LAGRANGIAN METHODS WITH GENERAL LOWER-LEVEL CONSTRAINTS. 1. Introduction. Many practical optimization problems have the form (1. ON AUGMENTED LAGRANGIAN METHODS WITH GENERAL LOWER-LEVEL CONSTRAINTS R. ANDREANI, E. G. BIRGIN, J. M. MARTíNEZ, AND M. L. SCHUVERDT Abstract. Augmented Lagrangian methods with general lower-level constraints

More information

Constrained Optimization

Constrained Optimization 1 / 22 Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University March 30, 2015 2 / 22 1. Equality constraints only 1.1 Reduced gradient 1.2 Lagrange

More information

1. Introduction. In this paper we discuss an algorithm for equality constrained optimization problems of the form. f(x) s.t.

1. Introduction. In this paper we discuss an algorithm for equality constrained optimization problems of the form. f(x) s.t. AN INEXACT SQP METHOD FOR EQUALITY CONSTRAINED OPTIMIZATION RICHARD H. BYRD, FRANK E. CURTIS, AND JORGE NOCEDAL Abstract. We present an algorithm for large-scale equality constrained optimization. The

More information

Higher-Order Methods

Higher-Order Methods Higher-Order Methods Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. PCMI, July 2016 Stephen Wright (UW-Madison) Higher-Order Methods PCMI, July 2016 1 / 25 Smooth

More information

2.3 Linear Programming

2.3 Linear Programming 2.3 Linear Programming Linear Programming (LP) is the term used to define a wide range of optimization problems in which the objective function is linear in the unknown variables and the constraints are

More information

An Inexact Newton Method for Nonconvex Equality Constrained Optimization

An Inexact Newton Method for Nonconvex Equality Constrained Optimization Noname manuscript No. (will be inserted by the editor) An Inexact Newton Method for Nonconvex Equality Constrained Optimization Richard H. Byrd Frank E. Curtis Jorge Nocedal Received: / Accepted: Abstract

More information

Nonlinear Programming

Nonlinear Programming Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week

More information

PDE-Constrained and Nonsmooth Optimization

PDE-Constrained and Nonsmooth Optimization Frank E. Curtis October 1, 2009 Outline PDE-Constrained Optimization Introduction Newton s method Inexactness Results Summary and future work Nonsmooth Optimization Sequential quadratic programming (SQP)

More information

A GLOBALLY CONVERGENT STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE

A GLOBALLY CONVERGENT STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE A GLOBALLY CONVERGENT STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE Philip E. Gill Vyacheslav Kungurtsev Daniel P. Robinson UCSD Center for Computational Mathematics Technical Report CCoM-14-1 June 30,

More information

A STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE

A STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE A STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE Philip E. Gill Vyacheslav Kungurtsev Daniel P. Robinson UCSD Center for Computational Mathematics Technical Report CCoM-14-1 June 30, 2014 Abstract Regularized

More information

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be

More information

Feasible Interior Methods Using Slacks for Nonlinear Optimization

Feasible Interior Methods Using Slacks for Nonlinear Optimization Feasible Interior Methods Using Slacks for Nonlinear Optimization Richard H. Byrd Jorge Nocedal Richard A. Waltz February 28, 2005 Abstract A slack-based feasible interior point method is described which

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

POWER SYSTEMS in general are currently operating

POWER SYSTEMS in general are currently operating TO APPEAR IN IEEE TRANSACTIONS ON POWER SYSTEMS 1 Robust Optimal Power Flow Solution Using Trust Region and Interior-Point Methods Andréa A. Sousa, Geraldo L. Torres, Member IEEE, Claudio A. Cañizares,

More information

A PRIMAL-DUAL TRUST REGION ALGORITHM FOR NONLINEAR OPTIMIZATION

A PRIMAL-DUAL TRUST REGION ALGORITHM FOR NONLINEAR OPTIMIZATION Optimization Technical Report 02-09, October 2002, UW-Madison Computer Sciences Department. E. Michael Gertz 1 Philip E. Gill 2 A PRIMAL-DUAL TRUST REGION ALGORITHM FOR NONLINEAR OPTIMIZATION 7 October

More information

Numerical Methods for PDE-Constrained Optimization

Numerical Methods for PDE-Constrained Optimization Numerical Methods for PDE-Constrained Optimization Richard H. Byrd 1 Frank E. Curtis 2 Jorge Nocedal 2 1 University of Colorado at Boulder 2 Northwestern University Courant Institute of Mathematical Sciences,

More information

1. Introduction. We analyze a trust region version of Newton s method for the optimization problem

1. Introduction. We analyze a trust region version of Newton s method for the optimization problem SIAM J. OPTIM. Vol. 9, No. 4, pp. 1100 1127 c 1999 Society for Industrial and Applied Mathematics NEWTON S METHOD FOR LARGE BOUND-CONSTRAINED OPTIMIZATION PROBLEMS CHIH-JEN LIN AND JORGE J. MORÉ To John

More information

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization E5295/5B5749 Convex optimization with engineering applications Lecture 8 Smooth convex unconstrained and equality-constrained minimization A. Forsgren, KTH 1 Lecture 8 Convex optimization 2006/2007 Unconstrained

More information

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints Instructor: Prof. Kevin Ross Scribe: Nitish John October 18, 2011 1 The Basic Goal The main idea is to transform a given constrained

More information

Large-Scale Nonlinear Optimization with Inexact Step Computations

Large-Scale Nonlinear Optimization with Inexact Step Computations Large-Scale Nonlinear Optimization with Inexact Step Computations Andreas Wächter IBM T.J. Watson Research Center Yorktown Heights, New York andreasw@us.ibm.com IPAM Workshop on Numerical Methods for Continuous

More information

1 Computing with constraints

1 Computing with constraints Notes for 2017-04-26 1 Computing with constraints Recall that our basic problem is minimize φ(x) s.t. x Ω where the feasible set Ω is defined by equality and inequality conditions Ω = {x R n : c i (x)

More information

This manuscript is for review purposes only.

This manuscript is for review purposes only. 1 2 3 4 5 6 7 8 9 10 11 12 THE USE OF QUADRATIC REGULARIZATION WITH A CUBIC DESCENT CONDITION FOR UNCONSTRAINED OPTIMIZATION E. G. BIRGIN AND J. M. MARTíNEZ Abstract. Cubic-regularization and trust-region

More information

A GLOBALLY CONVERGENT STABILIZED SQP METHOD

A GLOBALLY CONVERGENT STABILIZED SQP METHOD A GLOBALLY CONVERGENT STABILIZED SQP METHOD Philip E. Gill Daniel P. Robinson July 6, 2013 Abstract Sequential quadratic programming SQP methods are a popular class of methods for nonlinearly constrained

More information

Optimization and Root Finding. Kurt Hornik

Optimization and Root Finding. Kurt Hornik Optimization and Root Finding Kurt Hornik Basics Root finding and unconstrained smooth optimization are closely related: Solving ƒ () = 0 can be accomplished via minimizing ƒ () 2 Slide 2 Basics Root finding

More information

Preprint ANL/MCS-P , Dec 2002 (Revised Nov 2003, Mar 2004) Mathematics and Computer Science Division Argonne National Laboratory

Preprint ANL/MCS-P , Dec 2002 (Revised Nov 2003, Mar 2004) Mathematics and Computer Science Division Argonne National Laboratory Preprint ANL/MCS-P1015-1202, Dec 2002 (Revised Nov 2003, Mar 2004) Mathematics and Computer Science Division Argonne National Laboratory A GLOBALLY CONVERGENT LINEARLY CONSTRAINED LAGRANGIAN METHOD FOR

More information

minimize x subject to (x 2)(x 4) u,

minimize x subject to (x 2)(x 4) u, Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for

More information

On the use of piecewise linear models in nonlinear programming

On the use of piecewise linear models in nonlinear programming Math. Program., Ser. A (2013) 137:289 324 DOI 10.1007/s10107-011-0492-9 FULL LENGTH PAPER On the use of piecewise linear models in nonlinear programming Richard H. Byrd Jorge Nocedal Richard A. Waltz Yuchen

More information

Interior-Point Methods for Linear Optimization

Interior-Point Methods for Linear Optimization Interior-Point Methods for Linear Optimization Robert M. Freund and Jorge Vera March, 204 c 204 Robert M. Freund and Jorge Vera. All rights reserved. Linear Optimization with a Logarithmic Barrier Function

More information

IBM Research Report. Line Search Filter Methods for Nonlinear Programming: Motivation and Global Convergence

IBM Research Report. Line Search Filter Methods for Nonlinear Programming: Motivation and Global Convergence RC23036 (W0304-181) April 21, 2003 Computer Science IBM Research Report Line Search Filter Methods for Nonlinear Programming: Motivation and Global Convergence Andreas Wächter, Lorenz T. Biegler IBM Research

More information

A Trust-region-based Sequential Quadratic Programming Algorithm

A Trust-region-based Sequential Quadratic Programming Algorithm Downloaded from orbit.dtu.dk on: Oct 19, 2018 A Trust-region-based Sequential Quadratic Programming Algorithm Henriksen, Lars Christian; Poulsen, Niels Kjølstad Publication date: 2010 Document Version

More information

Implementation of an Interior Point Multidimensional Filter Line Search Method for Constrained Optimization

Implementation of an Interior Point Multidimensional Filter Line Search Method for Constrained Optimization Proceedings of the 5th WSEAS Int. Conf. on System Science and Simulation in Engineering, Tenerife, Canary Islands, Spain, December 16-18, 2006 391 Implementation of an Interior Point Multidimensional Filter

More information

Numerical Optimization of Partial Differential Equations

Numerical Optimization of Partial Differential Equations Numerical Optimization of Partial Differential Equations Part I: basic optimization concepts in R n Bartosz Protas Department of Mathematics & Statistics McMaster University, Hamilton, Ontario, Canada

More information

A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints

A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints Journal of Computational and Applied Mathematics 161 (003) 1 5 www.elsevier.com/locate/cam A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality

More information

Lecture 13: Constrained optimization

Lecture 13: Constrained optimization 2010-12-03 Basic ideas A nonlinearly constrained problem must somehow be converted relaxed into a problem which we can solve (a linear/quadratic or unconstrained problem) We solve a sequence of such problems

More information

On nonlinear optimization since M.J.D. Powell

On nonlinear optimization since M.J.D. Powell On nonlinear optimization since 1959 1 M.J.D. Powell Abstract: This view of the development of algorithms for nonlinear optimization is based on the research that has been of particular interest to the

More information

Survey of NLP Algorithms. L. T. Biegler Chemical Engineering Department Carnegie Mellon University Pittsburgh, PA

Survey of NLP Algorithms. L. T. Biegler Chemical Engineering Department Carnegie Mellon University Pittsburgh, PA Survey of NLP Algorithms L. T. Biegler Chemical Engineering Department Carnegie Mellon University Pittsburgh, PA NLP Algorithms - Outline Problem and Goals KKT Conditions and Variable Classification Handling

More information

Cubic regularization of Newton s method for convex problems with constraints

Cubic regularization of Newton s method for convex problems with constraints CORE DISCUSSION PAPER 006/39 Cubic regularization of Newton s method for convex problems with constraints Yu. Nesterov March 31, 006 Abstract In this paper we derive efficiency estimates of the regularized

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 6 Optimization Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction permitted

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 6 Optimization Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction permitted

More information

Numerical optimization

Numerical optimization Numerical optimization Lecture 4 Alexander & Michael Bronstein tosca.cs.technion.ac.il/book Numerical geometry of non-rigid shapes Stanford University, Winter 2009 2 Longest Slowest Shortest Minimal Maximal

More information

Generalization to inequality constrained problem. Maximize

Generalization to inequality constrained problem. Maximize Lecture 11. 26 September 2006 Review of Lecture #10: Second order optimality conditions necessary condition, sufficient condition. If the necessary condition is violated the point cannot be a local minimum

More information

Evaluation complexity for nonlinear constrained optimization using unscaled KKT conditions and high-order models by E. G. Birgin, J. L. Gardenghi, J. M. Martínez, S. A. Santos and Ph. L. Toint Report NAXYS-08-2015

More information

Steering Exact Penalty Methods for Nonlinear Programming

Steering Exact Penalty Methods for Nonlinear Programming Steering Exact Penalty Methods for Nonlinear Programming Richard H. Byrd Jorge Nocedal Richard A. Waltz April 10, 2007 Technical Report Optimization Technology Center Northwestern University Evanston,

More information

On Lagrange multipliers of trust-region subproblems

On Lagrange multipliers of trust-region subproblems On Lagrange multipliers of trust-region subproblems Ladislav Lukšan, Ctirad Matonoha, Jan Vlček Institute of Computer Science AS CR, Prague Programy a algoritmy numerické matematiky 14 1.- 6. června 2008

More information

A globally convergent Levenberg Marquardt method for equality-constrained optimization

A globally convergent Levenberg Marquardt method for equality-constrained optimization Computational Optimization and Applications manuscript No. (will be inserted by the editor) A globally convergent Levenberg Marquardt method for equality-constrained optimization A. F. Izmailov M. V. Solodov

More information

Key words. minimization, nonlinear optimization, large-scale optimization, constrained optimization, trust region methods, quasi-newton methods

Key words. minimization, nonlinear optimization, large-scale optimization, constrained optimization, trust region methods, quasi-newton methods SIAM J. OPTIM. c 1998 Society for Industrial and Applied Mathematics Vol. 8, No. 3, pp. 682 706, August 1998 004 ON THE IMPLEMENTATION OF AN ALGORITHM FOR LARGE-SCALE EQUALITY CONSTRAINED OPTIMIZATION

More information

Chapter 2. Optimization. Gradients, convexity, and ALS

Chapter 2. Optimization. Gradients, convexity, and ALS Chapter 2 Optimization Gradients, convexity, and ALS Contents Background Gradient descent Stochastic gradient descent Newton s method Alternating least squares KKT conditions 2 Motivation We can solve

More information

Computational Optimization. Augmented Lagrangian NW 17.3

Computational Optimization. Augmented Lagrangian NW 17.3 Computational Optimization Augmented Lagrangian NW 17.3 Upcoming Schedule No class April 18 Friday, April 25, in class presentations. Projects due unless you present April 25 (free extension until Monday

More information

A SEQUENTIAL QUADRATIC PROGRAMMING ALGORITHM THAT COMBINES MERIT FUNCTION AND FILTER IDEAS

A SEQUENTIAL QUADRATIC PROGRAMMING ALGORITHM THAT COMBINES MERIT FUNCTION AND FILTER IDEAS A SEQUENTIAL QUADRATIC PROGRAMMING ALGORITHM THAT COMBINES MERIT FUNCTION AND FILTER IDEAS FRANCISCO A. M. GOMES Abstract. A sequential quadratic programming algorithm for solving nonlinear programming

More information

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems 1 Numerical optimization Alexander & Michael Bronstein, 2006-2009 Michael Bronstein, 2010 tosca.cs.technion.ac.il/book Numerical optimization 048921 Advanced topics in vision Processing and Analysis of

More information

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL)

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL) Part 4: Active-set methods for linearly constrained optimization Nick Gould RAL fx subject to Ax b Part C course on continuoue optimization LINEARLY CONSTRAINED MINIMIZATION fx subject to Ax { } b where

More information

A Trust-Funnel Algorithm for Nonlinear Programming

A Trust-Funnel Algorithm for Nonlinear Programming for Nonlinear Programming Daniel P. Robinson Johns Hopins University Department of Applied Mathematics and Statistics Collaborators: Fran E. Curtis (Lehigh University) Nic I. M. Gould (Rutherford Appleton

More information

Determination of Feasible Directions by Successive Quadratic Programming and Zoutendijk Algorithms: A Comparative Study

Determination of Feasible Directions by Successive Quadratic Programming and Zoutendijk Algorithms: A Comparative Study International Journal of Mathematics And Its Applications Vol.2 No.4 (2014), pp.47-56. ISSN: 2347-1557(online) Determination of Feasible Directions by Successive Quadratic Programming and Zoutendijk Algorithms:

More information

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization Roger Behling a, Clovis Gonzaga b and Gabriel Haeser c March 21, 2013 a Department

More information

4TE3/6TE3. Algorithms for. Continuous Optimization

4TE3/6TE3. Algorithms for. Continuous Optimization 4TE3/6TE3 Algorithms for Continuous Optimization (Algorithms for Constrained Nonlinear Optimization Problems) Tamás TERLAKY Computing and Software McMaster University Hamilton, November 2005 terlaky@mcmaster.ca

More information

5.6 Penalty method and augmented Lagrangian method

5.6 Penalty method and augmented Lagrangian method 5.6 Penalty method and augmented Lagrangian method Consider a generic NLP problem min f (x) s.t. c i (x) 0 i I c i (x) = 0 i E (1) x R n where f and the c i s are of class C 1 or C 2, and I and E are the

More information

MODIFYING SQP FOR DEGENERATE PROBLEMS

MODIFYING SQP FOR DEGENERATE PROBLEMS PREPRINT ANL/MCS-P699-1097, OCTOBER, 1997, (REVISED JUNE, 2000; MARCH, 2002), MATHEMATICS AND COMPUTER SCIENCE DIVISION, ARGONNE NATIONAL LABORATORY MODIFYING SQP FOR DEGENERATE PROBLEMS STEPHEN J. WRIGHT

More information

A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications

A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications Weijun Zhou 28 October 20 Abstract A hybrid HS and PRP type conjugate gradient method for smooth

More information

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen Numerisches Rechnen (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2011/12 IGPM, RWTH Aachen Numerisches Rechnen

More information

LAGRANGIAN TRANSFORMATION IN CONVEX OPTIMIZATION

LAGRANGIAN TRANSFORMATION IN CONVEX OPTIMIZATION LAGRANGIAN TRANSFORMATION IN CONVEX OPTIMIZATION ROMAN A. POLYAK Abstract. We introduce the Lagrangian Transformation(LT) and develop a general LT method for convex optimization problems. A class Ψ of

More information

Recent Adaptive Methods for Nonlinear Optimization

Recent Adaptive Methods for Nonlinear Optimization Recent Adaptive Methods for Nonlinear Optimization Frank E. Curtis, Lehigh University involving joint work with James V. Burke (U. of Washington), Richard H. Byrd (U. of Colorado), Nicholas I. M. Gould

More information

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method Robert M. Freund March, 2004 2004 Massachusetts Institute of Technology. The Problem The logarithmic barrier approach

More information