A perturbed damped Newton method for large scale constrained optimization

Size: px

Start display at page:

Download "A perturbed damped Newton method for large scale constrained optimization"

Brook Paul
5 years ago
Views:

1 Quaderni del Dipartimento di Matematica, Università di Modena e Reggio Emilia, n. 46, July 00. A perturbed damped Newton method for large scale constrained optimization Emanuele Galligani Dipartimento di Matematica Pura ed Applicata G. Vitali Università degli Studi di Modena e Reggio Emilia Via Campi 13/B, 41100, Modena, taly e mail: galligani@unimo.it Abstract. n this wor we analyze the Newton nterior Point method presented in [El Bary, Tapia et al., J.Optim. Theory Appl., 89, 1996] for solving constrained systems of nonlinear equations arising from the Karush Kuhn Tucer conditions for nonlinear programming problems KKT systems. More specifically, we consider a variant of the Newton nterior Point method for KKT systems in in which the possibility to adaptively modify the perturbation parameter and the accuracy of the solution of the perturbed Newton equation, when one is still far away from the solution at an early stage, will moderate the difficulty of solving the KKT system. Using the results in [Durazzi, J.Optim. Theory Appl., 104, 000] and [Bellavia, J.Optim. Theory Appl., 96, 1998], it is possible to establish a global convergence theory for this modified method. The method proposed in this wor can be used effectively for large scale optimization problems with structured sparsity, which occur, for instance, from the discretization of optimal control problems with inequality constraints. Key Words: KKT systems, interior point method, inexact Newton method. 1

2 1 ntroduction Consider the following constrained system of nonlinear equations Gx, λe, λ Hw =, s = 0 SΛ e s 0; λ 0 where x R n ; λ E R n E, s,λ R n, G : R n+n E+n R n+n E+n, S = diag{s 1,..., s n }, Λ = diag{λ,1,...,λ,n }, s = s 1,..., s n T, λ = λ,1,..., λ,n T and e R n is the vector consisting of ones. Systems of this type are related, for example, to the Karush Kuhn Tucer conditions for the nonlinear programming problems. n this case, they often are nown as KKT systems. n order to solve these systems with Newton type methods it is necessary to introduce a perturbation parameter in the complementarity equation SΛ e = 0 The authors in [7] have developed an interior point method for solving this perturbed KKT system. n this paper we analyze such a method in the framewor of inexact Newton methods. This gives the possibility to revise the method to introduce an adaptive technique for changing the perturbation parameter and an inner linear solver for determining an approximate solution of the perturbed Newton equation. t maes the method more robust and highly effective for large scale optimization problems, as those that occur in data fitting applications and from the discretization of optimal control problems governed by partial differential equations. The constraint on the vector s and λ requires to modify the basic inexact Newton method in such a way that the iterates must lie on an appropriate trajectory which forces these from becoming too close to the boundary of the orthant feasible region s 0; λ 0. For this modification we use the technique proposed in [7]. t is possible to prove the convergence of this damped inexact Newton method under standard assumptions on KKT systems. This convergence analysis has recourse to the fundamental convergence theorem of the basic inexact Newton method.

3 Statement of the problem We consider the following nonlinear constrained optimization problem minx fx s.t. g E x = 0 1 g x 0 where f : R n R, g E : R n R ne and g : R n R n are smooth functions; n E < n. We are interested in the case when 1 is not a convex problem and when the number of variables n is large. We assume that first and second derivatives of the objective function and constraints are available. ntroducing the slac variables s = s 1,...s n T problem 1 is equivalent to minx fx s.t. g E x = 0 g x s = 0 s 0 The Karush Kuhn Tucer KKT conditions for problem are fx n E i=1 g E,ixλ E,i n i=1 g,ixλ,i = 0 g E x = 0 g x s = 0 λ T s = 0; λ 0; s 0 or fx g E xλ E g xλ = 0 g E x = 0 g x s = 0 3 SΛ e = 0; λ 0; s 0 Here fx indicates the gradient of fx and the n n E and n n matrices g E x and g x indicate the transpose of the jacobian matrices of g E x and g x respectively. 3

4 f we set w = x λ E λ s ; Hw = Gw SΛ e = fx g E xλ E g xλ g E x g x s SΛ e the KKT conditions 3 become Hw = 0 s 0; λ 0 4 the equation SΛ e = 0 is the complementarity equation of the KKT system 4. The lagrangian function associated with problem is Lx,λ E, λ = fx λ T Eg E x λ T g x The hessian matrix associated with problem is L Lx,λ E, λ = Fx n E i=1 n G E,i xλ E,i G,i xλ,i where Fx, G E,i x i = 1,..., n E and G,i x i = 1,..., n, denote the hessian matrices of fx, g E,i x i = 1,..., n E and g,i x i = 1,...,n, respectively. f x is feasible for problem, then we denote as Bx the set of indices of active i.e. binding inequality constraints at x: Bx = {i : g,i x = 0,i = 1,..., n } The jacobian matrix of Hw has the form H w = i=1 L g E x g x 0 g E x T g x T S Λ The square matrix H w has order n + n E + n. 5 The standard assumptions for the problem are the follows [10, 10.8]: 1. Existence. There exists x T, λ ET, λ T T, solution to problem and associated multipliers, satisfying the KKT conditions 3.. Smoothness. The hessian matrices Fx, G E,i x i = 1,..., n E and G,i x i = 1,..., n exist and are locally Lipschitz continuous at x. 3. Regularity. The set { g E,1 x,..., g E,nE x } { g,i x ; i Bx } is constituted by linearly independent vectors. 4

5 4. Second Order Sufficiency. For all y 0 satisfying g E,i x T y = 0, i = 1,..., n E, and g,i x T y = 0, i Bx, we have y T Lx,λ E, λ y > Strict Complementarity. For all i = 1,..., n, we have λ,i + g,ix > 0. We have the following basic result [7]. Theorem 1. Let the above assumptions hold. Let s = g x. Then, the Jacobian matrix H x,λ E, λ,s of Hw in 4 is nonsingular. Proof. For simplicity of notation, we will assume that Bx = {1,..., q}, q < n. Now we write g E x = [ g E,1 x,..., g E,nE x, g,1 x,..., g,q x ] g x = [ g,q+1x,..., g,n x ] Thus, the matrix H w can be written in the form where H w = Lx, λ λ E, g x E g x 0 g E x T g x T S n q Λ λ E = λ E,1,..., λ E,n E, λ,1,..., λ,q T λ = λ,q+1,..., λ,n T Λ = diag{λ,q+1,..., λ,n } S = diag{s q+1,..., s n } The square matrix H w has order n + n E + q + n q + n q because in the equations 3 the variables s 1,..., s q are zero. So these variables have been dropped from the system 4. The square matrix of order n + n E + q H ew = Lx, λ E, λ g E x g E x T 0 is the matrix associated with the equality constrained optimization problem minx fx s.t. g E x = 0 g,ix = 0 i Bx We observe that the regularity condition 3 is also regularity condition for this problem and second order sufficiency condition 4 is also second order sufficiency condition for this problem. Hence, from the theory of equality constrained optimization, we see that assumptions 3 and 4 are equivalent to the nonsingularity of H ew. Rearranging the order of rows and columns of H w we have 5

6 L g E g 0 g T E g T S = Q where with and We have and Q 1 = 0 S 0 Q = Q 1 1 = Q1 Q 3 Q = Q T 3 Q 0 0 ; Q 3 = 0 g x T 0 g E x T g E x Lx, λ E, λ 0 S 1 0 Q 0 = Q + Q T 3 Q 1 1 Q 3 1 = Q 1 Q T 3 Q 1 1 Q 3 = 0 For the assumption on strict complementarity 5, the diagonal elements of S are positive. Thus, the matrix Q 1 is nonsingular. Rearranging the order of rows and columns of H e w we obtain Q ; therefore, Q is nonsingular. Then, by the lemma on the inversion of a matrix by partitioning, the matrix Q, and hence H w, is nonsingular. This completes the proof. We recall here, the Lemma on the inversion of a matrix by partitioning [8, p ]. Let Q 1 be a s s nonsingular matrix and let Q, Q 3, Q 4 be r r, s r, r s matrices, respectively. f Q Q 4Q 1 1 Q3 is nonsingular matrix, then the matrix Q1 Q 3 Q = Q 4 Q is also nonsingular matrix and its inverse is given by 1 Q 1 Q1 Q 3 Q Q 1 1 = = Q3Q0Q4Q 1 1 Q 1 1 Q3Q0 Q 4 Q Q 0Q 4Q 1 1 Q 0 where Q 0 = Q Q 4 Q 1 1 Q 3 1. The proof is obtained by multiplying Q by Q 1 and verifying that the product yields the identity matrix. This result motivates the use of the Newton method for solving the nonlinear system 4 with an initial guess w 0, satisfying λ 0 > 0, s 0 > 0, in a neighborhood of w. At each stage, = 0, 1,..., of Newton s method, the Newton equation H w w = Hw 6 has to be solved. Here, w = x T, λ T E, λ T, s T T. 6

7 The new iterate is w +1 = w + w, where w is the solution of the linear system 6. We notice that the fourth equation of 6 is t implies that, if s ˆ i λ = 0 for > ˆ.,i S λ + Λ s = S Λ e = 0 or λ ˆ,i = 0 for an iteration ˆ, then s i = 0 or This means that when the iterates reach the boundary of the nonnegative orthant s 0, λ 0, they are forced to stic to the boundary. Hence, the Newton step w does not allow us to mae much progress toward the solution w. An obvious correction is to modify the Newton formulation so that zero variables can become nonzero in a subsequent iteration. This can be accomplished by replacing the complementarity equation SΛ e = 0 in 4 with the perturbed complementarity equation SΛ e = ρe ρ > 0 7 This is exactly the introduction of the notion of adherence to the central path in the interior point methods for linear and convex programming [11, p ]. f we set ẽ = 0 T, 0 T, 0 T,e T T, the perturbed KKT system has the form Hw = ρẽ 8 s 0; λ 0 9 and can be considered as a perturbation of 4 where the perturbation appears only on the complementarity equation SΛ e = 0. At each stage, = 0,1,..., of Newton method, now we have to solve the perturbed Newton equation H w w = Hw + ρ ẽ ρ ց 0 10 The perturbation on the complementarity equation forces the iterates to be sufficiently far from the boundary of the nonnegative orthant s 0, λ 0. The bound constraints 9 play a crucial role. We are interested in an iterative method that requires that all iterates w should satisfy the constraints 9 strictly i.e. s > 0, λ > 0, for all = 0, 1... and the KKT conditions 4 only in the limit. This iterative method consists in solving the perturbed Newton equation 10 and in decreasing the perturbation parameter at each iteration. When s > 0 and λ > 0, the system 10 can be rewritten in equivalent forms. Let the right hand side of system 10 be Hw + ρ ẽ = ᾱ ε β θ + ρ e 7

8 where From ᾱ = fx g E x λ E ε = g E x β = g x s θ = S Λ e s = Λ 1 θ + ρ e S λ the system 10 can be rewritten in the reduced form g x λ Lx, λ E, λ g E x g x x ᾱ g E x T 0 0 λ E = ε g x T 0 Λ 1 S λ β 11 Here, β = g x + ρ Λ 1 e. By changing the sign to the third bloc equation, the coefficient matrix of the system 11 is symmetric. By a further substitution from the third bloc equation λ = ρ S 1 e S 1 Λ g x T x + g x the system 11 becomes the symmetric system written in the condensed form H g E x x ᾱ g E x T = 1 0 λ E ε where H = Lx, λ E,λ + g x S 1 Λ g x T ᾱ = ᾱ + ρ g x S 1 e g x S 1 Λ g x The system 10 is not necessarily an ill conditioned system of equations. n actual computations more care must be taen during the reduction processes of the system 10 to avoid increase in the elements of the coefficient matrix due to small elements of Λ in the case of 11 or S in the case of 1. 8

9 3 Choice of the perturbation parameter The question is now how to choose the perturbation parameter ρ in the perturbed Newton equation 10. We define ρ = σ µ, where 0 < σ min σ σ max < 1 and µ satisfies the condition µ ẽ Hw 13 Here denotes the euclidean norm. Thus, equation 10 may be interpreted as the iteration of the inexact Newton method [] for solving the KKT system 4 with the parameter σ 0,1 as forcing term. Therefore, the vector σ µ ẽ has the meaning of a residual. n the inexact Newton method with forcing term σ, we have the condition on the residual: H w w + Hw σ Hw 14 where w is the solution of the linear system 10. By squaring this inequality we obtain Hw T H w w + Hw + H w w σ Hw Hence, Hw T H w w σ 1 Hw 0 Let us define the merit function for system 4 Φw = Hw 15 Since Φw = HwHw = H w T Hw, the above inequality gives Φw T w 0 16 Thus, the solution w of the system 10 satisfying condition 14 is a vector which is a descent direction at w for the merit function Φw. Since Hw T ẽ = e T Λ S e = s T λ, we have Φw T w = Hw T H w w = Hw T Hw + σ µ ẽ = Hw + σ µ s T λ f µ Hw s T λ then condition 16 is satisfied. Hw σ s T λ 17 9

10 Two different choices of µ that satisfies 17 are [7],[4] and by 13 µ 1 µ = st λ 18 n = Hw n 19 We also have that µ 1 µ 0 n order to prove that the values 18 and 19 of µ satisfy 17 and that the inequality 0 holds, we use the following relations e.g. see [9, p ] s T λ here 1 denotes 1 norm, and n fact, and µ 1 λ = st Finally, we have n Hw s T λ n = S Λ e 1 n S Λ e n S Λ e 1 λ n = st n S Λ e Hw S Λ e n n s T λ n s T λ µ = Hw n Hw Hw s T λ n S Λ e n s T λ n Hw Hw n S Λ e s T λ n S Λ e n Hw n Sometimes it happens that the value of µ 1 is too small when we are far away from a solution. t implies that the perturbed KKT system 8 9 is too close to the KKT system 4 and it can produce stagnation of the current solution on the boundary of the nonnegative orthant s 0, λ 0. n this case we propose to use other values for µ, for instance the value µ or any value between µ 1 and µ, to solve more inexactly system 4 and to obtain a different descent direction; we call saveguards these values, that are intended to prevent the perturbation parameter ρ from becoming too small too quicly. 10

11 Moreover, we observe that, if µ belongs to the interval [µ 1,µ ], inequality 13 is satisfied. Computation of the exact solution w of equation 10 can be too expensive if the dimension n + n E + n of the jacobian matrix H w is large and, for any dimension, may not be justified when w is relatively far from the solution w. Therefore one might prefer to compute some approximate solution of 10. n practice, the solution of 10 will most frequently be obtained by performing some iterations of an inner linear iterative solver for the equation 10 with an adaptive stopping criterion of the form where 0 δ δ max < 1 and r δ n µ 1 r = H w w + Hw σ µ ẽ e.g. see [1]. f µ satisfies inequality 13, formula 1 becomes r δ Hw 3 Let us consider the vector r partitioned commensurately as w and Hw r 1 r r = r 4 3 r 4 and we define as ˆr the first three bloc components of r, i.e. r ˆr 1 = r 5 r 3 We indicate the approximate solution of 10 again with w. We have the following result. Theorem. Let µ satisfy inequality 13; the approximate solution w of equation 10, satisfying condition 1 with σ max + δ max < 1 is a descent direction at w for the merit function Φw. Proof. Using we have [ ] T Φw T w = Hw Hw w = Hw T H w w [ ] = Hw T Hw + σ µ ẽ + r Since Hw T ẽ = e T S Λ = Hw T Hw + σ µ Hw T ẽ + Hw T r e = s T λ, we obtain µ Hw T ẽ = µ s T λ µ n S Λ e µ n Hw Hw by 19 11

12 The Cauchy Schwarz inequality Hw T r Hw T r Hw r gives Hw T r Hw δ n µ δ Hw by 19 Thus Φw T w 1 σ max + δ max Hw 0 The choice δ = 0 implies that the equation 10 must be solved exactly. When µ satisfies 13, then the approximate solution w of equation 10 satisfying condition 1, also satisfies the condition on the residual of the inexact Newton method where the forcing term is σ + δ n fact H w w + Hw σ + δ Hw H w w + Hw = ρ ẽ + r ρ ẽ + r σ Hw + δ Hw The value µ = µ is the upper bound for the parameter µ under which for any µ µ the solution w of system 10 computed exactly or approximately with formula 1 satisfies the condition on the residual of the inexact Newton method with forcing term σ or σ + δ. Let M and M be the coefficient matrices of the reduced system 11 and of the condensed system 1 respectively. f the perturbed Newton equation 10 is solved rewriting the system in the forms 11 or 1, an approximate solution w satisfies H w w = Hw + ρ ẽ + r where r is bounded as in 1, and it is defined for reduction 11 as r 1 r r = r 3 0 with r 1 r r 3 = M x λ E λ ᾱ ε β 1

13 and for the form 1 as with r 1 r r = r 1 r 0 0 = M x λ E ᾱ ε We note that, solving system 10 in a form 11 or 1, no further perturbation on the complementarity bloc equation is added r 4 = 0. 13

14 4 Step-length selection: a path following strategy The solution w of equation 10 is not usually feasible, since it is not guaranteed that the vectors s +1 and λ +1 of the new iterate w +1 = w + w should satisfy the bounds 9, that is s +1 0, λ +1 0, when the corresponding vectors s and λ are nonnegative. To avoid this difficulty, we introduce a step relaxation, that is a damping parameter α > 0, which maes shorter the step w between the two iterates w and w +1. Thus, if w is the current iterate of the inexact Newton method, we choose for the next iterate w +1 = w + α w 6 where 0 < α < 1 is the step length. This modification of the inexact Newton method involves the introduction of a merit function Φw 0 related to the system Hw = 0, for example, the merit function 15 Φw = Hw T Hw = Hw and a strategy for selecting the step length α to guarantee that the function Φw can be reduced at each iteration and the components of the vectors s and λ remain positive for all = 0, 1,... n order to generate a sequence of iterates w with s > 0 and λ > 0, that converges to a solution w of system 4 under standard assumptions on KKT systems, it is necessary to define a path to be followed by the iterates w which forces these from coming too close to the boundary of the nonnegative orthant defined by 9. For the generation of this trajectory, we use the following two functions of α defined in [7] cfr. the definition of the norm neighborhood and of the one sided norm neighborhood [11, p. 40], = 0, 1,... ϕ 1 α = min S αλ s α T λ α i=1,...,n αe γ τ 1 n 7 where ϕ α = s α T λ α γ τ Gw α 8 min i=1,...,n S 0 Λ 0 e τ 1 = s 0 T λ 0 τ = n s 0T λ 0 Gw 0 γ [ 1,

15 and w α = x α λ E α λ α s α = x λ E λ s + α x λ E λ s = w + α w with s 0 > 0 and λ 0 > 0. Thus τ 1 > 0, τ > 0. Clearly w = w 0 and w +1 = w α. The vector w is the approximate solution of 10 with residual control 1. The function ϕ 1 α is a piecewise quadratic function of α and the function ϕ α is a nonlinear function of α. Theorems 3 and 4 will assure that there exist two positive numbers α 1 and in 0,1] such that α ϕ 1 α 0 α 0,α1 ]; ϕ α 0 α 0,α ] when ϕ and ϕ 0 0. These conditions imply min S αλ s α T λ α i=1,...,n αe γ τ 1 n 30 and s α T λ α γ τ Gw α 31 This means that condition 30 eeps the iterates w sufficiently far from the boundary of the nonnegative orthant s 0, λ 0 and condition 31 obliges the sequence {s α T λ α} to converge to zero slower than { Gw α }. Theorem 3. Let us assume that σ [σ min, σ max ] 0,1, γ and τ 1 are given by 9; let us also assume that n + τ 1 γ σ > δ 3 1 τ 1 γ with δ [0, δ max ] [0,1]. Then, if ϕ 1 0 for all α [0,α 1 ]. Proof. Let and for i = 1,..., n. 0 0, there exists a positive number α1 ϕ 1,i α = s i αλ L 1,i = s i λ n,i α γ τ 1 n,i γ τ 1 > 0, such that ϕ 1 α s α T λ α 33 s T λ 15

16 Let the vector r be partitioned as in 4; from we have S λ + Λ s = S Λ that becomes in componentwise i = 1,..., n s i λ,i + λ,i s i By multiplying by e T, it becomes that is e T S λ + Λ s T λ Thus, for α [0, 1], we can write α = ϕ 1,i s i = + α s i e + ρ e + r 4 = s i λ,i + ρ + r 4,i 34 s = e T S Λ e + ρ n + e T r 4 + λ T s = s T λ + ρ n + e T r 4 35 s i λ n [ +α s i λ +α s,i τ1γ i λ,i + α λ,i s T λ,i + s i λ,i λ By using 34 and 35, we have ϕ 1,i α = s i λ,i τ 1γ Hence, n,i τ1γ n τ1γ s T λ n +α s i λ,i τ 1γ n [ = 1 α s i λ +α r 4,i τ 1γ + τ 1γ s + α s T λ n τ 1γ ] s T λ + s T λ + n s T λ [ s T λ + α,i τ1γ n s i,i + ρ + r 4,i + λ ] + ρ n + e T r 4 + n e T r 4 s T λ s T λ + α ] + αρ 1 τ 1 γ + s i λ ϕ 1,i α = 1 αϕ 1,i 0 + αρ 1 τ 1 γ + α +α s i λ By hypothesis ϕ 1,i 0 0, then n,i τ1γ s T λ n,i τ 1γ r 4,i τ1γ [ 1 αϕ 1,i 0 = ϕ 1,i α α ρ 1 τ 1γ + r 4,i τ1γ α s i λ n,i τ1γ n n s T λ s T λ n i=1 n i=1 r 4,i r 4,i ] + + α λ 16

17 Thus 1 [ ϕ 1,i α α ρ 1 τ 1γ + r 4,i τ1γ n +α s i λ αρ 1 τ 1 γ α n,i τ1γ n i=1 r 4,i ] s T λ r 4,i + τ1γ n n i=1 + r 4,i α s i λ,i τ1γ s T λ n α ρ 1 τ 1 γ 1 + γ τ 1 r 4 α L 1,i n and then, by using formula 1 we have [ ϕ 1,i α α σ µ 1 τ 1 γ 1 + γ τ 1 δ ] n µ αl 1,i n that is, the quantity in square bracets has to be positive or, when r 4 = 0 that is µ 1 τ 1γ σ δ n γ τ 1δ αl 1,i 0 36 [ ] ϕ 1,i α α σ µ 1 τ 1 γ αl 1,i σ µ 1 τ 1γ αl 1,i 0 37 We observe that in formulae 36 and 37, we have 1 τ 1γ 0. n the case of 36, inequality 3 implies that there exists such that ϕ 1 f, we define ᾱ 1 = 1 τ 1γ σ n + γ τ 1δ µ / L 1 > 0 38 α 0 for α [0, ᾱ1 α 1 ]. Here L 1 = max i=1,...,n L 1,i. = max {α : ϕ 1 α [0,1] t 0, t α} we have α 1 ᾱ 1 > 0. We note that, in the case 37, we do not need the inequality 3; the value ᾱ 1 becomes This completes the proof. ᾱ 1 = 1 τ 1γ σ µ / L 1 > Since we have: a b a b for a 0, b 0; a b a + b for all a, b; z 1 n z for any n vector z. We have that 1 τ 1 γ 0; in fact, min i s 0 i λ 0,i γ τ 1 = γ s 0T λ 0 n min i s 0 i λ 0,i n min i s 0 i λ 0,i n = 1 17

18 Theorem 4. Let us assume that σ [σ min, σ max ] 0,1, γ and τ are given by 9, δ [0, δ max ] [0,1]; let us also assume that n + τ γ σ > δ 40 n and that G w is Lipschitz continuous with constant Γ, i.e.: G w G w Γ w w Then, if ϕ 0 0, there exists a positive number α > 0, such that ϕ α 0 for all α [0,α ]. Proof. Let L = st λ Γ γ τ w and let the vector r defined in formula partitioned as in 4 and 5. By the mean value theorem for vector valued functions e.g. see [3, p. 74], we can write for α [0, 1] Gw + α w = Gw + αg w w + 1 +α G w + ξα w G w dξ w 0 = 1 αgw + αˆr α G w + ξα w G w dξ w 0 Here G w is the n + n E + n n + n E + n matrix consisting of the first three bloc rows of H w of formula 5 From the Lipschitz continuity for the derivative of Gw, we obtain or, by 1 Gw + α w 1 α Gw + α ˆr + 1 +α Γ ξα w dξ w Gw + α w 1 α Gw + αδ n µ + Γ α w 4 From the definition of ϕ ϕ α = st λ + α 0 α in 8, and by using 35, we have s T λ + s T λ + +α s T λ γ τ Gw + α w = s T λ + α s T λ + ρ n + e T r 4 + +α s T λ γ τ Gw + α w 18

19 f we multiply 4 by γ τ 1, changing the sign, then we have a lower bound of γ τ Gw + α w that gives ϕ α 1 αst λ + α ρ n + e T r 4 γ τ δ n µ + +α s T λ Γ γ τ w + Since e T r 4 = n i=1 γ τ 1 α Gw = 1 αϕ 0 + α ρ n + e T r 4 γ τ δ n µ + +α s T λ Γ γ τ w n r 4,i i=1 and, by hypothesis ϕ 0 0, we obtain r 4,i = r 4 1 n r 4 n r δ n µ ϕ α ασ µ n n δ µ γ τ δ n µ αl = ασ n n + γ τ n δ µ αl The inequality 40 implies that there exists such that ϕ f, we define ᾱ = n σ n + γ τ δ n µ /L > 0 43 α 0 for α [0, ᾱ ]. α we have α ᾱ > 0. This completes the proof. Let us define and with = max {α : ϕ α [0,1] α = min{α 1,α t 0, t α},1} 0, 1] 44 η = 1 α 1 σ + δ 45 σ + δ σ max + δ max < 1 then, we have η < 1 and the step α w satisfies the condition on the residual of the inexact Newton method with forcing term η H w α w + Hw η Hw 46 ndeed, since α > 0 and σ + δ σ max + δ max < 1, we have α 1 σ + δ α 1 σ max + δ max 0 19

20 besides, using, 3 and 13 we have H w α w + Hw = α Hw + σ µ ẽ + r + Hw 1 α Hw + α σ µ ẽ + r 1 α 1 σ + δ Hw = η Hw To select the step length α we perform a reduction of α by using the algorithm NB described in [6] until an acceptable α = θ t α 47 is found, where t is the smallest nonnegative integer such that α satisfies Hw + α w 1 βα 1 σ + δ Hw 48 with θ, β 0, 1. Since 1 βα 1 σ + δ < 1, inequality 48 asserts that Hw +1 Hw 49 n a next section, we have to prove that t is a finite number and independent on. Now, let us assume that ϕ 1 Since α α, theorems 3 and 4 assert that 0 0 and ϕ ϕ 1 α 0 and ϕ α 0 From 8 and the choice γ 1 we can write s α T λ α 1 τ Gw α Since we have S αλ αe S αλ αe 1 = s α T λ α α 0 s α T λ α S α Λ α e + Thus, 1 τ Gw α [ s α T λ α 1 ] 1 S α Λ 1 α e + τ Gw α 0

21 then, s α T λ α that is 1 min{1, τ } S +1 Λ +1 e + Gw α 1 s +1T λ +1 1 min{1, τ } Hw+1 51 Therefore, if Hw +1 0 i.e. Hw +1 = 0 or Φw > 0, we have and from 7, when 1 γ +1 γ, we have s +1 i λ +1,i γ τ 1 s +1T λ +1 n which means that all the numbers s +1 i from zero. Since s α T λ s +1T λ +1 > 0 5 γ +1τ 1 n s +1T λ +1 > 0 53 λ +1,i, i = 1,..., n, are bounded away α = S αλ αe 1 n S αλ αe α 0 we can write s α T λ α n S α Λ α e n S α Λ α e + Gw α that is Therefore, = n S +1 Λ +1 e + Gw +1 s +1T λ +1 max s +1 i λ +1 i=1,...,n,i = max n Hw s +1 i i=1,...,n λ +1,i = S +1 Λ +1 e S +1 Λ +1 e 1 = s +1T λ +1 n Hw +1 which means that all the numbers s +1 i λ +1,i, i = 1,..., n, are bounded above. Combining 51 and 54, we have the following inequality lφw +1 s +1T λ +1 n Φw where l = 1 min{1, τ } 1

22 With these results we can describe an iterative process for generating a sequence of points w with s > 0 and λ > 0, called interior points, that lie on a trajectory not too close to the boundary of the orthant defined by s 0 and λ 0. Let be given w 0 with s 0 > 0 and λ 0 > 0; let be given a sequence of parameters {γ } with Set 0 < β 1 and 0 < θ < 1. For = 0, formula 9 gives 1 > γ 0 γ 1... γ γ ϕ = 1 γ 0min i S 0 W 0 e > 0 ϕ 0 0 = 1 γ 0s 0T w 0 > 0 Theorems 3 and 4 assure that there exist α 1 0 > 0 and α 0 > 0 such that ϕ 0 1 α 0 for all α 0,α1 0 ] ϕ 0 α 0 for all α 0,α 0 ] Thus, we have ϕ 0 1 α 0 0 and ϕ 0 α 0 0, where α 0 is the step length obtained with the NB algorithm starting from α 0 = min{α 1 0,α 0,1}. f H w 0 is invertible, we determine w 1 = w 0 + α 0 w 0, where w 0 is the solution of 10, satisfying 1 for = 0. The following properties hold: H w 0 α 0 w 0 + Hw 0 η 0 Hw 0 η 0 < 1 Hw 1 Hw 0 lφw 1 s 1T λ 1 n Φw 1 Hence, if Hw 1 = 0, then s 1T λ 1 > 0 and s 1 i λ 1,i > 0, for i = 1,..., n. For = 1, we consider the functions 7 and 8: ϕ 1 1 α = min S 1 αλ 1 s 1 α T λ 1 α i=1,...,n αe γ 1 τ 1 ϕ 1 α = s1 α T λ 1 α γ 1 τ Gw 1 α where s 1 α = s 1 + α s 1, λ 1 α = λ 1 + α λ 1 and w 1 α = w 1 + α w 1. n

23 We have Since ϕ = min i=1,...,n S 1 Λ 1 ϕ 1 0 = s1t λ 1 γ 1 τ Gw 1 ϕ 0 1 α 0 = min i=1,...,n S 1 Λ 1 s 1T λ 1 e γ 1 τ 1 n s 1T λ 1 e γ 0 τ 1 0 n ϕ 0 α 0 = s 1T λ 1 γ 0 τ Gw 1 0 and γ 1 γ 0, we have ϕ ϕ0 1 α 0 0 and ϕ 1 0 ϕ0 α 0 0 Thus, theorems 3 and 4 assure that there exist α 1 1 > 0 and α 1 > 0 such that ϕ 1 1 α 0 for all α 0,α1 1 ] ϕ 1 α 0 for all α 0,α 1 ] Hence, we have ϕ 1 1 α 1 0 and ϕ 1 α 1 0, where α 1 is the step length obtained with the NB algorithm starting from α 1 = min{α 1 1,α 1,1}. f H w 1 is invertible, we determine w = w 1 + α 1 w 1, where w 1 is the solution of 10, satisfying 1 for = 1. The following properties hold: H w 1 α 1 w 1 + Hw 1 η 1 Hw 1 η 1 < 1 Hw Hw 1 lφw s T λ n Φw Hence, if Hw = 0, then s T λ > 0 and s i λ,i > 0, for i = 1,..., n. n succeeding steps =, 3,... of the process we have ϕ 1 0 ϕ 1 Thus, using the functions ϕ 1 1 α 1 0 and ϕ 0 ϕ 1 α 1 0 α and ϕ α and the NB algorithm, it is possible to determine the step length α > 0 and the point w +1 = w + α w, if H w is invertible. Then, the properties 46, 49 and 55 hold. Specifically, if Hw = 0 for all = 0, 1,,... the inner products s T λ are bounded above and bounded away from zero and all components of S Λ e are bounded above and bounded away from zero. 3

24 Besides, the sequence {Φw } is monotone and nonincreasing. Therefore Φw Φw 0 for all = 1,,... The last two properties will be used in theorem 5 to prove that the vectors s and λ are componentwise bounded away from zero for all = 0, 1,,... 4

25 5 Boundedness of the sequences n this section we will prove that the sequences generated by the iterative method are uniformly bounded on the level set Ωǫ. Given ǫ 0, let us define the level set Ωǫ = {w : ǫ Φw Φw 0 ; min i=1,...,n SΛ e τ1 s T λ τ Gw } s T λ n ; 56 where, for a given starting point w 0 = x 0T,λ 0 T 0 T E, λ,s 0 T T with λ 0 > 0 and s 0 > 0, we have the expression 9 for τ 1 and τ. n other words, the level set Ωǫ is constituted by all the points in which the merit function has the value less or equal than to the one computed at the initial point and in which the functions ϕ 1 and ϕ are nonnegative i.e. points that satisfy centrality conditions By the description in section 3, the iterates w are contained in Ω0. Besides, in Ωǫ, where ǫ > 0, the inner products s T λ are bounded above and bounded away from zero and all components of S Λ e are bounded above and bounded away from zero. Finally, in Ωǫ, the sequence{φw } is monotone and nonincreasing. Therefore, Φw Φw 0 for all = 1,,... Ωǫ is a closed set. ndeed, let v be an accumulation point of the sequence {v }, where v Ωǫ. The definition of continuity of Φw implies that lim Φv = Φ lim v = Φv and since Φv Φv 0 for all, we have that i.e. Φv Φv 0. Analogously, we have and lim Φv lim Φv0 min i=1,...,n S Λ e lim = min i=1,...,n S Λ e s T λ /n s T λ τ 1 /n lim s T λ Gw = s T λ Gw τ thus, v is a point of Ωǫ. We will mae the following standard assumptions on the Karush Kuhn Tucer systems: 5

26 A1 The functions fx, g E x, g x are twice continuously differentiable in Ω0; the vectors { g E,1 x,..., g E,nE x} are linearly independent; the matrix G w is Lipschitz continuous with constant Γ, i.e.: G w G v Γ w v A The iteration sequences {x }, {λ E } and {λ } are bounded in Ωǫ, where ǫ > 0. A3 n any compact subset of Ω0, the matrix H w is nonsingular; or if we consider the condensed form 1 of system 10, it is sufficient that in any compact subset of Ω0, where s is bounded away from zero, the matrix Lx,λ E, λ + g xs 1 Λ g x T is positive definite on the null space of g E x T i.e. N g E x T = {z : g E x T z = 0}. The boundedness of the sequences {x } can be assured by enforcing box constraints l i x i l i for sufficiently large l i > 0, i = 1,...,n. The boundedness of the sequences of the multipliers {λ E } and {λ } is of fundamental importance for choosing a good starting point for the generation of a sequence {w } converging to w see [5]. The condition A3 for the system 10 in the condensed form is also the one required for the local sequential quadratic programming method see [11, p. 531]. Theorem 5. Suppose that assumptions A1, A and A3 hold. f w Ωǫ, ǫ > 0, then a the iteration sequence {w } is bounded above; b the sequence {s, λ } is componentwise bounded away from zero; c the sequence of matrices {H w 1 } is bounded; d the sequence of { w } is bounded. Proof. Since {x } are points of a compact set Ωǫ is a closed set and l i x i l i, and g x are continuous functions on this set, we have that the sequence { g x } is bounded above, say, by M 0. Therefore, it follows from the definition of Ωǫ, ǫ > 0, and the fact that { Hw +1 } is monotonically decreasing that s = s g x s + g x Φw M0 This proves that {s } is bounded above. Therefore, the boundedness of {x }, {λ E } and {λ } by assumptions A and the boundedness of {s } imply that the sequence {w } is bounded above. This proves the proposition a of the theorem. Now in Ωǫ, ǫ > 0, the sequences {s i λ,i }, i = 1,..., n, are all bounded away from zero. Hence, all components λ,i of the vectors {λ } are bounded away from zero, because the vectors {s } are bounded above each component s i is bounded above. 6

27 Besides {s } is bounded away from zero, because {λ } is bounded above by assumption A. This proves the proposition b of the theorem. Rearranging the order of rows and columns of H w, we have where with Q 1 = L g E g 0 g T E g T S Λ Λ S 0 Q1 Q 3 Q = Q T 3 Q 0 0 ; Q 3 = 0 g T = Q and 0 g T E Q = g E L For simplicity we have omitted the arguments and the iteration index in the matrix H w. From the assumptions A1 and A the elements of the matrices g E, g, Fx, G E,ix i = 1,..., n E and G,ix i = 1,..., n are continuous functions of x, where x is a point of a compact set; besides, the multipliers vectors λ E and λ are bounded above in Ωǫ, ǫ > 0. Thus, the matrices g E, g and L are uniformly bounded in Ωǫ, where ǫ > 0. We have, Q = S 1 S 1 Λ Proposition b assures that this matrix exists in Ωǫ, ǫ > 0, and is uniformly bounded. We consider the matrix Q Q T 3 Q 1 0 g T E Q 3 = g E + L 0 g S 1 Λ g T 0 g T E 0 P T = g E L + g S 1 Λ g T 1 P 1 P By assumption A3 the symmetric matrix P is positive definite on the null space of P1 T ; by assumption A1, the matrix P1 T is a full row ran matrix. Thus, the matrix Q Q T 3 Q 1 1 Q3 is invertible [10, p. 44] and its inverse is denoted by Q0. n case P is nonsingular for instance is positive definite over the whole space, the inverse matrix Q 0 = Q + Q T 3 Q 1 1 Q3 1 has the following expression P1 T P 1 P 1 1 P1 T P 1 P 1 1 P1 T P 1 Q 0 = P 1 P 1 P1 T P 1 P 1 1 P 1 P 1 P 1 P1 T P 1 P 1 1 P1 T P 1 By assumptions A1, A and propositions a, b, the norm of this matrix is uniformly bounded in Ωǫ, ǫ > 0. The inverse of the matrix Q by the lemma on the inversion of a matrix by partitioning, has the form 1 Q 1 Q1 Q 3 Q 1 = Q T 1 Q 1 1 = Q3Q0QT 3 Q 1 1 Q 1 1 Q3Q0 3 Q Q 0Q T 3 Q 1 1 Q 0 7

28 This matrix is bounded, since every submatrix involved is bounded; this implies that {H w 1 } is uniformly bounded in Ωǫ, ǫ > 0, and it proves the proposition c of the theorem. Since {H w 1 } is bounded in Ωǫ, ǫ > 0, i.e., H w 1 M for w Ωǫ, ǫ > 0 and for all 0 with M positive scalar, it is easy to prove that the sequence { w } is bounded. ndeed, by, w has the form w = H w 1 Hw + σ µ ẽ + r From 13, 3 and σ + δ σ max + δ max < 1, in Ωǫ, ǫ > 0, we have w M1 + σ + δ Hw M1 + σ max + δ max Hw 0 M Hw 0 Thus, if w Ωǫ, ǫ > 0, the sequence { w } is bounded. This proves the theorem. Now, we need to prove that the numbers α 1 and α determined by theorems 3 and 4 are bounded away from zero for any, = 0, 1,..., i.e., they are uniformly bounded. Theorem 6. Suppose that assumptions A1, A and A3 hold. f {w } Ωǫ with ǫ > 0, δ [0, δ max ] [0, 1, σ [σ min,σ max ] 0, 1, with σ max + δ max < 1 and n + τ 1 γ n + τ γ σ > max{ δ, δ } 57 1 τ 1 γ n then, the sequence α = min{α 1 lim inf α > 0. Proof. Since α = min{α 1, α bounded away from zero. From the definition of α,α, 1} is bounded away from zero, i.e., 1}, it is sufficient that {α1 } and {α } are α see formula 7, α1 is the largest number in 0 ϕ 1 1 α and ϕ 1 [0, 1] such that 30 holds; since {w } Ωǫ, we have ϕ 1 Thus, theorem 3 assures that there exists a positive number ᾱ 1 α 1 such that ϕ 1 α 0 for α ᾱ1. This number ᾱ1 has the expression 38 or, in the case of r 4 = 0 when the system 10 is solved after rewriting it in the reduced form 11 or in the condensed form 1, ᾱ 1 has the expression 39. From the definition of Ωǫ, ǫ > 0, γ 1 and the boundedness of w see theorem 5, there exists a positive constant M 1 such that L 1,i = s i λ n,i γ τ 1 s T λ M 1 58 for all i, i = 1,..., n, and all, = 0, 1,... Since µ [µ 1, µ ], where µ1 and µ are given by 18 and 19, respectively, and s T λ = n µ 1 is bounded away from zero for all = 0, 1,..., when w Ωǫ, ǫ > 0 see formulas 5 and 55, we can determine a positive number ᾱ 1, such that where ᾱ 1 ᾱ 1 59 ᾱ 1 = 1 τ 1γ σ n + γ τ 1δ µ /M 1 8

29 when ᾱ 1 has the expression 38, or where ᾱ 1 = 1 τ 1 γ σ µ /M 1 when ᾱ 1 has the expression 39. That is, ᾱ 1 is uniformly bounded away from zero in Ωǫ, ǫ > 0, when the condition 3 holds at each iteration. This proves that the sequence {α 1 From the definition of α in [0, 1] such that 31 holds. and ϕ } is bounded away from zero in Ωǫ, ǫ > 0. α see formula 8, α is the largest number We assume that G w is Lipschitz continuous with Lipschitz constant Γ assumption A1. Since {w } Ωǫ we have ϕ 0 ϕ 1 α 1 0. Thus, theorem 4 assures that there exists a positive number ᾱ α such that ϕ α 0 for α ᾱ. This number has the expression 43. From the definition of Ωǫ, ǫ > 0, γ 1 and the boundedness of w see theorem 5, there exists a constant M such that L = st λ Γ γ τ w M 60 for all, = 0, 1,... Since µ [µ 1, µ ], and st λ = n µ 1 is bounded away from zero for all = 0, 1,..., when w Ωǫ, ǫ > 0, we can determine a positive number ᾱ, such that ᾱ ᾱ 61 where ᾱ = n σ n + γ τ δ n µ /M that is uniformly bounded away from zero in Ωǫ, ǫ > 0, when condition 40 holds at each iteration. This proves that the sequence {α } is bounded away from zero in Ωǫ, ǫ > 0. When condition 57 holds, the sequences {α 1 } and {α } are simultaneously bounded away from zero in Ωǫ, ǫ > 0. This completes the proof. n order to show the convergence of the method we have to prove the following result. Theorem 7. Suppose that assumptions A1, A and A3 hold. f {w } Ωǫ with ǫ > 0, δ [0, δ max ] [0, 1, σ [σ min,σ max ] 0, 1, with σ max + δ max < 1, and σ satisfies condition 57, then there exists a parameter η = 1 α 1 σ + δ uniformly less than one which satisfies inequality 46. Proof. Using, 3 and 13, we have obtained 46 with η = 1 α 1 σ +δ. From theorem 6 we have that α is bounded away from zero in Ωǫ, ǫ > 0, i.e. α α > 0, where α = min{ᾱ 1, ᾱ, 1}. Thus α 1 σ +δ α1 σ +δ > 0. Therefore, η 1 α1 σ + δ η < 1. This proves the theorem. 9

30 6 nexact Newton methods for KKT systems The general framewor for the proposed inexact Newton method can be stated as follows. Set w 0 = x 0T, λ 0 E T 0 T, λ, s 0 T T s.t. s 0 > 0, λ 0 > 0; θ 0, 1; β 0, 1 ], γ 1 [1, 1; Φw0 = Hw 0 For = 0, 1,...until Φw ǫ exit for some µ [µ 1, µ ] and σ [σ max, σ min] 0, 1 determine an approximate solution w of H w w = Hw + σ µ ẽ such that r δ n µ where: r = H w w + Hw σ µ ẽ δ [0, δ max] [0, 1; σ max + δ max < 1; and σ > max{ n +τ 1 γ 1 τ 1 γ δ, n +τ γ n δ } 1 choose γ γ 1 ; α = min{α 1, α, 1} with α 1 s.t. ϕ 1 α 0 α 0, α1 ] and α s.t. ϕ α 0 α 0, α ] Set α = θ t α ; t smallest integer s.t. Hw + α w 1 βα 1 σ + δ Hw w +1 = w + α w The choice δ = 0 implies that the system of linear equations H w w = Hw + σ µ ẽ is solved exactly, for instance by using a direct method. For an estimate of α 1 see formula 38 or 39 if the system of the linear equations above is transformed in a reduced form 11 or in a condensed form 1; for an estimate of α see formula 43. n practice to compute α, we set 30

31 the initial value of α = 1, then we reduce it in order to guarantee the feasibility of the new iterate i.e., s α > 0 and λ α > 0 using this rule α = min{ min s i <0 s i s i λ,i,i <0 λ,i, min λ and eventually reduce it again by a factor until centrality conditions are satisfied i.e. ϕ 1 α 0 and ϕ 1 α 0; the obtained value is α.,1} 31

32 7 Convergence Analysis The convergence of the inexact Newton method for KKT systems is a direct consequence of the theory of the basic inexact Newton method. We begin by reviewing the fundamental convergence theorem of the basic inexact Newton method. The inexact Newton method in its basic form for solving the nonlinear system Hv = 0 may be defined formally as follows [] Let v 0 be given For = 0, 1,...until convergence do Find some η [0, 1 and v that satisfy Hv + H v v η Hv Set v +1 = v + v Here, H v denotes the jacobian matrix of Hv and η is the forcing term. We have the following fundamental convergence theorem [6], [1, p. 70]. Given a C 1 map F : R n R n, let {v } be any sequence such that for all 0 H v v + Hv η Hv 0 < η η < 1 Hv +1 λ Hv 0 < λ λ < 1 v +1 = v + v f the sequence has a limit point v where H v is invertible, then lim v = v and Hv = 0. We can guarantee the validity of and at each step by introducing a bactracing technique. For example with the minimum reduction procedure [6] we can reduce the step v to satisfy both conditions and if and only if the first one already holds for the initial step. With these results we can study the convergence of the inexact Newton method for KKT systems described in the previous sections. Theorem 7 assures that the step α w satisfying inequality 46, is the initial step for the minimum reduction procedure. This step guarantees the validity of condition in the fundamental convergence theorem. The general framewor of the minimum reduction procedure can be stated as follows. 3

33 Algorithm NB nexact Newton Bactracing method Set η = 1 α 1 σ + δ ;t = 0; α = α ; η = η while Hw + α w 1 β1 η Hw t t max update α = θα; η = 1 θ1 η;t = t + 1 end while f t > t max then return : fail Denote with α the last value of α Update: w +1 = w + α w Here θ,β 0, 1 and t max is the maximum number of trial steps in the while loop; furthermore, we denote with η the last value of η and we have an analogous formula for η to formula 47 for α η = 1 θ t 1 η = 1 α 1 σ + δ where η is the initial value that satisfies 46. We note that η η < 1 since α α > 0. The while loop is executed only when the condition 46 is satisfied. At the end of the while loop we have inequality 48, if the NB algorithm does not fail. The while loop implementing the minimum reduction procedure terminates in a finite number of steps. ndeed, if the NB algorithm does not fail t t max, we can find a nonnegative integer t such that α = θ t α and Hw + θ t α w 1 βθ t α 1 σ + δ Hw 6 From 34 and then from 41 and we have, for α [0,1] and for i = 1,..., n : and s i + α s i λ,i + α λ,i = s i λ,i + αs i λ,i + λ,i s +α s i λ,i = s i λ,i + α s i +α s i λ,i λ i +,i + σ µ + r 4,i + Gw + α w = 1 αgw + αˆr + 1 +α G w + ξα w G w dξ w 0 33

34 We can write Hw + α w Gw + α w = S + α S Λ + α Λ Gw ˆr 0 = 1 α S Λ + α e r + α σ 4 µ e 1 +α 0 G w + ξα w G w dξ 0 +α 0 S Λ e Thus Hw + α w 1 α Hw + α r + ασ µ e + +α w 1 +α S Λ e 0 w + G w + ξα w G w dξ + From the Lischitz continuity for the derivative of Gw assumption A1, we obtain 1 G w + ξα w G w dξ 1 Γα w 0 Using 3 and 13, we have + Hw + α w 1 α Hw + ασ + δ Hw + where M 3 is a positive number such that Therefore, we can affirm that +α M 3 + Γ w S Λ e M 3 in Ωǫ, ǫ > 0 1 βα1 σ + δ Hw Hw + α w 1 βα1 σ + δ Hw α M 3 + Γ w 63 is nonnegative for α 0, ˆα] with ˆα = 1 β1 σ + δ Hw M 3 + Γ w > 0 Since ˆα is bounded away from zero in Ωǫ, ǫ > 0, it is possible to find a nonnegative integer ˆt such that 0 < θˆt α min{ˆα,1}: α = θˆt α. Thus t = ˆt i.e., α = θ t α in 63 the inequality 6 is satisfied. This proves the proposition that the while loop terminates in a finite number t of steps and that α is bounded below by a strictly positive number, say ᾰ. Set ᾱ = min{ α,ᾰ}, with α defined in the proof of theorem 7. 34

35 Theorem 8. Suppose that assumptions A1, A and A3 hold. Suppose that σ [σ min,σ max ] 0, 1, δ [0,δ max ] [0,1, σ max + δ max < 1 and condition 57 holds. Let {w } Ωǫ, ǫ > 0, be a sequence generated by the method with ǫ exit = 0. Then, a if w is a limit point of {w } such that H w is invertible, the sequence {Hw } converges to zero and lim w = w ; b the sequence { Hw } converges to zero and each limit point of {w } satisfies the KKT conditions 3 for problem 1. Proof. Since w Ωǫ where ǫ > 0, then Hw = 0 and, by theorem 5, the matrix H w is invertible. Therefore, the inexact Newton method for KKT systems is well defined and determines a new point at each iteration. From theorem 7, there exists a parameter η such that 46 holds with α α > 0 and η = 1 α 1 σ + δ η < 1, with η = 1 α1 σ + δ. So, the inexact Newton method for KKT systems can be viewed as a basic inexact Newton method with a minimum reduction procedure NB, whose initial step α w satisfies inequality 46, that is condition is satisfied with forcing term η. f NB procedure does not fail, at the end of the while loop we have that inequality 48 holds, and, we have obtained the damped Newton step α w ; then, condition is satisfied with λ = 1 βα 1 σ + δ 1 βᾱ1 σ + δ λ < 1. The step α w also satisfies condition with forcing term η = 1 α 1 σ + δ 1 ᾱ1 σ + δ η < 1. ndeed, proceeding as for 46, using, 3 and 13, we have H w α w + Hw = α Hw + σ µ ẽ + r + Hw 1 α Hw + α σ µ ẽ + r 1 α 1 σ + δ Hw = η Hw and since α ᾱ > 0, then η η < 1. Thus, the two conditions and of the fundamental theorem of the inexact Newton method in its basic form are satisfied for the step α w. Hence, it follows the part a of the theorem. We have, from 49, that Hw Hw 0 for all and the relation 49 implies that the sequence { Hw } is monotone and nonincreasing. Hence, this sequence has a limit point H R. f H = 0, we have proved the part b of the theorem. Suppose, by contradiction, that H > 0; then the sequence {w } and its limit points belong to Ωǫ with ǫ = H > 0. t w is one of these limit points, Hw is a nonsingular matrix by theorem 5; then, from part a of this theorem, we deduce that Hw = 0. This contradicts our assumption that H > 0. Hence, the sequence { Hw } must converge to zero. Moreover, any limit point w satisfies Hw = 0 and s 0, λ 0, i.e., w satisfies the KKT conditions 3 for problem 1. This proves the part b of the theorem. 35

36 References [1] Bellavia S.: nexact interior point method, J. Optim. Theory Appl , [] Dembo R.S., Eisenstat S.C., Steihaug T.: nexact Newton methods, SAM J. Numer. Anal , [3] Dennis J.E. Jr., Schnabel R.B.: Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Prentice Hall, Englewood Cliffs NJ, [4] Durazzi C.: On the Newton interior point method for nonlinear programming problems, J. Optim. Theory Appl , [5] Durazzi C., Ruggiero V.: Global convergence of the Newton interior point method for nonlinear programming, J. Optim Theory Appl , [6] Eisenstat S.C., Waler H.F.: Globally convergent inexact Newton methods, SAM J. Optim , [7] El Bary A.S., Tapia R.A., Tsuchiya T., Zhang Y.: On the formulation and theory of the Newton interior point method for nonlinear programming, J. Optim. Theory Appl , [8] Faddeev D.K., Faddeeva V.N.: Computational Methods of Linear Algebra, W.H. Freeman & Co., San Francisco, [9] Horn R.A., Johnson C.R.: Matrix Analysis, Cambridge Univ. Press, Cambridge, [10] Luenberger D.G.: Linear and Nonlinear Programming, nd edition, Addison Wesley Publ., Reading MA, [11] Nocedal J., Wright S.J.: Numerical Optimization, Springer, New Yor, [1] Rheinboldt W.C.: Methods for Solving Systems of Nonlinear Equations, nd edition, SAM, Philadelphia,

Interior-Point Methods as Inexact Newton Methods. Silvia Bonettini Università di Modena e Reggio Emilia Italy

Interior-Point Methods as Inexact Newton Methods. Silvia Bonettini Università di Modena e Reggio Emilia Italy InteriorPoint Methods as Inexact Newton Methods Silvia Bonettini Università di Modena e Reggio Emilia Italy Valeria Ruggiero Università di Ferrara Emanuele Galligani Università di Modena e Reggio Emilia