AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING

Size: px
Start display at page:

Download "AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING"

Transcription

1 AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING XIAO WANG AND HONGCHAO ZHANG Abstract. In this paper, we propose an Augmented Lagrangian Affine Scaling (ALAS) algorithm for general nonlinear programg, for which a quadratic approximation to the augmented Lagrangian is imized at each iteration. Different from the classical sequential quadratic programg (SQP), the linearization of nonlinear constraints is put into the penalty term of this quadratic approximation, which results smooth objective of the subproblem and avoids possible inconsistency among the linearized constraints and trust region constraint. By applying affine scaling techniques to handle the strict bound constraints, an active set type affine scaling trust region subproblem is proposed. Through special strategies of updating Lagrange multipliers and adjusting penalty parameters, this subproblem is able to reduce the objective function value and feasibility errors in an adaptive well-balanced way. Global convergence of the ALAS algorithm is established under mild assumptions. Furthermore, boundedness of the penalty parameter is analyzed under certain conditions. Preliary numerical experiments of ALAS are performed for solving a set of general constrained nonlinear optimization problems in CUTEr problem library. Key words. augmented Lagrangian, affine scaling, equality and inequality constraints, bound constraints, trust region, global convergence, boundedness of penalty parameter AMS subject classifications. 90C30, 65K05 1. Introduction. In this paper, we consider the following nonlinear programg problem: x R n f(x) s. t. c(x) = 0, x 0, (1.1) where f : R n R and c = (c 1,..., c m ) T with c i : R n R, i = 1,..., m, are Lipschitz continuously differentiable. Here, to simplify the exposition, we only assume nonnegative lower bounds on the variables. However, we emphasize that the analysis in this paper can be easily extended to the general case with bounds l x u. The affine scaling method is first proposed by Diin [18] for linear programg. Then, affine scaling techniques have been widely used to solve bound constrained optimization quite effectively [4, 10, 11, 25, 27, 28, 40]. Affine scaling methods for solving polyhedral constrained optimization are also intensively studied. Interested readers are referred to [6, 12, 22, 26, 30, 31, 37, 38]. However, for optimization problems with nonlinear constraints, there are only relatively few wors that have applied affine scaling techniques. In [17], a special class of nonlinear programg problems arising from optimal control is studied. These problems have a special structure that the variables can be separated into two independent parts, with one part satisfying bound constraints. Then an affine scaling method is proposed to eep the bound constraints strictly feasible. Some other affine scaling methods are introduced from the view of equivalent KKT equations of nonlinear programg [8, 9, 21]. In these methods, affine scaling techniques are often combined with primal and dual interior point approaches, where the iterates are scaled to avoid converging prematurely to boundaries. To solve (1.1), standard sequential quadratic programg (SQP) trust-region methods [33, 23] need to solve a constrained quadratic subproblem at each iteration. In this subproblem, the Hessian is often constructed by a quasi-newton technique and the constraints are local linearizations of original nonlinear constraints in a region around the current iterate, called trust region. One fundamental issue with this standard approach is that because of this newly introduced trust region constraint, the subproblems are often infeasible. There are two commonly used approaches for dealing with this issue. One is to relax This material is based upon wor supported by the Postdoc Grant of China S175, UCAS President grant Y35101AY00, NSFC grant of China and the National Science Foundation grant of USA wangxiao@ucas.ac.cn, School of Mathematical Sciences, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing , China. hozhang@math.lsu.edu, hozhang, 303 Locett Hall, Department of Mathematics, Louisiana State University, Baton Rouge, LA Phone (225) Fax (225)

2 2 X. WANG AND H. ZHANG the linear constraints. The other commonly used approach is to split the trust region into two separate parts. One part is used to reduce the feasibility error in the range space of the linear constraints, while the other is to reduce the objective in its null space. To avoid the possible infeasibility, some modern SQP methods are developed based on penalty functions. Two representative approaches are sequential l quadratic programg (Sl QP) [20] and sequential l 1 quadratic programg (Sl 1 QP) [42]. In these two methods, the linearization of nonlinear constraints in standard SQP subproblems are moved into the objective by adding an l and an l 1 penalty term, respectively. Then, the l penalty function and l 1 penalty function are used as merit functions in these two methods accordingly. Recently, Wang and Yuan [41] propose an Augmented Lagrangian Trust Region (ALTR) method for the equality constrained nonlinear optimization: x R n f(x) s. t. c(x) = 0. To handle the aforementioned infeasibility issues, at the -th iteration ALTR solves the following trust region subproblem: d R n (g A T λ ) T d dt B d + σ 2 c + A d 2 2 s. t. d 2, (1.2) where λ and σ are the Lagrange multipliers and the penalty parameter, respectively, and B is the Hessian of the Lagrange function f(x) λ T c(x) at x or its approximation. Here, A := ( c 1 (x ),..., c m (x )) T is the Jacobian matrix of equality constraints at x. Notice that the objective function in (1.2) is a local approximation of the augmented Lagrangian function L(x; λ, σ ) := f(x) λ T c(x) + σ 2 c(x) 2 2, which is actually obtained by applying the second-order Taylor expansion at x to approximate the Lagrange function, and using the linearization of c(x ) at x in the augmented quadratic penalty term. In ALTR, it is natural to adopt the augmented Lagrangian function as the merit function. One major advantage of (1.2) compared with Sl QP and Sl 1 QP subproblems is that its objective is a smooth quadratic function. Therefore, the subproblem (1.2) is a standard trust region subproblem, for solving which many efficient algorithms exist, such as the gqtpar subroutine [32] and the SMM algorithm [24]. Promising numerical results of ALTR have been reported in [41]. More recently, we noticed that Curtis et al. [16] also developed a similar trust region approach based on approximations to the augmented Lagrangian functions for solving constrained nonlinear optimization. In this paper, motivated by nice properties of the affine scaling technique and the augmented Lagrangian function, we extend the method developed in [41] to solve the general constrained nonlinear optimization (1.1), in which the inequality constraints are also allowed. Our major contributions in this paper are the followings. Firstly, to overcome possible infeasibility of SQP subproblems, we move the linearized equality constraints into a penalty term of a quadratic approximation to the augmented Lagrangian. This quadratic approximation can be viewed as a local approximation of the augmented Lagrangian function at fixed Lagrange multipliers and a fixed penalty parameter. In addition, we incorporate the affine scaling technique to deal with simple bound constraints. Secondly, we propose a new strategy to update penalty parameters with the purpose of adaptively reducing the objective function value and constraint violations in a well-balanced way. Thirdly, we establish the global convergence of our new method under the Constant Positive Linear Dependence (CPLD) constraint qualification. We also give a condition under which the penalty parameters in our method will be bounded. The remainder of this paper is organized as follows. In Section 2, we describe in detail the ALAS algorithm. Section 3 studies the global convergence properties of the proposed algorithm. In Section 4, we analyze the boundedness of penalty parameters. The preliary numerical results are given in Section 5. Finally, we draw some conclusions in Section 6. Notation. In this paper, R n is denoted as the n dimensional real vector space and R n + is the nonnegative orthant of R n. Let N be the set of all nonnegative integers. For any vector z in R n, z i is the i-th component of z and supp(z) = {i : z i 0}. We denote e i as the i-th coordinate vector. If not specified, refers to the

3 AUGMENTED LAGRANGIAN AFFINE SCALING METHOD 3 Euclidean norm and B ρ (z) is the Euclidean ball centered at z with radius ρ > 0. For a closed and convex set Ω R n, the operator P Ω ( ) denotes the Euclidean projection onto Ω, and when Ω = {x R n : x 0}, we simply let (z) + = P Ω (z). Given any matrix H R m n and an index set F, [H] F denotes the submatrix of H with rows indexed by F, while [H] F,I stands for the submatrix of H with rows indexed by F and columns indexed by I. In addition, H denotes the general inverse of H. We denote the gradient of f at x as g(x) := f(x), a column vector, and the Jacobian matrix of c at x as A(x), where A(x) = ( c 1 (x),..., c m (x)) T. The subscript refers to the iteration number. For convenience, we abbreviate g(x ) as g, and similarly f, c and A are also used. The i-th component of x is denoted as x i. 2. An augmented Lagrangian affine scaling method. In this section, we propose a new algorithm for nonlinear programg (1.1). The goal of this algorithm is to generate a sequence of iterates x 0, x 1, x 2,..., converging to a KKT point of (1.1), which is defined as follows (for instance, see [33]). Definition 2.1. A point x is called a KKT point of (1.1), if there exists a vector λ = (λ 1,..., λ m) T such that the following KKT conditions are satisfied at (x, λ ): (g (A ) T λ ) i 0, if x i = 0, (g (A ) T λ ) i = 0, if x i > 0, (2.1) c = 0, x 0. where c, g and A denote c(x ), g(x ) and A(x ) respectively. Before presenting the ALAS algorithm, we would lie to discuss several important issues first, such as the construction of the subproblem, the update of Lagrange multipliers and penalty parameters Affine scaling trust region subproblem. The augmented Lagrangian function (see, e.g., [14]) associated with the objective function f(x) and equality constraints c(x) = 0 is defined as L(x; λ, σ) = f(x) λ T c(x) + σ 2 c(x) 2, (2.2) where λ R m denotes the vector of Lagrange multipliers and σ R + is the penalty parameter. For (1.1), at the -th iteration with fixed λ and σ the classical augmented Lagrangian methods solve the following subproblem : L(x; λ, σ ) x R n s. t. x 0. (2.3) Normally (2.3) is solved to find x +1 such that (x x L(x +1, λ, σ )) + x w, where w, = 1, 2,..., is a sequence of preset tolerances gradually converging to zero. For more details, one could refer to [33]. However, different from classical augmented Lagrangian methods, a new subproblem is proposed in this paper. Instead of imizing L(x; λ, σ) at the -th iteration as in (2.3), we wor with its second order approximation. In order to mae this approximation adequate, the approximation is restricted in a trust region. So, at the current iterate x, x 0, a trust region subproblem can be defined with explicit bound constraints as: d R n q (d) := (g A T λ ) T d dt B d + σ 2 c + A d 2 = ḡ T d dt (B + σ A T A )d + σ 2 c 2 s. t. x + d 0, d, (2.4)

4 4 X. WANG AND H. ZHANG where > 0 is the trust region radius, B is the Hessian of the Lagrange function f(x) λ T c(x) or its approximation and ḡ = x L(x ; λ, σ ), i.e., ḡ = g A T λ + σ A T c. (2.5) Instead of solving (2.4) directly, we would lie to apply affine scaling techniques to deal with the explicit bound constraints effectively. Before giving our new subproblem, let us first recall a most commonly used affine scaling technique proposed by Coleman and Li [11] for the simple bound constrained optimization: f(x) s. t. l x u. x Rn At the current strict interior iterate x, i.e., l < x < u, the affine scaling trust region subproblem is defined as d R n f(x ) T d dt W d s. t. l x + d u, (D c ) 1 d, where W is equal to 2 f(x ) or its approximation, D c is an affine scaling diagonal matrix defined by [D c ] ii = v i, i = 1,..., n, and v is given by x i u i, if g i < 0 and u i <, x i l i, if g i 0 and l i >, v i = 1, if g i < 0 and u i =, 1, if g i 0 and l i =. Therefore, by applying similar affine scaling technique to (2.4), we obtain the following affine scaling trust region subproblem: d R n q (d) s. t. D 1 d, x + d 0, (2.6) where D is the scaling matrix given by [D ] ii = { xi, if ḡ i > 0, 1, otherwise, (2.7) and ḡ is defined in (2.5). By Definition 2.1, we can see that x is a KKT point of (1.1) if and only if x 0, c = 0 and D ḡ = 0. (2.8) In subproblem (2.6), the scaled trust region D 1 d requires [D ] ii > 0 for each i, which can be guaranteed if x is maintained in the strict interior of the feasible region, namely, x > 0. However, in our approach such requirement is actually not needed. In other words, we allow some components of x to be zero. For those i with [D ] ii = 0, we simply solve the subproblem in the reduced subspace by fixing d i = 0. So we translate the subproblem (2.6) to the following active set type affine scaling trust region subproblem: d R n q (d) := (D ḡ ) T d dt [D (B + σ A T A )D ]d + σ 2 c 2 s. t. d, x + D d 0, d i = 0, if [D ] ii = 0. (2.9)

5 AUGMENTED LAGRANGIAN AFFINE SCALING METHOD 5 Obviously, q (d) = q (D d). Hence, letting s = D s, where s denotes the solution of (2.9), we define the predicted reduction of the augmented Lagrangian function (2.2) at x with step s as Pred := q (0) q (s ) = q (0) q ( s ). (2.10) Note that (2.9) ensures that s is a feasible trial step for the bound constraints, i.e., x + s 0. Of course, it is impractical and inefficient to solve (2.9) exactly. To guarantee the global convergence, theoretically we only need to the trial step s satisfies the following sufficient model reduction condition: where β is some constant in (0,1). Pred β [ q (0) { q (d) : d, x + D d 0}], (2.11) 2.2. Update of Lagrange multipliers. We now discuss how to update the Lagrange multiplier λ. In our method, we do not necessarily update λ at each iterate. Its update depends on the performance of the algorithm. If the iterates are still far away from the feasible region, the algorithm will eep λ unchanged and focuses on reducing the constraint violation c by increasing the penalty parameter σ. However, if the constraint violation c is reduced, we thin that the algorithm is perforg well as the iterates approaching the feasible region. In this situation, the algorithm will update λ according to the latest information. In particular, we set a switch condition: c R 1, where R, = 1, 2,..., is a sequence of positive controlling factors gradually converging to zero as iterates approach the feasible region. In classical augmented Lagrangian methods, solving (2.3) yields a new iterate x +1, then it follows from the KKT conditions that g +1 A T +1λ + σ A T c = µ, (2.12) where µ 0 is the Lagrange multiplier associated with x 0. Therefore, by Definition 2.1, a good estimate of λ +1 would be λ σ c +1. However, this estimate is not so well suited for our method, because our new iterate is only obtained by approximately solving the subproblem (2.9). To ensure global convergence, theoretically Lagrange multipliers are normally required to be bounded or grow not too fast compared with penalty parameters (see, e.g., Lemma in [13]). More precisely, the ratio λ /σ needs to approach zero, when σ increases to infinity. The easiest way to realize this is to restrict λ in a bounded region, e.g. [λ, λ max ]. Therefore, in our method, the following scheme is proposed to update λ. We calculate λ = arg λ R m ψ (λ) := (x g + A T λ) + x 2 (2.13) and let λ = P [λ,λ max] λ. Here, λ and λ max are preset safeguards of upper and lower bounds of Lagrange multipliers, and in practice they are often set to be very small and large, respectively. Note that ψ (λ) defined in (2.13) is a continuous, piecewise quadratic function with respect to λ. Effective techniques could be applied to identify its different quadratic components and hence be able to solve (2.13) quite efficiently. In addition, it can be shown (see Section 4) that under certain nondegeneracy assumptions, when x is close to a KKT point x of (1.1), (2.13) is equivalent to the following smooth least squares problem: [g A T λ R m λ] I 2, where I consists of all indices of inactive bound constraints at x. This property provides us a practical way of obtaining a good initial guess for computing λ. Suppose that at the -th iteration I is a good estimate of I. Then, a good initial guess of λ could be the solution of the following smooth least squares problem: λ R m [g A T λ] I 2.

6 6 X. WANG AND H. ZHANG Remar 2.2. Although in our method (2.13) is adopted to compute λ, actually an other practical easiest way could be simply setting λ = λ 1 σ 1 c. (2.14) However, the theoretical advantage of using (2.13) is that a stronger global convergence property can be achieved, as shown in the following Theorem Update of penalty parameters. We now discuss the strategy of updating the penalty parameter σ. Note that in subproblem (2.9), it may happen that D ḡ = 0, which by the definition of ḡ in (2.5) indicates that d = 0 is a KKT point of the following problem: d R n L(x + D d; λ, σ ) s. t. x + D d 0. In this case, if x is feasible, by (2.8) we now that it is a KKT solution of (1.1). Hence, we can terate the algorithm. Otherwise, if x is infeasible (note that x is always ept feasible to the bound constraints), that is D ḡ = 0 and c 0, (2.15) we brea the first equality by repeatedly increasing σ. If for all sufficiently large σ, D x L(x ; λ, σ) = 0 and c 0, it can be shown by Lemma 2.3 that x is a stationary point of imizing c(x) 2 with the bound constraints. So, our algorithm detects whether (2.15) holds at x or not. If (2.15) holds, we increase the penalty parameter σ to brea the the first equality of (2.15). Notice that increasing σ also helps reducing the constraint violation. Now assume that the subproblem (2.9) returns a nonzero trial step s. Then, our updating strategies for σ mainly depend on the improvement of constraint violations. In many approaches, the penalty parameter σ increases if sufficient improvement on the feasibility is not obtained, that is σ will be increased, if c +1 τ c for some constant 0 < τ < 1. An adaptive way of updating penalty parameters for augmented Lagrangian methods can also be found in [16]. Inspired by [41], we propose a different strategy to update the penalty parameter. In [41] for equality constrained optimization, the authors propose to test the condition Pred < δ σ { c, c 2 } to decide whether to update σ or not, where δ, = 1, 2,..., is a prespecified sequence of parameters converging to zero. In this paper, however, with respect to the predicted reduction Pred defined in (2.10), we propose the following condition: Pred < δ σ { c, c 2 }. (2.16) where δ > 0 is a constant. If (2.16) holds, we believe that the constraint violation c is still relatively large. So we increase σ, which to some extent can help reducing constraint violations in future iterations. We believe that the switching condition (2.16) will provide a more adaptive way to reduce the objective function value and the constraint violation simultaneously. Besides, this new condition (2.16) would increase σ less frequently compared with the strategy in [41], therefore the subproblem would be more well-conditioned for numerical solving Algorithm description. We now summarize the above discussions into the following algorithm.

7 Algorithm 1 AUGMENTED LAGRANGIAN AFFINE SCALING METHOD 7 Augmented Lagrangian Affine Scaling (ALAS) Algorithm Step 0: Initialization. Given initial guess x 0, compute f 0, g 0, c 0 and A 0. Given initial guesses B 0, σ 0, 0 and parameters β (0, 1), δ > 0, max > 0, λ < 0 < λ max, θ 1, θ 2 > 1, 0 < η < η 1 < 1 2, R > 0, set R 0 = max{ c 0, R}. Set := 0. Compute λ 0 by (2.13) and set λ 0 = P [λ,λ max] λ 0. Step 1: Computing Scaling Matrices. Compute ḡ by (2.5) and D by (2.7). Step 2: Teration Test. If c = 0 and D ḡ = 0, stop and return x as the solution. Step 3: Computing Trial Steps. Set σ (0) = σ, D (0) = D and j := 0. While D (j) xl(x, λ, σ (j) ) = 0 and c 0, Endwhile Set σ = σ (j) s = D s. σ (j+1) [D (j+1) ] ii = Step 4: Updating Iterates. Let = θ 1σ (j) { ; (2.17) xi, if [ xl(x, λ, σ (j+1) )] i > 0, (2.18) 1, otherwise; j : = j + 1; and D = D (j). Solve the subproblem (2.9) to yield a trial step s satisfying (2.11) and set Ared = L(x ; λ, σ ) L(x + s ; λ, σ ) and ρ = Ared Pred, where Pred is defined by (2.10). If ρ < η, +1 = s /4, x +1 = x, := + 1, go to Step 3; otherwise, set x +1 = x + s. Calculate f +1, g +1, c +1 and A +1. Step 5: Updating Multipliers and Penalty Parameters. If c +1 R, then compute λ +1 as the imizer of ψ +1 (λ) defined by (2.13) or through (2.14), set λ +1 = P [λ,λ max] λ +1 and R +1 = βr ; (2.19) otherwise, λ +1 = λ, R +1 = R. If (2.16) is satisfied, σ +1 = θ 2σ ; (2.20) otherwise, σ +1 = σ. Compute B +1, which is (or some approximation to) the Hessian of f(x) λ +1 c(x) at x +1. Step 6: Updating Trust Region Radii. Set {max{, 1.5 s }, max}, if ρ [1 η 1, ), +1 =, if ρ [η 1, 1 η 1), max{0.5, 0.75 s }, if ρ [η, η 1). Let := + 1 and go to Step 1. (2.21) In ALAS, B is updated as the Hessian of the Lagrange function f(x) λ T c(x) or its approximation. But from global convergence point of view, we only need {B } is uniformly bounded. We update the penalty parameters in the loop (2.17)-(2.18) to move the iterates away from infeasible local imizers of the augmented Lagrangian function. The following lemma shows that if it is an infinite loop, then x would

8 8 X. WANG AND H. ZHANG be a stationary point of imizing c(x) 2 subject to bound constraints. Lemma 2.3. Suppose that x is an iterate satisfying D (j) xl(x ; λ, σ (j) ) = 0 with σ(j) updated by (2.17)-(2.18) in an infinite loop. Then x is a KKT point of c(x) 2 x R n s. t. x 0. and D (j) (2.22) Proof. If σ (j) is updated by (2.17) in an infinite loop, then lim j σ (j) D (j) xl(x ; λ, σ (j) = and for all j ) = D(j) (g A T λ + σ (j) AT c ) = 0. Since {D (j) } is equal to 1 or x i 0, it has a subsequence, still denoted as D (j), converging to a diagonal matrix, say D, with [ D ] ii 0. Then, for those i with [ D ] ii = 0, we have (A T c ) i 0 and x i = 0 due to the definition of D. And for those i with [ D ] ii > 0, we have [A T c ] i = 0. Hence, we have { (A T c ) i 0, if x i = 0, (A T c ) i = 0, if x i > 0, which indicates that x is a KKT point of (2.22). Theoretically, the loop (2.17)-(2.18) can be infinite. However, in practical computation, once σ (j) reaches a preset large number, we simply terate this algorithm and return x as an approximate infeasible stationary point of (1.1). Thus, in the following context we assume that the loop (2.17)-(2.18) finishes finitely at each iteration. Once the loop terates, we have D ḡ 0 and the subproblem (2.9) is solved. Then, it is not difficult to see that (2.9) always generates a trial step s such that ρ η, when the trust region radius is sufficiently small (see, e.g., [15]). Therefore, the loop between Step 3 and Step 4 in ALAS will stop in finite number of inner iterations. 3. Global convergence. In this section, we assume that an infinite sequence of iterates {x } is generated by ALAS. The following assumptions are required throughout the analysis of this paper. AS.1 f : R n R and c : R n R m are Lipschitz continuously differentiable. AS.2 {x } and {B } are bounded. Without the assumption AS.2, ALAS may allow unbounded imizers. There are many problem-dependent sufficient conditions to support AS.2. When the optimization problem has both finite lower and upper bound constraints, all the iterates {x } will be certainly bounded. Another sufficient condition could be the existence of M, ɛ > 0 such that the set {x R n : f(x) < M, c(x) < ɛ, x 0} is bounded. We shall start with the following important property of ALAS. Lemma 3.1. Under assumptions AS.1-AS.2, given any two integers p and q with 0 p q, we have c(x q+1 ) 2 c(x p ) M 0 σ 1 p, (3.1) where M 0 = 2 f max + 2 π λ (c max + R 0 / (1 β)), f max is an upper bound of { f(x ) }, c max is an upper bound of { c } and π λ is an upper bound of { λ }. Proof. As the proof is almost identical to that of Lemma 3.1 in [41], we give its proof in the appendix. The following lemma provides a lower bound on the predicted objective function value reduction obtained by the subproblem (2.9). Lemma 3.2. The predicted objective function value reduction defined by (2.10) satisfies where B = B + σ A T A. Pred β D ḡ 2 { D ḡ D B D,, D } ḡ, (3.2) ḡ

9 AUGMENTED LAGRANGIAN AFFINE SCALING METHOD 9 Proof. Define d (τ) = τ D ḡ D ḡ with τ 0. Then, due to the definition of D in (2.7), d(τ) is feasible for (2.9) if τ [ { 0,, D }] ḡ, (3.3) ḡ i >0 ḡ i which implies that d(τ) is feasible if τ [0, {, D ḡ ḡ }]. So considering the largest reduction of q along d(τ), we have { q (0) D ḡ 2 Then, (2.11) indicates that (3.2) holds. { q (d(τ)) : 0 τ { D ḡ D B D,, D ḡ ḡ, D }} ḡ ḡ } Global convergence with bounded penalty parameters. From the construction of ALAS, the penalty parameters {σ } is a monotone nondecreasing sequence. Hence, when goes to infinity, σ has a limit, either infinite or finite. So our following analysis is separated into two parts. We first study the global convergence of ALAS when {σ } is bounded. And the case when {σ } is unbounded will be addressed in Section 3.2. The following lemma shows that any accumulation point of {x } is feasible if {σ } is bounded. Lemma 3.3. Under assumptions AS.1-AS.2, assug that lim σ = σ <, we have lim c = 0, (3.4) which implies that all the accumulation points of {x } are feasible. Proof. If lim σ = σ <, by (2.20) in ALAS, the predicted reduction Pred satisfies Pred δ σ { c, c 2 }, (3.5) for all sufficiently large. Without loss of generality, we assume that σ = σ for all. The update scheme of λ in Step 4 together with AS.2 implies that the sum of all λ T c + λ T c +1, = 0, 1,..., is bounded, which is shown in (A.4) in Appendix: ( ( λ T c + λ T c +1 ) 2π λ c max + 1 ) 1 β R 0 <, =0 where c max is an upper bound of { c } and π λ is an upper bound of { λ }. Thus, the sum of Ared is bounded from above: Ared = (f f +1 ) + ( λ T c + λ T c +1 ) + σ ( c 2 c +1 2 ) 2 =0 =0 =0 =0 ( 2f max + 2π λ c max + 1 ) 1 β R 0 + σc 2 max <, (3.6) where f max is an upper bound of { f(x ) }. To prove (3.4), we first show lim inf c = 0 (3.7) by contradiction. Suppose there exists a constant τ > 0 such that c τ for all. Then (3.5) gives Pred δ σ { τ, τ 2 } for all large.

10 10 X. WANG AND H. ZHANG Let S be the set of iteration numbers corresponding to successful iterations, i.e., S = { N : ρ η}. (3.8) Since {x } is an infinite sequence, S is an infinite set. Then (3.6) indicates that δ { τ, τ 2 } Pred σ S S 1 Ared η S = 1 Ared <, (3.9) η =0 which yields { } S 0 as. Then, by update rules (2.21) of trust region radii, we have lim = 0. (3.10) By AS.1-AS.2 and the boundedness of λ, we have Ared Pred f(x ) λ T c + (g A T λ ) T s st B s (f(x + s ) λ T c(x + s )) + σ c + A s 2 c(x + s ) 2 M s 2 M D 2 2 2, where M is a positive constant. By the definition of D and AS.2, {D } is bounded, i.e., D D max for some D max > 0. So, (3.10) implies that ρ 1 = Ared Pred MD2 max 2 Pred δ σ { 0, as, τ, τ 2 } Hence, again by the rules of updating trust region radii, we have +1 for all large. So, is bounded away from 0. However, this contradicts (3.10). Therefore (3.7) holds. We next prove the stronger result that lim c = 0 by contradiction. Since {x } is an infinite sequence, S is infinite. So we assume that there exists an infinite set {m i } S and ν > 0 such that Because of (3.7), there exists a sequence {n i } such that Now, let us consider the set c mi 2ν. (3.11) c ν (m i < n i ) and c ni < ν. (3.12) K = i { S : m i < n i }. By (3.6), we have Ared 0, which implies that for all sufficiently large K, Ared η δ σ [ ν, ν 2 ] ξ, where ξ = ηδν/σ. Hence, when i is sufficiently large we have x mi x ni n i 1 =m i x x +1 n i 1 =m i, S The boundedness of =0 Ared shown in (3.6) also implies lim i n i 1 =m i, S D D max ξ Ared = 0. n i 1 =m i, S Ared. (3.13)

11 AUGMENTED LAGRANGIAN AFFINE SCALING METHOD 11 Hence, lim i x mi x ni = 0. Therefore, from (3.12) we have c mi < 2ν for large i, which contradicts (3.11). Consequently, (3.4) holds. Since x 0 for all, all the accumulation points of {x } are feasible. By applying Lemma 3.2, we obtain the following lower bound on Pred when {σ } is bounded. Lemma 3.4. Under assumptions AS.1-AS.2, assume that lim σ = σ <. Then the predicted objective function value reduction defined by (2.10) satisfies where M is a positive constant. Proof. By Lemma 3.2, we have Pred Pred β M D ḡ { D ḡ, }, (3.14) β D ḡ 2 { D ḡ D B D,, D } ḡ, (3.15) ḡ where B = B + σ A T A. Since {σ } is bounded, { B } and {ḡ } are all bounded. Then the definition of {D } in (2.7) and AS.2 indicate that D is bounded as well. So there exists a positive number M such that 2 max{ ḡ, D B D, 1} M. Then (3.15) yields that Pred β D ḡ 2 { D ḡ max{ ḡ, D B D }, }, which further yields (3.14). We now give the main convergence result for the case that {σ } is bounded. Theorem 3.5. Under assumptions AS.1-AS.2, assug that lim σ = σ <, we have lim inf D ḡ = 0, (3.16) which implies that at least one accumulation point of {x } is a KKT point of (1.1). Furthermore, if λ = arg ψ (λ) for all large, where ψ is defined by (2.13), we have lim D ḡ = 0, (3.17) which implies that all accumulation points of {x } are KKT points of (1.1). Proof. By contradiction we can prove that lim inf D ḡ = 0. (3.18) Suppose that there exists ɛ > 0 such that D ḡ ɛ for all. Then (3.14) indicates that Pred β M ɛ {ɛ, }, (3.19) Then by mimicing the analysis after (3.7), we can derive a contradiction, so we obtain (3.18). Hence, due to lim c = 0 by Lemma 3.3, we have lim inf D (g A T λ ) = 0. Then, as {λ } and {x } are bounded, there exist subsequences {x i }, {λ i }, x and λ such that x i x, λ i λ and lim D i (g i A T i i λ i ) = 0.

12 12 X. WANG AND H. ZHANG Therefore, from the definition of D in Step 3 of ALAS and 0 = c(x ) = lim i c(x i ), we can derive { [g (A ) T λ ] i = 0, if x i > 0, [g (A ) T λ ] i 0, if x i = 0. (3.20) This together with x 0 gives that x is a KKT point of (1.1). Hence, (3.16) holds. Now, if λ = arg ψ for all large, we now prove (3.17) in three steps. Without loss of generality, we assume λ = arg ψ for all. Firstly, we prove that for any two iterates x and x l there exists a constant C > 0, independent of and l, such that (x g + A T λ ) + x (x l g l + A T l λ l ) + x l C x x l. (3.21) Since λ = arg ψ with ψ defined in (2.13), we have Hence, (x g + A T λ ) + x (x g + A T λ l ) + x. (x g + A T λ ) + x (x l g l + A T l λ l ) + x l (x g + A T λ l ) + x (x l g l + A T l λ l ) + x l ((x g + A T λ l ) + x ) ((x l g l + A T l λ l ) + x l ) (x g + A T λ l ) + (x l g l + A T l λ l ) + + x x l (x g + A T λ l ) (x l g l + A T l λ l ) + x x l C x x l, where C > 0 is a constant. Here, the last inequality follows from the assumptions AS.1-AS.2 and the boundedness of {λ }. Then, by the arbitrary of x and x l, we have that (3.21) holds. Secondly, we show that is equivalent to lim D (g A T λ ) = 0 (3.22) K, lim (x g + A T λ ) + x = 0, (3.23) K, where K is any infinite subset of N. If (3.22) is satisfied, it follows from the same argument as showing (3.16) that any accumulation point x of {x : K} is a KKT point of (1.1). Hence, from KKT conditions (2.1), there exists λ such that (x g(x ) + (A(x )) T λ ) + x = 0. Then, by (3.21), we obtain (x g + A T λ ) + x C x x, K. Since x can be any accumulation point of {x : K}, the above inequality implies that (3.23) holds. We now assume that (3.23) is satisfied. Assume that (x, λ ) is any accumulation point of {(x, λ ), K}. This implies that there exists a K K such that x x and λ λ as K,. Hence, we have lim g A T λ = g(x ) (A(x )) T λ. (3.24) K, From (3.23) and (3.24) and x 0 for all, we have g(x ) (A(x )) T λ 0. For those i with [g(x ) (A(x )) T λ ] i = 0, we have by (3.24) and the boundedness of {D } that [D (g A T λ )] i 0 as K,. For those i with [g(x ) (A(x )) T λ ] i > 0, on one hand, we have from (3.23) and (3.24) that x i = 0; on the other hand, we have from (3.24) and lim c = 0 (Lemma 3.3) that lim ḡ i = [g(x ) (A(x )) T λ ] i > 0, K,

13 AUGMENTED LAGRANGIAN AFFINE SCALING METHOD 13 which implies [D ] ii = x i for all large K. Then it follows from x i x i = 0 that [D (g A T λ )] i 0 as K,. Hence, D (g A T λ )] 0 as K, Then, since (x, λ ) can be any accumulation point of {(x, λ ), K}, it implies (3.22). Therefore, we obtain the equivalence between (3.22) and (3.23). Thirdly, we now prove lim D (g A T λ ) = 0 by contradiction. If this does not hold, then there exists an infinite indices set {m i } S and a constant ν > 0 such that D mi (g mi A T m i λ mi ) ν, where S is defined in (3.8). So, from the equivalence between (3.22) and (3.23), there exists a constant ɛ > 0 such that (x mi g mi + A T m i λ mi ) + x mi ɛ for all m i. Let δ = ɛ/(3c). Then, it follows from (3.21) that (x g + A T λ ) + x 2 ɛ/3, if x B δ(x mi ). (3.25) Now, let S(ɛ) = { : (x g + A T λ ) + x ɛ}. Then it again follows from (3.22) and (3.23) that there exists a ˆν (0, ν] such that D (g A T λ ) ˆν, for all S(2 ɛ/3). (3.26) By (3.16) and ˆν ν, we can find a subsequence {n i } such that D ni (g ni A T n i λ ni ) < ˆν/2 and D (g A T λ ) ˆν/2 for any [m i, n i ). Following the same arguments as showing (3.13) in Lemma 3.3 and the boundedness of =0 Pred, we can obtain lim (x m i x ni ) = 0. i Hence, for all large n i, we have x ni B δ(x mi ). Thus by (3.25) we obtain (x ni g ni + A T n i λ ni ) + x ni 2 ɛ/3. Then, it follows from (3.26) that D ni (g ni A T n i λ ni ) ˆν. However, this contradicts the way of choosing {n i } such that D ni (g ni A T n i λ ni ) < ˆν/2. Hence, (3.17) holds, which implies that any accumulation point of {x } is a KKT point of (1.1). Remar 3.6. In Remar 2.2, we have mentioned that the computation of λ in (2.13) can be replaced with (2.14). If (2.14) is adopted, the first part of Theorem 3.5 still holds, while the second part will not be guaranteed Global convergence with unbounded penalty parameters. In this subsection, we study the behavior of ALAS when the penalty parameters {σ } are unbounded. The following lemma shows that in this case the limit of { c } always exists. Lemma 3.7. Under assumptions AS.1-AS.2, assume that lim σ =, then lim c exists. Proof. Let c = lim inf c(x ) and {x i } be a subsequence of {x } such that c = lim i c(x i ). By Lemma 3.1, substituting x p in the right side of (3.1) by x i, for all q > i we have that c(x q+1 ) 2 c(x i ) 2 + 2M 0 σ 1 p. Since lim p σ p =, it follows from above inequality that c = lim c(x ). Lemma 3.7 shows that the unboundedness of {σ } ensures convergence of equality constraint violations { c }. Then two cases might happen: lim c = 0 or lim c > 0. It is desirable if the iterates generated by ALAS converge to a feasible point. However, the constraints c(x) = 0 in (1.1) may not be feasible for any x 0. Hence, in the following we further divide our analysis into two parts. We first study the case that lim c > 0. The following theorem shows that in this case any infeasible accumulation point is a stationary point of imizing c(x) 2 subject to the bound constraints. Theorem 3.8. Under assumptions AS.1-AS.2, assume that lim σ = and lim c > 0. Then all accumulation points of {x } are KKT points of (2.22). Proof. We divide the proof into three major steps. Step 1: We prove by contradiction that lim inf D ḡ σ = 0. (3.27)

14 14 X. WANG AND H. ZHANG Assume that (3.27) does not hold. Then there exists a constant ξ > 0 such that D ḡ ξσ (3.28) for all large. Hence, when is sufficiently large, (2.17) will never happen and σ will be updated only through (2.20). Then, by Lemma 3.2, we have Pred β D { ḡ D ḡ 2 D (B + σ A T A )D,, D } ḡ ḡ = β D { ḡ D ḡ /σ 2 D (B /σ + A T A )D,, D } ḡ /σ. (3.29) ḡ /σ As {B /σ + A T A } and {ḡ /σ } are all bounded, (3.28) and (3.29) imply that there exists a constant ζ > 0 such that Pred ζσ {ζ, }, for all large. However, this contradicts with σ, because the above inequality implies that σ will not be increased for all large. Therefore, (3.27) holds. Then, by σ and (3.27), we have lim inf D A T c = lim inf Step 2: We now prove by contradiction that D ( g A T λ )/σ + A T c ) = lim inf D ḡ σ = 0. (3.30) lim D A T c = 0, (3.31) S, where S is defined in (3.8). Assume that (3.31) does not hold. Then, there exist x 0, ɛ > 0 and a subset K S such that Let F = F + F, where lim x = x and D A T c ɛ for all K. (3.32) K, F + := {i : [A( x) T c( x)] i > 0} and F := {i : [A( x) T c( x)] i < 0}. Then, it follows from the boundedness of {D } and (3.32) that F is not empty. Since σ, by the definition of ḡ, there exists a constant δ > 0 such that { > 0, if x B δ( x) and i ḡ F + ; i < 0, if x B δ( x) and i F. Accordingly, we have [D ] ii = { xi, if x B δ( x) and i F + ; 1, if x B δ( x) and i F. This together with (3.32) and continuity indicates that when δ is sufficiently small Let ϕ (d) := c + A D d 2 /2. We consider the following problem: D A T c ɛ/2, if x B δ( x). (3.33) ϕ (d) = (D A T d R n c ) T d dt (D A T A D )d c 2 s. t. d, x + D d 0,

15 AUGMENTED LAGRANGIAN AFFINE SCALING METHOD 15 with its solution denoted by d. Then, similar to the proof of Lemma 3.2, we can show that for any 1 2 ( c 2 c + A D d 2 ) D A T c 2 { D A T c D A T A D,, D A T c A T c }. (3.34) Let K := { S : D A T c ɛ/2}. Since K K, K is an infinite set. And for any K, it follows from (3.34) that there exists η > 0 such that c 2 c + A D d 2 η {1, }. Since D d is a feasible point of subproblem (2.6), the predicted reduction obtained by solving (2.9) satisfies the following relations: Pred = q (0) q (s ) β [q (0) q (D d )] = β [ (D (g A T λ )) T d 1 2 d T D B D d + σ 2 ( c 2 c + A D d 2 )]. As d max, we have (D (g A T λ )) T d 1 2 d T D B D d ( D (g A T λ ) + 12 ) D B D max ( D (g A T λ ) + 12 ) D B D max max{1, max } {1, }. Therefore, for any K and sufficiently large, by (2.11) we have [ Pred β ησ ( D 2 (g A T λ ) + 12 ) ] D B D max max{1, max } {1, } β ησ 4 {1, }, (3.35) where the second inequality follows from σ. From (3.30) and (3.32), there exist infinite sets {m i } K and {n i } such that Hence, we have x mi B δ/2 ( x), D A T c ɛ 2 for m i < n i and D ni A T n i c ni < ɛ 2. (3.36) Therefore, by (3.35), when m i is sufficiently large η β η 4 n i 1 =m i, S { S : m i < n i } = { K : m i < n i }. {1, } n i 1 =m i, S η σ Pred n i 1 =m i, S 1 σ Ared 1 2 ( c m i 2 c ni 2 ) + M 0 σ mi, (3.37)

16 16 X. WANG AND H. ZHANG where the last inequality follows from (A.5) with M 0 = 2 f max + 2 π λ (c max + R 0 / (1 β)). By Lemma 3.7, we have lim c exists. It together with lim σ and (3.37) yields n i 1 =m i, S {1, } 0, as m i, which obviously implies n i 1 =m i, S 0, as m i. Hence, we have x mi x ni n i 1 =m i x x +1 D max n i 1 =m i, S n i 1 =m i, S D 0, as m i, where D max is an upper bound of { D }. So, lim i x mi x ni = 0. Then, it follows from x mi B δ/2 ( x) by (3.36) that x ni B δ( x) when m i is large. And therefore by (3.33) we have D ni A T n i c ni ɛ/2. However, this constricts D ni A T n i c ni < ɛ/2 shown in (3.36). Hence, (3.31) holds. Step 3: We now prove that any accumulation point x of {x } is a KKT point of (2.22). Since x is also an accumulation point of {x : S}, by (3.31) there exists a subset S S such that lim x = x and lim D A T c = 0. (3.38) S, S, For the components of x, there are two cases to happen. Case I. If x i > 0, due to the definition of D, we now that [D ] ii > 0 for all large S. Hence, it follows from (3.38) that [(A ) T c ] i = 0. Case II. If x i = 0, then [(A ) T c ] i 0 must hold. We prove it by contradiction. Assume that there exists a positive constant ɛ such that Then since lim σ =, for sufficiently large S we have [(A ) T c ] i ɛ < 0. (3.39) [ḡ ] i = [g A T λ + σ A T c ] i < 0, which implies that [D ] ii = 1. Then, (3.38) indicates that [(A ) T c ] i = 0, which contradicts (3.39). Since x 0 for all, we have from Case I and Case II that x is a KKT point of (2.22). In the following of this section, we prove the global convergence of ALAS to KKT points under the case that lim c = 0. We now start with the following lemma. Lemma 3.9. Under assumptions AS.1-AS.2, assume that lim σ = and lim c = 0. Then the iterates {x } generated by ALAS satisfy lim inf D ḡ = 0. (3.40) Proof. If (2.17) happens at infinite number of iterates, then (3.40) obviously holds. So, we assume that (2.17) does not happen for all large. By Lemma 3.2, we have Pred β D { ḡ D ḡ 2 D (B + σ A T A )D,, D } ḡ ḡ D ḡ M 1 1 σ { D ḡ, }

17 AUGMENTED LAGRANGIAN AFFINE SCALING METHOD 17 for some constant M 1 > 0, since D, B, λ and A are all bounded. This lower bound on Pred indicates that (3.40) holds. Otherwise, there exists a constant ζ > 0 such that Pred ζ M 1 1 σ {ζ, }, which indicates that (2.16) does not hold for all large as c 0. Hence, σ will be a constant for all large, which contradicts lim σ =. The constraint qualification, Linear Independence Constraint Qualification (LICQ) [33], is widely used in analyzing the optimality of feasible accumulation points. With respect to the problem (1.1), the definition of LICQ is given as follows. Definition We say that LICQ condition holds at a feasible point x of problem (1.1), if the gradients of active constraints (all the equality and active bound inequality constraints) are linearly independent at x. Recently, a constraint qualification condition called Constant Positive Linear Dependence (CPLD) condition is proposed in [36]. The CPLD condition has been shown weaer than LICQ and has been widely studied [1, 2, 3]. In this paper, with respect to feasible accumulation points, we analyze their optimality properties under CPLD condition. We first give the following definition. Definition We say that a set of constraints of problem (1.1) with indices Ā = I J are Positive Linearly Dependent at x, where I {i = 1,..., m} and J {j = 1,..., n}, if there exist λ R m and µ R n + with i I λ i + j J µ j 0 such that i I λ i c i (x) + j J µ j e j = 0. Now, we give the definition of CPLD. Definition Given a feasible point x of (1.1), we say that the CPLD condition holds at x if the positive linear dependence of any subset of the active constraints at x implies the linear dependence of their gradients at some neighborhood of x. By Lemma 3.9, we can see there exist an accumulation point x and a subsequence {x i } such that lim i x i = x and lim i D i ḡ i = 0. (3.41) The following theorem shows under certain conditions, this x will be a KKT point of (1.1). Theorem Under assumptions AS.1-AS.2, assume that lim σ =, lim c = 0 and x is an accumulation point at which the CPLD condition holds. Then x is a KKT point of (1.1). Proof. By Lemma 3.9, (3.41). Therefore, there exist an accumulation point x and a subsequence {x i }, still denoted as {x } for notation simplicity, such that lim x = x, lim D ḡ = lim D (g A T λ + σ A T c ) = 0. (3.42) So, by the definition of D, there exists µ R n + such that lim g A T (λ σ c ) µ j e j = 0, (3.43) j A where A = {i : x i = 0}. Then, by (3.43) and the Caratheodory s Theorem for Cones, for any there exist ˆλ R m and ˆµ R n + such that lim g A T ˆλ j A ˆµ j e j = 0, (3.44)

18 18 X. WANG AND H. ZHANG and { c i (x ) : i supp(ˆλ )} {e j : j supp( ˆµ ) A } are linearly independent. (3.45) Let I = supp(ˆλ ) and J = supp( ˆµ ). Since R n is a finite dimensional space, without loss of generality, by taing subsequence if necessary we can assume that I = I and J = J A for all large. Let y = (ˆλ, ˆµ ). We now show that {y } = {(ˆλ, ˆµ )} is bounded. Suppose y as. From (3.44), we have lim i I c i (x )ˆλ i / y + j J e j ˆµ j / y = lim g / y = 0. Since z := (ˆλ, ˆµ )/ y has unit norm, it has a subsequence converging to a limit z := (w, v ). Without loss of generality, we assume that ˆλ / y w and ˆµ / y v. Hence we have vj e j = 0. (3.46) i I w i c i (x ) + j J Since ˆµ 0, we have v 0. So, (3.46) and z = 1 imply that the constraints with indices Ā = I J are positive linearly dependent at x. Then, since CPLD holds at x and x x, we have that { c i (x ) : i I} {e j : j J } are linearly dependent for all large. However, this contradicts (3.45). Hence {y } = {(ˆλ, ˆµ )} is bounded. So, by (3.44), there exist λ and µ 0 such that g(x ) (A ) T λ = j J µ j e j. which by Definition 2.1 indicates that x is a KKT point of (1.1). 4. Boundedness of penalty parameters. We have previously shown that the boundedness of penalty parameters plays a very important role in forcing x to be a KKT point of (1.1). In addition, large penalty parameters might lead to ill-conditioned Hessian of the augmented Lagrangian function (2.2), which normally brings numerical difficulties for solving trust region subproblems. Hence, in this section we study the behavior of penalty parameters and investigate the conditions under which the penalty parameters are bounded. Throughout this section, we assume that x is a KKT point of (1.1). For the analysis in this section, we need some additional assumptions. AS.3 The LICQ condition, defined in Definition 3.10, holds at the KKT point x. The assumption AS.3 implies that [(A ) T ] I has full column ran, i.e., Ran([(A ) T ] I ) = m, (4.1) where I is the index set of inactive bound constraints at x, i.e., I = {i : x i > 0}. AS.4 The strict complementarity conditions hold at the KKT point x, that is [g(x ) (A(x )) T λ ] i > 0, if i A = {i : x i = 0}, where λ is the Lagrange multiplier associated with the equality constraints c(x) = 0 at x. We now give a property of λ, defined in (2.13), in the following lemma. Lemma 4.1. Assume that AS.1-AS.4 hold at a KKT point x of (1.1). Then, there exists a constant ρ > 0, such that if x B ρ (x ), λ given by (2.13) is unique and λ = arg λ R m [g A T λ] I 2. (4.2)

19 AUGMENTED LAGRANGIAN AFFINE SCALING METHOD 19 Proof. The strict complementarity conditions imply that [g (A ) T λ ] I = 0 and [g (A ) T λ ] A > 0. (4.3) Then, (4.1) indicates that λ is uniquely given by λ = [(A(x )) T ] I [g(x )] I. We now denote µ as µ = arg λ R m [g A T λ] I 2. (4.4) AS.1 and AS.3 indicate that there exists a ρ > 0 such that if x B ρ (x ), Ran([(A ) T ] I ) = m, and thus µ is a strict unique imizer of (4.4), given by µ = [A T ] I [g ] I, Moreover, by (4.3), reducing ρ if necessary, we have if x B ρ (x ), and there exists a constant ξ > 0 independent of such that [g A T µ ] I < x i 2 < x i, for any i I, (4.5) [g A T µ ] i x i + ξ, for any i A. (4.6) Then (4.5) and (4.6) imply that if x B ρ (x ) the following equality holds: ψ (µ ) = (x g + A T µ ) + x 2 = [g A T µ ] I 2 + [x ] A 2. (4.7) In the following supposing x B ρ (x ), we show λ = µ by way of contradiction. Assume that λ µ. Then, since µ is the unique imizer of (4.4), we have This together with (4.5)-(4.6) implies that [g A T λ ] I 2 > [g A T µ ] I 2. [(x g + A T λ ) + x ] I 2 i I {[g A T λ ] 2 i, x 2 i} { [g A T λ ] I 2, (γ ) 2 /4} (4.8) > [g A T µ ] I 2 = [(x g + A T µ ) + x ] I 2, where γ = i I {x i }. Therefore, it follows from ψ ( λ ) ψ (µ ) by the definition of λ and (4.7) that [(x g + A T λ ) + x ] A 2 < [(x g + A T µ ) + x ] A 2 = [x ] A 2. Consequently, there exists an index j A, depending on, such that x j > [g A T λ ] j = [g A T µ ] j + [A T (µ λ )] j [g A T µ ] j A T µ λ which yields µ λ > [g A T µ ] j xj A T ξ M, (4.9)

20 20 X. WANG AND H. ZHANG where M = max{ A } < and the second inequality follows from (4.6). Recall that Ran([(A ) T ] I ) = m if x B ρ (x ). Hence, there exists ˆξ > 0 such that [A T ] I (µ λ ) ˆξ ξ/m. Reducing ρ if necessary, since x B ρ (x ), it again follows from (4.3) and µ = [A T ] I [g ] I as the unique imizer of (4.4) that which further gives that [g A T µ ] I ξ ˆξ 2M, Hence, by (4.8) and (4.10) [g A T λ ] I = [g A T µ ] I + [A T ] I (µ λ ) [A T ] I (µ λ ) [g A T µ ] I ξ ˆξ > M [g ξ ˆξ A T µ ] I 2M. (4.10) ψ ( λ ) = (x g + A T λ ) + x 2 [(x g + A T λ ) + x ] I 2 ( ) 2 ξ ξ, (γ ) 2 2M 4 =: θ > 0. (4.11) However, by (4.3) and (4.7), we can choose ρ sufficiently small such that if x B ρ (x ), then ψ (µ ) θ/2, and therefore by (4.11), ψ (µ ) < ψ ( λ ). This contradicts the definition of λ such that ψ ( λ ) ψ (µ ). Hence, if x B ρ (x ) with ρ sufficiently small, then λ = µ, which is uniquely given by (4.2). Lemma 4.1 shows that if the iterates {x } generated by ALAS converge to a KKT point satisfying AS.1-AS.4, then { λ } will converge to the unique Lagrange multiplier λ associated with the equality constraints at x. Hence, if we choose components of λ and λ max sufficient small and large respectively such that λ < λ < λ max, then we have λ < λ < λ max for all large. Then, according to the computation of λ in (2.19), λ = λ for all large. To show the boundedness of penalty parameters {σ }, we also need the following assumption. AS.5 Assume that for all large, λ is updated by (2.19) with λ λ λ max. This assumption assumes that λ is updated for all later iterations and λ = λ. Recall that to decide whether to update λ or not, we set a test condition c +1 R in ALAS. AS.5 essentially assumes that this condition is satisfied for all large. Actually, this assumption is more practical than theoretical, because when x is close to the feasible region and σ is relatively large, normally the constraint violation will be decreasing. Then if the parameter β in ALAS is very close to 1, the condition c +1 R will usually be satisfied in later iterations. We now conclude this section with the following theorem. Theorem 4.2. Suppose the iterates {x } generated by ALAS converge to a KKT point x satisfying assumptions AS.1-AS.5. Then Pred δ σ { c, c 2 }, for all large. (4.12) Furthermore, if ALAS does not terate in finite number of iterations, the penalty parameters {σ } are bounded. Proof. We first prove (4.12). We consider the case c = 0, otherwise (4.12) obviously holds. Adding constraints d A = 0 to the subproblem (2.9), we obtain the following problem: q (d) s. t. d, x + D d 0, d A = 0. (4.13)

5 Handling Constraints

5 Handling Constraints 5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest

More information

Penalty and Barrier Methods General classical constrained minimization problem minimize f(x) subject to g(x) 0 h(x) =0 Penalty methods are motivated by the desire to use unconstrained optimization techniques

More information

A Primal-Dual Interior-Point Method for Nonlinear Programming with Strong Global and Local Convergence Properties

A Primal-Dual Interior-Point Method for Nonlinear Programming with Strong Global and Local Convergence Properties A Primal-Dual Interior-Point Method for Nonlinear Programming with Strong Global and Local Convergence Properties André L. Tits Andreas Wächter Sasan Bahtiari Thomas J. Urban Craig T. Lawrence ISR Technical

More information

1 Computing with constraints

1 Computing with constraints Notes for 2017-04-26 1 Computing with constraints Recall that our basic problem is minimize φ(x) s.t. x Ω where the feasible set Ω is defined by equality and inequality conditions Ω = {x R n : c i (x)

More information

Algorithms for constrained local optimization

Algorithms for constrained local optimization Algorithms for constrained local optimization Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Algorithms for constrained local optimization p. Feasible direction methods Algorithms for constrained

More information

Algorithms for Constrained Optimization

Algorithms for Constrained Optimization 1 / 42 Algorithms for Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University April 19, 2015 2 / 42 Outline 1. Convergence 2. Sequential quadratic

More information

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES IJMMS 25:6 2001) 397 409 PII. S0161171201002290 http://ijmms.hindawi.com Hindawi Publishing Corp. A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

More information

A GLOBALLY CONVERGENT STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE

A GLOBALLY CONVERGENT STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE A GLOBALLY CONVERGENT STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE Philip E. Gill Vyacheslav Kungurtsev Daniel P. Robinson UCSD Center for Computational Mathematics Technical Report CCoM-14-1 June 30,

More information

A STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE

A STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE A STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE Philip E. Gill Vyacheslav Kungurtsev Daniel P. Robinson UCSD Center for Computational Mathematics Technical Report CCoM-14-1 June 30, 2014 Abstract Regularized

More information

ON AUGMENTED LAGRANGIAN METHODS WITH GENERAL LOWER-LEVEL CONSTRAINTS. 1. Introduction. Many practical optimization problems have the form (1.

ON AUGMENTED LAGRANGIAN METHODS WITH GENERAL LOWER-LEVEL CONSTRAINTS. 1. Introduction. Many practical optimization problems have the form (1. ON AUGMENTED LAGRANGIAN METHODS WITH GENERAL LOWER-LEVEL CONSTRAINTS R. ANDREANI, E. G. BIRGIN, J. M. MARTíNEZ, AND M. L. SCHUVERDT Abstract. Augmented Lagrangian methods with general lower-level constraints

More information

4TE3/6TE3. Algorithms for. Continuous Optimization

4TE3/6TE3. Algorithms for. Continuous Optimization 4TE3/6TE3 Algorithms for Continuous Optimization (Algorithms for Constrained Nonlinear Optimization Problems) Tamás TERLAKY Computing and Software McMaster University Hamilton, November 2005 terlaky@mcmaster.ca

More information

A New Penalty-SQP Method

A New Penalty-SQP Method Background and Motivation Illustration of Numerical Results Final Remarks Frank E. Curtis Informs Annual Meeting, October 2008 Background and Motivation Illustration of Numerical Results Final Remarks

More information

Constrained Optimization

Constrained Optimization 1 / 22 Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University March 30, 2015 2 / 22 1. Equality constraints only 1.1 Reduced gradient 1.2 Lagrange

More information

A STABILIZED SQP METHOD: GLOBAL CONVERGENCE

A STABILIZED SQP METHOD: GLOBAL CONVERGENCE A STABILIZED SQP METHOD: GLOBAL CONVERGENCE Philip E. Gill Vyacheslav Kungurtsev Daniel P. Robinson UCSD Center for Computational Mathematics Technical Report CCoM-13-4 Revised July 18, 2014, June 23,

More information

A trust region method based on interior point techniques for nonlinear programming

A trust region method based on interior point techniques for nonlinear programming Math. Program., Ser. A 89: 149 185 2000 Digital Object Identifier DOI 10.1007/s101070000189 Richard H. Byrd Jean Charles Gilbert Jorge Nocedal A trust region method based on interior point techniques for

More information

minimize x subject to (x 2)(x 4) u,

minimize x subject to (x 2)(x 4) u, Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for

More information

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be

More information

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method Optimization Methods and Software Vol. 00, No. 00, Month 200x, 1 11 On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method ROMAN A. POLYAK Department of SEOR and Mathematical

More information

A GLOBALLY CONVERGENT STABILIZED SQP METHOD

A GLOBALLY CONVERGENT STABILIZED SQP METHOD A GLOBALLY CONVERGENT STABILIZED SQP METHOD Philip E. Gill Daniel P. Robinson July 6, 2013 Abstract Sequential quadratic programming SQP methods are a popular class of methods for nonlinearly constrained

More information

Some new facts about sequential quadratic programming methods employing second derivatives

Some new facts about sequential quadratic programming methods employing second derivatives To appear in Optimization Methods and Software Vol. 00, No. 00, Month 20XX, 1 24 Some new facts about sequential quadratic programming methods employing second derivatives A.F. Izmailov a and M.V. Solodov

More information

A Primal-Dual Augmented Lagrangian Penalty-Interior-Point Filter Line Search Algorithm

A Primal-Dual Augmented Lagrangian Penalty-Interior-Point Filter Line Search Algorithm Journal name manuscript No. (will be inserted by the editor) A Primal-Dual Augmented Lagrangian Penalty-Interior-Point Filter Line Search Algorithm Rene Kuhlmann Christof Büsens Received: date / Accepted:

More information

SF2822 Applied Nonlinear Optimization. Preparatory question. Lecture 9: Sequential quadratic programming. Anders Forsgren

SF2822 Applied Nonlinear Optimization. Preparatory question. Lecture 9: Sequential quadratic programming. Anders Forsgren SF2822 Applied Nonlinear Optimization Lecture 9: Sequential quadratic programming Anders Forsgren SF2822 Applied Nonlinear Optimization, KTH / 24 Lecture 9, 207/208 Preparatory question. Try to solve theory

More information

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The

More information

Penalty and Barrier Methods. So we again build on our unconstrained algorithms, but in a different way.

Penalty and Barrier Methods. So we again build on our unconstrained algorithms, but in a different way. AMSC 607 / CMSC 878o Advanced Numerical Optimization Fall 2008 UNIT 3: Constrained Optimization PART 3: Penalty and Barrier Methods Dianne P. O Leary c 2008 Reference: N&S Chapter 16 Penalty and Barrier

More information

An Inexact Newton Method for Nonlinear Constrained Optimization

An Inexact Newton Method for Nonlinear Constrained Optimization An Inexact Newton Method for Nonlinear Constrained Optimization Frank E. Curtis Numerical Analysis Seminar, January 23, 2009 Outline Motivation and background Algorithm development and theoretical results

More information

Optimality Conditions for Constrained Optimization

Optimality Conditions for Constrained Optimization 72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

More information

A Local Convergence Analysis of Bilevel Decomposition Algorithms

A Local Convergence Analysis of Bilevel Decomposition Algorithms A Local Convergence Analysis of Bilevel Decomposition Algorithms Victor DeMiguel Decision Sciences London Business School avmiguel@london.edu Walter Murray Management Science and Engineering Stanford University

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

Interior Methods for Mathematical Programs with Complementarity Constraints

Interior Methods for Mathematical Programs with Complementarity Constraints Interior Methods for Mathematical Programs with Complementarity Constraints Sven Leyffer, Gabriel López-Calva and Jorge Nocedal July 14, 25 Abstract This paper studies theoretical and practical properties

More information

A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints

A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints Journal of Computational and Applied Mathematics 161 (003) 1 5 www.elsevier.com/locate/cam A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality

More information

A SHIFTED PRIMAL-DUAL PENALTY-BARRIER METHOD FOR NONLINEAR OPTIMIZATION

A SHIFTED PRIMAL-DUAL PENALTY-BARRIER METHOD FOR NONLINEAR OPTIMIZATION A SHIFTED PRIMAL-DUAL PENALTY-BARRIER METHOD FOR NONLINEAR OPTIMIZATION Philip E. Gill Vyacheslav Kungurtsev Daniel P. Robinson UCSD Center for Computational Mathematics Technical Report CCoM-19-3 March

More information

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization Frank E. Curtis, Lehigh University involving joint work with James V. Burke, University of Washington Daniel

More information

Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization. Nick Gould (RAL)

Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization. Nick Gould (RAL) Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization Nick Gould (RAL) x IR n f(x) subject to c(x) = Part C course on continuoue optimization CONSTRAINED MINIMIZATION x

More information

INTERIOR-POINT METHODS FOR NONCONVEX NONLINEAR PROGRAMMING: CONVERGENCE ANALYSIS AND COMPUTATIONAL PERFORMANCE

INTERIOR-POINT METHODS FOR NONCONVEX NONLINEAR PROGRAMMING: CONVERGENCE ANALYSIS AND COMPUTATIONAL PERFORMANCE INTERIOR-POINT METHODS FOR NONCONVEX NONLINEAR PROGRAMMING: CONVERGENCE ANALYSIS AND COMPUTATIONAL PERFORMANCE HANDE Y. BENSON, ARUN SEN, AND DAVID F. SHANNO Abstract. In this paper, we present global

More information

CONVERGENCE ANALYSIS OF AN INTERIOR-POINT METHOD FOR NONCONVEX NONLINEAR PROGRAMMING

CONVERGENCE ANALYSIS OF AN INTERIOR-POINT METHOD FOR NONCONVEX NONLINEAR PROGRAMMING CONVERGENCE ANALYSIS OF AN INTERIOR-POINT METHOD FOR NONCONVEX NONLINEAR PROGRAMMING HANDE Y. BENSON, ARUN SEN, AND DAVID F. SHANNO Abstract. In this paper, we present global and local convergence results

More information

Constrained Nonlinear Optimization Algorithms

Constrained Nonlinear Optimization Algorithms Department of Industrial Engineering and Management Sciences Northwestern University waechter@iems.northwestern.edu Institute for Mathematics and its Applications University of Minnesota August 4, 2016

More information

Written Examination

Written Examination Division of Scientific Computing Department of Information Technology Uppsala University Optimization Written Examination 202-2-20 Time: 4:00-9:00 Allowed Tools: Pocket Calculator, one A4 paper with notes

More information

A SHIFTED PRIMAL-DUAL INTERIOR METHOD FOR NONLINEAR OPTIMIZATION

A SHIFTED PRIMAL-DUAL INTERIOR METHOD FOR NONLINEAR OPTIMIZATION A SHIFTED RIMAL-DUAL INTERIOR METHOD FOR NONLINEAR OTIMIZATION hilip E. Gill Vyacheslav Kungurtsev Daniel. Robinson UCSD Center for Computational Mathematics Technical Report CCoM-18-1 February 1, 2018

More information

10 Numerical methods for constrained problems

10 Numerical methods for constrained problems 10 Numerical methods for constrained problems min s.t. f(x) h(x) = 0 (l), g(x) 0 (m), x X The algorithms can be roughly divided the following way: ˆ primal methods: find descent direction keeping inside

More information

An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization

An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization Frank E. Curtis, Lehigh University involving joint work with Travis Johnson, Northwestern University Daniel P. Robinson, Johns

More information

1. Introduction. We consider nonlinear optimization problems of the form. f(x) ce (x) = 0 c I (x) 0,

1. Introduction. We consider nonlinear optimization problems of the form. f(x) ce (x) = 0 c I (x) 0, AN INTERIOR-POINT ALGORITHM FOR LARGE-SCALE NONLINEAR OPTIMIZATION WITH INEXACT STEP COMPUTATIONS FRANK E. CURTIS, OLAF SCHENK, AND ANDREAS WÄCHTER Abstract. We present a line-search algorithm for large-scale

More information

Constrained optimization: direct methods (cont.)

Constrained optimization: direct methods (cont.) Constrained optimization: direct methods (cont.) Jussi Hakanen Post-doctoral researcher jussi.hakanen@jyu.fi Direct methods Also known as methods of feasible directions Idea in a point x h, generate a

More information

Constrained Optimization Theory

Constrained Optimization Theory Constrained Optimization Theory Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. IMA, August 2016 Stephen Wright (UW-Madison) Constrained Optimization Theory IMA, August

More information

A globally convergent Levenberg Marquardt method for equality-constrained optimization

A globally convergent Levenberg Marquardt method for equality-constrained optimization Computational Optimization and Applications manuscript No. (will be inserted by the editor) A globally convergent Levenberg Marquardt method for equality-constrained optimization A. F. Izmailov M. V. Solodov

More information

TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM

TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM H. E. Krogstad, IMF, Spring 2012 Karush-Kuhn-Tucker (KKT) Theorem is the most central theorem in constrained optimization, and since the proof is scattered

More information

Numerical Optimization

Numerical Optimization Constrained Optimization Computer Science and Automation Indian Institute of Science Bangalore 560 012, India. NPTEL Course on Constrained Optimization Constrained Optimization Problem: min h j (x) 0,

More information

An Inexact Newton Method for Optimization

An Inexact Newton Method for Optimization New York University Brown Applied Mathematics Seminar, February 10, 2009 Brief biography New York State College of William and Mary (B.S.) Northwestern University (M.S. & Ph.D.) Courant Institute (Postdoc)

More information

Iteration-complexity of first-order penalty methods for convex programming

Iteration-complexity of first-order penalty methods for convex programming Iteration-complexity of first-order penalty methods for convex programming Guanghui Lan Renato D.C. Monteiro July 24, 2008 Abstract This paper considers a special but broad class of convex programing CP)

More information

2.3 Linear Programming

2.3 Linear Programming 2.3 Linear Programming Linear Programming (LP) is the term used to define a wide range of optimization problems in which the objective function is linear in the unknown variables and the constraints are

More information

A class of Smoothing Method for Linear Second-Order Cone Programming

A class of Smoothing Method for Linear Second-Order Cone Programming Columbia International Publishing Journal of Advanced Computing (13) 1: 9-4 doi:1776/jac1313 Research Article A class of Smoothing Method for Linear Second-Order Cone Programming Zhuqing Gui *, Zhibin

More information

A Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity

A Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity A Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity Mohammadreza Samadi, Lehigh University joint work with Frank E. Curtis (stand-in presenter), Lehigh University

More information

CONSTRAINED NONLINEAR PROGRAMMING

CONSTRAINED NONLINEAR PROGRAMMING 149 CONSTRAINED NONLINEAR PROGRAMMING We now turn to methods for general constrained nonlinear programming. These may be broadly classified into two categories: 1. TRANSFORMATION METHODS: In this approach

More information

Operations Research Lecture 4: Linear Programming Interior Point Method

Operations Research Lecture 4: Linear Programming Interior Point Method Operations Research Lecture 4: Linear Programg Interior Point Method Notes taen by Kaiquan Xu@Business School, Nanjing University April 14th 2016 1 The affine scaling algorithm one of the most efficient

More information

Nonmonotonic back-tracking trust region interior point algorithm for linear constrained optimization

Nonmonotonic back-tracking trust region interior point algorithm for linear constrained optimization Journal of Computational and Applied Mathematics 155 (2003) 285 305 www.elsevier.com/locate/cam Nonmonotonic bac-tracing trust region interior point algorithm for linear constrained optimization Detong

More information

Optimality, Duality, Complementarity for Constrained Optimization

Optimality, Duality, Complementarity for Constrained Optimization Optimality, Duality, Complementarity for Constrained Optimization Stephen Wright University of Wisconsin-Madison May 2014 Wright (UW-Madison) Optimality, Duality, Complementarity May 2014 1 / 41 Linear

More information

Inexact Newton Methods and Nonlinear Constrained Optimization

Inexact Newton Methods and Nonlinear Constrained Optimization Inexact Newton Methods and Nonlinear Constrained Optimization Frank E. Curtis EPSRC Symposium Capstone Conference Warwick Mathematics Institute July 2, 2009 Outline PDE-Constrained Optimization Newton

More information

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications Optimization Problems with Constraints - introduction to theory, numerical Methods and applications Dr. Abebe Geletu Ilmenau University of Technology Department of Simulation and Optimal Processes (SOP)

More information

Lecture 3. Optimization Problems and Iterative Algorithms

Lecture 3. Optimization Problems and Iterative Algorithms Lecture 3 Optimization Problems and Iterative Algorithms January 13, 2016 This material was jointly developed with Angelia Nedić at UIUC for IE 598ns Outline Special Functions: Linear, Quadratic, Convex

More information

MODIFYING SQP FOR DEGENERATE PROBLEMS

MODIFYING SQP FOR DEGENERATE PROBLEMS PREPRINT ANL/MCS-P699-1097, OCTOBER, 1997, (REVISED JUNE, 2000; MARCH, 2002), MATHEMATICS AND COMPUTER SCIENCE DIVISION, ARGONNE NATIONAL LABORATORY MODIFYING SQP FOR DEGENERATE PROBLEMS STEPHEN J. WRIGHT

More information

Generalized Uniformly Optimal Methods for Nonlinear Programming

Generalized Uniformly Optimal Methods for Nonlinear Programming Generalized Uniformly Optimal Methods for Nonlinear Programming Saeed Ghadimi Guanghui Lan Hongchao Zhang Janumary 14, 2017 Abstract In this paper, we present a generic framewor to extend existing uniformly

More information

In view of (31), the second of these is equal to the identity I on E m, while this, in view of (30), implies that the first can be written

In view of (31), the second of these is equal to the identity I on E m, while this, in view of (30), implies that the first can be written 11.8 Inequality Constraints 341 Because by assumption x is a regular point and L x is positive definite on M, it follows that this matrix is nonsingular (see Exercise 11). Thus, by the Implicit Function

More information

A Simple Primal-Dual Feasible Interior-Point Method for Nonlinear Programming with Monotone Descent

A Simple Primal-Dual Feasible Interior-Point Method for Nonlinear Programming with Monotone Descent A Simple Primal-Dual Feasible Interior-Point Method for Nonlinear Programming with Monotone Descent Sasan Bahtiari André L. Tits Department of Electrical and Computer Engineering and Institute for Systems

More information

Beyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory

Beyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory Beyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory Xin Liu(4Ð) State Key Laboratory of Scientific and Engineering Computing Institute of Computational Mathematics

More information

Lectures 9 and 10: Constrained optimization problems and their optimality conditions

Lectures 9 and 10: Constrained optimization problems and their optimality conditions Lectures 9 and 10: Constrained optimization problems and their optimality conditions Coralia Cartis, Mathematical Institute, University of Oxford C6.2/B2: Continuous Optimization Lectures 9 and 10: Constrained

More information

Convex Optimization and SVM

Convex Optimization and SVM Convex Optimization and SVM Problem 0. Cf lecture notes pages 12 to 18. Problem 1. (i) A slab is an intersection of two half spaces, hence convex. (ii) A wedge is an intersection of two half spaces, hence

More information

Priority Programme 1962

Priority Programme 1962 Priority Programme 1962 An Example Comparing the Standard and Modified Augmented Lagrangian Methods Christian Kanzow, Daniel Steck Non-smooth and Complementarity-based Distributed Parameter Systems: Simulation

More information

MS&E 318 (CME 338) Large-Scale Numerical Optimization

MS&E 318 (CME 338) Large-Scale Numerical Optimization Stanford University, Management Science & Engineering (and ICME) MS&E 318 (CME 338) Large-Scale Numerical Optimization 1 Origins Instructor: Michael Saunders Spring 2015 Notes 9: Augmented Lagrangian Methods

More information

A New Sequential Optimality Condition for Constrained Nonsmooth Optimization

A New Sequential Optimality Condition for Constrained Nonsmooth Optimization A New Sequential Optimality Condition for Constrained Nonsmooth Optimization Elias Salomão Helou Sandra A. Santos Lucas E. A. Simões November 23, 2018 Abstract We introduce a sequential optimality condition

More information

Lecture 13: Constrained optimization

Lecture 13: Constrained optimization 2010-12-03 Basic ideas A nonlinearly constrained problem must somehow be converted relaxed into a problem which we can solve (a linear/quadratic or unconstrained problem) We solve a sequence of such problems

More information

Nonlinear Programming, Elastic Mode, SQP, MPEC, MPCC, complementarity

Nonlinear Programming, Elastic Mode, SQP, MPEC, MPCC, complementarity Preprint ANL/MCS-P864-1200 ON USING THE ELASTIC MODE IN NONLINEAR PROGRAMMING APPROACHES TO MATHEMATICAL PROGRAMS WITH COMPLEMENTARITY CONSTRAINTS MIHAI ANITESCU Abstract. We investigate the possibility

More information

Solving Dual Problems

Solving Dual Problems Lecture 20 Solving Dual Problems We consider a constrained problem where, in addition to the constraint set X, there are also inequality and linear equality constraints. Specifically the minimization problem

More information

A SEQUENTIAL QUADRATIC PROGRAMMING ALGORITHM THAT COMBINES MERIT FUNCTION AND FILTER IDEAS

A SEQUENTIAL QUADRATIC PROGRAMMING ALGORITHM THAT COMBINES MERIT FUNCTION AND FILTER IDEAS A SEQUENTIAL QUADRATIC PROGRAMMING ALGORITHM THAT COMBINES MERIT FUNCTION AND FILTER IDEAS FRANCISCO A. M. GOMES Abstract. A sequential quadratic programming algorithm for solving nonlinear programming

More information

Self-Concordant Barrier Functions for Convex Optimization

Self-Concordant Barrier Functions for Convex Optimization Appendix F Self-Concordant Barrier Functions for Convex Optimization F.1 Introduction In this Appendix we present a framework for developing polynomial-time algorithms for the solution of convex optimization

More information

Preprint ANL/MCS-P , Dec 2002 (Revised Nov 2003, Mar 2004) Mathematics and Computer Science Division Argonne National Laboratory

Preprint ANL/MCS-P , Dec 2002 (Revised Nov 2003, Mar 2004) Mathematics and Computer Science Division Argonne National Laboratory Preprint ANL/MCS-P1015-1202, Dec 2002 (Revised Nov 2003, Mar 2004) Mathematics and Computer Science Division Argonne National Laboratory A GLOBALLY CONVERGENT LINEARLY CONSTRAINED LAGRANGIAN METHOD FOR

More information

A globally and quadratically convergent primal dual augmented Lagrangian algorithm for equality constrained optimization

A globally and quadratically convergent primal dual augmented Lagrangian algorithm for equality constrained optimization Optimization Methods and Software ISSN: 1055-6788 (Print) 1029-4937 (Online) Journal homepage: http://www.tandfonline.com/loi/goms20 A globally and quadratically convergent primal dual augmented Lagrangian

More information

You should be able to...

You should be able to... Lecture Outline Gradient Projection Algorithm Constant Step Length, Varying Step Length, Diminishing Step Length Complexity Issues Gradient Projection With Exploration Projection Solving QPs: active set

More information

On sequential optimality conditions for constrained optimization. José Mario Martínez martinez

On sequential optimality conditions for constrained optimization. José Mario Martínez  martinez On sequential optimality conditions for constrained optimization José Mario Martínez www.ime.unicamp.br/ martinez UNICAMP, Brazil 2011 Collaborators This talk is based in joint papers with Roberto Andreani

More information

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

More information

On the use of piecewise linear models in nonlinear programming

On the use of piecewise linear models in nonlinear programming Math. Program., Ser. A (2013) 137:289 324 DOI 10.1007/s10107-011-0492-9 FULL LENGTH PAPER On the use of piecewise linear models in nonlinear programming Richard H. Byrd Jorge Nocedal Richard A. Waltz Yuchen

More information

An interior-point gradient method for large-scale totally nonnegative least squares problems

An interior-point gradient method for large-scale totally nonnegative least squares problems An interior-point gradient method for large-scale totally nonnegative least squares problems Michael Merritt and Yin Zhang Technical Report TR04-08 Department of Computational and Applied Mathematics Rice

More information

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL)

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL) Part 4: Active-set methods for linearly constrained optimization Nick Gould RAL fx subject to Ax b Part C course on continuoue optimization LINEARLY CONSTRAINED MINIMIZATION fx subject to Ax { } b where

More information

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization Roger Behling a, Clovis Gonzaga b and Gabriel Haeser c March 21, 2013 a Department

More information

A PRIMAL-DUAL TRUST REGION ALGORITHM FOR NONLINEAR OPTIMIZATION

A PRIMAL-DUAL TRUST REGION ALGORITHM FOR NONLINEAR OPTIMIZATION Optimization Technical Report 02-09, October 2002, UW-Madison Computer Sciences Department. E. Michael Gertz 1 Philip E. Gill 2 A PRIMAL-DUAL TRUST REGION ALGORITHM FOR NONLINEAR OPTIMIZATION 7 October

More information

Implications of the Constant Rank Constraint Qualification

Implications of the Constant Rank Constraint Qualification Mathematical Programming manuscript No. (will be inserted by the editor) Implications of the Constant Rank Constraint Qualification Shu Lu Received: date / Accepted: date Abstract This paper investigates

More information

2.098/6.255/ Optimization Methods Practice True/False Questions

2.098/6.255/ Optimization Methods Practice True/False Questions 2.098/6.255/15.093 Optimization Methods Practice True/False Questions December 11, 2009 Part I For each one of the statements below, state whether it is true or false. Include a 1-3 line supporting sentence

More information

A null-space primal-dual interior-point algorithm for nonlinear optimization with nice convergence properties

A null-space primal-dual interior-point algorithm for nonlinear optimization with nice convergence properties A null-space primal-dual interior-point algorithm for nonlinear optimization with nice convergence properties Xinwei Liu and Yaxiang Yuan Abstract. We present a null-space primal-dual interior-point algorithm

More information

A derivative-free nonmonotone line search and its application to the spectral residual method

A derivative-free nonmonotone line search and its application to the spectral residual method IMA Journal of Numerical Analysis (2009) 29, 814 825 doi:10.1093/imanum/drn019 Advance Access publication on November 14, 2008 A derivative-free nonmonotone line search and its application to the spectral

More information

Pacific Journal of Optimization (Vol. 2, No. 3, September 2006) ABSTRACT

Pacific Journal of Optimization (Vol. 2, No. 3, September 2006) ABSTRACT Pacific Journal of Optimization Vol., No. 3, September 006) PRIMAL ERROR BOUNDS BASED ON THE AUGMENTED LAGRANGIAN AND LAGRANGIAN RELAXATION ALGORITHMS A. F. Izmailov and M. V. Solodov ABSTRACT For a given

More information

Examination paper for TMA4180 Optimization I

Examination paper for TMA4180 Optimization I Department of Mathematical Sciences Examination paper for TMA4180 Optimization I Academic contact during examination: Phone: Examination date: 26th May 2016 Examination time (from to): 09:00 13:00 Permitted

More information

Sequential Quadratic Programming Method for Nonlinear Second-Order Cone Programming Problems. Hirokazu KATO

Sequential Quadratic Programming Method for Nonlinear Second-Order Cone Programming Problems. Hirokazu KATO Sequential Quadratic Programming Method for Nonlinear Second-Order Cone Programming Problems Guidance Professor Masao FUKUSHIMA Hirokazu KATO 2004 Graduate Course in Department of Applied Mathematics and

More information

IBM Research Report. Line Search Filter Methods for Nonlinear Programming: Motivation and Global Convergence

IBM Research Report. Line Search Filter Methods for Nonlinear Programming: Motivation and Global Convergence RC23036 (W0304-181) April 21, 2003 Computer Science IBM Research Report Line Search Filter Methods for Nonlinear Programming: Motivation and Global Convergence Andreas Wächter, Lorenz T. Biegler IBM Research

More information

Determination of Feasible Directions by Successive Quadratic Programming and Zoutendijk Algorithms: A Comparative Study

Determination of Feasible Directions by Successive Quadratic Programming and Zoutendijk Algorithms: A Comparative Study International Journal of Mathematics And Its Applications Vol.2 No.4 (2014), pp.47-56. ISSN: 2347-1557(online) Determination of Feasible Directions by Successive Quadratic Programming and Zoutendijk Algorithms:

More information

Primal/Dual Decomposition Methods

Primal/Dual Decomposition Methods Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Subgradients

More information

How to Characterize the Worst-Case Performance of Algorithms for Nonconvex Optimization

How to Characterize the Worst-Case Performance of Algorithms for Nonconvex Optimization How to Characterize the Worst-Case Performance of Algorithms for Nonconvex Optimization Frank E. Curtis Department of Industrial and Systems Engineering, Lehigh University Daniel P. Robinson Department

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 6 Optimization Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction permitted

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 6 Optimization Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction permitted

More information

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization E5295/5B5749 Convex optimization with engineering applications Lecture 8 Smooth convex unconstrained and equality-constrained minimization A. Forsgren, KTH 1 Lecture 8 Convex optimization 2006/2007 Unconstrained

More information

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems Robert M. Freund February 2016 c 2016 Massachusetts Institute of Technology. All rights reserved. 1 1 Introduction

More information

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions International Journal of Control Vol. 00, No. 00, January 2007, 1 10 Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions I-JENG WANG and JAMES C.

More information

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS MATHEMATICS OF OPERATIONS RESEARCH Vol. 28, No. 4, November 2003, pp. 677 692 Printed in U.S.A. ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS ALEXANDER SHAPIRO We discuss in this paper a class of nonsmooth

More information