Differentiable exact penalty functions for nonlinear optimization with easy constraints. Takuma NISHIMURA

Size: px
Start display at page:

Download "Differentiable exact penalty functions for nonlinear optimization with easy constraints. Takuma NISHIMURA"

Transcription

1 Master s Thesis Differentiable exact penalty functions for nonlinear optimization with easy constraints Guidance Assistant Professor Ellen Hidemi FUKUDA Takuma NISHIMURA Department of Applied Mathematics and Physics Graduate School of Informatics Kyoto University KYOTO UNIVERSITY KYOTO JAPAN F OU N DED February 2015

2 Abstract One approach for solving nonlinear constrained optimization problems is to use methods based on exact penalty functions. Basically, with an appropriate choice of the penalty parameter, the optimal solutions of the original constrained problem are obtained by solving an unconstrained one, which is easier to solve. Recently, Andreani, Fukuda and Silva proposed an implementable exact penalty method for nonlinear optimization problems with general equality and inequality constraints. In this paper, we extend their work, considering problems that distinguish the easy constraints from the general difficult ones. In this case, the definition of the exact penalty function is changed, in such a way that the original problem is replaced by a problem containing only easy constraints, which is also easier to solve. In order to construct such an exact penalty function, we consider the case that the easy constraints are defined by linear equalities, and then create an estimate of the Lagrange multipliers associated to a point. We incorporate this multipliers estimate in an augmented Lagrangian function, and then prove the whole exactness results. Finally, we propose to use the spectral projected gradient method with a dynamical way to update the penalty parameter.

3 Contents 1 Introduction 1 2 Preliminaries 3 3 Constructing the penalty function The multipliers estimate The penalty function Exactness results Analysis of KKT points Optimality results Algorithm Updating the penalty parameter Spectral projected gradient method Conclusion 21 References 21

4 1 Introduction We consider the following nonlinear constrained optimization problem: min f (x) s.t. g(x) 0 h(x) = 0 x X, where f : R n R, g : R n R m, and h : R n R p are twice continuously differentiable functions, and X R n is a nonempty closed convex set. Here, X is the set of easy constraints, in particular X = {x R n Ax = b}, (1.2) with A R l n, b R l,l n and rank(a) = l. Many methods for nonlinear constrained minimization problems have been considered in the literature, but here we focus on the penalty function approach. Among those methods, we cite the quadratic penalty functions, the augmented Lagrangian functions, and the exact penalty functions. In particular, the latter one consists of replacing the constrained minimization problem into a single unconstrained one, which is easier to solve. Moreover, the penalty parameter of the unconstrained problem is finite. One of the most famous exact penalty functions is the one proposed by Zangwill in [16]. For problem (1.1) with X = R n, it is defined by f (x) + c max{0,g 1 (x),...,g m (x), h 1 (x),..., h p (x) }, where c > 0 is a penalty parameter. Under reasonable assumptions and for c sufficiently large, we can get a solution of (1.1) by minimizing the above penalty function. However, such a function is nondifferentiable because it contains a maximum term in its formula. Also, it is not easy to find the appropriate penalty parameter. After Zangwill, many researchers proposed differentiable exact penalty functions, starting with Fletcher in 1970 [8], Mukai and Polak in 1975 [13], and Glad and Polak in 1979 [10]. In 1980 s, Di Pillo and Grippo proposed another type of exact penalty functions, based on the Lagrange multipliers estimate given by Glad and Polak [10]. They first considered optimization problems with inequality constraints in [6] and further extended the idea for problems with both inequality and equality constraints [7]. The idea of Di Pillo and Grippo was to incorporate an Lagrange multipliers estimate in an augmented Lagrangian function, so the obtained function has exactness properties. More recently, in 2010, André and Silva extended Di Pillo and Grippo s idea to solve variational inequality problems with the feasible set defined by functional inequality constraints [3]. We recall that variational inequality problems have many applications, and also extend the first-order necessary optimality condition of nonlinear programming problems. In 2013, based on André and Silva s approach, Andreani et al. proposed a Gauss-Newton-type method based on exact penalty functions to solve optimization problems with general inequality and equality constraints [2]. If second-order methods, like the Newton-type method, are applied to solve the problem, it would be necessary to deal with third-order derivatives of the problem data, which is difficult in the numerical point of view. To overcome such difficulty, they proposed to incorporate (1.1) 1

5 an multipliers estimate in the augmented Lagrangian function for variational inequalities. Another approach that does not deal with these third-order derivatives was given in 2012 by Fukuda et al. [9]. They proposed an exact penalty function for nonlinear second-order cone programs, which is an extension of nonlinear programming problems. However, in this case, they extended the multipliers estimate given by Lucidi in [12], and incorporated it into the classical augmented Lagrangian for nonlinear programming. Also, they approximated the B-subdifferential of the gradient of the penalty function and proved that the Newton-type method has global and superlinear convergence. In this paper, we further extend the exact penalty approach given by Andreani et al., but use the multipliers estimate and the augmented Lagrangian function that is similar to the ones given by Fukuda et al. Moreover, the nonlinear optimization problems considered here distinguish the easy constraints from the general (equality and inequality) ones. We are particularly interested in the case that the easy constraints are given by general linear equality constraints. For such a case, the definition of exact penalty functions changes. Here, the original problem can be solved by doing a minimization of the exact penalty function subject to the easy constraints, instead of an unconstrained minimization. In fact, one can observe that optimization problems containing only easy constraints are not difficult to solve comparing to the unconstrained ones. In augmented Lagrangian methods, for example, it is also common to distiguish the easy constraints. The paper is organized as follows. In Section 2, we give some notions and results that are necessary for the construction of the penalty function. In Section 3, we first propose an extension of Lucidi s multipliers estimate using the fact that the projection mapping onto the set X is easy to compute. Next, we show the properties associated to this estimate and incorporate it in the augmented Lagrangian for nonlinear programming problems. In Section 4, we prove that the constructed function is in fact an exact penalty function. In order to do that, we make the analysis for stationary points, global optimizers and local optimizers. In Section 5, we propose a way to dynamically the update the penalty parameter, and suggest to use the spectral projected gradient method to solve the problem. We conclude in Section 6, with some remarks and future works. Throughout the paper, we use the following notations. We define the Euclidean norm by, the supremum norm by and the inner product by,. The set of positive real numbers is defined by R ++. We also define the identity matrix of dimension n by I n, and the transpose of a matrix Z by Z T. For functions θ : R n R, and ν : R n R m, the gradient of θ, the Hessian of θ, and the Jacobian matrix of ν at x R n are given by f (x), 2 f (x) and Jg(x), respectively. For a function η : R n R m R, the gradient and the Hessian of η with respect to the first variable are given by x η(x, y) and 2 x xη(x, y), respectively. Moreover, given a vector z := (z 1,..., z n ) T R n, the diagonal matrix with diagonal entries z i, i = 1,...,n, is defined by diag(z). 2

6 2 Preliminaries In this section, we introduce some basic notions and results in order to construct a penalty function. First, we describe the Karush-Kuhn-Tucker (KKT) conditions of the problem (1.1), which can be written as follows. x L(x, λ, µ) N X (x), (2.1) g(x) 0, (2.2) h(x) = 0, (2.3) λ 0, (2.4) λ,g(x) = 0, (2.5) where, L(x, λ, µ) := f (x) + λ, g(x) + µ, h(x) is the Lagrangian function associated to (1.1), and λ R m and µ R p correspond to the Lagrange multipliers associated with the inequality g(x) 0 and equality constraints h(x) = 0, respectively. Moreover, N X (x) is the normal cone to the set of easy constraints X at x, that is N X (x) := {z R n z, y x 0, y X} Concerning the condition (2.1), we can prove the following lemma. Lemma 2.1. For all (x, λ, µ) R n+m+p, we have x L(x, λ, µ) N X (x) P X ( x L(x, λ, µ) + x) x = 0, (2.6) where P X denotes projection onto X. Proof. L(x, λ, µ) N X (x) is equivalent to which can be written as L(x, λ, µ), y x 0, y X, ( L(x, λ, µ) + x) x, y x 0, y X. Note that a point z X is a projection of u onto X if and only if u z, y z 0, y X. Therefore, we get P X ( L(x, λ, µ) + x) = x. 3 Constructing the penalty function In this section, we construct an exact penalty function based on the ideas given by Di Pillo and Grippo in [7] and Andreani et al. in [2]. Here, the definition of exact penalty functions is changed. In fact, instead of replacing the original problem with an unconstrained one, we replace 3

7 it with a problem containing only easy constraints. More precisely, we transform the problem (1.1) into the following problem. min w c (x) (3.1) s.t. x X. The basic idea to construct w c is to construct a Lagrange multipliers estimate associated to a point and incorporate it in the classical augmented Lagrangian for nonlinear programming problems. 3.1 The multipliers estimate In order to construct the penalty function, we first consider the following unconstrained minimization problem. The idea was given by Glad and Polak in [10] and further extended by Lucidi in [12]. In this work, it consists of finding an estimator of the Lagrange multipliers associated to a point x R n, by solving the problem min λ, µ P X ( x L(x, λ, µ) + x) x 2 + ζ 2 1 G(x)λ 2 + ζ 2 2 α(x)( λ 2 + µ 2 ), (3.2) where x L(x, λ, µ) is the gradient of the Lagrangian function associated to x, ζ 1, ζ 2 > 0, and G(x) := diag(g 1 (x),...,g m (x)) is the diagonal matrix with diagonal entries g i (x), i = 1,...,m. Moreover, α(x) := 1 ( max{g(x),0} 2 + h(x) 2) = m max{g i (x),0} 2 + i=1 p i=1 h i (x) 2 is a function that measures how a point x is feasible/infeasible with respect to the equality and inequality constraints. Note that if G(x)λ = 0 holds, then the complementarity condition (2.5) is satisfied. Now, we show a property related to P X. Lemma 3.1. Let z R n and X be given by (1.2). Then, the projection of z onto X can be written as P X (z) = (I n A ( T AA ) ) T 1 A z + A ( T AA ) T 1 b. (3.4) (3.3) Proof. First, given z R n, we consider the following minimization problem. 1 min w z 2 2 s.t. Aw = b. Then, we get a Lagrangian function for it, that is, L(w, λ) = 1 2 w z 2 λ, Aw b. 4

8 Considering w L(w, λ), we get w L(w, λ) = w z A T λ. If w L(w, λ) = 0, then we obtain Since rank(a) = l and Aw = b, we get Aw Az AA T λ = 0. λ = ( AA T ) 1 (b Az). Therefore, we obtain w = (I n A ( T AA ) ) T 1 A z + A ( T AA ) T 1 b. where Observe that from Lemma 3.1, we have P X ( x L(x, λ, µ) + x) x = q(x) P x L(x, λ, µ), P := I n A T (AA T ) 1 A, q(x) := Px + A T (AA T ) 1 b x. (3.5) Here, we also note that P = P T = P 2 and Pq(x) = 0. This result shows that (3.2) is equivalent to the following problem: min λ, µ PJg(x) T PJh(x) T ζ 1 G(x) 0 ζ 2 α(x) 1/2 I m 0 0 ζ 2 α(x) 1/2 I p [ λ µ ] which is a linear least squares problem. From now on, we consider that the following assumptions hold. q(x) P f (x) , (3.6) Assumption 3.1. A point x R n satisfies the linear independence constraint qualification (LICQ) on the set of feasible points. More precisely, the gradients g i (x), h j (x) and a k (where a k is the kth column of A T ) are linearly independent for all i {i {1,...,m} g i (x) = 0}, all j = 1,... p and all k = 1,... l. If Assumption 3.1 is satisfied, then the following result holds. Proposition 3.1. Suppose that x R n satisfies Assumption 3.1. Then, P g i (x), P h j (x) are linearly independent for all i {i {1,...,m} g i (x) = 0}, all j = 1,... p. 5

9 Proof. Suppose that Assumption 3.1 is satisfied. It means that α i g i (x) + i I p β j h j (x) + A T γ = 0 α i, β j,γ = 0, (3.7) j=1 where I := i {i {1,...,m} g i (x) = 0}. Now, assume that there exist α i, i I, and β j, j = 1,...,l, such that p α i P g i (x) + β j P h j (x) = 0. From (3.5), we get i I j=1 ( In A T (AA T ) 1 A ) α i g i (x) + i I Defining v = i I α i g i (x) + p j=1 β j h j (x), (3.8) is equivalent to α i g i (x) + i I p j=1 p j=1 β j h j (x) = 0. (3.8) β j h j (x) + A ( T (AA T ) 1 Av ) = 0. (3.9) Comparing to (3.7), this means that α i, β j = 0 for all i I, j = 1,..., p, and (AA T ) 1 Av = 0. Thus, we conclude that P g i (x), P h j (x) i I, all j 1,... p are linearly independent. The following proposition gives some properties associated with the multipliers estimate. Proposition 3.2. Suppose that x R n satisfies Assumption 3.1. Define the matrix N (x) as follows. [ Jg(x)PJg(x) N (x) := T + ζ1 2G(x)2 + ζ2 2α(x)I m Jg(x)PJh(x) T ] Jh(x)PJg(x) T Jh(x)PJh(x) T + ζ2 2α(x)I. p Then, (a) The matrix N (x) is positive definite. (b) The solution of (3.6) (equivalently, (3.2)) is unique and it is given by [ ] [ ] λ(x) = N 1 Jg(x)P (q(x) ) (x) P f (x). µ(x) Jh(x)P (c) If (x, λ, µ) R n+m+p satisfies the KKT conditions (2.1) (2.5), then λ = λ(x) and µ = µ(x). (d) The Jacobian matrices of λ( ) and µ( ) are given by, [ ] [ ] Jλ(x) = N 1 R1 (x) (x), J µ(x) R 2 (x) 6

10 with R 1 (x) :=Jg(x)P 2 x x L(x, λ(x), µ(x)) + 2ζ1 2 Λ(x)G(x)Jg(x) + ζ 2 2 λ(x) α(x)t m + ei m x L(x, λ(x), µ(x)) T P 2 g i (x), i=1 R 2 (x) :=Jh(x)P 2 x x L(x, λ(x), µ(x)) + ζ2 2 M(x) α(x)t p + e p i x L(x, λ(x), µ(x)) T P 2 h i (x), i=1 where, Λ(x) := diag(λ 1 (x),..., λ m (x)) and M(x) := diag(µ 1 (x),..., µ p (x)) are diagonal matrices with diagonal entries λ i (x) and µ i (x), respectively. Proof. (a) We consider the matrix A(x) R (n+m+p) (m+p) associated to the linear least squares problem (3.6), that is A(x) := PJg(x) T PJh(x) T ζ 1 G(x) 0 ζ 2 α(x) 1/2 I m 0 0 ζ 2 α(x) 1/2 I p. (3.10) If x is infeasible, then A(x) has full column rank, since in this case α(x) 0. Now, assume that x is feasible, so that α(x) = 0. Without loss of generality, we can write Jg(x) = [Jg(x) T = Jg(x) T ], where Jg(x) = and Jg(x) correspond to the parts of Jg(x) where g i (x) = 0 and g i (x) 0, respectively. In the same way, we can define the matrices Jh(x) =, Jh(x), and G(x). Moreover, we define m 1 and m 2 as the number of rows of Jg(x) = and Jg(x), respectively. Then, we have A(x) = PJg(x) = T PJg(x) T PJh(x) T ζ 1 G(x) We can see that A(x) has linearly independent columns, by Proposition 3.1 and because of the nonzero block diagonal matrix G(x). Furthermore, we can see that N (x) = A(x) T A(x), so we can conclude that N (x) is nonsingular and positive definite. (b) If we differentiate the objective function of problem (3.6) and set the result to zero, it yields A(x) T A(x) [ λ(x) µ(x) ] = A(x) T. q(x) P f (x) where A(x) is defined in (3.10). The result follows since N (x) = A(x) T A(x) is nonsingular from item (a). 7,

11 (c) From KKT conditions and equivalence (2.6), we have P X ( x L(x, λ, µ) + x) x = 0, G(x) λ = 0, H(x) µ = 0, and α(x) = 0 so the objective function s value of (3.2) at ( λ, µ) is zero. The result follows since the solution of (3.2) is unique from (b), and because the objective function s value is always nonnegative. (d) From item (b), we have: Jg(x)Pq(x) Jg(x)P f (x) = (Jg(x)PJg(x) T + ζ 2 1 G(x)2 + ζ 2 2 α(x)i m)λ(x) +Jg(x)PJh(x) T µ(x), Jh(x)Pq(x) Jh(x)P f (x) = Jh(x)PJg(x) T λ(x) which is equivalent to +(Jh(x)PJh(x) T + ζ 2 2 α(x)i p)µ(x), Jg(x)P x L(x, λ(x), µ(x)) + (ζ 2 1 G(x)2 + ζ 2 2 α(x)i m)λ(x) = 0, (3.11) From equation (3.11), we obtain Jh(x)P x L(x, λ(x), µ(x)) + ζ 2 2 α(x)i p µ(x) = 0. (3.12) m ei m g i (x) T P x L(x, λ(x), µ(x)) + (ζ1 2 G(x)2 + ζ2 2 α(x)i m)λ(x) = 0. i=1 Thus, deriving it with respect to x, it yields 0 = m ei m x L(x, λ(x), µ(x)) T P 2 g i (x) + 2ζ1 2 Λ(x)G(x)Jg(x) i=1 +ζ 2 1 G(x)2 Jλ(x) + ζ 2 2 λ(x) α(x)t + ζ 2 2 α(x)jλ(x) +Jg(x)P ( 2 x x L(x, λ(x), µ(x)) + Jg(x) T Jλ(x) + Jh(x) T J µ(x) ) = R 1 (x) + Jg(x)PJg(x) T Jλ(x) + ζ 2 1 G(x)2 Jλ(x) +ζ 2 2 α(x)jλ(x) + Jg(x)PJh(x)T J µ(x). Equivalently from equation (3.12), we obtain 0 = R 2 (x) + Jh(x)PJg(x) T Jλ(x) + Jh(x)PJh(x) T J µ(x) + ζ2 2 α(x)j µ(x). These two equations give the desired result. Note that the solution of the minimization (3.6) is unique under LICQ assumption. In Subsection 3.2, we will see that the solution of (3.6), that is, λ(x), µ(x) will be used to construct a penalty function. Furthermore, from (b) and (d) of the above proposition, we observe that the same matrix N (x) is used to define the estimates λ(x), µ(x) and their Jacobian Jλ(x), J µ(x). This means that the computation of Jλ(x), J µ(x) does not require much effort (more precisely, for the factorization of N (x) ) after the computation of λ(x), µ(x). 8

12 3.2 The penalty function The construction of the penalty function derives from the idea given by Di Pillo and Grippo in [6, 7]. It consists in including the multipliers estimate λ(x), µ(x), solution of (3.2) (equivalently (3.6)) into the classical augumented Lagrangian function [11, 14, 15], given by L c (x, λ, µ) := f (x) + λ,g(x) + c 2 g(x) 2 1 2c + µ, h(x) + c 2 h(x) 2, m max{0, λ i cg i (x)} 2 i=1 where c > 0 is the penalty parameter. Thus, the following is our possible penalty function: The gradient of w c at x is as follows. w c (x) := L c (x, λ(x), µ(x)). (3.13) w c (x) = f (x) + Jg(x) T λ(x) + (cjg(x) T + Jλ(x) T )(g(x) + y c (x)) +Jh(x) T µ(x) + (cjh(x) T + J µ(x) T )h(x) (3.14) = x L(x, λ(x), µ(x)) + (cjg(x) T + Jλ(x) T )(g(x) + y c (x)) +(cjh(x) T + J µ(x) T )h(x) where { y c (x) := max 0, λ(x) } g(x). c Although w c is not differentiable because it includes the maximum function, we can see that it is semismooth. Moreover, w c (x) has Jλ(x), J µ(x) in its formula, which contain secondorder terms 2 f (x), 2 g i (x), 2 h i (x). If a second-order method, like the Newton method, is used, then we have to deal with third-order terms of the problem data, which should be avoided for numerical reasons. 4 Exactness results In this section, we show some exactness results for w c defined in (3.13). We follow closely the results presented by Di Pillo and Grippo [6, 7], André and Silva [3], and Fukuda, Andreani and Silva [2]. 4.1 Analysis of KKT points First, we show that if a point satisfies KKT conditions, it satisfies w c (x) N X (x) for an arbitrary parameter c > 0. 9

13 Proposition 4.1. Let (x, λ, µ) be a KKT triple associated with the problem (1.1). Then, w c (x) N X (x) for all c > 0. Proof. From, (2.2), (2.4), (2.5), Proposition 3.2(c) and the fact that { g(x) + y c (x) = max g(x), λ(x) }, c we obtain g(x) + y c (x) = 0. From (2.3), we also have h(x) = 0. Therefore, we obtain w c (x) = f (x) + Jg(x) T λ + Jh(x) T µ = x L(x, λ, µ). But since (2.1) holds, we get w c (x) N X (x). In order to prove the opposite implication of the above result, we consider the function α(x) defined in (3.3). We recall that the function α is an infeasibility measure. Moreover, α(x) = 0 is equivalent to g(x) 0 and h(x) = 0. Considering the problem min α(x) (4.1) s.t x X, we see that it is a minimization of the infeasibility measure subject to easy constraints. Furthermore, if α(x) = Jg(x) T max{0,g(x)} Jh(x) T h(x) N X (x) holds, then we call the point x a stationary point of the minimization of the infeasibility problem (4.1). Now, we will show that the other implication can be true for large enough c, and for instance, under boundedness assumption. As we will see, instead of a KKT point, we may find a stationary point of the minimization of the infeasibility that is infeasible for (1.1). Proposition 4.2. Let {x k } R n and {c k } R ++ be sequences such that c k, x k x and w ck (x k ) N X (x k ) for all k. Then x is a stationary point of the minimization of the infeasibility (4.1). Proof. By the definition of w ck (x k ), we get x L ( x k, λ(x k ), µ(x k ) ) ( c k Jg(x k ) T + Jλ(x k ) T ) max{g(x k ), λ(x k )/c k } Jh(x k ) T µ(x k ) ( c k Jh(x k ) T + J µ(x k ) T ) h(x k ) N X (x k ). Since λ( ), µ( ) are continuous by LICQ, and f, g, h are twice continuously differentiable, we can divide the above expression by c k and take the limit, so ( J( x) T max{g( x),0} + Jh( x) T h( x) ) N X ( x), which means that x is a stationary point of the minimization of the infeasibility (4.1). 10

14 Proposition 4.3. Let x R n be a feasible point of problem (1.1). Then there exist c, δ > 0 such that if x x δ, c c and w c (x) N X (x), then (x, λ(x), µ(x)) satisfies the KKT conditions (2.1) (2.5). Proof. First, it is easy to prove that Y c (x)λ(x) = cy c (x)(g(x) + y c (x)), (4.2) where Y c (x):= diag((y c ) 1 (x),..., (y c ) m (x)). From equations (3.11) and (3.12), we obtain Jg(x)P x L(x, λ(x), µ(x)) = (ζ 2 1 G(x)2 + ζ 2 2 α(x)i m)λ(x), (4.3) Observe that (4.3) can be written as Jh(x)P x L(x, λ(x), µ(x)) = ζ2 2 α(x)µ(x). (4.4) From (4.2), we get Jg(x)P x L(x, λ(x), µ(x)) = (ζ 2 1 G(x)2 + ζ 2 2 α(x)i m)λ(x) = ζ 2 1 G(x)(G(x) + Y c (x))λ(x) + (ζ 2 1 G(x)Y c (x) ζ 2 2 α(x)i m)λ(x) = ζ 2 1 G(x)Λ(x)(g(x) + y c (x)) + (ζ 2 1 G(x)Y c (x) ζ 2 2 α(x)i m)λ(x). 1 c Jg(x)P x L(x, λ(x), µ(x)) ( ) 1 = ζ1 2 G(x) c Λ(x) + Y c (x) (g(x) + y c (x)) 1 c ζ 2 2 α(x)λ(x). From the definition of w c (x) given in (3.14), we have 1 c Jg(x)P w c (x) = 1 c Jg(x)P x L(x, λ(x), µ(x)) + Jg(x)P (Jg(x) T + 1c ) Jλ(x)T (g(x) + y c (x)) + Jg(x)P (Jh(x) T + 1c ) J µ(x)t h(x) ( = Jg(x)P (Jg(x) T + 1c ) ( )) 1 Jλ(x)T ζ1 2 G(x) c Λ(x) + Y c (x) (g(x) + y c (x)) 1 c ζ 2 2 α(x)λ(x) + Jg(x)P ( Jh(x) T + 1 c J µ(x)t ) h(x). (4.5) Moreover, we use the equation (3.12) and we get 11

15 Jh(x)P x L(x, λ(x), µ(x)) = ζ 2 2 α(x)µ(x), where and and we also obtain 1 c Jh(x)P w c (x) = 1 c ζ 2 2 α(x)µ(x) + Jh(x)P (Jg(x) T + 1c ) Jλ(x)T (g(x) + y c (x)) + Jh(x)P (Jh(x) T + 1c ) J µ(x)t h(x). (4.6) From equations (4.5) and (4.6) we get [ 1 Jg(x) c Jh(x) ] P w c (x) = K c (x) [ g(x) + yc (x) h(x) [ ] (Kc (x)) K c (x) := 11 (K c (x)) 12, (K c (x)) 21 (K c (x)) 22 ] ζ 2 2 α(x) c [ λ(x) µ(x) (K c (x)) 11 := Jg(x)P (Jg(x) T + 1c ) ( ) 1 Jλ(x)T ζ1 2 G(x) c Λ(x) + Y c (x), (K c (x)) 12 := Jg(x)P (Jh(x) T + 1c ) J µ(x)t, (K c (x)) 21 := Jh(x)P (Jg(x) T + 1c ) Jλ(x)T, (K c (x)) 22 := Jh(x)P (Jh(x) T + 1c ) J µ(x)t. Now, denoting σ m+p (K c (x)) as the smallest singular value of K c (x), we have ], (4.7) K c (x) [ g(x) + yc (x) h(x) ] 2 σ m+p (K c (x)) [ ] 2 g(x) + yc (x) 2 h(x) = σ m+p (K c (x)) 2 ( g(x) + y c (x) 2 + h(x) 2 ). Furthermore, from the definition of α(x) and y c (x), we obtain α(x) = 1 ( max{g(x),0} 2 + h(x) 2) 1 ( g(x) + yc (x) 2 + h(x) 2). (4.8) 2 2 Thus, considering the square of the norm in (4.7) and using the following basic inequality u v 2 u 2 2 v 2 for all,u,v, 12

16 we have 1 [ Jg(x) c 2 Jh(x) 1 2 K c (x) 1 ] P w c (x) 2 [ g(x) + yc (x) h(x) ] 2 ζ 4 2 c 2 α(x)2 ( λ(x) 2 + µ(x) 2) 2 σ m+p (K c (x)) 2 ζ 2 4 2c 2 α(x) ( λ(x) 2 + µ(x) 2) ( g(x) + yc (x) ) 2 + h(x) 2). Because x is a feasible point, we have y c ( x) g( x) if c and we get K c ( x) N ( x). Recalling that, N ( x) is nonsingular and by continuity, there exist c and δ such that x x δ, c c, then K c (x) is also nonsingular. The nonsingularity of K c (x) implies the existence of δ, c, ρ > 0 such that, for any x R n with x x δ and c c, 1 2 σ m+p (K c (x)) 2 ζ 2 4 2c 2 α(x) ( λ(x) 2 + µ(x) 2) ρ > 0. Thus, we obtain 1 [ ] Jg(x) c 2 P w Jh(x) c (x) 2 ρ ( g(x) + y c (x) ) 2 + h(x) 2). Now, we take any x and c such that w c (x) N X (x), x x δ, and c c. Recalling that w c (x) N X (x) is equivalent to P w c (x) = q(x) and observing that P 2 = P implies P w c (x) = Pq(x) = 0, the left-hand side of the above expression is zero. Hence, we get g(x) + y c (x) = 0, h(x) = 0. Next, from the definition of w c (x), we obtain x L(x, λ(x), µ(x)) N X (x). Moreover, g(x) + y c (x) = 0 implies g(x) 0, λ(x) 0 and g(x), λ(x) = 0, and we conclude that (x, λ(x), µ(x)) is a KKT triple. From these two results, we can prove the following theorem. Theorem 4.1. Let {x k } R n and {c k } R ++ be sequences such that c k and w ck (x k ) N X (x k ) for all k. Also, consider a subsequence {x k j } from {x k } such that {x k j } x for some x R n. Then, either there exists K such that (x k j, λ(x k j ), µ(x k j )) is a KKT triple associated with (1.1) for all k j > K, or x is a stationary point of the minimization of the infeasibility (4.1) that is infeasible for (1.1). Proof. From Proposition 4.2, x is a stationary point of the minimization of the infeasibility (4.1). If x is a feasible point, from Proposition 4.3, there exists such K that (x k j, λ(x k j ), µ(x k j )) is a KKT triple for all k j > K. Note that in the above theorem, we assume that a converging subsequence of {x k } exists. This happens, for example, under the boundedness assumption. Furthermore, if we assume that all stationary points of the minimization of the infeasibility (4.1) are feasible, we can show the following corollary, which is similar to Theorem 4.1. Such a property holds, for example, under the convexity of g i, i = 1,...,m and when h i, i = 1,..., p are affine. 13

17 Corollary 4.1. Assume that there exists c > 0 such that the set Z := {x R n w c (x) N X (x), c > c} is bounded. Assume that all stationary points of minimization of the infeasibility (4.1) are feasible for the problem (1.1). Then, there exists a positive c such that if w c (x) N X (x) and c > c then (x, λ(x), µ(x)) is a KKT triple associated with the problem (1.1). Proof. Suppose that there is no such c. Therefore, there exist two sequences {x k } R n and {c k } R ++ with w ck (x k ) N X (x k ) and c k and such that (x k, λ(x k ), µ(x k )) is not KKT. But for c k > c, we have x k Z, which is bounded. It means that there exists a convergent subsequence {x k j } of {x k }. This is not possible from Theorem 4.1 and because there is no stationary point of the minimization of the infeasibility (4.1) that is infeasible for (1.1). 4.2 Optimality results Here, we will prove that w c is in fact an exact penalty function. First, we define G f and L f as the set of global and local optimizers, respectively of the problem (1.1). Considering the problem (3.1), we also define G w (c) and L w (c) as the set of global and local minimizers, respectively of (3.1). The definition of (weakly) exact penalty function is as follows. Definition 4.1. The function w c is a weakly exact penalty function if there exists c > 0 such that for all c c, G w (c) = G f. Also, the function w c is an exact penalty function if there exists c > 0 such that for all c c, G w (c) = G f and L w (c) L f. First, we will show that w c is a weakly exact penalty function, by showing the equivalence of the sets of global minimizers. The following two lemmas will be useful for such a proof. Lemma 4.1. The function w c defined in (3.13) at x R n can be written as w c (x) = f (x) + λ(x),g(x) + y c (x) + c 2 g(x) + y c 2 + µ(x), h(x) + c 2 h(x) 2. Proof. It follows from [2, Lemma 4.1]. Lemma 4.2. Let (x, λ, µ) be a KKT triple associated to the problem (1.1) such that x R n. Then, w c (x) = f (x) for all c > 0. Proof. It follows from [2, Lemma 4.2]. Proposition 4.4. Let {x k } R n and {c k } R ++ be sequences such that {x k } is bounded, c k and x k G w (c k ) for all k. If G f holds, then there exist K such that x k G f for all k > K. 14

18 Proof. Suppose that for all K, there exists k > K such that x k G f. First, let ˆx G f, which exists because G f. Since ˆx is a KKT point and satisfies LICQ, from Lemma 4.2, we have w ck (x k ) w ck ( ˆx) = f ( ˆx) (4.9) for all k. Since {x k } X is bounded, there exists a subsequence of {x k } converging to x X. Without loss of generality, we can write lim k x k = x. So, taking the supremum limit in both sides of (4.9), we obtain lim sup w c k (x k ) f ( ˆx). (4.10) k Now, from Lemma 4.1, w ck can be written as w ck (x k ) = f (x k ) + λ(x k ),g(x k ) + y ck (x k ) + c k 2 + µ(x k ), h(x k ) + c k 2 h(xk ) 2. g(xk ) + y ck (x k ) 2 Thus, inequality (4.10) implies that h( x) = 0 and g( x) + max{0, g( x)} = 0, which implies g( x) 0, for the continuity of the involved functions. Moreover, it is easy to show that f ( x) lim sup k w ck (x k ). Therefore, f ( x) f ( ˆx), that is, x G f. Since x is feasible and satisfies LICQ, there exist c and δ as in the Theorem 4.1. Let K be sufficiently large such that x k x δ, c k c and x k G w (c k ) for all k > K. Since x k G w (c k ) implies w ck (x k ) N X (x k ), the same corollary ensures that x k is KKT. It means that x k is feasible, for all k > K. Furthermore, Lemma 4.2 and inequality (4.9) yield f (x k ) = w ck (x k ) f ( x) (4.11) for all k > K. We conclude that for such K, x k G f for all k > K, which is a contradiction. Proposition 4.5. Assume that G f all c > 0. holds. Then, G w (c) G f implies that G w (c) = G f for Proof. It follows from [2, Proposition 4.5]. Theorem 4.2. If there exists c > 0 such that c c G w (c) is bounded and G f holds. Then w c is a weakly exact penalty function for the problem. Proof. It follows from [2, Theorem 4.2]. Now, we will show that w c is in fact an exact penalty function, by proving the equivalence of the sets of local minimizers. We recall that such a proof is important because optimization solvers, in general, search for local solutions instead of the global ones. Before presenting the results for local minimizers, we state an additional lemma, which shows that w c at a feasible point is not greater that its objective function s value. Lemma 4.3. Let x R n be a feasible point for (1.1). Then, w c (x) f (x) for all c > 0. Proof. It follows from [2, Lemma 4.3]. 15

19 Theorem 4.3. Let {x k } R n and {c k } R ++ be sequences such that c k and x k L w (c k ) for all k. Let {x k j } be a subsequence of {x k } such that x k j x. If G f holds, then either there exists K such that x k j L f for all k j > K, or x is a stationary point of the minimization of the infeasibility (4.1) that is infeasible for (1.1). Proof. Since x k j L w (c k j ) implies w ck j (x k j ) N X (x k j ) for all k j, from Theorem 4.1 there is K such that x k j is KKT for all k j > K or x is a stationary point of the minimization of the infeasibility (4.1) that is infeasible. Considering the first case and fixing k j > K, from Lemma 4.2 there exists a neighborhood V (x k j ) of x k j such that f (x k j ) = w ck j (x k j ) w ck j (x) for all x V (x k j ) X. Note that the above statement is also true for all x V (x k j ) X {x g(x) 0, h(x) = 0}. Finally, from Lemma 4.3 we conclude that f (x k j ) w ck j (x) f (x) for all x V (x k j ) that is feasible for (1.1). This means that x k j L f for all k j > K, which completes the proof. Corollary 4.2. Suppose that there exists c > 0 such that c c L w (c) is bounded. Consider also that G f holds and, all stationary points of the minimization of the infeasibility (4.1) are feasible for problem (1.1). Then there exists c > 0 such that if x L w (c) and c > c then, x L f. Proof. It follows from [2, Corollary 4.4]. 5 Algorithm In the previous section, we proved that the function w c is in fact a penalty function in the meaning of Definition 4.1. It means that we are able to solve the original problem (1.1), by solving the problem (3.1) that contains only easy constraints. 5.1 Updating the penalty parameter As we noted previously, it is important to choose a good penalty parameter. In this work, we extend the dynamical update of parameter proposed by Glad and Polak [10]. The basic idea is to use a test function that measures the risk of computing a point x satisfying w c (x) N X (x) that is not a KKT point. First, we define the following function: { a c (x) := g(x) + y c (x) := max g(x), λ(x) }. c Note that if a c (x) = 0 for all c > 0, this is equivalent to g(x), λ(x) = 0, g(x) 0, and λ(x) 0. Also observe that if (x, λ(x), µ(x)) is KKT triple, then w c (x) N X (x). Finally, we define a test function given by t c (x) := P X ( w c (x) + x) x c γ ( a c (x) 2 + h(x) 2 ), 16

20 where γ > 0. It is easy to prove that t c is continuous because the functions involved in the test function are also continuous. In the next proposition, we show that t c is a test function. Proposition 5.1. The following statements are equivalent: (a) (x, λ(x), µ(x)) is a KKT triple for (1.1); (b) w c (x) N X (x), a c (x) = 0, and h(x) = 0; (c) w c (x) N X (x) and t c (x) 0. Proof. (a) (b): From (2.2), (2.4), and (2.5), we have a c (x) = 0. Also from (2.1) and (2.3), we conclude that w c (x) N X (x). (b) (a): It holds trivially. (b) (c): We just have to show that t c (x) 0. Since a c (x) = 0 and h(x) = 0, we have t c (x) = P X ( w c (x) + x) x 2 0. (c) (b): First, recall that w c (x) N X (x) is equivalent to P X ( w c (x) + x) = x. Thus, t c (x) 0 implies t c (x) = 1 c γ ( a c (x) 2 + h(x) 2 ) 0 which means that a c (x) = 0 and h(x) = 0. In the next result, we show that either x is a stationary point of the minimization of the infeasibility (4.1) that is infeasible for (1.1), or there exists c large enough such that t c (x) 0 for all c c and all x in a neighborhood of x. From Proposition 5.1, we observe that the latter case reveals us a way to update the penalty parameter c. More precisely, for each time we compute x satisfying w c (x) N X (x), we increase the value of c if t c (x) is greater that zero. Lemma 5.1. Let S R n be a compact set that contains no KKT points. Then, either there exist c, ɛ such that P X ( w c (x) + x) x ɛ for all x S and all c c; or there exist {x k } S, {c k } R ++ such that c k, P X ( w ck (x k ) + x k ) x k 0 and {x k } converges to a stationary point of the minimization of the infeasibility (4.1) that is infeasible for (1.1). Proof. If the first condition does not hold, there exist two sequences {x k } S, {c k } R ++ such that x k x S, c k and P X ( w ck (x k ) + x k ) x k 0. Recalling the definition of w ck (x k ) and P X, we have, from the continuity of the involved functions, Now, if c k, we get P ( f (x k ) + Jg(x k ) T λ(x k ) + Jh(x k ) T µ(x k ) +Jλ(x k ) T max{g(x k ), λ(x k )/c k } + J µ(x k ) T h(x k ) ) P ( c k Jg(x k ) T max{g(x k ), λ(x k )/c k } + c k Jh(x k ) T h(x k ) ) + q(x k ) 0. (5.1) P ( J( x) T max{g( x),0} + Jh( x) T h( x) ) + q( x) = 0, 17

21 which is equivalent to ( J( x) T max{g( x),0} + Jh( x) T h( x) ) N X ( x). Hence, x is a stationary point of minimization of the infeasibility (4.1). Next, we assume that x is feasible. First, we define λ k and µ k as follows. From (5.1), we obtain λ k := λ(x k ) + c k max{g(x k ), λ(x k )/c k } = max{λ(x k ) + c k g(x k ),0} µ k := µ(x k ) + c k h(x k ) P ( f (x k ) + Jg(x k ) T λ(x k ) + Jh(x k ) T µ(x k ) +Jλ(x k ) T max{g(x k ), λ(x k )/c k } + J µ(x k ) T h(x k ) ) q(x k ) 0. By the continuity, we get λ k λ 0, µ k µ and, P ( f ( x) + Jg( x) T λ + Jh( x) T µ + Jλ( x) T max{g( x),0} + J µ( x) T h( x) ) = q( x). From the definition of λ k, if g( x) ( < 0, then λ = 0. Moreover, h( x) = max{g( x),0} = 0 because x is feasible. This shows that P X x L( x, λ, µ) + x ) = x. Therefore, from (2.6), ( x, λ, µ) is a KKT triple and that is a contradiction because x S and S has no KKT points. Proposition 5.2. For all x R n, either x is a stationary point of the minimization of the infeasibility (4.1), or there exits c, δ > 0 such that if c c and if x x δ, then t c (x) 0. Proof. Consider that the second condition does not hold, that is, there are sequences {x k } R n and {c k } R ++ such that x k x, c k and t ck (x k ) > 0. Note that in this case, x k is not a KKT point for all k. We consider two cases. 1. Assume that x is not a KKT point. From Lemma 5.1, if we assume that S := {x k } { x}, then, we obtain that x is either an infeasible stationary point of the minimization of the infeasibility (4.1) or we obtain t ck (x k ) ɛ ck r ( a ck (x k ) 2 + h(x k ) 2 ) for all k large enough. Since c γ k, we have a contradiction because t c k (x k ) Assume now that x is a KKT point. From equation (4.7) and the fact that P w ck (x k ) = P ( P w ck (x k ) q(x k ) ), we obtain [ K ck (x k ack (x ) k ] ) h(x k = 1 [ Jg(x k ] ) ) c k Jh(x k P[P w ) ck (x k ) q(x k )] + ζ 2 2 [ α(xk ) λ(x k ] ) c k µ(x k ) for all k. Note that P = 1, K ck (x k ) converges to a nonsingular matrix N ( x), and J(x k ), Jh(x k ), λ(x k ), µ(x k ) converge to Jg( x), Jh( x), λ( x), µ( x) respectively. Therefore, for sufficiently large k, we have [ ack (x k ) h(x k ) ] 1 c k N 1 ( x) ( [ Jg( x) Jh( x) ] [P w c k (x k ) q(x k )] + ζ 2 2 α( x) 18 [ λ( x) µ( x) ] ).

22 From (4.8) and observing that max{g( x), 0} = 0, h( x) = 0, we get [ ack (x k ] ) h(x k ) 1 N 1 ( x) [ ] Jg( x) c k Jh( x) [P w c k (x k ) q(x k )], Squaring both sides of the above inequality gives ( a ck (x k ) 2 + h(x k ) 2 ) Therefore, 1 c 2 k N 1 ( x) 2 ( Jg( x) 2 + Jh( x) 2) [P w ck (x k ) q(x k )] 2. t ck (x k ) = P w ck (x k ) q(x k ) c γ k ( a ck (x k ) 2 + h(x k ) 2 ) 1 c γ+2 k N 1 ( x) 2 ( Jg( x) 2 + Jh( x) 2) 1 [P w ck (x k ) q(x k )] 2, which is not positive as c γ+2 k, giving again a contradiction. Based on the above results, we construct a framework of an algorithm to update the penalty parameter. We also show a theorem associated with it. Algorithm 5.1. Dynamical update of the penalty parameter. Step 1. Let A(x,c) be an algorithm that computes a point x satisfying w c (x) N X (x). Initialize x 0 R n, c 0 > 0, ξ > 1 and γ > 0. Set k = 0. Step 2. If x k is a KKT point of the problem, stop. Step 3. While t ck (x k ) > 0, do c k = ξc k. Step 4. Compute x k+1 = A(x k,c k ), set k = k + 1 and go to Step 2. Theorem 5.1. Let {x k } R n be a sequence computed by Algorithm 5.1. If {x k } is bounded and infinite, then for each one of its accumulation points, either it is a KKT point or it is a stationary point of the minimization of the infeasibility (4.1) that is infeasible for (1.1). Proof. Suppose that x is an accumulation point of {x k }. Then, by Proposition 5.2, if x is not a stationary point of the minimization of the infeasibility (4.1) that is infeasible, then t ck (x k ) 0 for all k large enough. Let c be the largest value of c k that was computed. Since x is a feasible accumulation point of the algorithm A(x,c), we have w c ( x) N X ( x) and in particular, P X ( w c ( x) + x) x = 0. From the continuity of t c, we also obtain t c ( x) 0 and so we conclude that x is a KKT point. 19

23 5.2 Spectral projected gradient method In this section, we present an algorithm to solve problem (3.1). Since it is not an unconstrained problem, the algorithm proposed in [2] can not be considered here. Thus, instead of a Newtontype method, we propose to use the spectral projected gradient method (SPG) to solve the easy constrained problem (3.1) [4]. The SPG method utilizes the orthogonal projection onto the set X. Note that the computation of an orthogonal projection is difficult in general. However, such a computation is trivial here, because X is the set of easy constraints. In particular, if X is defined as in (1.2), the projection is given by (3.4). The main algorithm is as follows. Algorithm 5.2. The spectral projected method with the exact penalty function. Step 1. Choose x 0 R n, c 0 > 0, β 0 [β min, β max ], ξ > 1, ɛ 0 and σ (0,1/2). Set k = 0. Step 2. If P X ( x k w ck (x k ) ) x k ɛ, stop. Step 3. While t ck (x k ) > 0, do c k = ξc k. Step 4. Compute d k = P X ( x k β k w ck (x k ) ) x k Step 5. Find t k > 0 such that w ck (x k + t k d k ) w ck (x k ) + σt k w ck (x k ), d k with a backtracking strategy. Step 6. Set x k+1 = x k + d k. Step 7. Let s k = x k+1 x k and y k = w ck { (x k+1 ) w { ck (x k ). If s }} k, y k 0, then set β k+1 = β max. Otherwise, set β k+1 = max β min,min s k 2 s k,y k, β max. Step 8. Set k = k + 1 and go to step 2. In Step 1, the parameter β 0 [ β min, β max ] is arbitrary, but one possible choice, given in [4] is to choose { { }} 1 β 0 := max β min,min ( P X x 0 w c0 (x 0 ) ) x 0, β max, assuming that P X ( x 0 w c0 (x 0 ) ) x 0. Another possibility, given in [5] is { { }} s 2 β 0 := max β min,min s, ȳ, β max, if s, ȳ > 0, β max, otherwise, where s := x x 0, ȳ := w c0 ( x) w c0 (x 0 ), with x := x 0 max { ɛ rel x 0,ɛ abs } wc0 (x 0 ) and ɛ rel, ɛ abs are the relative and absolute small values respectively, associated to the machine precision. Furthermore, a typical choice for the other parameters is to set β min = 10 30, β max = 10 30, σ = 10 4, ξ = 10, γ = 2, ζ 1 = ζ 2 = 2, and ɛ =

24 As for the initial value of the penalty parameter, we can take the idea given in [1], which considers the scaling of the objective function and the constraints, so 10 max { f (x 0 ),1 } c 0 := max c min,min max { 1,1/2 ( h(x 0 ) 2 + max{g(x 0 ),0} 2 + P X (x 0 ) x 0 2)},c max where c min and c max are the minimum and maximum values allowed for the penalty parameter, for example, c min = 1 and c max = We also observe that in Step 5, an Armijo-type line search is performed in order to find a step size t k. For each iteration of the line search, we need to compute w ck (x k + t k d k ), which requires the computation of the multipliers estimates λ(x k + t k d k ) and µ(x k + t k d k ). This, on the other hand, requires a matrix factorization to solve a linear least squares problem with order (n + 2m + p) (m + p). Since at each iteration of the line search, the point associated changes, this strategy may be computationally expensive. Another fact that we should point out is that the (spectral) projected gradient method is a first-order method, in the sense that it requires only the computation of the gradient w ck (x k ) at each iteration. In [2], the authors proposed to use a Gauss-Newton-type method to solve the problem (1.1) with X = R n. Recalling that an element of the B-subdifferential B w ck (x k ) has third-order terms of the problem data, the idea given in [2] is to ignore that terms by considering the augmented Lagrangian function for variational inequalities. Another approach, given in [9] for the nonlinear second-order cone programming problems, is to use an approximation of an element of B w ck (x k ), so the Newton method applied for the problem has global and superlinear convergence. The use of a second-order method for the case considered here (that is, a minimization with easy constraints) is still an ongoing topic of research. 6 Conclusion In this paper, we extended Andreani, Fukuda and Silva s exact penalty function approach for nonlinear optimization with easy constraints, in particular for linear equality constraints. In order to do that, we proposed a modification of the multipliers estimate and showed some exactness results. We also give a way to dynamically update the penalty parameter, and proposed to use the spectral projected gradient method to solve the problem. As a future work, it is desirable to use a second-order method, like the Newton method, and show the corresponding convergence results. Numerical experiments should be also done, including comparison with other methods. Acknowledgments I would like to express my appreciation for assistant professor Ellen Hidemi Fukuda. She always encouraged and supervised me kindly although I often troubled her. Moreover, she gave me a precious advice during my research. Without her help, I could not obtain the results in this paper. I would also like to address my acknowledgments to professor Nobuo Yamashita who was always supporting me during my master s course. He gave me invaluable comments and warm encouragements. Finally, I would like to thank all members of Yamashita Laboratory, my friends and my family for their encouraging words. 21

25 References [1] R. Andreani, E. G. Birgin, J. M. Martínez, M. Schuverdt. On augmented Lagrangian methods with general lower-level constraints, SIAM J. Optim. 18(4), pp (2007). [2] R. Andreani, E. H. Fukuda, P. J. S. Silva. A Gauss-Newton approach for solving constrained optimization problems using differentiable exact penalties, J. Optim. Theory Appl. 156(2), pp (2013) [3] T. A. André, P. J. S. Silva. Exact penalties for variational inequalities with applications to nonlinear complementarity problems, Comput. Optim. Appl. 47(3), pp (2010) [4] E. G. Birgin, J. M. Martínez, M. Raydan. Nonmonotone spectral gradient methods on convex sets, SIAM J. Optim. 10(4), pp (2000) [5] E. G. Birgin, J. M. Martínez, M. Raydan. Inexact spectral projected gradient methods on convex sets, IMA J. Numer. Anal. 23, pp (2003) [6] G. Di Pillo, L. Grippo. A continuously differentiable exact penalty function for nonlinear programming problems with inequality constraints, SIAM J. Control Optim. 23, pp (1985) [7] G. Di Pillo, L. Grippo. Exact penalty functions in constrained optimization, SIAM J. Control Optim. 27(6), pp (1989) [8] R. Fletcher. A class of methods for nonlinear programming with termination and convergence properties, In: Abadie, J.(ed. ) Integer and Nonlinear Programming, North-Holland, Amsterdam, pp (1970) [9] E. H. Fukuda, P. J. S. Silva, M. Fukushima. Differentiable exact penalty functions for nonlinear second-order cone programs, SIAM J. Optim. 22(4), pp (2012) [10] T. Glad, E. Polak. A multiplier method with automatic limitation of penalty growth, Math. Program. 17(2), pp (1979) [11] M. R. Hestenes. Multiplier and gradient methods, J. Optim. Theory Appl. 4, pp (1969) [12] S. Lucidi. New results on a continuously differentiable exact penalty function, SIAM J. Optim. 2(4), pp (1992) [13] H. Mukai, E. Polak. A quadratically convergent primal-dual algorithm with global convergence properties for solving optimization problems with equality constraints, Math. Program. 9(3), pp (1975) [14] M. J. D. Powell, A method for nonlinear constraints in minimization problems, In R. Fletcher (ed.) Optimization, Academic Press, New York, pp (1969) 22

26 [15] R. T. Rockafellar. Augmented Lagrange multiplier functions and duality in nonconvex programming, SIAM J. Control Optim. 12(2), pp (1974) [16] W. I. Zangwill. Nonlinear programming via penalty functions, Manag. Sci. 13, pp (1967) 23

Using exact penalties to derive a new equation reformulation of KKT systems associated to variational inequalities

Using exact penalties to derive a new equation reformulation of KKT systems associated to variational inequalities Using exact penalties to derive a new equation reformulation of KKT systems associated to variational inequalities Thiago A. de André Paulo J. S. Silva March 24, 2007 Abstract In this paper, we present

More information

Exact Augmented Lagrangian Functions for Nonlinear Semidefinite Programming

Exact Augmented Lagrangian Functions for Nonlinear Semidefinite Programming Exact Augmented Lagrangian Functions for Nonlinear Semidefinite Programming Ellen H. Fukuda Bruno F. Lourenço June 0, 018 Abstract In this paper, we study augmented Lagrangian functions for nonlinear semidefinite

More information

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method Optimization Methods and Software Vol. 00, No. 00, Month 200x, 1 11 On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method ROMAN A. POLYAK Department of SEOR and Mathematical

More information

Sequential Quadratic Programming Method for Nonlinear Second-Order Cone Programming Problems. Hirokazu KATO

Sequential Quadratic Programming Method for Nonlinear Second-Order Cone Programming Problems. Hirokazu KATO Sequential Quadratic Programming Method for Nonlinear Second-Order Cone Programming Problems Guidance Professor Masao FUKUSHIMA Hirokazu KATO 2004 Graduate Course in Department of Applied Mathematics and

More information

A derivative-free nonmonotone line search and its application to the spectral residual method

A derivative-free nonmonotone line search and its application to the spectral residual method IMA Journal of Numerical Analysis (2009) 29, 814 825 doi:10.1093/imanum/drn019 Advance Access publication on November 14, 2008 A derivative-free nonmonotone line search and its application to the spectral

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

Constrained Optimization

Constrained Optimization 1 / 22 Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University March 30, 2015 2 / 22 1. Equality constraints only 1.1 Reduced gradient 1.2 Lagrange

More information

Priority Programme 1962

Priority Programme 1962 Priority Programme 1962 An Example Comparing the Standard and Modified Augmented Lagrangian Methods Christian Kanzow, Daniel Steck Non-smooth and Complementarity-based Distributed Parameter Systems: Simulation

More information

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications Optimization Problems with Constraints - introduction to theory, numerical Methods and applications Dr. Abebe Geletu Ilmenau University of Technology Department of Simulation and Optimal Processes (SOP)

More information

Examination paper for TMA4180 Optimization I

Examination paper for TMA4180 Optimization I Department of Mathematical Sciences Examination paper for TMA4180 Optimization I Academic contact during examination: Phone: Examination date: 26th May 2016 Examination time (from to): 09:00 13:00 Permitted

More information

Numerical Optimization

Numerical Optimization Constrained Optimization Computer Science and Automation Indian Institute of Science Bangalore 560 012, India. NPTEL Course on Constrained Optimization Constrained Optimization Problem: min h j (x) 0,

More information

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems Robert M. Freund February 2016 c 2016 Massachusetts Institute of Technology. All rights reserved. 1 1 Introduction

More information

Introduction to Optimization Techniques. Nonlinear Optimization in Function Spaces

Introduction to Optimization Techniques. Nonlinear Optimization in Function Spaces Introduction to Optimization Techniques Nonlinear Optimization in Function Spaces X : T : Gateaux and Fréchet Differentials Gateaux and Fréchet Differentials a vector space, Y : a normed space transformation

More information

Lecture 19 Algorithms for VIs KKT Conditions-based Ideas. November 16, 2008

Lecture 19 Algorithms for VIs KKT Conditions-based Ideas. November 16, 2008 Lecture 19 Algorithms for VIs KKT Conditions-based Ideas November 16, 2008 Outline for solution of VIs Algorithms for general VIs Two basic approaches: First approach reformulates (and solves) the KKT

More information

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be

More information

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented

More information

Lecture 3. Optimization Problems and Iterative Algorithms

Lecture 3. Optimization Problems and Iterative Algorithms Lecture 3 Optimization Problems and Iterative Algorithms January 13, 2016 This material was jointly developed with Angelia Nedić at UIUC for IE 598ns Outline Special Functions: Linear, Quadratic, Convex

More information

Nonlinear Optimization: What s important?

Nonlinear Optimization: What s important? Nonlinear Optimization: What s important? Julian Hall 10th May 2012 Convexity: convex problems A local minimizer is a global minimizer A solution of f (x) = 0 (stationary point) is a minimizer A global

More information

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods AM 205: lecture 19 Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods Quasi-Newton Methods General form of quasi-newton methods: x k+1 = x k α

More information

Optimality Conditions for Constrained Optimization

Optimality Conditions for Constrained Optimization 72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

More information

Chap 2. Optimality conditions

Chap 2. Optimality conditions Chap 2. Optimality conditions Version: 29-09-2012 2.1 Optimality conditions in unconstrained optimization Recall the definitions of global, local minimizer. Geometry of minimization Consider for f C 1

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 4 Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 2 4.1. Subgradients definition subgradient calculus duality and optimality conditions Shiqian

More information

ON AUGMENTED LAGRANGIAN METHODS WITH GENERAL LOWER-LEVEL CONSTRAINTS. 1. Introduction. Many practical optimization problems have the form (1.

ON AUGMENTED LAGRANGIAN METHODS WITH GENERAL LOWER-LEVEL CONSTRAINTS. 1. Introduction. Many practical optimization problems have the form (1. ON AUGMENTED LAGRANGIAN METHODS WITH GENERAL LOWER-LEVEL CONSTRAINTS R. ANDREANI, E. G. BIRGIN, J. M. MARTíNEZ, AND M. L. SCHUVERDT Abstract. Augmented Lagrangian methods with general lower-level constraints

More information

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

More information

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods AM 205: lecture 19 Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods Optimality Conditions: Equality Constrained Case As another example of equality

More information

A GLOBALLY CONVERGENT STABILIZED SQP METHOD

A GLOBALLY CONVERGENT STABILIZED SQP METHOD A GLOBALLY CONVERGENT STABILIZED SQP METHOD Philip E. Gill Daniel P. Robinson July 6, 2013 Abstract Sequential quadratic programming SQP methods are a popular class of methods for nonlinearly constrained

More information

A SIMPLY CONSTRAINED OPTIMIZATION REFORMULATION OF KKT SYSTEMS ARISING FROM VARIATIONAL INEQUALITIES

A SIMPLY CONSTRAINED OPTIMIZATION REFORMULATION OF KKT SYSTEMS ARISING FROM VARIATIONAL INEQUALITIES A SIMPLY CONSTRAINED OPTIMIZATION REFORMULATION OF KKT SYSTEMS ARISING FROM VARIATIONAL INEQUALITIES Francisco Facchinei 1, Andreas Fischer 2, Christian Kanzow 3, and Ji-Ming Peng 4 1 Università di Roma

More information

MS&E 318 (CME 338) Large-Scale Numerical Optimization

MS&E 318 (CME 338) Large-Scale Numerical Optimization Stanford University, Management Science & Engineering (and ICME) MS&E 318 (CME 338) Large-Scale Numerical Optimization 1 Origins Instructor: Michael Saunders Spring 2015 Notes 9: Augmented Lagrangian Methods

More information

CONVERGENCE ANALYSIS OF AN INTERIOR-POINT METHOD FOR NONCONVEX NONLINEAR PROGRAMMING

CONVERGENCE ANALYSIS OF AN INTERIOR-POINT METHOD FOR NONCONVEX NONLINEAR PROGRAMMING CONVERGENCE ANALYSIS OF AN INTERIOR-POINT METHOD FOR NONCONVEX NONLINEAR PROGRAMMING HANDE Y. BENSON, ARUN SEN, AND DAVID F. SHANNO Abstract. In this paper, we present global and local convergence results

More information

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization E5295/5B5749 Convex optimization with engineering applications Lecture 8 Smooth convex unconstrained and equality-constrained minimization A. Forsgren, KTH 1 Lecture 8 Convex optimization 2006/2007 Unconstrained

More information

A CHARACTERIZATION OF STRICT LOCAL MINIMIZERS OF ORDER ONE FOR STATIC MINMAX PROBLEMS IN THE PARAMETRIC CONSTRAINT CASE

A CHARACTERIZATION OF STRICT LOCAL MINIMIZERS OF ORDER ONE FOR STATIC MINMAX PROBLEMS IN THE PARAMETRIC CONSTRAINT CASE Journal of Applied Analysis Vol. 6, No. 1 (2000), pp. 139 148 A CHARACTERIZATION OF STRICT LOCAL MINIMIZERS OF ORDER ONE FOR STATIC MINMAX PROBLEMS IN THE PARAMETRIC CONSTRAINT CASE A. W. A. TAHA Received

More information

Algorithms for constrained local optimization

Algorithms for constrained local optimization Algorithms for constrained local optimization Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Algorithms for constrained local optimization p. Feasible direction methods Algorithms for constrained

More information

1 Computing with constraints

1 Computing with constraints Notes for 2017-04-26 1 Computing with constraints Recall that our basic problem is minimize φ(x) s.t. x Ω where the feasible set Ω is defined by equality and inequality conditions Ω = {x R n : c i (x)

More information

INTERIOR-POINT METHODS FOR NONCONVEX NONLINEAR PROGRAMMING: CONVERGENCE ANALYSIS AND COMPUTATIONAL PERFORMANCE

INTERIOR-POINT METHODS FOR NONCONVEX NONLINEAR PROGRAMMING: CONVERGENCE ANALYSIS AND COMPUTATIONAL PERFORMANCE INTERIOR-POINT METHODS FOR NONCONVEX NONLINEAR PROGRAMMING: CONVERGENCE ANALYSIS AND COMPUTATIONAL PERFORMANCE HANDE Y. BENSON, ARUN SEN, AND DAVID F. SHANNO Abstract. In this paper, we present global

More information

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014 Convex Optimization Dani Yogatama School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA February 12, 2014 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12,

More information

Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms

Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms Carlos Humes Jr. a, Benar F. Svaiter b, Paulo J. S. Silva a, a Dept. of Computer Science, University of São Paulo, Brazil Email: {humes,rsilva}@ime.usp.br

More information

Generalization to inequality constrained problem. Maximize

Generalization to inequality constrained problem. Maximize Lecture 11. 26 September 2006 Review of Lecture #10: Second order optimality conditions necessary condition, sufficient condition. If the necessary condition is violated the point cannot be a local minimum

More information

AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING

AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING XIAO WANG AND HONGCHAO ZHANG Abstract. In this paper, we propose an Augmented Lagrangian Affine Scaling (ALAS) algorithm for general

More information

A STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE

A STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE A STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE Philip E. Gill Vyacheslav Kungurtsev Daniel P. Robinson UCSD Center for Computational Mathematics Technical Report CCoM-14-1 June 30, 2014 Abstract Regularized

More information

5 Handling Constraints

5 Handling Constraints 5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest

More information

Technische Universität Dresden Herausgeber: Der Rektor

Technische Universität Dresden Herausgeber: Der Rektor Als Manuskript gedruckt Technische Universität Dresden Herausgeber: Der Rektor The Gradient of the Squared Residual as Error Bound an Application to Karush-Kuhn-Tucker Systems Andreas Fischer MATH-NM-13-2002

More information

Algorithms for Constrained Optimization

Algorithms for Constrained Optimization 1 / 42 Algorithms for Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University April 19, 2015 2 / 42 Outline 1. Convergence 2. Sequential quadratic

More information

March 8, 2010 MATH 408 FINAL EXAM SAMPLE

March 8, 2010 MATH 408 FINAL EXAM SAMPLE March 8, 200 MATH 408 FINAL EXAM SAMPLE EXAM OUTLINE The final exam for this course takes place in the regular course classroom (MEB 238) on Monday, March 2, 8:30-0:20 am. You may bring two-sided 8 page

More information

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The

More information

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010 I.3. LMI DUALITY Didier HENRION henrion@laas.fr EECI Graduate School on Control Supélec - Spring 2010 Primal and dual For primal problem p = inf x g 0 (x) s.t. g i (x) 0 define Lagrangian L(x, z) = g 0

More information

Lectures 9 and 10: Constrained optimization problems and their optimality conditions

Lectures 9 and 10: Constrained optimization problems and their optimality conditions Lectures 9 and 10: Constrained optimization problems and their optimality conditions Coralia Cartis, Mathematical Institute, University of Oxford C6.2/B2: Continuous Optimization Lectures 9 and 10: Constrained

More information

Projection methods to solve SDP

Projection methods to solve SDP Projection methods to solve SDP Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria F. Rendl, Oberwolfach Seminar, May 2010 p.1/32 Overview Augmented Primal-Dual Method

More information

A GLOBALLY CONVERGENT STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE

A GLOBALLY CONVERGENT STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE A GLOBALLY CONVERGENT STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE Philip E. Gill Vyacheslav Kungurtsev Daniel P. Robinson UCSD Center for Computational Mathematics Technical Report CCoM-14-1 June 30,

More information

ON LICQ AND THE UNIQUENESS OF LAGRANGE MULTIPLIERS

ON LICQ AND THE UNIQUENESS OF LAGRANGE MULTIPLIERS ON LICQ AND THE UNIQUENESS OF LAGRANGE MULTIPLIERS GERD WACHSMUTH Abstract. Kyparisis proved in 1985 that a strict version of the Mangasarian- Fromovitz constraint qualification (MFCQ) is equivalent to

More information

4TE3/6TE3. Algorithms for. Continuous Optimization

4TE3/6TE3. Algorithms for. Continuous Optimization 4TE3/6TE3 Algorithms for Continuous Optimization (Algorithms for Constrained Nonlinear Optimization Problems) Tamás TERLAKY Computing and Software McMaster University Hamilton, November 2005 terlaky@mcmaster.ca

More information

CS-E4830 Kernel Methods in Machine Learning

CS-E4830 Kernel Methods in Machine Learning CS-E4830 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 27. September, 2017 Juho Rousu 27. September, 2017 1 / 45 Convex optimization Convex optimisation This

More information

8 Barrier Methods for Constrained Optimization

8 Barrier Methods for Constrained Optimization IOE 519: NL, Winter 2012 c Marina A. Epelman 55 8 Barrier Methods for Constrained Optimization In this subsection, we will restrict our attention to instances of constrained problem () that have inequality

More information

Augmented Lagrangian methods under the Constant Positive Linear Dependence constraint qualification

Augmented Lagrangian methods under the Constant Positive Linear Dependence constraint qualification Mathematical Programming manuscript No. will be inserted by the editor) R. Andreani E. G. Birgin J. M. Martínez M. L. Schuverdt Augmented Lagrangian methods under the Constant Positive Linear Dependence

More information

Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers

Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers Optimization for Communications and Networks Poompat Saengudomlert Session 4 Duality and Lagrange Multipliers P Saengudomlert (2015) Optimization Session 4 1 / 14 24 Dual Problems Consider a primal convex

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu

More information

Optimality conditions for problems over symmetric cones and a simple augmented Lagrangian method

Optimality conditions for problems over symmetric cones and a simple augmented Lagrangian method Optimality conditions for problems over symmetric cones and a simple augmented Lagrangian method Bruno F. Lourenço Ellen H. Fukuda Masao Fukushima September 9, 017 Abstract In this work we are interested

More information

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725 Karush-Kuhn-Tucker Conditions Lecturer: Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given a minimization problem Last time: duality min x subject to f(x) h i (x) 0, i = 1,... m l j (x) = 0, j =

More information

minimize x subject to (x 2)(x 4) u,

minimize x subject to (x 2)(x 4) u, Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for

More information

CONSTRAINED NONLINEAR PROGRAMMING

CONSTRAINED NONLINEAR PROGRAMMING 149 CONSTRAINED NONLINEAR PROGRAMMING We now turn to methods for general constrained nonlinear programming. These may be broadly classified into two categories: 1. TRANSFORMATION METHODS: In this approach

More information

Algorithms for nonlinear programming problems II

Algorithms for nonlinear programming problems II Algorithms for nonlinear programming problems II Martin Branda Charles University in Prague Faculty of Mathematics and Physics Department of Probability and Mathematical Statistics Computational Aspects

More information

Spectral gradient projection method for solving nonlinear monotone equations

Spectral gradient projection method for solving nonlinear monotone equations Journal of Computational and Applied Mathematics 196 (2006) 478 484 www.elsevier.com/locate/cam Spectral gradient projection method for solving nonlinear monotone equations Li Zhang, Weijun Zhou Department

More information

Some new facts about sequential quadratic programming methods employing second derivatives

Some new facts about sequential quadratic programming methods employing second derivatives To appear in Optimization Methods and Software Vol. 00, No. 00, Month 20XX, 1 24 Some new facts about sequential quadratic programming methods employing second derivatives A.F. Izmailov a and M.V. Solodov

More information

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems 1 Numerical optimization Alexander & Michael Bronstein, 2006-2009 Michael Bronstein, 2010 tosca.cs.technion.ac.il/book Numerical optimization 048921 Advanced topics in vision Processing and Analysis of

More information

A convergence result for an Outer Approximation Scheme

A convergence result for an Outer Approximation Scheme A convergence result for an Outer Approximation Scheme R. S. Burachik Engenharia de Sistemas e Computação, COPPE-UFRJ, CP 68511, Rio de Janeiro, RJ, CEP 21941-972, Brazil regi@cos.ufrj.br J. O. Lopes Departamento

More information

Lecture Note 5: Semidefinite Programming for Stability Analysis

Lecture Note 5: Semidefinite Programming for Stability Analysis ECE7850: Hybrid Systems:Theory and Applications Lecture Note 5: Semidefinite Programming for Stability Analysis Wei Zhang Assistant Professor Department of Electrical and Computer Engineering Ohio State

More information

A Primal-Dual Interior-Point Method for Nonlinear Programming with Strong Global and Local Convergence Properties

A Primal-Dual Interior-Point Method for Nonlinear Programming with Strong Global and Local Convergence Properties A Primal-Dual Interior-Point Method for Nonlinear Programming with Strong Global and Local Convergence Properties André L. Tits Andreas Wächter Sasan Bahtiari Thomas J. Urban Craig T. Lawrence ISR Technical

More information

A SHIFTED PRIMAL-DUAL PENALTY-BARRIER METHOD FOR NONLINEAR OPTIMIZATION

A SHIFTED PRIMAL-DUAL PENALTY-BARRIER METHOD FOR NONLINEAR OPTIMIZATION A SHIFTED PRIMAL-DUAL PENALTY-BARRIER METHOD FOR NONLINEAR OPTIMIZATION Philip E. Gill Vyacheslav Kungurtsev Daniel P. Robinson UCSD Center for Computational Mathematics Technical Report CCoM-19-3 March

More information

Preprint ANL/MCS-P , Dec 2002 (Revised Nov 2003, Mar 2004) Mathematics and Computer Science Division Argonne National Laboratory

Preprint ANL/MCS-P , Dec 2002 (Revised Nov 2003, Mar 2004) Mathematics and Computer Science Division Argonne National Laboratory Preprint ANL/MCS-P1015-1202, Dec 2002 (Revised Nov 2003, Mar 2004) Mathematics and Computer Science Division Argonne National Laboratory A GLOBALLY CONVERGENT LINEARLY CONSTRAINED LAGRANGIAN METHOD FOR

More information

Written Examination

Written Examination Division of Scientific Computing Department of Information Technology Uppsala University Optimization Written Examination 202-2-20 Time: 4:00-9:00 Allowed Tools: Pocket Calculator, one A4 paper with notes

More information

SF2822 Applied Nonlinear Optimization. Preparatory question. Lecture 9: Sequential quadratic programming. Anders Forsgren

SF2822 Applied Nonlinear Optimization. Preparatory question. Lecture 9: Sequential quadratic programming. Anders Forsgren SF2822 Applied Nonlinear Optimization Lecture 9: Sequential quadratic programming Anders Forsgren SF2822 Applied Nonlinear Optimization, KTH / 24 Lecture 9, 207/208 Preparatory question. Try to solve theory

More information

Numerical optimization

Numerical optimization Numerical optimization Lecture 4 Alexander & Michael Bronstein tosca.cs.technion.ac.il/book Numerical geometry of non-rigid shapes Stanford University, Winter 2009 2 Longest Slowest Shortest Minimal Maximal

More information

Nonlinear Optimization

Nonlinear Optimization Nonlinear Optimization Etienne de Klerk (UvT)/Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos Course WI3031 (Week 4) February-March, A.D. 2005 Optimization Group 1 Outline

More information

On Augmented Lagrangian Methods with General Lower-Level Constraints

On Augmented Lagrangian Methods with General Lower-Level Constraints On Augmented Lagrangian Methods with General Lower-Level Constraints R Andreani, Ernesto Birgin, J. Martinez, M. L. Schuverdt To cite this version: R Andreani, Ernesto Birgin, J. Martinez, M. L. Schuverdt.

More information

10 Numerical methods for constrained problems

10 Numerical methods for constrained problems 10 Numerical methods for constrained problems min s.t. f(x) h(x) = 0 (l), g(x) 0 (m), x X The algorithms can be roughly divided the following way: ˆ primal methods: find descent direction keeping inside

More information

Lecture 13 Newton-type Methods A Newton Method for VIs. October 20, 2008

Lecture 13 Newton-type Methods A Newton Method for VIs. October 20, 2008 Lecture 13 Newton-type Methods A Newton Method for VIs October 20, 2008 Outline Quick recap of Newton methods for composite functions Josephy-Newton methods for VIs A special case: mixed complementarity

More information

On the Coerciveness of Merit Functions for the Second-Order Cone Complementarity Problem

On the Coerciveness of Merit Functions for the Second-Order Cone Complementarity Problem On the Coerciveness of Merit Functions for the Second-Order Cone Complementarity Problem Guidance Professor Assistant Professor Masao Fukushima Nobuo Yamashita Shunsuke Hayashi 000 Graduate Course in Department

More information

SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1

SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1 SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1 Masao Fukushima 2 July 17 2010; revised February 4 2011 Abstract We present an SOR-type algorithm and a

More information

Optimisation in Higher Dimensions

Optimisation in Higher Dimensions CHAPTER 6 Optimisation in Higher Dimensions Beyond optimisation in 1D, we will study two directions. First, the equivalent in nth dimension, x R n such that f(x ) f(x) for all x R n. Second, constrained

More information

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization Roger Behling a, Clovis Gonzaga b and Gabriel Haeser c March 21, 2013 a Department

More information

GENERALIZED second-order cone complementarity

GENERALIZED second-order cone complementarity Stochastic Generalized Complementarity Problems in Second-Order Cone: Box-Constrained Minimization Reformulation and Solving Methods Mei-Ju Luo and Yan Zhang Abstract In this paper, we reformulate the

More information

4TE3/6TE3. Algorithms for. Continuous Optimization

4TE3/6TE3. Algorithms for. Continuous Optimization 4TE3/6TE3 Algorithms for Continuous Optimization (Duality in Nonlinear Optimization ) Tamás TERLAKY Computing and Software McMaster University Hamilton, January 2004 terlaky@mcmaster.ca Tel: 27780 Optimality

More information

Symmetric and Asymmetric Duality

Symmetric and Asymmetric Duality journal of mathematical analysis and applications 220, 125 131 (1998) article no. AY975824 Symmetric and Asymmetric Duality Massimo Pappalardo Department of Mathematics, Via Buonarroti 2, 56127, Pisa,

More information

A STABILIZED SQP METHOD: GLOBAL CONVERGENCE

A STABILIZED SQP METHOD: GLOBAL CONVERGENCE A STABILIZED SQP METHOD: GLOBAL CONVERGENCE Philip E. Gill Vyacheslav Kungurtsev Daniel P. Robinson UCSD Center for Computational Mathematics Technical Report CCoM-13-4 Revised July 18, 2014, June 23,

More information

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30 Optimization Escuela de Ingeniería Informática de Oviedo (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30 Unconstrained optimization Outline 1 Unconstrained optimization 2 Constrained

More information

Constrained Optimization Theory

Constrained Optimization Theory Constrained Optimization Theory Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. IMA, August 2016 Stephen Wright (UW-Madison) Constrained Optimization Theory IMA, August

More information

A smoothing augmented Lagrangian method for solving simple bilevel programs

A smoothing augmented Lagrangian method for solving simple bilevel programs A smoothing augmented Lagrangian method for solving simple bilevel programs Mengwei Xu and Jane J. Ye Dedicated to Masao Fukushima in honor of his 65th birthday Abstract. In this paper, we design a numerical

More information

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Lagrange duality. The Lagrangian. We consider an optimization program of the form Lagrange duality Another way to arrive at the KKT conditions, and one which gives us some insight on solving constrained optimization problems, is through the Lagrange dual. The dual is a maximization

More information

Subgradient. Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes. definition. subgradient calculus

Subgradient. Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes. definition. subgradient calculus 1/41 Subgradient Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes definition subgradient calculus duality and optimality conditions directional derivative Basic inequality

More information

ICS-E4030 Kernel Methods in Machine Learning

ICS-E4030 Kernel Methods in Machine Learning ICS-E4030 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 28. September, 2016 Juho Rousu 28. September, 2016 1 / 38 Convex optimization Convex optimisation This

More information

An approach to constrained global optimization based on exact penalty functions

An approach to constrained global optimization based on exact penalty functions DOI 10.1007/s10898-010-9582-0 An approach to constrained global optimization based on exact penalty functions G. Di Pillo S. Lucidi F. Rinaldi Received: 22 June 2010 / Accepted: 29 June 2010 Springer Science+Business

More information

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version Convex Optimization Theory Chapter 5 Exercises and Solutions: Extended Version Dimitri P. Bertsekas Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts http://www.athenasc.com

More information

First-order optimality conditions for mathematical programs with second-order cone complementarity constraints

First-order optimality conditions for mathematical programs with second-order cone complementarity constraints First-order optimality conditions for mathematical programs with second-order cone complementarity constraints Jane J. Ye Jinchuan Zhou Abstract In this paper we consider a mathematical program with second-order

More information

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44 Convex Optimization Newton s method ENSAE: Optimisation 1/44 Unconstrained minimization minimize f(x) f convex, twice continuously differentiable (hence dom f open) we assume optimal value p = inf x f(x)

More information

N. L. P. NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP. Optimization. Models of following form:

N. L. P. NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP. Optimization. Models of following form: 0.1 N. L. P. Katta G. Murty, IOE 611 Lecture slides Introductory Lecture NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP does not include everything

More information

Penalty and Barrier Methods. So we again build on our unconstrained algorithms, but in a different way.

Penalty and Barrier Methods. So we again build on our unconstrained algorithms, but in a different way. AMSC 607 / CMSC 878o Advanced Numerical Optimization Fall 2008 UNIT 3: Constrained Optimization PART 3: Penalty and Barrier Methods Dianne P. O Leary c 2008 Reference: N&S Chapter 16 Penalty and Barrier

More information

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen Numerisches Rechnen (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2011/12 IGPM, RWTH Aachen Numerisches Rechnen

More information

Least Sparsity of p-norm based Optimization Problems with p > 1

Least Sparsity of p-norm based Optimization Problems with p > 1 Least Sparsity of p-norm based Optimization Problems with p > Jinglai Shen and Seyedahmad Mousavi Original version: July, 07; Revision: February, 08 Abstract Motivated by l p -optimization arising from

More information

TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM

TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM H. E. Krogstad, IMF, Spring 2012 Karush-Kuhn-Tucker (KKT) Theorem is the most central theorem in constrained optimization, and since the proof is scattered

More information

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Lecture 15 Newton Method and Self-Concordance. October 23, 2008 Newton Method and Self-Concordance October 23, 2008 Outline Lecture 15 Self-concordance Notion Self-concordant Functions Operations Preserving Self-concordance Properties of Self-concordant Functions Implications

More information

A Primal-Dual Augmented Lagrangian Penalty-Interior-Point Filter Line Search Algorithm

A Primal-Dual Augmented Lagrangian Penalty-Interior-Point Filter Line Search Algorithm Journal name manuscript No. (will be inserted by the editor) A Primal-Dual Augmented Lagrangian Penalty-Interior-Point Filter Line Search Algorithm Rene Kuhlmann Christof Büsens Received: date / Accepted:

More information