CONSTRAINED OPTIMALITY CRITERIA

Similar documents
Generalization to inequality constrained problem. Maximize

In view of (31), the second of these is equal to the identity I on E m, while this, in view of (30), implies that the first can be written

The general programming problem is the nonlinear programming problem where a given function is maximized subject to a set of inequality constraints.

September Math Course: First Order Derivative

Lagrange Relaxation and Duality

Seminars on Mathematics for Economics and Finance Topic 5: Optimization Kuhn-Tucker conditions for problems with inequality constraints 1

Constrained Optimization

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

University of California, Davis Department of Agricultural and Resource Economics ARE 252 Lecture Notes 2 Quirino Paris

TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM

Symmetric and Asymmetric Duality

Written Examination

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006

GENERALIZED CONVEXITY AND OPTIMALITY CONDITIONS IN SCALAR AND VECTOR OPTIMIZATION

5 Handling Constraints

Optimality Conditions for Constrained Optimization

Appendix A Taylor Approximations and Definite Matrices

NONLINEAR. (Hillier & Lieberman Introduction to Operations Research, 8 th edition)

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

OPTIMIZATION THEORY IN A NUTSHELL Daniel McFadden, 1990, 2003

CONSTRAINED NONLINEAR PROGRAMMING

g(x,y) = c. For instance (see Figure 1 on the right), consider the optimization problem maximize subject to

The Kuhn-Tucker Problem

More on Lagrange multipliers

2.098/6.255/ Optimization Methods Practice True/False Questions

Applications of Linear Programming

Mathematical Economics: Lecture 16

Optimality, Duality, Complementarity for Constrained Optimization

Introduction to Mathematical Programming IE406. Lecture 10. Dr. Ted Ralphs

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Optimality Conditions

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

3.10 Lagrangian relaxation

Review of Optimization Methods

Convex Optimization & Lagrange Duality

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

Mathematical Foundations -1- Constrained Optimization. Constrained Optimization. An intuitive approach 2. First Order Conditions (FOC) 7

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

Solving Dual Problems

5. Duality. Lagrangian

On the acceleration of augmented Lagrangian method for linearly constrained optimization

Optimization. A first course on mathematics for economists

E 600 Chapter 4: Optimization

EC /11. Math for Microeconomics September Course, Part II Problem Set 1 with Solutions. a11 a 12. x 2

Optimization Theory. Lectures 4-6

Microeconomics I. September, c Leopold Sögner

Calculus and optimization

. This matrix is not symmetric. Example. Suppose A =

Constrained maxima and Lagrangean saddlepoints

Constrained Optimization and Lagrangian Duality

Optimization Tutorial 1. Basic Gradient Descent

EC /11. Math for Microeconomics September Course, Part II Lecture Notes. Course Outline

Roles of Convexity in Optimization Theory. Efor, T. E and Nshi C. E

Nonlinear Programming and the Kuhn-Tucker Conditions

N. L. P. NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP. Optimization. Models of following form:

TMA947/MAN280 APPLIED OPTIMIZATION

CHAPTER 1-2: SHADOW PRICES

Econ Slides from Lecture 14

Convex Optimization M2

Nonlinear Programming (Hillier, Lieberman Chapter 13) CHEM-E7155 Production Planning and Control

Interior-Point Methods for Linear Optimization

where u is the decision-maker s payoff function over her actions and S is the set of her feasible actions.

MATH2070 Optimisation

Lecture 4: Optimization. Maximizing a function of a single variable

Finite Dimensional Optimization Part III: Convex Optimization 1

INVEX FUNCTIONS AND CONSTRAINED LOCAL MINIMA

FINANCIAL OPTIMIZATION

Lagrange Multipliers

CONVEX FUNCTIONS AND OPTIMIZATION TECHINIQUES A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF

Sharpening the Karush-John optimality conditions

OPTIMALITY OF RANDOMIZED TRUNK RESERVATION FOR A PROBLEM WITH MULTIPLE CONSTRAINTS

Lectures 9 and 10: Constrained optimization problems and their optimality conditions

Lecture: Duality.

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

Week 4: Calculus and Optimization (Jehle and Reny, Chapter A2)

Optimization. Charles J. Geyer School of Statistics University of Minnesota. Stat 8054 Lecture Notes

STATIC LECTURE 4: CONSTRAINED OPTIMIZATION II - KUHN TUCKER THEORY

Summary Notes on Maximization

Decision Science Letters

2.3 Linear Programming

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

Research Article Optimality Conditions and Duality in Nonsmooth Multiobjective Programs

Convex Optimization Boyd & Vandenberghe. 5. Duality

Date: July 5, Contents

Constrained Optimization Theory

The dual simplex method with bounds

Lecture: Duality of LP, SOCP and SDP

3E4: Modelling Choice. Introduction to nonlinear programming. Announcements

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Introduction to Nonlinear Stochastic Programming

OPTIMISATION /09 EXAM PREPARATION GUIDELINES

Review of Optimization Basics

EE/AA 578, Univ of Washington, Fall Duality

1 Computing with constraints

Lecture 13: Constrained optimization

Algorithms for constrained local optimization

ON LICQ AND THE UNIQUENESS OF LAGRANGE MULTIPLIERS

CSCI : Optimization and Control of Networks. Review on Convex Optimization

Chap 2. Optimality conditions

Transcription:

5 CONSTRAINED OPTIMALITY CRITERIA In Chapters 2 and 3, we discussed the necessary and sufficient optimality criteria for unconstrained optimization problems. But most engineering problems involve optimization subect to several constraints on the design variables. The presence of constraints essentially reduces the region in which we search for the optimum. At the outset, it may appear that the reduction in the size of the feasible region should simplify the search for the optimum. On the contrary, the optimization process becomes more complicated, since many of the optimality criteria developed earlier need not hold in the presence of constraints. Even the basic condition that an optimum must be at a stationary point, where the gradient is zero, may be violated. For example, the unconstrained minimum of ƒ(x) (x 2) 2 occurs at the stationary point x 2. But if the problem has a constraint that x 4, then the constrained minimum occurs at the point x 4. This is not a stationary point of ƒ, since ƒ(4) 4. In this chapter we develop necessary and sufficient conditions of optimality for constrained problems. We begin with optimization problems involving only equality constraints. 5.1 EQUALITY-CONSTRAINED PROBLEMS Consider an optimization problem involving several equality constraints: Minimize ƒ(x, x,...,x ) N Subect to h (x,...,x ) 0 1,...,K 1 N 218 Engineering Optimization: Methods and Applications, Second Edition. A. Ravindran, K. M. Ragsdell and G. V. Relaitis 2006 John Wiley & Sons, Inc. ISBN: 978-0-471-55814-9

5.2 LAGRANGE MULTIPLIERS 219 In principle, this problem can be solved as an unconstrained optimization problem by explicitly eliminating K independent variables using the equality constraints. In effect, the presence of equality constraints reduces the dimensionality of the original problem from N to N K. Once the problem is reduced to an unconstrained optimization problem, the methods of Chapters 2 and 3 can be used to identify the optimum. To illustrate this, consider the following example. Example 5.1 Minimize ƒ(x) xxx 3 Subect to h (x) x x x 1 0 1 3 Eliminating the variable x 3 with the help of h 1 (x) 0, we get an unconstrained optimization problem involving two variables: min ƒ(x, x ) xx(1 x x ) The methods of Chapter 3 can now be applied to identify the optimum. The variable elimination method is applicable as long as the equality constraints can be solved explicitly for a given set of independent variables. In the presence of several equality constraints, the elimination process may become unwieldy. Moreover, in certain situations it may not be possible to solve the constraints explicitly to eliminate a variable. For instance, in Example 5.1, if the constraints h 1 (x) 0 were given as 1 h 1(x) xx 1 3 xx 2 3 x2 x1 0 then no explicit solution of one variable in terms of the others would be possible. Hence, in problems involving several complex equality constraints, it is better to use the method of Lagrange multipliers, which is described in the next section, for handling the constraints. 5.2 LAGRANGE MULTIPLIERS The method of Lagrange multipliers essentially gives a set of necessary conditions to identify candidate optimal points of equality-constrained optimization problems. This is done by converting the constrained problem to an equivalent unconstrained problem with the help of certain unspecified parameters nown as Lagrange multipliers.

220 CONSTRAINED OPTIMALITY CRITERIA Consider the minimization of a function of n variables subect to one equality constraint: Minimize ƒ(x, x,...,x ) (5.1) N Subect to h (x, x,...,x ) 0 (5.2) 1 N The method of Lagrange multipliers converts this problem to the following unconstrained optimization problem: Minimize L(x, v) ƒ(x) vh (x) (5.3) 1 The unconstrained function L(x; v) is called the Lagrangian function, and v is an unspecified constant called the Lagrange multiplier. There are no sign restrictions on the value of v. Suppose for a given fixed value of v v o the unconstrained minimum of L(x; v) with respect to x occurs at x x o and x o satisfies h 1 (x o ) 0. then, it is clear that x o minimizes Eq. (5.1) subect to (5.2) because for all values of x that satisfy (5.2), h 1 (x) 0 and min L(x; v) min ƒ(x). Of course, the challenge is to determine the appropriate value for v v o so that the unconstrained minimum point x o satisfies (5.2). But this can be done by treating v as a variable, finding the unconstrained minimum of (5.3) as a function of v, and adusting v such that (5.2) is satisfied. We illustrate this with the following example: Example 5.2 Minimize ƒ(x) x x Subect to h (x) 2x x 2 0 1 The unconstrained minimization problem becomes Minimize L(x; v) x1 x2 v(2x1 x2 2) Solution. Setting the gradient of L with respect to x equal to zero, L o 2x1 2v 0 x 1 v x 1 L v o 2x2 v 0 x 2 x 2 2 To test whether the stationary point x o corresponds to a minimum, we compute the Hessian matrix of L(x; v) with respect to x as

2 0 H L(x; v) 0 2 5.2 LAGRANGE MULTIPLIERS 221 which is positive definite. This implies that H L (x; v) is a convex function for o o all x. Hence x1 v, x2 v/2 corresponds to the global minimum. To deter- o o mine the optimal v, we substitute the values of x1 and x2 in the constraint 2x 1 x 2 2 to get 2v v/2 2orv o 4 5. Hence the constrained minimum o 4 o 2 4 is attained at x, x, and min ƒ(x). 1 5 2 5 5 In the solution of Example 5.2, we treated L(x; v) as a function of two variables x 1 and x 2 and considered v as a parameter whose value is adusted to satisfy the constraint. In problems where it is difficult to get an explicit solution to L 0 x for 1,2,...,N as a function of v, the values of x and v can be determined simultaneously by solving the following system of N 1 equations in N 1 unnowns: L 0 x h (x) 0 1 for 1,2,...,N Any appropriate numerical search technique discussed in Chapter 3 (e.g., Newton s method) could be used to determine all possible solutions. For each of the solutions (x o ; v o ), the Hessian matrix of L with respect to x has to be evaluated to determine whether it is positive definite for a local minimum (or negative definite for a local maximum). Example 5.3 Maximize ƒ(x) x x Subect to x x 1

222 CONSTRAINED OPTIMALITY CRITERIA Solution L(x; v) x1 x2 v(x1 x2 1) L 1 2vx 0 1 x 1 L 1 2vx 2 0 x 2 h 1(x) x1 x2 1 0 There are two solutions to this system of three equations in three variables, given by 1 1 1 (1) (x ; v 1), ; 2 1 1 1 (2) (x ; v 2), ; 2 The Hessian matrix of L(x; v) with respect to x is given by 2v 0 H L(x; v) 0 2v Evaluating the matrix H at the two solutions, we find that (1) 2 0 H L(x ; v 1) is positive definite 0 2 and (2) 2 0 H L(x ; v 2) is negative definite 0 2 Hence, (x (2) ; v 2 ) corresponds to the maximum of L with respect to x, and the o o optimal solution is x 1 x 2 1/2. [Note that (x (1) ; v 1 ) corresponds to the minimum of L.] It is to be emphasized here that if we consider L as a function of three variables, namely, x 1, x 2 and v, then the points (x (1) ; v 1 ) and (x (2) ; v 2 )donot correspond to a minimum or maximum of L with respect to x and v. Asa matter of fact, they become saddlepoints of the function L(x, v). We shall discuss saddlepoints and their significance later in this chapter.

5.2 LAGRANGE MULTIPLIERS 223 The Lagrange multiplier method can be extended to several equality constraints. Consider the general problem Minimize ƒ(x) Subect to h (x) 0 1,2,...,K The Langrange function becomes L(x; v) ƒ(x) K v h 1 Here v 1, v 2,...,v K are the Lagrange multipliers, which are unspecified parameters whose values will be determined later. By setting the partial derivatives of L with respect to x equal to zero, we get the following system of N equations in N unnowns: L(x; v) 0 x 1 L(x; v) 0 x 2 L(x; v) 0 x N It is difficult to solve the above system as a function of the vector v explicitly, we can augment the system with the constraint equations h (x) 0 1 h (x) 0 2 h (x) 0 K A solution to the augmented system of N K equations in N K variables gives a stationary point of L. This can then be tested for minimum or maximum by computing the Hessian matrix of L with respect to x as discussed in the single-constraint case. There may exist some problems for which the augmented system of N K equations in N K unnowns may not have a solution. In such cases, the Lagrange multiplier method would fail. However, such cases are rare in practice.

224 CONSTRAINED OPTIMALITY CRITERIA 5.3 ECONOMIC INTERPRETATION OF LAGRANGE MULTIPLIERS So far in our discussion we have treated Lagrange multipliers as adustable parameters whose values are adusted to satisfy the constraints. In fact, the Lagrange multipliers do have an important economic interpretation as shadow prices of the constraints, and their optimal values are very useful in sensitivity analysis. To exhibit this interpretation, let us consider the following optimization problem involving two variables and one equality constraint: Minimize ƒ(x, x ) Subect to h (x, x ) b 1 1 where b 1 corresponds to the availability of a certain scarce resource, L(x; v ) ƒ(x) v [h (x) b ] 1 1 1 1 Let us assume that the stationary point of L corresponds to the global minimum: L ƒ h v 1 1 0 (5.4) x x x 1 1 1 L ƒ h v 1 1 0 (5.5) x x x 2 o Let v be the optimal Lagrange multiplier and x o 1 be the optimal solution. Let o the minimum of L(x; v) for v 1 v occur at x x o such that h 1 (x o 1 ) b 1 and ƒ(x o ) L(x o ; o ) ƒ o. It is clear that the optimal values (x o o v1 ; v1) are a function of b 1, the limited availability of the scarce resource. The change in ƒ o, the optimal value of ƒ, due to a change in b 1 is given by the partial derivative ƒ o /b 1. By the chain rule, ƒ ƒ x ƒ x (5.6) b x b x b o o o o o o o 1 1 1 The partial derivative of the constraint function h 1 (x) b 1 0isgivenby Multiply both sides of (5.7) by o o h1 x 1 h1 x 2 o 0 x 1 b1 x2 b1 1 0 (5.7) o v 1 and subtract from (5.6) to get

Since x o and o v 1 o 2 o o x o o 1 1 o 1 o 1 1 1 5.4 KUHN TUCKER CONDITIONS 225 ƒ ƒ h v v (5.8) b x x b satisfy (5.4) and (5.5), Eq. (5.8) reduces to o ƒ o v 1 (5.9) b 1 Thus, from (5.9) we note that the rate of change of the optimal value of ƒ with respect to b 1 is given by the optimal value of the Lagrange multiplier o v 1. In other words, the change in the optimal value of the obective function per unit increase in the right-hand-side constant of a constraint, which we defined as shadow price in Chapter 4, is given by the Lagrange multiplier. o Depending on the sign of v 1, ƒ o may increase or decrease with a change in b 1. For an optimization problem with K constraints and n variables given by Minimize ƒ(x) Subect to h (x) b 1,2,...,K we can show, using a similar argument, that o ƒ v b o for 1,2,...,K 5.4 KUHN TUCKER CONDITIONS In the previous section we found that Lagrange multipliers could be used in developing optimality criteria for the equality-constrained optimization problems. Kuhn and Tucer have extended this theory to include the general nonlinear programming (NLP) problem with both equality and inequality constraints. Consider the following general NLP problem: Minimize ƒ(x) (5.10) Subect to g (x) 0 for 1,2,...,J (5.11) h (x) 0 for 1,2,...,K x (x, x,...,x ) (5.12) N

226 CONSTRAINED OPTIMALITY CRITERIA Definition The inequality constraint g (x) 0 is said to be an active or binding constraint at the point x if g (x) 0; it is said to be inactive or nonbinding if g (x) 0. If we can identify the inactive constraints at the optimum before solving the problem, then we can delete those constraints from the model and reduce the problem size. The main difficulty lies in identifying the inactive constraints before the problem is solved. Kuhn and Tucer have developed the necessary and sufficient optimality conditions for the NLP problem assuming that the functions ƒ, g, and h are differentiable. These optimality conditions, commonly nown as the Kuhn Tucer conditions (KTCs) may be stated in the form of finding a solution to a system of nonlinear equations. Hence, they are also referred to as the Kuhn Tucer problem (KTP). 5.4.1 Kuhn Tucer Conditions or Kuhn Tucer Problem Find vectors x (N1), u(1j), and v(1k) that satisfy J 1 K 1 ƒ(x) u g (x) v h (x) 0 (5.13) g (x) 0 for 1,2,...,J (5.14) h (x) 0 for 1,2,...,K (5.15) ug(x) 0 for 1,2,...,J (5.16) u 0 for 1,2,...,J (5.17) Let us first illustrate the KTCs with an example. Example 5.4 Minimize 2 ƒ(x) x1 x2 Subect to x x 6 x 1 0 x x 26 1

5.4 KUHN TUCKER CONDITIONS 227 Solution. Expressing the above problem in the NLP problem format given by Eqs. (5.12) and (5.13), we get 2 ƒ(x) x1 x2 ƒ(x) (2x 1, 1) g 1(x) x1 1 g 1(x) (1, 0) g 2(x) 26 x1 x2 g 2(x) (2x 1, 2x 2) h (x) x x 6 h (x) (1, 1) 1 1 Equation (5.13) of the KTCs reduces to This corresponds to ƒ g1 g2 h1 u1 u2 v1 0 for 1, 2 x x x x 2x u 2x u v 0 1 1 1 1 2x u v 0 1 Equations (5.14) and (5.15) of the KTP correspond to the given constraints of the NLP problem and are given by x1 1 0 26 x1 x2 0 x1 x2 6 0 Equation (5.16) is nown as the complementary slacness condition in the KTP and is given by u 1(x1 1) 0 u 2(26 x1 x 2) 0 Note that the variables u 1 and u 2 are restricted to be zero or positive, while v 1 is unrestricted in sign. Thus, the KTCs for this example are given by

228 CONSTRAINED OPTIMALITY CRITERIA 2x u 2x u v 0 1 1 1 1 2x u v 0 1 x1 1 0 26 x1 x2 0 x1 x2 6 0 u 1(x1 1) 0 u 2(26 x1 x 2) 0 u 0 u 0 v unrestricted 1 5.4.2 Interpretation of Kuhn Tucer Conditions To interpret the KTCs, consider the equality-constrained NLP problem The KTCs are given by Minimize ƒ(x) Subect to h (x) 0 1,...,K ƒ(x) v h (x) 0 (5.18) h (x) 0 (5.19) Now consider the Lagrangian function corresponding to the equalityconstrained NLP problem: L(x; v) ƒ(x) v h (x) The first-order optimality conditions are given by (x) ƒ(x) v h (x) 0 L L(v) h (x) 0 We find that the KTCs (5.18) and (5.19) are simply the first-order optimality conditions of the Lagrangian problem. Let us consider the inequality-constrained NLP problem:

5.5 KUHN TUCKER THEOREMS 229 The KTCs are given by Minimize ƒ(x) Subect to g (x) 0 1,...,J ƒ(x) u g (x) 0 g (x) 0 ug(x) 0 u 0 The Lagrangian function can be expressed as L(x; u) ƒ(x) the first-order optimality conditions are given by ug(x) ƒ(x) u g (x) 0 (5.20) g (x) 0 for 1,...,J Note that u is the Lagrange multiplier corresponding to constraint. In Section 5.3 we showed that u represents the shadow price of constraint ; in other words, u gives the change in the minimum value of the obective function ƒ(x) per unit increase in the right-hand-side constant. If we assume that the th constraint is inactive [i.e., g (x) 0], then u 0 and u g (x) 0. On the other hand, if the th constraint is active [i.e., g (x) 0], then its shadow price u need not necessarily be zero, but the value u g (x) 0 since g (x) 0. Hence, ug(x) 0 for all 1,...,J To determine the sign of u or the sign of the shadow price of the constraint g (x) 0, let us increase its right-hand-side value from 0 to 1. It is clear that this will constrain the problem further, because any solution that satisfied g (x) 1 will automatically satisfy g (x) 0. Hence, the feasible region will become smaller, and the minimum value of ƒ(x) cannot improve (i.e., will generally increase). In other words, the shadow price of the th constraint u is nononegative, as given by the KTCs. 5.5 KUHN TUCKER THEOREMS In the previous section we developed the KTCs for constrained optimization problems. Using the theory of Lagrange multipliers, we saw intuitively that

230 CONSTRAINED OPTIMALITY CRITERIA the KTCs give the necessary conditions of optimality. In this section, we see the precise conditions under which the KTP implies the necessary and sufficient conditions of optimality. Theorem 5.1 Kuhn Tucer Necessity Theorem Consider the NLP problem given by Eqs. (5.10) (5.12). Let ƒ, g, and h be differentiable functions and x* be a feasible solution to the NLP problem. Let I {g (x*) 0}. Futhermore, g (x*) for I and h (x*) for 1,..., K are linearly independent. If x* is an optimal solution to the NLP problem, then there exists a (u*, v*) such that (x*, u*, v*) solves the KTP given by Eqs. (5.13) (5.17). The proof of the theorem is beyond the scope of this test. Interested students may refer to Bazaraa et al. [1, Chap. 4]. The conditions that g (x*) for I and h (x*) for 1,..., K are linearly independent at the optimum are nown as a constraint qualification. A constraint qualification essentially implies certain regularity conditions on the feasible region that are frequently satisfied in practical problems. However, in general, it is difficult to verify the constraint qualifications, since it requires that the optimum solution be nown beforehand. For certain special NLP problems, however, the constraint qualification is always satisfied: 1. When all the inequality and equality constraints are linear. 2. When all the inequality constraints are concave functions and the equality constraints are linear and there exists at least one feasible x that is strictly inside the feasible region of the inequality constraints. In other words, there exists an x such that g (x) 0 for 1,...,J and (x) 0 for 1,...,K. h When the constraint qualification is not met at the optimum, there may not exist a solution to the KTP. Example 5.5 Minimize ƒ(x) (x 3) x 3 Subect to g 1(x) (1 x 1) x2 0 g (x) x 0 2 1 g (x) x 0 3 2

5.5 KUHN TUCKER THEOREMS 231 Solution. Figure 5.1 illustrates the feasible region for this nonlinear program. It is clear that the optimal solution to this problem is x* 1 1, x* 2 0, and ƒ(x*) 4. We shall now show that the constraint qualification is not satisfied at the optimum. Since g 1 (x*) 0, g 2 (x*) 0, and g 3 (x*) 0, I {1, 3}. Now, and 2 g 1(x*) [3(1 x 1),1] xx* g (x*) (0, 1) 3 (0, 1) It is clear that g 1 (x*) and g 3 (x*) are not linearly independent. Hence, the constraint qualification is not satisfied at the point x* (1, 0). Let us now write the KTCs to see whether they will be satisfied at (1, 0). Equations (5.13), (5.16), and (5.17) of the KTCs become 2 2(x1 3) u 1[3(1 x 1)]u2 0 (5.21) 2x2 u 1(1) u3 0 (5.22) 3 u 1[(1 x 1) x 2] 0 (5.23) ux 0 (5.24) 2 1 ux 0 (5.25) 3 2 u, u, u 0 (5.26) 3 At x* (1, 0), Eq. (5.21) implies u 2 4, while to satisfy Eq. (5.24), u 2 0. Hence, there exists no Kuhn Tucer point at the optimum. Figure 5.1. Feasible region of Example 5.5.

232 CONSTRAINED OPTIMALITY CRITERIA Note that when the constraint qualification is violated, it does not necessarily imply that a Kuhn Tucer point does not exist. To illustrate this, suppose the obective function of Example 5.5 is changed to ƒ(x) (x 1 1) 2 2 x 2. The optimum still occurs at x* (1, 0), and it does not satisfy the constraint qualification. The KTCs given by Eqs. (5.22) (5.26) will remain the same, while Eq. (5.21) becomes 2 2(x1 1) u 1[3(1 x 1)]u2 0 The reader can easily verify that there exists a Kuhn Tucer point given by x* (1, 0) and u* (0, 0, 0) that satisfies the KTCs. The Kuhn Tucer necessity theorem helps to identify points that are not optimal. In other words, given a feasible point that satisfies the constraint qualification, we can use Theorem 5.1 to prove that it is not optimal if it does not satisfy the KTCs. On the other hand, if it does satisfy the KTCs, there is no assurance that it is optimal to the nonlinear program! For example, consider the following NLP problem. Example 5.6 Solution. Here, The KTCs are given by Minimize ƒ(x) 1 x 2 Subect to 1 x 3 g (x) x 1 0 1 g (x) 3 x 0 2 2x u u 0 (5.27) 1 x 3 (5.28) u 1(x 1) 0 (5.29) u (3 x) 0 (5.30) 2 u, u 0 (5.31) Since the constraints are linear, the constraint qualification is satisfied at all feasible points. It is clear that x 3 is optimal. But consider the feasible solution x 2. To prove that it is not optimal, let us try to construct a Kuhn Tucer point at x 2 that satisfies Eqs. (5.27) (5.31). To satisfy Eqs. (5.29)

5.5 KUHN TUCKER THEOREMS 233 and (5.30), u 1 u 2 0; but x 2, u 1 u 2 0 violates Eq. (5.27). Hence, by Theorem 5.1, x 2 cannot be optimal. On the other hand, the solution x u 1 u 2 0 satisfies Eqs. (5.27) (5.31) and hence is a Kuhn Tucer point, but it is not optimal! By Theorem 5.1, we also now that the KTCs must be satisfied at the optimal solution x 3. It is easy to verify that the solution x 3, u 1 0, u 2 6 satisfies the KTCs. The following theorem gives conditions under which a Kuhn Tucer point automatically becomes an optimal solution to the NLP problem. Theorem 5.2 Kuhn Tucer Sufficiency Theorem Consider the NLP problem given by Eqs. (5.10) (5.12). Let the obective function ƒ(x) be convex, the inequality constraints g (x) be all concave functions for 1,...,J, and the equality constraints h (x) for 1,...,K be linear. If there exists a solution (x*, u*, v*) that satisfies the KTCs given by Eqs. (5.13) (5.17), then x* is an optimal solution to the NLP problem. A rigorous proof of the Kuhn Tucer sufficiency theorem can be found in Mangasarian [2]. When the sufficiency conditions of Theorem 5.2 hold, finding a Kuhn Tucer point gives an optimal solution to an NLP problem. Theorem 5.2 can also be used to prove that a given solution to an NLP problem is optimal. To illustrate this, recall Example 5.4: Minimize 2 ƒ(x) x1 x2 Subect to g 1(x) x1 1 0 g 2(x) 26 x1 x2 0 h (x) x x 6 0 1 We shall prove that x* 1 1, x* 2 5 is optimal by using Theorem 5.2. Now, 2 0 ƒ(x) (2x 1, 1) and H ƒ(x) 0 0 Since H ƒ (x) is positive semidefinite for all x, ƒ(x) is a convex function. The inequality constraint g 1 (x) is linear and hence both convex and concave. To show that g 2 (x) is concave, compute 2 0 g (x) (2x, 2x ) and H (x) 2 g 2 0 2

234 CONSTRAINED OPTIMALITY CRITERIA Since H g2 (x) is negative definite, g 2 (x) is concave. The equality constraint h 1 (x) is linear. Hence all the sufficiency conditions of Theorem 5.2 are satisfied, and if we are able to construct a Kuhn Tucer point using x* (1, 5), the solution x* is indeed optimal. The KTCs of Example 5.4 are given below: 2x u 2x u v 0 (5.32) 1 1 1 1 2x u v 0 (5.33) 1 x1 1 0 (5.34) 26 x1 x2 0 (5.35) x1 x2 6 0 (5.36) u 1(x1 1) 0 (5.37) u 2(26 x1 x 2) 0 (5.38) u, u 0 (5.39) Here x* (1, 5) satisfies Eqs. (5.34) (5.36), and hence it is feasible. Equations (5.32) and (5.33) reduce to 2 u 2u v 0 1 1 10u v 0 2 1 By setting v 1 0, we can get a solution u 2 0.1 and u 1 2.2. Thus, the solution x* (1, 5), u* (2.2, 0.1), and v* 1 0 satisfies the KTCs. Since the sufficiency conditions of Theorem 5.2 are satisfied, x* (1, 5) is an optimal solution to Example 5.4. Note that there also exist other values of u 1, u 2, v 1 that satisfy Eqs. (5.32) (5.39). Remars 1. For practical problems, the constraint qualification will generally hold. If the functions are differentiable, a Kuhn Tucer point is a possible candidate for the optimum. Hence, many of the NLP methods attempt to converge to a Kuhn Tucer point. (Recall the analogy to the unconstrained optimization case wherein the algorithms attempt to determine a stationary point.) 2. When the sufficiency conditions of Theorem 5.2 hold, a Kuhn Tucer point automatically becomes the global minimum. Unfortunately, the sufficiency conditions are difficult to verify, and often practical prob-

5.6 SADDLEPOINT CONDITIONS 235 lems may not possess these nice properties. Note that the presence of one nonlinear equality constraint is enough to violate the assumptions of Theorem 5.2 3. The sufficiency conditions of Theorem 5.2 have been generalized further to nonconvex inequality constraints, nonconvex obectives, and nonlinear equality constraints. These use generalizations of convex functions such as quasi-convex and pseudoconvex functions. (See Section 5.9.) 5.6 SADDLEPOINT CONDITIONS The discussion of Kuhn Tucer optimality conditions of Sections 5.4 and 5.5 assume that the obective function and the constraints are differentiable. We now discuss constrained optimality criteria for nondifferentiable functions. Definition A function ƒ(x, y) is said to have a saddlepoint at (x*, y*) if ƒ(x*, y) ƒ(x*, y*) ƒ(x, y*) for all x and y. The definition of a saddlepoint implies that x* minimizes the function ƒ(x, y*) for all x and y* maximizes the function ƒ(x*, y) for all y. For example, consider a function ƒ(x, y) x 2 xy 2y defined over all real values of x and nonnegative values of y. It is easy to verify that the function possesses a saddlepoint at the point x* 2, y* 4. In other words, ƒ(2, y) ƒ(2, 4) ƒ(x, 4) for all y 0 and all real x Recall the Lagrange multiplier method discussed in Section 5.2. It solves a constrained optimization problem of the form Minimize ƒ(x) Subect to h (x) 0 for 1,...,K The Lagrangian function is defined to be L(x; v) ƒ(x) v h (x) Suppose at v v* the minimum of L(x, v*) occurs at x x* such that h (x*) 0. We now then, by the Lagrange multiplier method, that x* isan optimal solution to the nonlinear program. It can be shown that (x*, v*) is a saddlepoint of the Lagrangian function satisfying

236 CONSTRAINED OPTIMALITY CRITERIA L(x*, v) L(x*, v*) L(x, v*) for all x and v Consider the general NLP problem: Minimize ƒ(x) Subect to g (x) 0 for 1,...,J x S The set S may be used to impose additional restrictions on the design variables. For example, the design variables may all be integers or restricted to a certain discrete set. The Kuhn Tucer saddlepoint problem (KTSP) is as follows: Find (x*, u*) such that where L(x*, u) L(x*, u*) L(x, u*) all u 0 and all x S L(x, u) ƒ(x) ug(x) Theorem 5.3 Sufficient Optimality Theorem If (x*, u*) is a saddlepoint solution of a KTSP, the x* is an optimal solution to the NLP problem. A proof of this theorem is available in Mangasarian [2, Chap. 3]. Remars 1. No convexity assumptions of the functions have been made in Theorem 5.3. 2. No constraint qualification is invoed. 3. Nonlinear equality constraints of the form h (x) 0 for 1,...,K can be handled easily by redefining the Lagrangian function as L(x, u, v) ƒ(x) ug(x) v h (x) Here the variables v for 1,...,K will be unrestricted in sign. 4. Theorem 5.3 provides only a sufficient condition. There may exist some NLP problems for which a saddlepoint does not exist even though the NLP problem has an optimal solution.

5.6 SADDLEPOINT CONDITIONS 237 Existence of Saddlepoints. There exist necessary optimality theorems that guarantee the existence of a saddlepoint solution without the assumption of differentiability. However, they assume that the constraint qualification is met and that the functions are convex. Theorem 5.4 Necessary Optimality Theorem Let x* minimize ƒ(x) subect to g (x) 0, 1,...,J and x S. Assume S is a convex set, ƒ(x) is a convex function, and g (x) are concave functions on S. Assume also that there exists a point x S such that g (x) 0 for all 1, 2,..., J. then there exists a vector of multipliers u* 0 such that (x*, u*) is a saddlepoint of the Lagrangian function satisfying for all x S and u 0. L(x, u) ƒ(x) ug(x) L(x*, u) L(x*, u*) L(x, u*) For a proof of this theorem, refer to the text by Lasdon [3, Chap. 1]. Even though Theorem 5.3 and the KTSP provide sufficient conditions for optimality without invoing differentiability and convexity, determination of a saddlepoint to a KTSP is generally difficult. However, the following theorem maes it computationally more attractive. Theorem 5.5 A solution (x*, u*) with u* 0 and x* S is a saddlepoint of a KTSP if and only if the following conditions are satisfied: (i) x* minimizes L(x, u*) over all x S (ii) g (x*) 0 for 1,...,J (iii) u g (x*) 0 for 1,...,J For a proof, see Lasdon [3, Chap. 1]. Condition (i) of Theorem 5.5 amounts to finding an unconstrained minimum of a function, and any of the direct-search methods discussed in Chapter 3 could be used. Of course, this assumes prior nowledge of the value of u*. However, a trial-and-error method can be used to determine u* and x* simultaneously and also satisfy conditions (ii) and (iii). One such method, due to Everett [4], is called the generalized Lagrange multiplier method (Section 5.8). Theorems 5.3 and 5.5 also form the basis of many of the Lagrangian

238 CONSTRAINED OPTIMALITY CRITERIA relaxation methods that have been developed for solving large-scale NLP problems [3]. It is important to note that saddlepoints may not exist for all NLP problems. The existence of saddlepoints is guaranteed only for NLP problems that satisfy the conditions of Theorem 5.4. 5.7 SECOND-ORDER OPTIMALITY CONDITIONS In Sections 5.4 5.6, we discussed the first-order necessary and sufficient conditions, called the Kuhn Tucer conditions, for constrained optimization problems using the gradients of the obective function and constraints. Second-order necessary and sufficient optimality conditions that apply to twicedifferentiable functions have been developed by McCormic [5], whose main results are summarized in this section. Consider the following NLP problem. Problem P1 Minimize ƒ(x) Subect to g (x) 0 1,2,...,J h (x) 0 x R N The first-order KTCs are given by Definitions 1,2,...,K ƒ(x) u g (x) v h (x) 0 (5.40) g (x) 0 1,...,J (5.41) h (x) 0 1,...,K (5.42) ug(x) 0 1,...,J (5.43) u 0 1,...,J (5.44) x is a feasible solution to an NLP problem when g (x) 0 for all and h (x) 0 for all. x* isalocal minimum to an NLP problem when x* is feasible and ƒ(x*) ƒ(x) for all feasible x in some small neighborhood (x*) of x*. x* isastrict (unique or isolated) local minimum when x* is feasible and ƒ(x*) ƒ(x) for feasible x x* in some small neighborhood (x*) of x*.

5.7 SECOND-ORDER OPTIMALITY CONDITIONS 239 A Kuhn Tucer point to an NLP problem is a vector (x*, u*, v*) satisfying Eqs. (5.40) (5.44). Let us first consider the basic motivation for the second-order optimality conditions. For simplicity, consider an equality-constrained NLP problem as follows: Minimize ƒ(x) Subect to h (x) 0 1,2,...,K The first-order KTCs are given by h (x) 0, 1,...,, ƒ(x) v h (x) 0 (5.45) Consider a point x that satisfied the first-order conditions. To chec further whether it is a local minimum, we can write down the Taylor series expansion at the point x using higher order terms for each function ƒ and h as follows: ƒ(x) ƒ(x x) ƒ(x) 1 T ƒ(x) x 2 x Hƒ x O(x) (5.46) where O(x) are very small higher order terms involving x. h (x) h (x x) h (x) 1 T h (x) x 2 x H x O(x) (5.47) where H is the Hessian matrix of h (x) evaluated at x. Multiply Eq. (5.47) by the Kuhn Tucer multiplier v, and sum over all 1,...,K. Subtracting this sum from Eq. (5.46), we obtain ƒ(x) v h (x) ƒ(x) v h (x) x For x x to be feasible, 1 T 2 x Hƒ vh x O(x) (5.48) h (x) 0 (5.49) Assuming that the constraint qualification is satisfied at x, the Kuhn Tucer necessary theorem implies that

240 CONSTRAINED OPTIMALITY CRITERIA ƒ(x) v h (x) 0 (5.50) Using Eqs. (5.49) and (5.50), Eq. (5.48) reduces to 1 T ƒ(x) 2 x Hƒ vh x O(x) (5.51) For x to be a local minimum, it is necessary that ƒ(x) 0 for all feasible movement x around x. Using Eqs. (5.49) and (5.51), the above condition implies that for all x satisfying T x H v H x 0 (5.52) h (x) 0 for 1,...,K (5.53) Using Eq. (5.47) and ignoring the second and higher order terms in x, Eq. (5.53) reduces to h (x) h (x) x 0 Thus assuming that the constraint qualification is satisfied at x, the necessary conditions for x to be a local minimum are as follows: 1. There exists v, 1,...,K, such that ( x, v) is a Kuhn Tucer point. 2. x T [H ƒ v H ] x 0 for all x satisfying h (x) x 0 Similarly, the sufficient condition for by x for 1,...,K to be strict local minimum is given ƒ(x) 0 for all feasible x around x This implies that T x H v H x 0 ƒ for all x satisfying

5.7 SECOND-ORDER OPTIMALITY CONDITIONS 241 h (x) x 0 for all 1,...,K (5.54) We shall now present the formal statements of second-order necessary and sufficient conditions for an NLP problem involving both equality and inequality constraints. Theorem 5.6 Second-Order Necessity Theorem Consider the NLP problem given by Problem P1. Let ƒ, g, and h be twicedifferentiable functions, and let x* be feasible for the nonlinear program. Let the active constraint set at x* bei {g (x*) 0}. Furthermore, assume that g (x*) for I and h (x*) for 1, 2,..., K are linearly independent. Then the necessary conditions that x* bealocal minimum to the NLP problem are as follows: 1. There exists (u*, v*) such that (x*, u*, v*) is a Kuhn Tucer point. 2. For every vector satisfying y (1N) it follows that g (x*)y 0 for I (5.55) h (x*)y 0 for 1,2,...,K (5.56) T y H L(x*, u*, v*)y 0 (5.57) where J 1 K 1 L(x, u, v) ƒ(x) ug(x) v h (x) and H L (x*, u*, v*) is the Hessian matrix of the second partial derivatives of L with respect to x evaluated at (x*, u*, v*). We shall illustrate Theorem 5.6 with an example in which the first-order necessary conditions are satisfied while the second-order conditions show that the point is not optimal. Example 5.7 [5] Minimize ƒ(x) (x 1) x 2 Subect to g 1(x) x1 x2 0 Suppose we want to verify whether x* (0, 0) is optimal.

242 CONSTRAINED OPTIMALITY CRITERIA Solution ƒ(x) [2(x 1), 2x ] g (x) (1, 2x ) I {1} Since g 1 (x*) (1, 0) is linearly independent, the constraint qualification is satisfied at x*. The first-order KTCs are given by 2(x 1) u 0 1 1 2x2 2x2u1 0 2 u 1(x1 x 2) 0 u 0 1 Here x* (0, 0) and u* 1 2 satisfy the above conditions. Hence, (x*, u*) (0, 0, 2) is Kuhn Tucer point and x* satisfies the first-order necessary conditions of optimality by Theorem 5.1. In other words, we do not now whether or not (0, 0) is an optimal solution to the NLP problem! Let us now apply the second-order necessary conditions to test whether (0, 0) is a local minimum to the NLP problem. The first part of Theorem 5.6 is already satisfied, since (x*, u*) (0, 0, 2) is a Kuhn Tucer point. To prove the second-order conditions, compute 2 0 H L(x, u) 0 2 2u 1 At (x*, u*), 2 0 H L(x*, u*) 0 4 We have to verify whether T 2 0 y y 0 0 4 for all y satisfying y 1 1 y 2 g (x*)y 0 or (1, 0) 0 In other words, we need to consider only vectors (y 1, y 2 ) of the form (0, y 2 ) to satisfy Eq. (5.57). Now,

5.7 SECOND-ORDER OPTIMALITY CONDITIONS 243 2 0 0 2 (0, y 2) 4y2 0 for all y2 0 0 4 y 2 Thus, x* (0, 0) does not satisfy the second-order necessary conditions, and hence its is not a local minimum for the NLP problem. Sufficient Conditions. When a point satisfies the second-order necessary conditions given by Theorem 5.6, it becomes a Kuhn Tucer point and a candidate for a local minimum. To show that it is in fact a minimum point, we need the second-order sufficient conditions. Of course, when a nonlinear program satisfies the assumptions of Theorem 5.2 (Kuhn Tucer sufficiency theorem), the Kuhn Tucer point automatically becomes the global minimum. However, Theorem 5.2 requires that the obective function be convex, the inequality constraints concave, and the equality constraints linear. These assumptions are too rigid and may not be satisfied in practice very often. In such situations, the second-order sufficiency conditions may be helpful in showing that a Kuhn Tucer point is a local minimum. Theorem 5.7 Second-Order Sufficiency Theorem Sufficient conditions that a point x* is a strict local minimum of the NLP problem P1, where ƒ, g, and h are twice-differentiable functions, are as follows: (i) There exists (u*, v*) such that (x*, u*, v*) is a Kuhn Tucer point. (ii) For every nonzero vector satisfying it follows that y (1N) g (x*)y 0 I {g (x*) 0, u* 0} (5.58) 1 g (x*)y 0 I {g (x*) 0, u* 0} (5.59) 2 h (x*)y 0 1,2,...,K (5.60) y 0 T y H L(x*, u*, v*)y 0 (5.61) Note: I 1 I 2 I, the set of all active constraints at x*. Comparing Theorems 5.6 and 5.7, it is clear that the sufficient conditions add very few new restrictions to the necessary conditions, and no additional assumptions about the properties of the functions are needed. The minor

244 CONSTRAINED OPTIMALITY CRITERIA changes are that Eq. (5.55) need not be satisfied for all active constraints and inequality (5.57) has to be satisfied as a strict inequality. Remar The restrictions on vector y given by Eqs. (5.58) and (5.59) use information on the multiplier u*. Han and Mangasarain [6] have given an equivalent form without the use of the Kuhn Tucer multiplier u*. They prove that Eqs. (5.58) and (5.59) are equivalent to the following set of conditions: ƒ(x*)y 0 g (x*)y 0 for I {g (x*) 0} We now illustrate the use of the second-order sufficient conditions. Example 5.8 Minimize ƒ(x) (x 1) x Subect to g 1(x) x1 5x2 0 This problem is very similar to the one given in Example 5.7. Suppose we want to verify whether x* (0, 0) is a local minimum. Note that the region S {xg 1 (x) 0} is not a convex set. For example, x (0.2, 1) and (0.2, 1) are feasible, but the midpoint (0.2, 0) is not. The KTCs for this problem are 2(x1 1) u1 0 2x2 2 5xu 2 1 0 u (x x ) 0 1 1 5 2 u 0 1 Here x* (0, 0), u* 1 2 satisfies the KTCs. Using Theorem 5.1, we can conclude that x* (0, 0) satisfies the necessary conditions for the minimum. But we cannot conclude that x* is a local minimum, since the function g 1 (x) is convex and violates the assumptions of Theorem 5.2 (Kuhn Tucer sufficiency theorem). Using the second-order sufficient conditions, we find that

5.8 GENERALIZED LAGRANGE MULTIPLIER METHOD 245 2 0 H L(x*, u*) 0 1.2 The vector y (y 1, y 2 ) satisfying Eqs. (5.58) and (5.59) is of the form (0, y 2 ) as in Example 5.8. Inequality (5.61) reduces to 2 0 0 2 (0, y 2) 1.2y2 0 for all y2 0 0 1.2 y 2 Hence by Theorem 5.7, x* (0, 0) is a strict local minimum. Fiacco [7] has extended Theorem 5.7 to sufficient conditions for wea (not necessarily strict) minimum also. 5.8 GENERALIZED LAGRANGE MULTIPLIER METHOD The usefulness of the Lagrange multiplier method for solving constrained optimization problems is not limited to differentiable functions. Many engineering problems may involve discontinuous or nondifferentiable functions to be optimized. Everett [4] generalized the Lagrange multiplier method presented earlier to handle such problems. Consider the NLP problem Minimize ƒ(x) Subect to g (x) b for 1,2,...,J x S where S is a subset of R N, imposing additional restrictions on the variables x (e.g., S may be a discrete set). Everett s generalized Lagrangian function corresponding to the NLP problem is given by J 1 E(x; ) ƒ(x) g (x) (5.62) where the s are nonnegative multipliers. Suppose the unconstrained minimum of E(x; ) over all x S is attained at the point x for a fixed value of. Then, Everett [4] proved that x is an optimal solution to the following mathematical program:

246 CONSTRAINED OPTIMALITY CRITERIA Minimize ƒ(x) Subect to g (x) g (x) 1,...,J x S Hence, to solve the original NLP problem, it is sufficient to find nonnegative multipliers * (called Everett s multipliers) such that the unconstrained minimum of E(x; *) over all x S occurs at the point x* such that g (x*) b for 1,...,J (5.63) We call this Everett s condition. Since any of the search methods discussed in Chapter 3 could be used to minimize Eq. (5.62), Everett s method loos computationally attractive. However, for many practical problems, Everett s condition is too restrictive since Eq. (5.63) implies that all the constraints have to be active at the optimum. Hence, Everett indicates that if the constraints g (x*) are close to the righthand-side constants b, then x* is a good approximate optimal solution to the given NLP problem. Thus, the basic steps of Everett s method to find an approximate optimal solution to the NLP problem are as follows: (1) Step 1. Choose an initial set of nonnegative multipliers for 1,..., J. Step 2. Find the unconstrained minimum of E(x; (1) ) over x S by any direct-search method. Let the constrained minimum occur at x x (1). (1) Step 3. Compute g (x ) for 1,...,J. (1) If the values of g (x ) are close to b for all (e.g., within specified error tolerance), then terminate. Otherwise, update the multipliers to a new set (2) of values and repeat the optimization process. Everett also provided a rule for updating the multipliers systematically in step 3. Recall that the multipliers s can be interpreted as shadow prices. In other words, if g (x) b corresponds to the minimum level of production for product, then can be interpreted as the price of product to brea even. Hence, if is increased eeping all the other shadow prices fixed, you would expect to produce and sell more of product. This is the basis of Everett s theorem, which shows that g (x) monotonically increases with while holding all other s fixed. Theorem 5.8 Let (1) and (2) be nonnegative vectors such that (2) (1) (1) (2) i i and for all i If x (1) and x (2) minimize E(x; ) given by Eq. (5.62), then

5.8 GENERALIZED LAGRANGE MULTIPLIER METHOD 247 (1) (2) g i[x ] g i[x ] We shall illustrate Everett s basic algorithm with the following example: Example 5.9 Minimize ƒ(x) x x Subect to g (x) 2x x 2 1 Everett s function is given by E(x; ) x1 x2 (2x1 x 2) We begin Everett s method with 1 0. The unconstrained minimum of (1) (1) E(x; 0) occurs at the point x (0, 0). Since g 1 (x ) 0, which is less than 2, we increase to increase g 1 (x). Choose 2 1. The unconstrained minimum of E(x; 1) occurs at the point x (2) (1, ), g(x (2) ) 2 1 2 2. Hence, has to be decreased to get a solution that reduces the constraint value. The remaining steps are shown in Table 5.1. Note that the value of in each step is simply the midpoint of the two previous s, since we now that the optimal is between 0 and 1. The convergence is achieved at step 8. When several constraints are present, the adustment of s becomes more difficult since Theorem 5.8 does not hold when several s are changed simultaneously. Even if we decide to change one at a time, it is quite possible that it may affect other constraint values. This is especially true if there are a lot of dependencies among the constraints. On the other hand, if the constraints are relatively independent, then Everett s method may eventually converge. Table 5.1 Everett s Method for Example 5.9 Step t t (t) (t) (t) x (x 1, x 2 ) g 1 ( x (t) ) 1 0 (1) (0, 0) 2 1 (2) (1, 0.5) 3 0.5 (3) (0.5, 0.25) 4 0.75 (4) (0.75, 0.375) 5 0.88 (5) (0.88, 0.44) 6 0.82 (6) (0.82, 0.41) 7 0.78 (7) (0.78, 0.39) 8 0.8 (8) (0.8, 0.4) Constraint Violation x 0 2 x 2.5 2 x 1.25 2 x 1.875 2 x 2. x 2.05 2 x 1.95 2 x

248 CONSTRAINED OPTIMALITY CRITERIA Improvements to Everett s Method. Broos and Geoffrion [8] observed that when there are inactive constraints in the NLP problem, Everett s conditions are unnecessarily restrictive because when a multiplier is zero, it is not necessary to require that the corresponding constraint be satisfied as a strict equality (or an approximate equality). Thus, they weaened Everett s conditions such that when their conditions are satisfied, the global optimum is guaranteed. Modified Everett s Conditions 1. x* minimizes E(x; *) given by Eq. (5.62) over all x S, and * 0. 2. When * 0: g (x*) b * 0: g (x*) b for 1,...,J If the above conditions are satisfied, then x* is optimal to the NLP problem: Minimize ƒ(x) Subect to g (x) b 1,...,J x S From Theorem 5.5, it is clear that the modified Everett s conditions are nothing more than the requirement that (x*, *) is a saddlepoint of the Lagrangian function: J 1 L(x, ) ƒ(x) [g (x) b ] Thus, with the modification by Broos and Geoffrion, Everett s method becomes an iterative technique for constructing a saddlepoint to the KTSP. Of course, the existence of a saddlepoint is guaranteed only under the assumptions given in Theorem 5.4. In addition to weaening Everett s optimality conditions, Broos and Geoffrion also gave a method for updating the multipliers required in step 3 of Everett s method. They observed that when we have an LP problem (i.e., when S is the nonnegative orthant, ƒ and g are linear functions), a solution (x*, *) satisfies the modified Everett s conditions if and only if x* is optimal for the given linear program and * is optimal for its dual. We now from the results of Chapter 4 that the optimal dual variables are nothing but the shadow prices of the primal constraints. We also now that the optimal Lagrange multipliers correspond to the shadow prices in the nonlinear program as well. Hence, an approximation to Everett s multipliers ( ) can be obtained with a linear approximation of the nonlinear program.

5.9 GENERALIZATION OF CONVEX FUNCTIONS 249 Broos and Geoffrion s Updating Scheme. Suppose we have the solutions (1) (2) (t1) x, x,..., x corresponding to the unconstrained minimum of E(x; ) at (1), (2),..., (t1). To determine the updated values of the multipliers (t) for use in step t, we solve the following linear program in variables y i : t1 (i) Minimize ƒ(x )yi i1 t1 Subect to y i 1 i1 t1 (i) g (x )y i b i1 y 0 i for 1,...,J The optimal value of the dual variables corresponding to the inequality constraints ( 1,...,J) will become the new multipliers needed for step (t) t. Note from the duality theory of linear programming that the dual variables (t) will automatically be nonnegative. Since we are only interested in the optimal dual solution to the above LP problem (and are not concerned with the optimal y i values), one could simply solve the dual LP problem directly: J 0 1 J (i) (i) 0 g (x ) ƒ(x ) for i 1,2,...,t1. 1 Maximize b 0 unrestricted in sign 0 for 1,...,J Note that the dual LP problem to be solved at step t 1 in order to determine (t1) would have ust one additional constraint. Hence, the dual simplex method described in Chapter 4 can be used to determine the new dual optimal solution without re-solving the problem. Of course, the algorithm is terminated when we find a * such that (x*, *) satisfies the modified Everett s conditions. 5.9 GENERALIZATION OF CONVEX FUNCTIONS In Section 5.5 we developed the KTCs for NLP problems. When the sufficiency conditions of Theorem 5.2 hold, a Kuhn Tucer point automatically

250 CONSTRAINED OPTIMALITY CRITERIA becomes a global minimum to the NLP problem. The sufficiency conditions required that greater than or equal to type constraints are concave functions, equality constraints are linear, and the obective function to minimize is convex. Note that the presence of one nonlinear equality constraint is enough to violate the sufficiency conditions. The sufficiency conditions of Theorem 5.2 have been generalized further to nonconcave inequality constraints, nonconvex obective functions, and nonlinear equality constraints. They use generalizations of convex functions such as pseudoconvex and quasi-convex functions. Definition: Pseudoconvex Function A differentiable function ƒ(x) defined on an open convex set S is pseudoconvex on S if and only if for all x (1), x (2) S Remars (1) (2) (1) (2) (1) ƒ(x )(x x ) 0 ƒ(x ) ƒ(x ) 1. ƒ(x) is pseudoconcave if ƒ(x) is pseudoconvex. 2. Every convex function is also pseudoconvex, but a pseudoconvex function may not be convex. Figures 5.2 and 5.3 illustrate pseudoconvex and pseudoconcave functions. Definition: Strictly Quasi-Convex Function A function ƒ(x) defined on a convex set S is strictly quasi-convex on S if and only if Figure 5.2. Pseudoconvex function.

5.9 GENERALIZATION OF CONVEX FUNCTIONS 251 Figure 5.3. Pseudoconcave function. (1) (2) (1) (2) ƒ(x (1 )x ) max[ƒ(x ), ƒ(x )] (1) (2) (1) (2) for all x, x S 0 1 ƒ(x ) ƒ(x ) Definition: Quasi-Convex Function A function ƒ(x) defined on a convex set S is quasi-convex on S if and only if (1) (2) (1) (2) ƒ(x (1 )x ) max[ƒ(x ), ƒ(x )] (1) (2) for all x, x S 0 1 Figures 5.4 and 5.5 illustrate quasi-convex and strictly quasi-convex functions. Figure 5.4. Quasi-convex function.

252 CONSTRAINED OPTIMALITY CRITERIA Figure 5.5. Strictly quasi-convex function. Remars 1. A pseudoconvex function is also a quasi-convex function. But a quasiconvex function may not be pseudoconvex. 2. A strictly quasi-convex function need not necessarily be quasi-convex unless ƒ(x) is assumed to be continuous on the convex set of S. Example 5.10 x for x 0 ƒ(x) 0 for 0 x 1 x 1 for x 1 The above function is both quasi-convex and quasi-concave but neither strictly quasi-convex nor strictly quasi-concave. Example 5.11 3 ƒ(x) x for x The above function is both strictly quasi-convex and strictly quasi-concave, but it is neither quasi-convex nor quasi-concave because of the inflection point at x 0. Theorem 5.9 Let ƒ be pseudoconvex function defined on an (open) convex set S. Ifƒ(x) 0, then x minimizes ƒ(x) over all x S.

5.9 GENERALIZATION OF CONVEX FUNCTIONS 253 Theorem 5.10 Let ƒ be a strictly quasi-convex function defined on a convex set S. Then a local minimum of ƒ is also a global minimum. Remars 1. For a proof of Theorems 5.9 and 5.10, the reader is referred to Mangasarian [2]. 2. Theorem 5.9 need not be true for strictly quasi-convex and quasi-convex functions. 3. Theorem 5.10 holds for pseudoconvex functions also. 4. Theorem 5.10 need not hold for quasiconvex functions. 5. We can show that every nonnegative linear combination of convex functions is also a convex function. But this is not true in general for pseudoconvex, strictly quasi-convex, and quasi-convex functions. Theorem 5.11 Generalization of Kuhn Tucer Sufficient Optimality Theorem Consider the NLP problem Minimize ƒ(x) Subect to g (x) 0 for 1,2,...,J h (x) 0 x (x, x,...,x ) N KTP is as follows: Find x, u, and v such that for 1,2,...,K ƒ(x) u g (x) v h (x) 0 u 0 v unrestricted in sign ug(x) 0 for all g (x) 0 h (x) 0 Let ƒ(x) be pseudoconvex, g be quasi-concave, and h be both quasi-convex and quasi-concave. If (x, u, v) solves the KTP, then x solves the NLP problem. For a proof of Theorem 5.11, the reader is referred to Mangasarian [2] and Bazaraa et al. [1].

254 CONSTRAINED OPTIMALITY CRITERIA 5.10 SUMMARY In this chapter we developed necessary and sufficient conditions of optimality for constrained optimization problems. We began with the discussion of Lagrangian optimality conditions for problems with equality constraints. These were then extended to inequality constraints in the form of Kuhn Tucer optimality conditions, which are first-order conditions involving the gradients of the obective function and the constraints. We learned that the Kuhn Tucer conditions are necessary when the functions are differentiable and the constraints satisfy some regularity condition nown as the constraint qualification. Kuhn Tucer conditions become sufficient conditions for global minima when the obective function is convex, the inequality constraints are concave functions, and the equality constraints are linear. We also discussed saddlepoint optimality conditions that could be applicable if the functions were not differentiable. Since there could be several points satisfying the Kuhn Tucer necessary conditions, we developed second-order necessary conditions that must be satisfied for a point to be a local minimum. Similarly, the assumptions under which the Kuhn Tucer sufficiency conditions hold are quite rigid. Hence, second-order sufficiency conditions were developed that do not require convexity of the functions and linearity of the equality constraints. Both the necessary and sufficient second-order conditions impose additional restrictions over and above those given by Kuhn Tucer and hence can be useful in reducing the set of candidate optima. REFERENCES 1. Bazaraa, M. S., D. Sherali, and C. M. Shetty, Nonlinear Programming: Theory and Algorithms, 2nd ed., Wiley, New Yor, 1993. 2. Mangasarian, O. L., Nonlinear Programming, McGraw-Hill, New Yor, 1969. 3. Lasdon, L. S., Optimization Theory for Large Systems, Macmillan, New Yor, 1979. 4. Everett, H., Generalized Lagrange Multiplier Method for Solving Problems of Optimum Allocation of Resources, Oper. Res., 11, 399 471 (1963). 5. McCormic, G. P., Second Order Conditions for Constrained Optima, SIAM J. Appl. Math., 15, 641 652 (1967). 6. Han, S. P., and O. L. Mangasarian, Exact Penalty Functions in Nonlinear Programming, Math. Programming, 17, 251 269 (1979). 7. Fiacco, A. V., Second Order Sufficient Conditions for Wea and Strict Constrained Minima, SIAM J. Appl. Math., 16, 105 108 (1968). 8. Broos, R., and A. Geoffrion, Finding Everett s Lagrange Multipliers by Linear Programming, Oper. Res., 14, 1149 1153 (1966).