# Optimality Conditions

Save this PDF as:
Size: px
Start display at page:

## Transcription

1 Chapter 2 Optimality Conditions 2.1 Global and Local Minima for Unconstrained Problems When a minimization problem does not have any constraints, the problem is to find the minimum of the objective function. We distinguish between two kinds of minima. The global minimum is the minimum of the function over the entire domain of interest, while a local minimum is the minimum over a smaller subdomain. The design point where the objective function f reaches a minimum is called the minimizer and is denoted by an asterisk as x. The global minimizer is the point x which satisfies f(x ) f(x). (2.1.1) A point x is a local minimizer if you for some r > 0 f(x ) f(x). if x x < r. (2.1.2) That is, x is a local minimizer, if you can find a sphere around it in which it is the minimizer. 2.2 Taylor series Expansion To find conditions for a point to be a minimizer, we use the Taylor series expansion around a presumed minimizer x n f(x) = f(x )+ (x i x i ) (x )+ 1 n n (x i x i )(x k x x i 2 k) (x )+higher order terms x k=1 i x k (2.2.1) When x is very close to x, we can neglect even the second derivative terms. Furthermore, if x and x are identical except for the jth component, then Eq becomes f(x) f(x ) + (x j x j) x j (x ). (2.2.2) Then, for Eq to hold for both positive and negative values of (x j x j ), we must have the familiar first order condition x j (x ) = 0,, j = 1,..., n (2.2.3) 15

2 CHAPTER 2. OPTIMALITY CONDITIONS Equation is not applicable only to a minimum; the same condition is derived in the same manner for a maximum. It is called a stationarity condition, and a point satisfying it is called a stationary point. To obtain conditions specific to a minimizer, we need to use the second derivative terms in the Taylor expansion. However, to facilitate working with these terms, we would like to rewrite this expansion in matrix notation, which is more compact. We define the gradient vector to be the vector whose components are the first derivatives, and denote it as f. That is, f = x 1 x 2... x n (2.2.4) Similarly, we define the Hessian (after the German mathematician) Otto Ludwig Hesse ( )) matrix H, to be the symmetric matrix of second derivatives. That is, h ij, the element of the matrix H at the ith row and jth column is 2 x i x j, H = x 2 1 x 2 x 1 x 1 x 2... x x nx 1 x nx 2... x 1 x n x 2 x n x 2 n. (2.2.5) Finally, we define x = x x, and then we can write the Taylor expansion, Eq as f(x) = f(x ) + x T f(x ) xt H(x ) x + higher order terms (2.2.6) For a minimum, the gradient vector is zero, so that if x T H(x ) x > 0 (2.2.7) then for small enough x, the higher order terms can be neglected, and we will be assured that x is at least a local minimizer of f. A symmetric matrix which satisfies such positivity condition for any vector x is called positive definite. It can be shown that a matrix is positive definite if and only if all of its eigenvalues are positive. In one dimension, the condition of positive definiteness reduces to the condition that the second derivative is positive, indicating a positive curvature or a convex function. In two dimensions, a positive definite Hessian means that the curvature is positive in any direction, or again that f is convex. The condition that H is positive definite is a sufficient condition for a local minimizer, but it is not a necessary condition. It can be shown that the necessary condition is a slightly milder version of Eq x T H(x ) x 0. (2.2.8) With this condition, the second-order terms in the Taylor series can be zero, and then the higherorder terms determine whether the point is a minimizer or not. In one dimension, the condition 16

3 2.2. TAYLOR SERIES EXPANSION is that the second derivative, or the curvature, is non-negative, and a zero second derivative implies that the minimum is decided on the basis of higher derivatives. For example, f = x 4 has zero first and second derivatives at x = 0, and this point is a minimum. However, f = x 4 satisfies the same conditions, but has a maximum at that point. Finally, f = x 3 satisfies this condition, and it has an inflection point. A matrix H which satisfies Eq for every vector x is called positive semi-definite, and all of its eigenvalues are non-negative (but some may be zero). Since minimizing f is equivalent to maximizing f, it follows that at a stationary point if H is positive definite, we have sufficient conditions for a local maximizer. Such a matrix is called negative definite and all of its eigenvalues are negative. Similarly, if H is positive semi-definite, we call H negative semi-definite, and the matrix has eigenvalues which are all non-positive. In one dimension the second derivative can be positive, negative, or zero. corresponding to a minimum, maximum, or uncertainty about the nature of the stationary point. In higher dimensions we have the 4 possibilities that we have already listed, plus the possibility that the some of the eigenvalues of the matrix are positive and some are negative. A matrix with both positive and negative eigenvalues is called indefinite. In two dimensions, a function with an indefinite matrix will have one positive eigenvalue and one negative eigenvalue. These two eigenvalues will correspond to the extreme values of the curvature of the function in different directions. The two directions will be perpendicular to each other, and with one direction having positive curvature and the other having negative curvature, the function will locally look like a saddle. Therefore, a stationary point with an indefinite Hessian is called a saddle point. To summarize, at a stationary point we need to inspect the Hessian. If it is positive definite we have a minimum. If it is positive semi-definite we may have a minimum, an inflection point or a saddle point. If the Hessian is indefinite we have a saddle point. If it is negative definite we must have a maximum, while if it is negative semi-definite we may have a maximum or an inflection point or a saddle point. Example Find the stationary point of the quadratic function f = x x rx 1 x 2 2x 1. Determine the nature of the stationary point for all possible values of r. The stationary point is found by setting the first derivatives to zero For any value of r beside r = 2, the solution is The Hessian is x 1 = 2x 1 + rx 2 2 = 0 (2.2.9) x 2 = rx 1 + 2x 2 = 0 (2.2.10) x 1 = 2 4 r 2, x 2 = 2r 4 r 2. (2.2.11) H = [ ] 2 r. (2.2.12) r 2 The eigenvalues µ of H are found be setting the determinant of H µi, where I is the identity matrix H µi = 2 µ r r 2 µ = (2 µ)2 r 2 = 0. (2.2.13) 17

4 CHAPTER 2. OPTIMALITY CONDITIONS f x Figure 2.1: Function with several local minima So that the two eigenvalues are µ 1,2 = 2 ± r. For r < 2 we have two positive eigenvalues so that H is positive definite and the stationary point is a minimum. For r > 2, we have one positive and one negative eigenvalue, so that H is indefinite and the stationary point is a saddle point. For r = 2 one of the eigenvalues is zero so that the matrix is positive semidefinite. However, we do not have a stationary point for r = Condition for global minimum Up to now we discussed the condition for a local minimizer. A function can have several local minima, as shown in the Fig. 2.1, and have a positive definite Hessian at each one of the minima. Therefore, when we find a stationary point where the Hessian is positive definite, we usually cannot tell whether the point is a local minimizer or a global one. However, if the function is convex everywhere it can be shown that any local minimizer is also a global minimizer. If f is twice differentiable everywhere, then convexity just means that the Hessian is positive semidefinite everywhere. However, it is useful to define convex functions even if they do not 18

5 2.4. NECESSARY CONDITIONS FOR CONSTRAINED LOCAL MINIMIZER posses second derivatives. For example, the function f = x is convex looking and has a global minimum at x = 0. Similarly, we can replace a smooth convex function by short straight segments, see Fig. 2.2, we would still have a convex looking function. Therefore, the definition of convex function is generalized to include these cases, and it can be shown that the proper definition is the following: A function f(x) is defined to be convex if for every two points x 1 and x 2 and every scalar 0 < α < 1 f[αx 1 + (1 α)x 2 ] αf(x 1 ) + (1 α)f(x 2 ). (2.3.1) Equation simply states that if we connect any two points on the surface of the function by a straight segment, the segment will lie above the function. If we can have strict inequality in Eq. /refeq:convex, then the function f is strictly convex. A local minimum of a convex function can be shown to be also the global minimum, or the only minimum of the function. If the function is strictly convex, the minimizer is a single point, while a more general convex function can have the minimum be attained in entire region of design space. It is usually not convenient to check on the convexity of a function from the definition of a convex function. If the function is C 2 (twice continuously differentiable), then convexity is assured if the Hessian of the function is positive semi-definite everywhere, and strict convexity is assured if the function is positive definite everywhere. One implication of this result is that the minimum of a quadratic function is the global minimum, because if a quadratic function is positive definite at one point, it is positive definite everywhere (the Hessian is a constant matrix). 2.4 Necessary conditions for constrained local minimizer We will consider first the case when we have only a single equality constraint minimize f(x), such that h(x) = 0, (2.4.1) (2.4.2) In this case, if we want to check whether a candidate point x is a minimizer, we can no longer compare it with any neighboring point. Instead we have to limit ourselves only to points that satisfy the equality constraint. That is, instead of Eq the condition should be f(x ) f(x). if x x < r, and h(x) = 0, (2.4.3) (2.4.4) We can obtain a first order optimality condition by considering a point x which is infinitesimally close to x, that is x = x + dx. Then df = f(x) f(x ) = n x i dx i. (2.4.5) 19

6 CHAPTER 2. OPTIMALITY CONDITIONS f x Figure 2.2: Convex function 20

7 2.4. NECESSARY CONDITIONS FOR CONSTRAINED LOCAL MINIMIZER Similarly, with h(x) h(x ) = dh = 0 we have dh = n h x i dx i = 0. (2.4.6) This equation indicates that we cannot choose the components of dx independently. Rather, we can express one of them, say the jth one, in terms of the other n 1, provided that h/ x j 0. That is dx j = 1 h dx i. (2.4.7) h/ x j x i We will therefore assume that we can choose all the components of dx arbitrarily except for the jth one. We now calculate df + λdh where λ is any number. Because dh = 0 we have df = df + λdh = n i j ( + λ h ) dx i (2.4.8) x i x i We now choose λ to eliminate the contribution of the dependent component dx j, that is + λ h = 0. (2.4.9) x j x j Then in Eq if we choose the dx k 0 and dx i = 0 for i k and i j then ( df = + λ h ) dx k. (2.4.10) x k x k Then, because dx k can be either positive or negative, to insure that df 0 we must have + λ h = 0 (2.4.11) x k x k This applies to every k j, but for j we have Eq , which is also the same. These equations are usually made to look the same as the conditions for an unconstrained minimization by defining a new function L called the Lagrangian L = f + λh, (2.4.12) and call λ a Lagrange multiplier. Then the condition for optimality is that we can find a Lagrange multiplier that satisfies the following equation L x i = x i + λ h x i = 0,, i = 1,..., n. (2.4.13) As in the case of unconstrained minimization, Eq is only a stationarity condition, which applies equally well to minima and maxima. Equations together with the condition h = 0 can be viewed as n + 1 equations for the components of the minimizer x and the Lagrange multiplier λ. However, we usually calculate the minimizer by some graphical or numerical method, and then use Eq to check whether we are at the minimum. When we have only a single design variable, we can always find a Lagrange multiplier that will satisfy Eq , so 21

8 CHAPTER 2. OPTIMALITY CONDITIONS that it does not seem that the optimality condition is much of a condition. However, indeed, with a single design variable and a single constraint, we are not left with any freedom to optimize. If we have a point that satisfies the constraint, in most cases we will not be able to move the single design variable from this point without violating the constraint. When we have multiple design variables, then Eq will give us a different equation for each design variable. At the optimum point all of these equations should give us the same value of λ. Another useful function of the Lagrangian is to estimate the effect of changing the data in a problem on the optimum design. That is, consider a situation where the objective function f and the constraint h are a function of a parameter p, then the minimizer x and the minimum value of the objective function f will depend on p, f = f(x, p), h = h(x, p), x = x (p), f (p) = f(x, p). (2.4.14) It can be shown that the derivative of f with respect to p is equal to the partial derivative of the Lagrangian with respect to p, df dp = L p (x, p). (2.4.15) Example For the soft-drink can design, Example1.2.1, assume that l, the design variable which decides the aspect ratio constraint on the can, is an outside parameter rather than a design variable. Further assume that the aspect ratio condition is imposed as an equality constraint, with H = 2D for l = 0 and H = 2.2D for l = 1. Check that for l = 0, the maximum profit per ounce, p o is achieved for D = 2.04 in. Calculate the derivative of p o with respect to l, and use it to estimate the effect on the profit of changing l from 0 to 1. Check your answer. The equality constraint dictates that H = 4.08 in. and V o = ounces. Because this design point is not at the limits of either the dimensions or the volume, we can reformulate the design problem as minimize such that p o = C P h = 1 V o H ( l)D = 0. (2.4.16) For convenience the equations for the cost, C, price, P, and volume in ounces V o are copied from Example C = 0.8V o + 0.1S, P = 2.5V o 0.02V 2 o + 5l. (2.4.17) where V o = 0.25πD 2 H/1.8, S = 2(0.25πD 2 ) + πdh, (2.4.18) For the candidate point, these equations give us p o = cents per ounce. In order to check the optimality condition, Eq , it may be tempting to substitute the expressions for the cost, price, volume and area into the objective function. However, differentiating the complicated expression that will result from the substitution is difficult. Instead, it is more convenient to differentiate the individual expressions as shown below. The Lagrangian function is given as L = C P V o + λ [ ] H 1 ( l)D, (2.4.19) 22

9 2.4. NECESSARY CONDITIONS FOR CONSTRAINED LOCAL MINIMIZER and the optimality condition, Eq is obtained by differentiating the Lagrangian with respect to the two design variables L D = ( 1 C V o D P D L H = ( 1 C V o H P H ) C P V 2 o ) C P V 2 o V o D + λ V o H H (2+0.2l)D, 2 λ (2+0.2l)D. (2.4.20) To evaluate the derivatives appearing in the optimality conditions we differentiate the equations for the cost C D = 0.8 V o S D D, C H = 0.8 V o S H H, (2.4.21) and the price P D = ( V o) V o D, Finally, we differentiate the equations for V o V o D = 0.5πDH/1.8, P H = ( V o) V o H. (2.4.22) V o H = 0.25πD2 /1.8, (2.4.23) and S S S = πd + πh, = πd. (2.4.24) D H Now we can substitute the values of D and H at the candidate optimum into the above equations in reverse order. First we evaluate the derivatives of the volume and the area. For D = 2.04 in. and H = 4.08 in. we get V o D = oz/in, V o H = oz/in, S S = in, D Next we evaluate the derivatives of the price from Eq P P = cents/in, D and the derivatives of the cost from Eq C C = cents/in, D H H H Finally, we can substitute these values into the optimality conditions, Eq L D L H = in. (2.4.25) = cents/in, (2.4.26) = cents/in. (2.4.27) = λ = 0, (2.4.28) = λ = 0. (2.4.29) The first equation gives us a value of λ = , while the other gives us λ = The difference reflects the fact that the candidate point, which we found by graphical optimization, is not exactly the optimum. However, the closeness of the two values indicates that we are close to the optimum design. For this problem we can check the exact optimum by substituting H = 2D from the constraint into the objective function. After some algebra we get p o = D D2. (2.4.30) Setting the derivative of p o to zero we get D = in., so that H = 4.072, and p o = cents per ounce. That is, the small change in the design variables did not change the objective function 23

10 CHAPTER 2. OPTIMALITY CONDITIONS even in the sixth significant digit. This insensitivity of the value of the optimum to a small error in the design variables is extreme. However, in most problems the value of the optimum objective is not very sensitive to such small errors in the design variables. Also, in most problems we do not have the luxury of solving the optimization problem analytically, and so we will continue with the slightly inaccurate candidate optimum, and take an average value of the Lagrange multiplier, λ to estimate the effect of changing the aspect ratio constraint. From Eq we have dp o dl = df = L = 1 P dl l V o l λ h l = (2.4.31) 5 0.2H λ V o ( l) 2 = 0.67 cents,. D (2.4.32) Note that the second term, involving the Lagrange multiplier, is only cents per ounce. That is, most of the change in profit comes from the 5 cents higher price that can be charged for the taller can, and the decrease in efficiency due to the higher aspect ratio has only a small effect. To predict the change in profit per ounce we can use the derivative p o(l = 1) = p o(l = 0) + dp o dl = = 1.78 cents/ounce. (2.4.33) To check this prediction, we can again use the aspect ratio constraint, H = 2.2D, and obtain the optimum can corresponding to this constraint. With some more algebra we get the following expression for the profit per ounce p o = D D D 3. (2.4.34) This time we cannot find a real solution that maximizes p o. The last term, corresponding to the 5 cents extra that a taller can commands, dominates. With this extra charge, it pays to make the can as small as possible, so that the additional 5 cents will have large effect on the profit per volume. We assume then that we will go to the smallest can possible, with V o = 5 oz., and calculate the dimensions of D = in., H = corresponding to p o = cents per ounce. This is substantially higher than the predicted value of cents. The discrepancy is due to the fact that Eq is only a linear extrapolation. Next we consider the case of multiple equality constraints. written as minimize f(x), The optimization problem is such that h i (x) = 0, i = 1,..., n e. (2.4.35) For this case we define Lagrange multipliers for each constraint, λ i, i = 1,..., n e, and a Lagrangian function L given as n e L = f + λ i h i. (2.4.36) Then, if the gradients of the constraints are linearly independent at a point x, it can be shown that a necessary condition for an optimum is that the derivatives of the Lagrangian with respect to all the design variables are zero, that is 24 x j (x ) + n e λ i h i x j = 0, j = 1,..., n (2.4.37)

11 2.4. NECESSARY CONDITIONS FOR CONSTRAINED LOCAL MINIMIZER This is again a stationarity condition, equally applicable to a minimum or a maximum. As in the case of a single constraint, the derivative of the Lagrangian with respect to a parameter is equal to the derivative of the optimum objective function with respect to that parameter. That is, if the objective function and constraints depend on a parameter p, f = f(x, p), and h i = h i (x, p), Eq applies. When we have inequality constraints the treatment remains very similar. Consider the standard format minimize f(x), such that h i (x) = 0, i = 1,..., n e, (2.4.38) g j (x) 0, j = 1,..., n g, (2.4.39) where any upper and lower limits on the design variables are assumed here to be included in the inequality constraints. We now construct a Lagrangian as n e L = f + λ i h i + ne+ng i=n e +1 λ i g i ne. (2.4.40) The conditions for a stationary point are again that the derivatives of the Lagrangian with respect to all the design variables are equal to zero. Again, if the objective function and constraints depend on a parameter p, then the derivative of the optimum objective function f with respect to that parameter, are given by Eq That is df dp = ne p + ng h i λ i p + g i λ i+ne p. (2.4.41) From Eq it is seen that the Lagrange multipliers measure the sensitivity of the objective function to changes in the constraints. For these reasons, the Lagrange multipliers are sometimes called shadow prices. It should also be clear that when a constraint is not active, it should not contribute anything to the derivative of the optimum objective with respect to the parameter. The reason is that for small changes in p, the constraint will remain inactive, so that the solution of the optimization problem will not depend at all on this constraint. Furthermore, consider the case when the parameter p enters an inequality constraints in the form of g i (x, p) = ḡ(x) + p 0, (2.4.42) so that g i / p = 1. In this case, it is clear that as p increases, it becomes more difficult to satisfy the constraint, so that the optimum value of the objective function could become worse but not improve. That is, df /dp 0. From Eq this indicates that λ i has to be positive. Altogether, for a problem with inequality constraints, we have two other conditions besides the vanishing of the derivatives of the Lagrangian. The Lagrange multipliers associated with inactive constraints have to be zero, and the Lagrange multipliers associated with active inequality constraints have to be positive. These conditions are summarized as x k + n e λ i h i x k + ng λ i+ne g i x k = 0,, k = 1,..., n g i λ i+ne = 0, i = 1,..., n g, (2.4.43) λ i+ne 0, i = 1,..., n g. 25

12 CHAPTER 2. OPTIMALITY CONDITIONS The second equation, which stipulates that the product of the inequality constraint times its Lagrange multiplier is equal to zero is a convenient way of requiring zero Lagrange multipliers for inactive constraints, as inactive constraints have values different from zero. These conditions, which need to be satisfied at the minimizer x are called the Kuhn-Tucker conditions. The Kuhn-Tucker conditions are necessary but not sufficient for a minimum. To get sufficient conditions, we again need to examine the second derivatives. However, now we need to consider only directions that are tangent to the active constraints. That is, we define the tangent space T as T (x ) = {y; y T g aj (x ) = 0, y T h i (x ) = 0, i = 1, n e }, (2.4.44) where f is the gradient of f, that is the vector of first derivatives of f, and where g aj indicates the active inequality constraints at the minimizer, x. That is T is the space of all vectors which are tangent to the active constraints, that is vectors that are perpendicular to the gradient of these constraints. A necessary second-order condition for a minimum is that y T H L (x )y 0, for every yint, (2.4.45) where H L is the Hessian matrix of the Lagrangian. A strict equality in the equation is a sufficient condition for a minimum, when the point satisfies the Kuhn-Tucker conditions. 2.5 Conditions for global optimality We have seen that the condition for global optimality of unconstrained minimum requiress the convexity of the objective function. A similar condition for a global minimum requires the convexity of the inequality constraints. However, more rigorously, the condition is that the feasible domain is convex, and so we start by defining a convex domain. A set of points S is convex, if for any pair of points x 1 and x 2 that belong to the set, the line segment connecting them also belongs to the set. That is, αx 1 + (1 α)x 2 belong to S, (2.5.1) It can be shown that the feasible domain is convex if all the inequality constraint function are convex and all the equality constraints are linear. This condition, however, means not only that the local minimizer is a global minimizer, but also that there is only one local minimum. This condition, therefore, is quite strong, and is rarely satisfied. For most engineering problems, it is impossible to prove that a point is a global minimum, and instead we have to keep looking for local minima, and be satisfied with probabilistic estimates of the chance that we have found the local minimum which is also the global one. 2.6 Exercises Find all the stationary points for the following functions and characterize them as minima, maxima, saddle points or inflection points: 1. f = 2x x 1x 2 + 3x x 1 2. f = x x 1x 2 + 3x f = cos x 1 sin x 2 26

13 2.6. EXERCISES 4. For the soft-drink can problem in Chapter 1, the optimum design for a standard-size can (l = 0) was D = in., H = in., and the profit per can was cents. The two active constraints for that problem were the aspect ratio and the volume constraints. Ignoring the other constraints we can formulate the problem as minimize such that p c = C P H g 1 = 1 ( l)D 0, (2.6.1) g 2 = V o Find the Lagrange multipliers associated with this solution, and use them to estimate the profit per can for the tall can design (l = 1). Check how well the estimate agrees with the actual optimum at l = 1. 27

### September Math Course: First Order Derivative

September Math Course: First Order Derivative Arina Nikandrova Functions Function y = f (x), where x is either be a scalar or a vector of several variables (x,..., x n ), can be thought of as a rule which

### MATH2070 Optimisation

MATH2070 Optimisation Nonlinear optimisation with constraints Semester 2, 2012 Lecturer: I.W. Guo Lecture slides courtesy of J.R. Wishart Review The full nonlinear optimisation problem with equality constraints

### Gradient Descent. Dr. Xiaowei Huang

Gradient Descent Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Three machine learning algorithms: decision tree learning k-nn linear regression only optimization objectives are discussed,

### where u is the decision-maker s payoff function over her actions and S is the set of her feasible actions.

Seminars on Mathematics for Economics and Finance Topic 3: Optimization - interior optima 1 Session: 11-12 Aug 2015 (Thu/Fri) 10:00am 1:00pm I. Optimization: introduction Decision-makers (e.g. consumers,

### Optimality Conditions for Constrained Optimization

72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

### TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM

TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM H. E. Krogstad, IMF, Spring 2012 Karush-Kuhn-Tucker (KKT) Theorem is the most central theorem in constrained optimization, and since the proof is scattered

### Review of Optimization Methods

Review of Optimization Methods Prof. Manuela Pedio 20550 Quantitative Methods for Finance August 2018 Outline of the Course Lectures 1 and 2 (3 hours, in class): Linear and non-linear functions on Limits,

### In view of (31), the second of these is equal to the identity I on E m, while this, in view of (30), implies that the first can be written

11.8 Inequality Constraints 341 Because by assumption x is a regular point and L x is positive definite on M, it follows that this matrix is nonsingular (see Exercise 11). Thus, by the Implicit Function

### Performance Surfaces and Optimum Points

CSC 302 1.5 Neural Networks Performance Surfaces and Optimum Points 1 Entrance Performance learning is another important class of learning law. Network parameters are adjusted to optimize the performance

### Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

Introduction to Machine Learning Lecture 7 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Convex Optimization Differentiation Definition: let f : X R N R be a differentiable function,

### Seminars on Mathematics for Economics and Finance Topic 5: Optimization Kuhn-Tucker conditions for problems with inequality constraints 1

Seminars on Mathematics for Economics and Finance Topic 5: Optimization Kuhn-Tucker conditions for problems with inequality constraints 1 Session: 15 Aug 2015 (Mon), 10:00am 1:00pm I. Optimization with

### Nonlinear Programming (NLP)

Natalia Lazzati Mathematics for Economics (Part I) Note 6: Nonlinear Programming - Unconstrained Optimization Note 6 is based on de la Fuente (2000, Ch. 7), Madden (1986, Ch. 3 and 5) and Simon and Blume

### 5 Handling Constraints

5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest

### Constrained Optimization

1 / 22 Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University March 30, 2015 2 / 22 1. Equality constraints only 1.1 Reduced gradient 1.2 Lagrange

### FINANCIAL OPTIMIZATION

FINANCIAL OPTIMIZATION Lecture 1: General Principles and Analytic Optimization Philip H. Dybvig Washington University Saint Louis, Missouri Copyright c Philip H. Dybvig 2008 Choose x R N to minimize f(x)

### Nonlinear Programming (Hillier, Lieberman Chapter 13) CHEM-E7155 Production Planning and Control

Nonlinear Programming (Hillier, Lieberman Chapter 13) CHEM-E7155 Production Planning and Control 19/4/2012 Lecture content Problem formulation and sample examples (ch 13.1) Theoretical background Graphical

### SECTION C: CONTINUOUS OPTIMISATION LECTURE 9: FIRST ORDER OPTIMALITY CONDITIONS FOR CONSTRAINED NONLINEAR PROGRAMMING

Nf SECTION C: CONTINUOUS OPTIMISATION LECTURE 9: FIRST ORDER OPTIMALITY CONDITIONS FOR CONSTRAINED NONLINEAR PROGRAMMING f(x R m g HONOUR SCHOOL OF MATHEMATICS, OXFORD UNIVERSITY HILARY TERM 5, DR RAPHAEL

### MATHEMATICAL ECONOMICS: OPTIMIZATION. Contents

MATHEMATICAL ECONOMICS: OPTIMIZATION JOÃO LOPES DIAS Contents 1. Introduction 2 1.1. Preliminaries 2 1.2. Optimal points and values 2 1.3. The optimization problems 3 1.4. Existence of optimal points 4

### Generalization to inequality constrained problem. Maximize

Lecture 11. 26 September 2006 Review of Lecture #10: Second order optimality conditions necessary condition, sufficient condition. If the necessary condition is violated the point cannot be a local minimum

### g(x,y) = c. For instance (see Figure 1 on the right), consider the optimization problem maximize subject to

1 of 11 11/29/2010 10:39 AM From Wikipedia, the free encyclopedia In mathematical optimization, the method of Lagrange multipliers (named after Joseph Louis Lagrange) provides a strategy for finding the

### Tangent spaces, normals and extrema

Chapter 3 Tangent spaces, normals and extrema If S is a surface in 3-space, with a point a S where S looks smooth, i.e., without any fold or cusp or self-crossing, we can intuitively define the tangent

### Math (P)refresher Lecture 8: Unconstrained Optimization

Math (P)refresher Lecture 8: Unconstrained Optimization September 2006 Today s Topics : Quadratic Forms Definiteness of Quadratic Forms Maxima and Minima in R n First Order Conditions Second Order Conditions

### Unconstrained Optimization

1 / 36 Unconstrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University February 2, 2015 2 / 36 3 / 36 4 / 36 5 / 36 1. preliminaries 1.1 local approximation

### Functions of Several Variables

Functions of Several Variables The Unconstrained Minimization Problem where In n dimensions the unconstrained problem is stated as f() x variables. minimize f()x x, is a scalar objective function of vector

### The Kuhn-Tucker Problem

Natalia Lazzati Mathematics for Economics (Part I) Note 8: Nonlinear Programming - The Kuhn-Tucker Problem Note 8 is based on de la Fuente (2000, Ch. 7) and Simon and Blume (1994, Ch. 18 and 19). The Kuhn-Tucker

### Chapter 2: Unconstrained Extrema

Chapter 2: Unconstrained Extrema Math 368 c Copyright 2012, 2013 R Clark Robinson May 22, 2013 Chapter 2: Unconstrained Extrema 1 Types of Sets Definition For p R n and r > 0, the open ball about p of

CHAPTER 2: QUADRATIC PROGRAMMING Overview Quadratic programming (QP) problems are characterized by objective functions that are quadratic in the design variables, and linear constraints. In this sense,

### N. L. P. NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP. Optimization. Models of following form:

0.1 N. L. P. Katta G. Murty, IOE 611 Lecture slides Introductory Lecture NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP does not include everything

### Optimization. A first course on mathematics for economists

Optimization. A first course on mathematics for economists Xavier Martinez-Giralt Universitat Autònoma de Barcelona xavier.martinez.giralt@uab.eu II.3 Static optimization - Non-Linear programming OPT p.1/45

### The general programming problem is the nonlinear programming problem where a given function is maximized subject to a set of inequality constraints.

1 Optimization Mathematical programming refers to the basic mathematical problem of finding a maximum to a function, f, subject to some constraints. 1 In other words, the objective is to find a point,

### Nonlinear Optimization: What s important?

Nonlinear Optimization: What s important? Julian Hall 10th May 2012 Convexity: convex problems A local minimizer is a global minimizer A solution of f (x) = 0 (stationary point) is a minimizer A global

### 1. Introduction. 2. Outlines

1. Introduction Graphs are beneficial because they summarize and display information in a manner that is easy for most people to comprehend. Graphs are used in many academic disciplines, including math,

### Date: July 5, Contents

2 Lagrange Multipliers Date: July 5, 2001 Contents 2.1. Introduction to Lagrange Multipliers......... p. 2 2.2. Enhanced Fritz John Optimality Conditions...... p. 14 2.3. Informative Lagrange Multipliers...........

### Scientific Computing: Optimization

Scientific Computing: Optimization Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course MATH-GA.2043 or CSCI-GA.2112, Spring 2012 March 8th, 2011 A. Donev (Courant Institute) Lecture

### Nonlinear Programming and the Kuhn-Tucker Conditions

Nonlinear Programming and the Kuhn-Tucker Conditions The Kuhn-Tucker (KT) conditions are first-order conditions for constrained optimization problems, a generalization of the first-order conditions we

### Examination paper for TMA4180 Optimization I

Department of Mathematical Sciences Examination paper for TMA4180 Optimization I Academic contact during examination: Phone: Examination date: 26th May 2016 Examination time (from to): 09:00 13:00 Permitted

### Constrained Optimization Theory

Constrained Optimization Theory Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. IMA, August 2016 Stephen Wright (UW-Madison) Constrained Optimization Theory IMA, August

### Computational Finance

Department of Mathematics at University of California, San Diego Computational Finance Optimization Techniques [Lecture 2] Michael Holst January 9, 2017 Contents 1 Optimization Techniques 3 1.1 Examples

### Optimization Methods

Optimization Methods Decision making Examples: determining which ingredients and in what quantities to add to a mixture being made so that it will meet specifications on its composition allocating available

### AM 205: lecture 18. Last time: optimization methods Today: conditions for optimality

AM 205: lecture 18 Last time: optimization methods Today: conditions for optimality Existence of Global Minimum For example: f (x, y) = x 2 + y 2 is coercive on R 2 (global min. at (0, 0)) f (x) = x 3

### Math 164-1: Optimization Instructor: Alpár R. Mészáros

Math 164-1: Optimization Instructor: Alpár R. Mészáros First Midterm, April 20, 2016 Name (use a pen): Student ID (use a pen): Signature (use a pen): Rules: Duration of the exam: 50 minutes. By writing

### Taylor Series and stationary points

Chapter 5 Taylor Series and stationary points 5.1 Taylor Series The surface z = f(x, y) and its derivatives can give a series approximation for f(x, y) about some point (x 0, y 0 ) as illustrated in Figure

### Chapter 7. Extremal Problems. 7.1 Extrema and Local Extrema

Chapter 7 Extremal Problems No matter in theoretical context or in applications many problems can be formulated as problems of finding the maximum or minimum of a function. Whenever this is the case, advanced

### CE 191: Civil and Environmental Engineering Systems Analysis. LEC 05 : Optimality Conditions

CE 191: Civil and Environmental Engineering Systems Analysis LEC : Optimality Conditions Professor Scott Moura Civil & Environmental Engineering University of California, Berkeley Fall 214 Prof. Moura

### CE 191: Civil & Environmental Engineering Systems Analysis. LEC 17 : Final Review

CE 191: Civil & Environmental Engineering Systems Analysis LEC 17 : Final Review Professor Scott Moura Civil & Environmental Engineering University of California, Berkeley Fall 2014 Prof. Moura UC Berkeley

### Lagrange multipliers. Portfolio optimization. The Lagrange multipliers method for finding constrained extrema of multivariable functions.

Chapter 9 Lagrange multipliers Portfolio optimization The Lagrange multipliers method for finding constrained extrema of multivariable functions 91 Lagrange multipliers Optimization problems often require

### Paul Schrimpf. October 17, UBC Economics 526. Constrained optimization. Paul Schrimpf. First order conditions. Second order conditions

UBC Economics 526 October 17, 2012 .1.2.3.4 Section 1 . max f (x) s.t. h(x) = c f : R n R, h : R n R m Draw picture of n = 2 and m = 1 At optimum, constraint tangent to level curve of function Rewrite

### Numerical optimization

Numerical optimization Lecture 4 Alexander & Michael Bronstein tosca.cs.technion.ac.il/book Numerical geometry of non-rigid shapes Stanford University, Winter 2009 2 Longest Slowest Shortest Minimal Maximal

### Constrained Optimization

Constrained Optimization Joshua Wilde, revised by Isabel Tecu, Takeshi Suzuki and María José Boccardi August 13, 2013 1 General Problem Consider the following general constrained optimization problem:

### Math 273a: Optimization Basic concepts

Math 273a: Optimization Basic concepts Instructor: Wotao Yin Department of Mathematics, UCLA Spring 2015 slides based on Chong-Zak, 4th Ed. Goals of this lecture The general form of optimization: minimize

### Notes on Constrained Optimization

Notes on Constrained Optimization Wes Cowan Department of Mathematics, Rutgers University 110 Frelinghuysen Rd., Piscataway, NJ 08854 December 16, 2016 1 Introduction In the previous set of notes, we considered

### Lecture 3. Optimization Problems and Iterative Algorithms

Lecture 3 Optimization Problems and Iterative Algorithms January 13, 2016 This material was jointly developed with Angelia Nedić at UIUC for IE 598ns Outline Special Functions: Linear, Quadratic, Convex

### Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be

### Mathematical Economics. Lecture Notes (in extracts)

Prof. Dr. Frank Werner Faculty of Mathematics Institute of Mathematical Optimization (IMO) http://math.uni-magdeburg.de/ werner/math-ec-new.html Mathematical Economics Lecture Notes (in extracts) Winter

### ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints Instructor: Prof. Kevin Ross Scribe: Nitish John October 18, 2011 1 The Basic Goal The main idea is to transform a given constrained

### EC /11. Math for Microeconomics September Course, Part II Problem Set 1 with Solutions. a11 a 12. x 2

LONDON SCHOOL OF ECONOMICS Professor Leonardo Felli Department of Economics S.478; x7525 EC400 2010/11 Math for Microeconomics September Course, Part II Problem Set 1 with Solutions 1. Show that the general

### Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems

1 Numerical optimization Alexander & Michael Bronstein, 2006-2009 Michael Bronstein, 2010 tosca.cs.technion.ac.il/book Numerical optimization 048921 Advanced topics in vision Processing and Analysis of

### OPTIMIZATION THEORY IN A NUTSHELL Daniel McFadden, 1990, 2003

OPTIMIZATION THEORY IN A NUTSHELL Daniel McFadden, 1990, 2003 UNCONSTRAINED OPTIMIZATION 1. Consider the problem of maximizing a function f:ú n 6 ú within a set A f ú n. Typically, A might be all of ú

### Optimality conditions for unconstrained optimization. Outline

Optimality conditions for unconstrained optimization Daniel P. Robinson Department of Applied Mathematics and Statistics Johns Hopkins University September 13, 2018 Outline 1 The problem and definitions

### EC /11. Math for Microeconomics September Course, Part II Lecture Notes. Course Outline

LONDON SCHOOL OF ECONOMICS Professor Leonardo Felli Department of Economics S.478; x7525 EC400 20010/11 Math for Microeconomics September Course, Part II Lecture Notes Course Outline Lecture 1: Tools for

### Mathematical Foundations -1- Constrained Optimization. Constrained Optimization. An intuitive approach 2. First Order Conditions (FOC) 7

Mathematical Foundations -- Constrained Optimization Constrained Optimization An intuitive approach First Order Conditions (FOC) 7 Constraint qualifications 9 Formal statement of the FOC for a maximum

### Lecture 18: Optimization Programming

Fall, 2016 Outline Unconstrained Optimization 1 Unconstrained Optimization 2 Equality-constrained Optimization Inequality-constrained Optimization Mixture-constrained Optimization 3 Quadratic Programming

### Calculus and optimization

Calculus an optimization These notes essentially correspon to mathematical appenix 2 in the text. 1 Functions of a single variable Now that we have e ne functions we turn our attention to calculus. A function

### More on Lagrange multipliers

More on Lagrange multipliers CE 377K April 21, 2015 REVIEW The standard form for a nonlinear optimization problem is min x f (x) s.t. g 1 (x) 0. g l (x) 0 h 1 (x) = 0. h m (x) = 0 The objective function

### Constrained optimization: direct methods (cont.)

Constrained optimization: direct methods (cont.) Jussi Hakanen Post-doctoral researcher jussi.hakanen@jyu.fi Direct methods Also known as methods of feasible directions Idea in a point x h, generate a

### A Project Report Submitted by. Devendar Mittal 410MA5096. A thesis presented for the degree of Master of Science

PARTICULARS OF NON-LINEAR OPTIMIZATION A Project Report Submitted by Devendar Mittal 410MA5096 A thesis presented for the degree of Master of Science Department of Mathematics National Institute of Technology,

### Microeconomics I. September, c Leopold Sögner

Microeconomics I c Leopold Sögner Department of Economics and Finance Institute for Advanced Studies Stumpergasse 56 1060 Wien Tel: +43-1-59991 182 soegner@ihs.ac.at http://www.ihs.ac.at/ soegner September,

### ARE202A, Fall 2005 CONTENTS. 1. Graphical Overview of Optimization Theory (cont) Separating Hyperplanes 1

AREA, Fall 5 LECTURE #: WED, OCT 5, 5 PRINT DATE: OCTOBER 5, 5 (GRAPHICAL) CONTENTS 1. Graphical Overview of Optimization Theory (cont) 1 1.4. Separating Hyperplanes 1 1.5. Constrained Maximization: One

### Week 4: Calculus and Optimization (Jehle and Reny, Chapter A2)

Week 4: Calculus and Optimization (Jehle and Reny, Chapter A2) Tsun-Feng Chiang *School of Economics, Henan University, Kaifeng, China September 27, 2015 Microeconomic Theory Week 4: Calculus and Optimization

### Econ Slides from Lecture 14

Econ 205 Sobel Econ 205 - Slides from Lecture 14 Joel Sobel September 10, 2010 Theorem ( Lagrange Multipliers ) Theorem If x solves max f (x) subject to G(x) = 0 then there exists λ such that Df (x ) =

### Constrained optimization

Constrained optimization In general, the formulation of constrained optimization is as follows minj(w), subject to H i (w) = 0, i = 1,..., k. where J is the cost function and H i are the constraints. Lagrange

### MVE165/MMG631 Linear and integer optimization with applications Lecture 13 Overview of nonlinear programming. Ann-Brith Strömberg

MVE165/MMG631 Overview of nonlinear programming Ann-Brith Strömberg 2015 05 21 Areas of applications, examples (Ch. 9.1) Structural optimization Design of aircraft, ships, bridges, etc Decide on the material

### Lectures 9 and 10: Constrained optimization problems and their optimality conditions

Lectures 9 and 10: Constrained optimization problems and their optimality conditions Coralia Cartis, Mathematical Institute, University of Oxford C6.2/B2: Continuous Optimization Lectures 9 and 10: Constrained

### Unconstrained optimization

Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout

### Numerical Optimization

Constrained Optimization Computer Science and Automation Indian Institute of Science Bangalore 560 012, India. NPTEL Course on Constrained Optimization Constrained Optimization Problem: min h j (x) 0,

### The Karush-Kuhn-Tucker conditions

Chapter 6 The Karush-Kuhn-Tucker conditions 6.1 Introduction In this chapter we derive the first order necessary condition known as Karush-Kuhn-Tucker (KKT) conditions. To this aim we introduce the alternative

### Math 5311 Constrained Optimization Notes

ath 5311 Constrained Optimization otes February 5, 2009 1 Equality-constrained optimization Real-world optimization problems frequently have constraints on their variables. Constraints may be equality

### Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

Optimization Escuela de Ingeniería Informática de Oviedo (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30 Unconstrained optimization Outline 1 Unconstrained optimization 2 Constrained

### Preliminary draft only: please check for final version

ARE211, Fall2012 CALCULUS4: THU, OCT 11, 2012 PRINTED: AUGUST 22, 2012 (LEC# 15) Contents 3. Univariate and Multivariate Differentiation (cont) 1 3.6. Taylor s Theorem (cont) 2 3.7. Applying Taylor theory:

### Calculus Example Exam Solutions

Calculus Example Exam Solutions. Limits (8 points, 6 each) Evaluate the following limits: p x 2 (a) lim x 4 We compute as follows: lim p x 2 x 4 p p x 2 x +2 x 4 p x +2 x 4 (x 4)( p x + 2) p x +2 = p 4+2

### Lagrange Multipliers

Lagrange Multipliers (Com S 477/577 Notes) Yan-Bin Jia Nov 9, 2017 1 Introduction We turn now to the study of minimization with constraints. More specifically, we will tackle the following problem: minimize

### Calculus 2502A - Advanced Calculus I Fall : Local minima and maxima

Calculus 50A - Advanced Calculus I Fall 014 14.7: Local minima and maxima Martin Frankland November 17, 014 In these notes, we discuss the problem of finding the local minima and maxima of a function.

### Lecture 3: Basics of set-constrained and unconstrained optimization

Lecture 3: Basics of set-constrained and unconstrained optimization (Chap 6 from textbook) Xiaoqun Zhang Shanghai Jiao Tong University Last updated: October 9, 2018 Optimization basics Outline Optimization

### 8. Constrained Optimization

8. Constrained Optimization Daisuke Oyama Mathematics II May 11, 2018 Unconstrained Maximization Problem Let X R N be a nonempty set. Definition 8.1 For a function f : X R, x X is a (strict) local maximizer

### Written Examination

Division of Scientific Computing Department of Information Technology Uppsala University Optimization Written Examination 202-2-20 Time: 4:00-9:00 Allowed Tools: Pocket Calculator, one A4 paper with notes

### II. An Application of Derivatives: Optimization

Anne Sibert Autumn 2013 II. An Application of Derivatives: Optimization In this section we consider an important application of derivatives: finding the minimum and maximum points. This has important applications

### Convex Optimization & Lagrange Duality

Convex Optimization & Lagrange Duality Chee Wei Tan CS 8292 : Advanced Topics in Convex Optimization and its Applications Fall 2010 Outline Convex optimization Optimality condition Lagrange duality KKT

### LAGRANGE MULTIPLIERS

LAGRANGE MULTIPLIERS MATH 195, SECTION 59 (VIPUL NAIK) Corresponding material in the book: Section 14.8 What students should definitely get: The Lagrange multiplier condition (one constraint, two constraints

### 1 Computing with constraints

Notes for 2017-04-26 1 Computing with constraints Recall that our basic problem is minimize φ(x) s.t. x Ω where the feasible set Ω is defined by equality and inequality conditions Ω = {x R n : c i (x)

6-1: Introduction to gradient descent Prof. J.C. Kao, UCLA Introduction to gradient descent Derivation and intuitions Hessian 6-2: Introduction to gradient descent Prof. J.C. Kao, UCLA Introduction Our

### NONLINEAR. (Hillier & Lieberman Introduction to Operations Research, 8 th edition)

NONLINEAR PROGRAMMING (Hillier & Lieberman Introduction to Operations Research, 8 th edition) Nonlinear Programming g Linear programming has a fundamental role in OR. In linear programming all its functions

### Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Numerisches Rechnen (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2011/12 IGPM, RWTH Aachen Numerisches Rechnen

### , b = 0. (2) 1 2 The eigenvectors of A corresponding to the eigenvalues λ 1 = 1, λ 2 = 3 are

Quadratic forms We consider the quadratic function f : R 2 R defined by f(x) = 2 xt Ax b T x with x = (x, x 2 ) T, () where A R 2 2 is symmetric and b R 2. We will see that, depending on the eigenvalues

### ARE211, Fall 2005 CONTENTS. 5. Characteristics of Functions Surjective, Injective and Bijective functions. 5.2.

ARE211, Fall 2005 LECTURE #18: THU, NOV 3, 2005 PRINT DATE: NOVEMBER 22, 2005 (COMPSTAT2) CONTENTS 5. Characteristics of Functions. 1 5.1. Surjective, Injective and Bijective functions 1 5.2. Homotheticity

### UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems Robert M. Freund February 2016 c 2016 Massachusetts Institute of Technology. All rights reserved. 1 1 Introduction

### Transpose & Dot Product

Transpose & Dot Product Def: The transpose of an m n matrix A is the n m matrix A T whose columns are the rows of A. So: The columns of A T are the rows of A. The rows of A T are the columns of A. Example:

### Differentiation. f(x + h) f(x) Lh = L.

Analysis in R n Math 204, Section 30 Winter Quarter 2008 Paul Sally, e-mail: sally@math.uchicago.edu John Boller, e-mail: boller@math.uchicago.edu website: http://www.math.uchicago.edu/ boller/m203 Differentiation

### Lagrange Multipliers

Optimization with Constraints As long as algebra and geometry have been separated, their progress have been slow and their uses limited; but when these two sciences have been united, they have lent each