IE 5531: Engineering Optimization I Lecture 12: Nonlinear optimization, continued Prof. John Gunnar Carlsson October 20, 2010 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 1 / 26
Administrivia PS4 posted later today Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 2 / 26
Recap: optimality conditions for NLP Single variable: df Necessary conditions for a minimizer x: = 0 and dx x d2 f dx x 2 0 df Sucient conditions: = 0 and dx x d2 f > dx x 2 0 If f (x) is convex, then df = 0 is a sucient condition dx x Multiple variables: Necessary conditions for a minimizer x: f = 0 and H 0 (x) x Sucient conditions: f = 0 and H 0 (x) x If f (x) is convex, then f = 0 is a sucient condition (x) x Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 3 / 26
Recap: descent directions Let f (x) be a dierentiable function; if x R n and there exists a vector d such that f ( x) T d < 0 then there exists a scalar τ > 0 such that f ( x + τ d) < f ( x) for all τ (0, τ) The vector d is called a descent direction at x An obvious descent direction is f (x) Thus the necessary condition for optimality, f (x) x = 0, simply says that no descent direction exists Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 4 / 26
Recap: feasible directions At a feasible point x, a feasible direction is a direction d 0 such that x + λd F for suciently small λ > 0 The set of all feasible directions at x is denoted D ( x; F) For example, if then D ( x; F) = {d : Ad = 0} If F = {x : Ax = b} F = {x : Ax b} then D ( x; F) = {d : A i d 0 i A ( x)}, where A ( x) is the set of indices of active constraints at x Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 5 / 26
Today: nonlinear optimization, continued Applications of optimality conditions Lagrange multipliers Linear inequality constraints KKT conditions for NLP Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 6 / 26
Two examples A quadratic function f (x) = x T Q x 2c T x The gradient is f (x) = 2Q x 2c and thus a necessary condition for optimality is that Q x = c A prot function f (x) = qu (x) c T x The gradient is q u (x) c and thus a necessary condition for optimality is that q u (x) = c Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 7 / 26
Linear equality-constrained problems Consider the problem minimize f (x) s.t. Ax = b with x R n and A full rank The set {x : Ax = b} is an n m-dimensional subspace Consider a feasible point x; the constraint Ax = b is equivalent to A ( x + p) = b Ap = 0 Thus the original problem is equivalent to minimize g (p) := f ( x + p) s.t. Ap = 0 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 8 / 26
Linear equality-constrained problems Let Z R n n m be a basis matrix for the null space {p : Ap = 0}; thus AZ = 0 Thus the original problem is equivalent to the unconstrained problem with v R n m By the chain rule, and therefore at optimality minimize h (v) := f ( x + Z v) h (v) = Z T f ( x + Z v) h (v ) = Z T f ( x + Z v ) = Z T f (x ) = 0 If Z T f (x ) = 0 then f (x ) = A T y for some y Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 9 / 26
Linear equality-constrained problems Theorem Let x be a local minimizer of LEP. A necessary condition for optimality at a point x for the problem minimize f (x) s.t. Ax = b is that f ( x) = A T y for some vector y R m. The geometric interpretation is that the gradient vector must be perpendicular to the constraint hyperplanes Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 10 / 26
Example Consider the problem of nding the nearest point to a xed point ˆx = (1, 1):, subject to a linear equality constraint: minimize x (1, 1) x 1 + x 2 = 1 s.t. We can square the objective function to make dierentiation easier: minimize (x 1 1) 2 + (x 2 1) 2 x 1 + x 2 = 1 The optimal point is clearly x = (1/2, 1/2), at which point we have ( ) ( ) ( ) x f (x) = 1 1 1 1 2 = = 1 x 2 1 1 1 as desired Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 11 / 26 s.t.
General equality-constrained problems We next consider a problem with a single (nonlinear) equality constraint minimize f (x) g (x) = 0 s.t. The Lagrange multiplier theorem describes the necessary conditions for optimality of a general equality-constrained problem Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 12 / 26
Proof sketch Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 13 / 26
Proof sketch Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 14 / 26
Proof sketch Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 15 / 26
Proof sketch Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 16 / 26
Lagrange multiplier theorem Theorem Let x be a local minimizer of the problem minimize f (x) s.t. g i (x) = 0 i {1,..., m} If the functions f (x) and g i (x) are continuously dierentiable at x and the Jacobian matrix g ( x) has rank m, then there exist scalars y 1,..., y m such that m f ( x) = y i g i ( x) i=1 where the y i 's are called Lagrange multipliers. Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 17 / 26
Lagrange function The function L (x, y) = f (x) m y i g i (x) is called the Lagrangian for the equality-constrained problem The preceding theorem is simply the necessary conditions for optimality of L (x, y): i=1 x L (x, y) = 0 = f ( x) y L (x, y) = g (x) = 0 m y i g i ( x) i=1 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 18 / 26
Linear inequality constraints Consider the problem minimize f (x) s.t. Ax b At optimality, some of the constraints are active (hold with equality) and the remaining constraints are inactive Write A = ( A 1 A 2 ) ( ; b = b1 b2 ) where A 1 x = b1 and A 2 x > b2 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 19 / 26
Linear inequality constraints By the previous argument for linear equality constraints we know that at optimality, for some y f (x ) = A T 1 y Note that the feasible directions at x are all p such that A 1 p 0 If x is a local minimizer then f (x ) T p 0 for all feasible directions p Therefore, we require f (x ) T p = ( A T 1 y ) T p = (y ) T A 1 p 0 for all feasible directions p, which implies that y 0 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 20 / 26
Linear inequality constrained problems Theorem A necessary condition for optimality at a point x for the problem minimize f (x) s.t. Ax b is that f ( x) = A T y for some vector y R m with y 0. Furthermore, we must have y i = 0 if A i x > b i. Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 21 / 26
General inequality constraints We nally consider the inequality-constrained problem minimize f (x) s.t. c (x) 0 where c (x) = (c 1 (x),..., c m (x)) Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 22 / 26
Karush-Kuhn-Tucker Optimality Conditions Theorem If x is a local minimizer for the problem minimize f (x) s.t. c (x) 0 with c (x) R m and certain (technical) constraint qualications are satised at x, then there exist scalars y 1,..., y m f ( x) = m i=1 y i 0 i y i c i ( x) = 0 i y i c i ( x) such that Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 23 / 26
Karush-Kuhn-Tucker Optimality Conditions Theorem If x is a local minimizer for the problem minimize f (x) s.t. c (x) 0 h (x) = 0 with c (x) R m and h (x) R p and certain (technical) constraint qualications are satised at x, then there exist scalars y 1,..., y m and z 1,..., z p such that f ( x) = m i=1 y i 0 i y i c i ( x) + p z i h i ( x) y i c i ( x) = 0 i Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 24 / 26 i=1
Example Suppose we want to build a square box with a given volume of at least 64 cubic inches We want to minimize the total amount of material used The formulation is We have minimize 2xy + 2xz + 2yz s.t. xyz 64 f = 2 (y + z, x + z, x + y) g = (yz, xz, yz) Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 25 / 26
Example The optimality conditions require that y + z yz 2 x + z = λ xz x + y xy It is not hard to see that this is satised at the point x = y = z = 4 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20, 2010 26 / 26