MATH2070 Optimisation Nonlinear optimisation with constraints Semester 2, 2012 Lecturer: I.W. Guo Lecture slides courtesy of J.R. Wishart
Review The full nonlinear optimisation problem with equality constraints Method of Lagrange multipliers Dealing with Inequality Constraints and the Kuhn-Tucker conditions Second order conditions with constraints
The full nonlinear optimisation problem with equality constraints Method of Lagrange multipliers Dealing with Inequality Constraints and the Kuhn-Tucker conditions Second order conditions with constraints
Non-linear optimisation Interested in optimising a function of several variables with constraints. Multivariate framework Variables x = (x 1, x 2,..., x n ) D R n Objective Function Z = f (x 1, x 2,..., x n ) Constraint equations, g 1 (x 1, x 2,..., x n ) = 0, h 1 (x 1, x 2,..., x n ) 0, g 2 (x 1, x 2,..., x n ) = 0, h 2 (x 1, x 2,..., x n ) 0,.. g m (x 1, x 2,..., x n ) = 0, h p (x 1, x 2,..., x n ) 0.
Vector notation Multivariate framework Minimise Z = f (x) such that g(x) = 0, h(x) 0, where g : R n R m and h : R n R p. Objectives? Find extrema of above problem. Consider equality constraints first. Extension: Allow inequality constraints.
Initial Example Minimise the heat loss through the surface area of an open lid rectangular container. Formulated as a problem Minimise, S = 2xy + 2yz + xz such that, V = xyz constant and x, y, z > 0. Parameters : (x, y, z) Objective function : f = S Constraints : g = xyz V
Graphical interpretation of problem level contours of f(x, y) g(x,y) = c P Feasible points occur at the tangential intersection of g(x, y) and f(x, y). Gradient of f is parallel to gradient of g. Recall from linear algebra and vectors for some constant, λ. f + λ g = 0,
Lagrangian Definition Define the Lagrangian for our problem, L(x, y, z, λ) = S(x, y, z) + λg(x, y, z) = 2xy + 2yz + xz + λ (xyz V ) Necessary conditions, L x = L y = L z = L λ = 0. This will be a system of four equations with four unknowns (x, y, z, λ) that needs to be solved
Solving the system Calculate derivatives, L x = 2y + z λyz, L y = 2x + 2z λxz, L z = 2y + x λxy, L λ = xyz V, Ensure constraint, L λ = 0 z = V xy (1) Eliminate λ, solve for (x, y, z) L x = 0 2y + z = λyz λ = 2 z + 1 y L y = 0 2x + 2z = λxz λ = 2 z + 2 x L z = 0 2y + 2x = λxy λ = 2 x + 1 y (2) (3) (4)
Solving system (continued) From (2) and (3) it follows x = 2y From (2) and (4) it follows x = z Use (1) with x = z = 2y then simplify, ( ) 3 3 V x = (x, y, z) = 2V, 4, 3 2V
The full nonlinear optimisation problem with equality constraints Method of Lagrange multipliers Dealing with Inequality Constraints and the Kuhn-Tucker conditions Second order conditions with constraints
Constrained optimisation Wish to generalised previous argument to multivariate framework, Minimise Z = f (x), x D R n such that g(x) = 0, h(x) 0, where g : R n R m and h : R n R p. Geometrically, the constraints define a subset Ω of R n, the feasible set, which is of dimension less than n m, in which the minimum lies.
Constrained Optimisation A value x D is called feasible if both g(x) = 0 and h(x) 0. An inequality constraint h i (x) 0 is Active if the feasible x D satisfies hi (x) = 0 Inactive otherwise.
Equality Constraints A point x which satisfies the constraint g(x ) = 0 is a regular point if { g 1 (x ), g 2 (x )...., g m (x )} is linearly independent. g 1 x 1 Rank.... g m x 1... g 1 x n. g m x n = m.
Tangent Subspace Definition Define the constrainted tangent subspace T = {d R n : g(x )d = 0}, where g(x ) is an m n matrix defined, g 1 x 1 g(x ) :=.... g m x 1... ( ) and gi T := gi x 1,..., g i x n g 1 x n. g m x n g 1 (x ) T =. g m (x ) T
Tangent Subspace (Example)
Generalised Lagrangian Definition Define the Lagrangian for the objective function f(x) with constraint equations g(x) = 0, L(x, λ) := f + m λ i g i = f + λ T g. i=1 Introduce a lagrange multiplier for each equality constraint.
First order necessary conditions Theorem Assume x is a regular point of g(x) = 0 and let x be a local extremum of f subject to g(x) = 0, with f, g C 1. Then there exists λ R m such that, L(x, λ) = f(x ) + λ T g(x ) = 0. In sigma notation: L(x ) x j = f(x ) x j + m i=1 λ i g i (x ) x j = 0, j = 1,..., n ; and L(x ) λ i = g i (x ) = 0, i = 1,..., m.
Another example Example Find the extremum for f(x, y) = 1 2 (x 1)2 + 1 2 (y 2)2 + 1 subject to the single constraint x + y = 1. Solution: Construct the Lagrangian: L(x, y, λ) = 1 2 (x 1)2 + 1 2 (y 2)2 + 1 + λ(x + y 1). Then the necessary conditions are: L x = x 1 + λ = 0 L y = y 2 + λ = 0 L λ = 0 x + y = 1.
Solve system Can solve this by converting problem into matrix form. i.e. solve for a in Ma = b 1 0 1 x 1 0 1 1 y = 2 1 1 0 λ 1 Invert matrix M and solve gives, x 1 0 1 y = 0 1 1 λ 1 1 0 1 0 2 = 1 1 1 1
Comparison with unconstrained problem Example Minimize f(x, y) = 1 2 (x 1)2 + 1 2 (y 2)2 + 1 subject to the single constraint x + y = 1. In unconstrained problem, minimum attained at x = (1, 2). Minimum at f = f(1, 2) = 1 Constraint modifies solution. Constrained minimum at f = f(0, 1) = 2. Note: Constrained minimum is bounded from below by the unconstrained minumum.
Use of convex functions : Example Example Minimise: f(x, y) = x 2 + y 2 + xy 5y such that: x + 2y = 5. The constraint is linear (therefore convex) and Hessian matrix of f is positive definite so f is convex as well. The Lagrangian is L(x, y, λ) = x 2 + y 2 + xy 5y + λ(x + 2y 5).
Solve Finding first order necessary conditions, L = 2x + y + λ = 0 x (5) L = 2y + x 5 + 2λ = 0 y (6) L = x + 2y 5 = 0. λ (7) In matrix form the system is, 2 1 1 x 0 1 2 2 y = 5 1 2 0 λ 5 Inverting and solving yields, x y = 1 5 10 3 λ 0
The full nonlinear optimisation problem with equality constraints Method of Lagrange multipliers Dealing with Inequality Constraints and the Kuhn-Tucker conditions Second order conditions with constraints
Inequality constraints We now consider constrained optimisation problems with p inequality constraints, Minimise: f(x 1, x 2,..., x n ) subject to: h 1 (x 1, x 2,..., x n ) 0 In vector form the problem is Minimise: h 2 (x 1, x 2,..., x n ) 0 h p (x 1, x 2,..., x n ) 0. f(x) subject to: h(x) 0..
The Kuhn-Tucker Conditions To deal with the inequality constraints, introduce the so called Kuhn-Tucker conditions. f x j + p i=1 λ i h i x j = 0, j = 1,..., n ; λ i h i = 0, i = 1,..., m ; λ i 0, i = 1,..., m. Extra conditions for λ apply in this sense, h i (x) = 0, if λ i > 0, i = 1,..., p or h i (x) < 0, if λ i = 0, i = 1,..., p.
Graphical Interpretation level contours of f(x, y) h(x,y) < 0 (λ = 0) P h(x,y) > 0 h(x,y) = 0 (λ > 0) If λ > 0 then find solution on line h(x, y) = 0 If λ = 0 then find solution in region h(x, y) < 0
Graphical Interpretation level contours of f(x, y) (x,y ) h(x,y) = 0 (λ > 0) Consider h(x, y) = 0, then λ > 0. Optimum at x = (x, y )
Graphical Interpretation (x,y ) level contours of f(x, y) h(x,y) < 0 (λ = 0) h(x,y) = 0 Consider h(x, y) < 0, then λ = 0. Optimum at x = (x, y )
Kuhn Tucker Kuhn-Tucker conditions are always necessary. Kuhn-Tucker conditions are sufficient if the objective function and the constraint functions are convex. Need a procedure to deal with these conditions.
Introduce slack variables Slack Variables To deal with the inequality constraint introduce a variable, s 2 i for each constraint. New constrained optimisation problem with m equality constraints, Minimise: f(x 1, x 2,..., x n ) subject to: h 1 (x 1, x 2,..., x n ) + s 2 1 = 0 h 2 (x 1, x 2,..., x n ) + s 2 2 = 0 h p (x 1, x 2,..., x n ) + s 2 p = 0..
New Lagrangian The Lagrangian for constrained problem is, L(x, λ, s) = f(x) + p i=1 ( λ i hi (x) + s 2 ) i = f(x) + λ T h(x) + λ T s, where s is a vector in R p with i-th component s 2 i. L depends on the n + 2m independent variables x 1, x 2,..., x n, λ 1, λ 2,..., λ p and s 1, s 2,..., s p. To find the constrained minimum we must differentiate L with respect to each of these variables and set the result to zero in each case.
Solving the system Need to solve n + 2m variables given by the n + 2m equations generated by, L = L = = L = 0 x 1 x 2 x n L = L = = L = L = L = = L = 0. λ 1 λ 2 λ p s 1 s 2 s p Having λ i 0 ensures that L = f(x) + has a well defined minimum. p i=1 ( λ i hi (x) + s 2 ) i If λ i < 0, we can always make L smaller due to the λ i s 2 i term.
New first order equations The differentiation, L/ λ i = 0, ensures the constraints hold. L λi = 0 h i + s 2 i = 0, i = 1,..., p. The differentiation, L s i minimising, = 0, controls observed region of L si = 0 2λ i s i = 0, i = 1,..., p. If λi > 0 we are on the boundary since s i = 0 If λi = 0 we are inside the feasible region since s i 0. If λi < 0 we are on the boundary again, but the point is not a local minimum. So only focus on λ i 0.
Simple application Example Consider the general problem Minimise: f(x) subject to: a x b. Re-express the problem to allow the Kuhn-Tucker conditions. Minimise: subject to: f(x) x a x b. The Lagrangian L is given by L = f + λ 1 ( x + s 2 1 + a) + λ 2 (x + s 2 2 b).
Solving system Find first order equations L x = f (x) λ 1 + λ 2 = 0, L = x + a + s 2 L 1 = 0 = x b + s 2 2 = 0, λ 1 λ 2 L L = 2λ 1 s 1 = 0 = 2λ 2 s 2 = 0. s 1 s 2 Consider the scenarios for when the constraints are at the boundary or inside the feasible region.
Possible scenarios y y y f (x) > 0 f (x) = 0 f (x) < 0 a DOWN b x a IN b x a UP b x Down variable scenario (lower boundary). s 1 = 0 λ 1 > 0 and s 2 2 > 0 λ 2 = 0. Implies that x = a, and f (x) = f (a) = λ 1 > 0, so that the minimum lies at x = a.
Possible scenarios y y y f (x) > 0 f (x) = 0 f (x) < 0 a DOWN b x a IN b x a UP b x In variable scenario (Inside feasible region). s 2 1 > 0 λ 1 = 0 and s 2 2 > 0 λ 2 = 0. Implies f (x) = 0, for some x [a, b] determined by the specific f(x), and thus there is a turning point somewhere in the interval [a, b]
Possible scenarios y y y f (x) > 0 f (x) = 0 f (x) < 0 a DOWN b x a IN b x a UP b x Up variable scenario (upper boundary). s 2 1 > 0 λ 1 = 0 and s 2 = 0 λ 2 > 0. Implies x = b, so that f (x) = f (b) = λ 2 < 0 and we have the minimum at x = b.
Numerical example Example Minimise: f(x 1, x 2, x 3 ) = x 1 (x 1 10) + x 2 (x 2 50) 2x 3 subject to: x 1 + x 2 10 x 3 10. Write down Lagrangian, L(x, λ, s) = f(x) + λ T (h(x) + s), where, h(x) = ( ) x1 + x 2 10 x 3 10 s = ( ) s 2 1 s 2 2
Solve the system of first order equations L = 2x 1 10 + λ 1 = 0 x 1 (8) L = 2x 2 50 + λ 1 = 0 x 2 (9) L = 2 + λ 2 = 0 x 3 (10) The Kuhn-Tucker conditions are λ 1, λ 2 0 and λ 1 (x 1 + x 2 10) = 0 (11) λ 2 (x 3 10) = 0 (12) Immediately from (10) and (12), λ 2 = 2 and x 3 = 10.
Two remaining cases to consider Inside the feasible set λ 1 = 0. From equations (8) and (9) find, x = (x 1, x 2, x 3 ) = (5, 25, 10) However, constraint x 1 + x 2 10, is violated Reject this solution! At the boundary, λ 1 > 0 Using constraint equation (11) x 1 + x 2 = 10. Find the system: 2x 1 30 + λ 1 = 0 2x 1 10 + λ 1 = 0, Solving gives, λ 1 = 20 and x 1 = 5, x 2 = 15.
The full nonlinear optimisation problem with equality constraints Method of Lagrange multipliers Dealing with Inequality Constraints and the Kuhn-Tucker conditions Second order conditions with constraints
Motivation Determine sufficient conditions for minima (maxima) given a set of constraints. Define the operator 2 to denote the Hessian matrices, 2 f := H f. What { are the necessary/sufficient conditions on 2 f and 2 } m g j that ensure a constrained minimum? j=1 Consider equality constraints first. Extend to inequality constraints.
Notation Hessian of the constraint equations in the upcoming results needs care. Define as follows, Constraint Hessian involving Lagrange multipliers, λ T 2 g(x) := m λ i 2 g i (x) = i=1 m λ i H gi (x) i=1
Necessary second order condition for a constrained minimum with equality constraints Theorem Suppose that x is a regular point of g C 2 and is a local minimum of f such that g(x) = 0 and x is a regular point of g. Then there exists a λ R m such that, f(x ) + λ T g(x ) = 0. Also, from the tangent plane T = {y : g(x )y = 0}, the Lagragian Hessian at x, 2 L(x ) := 2 f(x ) + λ T 2 g(x ), is positive semi-definite on T. i.e. y T 2 L(x )y 0 for all y T.
Sufficient second order condition for a constrained minimum with equality constraints Theorem Suppose that x and λ satisfy, f(x ) + λ T g(x ) = 0. Suppose further that the Lagragian Hessian at x, 2 L(x ) := 2 f(x ) + λ T 2 g(x ). If y T 2 L(x )y > 0 for all y T. Then f(x ) is a local minimum that satisfies g(x ) = 0.
Determine if positive/negative definite Options for determining definiteness of 2 L(x ) on the subspace T. Complete the square of y T 2 L(x )y. Compute the eigenvalues of y T 2 L(x )y. Example Max: Z = x 1 x 2 + x 1 x 3 + x 2 x 3 subject to: x 1 + x 2 + x 3 = 3.
Extension: Inequality constraints Problem: where f, g, h C 1. Min f(x) subject to: h(x) 0 g(x) = 0. Extend arguments to include both equality and inequality constraints. Notation: Define the index set J : Definition Let J (x ) be the index set of active constraints. h j (x ) = 0, j J ; h j (x ) < 0, j / J.
First order necessary conditions The point x is called a regular point of the constraints, g(x) = 0 and h(x) 0, if are linearly independent. Theorem { g i } m i=1 and { h j } j J Let x be a local minimum of f(x) subject the constraints g(x) = 0 and h(x) 0. Suppose further that x is a regular point of the constraints. Then there exists λ R m and µ R p with µ 0 such that, f(x ) + λ T g(x ) + µ T h(x ) = 0 µ T h(x ) = 0. (13)
Second order necessary conditions Theorem Let x be a regular point of g(x) = 0, h(x) 0 with f, g, h C 2. If x is a local minimum point of the fully constrained problem then there exists λ R m and µ R p such that (13) holds and 2 f(x ) + m λ i 2 g i (x ) + i=1 p µ j 2 h j (x ) j=1 is positive semi-definite on the tangent space of the active constraints. i.e. on the space, { } T = d R n g(x )d = 0; h j (x )d = 0, j J
Second order sufficient conditions Theorem Let x satisfy g(x) = 0, h(x) 0. Suppose there exists λ R m and µ R p such that µ 0 and (13) holds and 2 f(x ) + m λ i 2 g i (x ) + i=1 p µ j 2 h j (x ) is positive definite on the tangent space of the active constraints. i.e. on the space, { } T = d R n g(x )d = 0; h j (x )d = 0, j J, j=1 { } where J = j h j (x ) = 0, µ j > 0. Then x is a strict local minimum of f subject to the full constraints.
Ways to compute the sufficient conditions The nature of the Hessian of the Lagrangian function and with its auxiliary conditions can be determined. 1. Eigenvalues on the subspace T. 2. Find the nature of the Bordered Hessian.
Eigenvalues on subspaces Let {e i } m i=1 be an orthonormal basis of T. Then d T can be written, d = Ez, where z R m and, E = [e 1 e 2... e m ]. Then use, d T 2 Ld =...,
Bordered Hessian Alternative method to determining the behaviour of the Lagragian Hessian. Definition Given an objective function f with constraints g : R n R m, define the Bordered Hessian matrix, ( 2 L(x, λ) g(x) T ) B(x, λ) := g(x) 0 Find the nature of the Bordered Hessian.