Outline Roadmap for the NPP segment: 1 Preliminaries: role of convexity 2 Existence of a solution 3 Necessary conditions for a solution: inequality constraints 4 The constraint qualification 5 The Lagrangian approach 6 Interpretation of the Lagrange Multipliers 7 Constraints that are binding vs satisfied with equality 8 Necessary conditions for a solution: eq and ineq constraints 9 Sufficiency conditions 1 Bordered Hessians 2 Pseudo-concavity 1 The relationship between quasi- and pseudo-concavity 10 The basic sufficiency theorem () November 8, 2012 1 / 20
Existence Preliminaries: Convexity, quasi-ness and constrained optimization The canonical form of the nonlinear programming problem (NPP) maximize f(x) subject to g(x) b Conditions for existence of a solution the NPP: Theorem: If f : A R is continuous and strictly quasi-concave, and A is compact, nonempty and convex, then f attains a unique maximum on A. Conditions relating a local maximum on the constraint set to a unique solution to the NPP: Theorem: If f : R n R is continuous and strictly quasi-concave and g j : R n R is quasi-convex for each j, then if f attains a local maximum on A = {x R n : j,g j (x) b j }, this max is the unique global maximum of f on A. () November 8, 2012 2 / 20
Convexity and quasi-ness properties x2 x2 FOC doesn t imply max FOC is local max but not soln x1 x1 Constraint set is not convex Upper contour set of objective not convex () November 8, 2012 3 / 20
Separating Hyperplanes () November 8, 2012 4 / 20
Want to be able to separate by a hyperplane upper contour set of objective function constraint set Role of quasi-ness properties of functions A function is quasi-concave (quasi-convex) if all of its upper (lower) contour sets are convex A function is strictly quasi-concave (convex) if all of its upper (lower) contour sets are strictly convex sets (no flat edges) all of its level sets have empty interior To make everything work nicely 1 Require your objective function to be strictly quasi-concave 2 Define your feasible set as the intersection of lower contour sets of quasi-convex functions, e.g., g j (x) b j, for j = 1,...m. intersection of convex sets is convex your feasible set is convex () November 8, 2012 5 / 20
KKT necessary conditions for a soln: inequality constraints The KKT conditions in Mantra format: (Except for a bizarre exception) a necessary condition for x to solve a constrained maximization problem is that the gradient vector of the objective function at x belongs to the nonnegative cone defined by the gradient vectors of the constraints that are satisfied with equality at x. The KKT conditions in math format: If x solves the maximization problem and the constraint qualification holds at x then there exists a vector λ R m + s.t. f( x) = λ Jg( x) Moreover, λ has the property: g j ( x) < b j = λ j = 0. The KKT conditions vs the Lagrangian: exactly the same. The Lagrangian is long-hand () November 8, 2012 6 / 20
The KKT and the mantra x2 x1 a necessary condition... is that the gradient vector of the objective function... belongs to the nonnegative cone defined by the gradient vectors of the constraints satisfied with equality 3 dotted curves represent level sets of 3 different objective functions in each case, dashed arrows (gradients of objectives) belong to appropriate non-negative cone, defined by solid arrows () November 8, 2012 7 / 20
Necessary conditions for a Maximum In this picture the mantra fails f doesn t belong to cone defined by g 1 and g 2 dashed line perp ular to f points into interior of constraint set angle between dx and f is acute direction dx increases f & strictly decreases BOTH constraints angles between dx and g i are obtuse tail of red arrow can t solve the NPP g 1 g 2 θ(dx, g 2 ) > 90 f θ(dx, f) < 90 dx () November 8, 2012 8 / 20
Important issues The role of the constraint qualification (CQ): 1 it ensures that the linearized version of the constraint set is, locally, a good approximation of the true nonlinear constraint set. 2 the CQ will be satisfied at x if the gradients of the constraints that are satisfied with equality at x form a linear independent set Interpretation of the Lagrangian Constraint satisfied with equality vs binding constraints constraint. The KKT and equality constraints () November 8, 2012 9 / 20
The Constraint Qualification linearized constraint set f(x ) g 1 (x ) g 2 (x ) constraint set () November 8, 2012 10 / 20
KKT and the Lagrangian Approach The Lagrangian function for f : R n R, g : R n R m and b R m. L(x,λ) = f(x) + λ(b g(x)) = f(x) + m j=1 λ j (b j g j (x)) The first order conditions for an extremum of L on R n R m + are: ( x, λ) s.t. Compare KKT with (1) for i = 1,...n, L( x, λ)/ x i = 0 (1) for j = 1,...m, L( x, λ)/ λ j 0 (2) λj L( x, λ)/ λ j = 0. (3) f( x) T = λ T Jg( x) f( x)/ x i = m j=1 λj g j ( x)/ x i (2) says for j = 1,...m, (b j g j ( x)) 0 or (b g( x)) 0. (3) complementary slackness : (b j g j ( x)) > 0 implies λ j = 0; () November 8, 2012 11 / 20
Lagrangian as the shadow value of the constraint Theorem: At a solution ( x, λ) to the NPP, L( x, λ) = f( x). Fact: λ j is the shadow value of the j th constraint: M(b) = f( x(b)) = L( x(b), λ(b)) = f( x) + λ(b g( x)). { m f i ( x(b)) dm(b) = db j dl( x(b), λ(b)) db j = n i=1 k=1 λk (b)g k i ( x(b))} dxi db j m dλ k + (b k g k ( x(b))) + λ j (b) k=1 db j For each i, the term in curly brackets is zero, by the KT conditions. For each k, If (b k g k ( x(b))) = 0, then dλ k db k (b k g k ( x(b))) is zero. If (b k g k ( x(b))) < 0, then λ k ( ) = 0 on a nbd of b, so dλ k ( ) db k = 0. The only term remaining is λ j (b). Conclude that dm(b) db j = λ j (b) () November 8, 2012 12 / 20
Interpretation of the Lagrangian: One inequality constraint Gradient of objective and constraint are colinear λ is the ratio of the norms of the two gradients in both figures, dashed line represents relaxation of constraint by b = 1. the arrows represent f(x) and g(x). in one figure f(x) small; g(x) big; in the other, f(x) big; g(x) small which is which? Figure 1. Interpretation of the Lagrangian () November 8, 2012 13 / 20
Interpretation of the Lagrangian: One inequality constraint Gradient of objective and constraint are colinear λ is the ratio of the norms of the two gradients f(x) small; g(x) big; interpretation f(x) big; g(x) small; interpretation λ is the shadow value of the constraint: bang for a buck g(x ) = b + 1 (g given dx) g(x ) = b + 1 (g given dx) g(x ) = b g(x ) = b λ = f(x ) g(x ) is small λ = f(x ) g(x ) is big g(x ) f(x ) f( ) = f(x f( ) f(x ) + small f f(x ) g(x ) f( ) = f(x ) + BIG f f( ) = f(x ) Figure 1. Interpretation of the Lagrangian () November 8, 2012 14 / 20
Interpretation of the Lagrangian: Two inequality constraints Gradient of objective and one constraint are nearly colinear λ 1 is relatively large relatively large gain from b 1 = ε. λ 2 is relatively small relatively small gain from b 2 = ε. g 1 (x) b1 g 1 (x) b1 + ǫ g 2 (x ) g 1 (x ) f(x ) g 2 (x) b2 g 2 (x) b2 + ǫ Figure 1. Two multipliers, one big, one small () November 8, 2012 15 / 20
Constraint satisfied with equality vs binding Informal defn: a constraint is binding if the maximized value of objective increases when the constraint is slightly relaxed. g 1 (x) b1 g 2 (x ) f(x ) g 1 (x ) g 2 (x) b2 () November 8, 2012 16 / 20
Relax the first constraint: optimum shifts NE g 1 (x) b1 + ǫ g 1 (x) b1 g 2 (x) b2 () November 8, 2012 17 / 20
Relax the second constraint: optimum doesn t move g 1 (x) b1 g 2 (x) b2 + ǫ g 2 (x) b2 () November 8, 2012 18 / 20
KKT with equality constraints defined as inequalities KKT with equality constraints technically, don t absolutely need to introduce them g(x) = b iff g(x) b and g(x) b can represent an equality constraint as two inequality constraints apply the mantra as usual but there are high costs of doing this you lose uniqueness of the Lagrangian multiplier full rank sufficient condition for the CQ is never satisfied x2 f(x) x 1 g(x) x 2 Upper contour set of g corresponding to b g(x) f(x) x 3 x1 Lower contour set of g corresponding to b g(x) = b () November 8, 2012 19 / 20
KKT necessary conditions: equality & inequality constraints The KKT conditions in full generality f : R n R, g : R n R m, h : R n R l are continuously differentiable b R m and c R l. The canonical form of the nonlinear programming problem (NPP) maximize f(x) subject to g(x) b and h(x) = c The KKT conditions: If x solves the maximization problem and the constraint qualification holds at x then there exist vectors λ R m + and µ R l s.t. f( x) = λ Jg( x) + µ Jh( x) Moreover, λ has the property: g j ( x) < b j = λ j = 0. Note the two differences between λ and µ λ must be nonnegative; there is no complementary slackness condition on µ () November 8, 2012 20 / 20