University of California, Davis Department of Agricultural and Resource Economics ARE 5 Lecture Notes Quirino Paris Karush-Kuhn-Tucker conditions................................................. page Specification of a nonlinear programming problem NPP................................. Review of the traditional Lagrange method............................................. Conversion of the NPP into a traditional Lagrange problem................................ KKT conditions in vector notation.................................................... 5 Geometric interpretation of KKT conditions............................................ 6 Constraint qualification............................................................ 7 Matching the KKT conditions with the structure of the Economic Equilibrium................. 8 Karush-Kuhn-Tucker conditions The formulation of mathematical programming problems began in 939 with the publication of the Master of Science thesis by William Karush (W. Karush (939. "Minima of Functions of Several Variables with Inequalities as Side Constraints". M.Sc. Dissertation. Dept. of Mathematics, Univ. of Chicago, Chicago, Illinois.. This work was ignored for a long time to the point that H.W. Kuhn and A.W. Tucker published their 95 article without citing the seminal work of Karush (H. W. Kuhn and A. W. Tucker, Nonlinear Programming, Proc. Second Berkeley Symp. on Math. Statist. and Prob. (Univ. of Calif. Press, 95, 48-49. Many authors still cite only Kuhn and Tucker when discussing Kuhn-Tucker conditions. The omission of Karush s name is unacceptable. His name belongs to the Karush-Kuhn-Tucker theory of nonlinear programming KKT conditions, for short. Specification of a nonlinear programming problem Most of the time, in this course, we will be dealing with a primal nonlinear programming problem (NPP like NPP max f (x x,x,...,x n subject to g (x b g (x b... g m (x b m x j, j =,...,n f (x is a differentiable concave (quasiconcave function called the objective function (to be optimized. The other relations are differentiable convex (quasiconvex functions called constraints (including the non-negativities of the endogenous variables x j, j =,...,n. As we will see soon, to solve the nonlinear problem stated above we will have to state also a constraint qualification to rule out strange behavior of the constraints.
Question: How should the first-order necessary conditions (FONC of the NPP be stated? The answer constitutes the KKT conditions. To understand the FONC of the NPP we will transform the NPP into a problem subject to equations and unrestricted variables in order to be able to apply the traditional Lagrange method. Review of the traditional Lagrange method (See also chapter, Symmetric Programming textbook Only one constraint is sufficient. For differentiable functions, max f (x x,x,...,x n subject to g(x = b. The Lagrange function is stated as L = f (x + λ[b g(x ] f g with FONC = λ, j =,...,n x j x j x j = b g(x,..., x n. λ The traditional Lagrange problem involves only equations and has no restricted variables. The solution of the FONC will provide the equilibrium values of the endogenous variables, x * j, j =,...,n and λ *. *,λ* f ( x * Recall that, at equilibrium, L(x = = λ * and λ * b b may be either a positive or a negative value f / x j depending on the derivatives λ * = g/ x j. Lagrange multipliers are interpreted (always as shadow prices of the corresponding constraint. Figure illustrates the shape of a Lagrange function. Figure. The graph of Lagrange function. y is the Lagrange multiplier Conversion of the NPP into a traditional Lagrange problem (See also chapter 3, Symmetric Programming textbook To make it simple, we will deal with only two constraints and two endogenous variables. NPP, max f (x x,x subject to
g (x b g (x b x x Let us hypothesize that the above NPP, represents the problem of maximizing total revenue, TR = f (x, subject to the technology, g (x b, g (x b, of transforming the supply of two inputs, b, b, into two outputs, x. To convert the NPP, into a traditional Lagrange problem (with only equations and unrestricted variables, add slack variables, s h, h =,...,4, to the inequalities to transform them into equations and unrestricted variables: NPP, max f (x x,x subject to g (x + s = b λ g (x + s = b λ x s 3 b 3 λ 3 b 4 λ 4 x s 4 In principle, in this NPP,, there are no inequalities and restricted variables. Therefore, we can apply the traditional Lagrange method in a straightforward way: the Lagrange function is L = f (x + λ [b s g (x ]+ λ [b s g (x ] + λ 3 (b 3 + s 3 x + λ 4 (b 4 + s 4 x with FONC f g g. = λ λ λ 3 g g. = f λ λ λ 4 x x x x 3. = λ s s 4. = λ s s 5. = λ 3 s 3 s 3 6. = λ 4 s 4 s 4 3
7. = b s g (x λ 8. = b s g (x λ 9. = b 3 + s 3 x λ 3 0. = b 4 + s 4 x λ 4 In the programming literature, the KKT conditions are presented as a set of inequalities and a set complementary slackness relations. To achieve that specification it is sufficient to eliminate the slack variables introduced above and discuss the remaining relations using the structure of the Economic Equilibrium problem discussed in Lecture Notes. Beginning with FONC 7 (and similarly for 8 l = b g (x = s λ Note that FONC 7 is interpreted as S D using the Economic Equilibrium structure. From FONC 9 (and similarly for 0 l = s3 = x + (0 b 3 x λ 3 Note that FONC 9 is interpreted as Q using the Economic Equilibrium structure. Furthermore, multiplying FONC 7 by λ (and similarly for FONC 8 λ = λ [b s g (x ] = λ [b g (x ] λ because of FONC 3 ( λ s (and similarly FONC 4 for FONC 8. Note that this result is interpreted as the complementary slackness condition P(S D of the Economic Equilibrium structure. Consider now FONC and rewrite it as (similarly for FONC l f g g λ = = λ 3 " x λ #%%%$%%% & MR 0 MR MC = λ 3 Therefore, the Lagrange multiplier λ 3 is interpreted as the (negative Opportunity Cost of producing (or not producing output x. Since λ 3 is non-positive according to the Economic Equilibrium structure the elimination of ( λ 3 from FONC (and similarly for FONC results in the inequality f g g λ = 0 x λ Furthermore, note that from FONC 9 ( b 3 0 and FONC 5 (and similarly from FONC 0 and 6 MC x = s 3 λ 3 x = λ 3 s 3 4
and, therefore, by multiplying FONC by x (and similarly FONC by x and using FONC 5 f g g f g g x = x λ λ λ 3 = x λ λ which corresponds to the complementary slackness condition Q(MR MC of the Economic Equilibrium structure. There remains to sign the Lagrange multipliers λ, λ. Since, by hypothesis, f ( is the total revenue function and b, b are supplies of inputs, an increase of input supply will not decrease the total revenue and, therefore, f f = λ, = λ. b b We have derived all the six (6 sets of KKT conditions which match the six (6 sets of relations of the Economic Equilibrium structure. Except that it is a little difficult to list them all clearly as they were developed in this simplified problem. For this reason, we restate now the KKT conditions using vector notation. KKT condition in vector notation Let the problem to solve be stated as the maximization of a differentiable and concave (quasi-concave function f (x subject to a differentiable and convex (quasi-convex vector function g(x such as max f (x = f (x,..., x n x subject to g (x,..., x n b g(x b!!! g m (x,..., x n b m x The dimension of the vector x is (n while the dimension of the vector b is (m with m < n simply as an arbitrary choice. The Lagrange function is and KKT conditions are L = f (x + λ [b g(x] f g. = λ 0 x x x f g Dual relations. x = x λ x x x 3. x 5
4. = b g(x λ Primal relations 5. λ = λ [b g(x] λ 6. λ Geometric interpretation of KKT conditions In order to be able to graph the relevant KKT conditions we return to the example with constraints and endogenous variables. The relevant KKT conditions are assumed to be binding f g g λ = x λ f g g λ = x x x x λ It is convenient to rewrite these KKT conditions in gradient notation. Let x * be the optimal point. Then * λ L x (x * = f x (x * g x (x * g x (x * * λ which corresponds to * f x (x * = g x (x * * λ + g x (x * λ * * with λ and λ. This means that the gradient of the objective function evaluated at the optimal point x * represents the main diagonal of the parallelogram constructed using the gradients of the constraints as a basis (mathematical ruler. Figure illustrates this fact. Figure. NPP and KKT conditions 6
In figure, D. The parallelogram expresses the KKT conditions with the gradient of the objective function being the main diagonal of the parallelogram constructed on the gradients of the constraints multiplied by positive scalars λ and λ. If the gradient of the objective function, f x (x *, were to fall outside the cone generated by the two gradients of the binding constraints, g x (x *, g x (x *, KKT conditions would be violated. (In a space of two dimensions, a cone is an arrangement of two vectors that form an angle less than 80 degrees. Constraint Qualification In solving a nonlinear programming problem by KKT theory there is the necessity of ruling out a peculiar arrangement of the constraints graph in the neighborhood of the optimal point. This assumption is called constraint qualification. It says that, if the optimal point happens to be in the neighborhood of a cusp, the KKT conditions break down. The Webster dictionary says that a cusp is a fixed point on a mathematical curve at which a point tracing the curve would exactly reverse its direction of motion. Hence, it is wise to stay away from this event by invoking the constraint qualification. A simple numerical example is sufficient to illustrate this unfavorable mathematical event. max f (x = x subject to x ( x 3 0, x Figure 3 illustrates the problem s graph. Figure 3. Violation of KKT conditions This problem fails to satisfy the associated KKT conditions. In other words, it is not possible to express the gradient of the objective function as a linear combination (let alone a positive linear combination of the gradient of the constraint. To verify this assertion we specify the Lagrange function: L = x + λ[( x 3 x ] with KKT conditions for x = 3λ( x 0 x = x [ 3λ( x ] 7
Looking at figure 3, the objective function is maximized at x * =. But at this optimal value of the endogenous variable the KKT conditions result in a contradiction because 0. Matching the KKT conditions with the structure of the Economic Equilibrium f g. = λ 0 x x x MR MC 0 f g. x = x λ x x x Q(MR MC 3. x Q 4. = b g(x S D λ 5. λ = λ [b g(x] P(S D λ 6. λ P As stated in Lecture Notes, the Economic Equilibrium structure is more general than the KKT conditions since it does not imply differentiability, integrability and optimization. This matching of the two structures is very important for all the material that we will discuss in this course. I suggest (recommend to learn all these relationships by heart for how they look like and for what they mean geometrically and economically. 8