UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

Size: px
Start display at page:

Download "UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems"

Transcription

1 UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems Robert M. Freund February 2016 c 2016 Massachusetts Institute of Technology. All rights reserved. 1

2 1 Introduction We consider constrained optimization problems of the form: (P) min x f(x) s.t. g(x) 0 h(x) = 0 x X, where X is an open set and g(x)) = (g 1 (x),..., g m (x)) : R n R m, h(x) = (h 1 (x),..., h l (x)) : R n R l. Let F denote the feasible region of (P), namely, F := {x X : g(x) 0, h(x) = 0}. Then the problem (P) can be written as min f(x). x F Recall that x is a local minimum of (P) if there exists ε > 0 such that f( x) f(y) for all y F B( x, ε). Also recall that local and global minima and maxima, strict and non-strict, are defined analogously. We will often use the following shorthand notation: g(x) = g 1 (x) T. g m (x) T and h(x) = h 1 (x) T. h l (x) T. Here g(x) R m n and h(x) R l n are each Jacobian matrices, the i th row of which is the transpose of the corresponding gradient. 2

3 2 Necessary Optimality Conditions 2.1 Geometric Necessary Conditions A set C R n is a cone if for every x C, αx C for any α 0. A set C is a convex cone if C is a cone and C is a convex set. Suppose x F. We have the following definitions: F 0 := {d : f( x) T d < 0} is the cone of improving directions of f(x) at x (not including the origin 0 in the cone). I := {i : g i ( x) = 0} is the set of indices of the binding inequality constraints at x. G 0 := {d : g i ( x) T d < 0 for all i I} is the cone of inward pointing directions for the binding constraints at x (not including the origin 0 in the cone). H 0 := {d : h i ( x) T d = 0 for all i = 1,..., l} is the set of tangent directions for the equality constraints at x. Theorem 2.1 (Geometric First-order Necessary Conditions) If x is a local minimum of (P) and either: (i) h(x) is a linear function, i.e., h(x) = Ax b for A R l n, or (ii) h i ( x), i = 1,..., l, are linearly independent, then F 0 G 0 H 0 =. Proof: Assume h(x) is the linear function h(x) = Ax b, whereby h i ( x) = A i for i = 1,..., l and H 0 = {d : Ad = 0}. Suppose d F 0 G 0 H 0. Suppose that λ > 0 and is sufficiently small. Then g i ( x + λd) < g i ( x) = 0 for i I because d G 0. Also g i ( x + λd) < 0 for i / I from continuity and the fact that g i ( x) < 0. Furthermore h( x + λd) = (A x b) + λad = 0. 3

4 Therefore x + λd F for all λ > 0 and sufficiently small. On the other hand, for all sufficiently small λ > 0 it holds that f( x + λd) < f( x). This contradicts the assumption that x is a local minimum of (P). The proof when h(x) is nonlinear is rather involved, as it relies on the Implicit Function Theorem. We present the proof in Section 6 at the end of this lecture note. Note that Theorem 2.1 essentially states that if a point x is (locally) optimal, there is no direction d which is both a feasible direction (i.e., such that g( x + λd) 0 and h( x + λd) 0) for small λ > 0) and is an improving direction (i.e., such that f( x + λd) < f( x) for small λ > 0), which makes sense intuitively. 2.2 Separation of Convex Sets We will shortly attempt to restate the geometric necessary local optimality conditions: F 0 G 0 H 0 = into a constructive and computable algebraic statement about the gradients of the objective function and the constraint functions. The vehicle that will make this happen involves the separation theory of convex sets. We start with some definitions: If p 0 is a vector in R n and α is a scalar, H := {x R n : p T x = α} is a hyperplane, and H + = {x R n : p T x α}, H = {x R n : p T x α} are halfspaces. Let S and T be two non-empty sets in R n. A hyperplane H = {x : p T x = α} is said to separate S and T if p T x α for all x S and p T x α for all x T, i.e., if S H + and T H. If, in addition, S T H, then H is said to properly separate S and T. H is said to strictly separate S and T if p T x > α for all x S and p T x < α for all x T. H is said to strongly separate S and T if for some ε > 0 it holds that p T x α + ε for all x S and p T x α ε for all x T. 4

5 Proposition 2.1 Let S be a nonempty closed convex set in R n, and y S. Then there exists a unique point x S of minimum distance to y. Furthermore, x is the minimizing point if and only if (y x) T (x x) 0 for all x S. (1) Proof: Let f(x) = x y 2. Let ˆx be an arbitrary point in S, and let Ŝ = S {x : x y ˆx y }. Then Ŝ is a compact set and Ŝ S. Furthermore, since f(x) is a continuous function it follows from the Weierstrass Theorem that f( ) attains its minimum over the set Ŝ at some point x Ŝ. As a consequence of the construction of Ŝ it also follows that x minimizes f( ) over the set S. Note that x y. Furthermore, because f( ) is a strictly convex function, x is unique. We now establish that x is the point in S that minimizes the distance to y if and only if (1) holds. Let us first assume that (1) holds; then note that for any x S, x y 2 = (x x) (y x) 2 = x x 2 + y x 2 2(x x) T (y x) x y 2. Therefore f(x) f( x), and so x is the point in S that minimizes the distance to y. Conversely, let us assume that x is the point in S that minimizes the distance to y. For any x S, λx + (1 λ) x S for any λ [0, 1]. Therefore λx + (1 λ) x y x y. Thus: x y 2 λx + (1 λ) x y 2 which when rearranged yields: = λ(x x) + ( x y) 2 = λ 2 x x 2 + 2λ(x x) T ( x y) + x y 2, λ 2 x x 2 2λ(y x) T (x x). Restricting attention to when λ > 0, we divide by λ to yield: λ x x 2 2(y x) T (x x). 5

6 Now let λ 0, whereby we obtain (y x) T (x x) 0. Theorem 2.2 Let S be a nonempty closed convex set in R n, and suppose that y S. Then there exists p 0 and α such that H = {x : p T x = α} strongly separates S and {y}. Proof: Let x be the point in S that minimizes the distance from y to the set S. Note that x y. Let p = y x, α = 1 2 (pt y + p T x), and ε = 1 2 (pt y p T x) = 1 2 y x 2 > 0. From Proposition 2.1 it follows that for any x S, (x x) T (y x) 0, and so p T x = (y x) T x (y x) T x = p T x = α ε. Therefore p T x α ε for all x S. On the other hand, p T y = α + ε, which completes the proof. We now use Theorem 2.2 to prove two useful results about the solvability of linear inequalities: Theorem 2.3 (Farkas Lemma) Given an m n matrix A and an n- vector c, exactly one of the following two systems has a solution: or (i) Ax 0, c T x > 0, (ii) A T y = c, y 0. Proof: First note that both systems (i) and (ii) cannot have a solution, since then we would have 0 < c T x = y T Ax 0. Suppose the system (ii) has no solution. Let S = {x : x = A T y for some y 0}. Then c S. S is easily seen to be a convex set. Also, S is a closed set. 1 Therefore there exists p 0 and α such that c T p > α and p T x α for all x S. It then follows by construction of S that p T (A T y) = (Ap) T y α for all y 0. 1 For an exact proof of this fact, see Appendix B.3 of Nonlinear Programming by Dimitri Bertsekas, Athena Scientific,

7 If (Ap) i > 0 for some i, one could set y i sufficiently large so that (Ap) T y > α, a contradiction. Thus Ap 0. Taking y = 0, we also have that α 0, and so c T p > 0, whereby p is a solution of (i). Lemma 2.1 [Key Lemma] Given matrices Ā, B, and H of appropriate dimensions, exactly one of the following two systems has a solution: (i) Āx < 0, Bx 0, Hx = 0, or (ii) ĀT u + B T w + H T v = 0, u 0, w 0, e T u = 1. Proof: It is easy to show that both (i) and (ii) cannot have a solution. Suppose (i) does not have a solution. Then the system: Āx +eθ 0, θ > 0 Bx 0 Hx 0 Hx 0 has no solution. This system can be re-written in the form: Ā e B 0 H 0 H 0 ( x θ ) ( x 0, (0,..., 0, 1) θ ) > 0. From Farkas Lemma, there exists a vector (u; w; v 1 ; v 2 ) 0 such that: T Ā e u 0 B 0 H 0 w v 1 =. 0. H 0 v 2 1 This can be rewritten as: 7

8 Ā T u + B T w + H T (v 1 v 2 ) = 0, e T u = 1. Setting v = v 1 v 2 completes the proof. We conclude this subsection with some extensions/consequences of Theorem 2.2. The first extension is to the strong separation of two convex sets when one of the sets is closed and bounded, as follows: Theorem 2.4 Suppose S 1 and S 2 are disjoint nonempty closed convex sets and S 1 is bounded. Then S 1 and S 2 can be strongly separated by a hyperplane. Proof: Let T = {x R n : x = y z, where y S 1, z S 2 }. Then it is easy to show that T is a convex set. We also claim that T is a closed set. To see this, let {x i } T, and suppose x = lim i x i. Then x i = y i z i for {y i } S 1 and {z i } S 2. By the Weierstrass Theorem, some subsequence of {y i } converges to a point ȳ S 1. Then z i = y i x i ȳ x (over this subsequence), so that z = ȳ x is a limit point of {z i }. Since S 2 is also closed, z S 2, and therefore x = ȳ z T, proving that T is a closed set. By hypothesis S 1 S 2 =, whereby 0 T. Since T is convex and closed, there exists a hyperplane H = {x : p T x = ᾱ} such that p T x > ᾱ for x T and p T 0 < ᾱ (and hence ᾱ > 0). Let y S 1 and z S 2. Then x = y z T, and so p T (y z) = p T x > ᾱ > 0 for any y S 1 and z S 2. Let α 1 = inf{p T y : y S 1 } and α 2 = sup{p T z : z S 2 }, and note that 0 < ᾱ α 1 α 2. Define α = 1 2 (α 1 + α 2 ) and ε = 1 2ᾱ > 0. Then for all y S 1 and z S 2 we have and p T y α 1 = 1 2 (α 1 + α 2 ) (α 1 α 2 ) α + 1 2ᾱ = α + ε p T z α 2 = 1 2 (α 1 + α 2 ) 1 2 (α 1 α 2 ) α 1 2ᾱ = α ε. Therefore S 1 and S 2 are strongly separated by the hyperplane H := {x R n : p T x = α}. 8

9 Corollary 2.1 If S is a closed convex set in R n, then S is the intersection of all halfspaces that contain S. Theorem 2.5 Let S R n, and let C be the intersection of all halfspaces containing S. Then C is the smallest closed convex set containing S. 2.3 Algebraic Necessary Conditions In this subsection we restate the geometric necessary local optimality conditions of Theorem 2.1: F 0 G 0 H 0 = into a constructive and computable algebraic statement about the gradient of the objective function and gradients of the constraint functions. Theorem 2.6 (Fritz John Necessary Conditions) Let x be a feasible solution of (P). If x is a local minimum of (P), then there exists (u 0, u, v) such that m l u 0 f( x) + u i g i ( x) + v i h i ( x) = 0, u 0, u 0, (u 0, u, v) 0, u i g i ( x) = 0, i = 1,..., m. (Note that the first equation can be rewritten as u 0 f( x) + g( x) T u + h( x) T v = 0.) Proof: If the vectors h i ( x) are linearly dependent, then there exists v 0 such that l v i h i ( x) = 0. Setting (u 0, u) = 0 establishes the result. Suppose now that the vectors h i ( x) are linearly independent. Then we can apply Theorem 2.1 and assert that F 0 G 0 H 0 =. Assume for simplicity that I = {1,..., p}. Let 9

10 Ā = f( x) T g 1 ( x) T. g p ( x) T h 1 ( x) T, H =.. h l ( x) T Then there is no d that satisfies Ād < 0, Hd = 0. From the Key Lemma (Lemma 2.1), there exists (u 0, u 1,..., u p ) and (v 1,..., v l ) such that u 0 f( x) + p u i g i ( x) + l v i h i ( x) = 0, with u 0 +u 1 + +u p = 1 and (u 0, u 1,..., u p ) 0. Define u p+1,..., u m = 0. Then (u 0, u) 0, (u 0, u) 0, and for any i, either g i ( x) = 0, or u i = 0. Furthermore, u 0 f( x) + p u i g i ( x) + l v i h i ( x) = 0. When u 0 > 0, the Fritz John conditions state that the gradient of the objective function and the gradients of the active constraints must line up in an intuitively obviously way having to do with the local geometry of feasible directions at the point x. The condition u i g i ( x) = 0, i = 1,..., m, can be equivalently stated as either g i ( x) = 0 or u i = 0, i = 1,..., m, and is called the complementary slackness condition or simply the complementarity condition. Theorem 2.7 (Karush-Kuhn-Tucker (KKT) Necessary Conditions) Let x be a feasible solution of (P) and let I = {i : g i ( x) = 0}. Further, suppose that h i ( x) for i = 1,..., l and g i ( x) for i I are linearly independent. If x is a local minimum, there exists (u, v) such that f( x) + g( x) T u + h( x) T v = 0, u 0, u i g i ( x) = 0, i = 1,..., m. 10

11 Proof: x must satisfy the Fritz John conditions. If u 0 > 0, we can redefine u u/u 0 and v v/u 0, whereby the redefined values (u, v) satisfy the conditions of the theorem. If u 0 = 0, then u i g i ( x) + i I l v i h i ( x) = 0, and so the above gradients are linearly dependent. assumptions of the theorem. This contradicts the Example 1 Consider the problem: min 6(x 1 10) 2 +4(x ) 2 s.t. x 2 1 +(x 2 5) 2 50 x x (x 1 6) 2 +x In this problem, we have: f(x) = 6(x 1 10) 2 + 4(x ) 2 g 1 (x) = x (x 2 5) 2 50 g 2 (x) = x x g 3 (x) = (x 1 6) 2 + x We also have: 11

12 f(x) = g 1 (x) = g 2 (x) = g 3 (x) = 12(x 1 10) 8(x ) 2x 1 2(x 2 5) 2x 1 6x 2 2(x 1 6) 2x 2 Let us determine whether or not the point x = ( x 1, x 2 ) = (7, 6) is a candidate to be an optimal solution to this problem. We first check for feasibility: g 1 ( x) = 0 0 g 2 ( x) = 43 < 0 g 3 ( x) = 0 0 To check for optimality, we compute all gradients at x: 12

13 f(x) = g 1 (x) = g 2 (x) = g 3 (x) = We next check to see if the gradients line up, by trying to solve for u 1 0, u 2 = 0, u 3 0 in the following system: u 1 + u 2 + u 3 = Notice that ū = (ū 1, ū 2, ū 3 ) = (2, 0, 4) solves this system, and that ū 0 and ū 2 = 0. Therefore x is a candidate to be an optimal solution of this problem. Example 2 Consider the problem (P): (P) : max x x T Qx s.t. x 1, where Q is symmetric. This is equivalent to: min x x T Qx s.t. x T x 1. 13

14 The Fritz John conditions are: 2u 0 Qx + 2ux = 0 x T x 1 (u 0, u) 0 (u 0, u) 0 u(1 x T x) = 0. Suppose that u 0 = 0. Then u 0. However, the gradient condition implies that x = 0, and then from complementarity we have u = 0, which is a contradiction. Therefore we can assume that u 0 > 0 and so instead work directly with the KKT conditions: 2Qx + 2ux = 0 x T x 1 u 0 u(1 x T x) = 0. First note that if x and u satisfy the KKT conditions, then x T Qx = ux T x = u, whereby we see that u is the objective function value of the problem (P) at x. One solution to the KKT system is x = 0, u = 0, with objective function value u = x T Qx = 0. Are there any better solutions to the KKT system? If x 0 is a solution of the KKT system together with some value u, then x is an eigenvector of Q with nonnegative eigenvalue u, so the objective value of this solution is u. Therefore the solution of (P) with the largest objective function value is x = 0 if the largest eigenvalue of Q is nonpositive. If the largest eigenvalue of Q is positive, then the optimal objective value of (P) is the largest eigenvalue, and the optimal solution is any eigenvector x corresponding to this eigenvalue, normalized so that x = 1. Example 3 Consider the problem: 14

15 min (x 1 12) 2 +(x 2 + 6) 2 s.t. x x 1 +x x (x 1 9) 2 +x x 1 +4x 2 = 20 In this problem, we have: f(x) = (x 1 12) 2 + (x 2 + 6) 2 g 1 (x) = x x 1 + x x g 2 (x) = (x 1 9) 2 + x h 1 (x) = 8x 1 + 4x 2 20 Let us determine whether or not the point x = ( x 1, x 2 ) = (2, 1) is a candidate to be an optimal solution to this problem. We first check for feasibility: g 1 ( x) = 0 0 g 2 ( x) = 14 < 0 h 1 ( x) = 0 To check for optimality, we compute all gradients at x: 15

16 f( x) = g 1 ( x) = g 2 ( x) = h 1 ( x) = We next check to see if the gradients line up, by trying to solve for u 1 0, u 2 = 0, v 1 in the following system: u 1 + u 2 + v 1 = Notice that (ū, v) = (ū 1, ū 2, v 1 ) = (4, 0, 1) solves this system and that ū 0 and ū 2 = 0. Therefore x is a candidate to be an optimal solution of this problem. 2.4 Other Constraint Qualifications for KKT Necessary Conditions Recall that the statement of the KKT necessary conditions established herein in Theorem 2.7 has the form if x is a local minimum of (P) and (some requirement for the constraints) then the KKT conditions must hold at x. This additional requirement for the constraints that enables us to proceed with the proof of the KKT conditions is called a constraint qualification. 16

17 In Theorem 2.7 we established the following constraint qualification: Linear Independence Constraint Qualification: The gradients g i ( x), i I, h i ( x), i = 1,..., l are linearly independent. We will now establish two alternative (and useful) constraint qualifications. We start with a definition: Definition 2.1 A point x is called a Slater point if x satisfies g(x) < 0 and h(x) = 0, that is, x is feasible and satisfies all inequalities strictly. Theorem 2.8 (Slater condition) Suppose that g i ( ), i = 1,..., m are convex, h i ( ), i = 1,..., l are linear, and h i (x), i = 1,..., l are linearly independent for every x, and (P) has a Slater point. Then the KKT conditions are necessary to characterize a local optimal solution. Proof: Let x be a local minimum. The Fritz-John conditions are necessary for this problem, whereby there must exist (u 0, u, v) 0 such that (u 0, u) 0 and u 0 f( x) + m u i g i ( x) + l v i h i ( x) = 0, u i g i ( x) = 0, i = 1,..., m. If u 0 > 0, dividing through by u 0 demonstrates the KKT conditions. Now suppose u 0 = 0. Let x 0 be a Slater point, and define d := x 0 x. Then for each i I, 0 = g i ( x) > g i (x 0 ), and by the convexity of g i ( ) we have g i ( x) T d < 0. Also, since h i (x) is linear, h i ( x) T d = 0, i = 1,..., l. Suppose that u i > 0 for some i I. It then would be true that 0 = 0 T d = ( u 0 f( x) + m u i g i ( x) + T l v i h i ( x)) d < 0, which yields a contradiction. Therefore u i = 0 for all i I. But this then implies that v 0 and l v i h i ( x) = 0, which violates the linear independence assumption. This is a contradiction, and so u 0 > 0. 17

18 Theorem 2.9 (Linear constraints) If all constraints are linear, the KKT conditions are necessary to characterize a local optimal solution. Proof: Our problem is (P) s.t. min x f(x) Ax b Mx = g. Suppose x is a local optimum. Without loss of generality, we can partition the constraints Ax b into groups A I x b I and A Ī x bī such that A I x = b I and A x < bī. Then at x, the set S := {d : A Ī Id 0, Md = 0 } is precisely the set of feasible directions. Thus, in particular, for every d as above, f( x) T d 0, for otherwise d would be a feasible descent direction at x, violating its local optimality. Therefore, the linear system f( x) T d < 0 A I d 0 Md = 0 has no solution. Applying the Key Lemma (Lemma 2.1) using Ā := f( x)t, B := A I, and H := M, there exists (u 0, w, v) satisfying u 0 = 1, w 0, and u 0 f( x) + A T I w + M T v = 0, which are precisely the KKT conditions. 3 Sufficient Conditions for Optimality for Convex Optimization Problems We start with a definition. If f( ) : X R, then the α level set of f( ) is defined as: for each α R. S α := {x X : f(x) α} 18

19 Proposition 3.1 If f( ) is convex on X, then S α is a convex set for every α R. Proof: Suppose that f( ) is convex. For any given value of α, suppose that x, y S α. Let z = λx + (1 λ)y for some λ [0, 1]. Then f(z) λf(x) + (1 λ)f(y) λα + (1 λ)α = α, so z S α, which shows that S α is a convex set. The program (P) min x f(x) s.t. g(x) 0 h(x) = 0 x X is called a convex problem if f(x), g i (x), i = 1,..., m, are convex functions, h i (x), i = 1..., l are linear functions, and X is an open convex set. Proposition 3.2 If the optimization problem (P) is a convex problem, then the feasible region of (P) is a convex set. Proof: Because each function g i ( ) is convex, the level sets C i := {x X : g i (x) 0}, i = 1,..., m are convex sets. Also, because each function h i ( ) is linear, the sets D i = {x X : h i (x) = 0}, i = 1,..., l are convex sets. Thus, since the intersection of convex sets is also a convex set, the feasible region F = {x X : g i (x) 0, i = 1,..., m, h i (x) = 0, i = 1,..., l} 19

20 is a convex set. Theorem 3.1 (KKT Sufficient Conditions for Convex Problems) Suppose that (P) is a convex problem. Let x be a feasible solution of (P), and suppose x together with multipliers (u, v) satisfies f( x) + m u i g i ( x) + u 0, l v i h i ( x) = 0, u i g i ( x) = 0, i = 1,..., m. Then x is a global optimal solution of (P). Proof: We know from Proposition 3.2 that the feasible region of (P) is a convex set. Let I = {i g i ( x) = 0} denote the index of active constraints at x. Let x F be any point different from x. Thus for i I we have g i (x) 0 and g i ( x) = 0, whereby: 0 g i (x) g i ( x) + g i ( x) T (x x) = g i ( x) T (x x) whereby g i ( x) T (x x) 0 for all i I. Also, since h i ( ) is a linear function, we have h i ( x + λ(x x)) = 0 for all λ [0, 1], whereby h i ( x) T (x x) = 0 for each i = 1,..., l. Thus, from the KKT conditions, [ f( x) T (x x) = i I u i g i ( x) T l v i h i ( x)] (x x) 0, and by the gradient inequality, f(x) f( x) + f( x) T (x x) f( x). As this is true for all x F, it follows that x is a global optimal solution of (P). 20

21 Example 4 Continuing Example 1, note that f( ), g 1 ( ), g 2 ( ), and g 3 ( ) are all convex functions. Therefore the problem is a convex problem, and the KKT conditions are sufficient for optimality. Since the point x = (7, 6) was shown to satisfy the KKT conditions, it follows that x = (7, 6) is a global minimum. Example 5 Continuing Example 3, note that f( ), g 1 ( ), g 2 ( ) are all convex functions and that h 1 ( ) is a linear function. Therefore the problem is a convex problem, and the KKT conditions sufficient for optimality. Since the point x = (2, 1) was shown to satisfy the KKT conditions, it follows that x = (2, 1) is a global minimum. 4 Generalizations of Convexity and Weaker Sufficient Conditions for Optimality In Section 3 we showed the sufficiency of the KKT conditions for optimality under the condition that all relevant functions are convex. Herein we explore ways that the convexity conditions can be relaxed while still maintaining the sufficiency of the KKT conditions. It turns out that the KKT conditions will be sufficient for optimality so long as: (i) f( ) is a pseudoconvex function (to be defined shortly), (ii) g 1 ( ),..., g m ( ) are quasiconvex functions (also to be defined shortly), and (iii) h 1 ( ),..., h l ( ) are linear functions. Suppose X is a convex set in R n. The differentiable function f( ) : X R is a pseudoconvex function if for every x, y X the following holds: f(x) T (y x) 0 f(y) f(x). It is straightforward to show that a differentiable convex function is pseudoconvex, which we will prove at the end of this subsection. 21

22 Incidentally, we define a differentiable function f( ) : X R to be pseudoconcave if for every x, y X the following holds: f(x) T (y x) 0 f(y) f(x). Suppose X is a convex set in R n. The function f( ) : X R is a quasiconvex function if for all x, y X and for all λ [0, 1], f(λx + (1 λ)y) max{f(x), f(y)}. f( ) is quasiconcave if for all x, y X and for all λ [0, 1], f(λx + (1 λ)y) min{f(x), f(y)}. It is also straightforward to show that a convex function is quasiconvex, which will also be proved at the end of this subsection. Proposition 4.1 A function f( ) is quasiconvex on X if and only if S α is a convex set for every α R. Proof: Suppose that f(x) is quasiconvex. For any given value of α, suppose that x, y S α. Let z = λx + (1 λ)y for some λ [0, 1]. Then f(z) max{f(x), f(y)} α, so z S α, which shows that S α is a convex set. Conversely, suppose S α is a convex set for every α. Let x and y be given, and let α = max{f(x), f(y)}, and hence x, y S α. Then for any λ [0, 1], f(λx + (1 λ)y) α = max{f(x), f(y)}, and so f(x) is a quasiconvex function. Theorem 4.1 (KKT Sufficient Conditions for Generalized Convex Problems) Suppose that f(x) is a pseudoconvex function, g i (x), i = 1,..., m, are quasiconvex functions, and h i (x), i = 1,..., l, are linear functions. Let x be a feasible solution of (P), and suppose x together with multipliers (u, v) satisfies m l f( x) + u i g i ( x) + v i h i ( x) = 0, 22

23 u 0, u i g i ( x) = 0, i = 1,..., m. Then x is a global optimal solution of (P). Proof: Because each g i ( ) is quasiconvex, the level sets C i := {x X : g i (x) 0}, i = 1,..., m are convex sets. Also, because each h i ( ) is linear, the sets D i = {x X : h i (x) = 0}, i = 1,..., l are convex sets. Thus, since the intersection of convex sets is also a convex set, the feasible region is a convex set. F = {x X : g(x) 0, h(x) = 0} Let I = {i g i ( x) = 0} denote the index of active constraints at x. Let x F be any point different from x. Then λx + (1 λ) x is feasible for all λ [0, 1]. Thus for i I we have g i ( x + λ(x x)) = g i (λx + (1 λ) x) 0 = g i ( x) for any λ [0, 1], and since the value of g i ( ) does not strictly increase by moving in the direction x x, we must have g i ( x) T (x x) 0, for all i I. Similarly, h i ( x + λ(x x)) = 0, and so h i ( x) T (x x) = 0 for all i = 1,..., l. Thus, from the KKT conditions, f( x) T (x x) = ( g( x) T u + h( x) T v ) T (x x) 0, and by pseudoconvexity, f(x) f( x) for any feasible x. 23

24 The following summarizes how these different generalizations of convexity are related. Proposition 4.2 It holds that: (i) A differentiable convex function is pseudoconvex. (ii) A convex function is quasiconvex. (iii) A pseudoconvex function is quasiconvex. Proof: To prove the first claim, note that if f( ) is convex and differentiable, then f(y) f(x) + f(x) T (y x). Hence, if f(x) T (y x) 0, then f(y) f(x), and so f(x) is pseudoconvex. To prove the second claim, observe from Proposition 3.1 that the level sets of a convex function f( ) are convex sets, whereby from Proposition 4.1 it follows that f( ) is quasiconvex. To show the third claim, suppose f(x) is pseudoconvex. Let x, y and λ [0, 1] be given, and let z = λx + (1 λ)y. If λ = 0 or λ = 1, then f(z) max{f(x), f(y)} trivially; therefore, assume 0 < λ < 1. Let d = y x. If f(z) T d 0, then f(z) T (y z) = f(z) T (λ(y x)) = λ f(z) T d 0, so f(z) f(y) max{f(x), f(y)}. On the other hand, if f(z) T d 0, then f(z) T (x z) = f(z) T ( (1 λ)(y x)) = (1 λ) f(z) T d 0, so f(z) f(x) max{f(x), f(y)}. Thus f(x) is quasiconvex. 24

25 Example 6 Let D = R n be the domain of the convex function f( ) and let R ++ denote the positive half-line, that is, R ++ := {α R α > 0}. For every x R n define the new function h(x, t) : R n R ++ R as follows: ( x ) h(x, t) := f. t Then it is a straightforward exercise to show that h(x, t) is a quasiconvex function on its domain D := R n R Second-Order Optimality Conditions To describe the second order conditions for optimality, we will define the following function, known as the Lagrangian function, or simply the Lagrangian: L(x, u, v) = f(x) + m u i g i (x) + l v i h i (x) = f(x) + u T g(x) + v T h(x). Using the Lagrangian, we can, for example, re-write the gradient conditions of the KKT necessary conditions as follows: x L( x, u, v) = 0, u T g( x) = 0, u 0, g( x) 0, (2) since x L(x, u, v) = f(x) + g(x) T u + h(x) T v. Also, note that 2 xxl(x, u, v) = 2 f(x)+ m u i 2 g i (x)+ l v i 2 h i (x). Here we use the standard notation: 2 q(x) denotes the Hessian of the function q(x), and 2 xxl(x, u, v) denotes the submatrix of the Hessian of L(x, u, v) corresponding to the second partial derivatives with respect to the x variables only. Theorem 5.1 (KKT second order necessary conditions) Suppose x is a local minimum of (P), and g i ( x), i I and h i ( x), i = 1,..., l are linearly independent. Then x must satisfy the KKT conditions. Furthermore, every d that satisfies: g i ( x) T d 0, i I, h i ( x) T d = 0, i = 1..., l 25

26 must also satisfy d T xx L( x, u, v)d 0. Theorem 5.2 (KKT second order sufficient conditions) Suppose the point x F together with multipliers (u, v) satisfies the KKT conditions. Let I + = {i I : u i > 0} and I 0 = {i I : u i = 0}. Additionally, suppose that every d 0 that satisfies also satisfies g i ( x) T d = 0, i I +, g i ( x) T d 0, i I 0, h i ( x) T d = 0, i = 1..., l d T 2 xxl( x, u, v)d > 0. Then x is a strict local minimum of (P). 6 A Proof of Theorem 2.1 The proof of Theorem 2.1 relies on the Implicit Function Theorem. To motivate the Implicit Function Theorem, consider a system of linear functions: h(x) := Ax b and suppose that we are interested in solving h(x) = Ax b = 0. 26

27 Let us assume that A R l n has full row rank (i.e., its rows are linearly independent). Then we can partition the columns of A and elements of x as follows: A = [B N], x = (y; z), so that B R l l is non-singular, and h(x) = By + Nz b. Let s(z) = B 1 b B 1 Nz. Then for any z, h(s(z), z) = Bs(z)+Nz b = 0, i.e., x = (s(z), z) solves h(x) = 0. This idea of invertability of a system of equations is generalized (although only locally) by the following version of the Implicit Function Theorem, where we will preserve the notation used above: Theorem 6.1 (Implicit Function Theorem) Let h(x) : R n R l and x = (ȳ 1,..., ȳ l, z 1,..., z n l ) = (ȳ, z) satisfy: 1. h( x) = 0 2. h(x) is continuously differentiable in a neighborhood of x 3. The l l Jacobian matrix h 1 ( x) h 1 ( x) y l y h l ( x) y 1 h l ( x) y l is nonsingular. Then there exists ε > 0 along with functions s(z) = (s 1 (z),..., s l (z)) such that for all z B( z, ε), h(s(z), z) = 0. Moreover, s k (z) is continuously differentiable, k = 1,..., l. Furthermore, for all i = 1,..., l and j = 1,..., n l we have: l k=1 h i (y, z) y k s k(z) z j + h i(y, z) z j = 0. 27

28 Proof of Theorem 2.1: Let A = h( x) R l n. Then A has full row rank, and its columns (along with corresponding elements of x) can be re-arranged so that A = [B N] and x = (ȳ; z), where B is non-singular. Let z lie in a small neighborhood of z. Then, from the Implicit Function Theorem, there exists s(z) such that h(s(z), z) = 0. Suppose that d F 0 G 0 H 0, and let us write d = (q; p). Then 0 = Ad = Bq + Np, whereby q = B 1 Np. Let z(θ) = z + θp, y(θ) = s(z(θ)) = s( z + θp), and x(θ) = (y(θ), z(θ)). We will derive a contradiction by showing that d is an improving feasible direction, i.e., for small θ > 0, x(θ) is feasible and f(x(θ)) < f( x). To show feasibility of x(θ), note that for θ > 0 sufficiently small, it follows from the Implicit Function Theorem that: Furthermore, for i = 1,..., l we have: h(x(θ)) = h(s(z(θ)), z(θ)) = 0. 0 = h i(x(θ)) θ = l k=1 h i (s(z(θ)), z(θ)) y k s k(z(θ)) θ n l + k=1 h i (s(z(θ)), z(θ)) z k(θ) z k θ. Let r k = s k(z(θ)) θ, and recall that z k(θ) θ = p k. At θ = 0, the above equation system can then be re-written as 0 = Br + Np, or r = B 1 Np = q. Therefore, x k(θ) θ = d k for k = 1,..., n. For i I, g i (x(θ)) = g i ( x) + θ g i(x(θ)) θ = θ n k=1 g i (x(θ)) x k + θ α i (θ) θ=0 x k(θ) θ = θ g i ( x) T d + θ α i (θ), θ=0 where α i (θ) 0 as θ 0. Hence g i (x(θ)) < 0 for all i = 1,..., m for θ > 0 28

29 sufficiently small, and therefore, x(θ) is feasible for any θ > 0 sufficiently small. On the other hand, f(x(θ)) = f( x) + θ f( x) T d + θ α(θ) < f( x) for θ > 0 sufficiently small, where α(θ) 0 as θ 0. But this contradicts the local optimality of x. Therefore no such d can exist, and the theorem is proved. 29

30 7 Summary Road Map of Constrained Optimization Optimality Conditions Here is a road map of the structure of the constrained optimization optimality conditions we have developed herein. Necessary Conditions: If x solves P, then it holds that Geometric Conditions (Theorem 2.1): F 0 G 0 H 0 = 2. Analytic Conditions: (i) Fritz John Necessary Conditions (Theorem 2.6) u 0 f( x) + m u i g i ( x) + (ii) KKT Necessary Conditions f( x) + m u i g i ( x) + l v i h i ( x) = 0 l v i h i ( x) = 0 The KKT conditions require a constraint qualification such as: (a) linear independence of active gradients (Theorem 2.7), or (b) Slater condition (Theorem 2.8), or (c) all constraints are linear (Theorem 2.9), or (d) some other constraint qualification. Sufficient Conditions: If P satisfies... and x satisfies..., then x is an optimal solution. 1. If convexity conditions hold, then the KKT conditions are sufficient for optimality (Theorem 3.1) 2. If generalized convexity conditions hold, then the KKT conditions are sufficient for optimality (Theorem 4.1) 30

Constrained Optimization

Constrained Optimization 1 / 22 Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University March 30, 2015 2 / 22 1. Equality constraints only 1.1 Reduced gradient 1.2 Lagrange

More information

Optimality Conditions for Constrained Optimization

Optimality Conditions for Constrained Optimization 72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

More information

The general programming problem is the nonlinear programming problem where a given function is maximized subject to a set of inequality constraints.

The general programming problem is the nonlinear programming problem where a given function is maximized subject to a set of inequality constraints. 1 Optimization Mathematical programming refers to the basic mathematical problem of finding a maximum to a function, f, subject to some constraints. 1 In other words, the objective is to find a point,

More information

Duality Theory of Constrained Optimization

Duality Theory of Constrained Optimization Duality Theory of Constrained Optimization Robert M. Freund April, 2014 c 2014 Massachusetts Institute of Technology. All rights reserved. 1 2 1 The Practical Importance of Duality Duality is pervasive

More information

Symmetric Matrices and Eigendecomposition

Symmetric Matrices and Eigendecomposition Symmetric Matrices and Eigendecomposition Robert M. Freund January, 2014 c 2014 Massachusetts Institute of Technology. All rights reserved. 1 2 1 Symmetric Matrices and Convexity of Quadratic Functions

More information

MATHEMATICAL ECONOMICS: OPTIMIZATION. Contents

MATHEMATICAL ECONOMICS: OPTIMIZATION. Contents MATHEMATICAL ECONOMICS: OPTIMIZATION JOÃO LOPES DIAS Contents 1. Introduction 2 1.1. Preliminaries 2 1.2. Optimal points and values 2 1.3. The optimization problems 3 1.4. Existence of optimal points 4

More information

In view of (31), the second of these is equal to the identity I on E m, while this, in view of (30), implies that the first can be written

In view of (31), the second of these is equal to the identity I on E m, while this, in view of (30), implies that the first can be written 11.8 Inequality Constraints 341 Because by assumption x is a regular point and L x is positive definite on M, it follows that this matrix is nonsingular (see Exercise 11). Thus, by the Implicit Function

More information

Chapter 2 Convex Analysis

Chapter 2 Convex Analysis Chapter 2 Convex Analysis The theory of nonsmooth analysis is based on convex analysis. Thus, we start this chapter by giving basic concepts and results of convexity (for further readings see also [202,

More information

CO 250 Final Exam Guide

CO 250 Final Exam Guide Spring 2017 CO 250 Final Exam Guide TABLE OF CONTENTS richardwu.ca CO 250 Final Exam Guide Introduction to Optimization Kanstantsin Pashkovich Spring 2017 University of Waterloo Last Revision: March 4,

More information

Numerical Optimization

Numerical Optimization Constrained Optimization Computer Science and Automation Indian Institute of Science Bangalore 560 012, India. NPTEL Course on Constrained Optimization Constrained Optimization Problem: min h j (x) 0,

More information

Convex Functions and Optimization

Convex Functions and Optimization Chapter 5 Convex Functions and Optimization 5.1 Convex Functions Our next topic is that of convex functions. Again, we will concentrate on the context of a map f : R n R although the situation can be generalized

More information

Summary Notes on Maximization

Summary Notes on Maximization Division of the Humanities and Social Sciences Summary Notes on Maximization KC Border Fall 2005 1 Classical Lagrange Multiplier Theorem 1 Definition A point x is a constrained local maximizer of f subject

More information

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The

More information

Lecture 1: Introduction. Outline. B9824 Foundations of Optimization. Fall Administrative matters. 2. Introduction. 3. Existence of optima

Lecture 1: Introduction. Outline. B9824 Foundations of Optimization. Fall Administrative matters. 2. Introduction. 3. Existence of optima B9824 Foundations of Optimization Lecture 1: Introduction Fall 2009 Copyright 2009 Ciamac Moallemi Outline 1. Administrative matters 2. Introduction 3. Existence of optima 4. Local theory of unconstrained

More information

TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM

TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM H. E. Krogstad, IMF, Spring 2012 Karush-Kuhn-Tucker (KKT) Theorem is the most central theorem in constrained optimization, and since the proof is scattered

More information

CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS

CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS A Dissertation Submitted For The Award of the Degree of Master of Philosophy in Mathematics Neelam Patel School of Mathematics

More information

Mathematical Economics. Lecture Notes (in extracts)

Mathematical Economics. Lecture Notes (in extracts) Prof. Dr. Frank Werner Faculty of Mathematics Institute of Mathematical Optimization (IMO) http://math.uni-magdeburg.de/ werner/math-ec-new.html Mathematical Economics Lecture Notes (in extracts) Winter

More information

Econ 508-A FINITE DIMENSIONAL OPTIMIZATION - NECESSARY CONDITIONS. Carmen Astorne-Figari Washington University in St. Louis.

Econ 508-A FINITE DIMENSIONAL OPTIMIZATION - NECESSARY CONDITIONS. Carmen Astorne-Figari Washington University in St. Louis. Econ 508-A FINITE DIMENSIONAL OPTIMIZATION - NECESSARY CONDITIONS Carmen Astorne-Figari Washington University in St. Louis August 12, 2010 INTRODUCTION General form of an optimization problem: max x f

More information

MATH2070 Optimisation

MATH2070 Optimisation MATH2070 Optimisation Nonlinear optimisation with constraints Semester 2, 2012 Lecturer: I.W. Guo Lecture slides courtesy of J.R. Wishart Review The full nonlinear optimisation problem with equality constraints

More information

Lecture 1: Introduction. Outline. B9824 Foundations of Optimization. Fall Administrative matters. 2. Introduction. 3. Existence of optima

Lecture 1: Introduction. Outline. B9824 Foundations of Optimization. Fall Administrative matters. 2. Introduction. 3. Existence of optima B9824 Foundations of Optimization Lecture 1: Introduction Fall 2010 Copyright 2010 Ciamac Moallemi Outline 1. Administrative matters 2. Introduction 3. Existence of optima 4. Local theory of unconstrained

More information

Constrained Optimization Theory

Constrained Optimization Theory Constrained Optimization Theory Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. IMA, August 2016 Stephen Wright (UW-Madison) Constrained Optimization Theory IMA, August

More information

Inequality Constraints

Inequality Constraints Chapter 2 Inequality Constraints 2.1 Optimality Conditions Early in multivariate calculus we learn the significance of differentiability in finding minimizers. In this section we begin our study of the

More information

ON LICQ AND THE UNIQUENESS OF LAGRANGE MULTIPLIERS

ON LICQ AND THE UNIQUENESS OF LAGRANGE MULTIPLIERS ON LICQ AND THE UNIQUENESS OF LAGRANGE MULTIPLIERS GERD WACHSMUTH Abstract. Kyparisis proved in 1985 that a strict version of the Mangasarian- Fromovitz constraint qualification (MFCQ) is equivalent to

More information

Date: July 5, Contents

Date: July 5, Contents 2 Lagrange Multipliers Date: July 5, 2001 Contents 2.1. Introduction to Lagrange Multipliers......... p. 2 2.2. Enhanced Fritz John Optimality Conditions...... p. 14 2.3. Informative Lagrange Multipliers...........

More information

Lectures 9 and 10: Constrained optimization problems and their optimality conditions

Lectures 9 and 10: Constrained optimization problems and their optimality conditions Lectures 9 and 10: Constrained optimization problems and their optimality conditions Coralia Cartis, Mathematical Institute, University of Oxford C6.2/B2: Continuous Optimization Lectures 9 and 10: Constrained

More information

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version Convex Optimization Theory Chapter 5 Exercises and Solutions: Extended Version Dimitri P. Bertsekas Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts http://www.athenasc.com

More information

Seminars on Mathematics for Economics and Finance Topic 5: Optimization Kuhn-Tucker conditions for problems with inequality constraints 1

Seminars on Mathematics for Economics and Finance Topic 5: Optimization Kuhn-Tucker conditions for problems with inequality constraints 1 Seminars on Mathematics for Economics and Finance Topic 5: Optimization Kuhn-Tucker conditions for problems with inequality constraints 1 Session: 15 Aug 2015 (Mon), 10:00am 1:00pm I. Optimization with

More information

Nonlinear Programming and the Kuhn-Tucker Conditions

Nonlinear Programming and the Kuhn-Tucker Conditions Nonlinear Programming and the Kuhn-Tucker Conditions The Kuhn-Tucker (KT) conditions are first-order conditions for constrained optimization problems, a generalization of the first-order conditions we

More information

Lecture 3. Optimization Problems and Iterative Algorithms

Lecture 3. Optimization Problems and Iterative Algorithms Lecture 3 Optimization Problems and Iterative Algorithms January 13, 2016 This material was jointly developed with Angelia Nedić at UIUC for IE 598ns Outline Special Functions: Linear, Quadratic, Convex

More information

4TE3/6TE3. Algorithms for. Continuous Optimization

4TE3/6TE3. Algorithms for. Continuous Optimization 4TE3/6TE3 Algorithms for Continuous Optimization (Duality in Nonlinear Optimization ) Tamás TERLAKY Computing and Software McMaster University Hamilton, January 2004 terlaky@mcmaster.ca Tel: 27780 Optimality

More information

Division of the Humanities and Social Sciences. Supergradients. KC Border Fall 2001 v ::15.45

Division of the Humanities and Social Sciences. Supergradients. KC Border Fall 2001 v ::15.45 Division of the Humanities and Social Sciences Supergradients KC Border Fall 2001 1 The supergradient of a concave function There is a useful way to characterize the concavity of differentiable functions.

More information

Convex Analysis and Economic Theory Winter 2018

Convex Analysis and Economic Theory Winter 2018 Division of the Humanities and Social Sciences Ec 181 KC Border Convex Analysis and Economic Theory Winter 2018 Topic 7: Quasiconvex Functions I 7.1 Level sets of functions For an extended real-valued

More information

1 Review of last lecture and introduction

1 Review of last lecture and introduction Semidefinite Programming Lecture 10 OR 637 Spring 2008 April 16, 2008 (Wednesday) Instructor: Michael Jeremy Todd Scribe: Yogeshwer (Yogi) Sharma 1 Review of last lecture and introduction Let us first

More information

The Steepest Descent Algorithm for Unconstrained Optimization

The Steepest Descent Algorithm for Unconstrained Optimization The Steepest Descent Algorithm for Unconstrained Optimization Robert M. Freund February, 2014 c 2014 Massachusetts Institute of Technology. All rights reserved. 1 1 Steepest Descent Algorithm The problem

More information

Optimization Theory. Lectures 4-6

Optimization Theory. Lectures 4-6 Optimization Theory Lectures 4-6 Unconstrained Maximization Problem: Maximize a function f:ú n 6 ú within a set A f ú n. Typically, A is ú n, or the non-negative orthant {x0ú n x$0} Existence of a maximum:

More information

GENERALIZED CONVEXITY AND OPTIMALITY CONDITIONS IN SCALAR AND VECTOR OPTIMIZATION

GENERALIZED CONVEXITY AND OPTIMALITY CONDITIONS IN SCALAR AND VECTOR OPTIMIZATION Chapter 4 GENERALIZED CONVEXITY AND OPTIMALITY CONDITIONS IN SCALAR AND VECTOR OPTIMIZATION Alberto Cambini Department of Statistics and Applied Mathematics University of Pisa, Via Cosmo Ridolfi 10 56124

More information

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008. 1 ECONOMICS 594: LECTURE NOTES CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS W. Erwin Diewert January 31, 2008. 1. Introduction Many economic problems have the following structure: (i) a linear function

More information

FIRST- AND SECOND-ORDER OPTIMALITY CONDITIONS FOR MATHEMATICAL PROGRAMS WITH VANISHING CONSTRAINTS 1. Tim Hoheisel and Christian Kanzow

FIRST- AND SECOND-ORDER OPTIMALITY CONDITIONS FOR MATHEMATICAL PROGRAMS WITH VANISHING CONSTRAINTS 1. Tim Hoheisel and Christian Kanzow FIRST- AND SECOND-ORDER OPTIMALITY CONDITIONS FOR MATHEMATICAL PROGRAMS WITH VANISHING CONSTRAINTS 1 Tim Hoheisel and Christian Kanzow Dedicated to Jiří Outrata on the occasion of his 60th birthday Preprint

More information

Convex Optimization M2

Convex Optimization M2 Convex Optimization M2 Lecture 3 A. d Aspremont. Convex Optimization M2. 1/49 Duality A. d Aspremont. Convex Optimization M2. 2/49 DMs DM par email: dm.daspremont@gmail.com A. d Aspremont. Convex Optimization

More information

Chap 2. Optimality conditions

Chap 2. Optimality conditions Chap 2. Optimality conditions Version: 29-09-2012 2.1 Optimality conditions in unconstrained optimization Recall the definitions of global, local minimizer. Geometry of minimization Consider for f C 1

More information

Least Sparsity of p-norm based Optimization Problems with p > 1

Least Sparsity of p-norm based Optimization Problems with p > 1 Least Sparsity of p-norm based Optimization Problems with p > Jinglai Shen and Seyedahmad Mousavi Original version: July, 07; Revision: February, 08 Abstract Motivated by l p -optimization arising from

More information

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be

More information

Introduction to Nonlinear Stochastic Programming

Introduction to Nonlinear Stochastic Programming School of Mathematics T H E U N I V E R S I T Y O H F R G E D I N B U Introduction to Nonlinear Stochastic Programming Jacek Gondzio Email: J.Gondzio@ed.ac.uk URL: http://www.maths.ed.ac.uk/~gondzio SPS

More information

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications Optimization Problems with Constraints - introduction to theory, numerical Methods and applications Dr. Abebe Geletu Ilmenau University of Technology Department of Simulation and Optimal Processes (SOP)

More information

Sharpening the Karush-John optimality conditions

Sharpening the Karush-John optimality conditions Sharpening the Karush-John optimality conditions Arnold Neumaier and Hermann Schichl Institut für Mathematik, Universität Wien Strudlhofgasse 4, A-1090 Wien, Austria email: Arnold.Neumaier@univie.ac.at,

More information

Introduction to Optimization Techniques. Nonlinear Optimization in Function Spaces

Introduction to Optimization Techniques. Nonlinear Optimization in Function Spaces Introduction to Optimization Techniques Nonlinear Optimization in Function Spaces X : T : Gateaux and Fréchet Differentials Gateaux and Fréchet Differentials a vector space, Y : a normed space transformation

More information

University of California, Davis Department of Agricultural and Resource Economics ARE 252 Lecture Notes 2 Quirino Paris

University of California, Davis Department of Agricultural and Resource Economics ARE 252 Lecture Notes 2 Quirino Paris University of California, Davis Department of Agricultural and Resource Economics ARE 5 Lecture Notes Quirino Paris Karush-Kuhn-Tucker conditions................................................. page Specification

More information

Optimization. A first course on mathematics for economists

Optimization. A first course on mathematics for economists Optimization. A first course on mathematics for economists Xavier Martinez-Giralt Universitat Autònoma de Barcelona xavier.martinez.giralt@uab.eu II.3 Static optimization - Non-Linear programming OPT p.1/45

More information

5 Handling Constraints

5 Handling Constraints 5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest

More information

Constrained maxima and Lagrangean saddlepoints

Constrained maxima and Lagrangean saddlepoints Division of the Humanities and Social Sciences Ec 181 KC Border Convex Analysis and Economic Theory Winter 2018 Topic 10: Constrained maxima and Lagrangean saddlepoints 10.1 An alternative As an application

More information

. This matrix is not symmetric. Example. Suppose A =

. This matrix is not symmetric. Example. Suppose A = Notes for Econ. 7001 by Gabriel A. ozada The equation numbers and page numbers refer to Knut Sydsæter and Peter J. Hammond s textbook Mathematics for Economic Analysis (ISBN 0-13- 583600-X, 1995). 1. Convexity,

More information

Algorithms for nonlinear programming problems II

Algorithms for nonlinear programming problems II Algorithms for nonlinear programming problems II Martin Branda Charles University in Prague Faculty of Mathematics and Physics Department of Probability and Mathematical Statistics Computational Aspects

More information

Algorithms for constrained local optimization

Algorithms for constrained local optimization Algorithms for constrained local optimization Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Algorithms for constrained local optimization p. Feasible direction methods Algorithms for constrained

More information

Nonlinear Optimization

Nonlinear Optimization Nonlinear Optimization Etienne de Klerk (UvT)/Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos Course WI3031 (Week 4) February-March, A.D. 2005 Optimization Group 1 Outline

More information

5. Duality. Lagrangian

5. Duality. Lagrangian 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

Generalization to inequality constrained problem. Maximize

Generalization to inequality constrained problem. Maximize Lecture 11. 26 September 2006 Review of Lecture #10: Second order optimality conditions necessary condition, sufficient condition. If the necessary condition is violated the point cannot be a local minimum

More information

1 Overview. 2 A Characterization of Convex Functions. 2.1 First-order Taylor approximation. AM 221: Advanced Optimization Spring 2016

1 Overview. 2 A Characterization of Convex Functions. 2.1 First-order Taylor approximation. AM 221: Advanced Optimization Spring 2016 AM 221: Advanced Optimization Spring 2016 Prof. Yaron Singer Lecture 8 February 22nd 1 Overview In the previous lecture we saw characterizations of optimality in linear optimization, and we reviewed the

More information

Additional Homework Problems

Additional Homework Problems Additional Homework Problems Robert M. Freund April, 2004 2004 Massachusetts Institute of Technology. 1 2 1 Exercises 1. Let IR n + denote the nonnegative orthant, namely IR + n = {x IR n x j ( ) 0,j =1,...,n}.

More information

The Karush-Kuhn-Tucker (KKT) conditions

The Karush-Kuhn-Tucker (KKT) conditions The Karush-Kuhn-Tucker (KKT) conditions In this section, we will give a set of sufficient (and at most times necessary) conditions for a x to be the solution of a given convex optimization problem. These

More information

STATIC LECTURE 4: CONSTRAINED OPTIMIZATION II - KUHN TUCKER THEORY

STATIC LECTURE 4: CONSTRAINED OPTIMIZATION II - KUHN TUCKER THEORY STATIC LECTURE 4: CONSTRAINED OPTIMIZATION II - KUHN TUCKER THEORY UNIVERSITY OF MARYLAND: ECON 600 1. Some Eamples 1 A general problem that arises countless times in economics takes the form: (Verbally):

More information

Interior-Point Methods for Linear Optimization

Interior-Point Methods for Linear Optimization Interior-Point Methods for Linear Optimization Robert M. Freund and Jorge Vera March, 204 c 204 Robert M. Freund and Jorge Vera. All rights reserved. Linear Optimization with a Logarithmic Barrier Function

More information

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010 I.3. LMI DUALITY Didier HENRION henrion@laas.fr EECI Graduate School on Control Supélec - Spring 2010 Primal and dual For primal problem p = inf x g 0 (x) s.t. g i (x) 0 define Lagrangian L(x, z) = g 0

More information

Existence of minimizers

Existence of minimizers Existence of imizers We have just talked a lot about how to find the imizer of an unconstrained convex optimization problem. We have not talked too much, at least not in concrete mathematical terms, about

More information

Appendix PRELIMINARIES 1. THEOREMS OF ALTERNATIVES FOR SYSTEMS OF LINEAR CONSTRAINTS

Appendix PRELIMINARIES 1. THEOREMS OF ALTERNATIVES FOR SYSTEMS OF LINEAR CONSTRAINTS Appendix PRELIMINARIES 1. THEOREMS OF ALTERNATIVES FOR SYSTEMS OF LINEAR CONSTRAINTS Here we consider systems of linear constraints, consisting of equations or inequalities or both. A feasible solution

More information

We are now going to move on to a discussion of Inequality constraints. Our canonical problem now looks as ( ) = 0 ( ) 0

We are now going to move on to a discussion of Inequality constraints. Our canonical problem now looks as ( ) = 0 ( ) 0 4 Lecture 4 4.1 Constrained Optimization with Inequality Constraints We are now going to move on to a discussion of Inequality constraints. Our canonical problem now looks as Problem 11 (Constrained Optimization

More information

SECTION C: CONTINUOUS OPTIMISATION LECTURE 9: FIRST ORDER OPTIMALITY CONDITIONS FOR CONSTRAINED NONLINEAR PROGRAMMING

SECTION C: CONTINUOUS OPTIMISATION LECTURE 9: FIRST ORDER OPTIMALITY CONDITIONS FOR CONSTRAINED NONLINEAR PROGRAMMING Nf SECTION C: CONTINUOUS OPTIMISATION LECTURE 9: FIRST ORDER OPTIMALITY CONDITIONS FOR CONSTRAINED NONLINEAR PROGRAMMING f(x R m g HONOUR SCHOOL OF MATHEMATICS, OXFORD UNIVERSITY HILARY TERM 5, DR RAPHAEL

More information

Convex Analysis and Optimization Chapter 2 Solutions

Convex Analysis and Optimization Chapter 2 Solutions Convex Analysis and Optimization Chapter 2 Solutions Dimitri P. Bertsekas with Angelia Nedić and Asuman E. Ozdaglar Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts http://www.athenasc.com

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu

More information

Examination paper for TMA4180 Optimization I

Examination paper for TMA4180 Optimization I Department of Mathematical Sciences Examination paper for TMA4180 Optimization I Academic contact during examination: Phone: Examination date: 26th May 2016 Examination time (from to): 09:00 13:00 Permitted

More information

Finite Dimensional Optimization Part III: Convex Optimization 1

Finite Dimensional Optimization Part III: Convex Optimization 1 John Nachbar Washington University March 21, 2017 Finite Dimensional Optimization Part III: Convex Optimization 1 1 Saddle points and KKT. These notes cover another important approach to optimization,

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006 Quiz Discussion IE417: Nonlinear Programming: Lecture 12 Jeff Linderoth Department of Industrial and Systems Engineering Lehigh University 16th March 2006 Motivation Why do we care? We are interested in

More information

CONSTRAINED NONLINEAR PROGRAMMING

CONSTRAINED NONLINEAR PROGRAMMING 149 CONSTRAINED NONLINEAR PROGRAMMING We now turn to methods for general constrained nonlinear programming. These may be broadly classified into two categories: 1. TRANSFORMATION METHODS: In this approach

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 4 Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 2 4.1. Subgradients definition subgradient calculus duality and optimality conditions Shiqian

More information

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

More information

EC /11. Math for Microeconomics September Course, Part II Problem Set 1 with Solutions. a11 a 12. x 2

EC /11. Math for Microeconomics September Course, Part II Problem Set 1 with Solutions. a11 a 12. x 2 LONDON SCHOOL OF ECONOMICS Professor Leonardo Felli Department of Economics S.478; x7525 EC400 2010/11 Math for Microeconomics September Course, Part II Problem Set 1 with Solutions 1. Show that the general

More information

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 5 Fall 2009

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 5 Fall 2009 UC Berkeley Department of Electrical Engineering and Computer Science EECS 227A Nonlinear and Convex Optimization Solutions 5 Fall 2009 Reading: Boyd and Vandenberghe, Chapter 5 Solution 5.1 Note that

More information

Finite-Dimensional Cones 1

Finite-Dimensional Cones 1 John Nachbar Washington University March 28, 2018 1 Basic Definitions. Finite-Dimensional Cones 1 Definition 1. A set A R N is a cone iff it is not empty and for any a A and any γ 0, γa A. Definition 2.

More information

Characterizations of Solution Sets of Fréchet Differentiable Problems with Quasiconvex Objective Function

Characterizations of Solution Sets of Fréchet Differentiable Problems with Quasiconvex Objective Function Characterizations of Solution Sets of Fréchet Differentiable Problems with Quasiconvex Objective Function arxiv:1805.03847v1 [math.oc] 10 May 2018 Vsevolod I. Ivanov Department of Mathematics, Technical

More information

2.3 Linear Programming

2.3 Linear Programming 2.3 Linear Programming Linear Programming (LP) is the term used to define a wide range of optimization problems in which the objective function is linear in the unknown variables and the constraints are

More information

Convex Optimization Boyd & Vandenberghe. 5. Duality

Convex Optimization Boyd & Vandenberghe. 5. Duality 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

MATH 5720: Unconstrained Optimization Hung Phan, UMass Lowell September 13, 2018

MATH 5720: Unconstrained Optimization Hung Phan, UMass Lowell September 13, 2018 MATH 57: Unconstrained Optimization Hung Phan, UMass Lowell September 13, 18 1 Global and Local Optima Let a function f : S R be defined on a set S R n Definition 1 (minimizers and maximizers) (i) x S

More information

Convex Optimization and Modeling

Convex Optimization and Modeling Convex Optimization and Modeling Duality Theory and Optimality Conditions 5th lecture, 12.05.2010 Jun.-Prof. Matthias Hein Program of today/next lecture Lagrangian and duality: the Lagrangian the dual

More information

Final Exam - Math Camp August 27, 2014

Final Exam - Math Camp August 27, 2014 Final Exam - Math Camp August 27, 2014 You will have three hours to complete this exam. Please write your solution to question one in blue book 1 and your solutions to the subsequent questions in blue

More information

Optimality, Duality, Complementarity for Constrained Optimization

Optimality, Duality, Complementarity for Constrained Optimization Optimality, Duality, Complementarity for Constrained Optimization Stephen Wright University of Wisconsin-Madison May 2014 Wright (UW-Madison) Optimality, Duality, Complementarity May 2014 1 / 41 Linear

More information

The Kuhn-Tucker Problem

The Kuhn-Tucker Problem Natalia Lazzati Mathematics for Economics (Part I) Note 8: Nonlinear Programming - The Kuhn-Tucker Problem Note 8 is based on de la Fuente (2000, Ch. 7) and Simon and Blume (1994, Ch. 18 and 19). The Kuhn-Tucker

More information

Assignment 1: From the Definition of Convexity to Helley Theorem

Assignment 1: From the Definition of Convexity to Helley Theorem Assignment 1: From the Definition of Convexity to Helley Theorem Exercise 1 Mark in the following list the sets which are convex: 1. {x R 2 : x 1 + i 2 x 2 1, i = 1,..., 10} 2. {x R 2 : x 2 1 + 2ix 1x

More information

Lecture 19 Algorithms for VIs KKT Conditions-based Ideas. November 16, 2008

Lecture 19 Algorithms for VIs KKT Conditions-based Ideas. November 16, 2008 Lecture 19 Algorithms for VIs KKT Conditions-based Ideas November 16, 2008 Outline for solution of VIs Algorithms for general VIs Two basic approaches: First approach reformulates (and solves) the KKT

More information

8. Constrained Optimization

8. Constrained Optimization 8. Constrained Optimization Daisuke Oyama Mathematics II May 11, 2018 Unconstrained Maximization Problem Let X R N be a nonempty set. Definition 8.1 For a function f : X R, x X is a (strict) local maximizer

More information

Convex Sets with Applications to Economics

Convex Sets with Applications to Economics Convex Sets with Applications to Economics Debasis Mishra March 10, 2010 1 Convex Sets A set C R n is called convex if for all x, y C, we have λx+(1 λ)y C for all λ [0, 1]. The definition says that for

More information

Nonlinear Programming Models

Nonlinear Programming Models Nonlinear Programming Models Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Nonlinear Programming Models p. Introduction Nonlinear Programming Models p. NLP problems minf(x) x S R n Standard form:

More information

In English, this means that if we travel on a straight line between any two points in C, then we never leave C.

In English, this means that if we travel on a straight line between any two points in C, then we never leave C. Convex sets In this section, we will be introduced to some of the mathematical fundamentals of convex sets. In order to motivate some of the definitions, we will look at the closest point problem from

More information

Vector Spaces. Addition : R n R n R n Scalar multiplication : R R n R n.

Vector Spaces. Addition : R n R n R n Scalar multiplication : R R n R n. Vector Spaces Definition: The usual addition and scalar multiplication of n-tuples x = (x 1,..., x n ) R n (also called vectors) are the addition and scalar multiplication operations defined component-wise:

More information

6.254 : Game Theory with Engineering Applications Lecture 7: Supermodular Games

6.254 : Game Theory with Engineering Applications Lecture 7: Supermodular Games 6.254 : Game Theory with Engineering Applications Lecture 7: Asu Ozdaglar MIT February 25, 2010 1 Introduction Outline Uniqueness of a Pure Nash Equilibrium for Continuous Games Reading: Rosen J.B., Existence

More information

Written Examination

Written Examination Division of Scientific Computing Department of Information Technology Uppsala University Optimization Written Examination 202-2-20 Time: 4:00-9:00 Allowed Tools: Pocket Calculator, one A4 paper with notes

More information

Enhanced Fritz John Optimality Conditions and Sensitivity Analysis

Enhanced Fritz John Optimality Conditions and Sensitivity Analysis Enhanced Fritz John Optimality Conditions and Sensitivity Analysis Dimitri P. Bertsekas Laboratory for Information and Decision Systems Massachusetts Institute of Technology March 2016 1 / 27 Constrained

More information

Computational Optimization. Convexity and Unconstrained Optimization 1/29/08 and 2/1(revised)

Computational Optimization. Convexity and Unconstrained Optimization 1/29/08 and 2/1(revised) Computational Optimization Convexity and Unconstrained Optimization 1/9/08 and /1(revised) Convex Sets A set S is convex if the line segment joining any two points in the set is also in the set, i.e.,

More information

Math 341: Convex Geometry. Xi Chen

Math 341: Convex Geometry. Xi Chen Math 341: Convex Geometry Xi Chen 479 Central Academic Building, University of Alberta, Edmonton, Alberta T6G 2G1, CANADA E-mail address: xichen@math.ualberta.ca CHAPTER 1 Basics 1. Euclidean Geometry

More information

1. Introduction. Consider the following quadratically constrained quadratic optimization problem:

1. Introduction. Consider the following quadratically constrained quadratic optimization problem: ON LOCAL NON-GLOBAL MINIMIZERS OF QUADRATIC OPTIMIZATION PROBLEM WITH A SINGLE QUADRATIC CONSTRAINT A. TAATI AND M. SALAHI Abstract. In this paper, we consider the nonconvex quadratic optimization problem

More information

Lecture: Duality.

Lecture: Duality. Lecture: Duality http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghe s lecture notes Introduction 2/35 Lagrange dual problem weak and strong

More information