Generalization to inequality constrained problem. Maximize

Similar documents
In view of (31), the second of these is equal to the identity I on E m, while this, in view of (30), implies that the first can be written

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

Lecture: Duality of LP, SOCP and SDP

Convex Optimization & Lagrange Duality

5. Duality. Lagrangian

SECTION C: CONTINUOUS OPTIMISATION LECTURE 9: FIRST ORDER OPTIMALITY CONDITIONS FOR CONSTRAINED NONLINEAR PROGRAMMING

Constrained Optimization

Constrained Optimization and Lagrangian Duality

Convex Optimization M2

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

IE 5531: Engineering Optimization I

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

Lecture 18: Optimization Programming

Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

CONSTRAINED OPTIMALITY CRITERIA

Constrained Optimization Theory

Numerical Optimization

Nonlinear Optimization

TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM

minimize x subject to (x 2)(x 4) u,

Lectures 9 and 10: Constrained optimization problems and their optimality conditions

Chap 2. Optimality conditions

Microeconomics I. September, c Leopold Sögner

Convex Optimization Boyd & Vandenberghe. 5. Duality

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Lecture: Duality.

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006

Nonlinear Optimization: What s important?

CONSTRAINED NONLINEAR PROGRAMMING

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Bindel, Spring 2017 Numerical Analysis (CS 4220) Notes for So far, we have considered unconstrained optimization problems.

Lagrange Relaxation and Duality

Introduction to Mathematical Programming IE406. Lecture 10. Dr. Ted Ralphs

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

2.098/6.255/ Optimization Methods Practice True/False Questions

Introduction to Optimization Techniques. Nonlinear Optimization in Function Spaces

More on Lagrange multipliers

ON LICQ AND THE UNIQUENESS OF LAGRANGE MULTIPLIERS

Finite Dimensional Optimization Part III: Convex Optimization 1

Optimality Conditions for Constrained Optimization

4TE3/6TE3. Algorithms for. Continuous Optimization

E 600 Chapter 4: Optimization

Nonlinear Programming and the Kuhn-Tucker Conditions

MATH2070 Optimisation

EE/AA 578, Univ of Washington, Fall Duality

Optimisation in Higher Dimensions

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

subject to (x 2)(x 4) u,

Convex Optimization and Modeling

Computational Optimization. Constrained Optimization Part 2

Lagrangian Duality. Richard Lusby. Department of Management Engineering Technical University of Denmark

2.3 Linear Programming

Nonlinear Programming (Hillier, Lieberman Chapter 13) CHEM-E7155 Production Planning and Control

5 Handling Constraints

The Fundamental Theorem of Linear Inequalities

Applications of Linear Programming

On the Method of Lagrange Multipliers

Appendix A Taylor Approximations and Definite Matrices

10-725/ Optimization Midterm Exam

Mathematical Economics. Lecture Notes (in extracts)

Lecture 6: Conic Optimization September 8

Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers

OPTIMISATION /09 EXAM PREPARATION GUIDELINES

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Optimization Theory. Lectures 4-6

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

Support Vector Machines

ICS-E4030 Kernel Methods in Machine Learning

Support Vector Machines: Maximum Margin Classifiers

Constrained optimization

The Kuhn-Tucker Problem

Algorithms for Constrained Optimization

Lecture 11 and 12: Penalty methods and augmented Lagrangian methods for nonlinear programming

Math 5311 Constrained Optimization Notes

8. Constrained Optimization

Scientific Computing: Optimization

The general programming problem is the nonlinear programming problem where a given function is maximized subject to a set of inequality constraints.

Lecture 3. Optimization Problems and Iterative Algorithms

Support Vector Machines for Regression

CONVEX FUNCTIONS AND OPTIMIZATION TECHINIQUES A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF

CO 250 Final Exam Guide

University of California, Davis Department of Agricultural and Resource Economics ARE 252 Lecture Notes 2 Quirino Paris

Constraint qualifications for nonlinear programming

Lecture 8. Strong Duality Results. September 22, 2008

Gradient Descent. Dr. Xiaowei Huang

1 Computing with constraints

Optimality, Duality, Complementarity for Constrained Optimization

OPTIMISATION 2007/8 EXAM PREPARATION GUIDELINES

Computational Finance

Lagrangian Duality. Evelien van der Hurk. DTU Management Engineering

Symmetric and Asymmetric Duality

Enhanced Fritz John Optimality Conditions and Sensitivity Analysis

CS-E4830 Kernel Methods in Machine Learning

Sufficient Conditions for Finite-variable Constrained Minimization

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 5 Fall 2009

Homework Set #6 - Solutions

Optimization Tutorial 1. Basic Gradient Descent

g(x,y) = c. For instance (see Figure 1 on the right), consider the optimization problem maximize subject to

Constrained optimization: direct methods (cont.)

Transcription:

Lecture 11. 26 September 2006 Review of Lecture #10: Second order optimality conditions necessary condition, sufficient condition. If the necessary condition is violated the point cannot be a local minimum point. If the sufficiency condition is not met, the point may not be an isolated minimum point. Comments on Use of Optimality Conditions. Always use the standard for of the NLP problem Make sure to check all the KKT conditions Check regularity of the candidate solution points Review for midterm exam. Exam will be on Thursday, 9/28/06. Duality in Nonlinear Programming. Duality in Nonlinear Programming: Lagrangian duality, or local duality Equality Constrained Problem: x* is a local min for the equality constrained problem as well as the Lagrangian function at u*. Given optimum u, optimum x can be found by minimizing the Lagrangian. Given u in the neighborhood of its optimum, x found by minimizing the Lagrangian is also in the neighborhood of its optimum value. Thus, there is a unique correspondence between u and x; x = x(u) and x(u) is differentiable of u. Dual function. Definition of the dual function. Gradient and Hessian of the dual function. Local duality theorem. Maximize the dual function. Example problem Generalization to inequality constrained problem. Maximize the dual function subject to non-negativity of the dual variables. Strong duality theorem; weak duality theorem. Saddle points. Saddle point theorem. Example problem. Read: Duality in NLP. 1

Project #2; Report due today. 53:235 Applied Optimal Design HW#9: Solve Exercise 5.4 using KKT optimality conditions; Check duality assumption; calculate the dual function; maximize the dual function; show x* = x(u*), and f(x*) = phi (u*). No need to submit. 2

4.8 DUALITY IN NONLINEAR PROGRAMMING (J.S. Arora) 4.8.1 Introduction Given a nonlinear programming problem, there is another nonlinear programming problem closely associated with it. The former is called the primal problem, and the latter is called the Lagrangian dual problem, or simply dual problem. Under certain convexity assumptions, the primal and dual problems have equal optimal cost values and therefore it is possible to solve the primal problem indirectly by solving the dual problem. As a by-product of one of the duality theorems, we obtain the saddle point necessary optimality conditions that are explained later. In recent years, duality has played a very important role in development of optimization theory and numerical methods. Development of the duality theory requires assumptions about convexity of the problem. However to be broadly applicable, the theory should require a minimum of convexity assumptions. This leads to the concept requiring local convexity and to the local duality theory. In this section, we shall present only the local duality theory and discuss its computational aspects. The theory can be used to develop computational methods for solving optimization problems. We shall see later that it can be used to develop the so-called multiplier or augmented Lagrangian methods. 4.8.2 Local Duality 4.8.2.1 EQUALITY CONSTRAINT CASE. For sake of developing the local duality theory, we consider the equality-constrained problem first: Problem PE: Minimize f(x), x R n (4.8.1) Subject to g i (x) = 0; i = 1 to p (4.8.2) Later on we will extend the theory to both equality and inequality constrained problems. The theory we are going to present is sometimes called strong duality or Lagrangian duality. We will assume that f and g i C 2, i = 1 to p. Let x * be a local minimum of the Problem PE that is also a 3

regular point of the constraint set. Then there exists a unique Lagrange multiplier vector u * R p such that x f(x * ) + x g(x * )u * = 0 (4.8.3) where x g is an n p matrix whose columns are the gradients of the constraints. Also Hessian of the Lagrange function x 2 L(x *,u * ), where L(x,u) = f(x) + (u,g(x)) (4.8.4) must be at least positive semidefinite (second order necessary condition) on the tangent subspace Or, M = {y R n ( x g i (x * ),y) = 0; i = 1 to p, y 0} (4.8.5) (y, x 2 L(x *,u * )y) 0, y 0, y M (4.8.6) Now, we introduce the assumption that x 2 L(x *,u * ) is actually positive definite; i.e., (y, x 2 L(x *,u * )y) > 0, for all y R n (4.8.7) This assumption is necessary for development of the local duality theory. The assumption guarantees that the Lagrangian of Eq. (4.8.4) is locally convex at x *. This also satisfies the sufficiency condition for x * to be an isolated local minimum of Problem PE. With this assumption, the point x * is not only a local minimum of the Problem PE, it is also a local minimum for the unconstrained problem: Minimize L(x,u * ) = f(x) + (u *,g(x)) (4.8.8) where u * is a vector of Lagrange multipliers at x *. The necessary and sufficient conditions for the above unconstrained problem are the same as for the constrained Problem PE (with x 2 L(x *,u * ) being positive definite). In addition for any u sufficiently close to u * the Lagrange function f(x) + (u,g(x)) will have a local minimum at a point x near x *. Now we shall establish the condition that x(u) exists and is a differentiable function of u. The Karush-Kuhn-Tucker necessary condition is x L(x,u) x f + ( x g)u = 0 (4.8.9) 4

Since x 2 L(x *,u * ) is positive definite, it is nonsingular. Also because of positive definiteness, x 2 L(x,u) is nonsingular in a neighborhood of (x *,u * ). This is a generalization of a theorem from calculus: if a function is positive at a point, it is positive in a neighborhood of that point. x 2 L(x,u) is also the Jacobian of the necessary conditions of Eq. (4.8.9) with respect to x. Therefore, Eq. (4.8.9) has a solution x near x * when u is near u *. Thus, locally there is a unique correspondence between u and x through solution of the unconstrained problem Minimize L(x,u) = f(x) + (u,g(x)) (4.8.10) Furthermore, for a given u, x(u) is a differentiable function (by the Implicit Functions Theorem of calculus). The necessary condition for the problem (4.8.10) can be written as x f(x) + ( x g(x))u = 0 (4.8.11) and x 2 L(x,u) is positive definite as x 2 L(x *,u * ) is positive definite. Def. 4.8.1 (Dual Function): Near u *, we define the dual function φ by the equation φ(u) = min x [f(x) + (u,g(x))] = min x L(x,u). (4.8.12) In the above definition, the minimum is taken locally with respect to x near x *. With this definition of the dual function we can show that locally the original constrained Problem PE is equivalent to unconstrained local maximization of the dual function φ with respect to u. Thus, we can establish equivalence between a constrained problem in x and an unconstrained problem in u. To establish the duality relation, we must prove two lemmas. Lemma 4.8.1: The dual function φ(u) has gradient u φ(u) = g(x(u)) (4.8.13) Proof: Let x(u) represent a local minimum for the Lagrange function L(x) = f(x) + (u,g(x)) (4.8.14) Therefore, the dual function can be explicitly written from Eq. (4.8.12) as φ(u) = f(x(u)) + (u,g(x(u))) (4.8.15) Therefore, 5

u φ(u) dφ du = φ + u dx φ = g(x(u)) + du x dx L du x (4.8.16) But L/ x in Eq. (4.8.16) is zero because x(u) minimizes the Lagrange function of Eq. (4.8.14). This proves the result of Eq. (4.8.13). Lemma 4.8.1 is of extreme practical importance, since it shows that the gradient of the dual function is quite simple to calculate. Once the dual function is evaluated by minimization with respect to x, the corresponding g(x), which is the gradient of φ(u), can be evaluated without any further calculation. Lemma 4.8.2: The Hessian of the dual function is 2 u φ(u) = - x gt (x) [ 2 x L(x)]-1 x g(x) (4.8.17) Proof: By Lemma 4.8.1, 2 u φ(u) = u ( u φ) = u g(x(u)) = u x x g(x) (4.8.18) To calculate u x, we observe that x L(x,u) x f(x) + x g(x)u = 0 (4.8.19) where L(x,u) is defined in Eq. (4.8.14). Differentiating Eq. (4.8.19) with respect to u, u ( x L) x g(x) T + u x 2 x L(x) = 0 u x = - x g(x) T [ 2 x L(x)]-1. (4.8.20) Substituting Eq. (4.8.20) into Eq. (4.8.18), we obtain the result of Eq. (4.8.17) that was to be proved. Since [ 2 x L(x)]-1 is positive definite, and since x g(x) is of full column rank near x *, we have 2 u φ(u), a p p matrix (Hessian of φ) to be negative definite. This observation and the Hessian of φ play dominant role in analysis of dual methods. Theorem 4.8.1 (Local Duality Theorem): Consider the problem Minimize f(x) subject to g(x) = 0 6

Let (i) x * be a local minimum, (ii) x * be a regular point, (iii) u * be the Lagrange multipliers at x *, and (iv) x 2 L(x *,u * ) be positive definite. Then the dual problem Maximize φ(u) has a local solution at u * with x * = x(u * ). The maximum value of the dual function is equal to the minimum value of f(x); i.e., φ(u * ) = f(x * ) Proof: It is clear that x * = x(u * ) by definition of φ. Now at u *, we have Lemma 4.8.1: u φ(u * ) = g(x) = 0 and by Lemma 4.8.2 the Hessian of φ is negative definite. Thus, u * satisfies the first order necessary and second order sufficiency conditions for an unconstrained maximum point of φ. Substituting u * in the definition of φ of Eq. (4.8.15), φ(u * ) = f(x(u * )) + (u *,g(x(u * ))) = f(x * ) + (u *,g(x * )) = f(x * ) which was to be proved. 4.8.2.2 INEQUALITY CONSTRAINT CASE. Consider the inequality-constrained problem: Problem P Minimize f(x) Subject to x S S = {x R n g i (x) = 0, i = 1 to p; g i (x) 0; i = p+1 to m} (4.8.21) Define the Lagrange function as L(x,u) = f(x) + (u,g(x)) with u i 0, i > p (4.8.22) Dual function for the Problem P is defined as φ(u) = min x Dual problem is defined as Maximize φ(u) L(x,u); u i 0, i > p (4.8.23) 7

Subject to u i 0, i > p (4.8.24) Theorem 4.8.2 (Strong Duality Theorem). Let (i) x * be a local minimum of the Problem P, (ii) x * be a regular point, (iii) x 2 L(x * ) be positive definite, and (iv) u * be the Lagrange multipliers at the optimum point x *. Then u * solves the dual problem defined in Eq. (4.8.24) with f(x * ) = φ(u * ) and x * = x(u * ). If the assumption of positive definiteness of x 2 L(x * ) is not made, we get the weak duality theorem. Theorem 4.8.3 (Weak Duality Theorem). Let x be a feasible solution to Problem P and let u the feasible solution for the dual problem defined in Eq. (4.8.24); i.e., g i (x) = 0, i = 1 to p; g i (x) 0, i = p + 1 to m and u i 0, i = p + 1 to m. Then φ(u) f(x). Proof: By definition φ(u) = min x L(x,u) = min x [f(x) + (u,g(x))] f(x) + (u,g(x)) f(x) since u i 0, i = p + 1 to m; g i (x) 0, i = p + 1 to m; and g i (x) = 0, i = 1 to p. From the Theorem 4.8.3, we obtain the following results: 1. Minimum [f(x) with x S] Maximum [φ(u) with u i 0, i = p + 1 to m] 2. If f(x * ) = φ(u * ) with u i 0, i = p + 1 to m and x * S, then x * and u * solve the primal and dual problems, respectively. 3. If Minimum [f(x) for x S] = -, then the dual is infeasible, and vice versa (i.e., if dual is infeasible, the primal is unbounded). 4. If Maximum [φ(u); u i 0, i > p] =, then the primal problem has no feasible solution, and vice versa (i.e., if primal is infeasible, the dual is unbounded). 8

Lemma 4.8.3 (Lower Bound for Primal Cost Function): Let u R m. Then for any u with u i 0, i = p + 1 to m φ(u) f(x * ) Proof: φ(u) maximum φ(u); u i 0, i = p + 1 to m = f(x) The above Lemma is quite useful for practical applications. It tells us how to find a lower bound on the optimal primal cost function. Dual cost function for arbitrary u i, i = 1 to p and u i 0, i = p + 1 to m provides a lower bound for the cost function. For any x S, f(x) provides an upper bound for the cost function. Def. 4.8.2 (Saddle Points): Let L(x,u) be the Lagrange function with u R m. L has a saddle point at (x *,u * ) subject to u i 0, i = p + 1 to m if L(x *,u) L(x *,u * ) L(x,u * ) holds for all x near x * and u near u * with u i 0 for i = p + 1 to m. Theorem 4.8.4 (Saddle Point Theorem). Consider the NLP problem: Minimize f(x) with x S. Let f and g i C 2, i = 1 to m and let L(x,u) be defined as L(x,u)= f(x) + (u,g(x)) Let L(x *,u * ) exist with u i 0, i = p + 1 to m. Also let 2 x L(x *,u * ) be positive definite. Then x * satisfying a suitable constraint qualification is a local minimum of NLP if and only if (x *,u * ) is a saddle point of the Lagrangian, i.e., L(x *,u) L(x *,u * ) L(x,u * ) for all x near x * and all u near u * with u i 0 for i = p + 1 to m. See Bazaraa and Shetty (1979) p. 185 for proof. Example: Consider the following problem in two variables (Ref. ): min f = x 1 x2 2 2 1 2 = subject to ( x 3) + x 5 9

Let us first solve the problem using the optimality conditions first. The Lagrangian for the problem is defined as 2 2 = 1 2 1 2 ] L x x + u[ ( x 3) + x 5 The first order necessary conditions are -x 2 + (2x 1-6)u = 0 (b) -x 1 + 2x 2 u = 0 (c) (a) together with the equality constraint. These equations have a solution x1 = 4, x2 = 2, u = 1, f = 8 The Hessian of the Lagrangian is (d) x 2 L = 2 1 1 2 Since this is positive definite, we conclude that the solution obtained is an isolated local minimum. Since x 2 L(x * ) is positive definite, we can apply the local duality theory near the solution. (e) Define a dual function as φ( u ) min = x L( x,u) (f) Solving Eqs. (b) and (c), we get x 1 and x 2 in terms of u as 2 12u x 1 = (g) 2 4 u 1 6u x 2 = (h) 2 4 u 1 provided 4u 2 1 0. Substituting Eqs. (g) and (h) into Eq. (f), the dual function is given as φ(u) = 3 5 4u + 4u 80u 2 2 ( 4u 1) (i) valid for u ± 1/2. This φ has a local maximum at u * = 1. Substituting u = 1 in Eqs. (g) and (h), we get the same solution as in Eqs. (d). Note that φ = 8 (=f * ). 10