Lagrange Multipliers with Optimal Sensitivity Properties in Constrained Optimization

Similar documents
The Relation Between Pseudonormality and Quasiregularity in Constrained Optimization 1

Enhanced Fritz John Optimality Conditions and Sensitivity Analysis

Extended Monotropic Programming and Duality 1

Date: July 5, Contents

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Nonlinear Programming 3rd Edition. Theoretical Solutions Manual Chapter 6

A Geometric Framework for Nonconvex Optimization Duality using Augmented Lagrangian Functions

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

Convex Analysis and Optimization Chapter 2 Solutions

A Unified Analysis of Nonconvex Optimization Duality and Penalty Methods with General Augmenting Functions

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

Pseudonormality and a Lagrange Multiplier Theory for Constrained Optimization 1

Sharpening the Karush-John optimality conditions

Lecture 3. Optimization Problems and Iterative Algorithms

Convexity, Duality, and Lagrange Multipliers

Algorithms for nonlinear programming problems II

Subgradient Methods in Network Resource Allocation: Rate Analysis

Approximate Primal Solutions and Rate Analysis for Dual Subgradient Methods

Division of the Humanities and Social Sciences. Supergradients. KC Border Fall 2001 v ::15.45

Convex Analysis and Economic Theory AY Elementary properties of convex functions

Solution Methods for Stochastic Programs

Existence of Global Minima for Constrained Optimization 1

Inequality Constraints

CONSTRAINED NONLINEAR PROGRAMMING

Generalization to inequality constrained problem. Maximize

Finite Convergence for Feasible Solution Sequence of Variational Inequality Problems

Numerical optimization

Solutions Chapter 5. The problem of finding the minimum distance from the origin to a line is written as. min 1 2 kxk2. subject to Ax = b.

FIRST- AND SECOND-ORDER OPTIMALITY CONDITIONS FOR MATHEMATICAL PROGRAMS WITH VANISHING CONSTRAINTS 1. Tim Hoheisel and Christian Kanzow

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

Penalty and Barrier Methods. So we again build on our unconstrained algorithms, but in a different way.

Convex Optimization Theory. Chapter 4 Exercises and Solutions: Extended Version

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems

Identifying Active Constraints via Partial Smoothness and Prox-Regularity

POLARS AND DUAL CONES

lim sup f(x k ) d k < 0, {α k } K 0. Hence, by the definition of the Armijo rule, we must have for some index k 0

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

Unconstrained minimization of smooth functions

1. f(β) 0 (that is, β is a feasible point for the constraints)

Constrained Optimization

DO NOT OPEN THIS QUESTION BOOKLET UNTIL YOU ARE TOLD TO DO SO

Algorithms for constrained local optimization

A convergence result for an Outer Approximation Scheme

More First-Order Optimization Algorithms

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Convex Optimization Theory. Athena Scientific, Supplementary Chapter 6 on Convex Optimization Algorithms

Optimization methods

ON LICQ AND THE UNIQUENESS OF LAGRANGE MULTIPLIERS

Implications of the Constant Rank Constraint Qualification

CHARACTERIZATIONS OF LIPSCHITZIAN STABILITY

Convex Analysis and Optimization Chapter 4 Solutions

Uniqueness of Generalized Equilibrium for Box Constrained Problems and Applications

Appendix to. Automatic Feature Selection via Weighted Kernels. and Regularization. published in the Journal of Computational and Graphical.

AM 205: lecture 18. Last time: optimization methods Today: conditions for optimality

Introduction to Optimization Techniques. Nonlinear Optimization in Function Spaces

Introduction to Convex Analysis Microeconomics II - Tutoring Class

Optimality, Duality, Complementarity for Constrained Optimization

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09

Constrained Optimization Theory

JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS: Vol. 89, No. 1, pp , APRIL 1996 TECHNICAL NOTE

CONSTRAINED OPTIMALITY CRITERIA

ON GAP FUNCTIONS OF VARIATIONAL INEQUALITY IN A BANACH SPACE. Sangho Kum and Gue Myung Lee. 1. Introduction

Robust linear optimization under general norms

TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM

On nonexpansive and accretive operators in Banach spaces

A First Order Method for Finding Minimal Norm-Like Solutions of Convex Optimization Problems

Primal Solutions and Rate Analysis for Subgradient Methods

CONVEX FUNCTIONS AND OPTIMIZATION TECHINIQUES A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF

Thai Journal of Mathematics Volume 14 (2016) Number 1 : ISSN

Inexact alternating projections on nonconvex sets

Characterizations of Solution Sets of Fréchet Differentiable Problems with Quasiconvex Objective Function

ECE580 Exam 1 October 4, Please do not write on the back of the exam pages. Extra paper is available from the instructor.

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

Lecture 5: September 15

Iteration-complexity of first-order penalty methods for convex programming

Lectures 9 and 10: Constrained optimization problems and their optimality conditions

Nonlinear Programming and the Kuhn-Tucker Conditions

A SIMPLE PARALLEL ALGORITHM WITH AN O(1/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS

SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1

Algorithms for nonlinear programming problems II

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization

Absolute value equations

Numerical Optimization. Review: Unconstrained Optimization

TWO MAPPINGS RELATED TO SEMI-INNER PRODUCTS AND THEIR APPLICATIONS IN GEOMETRY OF NORMED LINEAR SPACES. S.S. Dragomir and J.J.

c 1998 Society for Industrial and Applied Mathematics

Decision Science Letters

ON REGULARITY CONDITIONS FOR COMPLEMENTARITY PROBLEMS

Quasi-relative interior and optimization

arxiv: v1 [math.oc] 1 Jul 2016

APPLICATIONS OF DIFFERENTIABILITY IN R n.

Scientific Computing: Optimization

5 Handling Constraints

Optimality Conditions for Constrained Optimization

5. Subgradient method

TECHNICAL NOTE. A Finite Algorithm for Finding the Projection of a Point onto the Canonical Simplex of R"

Additional Homework Problems

GENERALIZED second-order cone complementarity

10 Numerical methods for constrained problems

Transcription:

Lagrange Multipliers with Optimal Sensitivity Properties in Constrained Optimization Dimitri P. Bertsekasl Dept. of Electrical Engineering and Computer Science, M.I.T., Cambridge, Mass., 02139, USA. (dimitribhit. edu) Summary. We consider optimization problems with inequality and abstract set constraints, and we derive sensitivity properties of Lagrange multipliers under very weak conditions. In particular, we do not assume uniqueness of a Lagrange multiplier or continuity of the perturbation function. We show that the Lagrange multiplier of minimum norm defines the optimal rate of improvement of the cost per unit constraint violation. Key words: constrained optimization, Lagrange multipliers, sensitivity. 1 Introduction We consider the constrained optimization problem minimize f (x) subjectto XEX, gj(x)lo, j=l,...,r, (PI where X is a nonempty subset of!rn, and f :!Rn -+!R and gj :!Rn *!R are smooth (continuously differentiable) functions. In our notation, all vectors are viewed as column vectors, and a prime denotes transposition, so x'y denotes the inner product of the vectors x and y. We will use throughout the standard Euclidean norm IIxI1 = (X'X)'/~. The gradient vector of a smooth function h :!Rn H!R at a vector x is denoted by Vh(x). The positive part of the constraint function gj(x) is denoted by and we write Research supported by NSF Grant ECS-0218328.

16 Dimitri P. Bertsekas The tangent cone of X at a vector x E X is denoted by Tx(x). It is the set of vectors y such that either y = 0 or there exists a sequence {xk} c X such that xk # x for all k and An equivalent definition often found in the literature (e.g., Bazaraa, Sherali, and Shetty [BSS93], Rockafellar and Wets [RoW98]) is that Tx(x) is the set of vectors y such that that there exists a sequence {xk} C X with xk + x, and a positive sequence {ak} such that ak + 0 and (xk - x)/cuk + y. Note that Tx(x) is a closed cone, but it need not be convex (it is convex if X is convex, or more generally, if X is regular at x in the terminology of nonsmooth optimization; see [BN003] or [RoW78]). For any cone N, we denote by N* its polar cone (N* = {z I z'y I 0, V y E N}). This paper is related to research on optimality conditions of the Fritz John type and associated subjects, described in the papers by Bertsekas and Ozdaglar [Be002], Bertsekas, Ozdaglar, and Tseng [BOT04], and the book [BN003]. We generally use the terminology of these works. A Lagrange multiplier associated with a local minimum x* is a vector p = (pl,..., pr) such that where A(x*) = {j / gj(x*) = 0) is the index set of inequality constraints that are active at x*. The set of Lagrange multipliers corresponding to x* is a (possibly empty) closed and convex set. Conditions for existence of at least one Lagrange multiplier are given in many sources, including the books [BSS93],[Ber99], and [BN003], and the survey [Roc93]. We will show the following sensitivity result. The proof is given in the next section. Proposition 1. Let x* be a local minimum of problem (P), assume that the set of Lagrange multipliers is nonempty, and let p* be the vector of minimum norm on this set. Then for every sequence {xk} C X of infeasible vectors such that xk + x*, we have Furthermore, if p* + 0 and Tx(x*) is convex, the preceding inequality is sharp in the sense that there exists a sequence of infeasible vectors {xk} c X such that xk + x* and

Lagrange Multipliers with Optimal Sensitivity Properties 17 For this sequence, we have A sensitivity result of this type was first given by Bertsekas, Ozdaglar, and Tseng [BOT04], for the case of a convex, possibly nondifferentiable problem. In that paper, X was assumed convex, and the functions f and gj were assumed convex over X (rather than smooth). Using the definition of the dual function [q(p) = infzex{ f (x) + p'g(x))], it can be seen that 'd x E X, where q* is the dual optimal value (assumed finite), and p* is the dual optimal solution of minimum norm (assuming a dual optimal solution exists). The inequality was shown to be sharp, assuming that p* # 0, in the sense that there exists a sequence of infeasible vectors {xk} c X such that This result is consistent with Prop. 1. However, the line of analysis of the present paper is different, and in fact simpler, because it relies on the machinery of differentiable calculus rather than convex analysis (there is a connection with convex analysis, but it is embodied in Lemma 1, given in the next section). Note that Prop. 1 establishes the optimal rate of cost improvement with respect to infeasible constraint perturbations, under much weaker assumptions than earlier results for nonconvex problems. For example, classical sensitivity results, include second order sufficiency assumptions guaranteeing that the Lagrange multiplier is unique and that the perturbation function is differentiable (see e.g., [Ber99]). More recent analyses (see, e.g., Bonnans and Shapiro [BoSOO], Section 5.2) also require considerably stronger conditions that ours. Note also that under our weak assumptions, a sensitivity analysis based on the directional derivative of the perturbation function p is not appropriate. The reason is that our assumptions do not preclude the possibility that p has discontinuous directional derivative at u = 0, as illustrated by the following example, first discussed in [BOT04].

18 Dimitri P. Bertsekas Example 1. Consider the two-dimensional problem, minimize -x2 subject to x E X = {x / xi 5 xl), gl(x) = XI 5 0, g2(x) = x2 < 0, we have -uz if ui < ul, P(U) = - if ul < ui, ul 2 0, u2 2 0, otherwise. It can be verified that x* = 0 is the global minimum (in fact the unique feasible solution) and that the set of Lagrange multipliers is Consistently with the preceding proposition, for the sequence xk = (1/k2, Ilk), we have However, p* = (0, I), is not a direction of steepest descent, since starting at u = 0 and going along the direction (O,l), p(u) is equal to 0, so p'(0; p*) = 0. In fact p has no direction of steepest descent at u = 0, because pl(o;.) is not continuous or even lower semicontinuous. However, one may achieve the optimal improvement rate of lip* / / by using constraint perturbations that lie on the curved boundary of X. Finally, let us illustrate with an example how our sensitivity result fails when the convexity assumption on Tx(x*) is violated. In this connection, it is worth noting that nonconvexity of Tx(x*) implies that X is not regular at x* (in the terminology of nonsmooth analysis - see [BN003] and [RoW78]), and this is a major source of exceptional behavior in relation to Lagrange multipliers (see [BN003], Chapter 5). Example 2. In this 2-dimensional example, there are two linear constraints and the set X is the (nonconvex) cone (see Fig. 1). Let the cost function be

Lagrange Multipliers with Optimal Sensitivity Properties 19 Fig. 1: Constraintsof Example 2. We have Tx(x*) = X = {x (xl+xz(xl-zz) = 0). The set X consists of the two lines shown, but the feasible region is the lower portion where 22 5 0. Then the vector x* = (0,O) is a local minimum, and we have Tx(x*) = X, so Tx(x*) is not convex. A Lagrange multiplier is a nonnegative vector (p?, pz) such that (Vf (x*) + ptvg1 (x*) + p;~g~(x*))'d 2 0, V d E Tx (x*), from which, since Tx(x*) contains the vectors (1,0), (-1, O), (0, I), and (0, -I), we obtain Thus the unique Lagrange multiplier vector is p* = (1,l). There are two types of sequences {xk} c X (and mixtures of these two) that are infeasible and converge to x*: those that approach x* along the boundary of the constraint xl + x2 5 0 [these have the form (-Fk,Fk), where Fk > 0 and Fk -+ 01, and those that approach x* along the boundary of the constraint -XI + x2 5 0 [these have the form (tk,fk), where Fk > 0 and Fk + 01. For any of these sequences, we have f(xk) = (Fk)' + (tk- 1)2 and llg+(xk)ll = 2Fk, so Thus 1Ip*II is strictly larger than the optimal rate of cost improvement, and the conclusion of Prop. 1 fails.

20 Dimitri P. Bertsekas 2 Proof Let {xk} c X be a sequence of infeasible vectors such that xk --+ x*. We will show the bound (1.3). The sequence {(xk- x*)/llxk - x*ii) is bounded and each of its limit points belongs to Tx(x*). Without loss of generality, we assume that {(xk- x*)/jjxk - x*i/) converges to a vector d E Tx(x*). Then for the minimum norm Lagrange multiplier p*, we have Denote We have T v f (x*) + C p;vg3 (x*) j=1 so using Eq. (2.1) and the fact ck --, 0, we have T V f (x*) + C p;vgj(x*) (xk - x*) t O(/lxk - x* 1 1 ) (2.2) j=1 Using Eq. (2.2), a Taylor expansion, and the fact p*'g(x*) = 0, we have We thus obtain, using the fact p 2 0, and using the Cauchy-Schwarz inequality,

Lagrange Multipliers with Optimal Sensitivity Properties 21 which is the desired bound (1.3). For the proof that the bound is sharp, we will need the following lemma first given in Bertsekas and Ozdaglar [Be0021 (see also [BN003], Lemma 5.3.1). Lemma 1. Let N be a closed convex cone in!rn, and let ao,...,a, be given vectors in!rn. Suppose that the set is nonempty, and let p* be the vector of minimum norm in M. Then, there exists a sequence {dk} c N* such that For simplicity, we assume that all the constraints are active at x*. Inactive inequality constraints can be neglected since the subsequent analysis focuses in a small neighborhood of x*, within which these constraints remain inactive. We will use Lemma 1 with the following identifications: M = set of Lagrange multipliers, y* = Lagrange multiplier of minimum norm. Since Tx(x*) is closed and is assumed convex, we have N* = Tx(x*), so Lemma 1 yields a sequence {dk} c Tx(x*) such that Since dk E Tx(x*), for each k we can select a sequence {xklt} C X such that xk>t # x* for all t and Denote For each k, we select tk sufficiently large so that and we denote xk = xk,tk (k = (kt,?,

22 Dimitri P. Bertsekas Thus, we have Using a first order expansion for the cost function f, we have for eac t, f (xk) - f (x*) = v f (x*)'(xk - x*) + O(llxk - x* / I ) = V f (x*)'(dk + tk) llxk - x* 1 1 + o(llxk - x *) = ilxk - x* 1 1 (V (x*)'dk + v f (x*)'tk + O (xk - x * I I ) ), llxk - x* I / and, since tk + 0 and vf(x*)'dk + -11p*//', f (xk)- f (x*) = -llxk - x* I. IIp*J1' + o(llxk - x*j/). (2.5) Similarly, using also the fact gj(x*) = 0, we have for each k and t, gj(xk) = lixk - x*jlpj* + 0(llxk - x*1), j = 1,...,7-, from which we also have gt(xk) = llxk - x*11~$ + o(lxk - x*/), j = 1,...,T. (2.6) We thus obtain 119'-(xk)ll = Ilxk - x*ll * 1lii*11 + 0(llxk - x*ii), (2.7) which, since IIp*ll # 0, also shows that Ig+(xk))ll # 0 for all sufficiently large k. Without loss of generality, we assume that 119'-(xk)ll#O, k=0,1,... (2.8) By multiplying Eq. (2.7) with Ip*11, we see that + k IIii*II. llg (17: Ill = llxk - x*11 l l ~ * 1 1 + ~ 0(11xk - x*ll). (2.9) Combining Eqs. (2.5) and (2.9), we obtain f (x*) - f(xk) = IIii*Il. lig+(xk)ll + 0(llxk - x*ll), which together with Eqs. (2.7) and (2.8), shows that Taking the limit as k -+ cc and using the fact lly*i/ # 0, we obtain Finally, from Eqs. (2.6) and (2.7), we see that from which Eq. (1.5) follows. 0

Lagrange Multipliers with Optimal Sensitivity Properties 23 References [BSS93] Bazaraa, M. S., Sherali, H. D., and Shetty, C. M., Nonlinear Programming Theory and Algorithms, (2nd Ed.), Wiley, N. Y (1993). [BN003] Bertsekas, D. P., with NediC, A,, and Ozdaglar, A. E., Convex Analysis and Optimization, Athena Scientific, Belmont, MA (2003). [BOT04] Bertsekas, D. P., Ozdaglar, A. E., and Tseng, P., Enhanced Fritz John Conditions for Convex Programming, Lab, for Information and Decision Systems Report 2631, MIT (2004). To appear in SIAM J. on Optimization. [Be0021 Bertsekas, D. P., and Ozdaglar, A. E. Pseudonormality and a Lagrange Multiplier Theory for Constrained Optimization, J. Opt. Theory Appl., Vol. 114, pp. 287-343 (2002) [Berg91 Bertsekas, D. P., Nonlinear Programming: 2nd Edition, Athena Scientific, Belmont, MA (1999). [BoSOO] Bonnans, J. I?., and Shapiro, A,, Perturbation Analysis of Optimization Problems, Springer-Verlag, N. Y. (2000). [RoW98] Rockafellar, R. T., and Wets, R. J.-B., Variational Analysis, Springer- Verlag, Berlin (1998). [Roc931 Rockafellar, R. T., Lagrange Multipliers and Optimality, SIAM Review, Vol. 35, pp. 183-238 (1993).

http://www.springer.com/978-0-387-30063-4