The Relation Between Pseudonormality and Quasiregularity in Constrained Optimization 1

Similar documents
Lagrange Multipliers with Optimal Sensitivity Properties in Constrained Optimization

Date: July 5, Contents

Pseudonormality and a Lagrange Multiplier Theory for Constrained Optimization 1

Extended Monotropic Programming and Duality 1

Enhanced Fritz John Optimality Conditions and Sensitivity Analysis

A Geometric Framework for Nonconvex Optimization Duality using Augmented Lagrangian Functions

A Unified Analysis of Nonconvex Optimization Duality and Penalty Methods with General Augmenting Functions

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

Convex Analysis and Optimization Chapter 2 Solutions

Convex Analysis and Optimization Chapter 4 Solutions

Nonlinear Programming 3rd Edition. Theoretical Solutions Manual Chapter 6

Introduction to Convex Analysis Microeconomics II - Tutoring Class

Numerical Optimization

ON LICQ AND THE UNIQUENESS OF LAGRANGE MULTIPLIERS

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS

Optimality Conditions for Nonsmooth Convex Optimization

SOME STABILITY RESULTS FOR THE SEMI-AFFINE VARIATIONAL INEQUALITY PROBLEM. 1. Introduction

Constrained Optimization Theory

Sharpening the Karush-John optimality conditions

Convexity, Duality, and Lagrange Multipliers

FIRST- AND SECOND-ORDER OPTIMALITY CONDITIONS FOR MATHEMATICAL PROGRAMS WITH VANISHING CONSTRAINTS 1. Tim Hoheisel and Christian Kanzow

Constraint qualifications for nonlinear programming

Introduction to Optimization Techniques. Nonlinear Optimization in Function Spaces

Lecture 3. Optimization Problems and Iterative Algorithms

Optimality Conditions for Constrained Optimization

lim sup f(x k ) d k < 0, {α k } K 0. Hence, by the definition of the Armijo rule, we must have for some index k 0

Optimality, identifiability, and sensitivity

Appendix B Convex analysis

Convex Analysis and Economic Theory AY Elementary properties of convex functions

A CHARACTERIZATION OF STRICT LOCAL MINIMIZERS OF ORDER ONE FOR STATIC MINMAX PROBLEMS IN THE PARAMETRIC CONSTRAINT CASE

Convex Optimization Theory. Athena Scientific, Supplementary Chapter 6 on Convex Optimization Algorithms

Chapter 2 Convex Analysis

Downloaded 09/27/13 to Redistribution subject to SIAM license or copyright; see

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

Constrained Optimization

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

A FRITZ JOHN APPROACH TO FIRST ORDER OPTIMALITY CONDITIONS FOR MATHEMATICAL PROGRAMS WITH EQUILIBRIUM CONSTRAINTS

GENERALIZED CONVEXITY AND OPTIMALITY CONDITIONS IN SCALAR AND VECTOR OPTIMIZATION

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

Existence of Global Minima for Constrained Optimization 1

CONSTRAINED NONLINEAR PROGRAMMING

Assignment 1: From the Definition of Convexity to Helley Theorem

Convex Optimization Theory. Chapter 3 Exercises and Solutions: Extended Version

The exact absolute value penalty function method for identifying strict global minima of order m in nonconvex nonsmooth programming

The general programming problem is the nonlinear programming problem where a given function is maximized subject to a set of inequality constraints.

Implications of the Constant Rank Constraint Qualification

Preprint Stephan Dempe and Patrick Mehlitz Lipschitz continuity of the optimal value function in parametric optimization ISSN

6.254 : Game Theory with Engineering Applications Lecture 7: Supermodular Games

Local Indices for Degenerate Variational Inequalities

DUALIZATION OF SUBGRADIENT CONDITIONS FOR OPTIMALITY

On Semicontinuity of Convex-valued Multifunctions and Cesari s Property (Q)

TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM

CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS

In view of (31), the second of these is equal to the identity I on E m, while this, in view of (30), implies that the first can be written

On the convexity of piecewise-defined functions

Second Order Optimality Conditions for Constrained Nonlinear Programming

Characterizations of Solution Sets of Fréchet Differentiable Problems with Quasiconvex Objective Function

Centre d Economie de la Sorbonne UMR 8174

Some Notes on Approximate Optimality Conditions in Scalar and Vector Optimization Problems

Division of the Humanities and Social Sciences. Supergradients. KC Border Fall 2001 v ::15.45

Helly's Theorem and its Equivalences via Convex Analysis

Subgradient Methods in Network Resource Allocation: Rate Analysis

On Weak Pareto Optimality for Pseudoconvex Nonsmooth Multiobjective Optimization Problems

CONVEX OPTIMIZATION VIA LINEARIZATION. Miguel A. Goberna. Universidad de Alicante. Iberian Conference on Optimization Coimbra, November, 2006

Normal Fans of Polyhedral Convex Sets

Lecture 1: Introduction. Outline. B9824 Foundations of Optimization. Fall Administrative matters. 2. Introduction. 3. Existence of optima

LECTURE 4 LECTURE OUTLINE

Chap 2. Optimality conditions

Lecture 1: Introduction. Outline. B9824 Foundations of Optimization. Fall Administrative matters. 2. Introduction. 3. Existence of optima

Week 3: Faces of convex sets

Decision Science Letters

Optimality, identifiability, and sensitivity

Convex Sets. Prof. Dan A. Simovici UMB

Generalization to inequality constrained problem. Maximize

Minimality Concepts Using a New Parameterized Binary Relation in Vector Optimization 1

Solving generalized semi-infinite programs by reduction to simpler problems.

Summary Notes on Maximization

AN ABADIE-TYPE CONSTRAINT QUALIFICATION FOR MATHEMATICAL PROGRAMS WITH EQUILIBRIUM CONSTRAINTS. Michael L. Flegel and Christian Kanzow

Applied Mathematics Letters

1 Introduction The study of the existence of solutions of Variational Inequalities on unbounded domains usually involves the same sufficient assumptio

informs DOI /moor.xxxx.xxxx c 20xx INFORMS

Finite Convergence for Feasible Solution Sequence of Variational Inequality Problems

GEOMETRIC APPROACH TO CONVEX SUBDIFFERENTIAL CALCULUS October 10, Dedicated to Franco Giannessi and Diethard Pallaschke with great respect

POLARS AND DUAL CONES

Solutions Chapter 5. The problem of finding the minimum distance from the origin to a line is written as. min 1 2 kxk2. subject to Ax = b.

On constraint qualifications with generalized convexity and optimality conditions

A Characterization of Polyhedral Convex Sets

ON GENERALIZED-CONVEX CONSTRAINED MULTI-OBJECTIVE OPTIMIZATION

Approximate Primal Solutions and Rate Analysis for Dual Subgradient Methods

OPTIMALITY CONDITIONS FOR GLOBAL MINIMA OF NONCONVEX FUNCTIONS ON RIEMANNIAN MANIFOLDS

Conditioning of linear-quadratic two-stage stochastic programming problems

Radius Theorems for Monotone Mappings

Chapter 2: Preliminaries and elements of convex analysis

Geometry and topology of continuous best and near best approximations

Algorithms for nonlinear programming problems II

Necessary optimality conditions for optimal control problems with nonsmooth mixed state and control constraints

Key words. constrained optimization, composite optimization, Mangasarian-Fromovitz constraint qualification, active set, identification.

CHARACTERIZATION OF (QUASI)CONVEX SET-VALUED MAPS

Uniqueness of Generalized Equilibrium for Box Constrained Problems and Applications

Transcription:

October 2003 The Relation Between Pseudonormality and Quasiregularity in Constrained Optimization 1 by Asuman E. Ozdaglar and Dimitri P. Bertsekas 2 Abstract We consider optimization problems with equality, inequality, and abstract set constraints. Our goal is to investigate the connections between various characteristics of the constraint set that imply the existence of Lagrange multipliers. For problems with no abstract set constraint, the classical condition of quasiregularity provides the connecting link between the most common constraint qualifications and existence of Lagrange multipliers. In earlier work, we introduced a new and general condition, pseudonormality, that is central within the theory of constraint qualifications, exact penalty functions, and existence of Lagrange multipliers. In this paper, we explore the relations between pseudonormality, quasiregularity, and existence of Lagrange multipliers in the presence of an abstract set constraint. In particular, under a regularity assumption on the abstract constraint set, we show that pseudonormality implies quasiregularity. However, contrary to pseudonormality, quasiregularity does not imply the existence of Lagrange multipliers, except under additional assumptions. 1 Research supported by NSF under Grant ACI-9873339. 2 Dept. of Electrical Engineering and Computer Science, M.I.T., Cambridge, Mass., 02139. 1

1. Introduction 1. INTRODUCTION In this paper, we consider finite-dimensional optimization problems of the form minimize f(x) subject to x C, (1.1) where the constraint set C consists of equality and inequality constraints as well as an additional abstract set constraint x X: C = X { x h 1 (x) = 0,...,h m (x) = 0 } { x g 1 (x) 0,...,g r (x) 0 }. (1.2) We assume throughout the paper that f, h i, g j are smooth (continuously differentiable) functions from R n to R, and X is a nonempty closed set. The analysis of constrained optimization problems is centered around characterizing how the cost function behaves as we move from a local minimum to neighboring feasible points. The relevant variations can be studied in terms of various conical approximations to the constraint set. In this paper, we use two such approximations, namely the tangent cone and the normal cone, which are particularly useful in characterizing local optimality of feasible solutions of the above problem (see [BNO03]). In our terminology, a vector y is a tangent of a set S R n at a vector x S if either y = 0 or there exists a sequence {x k } S such that x k x for all k and x k x, x k x x k x y y. The set of all tangent vectors of S at x is denoted by T S (x) and is also referred to as the tangent cone of S at x. The polar cone of any cone T is defined by T = { z z y 0, y T }. The normal cone is another conical approximation that is useful in the context of optimality conditions and is of central importance in nonsmooth analysis (see Mordukhovich [Mor76], Aubin and Frankowska [AuF90], Rockafellar and Wets [RoW98], and Borwein and Lewis [BoL00]). For a closed set X and a point x X, the normal cone of X at x, denoted by N X (x), is obtained from the polar cone T X (x) by means of a closure operation. In particular, we have z N X (x) if there exist sequences {x k } X and {z k } such that x k x, z k z, and z k T X (x k ) for all k. From the definition of the normal cone, an important closedness property follows: if {x k } X 2

1. Introduction is a sequence that converges to some x X, and {z k } is a sequence that converges to some z with z k N X (x k ) for all k, then z N X (x ) (see Rockafellar and Wets [RoW98], Proposition 6.6). It can be seen that, for any x X, we have T X (x) N X (x). Note that N X (x) may not always be equal to T X (x). When T X (x) = N X (x), we say that X is regular at x (see Rockafellar and Wets [RoW98], p. 199). Regularity is an important property, which distinguishes problems that have satisfactory Lagrange multiplier theory from those that do not, as has been been emphasized in earlier work ([Roc93], [BeO00], [BNO03]). In this paper, we further elaborate on the significance of this property of the constraint set in relation to existence of Lagrange multipliers. A classical necessary condition for a vector x C to be a local minimum of f over C is f(x ) y 0, y T C (x ), (1.3) where T C (x ) is the tangent cone of C at x (see e.g., Bazaraa, Sherali, and Shetty [BSS93], Bertsekas [Ber99], Hestenes [Hes75], Rockafellar [Roc93], Rockafellar and Wets [RoW98]). When the constraint set is specified in terms of equality and inequality constraints, as in problem (1.1)- (1.2), more refined optimality conditions can be obtained. In particular, we say that the constraint set C of Eq. (1.2) admits Lagrange multipliers at a point x C if for every smooth cost function f for which x is a local minimum of problem (1.1), there exist vectors λ = (λ 1,...,λ m) and µ = (µ 1,...,µ r) that satisfy the following conditions: m f(x ) + λ i h i(x ) + µ j g j(x ) y 0, y T X (x ), (1.4) i=1 µ j 0, j = 1,...,r, (1.5) µ j = 0, j / A(x ), (1.6) where A(x ) = { j g j (x ) = 0 } is the index set of inequality constraints that are active at x. We refer to such a pair (λ,µ ) as a Lagrange multiplier vector corresponding to f and x or simply a Lagrange multiplier. In the case where X = R n, a typical approach to asserting the admittance of Lagrange multipliers is to assume structure in the constraint set, which guarantees that the tangent cone T C (x ) has the form T C (x ) = V (x ), where V (x ) is the cone of first order feasible variations at x, given by V (x ) = { y h i (x ) y = 0, i = 1,...,m, g j (x ) y 0, j A(x ) }. (1.7) 3

1. Introduction In this case we say that x is a quasiregular point or that quasiregularity holds at x [other terms used are x satisfies Abadie s constraint qualification (Abadie [Aba67], Bazaraa, Sherali, and Shetty [BSS93]), or is a regular point (Hestenes [Hes75])]. When there is no abstract set constraint, it is well-known (see e.g., Bertsekas [Ber99], p. 332) that for a given smooth f for which x is a local minimum, there exist Lagrange multipliers if and only if f(x ) y 0, y V (x ). If x is a quasiregular local minimum, it follows from Eq. (1.3) that the above condition holds, so the constraint set admits Lagrange multipliers at x. Thus, a common line of analysis when X = R n is to establish various conditions, also known as constraint qualifications, which imply quasiregularity, and therefore imply that the constraint set admits Lagrange multipliers (see e.g., Bertsekas [Ber99], or Bazaraa, Sherali, and Shetty [BSS93]). This line of analysis, however, requires fairly complicated proofs to show the relations of constraint qualifications to quasiregularity. A general constraint qualification, called quasinormality, was introduced for the special case where X = R n by Hestenes in [Hes75]. Hestenes also showed that quasinormality implies quasiregularity (see also Bertsekas [Ber99], Proposition 3.3.17). Since it is simple to show that the major classical constraint qualifications imply quasinormality (see e.g. Bertsekas [Ber99]), this provides an alternative line of proof that these constraint qualifications imply quasiregularity. The extension of quasinormality to the case where X R n was investigated in [BeO00]. In particular, we say that a feasible vector x of problem (1.1)-(1.2) is quasinormal if there are no scalars λ 1,...,λ m,µ 1,...,µ r, and no sequence {x k } X such that: ( m (i) i=1 λ i h i (x ) + ) r µ j g j (x ) N X (x ). (ii) µ j 0, for all j = 1,...,r. (iii) λ 1,...,λ m,µ 1,...,µ r are not all equal to 0. (iv) {x k } converges to x and for each k, λ i h i (x k ) > 0 for all i with λ i 0 and µ j g j (x k ) > 0 for all j with µ j 0. A slightly stronger notion, known as pseudonormality, was also introduced in [BeO00], and was shown to form the connecting link between major constraint qualifications, admittance of Lagrange multipliers, and exact penalty function theory, for the general case where X R n. In particular, we say that a feasible vector x of problem (1.1)-(1.2) is pseudonormal if there are no scalars λ 1,...,λ m,µ 1,...,µ r, and no sequence {x k } X such that: ( m (i) i=1 λ i h i (x ) + ) r µ j g j (x ) N X (x ). 4

2. Pseudonormality, Quasinormality, and Quasiregularity (ii) µ j 0, for all j = 1,...,r, and µ j = 0 for all j / A(x ). (iii) {x k } converges to x and m λ i h i (x k ) + µ j g j (x k ) > 0, k. (1.8) i=1 In this paper, we focus on the following extension of the notion of quasiregularity, adopted in several treatments of Lagrange multiplier theory (see e.g. Rockafellar [Roc93]) for the case where X may be a strict subset of R n. We say that a feasible vector x of problem (1.1)-(1.2) is quasiregular if T C (x ) = V (x ) T X (x ), where V (x ) is the cone of first order feasible variations [cf. Eq. (1.7)]. We investigate the connections between pseudonormality, quasiregularity, and admittance of Lagrange multipliers when X R n. In Section 2, we show that under a regularity assumption on X, pseudonormality implies quasiregularity. Our line of proof is not only more general than the one of Hestenes (which applies only to the case X = R n ), but is also considerably simpler. In Section 3, we focus on the relation of quasiregularity and admittance of Lagrange multipliers. We show that contrary to the case where X = R n, quasiregularity by itself is not sufficient to guarantee the existence of a Lagrange multiplier. Thus the importance of quasiregularity, which constitutes the classical pathway to Lagrange multipliers when X = R n, diminishes when X R n. On the other hand, pseudonormality still provides unification of the theory. Regarding notation, all vectors are viewed as column vectors, and a prime denotes transposition, so x y denotes the inner product of the vectors x and y. We will use throughout the standard Euclidean norm x = (x x) 1/2. We denote the convex hull and the closure of a set C by conv(c) and cl(c), respectively. 2. PSEUDONORMALITY, QUASINORMALITY, AND QUASIREGULARITY In this section, we investigate the connection of pseudonormality and quasiregularity. In particular, we show that under a regularity assumption on X, quasinormality implies quasiregularity even when X R n. This shows that any constraint qualification that implies quasinormality also implies quasiregularity. Moreover, since pseudonormality implies quasinormality, it follows that under the given assumption, pseudonormality also implies quasiregularity. 5

2. Pseudonormality, Quasinormality, and Quasiregularity We first give some basic results regarding cones and their polars that will be useful in our analysis. For the proofs, see [Roc70] or [BNO03]. Lemma 1: (a) Let C be a cone. Then (C ) = cl ( conv(c) ). In particular, if C is closed and convex, (C ) = C. (b) Let C 1 and C 2 be two cones. If C 1 C 2, then C2 C 1. (c) Let C 1 and C 2 be two cones. Then, (C 1 + C 2 ) = C1 C 2 and C 1 + C 2 (C 1 C 2 ). In particular, if C 1 and C 2 are closed and convex, (C 1 C 2 ) = cl(c1 + C 2 ). Next, we prove the following lemma that relates to the properties of quasinormality. From here on in our analysis, we assume for simplicity that all the constraints of problem (1.1)-(1.2) are inequalities. (Equality constraints can be handled by conversion to inequality constraints.) Lemma 2: x are quasinormal. If a vector x C is quasinormal, then all feasible vectors in a neighborhood of Proof: Assume that the claim is not true. Then we can find a sequence {x k } C such that x k x for all k, x k x and x k is not quasinormal for all k. This implies, for each k, the existence of scalars ξ1 k,...,ξk r, and a sequence {x k l } X such that: (a) ξj k g j(x k ) N X (x k ), (2.1) (b) ξ k j 0, for all j = 1,...,r, and ξk 1,...,ξk r are not all equal to 0. (c) lim l x k l = x k, and for all l, ξ k j g j(x k l ) > 0 for all j with ξk j > 0. For each k denote ( ) δ k = ξ k 2, j µ k j = ξk j, j = 1,...,r, k. δk Since δ k 0 and N X (x k ) is a cone, conditions (a)-(c) for the scalars ξ k 1,...,ξk r yield the following set of conditions that hold for each k for the scalars µ k 1,...,µk r: 6

2. Pseudonormality, Quasinormality, and Quasiregularity (i) µ k j g j(x k ) N X (x k ), (2.2) (ii) µ k j 0, for all j = 1,...,r, and µk 1,...,µk r are not all equal to 0. (iii) There exists a sequence {x k l } X such that lim l x k l = x k, and for all l, µ k j g j(x k l ) > 0 for all j with µ k j > 0. Since by construction we have ( ) µ k 2 j = 1, (2.3) the sequence {µ k 1,...,µk r} is bounded and must contain a subsequence that converges to some nonzero limit {µ 1,...,µ r}. Assume without loss of generality that {µ k 1,...,µk r} converges to {µ 1,...,µ r}. Taking the limit in Eq. (2.2), and using the closedness of the normal cone, we see that this limit must satisfy µ j g j(x ) N X (x ). (2.4) Moreover, from condition (ii) and Eq. (2.3), it follows that µ j µ 1,...,µ r are not all equal to 0. Finally, let J = {j µ j > 0}. Then, there exists some k 0 such that for all k k 0, we must have µ k j 0, for all j = 1,...,r, and > 0 for all j J. From condition (iii), it follows that for each k k 0, there exists a sequence {x k l } X with lim l xk l = x k, g j (x k l ) > 0, l, j J. For each k k 0, choose an index l k such that l 1 <... < l k 1 < l k and Consider the sequence {y k } defined by lim k xk l k = x. y k = x k 0+k 1 l k0 +k 1, k = 1,2,... It follows from the preceding relations that {y k } X and lim k yk = x, g j (y k ) > 0, k, j J. The existence of scalars µ 1,...,µ r that satisfy Eq. (2.4) and the sequence {y k } that satisfies the preceding relation violates the quasinormality of x, thus completing the proof. 7 Q.E.D.

2. Pseudonormality, Quasinormality, and Quasiregularity We mentioned that a classical necessary condition for a vector x C to be a local minimum of the function f over the set C is f(x ) T C (x ). (2.5) An interesting converse was given by Gould and Tolle [GoT71], namely that every vector in T C (x ) is equal to the negative of the gradient of some function having x as a local minimum over C. Rockafellar and Wets [RoW98, p. 205] showed that this function can be taken to be smooth over R n as in the following lemma. Lemma 3: Let x be a vector in C. For every y T C (x), there is a smooth function F with F(x) = y, which achieves a strict global minimum over C at x. We use this result to obtain a specific representation of a vector that belongs to T C (x) for some x C under a quasinormality condition, as given in the following proposition. This result will be central in showing the relation of quasinormality to quasiregularity. Proposition 1: If x is a quasinormal vector of C, then any y T C (x) can be represented as y = z + µ j g j (x), where z N X (x), µ j 0, for all j = 1,...,r. Furthermore, there exists a sequence {x k } X that converges to x and is such that µ j g j (x k ) > 0 for all k and all j with µ j > 0. Proof: Let y be a vector that belongs to T C (x). By Lemma 3, there exists a smooth function F that achieves a strict global minimum over C at x with F(x) = y. We use a quadratic penalty function approach. For each k = 1,2,..., choose an ɛ > 0 and consider the penalized problem minimize F k (x) subject to x X S, where F k (x) = F(x) + k 2 ( g + j (x)) 2, and S = {x x x ɛ}. Since X S is compact, by Weierstrass theorem, there exists an optimal solution x k for the above problem. We have for all k F(x k ) + k 2 ( g + j (xk ) ) 2 = F k (x k ) F k (x) = F(x) (2.6) and since F(x k ) is bounded over X S, we obtain lim k g+ j (xk ) = 0, j = 1,...,r; 8

2. Pseudonormality, Quasinormality, and Quasiregularity otherwise the left-hand side of Eq. (2.6) would become unbounded from above as k. Therefore, every limit point x of {x k } is feasible, i.e., x C. Furthermore, Eq. (2.6) yields F(x k ) F(x) for all k, so by taking the limit along the relevant subsequence as k, we obtain F( x) F(x). Since x is feasible, we have F(x) < F( x) (since F achieves a strict global minimum over C at x), unless x = x, which when combined with the preceding inequality yields x = x. Thus the sequence {x k } converges to x, and it follows that x k is an interior point of the closed sphere S for all k greater than some k. For k k, we have the necessary optimality condition, F k (x k ) y 0 for all y T X (x k ), or equivalently F k (x k ) T X (x k ), which is written as F(x k ) + ζj k g j(x k ) T X (x k ), (2.7) where ζj k = kg+ j (xk ). (2.8) Denote, δ k = 1 + (ζj k)2, (2.9) µ k 0 = 1 δ k, µk j = ζk j, j = 1,...,r. (2.10) δk Then by dividing Eq. (2.7) with δ k, we get µ k 0 F(xk ) + µ k j g j(x k ) T X (x k ). (2.11) Since by construction the sequence {µ k 0,µk 1,...,µk r} is bounded, it must contain a subsequence that converges to some nonzero limit {µ 0,µ 1,...,µ r }. From Eq. (2.11) and the defining property of the normal cone N X (x) [x k x, z k z, and z k T X (x k ) for all k, imply that z N X (x)], we see that µ 0 and the µ j must satisfy µ 0 F(x) + µ j g j (x) N X (x). (2.12) Furthermore, from Eq. (2.10), we have g j (x k ) > 0 for all j such that µ j > 0 and k sufficiently large. By using the quasinormality of x, it follows that we cannot have µ 0 = 0, and by appropriately normalizing, we can take µ 0 = 1 and obtain F(x) + µ j g j (x) N X (x). 9

2. Pseudonormality, Quasinormality, and Quasiregularity Since F(x) = y, we see that y = z + µ j g j (x), where z N X (x), and the scalars µ 1,...,µ r and the sequence {x k } satisfy the desired properties, thus completing the proof. Q.E.D. We are now ready to prove the main result of this section. Proposition 2: If x is a quasinormal vector of C and X is regular at x, then x is quasiregular. Proof: We must show that T C (x ) = T X (x ) V (x ), and to this end, we first show that T C (x ) T X (x ) V (x ). Indeed, since C X, using the definition of the tangent cone, we have T C (x ) T X (x ). (2.13) To show that T C (x ) V (x ), let y be a nonzero tangent of C at x. Then there exist sequences {ξ k } and {x k } C such that x k x for all k, ξ k 0, x k x, and x k x x k x = y y + ξk. By the mean value theorem, we have for all j and k 0 g j (x k ) = g j (x ) + g j ( x k ) (x k x ) = g j ( x k ) (x k x ), where x k is a vector that lies on the line segment joining x k and x. This relation can be written as where y k = y + ξ k y, or equivalently x k x g j ( x y k ) y k 0, g j ( x k ) y k 0, y k = y + ξ k y. Taking the limit as k, we obtain g j (x ) y 0 for all j, thus proving that y V (x ). Hence, T C (x ) V (x ). Together with Eq. (2.13), this shows that T C (x ) T X (x ) V (x ). (2.14) 10

2. Pseudonormality, Quasinormality, and Quasiregularity To show the reverse inclusion T X (x ) V (x ) T C (x ), we first show that N C (x ) T X (x ) + V (x ). Let y be a vector that belongs to N C (x ). By the definition of the normal cone, this implies the existence of a sequence {x k } C that converges to x and a sequence {y k } that converges to y, with y k T C (x k ) for all k. In view of the assumption that x is quasinormal, it follows from Lemma 2 that for all sufficiently large k, x k is quasinormal. Then, by Proposition 1, for each sufficiently large k, there exists a vector z k N X (x k ) and nonnegative scalars µ k 1,...,µk r such that y k = z k + µ k j g j(x k ). (2.15) Furthermore, there exists a sequence {x k l } X such that and for all l, µ k j g j(x k l ) > 0 for all j with µk j > 0. lim l xk l = x k, We will show that the sequence {µ k 1,...,µk r} is bounded. Suppose, to arrive at a contradiction, that this sequence is unbounded, and assume without loss of generality, that for each k, at least one of the µ k j is nonzero. For each k, denote δ k = 1 r (µk j )2, and ξ k j = δk µ k j, j = 1,...,r. It follows that δ k > 0 for all k and δ k 0 as k. Then, by multiplying Eq. (2.15) by δ k, we obtain δ k y k = δ k z k + ξj k g j(x k ), or equivalently, since z k N X (x k ) and δ k > 0, we have δ k z k = δ k y k ξj k g j(x k ) N X (x k ). Note that by construction, the sequence {ξ1 k,...,ξk r } is bounded, and therefore has a nonzero limit point {ξ1,...,ξ r }. Taking the limit in the preceding relation along the relevant subsequence and using the facts δ k 0, y k y, and x k x together with the closedness of the normal cone N X (x ), we see that δ k z k converges to some vector z in N X (x ), where z = ξj g j(x ). 11

2. Pseudonormality, Quasinormality, and Quasiregularity Furthermore, by defining an index l k for each k such that l 1 < < l k 1 < l k and lim k xk l k = x, we see that for all j with ξ j > 0, we have g j(x k l k ) > 0 for all sufficiently large k. The existence of such scalars ξ 1,...,ξ r violates the quasinormality of the vector x, thus showing that the sequence {µ k 1,...,µk r} is bounded. Let {µ 1,...,µ r} be a limit point of the sequence {µ k 1,...,µk r}, and assume without loss of generality that {µ k 1,...,µk r} converges to {µ 1,...,µ r}. Taking the limit as k in Eq. (2.15), we see that z k converges to some z, where z = y µ j g j(x ). (2.16) By closedness of the normal cone N X (x ) and in view of the assumption that X is regular at x, so that N X (x ) = T X (x ), we have that z T X (x ). Furthermore, by defining an index l k for each k such that l 1 < < l k 1 < l k and lim k xk l k = x, we see that for all j with µ j > 0, we have g j(x k i k ) > 0 for all sufficiently large k, showing that g j (x ) = 0. Hence, it follows that µ j = 0 for all j / A(x ), and using Eq. (2.16), we can write y as y = z + j A(x ) µ j g j(x ). By Farkas Lemma, V (x ) is the cone generated by g j (x ), j A(x ). Hence, it follows that y T X (x ) + V (x ), and we conclude that N C (x ) T X (x ) + V (x ). (2.17) Finally, using the properties relating to cones and their polars given in Lemma 1 and the fact that T X (x ) is convex (which follows by the regularity of X at x, see [Row98, p. 221]), we obtain ( TX (x ) + V (x ) ) = TX (x ) V (x ) N C (x ). (2.18) Using the relation N C (x ) T C (x ) (see [Row98], Propositions 6.26 and 6.28), this shows that T X (x ) V (x ) T C (x ), which together with Eq. (2.14) concludes the proof. Q.E.D. Note that in the preceding proof, we showed that T C (x ) T X (x ) V (x ), 12

which, using the relations given in Lemma 1, implies that 3. Quasiregularity and Admittance of Lagrange Multipliers T X (x ) + V (x ) (T X (x ) V (x )) T C (x ). We also proved that if X is regular at x and x is quasinormal, we have N C (x ) T X (x ) + V (x ), [cf. Eq. (2.17)]. Combining the preceding two relations with the relation T C (x ) N C (x ), we obtain T C (x ) = N C (x ), thus showing that quasinormality of x together with regularity of X at x implies that C is regular at x. 3. QUASIREGULARITY AND ADMITTANCE OF LAGRANGE MULTIPLIERS In this section, we explore the relation of quasiregularity with the admittance of Lagrange multipliers. First, we provide a necessary and sufficient condition for the constraint set C of Eq. (1.2) to admit Lagrange multipliers. This condition was given in various forms by Gould and Tolle [GoT72], Guignard [Gui69], and Rockafellar [Roc93]. For completeness, we provide the proof of this result using the gradient characterization of vectors in T C (x ), given in Lemma 3. Proposition 3: Let x be a feasible vector of problem (1.1)-(1.2). The constraint set C of problem (1.2) admits Lagrange multipliers at x if and only if T C (x ) = T X (x ) + V (x ). (3.1) Proof: Denote by D(x ) the set of gradients of all smooth cost functions for which x is a local minimum of problem (1.1)-(1.2). We claim that D(x ) = T C (x ). Indeed by the necessary condition for optimality [cf. Eq. (1.3)], we have D(x ) T C (x ). To show the reverse inclusion, let y T C (x ). By Lemma 3, there exists a smooth function F with F(x ) = y, which achieves a strict global minimum over C at x. Thus, y D(x ), showing that D(x ) = T C (x ). (3.2) 13

3. Quasiregularity and Admittance of Lagrange Multipliers We now note that by definition, the constraint set C admits Lagrange multipliers at x if and only if D(x ) T X (x ) + V (x ). In view of Eq. (3.2), this implies that the constraint set C admits Lagrange multipliers at x if and only if T C (x ) T X (x ) + V (x ). (3.3) On the other hand, we have shown in the proof of Proposition 2 that we have T C (x ) T X (x ) V (x ), [cf. Eq. (2.14)]. Using the properties of polar cones given in Lemma 1, this implies T X (x ) + V (x ) (T X (x ) V (x )) T C (x ), which combined with Eq. (3.3), yields the desired relation, and concludes the proof. Q.E.D. Note that the condition given in Eq. (3.1) is equivalent to the following two conditions: ( (a) V (x ) cl conv ( T X (x ) )) ( = cl conv ( T C (x ) )), (b) V (x ) + T X (x ) is a closed set. Quasiregularity is a weaker condition, even under the assumption that X is regular, since the vector sum V (x ) + T X (x ) need not be closed even if both of these cones themselves are closed, as shown in the following example. Example 1: Consider the constraint set C R 3 specified by C = { x X h(x) = 0 }, where X = { (x 1, x 2, x 3) x 2 1 + x 2 2 x 2 3, x 3 0 }, and h(x) = x 2 + x 3. Let x denote the origin. In view of the convexity of X, we have that X is regular at x and that T X(x ) is given by the closure of the set of feasible directions at x. Since x is the origin and X is a closed cone, it follows that T X(x ) = X. 14

The cone of first order feasible variations, V (x ), is given by 3. Quasiregularity and Admittance of Lagrange Multipliers V (x ) = { (x 1, x 2, x 3) x 2 + x 3 = 0 }. It can be seen that the set V (x ) + T X(x ) is not closed (cf. [BNO03], Exercise 3.5), implying that C does not admit Lagrange multipliers. On the other hand, we have T C(x ) = T X(x ) V (x ), i.e., x is quasiregular. The preceding example shows that quasiregularity is not powerful enough to assert the existence of Lagrange multipliers for the general case X R n, unless additional assumptions are imposed. It is effective only for special cases, for instance, when T X (x ) is a polyhedral cone, in which case V (x ) + T X (x ) is closed, since it is the vector sum of two polyhedral sets, and quasiregularity implies the admittance of Lagrange multipliers. Thus the importance of quasiregularity, the classical pathway to Lagrange multipliers when X = R n, diminishes when X R n. By contrast, as shown in [BeO00], pseudonormality still provides unification of the theory of constraint qualifications and existence of Lagrange multipliers. 15

References REFERENCES [Aba67] Abadie, J., 1967. On the Kuhn-Tucker Theorem, in Nonlinear Programming, Abadie, J., (Ed.), North Holland, Amsterdam. [AuF90] Aubin, J.-P., and Frankowska, H., Set-Valued Analysis, Birkhauser, Boston. [BSS93] Bazaraa, M. S., Sherali, H. D., and Shetty, C. M., 1993. Nonlinear Programming Theory and Algorithms, (2nd Ed.), Wiley, N. Y. [BaG82] Bazaraa, M. S., and Goode, J. J., 1982. Sufficient Conditions for a Globally Exact Penalty Function without Convexity, Math. Programming Stud., Vol. 19, pp. 1-15. [BNO03] Bertsekas, D. P., Nedic, A., and Ozdaglar, A. E., Convex Analysis and Optimization, Athena Scientific, Belmont, MA, 2003. [Ber99] Bertsekas, D. P., Nonlinear Programming: 2nd Edition, Athena Scientific, Belmont, MA, 1999. [BeO00] Bertsekas, D. P., and Ozdaglar, A. E., Pseudonormality and Lagrange Multiplier Theory for Constrained Optimization, Report LIDS-P-2489, Lab. for Information and Decision Systems, MIT, Nov. 2000, JOTA, to appear. [BoL00] Borwein, J. M., and Lewis, A. S., Convex Analysis and Nonlinear Optimization, Springer- Verlag, N. Y. [BoS00] Bonnans, J. F., and Shapiro, A., 2000. Perturbation Analysis of Optimization Problems, Springer-Verlag, N. Y. [GoT71] Gould, F. J., and Tolle, J., 1971. A Necessary and Sufficient Condition for Constrained Optimization, SIAM J. Applied Math., Vol. 20, pp. 164-172. [GoT72] Gould, F. J., and Tolle, J., 1972. Geometry of Optimality Conditions and Constraint Qualifications, Math. Programming, Vol. 2, pp. 1-18. [Gui69] Guignard, M., 1969. Generalized Kuhn-Tucker Conditions for Mathematical Programming Problems in a Banach Space, SIAM J. on Control, Vol. 7, pp. 232-241. [Hes75] Hestenes, M. R., 1975. Optimization Theory: The Finite Dimensional Case, Wiley, N. Y. [Mor76] Mordukhovich, B. S., 1976. Maximum Principle in the Problem of Time Optimal Response with Nonsmooth Constraints, J. of Applied Mathematics and Mechanics, Vol. 40, pp. 960-969. [RoW98] Rockafellar, R. T., and Wets, R. J.-B., 1998. Variational Analysis, Springer-Verlag, Berlin. 16

References [Roc70] Rockafellar, R. T., 1970. Convex Analysis, Princeton Univ. Press, Princeton, N. J. [Roc93] Rockafellar, R. T., 1993. Lagrange Multipliers and Optimality, SIAM Review, Vol. 35, pp. 183-238. 17