A Unified Analysis of Nonconvex Optimization Duality and Penalty Methods with General Augmenting Functions

Similar documents
A Geometric Framework for Nonconvex Optimization Duality using Augmented Lagrangian Functions

Convex Analysis and Optimization Chapter 2 Solutions

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Existence of Global Minima for Constrained Optimization 1

The Relation Between Pseudonormality and Quasiregularity in Constrained Optimization 1

8. Conjugate functions

Extended Monotropic Programming and Duality 1

Enhanced Fritz John Optimality Conditions and Sensitivity Analysis

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

Nonlinear Programming 3rd Edition. Theoretical Solutions Manual Chapter 6

Approximate Primal Solutions and Rate Analysis for Dual Subgradient Methods

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

LECTURE 9 LECTURE OUTLINE. Min Common/Max Crossing for Min-Max

Convex Optimization Theory. Chapter 3 Exercises and Solutions: Extended Version

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS

Convexity, Duality, and Lagrange Multipliers

Convex Analysis and Optimization Chapter 4 Solutions

Introduction to Convex Analysis Microeconomics II - Tutoring Class

Optimality Conditions for Nonsmooth Convex Optimization

BASICS OF CONVEX ANALYSIS

LECTURE SLIDES ON BASED ON CLASS LECTURES AT THE CAMBRIDGE, MASS FALL 2007 BY DIMITRI P. BERTSEKAS.

Lecture 8. Strong Duality Results. September 22, 2008

Introduction to Convex and Quasiconvex Analysis

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

LECTURE SLIDES ON CONVEX OPTIMIZATION AND DUALITY THEORY TATA INSTITUTE FOR FUNDAMENTAL RESEARCH MUMBAI, INDIA JANUARY 2009 PART I

Constrained Consensus and Optimization in Multi-Agent Networks

1 Review of last lecture and introduction

Date: July 5, Contents

In English, this means that if we travel on a straight line between any two points in C, then we never leave C.

Lagrangian-Conic Relaxations, Part I: A Unified Framework and Its Applications to Quadratic Optimization Problems

Chapter 2 Convex Analysis

Convex Optimization Theory

Constrained Optimization and Lagrangian Duality

CONVEX OPTIMIZATION VIA LINEARIZATION. Miguel A. Goberna. Universidad de Alicante. Iberian Conference on Optimization Coimbra, November, 2006

Duality Theory of Constrained Optimization

CONTINUOUS CONVEX SETS AND ZERO DUALITY GAP FOR CONVEX PROGRAMS

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

On Semicontinuity of Convex-valued Multifunctions and Cesari s Property (Q)

1 Lyapunov theory of stability

GEOMETRIC APPROACH TO CONVEX SUBDIFFERENTIAL CALCULUS October 10, Dedicated to Franco Giannessi and Diethard Pallaschke with great respect

A convergence result for an Outer Approximation Scheme

CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS

LECTURE 10 LECTURE OUTLINE

Linear Programming. Larry Blume Cornell University, IHS Vienna and SFI. Summer 2016

POLARS AND DUAL CONES

Subgradient Methods in Network Resource Allocation: Rate Analysis

LECTURE SLIDES ON CONVEX ANALYSIS AND OPTIMIZATION BASED ON LECTURES GIVEN AT THE MASSACHUSETTS INSTITUTE OF TECHNOLOGY CAMBRIDGE, MASS

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Semi-infinite programming, duality, discretization and optimality conditions

ON GAP FUNCTIONS OF VARIATIONAL INEQUALITY IN A BANACH SPACE. Sangho Kum and Gue Myung Lee. 1. Introduction

ZERO DUALITY GAP FOR CONVEX PROGRAMS: A GENERAL RESULT

An Infeasible Interior Proximal Method for Convex Programming Problems with Linear Constraints 1

Maximal Monotone Inclusions and Fitzpatrick Functions

Helly's Theorem and its Equivalences via Convex Analysis

Elements of Convex Optimization Theory

LECTURE 4 LECTURE OUTLINE

THE UNIQUE MINIMAL DUAL REPRESENTATION OF A CONVEX FUNCTION

Convex Optimization Theory. Chapter 4 Exercises and Solutions: Extended Version

FUNCTIONAL ANALYSIS LECTURE NOTES: COMPACT SETS AND FINITE-DIMENSIONAL SPACES. 1. Compact Sets

Convex Analysis and Economic Theory AY Elementary properties of convex functions

Division of the Humanities and Social Sciences. Supergradients. KC Border Fall 2001 v ::15.45

Lecture 3. Optimization Problems and Iterative Algorithms

When are Sums Closed?

Subdifferential representation of convex functions: refinements and applications

Introduction to Nonlinear Stochastic Programming

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

Finite Convergence for Feasible Solution Sequence of Variational Inequality Problems

Subgradient. Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes. definition. subgradient calculus

ON GENERALIZED-CONVEX CONSTRAINED MULTI-OBJECTIVE OPTIMIZATION

On the Existence and Convergence of the Central Path for Convex Programming and Some Duality Results

Dedicated to Michel Théra in honor of his 70th birthday

On the relation between concavity cuts and the surrogate dual for convex maximization problems

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

Convex Optimization & Lagrange Duality

Epiconvergence and ε-subgradients of Convex Functions

Convex Optimization M2

Translative Sets and Functions and their Applications to Risk Measure Theory and Nonlinear Separation

Convexity in R n. The following lemma will be needed in a while. Lemma 1 Let x E, u R n. If τ I(x, u), τ 0, define. f(x + τu) f(x). τ.

4. Algebra and Duality

Convex Analysis Notes. Lecturer: Adrian Lewis, Cornell ORIE Scribe: Kevin Kircher, Cornell MAE

Solutions Chapter 5. The problem of finding the minimum distance from the origin to a line is written as. min 1 2 kxk2. subject to Ax = b.

Solving Dual Problems

IE 521 Convex Optimization

Additional Homework Problems

Set, functions and Euclidean space. Seungjin Han

Prox-Diagonal Method: Caracterization of the Limit

Almost Convex Functions: Conjugacy and Duality

SOME STABILITY RESULTS FOR THE SEMI-AFFINE VARIATIONAL INEQUALITY PROBLEM. 1. Introduction

On Gap Functions for Equilibrium Problems via Fenchel Duality

Franco Giannessi, Giandomenico Mastroeni. Institute of Mathematics University of Verona, Verona, Italy

Constraint qualifications for convex inequality systems with applications in constrained optimization

Risk Aversion over Incomes and Risk Aversion over Commodities. By Juan E. Martinez-Legaz and John K.-H. Quah 1

Convex Functions. Pontus Giselsson

Optimization for Machine Learning

Convex Analysis Background

STABLE AND TOTAL FENCHEL DUALITY FOR CONVEX OPTIMIZATION PROBLEMS IN LOCALLY CONVEX SPACES

On duality gap in linear conic problems

Example: feasibility. Interpretation as formal proof. Example: linear inequalities and Farkas lemma

Lecture 8 Plus properties, merit functions and gap functions. September 28, 2008

Optimization and Optimal Control in Banach Spaces

Transcription:

A Unified Analysis of Nonconvex Optimization Duality and Penalty Methods with General Augmenting Functions Angelia Nedić and Asuman Ozdaglar April 16, 2006 Abstract In this paper, we study a unifying framework for the analysis of duality schemes and penalty methods constructed using augmenting functions that need not be nonnegative or bounded from below. We consider two geometric optimization problems that are dual to each other and characterize primal-dual problems for which the optimal values of the two problems are equal. To establish this, we show that we can use general concave surfaces to separate nonconvex sets with certain properties. We apply our framework to study augmented optimization duality and general classes of penalty methods without imposing compactness assumptions. Key words: Augmenting functions, augmented Lagrangian functions, recession directions, duality gap, penalty methods. MSC2000 subject classification: Primary: 90C30, 90C46. OR/MS subject classification: Primary: Programming/nonlinear. 1 Introduction Convex optimization duality has been traditionally established using hyperplanes that can separate the epigraph of the primal (or perturbation) function of the constrained optimization problem from a point that does not belong to the closure of the epigraph (see, for example, Rockafellar [15], Hiriart-Urruty and Lemarechal [9], Bonnans and Shapiro [7], Borwein and Lewis [8], Bertsekas, Nedić, and Ozdaglar [5], [6], Auslender and Teboulle [2]). For a nonconvex optimization problem, the epigraph of the primal function is typically nonconvex and the separation using linear surfaces may not be Department of Industrial and Enterprise Systems Engineering, University of Illinois, andjelija03@yahoo.com Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, asuman@mit.edu 1

possible. Recent literature has considered dual problems that are constructed using augmented Lagrangian functions, where the augmenting functions are nonnegative and satisfy either coercivity assumptions (Rockafellar and Wets [16], Huang and Yang [10]), or peak-at-zero-type assumptions (see Rubinov, Huang, and Yang [17] and Rubinov and Yang [18]). In our earlier work [12], we presented a geometric approach that provides a unified framework to study augmented optimization duality and penalty methods. We considered dual problems constructed using convex nonnegative augmenting functions, and provided necessary and sufficient conditions for zero duality gap without explicitly imposing coercivity assumptions. In this paper, we consider a similar geometric framework to analyze nonconvex optimization duality using a more general class of augmented Lagrangian functions. Here, the augmenting functions that are used to construct the Lagrangian functions need not be nonnegative or even bounded from below. We consider two simple geometric optimization problems that are defined in terms of a nonempty set V R m R that intersects the vertical axis {(0, w) w R}, i.e., the w-axis. In particular, we consider the following: Geometric Primal Problem: We would like to find the minimum value intercept of the set V and the w-axis. Geometric Dual Problem: Consider concave surfaces {(u, φ c,µ (u)) u R m } that lie below the set V and φ c,µ : R m R has the following form φ c,µ (u) = 1 c σ(cu) µ u + ξ, (1) where σ is a convex augmenting function, c 0 is a scaling (or penalty) parameter, µ is a vector in R m, and ξ is a scalar. We would like to find the maximum intercept of such a surface with the w-axis. It is geometrically straightforward to see that the optimal value of the dual problem is no higher than the optimal value of the primal problem, i.e., a primal-dual relation known as the weak duality. We are interested in characterizing the primal-dual problems for which the two optimal values are equal, i.e., the problems for which a relation known as the strong duality holds. In particular, our objective is to provide necessary and sufficient conditions for strong duality between geometric primal and dual problems. To establish sufficient conditions for strong duality, we first analyze sufficient conditions for the strong separation of the set V from a point of the form (0, w) with w R that does not lie in the closure of the set V. This separation is realized through the use of concave surfaces of the form given in Eq. (1). In particular, we identify sufficient conditions for the set V under which the separation can be established. Then, we consider separating surfaces constructed by using bounded-below augmenting functions, unbounded augmenting functions, and asymptotic augmenting functions (see Figure 1). Subsequently, we show that our geometric approach provides a unified framework both for establishing zero duality gap for augmented optimization duality and for studying the convergence of general classes of penalty methods to the optimal value of the constrained optimization problem. Asymptotic convergence behavior of general penalty 2

σ(u) σ(u) u u Figure 1: General augmenting functions σ(u) for u R. Part (a) illustrates a boundedbelow augmenting function, e.g., σ(u) = a(e u 1). Part (b) illustrates an unbounded augmenting function, e.g., σ(u) = log(1 u), which is also an asymptotic augmenting function, i.e, σ(u) goes to plus infinity along a finite asymptote. methods was analyzed by Auslender, Cominetti, and Haddou in [1] for convex optimization problems under the assumption that the optimal solution set of the primal problem is nonempty and compact. Here, we generalize these results to larger classes of penalty functions and to nonconvex optimization problems. We also provide necessary and sufficient conditions for convergence without imposing the existence of an optimal solution or compactness assumptions. In our paper [12], we studied the geometric dual problem that involves the separation of the set V from a point not belonging to its closure using concave surfaces of the form φ c,µ (u) = cσ(u) µ u + ξ, i.e., we considered scaling the augmenting functions linearly by a parameter c 0. Under some conditions, we have shown that for nonnegative augmenting functions σ and for any possible dent in the bottom shape of the set V, there exists a large enough scalar c > 0 such that the desired separation can be achieved using linear scaling. However, it is straightforward to see in our geometric framework that such a linear scaling is not appropriate for augmenting functions that can take negative values. For this reason, we consider here a scaling of the form 1 σ(cu). As we will see, such a scaling is appropriate c for the achievement of the desired separation with a large enough value of the scaling parameter c. The paper is organized as follows: In Section 2, we introduce our geometric approach in terms of a nonempty set V and separating augmenting functions. In Section 3, we present some properties of the set V and the augmenting functions, and analyze the implications related to separation properties. In Section 4, we present our separation theorems. In particular, we discuss sufficient conditions on augmenting functions and the set V that guarantee the separation of this set and a vector (0, w 0 ) that does not belong to the closure of the set V. We present separation results for exponential-type, unbounded, and asymptotic augmenting functions. In Section 5, we provide necessary and sufficient conditions for strong duality for the geometric primal and dual problems. 3

In Sections 5 and 6, we apply our results to nonconvex optimization duality and penalty methods using the primal function of the constrained optimization problem. 1.1 Notation, Terminology, and Basics We view a vector as a column vector, and we denote the inner product of two vectors x and y by x y. For a scalar sequence {γ k } approaching the zero value monotonically from above, we write γ k 0. For any vector u R n, we can write u = u + + u with u + 0 and u 0, where the vector u + is the component-wise maximum of u and the zero vector, i.e., u + = (max{0, u 1 },..., max{0, u n }), and the vector u is the component-wise minimum of u and the zero vector, i.e., u = (min{0, u 1 },..., min{0, u n }). For a function f : R n [, ], we denote the domain of f by dom(f), i.e., We denote the epigraph of f by epi(f), i.e., dom(f) = {x R n f(x) < }. epi(f) = {(x, w) R n R f(x) w}. For any scalar γ, we denote the (lower) γ-level set of f by L f (γ), i.e., L f (γ) = {x R n f(x) γ}. We say that the function f is level-bounded when the set L f (γ) is bounded for every scalar γ. We denote the closure of a set X by cl(x). We define a cone K as a set of all vectors x such that λx K whenever x K and λ 0. For a given nonempty set X, the cone generated by the set X is denoted by cone(x) and is given by cone(x) = {y y = λx for some x X and λ 0}. When establishing the separation results, we use the notion of a recession cone of a set. In particular, the recession cone of a set C is denoted by C and is defined as follows. Definition 1 (Recession Cone) The recession cone C of a nonempty set C is given by C = {d λ k x k d for some {x k } C and {λ k } R with λ k 0, λ k 0}. A direction d C is referred to as a recession direction of the set C. Some basic properties of a recession cone are given in the following lemma (cf. Rockafellar and Wets [16], Section 3B). 4

Lemma 1 (Recession Cone Properties) The recession cone C of a nonempty set C is a closed cone. Furthermore, when C is a cone, we have C = (cl(c)) = cl(c); We also use the notion of a recession function in the subsequent analysis. The recession function of a function f, denoted by f, is defined through the recession cone of the epigraph epi(f) of the function f, as follows. Definition 2 (Recession Function) The recession function f of a proper function f : R n (, ] is defined in terms of its epigraph as epi(f ) = (epi(f)). Some basic properties of a recession function are given in the following lemma (cf. Rockafellar and Wets [16], Theorem 3.21). Lemma 2 (Recession Function Properties) Let f be the recession function of a proper function f : R n (, ]. Then, we have: (a) The following relation holds { f f( y k ) (y) = inf lim inf } for some and y k y, where the infimum is taken over all sequences {y k } with y k y and all sequences { } with. (b) The function f is closed and positively homogeneous, i.e., f (λy) = λf (y) for all y and all λ > 0. (c) When f is proper, closed, and convex, the function f is also convex, and we have f (y) = lim c f(ȳ + cy) f(ȳ) c for all ȳ dom(f). 2 Geometric Approach In this section, we introduce the primal and the dual problems in our geometric framework. We also establish the weak duality relation between the primal optimal and the dual optimal values. 5

2.1 Geometric Primal and Dual Problems Consider a nonempty set V R m R that intersects the w-axis, {(0, w) w R}. The geometric primal problem consists of determining the minimum value intercept of V and the w-axis, i.e., inf (0,w) V w. We denote the primal optimal value by w. To define the geometric dual problem, we consider the following class of convex augmenting functions. Definition 3 A function σ : R m (, ] is called an augmenting function if it is convex, not identically equal to 0, and taking zero value at the origin, σ(0) = 0. This definition of augmenting function is motivated by the one introduced by Rockafellar and Wets [16] (see there Definition 11.55). Note that an augmenting function is a proper function. The geometric dual problem considers concave surfaces {(u, φ c,µ (u)) u R m } that lie below the set V. The function φ c,µ : R m R has the following form φ c,µ (u) = 1 c σ(cu) µ u + ξ, where σ is an augmenting function, c 0 is a scaling (or penalty) parameter, µ R m, and ξ R. This surface can be expressed as {(u, w) R m R w + σ(cu)/c + µ u = ξ}, and thus it intercepts the vertical axis {(0, w) w R} at the level ξ. Furthermore, it is below the set V if and only if w + 1 c σ(cu) + µ u ξ for all (u, w) V. Therefore, among all surfaces that are defined by an augmenting function σ, a scalar c 0, and a vector µ R m, and that support the set V from below, the maximum intercept with the w-axis is given by d(c, µ) = inf (u,w) V { w + 1 } c σ(cu) + µ u. The dual problem consists of determining the maximum of these intercepts over c 0 and µ R m, i.e., sup c 0, µ R m d(c, µ). We denote the dual optimal value by d. From the construction of the dual problem, we can see that the dual optimal value d does not exceed w (see Figure 2), i.e., the weak duality relation d w holds. We say that there is zero duality gap when d = w, and there is a duality gap when d < w. The weak duality relation is formally established in the following proposition. 6

µ d(c 2, ) w V w * = d* d(c 1, µ ) u Figure 2: Geometric primal and dual problems. The figure illustrates the maximum intercept of a concave surface supporting the set V, d(c, µ) for c 2 c 1. Proposition 1 (Weak Duality) The dual optimal value does not exceed the primal optimal value, i.e., d w. Proof. For any augmenting function σ, scalar c 0, and vector µ R m, we have { d(c, µ) = w + 1 } c σ(cu) + µ u inf w = w. (0,w) V inf (u,w) V Therefore Q.E.D. d = sup c 0, µ R m d(c, µ) w. 2.2 Separating Surfaces We consider a nonempty set V R m R and a vector (0, w 0 ) that does not belong to the closure of the set V. We are interested in conditions guaranteeing that cl(v ) and the given vector can be separated by a concave surface {(u, φ c,µ (u)) u R m }, where the function φ c,µ has the form φ c,µ (u) = 1 c σ(cu) µ u + ξ for an augmenting function σ, a vector µ R m, and some scalars ξ and c with c > 0. In the following sections, we will provide conditions under which the separation can be realized using an augmenting function only, i.e., µ can be taken equal to 0 in the separating concave surface {(u, φ c,µ (u)) u R m }. In particular, we say that the augmenting function σ separates the set V and the vector (0, w 0 ) cl(v ) when for some c > 0, w + 1 c σ(cu) w 0 for all (u, w) V. 7

We also say that the augmenting function σ strongly separates the set V and the vector (0, w 0 ) cl(v ) when for some c > 0 and ξ R, 3 Preliminary Results w + 1 c σ(cu) ξ > w 0 for all (u, w) V. In this section, we establish some properties of the set V and the augmenting function σ, which will be essential in our subsequent separation results. 3.1 Properties of the set V We first present some properties of the set V and establish their implications. Definition 4 We say that a set V R m R is extending upward in w-space if for every vector (ū, w) V, the half-line {(ū, w) w w} is contained in V. Definition 5 We say that a set V R m R is extending upward in u-space if for every vector (ū, w) V, the cone {(u, w) u ū} is contained in V. The preceding two properties just defined are satisfied, for example, when the set V is the epigraph of a nonincreasing function. For the sets that are extending upward in w-space and u-space, we have the following two lemmas, which will be used in the subsequent analysis. The proofs of these results can be found in our earlier work [12]. Lemma 3 Let V R m R be a nonempty set. (a) When the set V is extending upward in w-space, the closure cl(v ) of the set V is also extending upward in w-space. Furthermore, for any vector (0, w 0 ) cl(v ), we have w 0 < w = inf w (0,w) cl(v ) w. (b) When the set V is extending upward in u-space, the closure cl(v ) of the set V is also extending upward in u-space. Lemma 4 Let V R m R be a nonempty set extending upward in u-space. Then, the closure of the cone generated by the set V is also extending upward in u-space, i.e., for any (ū, w) cl(cone(v )), the cone {(u, w) u ū} is contained in cl(cone(v )). The proofs of some of our separation results require the separation of the half-line {(0, w) w w} for some w < 0 and the cone generated by the set V. To avoid complications that may arise when the set V has an infinite slope around the origin, we consider the cone generated by a slightly upward translation of the set V in w-space, a set we denote by Ṽ (see Figure 3). For the separation results, another important characteristic of the set V is the bottom-shape of V. In particular, it is desirable that the changes in w are commensurate with the changes in u for (u, w) V, i.e., the ratio of u and w-values is 8

w w ~ V = V + (0, ε) V K = cl(cone(v)) u K = cl(cone(v)) u Figure 3: Illustration of the closure of the cones generated respectively by the set V and by the set Ṽ, which is an upward translation of the set V. asymptotically finite, as w decreases. To characterize this, we use the notion of recession directions and recession cone of a nonempty set (see Section 1.1). In the next lemma, we study the implications of the recession directions of set V on the properties of the cone generated by the set Ṽ. This result plays a key role in the subsequent development. The result is an extension of the similar result established in our paper [12]. Lemma 5 Let V R m R be a nonempty set. Assume that w = inf (0,w) V w is finite, and that V extends upward in u-space and w-space. Consider a nonzero vector (ū, w) R m R that satisfies either ū = 0, w < 0, or ū 0, w = 0. Assume that (ū, w) is not a recession direction of V, i.e., (ū, w) / V. For a given ɛ > 0, consider the set Ṽ given by Ṽ = {(u, w) (u, w ɛ) V }, (2) and the cone generated by Ṽ, denoted by K. Then, the vector (ū, w) does not belong to the closure of the cone K generated by Ṽ, i.e., (ū, w) cl(k). Proof. Since w is finite, by using a translation of space along w-axis if necessary, we may assume without loss of generality that w = inf w = 0. (3) (0,w) cl(v ) For a given ɛ > 0, consider the set Ṽ defined in Eq. (2) and let K be the cone generated by Ṽ. To obtain a contradiction, assume that (ū, w) cl(k). Since cl(k) 9

is a cone, by Recession Cone Properties [cf. Lemma 1] it follows that cl(k) = K. Therefore, the vector (ū, w) is a recession direction of K, and therefore, there exist a vector sequence {(u k, w k )} and a scalar sequence{λ k } such that λ k (u k, w k ) (ū, w) with {(u k, w k )} K, λ k 0, λ k 0. Since the cone K is generated by the set Ṽ, for the sequence {(u k, w k )} K we have (u k, w k ) = β k (ū k, w k ), (ū k, w k ) Ṽ, β k 0 for all k. Consider now the sequence {λ k β k (ū k, w k )}. From the preceding two relations, we have λ k β k (ū k, w k ) = λ k (u k, w k ) and λ k β k (ū k, w k ) (ū, w) with (ū k, w k ) Ṽ and λ kβ k 0. (4) Because the set V extends upward in w-space and because the set Ṽ is a translation of V up w-axis [cf. Eq. (2)], we have that (u, w) V for any (u, w) Ṽ. In particular, since (ū k, w k ) Ṽ for all k, it follows that (ū k, w k ) V for all k. (5) In view of λ k β k 0, we have lim inf λ k β k 0, and there are three cases to consider. Case 1: lim inf λ k β k = 0. We can assume without loss of generality that λ k β k 0. Then, from Eq. (4) and the fact (ū k, w k ) V for all k [cf. Eq. (5)], it follows that the vector (ū, w) is a recession direction of the set V, a contradiction. Case 2: lim inf λ k β k =. In this case, we have lim λ k β k =, and by using this relation in Eq. (4), we obtain lim (ū k, w k ) = (0, 0). Therefore (0, 0) cl(ṽ ), and by the definition of the set Ṽ [cf. Eq. (2)], it follows that (0, ɛ) cl(v ). But this contradicts the relation w = 0 [cf. Eq. (3)]. Case 3: lim inf λ k β k = ξ > 0. We can assume without loss of generality that λ k β k ξ > 0 and λ k β k > 0 for all k. Then, from Eq. (4) we have for the sequence {(ū k, w k )}, lim (ū (ū, w) k, w k ) = lim = 1 (ū, w) with ξ > 0. (6) λ k β k ξ Assume now that ū = 0 and w < 0. From the preceding relation, it follows that (ū k, w k ) (0, w) with w = w ξ < 0. Because (ū k, w k ) V for all k [cf. Eq. (5)] it follows that the vector (0, w) with w < 0 lies in the closure of the set V. But this contradicts the relation w = 0 [cf. Eq. (3)]. 10

Assume now that ū 0 and w = 0. Then, from relation (6), we obtain (ū k, w k ) (ũ, 0) with ũ = ū ξ 0. This relation together with (ū k, w k ) Ṽ for all k [cf. Eq. (4)] implies that the vector (ũ, 0) cl(ṽ ). By assumption the set V extends upward in u space, and so does the set Ṽ, and therefore the set cl(ṽ ) extends upward in u-space by Lemma 3(b). Since ũ 0, we have that the vector (0, 0) cl(ṽ ). By the definition of set Ṽ [cf. Eq. (2)], this implies that (0, ɛ) cl(v ), contradicting the relation w = 0 [cf. Eq. (3)]. Q.E.D. 3.2 Separation Properties of Augmenting Functions In this section, we analyze the properties of augmenting functions under some conditions. These properties are crucial for our proofs of the separation results in Section 4. We start with augmenting functions that are bounded from below. Here, we consider a nonempty cone C R m R and a family of nonempty and convex sets parametrized by a scalar c > 0. Each of these convex sets is related to a level set of the function σ c, a scaled version of the augmenting function σ. Under some conditions on the function σ, we show that if the family of the convex sets and the cone C have no vectors in common then there is a large enough scalar c and a convex set X c R m R defined in terms of σ c having no vector in common with the cone C. In particular, we have the following lemma. Lemma 6 Assume that: (a) The function σ : R m (, ] is an augmenting function satisfying the following two conditions: (i) The function σ is bounded-below, i.e., σ(u) σ 0 for some scalar σ 0 and for all u. (ii) For any vector sequence {u k } R m and any positive scalar sequence { } with σ(c, if the relation lim sup k u k ) < holds, then the nonnegative part of the sequence {u k } converges to zero, i.e., lim sup σ( u k ) < with {u k } R m and u + k 0, where u + = (max{0, u 1 },..., max{0, u m }). (b) The cone C R m R is nonempty and closed. Furthermore, it extends upward in u-space and does not contain any vector of the form (0, w) with w < 0. (c) For all γ > 0, there exists c > 0 such that {(u, w) u L σc (γ)} C = for all c c, where w < 0 and L σc (γ) is a γ-level set of the function σ c (u) = σ(cu)/c for all u. 11

Then, there exists a large enough c 0 > 0 such that, for all c c 0, the set X c defined by { } X c = (u, w) R m R w σ c (u) + w (7) has no vector in common with the cone C. Proof. To arrive at a contradiction assume that there is no c 0 > 0 such that X c C = for all c c 0. Then, there exist a positive scalar sequence { } with and a vector sequence {(u k, w k )} R m R such that (u k, w k ) X ck C for all k. (8) In particular, by the definition of X c in Eq. (7), we have that w k σ(u k ) + w for all k, (9) where w < 0. We now consider separately the two possible cases: w k w for infinitely many indices k, and w k < w for infinitely many indices k. Case 1: w k w infinitely often. By taking an appropriate subsequence if necessary, we may assume without loss of generality that w k w for all k. Then, from relation (9) we have that w w k σ(u k ) + w for all k. (10) This relation implies that σ( u k )/ 0 for all k, and therefore lim sup σ( u k ) 0. (11) Since σ(u) σ 0 [cf. condition (i) in part (a)] and, it follows that lim inf σ( u k ) 0. From the preceding inequality and relation (11) we see that 1 σ( u k ) 0, and by using this in relation (10), we obtain lim w k = w. We have that (u k, w k ) C [cf. Eq. (8)] and u k u + k for all k, and because the cone C extends upward in u-space [cf. part (b) of the assumptions], it follows that (u + k, w k) C for all k. In view of relation (11) and condition (ii) in part (a) of the assumptions, we have u + k 0. Because C is closed, the relations u + k 0 and w k w imply that (0, w) C with w < 0. 12

This, however, contradicts the assumption that C does not contain any vector of the form (0, w) with w < 0 [cf. part (b) of the assumptions]. Case 2: w k < w infinitely often. Again, by taking an appropriate subsequence, we may assume without loss of generality that w k < w for all k, (12) with w < 0. Note that the set X c can be viewed as the zero-level set of the function F c (u, w) = w w + σ c (u) with σ c (u) = σ(cu)/c for all u, i.e., X c = {(u, w) R m R F c (u, w) 0}. By the definition of the augmenting function [cf. Definition 3], we have that σ is convex and σ(0) = 0. Therefore, for any c > 0, the function F c (u, w) is convex and F c (0, w) = 0. Consequently, for any c > 0, the set X c is convex and contains the vector (0, w). Consider now the scalars α k = w w k, for which we have α k (0, 1) for all k by Eq. (12) and w < 0. From the convexity of the sets X ck, and the relations (0, w) X ck and (u k, w k ) X ck for all k [cf. Eq. (8)], it follows that α k (u k, w k ) + (1 α k )(0, w) = (α k u k, w + (1 α k ) w) X ck for all k. Using the definition of the set X c [cf. Eq. (7)], we obtain implying that w + (1 α k ) w σ ck (α k u k ) + w, (1 α k ) w σ ck (α k u k ). Since w < 0 and α k (0, 1) for all k, it follows that σ ck (α k u k ) (1 α k ) w < w = w for all k. Hence, for all k, the vector α k u k belongs to level set L σck (γ) with γ = w > 0, and therefore (α k u k, w) {(u, w) u L σck (γ)} for γ = w and for all k. On the other hand, the vector (u k, w k ) lies in the cone C for all k [cf. Eq. (8)], and because α k > 0, the vector α k (u k, w k ) = (α k u k, w) also lies in the cone C. Therefore (α k u k, w) {(u, w) u L σck (γ)} C for all k, with. However, this contradicts the assumption in part (b) that the set {(u, w) u L σc (γ)} and the cone C have no vector in common for any γ > 0 and for all large enough c. Q.E.D. We note here that the analysis of Case 2 in the preceding proof does not use any of the assumptions on the augmenting function given in part (a). In particular, it only 13

requires the convexity of the function σ and the relation σ(0) = 0, which hold for any augmenting function [cf. Definition 3]. The implication of Lemma 6 is that there exists a large enough c 0 > 0 such that, for all c c 0, the set S c = {(u, 2 w) u L σc ( w )} and the cone C are separated by the concave function φ c (u) = 1 c σ(cu) + w, u Rm. In particular, the lemma asserts that for all c c 0, w φ c (u) < z for all (u, w) X c and (u, z) C. Moreover, one can verify that the set S c is contained in the set X c when w < 0 and c > 0. Therefore, for all c c 0, w φ c (u) < z for all (u, w) S c and (u, z) C, thus showing that φ c separates the set S c and the cone C. We, now, consider the augmenting functions σ that may be unbounded from below. Under some conditions on σ, we show a result similar to that of Lemma 6. Lemma 7 Assume that: (a) The function σ : R m (, ] is an augmenting function satisfying the following two conditions: (i) For any vector sequence {u k } R m with u k ū for some ū R m, and for any σ(c positive scalar sequence { } with, if the relation lim sup k u k ) < holds, then the vector ū is nonpositive i.e., lim sup σ( u k ) < with u k ū and ū 0. (ii) For any vector sequence {u k } R m with u k ū and ū 0, and for any positive scalar sequence { } with, we have lim inf σ( u k ) 0. (b) The cone C R m R is nonempty and closed. It extends upward in u-space, and does not contain any vector of the form (0, w) with w < 0 or a vector of the form (u, 0) with u 0, u 0. (c) For all γ > 0, there exists c > 0 such that {(u, w) u L σc (γ)} C = for all c c, where w < 0 and L σc (γ) is a γ-level set of the function σ c (u) = σ(cu)/c for all u. 14

Then, there exists a large enough c 0 > 0 such that, for all c c 0, the set X c defined by { } X c = (u, w) R m R w σ c (u) + w (13) has no vector in common with the cone C. Proof. To arrive at a contradiction assume that there is no c 0 such that X c C = for all c c 0. Then, there exist a scalar sequence { } with and a vector sequence {(u k, w k )} R m R such that (u k, w k ) X ck C for all k. (14) In particular, by the definition of X c in Eq. (13), we have that w k σ(u k ) + w for all k, (15) where w < 0. We now consider separately the two cases: w k w for infinitely many indices k and w k < w for infinitely many indices k. Case 1: w k w infinitely often. By taking an appropriate subsequence if necessary, we may assume without loss of generality that w k w for all k. Then, from relation (15) we have that w w k σ(u k ) + w for all k. (16) Suppose that {u k } is bounded, and without loss of generality we may assume that u k ū. From the preceding relation it follows that lim sup σ( u k ) 0. Since and u k ū, by condition (i) in part (a), we have that ū 0. Then, by condition (ii) in part (a), we further have lim inf σ( u k ) 0. Therefore, 1 σ( u k ) 0, and by using this in Eq. (16), we obtain w k w. Because C is closed [see assumptions in part (b)] and (u k, w k ) C for all k [cf. Eq. (14)] with u k ū and w k w, it follows that (ū, w) C where ū 0 and w < 0. By assumptions in part (b), the cone C is extending upward in u-space, and therefore (0, w) C with w < 0. This however, contradicts the assumption that the cone C does not contain any vector of the form (0, w) with w < 0 [cf. assumptions in part (b)]. 15

Suppose now that {u k } is unbounded, and let us assume without loss of generality that u k. By dividing with u k in Eq. (16), we obtain w u k w k u k σ(λ kv k ) + w λ k u k for all k, (17) where λ k = u k, v k = u k u k. Note that v k is bounded, and again, let us assume without loss of generality that v k v for some vector v with v 0. Relation (17) implies that lim sup σ(λ k v k ) λ k 0. Since λ k and v k v, by condition (i) in part (a), we have that v 0. Then, by condition (ii) in part (a), we also have lim inf σ(λ k v k ) λ k 0. In view of the preceding two relations, it follows that σ(λ kv k ) λ k (17), together with u k, we obtain 0. By using this in Eq. lim w k u k = 0. Since (u k, w k ) belongs to the cone C [cf. Eq. (14)], it follows that (v k, w k / u k ) C. In view of the relations v k v with v 0, v 0 and w k / u k 0, and the closedness of C [cf. assumptions in part (b)], it follows that ( v, 0) C with v 0 and v 0. This, however, contradicts the assumption in part (b) that the cone C does not contain any vector of the form (u, 0) with u 0 and u 0. Case 2: w k < w infinitely often. The proof is the same as for Case 2 in the proof of Lemma 6. 4 Separation Theorems Q.E.D. In this section, we discuss some sufficient conditions on augmenting functions and the set V that guarantee the separation of this set and a vector (0, w 0 ) not belonging to the closure of the set V. In Section 4.1, we focus on separation results for a set V that does not have (0, 1) as its recession direction. In Section 4.2, we relax this condition and establish separation results using local conditions around the origin for the set V. 16

4.1 Global conditions on the set V Throughout this section, we consider a set V that has a nonempty intersection with w-axis, extends upward both in u-space and w-space, and does not have (0, 1) as its recession direction. These properties of V are formally imposed in the following assumption. Assumption 1 Let V R m R be a nonempty set that satisfies the following: (a) The primal optimal value is finite, i.e., w = inf (0,w) V w is finite. (b) The set V extends upward in u-space and w-space. (c) The vector (0, 1) is not a recession direction of V, i.e., (0, 1) V. Assumption 1(a) formally states the requirement that the set V intersects the w-axis. Assumption 1(b) is satisfied, for example, when V is the epigraph of a nonincreasing function. Assumption 1(c) formalizes the requirement that the changes in w are commensurate with the changes in u for (u, w) V. It can be viewed as a requirement on the rate of decrease of w with respect to u. As an example, suppose that V is the epigraph of the function f(u) = u β for u R m and some scalar β, i.e., V = epi(f). Then, we have 1 f(u) (u, f(u)) = ( u 1 β, 1 ). For β < 0, as u 0, we have f(u) and u 1 β 0. Hence, for β < 0, 1 (u, f(u)) (0, 1), implying that the direction (0, 1) is a recession direction of f(u) the set V. However, for β > 0, we have f(u) as u. Furthermore, as 1 u, we have (u, f(u)) (0, 1) if and only if 1 β < 0. Thus, the vector f(u) (0, 1) is again a recession direction of V for β > 1, and it is not a recession direction of V for 0 β 1. We note that all of the subsequent separation results are established using the augmenting functions only. In other words, we can take µ = 0 in the separating concave surface {(u, φ c,µ (u)) u R m }, where φ c,µ (u) = 1 c σ(cu) µ u + ξ. Thus, the separation of the set V from a vector that does not belong to the closure of V can be realized using the augmenting function solely. 4.1.1 Bounded-Below Augmenting Functions In this section, we establish a separation result for an augmenting function σ that is bounded from below. In particular, we consider a class of augmenting functions σ satisfying the following assumption. Assumption 2 Let σ be an augmenting function with the following properties: 17

(a) The function σ is bounded-below, i.e., σ(u) σ 0 for some scalar σ 0 and for all u. (b) For any sequence {u k } R m and any positive scalar sequence { } with, σ(c if the relation lim sup k u k ) < holds, then the nonnegative part of the sequence {u k } converges to zero, i.e., lim sup σ( u k ) < with {u k } R m and u + k 0, where u + = (max{0, u 1 },..., max{0, u m }). Note that any nonnegative augmenting function [i.e., σ 0 = 0] satisfies Assumption 2(a). The following are some examples of the nonnegative augmenting functions that satisfy Assumption 2(b): σ(u) = m (max{0, u i }) β with β > 1 i=1 (cf. Luenberger [11]), where β = 2 is a popular choice (for example, see Polyak [13]); σ(u) = (u + ) Qu + with u + i = max{0, u i } (cf. Luenberger [11]), where Q is a symmetric positive definite matrix. The following is an example of an augmenting function that satisfies Assumption 2 with σ 0 < 0: σ(u) = a 1 (e u 1 1) +... + a m (e u m 1) for u = (u 1,..., u m ) R m and a i > 0 for all i. This function has been considered by Tseng and Bertsekas [19] in constructing penalty and multiplier methods for convex optimization problems. We next state our separation result for an augmenting function bounded from below. Proposition 2 (Bounded-Below Augmenting Function) Let V R m R be a nonempty set satisfying Assumption 1. Let σ be an augmenting function satisfying Assumption 2. Then, the set V and a vector (0, w 0 ) that does not belong to the closure of V can be strongly separated by the function σ, i.e., there exist scalars c 0 > 0 and ξ 0 such that for all c c 0, w + 1 c σ(cu) ξ 0 > w 0 for all (u, w) V. Proof. By Lemma 3(a), we have that w 0 < w = inf w (0,w) cl(v ) w. 18

By this relation and Assumption 1(a), it follows that w is finite. By using the translation of space along w-axis if necessary, we may assume without loss of generality that w = 0, so that w 0 < 0. Consider the set Ṽ given by Ṽ = { (u, w) ( u, w + w 0 4 ) V }, (18) and the cone generated by Ṽ, denoted by K. The proof relies on constructing a family of convex sets {X c c > 0}, where for each c, the set X c is defined in terms of the function σ c (u) = σ(cu)/c. Moreover, each of these sets contains the vector (0, w 0 /2) and extends downward along the w-axis. We show that at least one of these sets does not have any vector in common with the closure of the cone K. That set is actually a surface that separates (0, w 0 /2) and the closure of K, and therefore, it also separates (0, w 0 ) and V. In particular, the proof is given in the following steps: Step 1: We first show that for all γ > 0, there exists some c > 0 such that where σ c is a function given by {(u, w 0 /2) u L σc (γ)} cl(k) = for all c c, (19) σ c (u) = 1 σ(cu) for all u. c To arrive at a contradiction, suppose that relation (19) does not hold. Then, given some γ > 0, there exist a scalar sequence { } with and > 0 for all k, and a vector sequence {u k } such that σ( u k ) γ, (u k, w 0 /2) cl(k). (20) By taking the limit superior in relation (20), we obtain lim sup σ( u k ) γ <. Because, by Assumption 2(b) on the augmenting function, we have that u + k 0. The set V extends upward in u-space, and so does the set Ṽ, an upward translation of the set V along the w-axis [cf. Eq. (18)]. Then, by Lemma 4, the closure of the cone K generated by Ṽ also extends upward in u-space. Since (u k, w 0 /2) cl(k) [cf. Eq. (20)] and u k u + k for all k, it follows that (u + k, w 0/2) cl(k) for all k. Furthermore, because u + k 0, we obtain (0, w 0/2) cl(k), and therefore (0, w 0 ) cl(k). 19

On the other hand, since (0, w 0 ) cl(v ), by Lemma 5 we have that (0, w 0 ) cl(k), contradicting the preceding relation. Thus, relation (19) must hold. Step 2: We consider the sets X c given by { X c = (u, w) R m R w σ c (u) + w } 0 2 for c > 0. (21) We apply Lemma 6 to assert that there exists some sufficiently large c 0 > 0 such that X c cl(k) = for all c c 0. (22) In particular, we use Lemma 6 with the following identification: C = cl(k) and w = w 0 2. To verify that the assumptions of Lemma 6 are satisfied, note that Assumption 2 is identical to the assumptions in part (a) of Lemma 6. Furthermore, since the set V extends upward in u-space, so does the cone cl(k) by Lemma 4. Thus, the cone C = cl(k) satisfies the assumptions in part (b) of Lemma 6. In view of relation (19), the condition (c) of Lemma 6 is also satisfied. Hence, by this lemma, relation (22) holds. By using relation (22) and the definition of the set X c in Eq. (21), it follows that for all c c 0, w > σ(cu) c The cone K is generated by the set (u, w) Ṽ, implying that for all c c 0, w w 0 4 + σ(cu) c + w 0 for all (u, w) cl(k). 2 Ṽ, so that the preceding relation holds for all > w 0 2 Furthermore, since w 0 < 0, it follows that for all c c 0, w + σ(cu) c thus completing the proof. for all (u, w) V. ξ 0 > w 0 for all (u, w) V and ξ 0 = 3w 0 4, Q.E.D. 4.1.2 Unbounded Augmenting Functions In this section, we extend the separation result of the preceding section to augmenting functions that are not necessarily bounded from below. In particular, we consider augmenting functions σ that satisfy the following assumption. Assumption 3 Let σ be an augmenting function with the following properties: (a) For any sequence {u k } R m with u k ū and for any positive scalar sequence σ(c { } with, the relation lim sup k u k ) < implies that the vector ū is nonpositive, i.e., lim sup σ( u k ) < with u k ū and ū 0. 20

(b) For any sequence {u k } R m with u k ū and ū 0, and for any positive scalar sequence { } with, we have lim inf σ( u k ) 0. For example, Assumption 3 is satisfied for an augmenting function σ of the form σ(u) = m θ(u i ) i=1 with the following choices of the scalar function θ: { log(1 x) x < 1, θ(x) = + x 1, (cf. modified barrier method of Polyak [14]), { x x < 1, θ(x) = 1 x + x 1, (cf. hyperbolic MBF method of Polyak [14]), { x + 1 θ(x) = 2 x2 x 1, 2 1 log( 2x) 3 x < 1, 4 8 2 (cf. quadratic logarithmic method of Ben-Tal and Zibulevski [3]). More generally, we have the following result: Lemma 8 Let θ : R (, ] be a closed convex function with θ(0) = 0, and having the recession function θ that satisfies θ ( 1) = 0 and θ (1) = +. Then, the function σ : R m (, ] defined by σ(u) = m θ(u i ), i=1 is an augmenting function that satisfies Assumption 3. Proof. Since θ is a convex function, not identically equal to 0 [because θ (1) = + ], with θ(0) = 0, it follows immediately by Definition 3 that σ is an augmenting function. By the assumption that θ ( 1) = 0 and θ (1) = +, and by the positive homogeneity of the function θ [cf. Lemma 2(b)], we have θ (x) = 0 for all x < 0, (23) θ (x) = + for all x > 0. (24) 21

The function θ is proper, closed, and convex. Moreover, any scalar x with x < 0 is in the domain of θ [cf. Eq. 23]. Then, by using Lemma 2(c) with y = 0 and ȳ < 0, we obtain θ (0) = 0. (25) Now, consider a vector sequence {u k } R m with u k ū, and a positive scalar sequence { } with. Furthermore, let {u ki } denote the scalar sequence formed by the i th components of the vectors u k. Note that θ (x) 0 for all scalars x, and by using Lemma 2(a), we can see that 0 θ (ū i ) lim inf θ( u ki ) for all i = 1,..., m. (26) By summing over all i = 1,..., m, we obtain 0 m i=1 lim inf θ( u ki ) lim inf m i=1 θ( u ki ) = lim inf σ( u k ). (27) This relation implies that σ satisfies Assumption 3(b). Suppose now that σ( u k ) lim sup < +. Then, in view of relation (27), we have that m i=1 lim inf θ( u ki ) lim inf σ( u k ) lim sup σ( u k ) < +, implying by Eq. (26) that θ (ū i ) < + for all i. Since θ (x) is finite only for x 0 [see Eqs. (23) (25)], it follows that ū i 0 for all i. Thus, σ satisfies Assumption 3(a). Q.E.D. The convergence behavior of penalty functions that satisfy the assumptions of Lemma 8 was studied by Auslender, Cominetti, and Haddou [1] for convex optimization problems (see the second class of penalty methods studied there). To establish a separation result for an augmenting function with possibly unbounded values, we use an additional assumption on the set V. Assumption 4 For any ū R m with ū 0 and ū 0, the vector (ū, 0) is not a recession direction of the set V, i.e., (ū, 0) / V for any ū 0, ū 0. We next state our separation result for unbounded augmenting functions. The proof uses a similar construction as in the proof of Proposition 2. However, here, the proof is more involved since we deal with the augmenting functions that need not be bounded from below. 22

Proposition 3 (Unbounded Augmenting Function) Let V R m R be a nonempty set satisfying Assumption 1 and Assumption 4. Let σ be an augmenting function satisfying Assumption 3. Then, the set V and a vector (0, w 0 ) that does not belong to the closure of V can be strongly separated by the augmenting function σ, i.e., there exist scalars c 0 > 0 and ξ 0 such that for all c c 0, w + 1 c σ(cu) ξ 0 > w 0 for all (u, w) V. Proof. By Lemma 3(a), we have that w 0 < w = inf w (0,w) cl(v ) w. This relation and Assumption 1(a) imply that w is finite. By using the translation of space along w-axis if necessary, we may assume without loss of generality that w = 0, so that w 0 < 0. Again, we consider the set Ṽ given by { ( Ṽ = (u, w) u, w + w ) } 0 V, (28) 4 and the cone K generated by Ṽ. The proof is given in the following steps. Step 1: We first show that for all γ > 0, there exists some c > 0 such that {(u, w 0 /2) u L σc (γ)} cl(k) = for all c c, (29) where σ c is a function given by σ c (u) = σ(cu)/c for all u. To arrive at a contradiction, suppose that relation (29) does not hold. Then, for some γ > 0, there exist a sequence { } with and > 0 for all k, and a sequence {u k } such that σ( u k ) γ, (u k, w 0 /2) cl(k) for all k. (30) Suppose that {u k } is bounded. Then, {u k } has a limit point ū R m, and without loss of generality we may assume that u k ū. By taking the limit superior in the first relation of Eq. (30), we obtain lim sup σ( u k ) γ <. Because and u k ū, by Assumption 3(a), it follows that ū 0. In view of the relations (u k, w 0 /2) cl(k) for all k [cf. Eq. (30)] and u k ū, we have (ū, w 0 /2) cl(k). The set V extends upward in u-space and so does the set Ṽ [cf. Eq. (28)]. Then, by Lemma 4, the closure of the cone K generated by the set Ṽ also extends upward in u-space. Since (ū, w 0 /2) cl(k) and ū 0, it follows that (0, w 0 /2) cl(k), implying that (0, w 0 ) cl(k). 23

On the other hand, since (0, w 0 ) cl(v ), by Lemma 5 we have that (0, w 0 ) cl(k), contradicting the preceding relation. Suppose now that {u k } is unbounded. Again, by taking an appropriate subsequence, without loss of generality we may assume that u k. Then, from Eq. (30) we obtain We can write σ( u k ) u k γ u k, ( uk u k, w 0 2 u k σ(λ k v k ) λ k γ u k, ) cl(k) for all k. (31) for λ k = u k and v k = u k / u k. Note that v k is bounded and, therefore, it has a limit v R m with v 0. Without loss of generality, we may assume that v k v. Furthermore, from the preceding relation and u k, it follows that lim sup σ(λ k v k ) λ k 0. Because and v k v, by Assumption 3(a), it follows that v 0. In view of the relations ( ) uk u k, w 0 cl(k) for all k 2 u k [cf. Eq. (31)], and u k / u k v and u k, we obtain ( v, 0) cl(k) with v 0 and v 0. (32) At the same time, according to Assumption 4, the set V has no recession direction of the form (ū, 0) with ū 0 and ū 0. Then, according to Lemma 5, the cone cl(k) does not have recession directions of the form (ū, 0) with ū 0 and ū 0, contradicting relation (32). Thus, relation (29) must hold. Step 2: We consider the sets X c given by { X c = (u, w) R m R w σ c (u) + w } 0 2 for c > 0. (33) We apply Lemma 7 to assert that there exists some sufficiently large c 0 such that X c cl(k) = for all c c 0. (34) In particular, we use Lemma 7 with the following identification: C = cl(k) and w = w 0 2. To verify that the assumptions of Lemma 7 are satisfied, note that Assumption 3 is identical to the assumptions in part (a) of Lemma 7. We have also argued that the closure of the cone K generated by the set Ṽ extends upward in u-space and does not have recession directions of the form (0, w) with w < 0 or (u, 0) with u 0 and u 0. 24

Thus, the cone C = cl(k) satisfies the assumptions in part (b) of Lemma 7. In view of relation (29), the condition (c) in Lemma 7 is also satisfied. Hence, by this lemma, relation (34) holds. From relation (34) and the definition of the set X c in Eq. (33), it follows that for all c c 0, w > σ(cu) c + w 0 2 for all (u, w) cl(k). The cone K is generated by the set Ṽ, so that the preceding relation holds for all (u, w) Ṽ, implying that for all c c 0, w w 0 4 + σ(cu) c > w 0 2 Furthermore, since w 0 < 0, it follows that for all c c 0, w + σ(cu) c thus completing the proof. for all (u, w) V. ξ 0 > w 0 for all (u, w) V and ξ 0 = 3w 0 4, Q.E.D. 4.2 Local Conditions on the set V In this section, we present separation results when the condition (0, 1) / V is relaxed in Assumption 1. In particular, we consider sufficient conditions for the separation of a vector (0, w 0 ) / cl(v ) and the set V by an augmenting function for the case when the direction (0, 1) may be a recession direction of V. However, we still assume that the set V extends upward both in u-space and w-space. Furthermore, instead of assuming that the value w = inf (0,w) V w is finite, we assume that the w-components of the set V are bounded from below in some neighborhood of the origin. These properties of V are formally imposed in the following assumption. Assumption 5 Let V R m R be a nonempty set that satisfies the following: (a) There exists a neighborhood of u = 0 such that the w-components of the vectors (u, w) V are bounded for all u in that neighborhood, i.e., there exists δ > 0 such that inf (u,w) V u δ w >. (b) The set V extends upward in u-space and w-space. 4.2.1 Asymptotic Augmenting Functions Here, we establish separation results for the sets that satisfy Assumption 5 by using augmenting functions that go to infinity along a finite positive asymptote. In particular, we consider the augmenting functions that satisfy the following assumption. Assumption 6 Let σ be an augmenting function such that: 25

(a) For some scalars a i > 0, i = 1,..., m, we have σ(u) = + for u R m with u i a i for some i {1,..., m}. (b) For any vector sequence {u k } R m with u k ū and ū 0, and for any positive scalar sequence { } with, we have lim inf σ( u k ) 0. Note that the unbounded augmenting functions of the form modified barrier method and hyperbolic MBF method of Polyak satisfy this assumption. Note also that any augmenting function σ that satisfies Assumption 6(a) also satisfies Assumption 2(a). To see this, let {u k } be a sequence with u k ū and { } be a sequence with, and assume that σ( u k ) lim sup <. (35) Assume to arrive at a contradiction that ū i > 0 for some i. Then, since, we have u ki > a i for all sufficiently large k, implying that σ( u k ) = + for all sufficiently large k, and contradicting Eq. (35). We now present sufficient conditions for the separation of a vector (0, w 0 ) cl(v ) and the set V. Proposition 4 (Asymptotic Augmenting Function) Let V R m R be a nonempty set and σ be an augmenting function. Assume also that one of the following holds: (a) The set V satisfies Assumption 5, and the augmenting function σ is nonnegative and satisfies Assumption 6(a). (b) The set V satisfies Assumption 4 and Assumption 5, and the augmenting function σ satisfies Assumption 6. Then, the set V and a vector (0, w 0 ) R m R that does not belong to the closure of V can be separated by the function σ, i.e., there exist scalars c 0 > 0 and ξ 0 such that for all c c 0, w + 1 c σ(cu) ξ 0 > w 0 for all (u, w) V. Proof. To arrive at a contradiction, assume that there are no scalars c 0 > 0 and ξ 0 such that the preceding relation holds. Then, there exist a positive scalar sequence { } with and a vector sequence {(u k, w k )} V such that w k + σ(u k ) w 0 for all k. (36) We show that, under Assumption 6(a) on the augmenting function σ, Eq. (36) implies that u = 0 is a limit point of the sequence {u + k }, i.e., {u+ k } converges to u = 0 along some subsequence. Suppose that this is not the case. Then, because u + k 0 for all k, 26

there exist a coordinate index i and a scalar ɛ > such that u ki ɛ for all sufficiently large k. Then, since, we have u ki ɛ > a i for all sufficiently large k, where the scalar a i is such that σ(u) = + for any vector u with u i a i [cf. Assumption 6(a)]. Hence, σ( u k ) = + for all sufficiently large k, thus contradicting Eq. (36). Therefore u = 0 is a limit point of the sequence {u + k }. By taking an appropriate subsequence if necessary, we may assume without loss of generality that u + k 0 and u+ k δ for all k, (37) where δ is the scalar defining the local property of V, as given in Assumption 5(a). The set V extends upward in u-space by Assumption 5(b), and since u u + for all u, we have (u + k, w k) V for all k. (38) By Assumption 5(a), the components w are bounded from below for all (u, w) V with u δ. In view of this, from the preceding two relations, we conclude that the sequence {w k } is bounded from below, i.e., for some w δ we have w k w δ for all k. From now on, the arguments for parts (a) and (b) are different. (a) Since the function σ is nonnegative [i.e., σ(u) 0 for all u], from Eq. (36) it follows that w k w 0 for all k. The preceding two relations imply that {w k } is bounded, and therefore it has a limit point w with w w 0. We may assume that w k w. In view of relations u + k 0 and (u + k, w k) V for all k [cf. Eqs. (37) and (38)], it follows that (0, w) cl(v ). The set V extends upward in w-space by Assumption 5(b), and so does the closure of V by Lemma 3(a). Thus, since (0, w) cl(v ) and w w 0, we have (0, w 0 ) cl(v )- a contradiction. (b) Suppose that the sequence {u k } is bounded. Then, it has a limit point ū, and we may assume that u k ū. Since u + k 0 [cf. Eq. (37)], it follows that ū 0. Therefore, by Assumption 6(b), we have that lim inf σ( u k ) 0. By taking the limit inferior in Eq. (36) and using the preceding relation, we obtain { lim inf w k lim inf w σ( u k ) k + lim inf lim inf w k + σ(c } ku k ) w 0. Because w k w δ for all k, we see that the value w = lim inf w k is finite, and in view of the preceding relation, we have that w w 0. Thus, w k w along some subsequence, and since u + k 0 and (u+ k, w k) V for all k [cf. Eqs. (37) and (38)], it follows that (0, w) cl(v ) with w w 0. The set V extends upward in w-space by Assumption 27