An Enhanced Spatial Branch-and-Bound Method in Global Optimization with Nonconvex Constraints

Similar documents
On smoothness properties of optimal value functions at the boundary of their domain under complete convexity

INTERIOR-POINT METHODS FOR NONCONVEX NONLINEAR PROGRAMMING: CONVERGENCE ANALYSIS AND COMPUTATIONAL PERFORMANCE

Coercive polynomials and their Newton polytopes

Optimality Conditions for Constrained Optimization

Technische Universität Ilmenau Institut für Mathematik

A Geometric Framework for Nonconvex Optimization Duality using Augmented Lagrangian Functions

ON GENERALIZED-CONVEX CONSTRAINED MULTI-OBJECTIVE OPTIMIZATION

Kaisa Joki Adil M. Bagirov Napsu Karmitsa Marko M. Mäkelä. New Proximal Bundle Method for Nonsmooth DC Optimization

Introduction to Real Analysis Alternative Chapter 1

Constraint qualifications for nonlinear programming

A convergence result for an Outer Approximation Scheme

3.10 Lagrangian relaxation

AN INTERIOR-POINT METHOD FOR NONLINEAR OPTIMIZATION PROBLEMS WITH LOCATABLE AND SEPARABLE NONSMOOTHNESS

CONVERGENCE ANALYSIS OF AN INTERIOR-POINT METHOD FOR NONCONVEX NONLINEAR PROGRAMMING

Some new facts about sequential quadratic programming methods employing second derivatives

CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS

Implications of the Constant Rank Constraint Qualification

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

Constrained Optimization Theory

Solving Dual Problems

A projection-type method for generalized variational inequalities with dual solutions

FIRST YEAR CALCULUS W W L CHEN

Implementation of an αbb-type underestimator in the SGO-algorithm

Convex Functions and Optimization

CONSTRAINED NONLINEAR PROGRAMMING

ON LICQ AND THE UNIQUENESS OF LAGRANGE MULTIPLIERS

5 Handling Constraints

A Unified Analysis of Nonconvex Optimization Duality and Penalty Methods with General Augmenting Functions

Optimization and Optimal Control in Banach Spaces

FIRST- AND SECOND-ORDER OPTIMALITY CONDITIONS FOR MATHEMATICAL PROGRAMS WITH VANISHING CONSTRAINTS 1. Tim Hoheisel and Christian Kanzow

Characterizations of Solution Sets of Fréchet Differentiable Problems with Quasiconvex Objective Function

Convex Optimization Notes

Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms

Existence and Uniqueness

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

A full-newton step infeasible interior-point algorithm for linear programming based on a kernel function

n [ F (b j ) F (a j ) ], n j=1(a j, b j ] E (4.1)

Priority Programme 1962

Lyapunov Stability Theory

September Math Course: First Order Derivative

The Chromatic Number of Ordered Graphs With Constrained Conflict Graphs

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)

Some Background Material

Inexact Newton Methods and Nonlinear Constrained Optimization

TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM

An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints

Problem 3. Give an example of a sequence of continuous functions on a compact domain converging pointwise but not uniformly to a continuous function

An Inexact Newton Method for Optimization

Notes on Complex Analysis

Lecture 13: Constrained optimization

6 Lecture 6: More constructions with Huber rings

A Continuation Method for the Solution of Monotone Variational Inequality Problems

Lecture 7: Semidefinite programming

MA651 Topology. Lecture 9. Compactness 2.

Legendre-Fenchel transforms in a nutshell

Interior-Point Methods for Linear Optimization

CONVERGENCE PROPERTIES OF COMBINED RELAXATION METHODS

MATH41011/MATH61011: FOURIER SERIES AND LEBESGUE INTEGRATION. Extra Reading Material for Level 4 and Level 6

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory

1 Convexity, Convex Relaxations, and Global Optimization

Linear Programming: Simplex

Worst case analysis for a general class of on-line lot-sizing heuristics

Solving generalized semi-infinite programs by reduction to simpler problems.

On the Convergence of Adaptive Stochastic Search Methods for Constrained and Multi-Objective Black-Box Global Optimization

Finite Dimensional Optimization Part I: The KKT Theorem 1

Convex Optimization and Modeling

Linear Programming Redux

Tree sets. Reinhard Diestel

Supplementary lecture notes on linear programming. We will present an algorithm to solve linear programs of the form. maximize.

A New Trust Region Algorithm Using Radial Basis Function Models

Sharpening the Karush-John optimality conditions

Problem List MATH 5143 Fall, 2013

An Adaptive Partition-based Approach for Solving Two-stage Stochastic Programs with Fixed Recourse

3 Integration and Expectation

arxiv: v2 [math.ag] 24 Jun 2015

Hierarchy among Automata on Linear Orderings

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming

A Brief Introduction to the Theory of Lebesgue Integration

Near-Potential Games: Geometry and Dynamics

On Semicontinuity of Convex-valued Multifunctions and Cesari s Property (Q)

THE UNIQUE MINIMAL DUAL REPRESENTATION OF A CONVEX FUNCTION

C 1 convex extensions of 1-jets from arbitrary subsets of R n, and applications

Lectures on Parametric Optimization: An Introduction

A Parametric Simplex Algorithm for Linear Vector Optimization Problems

Set, functions and Euclidean space. Seungjin Han

arxiv: v1 [math.oc] 3 Jan 2019

MORE ON CONTINUOUS FUNCTIONS AND SETS

AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Subdifferential representation of convex functions: refinements and applications

Research Article A New Global Optimization Algorithm for Solving Generalized Geometric Programming

Measure Theory and Lebesgue Integration. Joshua H. Lifton

SUPPLEMENTARY MATERIAL TO IRONING WITHOUT CONTROL

02. Measure and integral. 1. Borel-measurable functions and pointwise limits

Nonlinear Stationary Subdivision

Efficient packing of unit squares in a square

A Proximal Method for Identifying Active Manifolds

(x x 0 ) 2 + (y y 0 ) 2 = ε 2, (2.11)

Numerical Optimization

Twice Differentiable Characterizations of Convexity Notions for Functions on Full Dimensional Convex Sets

Transcription:

An Enhanced Spatial Branch-and-Bound Method in Global Optimization with Nonconvex Constraints Oliver Stein Peter Kirst # Paul Steuermann March 22, 2013 Abstract We discuss some difficulties in determining valid upper bounds in spatial branch-and-bound methods for global minimization in the presence of nonconvex constraints. In fact, two examples illustrate that standard techniques for the construction of upper bounds may fail in this setting. Instead, we propose to perturb infeasible iterates along Mangasarian-Fromovitz directions to feasible points whose objective function values serve as upper bounds. These directions may be calculated by the solution of a single linear optimization problem per iteration. Numerical results show that our enhanced algorithm performs well even for optimization problems where the standard branch-andbound method does not converge to the correct optimal value. Keywords: Branch-and-bound, convergence, consistency, Mangasarian- Fromovitz constraint qualification. AMS subject classifications: 90C26. Institute of Operations Research, Karlsruhe Institute of Technology (KIT), Germany, stein@kit.edu # Institute of Operations Research, Karlsruhe Institute of Technology (KIT), Germany, kirst@kit.edu Blumenstraße 5, 41363 Jüchen, Germany, numerik@gmx.de 1

1 Introduction In this article we present an improvement of branch-and-bound methods for solving global minimization problems of the form P (B) : min f(x) s.t. g i(x) 0, i I x R n x B. All functions f, g i, i I = {1,..., p} with p N 0 are assumed to be at least continuously differentiable on some open set containing the box B = [b, b] = {x R n b x b} with b, b R n, where the inequalities are meant componentwise. The feasible set of P (B) is denoted by M(B) = {x B g i (x) 0, i I}. Our goal is to determine a point x M(B) with f(x ) f(x) + ε for all x M(B) with a given tolerance ε > 0. Branch-and-bound methods are applied in order to solve this kind of problems as we describe in terms of a general framework in Section 2. In [9] an appropriate theory is given which can be useful in proving convergence of explicit algorithms. However, a strong assumption with respect to upper bounds is made which can be difficult to verify in the presence of nonconvex constraints, even if it is fulfilled. This will be illustrated in Section 3 by means of two examples for the αbb method [3], a well established algorithm in global optimization. In Section 4 an enhanced version of the framework is proposed to overcome these difficulties. Numerical results are presented in Section 5, and Section 6 concludes the paper with some final remarks. The notation in this paper is standard. In particular, Df denotes the row vector of partial derivatives of the function f and the Hessian of a twice differentiable function f is denoted by D 2 f. 2 The Branch-and-Bound Framework The main algorithmic idea of branch-and-bound methods is heavily based upon the construction of lower and upper bounds at function values on certain domains. This can be achieved in many ways, for example by the αbb 2

method from [3], using Lipschitz constants as examined in [16] or by exploiting duality [5, 6]. Further examples are centered forms [13] or optimal centered forms [4]. The direct application of interval arithmetic [15] is also possible (as opposed to its indirect application in some of the aforementioned alternatives). Independently of what bounding procedures are chosen, we have to ensure some properties regarding their convergence. In order to explain this, we first distinguish two types of bounding procedures. Definition 2.1. a) A function l from the set of all sub-boxes X of B to R is called M- dependent lower bounding procedure for the objective function of P (B), if l(x) inf x M(X) f(x) holds for all sub-boxes X B and any choice of the functions f, g i, i I. b) A function l from the set of all sub-boxes X of B to R is called M- independent lower bounding procedure for a function, if it satisfies l(x) min x X f(x) for all sub-boxes X B and any choice of f. c) A lower bounding procdure l is called monotone, if l(x 1 ) l(x 2 ) holds for all boxes X 1 X 2 B. The corresponding notions for upper bounding procedures u are defined analogously (e.g., u(x) inf x M(X) f(x) for all X B, etc.). Note that, under our assumptions, for X B the optimal value v(x) = inf x M(X) f(x) in the definition of M-dependent bounding procedures is attained as a real number if and only if M(X) is nonempty and that otherwise, as usual, we put v(x) = +. While for the chosen data functions f, g i, i I, of the problem P (B) one may refer to the M-dependent lower bounding procedure by l f, gi, i I, we will omit these subscripts as we will not study the dependence of P (B) on its data functions. On the other hand, an M-independent lower bounding procedure may be applied to any single one of the functions f, g i, i I, giving rise to lower bounding procedures l f and l gi, i I. Hence, in the following, an M-independent lower bounding procedure l will always be indexed by the corresponding function, and an M-dependent one will not. The analogous convention holds for upper bounding procedures. To define our convergence concept for upper and lower bounding procedures, we will apply them to exhaustive sequences of boxes (X k ) k N, by which we mean nested sequences (i.e., X k X k 1, k N) of nonempty boxes X k B 3

with lim k diag(x k ) = 0, where diag(x) = x x 2 denotes the diagonal length of a box X. For any exhaustive sequence of boxes (X k ) k N, the set k N X k is a singleton, say { x}. As the sequence (min x Xk f(x)) k N is nondecreasing and bounded above by f( x), it converges. The continuity of f actually yields lim min f(x) = f( x). k x X k We call M-independent lower and upper bounding procedures convergent if lim l f(x k ) = lim min f(x) k k x X k lim u f(x k ) = lim min f(x) k k x X k and hold for any exhaustive sequence of boxes (X k ) k N and any choice of the function f, respectively. For M-dependent lower and upper bounding procedures one may distinguish whether x M(B) holds or not. In the first case we have x M(X k ) for all k N, so that the continuity of f again yields lim min k x M(X k ) f(x) = f( x). In the case x / M(B), on the other hand, for all sufficiently large k N we have M(X k ) = and inf x M(Xk ) f(x) = +. We may thus define the extended-valued limit lim inf k x M(X k ) f(x) = +. With this definition, M-dependent lower and upper bounding procedures are called convergent if lim l(x k) = lim k k lim u(x k) = lim k k inf x M(X k ) inf x M(X k ) f(x) f(x) and hold for any exhaustive sequence of boxes (X k ) k N and any choice of the functions f, g i, i I, respectively. The following proposition shows that both types of bounding procedures can easily be constructed from each other. 4

Proposition 2.2. a) Let l(x) denote a convergent M-dependent lower bounding procedure. Then the choice g i (x) 0, i I, results in a convergent M-independent lower bounding procedure l f (X). b) Let l f (X), l gi (X), i I, be convergent M-independent lower bounding procedures. Then { lf (X) if max l(x) = i I l gi (X) 0 + if max i I l gi (X) > 0. is a convergent M-dependent lower bounding procedure. c) Analogous statements hold for the upper bounding procedures. Proof. As the choice of the functions g i, i I, results in M(X) = X for all sub-boxes X of B, the assertion of part a) is clear. To see part b), we first show that l is an M-dependent lower bounding procedure in the sense of Definition 2.1a). To this end, let X be a sub-box of B. In the case M(X) we choose some x X with g i ( x) 0 for all i I. This implies max i I l gi (X) max i I g i ( x) 0, so that l(x) = l f (X) min x X f(x) min x M(X) f(x). In the case M(X) = the assertion is trivial since inf x M(X) f(x) = +. Next, to show the convergence of l, choose an exhaustive sequence of boxes (X k ) k N and let x denote the unique element of k N X k. We distinguish two cases. First assume that x M(B). Then, as above, for any k N, we have max i I l gi (X k ) max i I g i ( x) 0 and, thus, l(x k ) = l f (X k ). This implies lim l(x k) = k lim l f (X k ) = k lim min f(x) = f( x) = k x X k lim min k x M(X k ) f(x) by the convergence of l f, the continuity of f, and x M(X k ) for all k N. For the second case assume that x M(B). Then there is an i I with g i ( x) > 0. By the exhaustiveness of the sequence of boxes (X k ) k N and the continuity of g i, the sequence (min x Xk g i (x)) k N converges to g i ( x). Hence, for all sufficiently large k N we have min x Xk g i (x) > 0 and, as l gi is a convergent M-independent bounding procedure, for possibly larger k N we also obtain l gi (X k ) > 0. This implies l(x k ) = + and, thus, lim k l(x k ) = +, as required for x / M(B). The assertion of part c) is shown analogously. 5

Remark 2.3. Many well-known bounding procedures lack the stated convergence properties at first glance, such as some of the aforementioned procedures based on centered forms or interval arithmetic. In these cases, however, usually a similar type of convergence holds, for example lim diag(x) 0 ( u f (X) l f (X) ) = 0. (2.1) With an exhaustive sequence of boxes (X k ) k N we then have ( ) ( ) 0 u f (X k ) min f(x) + min f(x) l f (X k ) = u f (X k ) l f (X k ) x X }{{ k x X }} k {{} 0 0 for all k N. In view of (2.1) we immediately obtain the M-independent convergence of both bounding procedures as well. In fact, while u f (X) denotes an upper bound for min x X f(x), for centered forms or interval arithmetic (2.1) even holds with upper bounds for max x X f(x) as, for example, l f (X). The convergence of such upper bounding procedures follows as above. A convenient way of computing lower bounds in global optimization is the socalled αbb relaxation. We will briefly sketch the method in Example 2.4. For a more detailed explanation we refer to [7]. Tight convex underestimators tailored for certain types of functions such as bi- and trilinear, univariate concave and fractional terms can be found in [1]. For implementation details we refer to [2]. Example 2.4. The main idea of the αbb method to determine lower and upper bounds for nonconvex twice differentiable functions on boxes is to construct convex underestimators. Say, for example, that we want to construct a convex underestimator f for the objective function f on the box X B, that is, f is convex on X and satisfies f(x) f(x) for all x X. This can be achieved by adding a convex term scaled with a suitably large parameter α 0 to compensate for the non-convexities of f. In fact, for α 0 the function ψ α : R n R, x α 2 (x x) (x x). is easily seen to be non-positive on X, so that the function f(x) := f(x) + ψ α (x) is an underestimator for f on X. Moreover, in view of D 2 ψ α (x) = αi n (with the identity matrix I n ), ψ α is strongly convex for α > 0. For sufficiently 6

large values of α the function f thus is convex on B [1]. Methods to compute appropriate values for α using, for example, interval arithmetic, are discussed in [1]. Let x denote a (globally) minimal point of the convex function f on X. Then l f (x) := f( x) and u f (X) = f( x) obviously give rise to M-independent lower and upper bounding procedures for f on X, respectively. Since ψ α attains its unique minimal point at the midpoint mid(x) = 1 (x + x) of the box X, it is 2 not hard to see [3] that the so-called maximum separation distance between f and f on X satisfies max x X ( f(x) f(x) ) = α 8 diag(x)2. (2.2) This leads to u f (X) l f (X) = f( x) f( x) max x X ( f(x) f(x) ) = α 8 diag(x)2, so that the lower and upper bounding procedures satisfy (2.1). The discussion in Remark 2.3 now implies that both l f and u f are convergent. In view of l f (X) = min x X f(x) the lower bounding procedure lf also is monotone. To construct M-dependent lower and upper bounding procedures for the objective function of P (B), for X B we may relax the nonlinear optimization problem P (X) and solve the convex problem P (X) : min f(x) s.t. x M(X), x R n where M(X) denotes a convex set containing the feasible set M(X). Such a set can, for example, be obtained by M(X) = {x X ĝ i (x) 0, i I}, where ĝ i is a convex underestimator of g i on X constructed as described above for the objective function. This approach is proposed in [3]. In case of solvability, the minimal value v(x) of the convex optimization problem P (X) provides an M-dependent lower bounding procedure l(x) for the globally minimal value v(x) of P (X). Moreover, for any point x M(X) the value f(x) provides an M-dependent upper bounding procedure u(x) to v(x). Unfortunately, in the presence of nonconvex constraints it may be complicated to determine such a point x M(X), so that stating a well-defined upper bounding procedure is a challenging issue. We will discuss this in more general terms below. 7

We will now give a short overview of the branch-and-bound mechanism before we formulate the procedure in detail. We start by dividing the original box B = [b, b] into the boxes X 1 and X 2 along a longest edge. Given an M-dependent lower bounding procedure l, for j {1, 2} we calculate a valid lower bound v j := l(x j ) at the globally minimal value of P (X j ). We save these values together with the corresponding boxes in a list L = {(X 1, v 1 ), (X 2, v 2 )}. If we know elements x 1 M(X 1 ) and x 2 M(X 2 ) we can insert them into the objective function to obtain real upper bounds f(x 1 ) and f(x 2 ) for v(b). Then we update u(b) to the smaller of the two bounds and store the corresponding point x i as the best known feasible point so far, x. To illustrate this, in the αbb approach (cf. Ex. 2.4) we can check if at least one of the optimal points x j of the convex subproblems P (X j ), j {1, 2}, is feasible. Among the feasible points x j, j {1, 2}, the one with smallest value f( x j ) is stored as x. However, in case of infeasibility of both, x 1 and x 2, we cannot construct an upper bound u(b) as above. We will describe the difficulties arising in this step in detail in Section 3 and propose a solution to this problem in Section 4. For the description of the next step assume that we know a valid upper bound u(b). If there is a pair ( X, ṽ) L with ṽ > u(b) we can remove it from L, because the globally minimal point can certainly not be found in X. This is known as the fathoming step. If in this step all list entries are fathomed, resulting in L =, then this is a certificate for inconsistency of M(B). On the other hand, for a consistent set M(B) the box containing the best known point so far is always a list entry. All pairs ( X, ṽ) in the list L contain lower bounds ṽ = l( X) at v( X). Hence, we can compute a valid lower bound for v(b), that is, on the whole domain, by l(b) = min{ṽ R ( X, ṽ) L}. If the lower and upper bounds are sufficiently close to each other in the sense that u(b) l(b) ε holds for a prescribed tolerance ε > 0, the algorithm can terminate since f(x ) = u(b) l(b) + ε f(x) + ε is satisfied for all x M(B) as desired. Otherwise, we choose a pair ( X, ṽ) with minimal value ṽ from L and repeat the above procedure to improve the bounds. This leads, in particular, to a tessellation of the original box B into sub-boxes. 8

We are now ready to state a preliminary version of the branch-and-bound framework. Algorithm 2.5 (Conceptual branch-and-bound method). Step 1: Initialization. Choose a tolerance ε > 0, set the lower bound to l 0 (B) =, the upper bound to u 0 (B) = +, and the list to L = {(B, l 0 (B))}. Initialize the best known point so far by x 0 = mid(b) and set the iteration counter to k = 1. Step 2: Select and divide box. Choose a pair (X k, v k ) L with smallest lower bound v k and remove it from L. Divide X k along a longest edge into Xk 1 and X2 k. If the longest edge is not unique, choose the edge parallel to the coordinate axis with smallest index. Step 3: Calculate lower bounds. Determine lower bounds vk 1 = l(x1 k ) and vk 2 = l(x2 k ) for the minimal value of P (X1 k ) and P (X2 k ). Add the pairs (Xk 1, v1 k ) and (X2 k, v2 k ) to L. Step 4: Update upper bound. Choose points x 1 k X1 k and x2 k X2 k. If x 1 k M(X1 k ) and x2 k M(X2 k ), put u k(b) = min{u k 1 (B), f(x 1 k ), f(x2 k )} and choose x k {x k 1, x1 k, x2 k } with f(x k ) = u k(b). If x 1 k M(X1 k ) and x2 k / M(X2 k ), put u k(b) = min{u k 1 (B), f(x 1 k )} and choose x k {x k 1, x1 k } with f(x k ) = u k(b). If x 1 k / M(X1 k ) and x2 k M(X2 k ), put u k(b) = min{u k 1 (B), f(x 2 k )} and choose x k {x k 1, x2 k } with f(x k ) = u k(b). If x 1 k / M(X1 k ) and x2 k / M(X2 k ), put u k(b) = u k 1 (B). Step 5: Fathoming. Remove all pairs ( X, ṽ) with ṽ > u k (B) or ṽ = + from L. Step 6: Update lower bound. If L, compute the lower bound l k (B) = min{ṽ R ( X, ṽ) L}. Step 7: Termination criterion. If u k (B) l k (B) ε or L =, stop. Step 8: Increment iteration counter. Increment k, go to Step 2. As mentioned before, we will focus on problems arising in Step 4 of Algorithm 2.5. In fact, in general there is no guarantee that for any k N one of 9

the chosen points x 1 k and x2 k is feasible, so that the inital upper bound u 0(B) might never be updated. This is, of course, not an issue in the merely box constrained case M(X) = X, and also for problems with convex constraints g i, i I, one can efficiently determine feasible points. For example, in the αbb approach (cf. Example 2.4) the optimal points x j k of the convex subproblems P (X j k ), j {1, 2}, are elements of M(X j k ) = M(Xj k ) and, thus, necessarily feasible. In Theorem 4.7 below we will see under which additional conditions such upper bounding procedures lead to finite termination of Algorithm 2.5 for any ε > 0. In the presence of nonconvex constraints, however, before one can study finite termination of Algorithm 2.5, in the first place one has to make sure that the initial bound does not remain constant. In fact, in Section 3 we will review a simple upper bounding procedure from [1] and show in Example 3.2 that the initial upper bound is never updated. An obvious sufficient condition for the termination criterion in Step 7 to hold after finitely many iterations is convergence of the infinite branch-andbound procedure corresponding to the theoretical choice ε = 0, that is, the difference u k (B) l k (B) tends to zero for k. This sufficient condition can, however, be weakened in view of the next result. Lemma 2.6. In Algorithm 2.5 with ε = 0, let l be a monotone M-dependent lower bounding procedure, and let u k (B) +. Then the sequence (u k (B) l k (B)) k N is convergent. Proof. By construction, the sequence (u k (B)) k N is nonincreasing, and by the assumption it is real for all sufficiently large k N. Moreover, by the selection rule in Step 2 for each k N we choose some X k with minimal v k = l(x k ), and the latter coincides with l k 1 (B). The next lower bound l k (B) then either coincides with some ṽ for a pair ( X, ṽ) which was part of the list before the deletion of X k, or it is one of the lower bounds vk 1 = l(x1 k ), v2 k = l(x2 k ) which result from splitting X k. In the first case we have l k (B) l k 1 (B) by the construction of l k 1 (B), and in the second case the same relation holds in view of the monotonicity of l. This shows that the sequence (l k (B)) k N is nondecreasing. Thus, the sequence (u k (B) l k (B)) k N turns out to be nonincreasing and bounded below by zero and, hence, to be convergent. This immediately leads to the following convergence condition. Proposition 2.7. In Algorithm 2.5 with ε = 0, let l be a monotone M- dependent lower bounding procedure, let u k (B) +, and let (u k (B) 10

l k (B)) k N possess zero as a cluster point. Then, for any ε > 0, Algorithm 2.5 terminates after finitely many iterations. Remark 2.8. According to [9, Theorem IV.3] any branch-and-bound procedure is convergent if its selection operation is bound improving and its bounding operations are consistent. Here, a selection operation is called bound improving if at least one tessellation element, at which the current lower bound is attained, is selected for further subdivision after a finite number of steps. In Algorithm 2.5 this is obviously the case by the formulation of Step 2 (see also [7]). Consistency is less obvious and, actually, challenging in the presence of nonconvex constraints. In fact, a bounding operation is called consistent [9] if at every step any unfathomed partition element can be further refined, and if any nested subsequence (X kj ) j N of boxes selected in Step 2 satisfies lim j ( ukj (B) l(x kj ) ) = 0. (2.3) As seen above, in Algorithm 2.5 we have l(x kj ) = l kj 1(B) and, thus, u kj (B) l(x kj ) u kj 1(B) l kj 1(B) so that the convergence of the infinite branch-and-bound procedure itself is a natural sufficient condition for consistency. In other words, for the convergence proof of Algorithm 2.5 it will not be helpful to use the consistency concept. As mentioned before, from our point of view the main challenge in any convergence proof for Algorithm 2.5 in the presence of nonconvex constraints is a well-defined statement of the upper bounding procedure. In particular, for the upper bounding procedures in the αbb framework described in [7] we are not aware of a consistency or convergence proof. 3 Difficulties with Nonconvex Constraints As discussed above, determining appropriate upper bounds to the globally minimal value is crucial for termination of Algorithm 2.5 in Step 7. The first idea to gain valid upper bounds is to insert points from the boxes Xk 1 and Xk 2 into the objective function [1]. If these points are feasible we obtain valid upper bounds. In the framework of the αbb method (cf. Ex. 2.4), a 11

natural choice for these points are the optimal points x 1 k and x2 k of the convex relaxations P (Xk 1) and P (Xk 2), respectively. Unfortunately both, x1 k and x2 k, may be infeasible so that we cannot update the upper bound. Hence, it is possible that no good upper bound is found, or even none at all. Example 3.1 below illustrates this fact for the αbb method and shows that this procedure of computing upper bounds does not work in general. Example 3.1. Consider the nonconvex optimization problem P (B) : min x R 2 f(x) := x 1 x 2 s.t. g(x) := x 2 1 (x 2 5) 2 + 25 + 2 0 x B := [1, 2] [0, 1]. It is not hard to see that the unique globally minimal point is x = ( 4 2, 0) with globally minimal value f(x ) = 4 2. The proof is omitted here for brevity. In the following we will show that the αbb method does not generate any feasible iterates and, thus, no upper bounds for problem P (B) with the strategy described above. As a first step we shall show that the upper left vertex of any box created by the algorithm is infeasible. This is easily seen for the initial box B as well as for any sub-box X = [x, x] of B with x 1 < 4 2. Moreover, as B is a square, due to the division rule all sub-boxes X = [x, x] created by the algorithm are either squares or rectangles with x 2 x 2 = 2(x 1 x 1 ). As all vertices of these boxes have rational coordinates, in the following we only need to consider boxes with 4 2 < x 1 b 1 = 2. In any iteration let (X, v) L be the pair chosen in Step 2 of Algorithm 2.5. Then the upper left vertex (x 1, x 2 ) of X necessarily satisfies x 1 x 2 = min x X f(x) min f(x) = v f(x ) = 4 2. (3.4) x M(X) First consider the case that X is a rectangle with x 2 x 2 = 2(x 1 x 1 ). Then both sub-boxes X 1 and X 2 resulting from the division rule are squares, where the upper left vertex of X 1 is (x 1, x 2 ), and the upper left vertex of X 2 is (x 1, x 2 ) with x 2 := 1 2 (x 2 + x 2 ). In the following we will show that both these points are infeasible, starting with (x 1, x 2 ). In fact, due to X B we have x 2 b 2 = 0 and, thus, x 2 1x 2 2. Using (3.4), this leads to x 2 1(x 2 1 4 2). The monotonicity of g(x 1, ) on [0, 1] now yields ( g(x 1, x 2 ) x 2 x 1 1 4 2 2 5) + 25 + 2. (3.5) 2 12

As the right hand side of (3.5) is stricly increasing for x 1 ( 4 2, 2] and attains the value zero at 4 2, we obtain g(x 1, x 2 ) > 0, that is, the infeasibility of the upper left vertex of X 2. The monotonicity of g(x 1, ) on [0, 1] and the relation x 2 x 2 now yield g(x 1, x 2 ) g(x 1, x 2 ) > 0, that is, also the infeasibility of the upper left vertex of X 1. Analogous arguments yield the infeasibility of the upper left vertices of X 1 and X 2 in the case that the box X is a square. Altogether, we have shown that the upper left vertex of any box created by the algorithm is infeasible. Now we apply the αbb method to problem P (B). Consider an arbitrary iteration in which we have to solve the convex subproblem P ( X) with some sub-box X of B. We will show that any optimal point x of P ( X), if it exists, is infeasible for P (B). In fact, this is trivial in the case M( X) =, so that in the sequel we will assume M( X) and, thus, solvability of P ( X) and P ( X). In the formulation of the relaxed problem P ( X), as f is linear, we only have to construct a convex underestimator for the restriction g on X. It is not hard to see that this is achieved by setting ĝ(x) := g(x) + ψ α (x) with α = 2. Note that then ĝ(x) < g(x) holds for all x from the box X, except for its vertices. In fact, let x be a vertex of X. Then x has rational entries, so that the equation 0 = g(x) = x 2 1 (x 2 5) 2 + 25 + 2 cannot hold or, in other words, g is not active at any vertex of X. On the other hand, let x be some optimal point of the (nonconvex) problem P ( X). Then g must be active at x as the objective function f(x) = x 1 x 2 attains its minimal value at a boundary point of M( X), and the upper left vertex of X is infeasible. Hence, x is not a vertex of X, and we obtain ĝ( x ) < g( x ) = 0. This excludes that f( x ) is also the minimal value of P ( X), as we will see next. In fact, by the continuity of ĝ there is a neighborhood U of x with ĝ(x) < 0 for all x U. Since x is, in particular, not the upper left vertex of X, the objective function f(x) = x 1 x 2 can be strictly improved by shifting x along one of the directions ( 1, 0) or (0, 1). This implies that the minimal value of P ( X) is stricly lower than the minimal value of P ( X). We conclude that any optimal point x of P ( X) is infeasible for P ( X) and, thus, for P (B). If, as in Example 3.1, feasible points are not available for the generation of upper bounds, a popular alternative is to accept also ε f -feasible points, that is, points x R n with max i I g i (x) ε f for a given tolerance ε f > 0 [7]. 13

While infeasible points clearly do not necessarily lead to valid upper bounds, the idea behind ε f -feasibility appears to be that the distance of any ε f -feasible point to the feasible set is small, so that the possible error in the objective function evaluation may be controlled by ε f. The following Example 3.2 illustrates the well-known fact that the latter argument is wrong. Example 3.2. Consider the optimization problem P (B) : min f(x) := 2x x R s.t. g(x) := 3 8π x + 1 2 sin(x) 0 x B := [ 32 π, 32 ] π. It is not hard to see that the minimal value of P (B) is zero. After the initialization of Algorithm 2.5, in Step 2 the box B is divided into X 1 = [ 3π, 0] and 2 X2 = [0, 3 π]. In the following we will focus on the 2 solution of the relaxed problem P (X 2 ). Since the objective function is linear, we only have to construct a convex underestimator of g on X 2. In view of g (x) = 1 sin(x) we may set ĝ(x) = g(x) + αψ(x) with α = 1. The 2 2 unique minimal point of P (X 2 ) then turns out to be x 2 4.6633. In view of g( x 2) 0.0572, the optimal point of the relaxation is infeasible for P (B) but, for example for ε f = 0.06, it is at least ε f feasible. Evaluating the objective function at this point leads to f( x 2) 9.3262. This value is significantly smaller than the common optimal value zero of P (X 2 ) and P (B) and, thus, utterly useless as an upper bound. For completeness we also mention the strategy to solve the nonconvex optimization problem P (X) locally on a given sub-box X and use the resulting optimal value as an upper bound [7]. Since in this so-called upper level problem approach also ε f -feasible points have to be accepted, similar problems as in Example 3.2 may occur. 4 An Enhanced Branch-and-Bound Framework In this section we derive a convergent branch-and-bound framework in the presence of nonconvex constraints. 14

As a first step, we need the following lemma which ensures that a box not containing any feasible point affects the algorithm for at most a finite number of iterations. Lemma 4.1. Let M(B) be nonempty and let X B \ M(B) be a box such that ( X, ṽ) L holds with some ṽ R. Further assume that in Algorithm 2.5 some convergent M-dependent lower bounding procedure is used. Then in Step 2 of the algorithm a box X X is selected at most in finitely many iterations. Proof. We derive a contradiction. In fact, if the assertion is wrong, then it is not hard to see that the sequence of boxes created by the algorithm contains an exhaustive subsequence of boxes X kj X, j N. Due to X B \ M(B) we have min x M(Xkj ) f(x) = + for each j N and, by the convergence of the lower bounding procedure, lim j l(x kj ) = lim j min x M(Xkj ) f(x) = +. On the other hand, since M(B) is nonempty there is some x M(B) with a finite objective function value f(x). Therefore in every iteration of the algorithm only boxes X with lower bounds v smaller than f(x) are selected and, in particular, we have l(x kj ) = l kj 1(B) f(x) for all j N. This contradicts the unboundedness of (l(x kj )) j N. Proposition 4.2. Let M(B) be nonempty and assume that in Algorithm 2.5 some convergent and monotone M-dependent lower bounding procedure is used. Then, if the infinite branch-and-bound procedure corresponding to ε = 0 does not terminate, we have lim k l k (B) = v(b). Proof. First recall from the proof of Lemma 2.6 that the sequence l k (B) is not only bounded above by the finite value v(b) but also nondecreasing. Therefore it is convergent, and its limit satisfies lim k l k (B) v(b). Under the division rule from Step 2 it is not hard to see that the infinite sequence of created boxes contains an exhaustive subsequence (X kj ) j N. Let x denote the unique element of j N X k j. Assume that x / M(B) holds. Then, for all sufficiently large j N we have X kj B \ M(B), in contradiction to the assertion of Lemma 4.1. Hence we find x M(B) and, by the selection rule in Step 2 and the convergence of the lower bounding procedure, lim l k j 1(B) = j lim l(x kj ) = j lim min j x M(X kj ) f(x) = f( x). As any subsequence of the convergent sequence (l k (B)) k N inherits its limit, we obtain lim k l k (B) = f( x) v(b) and, thus, the assertion. 15

Remark 4.3. In the framework presented in [9], convergence of lower bounds to v(b) can also be proved, but under different assumptions on the lower bounding procedure. In fact, there one uses that the selection operation is bound improving which is clearly fulfilled in our case, as mentioned above. Secondly, one has to show that the lower bounding operation is strongly consistent. For the formulation of strong consistency consider any nested subsequence of boxes (X kj ) j N selected in Step 2. Strongly consistent bounding operations involve mainly two items in our case. The first one is M M(B) where M := j N X kj which follows immediately from Lemma 4.1. The second one is lim l(x k j ) = min f( M M(B)) j which is likewise used in our proof of Proposition 4.2. After this analysis of an approximation procedure for the optimal value, we turn to an approximation procedure for an optimal point. Proposition 4.4. Let M(B) be nonempty, assume that in Algorithm 2.5 some convergent and monotone M-dependent lower bounding procedure is used, and that the infinite branch-and-bound procedure does not terminate. Furthermore, let (X k ) k N be a subsequence of the boxes chosen in Step 2, and let (x k ) k N be a sequence of points with x k X k, k N. Then (x k ) k N possesses a cluster point, and any such cluster point is a globally minimal point of P (B). Proof. As the sequence (x k ) k N is contained in the bounded set B, the existence of a cluster point is clear. Let x be such a cluster point. We have to prove the feasibility of x as well as f( x) = v(b). Choose a subsequence (x kj ) j N with lim j x kj = x. Then, without loss of generality, we may assume that the sequence (X kj ) j N is exhaustive. The feasibility of x now follows from Lemma 4.1 as in the proof of Proposition 4.2. Furthermore, the convergence of the lower bounding procedure and the continuity of f yield lim l k j 1(B) = j lim l(x kj ) = j lim min j x M(X kj ) f(x) = f( x), so that Proposition 4.2 implies f( x) = v(b). 16

Remark 4.5. In [9, Corollary IV.2] one finds a similar result for the best known points so far denoted by x k. However, there consistent bounding operations are assumed, in contrast to our approach. Lemma 4.6. Under the assumptions of Proposition 4.4, further assume that the subsequence (X k ) k N and the points (x k ) k N can be chosen such that x k M(B) holds for all k N. Then the (complete) sequence (u k (B) l k (B)) k N possess zero as a cluster point. Proof. Consider the subsequence (u k (B) l k (B)) k N corresponding to the chosen boxes. For each k N the feasibility of x k leads to v(b) u k (B) f(x k ). Proposition 4.4 and the continuity of f then yield that the sequence (u k (B)) k N possesses the cluster point v(b). Together with Proposition 4.2 the assertion follows. The combination of Proposition 2.7 and Lemma 4.6 immediately yields the following result. Theorem 4.7. Assume that in Algorithm 2.5 some convergent and monotone M-dependent lower bounding procedure is used, and that in each iteration k at least one of the points x 1 k and x2 k chosen in Step 4 is an element of M(B). Then, for any ε > 0, Algorithm 2.5 terminates after finitely many iterations. The feasibility assumption for the points x 1 k, x2 k from Theorem 4.7 clearly is satisfied for purely box constrained problems. It also holds for problems with convex constraints if, for example, the αbb method is used. As illustrated by Example 3.1, this feasibility assumption may not hold in the presence of nonconvex constraints. Then only the approximation of v(b) by lower bounds is clear, accompanied by an approximation of some globally minimal point by infeasible points. This, however, does not guarantee the termination tolerance ε. In the following we will thus derive an upper bounding procedure which guarantees convergence under a mild assumption. Its main idea is an appropriate perturbation of the points x k from Proposition 4.4 to feasible points. In fact, our main assumption is that the Mangasarian-Fromovitz Constraint Qualification (MFCQ) is fulfilled at every globally minimal point x of P (B) (cf. also Rem. 4.13 below). For its formulation, we need to write the box constraints x B as explicit inequality constraints, that is, we put M(B) = 17

x x x + λd d Figure 1: Point x close to the feasible set {x R n g i (x) 0, i Ĩ} with Ĩ = {1,..., p + 2n} and g i (x) = g i (x), i I = {1,..., p}, g p+i (x) = b i x i, i = 1,..., n, g p+n+i (x) = x i b i, i = 1,..., n. Then, with the active index set Ĩ0(x ) = {i Ĩ g i(x ) = 0}, MFCQ holds at x if there exists a direction d R n with D g i (x )d < 0 for all i Ĩ0(x ). If MFCQ holds at x, for reasons of continuity any point x R n sufficiently close to x also satisfies D g i ( x)d < 0, i Ĩ0(x ), even if x is infeasible. This is illustrated in Figure 1. Assuming temporarily that Ĩ0(x ) is known, we can compute such a direction d by solving the linear program LP ( x) : min z s.t. D g i( x)d z, i Ĩ0(x ) (d,z) R n+1 d r where the last constraint bounds d with some search radius r > 0. Note that LP ( x) is solvable and that for any optimal point ( d, z) the relation z = max i Ĩ 0 (x ) D g i( x) d holds. As seen above, for x sufficiently close to x the optimal value z of LP ( x) is negative, and we obtain a promising direction 18

d in x for searching for feasible points. Actually, still assuming that we know the set Ĩ0(x ), we can prove the following result. Lemma 4.8. Under the assumptions of Proposition 4.4, let x be a cluster point of the sequence (x k ) k N being a globally minimal point of P (B). Furthermore, let MFCQ hold at x. Then, for any r > 0, (x k ) k N contains a subsequence (x kj ) j N such that, for all sufficiently large j N, any optimal point (d kj, z kj ) of LP (x kj ) satisfies z kj < 0 as well as x kj + λ kj d kj M(B) with some λ kj > 0, where lim j λ kj = 0. Proof. First we choose a subsequence (x kj ) j N of (x k ) k N with lim j x kj = x. For j N, let (d kj, z kj ) be some optimal point of the (solvable) linear program LP (x kj ). Choosing, if necessary, another subsequence of (x kj ) j N, by the boundedness of ( d kj ) j N we may assume without loss of generality that the vectors d kj converge to some d R n. The values z kj = max i Ĩ 0 (x ) Dg i(x kj )d kj then obviously converge to z = max i Ĩ 0 (x ) D g i(x )d. As the optimal point mapping of LP (x) is outer semi-continuous [17], the limit (d, z ) turns out to be an optimal point of LP (x ). The assumption of MFCQ in x implies the existence of some feasible point (d, z ) of LP (x ) with z < 0, so that also z < 0 and, hence, D g i (x )d < 0 hold for all i Ĩ0(x ). Standard arguments then show that for all sufficiently small λ > 0 we have g i (x + λd ) < 0 for all i Ĩ. In particular, we may choose some sufficiently large l N with g i (x + 1 l d ) < 0, i Ĩ. Then the continuity of the functions g i, i Ĩ, implies g i (x kj + 1d l k j ) < 0 for all i Ĩ and all j j l with some j l N. Setting λ kjl := 1 shows g l i(x kjl + λ kjl d kjl ) < 0 for all i Ĩ and all sufficiently large l N and, thus, the assertion. Note that Lemma 4.8 relies on solutions of the linear programs LP (x kj ), j N, in whose formulation the active index set Ĩ0(x ) appears. Unfortunately, this set is not known a priorily so that we need a certain substitute. This motivates to define the sets { } I 0 (X) = i Ĩ l g i (X) 0 l gi (X) for boxes X B, where l gi and l gi are convergent M-independent bounding 19

procedures for i {1,..., p}, and where we set l gp+i (X) = b i x i, u gp+i (X) = b i x i, l gp+n+i (X) = x i b i, u gp+n+i (X) = x i b i for i = 1,..., n. It it not hard to see that these are also M-independent bounding procedures. Note that l gi (X) is an upper bound for max x X g i (x), as opposed to u gi (X) which is defined to be only an upper bound for min x X g i (x). Lemma 4.9 (Identification of active indices). Let (X k ) k N be an exhaustive sequence of boxes and let x be element of each box X k, k N. Then, for all sufficiently large k N, we have Ĩ0(x ) = I 0 (X k ). Proof. For any i Ĩ0(x ) we have g i (x ) = 0. Since for any k N we have x X k, this results in l gi (X k ) g i (x ) = 0 l gi (X k ) and, thus, Ĩ 0 (x ) I 0 (X k ) for any k N. Next, consider any i Ĩ with g i(x ) < 0. Then, for all sufficiently large k N the convergence of the upper bounding procedure guarantees the inequality g i (x ) l gi (X k ) < 0 and, hence, i / I 0 (X). An analogous argument for any i I with g i (x ) > 0 completes the proof. Lemma 4.9 shows that, in Lemma 4.8, for all sufficiently large k N in problem LP (x k ) we may replace the index set Ĩ0(x ) by I 0 (X k ). Moreover, after the special choice x k := mid(x k ), the linear program is parametrized merely by the choice of the box X k, and we may replace LP (x k ) by LP (X k ) : min z s.t. Dg i(mid(x k )) d z, i I 0 (X k ) (d,z) R n+1 d r. We are now ready to state our final version of the branch-and-bound method in Algorithm 4.10. Let us first explain in detail how we use Lemma 4.8 in Step 4 of the algorithm: if in Step 4 of iteration k both chosen points x 1 k X1 k and x2 k X2 k are infeasible, then the midpoint of the box Xk is perturbed to a feasible point by the construction illustrated in Figure 1. To this end, an optimal point (d k, z k ) of LP (X k ) is determined. In the case z k 0 this information is not used, and the algorithm continues with Step 5. 20

For sufficiently large k, however, we will find z k < 0. In this case there is a chance for mid(x k ) + λ k d k to be feasible for some λ k (0, 1]. As we cannot be sure that k is also suffienctly large for this and, at the same time, we have to search for λ k (0, 1], we suggest to proceed as follows: we discretize the interval (0, 1] uniformly by the points s/n k with s = 1,..., N k and some number N k which is initialized to N 1 = 1. If none of the points mid(x k )+ s N k d k with s = 1,..., N k turn out to be feasible, we do not further refine the discretization by increasing N k, as k might still be to small to find a feasible point this way. Instead, we only increment N k to N k+1 = N k + 1 and continue with Step 5. If, on the other hand, mid(x k ) + s k N k d k M(B) holds for some s k {1,..., N k }, we update the upper bound and the best known feasible point so far accordingly. For the next iteration, we reset N k+1 = 1. Unfortunately, the step sizes λ k = s k /N k determined by this strategy do not necessarily converge to zero as in the assertion of Lemma 4.8, but are only bounded. To compensate for this, we enforce convergence of the points mid(x k ) + λ k d k to x by choosing a geometrically decreasing sequence of search radii (r k ) k N. Note that this modification of LP (X k ) does not interfere with our previous arguments. Algorithm 4.10 (Convergent spatial branch-and-bound method). Step 1: Initialization. Choose a tolerance ε > 0 and some γ (0, 1), set the lower bound to l 0 (B) =, the upper bound to u 0 (B) = +, the search radius to r 1 = diag(b), the number of discretization points to N 1 = 1, and the list to L = {(B, l 0 (B))}. Initialize the best known point so far by x 0 = mid(b) and set the iteration counter to k = 1. Step 2: Select and divide box. Choose a pair (X k, v k ) L with smallest lower bound v k and remove it from L. Divide X k along a longest edge into Xk 1 and X2 k. If the longest edge is not unique, choose the edge parallel to the coordinate axis with smallest index. Step 3: Calculate lower bounds. Determine lower bounds vk 1 = l(x1 k ) and vk 2 = l(x2 k ) for the minimal value of P (X1 k ) and P (X2 k ). Add the pairs (Xk 1, v1 k ) and (X2 k, v2 k ) to L. Step 4: Update upper bound. Choose points x 1 k X1 k and x2 k X2 k. Step 4a: Feasibility check. If x 1 k M(X1 k ) and x2 k M(X2 k ), put u k(b) = min{u k 1 (B), f(x 1 k ), f(x2 k )} and choose x k {x k 1, x1 k, x2 k } with f(x k ) = u k (B). 21

If x 1 k M(X1 k ) and x2 k / M(X2 k ), put u k(b) = min{u k 1 (B), f(x 1 k )} and choose x k {x k 1, x1 k } with f(x k ) = u k(b). If x 1 k / M(X1 k ) and x2 k M(X2 k ), put u k(b) = min{u k 1 (B), f(x 2 k )} and choose x k {x k 1, x2 k } with f(x k ) = u k(b). Step 4b: Perturbation to feasibility. If x 1 k / M(B) and x2 k compute some optimal point (d k, z k ) of / M(B), min z s.t. Dg i(mid(x k )) d z, i I 0 (X k ) (d,z) R n+1 d r k. If z k < 0, put s = 1. While x k,s = mid(x k) + s N k d k / M(B) and s N k, increment s. If x k,s = mid(x k) + s N k d k M(B), put u k (B) = min{u k 1 (B), f(x k,s )} and choose x k {x k 1, x k,s } with f(x k ) = u k(b). Put N k = 1 and r k +1 = γr k. Else put N k+1 = N k + 1. Step 5: Fathoming. Remove all pairs ( X, ṽ) with ṽ > u k (B) or ṽ = + from L. Step 6: Update lower bound. If L, compute the new lower bound l k (B) = min{ṽ R ( X, ṽ) L}. Step 7: Termination criterion. If u k (B) l k (B) ε or L =, stop. Step 8: Increment iteration counter. Increment k, go to Step 2. The following result states that after finitely many iterations the feasible points x k, k N, identified by Algorithm 4.10 approximate some globally minimal point of P (B) arbitrarily well. Proposition 4.11. Let M(B) be nonempty, let MFCQ hold at all globally minimal points of P (B), assume that in Algorithm 4.10 some convergent and monotone M-dependent lower bounding procedure is used, and that the infinite branch-and-bound procedure does not terminate. Then, for any δ > 0 there exist a globally minimal point x of P (B) and some k N such that the feasible point x k satisfies x k x δ. 22

Proof. Let (X kj ) j N be an exhaustive subsequence of the boxes selected in Step 2. If infinitely many of the points x k j, j N, are chosen in Step 4a and, hence, are elements of X kj, then the assertion follows from Proposition 4.4. If, on the other hand, Step 4a is invoked only finitely often, then infinitely many of the points x k j are generated in Step 4b by perturbation of mid(x kj ) to a point in M(B). By Lemma 4.8, for some sufficiently large j N the point x k j is feasible. Moreover, the sequence (mid(x kj )) j N has a cluster point at a globally minimal point of P (B) by Proposition 4.4, and the perturbation term λ kj d kj tends to zero for j. This shows the assertion. In analogy to the proofs of Lemma 4.6 and Theorem 4.7, Proposition 4.11 implies our main convergence result. Theorem 4.12. Let M(B) be nonempty, let MFCQ hold at all globally minimal points of P (B), assume that in Algorithm 4.10 some convergent and monotone M-dependent lower bounding procedure is used, and that the infinite branch-and-bound procedure does not terminate. Then, for any ε > 0, Algorithm 4.10 terminates after finitely many iterations. Remark 4.13. Note that the assumption of MFCQ at all globally minimal points of P (B) is mild in the following sense: MFCQ is weaker than the so-called linear independence constraint qualification (LICQ), and in [10] it is shown that generically even LICQ holds at every point in M(B). Remark 4.14. As discussed in Remark 2.8, under the assumptions of Theorem 4.12 the bounding operation of Algorithm 4.10 is consistent. 5 Numerical Results In this section we present our numerical results. The algorithm is implemented in C++ and compiled with GNU g++, version 4.6. To integrate interval arithmetic, the library PROFIL/BIAS V2, see [12], is linked in. All occuring linear optimization problems are solved with GNU Linear Programming Kit 4.45, see [14]. Bounds at the restrictions g i are provided through the use of optimal centered forms as described in [4]. As a bounding procedure for calculating lower bounds at the objective function we apply the αbb relaxation from Example 2.4. In order to make sure that we really obtain lower bounds, we used an outer approximation method to solve the convex subproblems, see [11]. 23

We applied the enhanced branch-and-bound framework to different test problems. In addition to the problems from Examples 3.1 and 3.2, a selection of the Hock-Schittkowski collection [8] was examined. We will identify these problems by HSk where k denotes the corresponding number from the test collection. A brief overview over the optimization problems is given in Table 1. Here, n denotes the number of variables and p refers to the number of inequality constraints other than box constraints. Problem n p f(x ) Example 3.1 2 1 4 2 1.18921 Example 3.2 1 1 0.0 HS005 2 0 3 2 π 3 1.91322 HS018 2 2 5.0 HS019 2 2 6961.81381 HS021 2 1 99.96 HS023 2 5 2.0 HS030 3 1 1.0 HS031 3 2 6.0 HS034 3 2 ln(ln(10.0)) 0.83403 HS036 3 1 3300.0 HS045 5 0 1.0 HS065 3 1 0.9535288567 HS095 6 4 0.015619514 Table 1: Overview of the test problems Next we will present the numerical results of our algorithm. The tolerance for the termination of the branch-and-bound mechansim is set to ε = 0.001 while the feasibility tolerance for the convex subproblems is set to ε sub = 0.0001. As it can be seen in Table 2, all problems are solved by our algorithm. The column LPs contains the number of linear programs which are solved in order to compute Mangasarian-Fromovitz directions. A comparison with Table 1 shows that we could determine valid lower and upper bounds within the given tolerance for all problems. We also implemented the algorithm stated in Example 2.4 (compare also [3]). In Table 3 the results are presented where we do not allow updates of the upper bound at ε f -feasible points. Observe that Example 3.1 is not solved correctly as we expected from the analysis in Section 3. In that case the algorithm was terminated after a maximum number of 100,000 iterations. The acceptance of ε f -feasible points with ε f 24 = 0.1 for updating the upper

Problem Iter. LPs u(b) l(b) u(b) f(x ) f(x ) l(b) Example 3.1 98 98 1.1902 1.1892 0.0010 0.0000 Example 3.2 4 3-0.0000-0.0001 0.0000 0.0001 HS005 10 6-1.9132-1.9133 0.0000 0.0001 HS018 51 46 5.0005 4.9999 0.0005 0.0001 HS019 531 531-6961.8138-6961.8139 0.0000 0.0001 HS021 1 0-99.9600-99.9600 0.0000 0.0000 HS023 55 55 2.0009 2.0000 0.0009 0.0000 HS030 1 0 1.0000 1.0000 0.0000 0.0000 HS031 169 162 6.0005 6.0000 0.0005 0.0000 HS034 38 38-0.8331-0.8340 0.0009 0.0000 HS036 40 34-3300.0000-3300.0003 0.0000 0.0003 HS045 166 144 1.0010 1.0000 0.0010 0.0000 HS065 29 22 0.9544 0.9534 0.0009 0.0001 HS095 21 17 0.0157 0.0156 0.0001 0.0000 Table 2: Results of Algorithm 4.10 Problem Iter. LPs u(b) l(b) u(b) f(x ) f(x ) l(x) Example 3.1 - - - 1.1892-0.0000 Example 3.2 4 0-0.0000-0.0001 0.0000 0.0001 HS005 10 0-1.9132-1.9133 0.0009 0.0001 HS018 53 0 5.0005 5.0000 0.0005 0.0000 HS019 673 0-6961.8138-6961.8139 0.0000 0.0001 HS021 1 0-99.9600-99.9600 0.0000 0.0000 HS023 98 0 2.0000 2.0000 0.0000 0.0000 HS030 1 0 1.0000 1.0000 0.0000 0.0000 HS031 209 0 6.0005 6.0000 0.0005 0.0000 HS034 40 0-0.8331-0.8340 0.0009 0.0000 HS036 40 0-3300.0000-3300.0003 0.0000 0.0003 HS045 166 0 1.0010 1.0000 0.0010 0.0000 HS065 32 0 0.9544 0.9535 0.0009 0.0000 HS095 21 0 0.0157 0.0156 0.0001 0.0000 Table 3: Standard αbb without acceptance of ε f -feasible points bound cannot be seen as a practical solution to solve the difficulties as we show in Table 4. Here, the value of u(b) for the problems Example 3.1, Example 3.2, HS018, HS019, HS023 and HS031 is lower than the globally minimal value from Table 1 and therefore not an upper bound. 25

Problem Iter. LPs u(b) l(b) u(b) f(x ) f(x ) l(b) Example 3.1 1 0 1.1657 1.1657-0.0235 0.0235 Example 3.2 1 0-9.3266-9.3266-9.3266 9.3266 HS005 10 0-1.9132-1.9133 0.0000 0.0000 HS018 24 0 4.9853 4.9852-0.0147 0.0148 HS019 20 0-6993.5395-6989.0644-31.7257 27.2506 HS021 1 0-99.9600-99.9600 0.0000 0.0000 HS023 37 0 1.8313 1.8313-0.1687 0.1687 HS030 1 0 1.0000 1.0000 0.0000 0.0000 HS031 21 0 5.6149 5.6148-0.3851 0.3852 HS034 4 0-0.8340-0.8340 0.0000 0.0000 HS036 40 0-3300.0000-3300.0003 0.0000 0.0000 HS045 166 0 1.0010 1.0000 0.0010 0.0000 HS065 1 0 0.9535 0.9535 0.0000 0.0000 HS095 21 0 0.0156 0.0156 0.0000 0.0000 Table 4: Standard αbb with acceptance of ε f -feasible points 6 Final Remarks In this paper we presented an enhanced branch-and-bound framework which leads to convergent algorithms for solving global optimization problems under weak conditions at the bounding procedures and the minimal points. However, there are still a few issues we would like to mention. First we point out that further research is necessary to include nonlinear equality restrictions. Although formally restrictions of the form h(x) = 0 with h : R n R can be integrated in the algorithm via the reformulation ±h(x) 0, this approach destroys the MFCQ which is essential for our concept of determining upper bounds. At least all results concerning the lower bounds as well as Proposition 4.4 remain true, but without valid upper bounds the termination tolerance cannot be guaranteed. As a second issue we would like to mention that we experimented with an alternative to calculate valid upper bounds. The main idea is to not only divide the box with the lowest bound in the list but also all boxes in a certain neighbourhood of this box. Then all midpoints of these boxes are checked for feasiblity. Although we could prove convergence, we do not recommend this approach due to the fact that the number of items in the list may increase exponentially. Our numerical tests support this observation. Finally, we believe that our approach may be adapted to other partition strategies as, for example, simplicial subdivision. The details are left for future research. 26