Projected Gradient Methods for NCP 57. Complementarity Problems via Normal Maps

Size: px
Start display at page:

Download "Projected Gradient Methods for NCP 57. Complementarity Problems via Normal Maps"

Transcription

1 Projected Gradient Methods for NCP 57 Recent Advances in Nonsmooth Optimization, pp Eds..-Z. u, L. Qi and R.S. Womersley c1995 World Scientic Publishers Projected Gradient Methods for Nonlinear Complementarity Problems via Normal Maps Michael C. Ferris 1 University of Wisconsin{Madison, Computer Sciences epartment, Madison, WI 53706, USA aniel Ralph 2 University of Melbourne, epartment of Mathematics, Melbourne, Australia Abstract We present a new approach to solving nonlinear complementarity problems based on the normal map and adaptations of the projected gradient algorithm. We characterize a Gauss{Newton point for nonlinear complementarity problems and show that it is sucient to check at most two cells of the related normal manifold to determine such points. Our algorithm uses the projected gradient method on one cell and n rays to reduce the normed residual at the current point. Global convergence is shown under very weak assumptions using a property called nonstationary repulsion. A hybrid algorithm maintains global convergence, with quadratic local convergence under appropriate assumptions. 1 Introduction The nonlinear complementarity problem is to nd a vector z 2 IR n satisfying: f(z) 0; z 0; hf(z); zi = 0; (NCP) 1 The work of this author was based on research supported by the National Science Foundation grant CCR and the Air Force Oce of Scientic Research grant F The work of this author was based on research partially supported by the U.S. Army Research Oce through the Mathematical Sciences Institute, Cornell University, the National Science Foundation, the Air Force Oce of Scientic Research, the Oce of Naval Research, under grant MS , and the Australian Research Council.

2 58 M. C. Ferris and. Ralph where f : IR n! IR n is a smooth function and all vector inequalities are taken component-wise. In this paper, we will describe an algorithm for solving nonlinear complementarity problems that is computationally based on the projected gradient algorithm, and uses a reformulation of (NCP) as a system of nonsmooth equations. The algorithm is conceptually simple to implement and has a low cost per iteration; and we demonstrate its convergence properties assuming only that f is continuously dierentiable. The problem (NCP) can be reformulated using a normal map: 0 = f + (x) def = f(x + ) + x x + ; (NE) where x + is the Euclidean projection of x onto IR n +. Note that z solves (NCP) if and only if z f(z) solves (NE), and x solves (NE) if and only if x + solves (NCP). Normal maps were introduced by Robinson in [32] (see also [29, 30]) and we note here simply that the formulation (NE) has some advantages over (NCP). For example, it is an equation rather than a system of inequalities and equalities, hence its examination from the viewpoint of equations may yield insight dicult to obtain otherwise. This has proven to be the case as demonstrated by recent advances on nonsmooth Newtonlike algorithms for (NE) in [5, 4, 12, 28, 34]. Nonsmoothness of the normal map, however, is the diculty assumed. In fact, normal maps such as f + can be cast in a more general framework, where x + is replaced by (x), the projection of x onto a nonempty closed convex set. In this context, nding a zero of the normal map f (x) def = f( (x)) + x (x) is equivalent to a nonlinear variational inequality [11] dened by the set and the function f. In the special case where IR n +, f = f +. For polyhedral, the normal map [31, 33] f is intimately related to the normal manifold [32]. This manifold is constructed using the faces of the set ; it is a collection of n-dimensional polyhedral sets (called cells) which partition IR n. The normal map f is smooth in each cell of IR n ; nondierentiability only can occur as x moves from one cell to another. A cell is sometimes called a piece of linearity. In the particular example resulting from nonlinear complementarity problems where IR n +, the cells of the normal manifold are precisely the orthants of IR n. Practical Newton-like methods for (NE) solve a linear or piecewise linear model based at the kth iterate, x k, to obtain the next iterate x k+1. Unfortunately, this model is not always invertible and this creates problems for dening algorithms and in computing x k+1. In this paper, we are concerned with dening practical algorithms with strong global convergence properties for nding zeros of normal maps. Our goal is to obtain convergence, at least on a subsequence, to a Gauss{Newton point for normal maps. This generalizes the familiar notion from nonlinear equation theory where a Gauss{Newton point is a stationary point for the problem of minimizing the Euclidean norm residual of the function.

3 Projected Gradient Methods for NCP 59 We are ultimately interested in zeros of f but nding one may be on the level of diculty of nding zeros of general nonlinear functions. We revert to considering the residual function (x) def = min 1 2 kf (x)k 2 ; which gives us a measure of the violation of satisfying f (x) = 0. Our aim in this paper is to develop a robust algorithm for minimizing that has a low cost per iteration. Note that is a piecewise smooth function. In order to motivate our denition of Gauss{Newton points, let us rst examine the notion of a Gauss{Newton point for nonlinear equations. This corresponds to the case where IR n, and f = f. A Gauss{Newton point for the smooth function f is a point x 2 IR n such that x = x minimizes the rst-order model 1 kf(x ) + rf(x )(x x )k 2 of (x) over IR n. For 2 general, we construct a piecewise linear model of the residual function based on the directional derivative f(x 0 ; ). There are several key ideas on which the development of this paper are based. (i) The characterization of Gauss{Newton points for normal maps requires the stationarity of the residual function with respect to every cell that contains that Gauss{Newton point. Thus, for complementarity problems, we must examine up to 2 n orthants to determine whether or not x is a Gauss{Newton point of f +. Our rst key result is to show that it is sucient to check at most two of these cells, independent of the magnitude of n. An alternative characterization given in this paper shows that one cell and at most n rays in neighboring cells need to be examined to verify stationarity of (or give a descent direction). (ii) The inherent diculty in dening an algorithm to determine a Gauss{Newton point is that one must be sure that the limit point of the algorithm is stationary for in each piece of smoothness (orthant) containing that limit point. The second key idea, motivated by the characterizations above, is to apply variants of the projected gradient method [2] simultaneously to a single cell and n rays, to reduce. This means that the work performed by the projected gradient algorithm at each step of the Gauss{Newton method is comparable to performing just two projected gradient steps. (iii) Our algorithm depends heavily on the projected gradient method having Non- Stationary Repulsion or NSR (see Section 3). Simply stated, if an algorithm has NSR, then each nonstationary point has a neighborhood that can be visited by at most one iterate of the algorithm. The third key result is that the projected gradient algorithm and the adaptations that we use in our algorithm have NSR. This property forces our algorithm to generate a better point in a neighboring orthant if the limit point of the sequence is not stationary in such an orthant. The paper is organized as follows. In Section 2 we dene the notion of a Gauss{ Newton point for f + and and prove several equivalent characterizations (Proposi-

4 60 M. C. Ferris and. Ralph tion 2.3). We give a testable regularity condition (enition 2.4) that guarantees that such Gauss{Newton points are solutions of (NE). Section 3 outlines the nonstationary repulsion property and shows that any algorithm having NSR possesses strong global convergence properties (Theorem 3.2). We prove several technical results that are key to the convergence of our algorithms. A special case of these results are used to show that the projected gradient algorithm has NSR (Theorem 3.6). Section 4 contains a description of three algorithms and their convergence properties. Our main convergence result, Theorem 4.3, proves the Gauss{Newton method we present is extremely robust: assuming only continuous dierentiability of f, every limit point of the method is stationary for. No regularity assumptions on limit points are required. However, before proving this result, we outline a basic algorithm that can easily be shown to have NSR and hence global convergence under the same assumptions. Theorem 4.3 proves convergence of an extension to the basic algorithm that is motivated by the practical considerations of reducing the number of function and Jacobian evaluations. A Newton based hybrid method with global and local quadratic convergence is given in Subsection 4.3. Some simple examples of the use of these algorithms conclude the paper. There have been many other research papers devoted to solving nonlinear complementarity problems. Some of the more recent papers are mentioned below. There are several types of Newton methods for solving nonsmooth equations; see Subsection 4.3 for a brief introduction. Here we mention the following references on Newton methods for nonsmooth equations and extensions, [5, 4, 6, 12, 13, 15, 16, 20, 21, 26, 27, 28, 34]. A feature shared by \pure" Newton methods is the need for an invertible model function at the current iteration; applying the inverse of this model yields the next iterate. However singularities occur in many problems, for instance see [12], causing numerical diculties for, or outright failure of these methods. To circumvent the singularity problem, several Gauss{Newton techniques for solving nonlinear complementarity problems have been proposed. These can be found in the following references [1, 9, 19, 23, 22, 24]. Alternative techniques can be found in [8, 10, 14, 17, 18, 36]. Most of the notation in this paper is standard. We use IR n to denote the n{ dimensional real vector space, h; i for the inner product of two elements in this space, kk for the associated Euclidean norm, and IB for the corresponding ball of vectors x such that kxk 1. For a dierentiable function : IR n! IR m, r(x) 2 IR mn represents the Jacobian of evaluated at x, and r(x) T represents the transpose of this matrix. If is only directionally dierentiable, we denote the directional derivative mapping at x by 0 (x; ). Calligraphic upper case letters in general represent sets of indices, upper case letters represent sets or operators. If is a convex set, the normal cone to at a point x 2 is N (x) def = fy: hy; c xi 0; 8c 2 g :

5 Projected Gradient Methods for NCP 61 The tangent cone at x 2, is dened by T (x) def = N (x), where for a given convex cone K, the polar cone is dened by K def = fy: hy; ki 0; 8k 2 Kg : Both the tangent and normal cones are empty at points x =2. The Euclidean projection of x onto the set is represented by (x). A function :! IR m is C 1 (continuously dierentiable) if it is dierentiable in the relative interior of and, for each sequence fx k g in the relative interior of that converges (to a general point of ), fr(x k )g is also convergent. If is a polyhedral convex set, and F is a face of then N (x) is the same set for every x in the relative interior of F [32]. We call this set N (F ). A facet of is a face that has dimension 1 less than. Further denitions from convex analysis can be found in [35]. We may abuse notation, when there is no possibility of confusion, by writing O instead of j O to mean the restriction of to an orthant O (to be distinguished from a normal map involving O ). Finally, throughout the paper the function f :! IR m is assumed to be C 1 ; and usually = IR n +. 2 Gauss{Newton Points and Regularity As we outlined in the introduction, a Gauss{Newton point for the smooth function f is a point x 2 IR n such that x = x minimizes the rst-order model 1 2 kf(x ) + rf(x )(x x )k 2 of (x) = 1 2 kf(x)k2 over IR n. Equivalently, x is a stationary point of, that is r(x ) = rf(x ) T f(x ) = 0. Note again that for the remainder of this paper we assume that f is continuously dierentiable on its domain ( or IR n + ). In the general case, we approximate the normal map f (x) by the piecewise linear model f (x ) + f 0 (x ; x x ), where the directional derivative f 0 (x ; ) is a piecewise linear map. We can now dene the notion of a Gauss{Newton point of f, which is based on this directional derivative. enition 2.1 Let x 2 IR n. We say x is a Gauss{Newton point for f if x = x solves the problem 1 min x 2 kf (x ) + f(x 0 ; x x )k 2 : (1) Equivalently, x is a Gauss-Newton point if 1 2 kf (x )k kf (x ) + f 0 (x ; x x )k 2 ; 8x 2 IR n : For the remainder of this paper we will consider only the special case of nonlinear complementarity problems where IR n +. However, many of the results have analogues in the general polyhedral case.

6 62 M. C. Ferris and. Ralph 2.1 Gauss{Newton points of complementarity problems Using enition 2.1, we see that x is a Gauss{Newton point of f + if it solves (1) with f = f +. To understand this more fully, we now investigate the directional derivative f 0 + in more detail. We can easily calculate the directional derivative of the function x + at x in the direction d: it is the vector x 0 +(d) in IR n whose ith component is given by [x 0 + (d)] i = 8 >< >: d i if x i > 0; (d i ) + if x i = 0; 0 if x i < 0. In fact x 0 +(d) is exactly the projection of d onto the critical cone of IR n + at x, K(x). This critical cone is the Cartesian product of n intervals in IR, the ith interval being K i = 8 >< >: IR if x i > 0; IR + if x i = 0; f0g if x i < 0. Since f is continuously dierentiable, f + is directionally dierentiable: for x, d 2 IR n, f 0 +(x; d) = rf(x + ) K (d) + d K (d); where the notation K = K(x) is used. As a function of d, the mapping on the right is exactly the normal map induced by the matrix rf(x + ) and the convex cone K, so f 0 +(x; d) = rf(x + ) K (d): As mentioned above, the diculty in determining whether a point x is a Gauss{ Newton point is that we must examine potentially exponentially many pieces of smoothness of f +, or pieces of linearity of rf(x + ) K. In fact, the number of pieces of linearity of rf(x + ) K is the number of orthants containing x, and is given by 2 m where m is the number of components of x equal to zero. The next result removes this diculty by showing that at most two pieces of linearity need to be considered. We introduce some notation. Given an orthant O, let H i be the half-line IR +, i = 1; : : : ; n, such that O = H 1 : : : H n : The complement of O at a point x 2 O is the orthant ~ O given as the Cartesian product of half-lines ~H i where ~H i = ( Hi if x i 6= 0, H i if x i = 0. It may seem odd that the complement of O at an interior point x is O itself. This is actually quite natural in the context of stationary points of because is dierentiable at each interior point x of an orthant, hence the question of stationarity of at x is independent of other orthants. We next introduce the formal denition of a stationary point.

7 Projected Gradient Methods for NCP 63 enition 2.2 If is directionally dierentiable and is a nonempty convex set, then x is a stationary point for min x2 (x) if 0 (x ; d) 0; 8d 2 T (x ): Note that if IR n, then a stationary point satises 0 (x ; d) 0, for all d 2 IR n. Proposition 2.3 Given x 2 IR n, let K be the critical cone to IR n + at x, O be any orthant containing x and ~ O be the complement of O at x. Suppose f is continuously dierentiable, then the function, dened by is directionally dierentiable and The following statements are equivalent: (x) def = 1 2 kf +(x)k 2 ; 0 (x ; d) = f + (x ); f 0 + (x ; d) E ; 8d 2 IR n : (2) 1. x is a Gauss{Newton point of f x is a stationary point of minf(x): x 2 IR n g rf(x + )T f + (x ) + K and 0 2 f + (x ) + K. 4. x is stationary for both minf(x): x 2 O g and min n (x): x 2 ~ O o 5. x is stationary for minf(x): x 2 O g and for each 1-dimensional problem minf(x): x 2 x + N O (F )g ; where F is a facet of O containing x. Proof If statement 1 holds, then we dene (x) def = 1 f + (x ) + f 0 2 +(x ; x x ) 2 ; and note that 0 (x ; h) = 0 (x ; h), for all h. Since x is a Gauss{Newton point, it follows that 0 (x ; d) + o(d) 0, for all d, and hence that statement 2 holds by positive homogeneity. Conversely, if statement 2 holds, then for all d and > 0, 0 f + (x ); f 0 +(x ; d) E, so that (x + d) = 1 2 = 1 2 f + (x ) + f 0 + (x ; d) 2 f + (x ) + f 0 +(x ; d) kf +(x )k f 0 + (x ; d) 2 (x ):

8 64 M. C. Ferris and. Ralph Hence statement 1 holds. Statement 2 means that f + (x ); rf(x + ) K(d) E 0, for all d 2 IR n. If d 2 K, then rf(x + ) K(d) = rf(x + )d so that f + (x ); rf(x + )ke 0, for all k 2 K. Similarly, hf + (x ); i 0, for all 2 K. This is exactly statement 3. Conversely, let d 2 IR n, and recall from the Moreau decomposition that d = k + where k = K (d) and = K (d). Using statement 3, f+ (x ); rf(x +) K (d) E = f + (x ); rf(x +)k E + hf + (x ); i 0: Thus statement 2 holds. Clearly statement 2 implies statement 4. Suppose statement 4 holds. Consider a facet F of O containing x. There is a unique index 1 i n such that neither e i nor e i lies in F, where e i is the vector in IR n with component i equal to 1 and all other components equal to zero. Choose s = 1 such that se i 62 O, then N O (F ) = fse i : 0g. Note further that se i 2 ~O. Thus statement 4 implies statement 5. Suppose statement 5 holds and consider e = e i for any index i. If x i 6= 0, then stationarity of x for minf(x): x 2 O g yields that 0 (x ; e) 0. If x i = 0 then either e 2 O or e 2 N O (F ) for some facet F of O containing x. Therefore 0 (x ; e) 0: It follows by linearity of 0 (x ; ) on each orthant, that 0 (x ; d) 0 for each d in each orthant, hence for d 2 IR n. This is statement 2. The proof of the equivalence between statements 1, 2 and 3 in Proposition 2.3 can be immediately adapted to the case of a general polyhedral set, with K then representing the critical cone to at the point x. 2.2 Regularity We now turn to the question of when a Gauss{Newton point for f + is a solution of f + (x) = 0. This is commonly called regularity and we introduce a notion of regularity that is pertinent to our Gauss{Newton formulation. Recall from Proposition 2.3 that x is a Gauss{Newton point if and only if f + (x) 2 K; rf(x + ) T f + (x) 2 K ; where K is the critical cone to IR n + at x. A simple regularity condition would be f + (x) 2 K; rf(x + ) T f + (x) 2 K =) f + (x) = 0: However, this condition is dicult to verify in most practical instances. In order to generate a more testable notion of regularity, we follow the development of More [19]. Here, f + (x) is replaced by a general vector z and extra conditions that

9 Projected Gradient Methods for NCP 65 are satised by f + (x) are used to weaken the regularity assumption. Thus we dene P def = fi: x i > 0; [f + (x)] i > 0g ; N def = fi: x i > 0; [f + (x)] i < 0g ; C def = fi: [f + (x)] i = 0g ; and we note that [f + (x)] P > 0, [f + (x)] N < 0 and [f + (x)] C = 0. enition 2.4 A point x 2 IR n is said to be regular if the only z satisfying is z = 0. z 2 K; rf(x + ) T z 2 K ; z P 0; z N 0; z C = 0 This condition is closely related to [19, enition 3.1]. This is because x is regular if and only if z 6= 0; z 2 K; z P 0; z N 0; z C = 0 =) rf(x + ) T z =2 K ; and the condition on the right is equivalent to existence of p 2 K such that z T rf(x + )p > 0. In contrast to [19, 22], the point x is not constrained to be nonnegative. Using enition 2.4, we can prove the following result. Lemma 2.5 x is a regular stationary point for if and only if x solves (NE). Proof If f + (x) = 0 then C = f1; : : : ; ng so z = z C = 0. Further, using (2), x is stationary for. Conversely, if x is stationary, then Proposition 2.3 shows that z = f + (x) satises all the relations required in the denition of regularity, and hence f + (x) = 0. We turn to the question of testing whether a point x is regular. [19, 22, 28] give several conditions on the Jacobian of f to ensure that x is regular in the sense dened in the corresponding paper. For brevity we only discuss the s-regularity condition of Pang and Gabriel [22], and do not repeat denitions here. More [19] argues that s-regularity is stronger than his regularity condition; a similar comparison between enition 2.4 and s-regularity can be made. Here we make a new observation about s-regularity. To explain this, recall that the goal of [22] is to solve 0 = (x) def = (1=2) kminff(x); xgk 2, where the min is taken component-wise; a solution of this equation solves (NCP) and vice versa. If x is nonstationary for, then s-regularity of x ensures that for some direction y 2 IR n and all x near x, y is a (strict) descent direction for at x, i.e. 0 (x; y) < 0; see [22, Lemmas 2, 6 and 7]. However (like ) is only piecewise smooth, and may have a nonstationary point which is a local minimum of some piece of smoothness of, contradicting the existence of such a direction y. So s-regularity is too strong in the context of this investigation.

10 66 M. C. Ferris and. Ralph In what follows, we give conditions that ensure x to be regular in the sense of enition 2.4. These results are proven by adapting arguments from More [19]. A key construct in the results is the matrix J(x) def = T 1 rf(x + )T 1 where T = diag(t i ); t i = ( 1 if i 2 P 1 if i =2 P. T is chosen so that every component of ~z def = T z is nonnegative. Under this transformation, x is regular if 0 6= ~z 0; ~z C = 0; ~z 2 T K =) 9~p 2 T K; ~z T J(x)~p > 0; where def = fi: [f + (x)] i 6= 0; x i 0g. Note that ~z i = 0 when i =2. The results we now give impose conditions of J(x) to guarantee regularity. We note that A 2 IR nn is an S-matrix if there is an x > 0 with Ax > 0, see [3]. Theorem 2.6 Let J(x) = T 1 rf(x + )T 1. If [J(x)] EE is an S-matrix for some index set E with E fi: x i 0g, then x is regular. Proof Since [J(x)] EE is an S-matrix, there is some ~p E > 0 such that [J(x)] EE ~p E > 0. Let ~p be the vector in IR n obtained by setting other elements to zero, so that [J(x)~p] E > 0. Now 0 6= ~z 0 and E so ~z T J(x)~p > 0. Also, T K is the Cartesian product of (T K) i = 8 >< Thus ~p 2 T K and hence x is regular. >: IR if x i > 0 IR + if x i = 0 f0g if x i < 0 ; i = 1; : : : ; n: (3) A is a P -matrix if all its principal minors are positive. P -matrices are S-matrices [3, Corollary 3.3.5]. The following corollary is now immediate. Corollary 2.7 If [rf(x + )] is a P -matrix, then x is regular. Proof The hypotheses imply that [J(x)] is a P -matrix and hence an S-matrix. To complete our discussion of tests for regularity, we give the following result. Recall that if A is partitioned in the form " # ANN A A = NM ; A MN A MM and the matrix A NN is nonsingular, then (AnA NN ) def = A MM A MN A 1 NNA NM is called the Schur complement of A NN in A. The proof of the following result is modeled after [19, Corollary 4.6].

11 Projected Gradient Methods for NCP 67 Theorem 2.8 If [rf(x + )] NN is nonsingular and the Schur complement of [rf(x + )] NN in [J(x)] is an S-matrix, then x is regular. Proof Let A = [J(x)] and partition A into " ANN A NM A MN A MM # ; where A NN = [rf(x + )] NN and M def = n N. We construct ~p N, ~p M such that [J(x)] " ~pn ~p M # > 0: (4) Let a > 0, then ~p N, ~p M solve " ANN A NM A MN A MM # " ~pn ~p M # = " a q # if and only if ~p N, ~p M solve " ANN A NM 0 (AnA NN ) # " ~pn ~p M # = " a q A MN A 1 NN a # : Since (AnA NN ) is an S-matrix by assumption, there exists ~p M > 0 with (AnA NN ) ~p M > 0. Multiplying ~p M by an appropriately large number gives (AnA NN ) ~p M +A MN A 1 NNa > 0. It follows that q def = (AnA NN ) ~p M + A MN A 1 NNa > 0, and taking ~p N = A 1 NN (a A NM ~p M ) implies (4). Let ~p 2 IR n be the vector constructed from ~p M and ~p N by adding appropriate zeros. Then it is easy to see that ~p 2 T K, see (3). Furthermore, ~z T J(x)~p = ~z [J(x)~p] T > 0. Hence x is regular. Note that [5, 12, 28] all assume that [rf(x + )] EE is nonsingular and ([rf(x + )] LL n[rf(x + )] EE ) is a P -matrix. (5) Here E def = fi: x i > 0g contains N and L def = fi: x i 0g contains. Theorem 2.8 requires the non-singularity of a smaller matrix and a weaker assumption on the Schur complement. However, (5) guarantees regularity in the sense of enition 2.4 as we now show. Lemma 2.9 If (5) holds or, equivalently, the B-derivative f 0 +(x; ) is invertible, then and so x is regular. z 2 K; rf(x + ) T z 2 K =) z = 0;

12 68 M. C. Ferris and. Ralph Proof The equivalence between (5) and the existence of a Lipschitz inverse of f 0 + (x; ) is given by [28, Proposition 12]. Since all piecewise linear functions are Lipschitz, the claimed equivalence holds. Suppose z 2 K and rf(x + ) T z 2 K. It follows that z i = 0, i =2 L. Also rf(x + ) T z 2 K implies h rf(x+ ) T z i E = 0; h rf(x+ ) T z i M 0; where M def = L n E. Using the invertibility assumption from (5) and z 2 K again, we see that [rf(x+ ) T ] MM [rf(x + ) T ] ME [rf(x + ) T ] 1 EE[rf(x + ) T ] EM zm 0; z M 0: The Schur complement is a P -matrix and hence z = 0 follows from [3, Theorem 3.3.4]. 3 Nonstationary Repulsion (NSR) of the Projected Gradient Method Let be a nonempty closed convex set in IR n and :! IR be C 1. (We are thinking of being an orthant and = j.) We paraphrase the description of the projected gradient (PG) algorithm given by Calamai and More [2] for the problem min (x): (6) x2 For any > 0, the rst-order necessary condition for x to be a local minimizer of this problem is that (x r(x)) = x: When x k 2 is nonstationary, a step length k > 0 is chosen by searching the path x k () def = (x k r(x k )); > 0: Given constants 1 > 0, 2 2 (0; 1), and 1 and 2 with 0 < 1 2 < 1, the step length k must satisfy (x k ( k )) (x k ) + 1 r(x k ); x k ( k ) x ke (7) and where k satises k 1 or k 2 k > 0; (8) (x k ( k )) > (x k ) + 2 r(x k ); x k ( k ) x ke : (9)

13 Projected Gradient Methods for NCP 69 Condition (7) forces k not to be too large; it is the analogue of the condition used in the standard Armijo line search for unconstrained optimization. Condition (8) forces k not to be too small; in the case that k < 1, this requirement is the analogue of the standard Wolfe-Goldstein [7] condition from unconstrained optimization. The PG method is a feasible point algorithm in that it requires a starting point x 0 in and produces a sequence of iterates fx k g. It is also monotonic, that is, if x k 2 is nonstationary, then (x k+1 ) (x k ). We claim that the PG method has NSR: enition 3.1 An iterative feasible point algorithm for (6) has nonstationary repulsion (NSR) if for each nonstationary x 2, there exists a neighborhood V of x such that if any iterate x k lies in V \, then (x k+1 ) < (x). The fact that the steepest descent method, i.e. the PG method when = IR n, has NSR is easy to see. Also, Polak [25, Chapter 1] discusses a general descent property that is similar to NSR and provides convergence results like Theorem 3.2 below. It is trivial but important that NSR yields strong global convergence properties: Theorem 3.2 Suppose A is a monotonic feasible point algorithm for (6) with NSR. Let x Any limit point of the sequence generated by A is stationary. 2. Let B be any monotonic feasible point algorithm for (6). Suppose fx k g is a sequence dened by applying either A or B to each x k. Then any limit point of fx k g k2k is stationary if A is applied innitely many times, where K = n k: x k+1 is generated by A o. Proof 1. This is a corollary of part 2 of the theorem. 2. Let x 2 be nonstationary for (6) and K have innite cardinality. NSR gives > 0 such that (x k+1 ) < (x) if x k 2 (x + IB) \. If the subsequence fx k g K does not intersect (x + IB) \ then x is not a limit point of this subsequence. So we assume that x k 2 (x + IB) \ for some k 2 K, hence (x k+1 ) < (x). By continuity of there is 1 2 (0; ) such that (x) > (x k+1 ) if x 2 (x + 1 IB)\. By monotonicity of A and B, (x k+j ) (x k+1 ) for each j 1; hence x is not a limit point of fx k+j g j1, or of fx k g K. Of course NSR is not a guarantee of convergence. To guarantee existence of a limit point of a sequence produced by a method with NSR we need additional knowledge, for instance boundedness of the lower level set n x: (x) (x0 ) o

14 70 M. C. Ferris and. Ralph where x 0 is the starting iterate. To prove that the PG method has NSR we need to establish that the rate of descent obtained along the path x() = (x r(x)) is uniform for feasible x in a neighborhood of a given nonstationary x 2. The lemma below states a uniform descent property for all small perturbations about a given function ; the reader may consider = for simplicity. In the case where many functions are present we use the notation enition 3.3 x () def = (x r(x)): 1. Let x 2 IR n and > 0. If :! IR is C 1, the modulus of continuity of r at x 2 is the function of and > 0 (and x, )!(; ) def = sup fkr(y) r(x)k : x; y 2 ; kx xk ; kx yk g : 2. Let x 2 IR n, > 0, and : IR n! IR be C 1. Given > 0, let U() = U(; ; x; ) be the set of all C 1 functions :! IR n such that sup n j(x) (x)j + r(x) r (x) : x 2 (x + IB) \ o < ; (10) and!(; ) (1 + )!( ; ); 8 2 (0; ): (11) Lemma 3.4 Let : IRn! IR be C 1, > 0 and x 2 be nonstationary for min n o (x): x 2. There exist positive constants and such that for each 2 U() = U(; ; x; ), x 2 (x + IB) \, and 0, hr(x); x () xi minf; g: (12) Proof Let :! IR be C 1, x 2 and > 0. According to [2, (2.4)], hr(x); x () xi kx () xk 2 =; 8x 2 ; > 0: Moreover, [2, Lemma 2.2] says that, as a function of > 0, kx () xk = is antitone (nonincreasing); in particular for any > 0, kx () xk = kx () xk =; 8 2 (0; ): Using this with the previous inequality, we deduce for any > 0 that hr(x); x () xi (kx () xk =) 2 ; 8x 2 ; > 0 (kx () xk =) 2 ; 8x 2 ; 0 < < : (13)

15 Projected Gradient Methods for NCP 71 Fix > 0. By hypothesis, the point x is such that x () x > 0. Also by (10), if x! x and!, where convergence of means that 2 U() and # 0, then x () converges to x (). Hence there are > 0, > 0 such that for 2 U(), x 2 x + IB, kx () xk p : Together with (13), this yields hr(x); x () xi ; 8x 2 (x + IB) \ ; 2 [0; ]: Let def = minf; g, then (12) holds for 0. Using the well known antitone property of hr(x); x () xi in 0, see [2, (2.6)], we see that (12) also holds for >. The following result gives some technical properties of the PG method that will be important for our main algorithm. We use it later to prove that the PG method has NSR, though in this case NSR follows from the simpler case in which is a xed function. Proposition 3.5 Let : IR n! IR be C 1, > 0 and x 2 be nonstationary for min n o (x): x 2. Then there is a positive constant such that for each x 2 (x + IB) \ and 2 U() = U(; ; x; ): 1. For each 2 [0; ], (x ()) (x) + 1 hr(x); x () xi : 2. One step of PG on (6) from x generates x () with. Proof Suppose, x and are as stated. Let 1 > 0, 2 2 (0; 1) and 0 < 1 2 < 1 be the constants of the PG method. Let 1 ; be given by Lemma 3.4, and 2 U( 1 ); and assume without loss of generality that 1 2 (0; ], i.e. (10) and (11) hold with = = 1. We estimate the error term "(; x; y) def = (y) (x) hr(x); y xi ; where y; x 2. By choice of 2 U( 1 ), specically (11) with = = 1, for each 2 (0; 1 ), x 2 (x + 1 IB) \ and y 2 (x + IB) \, kr(x) r(y)k (1 + 1 )!( ; ):

16 72 M. C. Ferris and. Ralph Thus, for x 2 (x + 1 IB) \ and y 2 (x + 1 IB) \, j"(; y; x)j = j Z 1 hr(x + t(y x)) r(x); y xi dtj 0 (1 + 1 )!( ; kx yk) ky xk : (14) 2 By continuity of r on the compact set (x + 1 IB) \, there is a nite upper bound on r (x) for x 2 (x + 1 IB) \. ene = + 1 ; by choice of 2 U( 1 ), specically (10) with = = 1, kr(x)k for x 2 (x + 1 IB) \. It follows for such x and any 0, that kx () xk ; (15) because is Lipschitz of modulus 1. Furthermore, since r is uniformly continuous on compact sets,!( ; ) # 0 as # 0. Thus, using the fact that!( ; ) is nondecreasing, there exists 2 2 (0; 1 ) such that for x 2 (x + 1 IB) \ and 2 (0; 2 ), both 1 and (1 + 1 )!( ; ) 2 (1 2): From these inequalities and the inequalities (14) and (15), we see that for x 2 (x + 1 IB) \ and 2 (0; 2 ), Now for such x and, "(; x (); x) (1 2 ): (16) (x ()) = (x) + [ 2 + (1 2 )] hr(x); x () xi + "(; x (); x) (x) + 2 hr(x); x () xi (1 2 ) + (1 2 ) where the second inequality relies on the uniform descent property of Lemma 3.4 and (16). Thus (x ()) (x) + 2 hr(x); x () xi ; 8x 2 (x + 1 IB) \ ; 2 (0; 2 ); and this inequality with 1 replacing 2 also holds. Finally, for any x k 2 (x+ 1 IB)\, the auxiliary scalar k satisfying (9) is bounded below by 2 ; hence the step size k is bounded below by def = minf 1 ; 2 2 g. Since 0 < 2 < 1, then < 2 < 1, parts 1 and 2 of the proposition hold. Theorem 3.6 The PG method applied to (6) has NSR.

17 Projected Gradient Methods for NCP 73 Proof Let x 2 IR n be nonstationary, so according to Proposition 3.5 and Lemma 3.4, if x k 2 (x + IB) \ then k and (x k+1 ) (x k ) + 1 r(x k ); x k ( k ) x ke (x k ) 2; (17) where = 1 =2 > 0. Now by continuity of there is 2 (0; ) such that j(x) (x)j ; 8x 2 (x + IB) \ : Take V def = (x + IB) \, and use the above inequality with (17) to see that for any x k 2 V, (x k+1 ) (x) : The NSR property of enition 3.1 follows. 4 Projected Gradient Algorithms for NCP Our main goal here is to present a method for minimizing that has a low computational cost, and has NSR. Before proceeding we will make a few comments on guaranteeing convergence, at least on a subsequence. Existence of a (stationary) limit point of a sequence produced by a method with NSR follows from boundedness of the lower level set n x 2 IR n : kf + (x)k f+ (x 0 ) o ; where x 0 is the initial point. This boundedness property holds in many cases, for instance if f is a uniform P-function, see Harker and Xiao [12]; hence if f is strongly monotone. However the uniform P-function property implies that f+(x; 0 ) is invertible for each x, a condition that we believe is too strong in general (c.f. Lemma 2.9). A weaker condition yielding boundedness of the above level set is that f + is proper, namely that the inverse image f 1 + (S) of any compact set S IR n is compact. 4.1 A simple globally convergent algorithm Given statement 4 of Proposition 2.3, it is tempting to use the following steepest descent idea in algorithms for minimizing. Given the kth iterate x k 2 IR n, an orthant O k containing x k and the complement ~ O k of O k at x k, let d k solve min 0 (x k ; d) subject to kdk 1; d 2 O k [ ~O k : d This essentially requires two n-dimensional convex quadratic programs to be solved (a polyhedral norm on d may be used), one for each orthant. If d = 0 is a solution,

18 74 M. C. Ferris and. Ralph then x k is stationary for. Otherwise 0 (x k ; d k ) < 0, and we can perform a line search to establish k > 0 such that for x k+1 = x k + k d k, (x k + k d k ) is strictly less than (x k ). However if is nonsmooth there seems to be little global convergence theory for algorithms based on this idea. For instance, it is not known if the step length k can be chosen to be uniformly large in a neighborhood of a nonstationary point, while still retaining a certain rate of descent; hence it is hard to show that the sequence produced will not accumulate at a nonstationary point. Pang, Han and Rangaraj [23, Corollary 1] give an additional smoothness assumption at a limit point that is required to prove stationarity. Alternatively given the stationarity characterization of Proposition 2.3.5, we can design a naive steepest descent algorithm for minimizing, each iteration of which is based on a projected gradient step over an orthant O k containing the current iterate x k, and an additional m projected gradient steps on 1-dimensional problems corresponding to moving in directions normal to the m facets of O k that contain x k (so m is the number of zero components of x k ). It is signicant that to obtain global convergence, we only need to increase the number of 1-dimensional subproblems at each iteration from m to n, i.e. normals to all facets of O k must be examined. The algorithm below introduces notation not strictly required for its statement; this notation is presented in preparation for the main algorithm, Algorithm 2, which appears in the next subsection. By O we mean the restriction j O of to O. Algorithm 1. Let x 0 2 IR n. Given k 2 f0; 1; 2; : : : ; g and x k 2 IR n ; dene x k+1 as follows. Choose any orthant O k containing x k, let y 0 () def = O k[x k r O k(x k )], and 0 be the step size determined by one step of the projected gradient algorithm applied to min n (x): x 2 O ko from x k. Suppose F 1 ; : : : ; F n are the facets of O k. For j = 1; : : : ; n, let y j def = Fj (x k def ), N j = N O k(f j ), y j () def = y j + Nj [ r(y j )], and j be the step size determined by one step of the projected gradient algorithm applied to starting from y j. Let min n (x): x 2 y j + N j o ; (18) x k+1 def = y^ (^ ); where ^ 2 argmin n (y j ( j )): j = 0; 1; 2; : : : ; n o : If (x k+1 ) = (x k ) then STOP; x k is a Gauss{Newton point of f +. Remark. In Algorithm 1 the projected gradient method is used as a subroutine. Therefore we assume that if the starting point of a subproblem is stationary, then the projected gradient method merely returns this point; the decision of whether or not the main algorithm should continue is made elsewhere.

19 Projected Gradient Methods for NCP 75 Theorem 4.1 Algorithm 1 is well dened and has NSR. Proof Since the projected gradient method is well dened, for each k and x k the algorithm produces x k+1. If (x k+1 ) = (x k ) then none of the subproblems of the form (18) produced a point with a lower function value than (x k ). So x k is stationary for each subproblem for which F j is a facet of O k containing x k, and by Proposition 2.3, x k is also a Gauss-Newton point of f +. Thus Algorithm 1 is well dened. We show that the algorithm has NSR. Suppose x is not a Gauss-Newton point of f +. For x k suciently close to x, x 2 O k. So consider the case when O k = O for some xed orthant O containing x. By Proposition 2.3 x is nonstationary either for min n (x): x 2 O o or for minf(x): x 2 x + N O(F )g, where F is some facet of O containing x. In the former case, for some = ( O) > 0 and each x k 2 x + IB, we have from Theorem 3.6 with = O and = O, that the candidate y 0 ( 0 ) for the next iterate x k+1 yields (y 0 ( 0 )) < (x). Hence our choice of x k+1 also yields (x k+1 ) < (x). In the latter case, we can apply Proposition 3.5 by reformulating the subproblem (18) as minf(y j + d): d 2 N O(F )g, i.e. dene (d) = (y j +d), (d) = (x+d), = N O(F ), and as any positive constant, and let 1 > 0 be the constant given by Proposition 3.5. Given the simple form of, it is easy to check that there is = ( O) > 0 such that if x k 2 x + IB, then 2 U( 1 ; ; x; ): For such x k, Proposition 3.5 says that the candidate iterate y j ( j ) yields (y j ( j )) < (x), hence (x k+1 ) < (x). Since there are only nitely many orthants, we conclude that for some > 0 independent of O k, and each x k 2 x + IB, we have (x k+1 ) < (x). This algorithm is extremely robust: under the single assumption that f is C 1 on IR n +, the method is well dened and accumulation points are always Gauss{Newton points. It is also reasonably simple, using the projected gradient method as the work horse. A serious drawback of Algorithm 1 is that we need at least n + 1 function and Jacobian evaluations per iteration, in order to carry out the projected gradient method on the n + 1 subproblems. By contrast, the use of 1-dimensional subproblems means the linear algebra performed by Algorithm 1 is only around twice as expensive as the linear algebra needed to perform one projected gradient step on an orthant. 4.2 An ecient globally convergent algorithm We present a globally convergent method for nding Gauss{Newton points of f + based on the PG method. It is ecient in the sense that per iteration, the number of function evaluations is comparable to that needed for the PG method applied to minimizing a smooth function over an orthant, and the linear algebra computation involves about double the work required for linear algebra in the PG method.

20 76 M. C. Ferris and. Ralph At each iteration, we approximate by linearizing f about x k +. Let where A k (x) def = 1 L k + 2 (x) 2 L k +(x) def = f(x k +) + rf(x k +)(x + x k +) + x x + : (19) The \linearization" L k + is a local point-based approximation [34] when rf is locally Lipschitz, and more generally a uniform rst-order approximation near x k [28]; such approximations are more powerful than directional derivatives in that they approximate f + uniformly well for all x near x k. In [5, 4, 28, 34] these approximation properties have been exploited to give strong convergence results for Newton methods applied to nonsmooth equations like f + (x) = 0. Our main algorithm, below, and its extremely robust convergence behavior also rely on these approximation properties. Lemma 4.2 Let x 2 IR n and > 0. There is a non-decreasing function " : IR +! IR + such that "() = o() as # 0, and for each x k ; x 2 x + IB, (x) A k (x) "( x x k ): Proof We have j(x) A k (x)j = (1=2) f + (x) + L k +(x); f + (x) L k +(x) E c f + (x) L k +(x) ; where c 2 (0; 1) is the maximum value of (1=2) f+ (x) + L k +(x) for x k ; x 2 x + IB. Let! be the modulus of continuity of rf on IR n + \ (x + + IB) (see enition 3.3). Similar to (14) in the proof of Proposition 3.5, f + (x) L k + (x)!( x x k ) x x k =2; where!()! 0 as # 0. Take "() def = c!()=2. We will search several paths during an iteration but, unlike Algorithm 1, our criteria for choosing the path parameter will use derivatives of the approximation A k rather than of. Let 0 2 (0; 1 ) and 2 (0; 1). Suppose we are also given an orthant O containing a point y (but not necessarily x k ), and a path y : [0; 1)! IR n with y(0) = y. Given > 0, y() is a candidate for x k+1 if (y()) A k (y) + 0 ra k O (y); y() y E : (20) Here A k O is the restriction A k j O. If fails the above test, we can try =. Note that if y = x k then ra k O(y) = r O (x k ), and the obvious choice for y() is O (x k r O (x k )). In this case (20) is equivalent to (7) with = O and = O. The rst part of Algorithm 2 is a single step of Algorithm 1 applied to A k instead of. The second part determines the path and the corresponding step length that will dene the next iterate x k+1.

21 Projected Gradient Methods for NCP 77 Algorithm 2. Let x 0 2 IR n and (in addition to the constants used for the PG method), 0 2 (0; 1 ); 2 (0; 1). Given k 2 f0; 1; 2; : : : ; g and x k 2 IR n ; dene x k+1 as follows. Part I. Choose any orthant O k containing x k, let y 0 def = x k, y 0 () def = O k[x k r O k(x k )]; and 0 be the step size determined by one step of the projected gradient algorithm applied to min n A k (x): x 2 O ko from y 0. Suppose F 1 ; : : : ; F n are the facets of O k. For j = 1; : : : ; n, let y j def = Fj (x k def ), N j = N O k(f j ), def O j = F j + N j, y j () def = y j + Nj [ ra k O j (y j )]; and j be the step size determined by one step of the projected gradient algorithm applied to from y j. min n A k (x): x 2 y j + N j o Part II. Path search: Let M def = f0; : : : ; ng, ^ def def = 0 and 0 = 0 =. REPEAT def Let ^ = ^. If ^ y^ x k then M def = M n f^ g. If M = ; STOP; x k is a Gauss{Newton point of f +. Else let ^ 2 argmin n A k (y j ) + 0 ra k Oj (y j ); y j ( j ) y je : j 2 M o : UNTIL (20) holds for y() = y^ (^ ), y = y^ and O = O^. k+1 def Let x = y^ (^ ). Remark. For the algorithm to work properly, we assume that part I returns j = 0 if y j is already stationary for the corresponding subproblem. Theorem 4.3 Algorithm 2 is well dened and has NSR. Proof First we show that each step of the algorithm is well dened. Consider one step of the algorithm given k 2 f0; 1; 2; : : :g and x k 2 IR n. Part I is well dened because the projected gradient method is well dened. For part II we see that each iteration of the REPEAT loop is well dened; we claim that the loop terminates after nitely many iterations. Certainly if j 2 f0; : : : ; ng and y j 6= x k, then after a nite number of loop iterations in which ^ = j and j def = j, we have j yj x k ; (21)

22 78 M. C. Ferris and. Ralph hence in any subsequent loop iterations j 62 M and ^ 6= j. Instead suppose j is such that y j = x k. Either y j is stationary for the jth subproblem hence j = 0 and, by construction of M, ^ equals j for at most one loop iteration; or using Proposition 3.5.1, def initially j > 0 and after nitely many loop iterations in which ^ = j and j = j, (20) holds, terminating the loop. It is only left to check that x k is a Gauss-Newton point of f + if M = ;. In this case, j yj x k for each j, in particular j = 0 if y j = x k, i.e. for j = 0 and each j in M 0 = n j: 1 j n; x k 2 F j o. This is only possible if xk is stationary for each subproblem min n A k (x): x 2 O ko and min n A k (x): x 2 x k + N Oj (F j ) o where j 2 M 0. Since for each orthant O containing x k we have ra k O (xk ) = r O (x k ), it follows that x k is also stationary for min n (x): x 2 O ko and min n (x): x 2 x k + N Oj (F j ) o where j 2 M 0. Proposition 2.3 says x k is indeed a Gauss-Newton point of f +. We now prove the NSR property. Suppose that x is nonstationary for. As in the proof of Theorem 4.1 we assume O k = O for some xed orthant O containing x. Observe from Proposition 2.3, that either x is nonstationary for min n (x): x 2 O o or x is nonstationary for minf(x): x 2 x + N O(F )g, for some facet F of O containing x. Below we assume the latter, and deduce for x k near x that (x k+1 ) < (x). Let O be an orthant containing x and F be a facet of O containing x. Assume x is nonstationary for minf(x): x 2 x + N O(F )g. Assume further that x k is some iterate with O k = O, so if F 1 ; : : : ; F n are the facets of O k, then F = F ~ for some index ~. To simplify notation we omit the superscript or subscript ~ where possible. Let N = N O(F ), O = F + N, y = F (x k ), y() = N (y ra k O(y)), and A(x) def = (1=2) kf(x + ) + rf(x + )(x + x + ) + x x + k 2 : Observe, since r O(x) = r A O(x), that x is nonstationary for min n A(x): x 2 y + N o. Rewriting the ~ th subproblem, min n AO (x): x 2 y + N o, as min n AO (y + d): d 2 N o ; dening (d) = A k N (y + d), (d) = AN (y + d), = N and choosing > 0, enables us to apply Lemma 3.4 and Proposition 3.5. Then there exist 1 > 0; > 0 such that if x k x 1 and 2 U( 1 ) = U( 1 ; ; x; ) (see enition 3.3), then A k (y()) A k (y) 1 ra k O (y); y() y E ; (22) ra k O (x k ); y() y E minf; 1 g; (23) and the initial step size ~ chosen in Part I of the algorithm is bounded below by 1. Now A N and A k N are quadratic functions dened on the half-line N, hence, by continuity of rf, it follows easily that there exists 2 2 (0; 1 ] such that 2 U( 1 ) if x k x 2. Thus (22) and (23) hold for such x k and 2 [0; 2 ].

23 Projected Gradient Methods for NCP 79 Let xk x 2 and 0 2. We have (y()) (x k ) = [A k (y()) A k (y)] + [A k (y) (x k )] + [(y()) A k (y())] 1 ra k O (y); y() y E + [A k (y) (x k )] + [(y()) A k (y())]; (24) using (22). Let L be an upper bound on ra k O(y) for x k 2 x + 2 IB, and observe y() x k ky() yk + y x k L + y x k : Also y = F (x k ) is bounded on x+ 2 IB, therefore Lemma 4.2 provides a non-decreasing error bound "(t) = o(t) such that for each x k 2 x + 2 IB, 2 [0; 2 ], (y()) A k (y()) "(L + y x k ): (25) ^: Now Let ^ = ( 1 0 )=2 and choose 2 (0; 2 ) such that "(2L) choose 3 2 (0; 2 ) such that if xk x 3, then both y x k minf; Lg and ja k (y) (x k )j ^. Let x k 2 x + 3 IB. For 2 (0; ], (24) and (25) yield (y()) (x k ) 1 ra k O (y); y() y E + ^ + "(L + L) 1 ra k O (y); y() y E + ( 1 0 ) 1 ra k O (y); y() y E + ( 1 0 ); 8 2 [ ; ]: From (23), if 1 then ra k O(y); y() ye ; therefore (y()) (x k ) 0 ra k O (y); y() y E ; 8 2 [ ; ]: (26) From above, the initial step size ~ and the point y ~ = y are such that ~, and > y x k. We claim it follows from (26) that, during the REPEAT loop of part II, ~ 2 M and ~. To see this suppose that ~ decreases in some loop iteration after the rst loop iteration. Then at the end of the previous loop iteration, (^ = ~ and) the condition (20) fails for y() = y ~ ( ~ ), y = y ~ and O = O ~ ; so it follows from (26) that ~ >. Thus the new value ~ of ~ is bounded below by, hence also ~ > y x k and ~ is not deleted from M. Therefore after the REPEAT loop terminates, ~ 2 M and ~ ; and the selection of x k+1, whether or not using y ~ () = y(), satises (x k+1 ) min n A k (y j ) + 0 ra k Oj (y j ); y j ( j ) y je : j 2 M o A k (y) + 0 ra k O (y); y( ~ ) y E A k (y) 0 minf ~ ; 1 g (from (23)) A k (y) ;

24 80 M. C. Ferris and. Ralph where def = 0 is a positive constant independent of x k. As noted above, A k (y)! (x) as x k! x, so (x k+1 ) < (x) for x k suciently close to x. A similar argument can be made for the case when O k = O and x is nonstationary for min n (x): x 2 O o. In this case ~ = 0, y = x k and y() = O(x k ra k O (xk )). We do not give details, but only note that this process is somewhat simpler than that above because the inequality corresponding to (24) only has two summands on the right: (y()) (x k ) 1 ra k (x k ); y() x ke + [(y()) A k (y())]: Since there are only nitely many choices of O, the NSR property of Algorithm 2 is established. 4.3 A hybrid algorithm with quadratic local convergence Both of the algorithms given above have at best a linear rate of convergence because the projected gradient method is only a rst-order method. However, if an algorithm for nding a Gauss-Newton point of f + has NSR (such as Algorithms 1 and 2), then this lends itself to hybrid methods that alternate between steps of the original algorithm and Newton-like steps and therefore admit the possibility of quadratic local convergence. For such a hybrid algorithm, let K be the set of indices k for which the original algorithm determines x k+1. If K has innitely many elements and monotonicity of the algorithm is maintained, accumulation points of the subsequence fx k g k2k are Gauss{Newton points of f +. If such a limit point x is in fact a point of attraction of a Newton method, and a Newton step is taken every `th iteration, then convergence will be `-step superlinear, or `-step quadratic if rf is Lipschitz. See [2] for details on a related hybrid algorithm in the context of quadratic programming. We briey sketch three popular Newton methods for solving the nonsmooth equation f + (x) = 0; which often produce Q-quadratically convergent sequences of iterates. To make comparisons easy, we use the general notion of a Newton path [28] which, given the iterate x k, is some function p k : [0; 1]! IR n with p k (0) = x k ; the next iterate x k+1 is dened as p k () for some 2 [0; 1] (details are given below). We say a Newton iterate or Newton step is taken if x k+1 = p k (1). We may not take a Newton step, however, if it does not yield \sucient progress". A simple damping strategy is used to ensure sucient progress: recall the constants 0 ; 2 (0; 1), and dene as the largest member of f1; ; 2 ; : : :g such that f + (p k ()) (1 0 ) f+ (x k ) : (27) k+1 def Then x = p k (); this is the damped Newton iterate. Newton path 1. Given k and x k, let O k be an orthant containing x k, M k = rf + j O k, d k = (M k ) 1 f + (x k ); p k () = x k + d k :

1. Introduction The nonlinear complementarity problem (NCP) is to nd a point x 2 IR n such that hx; F (x)i = ; x 2 IR n + ; F (x) 2 IRn + ; where F is

1. Introduction The nonlinear complementarity problem (NCP) is to nd a point x 2 IR n such that hx; F (x)i = ; x 2 IR n + ; F (x) 2 IRn + ; where F is New NCP-Functions and Their Properties 3 by Christian Kanzow y, Nobuo Yamashita z and Masao Fukushima z y University of Hamburg, Institute of Applied Mathematics, Bundesstrasse 55, D-2146 Hamburg, Germany,

More information

SOLUTION OF NONLINEAR COMPLEMENTARITY PROBLEMS

SOLUTION OF NONLINEAR COMPLEMENTARITY PROBLEMS A SEMISMOOTH EQUATION APPROACH TO THE SOLUTION OF NONLINEAR COMPLEMENTARITY PROBLEMS Tecla De Luca 1, Francisco Facchinei 1 and Christian Kanzow 2 1 Universita di Roma \La Sapienza" Dipartimento di Informatica

More information

system of equations. In particular, we give a complete characterization of the Q-superlinear

system of equations. In particular, we give a complete characterization of the Q-superlinear INEXACT NEWTON METHODS FOR SEMISMOOTH EQUATIONS WITH APPLICATIONS TO VARIATIONAL INEQUALITY PROBLEMS Francisco Facchinei 1, Andreas Fischer 2 and Christian Kanzow 3 1 Dipartimento di Informatica e Sistemistica

More information

16 Chapter 3. Separation Properties, Principal Pivot Transforms, Classes... for all j 2 J is said to be a subcomplementary vector of variables for (3.

16 Chapter 3. Separation Properties, Principal Pivot Transforms, Classes... for all j 2 J is said to be a subcomplementary vector of variables for (3. Chapter 3 SEPARATION PROPERTIES, PRINCIPAL PIVOT TRANSFORMS, CLASSES OF MATRICES In this chapter we present the basic mathematical results on the LCP. Many of these results are used in later chapters to

More information

Smoothed Fischer-Burmeister Equation Methods for the. Houyuan Jiang. CSIRO Mathematical and Information Sciences

Smoothed Fischer-Burmeister Equation Methods for the. Houyuan Jiang. CSIRO Mathematical and Information Sciences Smoothed Fischer-Burmeister Equation Methods for the Complementarity Problem 1 Houyuan Jiang CSIRO Mathematical and Information Sciences GPO Box 664, Canberra, ACT 2601, Australia Email: Houyuan.Jiang@cmis.csiro.au

More information

Vector Space Basics. 1 Abstract Vector Spaces. 1. (commutativity of vector addition) u + v = v + u. 2. (associativity of vector addition)

Vector Space Basics. 1 Abstract Vector Spaces. 1. (commutativity of vector addition) u + v = v + u. 2. (associativity of vector addition) Vector Space Basics (Remark: these notes are highly formal and may be a useful reference to some students however I am also posting Ray Heitmann's notes to Canvas for students interested in a direct computational

More information

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) Contents 1 Vector Spaces 1 1.1 The Formal Denition of a Vector Space.................................. 1 1.2 Subspaces...................................................

More information

IE 5531: Engineering Optimization I

IE 5531: Engineering Optimization I IE 5531: Engineering Optimization I Lecture 15: Nonlinear optimization Prof. John Gunnar Carlsson November 1, 2010 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I November 1, 2010 1 / 24

More information

Optimization over Sparse Symmetric Sets via a Nonmonotone Projected Gradient Method

Optimization over Sparse Symmetric Sets via a Nonmonotone Projected Gradient Method Optimization over Sparse Symmetric Sets via a Nonmonotone Projected Gradient Method Zhaosong Lu November 21, 2015 Abstract We consider the problem of minimizing a Lipschitz dierentiable function over a

More information

Methods for a Class of Convex. Functions. Stephen M. Robinson WP April 1996

Methods for a Class of Convex. Functions. Stephen M. Robinson WP April 1996 Working Paper Linear Convergence of Epsilon-Subgradient Descent Methods for a Class of Convex Functions Stephen M. Robinson WP-96-041 April 1996 IIASA International Institute for Applied Systems Analysis

More information

On the Convergence of Newton Iterations to Non-Stationary Points Richard H. Byrd Marcelo Marazzi y Jorge Nocedal z April 23, 2001 Report OTC 2001/01 Optimization Technology Center Northwestern University,

More information

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS MATHEMATICS OF OPERATIONS RESEARCH Vol. 28, No. 4, November 2003, pp. 677 692 Printed in U.S.A. ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS ALEXANDER SHAPIRO We discuss in this paper a class of nonsmooth

More information

ON THE ARITHMETIC-GEOMETRIC MEAN INEQUALITY AND ITS RELATIONSHIP TO LINEAR PROGRAMMING, BAHMAN KALANTARI

ON THE ARITHMETIC-GEOMETRIC MEAN INEQUALITY AND ITS RELATIONSHIP TO LINEAR PROGRAMMING, BAHMAN KALANTARI ON THE ARITHMETIC-GEOMETRIC MEAN INEQUALITY AND ITS RELATIONSHIP TO LINEAR PROGRAMMING, MATRIX SCALING, AND GORDAN'S THEOREM BAHMAN KALANTARI Abstract. It is a classical inequality that the minimum of

More information

A Generalized Homogeneous and Self-Dual Algorithm. for Linear Programming. February 1994 (revised December 1994)

A Generalized Homogeneous and Self-Dual Algorithm. for Linear Programming. February 1994 (revised December 1994) A Generalized Homogeneous and Self-Dual Algorithm for Linear Programming Xiaojie Xu Yinyu Ye y February 994 (revised December 994) Abstract: A generalized homogeneous and self-dual (HSD) infeasible-interior-point

More information

WHEN ARE THE (UN)CONSTRAINED STATIONARY POINTS OF THE IMPLICIT LAGRANGIAN GLOBAL SOLUTIONS?

WHEN ARE THE (UN)CONSTRAINED STATIONARY POINTS OF THE IMPLICIT LAGRANGIAN GLOBAL SOLUTIONS? WHEN ARE THE (UN)CONSTRAINED STATIONARY POINTS OF THE IMPLICIT LAGRANGIAN GLOBAL SOLUTIONS? Francisco Facchinei a,1 and Christian Kanzow b a Università di Roma La Sapienza Dipartimento di Informatica e

More information

A convergence result for an Outer Approximation Scheme

A convergence result for an Outer Approximation Scheme A convergence result for an Outer Approximation Scheme R. S. Burachik Engenharia de Sistemas e Computação, COPPE-UFRJ, CP 68511, Rio de Janeiro, RJ, CEP 21941-972, Brazil regi@cos.ufrj.br J. O. Lopes Departamento

More information

GENERALIZED CONVEXITY AND OPTIMALITY CONDITIONS IN SCALAR AND VECTOR OPTIMIZATION

GENERALIZED CONVEXITY AND OPTIMALITY CONDITIONS IN SCALAR AND VECTOR OPTIMIZATION Chapter 4 GENERALIZED CONVEXITY AND OPTIMALITY CONDITIONS IN SCALAR AND VECTOR OPTIMIZATION Alberto Cambini Department of Statistics and Applied Mathematics University of Pisa, Via Cosmo Ridolfi 10 56124

More information

Optimization and Optimal Control in Banach Spaces

Optimization and Optimal Control in Banach Spaces Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,

More information

Optimality Conditions for Constrained Optimization

Optimality Conditions for Constrained Optimization 72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

More information

Unconstrained optimization

Unconstrained optimization Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout

More information

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems Robert M. Freund February 2016 c 2016 Massachusetts Institute of Technology. All rights reserved. 1 1 Introduction

More information

TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM

TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM H. E. Krogstad, IMF, Spring 2012 Karush-Kuhn-Tucker (KKT) Theorem is the most central theorem in constrained optimization, and since the proof is scattered

More information

only nite eigenvalues. This is an extension of earlier results from [2]. Then we concentrate on the Riccati equation appearing in H 2 and linear quadr

only nite eigenvalues. This is an extension of earlier results from [2]. Then we concentrate on the Riccati equation appearing in H 2 and linear quadr The discrete algebraic Riccati equation and linear matrix inequality nton. Stoorvogel y Department of Mathematics and Computing Science Eindhoven Univ. of Technology P.O. ox 53, 56 M Eindhoven The Netherlands

More information

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented

More information

Spectral gradient projection method for solving nonlinear monotone equations

Spectral gradient projection method for solving nonlinear monotone equations Journal of Computational and Applied Mathematics 196 (2006) 478 484 www.elsevier.com/locate/cam Spectral gradient projection method for solving nonlinear monotone equations Li Zhang, Weijun Zhou Department

More information

Implications of the Constant Rank Constraint Qualification

Implications of the Constant Rank Constraint Qualification Mathematical Programming manuscript No. (will be inserted by the editor) Implications of the Constant Rank Constraint Qualification Shu Lu Received: date / Accepted: date Abstract This paper investigates

More information

A Quasi-Newton Algorithm for Nonconvex, Nonsmooth Optimization with Global Convergence Guarantees

A Quasi-Newton Algorithm for Nonconvex, Nonsmooth Optimization with Global Convergence Guarantees Noname manuscript No. (will be inserted by the editor) A Quasi-Newton Algorithm for Nonconvex, Nonsmooth Optimization with Global Convergence Guarantees Frank E. Curtis Xiaocun Que May 26, 2014 Abstract

More information

Stationary Points of Bound Constrained Minimization Reformulations of Complementarity Problems1,2

Stationary Points of Bound Constrained Minimization Reformulations of Complementarity Problems1,2 JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS: Vol. 94, No. 2, pp. 449-467, AUGUST 1997 Stationary Points of Bound Constrained Minimization Reformulations of Complementarity Problems1,2 M. V. SOLODOV3

More information

Nonlinear Programming

Nonlinear Programming Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week

More information

ARE202A, Fall Contents

ARE202A, Fall Contents ARE202A, Fall 2005 LECTURE #2: WED, NOV 6, 2005 PRINT DATE: NOVEMBER 2, 2005 (NPP2) Contents 5. Nonlinear Programming Problems and the Kuhn Tucker conditions (cont) 5.2. Necessary and sucient conditions

More information

IE 5531: Engineering Optimization I

IE 5531: Engineering Optimization I IE 5531: Engineering Optimization I Lecture 14: Unconstrained optimization Prof. John Gunnar Carlsson October 27, 2010 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 27, 2010 1

More information

Linear Algebra (part 1) : Matrices and Systems of Linear Equations (by Evan Dummit, 2016, v. 2.02)

Linear Algebra (part 1) : Matrices and Systems of Linear Equations (by Evan Dummit, 2016, v. 2.02) Linear Algebra (part ) : Matrices and Systems of Linear Equations (by Evan Dummit, 206, v 202) Contents 2 Matrices and Systems of Linear Equations 2 Systems of Linear Equations 2 Elimination, Matrix Formulation

More information

A Proximal Method for Identifying Active Manifolds

A Proximal Method for Identifying Active Manifolds A Proximal Method for Identifying Active Manifolds W.L. Hare April 18, 2006 Abstract The minimization of an objective function over a constraint set can often be simplified if the active manifold of the

More information

2 B. CHEN, X. CHEN AND C. KANZOW Abstract: We introduce a new NCP-function that reformulates a nonlinear complementarity problem as a system of semism

2 B. CHEN, X. CHEN AND C. KANZOW Abstract: We introduce a new NCP-function that reformulates a nonlinear complementarity problem as a system of semism A PENALIZED FISCHER-BURMEISTER NCP-FUNCTION: THEORETICAL INVESTIGATION AND NUMERICAL RESULTS 1 Bintong Chen 2, Xiaojun Chen 3 and Christian Kanzow 4 2 Department of Management and Systems Washington State

More information

N.G.Bean, D.A.Green and P.G.Taylor. University of Adelaide. Adelaide. Abstract. process of an MMPP/M/1 queue is not a MAP unless the queue is a

N.G.Bean, D.A.Green and P.G.Taylor. University of Adelaide. Adelaide. Abstract. process of an MMPP/M/1 queue is not a MAP unless the queue is a WHEN IS A MAP POISSON N.G.Bean, D.A.Green and P.G.Taylor Department of Applied Mathematics University of Adelaide Adelaide 55 Abstract In a recent paper, Olivier and Walrand (994) claimed that the departure

More information

NONSMOOTH VARIANTS OF POWELL S BFGS CONVERGENCE THEOREM

NONSMOOTH VARIANTS OF POWELL S BFGS CONVERGENCE THEOREM NONSMOOTH VARIANTS OF POWELL S BFGS CONVERGENCE THEOREM JIAYI GUO AND A.S. LEWIS Abstract. The popular BFGS quasi-newton minimization algorithm under reasonable conditions converges globally on smooth

More information

Unconstrained minimization of smooth functions

Unconstrained minimization of smooth functions Unconstrained minimization of smooth functions We want to solve min x R N f(x), where f is convex. In this section, we will assume that f is differentiable (so its gradient exists at every point), and

More information

20 J.-S. CHEN, C.-H. KO AND X.-R. WU. : R 2 R is given by. Recently, the generalized Fischer-Burmeister function ϕ p : R2 R, which includes

20 J.-S. CHEN, C.-H. KO AND X.-R. WU. : R 2 R is given by. Recently, the generalized Fischer-Burmeister function ϕ p : R2 R, which includes 016 0 J.-S. CHEN, C.-H. KO AND X.-R. WU whereas the natural residual function ϕ : R R is given by ϕ (a, b) = a (a b) + = min{a, b}. Recently, the generalized Fischer-Burmeister function ϕ p : R R, which

More information

The iterative convex minorant algorithm for nonparametric estimation

The iterative convex minorant algorithm for nonparametric estimation The iterative convex minorant algorithm for nonparametric estimation Report 95-05 Geurt Jongbloed Technische Universiteit Delft Delft University of Technology Faculteit der Technische Wiskunde en Informatica

More information

1 Matrices and Systems of Linear Equations

1 Matrices and Systems of Linear Equations Linear Algebra (part ) : Matrices and Systems of Linear Equations (by Evan Dummit, 207, v 260) Contents Matrices and Systems of Linear Equations Systems of Linear Equations Elimination, Matrix Formulation

More information

Key words. linear complementarity problem, non-interior-point algorithm, Tikhonov regularization, P 0 matrix, regularized central path

Key words. linear complementarity problem, non-interior-point algorithm, Tikhonov regularization, P 0 matrix, regularized central path A GLOBALLY AND LOCALLY SUPERLINEARLY CONVERGENT NON-INTERIOR-POINT ALGORITHM FOR P 0 LCPS YUN-BIN ZHAO AND DUAN LI Abstract Based on the concept of the regularized central path, a new non-interior-point

More information

y Ray of Half-line or ray through in the direction of y

y Ray of Half-line or ray through in the direction of y Chapter LINEAR COMPLEMENTARITY PROBLEM, ITS GEOMETRY, AND APPLICATIONS. THE LINEAR COMPLEMENTARITY PROBLEM AND ITS GEOMETRY The Linear Complementarity Problem (abbreviated as LCP) is a general problem

More information

IE 5531: Engineering Optimization I

IE 5531: Engineering Optimization I IE 5531: Engineering Optimization I Lecture 19: Midterm 2 Review Prof. John Gunnar Carlsson November 22, 2010 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I November 22, 2010 1 / 34 Administrivia

More information

A NOTE ON A GLOBALLY CONVERGENT NEWTON METHOD FOR SOLVING. Patrice MARCOTTE. Jean-Pierre DUSSAULT

A NOTE ON A GLOBALLY CONVERGENT NEWTON METHOD FOR SOLVING. Patrice MARCOTTE. Jean-Pierre DUSSAULT A NOTE ON A GLOBALLY CONVERGENT NEWTON METHOD FOR SOLVING MONOTONE VARIATIONAL INEQUALITIES Patrice MARCOTTE Jean-Pierre DUSSAULT Resume. Il est bien connu que la methode de Newton, lorsqu'appliquee a

More information

58 Appendix 1 fundamental inconsistent equation (1) can be obtained as a linear combination of the two equations in (2). This clearly implies that the

58 Appendix 1 fundamental inconsistent equation (1) can be obtained as a linear combination of the two equations in (2). This clearly implies that the Appendix PRELIMINARIES 1. THEOREMS OF ALTERNATIVES FOR SYSTEMS OF LINEAR CONSTRAINTS Here we consider systems of linear constraints, consisting of equations or inequalities or both. A feasible solution

More information

A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints

A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints Journal of Computational and Applied Mathematics 161 (003) 1 5 www.elsevier.com/locate/cam A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality

More information

1 Lyapunov theory of stability

1 Lyapunov theory of stability M.Kawski, APM 581 Diff Equns Intro to Lyapunov theory. November 15, 29 1 1 Lyapunov theory of stability Introduction. Lyapunov s second (or direct) method provides tools for studying (asymptotic) stability

More information

Lecture 4 - The Gradient Method Objective: find an optimal solution of the problem

Lecture 4 - The Gradient Method Objective: find an optimal solution of the problem Lecture 4 - The Gradient Method Objective: find an optimal solution of the problem min{f (x) : x R n }. The iterative algorithms that we will consider are of the form x k+1 = x k + t k d k, k = 0, 1,...

More information

Lecture 4 - The Gradient Method Objective: find an optimal solution of the problem

Lecture 4 - The Gradient Method Objective: find an optimal solution of the problem Lecture 4 - The Gradient Method Objective: find an optimal solution of the problem min{f (x) : x R n }. The iterative algorithms that we will consider are of the form x k+1 = x k + t k d k, k = 0, 1,...

More information

2 Sequences, Continuity, and Limits

2 Sequences, Continuity, and Limits 2 Sequences, Continuity, and Limits In this chapter, we introduce the fundamental notions of continuity and limit of a real-valued function of two variables. As in ACICARA, the definitions as well as proofs

More information

Zangwill s Global Convergence Theorem

Zangwill s Global Convergence Theorem Zangwill s Global Convergence Theorem A theory of global convergence has been given by Zangwill 1. This theory involves the notion of a set-valued mapping, or point-to-set mapping. Definition 1.1 Given

More information

Garrett: `Bernstein's analytic continuation of complex powers' 2 Let f be a polynomial in x 1 ; : : : ; x n with real coecients. For complex s, let f

Garrett: `Bernstein's analytic continuation of complex powers' 2 Let f be a polynomial in x 1 ; : : : ; x n with real coecients. For complex s, let f 1 Bernstein's analytic continuation of complex powers c1995, Paul Garrett, garrettmath.umn.edu version January 27, 1998 Analytic continuation of distributions Statement of the theorems on analytic continuation

More information

The general programming problem is the nonlinear programming problem where a given function is maximized subject to a set of inequality constraints.

The general programming problem is the nonlinear programming problem where a given function is maximized subject to a set of inequality constraints. 1 Optimization Mathematical programming refers to the basic mathematical problem of finding a maximum to a function, f, subject to some constraints. 1 In other words, the objective is to find a point,

More information

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term; Chapter 2 Gradient Methods The gradient method forms the foundation of all of the schemes studied in this book. We will provide several complementary perspectives on this algorithm that highlight the many

More information

Absolute value equations

Absolute value equations Linear Algebra and its Applications 419 (2006) 359 367 www.elsevier.com/locate/laa Absolute value equations O.L. Mangasarian, R.R. Meyer Computer Sciences Department, University of Wisconsin, 1210 West

More information

University of California. Berkeley, CA fzhangjun johans lygeros Abstract

University of California. Berkeley, CA fzhangjun johans lygeros Abstract Dynamical Systems Revisited: Hybrid Systems with Zeno Executions Jun Zhang, Karl Henrik Johansson y, John Lygeros, and Shankar Sastry Department of Electrical Engineering and Computer Sciences University

More information

Preface These notes were prepared on the occasion of giving a guest lecture in David Harel's class on Advanced Topics in Computability. David's reques

Preface These notes were prepared on the occasion of giving a guest lecture in David Harel's class on Advanced Topics in Computability. David's reques Two Lectures on Advanced Topics in Computability Oded Goldreich Department of Computer Science Weizmann Institute of Science Rehovot, Israel. oded@wisdom.weizmann.ac.il Spring 2002 Abstract This text consists

More information

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Zhaosong Lu October 5, 2012 (Revised: June 3, 2013; September 17, 2013) Abstract In this paper we study

More information

Lecture 3. Optimization Problems and Iterative Algorithms

Lecture 3. Optimization Problems and Iterative Algorithms Lecture 3 Optimization Problems and Iterative Algorithms January 13, 2016 This material was jointly developed with Angelia Nedić at UIUC for IE 598ns Outline Special Functions: Linear, Quadratic, Convex

More information

McMaster University. Advanced Optimization Laboratory. Title: A Proximal Method for Identifying Active Manifolds. Authors: Warren L.

McMaster University. Advanced Optimization Laboratory. Title: A Proximal Method for Identifying Active Manifolds. Authors: Warren L. McMaster University Advanced Optimization Laboratory Title: A Proximal Method for Identifying Active Manifolds Authors: Warren L. Hare AdvOl-Report No. 2006/07 April 2006, Hamilton, Ontario, Canada A Proximal

More information

September Math Course: First Order Derivative

September Math Course: First Order Derivative September Math Course: First Order Derivative Arina Nikandrova Functions Function y = f (x), where x is either be a scalar or a vector of several variables (x,..., x n ), can be thought of as a rule which

More information

1 Introduction We consider the problem nd x 2 H such that 0 2 T (x); (1.1) where H is a real Hilbert space, and T () is a maximal monotone operator (o

1 Introduction We consider the problem nd x 2 H such that 0 2 T (x); (1.1) where H is a real Hilbert space, and T () is a maximal monotone operator (o Journal of Convex Analysis Volume 6 (1999), No. 1, pp. xx-xx. cheldermann Verlag A HYBRID PROJECTION{PROXIMAL POINT ALGORITHM M. V. Solodov y and B. F. Svaiter y January 27, 1997 (Revised August 24, 1998)

More information

and P RP k = gt k (g k? g k? ) kg k? k ; (.5) where kk is the Euclidean norm. This paper deals with another conjugate gradient method, the method of s

and P RP k = gt k (g k? g k? ) kg k? k ; (.5) where kk is the Euclidean norm. This paper deals with another conjugate gradient method, the method of s Global Convergence of the Method of Shortest Residuals Yu-hong Dai and Ya-xiang Yuan State Key Laboratory of Scientic and Engineering Computing, Institute of Computational Mathematics and Scientic/Engineering

More information

A projection-type method for generalized variational inequalities with dual solutions

A projection-type method for generalized variational inequalities with dual solutions Available online at www.isr-publications.com/jnsa J. Nonlinear Sci. Appl., 10 (2017), 4812 4821 Research Article Journal Homepage: www.tjnsa.com - www.isr-publications.com/jnsa A projection-type method

More information

Date: July 5, Contents

Date: July 5, Contents 2 Lagrange Multipliers Date: July 5, 2001 Contents 2.1. Introduction to Lagrange Multipliers......... p. 2 2.2. Enhanced Fritz John Optimality Conditions...... p. 14 2.3. Informative Lagrange Multipliers...........

More information

Douglas-Rachford splitting for nonconvex feasibility problems

Douglas-Rachford splitting for nonconvex feasibility problems Douglas-Rachford splitting for nonconvex feasibility problems Guoyin Li Ting Kei Pong Jan 3, 015 Abstract We adapt the Douglas-Rachford DR) splitting method to solve nonconvex feasibility problems by studying

More information

DO NOT OPEN THIS QUESTION BOOKLET UNTIL YOU ARE TOLD TO DO SO

DO NOT OPEN THIS QUESTION BOOKLET UNTIL YOU ARE TOLD TO DO SO QUESTION BOOKLET EECS 227A Fall 2009 Midterm Tuesday, Ocotober 20, 11:10-12:30pm DO NOT OPEN THIS QUESTION BOOKLET UNTIL YOU ARE TOLD TO DO SO You have 80 minutes to complete the midterm. The midterm consists

More information

Combinatorial Structures in Nonlinear Programming

Combinatorial Structures in Nonlinear Programming Combinatorial Structures in Nonlinear Programming Stefan Scholtes April 2002 Abstract Non-smoothness and non-convexity in optimization problems often arise because a combinatorial structure is imposed

More information

460 HOLGER DETTE AND WILLIAM J STUDDEN order to examine how a given design behaves in the model g` with respect to the D-optimality criterion one uses

460 HOLGER DETTE AND WILLIAM J STUDDEN order to examine how a given design behaves in the model g` with respect to the D-optimality criterion one uses Statistica Sinica 5(1995), 459-473 OPTIMAL DESIGNS FOR POLYNOMIAL REGRESSION WHEN THE DEGREE IS NOT KNOWN Holger Dette and William J Studden Technische Universitat Dresden and Purdue University Abstract:

More information

ON REGULARITY CONDITIONS FOR COMPLEMENTARITY PROBLEMS

ON REGULARITY CONDITIONS FOR COMPLEMENTARITY PROBLEMS ON REGULARITY CONDITIONS FOR COMPLEMENTARITY PROBLEMS A. F. Izmailov and A. S. Kurennoy December 011 ABSTRACT In the context of mixed complementarity problems various concepts of solution regularity are

More information

1 Introduction Let F : < n! < n be a continuously dierentiable mapping and S be a nonempty closed convex set in < n. The variational inequality proble

1 Introduction Let F : < n! < n be a continuously dierentiable mapping and S be a nonempty closed convex set in < n. The variational inequality proble A New Unconstrained Dierentiable Merit Function for Box Constrained Variational Inequality Problems and a Damped Gauss-Newton Method Defeng Sun y and Robert S. Womersley z School of Mathematics University

More information

Convex Functions and Optimization

Convex Functions and Optimization Chapter 5 Convex Functions and Optimization 5.1 Convex Functions Our next topic is that of convex functions. Again, we will concentrate on the context of a map f : R n R although the situation can be generalized

More information

290 J.M. Carnicer, J.M. Pe~na basis (u 1 ; : : : ; u n ) consisting of minimally supported elements, yet also has a basis (v 1 ; : : : ; v n ) which f

290 J.M. Carnicer, J.M. Pe~na basis (u 1 ; : : : ; u n ) consisting of minimally supported elements, yet also has a basis (v 1 ; : : : ; v n ) which f Numer. Math. 67: 289{301 (1994) Numerische Mathematik c Springer-Verlag 1994 Electronic Edition Least supported bases and local linear independence J.M. Carnicer, J.M. Pe~na? Departamento de Matematica

More information

Midterm 1. Every element of the set of functions is continuous

Midterm 1. Every element of the set of functions is continuous Econ 200 Mathematics for Economists Midterm Question.- Consider the set of functions F C(0, ) dened by { } F = f C(0, ) f(x) = ax b, a A R and b B R That is, F is a subset of the set of continuous functions

More information

Metric Spaces. DEF. If (X; d) is a metric space and E is a nonempty subset, then (E; d) is also a metric space, called a subspace of X:

Metric Spaces. DEF. If (X; d) is a metric space and E is a nonempty subset, then (E; d) is also a metric space, called a subspace of X: Metric Spaces DEF. A metric space X or (X; d) is a nonempty set X together with a function d : X X! [0; 1) such that for all x; y; and z in X : 1. d (x; y) 0 with equality i x = y 2. d (x; y) = d (y; x)

More information

Set, functions and Euclidean space. Seungjin Han

Set, functions and Euclidean space. Seungjin Han Set, functions and Euclidean space Seungjin Han September, 2018 1 Some Basics LOGIC A is necessary for B : If B holds, then A holds. B A A B is the contraposition of B A. A is sufficient for B: If A holds,

More information

Lecture 8 Plus properties, merit functions and gap functions. September 28, 2008

Lecture 8 Plus properties, merit functions and gap functions. September 28, 2008 Lecture 8 Plus properties, merit functions and gap functions September 28, 2008 Outline Plus-properties and F-uniqueness Equation reformulations of VI/CPs Merit functions Gap merit functions FP-I book:

More information

A derivative-free nonmonotone line search and its application to the spectral residual method

A derivative-free nonmonotone line search and its application to the spectral residual method IMA Journal of Numerical Analysis (2009) 29, 814 825 doi:10.1093/imanum/drn019 Advance Access publication on November 14, 2008 A derivative-free nonmonotone line search and its application to the spectral

More information

R. Schaback. numerical method is proposed which rst minimizes each f j separately. and then applies a penalty strategy to gradually force the

R. Schaback. numerical method is proposed which rst minimizes each f j separately. and then applies a penalty strategy to gradually force the A Multi{Parameter Method for Nonlinear Least{Squares Approximation R Schaback Abstract P For discrete nonlinear least-squares approximation problems f 2 (x)! min for m smooth functions f : IR n! IR a m

More information

A TOUR OF LINEAR ALGEBRA FOR JDEP 384H

A TOUR OF LINEAR ALGEBRA FOR JDEP 384H A TOUR OF LINEAR ALGEBRA FOR JDEP 384H Contents Solving Systems 1 Matrix Arithmetic 3 The Basic Rules of Matrix Arithmetic 4 Norms and Dot Products 5 Norms 5 Dot Products 6 Linear Programming 7 Eigenvectors

More information

6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE. Three Alternatives/Remedies for Gradient Projection

6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE. Three Alternatives/Remedies for Gradient Projection 6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE Three Alternatives/Remedies for Gradient Projection Two-Metric Projection Methods Manifold Suboptimization Methods

More information

Introduction to Real Analysis Alternative Chapter 1

Introduction to Real Analysis Alternative Chapter 1 Christopher Heil Introduction to Real Analysis Alternative Chapter 1 A Primer on Norms and Banach Spaces Last Updated: March 10, 2018 c 2018 by Christopher Heil Chapter 1 A Primer on Norms and Banach Spaces

More information

INTRODUCTION TO NETS. limits to coincide, since it can be deduced: i.e. x

INTRODUCTION TO NETS. limits to coincide, since it can be deduced: i.e. x INTRODUCTION TO NETS TOMMASO RUSSO 1. Sequences do not describe the topology The goal of this rst part is to justify via some examples the fact that sequences are not sucient to describe a topological

More information

arxiv: v1 [math.oc] 1 Jul 2016

arxiv: v1 [math.oc] 1 Jul 2016 Convergence Rate of Frank-Wolfe for Non-Convex Objectives Simon Lacoste-Julien INRIA - SIERRA team ENS, Paris June 8, 016 Abstract arxiv:1607.00345v1 [math.oc] 1 Jul 016 We give a simple proof that the

More information

Alternative theorems for nonlinear projection equations and applications to generalized complementarity problems

Alternative theorems for nonlinear projection equations and applications to generalized complementarity problems Nonlinear Analysis 46 (001) 853 868 www.elsevier.com/locate/na Alternative theorems for nonlinear projection equations and applications to generalized complementarity problems Yunbin Zhao a;, Defeng Sun

More information

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by: Newton s Method Suppose we want to solve: (P:) min f (x) At x = x, f (x) can be approximated by: n x R. f (x) h(x) := f ( x)+ f ( x) T (x x)+ (x x) t H ( x)(x x), 2 which is the quadratic Taylor expansion

More information

3 Integration and Expectation

3 Integration and Expectation 3 Integration and Expectation 3.1 Construction of the Lebesgue Integral Let (, F, µ) be a measure space (not necessarily a probability space). Our objective will be to define the Lebesgue integral R fdµ

More information

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be

More information

Werner Romisch. Humboldt University Berlin. Abstract. Perturbations of convex chance constrained stochastic programs are considered the underlying

Werner Romisch. Humboldt University Berlin. Abstract. Perturbations of convex chance constrained stochastic programs are considered the underlying Stability of solutions to chance constrained stochastic programs Rene Henrion Weierstrass Institute for Applied Analysis and Stochastics D-7 Berlin, Germany and Werner Romisch Humboldt University Berlin

More information

Value and Policy Iteration

Value and Policy Iteration Chapter 7 Value and Policy Iteration 1 For infinite horizon problems, we need to replace our basic computational tool, the DP algorithm, which we used to compute the optimal cost and policy for finite

More information

Kaisa Joki Adil M. Bagirov Napsu Karmitsa Marko M. Mäkelä. New Proximal Bundle Method for Nonsmooth DC Optimization

Kaisa Joki Adil M. Bagirov Napsu Karmitsa Marko M. Mäkelä. New Proximal Bundle Method for Nonsmooth DC Optimization Kaisa Joki Adil M. Bagirov Napsu Karmitsa Marko M. Mäkelä New Proximal Bundle Method for Nonsmooth DC Optimization TUCS Technical Report No 1130, February 2015 New Proximal Bundle Method for Nonsmooth

More information

Economics Bulletin, 2012, Vol. 32 No. 1 pp Introduction. 2. The preliminaries

Economics Bulletin, 2012, Vol. 32 No. 1 pp Introduction. 2. The preliminaries 1. Introduction In this paper we reconsider the problem of axiomatizing scoring rules. Early results on this problem are due to Smith (1973) and Young (1975). They characterized social welfare and social

More information

SOME STABILITY RESULTS FOR THE SEMI-AFFINE VARIATIONAL INEQUALITY PROBLEM. 1. Introduction

SOME STABILITY RESULTS FOR THE SEMI-AFFINE VARIATIONAL INEQUALITY PROBLEM. 1. Introduction ACTA MATHEMATICA VIETNAMICA 271 Volume 29, Number 3, 2004, pp. 271-280 SOME STABILITY RESULTS FOR THE SEMI-AFFINE VARIATIONAL INEQUALITY PROBLEM NGUYEN NANG TAM Abstract. This paper establishes two theorems

More information

PM functions, their characteristic intervals and iterative roots

PM functions, their characteristic intervals and iterative roots ANNALES POLONICI MATHEMATICI LXV.2(1997) PM functions, their characteristic intervals and iterative roots by Weinian Zhang (Chengdu) Abstract. The concept of characteristic interval for piecewise monotone

More information

Approximation Algorithms for Maximum. Coverage and Max Cut with Given Sizes of. Parts? A. A. Ageev and M. I. Sviridenko

Approximation Algorithms for Maximum. Coverage and Max Cut with Given Sizes of. Parts? A. A. Ageev and M. I. Sviridenko Approximation Algorithms for Maximum Coverage and Max Cut with Given Sizes of Parts? A. A. Ageev and M. I. Sviridenko Sobolev Institute of Mathematics pr. Koptyuga 4, 630090, Novosibirsk, Russia fageev,svirg@math.nsc.ru

More information

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory Part V 7 Introduction: What are measures and why measurable sets Lebesgue Integration Theory Definition 7. (Preliminary). A measure on a set is a function :2 [ ] such that. () = 2. If { } = is a finite

More information

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9 MAT 570 REAL ANALYSIS LECTURE NOTES PROFESSOR: JOHN QUIGG SEMESTER: FALL 204 Contents. Sets 2 2. Functions 5 3. Countability 7 4. Axiom of choice 8 5. Equivalence relations 9 6. Real numbers 9 7. Extended

More information

University of Maryland at College Park. limited amount of computer memory, thereby allowing problems with a very large number

University of Maryland at College Park. limited amount of computer memory, thereby allowing problems with a very large number Limited-Memory Matrix Methods with Applications 1 Tamara Gibson Kolda 2 Applied Mathematics Program University of Maryland at College Park Abstract. The focus of this dissertation is on matrix decompositions

More information

CONSTRAINED NONLINEAR PROGRAMMING

CONSTRAINED NONLINEAR PROGRAMMING 149 CONSTRAINED NONLINEAR PROGRAMMING We now turn to methods for general constrained nonlinear programming. These may be broadly classified into two categories: 1. TRANSFORMATION METHODS: In this approach

More information

Proximal and First-Order Methods for Convex Optimization

Proximal and First-Order Methods for Convex Optimization Proximal and First-Order Methods for Convex Optimization John C Duchi Yoram Singer January, 03 Abstract We describe the proximal method for minimization of convex functions We review classical results,

More information