Projected Gradient Methods for NCP 57. Complementarity Problems via Normal Maps
|
|
- Augustine Hodges
- 5 years ago
- Views:
Transcription
1 Projected Gradient Methods for NCP 57 Recent Advances in Nonsmooth Optimization, pp Eds..-Z. u, L. Qi and R.S. Womersley c1995 World Scientic Publishers Projected Gradient Methods for Nonlinear Complementarity Problems via Normal Maps Michael C. Ferris 1 University of Wisconsin{Madison, Computer Sciences epartment, Madison, WI 53706, USA aniel Ralph 2 University of Melbourne, epartment of Mathematics, Melbourne, Australia Abstract We present a new approach to solving nonlinear complementarity problems based on the normal map and adaptations of the projected gradient algorithm. We characterize a Gauss{Newton point for nonlinear complementarity problems and show that it is sucient to check at most two cells of the related normal manifold to determine such points. Our algorithm uses the projected gradient method on one cell and n rays to reduce the normed residual at the current point. Global convergence is shown under very weak assumptions using a property called nonstationary repulsion. A hybrid algorithm maintains global convergence, with quadratic local convergence under appropriate assumptions. 1 Introduction The nonlinear complementarity problem is to nd a vector z 2 IR n satisfying: f(z) 0; z 0; hf(z); zi = 0; (NCP) 1 The work of this author was based on research supported by the National Science Foundation grant CCR and the Air Force Oce of Scientic Research grant F The work of this author was based on research partially supported by the U.S. Army Research Oce through the Mathematical Sciences Institute, Cornell University, the National Science Foundation, the Air Force Oce of Scientic Research, the Oce of Naval Research, under grant MS , and the Australian Research Council.
2 58 M. C. Ferris and. Ralph where f : IR n! IR n is a smooth function and all vector inequalities are taken component-wise. In this paper, we will describe an algorithm for solving nonlinear complementarity problems that is computationally based on the projected gradient algorithm, and uses a reformulation of (NCP) as a system of nonsmooth equations. The algorithm is conceptually simple to implement and has a low cost per iteration; and we demonstrate its convergence properties assuming only that f is continuously dierentiable. The problem (NCP) can be reformulated using a normal map: 0 = f + (x) def = f(x + ) + x x + ; (NE) where x + is the Euclidean projection of x onto IR n +. Note that z solves (NCP) if and only if z f(z) solves (NE), and x solves (NE) if and only if x + solves (NCP). Normal maps were introduced by Robinson in [32] (see also [29, 30]) and we note here simply that the formulation (NE) has some advantages over (NCP). For example, it is an equation rather than a system of inequalities and equalities, hence its examination from the viewpoint of equations may yield insight dicult to obtain otherwise. This has proven to be the case as demonstrated by recent advances on nonsmooth Newtonlike algorithms for (NE) in [5, 4, 12, 28, 34]. Nonsmoothness of the normal map, however, is the diculty assumed. In fact, normal maps such as f + can be cast in a more general framework, where x + is replaced by (x), the projection of x onto a nonempty closed convex set. In this context, nding a zero of the normal map f (x) def = f( (x)) + x (x) is equivalent to a nonlinear variational inequality [11] dened by the set and the function f. In the special case where IR n +, f = f +. For polyhedral, the normal map [31, 33] f is intimately related to the normal manifold [32]. This manifold is constructed using the faces of the set ; it is a collection of n-dimensional polyhedral sets (called cells) which partition IR n. The normal map f is smooth in each cell of IR n ; nondierentiability only can occur as x moves from one cell to another. A cell is sometimes called a piece of linearity. In the particular example resulting from nonlinear complementarity problems where IR n +, the cells of the normal manifold are precisely the orthants of IR n. Practical Newton-like methods for (NE) solve a linear or piecewise linear model based at the kth iterate, x k, to obtain the next iterate x k+1. Unfortunately, this model is not always invertible and this creates problems for dening algorithms and in computing x k+1. In this paper, we are concerned with dening practical algorithms with strong global convergence properties for nding zeros of normal maps. Our goal is to obtain convergence, at least on a subsequence, to a Gauss{Newton point for normal maps. This generalizes the familiar notion from nonlinear equation theory where a Gauss{Newton point is a stationary point for the problem of minimizing the Euclidean norm residual of the function.
3 Projected Gradient Methods for NCP 59 We are ultimately interested in zeros of f but nding one may be on the level of diculty of nding zeros of general nonlinear functions. We revert to considering the residual function (x) def = min 1 2 kf (x)k 2 ; which gives us a measure of the violation of satisfying f (x) = 0. Our aim in this paper is to develop a robust algorithm for minimizing that has a low cost per iteration. Note that is a piecewise smooth function. In order to motivate our denition of Gauss{Newton points, let us rst examine the notion of a Gauss{Newton point for nonlinear equations. This corresponds to the case where IR n, and f = f. A Gauss{Newton point for the smooth function f is a point x 2 IR n such that x = x minimizes the rst-order model 1 kf(x ) + rf(x )(x x )k 2 of (x) over IR n. For 2 general, we construct a piecewise linear model of the residual function based on the directional derivative f(x 0 ; ). There are several key ideas on which the development of this paper are based. (i) The characterization of Gauss{Newton points for normal maps requires the stationarity of the residual function with respect to every cell that contains that Gauss{Newton point. Thus, for complementarity problems, we must examine up to 2 n orthants to determine whether or not x is a Gauss{Newton point of f +. Our rst key result is to show that it is sucient to check at most two of these cells, independent of the magnitude of n. An alternative characterization given in this paper shows that one cell and at most n rays in neighboring cells need to be examined to verify stationarity of (or give a descent direction). (ii) The inherent diculty in dening an algorithm to determine a Gauss{Newton point is that one must be sure that the limit point of the algorithm is stationary for in each piece of smoothness (orthant) containing that limit point. The second key idea, motivated by the characterizations above, is to apply variants of the projected gradient method [2] simultaneously to a single cell and n rays, to reduce. This means that the work performed by the projected gradient algorithm at each step of the Gauss{Newton method is comparable to performing just two projected gradient steps. (iii) Our algorithm depends heavily on the projected gradient method having Non- Stationary Repulsion or NSR (see Section 3). Simply stated, if an algorithm has NSR, then each nonstationary point has a neighborhood that can be visited by at most one iterate of the algorithm. The third key result is that the projected gradient algorithm and the adaptations that we use in our algorithm have NSR. This property forces our algorithm to generate a better point in a neighboring orthant if the limit point of the sequence is not stationary in such an orthant. The paper is organized as follows. In Section 2 we dene the notion of a Gauss{ Newton point for f + and and prove several equivalent characterizations (Proposi-
4 60 M. C. Ferris and. Ralph tion 2.3). We give a testable regularity condition (enition 2.4) that guarantees that such Gauss{Newton points are solutions of (NE). Section 3 outlines the nonstationary repulsion property and shows that any algorithm having NSR possesses strong global convergence properties (Theorem 3.2). We prove several technical results that are key to the convergence of our algorithms. A special case of these results are used to show that the projected gradient algorithm has NSR (Theorem 3.6). Section 4 contains a description of three algorithms and their convergence properties. Our main convergence result, Theorem 4.3, proves the Gauss{Newton method we present is extremely robust: assuming only continuous dierentiability of f, every limit point of the method is stationary for. No regularity assumptions on limit points are required. However, before proving this result, we outline a basic algorithm that can easily be shown to have NSR and hence global convergence under the same assumptions. Theorem 4.3 proves convergence of an extension to the basic algorithm that is motivated by the practical considerations of reducing the number of function and Jacobian evaluations. A Newton based hybrid method with global and local quadratic convergence is given in Subsection 4.3. Some simple examples of the use of these algorithms conclude the paper. There have been many other research papers devoted to solving nonlinear complementarity problems. Some of the more recent papers are mentioned below. There are several types of Newton methods for solving nonsmooth equations; see Subsection 4.3 for a brief introduction. Here we mention the following references on Newton methods for nonsmooth equations and extensions, [5, 4, 6, 12, 13, 15, 16, 20, 21, 26, 27, 28, 34]. A feature shared by \pure" Newton methods is the need for an invertible model function at the current iteration; applying the inverse of this model yields the next iterate. However singularities occur in many problems, for instance see [12], causing numerical diculties for, or outright failure of these methods. To circumvent the singularity problem, several Gauss{Newton techniques for solving nonlinear complementarity problems have been proposed. These can be found in the following references [1, 9, 19, 23, 22, 24]. Alternative techniques can be found in [8, 10, 14, 17, 18, 36]. Most of the notation in this paper is standard. We use IR n to denote the n{ dimensional real vector space, h; i for the inner product of two elements in this space, kk for the associated Euclidean norm, and IB for the corresponding ball of vectors x such that kxk 1. For a dierentiable function : IR n! IR m, r(x) 2 IR mn represents the Jacobian of evaluated at x, and r(x) T represents the transpose of this matrix. If is only directionally dierentiable, we denote the directional derivative mapping at x by 0 (x; ). Calligraphic upper case letters in general represent sets of indices, upper case letters represent sets or operators. If is a convex set, the normal cone to at a point x 2 is N (x) def = fy: hy; c xi 0; 8c 2 g :
5 Projected Gradient Methods for NCP 61 The tangent cone at x 2, is dened by T (x) def = N (x), where for a given convex cone K, the polar cone is dened by K def = fy: hy; ki 0; 8k 2 Kg : Both the tangent and normal cones are empty at points x =2. The Euclidean projection of x onto the set is represented by (x). A function :! IR m is C 1 (continuously dierentiable) if it is dierentiable in the relative interior of and, for each sequence fx k g in the relative interior of that converges (to a general point of ), fr(x k )g is also convergent. If is a polyhedral convex set, and F is a face of then N (x) is the same set for every x in the relative interior of F [32]. We call this set N (F ). A facet of is a face that has dimension 1 less than. Further denitions from convex analysis can be found in [35]. We may abuse notation, when there is no possibility of confusion, by writing O instead of j O to mean the restriction of to an orthant O (to be distinguished from a normal map involving O ). Finally, throughout the paper the function f :! IR m is assumed to be C 1 ; and usually = IR n +. 2 Gauss{Newton Points and Regularity As we outlined in the introduction, a Gauss{Newton point for the smooth function f is a point x 2 IR n such that x = x minimizes the rst-order model 1 2 kf(x ) + rf(x )(x x )k 2 of (x) = 1 2 kf(x)k2 over IR n. Equivalently, x is a stationary point of, that is r(x ) = rf(x ) T f(x ) = 0. Note again that for the remainder of this paper we assume that f is continuously dierentiable on its domain ( or IR n + ). In the general case, we approximate the normal map f (x) by the piecewise linear model f (x ) + f 0 (x ; x x ), where the directional derivative f 0 (x ; ) is a piecewise linear map. We can now dene the notion of a Gauss{Newton point of f, which is based on this directional derivative. enition 2.1 Let x 2 IR n. We say x is a Gauss{Newton point for f if x = x solves the problem 1 min x 2 kf (x ) + f(x 0 ; x x )k 2 : (1) Equivalently, x is a Gauss-Newton point if 1 2 kf (x )k kf (x ) + f 0 (x ; x x )k 2 ; 8x 2 IR n : For the remainder of this paper we will consider only the special case of nonlinear complementarity problems where IR n +. However, many of the results have analogues in the general polyhedral case.
6 62 M. C. Ferris and. Ralph 2.1 Gauss{Newton points of complementarity problems Using enition 2.1, we see that x is a Gauss{Newton point of f + if it solves (1) with f = f +. To understand this more fully, we now investigate the directional derivative f 0 + in more detail. We can easily calculate the directional derivative of the function x + at x in the direction d: it is the vector x 0 +(d) in IR n whose ith component is given by [x 0 + (d)] i = 8 >< >: d i if x i > 0; (d i ) + if x i = 0; 0 if x i < 0. In fact x 0 +(d) is exactly the projection of d onto the critical cone of IR n + at x, K(x). This critical cone is the Cartesian product of n intervals in IR, the ith interval being K i = 8 >< >: IR if x i > 0; IR + if x i = 0; f0g if x i < 0. Since f is continuously dierentiable, f + is directionally dierentiable: for x, d 2 IR n, f 0 +(x; d) = rf(x + ) K (d) + d K (d); where the notation K = K(x) is used. As a function of d, the mapping on the right is exactly the normal map induced by the matrix rf(x + ) and the convex cone K, so f 0 +(x; d) = rf(x + ) K (d): As mentioned above, the diculty in determining whether a point x is a Gauss{ Newton point is that we must examine potentially exponentially many pieces of smoothness of f +, or pieces of linearity of rf(x + ) K. In fact, the number of pieces of linearity of rf(x + ) K is the number of orthants containing x, and is given by 2 m where m is the number of components of x equal to zero. The next result removes this diculty by showing that at most two pieces of linearity need to be considered. We introduce some notation. Given an orthant O, let H i be the half-line IR +, i = 1; : : : ; n, such that O = H 1 : : : H n : The complement of O at a point x 2 O is the orthant ~ O given as the Cartesian product of half-lines ~H i where ~H i = ( Hi if x i 6= 0, H i if x i = 0. It may seem odd that the complement of O at an interior point x is O itself. This is actually quite natural in the context of stationary points of because is dierentiable at each interior point x of an orthant, hence the question of stationarity of at x is independent of other orthants. We next introduce the formal denition of a stationary point.
7 Projected Gradient Methods for NCP 63 enition 2.2 If is directionally dierentiable and is a nonempty convex set, then x is a stationary point for min x2 (x) if 0 (x ; d) 0; 8d 2 T (x ): Note that if IR n, then a stationary point satises 0 (x ; d) 0, for all d 2 IR n. Proposition 2.3 Given x 2 IR n, let K be the critical cone to IR n + at x, O be any orthant containing x and ~ O be the complement of O at x. Suppose f is continuously dierentiable, then the function, dened by is directionally dierentiable and The following statements are equivalent: (x) def = 1 2 kf +(x)k 2 ; 0 (x ; d) = f + (x ); f 0 + (x ; d) E ; 8d 2 IR n : (2) 1. x is a Gauss{Newton point of f x is a stationary point of minf(x): x 2 IR n g rf(x + )T f + (x ) + K and 0 2 f + (x ) + K. 4. x is stationary for both minf(x): x 2 O g and min n (x): x 2 ~ O o 5. x is stationary for minf(x): x 2 O g and for each 1-dimensional problem minf(x): x 2 x + N O (F )g ; where F is a facet of O containing x. Proof If statement 1 holds, then we dene (x) def = 1 f + (x ) + f 0 2 +(x ; x x ) 2 ; and note that 0 (x ; h) = 0 (x ; h), for all h. Since x is a Gauss{Newton point, it follows that 0 (x ; d) + o(d) 0, for all d, and hence that statement 2 holds by positive homogeneity. Conversely, if statement 2 holds, then for all d and > 0, 0 f + (x ); f 0 +(x ; d) E, so that (x + d) = 1 2 = 1 2 f + (x ) + f 0 + (x ; d) 2 f + (x ) + f 0 +(x ; d) kf +(x )k f 0 + (x ; d) 2 (x ):
8 64 M. C. Ferris and. Ralph Hence statement 1 holds. Statement 2 means that f + (x ); rf(x + ) K(d) E 0, for all d 2 IR n. If d 2 K, then rf(x + ) K(d) = rf(x + )d so that f + (x ); rf(x + )ke 0, for all k 2 K. Similarly, hf + (x ); i 0, for all 2 K. This is exactly statement 3. Conversely, let d 2 IR n, and recall from the Moreau decomposition that d = k + where k = K (d) and = K (d). Using statement 3, f+ (x ); rf(x +) K (d) E = f + (x ); rf(x +)k E + hf + (x ); i 0: Thus statement 2 holds. Clearly statement 2 implies statement 4. Suppose statement 4 holds. Consider a facet F of O containing x. There is a unique index 1 i n such that neither e i nor e i lies in F, where e i is the vector in IR n with component i equal to 1 and all other components equal to zero. Choose s = 1 such that se i 62 O, then N O (F ) = fse i : 0g. Note further that se i 2 ~O. Thus statement 4 implies statement 5. Suppose statement 5 holds and consider e = e i for any index i. If x i 6= 0, then stationarity of x for minf(x): x 2 O g yields that 0 (x ; e) 0. If x i = 0 then either e 2 O or e 2 N O (F ) for some facet F of O containing x. Therefore 0 (x ; e) 0: It follows by linearity of 0 (x ; ) on each orthant, that 0 (x ; d) 0 for each d in each orthant, hence for d 2 IR n. This is statement 2. The proof of the equivalence between statements 1, 2 and 3 in Proposition 2.3 can be immediately adapted to the case of a general polyhedral set, with K then representing the critical cone to at the point x. 2.2 Regularity We now turn to the question of when a Gauss{Newton point for f + is a solution of f + (x) = 0. This is commonly called regularity and we introduce a notion of regularity that is pertinent to our Gauss{Newton formulation. Recall from Proposition 2.3 that x is a Gauss{Newton point if and only if f + (x) 2 K; rf(x + ) T f + (x) 2 K ; where K is the critical cone to IR n + at x. A simple regularity condition would be f + (x) 2 K; rf(x + ) T f + (x) 2 K =) f + (x) = 0: However, this condition is dicult to verify in most practical instances. In order to generate a more testable notion of regularity, we follow the development of More [19]. Here, f + (x) is replaced by a general vector z and extra conditions that
9 Projected Gradient Methods for NCP 65 are satised by f + (x) are used to weaken the regularity assumption. Thus we dene P def = fi: x i > 0; [f + (x)] i > 0g ; N def = fi: x i > 0; [f + (x)] i < 0g ; C def = fi: [f + (x)] i = 0g ; and we note that [f + (x)] P > 0, [f + (x)] N < 0 and [f + (x)] C = 0. enition 2.4 A point x 2 IR n is said to be regular if the only z satisfying is z = 0. z 2 K; rf(x + ) T z 2 K ; z P 0; z N 0; z C = 0 This condition is closely related to [19, enition 3.1]. This is because x is regular if and only if z 6= 0; z 2 K; z P 0; z N 0; z C = 0 =) rf(x + ) T z =2 K ; and the condition on the right is equivalent to existence of p 2 K such that z T rf(x + )p > 0. In contrast to [19, 22], the point x is not constrained to be nonnegative. Using enition 2.4, we can prove the following result. Lemma 2.5 x is a regular stationary point for if and only if x solves (NE). Proof If f + (x) = 0 then C = f1; : : : ; ng so z = z C = 0. Further, using (2), x is stationary for. Conversely, if x is stationary, then Proposition 2.3 shows that z = f + (x) satises all the relations required in the denition of regularity, and hence f + (x) = 0. We turn to the question of testing whether a point x is regular. [19, 22, 28] give several conditions on the Jacobian of f to ensure that x is regular in the sense dened in the corresponding paper. For brevity we only discuss the s-regularity condition of Pang and Gabriel [22], and do not repeat denitions here. More [19] argues that s-regularity is stronger than his regularity condition; a similar comparison between enition 2.4 and s-regularity can be made. Here we make a new observation about s-regularity. To explain this, recall that the goal of [22] is to solve 0 = (x) def = (1=2) kminff(x); xgk 2, where the min is taken component-wise; a solution of this equation solves (NCP) and vice versa. If x is nonstationary for, then s-regularity of x ensures that for some direction y 2 IR n and all x near x, y is a (strict) descent direction for at x, i.e. 0 (x; y) < 0; see [22, Lemmas 2, 6 and 7]. However (like ) is only piecewise smooth, and may have a nonstationary point which is a local minimum of some piece of smoothness of, contradicting the existence of such a direction y. So s-regularity is too strong in the context of this investigation.
10 66 M. C. Ferris and. Ralph In what follows, we give conditions that ensure x to be regular in the sense of enition 2.4. These results are proven by adapting arguments from More [19]. A key construct in the results is the matrix J(x) def = T 1 rf(x + )T 1 where T = diag(t i ); t i = ( 1 if i 2 P 1 if i =2 P. T is chosen so that every component of ~z def = T z is nonnegative. Under this transformation, x is regular if 0 6= ~z 0; ~z C = 0; ~z 2 T K =) 9~p 2 T K; ~z T J(x)~p > 0; where def = fi: [f + (x)] i 6= 0; x i 0g. Note that ~z i = 0 when i =2. The results we now give impose conditions of J(x) to guarantee regularity. We note that A 2 IR nn is an S-matrix if there is an x > 0 with Ax > 0, see [3]. Theorem 2.6 Let J(x) = T 1 rf(x + )T 1. If [J(x)] EE is an S-matrix for some index set E with E fi: x i 0g, then x is regular. Proof Since [J(x)] EE is an S-matrix, there is some ~p E > 0 such that [J(x)] EE ~p E > 0. Let ~p be the vector in IR n obtained by setting other elements to zero, so that [J(x)~p] E > 0. Now 0 6= ~z 0 and E so ~z T J(x)~p > 0. Also, T K is the Cartesian product of (T K) i = 8 >< Thus ~p 2 T K and hence x is regular. >: IR if x i > 0 IR + if x i = 0 f0g if x i < 0 ; i = 1; : : : ; n: (3) A is a P -matrix if all its principal minors are positive. P -matrices are S-matrices [3, Corollary 3.3.5]. The following corollary is now immediate. Corollary 2.7 If [rf(x + )] is a P -matrix, then x is regular. Proof The hypotheses imply that [J(x)] is a P -matrix and hence an S-matrix. To complete our discussion of tests for regularity, we give the following result. Recall that if A is partitioned in the form " # ANN A A = NM ; A MN A MM and the matrix A NN is nonsingular, then (AnA NN ) def = A MM A MN A 1 NNA NM is called the Schur complement of A NN in A. The proof of the following result is modeled after [19, Corollary 4.6].
11 Projected Gradient Methods for NCP 67 Theorem 2.8 If [rf(x + )] NN is nonsingular and the Schur complement of [rf(x + )] NN in [J(x)] is an S-matrix, then x is regular. Proof Let A = [J(x)] and partition A into " ANN A NM A MN A MM # ; where A NN = [rf(x + )] NN and M def = n N. We construct ~p N, ~p M such that [J(x)] " ~pn ~p M # > 0: (4) Let a > 0, then ~p N, ~p M solve " ANN A NM A MN A MM # " ~pn ~p M # = " a q # if and only if ~p N, ~p M solve " ANN A NM 0 (AnA NN ) # " ~pn ~p M # = " a q A MN A 1 NN a # : Since (AnA NN ) is an S-matrix by assumption, there exists ~p M > 0 with (AnA NN ) ~p M > 0. Multiplying ~p M by an appropriately large number gives (AnA NN ) ~p M +A MN A 1 NNa > 0. It follows that q def = (AnA NN ) ~p M + A MN A 1 NNa > 0, and taking ~p N = A 1 NN (a A NM ~p M ) implies (4). Let ~p 2 IR n be the vector constructed from ~p M and ~p N by adding appropriate zeros. Then it is easy to see that ~p 2 T K, see (3). Furthermore, ~z T J(x)~p = ~z [J(x)~p] T > 0. Hence x is regular. Note that [5, 12, 28] all assume that [rf(x + )] EE is nonsingular and ([rf(x + )] LL n[rf(x + )] EE ) is a P -matrix. (5) Here E def = fi: x i > 0g contains N and L def = fi: x i 0g contains. Theorem 2.8 requires the non-singularity of a smaller matrix and a weaker assumption on the Schur complement. However, (5) guarantees regularity in the sense of enition 2.4 as we now show. Lemma 2.9 If (5) holds or, equivalently, the B-derivative f 0 +(x; ) is invertible, then and so x is regular. z 2 K; rf(x + ) T z 2 K =) z = 0;
12 68 M. C. Ferris and. Ralph Proof The equivalence between (5) and the existence of a Lipschitz inverse of f 0 + (x; ) is given by [28, Proposition 12]. Since all piecewise linear functions are Lipschitz, the claimed equivalence holds. Suppose z 2 K and rf(x + ) T z 2 K. It follows that z i = 0, i =2 L. Also rf(x + ) T z 2 K implies h rf(x+ ) T z i E = 0; h rf(x+ ) T z i M 0; where M def = L n E. Using the invertibility assumption from (5) and z 2 K again, we see that [rf(x+ ) T ] MM [rf(x + ) T ] ME [rf(x + ) T ] 1 EE[rf(x + ) T ] EM zm 0; z M 0: The Schur complement is a P -matrix and hence z = 0 follows from [3, Theorem 3.3.4]. 3 Nonstationary Repulsion (NSR) of the Projected Gradient Method Let be a nonempty closed convex set in IR n and :! IR be C 1. (We are thinking of being an orthant and = j.) We paraphrase the description of the projected gradient (PG) algorithm given by Calamai and More [2] for the problem min (x): (6) x2 For any > 0, the rst-order necessary condition for x to be a local minimizer of this problem is that (x r(x)) = x: When x k 2 is nonstationary, a step length k > 0 is chosen by searching the path x k () def = (x k r(x k )); > 0: Given constants 1 > 0, 2 2 (0; 1), and 1 and 2 with 0 < 1 2 < 1, the step length k must satisfy (x k ( k )) (x k ) + 1 r(x k ); x k ( k ) x ke (7) and where k satises k 1 or k 2 k > 0; (8) (x k ( k )) > (x k ) + 2 r(x k ); x k ( k ) x ke : (9)
13 Projected Gradient Methods for NCP 69 Condition (7) forces k not to be too large; it is the analogue of the condition used in the standard Armijo line search for unconstrained optimization. Condition (8) forces k not to be too small; in the case that k < 1, this requirement is the analogue of the standard Wolfe-Goldstein [7] condition from unconstrained optimization. The PG method is a feasible point algorithm in that it requires a starting point x 0 in and produces a sequence of iterates fx k g. It is also monotonic, that is, if x k 2 is nonstationary, then (x k+1 ) (x k ). We claim that the PG method has NSR: enition 3.1 An iterative feasible point algorithm for (6) has nonstationary repulsion (NSR) if for each nonstationary x 2, there exists a neighborhood V of x such that if any iterate x k lies in V \, then (x k+1 ) < (x). The fact that the steepest descent method, i.e. the PG method when = IR n, has NSR is easy to see. Also, Polak [25, Chapter 1] discusses a general descent property that is similar to NSR and provides convergence results like Theorem 3.2 below. It is trivial but important that NSR yields strong global convergence properties: Theorem 3.2 Suppose A is a monotonic feasible point algorithm for (6) with NSR. Let x Any limit point of the sequence generated by A is stationary. 2. Let B be any monotonic feasible point algorithm for (6). Suppose fx k g is a sequence dened by applying either A or B to each x k. Then any limit point of fx k g k2k is stationary if A is applied innitely many times, where K = n k: x k+1 is generated by A o. Proof 1. This is a corollary of part 2 of the theorem. 2. Let x 2 be nonstationary for (6) and K have innite cardinality. NSR gives > 0 such that (x k+1 ) < (x) if x k 2 (x + IB) \. If the subsequence fx k g K does not intersect (x + IB) \ then x is not a limit point of this subsequence. So we assume that x k 2 (x + IB) \ for some k 2 K, hence (x k+1 ) < (x). By continuity of there is 1 2 (0; ) such that (x) > (x k+1 ) if x 2 (x + 1 IB)\. By monotonicity of A and B, (x k+j ) (x k+1 ) for each j 1; hence x is not a limit point of fx k+j g j1, or of fx k g K. Of course NSR is not a guarantee of convergence. To guarantee existence of a limit point of a sequence produced by a method with NSR we need additional knowledge, for instance boundedness of the lower level set n x: (x) (x0 ) o
14 70 M. C. Ferris and. Ralph where x 0 is the starting iterate. To prove that the PG method has NSR we need to establish that the rate of descent obtained along the path x() = (x r(x)) is uniform for feasible x in a neighborhood of a given nonstationary x 2. The lemma below states a uniform descent property for all small perturbations about a given function ; the reader may consider = for simplicity. In the case where many functions are present we use the notation enition 3.3 x () def = (x r(x)): 1. Let x 2 IR n and > 0. If :! IR is C 1, the modulus of continuity of r at x 2 is the function of and > 0 (and x, )!(; ) def = sup fkr(y) r(x)k : x; y 2 ; kx xk ; kx yk g : 2. Let x 2 IR n, > 0, and : IR n! IR be C 1. Given > 0, let U() = U(; ; x; ) be the set of all C 1 functions :! IR n such that sup n j(x) (x)j + r(x) r (x) : x 2 (x + IB) \ o < ; (10) and!(; ) (1 + )!( ; ); 8 2 (0; ): (11) Lemma 3.4 Let : IRn! IR be C 1, > 0 and x 2 be nonstationary for min n o (x): x 2. There exist positive constants and such that for each 2 U() = U(; ; x; ), x 2 (x + IB) \, and 0, hr(x); x () xi minf; g: (12) Proof Let :! IR be C 1, x 2 and > 0. According to [2, (2.4)], hr(x); x () xi kx () xk 2 =; 8x 2 ; > 0: Moreover, [2, Lemma 2.2] says that, as a function of > 0, kx () xk = is antitone (nonincreasing); in particular for any > 0, kx () xk = kx () xk =; 8 2 (0; ): Using this with the previous inequality, we deduce for any > 0 that hr(x); x () xi (kx () xk =) 2 ; 8x 2 ; > 0 (kx () xk =) 2 ; 8x 2 ; 0 < < : (13)
15 Projected Gradient Methods for NCP 71 Fix > 0. By hypothesis, the point x is such that x () x > 0. Also by (10), if x! x and!, where convergence of means that 2 U() and # 0, then x () converges to x (). Hence there are > 0, > 0 such that for 2 U(), x 2 x + IB, kx () xk p : Together with (13), this yields hr(x); x () xi ; 8x 2 (x + IB) \ ; 2 [0; ]: Let def = minf; g, then (12) holds for 0. Using the well known antitone property of hr(x); x () xi in 0, see [2, (2.6)], we see that (12) also holds for >. The following result gives some technical properties of the PG method that will be important for our main algorithm. We use it later to prove that the PG method has NSR, though in this case NSR follows from the simpler case in which is a xed function. Proposition 3.5 Let : IR n! IR be C 1, > 0 and x 2 be nonstationary for min n o (x): x 2. Then there is a positive constant such that for each x 2 (x + IB) \ and 2 U() = U(; ; x; ): 1. For each 2 [0; ], (x ()) (x) + 1 hr(x); x () xi : 2. One step of PG on (6) from x generates x () with. Proof Suppose, x and are as stated. Let 1 > 0, 2 2 (0; 1) and 0 < 1 2 < 1 be the constants of the PG method. Let 1 ; be given by Lemma 3.4, and 2 U( 1 ); and assume without loss of generality that 1 2 (0; ], i.e. (10) and (11) hold with = = 1. We estimate the error term "(; x; y) def = (y) (x) hr(x); y xi ; where y; x 2. By choice of 2 U( 1 ), specically (11) with = = 1, for each 2 (0; 1 ), x 2 (x + 1 IB) \ and y 2 (x + IB) \, kr(x) r(y)k (1 + 1 )!( ; ):
16 72 M. C. Ferris and. Ralph Thus, for x 2 (x + 1 IB) \ and y 2 (x + 1 IB) \, j"(; y; x)j = j Z 1 hr(x + t(y x)) r(x); y xi dtj 0 (1 + 1 )!( ; kx yk) ky xk : (14) 2 By continuity of r on the compact set (x + 1 IB) \, there is a nite upper bound on r (x) for x 2 (x + 1 IB) \. ene = + 1 ; by choice of 2 U( 1 ), specically (10) with = = 1, kr(x)k for x 2 (x + 1 IB) \. It follows for such x and any 0, that kx () xk ; (15) because is Lipschitz of modulus 1. Furthermore, since r is uniformly continuous on compact sets,!( ; ) # 0 as # 0. Thus, using the fact that!( ; ) is nondecreasing, there exists 2 2 (0; 1 ) such that for x 2 (x + 1 IB) \ and 2 (0; 2 ), both 1 and (1 + 1 )!( ; ) 2 (1 2): From these inequalities and the inequalities (14) and (15), we see that for x 2 (x + 1 IB) \ and 2 (0; 2 ), Now for such x and, "(; x (); x) (1 2 ): (16) (x ()) = (x) + [ 2 + (1 2 )] hr(x); x () xi + "(; x (); x) (x) + 2 hr(x); x () xi (1 2 ) + (1 2 ) where the second inequality relies on the uniform descent property of Lemma 3.4 and (16). Thus (x ()) (x) + 2 hr(x); x () xi ; 8x 2 (x + 1 IB) \ ; 2 (0; 2 ); and this inequality with 1 replacing 2 also holds. Finally, for any x k 2 (x+ 1 IB)\, the auxiliary scalar k satisfying (9) is bounded below by 2 ; hence the step size k is bounded below by def = minf 1 ; 2 2 g. Since 0 < 2 < 1, then < 2 < 1, parts 1 and 2 of the proposition hold. Theorem 3.6 The PG method applied to (6) has NSR.
17 Projected Gradient Methods for NCP 73 Proof Let x 2 IR n be nonstationary, so according to Proposition 3.5 and Lemma 3.4, if x k 2 (x + IB) \ then k and (x k+1 ) (x k ) + 1 r(x k ); x k ( k ) x ke (x k ) 2; (17) where = 1 =2 > 0. Now by continuity of there is 2 (0; ) such that j(x) (x)j ; 8x 2 (x + IB) \ : Take V def = (x + IB) \, and use the above inequality with (17) to see that for any x k 2 V, (x k+1 ) (x) : The NSR property of enition 3.1 follows. 4 Projected Gradient Algorithms for NCP Our main goal here is to present a method for minimizing that has a low computational cost, and has NSR. Before proceeding we will make a few comments on guaranteeing convergence, at least on a subsequence. Existence of a (stationary) limit point of a sequence produced by a method with NSR follows from boundedness of the lower level set n x 2 IR n : kf + (x)k f+ (x 0 ) o ; where x 0 is the initial point. This boundedness property holds in many cases, for instance if f is a uniform P-function, see Harker and Xiao [12]; hence if f is strongly monotone. However the uniform P-function property implies that f+(x; 0 ) is invertible for each x, a condition that we believe is too strong in general (c.f. Lemma 2.9). A weaker condition yielding boundedness of the above level set is that f + is proper, namely that the inverse image f 1 + (S) of any compact set S IR n is compact. 4.1 A simple globally convergent algorithm Given statement 4 of Proposition 2.3, it is tempting to use the following steepest descent idea in algorithms for minimizing. Given the kth iterate x k 2 IR n, an orthant O k containing x k and the complement ~ O k of O k at x k, let d k solve min 0 (x k ; d) subject to kdk 1; d 2 O k [ ~O k : d This essentially requires two n-dimensional convex quadratic programs to be solved (a polyhedral norm on d may be used), one for each orthant. If d = 0 is a solution,
18 74 M. C. Ferris and. Ralph then x k is stationary for. Otherwise 0 (x k ; d k ) < 0, and we can perform a line search to establish k > 0 such that for x k+1 = x k + k d k, (x k + k d k ) is strictly less than (x k ). However if is nonsmooth there seems to be little global convergence theory for algorithms based on this idea. For instance, it is not known if the step length k can be chosen to be uniformly large in a neighborhood of a nonstationary point, while still retaining a certain rate of descent; hence it is hard to show that the sequence produced will not accumulate at a nonstationary point. Pang, Han and Rangaraj [23, Corollary 1] give an additional smoothness assumption at a limit point that is required to prove stationarity. Alternatively given the stationarity characterization of Proposition 2.3.5, we can design a naive steepest descent algorithm for minimizing, each iteration of which is based on a projected gradient step over an orthant O k containing the current iterate x k, and an additional m projected gradient steps on 1-dimensional problems corresponding to moving in directions normal to the m facets of O k that contain x k (so m is the number of zero components of x k ). It is signicant that to obtain global convergence, we only need to increase the number of 1-dimensional subproblems at each iteration from m to n, i.e. normals to all facets of O k must be examined. The algorithm below introduces notation not strictly required for its statement; this notation is presented in preparation for the main algorithm, Algorithm 2, which appears in the next subsection. By O we mean the restriction j O of to O. Algorithm 1. Let x 0 2 IR n. Given k 2 f0; 1; 2; : : : ; g and x k 2 IR n ; dene x k+1 as follows. Choose any orthant O k containing x k, let y 0 () def = O k[x k r O k(x k )], and 0 be the step size determined by one step of the projected gradient algorithm applied to min n (x): x 2 O ko from x k. Suppose F 1 ; : : : ; F n are the facets of O k. For j = 1; : : : ; n, let y j def = Fj (x k def ), N j = N O k(f j ), y j () def = y j + Nj [ r(y j )], and j be the step size determined by one step of the projected gradient algorithm applied to starting from y j. Let min n (x): x 2 y j + N j o ; (18) x k+1 def = y^ (^ ); where ^ 2 argmin n (y j ( j )): j = 0; 1; 2; : : : ; n o : If (x k+1 ) = (x k ) then STOP; x k is a Gauss{Newton point of f +. Remark. In Algorithm 1 the projected gradient method is used as a subroutine. Therefore we assume that if the starting point of a subproblem is stationary, then the projected gradient method merely returns this point; the decision of whether or not the main algorithm should continue is made elsewhere.
19 Projected Gradient Methods for NCP 75 Theorem 4.1 Algorithm 1 is well dened and has NSR. Proof Since the projected gradient method is well dened, for each k and x k the algorithm produces x k+1. If (x k+1 ) = (x k ) then none of the subproblems of the form (18) produced a point with a lower function value than (x k ). So x k is stationary for each subproblem for which F j is a facet of O k containing x k, and by Proposition 2.3, x k is also a Gauss-Newton point of f +. Thus Algorithm 1 is well dened. We show that the algorithm has NSR. Suppose x is not a Gauss-Newton point of f +. For x k suciently close to x, x 2 O k. So consider the case when O k = O for some xed orthant O containing x. By Proposition 2.3 x is nonstationary either for min n (x): x 2 O o or for minf(x): x 2 x + N O(F )g, where F is some facet of O containing x. In the former case, for some = ( O) > 0 and each x k 2 x + IB, we have from Theorem 3.6 with = O and = O, that the candidate y 0 ( 0 ) for the next iterate x k+1 yields (y 0 ( 0 )) < (x). Hence our choice of x k+1 also yields (x k+1 ) < (x). In the latter case, we can apply Proposition 3.5 by reformulating the subproblem (18) as minf(y j + d): d 2 N O(F )g, i.e. dene (d) = (y j +d), (d) = (x+d), = N O(F ), and as any positive constant, and let 1 > 0 be the constant given by Proposition 3.5. Given the simple form of, it is easy to check that there is = ( O) > 0 such that if x k 2 x + IB, then 2 U( 1 ; ; x; ): For such x k, Proposition 3.5 says that the candidate iterate y j ( j ) yields (y j ( j )) < (x), hence (x k+1 ) < (x). Since there are only nitely many orthants, we conclude that for some > 0 independent of O k, and each x k 2 x + IB, we have (x k+1 ) < (x). This algorithm is extremely robust: under the single assumption that f is C 1 on IR n +, the method is well dened and accumulation points are always Gauss{Newton points. It is also reasonably simple, using the projected gradient method as the work horse. A serious drawback of Algorithm 1 is that we need at least n + 1 function and Jacobian evaluations per iteration, in order to carry out the projected gradient method on the n + 1 subproblems. By contrast, the use of 1-dimensional subproblems means the linear algebra performed by Algorithm 1 is only around twice as expensive as the linear algebra needed to perform one projected gradient step on an orthant. 4.2 An ecient globally convergent algorithm We present a globally convergent method for nding Gauss{Newton points of f + based on the PG method. It is ecient in the sense that per iteration, the number of function evaluations is comparable to that needed for the PG method applied to minimizing a smooth function over an orthant, and the linear algebra computation involves about double the work required for linear algebra in the PG method.
20 76 M. C. Ferris and. Ralph At each iteration, we approximate by linearizing f about x k +. Let where A k (x) def = 1 L k + 2 (x) 2 L k +(x) def = f(x k +) + rf(x k +)(x + x k +) + x x + : (19) The \linearization" L k + is a local point-based approximation [34] when rf is locally Lipschitz, and more generally a uniform rst-order approximation near x k [28]; such approximations are more powerful than directional derivatives in that they approximate f + uniformly well for all x near x k. In [5, 4, 28, 34] these approximation properties have been exploited to give strong convergence results for Newton methods applied to nonsmooth equations like f + (x) = 0. Our main algorithm, below, and its extremely robust convergence behavior also rely on these approximation properties. Lemma 4.2 Let x 2 IR n and > 0. There is a non-decreasing function " : IR +! IR + such that "() = o() as # 0, and for each x k ; x 2 x + IB, (x) A k (x) "( x x k ): Proof We have j(x) A k (x)j = (1=2) f + (x) + L k +(x); f + (x) L k +(x) E c f + (x) L k +(x) ; where c 2 (0; 1) is the maximum value of (1=2) f+ (x) + L k +(x) for x k ; x 2 x + IB. Let! be the modulus of continuity of rf on IR n + \ (x + + IB) (see enition 3.3). Similar to (14) in the proof of Proposition 3.5, f + (x) L k + (x)!( x x k ) x x k =2; where!()! 0 as # 0. Take "() def = c!()=2. We will search several paths during an iteration but, unlike Algorithm 1, our criteria for choosing the path parameter will use derivatives of the approximation A k rather than of. Let 0 2 (0; 1 ) and 2 (0; 1). Suppose we are also given an orthant O containing a point y (but not necessarily x k ), and a path y : [0; 1)! IR n with y(0) = y. Given > 0, y() is a candidate for x k+1 if (y()) A k (y) + 0 ra k O (y); y() y E : (20) Here A k O is the restriction A k j O. If fails the above test, we can try =. Note that if y = x k then ra k O(y) = r O (x k ), and the obvious choice for y() is O (x k r O (x k )). In this case (20) is equivalent to (7) with = O and = O. The rst part of Algorithm 2 is a single step of Algorithm 1 applied to A k instead of. The second part determines the path and the corresponding step length that will dene the next iterate x k+1.
21 Projected Gradient Methods for NCP 77 Algorithm 2. Let x 0 2 IR n and (in addition to the constants used for the PG method), 0 2 (0; 1 ); 2 (0; 1). Given k 2 f0; 1; 2; : : : ; g and x k 2 IR n ; dene x k+1 as follows. Part I. Choose any orthant O k containing x k, let y 0 def = x k, y 0 () def = O k[x k r O k(x k )]; and 0 be the step size determined by one step of the projected gradient algorithm applied to min n A k (x): x 2 O ko from y 0. Suppose F 1 ; : : : ; F n are the facets of O k. For j = 1; : : : ; n, let y j def = Fj (x k def ), N j = N O k(f j ), def O j = F j + N j, y j () def = y j + Nj [ ra k O j (y j )]; and j be the step size determined by one step of the projected gradient algorithm applied to from y j. min n A k (x): x 2 y j + N j o Part II. Path search: Let M def = f0; : : : ; ng, ^ def def = 0 and 0 = 0 =. REPEAT def Let ^ = ^. If ^ y^ x k then M def = M n f^ g. If M = ; STOP; x k is a Gauss{Newton point of f +. Else let ^ 2 argmin n A k (y j ) + 0 ra k Oj (y j ); y j ( j ) y je : j 2 M o : UNTIL (20) holds for y() = y^ (^ ), y = y^ and O = O^. k+1 def Let x = y^ (^ ). Remark. For the algorithm to work properly, we assume that part I returns j = 0 if y j is already stationary for the corresponding subproblem. Theorem 4.3 Algorithm 2 is well dened and has NSR. Proof First we show that each step of the algorithm is well dened. Consider one step of the algorithm given k 2 f0; 1; 2; : : :g and x k 2 IR n. Part I is well dened because the projected gradient method is well dened. For part II we see that each iteration of the REPEAT loop is well dened; we claim that the loop terminates after nitely many iterations. Certainly if j 2 f0; : : : ; ng and y j 6= x k, then after a nite number of loop iterations in which ^ = j and j def = j, we have j yj x k ; (21)
22 78 M. C. Ferris and. Ralph hence in any subsequent loop iterations j 62 M and ^ 6= j. Instead suppose j is such that y j = x k. Either y j is stationary for the jth subproblem hence j = 0 and, by construction of M, ^ equals j for at most one loop iteration; or using Proposition 3.5.1, def initially j > 0 and after nitely many loop iterations in which ^ = j and j = j, (20) holds, terminating the loop. It is only left to check that x k is a Gauss-Newton point of f + if M = ;. In this case, j yj x k for each j, in particular j = 0 if y j = x k, i.e. for j = 0 and each j in M 0 = n j: 1 j n; x k 2 F j o. This is only possible if xk is stationary for each subproblem min n A k (x): x 2 O ko and min n A k (x): x 2 x k + N Oj (F j ) o where j 2 M 0. Since for each orthant O containing x k we have ra k O (xk ) = r O (x k ), it follows that x k is also stationary for min n (x): x 2 O ko and min n (x): x 2 x k + N Oj (F j ) o where j 2 M 0. Proposition 2.3 says x k is indeed a Gauss-Newton point of f +. We now prove the NSR property. Suppose that x is nonstationary for. As in the proof of Theorem 4.1 we assume O k = O for some xed orthant O containing x. Observe from Proposition 2.3, that either x is nonstationary for min n (x): x 2 O o or x is nonstationary for minf(x): x 2 x + N O(F )g, for some facet F of O containing x. Below we assume the latter, and deduce for x k near x that (x k+1 ) < (x). Let O be an orthant containing x and F be a facet of O containing x. Assume x is nonstationary for minf(x): x 2 x + N O(F )g. Assume further that x k is some iterate with O k = O, so if F 1 ; : : : ; F n are the facets of O k, then F = F ~ for some index ~. To simplify notation we omit the superscript or subscript ~ where possible. Let N = N O(F ), O = F + N, y = F (x k ), y() = N (y ra k O(y)), and A(x) def = (1=2) kf(x + ) + rf(x + )(x + x + ) + x x + k 2 : Observe, since r O(x) = r A O(x), that x is nonstationary for min n A(x): x 2 y + N o. Rewriting the ~ th subproblem, min n AO (x): x 2 y + N o, as min n AO (y + d): d 2 N o ; dening (d) = A k N (y + d), (d) = AN (y + d), = N and choosing > 0, enables us to apply Lemma 3.4 and Proposition 3.5. Then there exist 1 > 0; > 0 such that if x k x 1 and 2 U( 1 ) = U( 1 ; ; x; ) (see enition 3.3), then A k (y()) A k (y) 1 ra k O (y); y() y E ; (22) ra k O (x k ); y() y E minf; 1 g; (23) and the initial step size ~ chosen in Part I of the algorithm is bounded below by 1. Now A N and A k N are quadratic functions dened on the half-line N, hence, by continuity of rf, it follows easily that there exists 2 2 (0; 1 ] such that 2 U( 1 ) if x k x 2. Thus (22) and (23) hold for such x k and 2 [0; 2 ].
23 Projected Gradient Methods for NCP 79 Let xk x 2 and 0 2. We have (y()) (x k ) = [A k (y()) A k (y)] + [A k (y) (x k )] + [(y()) A k (y())] 1 ra k O (y); y() y E + [A k (y) (x k )] + [(y()) A k (y())]; (24) using (22). Let L be an upper bound on ra k O(y) for x k 2 x + 2 IB, and observe y() x k ky() yk + y x k L + y x k : Also y = F (x k ) is bounded on x+ 2 IB, therefore Lemma 4.2 provides a non-decreasing error bound "(t) = o(t) such that for each x k 2 x + 2 IB, 2 [0; 2 ], (y()) A k (y()) "(L + y x k ): (25) ^: Now Let ^ = ( 1 0 )=2 and choose 2 (0; 2 ) such that "(2L) choose 3 2 (0; 2 ) such that if xk x 3, then both y x k minf; Lg and ja k (y) (x k )j ^. Let x k 2 x + 3 IB. For 2 (0; ], (24) and (25) yield (y()) (x k ) 1 ra k O (y); y() y E + ^ + "(L + L) 1 ra k O (y); y() y E + ( 1 0 ) 1 ra k O (y); y() y E + ( 1 0 ); 8 2 [ ; ]: From (23), if 1 then ra k O(y); y() ye ; therefore (y()) (x k ) 0 ra k O (y); y() y E ; 8 2 [ ; ]: (26) From above, the initial step size ~ and the point y ~ = y are such that ~, and > y x k. We claim it follows from (26) that, during the REPEAT loop of part II, ~ 2 M and ~. To see this suppose that ~ decreases in some loop iteration after the rst loop iteration. Then at the end of the previous loop iteration, (^ = ~ and) the condition (20) fails for y() = y ~ ( ~ ), y = y ~ and O = O ~ ; so it follows from (26) that ~ >. Thus the new value ~ of ~ is bounded below by, hence also ~ > y x k and ~ is not deleted from M. Therefore after the REPEAT loop terminates, ~ 2 M and ~ ; and the selection of x k+1, whether or not using y ~ () = y(), satises (x k+1 ) min n A k (y j ) + 0 ra k Oj (y j ); y j ( j ) y je : j 2 M o A k (y) + 0 ra k O (y); y( ~ ) y E A k (y) 0 minf ~ ; 1 g (from (23)) A k (y) ;
24 80 M. C. Ferris and. Ralph where def = 0 is a positive constant independent of x k. As noted above, A k (y)! (x) as x k! x, so (x k+1 ) < (x) for x k suciently close to x. A similar argument can be made for the case when O k = O and x is nonstationary for min n (x): x 2 O o. In this case ~ = 0, y = x k and y() = O(x k ra k O (xk )). We do not give details, but only note that this process is somewhat simpler than that above because the inequality corresponding to (24) only has two summands on the right: (y()) (x k ) 1 ra k (x k ); y() x ke + [(y()) A k (y())]: Since there are only nitely many choices of O, the NSR property of Algorithm 2 is established. 4.3 A hybrid algorithm with quadratic local convergence Both of the algorithms given above have at best a linear rate of convergence because the projected gradient method is only a rst-order method. However, if an algorithm for nding a Gauss-Newton point of f + has NSR (such as Algorithms 1 and 2), then this lends itself to hybrid methods that alternate between steps of the original algorithm and Newton-like steps and therefore admit the possibility of quadratic local convergence. For such a hybrid algorithm, let K be the set of indices k for which the original algorithm determines x k+1. If K has innitely many elements and monotonicity of the algorithm is maintained, accumulation points of the subsequence fx k g k2k are Gauss{Newton points of f +. If such a limit point x is in fact a point of attraction of a Newton method, and a Newton step is taken every `th iteration, then convergence will be `-step superlinear, or `-step quadratic if rf is Lipschitz. See [2] for details on a related hybrid algorithm in the context of quadratic programming. We briey sketch three popular Newton methods for solving the nonsmooth equation f + (x) = 0; which often produce Q-quadratically convergent sequences of iterates. To make comparisons easy, we use the general notion of a Newton path [28] which, given the iterate x k, is some function p k : [0; 1]! IR n with p k (0) = x k ; the next iterate x k+1 is dened as p k () for some 2 [0; 1] (details are given below). We say a Newton iterate or Newton step is taken if x k+1 = p k (1). We may not take a Newton step, however, if it does not yield \sucient progress". A simple damping strategy is used to ensure sucient progress: recall the constants 0 ; 2 (0; 1), and dene as the largest member of f1; ; 2 ; : : :g such that f + (p k ()) (1 0 ) f+ (x k ) : (27) k+1 def Then x = p k (); this is the damped Newton iterate. Newton path 1. Given k and x k, let O k be an orthant containing x k, M k = rf + j O k, d k = (M k ) 1 f + (x k ); p k () = x k + d k :
1. Introduction The nonlinear complementarity problem (NCP) is to nd a point x 2 IR n such that hx; F (x)i = ; x 2 IR n + ; F (x) 2 IRn + ; where F is
New NCP-Functions and Their Properties 3 by Christian Kanzow y, Nobuo Yamashita z and Masao Fukushima z y University of Hamburg, Institute of Applied Mathematics, Bundesstrasse 55, D-2146 Hamburg, Germany,
More informationSOLUTION OF NONLINEAR COMPLEMENTARITY PROBLEMS
A SEMISMOOTH EQUATION APPROACH TO THE SOLUTION OF NONLINEAR COMPLEMENTARITY PROBLEMS Tecla De Luca 1, Francisco Facchinei 1 and Christian Kanzow 2 1 Universita di Roma \La Sapienza" Dipartimento di Informatica
More informationsystem of equations. In particular, we give a complete characterization of the Q-superlinear
INEXACT NEWTON METHODS FOR SEMISMOOTH EQUATIONS WITH APPLICATIONS TO VARIATIONAL INEQUALITY PROBLEMS Francisco Facchinei 1, Andreas Fischer 2 and Christian Kanzow 3 1 Dipartimento di Informatica e Sistemistica
More information16 Chapter 3. Separation Properties, Principal Pivot Transforms, Classes... for all j 2 J is said to be a subcomplementary vector of variables for (3.
Chapter 3 SEPARATION PROPERTIES, PRINCIPAL PIVOT TRANSFORMS, CLASSES OF MATRICES In this chapter we present the basic mathematical results on the LCP. Many of these results are used in later chapters to
More informationSmoothed Fischer-Burmeister Equation Methods for the. Houyuan Jiang. CSIRO Mathematical and Information Sciences
Smoothed Fischer-Burmeister Equation Methods for the Complementarity Problem 1 Houyuan Jiang CSIRO Mathematical and Information Sciences GPO Box 664, Canberra, ACT 2601, Australia Email: Houyuan.Jiang@cmis.csiro.au
More informationVector Space Basics. 1 Abstract Vector Spaces. 1. (commutativity of vector addition) u + v = v + u. 2. (associativity of vector addition)
Vector Space Basics (Remark: these notes are highly formal and may be a useful reference to some students however I am also posting Ray Heitmann's notes to Canvas for students interested in a direct computational
More informationLinear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space
Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) Contents 1 Vector Spaces 1 1.1 The Formal Denition of a Vector Space.................................. 1 1.2 Subspaces...................................................
More informationIE 5531: Engineering Optimization I
IE 5531: Engineering Optimization I Lecture 15: Nonlinear optimization Prof. John Gunnar Carlsson November 1, 2010 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I November 1, 2010 1 / 24
More informationOptimization over Sparse Symmetric Sets via a Nonmonotone Projected Gradient Method
Optimization over Sparse Symmetric Sets via a Nonmonotone Projected Gradient Method Zhaosong Lu November 21, 2015 Abstract We consider the problem of minimizing a Lipschitz dierentiable function over a
More informationMethods for a Class of Convex. Functions. Stephen M. Robinson WP April 1996
Working Paper Linear Convergence of Epsilon-Subgradient Descent Methods for a Class of Convex Functions Stephen M. Robinson WP-96-041 April 1996 IIASA International Institute for Applied Systems Analysis
More informationOn the Convergence of Newton Iterations to Non-Stationary Points Richard H. Byrd Marcelo Marazzi y Jorge Nocedal z April 23, 2001 Report OTC 2001/01 Optimization Technology Center Northwestern University,
More informationON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS
MATHEMATICS OF OPERATIONS RESEARCH Vol. 28, No. 4, November 2003, pp. 677 692 Printed in U.S.A. ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS ALEXANDER SHAPIRO We discuss in this paper a class of nonsmooth
More informationON THE ARITHMETIC-GEOMETRIC MEAN INEQUALITY AND ITS RELATIONSHIP TO LINEAR PROGRAMMING, BAHMAN KALANTARI
ON THE ARITHMETIC-GEOMETRIC MEAN INEQUALITY AND ITS RELATIONSHIP TO LINEAR PROGRAMMING, MATRIX SCALING, AND GORDAN'S THEOREM BAHMAN KALANTARI Abstract. It is a classical inequality that the minimum of
More informationA Generalized Homogeneous and Self-Dual Algorithm. for Linear Programming. February 1994 (revised December 1994)
A Generalized Homogeneous and Self-Dual Algorithm for Linear Programming Xiaojie Xu Yinyu Ye y February 994 (revised December 994) Abstract: A generalized homogeneous and self-dual (HSD) infeasible-interior-point
More informationWHEN ARE THE (UN)CONSTRAINED STATIONARY POINTS OF THE IMPLICIT LAGRANGIAN GLOBAL SOLUTIONS?
WHEN ARE THE (UN)CONSTRAINED STATIONARY POINTS OF THE IMPLICIT LAGRANGIAN GLOBAL SOLUTIONS? Francisco Facchinei a,1 and Christian Kanzow b a Università di Roma La Sapienza Dipartimento di Informatica e
More informationA convergence result for an Outer Approximation Scheme
A convergence result for an Outer Approximation Scheme R. S. Burachik Engenharia de Sistemas e Computação, COPPE-UFRJ, CP 68511, Rio de Janeiro, RJ, CEP 21941-972, Brazil regi@cos.ufrj.br J. O. Lopes Departamento
More informationGENERALIZED CONVEXITY AND OPTIMALITY CONDITIONS IN SCALAR AND VECTOR OPTIMIZATION
Chapter 4 GENERALIZED CONVEXITY AND OPTIMALITY CONDITIONS IN SCALAR AND VECTOR OPTIMIZATION Alberto Cambini Department of Statistics and Applied Mathematics University of Pisa, Via Cosmo Ridolfi 10 56124
More informationOptimization and Optimal Control in Banach Spaces
Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,
More informationOptimality Conditions for Constrained Optimization
72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)
More informationUnconstrained optimization
Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout
More informationUNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems
UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems Robert M. Freund February 2016 c 2016 Massachusetts Institute of Technology. All rights reserved. 1 1 Introduction
More informationTMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM
TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM H. E. Krogstad, IMF, Spring 2012 Karush-Kuhn-Tucker (KKT) Theorem is the most central theorem in constrained optimization, and since the proof is scattered
More informationonly nite eigenvalues. This is an extension of earlier results from [2]. Then we concentrate on the Riccati equation appearing in H 2 and linear quadr
The discrete algebraic Riccati equation and linear matrix inequality nton. Stoorvogel y Department of Mathematics and Computing Science Eindhoven Univ. of Technology P.O. ox 53, 56 M Eindhoven The Netherlands
More informationSome Properties of the Augmented Lagrangian in Cone Constrained Optimization
MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented
More informationSpectral gradient projection method for solving nonlinear monotone equations
Journal of Computational and Applied Mathematics 196 (2006) 478 484 www.elsevier.com/locate/cam Spectral gradient projection method for solving nonlinear monotone equations Li Zhang, Weijun Zhou Department
More informationImplications of the Constant Rank Constraint Qualification
Mathematical Programming manuscript No. (will be inserted by the editor) Implications of the Constant Rank Constraint Qualification Shu Lu Received: date / Accepted: date Abstract This paper investigates
More informationA Quasi-Newton Algorithm for Nonconvex, Nonsmooth Optimization with Global Convergence Guarantees
Noname manuscript No. (will be inserted by the editor) A Quasi-Newton Algorithm for Nonconvex, Nonsmooth Optimization with Global Convergence Guarantees Frank E. Curtis Xiaocun Que May 26, 2014 Abstract
More informationStationary Points of Bound Constrained Minimization Reformulations of Complementarity Problems1,2
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS: Vol. 94, No. 2, pp. 449-467, AUGUST 1997 Stationary Points of Bound Constrained Minimization Reformulations of Complementarity Problems1,2 M. V. SOLODOV3
More informationNonlinear Programming
Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week
More informationARE202A, Fall Contents
ARE202A, Fall 2005 LECTURE #2: WED, NOV 6, 2005 PRINT DATE: NOVEMBER 2, 2005 (NPP2) Contents 5. Nonlinear Programming Problems and the Kuhn Tucker conditions (cont) 5.2. Necessary and sucient conditions
More informationIE 5531: Engineering Optimization I
IE 5531: Engineering Optimization I Lecture 14: Unconstrained optimization Prof. John Gunnar Carlsson October 27, 2010 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 27, 2010 1
More informationLinear Algebra (part 1) : Matrices and Systems of Linear Equations (by Evan Dummit, 2016, v. 2.02)
Linear Algebra (part ) : Matrices and Systems of Linear Equations (by Evan Dummit, 206, v 202) Contents 2 Matrices and Systems of Linear Equations 2 Systems of Linear Equations 2 Elimination, Matrix Formulation
More informationA Proximal Method for Identifying Active Manifolds
A Proximal Method for Identifying Active Manifolds W.L. Hare April 18, 2006 Abstract The minimization of an objective function over a constraint set can often be simplified if the active manifold of the
More information2 B. CHEN, X. CHEN AND C. KANZOW Abstract: We introduce a new NCP-function that reformulates a nonlinear complementarity problem as a system of semism
A PENALIZED FISCHER-BURMEISTER NCP-FUNCTION: THEORETICAL INVESTIGATION AND NUMERICAL RESULTS 1 Bintong Chen 2, Xiaojun Chen 3 and Christian Kanzow 4 2 Department of Management and Systems Washington State
More informationN.G.Bean, D.A.Green and P.G.Taylor. University of Adelaide. Adelaide. Abstract. process of an MMPP/M/1 queue is not a MAP unless the queue is a
WHEN IS A MAP POISSON N.G.Bean, D.A.Green and P.G.Taylor Department of Applied Mathematics University of Adelaide Adelaide 55 Abstract In a recent paper, Olivier and Walrand (994) claimed that the departure
More informationNONSMOOTH VARIANTS OF POWELL S BFGS CONVERGENCE THEOREM
NONSMOOTH VARIANTS OF POWELL S BFGS CONVERGENCE THEOREM JIAYI GUO AND A.S. LEWIS Abstract. The popular BFGS quasi-newton minimization algorithm under reasonable conditions converges globally on smooth
More informationUnconstrained minimization of smooth functions
Unconstrained minimization of smooth functions We want to solve min x R N f(x), where f is convex. In this section, we will assume that f is differentiable (so its gradient exists at every point), and
More information20 J.-S. CHEN, C.-H. KO AND X.-R. WU. : R 2 R is given by. Recently, the generalized Fischer-Burmeister function ϕ p : R2 R, which includes
016 0 J.-S. CHEN, C.-H. KO AND X.-R. WU whereas the natural residual function ϕ : R R is given by ϕ (a, b) = a (a b) + = min{a, b}. Recently, the generalized Fischer-Burmeister function ϕ p : R R, which
More informationThe iterative convex minorant algorithm for nonparametric estimation
The iterative convex minorant algorithm for nonparametric estimation Report 95-05 Geurt Jongbloed Technische Universiteit Delft Delft University of Technology Faculteit der Technische Wiskunde en Informatica
More information1 Matrices and Systems of Linear Equations
Linear Algebra (part ) : Matrices and Systems of Linear Equations (by Evan Dummit, 207, v 260) Contents Matrices and Systems of Linear Equations Systems of Linear Equations Elimination, Matrix Formulation
More informationKey words. linear complementarity problem, non-interior-point algorithm, Tikhonov regularization, P 0 matrix, regularized central path
A GLOBALLY AND LOCALLY SUPERLINEARLY CONVERGENT NON-INTERIOR-POINT ALGORITHM FOR P 0 LCPS YUN-BIN ZHAO AND DUAN LI Abstract Based on the concept of the regularized central path, a new non-interior-point
More informationy Ray of Half-line or ray through in the direction of y
Chapter LINEAR COMPLEMENTARITY PROBLEM, ITS GEOMETRY, AND APPLICATIONS. THE LINEAR COMPLEMENTARITY PROBLEM AND ITS GEOMETRY The Linear Complementarity Problem (abbreviated as LCP) is a general problem
More informationIE 5531: Engineering Optimization I
IE 5531: Engineering Optimization I Lecture 19: Midterm 2 Review Prof. John Gunnar Carlsson November 22, 2010 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I November 22, 2010 1 / 34 Administrivia
More informationA NOTE ON A GLOBALLY CONVERGENT NEWTON METHOD FOR SOLVING. Patrice MARCOTTE. Jean-Pierre DUSSAULT
A NOTE ON A GLOBALLY CONVERGENT NEWTON METHOD FOR SOLVING MONOTONE VARIATIONAL INEQUALITIES Patrice MARCOTTE Jean-Pierre DUSSAULT Resume. Il est bien connu que la methode de Newton, lorsqu'appliquee a
More information58 Appendix 1 fundamental inconsistent equation (1) can be obtained as a linear combination of the two equations in (2). This clearly implies that the
Appendix PRELIMINARIES 1. THEOREMS OF ALTERNATIVES FOR SYSTEMS OF LINEAR CONSTRAINTS Here we consider systems of linear constraints, consisting of equations or inequalities or both. A feasible solution
More informationA new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints
Journal of Computational and Applied Mathematics 161 (003) 1 5 www.elsevier.com/locate/cam A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality
More information1 Lyapunov theory of stability
M.Kawski, APM 581 Diff Equns Intro to Lyapunov theory. November 15, 29 1 1 Lyapunov theory of stability Introduction. Lyapunov s second (or direct) method provides tools for studying (asymptotic) stability
More informationLecture 4 - The Gradient Method Objective: find an optimal solution of the problem
Lecture 4 - The Gradient Method Objective: find an optimal solution of the problem min{f (x) : x R n }. The iterative algorithms that we will consider are of the form x k+1 = x k + t k d k, k = 0, 1,...
More informationLecture 4 - The Gradient Method Objective: find an optimal solution of the problem
Lecture 4 - The Gradient Method Objective: find an optimal solution of the problem min{f (x) : x R n }. The iterative algorithms that we will consider are of the form x k+1 = x k + t k d k, k = 0, 1,...
More information2 Sequences, Continuity, and Limits
2 Sequences, Continuity, and Limits In this chapter, we introduce the fundamental notions of continuity and limit of a real-valued function of two variables. As in ACICARA, the definitions as well as proofs
More informationZangwill s Global Convergence Theorem
Zangwill s Global Convergence Theorem A theory of global convergence has been given by Zangwill 1. This theory involves the notion of a set-valued mapping, or point-to-set mapping. Definition 1.1 Given
More informationGarrett: `Bernstein's analytic continuation of complex powers' 2 Let f be a polynomial in x 1 ; : : : ; x n with real coecients. For complex s, let f
1 Bernstein's analytic continuation of complex powers c1995, Paul Garrett, garrettmath.umn.edu version January 27, 1998 Analytic continuation of distributions Statement of the theorems on analytic continuation
More informationThe general programming problem is the nonlinear programming problem where a given function is maximized subject to a set of inequality constraints.
1 Optimization Mathematical programming refers to the basic mathematical problem of finding a maximum to a function, f, subject to some constraints. 1 In other words, the objective is to find a point,
More informationmin f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;
Chapter 2 Gradient Methods The gradient method forms the foundation of all of the schemes studied in this book. We will provide several complementary perspectives on this algorithm that highlight the many
More informationAbsolute value equations
Linear Algebra and its Applications 419 (2006) 359 367 www.elsevier.com/locate/laa Absolute value equations O.L. Mangasarian, R.R. Meyer Computer Sciences Department, University of Wisconsin, 1210 West
More informationUniversity of California. Berkeley, CA fzhangjun johans lygeros Abstract
Dynamical Systems Revisited: Hybrid Systems with Zeno Executions Jun Zhang, Karl Henrik Johansson y, John Lygeros, and Shankar Sastry Department of Electrical Engineering and Computer Sciences University
More informationPreface These notes were prepared on the occasion of giving a guest lecture in David Harel's class on Advanced Topics in Computability. David's reques
Two Lectures on Advanced Topics in Computability Oded Goldreich Department of Computer Science Weizmann Institute of Science Rehovot, Israel. oded@wisdom.weizmann.ac.il Spring 2002 Abstract This text consists
More informationIterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming
Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Zhaosong Lu October 5, 2012 (Revised: June 3, 2013; September 17, 2013) Abstract In this paper we study
More informationLecture 3. Optimization Problems and Iterative Algorithms
Lecture 3 Optimization Problems and Iterative Algorithms January 13, 2016 This material was jointly developed with Angelia Nedić at UIUC for IE 598ns Outline Special Functions: Linear, Quadratic, Convex
More informationMcMaster University. Advanced Optimization Laboratory. Title: A Proximal Method for Identifying Active Manifolds. Authors: Warren L.
McMaster University Advanced Optimization Laboratory Title: A Proximal Method for Identifying Active Manifolds Authors: Warren L. Hare AdvOl-Report No. 2006/07 April 2006, Hamilton, Ontario, Canada A Proximal
More informationSeptember Math Course: First Order Derivative
September Math Course: First Order Derivative Arina Nikandrova Functions Function y = f (x), where x is either be a scalar or a vector of several variables (x,..., x n ), can be thought of as a rule which
More information1 Introduction We consider the problem nd x 2 H such that 0 2 T (x); (1.1) where H is a real Hilbert space, and T () is a maximal monotone operator (o
Journal of Convex Analysis Volume 6 (1999), No. 1, pp. xx-xx. cheldermann Verlag A HYBRID PROJECTION{PROXIMAL POINT ALGORITHM M. V. Solodov y and B. F. Svaiter y January 27, 1997 (Revised August 24, 1998)
More informationand P RP k = gt k (g k? g k? ) kg k? k ; (.5) where kk is the Euclidean norm. This paper deals with another conjugate gradient method, the method of s
Global Convergence of the Method of Shortest Residuals Yu-hong Dai and Ya-xiang Yuan State Key Laboratory of Scientic and Engineering Computing, Institute of Computational Mathematics and Scientic/Engineering
More informationA projection-type method for generalized variational inequalities with dual solutions
Available online at www.isr-publications.com/jnsa J. Nonlinear Sci. Appl., 10 (2017), 4812 4821 Research Article Journal Homepage: www.tjnsa.com - www.isr-publications.com/jnsa A projection-type method
More informationDate: July 5, Contents
2 Lagrange Multipliers Date: July 5, 2001 Contents 2.1. Introduction to Lagrange Multipliers......... p. 2 2.2. Enhanced Fritz John Optimality Conditions...... p. 14 2.3. Informative Lagrange Multipliers...........
More informationDouglas-Rachford splitting for nonconvex feasibility problems
Douglas-Rachford splitting for nonconvex feasibility problems Guoyin Li Ting Kei Pong Jan 3, 015 Abstract We adapt the Douglas-Rachford DR) splitting method to solve nonconvex feasibility problems by studying
More informationDO NOT OPEN THIS QUESTION BOOKLET UNTIL YOU ARE TOLD TO DO SO
QUESTION BOOKLET EECS 227A Fall 2009 Midterm Tuesday, Ocotober 20, 11:10-12:30pm DO NOT OPEN THIS QUESTION BOOKLET UNTIL YOU ARE TOLD TO DO SO You have 80 minutes to complete the midterm. The midterm consists
More informationCombinatorial Structures in Nonlinear Programming
Combinatorial Structures in Nonlinear Programming Stefan Scholtes April 2002 Abstract Non-smoothness and non-convexity in optimization problems often arise because a combinatorial structure is imposed
More information460 HOLGER DETTE AND WILLIAM J STUDDEN order to examine how a given design behaves in the model g` with respect to the D-optimality criterion one uses
Statistica Sinica 5(1995), 459-473 OPTIMAL DESIGNS FOR POLYNOMIAL REGRESSION WHEN THE DEGREE IS NOT KNOWN Holger Dette and William J Studden Technische Universitat Dresden and Purdue University Abstract:
More informationON REGULARITY CONDITIONS FOR COMPLEMENTARITY PROBLEMS
ON REGULARITY CONDITIONS FOR COMPLEMENTARITY PROBLEMS A. F. Izmailov and A. S. Kurennoy December 011 ABSTRACT In the context of mixed complementarity problems various concepts of solution regularity are
More information1 Introduction Let F : < n! < n be a continuously dierentiable mapping and S be a nonempty closed convex set in < n. The variational inequality proble
A New Unconstrained Dierentiable Merit Function for Box Constrained Variational Inequality Problems and a Damped Gauss-Newton Method Defeng Sun y and Robert S. Womersley z School of Mathematics University
More informationConvex Functions and Optimization
Chapter 5 Convex Functions and Optimization 5.1 Convex Functions Our next topic is that of convex functions. Again, we will concentrate on the context of a map f : R n R although the situation can be generalized
More information290 J.M. Carnicer, J.M. Pe~na basis (u 1 ; : : : ; u n ) consisting of minimally supported elements, yet also has a basis (v 1 ; : : : ; v n ) which f
Numer. Math. 67: 289{301 (1994) Numerische Mathematik c Springer-Verlag 1994 Electronic Edition Least supported bases and local linear independence J.M. Carnicer, J.M. Pe~na? Departamento de Matematica
More informationMidterm 1. Every element of the set of functions is continuous
Econ 200 Mathematics for Economists Midterm Question.- Consider the set of functions F C(0, ) dened by { } F = f C(0, ) f(x) = ax b, a A R and b B R That is, F is a subset of the set of continuous functions
More informationMetric Spaces. DEF. If (X; d) is a metric space and E is a nonempty subset, then (E; d) is also a metric space, called a subspace of X:
Metric Spaces DEF. A metric space X or (X; d) is a nonempty set X together with a function d : X X! [0; 1) such that for all x; y; and z in X : 1. d (x; y) 0 with equality i x = y 2. d (x; y) = d (y; x)
More informationSet, functions and Euclidean space. Seungjin Han
Set, functions and Euclidean space Seungjin Han September, 2018 1 Some Basics LOGIC A is necessary for B : If B holds, then A holds. B A A B is the contraposition of B A. A is sufficient for B: If A holds,
More informationLecture 8 Plus properties, merit functions and gap functions. September 28, 2008
Lecture 8 Plus properties, merit functions and gap functions September 28, 2008 Outline Plus-properties and F-uniqueness Equation reformulations of VI/CPs Merit functions Gap merit functions FP-I book:
More informationA derivative-free nonmonotone line search and its application to the spectral residual method
IMA Journal of Numerical Analysis (2009) 29, 814 825 doi:10.1093/imanum/drn019 Advance Access publication on November 14, 2008 A derivative-free nonmonotone line search and its application to the spectral
More informationR. Schaback. numerical method is proposed which rst minimizes each f j separately. and then applies a penalty strategy to gradually force the
A Multi{Parameter Method for Nonlinear Least{Squares Approximation R Schaback Abstract P For discrete nonlinear least-squares approximation problems f 2 (x)! min for m smooth functions f : IR n! IR a m
More informationA TOUR OF LINEAR ALGEBRA FOR JDEP 384H
A TOUR OF LINEAR ALGEBRA FOR JDEP 384H Contents Solving Systems 1 Matrix Arithmetic 3 The Basic Rules of Matrix Arithmetic 4 Norms and Dot Products 5 Norms 5 Dot Products 6 Linear Programming 7 Eigenvectors
More information6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE. Three Alternatives/Remedies for Gradient Projection
6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE Three Alternatives/Remedies for Gradient Projection Two-Metric Projection Methods Manifold Suboptimization Methods
More informationIntroduction to Real Analysis Alternative Chapter 1
Christopher Heil Introduction to Real Analysis Alternative Chapter 1 A Primer on Norms and Banach Spaces Last Updated: March 10, 2018 c 2018 by Christopher Heil Chapter 1 A Primer on Norms and Banach Spaces
More informationINTRODUCTION TO NETS. limits to coincide, since it can be deduced: i.e. x
INTRODUCTION TO NETS TOMMASO RUSSO 1. Sequences do not describe the topology The goal of this rst part is to justify via some examples the fact that sequences are not sucient to describe a topological
More informationarxiv: v1 [math.oc] 1 Jul 2016
Convergence Rate of Frank-Wolfe for Non-Convex Objectives Simon Lacoste-Julien INRIA - SIERRA team ENS, Paris June 8, 016 Abstract arxiv:1607.00345v1 [math.oc] 1 Jul 016 We give a simple proof that the
More informationAlternative theorems for nonlinear projection equations and applications to generalized complementarity problems
Nonlinear Analysis 46 (001) 853 868 www.elsevier.com/locate/na Alternative theorems for nonlinear projection equations and applications to generalized complementarity problems Yunbin Zhao a;, Defeng Sun
More information1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:
Newton s Method Suppose we want to solve: (P:) min f (x) At x = x, f (x) can be approximated by: n x R. f (x) h(x) := f ( x)+ f ( x) T (x x)+ (x x) t H ( x)(x x), 2 which is the quadratic Taylor expansion
More information3 Integration and Expectation
3 Integration and Expectation 3.1 Construction of the Lebesgue Integral Let (, F, µ) be a measure space (not necessarily a probability space). Our objective will be to define the Lebesgue integral R fdµ
More informationStructural and Multidisciplinary Optimization. P. Duysinx and P. Tossings
Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be
More informationWerner Romisch. Humboldt University Berlin. Abstract. Perturbations of convex chance constrained stochastic programs are considered the underlying
Stability of solutions to chance constrained stochastic programs Rene Henrion Weierstrass Institute for Applied Analysis and Stochastics D-7 Berlin, Germany and Werner Romisch Humboldt University Berlin
More informationValue and Policy Iteration
Chapter 7 Value and Policy Iteration 1 For infinite horizon problems, we need to replace our basic computational tool, the DP algorithm, which we used to compute the optimal cost and policy for finite
More informationKaisa Joki Adil M. Bagirov Napsu Karmitsa Marko M. Mäkelä. New Proximal Bundle Method for Nonsmooth DC Optimization
Kaisa Joki Adil M. Bagirov Napsu Karmitsa Marko M. Mäkelä New Proximal Bundle Method for Nonsmooth DC Optimization TUCS Technical Report No 1130, February 2015 New Proximal Bundle Method for Nonsmooth
More informationEconomics Bulletin, 2012, Vol. 32 No. 1 pp Introduction. 2. The preliminaries
1. Introduction In this paper we reconsider the problem of axiomatizing scoring rules. Early results on this problem are due to Smith (1973) and Young (1975). They characterized social welfare and social
More informationSOME STABILITY RESULTS FOR THE SEMI-AFFINE VARIATIONAL INEQUALITY PROBLEM. 1. Introduction
ACTA MATHEMATICA VIETNAMICA 271 Volume 29, Number 3, 2004, pp. 271-280 SOME STABILITY RESULTS FOR THE SEMI-AFFINE VARIATIONAL INEQUALITY PROBLEM NGUYEN NANG TAM Abstract. This paper establishes two theorems
More informationPM functions, their characteristic intervals and iterative roots
ANNALES POLONICI MATHEMATICI LXV.2(1997) PM functions, their characteristic intervals and iterative roots by Weinian Zhang (Chengdu) Abstract. The concept of characteristic interval for piecewise monotone
More informationApproximation Algorithms for Maximum. Coverage and Max Cut with Given Sizes of. Parts? A. A. Ageev and M. I. Sviridenko
Approximation Algorithms for Maximum Coverage and Max Cut with Given Sizes of Parts? A. A. Ageev and M. I. Sviridenko Sobolev Institute of Mathematics pr. Koptyuga 4, 630090, Novosibirsk, Russia fageev,svirg@math.nsc.ru
More informationPart V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory
Part V 7 Introduction: What are measures and why measurable sets Lebesgue Integration Theory Definition 7. (Preliminary). A measure on a set is a function :2 [ ] such that. () = 2. If { } = is a finite
More informationMAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9
MAT 570 REAL ANALYSIS LECTURE NOTES PROFESSOR: JOHN QUIGG SEMESTER: FALL 204 Contents. Sets 2 2. Functions 5 3. Countability 7 4. Axiom of choice 8 5. Equivalence relations 9 6. Real numbers 9 7. Extended
More informationUniversity of Maryland at College Park. limited amount of computer memory, thereby allowing problems with a very large number
Limited-Memory Matrix Methods with Applications 1 Tamara Gibson Kolda 2 Applied Mathematics Program University of Maryland at College Park Abstract. The focus of this dissertation is on matrix decompositions
More informationCONSTRAINED NONLINEAR PROGRAMMING
149 CONSTRAINED NONLINEAR PROGRAMMING We now turn to methods for general constrained nonlinear programming. These may be broadly classified into two categories: 1. TRANSFORMATION METHODS: In this approach
More informationProximal and First-Order Methods for Convex Optimization
Proximal and First-Order Methods for Convex Optimization John C Duchi Yoram Singer January, 03 Abstract We describe the proximal method for minimization of convex functions We review classical results,
More information