IMPLEMENTING GENERATING SET SEARCH METHODS FOR LINEARLY CONSTRAINED MINIMIZATION

Size: px
Start display at page:

Download "IMPLEMENTING GENERATING SET SEARCH METHODS FOR LINEARLY CONSTRAINED MINIMIZATION"

Transcription

1 Technical Report Department of Computer Science WM CS College of William & Mary Revised July 18, 2005 Williamsburg, Virginia IMPLEMENTING GENERATING SET SEARCH METHODS FOR LINEARLY CONSTRAINED MINIMIZATION ROBERT MICHAEL LEWIS, ANNE SHEPHERD, AND VIRGINIA TORCZON Abstract. We discuss an implementation of a generating set search method for linearly constrained minimization with no assumption of nondegeneracy placed on the constraints. The convergence guarantees for generating set search methods require that the set of search directions possesses certain geometrical properties that allow it to approximate the feasible region near the current iterate. In the hard case, the calculation of the search directions corresponds to finding the extreme rays of a cone with a degenerate vertex at the origin, a difficult problem. We discuss here how state of the art computational geometry methods make it tractable to solve this problem in connection with generating set search. We also discuss a number of other practical issues of implementation, such as the desirability of augmenting the set of search directions beyond the theoretically minimal set. We illustrate the behavior of the implementation on several problems from the CUTEr test suite. Key words. nonlinear programming, nonlinear optimization, linear constraints, direct search, pattern search, generating set search, Fourier Motzkin double description algorithm 1. Introduction. We consider ways to implement direct search methods for solving linearly constrained nonlinear optimization problems. These problems we write in the form minimize f(x (1.1 subject to x L x x U c L A I x c U A E x = b. The objective function is f : R n R with x R n. We partition the constraints into three sets: lower and upper bounds on the variables, general linear inequality constraints, and linear equality constraints. We allow for the possibility that decision variables may be unbounded above or below by allowing the components of x L to be and components of x U to be +. We also allow general linear inequalities to be unbounded above or below by allowing components of c L to be and components of c U to be +. A key step in generating set search algorithms for linearly constrained minimization is the computation of the requisite search directions. These are computed from a working set of constraints chosen in each iteration to be treated as nearly binding. The search directions are computed as the generators of the polar of the cone formed Department of Mathematics, College of William & Mary, P.O. Box 8795, Williamsburg, Virginia, ; buckaroo@math.wm.edu. This research was supported by the National Aeronautics and Space Administration under Grant NCC , by the Computer Science Research Institute at Sandia National Laboratories, and by the National Science Foundation under Grant DMS Department of Computer Science, College of William & Mary, P.O. Box 8795, Williamsburg, Virginia, ; plshep@cs.wm.edu. This research was conducted under the appointment of a Department of Energy High-Performance Computer Science Graduate Fellowship administered by the Krell Institute and funded by the Computer Science Research Institute at Sandia National Laboratories. Department of Computer Science, College of William & Mary, P.O. Box 8795, Williamsburg, Virginia, ; va@cs.wm.edu. This research was supported by the Computer Science Research Institute at Sandia National Laboratories. 1

2 2 R. M. LEWIS, A. SHEPHERD, AND V. TORCZON by the working set. If the working set defines a cone with a nondegenerate vertex at the origin, we have what we call the easy case. Computational approaches for solving the nondegenerate case have been known for some time [21, 25, 18]. The hard case arises if the working set defines a cone with a degenerate vertex at the origin. Analysis of algorithms that allow degenerate working sets is wellestablished [18, 20]. On the other hand, the resulting calculation of the search directions quickly becomes too difficult for a naive approach. As noted in [18], sophisticated and efficient computational geometry algorithms are required. A goal here is to show that by using state-of-the-art algorithms from computational geometry the hard case can be dealt with in a computationally effective way, and that the naive bounds on the combinatorial complexity involved in managing the hard case are unduly pessimistic. For example, in Section 9 we illustrate a problem for which the estimate on the number of possible vectors to be considered is more than when, in fact, only 1 vector is needed and that vector is correctly identified in less than one-seventh of a second on a conventional workstation. Our specific implementation makes use of the analysis of generating set search found in [16], which synthesizes the analytical results found in [18, 20]. A new feature of the algorithm presented in [16] is the way in which we choose the set of search directions, which we show here to be effective in practice. We use to denote the feasible region for problem (1.1: = { x R n : x L x x U, c L A I x c U, and A E x = b }. The algorithm we present here, like all of the approaches referenced previously, is a feasible iterates method for solving (1.1: we require the initial iterate x 0 to be feasible and arrange for all subsequent iterates x k to satisfy x k. We assume that f is continuously differentiable on but that gradient information is not computationally available; i.e., no procedure exists for computing the gradient and it cannot be approximated accurately. We do not assume that the constraints are nondegenerate. We discuss several topics regarding our implementation in detail. The first is how to dynamically identify a set of search directions sufficient to guarantee desirable convergence properties. Generating set search methods for linearly constrained problems achieve convergence without explicit recourse to the gradient or the directional derivative of the objective. They also do not attempt to estimate Lagrange multipliers. Instead, when the search is close to the boundary of the feasible region, the set of search directions must include directions that conform to the geometry of the nearby boundary. The general idea is that the set must contain search directions that comprise a set of generators for the cone of feasible directions, hence the name generating set search. An important decision to be addressed in any implementation of this approach is what is meant by close to the boundary i.e., which constraints belong in the current working set since the current working set determines the requirements on the set of search directions needed for the iteration. Once the current working set has been identified, the next question becomes how to obtain a set of search directions that contains a set of generators for the cone of feasible directions. If the current working set is not degenerate, then a straightforward procedure is outlined in [18] (which is equivalent to that outlined in [21]. Equalities in the working set were not explicitly treated previously, so we give an elaboration that deals with this case. In the presence of degeneracy, the situation becomes considerably more complex. Thus obtaining a sufficient set of search directions will be the second of our concerns. We also discuss how augmenting the set of search directions can accelerate progress toward a solution.

3 IMPLEMENTING GSS FOR LINEARLY CONSTRAINED PROBLEMS 3 Once a set of search directions has been identified, a step of appropriate length along each of the search directions must be found. The analysis makes this straightforward through the use of a scalar step-length control parameter k. In our implementation, the way in which we choose the working set is tied to k. Furthermore, our most recent sentiment, in line with the approach used by Lucidi and Sciandrone in [19] and Lucidi, Sciandrone, and Tseng in [20], is that there is value in decoupling the lengths of the steps along each of the search directions to ensure that, as much as possible, the requirement of maintaining feasibility does not overly limit the steps that can be tried. Finally, there are the usual implementation details of optimization algorithms. These include step acceptance criteria, stopping criteria, scaling of the variables, and data caching. We begin in Section 2 with an example and then give some preliminary definitions and notation in Section 3. The remainder of the paper, as well as the overall organization of the software, is described in Figure 1.1. Accepting steps Section 8.2 Stopping criteria Section 8.3 Scaling issues Section 8.4 Caching data Section 8.5 Finding an initial feasible iterate Section 8.1 Linearly Constrained GSS Section 8 Figure 8.1 Identifying nearby constraints Section 4 Generating core search directions Section 5 Augmenting core search directions Section 6 Choosing the steplengths Section 7 Fig A map for the organization of the paper and the software. 2. An illustrative example. Figure 2.1 illustrates the first iterations of a greatly simplified linearly constrained GSS method applied to the two-dimensional modified Broyden tridiagonal function [4, 22], to which we have added three linear constraints. Level curves of f are shown in the background. The three constraints form a triangle. In each figure, a dot denotes the current iterate x k, which is the best point the feasible point with the lowest value of f found so far. We partition the indices of the iterations into two sets: S and U. The set S corresponds to all successful iterations those for which x k x k+1 (i.e., at iteration k the search identified a feasible point with a lower value of the objective and that point will become the next iterate. The set U corresponds to all unsuccessful iterations those for which x k+1 = x k since the search was unable to identify a feasible point with a lower value of the objective. Each of the six subfigures in Figure 2.1 represents one iteration of a linearly constrained GSS method. The four squares represent the trial points under consideration at that iteration. Taking a step of length k along each of the four search directions

4 4 R. M. LEWIS, A. SHEPHERD, AND V. TORCZON (a k = 0. The initial set of search directions. Move Northeast; k S. (b k = 1. Keep the set of search directions. Move Northeast; k S. (c k = 2. Change the set of search directions. Contract; k U. (d k = 3. Change the set of search directions. Move Northeast; k S. (e k = 4. Change the set of search directions. Contract; k U. (f k = 5. Change the set of search directions. Move Southeast; k S. Fig Linearly constrained GSS applied to the modified Broyden tridiagonal function. yields the trial points. In subfigures (b (f, the trial points from the previous iteration are shown in the background for comparison. In subfigure (a, the solution to the problem is marked with a star. Notice in subfigure (a that the initial iterate x 0 is near two constraints. Thus two of the initial search directions are chosen so that they are parallel to the two nearby constraints. In this instance, the result is two feasible descent directions; i.e., moving along two of the four search directions would allow the search to realize descent on the value of f at x 0 while remaining feasible. (Under appropriate assumptions, the analysis guarantees at least one feasible descent direction. Either of the two trial points produced by these two feasible descent directions is acceptable; in this instance we choose the one that gives the larger decrease in the value of the objective f to become x 1. Since the step of length 0 from x 0 to x 1 was sufficient to produce decrease, for the purposes of this simple example we set 1 to 0. In subfigure (b, the same two constraints remain nearby so we leave the set of search directions unchanged. Again, there are two feasible descent directions, each of which leads to a trial point that decreases the value of f at the current iterate. We accept the trial point that yields the larger decrease as x 2 and we set 2 to 1. In subfigure (c, the set of nearby constraints changes. We drop one of the constraints that had existed in our working set of nearby constraints in favor of a newly identified constraint that yields a different working set. The set of search directions that results is shown. As guaranteed by the analysis, we do have one feasible descent direction, but now the difficulty is that currently the step along that direction takes the search outside the feasible region. Thus, the best iterate is unchanged, so we set x 3 to x 2. Further, we contract by setting 3 to half the value of 2, which has the

5 IMPLEMENTING GSS FOR LINEARLY CONSTRAINED PROBLEMS 5 effect of halving the length of the steps taken at the next iteration. The result of halving the length of the steps is illustrated in subfigure (d. First notice that in addition to reducing the length of the step allowed, we also have changed the set of working constraints; in fact, we have reduced the set to only one working constraint. (The mechanism by which we identify nearby constraints and its connection to the value of k are discussed in more detail in Section 4. Given both a new set of search directions, as well as a reduction in the length of the steps allowed, we now have a feasible trial point that gives decrease on the value of the objective, which we make x 4. In addition, we set 4 to 3. In subfigure (e the search moves to the new iterate. Once again the search is near the same two constraints seen in subfigure (c, so we have the same working set and the same set of search directions, though now we search with a reduced step length (i.e., 4 = Once again, the length of the steps along the two descent directions is too long. So we set x 5 to x 4 and 5 to The result of halving is illustrated in subfigure (f. In addition to reducing the length of the steps allowed, we have changed the set of working constraints; once again, we have reduced the set to only one working constraint. A new feasible trial point that gives decrease on the value of f is identified (to the Southeast and at the next iteration the search will proceed from there. This example illustrates the essential features the implementation must address: how to identify the nearby constraints, how to obtain a set of search directions that contains a set of generators for the cone of feasible directions, how to find a step of an appropriate length, and how to categorize the iteration as either successful (k S or unsuccessful (k U. 3. Notation and definitions. Unless otherwise noted, norms and inner products are assumed to be the Euclidean norm and inner product. A cone K is a set that is closed under nonnegative scalar multiplication: K is a cone if x K implies ax K for all a 0. A cone is called finitely generated if there exists a (finite set of vectors v 1,...,v r such that K = { λ 1 v λ r v r λ 1,...,λ r 0 }. The vectors v 1,...,v r are called the generators of K. Conversely, the cone generated by a set of vectors v 1,...,v r is the set of all nonnegative combinations of these vectors. A special case occurs if K is a vector space. In that case, a set of generators is called a positive spanning set for K [7]. Thus a positive spanning set is like a linear spanning set but with the additional requirement that all the coefficients be nonnegative. One particular choice of generating set for R n to which we will have recourse is the set of the positive and negative unit coordinate vectors G = {e 1,e 2,...,e n, e 1, e 2,..., e n }. Clearly any vector in R n is a nonnegative combination of the vectors in G. The maximal linear subspace contained in a cone K is called its lineality space. If L is the lineality space of K, then we call K L the pointy part of K. Finally, the polar of a cone K, denoted by K, is the set K = { v w T v 0 for all w K }. The polar K is a convex cone, and if K is finitely generated, then so is K.

6 6 R. M. LEWIS, A. SHEPHERD, AND V. TORCZON Given a vector v, let v K and v K denote the projections (in the Euclidean norm of v onto K and its polar K. The polar decomposition [23, 26] says that any vector v can be written as the sum of its projection onto K and its polar K, and the projections are orthogonal: v = v K + v K, and vk T v K = 0. This means that Rn is positively spanned by a set of generators of K together with a set of generators of K, a generalization of the direct sum decomposition given by a subspace and its orthogonal complement. 4. Constructing the working set from the nearby constraints. We next discuss how we determine the working set of constraints. From this working set we compute the minimal set of necessary search directions. For convenience we combine the bounds and general linear constraints. Let ( I A =. Let a T i denote the rows of A, indexed by i I. Let C L,i and C U,i be the sets where lower and upper bounds on a T i x are binding: C L,i = { y a T i y = (c L i } and CU,i = { y a T i y = (c U i }. These sets are faces of the feasible polyhedron. Of interest to us are the outwardpointing normals to the faces within a prescribed distance of x. Let N(A E denote the nullspace of the equality constraints A E. Given a point x that is feasible for (1.1 and any set S, we define dist(x,s = A I { min { x y y S N(AE } if S N(A E + otherwise. That is, we measure distance from x to S inside N(A E. We then define the sets of inequalities at their bounds that are within distance ε of x: I L (x,ε = { i I dist(x, C L,i ε } I U (x,ε = { i I dist(x, C U,i ε }. We now define the working set of constraints. Given feasible x and ε > 0, the corresponding working set W(x,ε is made up of two pieces. The first piece consists of the vectors W E that correspond to the equality constraints together with every inequality for which the faces for both its lower and upper bounds are within distance ε of x. If the rows a T e of the equality constraints A E are indexed by e E, then W E (x,ε = { a e e E } { a i i I L (x,ε I U (x,ε }. The second piece of the working set consists of the vectors W I that are outward pointing normals to the inequalities for which the faces of exactly one of their lower or upper bounds is within distance ε of x. Let I E (x,ε = I L (x,ε I U (x,ε; then W I (x,ε = { a i i I L (x,ε \ I E (x,ε } { a i i I U (x,ε \ I E (x,ε }. Given x, we define the ε-normal cone N(x,ε to be the cone generated by the linear span of the vectors in W E (x,ε together with the positive span of the vectors in

7 IMPLEMENTING GSS FOR LINEARLY CONSTRAINED PROBLEMS 7 ε a 2 x 2 a 2 a 3 ε ε x 1 a 1 x 3 Fig The outward-pointing normals when constraints {1, 2} and {2, 3} are the working sets. Since the distance from x 3 to the boundary is greater than ε, the last working set is empty. W I (x,ε. If both the latter sets are empty, then N(x,ε = {0}. The ε-tangent cone, denoted by T(x,ε, is the polar of N(x,ε: T(x,ε = { v w T v 0 for all w N(x,ε }. If N(x,ε = {0}, then T(x,ε = R n (in other words, if the boundary is more than distance ε away, and one looks only within distance ε of x, then the problem looks unconstrained. Several examples are illustrated in Figure 4.2. N(x,ε 1 N(x,ε 2 x ε 1 x ε 2 ε 3 x T(x,ε 1 T(x,ε 2 T(x,ε 3 Fig The cones N(x, ε and T(x, ε for the values ε 1, ε 2, and ε 3. Note that for this example, as ε varies from ε 1 to 0, there are only the three distinct pairs of cones (N(x, ε 3 = {0}. The significance of N(x,ε is that for suitable choices of ε, its polar T(x,ε approximates the polyhedron near x, where near is in terms of ε. (More precisely, x + T(x,ε approximates the feasible region near x. The cone T(x,ε is important because if T(x,ε {0}, then one can proceed from x along all directions in T(x,ε for a distance of at least ε, and still remain inside the feasible region [18, 16]. Figure 4.2 illustrates this point. In the algorithm, we allow different values of ε k at every iteration k. There is a lot of flexibility in determining the choice of ε k, as discussed in [16]. We have chosen to yoke ε k to k. A connection between ε k and k makes sense. If every step likely to be considered at the current iteration would remain inside the feasible region, there seems little point in burdening the search with the need to consider nearby constraints. When the length of the steps allowed means that infeasible trial points could be constructed, however, then it makes sense to construct a working set

8 8 R. M. LEWIS, A. SHEPHERD, AND V. TORCZON using these nearby constraints and to generate directions that allow the search to move along them. Revisiting the example in Figure 2.1, in Figure 4.3 we show the choice of ε k that was used to identify the working set of nearby constraints. This illustration has a (a k = 0. The initial set of search directions. Move Northeast; k S. (b k = 1. Keep the set of search directions. Move Northeast; k S. (c k = 2. Change the set of search directions. Contract; k U. (d k = 3. Change the set of search directions. Move Northeast; k S. (e k = 4. Change the set of search directions. Contract; k U. (f k = 5. Change the set of search directions. Move Southeast; k S. Fig One strategy for choosing the ε k used in linearly constrained GSS applied to the modified Broyden tridiagonal function. simple way of choosing ε k : ε k is the same as k. In practice we have found that the choice need not be much more complicated, a point we discuss in detail in Section Choosing the set of search directions. Once the nearby constraints for a given iteration have been identified, we can define the set of search directions. Figures contain clues to the process. In order to ensure that the convergence results in [16] hold, we must arrange for the following. The search must conform to the local geometry of in the sense that at each iteration k, there is at least one direction in D k, the set of search directions at iteration k, along which the search can take an acceptably long step and still remain feasible. We decompose D k into G k and H k, where G k is made up of the directions necessary for convergence and H k contains any additional directions. Several options for choosing G k have been considered: 1. G k contains generators for all the cones T(x k,ε, for all ε [0,ε ], where ε > 0 is independent of k [18]. 2. G k is exactly the set of generators for the cone T(x k,ε k, where ε k 0 is updated according to whether acceptable steps are found, and H k = [20].

9 IMPLEMENTING GSS FOR LINEARLY CONSTRAINED PROBLEMS 9 N(x,ε d (1 d (3 poor N(x,ε d (1 N(x,ε d (1 x ε x ε x ε d (3 optimal d (2 T(x,ε f(x d (2 f(x d (3 good T(x,ε d (2 T(x,ε f(x (a Poor choice (b Reasonable choice (c Optimal choice (wrt κ(g k Fig Condition 1 is needed to avoid a poor choice when choosing G k. 3. G k contains generators for the cone T(x k,ε k, where ε k = k β max, and β max is described below in Condition 2 [16]. In the work discussed here, we use the third option. As is necessary for any of the three options, the following is required of G k [15, 16]: Condition 1. For every k, G k generates T k = T(x k,ε k. In addition, there exists a constant κ min > 0, independent of k, such that for all k, if G k {0} then (5.1 κ(g k min v R n max d G k v Tk 0 v T d v Tk d κ min. The lower bound in Condition 1 is needed when there is a lineality space present in T(x k,ε k. In this case we have freedom in the choice of generators. Figure 5.1 depicts the situation where G k contains a halfspace. We require a lower bound on κ(g k to avoid the situation illustrated in Figure 5.1(a, in which the search directions can be almost orthogonal to the direction of steepest descent. Figure 5.1(c, which uses a minimal number of vectors but ensures as large a value for κ(g k as possible, represents an optimal choice in the absence of explicit knowledge of the gradient. All possible cones N(x,ε and T(x,ε can be generated using a finite number of vectors, as formalized in the following proposition from [16]. Proposition 5.1. There exists a finite set A such that A contains generators for every set N(x,ε with x and ε 0. As a consequence, there also exists a finite set G such that G contains generators for every set T(x,ε with x and ε 0. The definition of A ensures that for every x and ε 0 there exists A A such that A generates N(x,ε. Furthermore, for every x and ε 0, there exists a constant ν min > 0 such that the following holds: (5.2 κ(a ν min. There is also the set H k of additional directions we choose to add to improve the search. We discuss this further in Section 6.

10 10 R. M. LEWIS, A. SHEPHERD, AND V. TORCZON In addition, as discussed in more detail in [16], we must enforce bounds on the lengths of the search directions themselves: Condition 2. There exist β min > 0 and β max > 0, independent of k, such that for all k, β min d β max for all d G k. In our implementation we choose to satisfy Condition 2 by normalizing all search directions so that β min = β max = 1. The necessary conditions on the set of search directions all involve the calculation of generators for the polars of cones determined by the current working set. We treat the determination of the search directions in increasing order of difficulty. Figure 5.2 shows the various cases to consider. Easy cases Empty working set Section 5.1 Only bound constraints in working set Section 5.1 Only equality constraints in working set Section 5.2 Generate core search directions Section 5 Straightforward case Nondegenerate working set Section uses LAPACK Hard case Degenerate working set Section uses cddlib Fig The components involved in the generation of the search directions When the working set contains only bound constraints or is empty. If only bound constraints are present the situation is quite simple since the generators for the ε-normal and ε-tangent cones can be drawn from the set of coordinate directions. We follow the prescription in [17] and include in G k the unit coordinate vectors ±e i. If some of the variables do not have bounds, a smaller number of the unit coordinate vectors could be used to form G k [18], but in our testing we have found that the coordinate directions are a consistently effective set of search directions. This has proven particularly true in engineering problems we have seen, where many of the responses are monotonic (at least locally with respect to each variable. The coordinate directions have physical meaning for such applications; thus, including them in the set of search directions seems to lead to better overall performance. A similar situation obtains when the working set is empty, so N(x k,ε k = {0} and T(x k,ε k = R n. In this case, any positive basis suffices [15]. We rely on the set

11 IMPLEMENTING GSS FOR LINEARLY CONSTRAINED PROBLEMS 11 of unit coordinate vectors ±e i because it is apt for many engineering problems and because the orthonormality of the vectors ensures a reasonable value for κ min in ( When the working set consists only of equality constraints. If the working set consists only of equality constraints, then the generators of T(x k,ε k correspond to a positive spanning set for the nullspace of A E. In this situation by default we compute an orthonormal basis Z for this nullspace and take as our generators the compass-like set consisting of the columns of Z and their negatives. We also allow for a nullspace basis defined by a basic/nonbasic division of the variables, as in the simplex method When the working set contains general linear constraints. In the case shown in Figure 5.3, once the nearby constraints have been identified using x k and ε k, their outward pointing normals a 1 and a 2 are extracted from the rows of the constraint matrix A to form the set of working constraints W k = {a 1,a 2 }. The vectors in W k are the generators for the ε-normal cone N(x k,ε k. T(x k,ε k T(v 0 a 2 x k x k v 0 ε k a 1 v 0 N(x k,ε k ε k v 0 N(v 0 Fig The outward pointing normals a 1 and a 2 of the nearby (working constraints, which are then translated to x k to form the cone N(x k, ε k and define its polar T(x k, ε k. The generators for N(x k, ε k and T(x k, ε k form the set of search directions. In this example, notice that the same vectors are generators for the normal cone and the tangent cone at v When the working set is known to be nondegenerate. Figure 5.3 depicts the computationally straightforward case. The generators of N(x k,ε k are linearly independent. As a consequence, N(x k,ε k is linearly isomorphic to the positive orthant in R 2, and the same isomorphism maps T(x k,ε k to the negative orthant. Computing the generators of T(x k,ε k becomes a simple matter of linear algebra. More generally, in the easy case N(x k,ε k is linearly isomorphic to a cone in some Euclidean space that is the sum of the positive orthant and a vector space, and T(x k,ε k is linearly isomorphic to the corresponding polar, which is the sum of the negative orthant and a vector space. The following proposition describes this easier case of computing generators for T(x k,ε k. Proposition 5.2 treats equality constraints explicitly, an elaboration of the construction given in [18]. We explain shortly how the result is used in the actual algorithm. Proposition 5.2. Suppose N(x k,ε k is generated by the positive span of the columns of the matrix V P and the linear span of the columns of the matrix V L : N(x k,ε k = { v v = V P λ + V L α, λ 0 }. Let Z be a matrix whose columns are a basis for the nullspace of VL T, and N be a

12 12 R. M. LEWIS, A. SHEPHERD, AND V. TORCZON matrix whose columns are a basis for the nullspace of VP T Z. Finally, suppose a right inverse R exists for VP TZ. Then T(x k,ε k is the positive span of the columns of ZR together with the linear span of the columns of ZN: (5.3 T(x k,ε k = { w w = ZRu + ZNξ, u 0 }. Proof. We begin with d = ZRu + ZNξ, where u 0. Then VL Td = 0 because VL TZ = 0. Meanwhile, since V P TZR = I and V P T ZN = 0, we have V T P d = V T P ( ZRu + ZNξ = u 0. Thus d has a nonpositive inner product with all the generators of N(x k,ε k, so d T(x k,ε k. Conversely, suppose d T(x k,ε k. Observe that because the linear span of the columns of V L is contained in K, we must have VL T d = 0. Now consider the vector d = ZR V T P d = ( ZR( V T P d. Since V T P d 0, we see that d lies in the positive span of the columns of ZR. Then (5.4 V T P (d d = V T P d V T P ZRV T P d = 0. Meanwhile, we know that VL T(d d = 0, so for some w we have d d = Zw. From (5.4 we obtain VP TZw = 0. This means that w is in the nullspace of V P T Z, so w = Nξ for some ξ, whence d d = ZNξ. The result then follows from the identity d = ZRu + ZNξ, u = VP T d 0. The adjective nondegenerate comes about because the easy case corresponds to the situation where the cone generated by the columns of V P is known to have a nondegenerate vertex at the origin. In applying this construction, we include as columns of V L all equality constraints along with the inequality constraints where we know that both a i and a i are in the working set. The latter situation can occur if ε k is large enough that the faces associated with both the lower and upper bound on a T i x are within distance ε k of x k. The columns of V P are the remaining outward pointing normals for the constraints in the working set. We then compute a basis Z for the nullspace of VL T and check whether a right inverse R exists for VP TZ. If so, the generators for T(x k,ε k are given by (5.3. Observe that if R exists, then ZR is a right inverse for VP T, so we actually check for the existence of a right inverse for the latter. By default, our implementation computes the pseudoinverse to serve as the right inverse in Proposition 5.2. This choice is particularly attractive because the columns of the pseudoinverse are orthogonal to any vector in the lineality space of T(x k,ε k, which helps improve conditioning of the resulting search directions (e.g., Condition 1. Within our implementation we include as an option a right inverse found by a basic/nonbasic division of the variables. To complete the set of generators of T(x k,ε k, we construct an orthonormal basis N for the nullspace of VP T Z. We then take as the remaining generators the columns of ±ZN; this corresponds to a compass-like search in the nullspace of VP T.

13 IMPLEMENTING GSS FOR LINEARLY CONSTRAINED PROBLEMS When the working set may be degenerate. If Proposition 5.2 cannot be applied the degenerate case the construction of the search directions becomes more complex, both computationally and algorithmically. In this case the rows of V T P are not linearly independent, and N(x k,ε k is a cone with a degenerate vertex at the origin. The problem of finding generators for T(x k,ε k is a special case of the problem of enumerating the vertices and extreme rays of a polyhedron (the V -representation of the polyhedron given the description of the polyhedron as an intersection of halfspaces (the H-representation. This is a well-studied problem in computational geometry (see [2] for a survey and effective algorithms have been developed. The two main classes of algorithms for solving this type of problem are pivoting algorithms and insertion algorithms. Pivoting algorithms move from vertex to vertex along the edges of the polyhedron by generating feasible bases, as in the simplex method for linear programming. The reverse search algorithm [3, 1] is an example of such a method. Insertion algorithms compute the V -description by working with the intersection of an increasing number of halfspaces. Typically, these algorithms start with a maximal linearly independent set of normals defining the halfspaces and constructs an initial set of extreme rays as in Proposition 5.2. Additional halfspaces are then introduced and the vertices and extreme rays are recomputed; this process continues until the entire set of halfspaces has been introduced. The Fourier Motzkin Double Description method [8, 24] is an example of an insertion algorithm. Others have observed that pivoting algorithms can be inefficient if they encounter a highly degenerate vertex [2]. This was our experience using the lrs reverse search code [1] in the highly degenerate case. For that reason we instead use the Double Description method, an insertion method, computing the necessary search directions using library procedures from Fukuda s cddlib code [9]. This package is based on an efficient implementation of the Double Description algorithm and has been shown to be effective in dealing with highly degenerate vertices [10]. We found it more efficient to retain equalities in the problem rather than to eliminate them by reparameterizing the problem in terms of a basis for the nullspace of the equality constraints. In some cases elimination of equalities led to a significant deterioration in the performance of cddlib. This is due to the fact that the constraint coefficients went from being sparse and frequently integral to being dense arrays of relatively random floating point numbers, which made the Fourier Motzkin elimination process more difficult. Thus, eliminating equalities had some serious disadvantages which in our testing were not offset by any advantages. 6. Augmenting the core set of search directions. While generators of T(x k,ε k are theoretically sufficient for the convergence analysis, in practice there are significant benefits to including additional search directions. The situation illustrated in Figure 6.1 shows the desirability of additional directions, since the directions T(x k,ε k point into the interior of the feasible region, while it may be preferable to move toward the boundary. At the very least, we would like the total set of search directions to be a positive basis for R n, or, if equality constraints are present, a positive basis for the null space of the equality constraints. One simple way to accomplish this is to include in H k the set N k, the outward pointing normals to the working set of constraints. Since these directions generate N(x k,ε k, and T(x k,ε k is polar to N(x k,ε k, by the polar decomposition, R n = N(x k,ε k T(x k,ε k. Thus, adding the directions gen-

14 14 R. M. LEWIS, A. SHEPHERD, AND V. TORCZON N(x,ε 1 f(x N(x,ε 2 f(x f(x x x T(x,ε 1 T(x,ε 2 T(x,ε 3 (a First iteration is necessarily unsuccessful and so 1 is halved for the next iteration. (b The second iteration is also unsuccessful and so 2 is halved again. (c Now, locally, the problem looks unconstrained and a successful step toward the boundary is possible. Fig Minimizing the number of function evaluations at a single iteration might require more total function evaluations since in this instance it will take at least five function evaluations to find descent. erating N(x k,ε k means our search directions constitute a positive spanning set for R n, as illustrated in Figure 6.2. See Section 9 for a computational illustration of the advantage of including these directions. N(x,ε 1 f(x N(x,ε 1 f(x x x T(x,ε 1 (a The generators of N(x, ε 1 are both descent directions. T(x,ε 1 (b Limiting the steps along the generators of N(x, ε 1 ensures feasible steps. Fig Why including the generators for the ε-normal cone N(x, ε 1 might be a good idea since descent will be found after at most three function evaluations. The preceding discussion needs to be made more precise if equality constraints are present. Typically, in this case the outward pointing normals a i to constraints in the working set are infeasible directions they do not lie in the nullspace of the equality constraints. Thus, a step along a direction a i will lead to violation of the equality constraints. In this situation we replace the outward pointing normals by their projection into the nullspace of the equality constraint matrix A E. We have tested other augmentations of the search directions, as well, but have found no real advantage. For instance, if the directions in G k linearly span R n, then adding the directions G k yields a positive basis. This follows from the simple observation that c 1 v c r v r = c 1 sign(c 1 v c r sign(c r v r. We have also

15 IMPLEMENTING GSS FOR LINEARLY CONSTRAINED PROBLEMS 15 tried adding generators for T(x k,ε for values of ε less than ε k. This allows us to anticipate the effect of reducing ε k in subsequent iterations. The operation of the Fourier Motzkin Double Description algorithm is such that appending generators to a set of directions and then recomputing the polar can be done efficiently. 7. Determining the length of the step. The step-length control parameter k determines the length of a step along each direction for a given set of search directions. In the unconstrained case, where the set of directions at iteration k is D k = The associated set of trial points would be { { } d (1 k,d(2 k,...,d(p k k, x k + k d (i k i = 1,...,p k }. When constraints are introduced, however, we must consider the possibility that some of the trial points will be infeasible. With this in mind, we choose the maximum [0, k] such that x k +, and define the set of trial points as (i k { x k + (i k d(i k (i k d(i k i = 1,...,p k }. In choosing (i k, a full step should be used if feasible, as stated in Condition 3. Condition 3. If x k + k d (i k (i, then k = k. The simplest way of satisfying Condition 3 is to choose = k if the result will (i be feasible and k = 0 otherwise, thereby simply rejecting infeasible points. We have found it much more efficient, however, to allow steps that stop at the boundary, as in [19, 20]. This is illustrated in Figure 6.2(b. More generally, for the (i search directions in H k we choose k to be the maximum value in [σ tol k, k ] satisfying x k + k d (i k, where σ tol 0. The default value of σ tol in our implementation is The factor σ tol prevents excessively small steps. Such steps can yield only a very tiny improvement, which has the effect of retarding decrease in k. (i k 8. GSS algorithms for problems with linear constraints. The variant of GSS that we have implemented for problems with linear constraints is outlined in Algorithm 8.1. See [15, 16] for a discussion of some of the many other possibilities. The remainder of this section is devoted to further explanation of Algorithm Finding an initial feasible iterate. GSS methods for linearly constrained problems are feasible iterates methods all iterates x k, k = 0,1,2,..., satisfy x k. If an infeasible initial iterate is given then we compute its Euclidean projection onto the feasible polyhedron; this is a quadratic program. For some of the test problems discussed in Section 9 the starting value provided is infeasible, and for these we used this projection strategy. Alternatively, one could use Phase 1 of the simplex method to obtain a starting point at a vertex of the feasible region.

16 16 R. M. LEWIS, A. SHEPHERD, AND V. TORCZON Algorithm 8.1 (Linearly constrained GSS using a sufficient decrease globalization strategy Initialization. Let f : R n R be given. Let be the feasible region. Let x 0 be the initial guess. Let tol > 0 be the step-length convergence tolerance. Let 0 > tol be the initial value of the step-length control parameter. Let 0 < θ 0 < θ max < 1, where θ is the contraction parameter. Let ε max > β max tol be the maximum distance used to identify nearby constraints (ε max = + is permissible. Let ρ( = α 2 be the forcing function, with α (0,1. Algorithm. For each iteration k = 0,1,2,... Step 1. Let ε k = min{ε max,β max k }. Choose a set of search directions D k = G k H k satisfying Conditions 1 and 2. Step 2. If there exists d k D k and a corresponding k [0, k ] satisfying Condition 3 such that x k + k d k and then: f(x k + k d k < f(x k ρ( k, Set x k+1 = x k + k d k (a successful step. Set k+1 = φ k k for any choice of φ k 1. Step 3. Otherwise, for every d G k, either x k + k d or In this case: f(x k + k d f(x k ρ( k. Set x k+1 = x k (an unsuccessful step. Set k+1 = θ k k for some θ k such that 0 < θ k θ max. If k+1 < tol, then terminate. Fig Linearly constrained GSS using a sufficient decrease globalization strategy Step acceptance criteria. An iteration k is successful, i.e., k S, if it satisfies the decrease condition. That is, there exists a d k D k for which (8.1 f(x k + k d k < f(x k ρ( k.

17 IMPLEMENTING GSS FOR LINEARLY CONSTRAINED PROBLEMS 17 The nonnegative function ρ defined on [0, + is called the forcing function and must satisfy one of the following two requirements. Either (8.2 ρ( 0 or (8.3 ρ is continuous, ρ( = o( as 0, and ρ( ρ( for <. Choosing ρ to be identically zero as in (8.2 imposes a simple decrease condition on the acceptance of the step. Otherwise, choosing ρ as in (8.3 imposes a sufficient decrease condition. We use ρ( = α 2 in the acceptance criterion (8.1. With this choice of ρ, all the terms on the right-hand side of (8.5 are O( k [16]. By default, the value of α is 10 4 ; however, for any given problem the value of α should be chosen to reflect the relative magnitude of f and Stopping criteria. We stop the algorithm once the step length parameter k falls below a specified tolerance tol. This stopping criterion is motivated by the relationship between k and χ(x k, a particular measure of progress toward a Karush Kuhn Tucker (KKT point of (1.1. This connection is stated in Theorem 8.3. For x, let χ(x max f(x T w. x+w w 1 This quantity is discussed at length in [5, 6]. Of interest to us here are the following properties: (1 χ(x is continuous on, (2 χ(x 0, and (3 χ(x = 0 if and only if x is a KKT point for (1.1. We therefore wish to see χ(x k tend to zero. Theorem 8.3 says that this occurs at unsuccessful iterations as k tends to zero. Theorem 8.3 is most simply stated under the following assumptions. Relaxations of these conditions, and a more general version of the theorem, are found in [16]. Assumption 8.1. The set F = { x f(x f(x 0 } is bounded. Assumption 8.2. The gradient of f is Lipschitz continuous with constant M on F. If both assumptions hold, then there exists γ > 0 such that for all x F, (8.4 f(x < γ. Theorem 8.3. Suppose that Assumptions 8.1 and 8.2 hold. Consider the linearly constrained GSS algorithm given in Figure 8.1, which satisfies Conditions 1, 2, and 3. If k is an unsuccessful iteration and ε k = β max k, then (8.5 χ(x k ( M κ min + γ ν min k β max + 1 κ min β min ρ( k k. Here, κ min is from Condition 1, ν min is from (5.2, M is from Assumption 8.2, γ is from (8.4, and β max and β min are from Condition 2. Obviously, one can also stop after a specified number of objective evaluations, as we do in our testing in Section 9.

18 18 R. M. LEWIS, A. SHEPHERD, AND V. TORCZON 8.4. Scaling. The relative scaling of variables can have a profound effect on the efficiency of optimization algorithms. We allow as an option the common technique of shifting and rescaling the variables so that they have a similar range and size. We work in a computational space whose variables w are related to the original variables x via x = Dw + c, where D is a diagonal matrix with positive diagonal entries and c is a constant vector, both provided by the user. We solve the transformed problem minimize f(dw + c subject to D 1 (x L c w D 1 (x U c c L A I c A I Dw c U A I c A E Dw = b A E c. The scaling scheme we use in the numerical tests presented in Section 9 is based on that described in Section of [12]. The latter scaling is based on the range of the variables and transforms the variables to lie between 1 and 1. We modify the scaling scheme slightly by not scaling variables whose original range is narrow. One advantage of scaling is that it makes it easier to define a reasonable default value for 0. For the results reported in Section 9, whenever scaling is applied to a problem, the default value for 0 is Data caching. The potentially regular pattern to the steps we take means that we may revisit points. This is particularly true for unconstrained and bound constrained problems. To avoid reevaluating known values of f, we store values of x at which we evaluate the objective and the corresponding objective values. Before evaluating f at a trial point x t, we examine the data cache to see whether we have already evaluated f at a point within some relative distance of x t. We do not evaluate f at x t if we have already evaluated f at a point x for which x t x < r tol x t. By default, r tol = Caching yields only minor improvements in efficiency if general linear inequality constraints are present. If there are a considerable number of general linear constraints (i.e., constraints other than bounds in the working set, then there is only a small chance of reevaluating f near a previous iterate. Still, if each evaluation of the objective function is expensive, then it is worth avoiding any reevaluations. 9. Some numerical illustrations. We present some illustrations of the behavior of our implementation of Algorithm 8.1 using problems from the CUTEr test suite [14, 13]. In several of these problems we must deal repeatedly with highly degenerate instances of the search direction calculation discussed in Section 5.3. As these examples show, the solution of such problems is quite tractable given the power of a good implementation of the Fourier Motzkin Double Description algorithm. All of the problems considered here have nonlinear objectives. We start with the value of x 0 specified by CUTEr, though we often had to project into the feasible region (as discussed in Section 8.1. In the tests we present here, we terminated the program when the number of evaluations of the objective reached that which would have been needed for 30 forward difference evaluations of the gradient if one had used the equality constraints to eliminate variables. When the problem is scaled (as it is for all but one of the examples, we use 0 = 2, as discussed in Section 8.4. While our implementation supports an option to increase k after a successful iteration, for the results reported here k+1 = k whenever k S (i.e., φ k = 1 for all k. Whenever k U, k+1 = 1 2 k (i.e., θ k = 1 2 for all k.

19 IMPLEMENTING GSS FOR LINEARLY CONSTRAINED PROBLEMS 19 We set ε max = so that it plays no role in the results reported here. Since we normalize all the search directions, β max = 1. Thus, ε k = k for all k. We use ρ( = α 2 as our forcing function with α = 10 4, as noted in Section 8.2. For the results presented here, G k consists of the generators for the ε-tangent cone T(x k,ε k. With one exception (for the purposes of illustrating the potential advantage of allowing a nonempty set H k H k consists of the generators for the ε-normal cone N(x k,ε k (projected into the nullspace of any equality constraints in the working set, as described in Section 6. Recall that by definition, D k = G k H k. As discussed in Section 7, we allow steps that stop at the boundary. For the (i search directions in H k, we choose k to be the maximum value in [σ tol k, k ] satisfying x k + k d (i k, where we use σ tol = Since we have occasion to mention run-times in passing, we note that the tests were run on a dual 3.2 GHz Xeon workstation with 1 GB of memory Value of objective Value of Number of directions Number of objective evaluations Iteration Working set Core directions Extra directions Fig Results for the avion2 test problem: 49 variables, 15 linear equality constraints, and upper and lower bounds on all variables. The first problem is avion2, an aeronautical design problem. In this case the equality constraints have rank 15, leaving 34 degrees of freedom in the problem. The plot on the left in Figure 9.1 gives the iteration history for the value of the objective and the value of, both plotted against the number of evaluations of the objective. Circles represent the value of the objective, with filled circles indicating unsuccessful iterations. The value of is represented by. The optimal objective value (computed using npsol [11] is the lowest value on the vertical scale. The plot on the right in Figure 9.1 shows the size of the working set at each iteration, as well as the number of vectors in G k, the set of generators for T(x k,ε k (core directions, and the number of vectors in H k, the extra search directions we include (extra directions. The set H k has fewer vectors than the working set because we do not include outward pointing normals for equality constraints in the working set. The solid line indicates the number of variables in the problem; when the size of the working set is above this line the algorithm must deal with the degeneracy discussed in Section For avion2, a total of 0.12 seconds is spent in cddlib treating the degenerate cases. The next test problem is spanhyd, a model of a hydroelectric system. This problem includes network constraints. Since the ranges allowed for some of the variables is small, we scale only a subset of the variables, though we still start with 0 = 2. The plot on the left in Figure 9.2 shows that the algorithm makes good progress

1. Introduction. We consider ways to implement direct search methods for solving linearly constrained nonlinear optimization problems of the form

1. Introduction. We consider ways to implement direct search methods for solving linearly constrained nonlinear optimization problems of the form IMPLEMENTING GENERATING SET SEARCH METHODS FOR LINEARLY CONSTRAINED MINIMIZATION ROBERT MICHAEL LEWIS, ANNE SHEPHERD, AND VIRGINIA TORCZON Abstract. We discuss an implementation of a derivative-free generating

More information

Stationarity results for generating set search for linearly constrained optimization

Stationarity results for generating set search for linearly constrained optimization Stationarity results for generating set search for linearly constrained optimization Virginia Torczon, College of William & Mary Collaborators: Tammy Kolda, Sandia National Laboratories Robert Michael

More information

College of William & Mary Department of Computer Science

College of William & Mary Department of Computer Science Technical Report WM-CS-2008-07 College of William & Mary Department of Computer Science WM-CS-2008-07 Active Set Identification without Derivatives Robert Michael Lewis and Virginia Torczon September 2008

More information

STATIONARITY RESULTS FOR GENERATING SET SEARCH FOR LINEARLY CONSTRAINED OPTIMIZATION

STATIONARITY RESULTS FOR GENERATING SET SEARCH FOR LINEARLY CONSTRAINED OPTIMIZATION STATIONARITY RESULTS FOR GENERATING SET SEARCH FOR LINEARLY CONSTRAINED OPTIMIZATION TAMARA G. KOLDA, ROBERT MICHAEL LEWIS, AND VIRGINIA TORCZON Abstract. We present a new generating set search (GSS) approach

More information

PATTERN SEARCH METHODS FOR LINEARLY CONSTRAINED MINIMIZATION

PATTERN SEARCH METHODS FOR LINEARLY CONSTRAINED MINIMIZATION PATTERN SEARCH METHODS FOR LINEARLY CONSTRAINED MINIMIZATION ROBERT MICHAEL LEWIS AND VIRGINIA TORCZON Abstract. We extend pattern search methods to linearly constrained minimization. We develop a general

More information

2.098/6.255/ Optimization Methods Practice True/False Questions

2.098/6.255/ Optimization Methods Practice True/False Questions 2.098/6.255/15.093 Optimization Methods Practice True/False Questions December 11, 2009 Part I For each one of the statements below, state whether it is true or false. Include a 1-3 line supporting sentence

More information

Asynchronous parallel generating set search for linearly-constrained optimization

Asynchronous parallel generating set search for linearly-constrained optimization SANDIA REPORT SAND2006-4621 Unlimited Release Printed August 2006 Asynchronous parallel generating set search for linearly-constrained optimization Joshua D. Griffin, Tamara G. Kolda, and Robert Michael

More information

Constrained Optimization

Constrained Optimization 1 / 22 Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University March 30, 2015 2 / 22 1. Equality constraints only 1.1 Reduced gradient 1.2 Lagrange

More information

On fast trust region methods for quadratic models with linear constraints. M.J.D. Powell

On fast trust region methods for quadratic models with linear constraints. M.J.D. Powell DAMTP 2014/NA02 On fast trust region methods for quadratic models with linear constraints M.J.D. Powell Abstract: Quadratic models Q k (x), x R n, of the objective function F (x), x R n, are used by many

More information

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL)

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL) Part 4: Active-set methods for linearly constrained optimization Nick Gould RAL fx subject to Ax b Part C course on continuoue optimization LINEARLY CONSTRAINED MINIMIZATION fx subject to Ax { } b where

More information

College of William & Mary Department of Computer Science

College of William & Mary Department of Computer Science Technical Report WM-CS-2010-01 College of William & Mary Department of Computer Science WM-CS-2010-01 A Direct Search Approach to Nonlinear Programming Problems Using an Augmented Lagrangian Method with

More information

A Parametric Simplex Algorithm for Linear Vector Optimization Problems

A Parametric Simplex Algorithm for Linear Vector Optimization Problems A Parametric Simplex Algorithm for Linear Vector Optimization Problems Birgit Rudloff Firdevs Ulus Robert Vanderbei July 9, 2015 Abstract In this paper, a parametric simplex algorithm for solving linear

More information

Interior-Point Methods for Linear Optimization

Interior-Point Methods for Linear Optimization Interior-Point Methods for Linear Optimization Robert M. Freund and Jorge Vera March, 204 c 204 Robert M. Freund and Jorge Vera. All rights reserved. Linear Optimization with a Logarithmic Barrier Function

More information

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems 1 Numerical optimization Alexander & Michael Bronstein, 2006-2009 Michael Bronstein, 2010 tosca.cs.technion.ac.il/book Numerical optimization 048921 Advanced topics in vision Processing and Analysis of

More information

Linear Programming: Simplex

Linear Programming: Simplex Linear Programming: Simplex Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. IMA, August 2016 Stephen Wright (UW-Madison) Linear Programming: Simplex IMA, August 2016

More information

Numerical optimization

Numerical optimization Numerical optimization Lecture 4 Alexander & Michael Bronstein tosca.cs.technion.ac.il/book Numerical geometry of non-rigid shapes Stanford University, Winter 2009 2 Longest Slowest Shortest Minimal Maximal

More information

Appendix A Taylor Approximations and Definite Matrices

Appendix A Taylor Approximations and Definite Matrices Appendix A Taylor Approximations and Definite Matrices Taylor approximations provide an easy way to approximate a function as a polynomial, using the derivatives of the function. We know, from elementary

More information

Nonlinear Optimization for Optimal Control

Nonlinear Optimization for Optimal Control Nonlinear Optimization for Optimal Control Pieter Abbeel UC Berkeley EECS Many slides and figures adapted from Stephen Boyd [optional] Boyd and Vandenberghe, Convex Optimization, Chapters 9 11 [optional]

More information

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008. 1 ECONOMICS 594: LECTURE NOTES CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS W. Erwin Diewert January 31, 2008. 1. Introduction Many economic problems have the following structure: (i) a linear function

More information

Yinyu Ye, MS&E, Stanford MS&E310 Lecture Note #06. The Simplex Method

Yinyu Ye, MS&E, Stanford MS&E310 Lecture Note #06. The Simplex Method The Simplex Method Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye (LY, Chapters 2.3-2.5, 3.1-3.4) 1 Geometry of Linear

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table

More information

Gradient Descent. Dr. Xiaowei Huang

Gradient Descent. Dr. Xiaowei Huang Gradient Descent Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Three machine learning algorithms: decision tree learning k-nn linear regression only optimization objectives are discussed,

More information

A generating set direct search augmented Lagrangian algorithm for optimization with a combination of general and linear constraints

A generating set direct search augmented Lagrangian algorithm for optimization with a combination of general and linear constraints SANDIA REPORT SAND2006-5315 Unlimited Release Printed August 2006 A generating set direct search augmented Lagrangian algorithm for optimization with a combination of general and linear constraints T.

More information

EE731 Lecture Notes: Matrix Computations for Signal Processing

EE731 Lecture Notes: Matrix Computations for Signal Processing EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University September 22, 2005 0 Preface This collection of ten

More information

Math 273a: Optimization The Simplex method

Math 273a: Optimization The Simplex method Math 273a: Optimization The Simplex method Instructor: Wotao Yin Department of Mathematics, UCLA Fall 2015 material taken from the textbook Chong-Zak, 4th Ed. Overview: idea and approach If a standard-form

More information

Chapter 1. Preliminaries

Chapter 1. Preliminaries Introduction This dissertation is a reading of chapter 4 in part I of the book : Integer and Combinatorial Optimization by George L. Nemhauser & Laurence A. Wolsey. The chapter elaborates links between

More information

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The

More information

October 25, 2013 INNER PRODUCT SPACES

October 25, 2013 INNER PRODUCT SPACES October 25, 2013 INNER PRODUCT SPACES RODICA D. COSTIN Contents 1. Inner product 2 1.1. Inner product 2 1.2. Inner product spaces 4 2. Orthogonal bases 5 2.1. Existence of an orthogonal basis 7 2.2. Orthogonal

More information

4TE3/6TE3. Algorithms for. Continuous Optimization

4TE3/6TE3. Algorithms for. Continuous Optimization 4TE3/6TE3 Algorithms for Continuous Optimization (Algorithms for Constrained Nonlinear Optimization Problems) Tamás TERLAKY Computing and Software McMaster University Hamilton, November 2005 terlaky@mcmaster.ca

More information

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17 EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 17 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory May 29, 2012 Andre Tkacenko

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

Notes on Constrained Optimization

Notes on Constrained Optimization Notes on Constrained Optimization Wes Cowan Department of Mathematics, Rutgers University 110 Frelinghuysen Rd., Piscataway, NJ 08854 December 16, 2016 1 Introduction In the previous set of notes, we considered

More information

Optimality Conditions for Constrained Optimization

Optimality Conditions for Constrained Optimization 72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

More information

5 Handling Constraints

5 Handling Constraints 5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

Algorithms for Constrained Optimization

Algorithms for Constrained Optimization 1 / 42 Algorithms for Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University April 19, 2015 2 / 42 Outline 1. Convergence 2. Sequential quadratic

More information

Developing an Algorithm for LP Preamble to Section 3 (Simplex Method)

Developing an Algorithm for LP Preamble to Section 3 (Simplex Method) Moving from BFS to BFS Developing an Algorithm for LP Preamble to Section (Simplex Method) We consider LP given in standard form and let x 0 be a BFS. Let B ; B ; :::; B m be the columns of A corresponding

More information

Chapter 1. Preliminaries. The purpose of this chapter is to provide some basic background information. Linear Space. Hilbert Space.

Chapter 1. Preliminaries. The purpose of this chapter is to provide some basic background information. Linear Space. Hilbert Space. Chapter 1 Preliminaries The purpose of this chapter is to provide some basic background information. Linear Space Hilbert Space Basic Principles 1 2 Preliminaries Linear Space The notion of linear space

More information

Bindel, Spring 2017 Numerical Analysis (CS 4220) Notes for So far, we have considered unconstrained optimization problems.

Bindel, Spring 2017 Numerical Analysis (CS 4220) Notes for So far, we have considered unconstrained optimization problems. Consider constraints Notes for 2017-04-24 So far, we have considered unconstrained optimization problems. The constrained problem is minimize φ(x) s.t. x Ω where Ω R n. We usually define x in terms of

More information

3 The Simplex Method. 3.1 Basic Solutions

3 The Simplex Method. 3.1 Basic Solutions 3 The Simplex Method 3.1 Basic Solutions In the LP of Example 2.3, the optimal solution happened to lie at an extreme point of the feasible set. This was not a coincidence. Consider an LP in general form,

More information

Ω R n is called the constraint set or feasible set. x 1

Ω R n is called the constraint set or feasible set. x 1 1 Chapter 5 Linear Programming (LP) General constrained optimization problem: minimize subject to f(x) x Ω Ω R n is called the constraint set or feasible set. any point x Ω is called a feasible point We

More information

Lecture Notes: Geometric Considerations in Unconstrained Optimization

Lecture Notes: Geometric Considerations in Unconstrained Optimization Lecture Notes: Geometric Considerations in Unconstrained Optimization James T. Allison February 15, 2006 The primary objectives of this lecture on unconstrained optimization are to: Establish connections

More information

Implicitely and Densely Discrete Black-Box Optimization Problems

Implicitely and Densely Discrete Black-Box Optimization Problems Implicitely and Densely Discrete Black-Box Optimization Problems L. N. Vicente September 26, 2008 Abstract This paper addresses derivative-free optimization problems where the variables lie implicitly

More information

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725 Dual methods and ADMM Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given f : R n R, the function is called its conjugate Recall conjugate functions f (y) = max x R n yt x f(x)

More information

1 Computing with constraints

1 Computing with constraints Notes for 2017-04-26 1 Computing with constraints Recall that our basic problem is minimize φ(x) s.t. x Ω where the feasible set Ω is defined by equality and inequality conditions Ω = {x R n : c i (x)

More information

1 Review Session. 1.1 Lecture 2

1 Review Session. 1.1 Lecture 2 1 Review Session Note: The following lists give an overview of the material that was covered in the lectures and sections. Your TF will go through these lists. If anything is unclear or you have questions

More information

OPRE 6201 : 3. Special Cases

OPRE 6201 : 3. Special Cases OPRE 6201 : 3. Special Cases 1 Initialization: The Big-M Formulation Consider the linear program: Minimize 4x 1 +x 2 3x 1 +x 2 = 3 (1) 4x 1 +3x 2 6 (2) x 1 +2x 2 3 (3) x 1, x 2 0. Notice that there are

More information

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table

More information

Unconstrained optimization

Unconstrained optimization Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout

More information

Key words. Pattern search, linearly constrained optimization, derivative-free optimization, degeneracy, redundancy, constraint classification

Key words. Pattern search, linearly constrained optimization, derivative-free optimization, degeneracy, redundancy, constraint classification PATTERN SEARCH METHODS FOR LINEARLY CONSTRAINED MINIMIZATION IN THE PRESENCE OF DEGENERACY OLGA A. BREZHNEVA AND J. E. DENNIS JR. Abstract. This paper deals with generalized pattern search (GPS) algorithms

More information

Lecture V. Numerical Optimization

Lecture V. Numerical Optimization Lecture V Numerical Optimization Gianluca Violante New York University Quantitative Macroeconomics G. Violante, Numerical Optimization p. 1 /19 Isomorphism I We describe minimization problems: to maximize

More information

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09 Numerical Optimization 1 Working Horse in Computer Vision Variational Methods Shape Analysis Machine Learning Markov Random Fields Geometry Common denominator: optimization problems 2 Overview of Methods

More information

In Chapters 3 and 4 we introduced linear programming

In Chapters 3 and 4 we introduced linear programming SUPPLEMENT The Simplex Method CD3 In Chapters 3 and 4 we introduced linear programming and showed how models with two variables can be solved graphically. We relied on computer programs (WINQSB, Excel,

More information

Scientific Computing: Optimization

Scientific Computing: Optimization Scientific Computing: Optimization Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course MATH-GA.2043 or CSCI-GA.2112, Spring 2012 March 8th, 2011 A. Donev (Courant Institute) Lecture

More information

Spring 2017 CO 250 Course Notes TABLE OF CONTENTS. richardwu.ca. CO 250 Course Notes. Introduction to Optimization

Spring 2017 CO 250 Course Notes TABLE OF CONTENTS. richardwu.ca. CO 250 Course Notes. Introduction to Optimization Spring 2017 CO 250 Course Notes TABLE OF CONTENTS richardwu.ca CO 250 Course Notes Introduction to Optimization Kanstantsin Pashkovich Spring 2017 University of Waterloo Last Revision: March 4, 2018 Table

More information

CONSTRAINED NONLINEAR PROGRAMMING

CONSTRAINED NONLINEAR PROGRAMMING 149 CONSTRAINED NONLINEAR PROGRAMMING We now turn to methods for general constrained nonlinear programming. These may be broadly classified into two categories: 1. TRANSFORMATION METHODS: In this approach

More information

Nonlinear Programming and the Kuhn-Tucker Conditions

Nonlinear Programming and the Kuhn-Tucker Conditions Nonlinear Programming and the Kuhn-Tucker Conditions The Kuhn-Tucker (KT) conditions are first-order conditions for constrained optimization problems, a generalization of the first-order conditions we

More information

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 5 Fall 2009

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 5 Fall 2009 UC Berkeley Department of Electrical Engineering and Computer Science EECS 227A Nonlinear and Convex Optimization Solutions 5 Fall 2009 Reading: Boyd and Vandenberghe, Chapter 5 Solution 5.1 Note that

More information

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method Robert M. Freund March, 2004 2004 Massachusetts Institute of Technology. The Problem The logarithmic barrier approach

More information

Uniqueness of the Solutions of Some Completion Problems

Uniqueness of the Solutions of Some Completion Problems Uniqueness of the Solutions of Some Completion Problems Chi-Kwong Li and Tom Milligan Abstract We determine the conditions for uniqueness of the solutions of several completion problems including the positive

More information

which arises when we compute the orthogonal projection of a vector y in a subspace with an orthogonal basis. Hence assume that P y = A ij = x j, x i

which arises when we compute the orthogonal projection of a vector y in a subspace with an orthogonal basis. Hence assume that P y = A ij = x j, x i MODULE 6 Topics: Gram-Schmidt orthogonalization process We begin by observing that if the vectors {x j } N are mutually orthogonal in an inner product space V then they are necessarily linearly independent.

More information

Computational Finance

Computational Finance Department of Mathematics at University of California, San Diego Computational Finance Optimization Techniques [Lecture 2] Michael Holst January 9, 2017 Contents 1 Optimization Techniques 3 1.1 Examples

More information

Chapter 5 Linear Programming (LP)

Chapter 5 Linear Programming (LP) Chapter 5 Linear Programming (LP) General constrained optimization problem: minimize f(x) subject to x R n is called the constraint set or feasible set. any point x is called a feasible point We consider

More information

Semidefinite Programming

Semidefinite Programming Semidefinite Programming Notes by Bernd Sturmfels for the lecture on June 26, 208, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra The transition from linear algebra to nonlinear algebra has

More information

Conditions for Robust Principal Component Analysis

Conditions for Robust Principal Component Analysis Rose-Hulman Undergraduate Mathematics Journal Volume 12 Issue 2 Article 9 Conditions for Robust Principal Component Analysis Michael Hornstein Stanford University, mdhornstein@gmail.com Follow this and

More information

Direct search based on probabilistic feasible descent for bound and linearly constrained problems

Direct search based on probabilistic feasible descent for bound and linearly constrained problems Direct search based on probabilistic feasible descent for bound and linearly constrained problems S. Gratton C. W. Royer L. N. Vicente Z. Zhang March 8, 2018 Abstract Direct search is a methodology for

More information

THE NEWTON BRACKETING METHOD FOR THE MINIMIZATION OF CONVEX FUNCTIONS SUBJECT TO AFFINE CONSTRAINTS

THE NEWTON BRACKETING METHOD FOR THE MINIMIZATION OF CONVEX FUNCTIONS SUBJECT TO AFFINE CONSTRAINTS THE NEWTON BRACKETING METHOD FOR THE MINIMIZATION OF CONVEX FUNCTIONS SUBJECT TO AFFINE CONSTRAINTS ADI BEN-ISRAEL AND YURI LEVIN Abstract. The Newton Bracketing method [9] for the minimization of convex

More information

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods AM 205: lecture 19 Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods Optimality Conditions: Equality Constrained Case As another example of equality

More information

Design and Analysis of Algorithms Lecture Notes on Convex Optimization CS 6820, Fall Nov 2 Dec 2016

Design and Analysis of Algorithms Lecture Notes on Convex Optimization CS 6820, Fall Nov 2 Dec 2016 Design and Analysis of Algorithms Lecture Notes on Convex Optimization CS 6820, Fall 206 2 Nov 2 Dec 206 Let D be a convex subset of R n. A function f : D R is convex if it satisfies f(tx + ( t)y) tf(x)

More information

Vector Spaces. Addition : R n R n R n Scalar multiplication : R R n R n.

Vector Spaces. Addition : R n R n R n Scalar multiplication : R R n R n. Vector Spaces Definition: The usual addition and scalar multiplication of n-tuples x = (x 1,..., x n ) R n (also called vectors) are the addition and scalar multiplication operations defined component-wise:

More information

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems Robert M. Freund February 2016 c 2016 Massachusetts Institute of Technology. All rights reserved. 1 1 Introduction

More information

1. Introduction. We analyze a trust region version of Newton s method for the optimization problem

1. Introduction. We analyze a trust region version of Newton s method for the optimization problem SIAM J. OPTIM. Vol. 9, No. 4, pp. 1100 1127 c 1999 Society for Industrial and Applied Mathematics NEWTON S METHOD FOR LARGE BOUND-CONSTRAINED OPTIMIZATION PROBLEMS CHIH-JEN LIN AND JORGE J. MORÉ To John

More information

Algorithms for constrained local optimization

Algorithms for constrained local optimization Algorithms for constrained local optimization Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Algorithms for constrained local optimization p. Feasible direction methods Algorithms for constrained

More information

CS6375: Machine Learning Gautam Kunapuli. Support Vector Machines

CS6375: Machine Learning Gautam Kunapuli. Support Vector Machines Gautam Kunapuli Example: Text Categorization Example: Develop a model to classify news stories into various categories based on their content. sports politics Use the bag-of-words representation for this

More information

Duality Theory of Constrained Optimization

Duality Theory of Constrained Optimization Duality Theory of Constrained Optimization Robert M. Freund April, 2014 c 2014 Massachusetts Institute of Technology. All rights reserved. 1 2 1 The Practical Importance of Duality Duality is pervasive

More information

The Simplex Method: An Example

The Simplex Method: An Example The Simplex Method: An Example Our first step is to introduce one more new variable, which we denote by z. The variable z is define to be equal to 4x 1 +3x 2. Doing this will allow us to have a unified

More information

Generating p-extremal graphs

Generating p-extremal graphs Generating p-extremal graphs Derrick Stolee Department of Mathematics Department of Computer Science University of Nebraska Lincoln s-dstolee1@math.unl.edu August 2, 2011 Abstract Let f(n, p be the maximum

More information

A LINEAR PROGRAMMING BASED ANALYSIS OF THE CP-RANK OF COMPLETELY POSITIVE MATRICES

A LINEAR PROGRAMMING BASED ANALYSIS OF THE CP-RANK OF COMPLETELY POSITIVE MATRICES Int J Appl Math Comput Sci, 00, Vol 1, No 1, 5 1 A LINEAR PROGRAMMING BASED ANALYSIS OF HE CP-RANK OF COMPLEELY POSIIVE MARICES YINGBO LI, ANON KUMMER ANDREAS FROMMER Department of Electrical and Information

More information

3E4: Modelling Choice. Introduction to nonlinear programming. Announcements

3E4: Modelling Choice. Introduction to nonlinear programming. Announcements 3E4: Modelling Choice Lecture 7 Introduction to nonlinear programming 1 Announcements Solutions to Lecture 4-6 Homework will be available from http://www.eng.cam.ac.uk/~dr241/3e4 Looking ahead to Lecture

More information

Data Mining. Linear & nonlinear classifiers. Hamid Beigy. Sharif University of Technology. Fall 1396

Data Mining. Linear & nonlinear classifiers. Hamid Beigy. Sharif University of Technology. Fall 1396 Data Mining Linear & nonlinear classifiers Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1396 1 / 31 Table of contents 1 Introduction

More information

A projection algorithm for strictly monotone linear complementarity problems.

A projection algorithm for strictly monotone linear complementarity problems. A projection algorithm for strictly monotone linear complementarity problems. Erik Zawadzki Department of Computer Science epz@cs.cmu.edu Geoffrey J. Gordon Machine Learning Department ggordon@cs.cmu.edu

More information

Linear Algebra and Eigenproblems

Linear Algebra and Eigenproblems Appendix A A Linear Algebra and Eigenproblems A working knowledge of linear algebra is key to understanding many of the issues raised in this work. In particular, many of the discussions of the details

More information

TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM

TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM H. E. Krogstad, IMF, Spring 2012 Karush-Kuhn-Tucker (KKT) Theorem is the most central theorem in constrained optimization, and since the proof is scattered

More information

An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints

An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints Klaus Schittkowski Department of Computer Science, University of Bayreuth 95440 Bayreuth, Germany e-mail:

More information

Linear Programming Redux

Linear Programming Redux Linear Programming Redux Jim Bremer May 12, 2008 The purpose of these notes is to review the basics of linear programming and the simplex method in a clear, concise, and comprehensive way. The book contains

More information

arxiv: v1 [math.na] 5 May 2011

arxiv: v1 [math.na] 5 May 2011 ITERATIVE METHODS FOR COMPUTING EIGENVALUES AND EIGENVECTORS MAYSUM PANJU arxiv:1105.1185v1 [math.na] 5 May 2011 Abstract. We examine some numerical iterative methods for computing the eigenvalues and

More information

MTH Linear Algebra. Study Guide. Dr. Tony Yee Department of Mathematics and Information Technology The Hong Kong Institute of Education

MTH Linear Algebra. Study Guide. Dr. Tony Yee Department of Mathematics and Information Technology The Hong Kong Institute of Education MTH 3 Linear Algebra Study Guide Dr. Tony Yee Department of Mathematics and Information Technology The Hong Kong Institute of Education June 3, ii Contents Table of Contents iii Matrix Algebra. Real Life

More information

A strongly polynomial algorithm for linear systems having a binary solution

A strongly polynomial algorithm for linear systems having a binary solution A strongly polynomial algorithm for linear systems having a binary solution Sergei Chubanov Institute of Information Systems at the University of Siegen, Germany e-mail: sergei.chubanov@uni-siegen.de 7th

More information

Pattern Classification, and Quadratic Problems

Pattern Classification, and Quadratic Problems Pattern Classification, and Quadratic Problems (Robert M. Freund) March 3, 24 c 24 Massachusetts Institute of Technology. 1 1 Overview Pattern Classification, Linear Classifiers, and Quadratic Optimization

More information

"SYMMETRIC" PRIMAL-DUAL PAIR

SYMMETRIC PRIMAL-DUAL PAIR "SYMMETRIC" PRIMAL-DUAL PAIR PRIMAL Minimize cx DUAL Maximize y T b st Ax b st A T y c T x y Here c 1 n, x n 1, b m 1, A m n, y m 1, WITH THE PRIMAL IN STANDARD FORM... Minimize cx Maximize y T b st Ax

More information

OPTIMISATION 3: NOTES ON THE SIMPLEX ALGORITHM

OPTIMISATION 3: NOTES ON THE SIMPLEX ALGORITHM OPTIMISATION 3: NOTES ON THE SIMPLEX ALGORITHM Abstract These notes give a summary of the essential ideas and results It is not a complete account; see Winston Chapters 4, 5 and 6 The conventions and notation

More information

The value of a problem is not so much coming up with the answer as in the ideas and attempted ideas it forces on the would be solver I.N.

The value of a problem is not so much coming up with the answer as in the ideas and attempted ideas it forces on the would be solver I.N. Math 410 Homework Problems In the following pages you will find all of the homework problems for the semester. Homework should be written out neatly and stapled and turned in at the beginning of class

More information

B553 Lecture 5: Matrix Algebra Review

B553 Lecture 5: Matrix Algebra Review B553 Lecture 5: Matrix Algebra Review Kris Hauser January 19, 2012 We have seen in prior lectures how vectors represent points in R n and gradients of functions. Matrices represent linear transformations

More information

3. THE SIMPLEX ALGORITHM

3. THE SIMPLEX ALGORITHM Optimization. THE SIMPLEX ALGORITHM DPK Easter Term. Introduction We know that, if a linear programming problem has a finite optimal solution, it has an optimal solution at a basic feasible solution (b.f.s.).

More information

Lecture: Algorithms for LP, SOCP and SDP

Lecture: Algorithms for LP, SOCP and SDP 1/53 Lecture: Algorithms for LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2018.html wenzw@pku.edu.cn Acknowledgement:

More information

Optimization Tutorial 1. Basic Gradient Descent

Optimization Tutorial 1. Basic Gradient Descent E0 270 Machine Learning Jan 16, 2015 Optimization Tutorial 1 Basic Gradient Descent Lecture by Harikrishna Narasimhan Note: This tutorial shall assume background in elementary calculus and linear algebra.

More information

Linear Programming Algorithms for Sparse Filter Design

Linear Programming Algorithms for Sparse Filter Design Linear Programming Algorithms for Sparse Filter Design The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher

More information

The Newton Bracketing method for the minimization of convex functions subject to affine constraints

The Newton Bracketing method for the minimization of convex functions subject to affine constraints Discrete Applied Mathematics 156 (2008) 1977 1987 www.elsevier.com/locate/dam The Newton Bracketing method for the minimization of convex functions subject to affine constraints Adi Ben-Israel a, Yuri

More information

Chapter 7. Optimization and Minimum Principles. 7.1 Two Fundamental Examples. Least Squares

Chapter 7. Optimization and Minimum Principles. 7.1 Two Fundamental Examples. Least Squares Chapter 7 Optimization and Minimum Principles 7 Two Fundamental Examples Within the universe of applied mathematics, optimization is often a world of its own There are occasional expeditions to other worlds

More information