arxiv: v2 [math.oc] 6 Oct 2014

Size: px
Start display at page:

Download "arxiv: v2 [math.oc] 6 Oct 2014"

Transcription

1 EFFICIENT FIRST-ORDER METHODS FOR LINEAR PROGRAMMING AND SEMIDEFINITE PROGRAMMING arxiv: v2 [math.oc] 6 Oct 204 JAMES RENEGAR. Introduction The study of first-order methods has largely dominated research in continuous optimization for the last decade, yet still the range of problems for which optimal first-order methods have been developed is surprisingly limited, even though much has been achieved in some areas with high profile, such as compressed sensing. Even if one restricts attention to, say, linear programming, the problems proven to be solvable by first-order methods in O(/ǫ) iterations all possess noticeably strong structure. We present a simple transformation of any linear program or semidefinite program into an equivalent convex optimization problem whose only constraints are linear equations. The objective function is defined on the whole space, making virtually all subgradient methods be immediately applicable. We observe, moreover, that the objective function is naturally smoothed, thereby allowing most first-order methods to be applied. We develop complexity bounds in the unsmoothed case for a particular subgradient method, and in the smoothed case for Nesterov s original optimal first-order method for smooth functions. We achieve the desired bounds on the number of iterations, O(/ǫ 2 ) and O(/ǫ), respectively. However, contrary to most of the literature on first-order methods, we measure error relatively, not absolutely. On the other hand, also unlike most of the literature, we require only the level sets to be bounded, not the entire feasible region to be bounded. Perhaps most surprising is that the transformation from a linear program or a semidefinite program is simple and so is the basic theory, and yet the approach has been overlooked until now, a blind spot. Once the transformation is realized, the remaining effort in establishing complexity bounds is mainly straightforward, by making use of various works of Nesterov. The following section presents the transformation and basic theory. At the end of the section we observe that the transformation and theory extend far beyond semidefinite programming with proofs virtually identical to the ones given. Thereafterweturn toalgorithms, firstforthe unsmoothed case. Thisis whereweactually rely on structure possessed by linear programs and semidefinite programs but not by conic optimization problems in general. A forthcoming paper [5] generalizes the results to all of hyperbolic programming. That paper depends on this one. Date: September 9, 204. Thanks to Yurii Nesterov for interesting conversations, and for encouragement, during a recent visit to Cornell and special thanks for his creative research in the papers without which this one would not exist.

2 2 J. RENEGAR As a linear programming problem 2. Basic Theory min c T x s.t. Ax = b x 0 is a special case duality aside of a semidefinite program in which all off-diagonal entries are constrained to equal 0, in developing the theory we focus on semidefinite programming, as there is no point in doing proofs twice, once for linear programming and again for semidefinite programming. After proving the first theorem, we digress to make certain the reader is clear on how to determine the implications of the paper for the special case of linear programming. (We sometimes digress to consider the special case of linear programming in later sections as well.) For C,A,...,A m S n n (n n symmetric matrices), and b R m, consider the semidefinite program inf C, X s.t. A(X) = b SDP X 0 where, is the trace inner product, where A(X) := ( A,X,..., A m,x ), and where X 0 is shorthand for X S n n (cone of positive semidefinite matrices). + Let opt val be the optimal value of SDP. Assume C is not orthogonalto the nullspace of A, as otherwise all feasible points are optimal. Assume a strictly feasible matrix E is known. Until section 5, assume E = I, the identity matrix. Assuming the identity is feasible makes the ideas and analysis particularly transparent. In section 5, it is shown that the results for E = I are readily converted to results when the known feasible matrix E is a positive-definite matrix other than the identity. Until section 5, however, the assumption E = I stands, but is not made explicit in the formal statement of results. For symmetric matrices X, let λ min (X) denote the minimum eigenvalue of X. It is well known that X λ min (X) is a concave function. Lemma 2.. Assume SDP has bounded optimal value. If X S n n satisfies A(X) = b and C,X < C,I, then λ min (X) <. Proof: If λ min (X), then I +t(x I) is feasible for all t 0. As the function t C,I +t(x I) is strictly decreasing (because C,X < C,I ), this implies SDP has unbounded optimal value, contrary to assumption. For all X S n n for which λ min <, let Z(X) denote the matrix where the line from I in direction X I intersects the boundary of S n n +, that is, Z(X) := I + λ min(x) (X I). We refer to Z(X) as the projection (from I) of X to the boundary of the semidefinite cone. The following result shows that SDP is equivalent to a particular eigenvalue optimization problem for which the only constraints are linear equations. Although the proof is straightforward, the centrality of the result to the development makes the result be a theorem.

3 EFFICIENT FIRST-ORDER METHODS FOR LP AND SDP 3 Theorem 2.2. Let val be any value satisfying val < C,I. If X solves max λ min (X) s.t. A(X) = b C,X = val, then Z(X ) is optimal for SDP. Conversely, if Z is optimal for SDP, then X := I + (Z I) is optimal for (), and Z = Z(X ). Proof: Fix a value satisfying val < C,I, and consider the affine space that forms the feasible region for (): {X S n n : A(X) = b and C,X = val}. (2) Since val < C,I, it is easily proven from the convexity of S n n + that X Z(X) gives a one-to-one map from the set (2) onto {Z S n n + : A(Z) = b and C,Z < C,I }, (3) where S n n + denotes the boundary of S n n +. For X in the set (2), the objective value of Z(X) is C,Z(X) = C,I + λ min(x) (X I) () = C,I + λ min(x) (val C,I ), (4) a strictly-decreasing function of λ min (X). Since the map X Z(X) is a bijection between the sets (2) and (3), solving SDP is thus equivalent to solving (). SDP has been transformed into an equivalent linearly-constrained maximization problem with concave albeit nonsmooth objective function. Virtually any subgradient method can be applied to this problem, the main cost per iteration being in computing a subgradient and projecting it onto the subspace {V : A(V) = 0 and C,V = 0}. In section 6, it is observed that the objective function has a natural smoothing, allowing almost all first-order methods to be applied, not just subgradient methods. We digress to interpret the implications of the development thus far for the linear programming problem min c T x s.t. Ax = b LP (5) x 0. LP can easily be expressed as a semidefinite program in variable X S n n constrained to have all off-diagonal entries equal to zero, where the diagonal entries correspond to the original variables x,...,x n. In particular, the standing assumption that I is feasible for SDP becomes, in the special case of LP, a standing assumption that (the vector of all ones) is feasible. The eigenvalues of X become the coordinates x,...,x n. The map X λ min (X) becomes x min j x j. Lemma 2. becomes the statement that if x satisfies Ax = b and c T x < c T, then min j x j <. Finally, Theorem 2.2 becomes the result that for any value satisfying val < c T, LP is equivalent to max x min j x j s.t. Ax = b (6) c T x = val,

4 4 J. RENEGAR in that, for example, x is optimal for (6) if and only if the projection z(x ) = + min j x (x ) is optimal for LP. j In this straightforward manner, the reader can realize the implications for LP of all results in the paper. Before leaving the simple setting of linear programming, we make observations pertinent to applying subgradient methods to solving (6), the problem equivalent to LP. The subgradients of x min j x j at x are the convex combinations of the standardbasisvectorse(k) forwhichx k = min j x j. Consequently, theprojectedsubgradients at x are the convex combinations of the vectors P k for which x k = min j x j, where P k is the k th column of the matrix projecting R n onto the nullspace of Ā = [ A c T ], that is P := I Ā T (ĀĀT ) Ā. (7) In particular, if for a subgradient method the current iterate is x, then the chosen projected subgradient can simply be any of the columns P k for which x k = min j x j. Choosing the projected subgradient in this way gives the subgradient method a combinatorial feel. If, additionally, the subgradient method does exact line searches, then the algorithm possesses distinct combinatorial structure. (In this regard it should be noted that the work required for an exact line search is only O(nlogn), dominated by the cost of sorting.) If m n, then P is not computed in its entirety, but instead the matrix M = (ĀĀT ) if formed as a preprocessing step, at cost O(m 2 n). Then, for any iterate x and an index k satisfying x k = min j x j, the projected subgradient P k is computed according to u = M Āk v = ĀT u P k = e(k) v, foracostof O(m 2 +#non zero entries in A+nlogn) periteration,whereo(nlogn) is the cost of finding a smallest coordinate of x. Now we return to the theory, expressed for SDP, but interpretable for LP in the straightforward manner explained above. Assume, henceforth, that SDP has at least one optimal solution. Thus, the equivalent problem () has at least one optimal solution. Let Xval denote any of the optimal solutions for the equivalent problem. Lemma 2.3. λ min (X val) = val opt val Proof: By Theorem2.2, Z(X val ) is optimal for SDP in particular, C,Z(X val ) = opt val. Thus, according to (4), Rearrangement completes the proof. opt val = C,I + λ min(x val ) (val C,I ). We focus on the goal of computing a matrix Z that is feasible for SDP and has objective value which is significantly better than the objective value for I, in the sense that C,Z opt val ǫ, (8)

5 EFFICIENT FIRST-ORDER METHODS FOR LP AND SDP 5 where ǫ > 0 is user-chosen. Thus, for the problem of main interest, SDP (or the special case, LP), the focus is on relative improvement in the objective value. The following proposition provides a useful characterization of the accuracy needed in approximately solving the SDP equivalent problem () so as to ensure that for the computed matrix X, the projection Z = Z(X) satisfies (8). Proposition 2.4. Let 0 ǫ <, and let val be a value satisfying val < C,I. If X is feasible for the SDP equivalent problem (), then C,Z(X) opt val ǫ (9) if and only if λ min (X val ) λ min(x) ǫ ǫ. (0) Proof: Assume X is feasible for the equivalent problem (). For Y = X,X val, we have the equality (4), that is, Thus, Hence, C,Z(Y) = C,I + λ min(y ) (val C,I ). C,Z(X) opt val = C,Z(X) C,Z(X val ) C,I C,Z(X val ) = λ min(x) λ min(xval ) λ min(xval ) = λ min(x val ) λ min(x) λ min (X) C,Z(X) opt val ǫ λ min (X val ) λ min(x) ǫ( λ min (X)) ( ǫ)(λ min (X val ) λ min(x)) ǫ( λ min (X val )) λ min (Xval ) λ min(x) ǫ ǫ ( λ min(xval )). UsingLemma2.3tosubstitute forthe rightmostoccurrenceofλ min (Xval )completes the proof. It might seem that to make use in complexity analysis of the equivalence of (0) with (9), it would be necessary to assume as input to algorithms a lower bound on opt val. Such is not the case, as is shown in the following sections. In concluding the section, we observe that the basic theory holds far more generally. In particular, let K be a closed, pointed, convex cone in R n, and assume e lies in the interior of K. For x R n, define λ min,e (x) = inf{λ R : x λe / K}. ().

6 6 J. RENEGAR It is easy to show x λ min,e (x) is a closed, concave function with finite value for all x R n. Assume additionally that e is feasible for the conic optimization problem min c, x s.t. Ax = b x K (2) (where, is any fixed inner product). The same proof as for Theorem 2.2 then shows that for any value satisfying val < c,e, (2) is equivalent to the linearlyconstrained optimization problem max λ min,e (x) s.t. Ax = b c,x = val, (3) in the sensethat ifxis optimalfor(3), then z(x) := e+ λ min,e(x) (x e)is optimal for (2), and conversely, if z is optimal for (2), then x = e+ c,e val c,e opt val (z e) is optimal for (3); moreover, z = z(x ). Likewise, Proposition 2.4 carries over with the same proof. Furthermore, analogous results are readily developed for a variety of different forms of conic optimization problems. Consider, for example, a problem min c,x s.t. Ax b K. (4) Nowassumeknownafeasiblepoint e inthe interiorofthe feasibleregion. Then, for any value satisfying val < c,e, the conic optimization problem (4) is equivalent to a problem for which there is only one linear constraint: max λ min,e (Ax b) s.t. c,x = val, (5) where e := Ae b, and where λ min,e is as defined in (). The problems are equivalent in that if x is optimal for (5), then the projection z(x ) of x from e to the boundary of the feasible region is optimal for (4) that is, z(x ) := e + λ min,e(ax b) (x e ) is optimal for (4) and, conversely, if z is optimal for (4), then x = e + c,e val c,e opt val (z e ) is optimal for (5); moreover, z = z(x ). We focus on the concrete setting of semidefinite programming (and linear programming) because the algebraic structure thereby provided is sufficient for designing provably-efficient first-order methods, in both smoothed and unsmoothed settings. We now begin validating the claim. 3. Corollaries for a Subgradient Method Continue to assume SDP has an optimal solution and I is feasible.

7 EFFICIENT FIRST-ORDER METHODS FOR LP AND SDP 7 Given ǫ > 0 and a value satisfying val < C,I, we wish to approximately solve the SDP equivalent problem max λ min (X) s.t. A(X) = b C,X = val, (6) where by approximately solve we mean that feasible X is computed for which with ǫ satisfying λ min (X val) λ min (X) ǫ ǫ ǫ ǫ. Indeed, according to Proposition 2.4, the projection Z = Z(X) will then satisfy C,Z opt val ǫ. (7) We begin by recalling a well-known complexity result for a subgradient method, interpreted for when the method is applied to solving the SDP equivalent problem (6). From this is deduced a bound on the number of iterations sufficient to obtain X whose projection Z = Z(X) satisfies (7). We observe, however, that in a certain respect, the result is disappointing. In the next section, the framework is embellished by applying the subgradient method not to (6) for only one value val, but to (6) for a small and carefully chosen sequence of values, val = val l. The embellishment results in a computational scheme which possesses the desired improvement. For specifying a subgradient method and stating a bound on its complexity, we follow Nesterov s book [2]: Subgradient Method (0) Inputs: Number of iterations: N Initial iterate: X 0 satisfying A(X 0 ) = b and C,X 0 < C,I. Let val := C,X 0. Distance upper bound: R, a value for which there exists X val satisfying X 0 X val R. Initial best iterate: X = X 0 Initial counter value: k = () Update counter: k + k (2) Iteration: Computeasubgradient λ min (X k )andorthogonallyproject it onto the subspace {V : A(V) = 0 and C,V = 0}. (8) Denoting the projection by G k, compute X k+ := X k + R N Gk G k. (3) If λ min (X k+ ) > λ min (X), then make the replacement X k+ X. (4) Check for termination: If k = N, then output X and terminate. Else, go to Step.

8 8 J. RENEGAR Theorem 3.. For Subgradient Method, the output X satisfies λ min (X val) λ min (X) R/ N, where val := C,X 0 and X 0 is the input matrix. Proof: The function X λ min (X) is Lipschitz continuous with constant : λ min (X) λ min (Y) X Y = ( λ j (X Y) 2) /2. The result is thus a simple corollaryof Theorem in Nesterov s book, by choosing the parameter values there to be h k = R/ N for k = 0,...,N. We briefly digress to the special case of linear programming. Recall for LP the linear program (5) the projected subgradients at x are the convex combinations of the columns P k for which x k = min j x j, where P is the matrix projecting R n orthogonally onto the nullspace of Ā = [ A c T ], that is, P = I ĀT (ĀĀT ) Ā. In particular, if x is the current iterate for Subgradient Method, a projected subgradient can be selected simply by computing any column P k for which x k = min j x j. R Subgradient Method then moves from x to x+ N Pk P k. The geometry is interesting in that each step is being chosen from among only the vectors R N P Pj j for j =,...,n. The geometry is made even more interesting by Theorem 3. asserting that even for the choice of steps coming from this limited set of vectors, still it holds that the final output x satisfies minx val,j min x j R/ N, j j where x val,j denotes the jth coordinate of the optimal solution x for the LP equivalent problem max x min j x j s.t. Ax = b c T x = val. Now we return to the more general setting of semidefinite programming. Below, the input matrix X 0 to Subgradient Method is required to be feasible for SDP, mainly so that the input R can be chosen as a value with clear relevance to SDP, a value we now describe. The level sets for SDP are the sets Level val = {X 0 : A(X) = b and C,X = val}, where val is any fixed value. Of course Level val = if val < opt val. If some nonempty level set is bounded, then all level sets are bounded. On the other hand, if a level set is unbounded, then either SDP has unbounded optimal value or can be made to have unbounded value with an arbitrarily small perturbation of C. Thus, in developing numerical methods for approximating optimal solutions, it is natural to focus on the case that level sets for SDP are bounded j

9 EFFICIENT FIRST-ORDER METHODS FOR LP AND SDP 9 (equivalently, the dual problem is strictly feasible). Hence, we assume the level sets are bounded. Let diam be a known value satisfying diam max{ X Y : X,Y Level val } for all val < C,I, that is, an upper bound on the diameters of all level sets for better objective values than the value for the level set containing I. Although the assumption of knowing the upper bound diam is strong, it is consistent with assumptions found throughout the literature on first-order methods, such as the requirement for Subgradient Method that the input R be an upper bound on X 0 X val, where X 0 is the input matrix. Moreover, even though the assumption of knowing diam is strong, still there are many interesting situations in which the assumption is valid, particularly when a problem is specifically modeled in such a way as to make the diameter of the level sets (for val < C,I ) be of reasonable magnitude. For example, when I is on (or near) the central path, the choice of upper bound diam = n is valid, albeit for various carefully modeled semidefinite programs in which I is explicitly made to be near the central path, stronger upper bounds hold (e.g., diam = O( n) in numerous interesting cases, some of which are displayed in the forthcoming paper [5]). In most of the literature on optimal first-order methods, the feasible region is required to be bounded, not just the level sets. By focusing on relative error (7) ratherthanabsoluteerror,weareabletorequireonlythatthelevelsetsbebounded, not the feasible region. In the following corollary regarding Subgradient Method, the choice of input N depends on the optimal value for SDP. Naturally the reader will infer that in addition to knowing the upper bound diam, our algorithmic scheme will require knowing a lower bound on opt val, but this is not the case for the scheme. The corollary is used for motivating the next step in specifying the scheme. Corollary 3.2. Assume X 0 is feasible for SDP and satisfies C,X 0 < C,I. Define val := C,X 0. Let 0 < ǫ <. If X 0 and R = diam are inputs to Subgradient Method, along with an integer N satisfying ( ) 2 diam N, (9) ǫ then for the output X, the projection Z(X) satisfies C,Z(X) opt val ǫ. Proof: Since X 0 is feasible, so is Xval (because 0 λ min(x 0 ) λ min (Xval )). Thus, X 0 Xval diam, making R = diam a valid input to Subgradient Method. For inputs as specified, Theorem 3. immediately implies the output X for Subgradient Method satisfies λ min (X val) λ min (X) ǫ Invoking Proposition 2.4 completes the proof..

10 0 J. RENEGAR The dependence ofthe iterationlowerbound (9) onǫ 2 is unfortunate but probably unavoidablewithout smoothing the objective function X λ min (X), as is done in sections 6 and 7. Likewise, a significant dependence on diam or some other meaningful quantity capturing the distance X 0 Xval probably is unavoidable. However, the dependence on is disconcerting. To understand why the dependence is disconcerting, consider that the most natural choice for the input matrix is X 0 = I λ max(π(c)) π(c), where π(c) is the orthogonal projection of C onto the subspace {V : A(V) = 0}. This is the choice for X 0 obtained by moving from I in direction π(c) until the boundary of the semidefinite cone is reached. However, even when I is on the central path (in which case the direction π(c) is tangent to the central path), it can happen that the value is of magnitude n for val = C,X 0 and X 0 = I λ max(π(c)) π(c). Thus, even for problems modeled carefully so that I is on the central path and diam is of limited size, the iteration lower bound (9) can grow significantly with n regardless of the value for ǫ. This is disconcerting. Moreover, we want an algorithm for which opt val does not explicitly figure into choosing the inputs. We already assume the upper bound diam is known. We want to avoid also assuming a lower bound on opt val is known. These matters are handled in the following section. 4. The NonSmoothed Scheme The observations concluding the preceding section raise a question: Is it possible to efficiently move from an initial feasible matrix U 0 satisfying C,U 0 < C,I, to a feasible matrix Y for which val = C,Y satisfies, say, 3? We begin this section by providing an affirmative answer, but first let us again display the pertinent optimization problem: max λ min (X) s.t. A(X) = b C,X = val. Recall that X val denotes any optimal solution of (20), an optimization problem which is equivalent to SDP (assuming val < C,I ). (20) Consider the following computational procedure: NonSmoothed SubScheme (0) Initiation: Input: AmatrixU 0 thatisfeasibleforsdpandsatisfies C,U 0 < C,I. Let val 0 = C,U 0 Let l =. () Outer Iteration Counter Step: l+ l (2) Inner Iterations: Apply Subgradient Method with inputs X 0 = U l, R = diam and N = 9diam 2. Rename the output X as V l.

11 EFFICIENT FIRST-ORDER METHODS FOR LP AND SDP (3) Check for Termination: If λ min (V l ) /3, then output Y = U l and terminate. Else, compute the projection and go to Step. U l+ := Z(V l ), let val l+ := C,U l+, Proposition 4.. NonSmoothed SubScheme outputs Y that is feasible for SDP and satisfies 3, where val := C,Y. The total number of outer iterations does not exceed log 3/2 ( 0 where val 0 = C,U 0 and U 0 is the input matrix. ), Proof: It is easily verified that all of the matrices U l and V l computed by Non- Smooth SubScheme satisfy the SDP equations A(X) = b. Moreover, U l is clearly feasible for SDP, lying in the boundary of the feasible region. Fix l to be any value attained by the counter. We now examine the effects of Steps 2 and 3. Corollary 3.2 with N = 9diam 2 shows that in Step 2, the output X from Subgradient Method satisfies that is, Observe λ min (X val l ) λ min (X) /3, λ min (X val l ) λ min (V l ) /3. (2) l = λ min(x val l ) (by Lemma 2.3) 2 3 λ min(v l ) (by (2)) Hence, if the method terminates in Step 3 that is, if λ min (V l ) /3 then the output matrix Y = U l satisfies 3, (22) where val := C,Y = val l. We have now verified that if NonSmoothed SubScheme terminates, then the output Y is indeed feasible for SDP and satisfies the desired inequality (22). On the other hand, if the method does not terminate in Step 3, it computes the matrix U l+ and its objective value, val l+. Here, observe val l+ = C,I + λ min(v l ) (val l C,I ) C,I 3 2 ( l), because val l < C,I and λ min (V l ) /3 (due to no termination). Hence, l+ 3 2 l.

12 2 J. RENEGAR Since all values val l computed by the algorithm satisfy val l opt val (as U l is feasible for SDP), it immediately follows that ( ) log 3/2 0 is an upper bound on the number of outer iterations. Specifying our overall computational scheme relying on the subgradient method, and analyzing the scheme s complexity, both are now easily accomplished: NonSmoothed Scheme (0) Inputs: A value 0 < ǫ <, and a matrix U 0 which both is feasible for SDP and satisfies C,U 0 < C,I. For example, the matrix U 0 = I λ π(c). max(π(c)) () Apply NonSmoothed SubScheme with input U 0. Let Y denote the output. (2) Apply Subgradient Method with inputs X 0 = Y, R = diam and N = (3diam/ǫ) 2. Let X denote the output. (3) Compute and output the projection Z = Z(X), then terminate. In stating the following theorem, we make explicit that I is being assumed as feasible. The generalization to assuming known a strictly feasible matrix, but not necessarily the identity, is presented in section 5. Theorem 4.2. Assume I is feasible for SDP. NonSmoothed Scheme outputs Z which is feasible for SDP and satisfies C,Z opt val ǫ. The total number of iterations of Subgradient Method is bounded above by ( 9 diam 2 + ) ( ( )) ǫ 2 + log 3/2, 0 where val 0 := C,U 0 and U 0 is the input matrix. Proof: Proposition 4. shows the output matrix Y from Step is feasible for SDP and satisfies 3, whereval = C,Y. Thus, by Corollary3.2, when X 0 = Y is input into Subgradient Method, along with R = diam and N = (3diam/ǫ) 2, the projection Z(X) of the output X satisfies C,Z(X) opt val ǫ, establishing correctness of NonSmoothed Scheme. The bound for total iterations of Subgradient Method is immediate from the outer iteration bound of Proposition 4., and the choices for the number of iterations in Step 2 of NonSmoothed SubScheme and in Step 2 of NonSmoothed Scheme.

13 EFFICIENT FIRST-ORDER METHODS FOR LP AND SDP 3 The following corollary is useful when an optimization problem is modeled so as to make I be on (or near) the central path. The proof follows standard lines in interior-point method theory, but nonetheless we include the proof for completeness. Corollary 4.3. If I is on the central path and the input matrix is chosen as U 0 = I λ max(π(c)) π(c), then the same conclusions as in Theorem 4.2 apply but now with the number of Subgradient Method iterations bounded above by ( 9 diam 2 + ) ( ) ǫ 2 + log 3/2 (n). Proof: Assume I is on the central path, that is, assume for some µ > 0 that I is the optimal solution for min C,X µ ln(det(x)) s.t. A(X) = b. Since the gradient of the objective function at X 0 is C X, a first-order optimality condition satisfied by I is that there exists a vector y for which C I = A y, where A y = i y ia i is the adjoint of A. This implies that the projection of C and µi onto the nullspace of A are identical. Consequently, C,X C,I and A(X) = b tr(x) n and A(X) = b. Hence, all X which are both feasible for SDP and have better objective value lie within the set {X 0 : tr(x) n}, a set which is contained within the ball of radius n centered at I. Thus, all feasible X for SDP satisfy that is, C,I C,X = π(c),i X π(c) n, n π(c). (23) On the other hand, the feasible matrix U 0 = I λ max(π(c)) π(c) lies distance at least from I, because the unit ball centered at I is contained in S n n and because + U 0 lies in the boundary of S n n +. Hence, Consequently, Combining (23) and (24) gives I U 0 = α π(c) π(c) for some α. C,I C,U 0 = π(c),i U 0 = α π(c) π(c). (24) C,I C,U 0 n. Substitution into Theorem 4.2 completes the proof.

14 4 J. RENEGAR It is interesting to observe that for any fixed value of ǫ, if one is able to model a family of optimization problems as semidefinite programs SDP(n) parameterized by n (the number of variables) in such a way that for every n, both I n is on (or near) the central path and for some p < /4, diam(n) = O(n p ), then the iteration bound provided by the corollary is better than the best iteration bound established for interior-point methods, i.e., O( n) iterations when ǫ is fixed. As each iteration of Subgradient Method is cheap relative to the cost of an interior-point method iteration, in this case NonSmoothed Scheme wins hands down. On the other hand, of course, if n is held fixed and ǫ goes to zero, the bound O(log(/ǫ)) on the number of iterations for interior-point methods is massively better than the bound O(/ǫ 2 ) for NonSmoothed Scheme. 5. Starting Points E I In this section it is observed that the theory and algorithms from previous sections are readily converted to the case that the starting point is a strictly-feasible matrix E I. As in the remarks closing section 2, the relevant concave function is λ min,e (X) := inf{λ R : X λe / S n n + }, (25) (the smallest eigenvalue of the matrix E /2 XE /2, where E /2 is the positive definite matrix satisfying E = E /2 E /2 ). As those remarks noted, the theory of that section is easily generalized, which for the present situation means replacing all occurrencesofλ min appearinginsection2byλ min,e, whilesimultaneouslyreplacing I by E, assuming E is strictly feasible. The algorithms and theory from sections 3 and 4, however, are not so obviously extended. At issue is that unlike X λ min (X), the function X λ min,e (X) need not be Lipschitz continuous with constant, a fact that was critical in the proof of Theorem 3.. Thankfully, the issue is easily handled by changing from the trace inner product to the inner product on S n n used in interior-point method theory: + U,V E := tr(e UE V) = tr ( (E /2 UE /2 )(E /2 UE /2 ) ). Thus, for example, when SubGradient Method computes a subgradient for iterate X k, the subgradient should be with respect to, E rather than with respect to,. Likewise, when SubGradient Method projects a subgradient onto the subspace (8), the projection should be orthogonal with respect to, E. Finally, the value diam should be replaced by a value diam E satisfying diam E max{ X Y E : X,Y Level val } for all val < C,E. With these changes, all of the results of previous sections are valid with E in place of I. For linear programming, the resulting changes to Subgradient Method are quickly described. Letting e denote the known strictly-feasible point (not necessarily the vector of all ones), and letting val be any value satisfying val < c T e, the problem

15 EFFICIENT FIRST-ORDER METHODS FOR LP AND SDP 5 equivalent to LP is max x min j x j /e j s.t. Ax = b c T x = val. In applying Subgradient Method, the relevant inner product is u,v e = j u j v j e 2 j. With respect to this inner product, the subgradients at x R n are the convex combinations of the vectors e (k) for which x k /e k = min j x j /e j, where e (k) has all coordinates equal to zero except for the k th coordinate, which is equal to e k. The, e -orthogonal projections of the vectors e (j) (j =,...,n) onto the nullspace of [ ] A Ā = c T are the columns of the matrix P e that, e -orthogonally projects R n onto the nullspace, that is, P e = I (e) 2 Ā T (Ā (e)2 Ā T ) Ā, where (e) is the diagonal matrix with j th diagonal entry equal to e j. Thus, the projected subgradients relied upon by Subgradient Method are now the convex combination of the columns of P e. Perhaps the easiest way to understand why all of the results of previous sections remain valid when I is replaced by E and, is replaced by, E is to use the standard trick in the interior-point method literature of scaling SDP to an equivalent semidefinite program for which I is feasible. The equivalent semidefinite program (in variable Y) is min Y E /2 CE /2,Y A(E /2 YE /2 ) = b Y 0 scaled SDP TheequivalenceisseenbynotingX isfeasibleforsdpifandonlyify = E /2 XE /2 is feasible for scaled SDP, and the objective value of X for SDP is the same as the objective value of Y for scaled SDP. Moreover, the inner product, E is transformed into the trace inner product, in that X,X 2 E = Y,Y 2 for Y j = E /2 X j E /2 (j =,2). Thus, for example, the, E -diameter of level set Level val (SDP) is the same as the, -diameter of level set Level val (scaled SDP). Lastly, as is straightforward but tedious to verify, each of the algorithms transforms as well. For example, consider Subgradient Method, and fix two of the inputs, R and N. Assume the algorithm is applied with, E to solving the linearly-constrained problem equivalent to SDP: max λ min,e (X) s.t. A(X) = b C,X = val.

16 6 J. RENEGAR Then X 0,X,...,X N is a possible resulting sequence if and only if Y 0,Y,...,Y N where Y j := E /2 X j E /2 is a possible resulting sequence when Subgradient Method is applied using the trace inner product to the problem equivalent to scaled SDP: max λ min (Y) s.t. A(E /2 YE /2 ) = b E /2 CE /2,Y = val. Applying any of the algorithms with strictly-feasible E and with inner product, E is, in other words, equivalent to scaling SDP, applying the algorithm with I and the trace inner product, and then unscaling the answer. We choose to assume I is feasible only to reduce notational clutter and make evident the simplicity of the main ideas. (Unfortunately, the trick of scaling does not generalize to hyperbolic programming, making the proofs in the forthcoming paper [5] necessarily more abstract than the ones here.) In closing the section, we remark that the results throughout the paper can be developed just as readily for semidefinite programs of, say, the form min x s.t. c T x m i= x ia i B. One assumes knowna strictly feasible point e, and relies upon the concavefunction X λ min,e (X) specified in (25), letting E := m i= e i A i B. The equivalent problem solvable by a subgradient method is max λ min,e ( m i= x ia i B) s.t. c T x = val, for any value satisfying val < c T e. The relevant inner product at e is m u,v e := u i A i, i= m v i A i E. i= For the special case of a linear program min c T x s.t. Ax b, letting α T i denote the i th row of A and letting e := Ae b, the equivalent problem for val < c T e is The relevant inner product is max min i e i (α T i x b i) s.t. c T x = val. u,v e = u T A T (e) 2 Av, where (e) is the diagonal matrix with i th diagonal entry equal to e i.

17 EFFICIENT FIRST-ORDER METHODS FOR LP AND SDP 7 6. Corollaries for Nesterov s First First-Order Method Assume SDP has an optimal solution and I is feasible. (In exactly the same manner as previous results generalize to an arbitrary strictly-feasible initial matrix E, so do all of the remaining results.) Recall that given ǫ > 0 and a value val satisfying val < C,I, we wish to approximately solve the SDP equivalent problem max λ min (X) s.t. A(X) = b C,X = val, where by approximately solve we mean that feasible X is computed for which with ǫ satisfying (26) λ min (X val ) λ min(x) ǫ (27) ǫ ǫ ǫ. Indeed, according to Proposition 2.4, the projection Z = Z(X) will then satisfy C,Z opt val ǫ. With [3], Nesterov initiated a huge wave of research, by displaying that some significant non-smooth optimization problems can be efficiently solved by smoothing the problem and then applying optimal first-order methods for smooth functions. In [4], he extended the approach to include some problems within the domain of semidefinite programming. Here he gave emphasis to the nonsmooth convex objective function X λ max (X), but the results trivially adapt to the concave function of interest to us, X λ min (X). For the nonsmooth concave function X λ min (X), the useful smoothing is f µ (X) := µln j e λj(x)/µ, where µ > 0 is user-chosen, and where λ (X),...,λ n (X) are the eigenvalues of X. For motivation, observe that for all X S n n, Not obvious, but which Nesterov proved, λ min (X) µ lnn f µ (X) λ min (X). (28) f µ (X) f µ (Y) µ X Y, where is the Frobenius norm. That is, the gradientoff µ is Lipschitzcontinuous, with constant /µ. The smoothed version of (26) is max f µ (X) s.t. A(X) = b C,X = val. Let Xval (µ) denote an optimal solution (which the reader should be careful to distinguish from Xval, an optimal solution for (26)). (29)

18 8 J. RENEGAR In passing, we note that the gradient of f µ at X is the the matrix ] f µ (X) = j e λ j (X)/µ Q [ λ(x) ] [ e λ (X)/µ... e λn(x)/µ Q T, where X = Q... Q T is an eigendecomposition of X. λn(x) For the special case of linear programming, the function f µ becomes f µ (x) = µln j e xj/µ, for which the gradient at x is the vector with j th coordinate equal to f µ (x) j = e xj/µ k e x k/µ. It is readily seen from (28) that for any value ǫ > 0 and for all X S n n, if µ = 2 ǫ /ln(n), then f µ (X val (µ)) f µ(x) 2 ǫ λ min (X val ) λ min(x) ǫ. Consequently, in order to compute X which is feasible for (26) and satisfies (27), it suffices to fix µ = 2 ǫ /ln(n) and compute X which is feasible for (29) and has objective value within ǫ /2 of f µ (X val (µ)). Since the objective function in (29) is smooth and the only constraints are linear equations, we can apply many first-order methods. It is only fitting that we rely on the original optimal first-order method for smooth functions, due to Nesterov [], and which we refer to as Nesterov s first first-order method, or for brevity, Nesterov s first method. Letting X 0 denote an initial feasible point, then according to Theorem in [2], Nesterov s first method produces a sequence of iterates X 0,X,... satisfying f µ (X val (µ) f µ(x k ) 4 µ(k+2) 2 X 0 X val (µ)) 2 (where we have used the fact that the Lipschitz constant for the gradient is /µ). Thus, f µ (X k ) is within ǫ /2 of the optimal value if 8 k µǫ X 0 Xval (µ)) 2, that is, if k 4 lnn X 0 Xval (µ)) 2 (using µ = 2 ǫ /ln(n)). We summarize these results in a theorem. ǫ Theorem 6.. (Nesterov) Let ǫ be any positive value, and µ = 2 ǫ /ln(n). Assume X satisfies A(X) = b and C,X = val < C,I. If Nesterov s first first-order method is applied to solving the smoothed problem (29), then the resulting iterates X 0 = X,X,X 2,... satisfy k 4 lnn X X val (µ)) ǫ 2 λ min (X val) λ min (X k ) ǫ.

19 EFFICIENT FIRST-ORDER METHODS FOR LP AND SDP 9 Recall that diam is assumed to be a known quantity satisfying diam max{ X Y : X,Y Level val } for all val < C,I. Corollary 6.2. Assume X is strictly feasible for SDP and C,X = val < C,I. For any value 0 < ǫ 2λ min (X), by letting µ := 2 ǫ /ln(n), if Nesterov s first first-order method is applied to solving the smoothed problem (29), then the resulting iterates X 0 = X,X,X 2,... satisfy k 4 lnn diam ǫ 2 λ min (X val) λ min (X k ) ǫ. Proof: Assume X, ǫ and µ are as in the statement. Then f µ (X) 0, by (28) and µ λ min /ln(n). Hence, f µ (X val (µ)) 0 (because f µ(x val (µ)) f µ(x)). Also by (28), Y / S n n + f µ (Y) < 0. Thus, Xval (µ) is feasible for SDP. Hence, X Xval (µ) diam. Substitution into Theorem 6. completes the proof. Corollary 6.3. Assume X is feasible for SDP and satisfies λ min (X) 6 and 3, (30) where val := C,X. Let 0 < ǫ <. For µ = 6ǫ/ln(n), if Nesterov s first first-order method is applied to solving the smoothed problem (29), then the resulting iterates X 0 = X,X,X 2,... satisfy k 2 lnn diam ǫ 2 C,Z(X k) opt val ǫ. Proof: Let ǫ := ǫ/3 /3. Then, by assumption, ǫ 2λ min (X), and hence Corollary 6.2 can be applied with giving k 2 lnn diam ǫ µ = 2 ǫ /ln(n) = 6 ǫ/ln(n), 2 λ min (X val ) λ min(x) ǫ 3 λ min (X val ) λ min(x) ǫ ǫ, where the last implication is due to the rightmost inequality assumed in (30). Invoking Proposition 2.4 completes the proof. 7. The Smoothed Scheme The presentation of the smoothed scheme is done in the same manner as the presentation of NonSmoothed Scheme in section 4, but now beginning with the following question: Is it possible to efficiently move from an initial matrix U 0 satisfying A(U 0 ) = b and C,U 0 < C,I, to a matrix Y satisfying the conditions of Corollary 6.3?

20 20 J. RENEGAR Before providing an affirmative answer, for ease of reference we again display the pertinent optimization problems: max λ min (X) s.t. A(X) = b C,X = val, (3) max f µ (X) s.t. A(X) = b C,X = val. Recall that Xval denotes any optimal solution for (3) a problem equivalent to SDP (assuming val < C,I ) whereas Xval (µ) denotes an optimal solution for (32) the smoothed version of (3). Consider the following computational procedure: (32) Smoothed SubScheme (0) Initiation: Input: A matrix U 0 that is feasible for SDP and satisfies both C,U 0 < C,I and λ min (U 0 ) = /6. If a matrix U is available satisfying only A(U) = b and λ min(u) C,U < C,I, then U 0 = I (U I) is acceptable input. Let val 0 = C,U 0 Let µ := /(6 lnn) and N := 2 lnn diam 2. Let l =. () Outer Iteration Counter Step: Let l+ l (2) Inner Iterations: Beginningat U l, apply N iterations ofnesterov sfirst first-order method to the smoothed problem (32), with val := val l. Let V l denote the final iterate. (3) Check for Termination: If λ min (V l ) /3, then output Y = U l and terminate. Else, compute U l+ := I and go to Step. λ min(v l ) (V l I), val l+ := C,U l+, Proposition 7.. Smoothed SubScheme terminates with a matrix Y which is feasible for SDP and satisfies λ min (Y) = 6, 3, (33) where val := C,Y. Moreover, the number of outer iterations does not exceed ( ) log 5/4 0 where val 0 = C,U 0 and U 0 is the input matrix.

21 EFFICIENT FIRST-ORDER METHODS FOR LP AND SDP 2 Proof: The proof especially the last half parallels that of Proposition 4.. Nonetheless, we include the proof in its entirety. It is easily verified that all of the matrices U l and V l computed by Smoothed SubScheme satisfy the SDP equations A(X) = b. Moreover, U l is feasible for SDP, because λ min (U l ) = /6 (by construction). Fix l 0 to be a value attained by the counter. We now examine the effects of Steps 2 and 3. Applying Corollary 6.2 with ǫ = /3 shows that in Step 2, the final iterate X N computed by Nesterov s first method satisfies λ min (X val l ) λ min (X N ) /3, that is, λ min (Xval l ) λ min (V l ) /3. (34) Observe that l = λ min(xval l ) (by Lemma 2.3) 2 3 λ min(v l ) (by (34)). Hence, if the method terminates in Step 3 that is, if λ min (V l ) /3 then the output matrix Y = U l satisfies 3, where val := C,Y = val l. We have now verified that if the Smoothed SubScheme terminates, then the output Y is indeed feasible for SDP and satisfies the inequalities (33). On the other hand, if the method does not terminate in Step 3, it computes the matrix U l+ and its objective value, val l+. Here, observe val l+ = C,I λ min(v l ) (val l C,I ) C,I 5 4 ( l), because val l < C,I and λ min (V l ) /3 (due to no termination). Hence, l+ 5 4 l. Since all values val l computed by the algorithm satisfy val l opt val (because U l is feasible for SDP), it immediately follows that ( ) log 5/4 0 is an upper bound on the number of outer iterations. Specifying our scheme based on Nesterov s first method, and analyzing the scheme s complexity, both are now easily accomplished:

22 22 J. RENEGAR Smoothed Scheme (0) Input: A value 0 < ǫ <, and U 0 S n n satisfying A(U 0 ) = b and λ min (U 0 ) = /6. λ max(π(c)) π(c). For example, the matrix U 0 = I 5 6 () Apply Smoothed SubScheme with input U 0. Let Y denote the output. 2 (2) Beginning at Y, apply lnn diam ǫ 2 iterations of Nesterov s first first-order method to the smoothed problem (32), with val := C,Y. Let X denote the output. (3) Compute and output the projection Z = Z(X), then terminate. Theorem 7.2. Assume I is feasible. Smoothed Scheme outputs Z which is feasible for SDP and satisfies C,Z opt val ǫ. (35) The total number of iterations of Nesterov s first first-order method is bounded above by 2 ( ( )) lnn diam ǫ + log 5/4, 0 where val 0 := C,U 0 and U 0 is the input matrix. Proof: Proposition 7. shows the matrix Y in Step satisfies the conditions of Corollary 6.3, which in turn shows the final output Z = Z(X) from Step 3 is feasible for SDP and satisfies (35). The bound on the total number of iterations of Nesterov s first method is a straightforward consequence of the outer iteration bound from Proposition 7., the choice for N in Smoothed SubScheme, and the number of iterates in Step 2 of Smoothed Scheme. The proof of the following corollary proceeds exactly as does the proof of Corollary 4.3. The added value is due to λ min (U 0 ) = /6 in Smoothed Scheme unlike λ min (U 0 ) = 0 in NonSmoothed Scheme and log 5/4 ( λ min(u 0) ) = log 5/4(6/5) <. Corollary 7.3. If I is on the central path and U 0 = I 5 6 λ max(π(c)) π(c), then the same conclusions as in Theorem 7.2 apply but now with the number of iterations of Nesterov s first first-order method bounded above by 2 ( ) lnn diam ǫ + log 5/4 (n) Closing Remarks Similar to the observation immediately following Corollary 4.3, we see from Corollary 7.3 that for fixed ǫ, if one models a family of problems as semidefinite programs SDP(n) where I n is on (or near) the central path and for which there exists p < /2 satisfying diam(n) = O(n p ), then Smoothed Scheme wins hands down over interior-point methods even on iteration count, let alone on total cost. Interior-point methods, of course, win hands down if n is fixed and ǫ goes to zero.

23 EFFICIENT FIRST-ORDER METHODS FOR LP AND SDP 23 Forfixedn, ouriterationboundofo(/ǫ)ismuchworsethantheboundo(/ ǫ) found in literature related to compressed sensing, where problems can be reduced to ones involving only the feasible regions {x 0 : j x j = } or {X 0 : tr(x) = } for which tractable prox functions are known. However, the approaches used there fail upon including additional constraints, such as requiring X to satisfy a specific sparsity pattern, as is relevant in statistics for fitting a concentration matrix (the inverse of a covariance matrix) to empirical data, and as is relevant in some applications of semidefinite programming to combinatorial problems pertaining to graphs. Among the obstacles is that tractable proximal operators remain unknown except for an extremely small universe of sets. In this vein, we note that for the algorithms herein, imposing a specific sparsity pattern actually reduces work, assuming the known feasible matrix is E = I. Indeed, assuming the sparsity pattern is X ij = 0 for (i,j) Zeros, and using A k,x = b i (k =,...,l) to denote the remaining constraints, projecting a subgradient G is accomplished by overwriting by 0 the entries G i,j for (i,j) Zeros, and orthogonally projecting the resulting matrix onto the subspace {V : C,V = 0 and Āk,V = 0 for k =,...l}, where C (resp.,āk)isthematrixobtainedbyoverwritingby0the(i,j) th coordinate ofc (resp., A k ), for(i,j) Zeros. If l the number of complicating constraints is small and the set Zeros is large, this approach results in significant computational savings per iteration. Moreover, the resulting iteration cost is very cheap relative to the cost of an iteration of an interior-point method, where sparsity constraints must be handled like any other constraints, due to the inner product, Xk being dependent on the iterate, unlike first-order methods where the inner product, =, I is held fixed throughout, an inner product for which sparsity constraints are ideally structured. Additional examples of the relevance of the algorithms and results, and their extensions to hyperbolic programming, are given in the forthcoming paper [5]. References [] Yurii Nesterov. A method of solving a convex programming problem with convergence rate O(/k 2 ). Soviet Mathematics Doklady, 27(2): , 983. [2] Yurii Nesterov. Introductory Lectures on Convex Optimization: A Basic Course. Springer, [3] Yurii Nesterov. Smooth minimization of non-smooth functions. Mathematical Programming, 03():27 52, [4] Yurii Nesterov. Smoothing technique and its applications in semidefinite optimization. Mathematical Programming, 0(2): , [5] James Renegar. Efficient first-order methods for hyperbolic programming. In preparation. School of Operations Research and Information Engineering, Cornell University, Ithaca, NY, U.S.

ACCELERATED FIRST-ORDER METHODS FOR HYPERBOLIC PROGRAMMING

ACCELERATED FIRST-ORDER METHODS FOR HYPERBOLIC PROGRAMMING ACCELERATED FIRST-ORDER METHODS FOR HYPERBOLIC PROGRAMMING JAMES RENEGAR Abstract. A framework is developed for applying accelerated methods to general hyperbolic programming, including linear, second-order

More information

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min. MA 796S: Convex Optimization and Interior Point Methods October 8, 2007 Lecture 1 Lecturer: Kartik Sivaramakrishnan Scribe: Kartik Sivaramakrishnan 1 Conic programming Consider the conic program min s.t.

More information

Lecture 5. Theorems of Alternatives and Self-Dual Embedding

Lecture 5. Theorems of Alternatives and Self-Dual Embedding IE 8534 1 Lecture 5. Theorems of Alternatives and Self-Dual Embedding IE 8534 2 A system of linear equations may not have a solution. It is well known that either Ax = c has a solution, or A T y = 0, c

More information

More First-Order Optimization Algorithms

More First-Order Optimization Algorithms More First-Order Optimization Algorithms Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye Chapters 3, 8, 3 The SDM

More information

Lecture 9 Monotone VIs/CPs Properties of cones and some existence results. October 6, 2008

Lecture 9 Monotone VIs/CPs Properties of cones and some existence results. October 6, 2008 Lecture 9 Monotone VIs/CPs Properties of cones and some existence results October 6, 2008 Outline Properties of cones Existence results for monotone CPs/VIs Polyhedrality of solution sets Game theory:

More information

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17 EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 17 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory May 29, 2012 Andre Tkacenko

More information

Chapter 1. Preliminaries

Chapter 1. Preliminaries Introduction This dissertation is a reading of chapter 4 in part I of the book : Integer and Combinatorial Optimization by George L. Nemhauser & Laurence A. Wolsey. The chapter elaborates links between

More information

Largest dual ellipsoids inscribed in dual cones

Largest dual ellipsoids inscribed in dual cones Largest dual ellipsoids inscribed in dual cones M. J. Todd June 23, 2005 Abstract Suppose x and s lie in the interiors of a cone K and its dual K respectively. We seek dual ellipsoidal norms such that

More information

Lecture 7: Semidefinite programming

Lecture 7: Semidefinite programming CS 766/QIC 820 Theory of Quantum Information (Fall 2011) Lecture 7: Semidefinite programming This lecture is on semidefinite programming, which is a powerful technique from both an analytic and computational

More information

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008. 1 ECONOMICS 594: LECTURE NOTES CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS W. Erwin Diewert January 31, 2008. 1. Introduction Many economic problems have the following structure: (i) a linear function

More information

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming CSC2411 - Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming Notes taken by Mike Jamieson March 28, 2005 Summary: In this lecture, we introduce semidefinite programming

More information

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5 Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5 Instructor: Farid Alizadeh Scribe: Anton Riabov 10/08/2001 1 Overview We continue studying the maximum eigenvalue SDP, and generalize

More information

Radial Subgradient Descent

Radial Subgradient Descent Radial Subgradient Descent Benja Grimmer Abstract We present a subgradient method for imizing non-smooth, non-lipschitz convex optimization problems. The only structure assumed is that a strictly feasible

More information

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces. Math 350 Fall 2011 Notes about inner product spaces In this notes we state and prove some important properties of inner product spaces. First, recall the dot product on R n : if x, y R n, say x = (x 1,...,

More information

The Nearest Doubly Stochastic Matrix to a Real Matrix with the same First Moment

The Nearest Doubly Stochastic Matrix to a Real Matrix with the same First Moment he Nearest Doubly Stochastic Matrix to a Real Matrix with the same First Moment William Glunt 1, homas L. Hayden 2 and Robert Reams 2 1 Department of Mathematics and Computer Science, Austin Peay State

More information

Topological properties

Topological properties CHAPTER 4 Topological properties 1. Connectedness Definitions and examples Basic properties Connected components Connected versus path connected, again 2. Compactness Definition and first examples Topological

More information

Interior-Point Methods for Linear Optimization

Interior-Point Methods for Linear Optimization Interior-Point Methods for Linear Optimization Robert M. Freund and Jorge Vera March, 204 c 204 Robert M. Freund and Jorge Vera. All rights reserved. Linear Optimization with a Logarithmic Barrier Function

More information

Optimization and Optimal Control in Banach Spaces

Optimization and Optimal Control in Banach Spaces Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,

More information

Optimality Conditions for Constrained Optimization

Optimality Conditions for Constrained Optimization 72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu

More information

Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs

Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs 2015 American Control Conference Palmer House Hilton July 1-3, 2015. Chicago, IL, USA Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs Raphael Louca and Eilyan Bitar

More information

15-780: LinearProgramming

15-780: LinearProgramming 15-780: LinearProgramming J. Zico Kolter February 1-3, 2016 1 Outline Introduction Some linear algebra review Linear programming Simplex algorithm Duality and dual simplex 2 Outline Introduction Some linear

More information

Inequality Constraints

Inequality Constraints Chapter 2 Inequality Constraints 2.1 Optimality Conditions Early in multivariate calculus we learn the significance of differentiability in finding minimizers. In this section we begin our study of the

More information

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) Contents 1 Vector Spaces 1 1.1 The Formal Denition of a Vector Space.................................. 1 1.2 Subspaces...................................................

More information

Assignment 1: From the Definition of Convexity to Helley Theorem

Assignment 1: From the Definition of Convexity to Helley Theorem Assignment 1: From the Definition of Convexity to Helley Theorem Exercise 1 Mark in the following list the sets which are convex: 1. {x R 2 : x 1 + i 2 x 2 1, i = 1,..., 10} 2. {x R 2 : x 2 1 + 2ix 1x

More information

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 1 Entropy Since this course is about entropy maximization,

More information

Problem 1 (Exercise 2.2, Monograph)

Problem 1 (Exercise 2.2, Monograph) MS&E 314/CME 336 Assignment 2 Conic Linear Programming January 3, 215 Prof. Yinyu Ye 6 Pages ASSIGNMENT 2 SOLUTIONS Problem 1 (Exercise 2.2, Monograph) We prove the part ii) of Theorem 2.1 (Farkas Lemma

More information

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 4

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 4 Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 4 Instructor: Farid Alizadeh Scribe: Haengju Lee 10/1/2001 1 Overview We examine the dual of the Fermat-Weber Problem. Next we will

More information

Solving Dual Problems

Solving Dual Problems Lecture 20 Solving Dual Problems We consider a constrained problem where, in addition to the constraint set X, there are also inequality and linear equality constraints. Specifically the minimization problem

More information

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)

More information

An Algorithm for Solving the Convex Feasibility Problem With Linear Matrix Inequality Constraints and an Implementation for Second-Order Cones

An Algorithm for Solving the Convex Feasibility Problem With Linear Matrix Inequality Constraints and an Implementation for Second-Order Cones An Algorithm for Solving the Convex Feasibility Problem With Linear Matrix Inequality Constraints and an Implementation for Second-Order Cones Bryan Karlovitz July 19, 2012 West Chester University of Pennsylvania

More information

Spectral Theorem for Self-adjoint Linear Operators

Spectral Theorem for Self-adjoint Linear Operators Notes for the undergraduate lecture by David Adams. (These are the notes I would write if I was teaching a course on this topic. I have included more material than I will cover in the 45 minute lecture;

More information

Convex Optimization and Modeling

Convex Optimization and Modeling Convex Optimization and Modeling Duality Theory and Optimality Conditions 5th lecture, 12.05.2010 Jun.-Prof. Matthias Hein Program of today/next lecture Lagrangian and duality: the Lagrangian the dual

More information

Singular Value Decomposition

Singular Value Decomposition Chapter 6 Singular Value Decomposition In Chapter 5, we derived a number of algorithms for computing the eigenvalues and eigenvectors of matrices A R n n. Having developed this machinery, we complete our

More information

The speed of Shor s R-algorithm

The speed of Shor s R-algorithm IMA Journal of Numerical Analysis 2008) 28, 711 720 doi:10.1093/imanum/drn008 Advance Access publication on September 12, 2008 The speed of Shor s R-algorithm J. V. BURKE Department of Mathematics, University

More information

An E cient A ne-scaling Algorithm for Hyperbolic Programming

An E cient A ne-scaling Algorithm for Hyperbolic Programming An E cient A ne-scaling Algorithm for Hyperbolic Programming Jim Renegar joint work with Mutiara Sondjaja 1 Euclidean space A homogeneous polynomial p : E!R is hyperbolic if there is a vector e 2E such

More information

Optimality, Duality, Complementarity for Constrained Optimization

Optimality, Duality, Complementarity for Constrained Optimization Optimality, Duality, Complementarity for Constrained Optimization Stephen Wright University of Wisconsin-Madison May 2014 Wright (UW-Madison) Optimality, Duality, Complementarity May 2014 1 / 41 Linear

More information

Linear Programming Redux

Linear Programming Redux Linear Programming Redux Jim Bremer May 12, 2008 The purpose of these notes is to review the basics of linear programming and the simplex method in a clear, concise, and comprehensive way. The book contains

More information

Summer School: Semidefinite Optimization

Summer School: Semidefinite Optimization Summer School: Semidefinite Optimization Christine Bachoc Université Bordeaux I, IMB Research Training Group Experimental and Constructive Algebra Haus Karrenberg, Sept. 3 - Sept. 7, 2012 Duality Theory

More information

Linear Algebra, Summer 2011, pt. 2

Linear Algebra, Summer 2011, pt. 2 Linear Algebra, Summer 2, pt. 2 June 8, 2 Contents Inverses. 2 Vector Spaces. 3 2. Examples of vector spaces..................... 3 2.2 The column space......................... 6 2.3 The null space...........................

More information

The Simplex Method: An Example

The Simplex Method: An Example The Simplex Method: An Example Our first step is to introduce one more new variable, which we denote by z. The variable z is define to be equal to 4x 1 +3x 2. Doing this will allow us to have a unified

More information

Key words. Complementarity set, Lyapunov rank, Bishop-Phelps cone, Irreducible cone

Key words. Complementarity set, Lyapunov rank, Bishop-Phelps cone, Irreducible cone ON THE IRREDUCIBILITY LYAPUNOV RANK AND AUTOMORPHISMS OF SPECIAL BISHOP-PHELPS CONES M. SEETHARAMA GOWDA AND D. TROTT Abstract. Motivated by optimization considerations we consider cones in R n to be called

More information

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems Robert M. Freund February 2016 c 2016 Massachusetts Institute of Technology. All rights reserved. 1 1 Introduction

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

Copositive Programming and Combinatorial Optimization

Copositive Programming and Combinatorial Optimization Copositive Programming and Combinatorial Optimization Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria joint work with I.M. Bomze (Wien) and F. Jarre (Düsseldorf) IMA

More information

OPTIMISATION 3: NOTES ON THE SIMPLEX ALGORITHM

OPTIMISATION 3: NOTES ON THE SIMPLEX ALGORITHM OPTIMISATION 3: NOTES ON THE SIMPLEX ALGORITHM Abstract These notes give a summary of the essential ideas and results It is not a complete account; see Winston Chapters 4, 5 and 6 The conventions and notation

More information

Constructions with ruler and compass.

Constructions with ruler and compass. Constructions with ruler and compass. Semyon Alesker. 1 Introduction. Let us assume that we have a ruler and a compass. Let us also assume that we have a segment of length one. Using these tools we can

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

Semidefinite Programming

Semidefinite Programming Semidefinite Programming Notes by Bernd Sturmfels for the lecture on June 26, 208, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra The transition from linear algebra to nonlinear algebra has

More information

SEMIDEFINITE PROGRAM BASICS. Contents

SEMIDEFINITE PROGRAM BASICS. Contents SEMIDEFINITE PROGRAM BASICS BRIAN AXELROD Abstract. A introduction to the basics of Semidefinite programs. Contents 1. Definitions and Preliminaries 1 1.1. Linear Algebra 1 1.2. Convex Analysis (on R n

More information

Chapter 7. Extremal Problems. 7.1 Extrema and Local Extrema

Chapter 7. Extremal Problems. 7.1 Extrema and Local Extrema Chapter 7 Extremal Problems No matter in theoretical context or in applications many problems can be formulated as problems of finding the maximum or minimum of a function. Whenever this is the case, advanced

More information

Proximal and First-Order Methods for Convex Optimization

Proximal and First-Order Methods for Convex Optimization Proximal and First-Order Methods for Convex Optimization John C Duchi Yoram Singer January, 03 Abstract We describe the proximal method for minimization of convex functions We review classical results,

More information

Linear Systems. Carlo Tomasi. June 12, r = rank(a) b range(a) n r solutions

Linear Systems. Carlo Tomasi. June 12, r = rank(a) b range(a) n r solutions Linear Systems Carlo Tomasi June, 08 Section characterizes the existence and multiplicity of the solutions of a linear system in terms of the four fundamental spaces associated with the system s matrix

More information

Conditional Gradient (Frank-Wolfe) Method

Conditional Gradient (Frank-Wolfe) Method Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties

More information

A Primal-Dual Second-Order Cone Approximations Algorithm For Symmetric Cone Programming

A Primal-Dual Second-Order Cone Approximations Algorithm For Symmetric Cone Programming A Primal-Dual Second-Order Cone Approximations Algorithm For Symmetric Cone Programming Chek Beng Chua Abstract This paper presents the new concept of second-order cone approximations for convex conic

More information

Solving large Semidefinite Programs - Part 1 and 2

Solving large Semidefinite Programs - Part 1 and 2 Solving large Semidefinite Programs - Part 1 and 2 Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria F. Rendl, Singapore workshop 2006 p.1/34 Overview Limits of Interior

More information

A priori bounds on the condition numbers in interior-point methods

A priori bounds on the condition numbers in interior-point methods A priori bounds on the condition numbers in interior-point methods Florian Jarre, Mathematisches Institut, Heinrich-Heine Universität Düsseldorf, Germany. Abstract Interior-point methods are known to be

More information

Fidelity and trace norm

Fidelity and trace norm Fidelity and trace norm In this lecture we will see two simple examples of semidefinite programs for quantities that are interesting in the theory of quantum information: the first is the trace norm of

More information

Math Linear Algebra II. 1. Inner Products and Norms

Math Linear Algebra II. 1. Inner Products and Norms Math 342 - Linear Algebra II Notes 1. Inner Products and Norms One knows from a basic introduction to vectors in R n Math 254 at OSU) that the length of a vector x = x 1 x 2... x n ) T R n, denoted x,

More information

10. Smooth Varieties. 82 Andreas Gathmann

10. Smooth Varieties. 82 Andreas Gathmann 82 Andreas Gathmann 10. Smooth Varieties Let a be a point on a variety X. In the last chapter we have introduced the tangent cone C a X as a way to study X locally around a (see Construction 9.20). It

More information

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2. APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product

More information

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725 Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:

More information

CO 250 Final Exam Guide

CO 250 Final Exam Guide Spring 2017 CO 250 Final Exam Guide TABLE OF CONTENTS richardwu.ca CO 250 Final Exam Guide Introduction to Optimization Kanstantsin Pashkovich Spring 2017 University of Waterloo Last Revision: March 4,

More information

Introduction to Semidefinite Programming I: Basic properties a

Introduction to Semidefinite Programming I: Basic properties a Introduction to Semidefinite Programming I: Basic properties and variations on the Goemans-Williamson approximation algorithm for max-cut MFO seminar on Semidefinite Programming May 30, 2010 Semidefinite

More information

Projection methods to solve SDP

Projection methods to solve SDP Projection methods to solve SDP Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria F. Rendl, Oberwolfach Seminar, May 2010 p.1/32 Overview Augmented Primal-Dual Method

More information

A New Look at the Performance Analysis of First-Order Methods

A New Look at the Performance Analysis of First-Order Methods A New Look at the Performance Analysis of First-Order Methods Marc Teboulle School of Mathematical Sciences Tel Aviv University Joint work with Yoel Drori, Google s R&D Center, Tel Aviv Optimization without

More information

On duality gap in linear conic problems

On duality gap in linear conic problems On duality gap in linear conic problems C. Zălinescu Abstract In their paper Duality of linear conic problems A. Shapiro and A. Nemirovski considered two possible properties (A) and (B) for dual linear

More information

Throughout these notes we assume V, W are finite dimensional inner product spaces over C.

Throughout these notes we assume V, W are finite dimensional inner product spaces over C. Math 342 - Linear Algebra II Notes Throughout these notes we assume V, W are finite dimensional inner product spaces over C 1 Upper Triangular Representation Proposition: Let T L(V ) There exists an orthonormal

More information

UNDERSTANDING THE DIAGONALIZATION PROBLEM. Roy Skjelnes. 1.- Linear Maps 1.1. Linear maps. A map T : R n R m is a linear map if

UNDERSTANDING THE DIAGONALIZATION PROBLEM. Roy Skjelnes. 1.- Linear Maps 1.1. Linear maps. A map T : R n R m is a linear map if UNDERSTANDING THE DIAGONALIZATION PROBLEM Roy Skjelnes Abstract These notes are additional material to the course B107, given fall 200 The style may appear a bit coarse and consequently the student is

More information

A Review of Linear Programming

A Review of Linear Programming A Review of Linear Programming Instructor: Farid Alizadeh IEOR 4600y Spring 2001 February 14, 2001 1 Overview In this note we review the basic properties of linear programming including the primal simplex

More information

Geometric problems. Chapter Projection on a set. The distance of a point x 0 R n to a closed set C R n, in the norm, is defined as

Geometric problems. Chapter Projection on a set. The distance of a point x 0 R n to a closed set C R n, in the norm, is defined as Chapter 8 Geometric problems 8.1 Projection on a set The distance of a point x 0 R n to a closed set C R n, in the norm, is defined as dist(x 0,C) = inf{ x 0 x x C}. The infimum here is always achieved.

More information

Convex Optimization. (EE227A: UC Berkeley) Lecture 6. Suvrit Sra. (Conic optimization) 07 Feb, 2013

Convex Optimization. (EE227A: UC Berkeley) Lecture 6. Suvrit Sra. (Conic optimization) 07 Feb, 2013 Convex Optimization (EE227A: UC Berkeley) Lecture 6 (Conic optimization) 07 Feb, 2013 Suvrit Sra Organizational Info Quiz coming up on 19th Feb. Project teams by 19th Feb Good if you can mix your research

More information

Diagonalization by a unitary similarity transformation

Diagonalization by a unitary similarity transformation Physics 116A Winter 2011 Diagonalization by a unitary similarity transformation In these notes, we will always assume that the vector space V is a complex n-dimensional space 1 Introduction A semi-simple

More information

Iterative Methods for Solving A x = b

Iterative Methods for Solving A x = b Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http

More information

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST) Lagrange Duality Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2017-18, HKUST, Hong Kong Outline of Lecture Lagrangian Dual function Dual

More information

Lecture Note 5: Semidefinite Programming for Stability Analysis

Lecture Note 5: Semidefinite Programming for Stability Analysis ECE7850: Hybrid Systems:Theory and Applications Lecture Note 5: Semidefinite Programming for Stability Analysis Wei Zhang Assistant Professor Department of Electrical and Computer Engineering Ohio State

More information

Convex Optimization. Ofer Meshi. Lecture 6: Lower Bounds Constrained Optimization

Convex Optimization. Ofer Meshi. Lecture 6: Lower Bounds Constrained Optimization Convex Optimization Ofer Meshi Lecture 6: Lower Bounds Constrained Optimization Lower Bounds Some upper bounds: #iter μ 2 M #iter 2 M #iter L L μ 2 Oracle/ops GD κ log 1/ε M x # ε L # x # L # ε # με f

More information

CS 6820 Fall 2014 Lectures, October 3-20, 2014

CS 6820 Fall 2014 Lectures, October 3-20, 2014 Analysis of Algorithms Linear Programming Notes CS 6820 Fall 2014 Lectures, October 3-20, 2014 1 Linear programming The linear programming (LP) problem is the following optimization problem. We are given

More information

Copositive Plus Matrices

Copositive Plus Matrices Copositive Plus Matrices Willemieke van Vliet Master Thesis in Applied Mathematics October 2011 Copositive Plus Matrices Summary In this report we discuss the set of copositive plus matrices and their

More information

CS261: A Second Course in Algorithms Lecture #12: Applications of Multiplicative Weights to Games and Linear Programs

CS261: A Second Course in Algorithms Lecture #12: Applications of Multiplicative Weights to Games and Linear Programs CS26: A Second Course in Algorithms Lecture #2: Applications of Multiplicative Weights to Games and Linear Programs Tim Roughgarden February, 206 Extensions of the Multiplicative Weights Guarantee Last

More information

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010 I.3. LMI DUALITY Didier HENRION henrion@laas.fr EECI Graduate School on Control Supélec - Spring 2010 Primal and dual For primal problem p = inf x g 0 (x) s.t. g i (x) 0 define Lagrangian L(x, z) = g 0

More information

Convex Optimization M2

Convex Optimization M2 Convex Optimization M2 Lecture 3 A. d Aspremont. Convex Optimization M2. 1/49 Duality A. d Aspremont. Convex Optimization M2. 2/49 DMs DM par email: dm.daspremont@gmail.com A. d Aspremont. Convex Optimization

More information

MAT-INF4110/MAT-INF9110 Mathematical optimization

MAT-INF4110/MAT-INF9110 Mathematical optimization MAT-INF4110/MAT-INF9110 Mathematical optimization Geir Dahl August 20, 2013 Convexity Part IV Chapter 4 Representation of convex sets different representations of convex sets, boundary polyhedra and polytopes:

More information

A PREDICTOR-CORRECTOR PATH-FOLLOWING ALGORITHM FOR SYMMETRIC OPTIMIZATION BASED ON DARVAY'S TECHNIQUE

A PREDICTOR-CORRECTOR PATH-FOLLOWING ALGORITHM FOR SYMMETRIC OPTIMIZATION BASED ON DARVAY'S TECHNIQUE Yugoslav Journal of Operations Research 24 (2014) Number 1, 35-51 DOI: 10.2298/YJOR120904016K A PREDICTOR-CORRECTOR PATH-FOLLOWING ALGORITHM FOR SYMMETRIC OPTIMIZATION BASED ON DARVAY'S TECHNIQUE BEHROUZ

More information

Lecture 17: Primal-dual interior-point methods part II

Lecture 17: Primal-dual interior-point methods part II 10-725/36-725: Convex Optimization Spring 2015 Lecture 17: Primal-dual interior-point methods part II Lecturer: Javier Pena Scribes: Pinchao Zhang, Wei Ma Note: LaTeX template courtesy of UC Berkeley EECS

More information

Convex Feasibility Problems

Convex Feasibility Problems Laureate Prof. Jonathan Borwein with Matthew Tam http://carma.newcastle.edu.au/drmethods/paseky.html Spring School on Variational Analysis VI Paseky nad Jizerou, April 19 25, 2015 Last Revised: May 6,

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57

More information

Spring 2017 CO 250 Course Notes TABLE OF CONTENTS. richardwu.ca. CO 250 Course Notes. Introduction to Optimization

Spring 2017 CO 250 Course Notes TABLE OF CONTENTS. richardwu.ca. CO 250 Course Notes. Introduction to Optimization Spring 2017 CO 250 Course Notes TABLE OF CONTENTS richardwu.ca CO 250 Course Notes Introduction to Optimization Kanstantsin Pashkovich Spring 2017 University of Waterloo Last Revision: March 4, 2018 Table

More information

16.1 L.P. Duality Applied to the Minimax Theorem

16.1 L.P. Duality Applied to the Minimax Theorem CS787: Advanced Algorithms Scribe: David Malec and Xiaoyong Chai Lecturer: Shuchi Chawla Topic: Minimax Theorem and Semi-Definite Programming Date: October 22 2007 In this lecture, we first conclude our

More information

Checking Consistency. Chapter Introduction Support of a Consistent Family

Checking Consistency. Chapter Introduction Support of a Consistent Family Chapter 11 Checking Consistency 11.1 Introduction The conditions which define a consistent family of histories were stated in Ch. 10. The sample space must consist of a collection of mutually orthogonal

More information

Optimization: Then and Now

Optimization: Then and Now Optimization: Then and Now Optimization: Then and Now Optimization: Then and Now Why would a dynamicist be interested in linear programming? Linear Programming (LP) max c T x s.t. Ax b αi T x b i for i

More information

An Alternative Proof of Primitivity of Indecomposable Nonnegative Matrices with a Positive Trace

An Alternative Proof of Primitivity of Indecomposable Nonnegative Matrices with a Positive Trace An Alternative Proof of Primitivity of Indecomposable Nonnegative Matrices with a Positive Trace Takao Fujimoto Abstract. This research memorandum is aimed at presenting an alternative proof to a well

More information

Convex Optimization. (EE227A: UC Berkeley) Lecture 28. Suvrit Sra. (Algebra + Optimization) 02 May, 2013

Convex Optimization. (EE227A: UC Berkeley) Lecture 28. Suvrit Sra. (Algebra + Optimization) 02 May, 2013 Convex Optimization (EE227A: UC Berkeley) Lecture 28 (Algebra + Optimization) 02 May, 2013 Suvrit Sra Admin Poster presentation on 10th May mandatory HW, Midterm, Quiz to be reweighted Project final report

More information

Spanning and Independence Properties of Finite Frames

Spanning and Independence Properties of Finite Frames Chapter 1 Spanning and Independence Properties of Finite Frames Peter G. Casazza and Darrin Speegle Abstract The fundamental notion of frame theory is redundancy. It is this property which makes frames

More information

Hilbert spaces. 1. Cauchy-Schwarz-Bunyakowsky inequality

Hilbert spaces. 1. Cauchy-Schwarz-Bunyakowsky inequality (October 29, 2016) Hilbert spaces Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/fun/notes 2016-17/03 hsp.pdf] Hilbert spaces are

More information

Examination paper for TMA4180 Optimization I

Examination paper for TMA4180 Optimization I Department of Mathematical Sciences Examination paper for TMA4180 Optimization I Academic contact during examination: Phone: Examination date: 26th May 2016 Examination time (from to): 09:00 13:00 Permitted

More information

Lecture 8 Plus properties, merit functions and gap functions. September 28, 2008

Lecture 8 Plus properties, merit functions and gap functions. September 28, 2008 Lecture 8 Plus properties, merit functions and gap functions September 28, 2008 Outline Plus-properties and F-uniqueness Equation reformulations of VI/CPs Merit functions Gap merit functions FP-I book:

More information

Polynomiality of Linear Programming

Polynomiality of Linear Programming Chapter 10 Polynomiality of Linear Programming In the previous section, we presented the Simplex Method. This method turns out to be very efficient for solving linear programmes in practice. While it is

More information

Division of the Humanities and Social Sciences. Supergradients. KC Border Fall 2001 v ::15.45

Division of the Humanities and Social Sciences. Supergradients. KC Border Fall 2001 v ::15.45 Division of the Humanities and Social Sciences Supergradients KC Border Fall 2001 1 The supergradient of a concave function There is a useful way to characterize the concavity of differentiable functions.

More information

1 Last time: least-squares problems

1 Last time: least-squares problems MATH Linear algebra (Fall 07) Lecture Last time: least-squares problems Definition. If A is an m n matrix and b R m, then a least-squares solution to the linear system Ax = b is a vector x R n such that

More information