An inexact accelerated proximal gradient method for large scale linearly constrained convex SDP

Size: px
Start display at page:

Download "An inexact accelerated proximal gradient method for large scale linearly constrained convex SDP"

Transcription

1 An inexact accelerated proximal gradient method for large scale linearly constrained convex SDP Kaifeng Jiang, Defeng Sun, and Kim-Chuan Toh March 3, 2012 Abstract The accelerated proximal gradient (APG) method, first proposed by Nesterov for minimizing smooth convex functions, and later extended by Bec and Teboulle to composite convex objective functions, and studied in a unifying manner by Tseng has proven to be highly efficient in solving some classes of large scale structured convex optimization (possibly nonsmooth) problems, including nuclear norm minimization problems in matrix completion and l 1 minimization problems in compressed sensing. The method has superior worst-case iteration complexity over the classical projected gradient method, and usually has good practical performance on problems with appropriate structures. In this paper, we extend the APG method to the inexact setting where the subproblem in each iteration is only solved approximately, and show that it enjoys the same worst-case iteration complexity as the exact counterpart if the subproblems are progressively solved to sufficient accuracy. We apply our inexact APG method to solve large scale convex quadratic semidefinite programming (QSDP) problems of the form: min{ 1 2 x, Q(x) + c, x A(x) = b, x 0, where Q, A are given linear maps and b, c are given data. The subproblem in each iteration is solved by a semismooth Newton-CG (SSNCG) method with warm-start using the iterate from the previous iteration. Our APG-SSNCG method is demonstrated to be efficient for QSDP problems whose positive semidefinite linear maps Q are highly ill-conditioned or ran deficient. Department of Mathematics, National University of Singapore, 10 Lower Kent Ridge Road, Singapore (aifengjiang@nus.edu.sg) Department of Mathematics and Ris Management Institute, National University of Singapore, 10 Lower Kent Ridge Road, Singapore (matsundf@nus.edu.sg). Department of Mathematics National University of Singapore, 10 Lower Kent Ridge Road, Singapore (mattohc@nus.edu.sg); and Singapore-MIT Alliance, 4 Engineering Drive 3, Singapore

2 1 Introduction Let S n be the space of n n real symmetric matrices endowed with the standard trace inner product, and Frobenius norm, and S+ n (S++) n the set of positive semidefinite (definite) matrices in S n. We consider the following linearly constrained convex semidefinite programming problem: (P ) min{f(x) : A(x) = b, x 0, x S n where f is a smooth convex function on S n +, A : S n R m is a linear map, b R m, and x 0 means that x S n +. Let A be the adjoint of A. The dual problem associated with (P ) is given by (D) max{f(x) f(x), x + b, p : f(x) A p z = 0, p R m, z 0, x 0. We assume that the linear map A is surjective, and that strong duality holds for (P ) and (D). Let x be an optimal solution of (P ) and (x, p, z ) be an optimal solution of (D). Then, as a consequence of strong duality, they must satisfy the following KKT conditions: A(x) = b, f(x) A p z = 0, x, z = 0, x, z 0. The problem (P ) contains the following important special case of convex quadratic semidefinite programming (QSDP): { 1 min 2 x, Q(x) + c, x : A(x) = b, x 0, (1) where Q : S n S n is a given self-adjoint positive semidefinite linear operator and c S n. Note that the Lagrangian dual problem of (1) is given by { max 1 2 x, Q(x) + b, p : A (p) Q(x) + z = c, z 0. (2) A typical example of QSDP is the nearest correlation matrix problem, where given a symmetric matrix u S n and a linear map L : S n R n n, one intends to solve { 1 min 2 L(x u) 2 : diag(x) = e, x 0, (3) where e R n is the vector of all ones and u S n is given. If we let Q = L L and c = L L(u) in (3), then we get the QSDP problem (1). A well studied special case of (3) is the W -weighted nearest correlation matrix problem, where L = W 1/2 W 1/2 for a given W S n ++ and Q = W W. Note that for U R n r, V R n s, U V : R r s S n is the symmetrized Kronecer product linear map defined by U V (M) = (UMV T + V M T U T )/2. 2

3 There are several methods available for solving this special case of (3), which include the alternating projection method [5], the quasi-newton method [7], the inexact semismooth Newton-CG method [11] and the inexact interior-point method [14]. All these methods, excluding the inexact interior-point method, rely critically on the fact that the projection of a given matrix x S n onto S+ n has an analytical formula with respect to the norm W 1/2 ( )W 1/2. However, all above mentioned techniques cannot be extended to efficiently solve the H-weighted case [5] of (3), where L(x) = H x for some H S n with nonnegative entries and Q(x) = (H H) x, with denoting the Hardamard product of two matrices defined by (A B) ij = A ij B ij. The aforementioned methods are not well suited for the H-weighted case of (3) because there is no explicitly computable formula for the following problem { 1 min 2 H (x u) 2 : x 0, (4) where u S n is a given matrix. To tacle the H-weighted case of (3), Toh [13] proposed an inexact interior-point method for a general convex QSDP including the H-weighted nearest correlation matrix problem. Recently, Qi and Sun [12] introduced an augmented Lagrangian dual method for solving the H-weighted version of (3), where the inner subproblem was solved by a semismooth Newton-CG (SSNCG) method. The augmented Lagrangian dual method avoids solving (4) directly and it can be much faster than the inexact interior-point method [13]. However, if the weight matrix H is very sparse or illconditioned, the conjugate gradient (CG) method would have great difficulty in solving the linear system of equations in the semismooth Newton method, and the augmented Lagrangian method would not be efficient or even fail. Another example of QSDP comes from the civil engineering problem of estimating a positive semidefinite stiffness matrix for a stable elastic structure from r measurements of its displacements {u 1,..., u r R n in response to a set of static loads {f 1,..., f r R n [18]. In this application, one is interested in the QSDP problem: min{ f L(x) 2 x S+, n where L : S n R n r is defined by L(x) = xu, and f = [f 1,..., f r ], u = [u 1,..., u r ]. In this case, the corresponding map Q = L L is given by Q(x) = (xb + Bx)/2 with B = uu T. The main purpose of this paper is to design an efficient algorithm to solve the problem (P ). The algorithm we propose here is based on the accelerated proximal gradient (APG) method of Bec and Teboulle [1] (the method is called FISTA in [1]), where in the th iteration with iterate x, a subproblem of the following form must be solved: { min f( x ), x x x x, H (x x ) : A(x) = b, x 0 (5) where H : S n S n is a given self-adjoint positive definite linear operator. In FISTA [1], H is restricted to LI, where I denotes the identity map and L is a Lipschitz constant for f. More significantly, for FISTA in [1], the subproblem (5) must be solved exactly to generate the next iterate x +1. In this paper, we design an inexact APG method which 3

4 overcomes the two limitations just mentioned. Specifically, in our inexact algorithm, the subproblem (5) is only solved approximately and H is not restricted to be a scalar multiple of I. In addition, we are able to show that if the subproblem (5) is progressively solved with sufficient accuracy, then the number of iterations needed to achieve ε-optimality (in terms of the function value) is also proportional to 1/ ε, just as in the exact algorithm. Another strong motivation for designing an inexact APG algorithm comes from the recent paper [2], which considered the following regularized inverse problem: min x { 1 2 Φx y 2 + λ x B where Φ : R p R n is a given linear map and x B is the atomic norm induced by a given compact set of atoms B in R p. It appears that the APG algorithm is highly suited for solving (6). But note that in each iteration of the APG algorithm, a subproblem of the form, min z { 1 z 2 x 2 + µ z B min{ 1 y 2 x 2 y B µ, must be solved. However, for most choices of B, the subproblem does not admit an analytical solution and has to be solved numerically. As a result, the subproblem is never solved exactly. In fact, it could be computationally very expensive to solve the subproblem to high accuracy. Our inexact APG algorithm thus has the attractive computational advantage that the subproblems need only be solved with progressively better accuracy while still maintaining the global iteration complexity. We should mention that the fast gradient method of Nesterov [9] has also been extended in [4] to the problem, min{f(x) x Q, where the function f is convex (not necessarily smooth) on the closed convex set Q, and is equipped with the so-called firstorder (δ, L)-oracle where for any y Q, we can compute a pair (f δ,l (y), g δ,l (y)) such that 0 f(x) f δ,l (y) g δ,l (y), x y L x 2 y 2 + δ x Q. In the inexact-oracle fast gradient method in [4], the subproblem of the form min{ g, x y + γ x 2 y 2 x Q in each iteration must be solved exactly. Thus the ind of the inexactness considered in [4] is very different from what we consider in this paper. From a practical perspective, the extension of the APG algorithm in [1] to the inexact setting would not be as interesting if we cannot demonstrate it to be computationally viable or offer any computational advantage over alternative algorithms for solving a problem such as (1). Hence, even though the focus of this paper is on designing some theoretical inexact APG algorithms for solving the problem (P ), and on establishing their iteration complexities, we do present some preliminary numerical results to demonstrate the practical viability of our proposed inexact algorithms. In particular, as we will demonstrate later in the paper, if the linear operator H is chosen appropriately so that the subproblem (5) is amenable to computation via the SSNCG method, then our inexact APG algorithm can be much more efficient than the state-of-the-art algorithm (the augmented Lagrangian method in [12]) for solving some convex QSDP problems arising from the H-weighted case of the nearest correlation matrix problem (3). In fact, from our preliminary numerical results, we observe that when the weight matrix H in (4) is (6) 4

5 highly ill-conditioned, our inexact APG algorithm can be times faster than the state-of-the-art augmented Lagrangian method designed in [12]. The paper is organized as follows. In section 2, we propose an inexact APG algorithm for solving a minimization problem of the form min{f(x) + g(x) : x X where f is a smooth convex function with Lipschitz continuous gradient and g is a proper lower semicontinuous convex function. We also prove that the proposed inexact APG algorithm enjoys the same iteration complexity as the FISTA algorithm in [1]. In section 3, we propose and analyse an inexact APG algorithm for the problem (P ) for which the semidefinite least squares subproblem in each iteration is not required to satisfy a stringent primal feasibility condition. In section 4, we conduct some preliminary numerical experiments to evaluate the practical performance of our proposed inexact APG algorithms for solving QSDP problems (1) arising from H-weighted nearest correlation matrix problems. We also evaluate the performance of the proposed algorithms on randomly generated QSDP problems for which the map Q taes the form as in the stiffness matrix estimation problem in [18]. 2 An inexact accelerated proximal gradient method For more generality, we consider the following minimization problem min{f (x) := f(x) + g(x) : x X (7) where X is a finite-dimensional Hilbert space. The functions f : X R, g : X R {+ are proper, lower semi-continuous convex functions (possibly nonsmooth). We assume that dom(g) := {x X : g(x) < is closed, f is continuously differentiable on X and its gradient f is Lipschitz continuous with modulus L on X, i.e., f(x) f(y) L x y x, y X. We also assume that the problem (7) is solvable with an optimal solution x dom(g). The inexact APG algorithm we propose for solving (7) is described as follows. 5

6 Algorithm 1. Given a tolerance ε > 0. Input y 1 = x 0 dom(g), t 1 = 1. Set = 1. Iterate the following steps. Step 1. Find an approximate minimizer { x arg min f(y ) + f(y ), y y + 1 y X 2 y y, H (y y ) + g(y), (8) where H is a self-adjoint positive definite linear operator that is chosen by the user. Step 2. Compute t +1 = t 2 Step 3. Compute y +1 = x + 2. ( t 1 t +1 )(x x 1 ). Notice that Algorithm 1 is an inexact version of the algorithm FISTA in [1], where x need not be the exact minimizer of the subproblem (8). In addition, the quadratic term is not restricted to the form L 2 y y 2 where L is a positive scalar. Given any positive definite linear operator H j : X X, and y j X, we define q j ( ) : X R by q j (x) = f(y j ) + f(y j ), x y j x y j, H j (x y j ). (9) Note that if we choose H j = LI, then we have f(x) q j (x) for all x dom(g). Let {ξ, {ϵ be given convergent sequences of nonnegative numbers such that ξ < =1 and ϵ <. =1 Suppose for each j, we have an approximate minimizer: that satisfies the conditions x j arg min{q j (x) + g(x) : x X (10) F (x j ) q j (x j ) + g(x j ) + ξ j, (11) 2t 2 j f(y j ) + H j (x j y j ) + γ j = δ j with H 1/2 j δ j ϵ j /( 2t j ) (12) ξ j where γ j g(x j ; ξ j ) (the set of -subgradients of g at x 2t 2 j 2t 2 j ). Note that for x j to be an j approximate minimizer, we must have x j dom(g). We should mention that the condition (11) is usually easy to satisfy. For example, if H j is chosen such that f(x) q j (x) for all x dom(g), then (11) is automatically satisfied. To establish the iteration complexity result analogous to the one in [1] for Algorithm 1, we need to establish a series of lemmas whose proofs are extensions of those in [1] to 6

7 account for the inexactness in x. We should note that though the ideas in the proofs are similar, but as the reader will notice later, the technical details become much more involved due to the error terms induced by the inexact solutions of the subproblems. Lemma 2.1. Given y j X and a positive definite linear operator H j on X such that the conditions (11) and (12) hold. Then for any x X, we have F (x) F (x j ) 1 2 x j y j, H j (x j y j ) + y j x, H j (x j y j ) + δ j, x x j ξ j. (13) t 2 j Proof. The proof follows similar arguments as in [1, Lemma 2.3] and we omit it here. For later purpose, we define the following quantities: v = F (x ) F (x ) 0, u = t x (t 1)x 1 x, (14) a = t 2 v 0, b = 1 2 u, H (u ) 0, e = t δ, u, (15) τ = 1 2 x 0 x, H 1 (x 0 x ), ϵ = ϵ j, ξ = (ξ j + ϵ 2 j). (16) Note that for the choice where ϵ j = 1/j α = ξ j for all j 1, where α > 1 is fixed, we have ϵ 1 α 1, ξ α 1 1. Lemma 2.2. Suppose that H 1 H 0 for all. Then a 1 + b 1 a + b e ξ. (17) Proof. The proof follows similar arguments as in [1, Lemma 4.1] and we omit it here. Lemma 2.3. Suppose that H 1 H 0 for all. Then a ( τ + ϵ ) ξ. (18) Proof. Note that we have e H 1/2 δ H 1/2 u t ϵ H 1/2 u / 2 = ϵ b. First, we show that a 1 + b 1 τ + ϵ 1 b1 + ξ 1. Note that a 1 = F (x 1 ) F (x ) and b 1 = 1 2 x 1 x, H 1 (x 1 x ). By applying the inequality (13) to x = x with j = 1, and noting that y 1 = x 0, we have that a x 1 y 1, H 1 (x 1 y 1 ) + y 1 x, H 1 (x 1 y 1 ) + δ 1, x x 1 ξ 1 = 1 2 x 1 x, H 1 (x 1 x ) 1 2 y 1 x, H 1 (y 1 x ) + δ 1, x x 1 ξ 1 = b x 0 x, H 1 (x 0 x ) + δ 1, x x 1 ξ 1. 7

8 Hence, by using the fact that H 1/2 1 δ 1 ϵ 1 / 2, we get a 1 + b x 0 x, H 1 (x 0 x ) δ 1, x x 1 + ξ 1 τ + ϵ 1 b1 + ξ 1. (19) Let By Lemma 2.2, we have s = ϵ 1 b1 + + ϵ b + ξ ξ. τ a 1 + b 1 ϵ 1 b1 ξ 1 a 2 + b 2 ϵ 1 b1 ϵ 2 b2 ξ 1 ξ 2 a + b s. (20) Thus we have a + b τ + s, and hence s = s 1 + ϵ b + ξ s 1 + ϵ τ + s + ξ. (21) Note that since τ b 1 ϵ 1 b1 ξ 1, we have b (ϵ 1+ ϵ (τ + ξ 1 )) ϵ 1 + τ + ξ 1. Hence s 1 = ϵ 1 b1 + ξ 1 ϵ 1 (ϵ 1 + τ + ξ 1 ) + ξ 1 ϵ ξ 1 + ϵ 1 ( τ + ξ 1 ). The inequality (21) implies that Hence we must have (τ + s ) ϵ τ + s (τ + s 1 + ξ ) 0. τ + s 1 2 ( ) ϵ + ϵ 2 + 4(τ + s 1 + ξ ). Consequently s s ϵ2 + ξ ϵ ϵ 2 + 4(τ + s 1 + ξ ) s 1 + ϵ 2 + ξ + ϵ ( τ + s 1 + ξ ). This implies that s s 1 + ϵ 2 j + j=2 ξ + τ ϵ + ξ j + τ j=2 ϵ j sj ϵ j + j=2 ϵ j sj 1 + ξ j ξ + τ ϵ + s ϵ. (22) In the last inequality, we used the fact that s j 1 + ξ j s j and 0 s 1 s. The inequality (22) implies that s 1 ( ) ϵ + ( ϵ ξ + 4 ϵ τ) 1/ j=2

9 From here, we get s ϵ ξ + 2 ϵ τ, and the required result follows from the fact that a τ + s in (20). Now we are ready to state the iteration complexity result for the inexact APG algorithm described in Algorithm 1. Theorem 2.1. Suppose the conditions (11) and (12) hold, and H 1 H 0 for all. Then 4 ( 0 F (x ) F (x ) ( τ + ϵ ( + 1) 2 ) ξ ). (23) Proof. By Lemma 2.3 and the fact that t ( + 1)/2, we have F (x ) F (x ) = a /t 2 4 ( + 1) 2 (( τ + ϵ ) ξ ). From the assumption on the sequences {ξ and {ϵ, we now that both { ϵ and { ξ are bounded. Then the required convergent complexity result follows. Observe that in Theorem 2.1, we will recover the complexity result established in [1] if ϵ j = 0 = ξ j for all j. 2.1 Specialization to the case where g = δ Ω Problem (P ) can be expressed in the form (7) with g = δ Ω, where δ Ω denotes the indicator function on the set Ω = {x S n : A(x) = b, x 0. (24) The sub-problem (8), for a fixed y, then becomes the following constrained minimization problem: { min f(y ), x y x y, H (x y ) : A(x) = b, x 0. (25) Suppose we have an approximate solution (x, p, z ) to the KKT optimality conditions for (25). More specifically, f(y ) + H (x y ) A p z =: δ 0 A(x ) b = 0 (26) x, z =: ε 0, x, z 0. To apply the complexity result established in Theorem 2.1, we need δ and ε to be sufficiently small so that the conditions (11) and (12) are satisfied. Observe that we need 9

10 x to be contained in Ω in (26). Note that the first equation in (26) is the feasibility condition for the dual problem of (25), and it corresponds to the condition in (12) with γ = A p z. Indeed, as we shall show next, γ is an ε -subgradient of g at x Ω if z 0. Now, given any v Ω, we need to show that g(v) g(x ) + γ, v x ε. We have g(v) = 0, g(x ) = 0 since v, x Ω, and γ, v x = A p + z, x v = p, A(x ) A(v) + z, x z, v = z, x z, v z, x = ε. (27) Note that in deriving (27), we used the fact that z, v 0 since v 0 and z 0. Thus the condition (12) is satisfied if H 1/2 δ ϵ /( 2t ) and ε ξ /(2t 2 ). As we have already noted in the last paragraph, the approximate solution x obtained by solving the sub-problem (25) should be feasible, i.e. x Ω. In practice we can maintain the positive semidefiniteness of x by performing projection onto S+. n But the residual vector r := A(x ) b is usually not exactly equal to 0, except for some special cases. For the nearest correlation matrix problem (3), one can indeed obtain an approximate solution x which is contained in Ω be performing a simple diagonal scaling to mae the diagonal entries of the resulting matrix to be one. In the following paragraph, we will propose a strategy to find a feasible solution x Ω given an approximate solution x of (26) for which r is not necessarily 0, but (x, p, z ) satisfies that conditions that x 0, z 0, and H 1/2 δ 1ϵ 2 /( 2t ) and ε 1ξ 2 /(2t 2 ). Suppose that there exists x 0 such that A( x) = b. Since A is surjective, AA is nonsingular. Let ω = A (AA ) 1 (r ). We note that ω 2 r /σ min (A), and A(x + ω ) = b, where 2 denotes the spectral norm. However, x + ω may not be positive semidefinite. Thus we consider the following iterate: x = λ(x + ω ) + (1 λ) x = λx + (λω + (1 λ) x), where λ [0, 1]. It is clear that A x = b. By choosing λ = 1 ω 2 /( ω 2 + λ min ( x)), we can guarantee that x is positive semidefinite. For x, we have Moreover 0 x, z λε + λ n ω 2 z + ω 2 nλmax ( x) z ω 2 + λ min ( x) ε + n ω 2 z + n ω 2 λ min ( x) λ max( x) z ( ) 2ε, if ω 2 ε n z 1 + λ max( x) 1. λ min ( x) f(y ) + H ( x y ) (A p + z ) = δ + H ( x x ) =: δ 10

11 Thus γ = A p z is an 2ε -subgradient of g at x Ω. Now H 1/2 δ H 1/2 ( x x ), and H 1/2 ( H 1/2 ( x x ) 2 = x x, H ( x x ) n ω 2 2λ max (H 1 ) 1 + x x ) 2 2. λ min ( x) δ + Thus we have H 1/2 δ ϵ /( 2t ) if w 2 ϵ ( 2 (λ max (H 1 )) 1/2 1 + x x 2 2n t λ min ( x) ) 1. To conclude, ( x, p, z ) would satisfy the condition (12) if { ξ ω 2 min 4t 2 n z ( 1 + λ max( x) λ min ( x) ) 1, ϵ 2 2nλ max (H 1 ) t ( 1 + x x 2 λ min ( x) ) 1. (28) We should note that even though we have succeeded in constructing a feasible x in Ω. The accuracy requirement in (28) could be too stringent for computational efficiency. For example, when σ min (A) is small, or z is large, or x has a large condition number, or λ max (H 1 ) is large, we would expect that x must be computed to rather high accuracy so that r is small enough for (28) to be satisfied. 3 Analysis of an inexact APG method for (P ) To apply Algorithm 1 to solve the problem (P ), the requirement that x must be primal feasible, i.e., x Ω, can be restrictive as it limits our flexibility of choosing a non-primal feasible algorithm for solving (25). Even though the modification outlined in the last paragraph of section 2.1 is able to produce a primal feasible x, the main drawbac is that the residual norm ω must satisfy the stringent accuracy condition in (28). To overcome the drawbacs just mentioned, here we propose an inexact APG algorithm for solving (P ) for which the iterate x need not be strictly contained in Ω. As the reader will observe later, the analysis of the iteration complexity of the proposed inexact APG becomes even more challenging than the analysis done in the previous section. We let (x, p, z ) be an optimal solution of (P ) and (D). In this section, we let q (x) = f(y ) + f(y ), x y x y, H (x y ), x X. (29) Note that X = S n. The inexact APG algorithm we propose for solving (P ) is given as follows. 11

12 Algorithm 2. Given a tolerance ε > 0. Input y 1 = x 0 X, t 1 = 1. Set = 1. Iterate the following steps. Step 1. Find an approximate minimizer x arg min x X { q (x) : x Ω, (30) where H is a self-adjoint positive definite operator that is chosen by the user, and x is allowed to be contained in a suitable enlargement Ω of Ω. Step 2. Compute t +1 = t 2 Step 3. Compute y +1 = x + 2. ( t 1 t +1 )(x x 1 ). Note that when Ω = Ω, the dual problem of (30) is given by max { q (x) q (x), x + b, p q (x) A p z = 0, z 0, x 0. (31) Let {ξ, {ϵ, {µ be given convergent sequences of nonnegative numbers such that ξ <, ϵ <, and µ <, =1 =1 =1 and be a given positive number. We assume that the approximate minimizer x in (30) has the property that x and its corresponding dual variables (p, z ) satisfy the following conditions: f(x ) q (x ) + ξ /(2t 2 ) q (x ), x b, p q (x ) A p z = δ, with H 1/2 δ ϵ /( 2t ) (32) r µ /t 2 x, z ξ /(2t 2 ) x 0, z 0, where r := A(x ) b. We assume that µ /t 2 µ +1/t 2 +1 and ϵ /t ϵ +1 /t +1 for all. Observe that the last five conditions in (32) stipulate that (x, p, z ) is an approximate optimal solution of (30) and (31). Just as in the previous section, we need to establish a series of lemmas to analyse the iteration complexity of Algorithm 2. However, we should mention that the lac of feasibility in x (i.e., x may not be contained in Ω) introduces nontrivial technical 12

13 difficulties in the proof of the complexity result for Algorithm 2. For example, F (x ) F (x ) no longer holds as in the feasible case when x Ω. Lemma 3.1. Given y j X and a positive definite linear operator H j on X such that the conditions in (32) hold. Then for any x S n +, we have f(x) f(x j ) 1 2 x j y j, H j (x j y j ) + y j x, H j (x j y j ) Proof. Since f(x j ) q j (x j ) + ξ j /(2t 2 j), we have + δ j + A p j, x x j ξ j /t 2 j. (33) f(x) f(x j ) f(x) q j (x j ) ξ j /(2t 2 j) = f(x) f(y j ) f(y j ), x j y j 1 2 x j y j, H j (x j y j ) ξ j /(2t 2 j) f(y j ), x x j 1 2 x j y j, H j (x j y j ) ξ j /(2t 2 j). Note that in the last inequality, we have used the fact that f(x) f(y j ) f(y j ), x y j for all x X. Now, by using (32), we get f(x) f(x j ) f(y j ), x x j 1 2 x j y j, H j (x j y j ) ξ j /(2t 2 j) = δ j + A p j H j (x j y j ), x x j + z j, x x j 1 2 x j y j, H j (x j y j ) ξ j /(2t 2 j) δ j + A p j H j (x j y j ), x x j 1 2 x j y j, H j (x j y j ) ξ j /t 2 j. From here, the required inequality (33) follows readily. Note that in deriving the last inequality, we have used the fact that z j, x j ξ j /(2t 2 j) and z j, x 0. For later purpose, we define the following quantities for 1: v = f(x ) f(x ), u = t x (t 1)x 1 x, a = t 2 v, b = 1 2 u, H (u ) 0, e = t δ, u, (34) η = p, t 2 r t 2 1 r 1, with η 1 = p 1, r 1, χ = p 1 p µ 1, with χ 1 = 0, ϵ = ϵ j, ξ = (ξ j + ϵ 2 j), χ = χ j, τ = 1 2 x 0 x, H 1 (x 0 x ). Note that unlie the analysis in the previous section, a may be negative. 13

14 Lemma 3.2. Suppose that H 1 H 0 for all. Then a 1 + b 1 a + b e ξ η. (35) Proof. By applying the inequality (33) to x = x 1 0 with j =, we get v 1 v = f(x 1 ) f(x ) (36) 1 2 x y, H (x y ) + y x 1, H (x y ) + δ + A p, x 1 x ξ /t 2. Similarly, by applying the inequality (33) to x = x 0 with j =, we get v = f(x ) f(x ) (37) 1 2 x y, H (x y ) + y x, H (x y ) + δ + A p, x x ξ /t 2. By multiplying (36) throughout by t 1 (note that t 1 for all 1) and adding that to (37), we get (t 1)v 1 t v t 2 x y, H (x y ) + t y (t 1)x 1 x, H (x y ) δ + A p, t x (t 1)x 1 x ξ /t. (38) Now, by multiplying (38) throughout by t and using the fact that t 2 1 = t (t 1), we get a 1 a = t 2 1v 1 t 2 v t2 2 x y, H (x y ) + t t y (t 1)x 1 x, H (x y ) δ + A p, t 2 x t 2 1x 1 t x ξ. Let a = t y, b = t x, and c = (t 1)x 1 + x. By using the fact that b a, H (b a) + 2 a c, H (b a) = b c, H (b c) a c, H (a c), we get a 1 a 1 2 b c, H (b c) 1 2 a c, H (a c) δ + A p, t 2 x t 2 1x 1 t x ξ. (39) Now a c = t y c = t x 1 + (t 1 1)(x 1 x 2 ) c = u 1, b c = u, and t 2 x t 2 1 x 1 t x = t u. Thus (39) implies that a 1 a 1 2 u, H (u ) 1 2 u 1, H (u 1 ) δ + A p, t u ξ b b 1 δ, t u p, A(t u ) ξ. (40) 14

15 Note that in deriving (40), we have used the fact that H 1 H. Now p, A(t u ) = p, t 2 (Ax b) t 2 1(Ax 1 b) = p, t 2 r t 2 1r 1. From here, the required result is proved. Lemma 3.3. Suppose that H 1 H 0 and the conditions in (32) are satisfied for all. Then where ω = ϵ j Aj, and a + b ( τ + ϵ ) 2 + p µ + 2( ξ + χ + ω ) (41) A j = p j µ j + a j, with a j = max{0, a j. Proof. Note that we have e H 1/2 δ H 1/2 u t ϵ H 1/2 u / 2 = ϵ b. First, we show that a 1 + b 1 τ + p 1, r 1 + ϵ 1 b1 + ξ 1. Note that a 1 = f(x 1 ) f(x ) and b 1 = 1 2 x 1 x, H 1 (x 1 x ). By applying the inequality (33) to x = x with j = 1, and noting that y 1 = x 0, we have that a x 1 y 1, H 1 (x 1 y 1 ) + y 1 x, H 1 (x 1 y 1 ) + δ 1 + A p 1, x x 1 ξ 1 = 1 2 x 1 x, H 1 (x 1 x ) 1 2 y 1 x, H 1 (y 1 x ) + δ 1 + A p 1, x x 1 ξ 1 = b x 0 x, H 1 (x 0 x ) + δ 1 + A p 1, x x 1 ξ 1. Hence, by using the fact that H 1/2 1 δ 1 ϵ 1 / 2, we get a 1 + b x 0 x, H 1 (x 0 x ) δ 1 + A p 1, x x 1 + ξ 1 τ + ϵ 1 b1 + p 1, r 1 + ξ 1 τ + p 1, r 1 + ϵ 1 b1 + ξ 1. Let s 1 = ϵ 1 b1 + ξ 1 and for 2, s = ϵ j bj + By Lemma 3.2, we have ξ j + τ a 1 + b 1 ϵ 1 b1 ξ 1 η 1 χ j. a 2 + b 2 ϵ 2 b2 ϵ 1 b1 ξ 1 ξ 2 η 1 η 2 a + b a + b +1 ϵ j bj ξ j ϵ j bj 15 η j ξ j p, t 2 r χ j.

16 Note that in the last inequality, we used the fact that 1 η j = p, t 2 r + p j p j+1, t 2 jr j p, t 2 r + Thus we have a + b τ + p, t 2 r + s, and this implies that Hence χ j. b τ + s where τ := τ + p, t 2 r a τ + A. (42) s = s 1 + ϵ b + ξ + χ s 1 + ϵ τ + s + ξ + χ. (43) Note that since τ 1 b 1 ϵ 1 b1 ξ 1, we have b 1 1(ϵ ϵ (τ 1 + ξ 1 )) ϵ 1 + τ1 + ξ 1. Hence s 1 = ϵ 1 b1 + ξ 1 ϵ 1 (ϵ 1 + τ 1 + ξ 1 ) + ξ 1 ϵ ξ 1 + ϵ 1 ( τ 1 + ξ 1 ). The inequality (43) implies that ) (τ + s ) ϵ τ + s (τ + s 1 + ξ + χ 0. Hence we must have τ + s 1 2 ( ) ϵ + ϵ 2 + 4(τ + s 1 + ξ + χ ). Consequently, by using the fact that a + b a + b for any a, b 0, we have s s ϵ2 + ξ + χ ϵ ϵ 2 + 4(τ + s 1 + ξ + χ ) s ϵ2 + ξ + χ ϵ ϵ 2 + 4(τ + A + s 1 + ξ + χ ) This implies that s 1 + ϵ 2 + ξ + χ + ϵ ( τ + A + s 1 + ξ + χ ). (44) s s 1 + (ϵ 2 j + ξ j + χ j ) + j=2 ξ + χ + ϵ j τ + Aj + j=2 ϵ j τ + Aj + ϵ j sj ϵ j sj 1 + ξ j + χ j ξ + χ + ω + ϵ τ + ϵ s. (45) In the last inequality, we used the fact that s j 1 + ξ j + χ j s j, and 0 s 1 s. The inequality (45) implies that s 1 ( ) ϵ + ϵ θ, (46) 16 j=2

17 where θ = ξ + χ + ω + ϵ τ. From here, we get s ϵ 2 + 2θ. (47) The required result follows from (47) and the fact that a + b τ + s + p, t 2 r τ + s + p µ. Let Ω := {x S n : A(x) b µ /t 2, x 0. (48) We assume that the following minimization problem min {f(x) : x Ω has an optimal solution x. Since x, x Ω, we have f(x ) f(x ) and f(x ) f(x ). Hence v = f(x ) f(x ) f(x ) f(x ). Also, since µ /t 2 µ +1/t 2 +1, we have f(x +1 ) f(x ) and Ω +1 Ω. Lemma 3.4. For all 1, we have Proof. By the convexity of f, we have 0 f(x ) f(x ) p µ /t 2. (49) f(x ) f(x ) f(x ), x x = A p + z, x x = p, A(x ) A(x ) + z, x z, x p b A(x ) p µ /t 2. Note that in deriving the last inequality, we have used the fact that z, x = 0, z, x 0, and A(x ) = b. Theorem 3.1. Suppose that H 1 H 0 for all. Let M := max 1 j { ( p + p j )µ j. Then we have 4 p µ ( + 1) f(x 4 ( ) f(x 2 ) ( ) τ + ϵ ( + 1) 2 ) 2 + p µ + 2 ϵ M + 2( ξ + χ ).(50) Proof. The inequality on the left-hand side of (50) follows from Lemma 3.4 and the fact that t ( + 1)/2 and f(x ) f(x ) f(x ) f(x ). Next, we prove the inequality on the right-hand side of (50). By Lemma 3.3 and noting that b 0, we have t 2 (f(x ) f(x )) = a ( τ + ϵ ) 2 + p µ + 2( ξ + χ ) + 2ω. 17

18 Now a j = t 2 j(f(x ) f(x j )) t 2 j(f(x ) f(x j )) p µ j. Hence a j p µ j, and ω ϵ j p j µ j + p µ j M ϵ. (51) From here, the required result follows. From the assumption on the sequences {ϵ, {ξ, and {µ, we now that the sequences { ϵ and { ξ are bounded. In order to show that the sequence of function values f(x ) converges to the optimal function value f(x ) with the convergent rate O(1/ 2 ), it is enough to show that the sequence { p is bounded under certain conditions, from which we can also have the boundedness of {M and { χ. Then the desired convergent rate of O(1/ 2 ) for our inexact APG method follows. 3.1 Boundedness of {p In this subsection, we consider sufficient conditions to ensure the boundedness of the sequence {p. Lemma 3.5. Suppose that there exists ( x, p, z) such that A( x) = b, x 0, f( x) = A p + z, z 0. (52) If the sequence {f(x ) is bounded from above, then the sequence {x is bounded. Proof. By using the convexity of f, we have Thus f( x) f(x ) f( x), x x = A p + z, x x = p, A( x) A(x ) + z, x z, x p b A(x ) + z, x z, x p µ /t 2 + z, x z, x p µ 1 + z, x z, x. λ min ( z)tr(x ) z, x p µ 1 + z, x f( x) + f(x ). (53) From here, the required result is proved. Remar 3.1. The condition that {f(x ) is bounded from above appears to be fairly wea. But unfortunately we are not able to prove that this condition holds true. In many cases of interest, such as the nearest correlation matrix problem (3), the condition that {f(x ) is bounded above or that {x is bounded can be ensured since Ω 1 is bounded. 18

19 Lemma 3.6. Suppose that H 1 H 0 for all, {x is bounded and there exists ˆx such that A(ˆx) = b, ˆx 0. Then the sequence {z is bounded. In addition, the sequence {p is also bounded. Proof. From (32), we have λ min (ˆx)Tr(z ) ˆx, z = ˆx, q (x ) A p δ = b, p + ˆx, q (x ) ˆx, δ + ˆx x, q (x ) ˆx, δ = + ˆx x, f(y ) + H (x y ) ˆx, δ + ˆx x f(y ) + H (x y ) + H 1/2 ˆx ϵ /( 2t ) + ˆx x f(y ) + H (x y ) + H 1/2 1 ˆx ϵ 1 / 2. (54) Since {x is bounded, it is clear that {y is also bounded. By the continuity of f and that fact that 0 H H 1, the sequence { f(y ) + H (x y ) is also bounded. From (54), we can now conclude that {z is bounded. Next, we show that {p is bounded. Let A = (AA ) 1 A. Note that the matrix AA is nonsingular since A is assumed to be surjective. From (32), we have p = A ( q (x ) z δ ), and hence ( ) p A q (x ) z δ A f(y ) + H (x y ) + z + δ. Since H H 1 λ max (H 1 )I, we have δ λ max (H 1 ) H 1/2 δ λ max (H 1 )ϵ 1 / 2. By using the fact that the sequences { f(y ) + H (x y ) and {z are bounded, the boundedness of {p follows. 3.2 A semismooth Newton-CG method for solving the inner subproblem (30) In Section 3, an inexact APG method (Algorithm 2) was presented for solving (P ) with the desired convergent rate of O(1/ 2 ). However, an important issue on how to efficiently solve the inner subproblem (30) has not been addressed. In this section, we propose the use of a SSNCG method to solve (30) with warm-start using the iterate from the previous iteration. Note that the self-adjoint positive definite linear operator H can be chosen by the user. Suppose that at each iteration we are able to choose a linear operator of the form: H := w w, where w S n ++ 19

20 such that f(x) q (x) for all x Ω. (Note that we can always choose w = LI if there are no other better choices.) Then the objective function q ( ) in (30) can equivalently be written as q (x) = 1 2 w1/2 (x u )w 1/2 2 + f(y ) 1 2 w 1/2 f(y )w 1/2 2, where u = y w 1 f(y )w 1. By dropping the last two constant terms in the above equation, we can equivalently write (30) as the following well-studied W -weighted semidefinite least squares problem { 1 min 2 w1/2 (x u )w 1/2 2 : A(x) = b, x 0. (55) Let x = w 1/2 xw 1/2, ū = w 1/2 u w 1/2, and define the linear map Ā : Sn R m by Ā( x) = A(w 1/2 x w 1/2 ). Then (55) can equivalently be written as { 1 min 2 x ū 2 : Ā( x) = b, x 0, (56) whose Lagrangian dual problem is given by max {θ(p) := b T p 1 2 Π S n+ (ū + Ā p) 2 p R m (57) where Π S n + (u) denotes the metric projection of u S n onto S+. n The problem (57) is an unconstrained continuously differentiable convex optimization problem, and it can be efficiently solved by the SSNCG method developed in [11]. Note that once an approximate solution p is computed from (57), an approximate solution for (55) can be computed by setting x = w 1/2 x w 1/2 with x = Π S n + (ū + Ā p ) and its complementary dual slac variable is z = w 1/2 ( x ū Ā p )w 1/2. Note that the problem (57) is an unconstrained continuously differentiable convex optimization problem which can also be solved by a gradient ascent method. In our numerical implementation, we use a gradient method to solve (57) during the initial phase of Algorithm 2 when high accuracy solutions are not required. When the gradient method encounters difficulty in solving the subproblem to the required accuracy or becomes inefficient, we switch to the SSNCG method to solve (57). We should note that approximate solution computed for the current subproblem can be used to warm start the SSNCG and gradient methods for solving the next subproblem. In fact, the strategy of solving a semidefinite least squares subproblem (30) in each iteration of our inexact APG algorithm is practically viable precisely because we are able to warm start the SSNCG or gradient methods when solving the subproblems. In our numerical experience, the SSNCG method would typically tae less than 5 Newton steps to solve each subproblem with warm start. 20

21 To successfully apply the SSNCG method to solve (30), we must find a suitable symmetrized Kronecer product approximation of Q. Note that for the H-weighted nearest correlation matrix problem where Q is a diagonal operator defined by Q(x) = (H H) x, a positive definite symmetrized Kronecer product approximation for Q can be derived as follows. Consider a ran-one approximation dd T of H H, then diag(d) diag(d) is a symmetrized Kronecer product approximation of Q. For the vector d R n, one can simply tae d j = max { ϵ, max ij 1 i n, j = 1,..., n, (58) where ϵ > 0 is a small positive number. For the convex QSDP problem (1) where the linear operator Q is defined by Q(x) = B I(x) = (Bx + xb)/2, (59) where B S n +, we propose the following strategy for constructing a suitable symmetrized Kronecer product approximation of Q = B I. Suppose we have the eigenvalue decomposition B = P ΛP T, where Λ = diag(λ) and λ = (λ 1,..., λ n ) T is the vector of eigenvalues of B. Then x, B I(x) = 1 2 ˆx, Λˆx + ˆxΛ = 1 2 n i=1 n ˆx ij (λ i + λ j ) = n i=1 n ˆx ij M ij, where ˆx = P T xp and M = 1 2 (λet + eλ T ) with e R n being the vector of all ones. For the choice of w, one may simply choose w = max(m)i, where max(m) is the largest element of M. However, if the matrix B is ill-conditioned, this choice of w may not wor very well in practice since max(m)i I may not be a good approximation of Q = B I. To find a better approximation of Q, we propose to consider the following nonconvex minimization problem: { n min i=1 n h i h j h i h j M ij 0 i, j = 1,..., n, h R n +. (60) Thus if ĥ is a feasible solution to the above problem, then we have x, B I(x) = n n ˆx ij M ij i=1 n n ˆx ij ĥ i ĥ j = x, w xw i=1 x S n with w = P diag(ĥ)p T. Note that the above strategy can also be used to get a suitable symmetrized Kronecer product approximation of the form diag(d) diag(d) when Q is a diagonal operator. 21

22 To find a good feasible solution for (60), we consider the following strategy. Suppose we are given an initial vector u R n + such that uu T M 0. For example, we may tae u = max(m)e. Our purpose is to find a correction vector ξ R n + such that h := u ξ satisfies the constraint in (60) while the objective value is reduced. Note that we have h i h j M ij = u i u j M ij (u i ξ j + u j ξ i ) + ξ i ξ j u i u j M ij (u i ξ j + u j ξ i ). Thus the constraints in (60) are satisfied if ξ u and u i ξ j + u j ξ i u i u j M ij i, j = 1,..., n. Since n n i=1 h ih j = (e T ξ) 2 2(e T u)(e T ξ) + (e T u) 2, and noting that 0 e T ξ e T u, the objective value in (60) is minimized if e T ξ is maximized. Thus we propose to consider the following LP: max { e T ξ u i ξ j + u j ξ i u i u j M ij i, j = 1,..., n, 0 ξ u. (61) Observe that the correction vector ξ depends on the given vector u. Thus if necessary, after a new u is obtained, one may repeat the process by solving the LP associated with the new u. 4 Numerical Experiments In this section, we report the performance of the inexact APG algorithm (Algorithm 2) for large scale linearly constrained QSDP problems. We implemented our algorithm in MATLAB 2008a (version 7.6), and the numerical experiments are run in MATLAB under a Linux operating system on an Intel Core 2 Duo 2.40GHz CPU with 4GB memory. We measure the infeasibilities for the primal and dual problems (1) and (2) as follows: R P = b A(x) 1 + b, R D = Q(x) + c A p z, (62) 1 + c where x, p, z are computed from (57). In our numerical experiments, we stop the inexact APG algorithm when max{r P, R D Tol, (63) where Tol is a pre-specified accuracy tolerance. Unless otherwise specified, we set Tol = 10 6 as the default tolerance. In addition, we also stop the inexact APG algorithm when the maximum number of outer iteration exceeds 300. When solving the subproblem (57) at iteration of our inexact APG method, we stop the SSNCG or gradient method when θ(p ) /(1 + b ) < min{1/t 3.1, 0.2 f(x 1) A p 1 z 1 /(1 + c ). 22

23 4.1 Example 1 In this example, we consider the following H-weighted nearest correlation matrix problem { 1 min 2 H (x u) 2 diag(x) = e, x 0. (64) We compare the performance of our inexact APG (IAPG) method and the augmented Lagrangian dual method (AL) studied by Qi and Sun in [12], whose Matlab codes are available at matsundf. We consider the gene correlation matrices û from [6]. For testing purpose we perturb û to u := (1 α)û + αe, where α (0, 1) and E is a randomly generated symmetric matrix with entries in [ 1, 1]. We also set u ii = 1, i = 1,..., n. The weight matrix H is a sparse random symmetric matrix with about 50% nonzero entries. The Matlab code for generating H and E is as follows: H = sprand(n,n,0.5); H = triu(h) + triu(h,1) ; H = (H + H )/2; E = 2*rand(n,n)-1; E = triu(e)+triu(e,1) ; E = (E+E )/2. In order to generate a good initial point, we use the SSNCG method in [11] to solve the following unweighted nearest correlation matrix problem { 1 min 2 x u 2 diag(x) = e, x 0. (65) Due to the difference in stopping criteria for different algorithms, we set different accuracy tolerances for the IAPG and augmented Lagrangian methods. For the IAPG method, we set Tol = For the augmented Lagrangian method, its stopping criteria depends on a tolerance parameter Tol1 which controls the three conditions in the KKT system (26). We set Tol1 = Table 1 presents the numerical results obtained by the IAPG method (Algorithm 2) and the augmented Lagrangian dual method (AL) for various instances of Example 1. We use the primal infeasibility, primal objective value and computing time to compare the performance of the two algorithms. For each instance in the table, we report the matrix dimension (n), the noise level (α), the number of outer iterations (iter), the total number of Newton systems solved (newt) the primal infeasibility (R P ), the dual infeasibility (R D ), the primal objective value (pobj) in (64), as well as the computation time (in the format hours:minutes:seconds) and the ran of the computed solution (ranx). We may observe from the table that the IAPG method can solve (64) very efficiently. For each instance, the IAPG method can achieve nearly the same primal objective value as the augmented Lagrangian method, and the former can achieve much better primal infeasibility while taing less than 50% of the time needed by the augmented Lagrangian method. 23

24 Algo. problem n α iter/newt R P R D pobj time ranx IAPG Lymph / e e e+0 2: / e e e-1 4: AL Lymph e e e+0 5: e e e-1 30: IAPG ER / e e e+1 3: / e e e+0 3: AL ER e e e+1 9: e e e+0 14: IAPG Arabidopsis / e e e+1 4: / e e e+0 4: AL Arabidopsis e e e+1 12: e e e+0 22: IAPG Leuemia / e e e+2 9: / e e e+1 8: AL Leuemia e e e+2 22: e e e+1 28: IAPG hereditarybc / e e e+2 17: / e e e+2 17: AL hereditarybc e e e+2 38: e e e+2 36: Table 1: Comparison of the inexact APG (IAPG) and augmented Lagrangian dual (AL) algorithms on (64) using sample correlation matrix from gene data sets. The weight matrix H is a sparse random matrix with about 50% nonzero entries. 4.2 Example 2 We consider the same problem as in Example 1, but the weight matrix H is generated from a weight matrix H 0 used by a hedge fund company. The matrix H 0 is a symmetric matrix with all positive entries. It has about 24% of the entries equal to 10 5 and the rest are distributed in the interval [2, ]. It has 28 eigenvalues in the interval [ 520, 0.04], 11 eigenvalues in the interval [ , ], and the rest of 54 eigenvalues in the interval [10 4, ]. The Matlab code for generating the matrix H is given by tmp = ron(ones(25,25),h0); H = tmp([1:n],[1:n]); H = (H+H )/2. We use the same implementation techniques as in Example 1. The stopping tolerance for the IAPG method is set to Tol = 10 6 while the tolerance for the augmented Lagrangian method is set to Tol1 = Table 2 presents the numerical results obtained by the IAPG and augmented Lagrangian dual (AL) methods. In the table, means that the augmented Lagrangian method cannot achieve the required tolerance of 10 2 in 24 hours. As we can see from Table 2, the IAPG method is much more efficient than the augmented Lagrangian method, and it can achieve much better primal infeasibility. For the last gene correlation matrix of size 1869, the IAPG method can find a good approx- 24

25 Algo. problem n α iter/newt R P R D pobj time ranx IAPG Lymph / e e e+6 1: / e e e+6 1: AL Lymph e e e+6 56: e e e+6 29: IAPG ER / e e e+7 2: / e e e+6 2: AL ER e e e+7 2:05: e e e+6 53: IAPG Arabidopsis / e e e+7 4: / e e e+6 3: AL Arabidopsis e e e+7 4:49: e e e+6 1:28: IAPG Leuemia / e e e+7 11: / e e e+7 10: AL Leuemia e e e+7 5:55: IAPG hereditarybc / e e e+8 29: / e e e+7 26: AL hereditarybc Table 2: Same as Table 1, but with a bad weight matrix H. imate solution within half an hour. For the augmented Lagrangian method, because the map Q associated with the weight matrix H is highly ill-conditioned, the CG method has great difficulty in solving the ill-conditioned linear system of equations obtained by the semismooth Newton method. Note that we have also used Algorithm 1 to solve the problems in Tables 1 and 2 since we can obtain an approximate solution x which is feasible for (64) for the subproblems (25) appearing in Algorithm 1. For these problems, the numerical results obtained by Algorithm 1 are quite similar to those of Algorithm 2, and hence we shall not report them here. 4.3 Example 3 In this example, we report the performance of the inexact APG on the linearly constrained QSDP problem (1). The linear operator Q is given by Q(x) = 1 (Bx + xb) for a given 2 B 0, and the linear map A is given by A(x) = diag(x). We generate a positive definite matrix X and set b = A(x). Similarly we can generate a random vector p R m and a positive definite matrix z and set c = A (p)+z Q(x). The Matlab code for generating the matrix B is given by randvec = 1+ 9*rand(n,1); tmp = randn(n,ceil(n/4)); B = diag(randvec)+(tmp*tmp )/n; B = (B+B )/2. Note that the matrix B generated is rather well conditioned. 25

26 As discussed in section 3.2, we are able to find a good symmetrized Kronecer product approximation w w of Q. By noting that 1 2 x, w w(x) + c, x = 1 2 w1/2 (x u)w 1/ w 1/2 cw 1/2 2, where u = w 1 cw 1, and dropping the constant term, we propose to solve the following problem to generate a good initial point for the inexact APG method: { 1 min 2 w1/2 (x u)w 1/2 2 A(x) = b, x 0, which can be efficiently solved by the the SSNCG method in [11]. n m cond(b) iter/newt R P R D pobj dobj time e+0 9/9 3.24e e e e e+0 9/9 3.68e e e e+4 1: e+0 9/9 3.16e e e e+5 8: e+0 9/9 3.32e e e e+5 16: e+0 9/9 2.98e e e e+5 29:02 Table 3: Numerical results of the inexact APG algorithm on (1), where the positive definite matrix B for the linear operator Q is well-conditioned. The performance results of our IAPG method on convex QSDP problems are given in Table 3, where pobj and dobj are the primal and dual objective values for QSDP, respectively. We may see from the table that the IAPG method can solve all the five instances of QSDP problems very efficiently with very good primal infeasibility. 4.4 Example 4 We consider the same problem as Example 3 but the linear map A is generated by using the first generator in [8] with order p = 3. The positive definite matrix B is generated by using Matlab s built-in function: B = gallery( lehmer,n). The condition number cond(b) of the generated Lehmer matrix B is within the range [n, 4n 2 ]. For this example, the simple choice of w = λ max (B)I in the symmetrized Kronecer product w w for approximating Q does not wor well. In our numerical implementation, we employ the strategy described in section 3.2 to find a suitable w. Table 4 presents the performance results of our IAPG method on convex QSDP problems where the matrix B is very ill-conditioned. As observed from the table, the condition numbers of B are large. We may see from the table that the IAPG method can solve the problem very efficiently with very accurate approximate optimal solution. 26

27 n m cond(b) iter/newt R P R D pobj dobj time e+5 51/ e e e e+3 1: e+6 62/ e e e e+4 11: e+6 76/ e e e e+4 1:14: e+6 80/ e e e e+4 2:11:01 Table 4: Same as Table 3, but the matrix B for the linear operator Q is ill-conditioned and the linear map A is randomly generated as in [8]. References [1] A. Bec, and M. Teboulle, A fast iterative shrinage-thresholding algorithm for linear inverse problems, SIAM J. Imaging Sciences, 2 (2009), pp [2] V. Chandrasearan, B. Recht, P.A. Parrilo, and A.S. Willsy, The convex geometry of linear inverse problems, prepint, December, [3] F.H. Clare, Optimization and Nonsmooth Analysis, John Wiley and Sons, New Yor, [4] O. Devolder, F. Glineur, and Y. Nesterov, First-order methods of smooth convex optimization with inexact oracle, preprint, [5] N.J. Higham, Computing the nearest correlation matrix a problem from finance, IMA Journal of Numerical Analysis, 22 (2002), [6] L. Li and K.C. Toh, An inexact interior point method for L1-regularized sparse covariance selection, Mathematical Programming Computation, 2 (2010), [7] J. Malic, A dual approach to semidefinite least-squares problems, SIAM Journal on Matrix Analysis and Applications, 26 (2004), [8] J. Malic, J. Povh, F. Rendl, and A. Wiegele, Regularization methods for semidefinite programming, SIAM Journal on Optimization, 20 (2009), [9] Y. Nesterov, A method of solving a convex programming problem with convergence rate O(1/ 2 ), Soviet Mathematics Dolady 27 (1983), [10] Y. Nesterov, Gradient method for minimizing composite objective function, preprint, [11] H. Qi and D.F. Sun, A quadratically convergent Newton method for computing the nearest correlation matrix, SIAM J. on Matrix Analysis and Applications, 28 (2006),

An inexact accelerated proximal gradient method for large scale linearly constrained convex SDP

An inexact accelerated proximal gradient method for large scale linearly constrained convex SDP An inexact accelerated proximal gradient method for large scale linearly constrained convex SDP Kaifeng Jiang, Defeng Sun, and Kim-Chuan Toh Abstract The accelerated proximal gradient (APG) method, first

More information

An Introduction to Correlation Stress Testing

An Introduction to Correlation Stress Testing An Introduction to Correlation Stress Testing Defeng Sun Department of Mathematics and Risk Management Institute National University of Singapore This is based on a joint work with GAO Yan at NUS March

More information

Solving large Semidefinite Programs - Part 1 and 2

Solving large Semidefinite Programs - Part 1 and 2 Solving large Semidefinite Programs - Part 1 and 2 Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria F. Rendl, Singapore workshop 2006 p.1/34 Overview Limits of Interior

More information

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework - MC/MC Constrained optimization

More information

Projection methods to solve SDP

Projection methods to solve SDP Projection methods to solve SDP Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria F. Rendl, Oberwolfach Seminar, May 2010 p.1/32 Overview Augmented Primal-Dual Method

More information

An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems

An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems Kim-Chuan Toh Sangwoon Yun March 27, 2009; Revised, Nov 11, 2009 Abstract The affine rank minimization

More information

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Zhaosong Lu October 5, 2012 (Revised: June 3, 2013; September 17, 2013) Abstract In this paper we study

More information

On the acceleration of the double smoothing technique for unconstrained convex optimization problems

On the acceleration of the double smoothing technique for unconstrained convex optimization problems On the acceleration of the double smoothing technique for unconstrained convex optimization problems Radu Ioan Boţ Christopher Hendrich October 10, 01 Abstract. In this article we investigate the possibilities

More information

12. Interior-point methods

12. Interior-point methods 12. Interior-point methods Convex Optimization Boyd & Vandenberghe inequality constrained minimization logarithmic barrier function and central path barrier method feasibility and phase I methods complexity

More information

An adaptive accelerated first-order method for convex optimization

An adaptive accelerated first-order method for convex optimization An adaptive accelerated first-order method for convex optimization Renato D.C Monteiro Camilo Ortiz Benar F. Svaiter July 3, 22 (Revised: May 4, 24) Abstract This paper presents a new accelerated variant

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

Efficient Quasi-Newton Proximal Method for Large Scale Sparse Optimization

Efficient Quasi-Newton Proximal Method for Large Scale Sparse Optimization Efficient Quasi-Newton Proximal Method for Large Scale Sparse Optimization Xiaocheng Tang Department of Industrial and Systems Engineering Lehigh University Bethlehem, PA 18015 xct@lehigh.edu Katya Scheinberg

More information

Proximal Newton Method. Ryan Tibshirani Convex Optimization /36-725

Proximal Newton Method. Ryan Tibshirani Convex Optimization /36-725 Proximal Newton Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: primal-dual interior-point method Given the problem min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h

More information

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44 Convex Optimization Newton s method ENSAE: Optimisation 1/44 Unconstrained minimization minimize f(x) f convex, twice continuously differentiable (hence dom f open) we assume optimal value p = inf x f(x)

More information

Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems)

Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems) Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems) Donghwan Kim and Jeffrey A. Fessler EECS Department, University of Michigan

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)

More information

1 Computing with constraints

1 Computing with constraints Notes for 2017-04-26 1 Computing with constraints Recall that our basic problem is minimize φ(x) s.t. x Ω where the feasible set Ω is defined by equality and inequality conditions Ω = {x R n : c i (x)

More information

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725 Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:

More information

The Proximal Gradient Method

The Proximal Gradient Method Chapter 10 The Proximal Gradient Method Underlying Space: In this chapter, with the exception of Section 10.9, E is a Euclidean space, meaning a finite dimensional space endowed with an inner product,

More information

Nearest Correlation Matrix

Nearest Correlation Matrix Nearest Correlation Matrix The NAG Library has a range of functionality in the area of computing the nearest correlation matrix. In this article we take a look at nearest correlation matrix problems, giving

More information

Accelerated primal-dual methods for linearly constrained convex problems

Accelerated primal-dual methods for linearly constrained convex problems Accelerated primal-dual methods for linearly constrained convex problems Yangyang Xu SIAM Conference on Optimization May 24, 2017 1 / 23 Accelerated proximal gradient For convex composite problem: minimize

More information

NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS. 1. Introduction. We consider first-order methods for smooth, unconstrained

NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS. 1. Introduction. We consider first-order methods for smooth, unconstrained NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS 1. Introduction. We consider first-order methods for smooth, unconstrained optimization: (1.1) minimize f(x), x R n where f : R n R. We assume

More information

Research Note. A New Infeasible Interior-Point Algorithm with Full Nesterov-Todd Step for Semi-Definite Optimization

Research Note. A New Infeasible Interior-Point Algorithm with Full Nesterov-Todd Step for Semi-Definite Optimization Iranian Journal of Operations Research Vol. 4, No. 1, 2013, pp. 88-107 Research Note A New Infeasible Interior-Point Algorithm with Full Nesterov-Todd Step for Semi-Definite Optimization B. Kheirfam We

More information

ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING

ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING YANGYANG XU Abstract. Motivated by big data applications, first-order methods have been extremely

More information

More First-Order Optimization Algorithms

More First-Order Optimization Algorithms More First-Order Optimization Algorithms Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye Chapters 3, 8, 3 The SDM

More information

Sparse Optimization Lecture: Dual Methods, Part I

Sparse Optimization Lecture: Dual Methods, Part I Sparse Optimization Lecture: Dual Methods, Part I Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know dual (sub)gradient iteration augmented l 1 iteration

More information

Primal-Dual First-Order Methods for a Class of Cone Programming

Primal-Dual First-Order Methods for a Class of Cone Programming Primal-Dual First-Order Methods for a Class of Cone Programming Zhaosong Lu March 9, 2011 Abstract In this paper we study primal-dual first-order methods for a class of cone programming problems. In particular,

More information

Dual Proximal Gradient Method

Dual Proximal Gradient Method Dual Proximal Gradient Method http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/19 1 proximal gradient method

More information

5. Duality. Lagrangian

5. Duality. Lagrangian 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010 I.3. LMI DUALITY Didier HENRION henrion@laas.fr EECI Graduate School on Control Supélec - Spring 2010 Primal and dual For primal problem p = inf x g 0 (x) s.t. g i (x) 0 define Lagrangian L(x, z) = g 0

More information

HYBRID JACOBIAN AND GAUSS SEIDEL PROXIMAL BLOCK COORDINATE UPDATE METHODS FOR LINEARLY CONSTRAINED CONVEX PROGRAMMING

HYBRID JACOBIAN AND GAUSS SEIDEL PROXIMAL BLOCK COORDINATE UPDATE METHODS FOR LINEARLY CONSTRAINED CONVEX PROGRAMMING SIAM J. OPTIM. Vol. 8, No. 1, pp. 646 670 c 018 Society for Industrial and Applied Mathematics HYBRID JACOBIAN AND GAUSS SEIDEL PROXIMAL BLOCK COORDINATE UPDATE METHODS FOR LINEARLY CONSTRAINED CONVEX

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

Fast Algorithms for SDPs derived from the Kalman-Yakubovich-Popov Lemma

Fast Algorithms for SDPs derived from the Kalman-Yakubovich-Popov Lemma Fast Algorithms for SDPs derived from the Kalman-Yakubovich-Popov Lemma Venkataramanan (Ragu) Balakrishnan School of ECE, Purdue University 8 September 2003 European Union RTN Summer School on Multi-Agent

More information

A two-phase augmented Lagrangian approach for linear and convex quadratic semidefinite programming problems

A two-phase augmented Lagrangian approach for linear and convex quadratic semidefinite programming problems A two-phase augmented Lagrangian approach for linear and convex quadratic semidefinite programming problems Defeng Sun Department of Mathematics, National University of Singapore December 11, 2016 Joint

More information

A Multiple-Cut Analytic Center Cutting Plane Method for Semidefinite Feasibility Problems

A Multiple-Cut Analytic Center Cutting Plane Method for Semidefinite Feasibility Problems A Multiple-Cut Analytic Center Cutting Plane Method for Semidefinite Feasibility Problems Kim-Chuan Toh, Gongyun Zhao, and Jie Sun Abstract. We consider the problem of finding a point in a nonempty bounded

More information

Convex Optimization M2

Convex Optimization M2 Convex Optimization M2 Lecture 3 A. d Aspremont. Convex Optimization M2. 1/49 Duality A. d Aspremont. Convex Optimization M2. 2/49 DMs DM par email: dm.daspremont@gmail.com A. d Aspremont. Convex Optimization

More information

Block Coordinate Descent for Regularized Multi-convex Optimization

Block Coordinate Descent for Regularized Multi-convex Optimization Block Coordinate Descent for Regularized Multi-convex Optimization Yangyang Xu and Wotao Yin CAAM Department, Rice University February 15, 2013 Multi-convex optimization Model definition Applications Outline

More information

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 36 Why proximal? Newton s method: for C 2 -smooth, unconstrained problems allow

More information

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization Roger Behling a, Clovis Gonzaga b and Gabriel Haeser c March 21, 2013 a Department

More information

Convex Optimization on Large-Scale Domains Given by Linear Minimization Oracles

Convex Optimization on Large-Scale Domains Given by Linear Minimization Oracles Convex Optimization on Large-Scale Domains Given by Linear Minimization Oracles Arkadi Nemirovski H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology Joint research

More information

Convex Optimization Boyd & Vandenberghe. 5. Duality

Convex Optimization Boyd & Vandenberghe. 5. Duality 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization Panos Parpas Department of Computing Imperial College London www.doc.ic.ac.uk/ pp500 p.parpas@imperial.ac.uk jointly with D.V.

More information

Sparse Approximation via Penalty Decomposition Methods

Sparse Approximation via Penalty Decomposition Methods Sparse Approximation via Penalty Decomposition Methods Zhaosong Lu Yong Zhang February 19, 2012 Abstract In this paper we consider sparse approximation problems, that is, general l 0 minimization problems

More information

Rank Constrained Matrix Optimization Problems

Rank Constrained Matrix Optimization Problems Rank Constrained Matrix Optimization Problems Defeng Sun Department of Mathematics and Risk Management Institute National University of Singapore This talk is based on a joint work with Yan Gao at NUS

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNIVERSITY OF TECHNOLOGY REPORT -09 Computational and Sensitivity Aspects of Eigenvalue-Based Methods for the Large-Scale Trust-Region Subproblem Marielba Rojas, Bjørn H. Fotland, and Trond Steihaug

More information

arxiv: v1 [math.oc] 10 May 2016

arxiv: v1 [math.oc] 10 May 2016 Fast Primal-Dual Gradient Method for Strongly Convex Minimization Problems with Linear Constraints arxiv:1605.02970v1 [math.oc] 10 May 2016 Alexey Chernov 1, Pavel Dvurechensky 2, and Alexender Gasnikov

More information

SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1

SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1 SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1 Masao Fukushima 2 July 17 2010; revised February 4 2011 Abstract We present an SOR-type algorithm and a

More information

Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms

Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms Carlos Humes Jr. a, Benar F. Svaiter b, Paulo J. S. Silva a, a Dept. of Computer Science, University of São Paulo, Brazil Email: {humes,rsilva}@ime.usp.br

More information

Lagrangian-Conic Relaxations, Part I: A Unified Framework and Its Applications to Quadratic Optimization Problems

Lagrangian-Conic Relaxations, Part I: A Unified Framework and Its Applications to Quadratic Optimization Problems Lagrangian-Conic Relaxations, Part I: A Unified Framework and Its Applications to Quadratic Optimization Problems Naohiko Arima, Sunyoung Kim, Masakazu Kojima, and Kim-Chuan Toh Abstract. In Part I of

More information

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method Optimization Methods and Software Vol. 00, No. 00, Month 200x, 1 11 On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method ROMAN A. POLYAK Department of SEOR and Mathematical

More information

5 Handling Constraints

5 Handling Constraints 5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest

More information

An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods

An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods Renato D.C. Monteiro B. F. Svaiter May 10, 011 Revised: May 4, 01) Abstract This

More information

A DECOMPOSITION PROCEDURE BASED ON APPROXIMATE NEWTON DIRECTIONS

A DECOMPOSITION PROCEDURE BASED ON APPROXIMATE NEWTON DIRECTIONS Working Paper 01 09 Departamento de Estadística y Econometría Statistics and Econometrics Series 06 Universidad Carlos III de Madrid January 2001 Calle Madrid, 126 28903 Getafe (Spain) Fax (34) 91 624

More information

SIAM Conference on Imaging Science, Bologna, Italy, Adaptive FISTA. Peter Ochs Saarland University

SIAM Conference on Imaging Science, Bologna, Italy, Adaptive FISTA. Peter Ochs Saarland University SIAM Conference on Imaging Science, Bologna, Italy, 2018 Adaptive FISTA Peter Ochs Saarland University 07.06.2018 joint work with Thomas Pock, TU Graz, Austria c 2018 Peter Ochs Adaptive FISTA 1 / 16 Some

More information

Convex Optimization and l 1 -minimization

Convex Optimization and l 1 -minimization Convex Optimization and l 1 -minimization Sangwoon Yun Computational Sciences Korea Institute for Advanced Study December 11, 2009 2009 NIMS Thematic Winter School Outline I. Convex Optimization II. l

More information

Bulletin of the. Iranian Mathematical Society

Bulletin of the. Iranian Mathematical Society ISSN: 1017-060X (Print) ISSN: 1735-8515 (Online) Bulletin of the Iranian Mathematical Society Vol. 41 (2015), No. 5, pp. 1259 1269. Title: A uniform approximation method to solve absolute value equation

More information

Lecture 9: September 28

Lecture 9: September 28 0-725/36-725: Convex Optimization Fall 206 Lecturer: Ryan Tibshirani Lecture 9: September 28 Scribes: Yiming Wu, Ye Yuan, Zhihao Li Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These

More information

12. Interior-point methods

12. Interior-point methods 12. Interior-point methods Convex Optimization Boyd & Vandenberghe inequality constrained minimization logarithmic barrier function and central path barrier method feasibility and phase I methods complexity

More information

You should be able to...

You should be able to... Lecture Outline Gradient Projection Algorithm Constant Step Length, Varying Step Length, Diminishing Step Length Complexity Issues Gradient Projection With Exploration Projection Solving QPs: active set

More information

SEMI-SMOOTH SECOND-ORDER TYPE METHODS FOR COMPOSITE CONVEX PROGRAMS

SEMI-SMOOTH SECOND-ORDER TYPE METHODS FOR COMPOSITE CONVEX PROGRAMS SEMI-SMOOTH SECOND-ORDER TYPE METHODS FOR COMPOSITE CONVEX PROGRAMS XIANTAO XIAO, YONGFENG LI, ZAIWEN WEN, AND LIWEI ZHANG Abstract. The goal of this paper is to study approaches to bridge the gap between

More information

Lecture: Duality of LP, SOCP and SDP

Lecture: Duality of LP, SOCP and SDP 1/33 Lecture: Duality of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:

More information

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014 Convex Optimization Dani Yogatama School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA February 12, 2014 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12,

More information

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications Professor M. Chiang Electrical Engineering Department, Princeton University March

More information

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min. MA 796S: Convex Optimization and Interior Point Methods October 8, 2007 Lecture 1 Lecturer: Kartik Sivaramakrishnan Scribe: Kartik Sivaramakrishnan 1 Conic programming Consider the conic program min s.t.

More information

Positive semidefinite matrix approximation with a trace constraint

Positive semidefinite matrix approximation with a trace constraint Positive semidefinite matrix approximation with a trace constraint Kouhei Harada August 8, 208 We propose an efficient algorithm to solve positive a semidefinite matrix approximation problem with a trace

More information

A priori bounds on the condition numbers in interior-point methods

A priori bounds on the condition numbers in interior-point methods A priori bounds on the condition numbers in interior-point methods Florian Jarre, Mathematisches Institut, Heinrich-Heine Universität Düsseldorf, Germany. Abstract Interior-point methods are known to be

More information

Primal-dual first-order methods with O(1/ǫ) iteration-complexity for cone programming

Primal-dual first-order methods with O(1/ǫ) iteration-complexity for cone programming Mathematical Programming manuscript No. (will be inserted by the editor) Primal-dual first-order methods with O(1/ǫ) iteration-complexity for cone programming Guanghui Lan Zhaosong Lu Renato D. C. Monteiro

More information

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces. Math 350 Fall 2011 Notes about inner product spaces In this notes we state and prove some important properties of inner product spaces. First, recall the dot product on R n : if x, y R n, say x = (x 1,...,

More information

An efficient Hessian based algorithm for solving large-scale sparse group Lasso problems

An efficient Hessian based algorithm for solving large-scale sparse group Lasso problems arxiv:1712.05910v1 [math.oc] 16 Dec 2017 An efficient Hessian based algorithm for solving large-scale sparse group Lasso problems Yangjing Zhang Ning Zhang Defeng Sun Kim-Chuan Toh December 15, 2017 Abstract.

More information

10 Numerical methods for constrained problems

10 Numerical methods for constrained problems 10 Numerical methods for constrained problems min s.t. f(x) h(x) = 0 (l), g(x) 0 (m), x X The algorithms can be roughly divided the following way: ˆ primal methods: find descent direction keeping inside

More information

Homework 4. Convex Optimization /36-725

Homework 4. Convex Optimization /36-725 Homework 4 Convex Optimization 10-725/36-725 Due Friday November 4 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via convex relaxations Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

A polynomial-time inexact primal-dual infeasible path-following algorithm for convex quadratic SDP

A polynomial-time inexact primal-dual infeasible path-following algorithm for convex quadratic SDP A polynomial-time inexact primal-dual infeasible path-following algorithm for convex quadratic SDP Lu Li, and Kim-Chuan Toh April 6, 2009 Abstract Convex quadratic semidefinite programming (QSDP) has been

More information

An Augmented Lagrangian Dual Approach for the H-Weighted Nearest Correlation Matrix Problem

An Augmented Lagrangian Dual Approach for the H-Weighted Nearest Correlation Matrix Problem An Augmented Lagrangian Dual Approach for the H-Weighted Nearest Correlation Matrix Problem Houduo Qi and Defeng Sun March 3, 2008 Abstract In [15], Higham considered two types of nearest correlation matrix

More information

arxiv: v1 [math.oc] 13 Dec 2018

arxiv: v1 [math.oc] 13 Dec 2018 A NEW HOMOTOPY PROXIMAL VARIABLE-METRIC FRAMEWORK FOR COMPOSITE CONVEX MINIMIZATION QUOC TRAN-DINH, LIANG LING, AND KIM-CHUAN TOH arxiv:8205243v [mathoc] 3 Dec 208 Abstract This paper suggests two novel

More information

Primal-Dual Interior-Point Methods

Primal-Dual Interior-Point Methods Primal-Dual Interior-Point Methods Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 Outline Today: Primal-dual interior-point method Special case: linear programming

More information

MS&E 318 (CME 338) Large-Scale Numerical Optimization

MS&E 318 (CME 338) Large-Scale Numerical Optimization Stanford University, Management Science & Engineering (and ICME) MS&E 318 (CME 338) Large-Scale Numerical Optimization 1 Origins Instructor: Michael Saunders Spring 2015 Notes 9: Augmented Lagrangian Methods

More information

A Distributed Newton Method for Network Utility Maximization, II: Convergence

A Distributed Newton Method for Network Utility Maximization, II: Convergence A Distributed Newton Method for Network Utility Maximization, II: Convergence Ermin Wei, Asuman Ozdaglar, and Ali Jadbabaie October 31, 2012 Abstract The existing distributed algorithms for Network Utility

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley A. d Aspremont, INFORMS, Denver,

More information

Iteration-complexity of first-order penalty methods for convex programming

Iteration-complexity of first-order penalty methods for convex programming Iteration-complexity of first-order penalty methods for convex programming Guanghui Lan Renato D.C. Monteiro July 24, 2008 Abstract This paper considers a special but broad class of convex programing CP)

More information

Research Article Solving the Matrix Nearness Problem in the Maximum Norm by Applying a Projection and Contraction Method

Research Article Solving the Matrix Nearness Problem in the Maximum Norm by Applying a Projection and Contraction Method Advances in Operations Research Volume 01, Article ID 357954, 15 pages doi:10.1155/01/357954 Research Article Solving the Matrix Nearness Problem in the Maximum Norm by Applying a Projection and Contraction

More information

Gradient based method for cone programming with application to large-scale compressed sensing

Gradient based method for cone programming with application to large-scale compressed sensing Gradient based method for cone programming with application to large-scale compressed sensing Zhaosong Lu September 3, 2008 (Revised: September 17, 2008) Abstract In this paper, we study a gradient based

More information

A Smoothing Newton Method for Solving Absolute Value Equations

A Smoothing Newton Method for Solving Absolute Value Equations A Smoothing Newton Method for Solving Absolute Value Equations Xiaoqin Jiang Department of public basic, Wuhan Yangtze Business University, Wuhan 430065, P.R. China 392875220@qq.com Abstract: In this paper,

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon

More information

Copositive Programming and Combinatorial Optimization

Copositive Programming and Combinatorial Optimization Copositive Programming and Combinatorial Optimization Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria joint work with I.M. Bomze (Wien) and F. Jarre (Düsseldorf) IMA

More information

Math 273a: Optimization Overview of First-Order Optimization Algorithms

Math 273a: Optimization Overview of First-Order Optimization Algorithms Math 273a: Optimization Overview of First-Order Optimization Algorithms Wotao Yin Department of Mathematics, UCLA online discussions on piazza.com 1 / 9 Typical flow of numerical optimization Optimization

More information

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming CSC2411 - Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming Notes taken by Mike Jamieson March 28, 2005 Summary: In this lecture, we introduce semidefinite programming

More information

A projection method based on KKT conditions for convex quadratic semidefinite programming with nonnegative constraints

A projection method based on KKT conditions for convex quadratic semidefinite programming with nonnegative constraints A projection method based on KKT conditions for convex quadratic semidefinite programming with nonnegative constraints Xiaokai Chang Sanyang Liu Pengjun Zhao December 4, 07 Abstract The dual form of convex

More information

Multi-stage convex relaxation approach for low-rank structured PSD matrix recovery

Multi-stage convex relaxation approach for low-rank structured PSD matrix recovery Multi-stage convex relaxation approach for low-rank structured PSD matrix recovery Department of Mathematics & Risk Management Institute National University of Singapore (Based on a joint work with Shujun

More information

Applications of Linear Programming

Applications of Linear Programming Applications of Linear Programming lecturer: András London University of Szeged Institute of Informatics Department of Computational Optimization Lecture 9 Non-linear programming In case of LP, the goal

More information

On the Moreau-Yosida regularization of the vector k-norm related functions

On the Moreau-Yosida regularization of the vector k-norm related functions On the Moreau-Yosida regularization of the vector k-norm related functions Bin Wu, Chao Ding, Defeng Sun and Kim-Chuan Toh This version: March 08, 2011 Abstract In this paper, we conduct a thorough study

More information

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization Proximal Newton Method Zico Kolter (notes by Ryan Tibshirani) Convex Optimization 10-725 Consider the problem Last time: quasi-newton methods min x f(x) with f convex, twice differentiable, dom(f) = R

More information

Optimality, Duality, Complementarity for Constrained Optimization

Optimality, Duality, Complementarity for Constrained Optimization Optimality, Duality, Complementarity for Constrained Optimization Stephen Wright University of Wisconsin-Madison May 2014 Wright (UW-Madison) Optimality, Duality, Complementarity May 2014 1 / 41 Linear

More information

A Unified Approach to Proximal Algorithms using Bregman Distance

A Unified Approach to Proximal Algorithms using Bregman Distance A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department

More information

Complexity Analysis of Interior Point Algorithms for Non-Lipschitz and Nonconvex Minimization

Complexity Analysis of Interior Point Algorithms for Non-Lipschitz and Nonconvex Minimization Mathematical Programming manuscript No. (will be inserted by the editor) Complexity Analysis of Interior Point Algorithms for Non-Lipschitz and Nonconvex Minimization Wei Bian Xiaojun Chen Yinyu Ye July

More information

Linear Algebra Massoud Malek

Linear Algebra Massoud Malek CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

More information

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented

More information

A full-newton step infeasible interior-point algorithm for linear programming based on a kernel function

A full-newton step infeasible interior-point algorithm for linear programming based on a kernel function A full-newton step infeasible interior-point algorithm for linear programming based on a kernel function Zhongyi Liu, Wenyu Sun Abstract This paper proposes an infeasible interior-point algorithm with

More information

Nonlinear Optimization for Optimal Control

Nonlinear Optimization for Optimal Control Nonlinear Optimization for Optimal Control Pieter Abbeel UC Berkeley EECS Many slides and figures adapted from Stephen Boyd [optional] Boyd and Vandenberghe, Convex Optimization, Chapters 9 11 [optional]

More information

Homotopy methods based on l 0 norm for the compressed sensing problem

Homotopy methods based on l 0 norm for the compressed sensing problem Homotopy methods based on l 0 norm for the compressed sensing problem Wenxing Zhu, Zhengshan Dong Center for Discrete Mathematics and Theoretical Computer Science, Fuzhou University, Fuzhou 350108, China

More information