A Constraint-Reduced Algorithm for Semidefinite Optimization Problems with Superlinear Convergence

Similar documents
A Polynomial Time Constraint-Reduced Algorithm for Semidefinite Optimization Problems, with Convergence Proofs.

Exploiting Sparsity in Primal-Dual Interior-Point Methods for Semidefinite Programming.

Infeasible Primal-Dual (Path-Following) Interior-Point Methods for Semidefinite Programming*

Research Note. A New Infeasible Interior-Point Algorithm with Full Nesterov-Todd Step for Semi-Definite Optimization

Infeasible Primal-Dual (Path-Following) Interior-Point Methods for Semidefinite Programming*

c 2005 Society for Industrial and Applied Mathematics

arxiv: v1 [math.oc] 26 Sep 2015

The Q Method for Symmetric Cone Programmin

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

A CONIC DANTZIG-WOLFE DECOMPOSITION APPROACH FOR LARGE SCALE SEMIDEFINITE PROGRAMMING

Enlarging neighborhoods of interior-point algorithms for linear programming via least values of proximity measure functions

Limiting behavior of the central path in semidefinite optimization

Semidefinite Programming, Combinatorial Optimization and Real Algebraic Geometry

Introduction to Semidefinite Programs

A PREDICTOR-CORRECTOR PATH-FOLLOWING ALGORITHM FOR SYMMETRIC OPTIMIZATION BASED ON DARVAY'S TECHNIQUE

A Constraint-Reduced MPC Algorithm for Convex Quadratic Programming, with a Modified Active-Set Identification Scheme

DEPARTMENT OF MATHEMATICS

A Generalized Homogeneous and Self-Dual Algorithm. for Linear Programming. February 1994 (revised December 1994)

A priori bounds on the condition numbers in interior-point methods

An Infeasible Interior-Point Algorithm with full-newton Step for Linear Optimization

SVM May 2007 DOE-PI Dianne P. O Leary c 2007

1 Outline Part I: Linear Programming (LP) Interior-Point Approach 1. Simplex Approach Comparison Part II: Semidenite Programming (SDP) Concludin

2 The SDP problem and preliminary discussion

A path following interior-point algorithm for semidefinite optimization problem based on new kernel function. djeffal

The Ongoing Development of CSDP

PRIMAL-DUAL AFFINE-SCALING ALGORITHMS FAIL FOR SEMIDEFINITE PROGRAMMING

Degeneracy in Maximal Clique Decomposition for Semidefinite Programs

Lecture 5. Theorems of Alternatives and Self-Dual Embedding

א K ٢٠٠٢ א א א א א א E٤

The Simplest Semidefinite Programs are Trivial

Second Order Cone Programming Relaxation of Positive Semidefinite Constraint

Projection methods to solve SDP

12. Interior-point methods

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

SDPARA : SemiDefinite Programming Algorithm PARAllel version

A full-newton step infeasible interior-point algorithm for linear programming based on a kernel function

15. Conic optimization

Second Order Cone Programming Relaxation of Nonconvex Quadratic Optimization Problems

A NEW SECOND-ORDER CONE PROGRAMMING RELAXATION FOR MAX-CUT PROBLEMS

On implementing a primal-dual interior-point method for conic quadratic optimization

Solving large Semidefinite Programs - Part 1 and 2

A Full Newton Step Infeasible Interior Point Algorithm for Linear Optimization

18. Primal-dual interior-point methods

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

Room 225/CRL, Department of Electrical and Computer Engineering, McMaster University,

Implementation and Evaluation of SDPA 6.0 (SemiDefinite Programming Algorithm 6.0)

An Interior-Point Method for Approximate Positive Semidefinite Completions*

A FULL-NEWTON STEP INFEASIBLE-INTERIOR-POINT ALGORITHM COMPLEMENTARITY PROBLEMS

A semidefinite relaxation scheme for quadratically constrained quadratic problems with an additional linear constraint

A direct formulation for sparse PCA using semidefinite programming

Positive semidefinite matrix approximation with a trace constraint

The Q Method for Second-Order Cone Programming

Second-order cone programming

A CONIC INTERIOR POINT DECOMPOSITION APPROACH FOR LARGE SCALE SEMIDEFINITE PROGRAMMING

How to generate weakly infeasible semidefinite programs via Lasserre s relaxations for polynomial optimization

Predictor-corrector methods for sufficient linear complementarity problems in a wide neighborhood of the central path

Semidefinite Programming

Inexact primal-dual path-following algorithms for a special class of convex quadratic SDP and related problems

Advances in Convex Optimization: Theory, Algorithms, and Applications

A study of search directions in primal-dual interior-point methods for semidefinite programming

A new primal-dual path-following method for convex quadratic programming

Lecture: Algorithms for LP, SOCP and SDP

Constraint Reduction for Linear Programs with Many Constraints

A Continuation Approach Using NCP Function for Solving Max-Cut Problem

Linear Algebra and its Applications

Research Reports on Mathematical and Computing Sciences

Lecture 17: Primal-dual interior-point methods part II

A polynomial-time inexact primal-dual infeasible path-following algorithm for convex quadratic SDP

L. Vandenberghe EE236C (Spring 2016) 18. Symmetric cones. definition. spectral decomposition. quadratic representation. log-det barrier 18-1

Primal-Dual Geometry of Level Sets and their Explanatory Value of the Practical Performance of Interior-Point Methods for Conic Optimization

Research Reports on Mathematical and Computing Sciences

On Mehrotra-Type Predictor-Corrector Algorithms

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization

Primal-dual path-following algorithms for circular programming

Inexact primal-dual path-following algorithms for a special class of convex quadratic SDP and related problems

Convex and Nonsmooth Optimization: Assignment Set # 5 Spring 2009 Professor: Michael Overton April 23, 2009

SPARSE SECOND ORDER CONE PROGRAMMING FORMULATIONS FOR CONVEX OPTIMIZATION PROBLEMS

A Constraint-Reduced Variant of Mehrotra s Predictor-Corrector Algorithm

Improved Full-Newton Step O(nL) Infeasible Interior-Point Method for Linear Optimization

The Trust Region Subproblem with Non-Intersecting Linear Constraints

A full-newton step feasible interior-point algorithm for P (κ)-lcp based on a new search direction

Absolute value equations

m i=1 c ix i i=1 F ix i F 0, X O.

More First-Order Optimization Algorithms

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Homework 4. Convex Optimization /36-725

c 2000 Society for Industrial and Applied Mathematics

w Kluwer Academic Publishers Boston/Dordrecht/London HANDBOOK OF SEMIDEFINITE PROGRAMMING Theory, Algorithms, and Applications

A full-newton step infeasible interior-point algorithm for linear complementarity problems based on a kernel function

A SECOND ORDER MEHROTRA-TYPE PREDICTOR-CORRECTOR ALGORITHM FOR SEMIDEFINITE OPTIMIZATION

A New Class of Polynomial Primal-Dual Methods for Linear and Semidefinite Optimization

Agenda. Interior Point Methods. 1 Barrier functions. 2 Analytic center. 3 Central path. 4 Barrier method. 5 Primal-dual path following algorithms

12. Interior-point methods

Sparse Optimization Lecture: Basic Sparse Optimization Models

Department of Social Systems and Management. Discussion Paper Series

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

A Second Full-Newton Step O(n) Infeasible Interior-Point Algorithm for Linear Optimization

Identifying Redundant Linear Constraints in Systems of Linear Matrix. Inequality Constraints. Shafiu Jibrin

Solving second order cone programming via a reduced augmented system approach

Approximation Algorithms

Transcription:

A Constraint-Reduced Algorithm for Semidefinite Optimization Problems with Superlinear Convergence Sungwoo Park ebruary 14, 2016 Abstract Constraint reduction is an essential method because the computational cost of the interior point methods can be effectively saved. Park and O Leary proposed a constraint-reduced predictor-corrector algorithm for semidefinite programming with polynomial global convergence, but they did not show its superlinear convergence. We first develop a constraintreduced algorithm for semidefinite programming having both polynomial global and superlinear local convergences. The new algorithm repeats a corrector step to have an iterate tangentially approach a central path, by which superlinear convergence can be achieved. This study proves its convergence rate and shows its effective cost saving in numerical experiments. Keywords: Semidefinite programming, Interior point methods, Constraint reduction, Primal dual infeasible, Local convergence. AMS Classification: 90C22, 65K05, 90C51 1 Introduction Constraint reduction methods originated from the question of whether we can save computational costs by ignoring a subset of constraints during the iterations of interior point methods IPM). or semidefinite programming SDP), construction of a Schur complement matrix is the most expensive part of the computation for each iteration of most IPM s See [1]). A well-designed constraint reduction method can effectively reduce the computational cost by ignoring unimportant constraints without reducing convergence rate. This paper studies a constraint-reduced IPM for SDP, that has both polynomial global and superlinear local convergence. I would like to thank Professor Dianne P. O Leary for careful review of the manuscript and insightful advices to enhance the convergence analysis. KCG holdings, 545 Washington BLVD, Jersey City, NJ 07310, USA. swpark81@gmail.com 1

or IPM of SDP problems, different search directions have been proposed: HKM directionhelmberg-rendl-vanderbei-wolkowicz/kojima-shindoh-hara/ Monteiro) [2, 3, 4], AHO direction Alizadeh-Haeberly-Overton) [1], and NT direction Nesterov-Todd direction) [5]. SDP algorithms [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] adopt different directions and have different global and local convergence rates. or example, Potra et al. [14] developed a predictorcorrector infeasible IPM algorithm using the HKM direction with polynomial global convergence and superlinear local convergence. Kojima et al. [8] proposed a modified algorithm repeating a corrector step to achieve tangential convergence by which the fast-centering assumption in [14] can be removed. Later, Potra et al. [13] proved the local convergence of the algorithms without nondegeneracy assumption used in [8]. Many practical SDP packages [16, 17] exploit a block diagonal structure to avoid unnecessary computations for off-diagonal blocks. More recently, ukuda et al. [18] developed two algorithms: one to convert sparse SDP into multiple small block variables using positive semidefinite matrix completion and the other to incorporate the method into primal-dual IPM. Our constraint-reduced algorithm adaptively excludes unnecessary block constraints to save computational costs. Therefore, from the algorithmic perspective, our method is potentially applicable to the previous methods exploiting the block structure. There have been many efforts to apply constraint reduction to optimization problems, for example, linear programming LP) [19, 20, 21, 22, 23, 24, 25], support vector machine SVM) [26, 27], and quadratic programming QP) [28]. Most recently, Park and O Leary [29, 30] established a constraint-reduced predictor-corrector algorithm for block diagonal SDP by using a constraintreduced HKM direction. They proved polynomial global convergence of their algorithm, but they did not show its superlinear convergence. This paper extends their study in that we develop a new constraint-reduced algorithm having not only polynomial global convergence but also superlinear local convergence. We utilize the idea of repeating the corrector step to achieve tangential convergence, so our algorithm is a constraint-reduced version of Kojima et al. and Potra et al. [8, 13]. To the author s best knowledge, this is the first constraintreduced IPM for SDP that achieves superlinear local convergence. This paper is organized as follows. In Sect. 2, we introduce our blockconstrained SDP problem. In Sect. 3, we summarize the constraint-reduced predictor-corrector algorithm, established by Park and O Leary [29, 30], and its polynomial convergence property, that will be used later for local convergence analysis. In Sect. 4, we propose a new algorithm adopting new constraint reduction criteria and show its superlinear convergence. In Sect. 5, through the numerical experiments, we demonstrate how effectively constraint reduction saves computational costs. inally, in Sect. 6, we conclude the paper with summarizing this study and future work. 2

2 Block-constrained Semidefinite Programming We frequently use the notation in Table 1 in this paper. The primal and dual SDP problems are defined as Primal SDP: Dual SDP: min C X s.t. A i X = b i for i = 1,...,m, X 0,1) X m max y,z bt y s.t. y i A i +Z = C, Z 0, 2) i=1 where C S n, A i S n, X S n, and Z S n. We focus on problems in which the matrices A i and C are block diagonal: A i1 0 C 1 0 A i =..., C =..., 0 A ip 0 C p where A ij S nj,c j S nj for i = 1,...,m and j = 1,...,p. Many SDP problems have the block diagonal structures. See [31, Section 4.4.1] for examples. We let X j and Z j denote the corresponding block of X and Z. By the block structure, we can decompose each constraint in 1) and 2) into p independent block constraints. The block diagonal SDP is of interest because the Schur complement can be broken down into matrices corresponding to each diagonal block. Constraint-reduced algorithms reduce computations by ignoring unnecessary matrices. We will discuss the relation of block structure and constraint reduction on Schur complement matrix in Section 3. In the iteration of IPM, some block constraints make relatively insignificant contributions to determine a search direction. We call these inactive constraints. Constraint reduction has a goal to save computational cost by ignoring such block constraints while preserving the convergence property of the original IPM. We will introduce rigorous criteria to select the inactive block constraints to achieve superlinear convergence later in Sect. 4. Without loss of generality, we assume that we can partition the matrices into the active block and the inactive block with appropriate reordering, so [ ] [ ] Âi 0 Ĉ 0 A i =, C =, 0 à i 0 C where Âi and Ĉ contain the active blocks while Ãi and C contain the inactive blocks. In a similar way, we can also partition the optimizing matrices X and Z into X, X, Ẑ, and Z. In addition, let A R m n2 denote the matrix whose ith row is veca i ) T. Then the matrix A can be partitioned into the active part  and the inactive part Ã, whose rows are vec Âi ) T and vec Ãi ) T, so A = [Â,Ã]. We also define a matrix A j whose rows are veca ij ) T for j = 1,...,q. We have computational benefits by using svec) instead of vec) 3

S n S+ n S++ n X 0 X 0 A B = tr AB T) Table 1: Notation for the SDP. The set of n n symmetric matrices The set of n n symmetric positive semidefinite matrices The set of n n symmetric positive definite matrices A positive definite matrix A positive semidefinite matrix The dot-product of matrices µ = X Z)/n The duality gap x = vecx) The vectorization of a given matrix X, a stack of columns of X T k = sveck) The symmetric vectorization of a given symmetric matrix K, matx) The inverse of vecx) symmx) = 1 2 X+XT ) The symmetric part of X G H Kronecker product of matrices G and H A The 2-norm of a matrix A A = ij a2 ij )1/2 The robenius norm of a matrix A G k = O1) Γ > 0 such that G k Γ G k = Ω1) Γ > 0 such that 1/Γ G k Γ G k = Oη k ) G k = Ωη k ) G k /η k = O1) G k /η k = Ω1) because we can avoid duplicate computations for off-diagonal elements of symmetric A i. or simplicity of notations in the following equations, we use vec) for the rest of this paper. 3 Constraint-reduced SDP Method 3.1 Constraint-reduced HKM direction In this section, we introduce a constraint-reduced HKM direction. Throughout this paper, we assume the Slater condition. Assumption 3.1 Slater condition). There exists a primal and dual feasible point X,y,Z) such that X 0 and Z 0. Under the assumption, X,y,Z) is a solution of 1) and 2) if and only if A i X = b i for i = 1,...,m, 3) m y i A i )+Z = C, 4) i=1 XZ = 0, 5) X 0, Z 0. 6) 4

or a given current iterate X, y, Z), the primal residual, dual residual, and complementarity residuals are defined as r pi R d R c = b i A i X for i = 1,...,m, m = C Z y i A i, = µi XZ, i=1 where µ is a target duality gap. The HKM direction X, y, Z) S n R n S n can be found by solving the following equations: A i X = r pi for i = 1,...,m, 7) m y i A i )+ Z = R d, 8) i=1 symm Z 1/2 X Z+ XZ)Z 1/2) = µi Z 1/2 XZ 1/2. 9) Kojima et al. [3, Theorem 4.2] and Monteiro [4, Lemma 2.1 ff] showed that the equations above have a unique solution for X,Z) S n ++ S n ++. On the other hand, we can obtain a solution of 7) 9) by solving a reduced equation, M y = g, 10) where M = M+ M, M = Â X Ẑ 1 )ÂT, M = Ã X Z 1 )ÃT, g = r p +AX Z 1 )r d AI Z 1 )r c. When solving 10), the most computationally expensive part is to compute the Schur complement matrix M, Omn 3 + m 2 n 2 ), which is even more expensive than its Cholesky decomposition On 3 ), as Alizadeh et al. [1] explained. or a sophisticated flop count analysis of constructing the Schur complement matrix, see ujisawa et al. [32]. To save the computational cost, Park and O Leary [29, 30] proposed to solve the equations below, M y = g, 11) [ ])) X = symm mat I Ẑ 1 )r c X Ẑ 1 )r d ÂT y) I Z 1 )r c X Z 1 12), )r d m Z = R d A i y i. 13) i=1 5

Compared to 10), 11) replaces M with M to avoid the computation of M, by which the constraints associated with M are implicitly reduced, so we call it a constraint-reduced HKM direction. urther discussion about efficient ways of updating M and solving linear system 11) can be found in Park and O Leary [29, Section 2.3]. The following lemma explains how constraint reduction affects the original HKM direction. Lemma 3.1 Constraint-reduced HKM direction). A solution X, y, Z) S n R n S n of 11) 13) satisfies 7), 8), and a perturbed complementarity equation, symm Z 1/2 X Z+ XZ)Z 1/2) = µi Z 1/2 X+ X ǫ )Z 1/2, 9*) where [ 0 0 X ǫ = symm 0 mat X Z 1 ) )ÃT y ]). Proof. See Park and O Leary [30, Lemma 2.1 ff]. As a result of constraint reduction, a perturbation term X ǫ is added in the complementarity equation 9*). 3.2 Constraint-reduced Algorithm We summarize Algorithm SDP:Reduced, the constraint-reduced predictor-corrector IPM, in which Park and O Leary [30] used the constraint-reduced HKM direction. The algorithm defines a constant ρ based on unknown optimal X and Z. Our convergence result holds for any ρ > 0, but in practical implementation, we can choose ρ based on the given input matrices as discussed in Toh et al. [16, Section 3.4]. We will develop a modified algorithm in Sect. 4. irst, we establish a few essential notations to explain the algorithm. We define a set of feasible solutions, a set of optimal solutions, and a neighborhood Nγ,τ) of the central path as = {X,y,Z) S+ n Rm S+ n : X,y,Z) satisfies 3) and 4). }, = {X,y,Z) : X Z = 0}, Nγ,τ) = {X,Z) S++ n S++ n : Z 1/2 XZ 1/2 τi γτ}. The algorithm uses two fixed positive parameters α and β satisfying β 2 21 β) 2 < α < β β 1 β < 1. 14) or example, we can choose α,β) = 0.17,0.3). or a starting point, we can use a standard choice of X 0,y 0,Z 0 ) = ρ p I,0,ρ d I) for ρ p > 0 and ρ d > 0, which satisfies X 0,Z 0 ) Nα,τ 0 ) that is required in Step 2 of the algorithm. In 6

Algorithm 1 SDP:Reduced Primal-Dual Infeasible Constraint-Reduced Predictor-Corrector Algorithm for Block Diagonal SDP 1. Input : A, b, C; α and β satisfying 14); convergence tolerance τ ; ω 0, 0.5) for the perturbation bound of the primal direction in the predictor step. 2. Choose X 0,y 0,Z 0 ) such that X 0,Z 0 ) Nα,τ 0 ), and set τ = τ 0. 3. Repeat until τ < τ : or k = 0,1,..., a) Predictor step): Set X,y,Z) = X k,y k,z k ), r p = r k p, r d = r k d, and τ = τ k. i. ind M p = p j=1 A jx j Z 1 j )A T j such that p p where p is the number of blocks, M p is full-rank, and Condition 3.1 is satisfied. ii. Solve 11) with M = M p and µ = 0 in r c to find X, y, Z) satisfying 7), 8), and 9*). Choose a step length θ [ θ, θ] defined by 19) and 20). Set X = X+θ X, y = y+θ y, Z = Z+θ Z. iii. If θ = 1, terminate the iteration with optimal solution X,y,Z). b) Corrector step): Set τ = 1 θ)τ i. ind M p = p j=1 A jx j Z 1 j )A T j such that p p where p is the number of blocks, M p is full-rank, and Condition 3.2 is satisfied. ii. Solve 11) with M = M p, r p = 0, r d = 0, and µ = τ in r p to find X, y, Z) satisfying 7), 8), and 9*). Take a full step as X + = X+ X, y + = y+ y, Z + = Z+ Z. c) Update X k+1,y k+1,z k+1 ) = X +,y +,Z + ), r k+1 p = b Ax k+1, r k+1 d = c z k+1 A T y k+1, and τ k+1 = τ, the algorithm, both predictor and corrector steps solve the constraint-reduced equations 11) 13), but with different settings of r p, r d, and r c. irst, for a given X,y,Z) Nα,τ), the predictor step finds a solution X, y, Z) by setting µ = 0 in r c and updates the iterate as X,y,z) = X,y,Z)+θ X, y, Z), where θ is the predictor s step size. Second, by using X,y,z) for X,y,Z) in the equations, the corrector step finds X, y, Z) by setting 7

r p,r d,µ) = 0,0,1 θ)τ) and updates the iterate as X +,y +,z + ) = X +,y +,z + ) = X,y,Z)+ X, y, Z). or the predictor step s direction X, y, Z), we define δ := 1 τ Z 1/2 X ZZ 1/2. 15) Additionally, let X ǫ and X ǫ betheperturbationmatricesin9*), theformer for the predictor step and the latter for the corrector step. The perturbations for constraint reduction are quantified as δ ǫ := 1 Z 1/2 X ǫ Z 1/2, 16) τ δ ǫ := 1 Z 1/2 X ǫ Z 1/2. 17) τ Based on the perturbations, conditions below will guide us to find how many block constraints should be included in constructing Schur complement matrix. We will say that the included block are active and the others are inactive. Condition 3.1 Requirement on predictor step s perturbation). or a given constant ω, δ ǫ ω τ δ x, where 0 < ω < 0.5 and δ x := Z 1/2 XZ 1/2. Condition 3.2 Requirement on corrector step s perturbation). δ ǫ < 1 θ) s 2 +t s), where s = β 2 β +1 and t = 2α1 β) 2 β 2. or the construction of the Schur complement matrix in 3-a)-i and 3-b)-i of Algorithm SDP:Reduced, we may incrementally build the matrix M p as M p M p 1 +A p X p Z 1 p )AṰ p, until Condition 3.1 or Condition 3.2 is satisfied. In Algorithm ExploratoryConstruction, we propose two adaptations to improve efficiency. irst, Lemma 3.1 suggests adding the blocks in order of decreasing Xj Z 1 j, but for efficiency, we add in order of decreasing X j Z 1 j. Second, checking the condition for each increment is too expensive, so we make at most four tries. We first form M p for a p somewhat smaller than that used at the previous iteration. If that matrix fails to satisfy the condition, we add in blocks to try the previous p and then a somewhat larger p. If all of those fail, we use all blocks, no reduction. This exploratory construction uses the fact that inactive 8

Algorithm 2 ExploratoryConstruction Exploratory Construction of Reduced Schur Complement Matrix 1. Input: r s < 1, r e > 1, A j, X j, and Z j for j = 1,...,p. The number of active blocks p 0 at the previous iteration initially p 0 = p). Constraint reduction condition C: Condition 3.1 or Condition 3.2. 2. Sort the block constraints in decreasing order of X j Z 1. 3. Shrink step) Build M p for p = max1,r s p 0 ). If M p satisfies C, return M p. 4. Rollback step) Incrementally build M p for p = p 0 from M p of step 3. If M p satisfies C, return M p. 5. Expand step) Incrementally build M p for p = min p,r e p 0 ) from M p of step 4. If M p satisfies C, return M p. 6. ull Schur complement) Incrementally build M p for p = p from M p of step 5, and return M p. j blocks at the previous iteration tend to be inactive again. Later in Lemma 4.4, asymptotic behavior of X and Z will disclose this property. Park and O Leary [29, 30] presented a range for the predictor s step size θ as θ θ θ, 18) where θ = α β δ ǫ)+ α β δ ǫ ) 2 +4δβ α), 19) 2δ θ = max{ θ [0,1] : X+θ X,y+θ y,z+θ Z) Nβ,1 θ)τ), θ [0, θ]}. 20) They proved polynomial convergence using the following results. Lemma 3.2 After Predictor Step). or X,Z) Nα,τ), after the predictor step, r + p = 1 θ)r p, r + d = 1 θ)r d. In addition, if θ < 1, X,Z) Nβ,τ + ), where τ + = τ = 1 θ)τ. Otherwise, X,y,Z) is a solution of the SDP. Proof. See Park and O Leary [30, Lemma 3.1 3.2]. 9

Algorithm 3 SDP:ReducedLocal Modified Primal-Dual Infeasible Constraint-Reduced Predictor-Corrector Algorithm for Block Diagonal SDP with repeating corrector step. Step 1 and 2 are same as SDP:Reduced but using α, β, and ζ satisfying 14) and 23). 3. Repeat until τ < τ : or k = 0,1,..., a) Predictor step): Same as SDP:Reduced but replacing Condition 3.1 with Condition 4.1. b) Corrector step): Initially, set X 0,y 0,Z 0 ) = X,y,Z). Repeat 3b) of SDP:Reduced until X q,z q ) Nminτ σ,α),τ) for σ > 0: or q = 0,1,..., update X q,y q,z q ) = X +,y +,Z + ). c) Update X k+1,y k+1,z k+1 ) = X +,y +,Z + ), r k+1 p = b Ax k+1, r k+1 d = c z k+1 A T y k+1, and τ k+1 = τ. Lemma 3.3 After CorrectorStep). or X,Z) Nβ,τ + ), after the corrector steps, X +,Z + ) Nα,τ + ). Proof. See Park and O Leary [30, Lemma 3.3]. 4 Algorithm with Polynomial and Superlinear Convergence In this section, we propose a new constraint-reduced algorithm, as presented in Algorithm SDP:ReducedLocal, and show its superlinear local convergence. The new algorithm replaces the previous constraint reduction criteria, Condition 3.1 and 3.2, with Condition 4.1 and 4.2. Plus, the corrector step has its own iteration so as to obtain a tangential convergence to a central path in the sense that Z k ) 1/2 X k Z k ) 1/2 τ k I 0. τ k We define a few parameters to explain the new algorithm. We use indices k and q to denote the parametersat the k-th outer iteration and at the correctorstep s q-th inner iteration. However, when the meaning of notation is obvious by the context, we omit the indices to simplify notation. irst, at the k-th iteration, we define φ k as Z k ) 1/2 X k Z k ) 1/2 τ k I φ k := max τ k, τ k. 21) 10

Second, we let X q,y q,z q ) denote the q-th iterate of the corrector step s iteration in SDP:ReducedLocal, and we define its relative distance γ q to the central path as γ q := Z q ) 1/2 X q Z q ) 1/2 τi τ, 22) where τ = 1 θ)τ. By the definition, X q,z q ) Nγ q,τ) for X q,z q ) S++ n Sn ++ Third, we define a constant ζ such that 0 < ζ < 1 2. 23) β Because0 < β < 0.5by14), suchaζ exists. Usingtheparameters,weintroduce the new constraint reduction criteria. Condition 4.1New requirement on predictor step s perturbation). or a given positive constant C ǫ, δ ǫ min ω τ k δ x,c ǫ φ k ), where ω and δ x are as defined in Condition 3.1. Condition 4.2 New requirement on corrector step s perturbation). δ ǫ < ζγ 2 q1 θ). AsLemma3.1reveals,wecanreduceδ ǫ andδ ǫ asweincludemoreconstraints in the active set. When taking all constraints active, δ ǫ and δ ǫ become zero and no constraints are reduced. Therefore, we can always satisfy Condition 4.1 and 4.2 by taking enough block constraints. 4.1 Polynomial Convergence In this section, we show SDP:ReducedLocal has polynomial convergence, since Lemma 3.2 and 3.3 hold for SDP:Reduced. irst, it is obvious that Condition 4.1 is stricter than Condition 3.1, so the result of Lemma 3.2 also holds for SDP:ReducedLocal. Second, the stopping condition of the corrector step guarantees that X +,Z + ) Nminτ σ,α),τ), 24) which is even stricter than the result of Lemma 3.3. To meet the stopping condition, the corrector step s iteration must converge and its convergence rate needs to be fast enough to have the algorithm practical. The following lemmas show that the corrector step s iteration converges with a quadratic rate. Note that initially γ β because X,Z) β,τ) by Lemma 3.2. Lemma 4.1 Quadratic convergence of γ). Let γ be the relative distance to the central path as defined in 22). or γ β, after each iteration of the corrector step 3b), X +,Z + ) Nγ +,τ), 11

and so γ quadratically converges to 0. Proof. See Appendix. γ + < γ2 β γ, Lemma4.2. The corrector step 3b) in SDP:ReducedLocal requires Ologlog1/τ σ ))) iterations to satisfy the stopping condition 24), X q,z q ) Nminτ σ,α),τ). Proof. See Appendix. By Lemma 4.2, the stopping condition 24) of the corrector step can be satisfied by a finite number of iterations, so the iterate tangentially approaches the central path by the repeated corrector step. rom Lemma 3.2 and 24), SDP:ReducedLocal has polynomial global convergence like SDP:Reduced. Theorem 4.1 Polynomial Global Convergence). After k-th iteration of Algorithm SDP:ReducedLocal, the iterate X k,y k,z k ) satisfies r k p = ψ k r 0 p, r k d = ψ k r 0 d, τ k = ψ k τ 0, 25) X k,z k ) Nα,τ k ), and 26) 1 α n )τ k µ k = 1 n Xk Z k ) 1+ α n )τ k, 27) where ψ k = k 1 i=0 1 θ i). In addition, the step lengthθ k is bounded away from zero, soalgorithm SDP:ReducedLocal is globally convergent. Defining ǫ k := maxx k Z k, r k p, r k d ), Algorithm SDP:ReducedLocal converges in Onlnǫ 0 /ǫ)) iterations for a given tolerance ǫ. Proof. 25) 26) are direct consequences of Lemma 3.2 and 24). We can derive 27) from 26). See Park and O Leary [30, Theorem 3.1]. The rest of result can be shown from 25) 27) as in Park and O Leary [30, Sect. 3.2]. 4.2 Superlinear Convergence We discuss the asymptotic bounds of X k and Z k under a strict complementarity assumption. The notation for the asymptotic bounds, like O ) and Ω ), are defined in Table 1. In the asymptotic bounds for a matrix, we assume that the matrix size n is constant because iteration-by-iteration change of matrix norm is more of our interest. By using the asymptotic bounds, we prove that SDP:ReducedLocal has superlinear convergence. Assumption 4.1 Strict Complementarity). The SDP problem has a solution X,y,Z ), so X +Z 0. 12

The following proofs are organized as follows. Lemma 4.3 will show that the optimal matrices X and Z share eigenvectors. Based on the property, the asymptotic behavior of X k and Z k will be discovered in Lemma 4.4. Lemma 4.5 introduces important inequalities that will be frequently used in the following proofs. By using Lemma 4.4 and 4.5, we show an asymptotic bound of 1 θ k ) in Lemma 4.6. inally, Theorem 4.2 proves the superlinear convergence from Theorem 4.1, Lemma 4.4 and Lemma 4.6. Lemma 4.3. Let X,y,Z ). Then, there exists an orthogonal matrix Q = [q 1,,q n ] such that Q T X Q and Q T Z Q are diagonal matrices. Proof. See Alizadeh, Haeberly, and Overton [6, Lemma 1 ff]. Let X,y,Z ) be the strict complementary solution, and let Q be the orthogonal matrix whose columns are the eigenvectors of X and Z. By the strict complementarity, we define two sets B and N as B := {i : q T i X q i > 0}, N := {i : q T i Z q i > 0}, where B N = {1,2,...,n} and q i is the i-th column of Q. We also define a set M of X,y,Z ) as M := {X,y,Z ) 0 : q T i X q j = 0 if i or j N, q T i Z q j = 0 if i or j B.} where 0 := {X,y,Z) S n R m S n : X,y,Z) satisfies 3) and 4)}. Then, we consider the following minimization problem: min X,y,Z Z k ) 1/2 X k X )Z k Z )Z k ) 1/2 28) such that X,y,Z ) M and [X,Z ] Γ, where Γ is a constant such that [X k,z k ] Γ for k. Let X k, y k, Z k ) denote the solution 1 of 28), and we define η k as η k := 1 τ k Z k ) 1/2 X k X k )Z k Z k )Z k ) 1/2. 29) By definition of M, X k Zk = 0. 30) Now, the lemma below from Potra et al. [12, 14] reveals asymptotic bounds for X k, Z k, and η k. Lemma 4.4 Asymptotic Bounds). Let X k,y k,z k ) denote the iterate satisfying the properties 25) 27) in Theorem 4.1, and let η k be as defined in 29). Then, X k ) 1/2 Z k ) 1/2 2 1 See Potra et al. [14, pp18 19] for the existence of the solution. = Z k ) 1/2 X k Z k ) 1/2 1+α)τk. 31) 13

Under Assumption 4.1, we have [ Q T X k ) 1/2 O1) O τ Q = k ) O τ k ) O τ k ) Q T Z k ) 1/2 Q = and [ Q T X k Q = [ O τk ) O τ k ) O τ k ) O1) O1) O τ k ) O τ k ) Oτ k ) ] [, Q T X k ) 1/2 O1) O1) Q = O1) O1/ τ k ) ], Q T Z k ) 1/2 Q = ] [, Q T Z k Oτk ) O τ Q = k ) O τ k ) O1) ], 32) [ ] O1/ τk ) O1), O1) O1) 33) ], 34) η k = Oφ k ). 35) Proof. See Potra et al. [14, Corollary 3.3 and Lemma 4.4] and[12, Theorem 4.2]. Note that, in the course of their proofs, the roles of X k and Z k are switched. As Theorem 4.1 suggests, the superlinear convergence can be established by showing that lim k 1 θ k) 0. Potra et al. [14, Theorem 4.7] proved this by showing that 1 θ k ) = Oη k ). However, their result is not directly applicable to our algorithm due to the perturbation δ ǫ by constraint reduction. Instead, Lemma 4.6 shows that 1 θ k ) = Oη k +δ ǫ ). Then, Theorem4.2willestablishthe superlinearconvergencebyshowingη k 0 and δ ǫ 0. In the proofs, we utilize the following preliminary lemma. Lemma 4.5. or X,Z ) Nγ,τ ) and H R n n, let X, y, Z ) be a solution of Then A i X = 0 for i = 1,...,m, 36) m y ia i + Z = 0, 37) i=1 symm Z 1/2 X Z + X Z )Z 1/2) = H. 38) δ x δ z 1 2 δ 2 x +δ 2 z ) H 2 21 γ) 2, where δ x H 1 γ, δ z H 1 γ, δ x = Z 1/2 X Z 1/2 and δ z = τ Z 1/2 Z Z 1/2. 14

Proof. See Monteiro [4, Lemma 4.4 in p.671], in which the roles of X and Z in H are switched. Lemma 4.6 Similar to [14, Theorem 4.7]). Under Assumption 4.1, 1 θ k ) = Oη k +δ ǫ ). Proof. or simplicity, we omit the index k in the following equations. By Theorem 4.1, X,Z) Nα,τ), and it is easy to show that 36) and 37) in Lemma 4.5 are satisfied by substituting X, y, Z ) = X+X X, y+y y, Z+Z Z), where X, y, Z)isasolutionof28). Thus,wecanuseLemma4.5bysubstituting X,Z ) with X,Z). By using 30) and 9*) in Lemma 3.1 with µ = 0, we can rewrite H in Lemma 4.5 as H = symm Z 1/2 X Z+Z Z)+ X+X X)Z )Z 1/2) = symm Z 1/2 X Z+XZ X Z+ XZ+XZ XZ )Z 1/2) = symm + symm = symm = symm ) ǫ )Z 1/2) Z 1/2 XZ X Z XZ+ X Z Z 1/2 X Z+ XZ+XZ)Z 1/2) X Z = 0 by 30).) Z 1/2 X X)Z Z)Z 1/2) Z 1/2 X ǫ Z 1/2 9 ) with µ = 0) where and ǫ are defined as := Z 1/2 X X)Z Z)Z 1/2 and ǫ := Z 1/2 X ǫ Z 1/2, and, by 16) and 29), Thus, we have ǫ = δ ǫ τ, = η k τ. 39) H = symm ) ǫ + ǫ = η k τ +δ ǫ τ = η k +δ ǫ )τ. We also define x and z as x := Z 1/2 X+X X)Z 1/2 and z := Z 1/2 Z+Z Z)Z 1/2. By Lemma 4.5, because X,Z) Nα,τ), we have δ z = τ z z H 1 α η k +δ ǫ )τ 1 α η k +δ ǫ 1 α. 40) 15

Similarly, δ x = x H 1 α η k +δ ǫ )τ 1 α. 41) Let v i denote columns of of Q T X) 1/2 Q, so Q T X) 1/2 Q = [v 1,...,v n ]. Then, by using 32) in Lemma 4.4 together with X, y, Z) M and [ X, Z] Γ, X) 1/2 X X)X) 1/2 = I X) 1/2 XX) 1/2 I + X) 1/2 XX) 1/2 = n+ Q T X) 1/2 QQ T XQQ T X) 1/2 Q = n+ [v 1,...,v n ]Q T XQ[v1,...,v n ] T = n+ q T Xq i j )v i v T j X, y, Z) M) i,j B = n+γ v i v T j [ X, Z] Γ) i,j B = O1). 32)) 42) Similarly, we can show that Z 1/2 Z Z)Z 1/2 = O1). 43) Next, = = Z 1/2 X ZZ Z 1/2 1/2 XZ 1/2) Z 1/2 ZZ 1/2) x Z 1/2 X X)Z 1/2) z Z 1/2 Z Z)Z 1/2) = x z x Z 1/2 Z Z)Z 1/2 Z 1/2 X X)Z 1/2 z +. Z)Z 1/2 = x z x Z 1/2 Z Z 1/2 X 1/2 ) X 1/2 X X)X 1/2) X 1/2 Z 1/2 ) z +. Thus, by 31) in Lemma 4.4, and 39) 43), we can calculate the upper bound 16

of δ in 15) as δ = 1 τ Z 1/2 X ZZ 1/2 1 τ x z + 1 Z τ x 1/2 Z Z)Z 1/2 + 1 X τ z 1/2 Z 1/2 2 X 1/2 X X)X 1/2 + 1 τ 2 = Oη k +δ ǫ ). 44) By 18) and the definition of θ in 19), 1 θ 1 θ = 1 α β δ ǫ)+ α β δ ǫ ) 2 +4δβ α) 2δ 2β α) = 1 β α+δǫ ) 2 +4δβ α)+β α+δ ǫ ) = = β α+δǫ ) 2 +4δβ α) β α+δ ǫ )+2δ ǫ β α+δǫ ) 2 +4δβ α)+β α+δ ǫ ) 4δβ α) β α+δǫ ) 2 +4δβ α)+β α+δ ǫ )) 2 2δ ǫ + β α+δǫ ) 2 +4δβ α)+β α+δ ǫ ) 4δβ α) β α)+β α)) 2 + 2δ ǫ β α)+β α) = δ +δ ǫ β α. Thus, by 44), 1 θ δ +δ ǫ β α = Oη k +δ ǫ )+Oδ ǫ ) = Oη k +δ ǫ ). inally, we provethe superlinear local convergenceby using that η k = Oφ k ) and δ ǫ = Oφ k ). Theorem 4.2 Superlinear Convergence). Under Assumption 4.1, Algorithm SDP:ReducedLocal converges superlinearly with Q-order of at least 1+minσ,0.5). Proof. Since τ = τ + in 24), we can rewrite it as X k,z k ) Nminτk σ,α),τ k), so Z k ) 1/2 X k Z k ) 1/2 τ k I minτk σ,α)τ k τ k ) 1+σ, Z k ) 1/2 X k Z k ) 1/2 τ k I = Oτk σ τ ). k 17

Thus, by the definition of φ k, φ k = O maxτk σ,τ0.5 k )) = Oτ minσ,0.5) k ). Then, by Lemma 4.6, Condition 4.1, and 35) in Lemma 4.4, 1 θ k ) = Oη k +δ ǫ ) = Oφ k ) = Oτ minσ,0.5) k ), which implies superlinear convergence by Theorem 4.1. Therefore, τ k+1 = 1 θ k )τ k = Oτ 1+minσ,0.5) k ). 5 Experiments This section summarizes numerical experiments to evaluate the effectiveness of constraint-reduced algorithms. Matlab codes for the experiments are available at http://www.mathworks.com/matlabcentral/fileexchange/54117. To evaluate the constraint-reduced algorithms, we compare their computational costs with an algorithm with no constraint reduction. We can disable constraint reduction by constructing a full Schur complement matrix with no condition checks in SDP:ReducedLocal, which is equivalent to the algorithm by Kojima et al. [8]. We use this unreduced algorithm as our benchmark. Reducing zero blocks ofxmakesthe perturbations δ ǫ and δ ǫ zero by Lemma 3.1, so the constraint reduction conditions are immediately satisfied. rom the perspective, the number of X j = 0 blocks at the solution X,y,Z ) may determine an applicability of constraint reduction. In this experiment, we consider three groups of SDP problems, having different proportions of zero X j blocks: 0%, 25%, and 50%. or each group, we randomly generate SDP problems for different m, n, and block size n j. or simplicity, we only consider the case of identical block sizes n j. irst, we evaluate constraint reduction in the predictor step, disabling the constraint reduction in the corrector step. ig. 1 shows the average number of iterations and the average number of reduced blocks varying the parameter ω in Condition 3.1 and Condition 4.1. When ω = 0, no constraint reduction is performed, so the iteration counts for the two algorithms are the same as that for the unreduced algorithm. When ω 0.1, the algorithms reduce blocks but require additional iterations, which generally increase the total amount of work in the experiment. We can understand this from the relation of θ and δ ǫ in 19) together with its effect on the converging rate in Lemma 3.2. Results for different dimensions and different ratios of zero X j blocks are similar. Next, we evaluate the constraint reduction in the corrector step with disabling the reduction in the predictor step. irst, we use m = 256, n j = 16, and vary n = 128,256,512,1024, and 2048. Second, we fix m = 256, n = 512, and vary n j = 4,8,16,32, and 64. Varying m gives similar results, not presented. or each set of dimensions, we generate 5 random SDP problems and take an average of results. To evaluate the computational cost saving, we count the total number of flops to construct all of the Schur complement matrices during 18

iterations 180 160 140 120 100 80 60 average # of iterations until it converges SDP:reduced SDP:reducedLocal average reduced blocks per iteration 4 3.5 3 2.5 2 1.5 1 0.5 average # of reduced blocks per iteration SDP:reduced SDP:reducedLocal 40 0 0.01 0.03 0.1 0.3 parameter ω 0 0 0.01 0.03 0.1 0.3 parameter ω igure 1: When m = 64, n = 256, n j = 16, and 0% zero X j blocks, the number of iteration until convergence and the average reduced blocks per iteration by SDP:Reduced and SDP:ReducedLocalwith varying ω = 0.0, 0.01, 0.03, 0.1, and 0.3. the corrector steps until each algorithm converges. In case of the constraintreduced algorithms, all the additional computations for constraint reduction are also included in flop counts. ig. 2 shows the % of flop savings in the corrector steps by SDP:Reduced and SDP:ReducedLocal against the unreduced algorithm with varying matrix size n and block size n j. or the same settings of n and n j, ig. 3 demonstrates an average number of reduced blocks per iteration. As matrix size n grows, constraint reduction saves more flops because the algorithm has more block candidates to be reduced. On the other hand, constraint reduction tends to save the more flops for larger block sizes because Schur complement matrix construction costs Omn 3 j + m2 n 2 j ) for each block. Particularly at block size n j = 64, the % of saving drops because we have only a few blocks, so fewer candidates for constraint reduction. The constraint reduction effectively reduces flop counts even for SDP problems whose zero X j block ratio is 0%, which indicates that constraint reduction is not limited to SDP problems having many zero X j blocks. However, compared to the case of 25% of zero X j blocks, the flops saving is not improved very much for the case of 50%. Thus, the effectiveness of constraint reduction is more related to the contributions of blocks along the iterations rather than the number of zero blocks at the final optimum. When we enable constraint reduction only for the corrector step, both SDP:Reduced and SDP:ReducedLocal converge as fast as the unreduced algorithm in terms of the number of iterations. However, SDP:ReducedLocal saves more flops than SDP:Reduced. ig. 3 shows that SDP:ReducedLocal reduced more blocks than SDP:Reduced, so the constraint reduction conditions for SDP:ReducedLocal are more effective than those of SDP:Reduced while achieving a theoretically faster convergence rate. 19

30 % or fixed block size n j =16 25 % or fixed matrix size n=512 % flop reduction 25 % 20 % 15 % 10 % 5 % 0 % -5 % 128 256 512 1024 2048 matrix size n SDP:Reduced for 0% zero X* block SDP:Reduced for 25% zero X* block SDP:Reduced for 50% zero X* block 20 % 15 % 10 % 5 % 0 % 4 8 16 32 64 block size n j SDP:ReducedLocal for 0% zero X* block SDP:ReducedLocal for 25% zero X* block SDP:ReducedLocal for 50% zero X* block igure 2: When m = 256, the % of flops saving in the corrector steps with varyingmatrixsizenandblocksizen j bysdp:reducedandsdp:reducedlocal compared with unreduced algorithm. 40 or fixed block size n j =16 16 or fixed matrix size n=512 average reduced blocks per iteration 35 30 25 20 15 10 5 0 128 256 512 1024 2048 matrix size n SDP:Reduced for 0% zero X* block SDP:Reduced for 25% zero X* block SDP:Reduced for 50% zero X* block 14 12 10 8 6 4 2 0 4 8 16 32 64 block size n j SDP:ReducedLocal for 0% zero X* block SDP:ReducedLocal for 25% zero X* block SDP:ReducedLocal for 50% zero X* block igure 3: When m = 256, the average number of reduced blocks in the corrector steps with varying matrix size n and block size n j by SDP:Reduced and SDP:ReducedLocal. 6 Conclusions In this paper, we developed a constraint-reduced predictor-corrector algorithm for block-constrained SDP and showed its polynomial global and superlinear local convergence under Slater and strict complementarity assumptions. To accomplish the superlinear convergence, Algorithm SDP:ReducedLocal adopts new constraint reduction criteria and repeats the corrector step so that the iterate tangentially approaches the central path. By the numerical experiments, we demonstrated its computational cost saving especially for the corrector step. We applied a constraint reduction method to the predictor-corrector methods using the HKM direction. It will be also interesting to apply constraint reduction to the algorithm using other directions. or example, Kojima et al. 20

[9] developed an algorithm using AHO direction, that has a quadratic local convergence. Inspired by the fast centering effect of AHO direction, Potra et al. [12] developed an algorithm using HKM direction in the predictor step and AHO direction in the corrector step, that has the superlinear local convergence. Later, Ji, Potra, and Sheng [7] generalized the idea by revealing the relation between the convergence rate and the condition numbers of scaling matrix for MZ-family directions. The advantage of the algorithms using the AHO direction is that we need neither a fast centering assumption nor repeated corrector steps for local convergence. The study of constraint reduction method for these algorithms will extend the scope of application for constraint reduction. A Appendix: Proofs Lemma A.1. Suppose that W R n n is a nonsingular matrix. Then, for any E S n n, we have E λ i E) E, E symm WEW 1 ). Proof. The first inequality comes from λ i E) σ max E) = σmaxe) 2 n σi 2E) = E. i=1 or the second inequality, see Monteiro [4, Lemma 3.3]. Proof. of Lemma 4.1. By definition of γ, γ = where Z) 1/2 XZ) 1/2 τi τ Z + ) 1/2 X + Z + ) 1/2 τi and γ + = τ X +,y +,Z + ) = X+ X,y+ y,z+ Z). Because X 0,Z 0 ) Nβ,1 θ)τ) = Nβ,τ) by Lemma 3.2, the initial relative distance γ 0 satisfies γ 0 β. Thus, it suffices to show that, for 0 < γ β, γ + < γ, and 45) γ + = Oγ 2 ). 46). 21

By Lemma 4.5, we have Z 1/2 ZZ 1/2 H Z 1/2 X Z+ XZ)Z 1/2 1 γ)τ 1 γ)τ τi Z 1/2 XZ 1/2 +Z 1/2 X ǫ Z 1/2 = 9 )) 1 γ)τ τi Z 1/2 XZ 1/2 + Z 1/2 X ǫ Z 1/2 1 γ)τ Thus, by Lemma A.1, = γτ +τδ ǫ 1 γ)τ = γ 1 γ + δ ǫ 1 γ)1 θ) γ 1 γ + ζγ2 1 θ) 1 γ)1 θ) γ = 1 γ 1+ζγ) < γ 1 γ [ 1 = γ 1 γ + γ γ 1 γ + 1 γ [ ) )] 1 β γ = γ 1+ γ [ 1+ β 1 γ γ 17) and τ = 1 θ)τ) Condition 4.2) ) 1 1+ β 2 ) )] 1 β 1 ) γ 23)) 1 γ )] = 2γ < 1 γ β < 0.5) ) γ 1 γ λ min Z 1/2 ZZ 1/2 ) > 1, which implies that I+Z 1/2 ZZ 1/2 0, so Z + = Z+ Z = Z 1/2 I+Z 1/2 ZZ 1/2 )Z 1/2 0, so Z + 0. Therefore, Z + ) 1/2 exists and is invertible. Define Q, E, and W as Then, WEW 1 = Z 1/2 Z + ) 1/2 ) Q := Z 1/2 X + Z + τi)z 1/2, E := Z + ) 1/2 X + Z + ) 1/2 τi, W := Z 1/2 Z + ) 1/2. = Z 1/2 X + Z + τi)z 1/2 = Q. [ ] Z + ) 1/2 X + Z + ) 1/2 τi Z 1/2 Z + ) 1/2 ) 1 22

Thus, by Lemma A.1 and the equation above, τ γ + = Z + ) 1/2 X + Z + ) 1/2 τi = E symm WEW 1 ) = symmq). 47) On the other hand, by 9*) in Lemma 3.1 with µ = τ, X,Z) = X,Z), and X, Z, X ǫ ) = X, Z, X ǫ ), we have symm Z 1/2 X Z+ XZ)Z 1/2) = µi Z 1/2 X+ X ǫ )Z 1/2, 48) Thus, by using the equation above, symmq) = symm Z 1/2 X + Z + τi)z 1/2) = symm Z 1/2 X+ X)Z+ Z) τi)z 1/2) = symm Z 1/2 XZ+X Z+ XZ+ X Z τi)z 1/2) = symm Z 1/2 [ ] ) 1/2 XZ τi)+x Z+ XZ+ X Z) Z [ = Z 1/2 XZ 1/2 τi)+ symm Z 1/2 X Z+ XZ)Z 1/2)] + symm Z 1/2 X ZZ 1/2) = Z 1/2 X ǫ Z 1/2 + symm Z 1/2 X ZZ 1/2) 48)). By the definition of γ, X,Z) Nγ,τ). By using 17), 48), and Lemma 4.5 23

together with the fact X,Z) Nγ,τ), symmq) = 1 2 Q+QT ) Q Z 1/2 X ǫ Z 1/2 + Z 1/2 X ZZ 1/2 Z 1/2 X ǫ Z 1/2 + Z 1/2 X Z 1/2 Z 1/2 ) ZZ 1/2 Z 1/2 X ǫ Z 1/2 + Z 1/2 XZ 1/2 Z 1/2 1/2 ZZ In addition, by Condition 4.2, τδ ǫ + H 2 2τ1 γ) 2 17) and Lemma 4.5) Z 1/2 X Z+ XZ)Z 1/2 2 τδ ǫ + 2τ1 γ) 2 τi Z 1/2 XZ 1/2 Z 1/2 X ǫ Z 1/2 2 τδ ǫ + 2τ1 γ) 2 48)) [ 1 τi Z 1/2 1/2 2 Z τδ ǫ + XZ 2τ1 γ) 2 + 1/2 Xǫ Z 1/2 ] 2 2 τi Z 1/2 1/2 Z 1/2 + XZ 2τ1 γ) 2 Xǫ Z 1/2 A+B 2 A 2 + B 2 +2 A B ) ) τ 2 γ 2 +τ 2 δ 2 ǫ +2ττγδ ǫ τδ ǫ + 2τ1 γ) 2 X,Z) Nγ,τ) and 17)). symmq) τζ1 θ)γ 2 1 + τ 2 2τ1 γ) 2 γ 2 +τ 2 ζ 2 1 θ) 2 γ 4 +2ττζ1 θ)γ 3) = τζγ 2 1 + τ 2 2τ1 γ) 2 γ 2 +ζ 2 τ 2 γ 4 +2ζτ 2 γ 3) τ = 1 θ)τ) [ = τγ 2 ζ + 1 ) ] 2 1+ζγ. 2 1 γ Therefore, together with 47), we have τ γ + symmq) τγ 2 [ γ + γ 2 [ By 23), for any γ β, ζ + 1 2 ) ] 2 1+ζγ, 1 γ ζ + 1 2 ) ] 2 1+ζγ. 49) 1 γ ζ < 1 β 2 1 2, γ 50) ζγ < 1 2γ. 51) 24

Thus, by 49) 51) together with γ β, we have [ γ + τγ 2 ζ + 1 ) ] [ 2 1+ζγ < γ 2 ζ + 1 ) ] 2 1+1 2γ) = γ 2 ζ+2) γ2 2 1 γ 2 1 γ β γ, Now, we finish the proof by showing X + 0. By the inequality above, Z + ) 1/2 XZ + ) 1/2 τi < γτ. 52) By Lemma A.1 and 52), ) λ min Z + ) 1/2 XZ + ) 1/2 τi Z + ) 1/2 XZ + ) 1/2 τi > γτ, λ min Z + ) 1/2 XZ + ) 1/2) > 1 γ)τ > 0, so Z + ) 1/2 X + Z + ) 1/2 0. Therefore, X + 0 because Z + 0 as we showed above. Proof. of Lemma 4.2. To simplify notation, let ǫ denote the target distance, minτ σ,α). By mathematical induction, for q 2, we can rewrite the inequality in Lemma 4.1 as γ q < β ) 2 q 1 γ1. β Thus, at worst, we can reach the target distance by γ q < β ) 2 q 1 γ1 < ǫ β 2 q 1 > logǫ/β) logγ 1 /β) = logβ/ǫ) logβ/γ 1 ) q > 1+log 2 logβ/ǫ) logβ/γ 1 ) ). γ 1 < γ 0 β) Therefore, the required iteration Q is bounded by ) logβ/ǫ) Q 2+log 2 logβ/γ 1 ) so, Q = Ologlog1/τ σ ))) for given α, β, and γ 1. = 2+log 2 logβ/minτ σ,α)) logβ/γ 1 ) ), 25

References [1] Alizadeh,., Haeberly, J.P.A., Overton, M.L.: Primal-dual interior-point methods for semidefinite programming: Convergence rates, stability and numerical results. SIAM J. Opt. 83), 746 768 1998) [2] Helmberg, C., Rendl,., Vanderbei, R.J., Wolkowicz, H.: An interior-point method for semidefinite programming. SIAM J. Opt. 62), 342 361 1996) [3] Kojima, M., Shindoh, S., Hara, S.: Interior-point methods for the monotone semidefinite linear complementarity problem in symmetric matrices. SIAM J. Opt. 71), 86 125 1997) [4] Monteiro, R.D.C.: Primal-dual path-following algorithms for semidefinite programming. SIAM J. Opt. 73), 663 678 1997) [5] Nesterov, Y.E., Todd, M.J.: Primal-dual interior-point methods for selfscaled cones. SIAM J. Opt. 82), 324 364 1998) [6] Alizadeh,., Haeberly, J.P.A., Overton, M.L.: Complementarity and nondegeneracy in semidefinite programming. Math. Prog. 771), 111 128 1997) [7] Ji, J., Potra,.A., Sheng, R.: On the local convergence of a predictorcorrector method for semidefinite programming. SIAM J. Opt. 101), 195 210 1999) [8] Kojima, M., Shida, M., Shindoh, S.: Local convergence of predictorcorrector infeasible-interior-point algorithms for SDPs and SDLCPs. Math. Prog. 802), 129 160 1998) [9] Kojima, M., Shida, M., Shindoh, S.: A predictor-corrector interior-point algorithm for the semidefinite linear complementarity problem using the Alizadeh Haeberly Overton search direction. SIAM J. Opt. 92), 444 465 1999) [10] Monteiro, R.D.C.: Polynomial convergence of primal-dual algorithms for semidefinite programming based on Monteiro and Zhang family of directions. SIAM J. Opt. 83), 797 812 1998) [11] Monteiro, R.D.C., Zhang, Y.: A unified analysis for a class of long-step primal-dual path-following interior point algorithms for semidefinite programming. Math. Prog. 813), 281 299 1998) [12] Potra,.A., Sheng, R.: Superlinear convergence of a predictor-corrector method for semidefinite programming without shrinking central path neighborhood. Tech. Rep. 91, Reports on Computational Mathematics, Department of Mathematics, Univ. of Iowa 1996) 26

[13] Potra,.A., Sheng, R.: Superlinear convergence of interior-point algorithms for semidefinite programming. J. Opt. Theory Appl. 991), 103 119 1998) [14] Potra,.A., Sheng, R.: A superlinearly convergent primal-dual infeasibleinterior-point algorithm for semidefinite programming. SIAM J. Opt. 84), 1007 1028 1998) [15] Zhang, Y.: On extending some primal-dual interior-point algorithms from linear programming to semidefinite programming. SIAM J. Opt. 82), 365 386 1998) [16] Toh, K.C., Todd, M.J., Tütüncü, R.H.: On the implementation and usage of SDPT3 - a MATLAB software package for semidefinite-quadratic-linear programming, version 4.0. In: M.. Anjos, J.B. Lasserre eds.) Handbook on Semidefinite, Conic and Polynomial Optimization, vol. 166, pp. 715 754. Springer, New York 2012) [17] ujisawa, K., Kojima, M., Nakata, K.: Sdpa semidefinite programming algorithm) Users Manual B-308. Tech. rep., Department of Mathematical and Computing Sciences, Tokyo Institute of Technology, Japan 1996). URL ftp://ftp.is.titech.ac.jp/pub/opres/software/sdpa [18] ukuda, M., Kojima, M., Murota, K., Nakata, K.: Exploiting sparsity in semidefinite programming via matrix completion I: implementation and numerical results. SIAM J. Opt. 113), 647 674 2000) [19] Dantzig, G.B., Ye, Y.: A build-up interior-point method for linear programming: Affine scaling form. Tech. rep., Stanford Univ. 1991) [20] Hertog, D.D., Roos, C., Terlaky, T.: Adding and deleting constraints in the path-following method for LP. In: Advances in Optimization and Approximation, vol. 1, pp. 166 185. Springer, New York 1994) [21] Kaliski, J.A., Ye, Y.: A decomposition variant of the potential reduction algorithm for linear programming. Manage. Sci. 39, 757 776 1993) [22] Tits, A.L., Absil, P.A., Woessner, W.P.: Constraint reduction for linear programs with many inequality constraints. SIAM J. Opt. 171), 119 146 2006) [23] Tone, K.: An active-set strategy in an interior point method for linear programming. Math. Prog. 593), 345 360 1993) [24] Winternitz, L.B., Nicholls, S.O., Tits, A.L., O Leary, D.P.: A constraintreduced variant of Mehrotra s predictor-corrector algorithm. Comput. Optim. Appl. 513), 1001 1036 2012) [25] Ye, Y.: An On 3 L) potential reduction algorithm for linear programming. Math. Prog. 502), 239 258 1991) 27

[26] Jung, J.H., O Leary, D.P., Tits, A.L.: Adaptive constraint reduction for training support vector machines. Elec. Trans. Numer. Anal. 31, 156 177 2008) [27] Williams, J.A.: The use of preconditioning for training support vector machines. Master s thesis, Applied Mathematics and Scientific Computing Program, Univ. of Maryland, College Park, MD 2008) [28] Jung, J.H., O Leary, D.P., Tits, A.L.: Adaptive constraint reduction for convex quadratic programming. Comput. Optim. Appl. 511), 125 157 2012) [29] Park, S., O Leary, D.P.: A polynomial time constraint-reduced algorithm for semidefinite optimization problems. J. Opt. Theory Appl. 2015) [30] Park, S., O Leary, D.P.: A polynomial time constraint reduced algorithm for semidefinite optimization problems, with convergence proofs. Tech. rep., Univ. of Maryland, College Park, MD 2015). URL http://www. optimization-online.org/db_html/2013/08/4011.html [31] Park, S.: Matrix reduction in numerical optimization. Ph.D. thesis, Computer Science Department, Univ. of Maryland, College Park, MD 2011). URL http://drum.lib.umd.edu/handle/1903/11751 [32] ujisawa, K., Kojima, M., Nakata, K.: Exploiting sparsity in primal-dual interior-point methods for semidefinite programming. Math. Prog. 791), 235 253 1997) 28