A Primal-Dual Second-Order Cone Approximations Algorithm For Symmetric Cone Programming

A Primal-Dual Second-Order Cone Approximations Algorithm For Symmetric Cone Programming Chek Beng Chua Abstract This paper presents the new concept of second-order cone approximations for convex conic programming. Given any open convex cone K, a logarithmically homogeneous self-concordant barrier for K and any positive real number r 1, we associate, with each direction x K, a second-order cone ˆK r (x) containing K. We show that K is the intersection of the second-order cones ˆK r (x), as x ranges through all directions in K. Using these second-order cones as approximations to cones of symmetric positive definite matrices, we develop a new polynomial-time primal-dual interior-point algorithm for semi-definite programming. The algorithm is extended to symmetric cone programming via the relation between symmetric cones and Euclidean Jordan algebras. 0 Introduction This paper aims to present the new concept of second-order cone approximations for convex conic programming. Convex conic programming is a generalization of linear programming, which is the class of optimization problems with linear objective functions, linear equality constraints and non-negative constraints (i.e. the variables are required to be non-negative). In a convex conic programming problem, the non-negativity constraints are generalized to the conic constraint x cl K, where K is a finite-dimensional open convex cone and cl K is its closure. Every convex programming problem (i.e. minimization of a convex function over a finite-dimensional convex set) can be expressed as a convex conic programming problem. The analytical foundation for the study of interior-point methods for convex conic programming was provided by Nesterov and Nemirovskii [6] in their seminal work. Instrumental to their work is a special class of functionals which they termed logarithmically homogeneous self-concordant barriers. This class of functionals captures the essential properties of the standard logarithmic barrier (x 1,..., x n ) n ln x i that are responsible for several polynomial-time algorithms for linear programming. School of Operations Research and Industrial Engineering, Cornell University, Ithaca, New York, 14853, USA (chua@orie.cornell.edu). This research was performed as part of the author s Ph.D. study at Cornell University. 1

The Hessian of a logarithmically homogeneous self-concordant barrier at each point x K defines an inner product called the local inner product at x. Under the metric induced by the local inner product at x, the open ball of radius r centered at x is contained entirely in K whenever r 1. We consider the smallest open cone K r (x) containing this open ball, which is a second-order cone. The second-order cone K r (x) can be used to approximate K. An advantage of this approximation is that the resulting approximating problem can be solve exactly. In the context of semi-definite programming, where K is a cone of symmetric positive semi-definite matrices, we develop a new primal-dual algorithm that alternates between the primal problem and its Lagrangian dual (which is again a semi-definite programming problem), solving an approximating second-order cone programming problem at each step. This algorithm has a provable polynomial bound on the number of iterations that matches the best bound known for general semi-definite programming problems. This algorithm is extended to symmetric cone programming, where K is a symmetric cone (also known as self-dual homogeneous cone and self-scaled cone). Nesterov and Todd ([7], [8]) were the first to develop efficient primal-dual interior-point algorithms for symmetric cone programming in the context of self-scaledness. Symmetric cones are closely related to Euclidean Jordan algebras. To the author s knowledge, this relation was first brought to the attention of optimizers by O. Güler. Using the relation between symmetric cones and Euclidean Jordan algebras, many interior-point algorithms for semi-definite programming were extended to symmetric cone programming (see e.g. [1], [3]). Using this relation, our algorithm can also be extended to symmetric cone programming. This paper is organized as follows. We begin Section 1 with a brief introduction to logarithmically homogeneous self-concordant barriers. This sets up the discussion on secondorder cone approximations in the second part of the section. In Section, we look at semi-definite programming and its standard logarithmic barrier. We highlight and prove several properties of the standard logarithmic barrier that are essential to the development and analysis of a primal-dual second-order cone approximations algorithm for semi-definite programming. This is followed by the description and analysis of the algorithm, which shows that the algorithm is polynomial-time. Section 3 is devoted to the extension of the algorithm to symmetric cone programming. We briefly discuss the relation between symmetric cones and Euclidean Jordan algebras, and show that this relation provides a simple extension of the algorithm. Specifically, through Euclidean Jordan algebras, we define standard logarithmic barriers for symmetric cones and show that they also possess the same essential properties that are highlighted in Section. 1 Second-Order Cone Approximations Among all open convex cones, second-order cones form the class of cones that are easiest to deal with. In fact, each convex conic programming problem with a second-order cone as the underlying cone can be solved exactly. In this section, we discuss the concept of using a special class of second-order cones as approximations to an arbitrary open convex cone.

Throughout this section, let E be a finite-dimensional real vector space with inner product, and induced norm. 1.1 Self-Concordant Logarithmically-Homogeneous Barriers A convex conic programming problem is the following minimization problem min ŝ, x s. t. x L + ˆx, x cl K where ŝ, ˆx E, L E is a vector subspace, K E is an open convex cone and cl K is the closure of K. Without loss of generality, we may assume that K is regular, i.e. K is non-empty and does not contain any vector subspace. The dual problem is min ˆx, s s. t. s L + ŝ, s cl K where K := {z E : z, y > 0 for all y K} is the dual cone of K and L is the orthogonal complement of L. Henceforth, let K E be a regular open convex cone, and let K be its dual cone. A function f : K R is a (non-degenerate and strongly) ϑ-self-concordant barrier for K if it is a strictly convex, three-times continuously differentiable barrier for K (i.e. f diverges to infinity as its argument approaches any point on the boundary K of K) that satisfies D 3 f(x)[h, h, h] (D f(x)[h, h]) 3/ (1.1) and Df(x)[h] (ϑd f(x)[h, h]) 1/ (1.) for all x K and h E. Here, D k f(x)[h,..., h] denotes the k th directional derivative of f d along h, i.e. k f(x + th) dt k t=0. Self-concordancy was introduced by Nesterov and Nemirovskii [6]. All results in this subsection were proven in [6]. Hence, we do not give proofs nor justifications for these results. We refer interested readers to [6] for a comprehensive discussion on the theory of self-concordancy. If f is a ϑ-self-concordant barrier for K, then it can be shown that ϑ 1, and the duality map x g(x) where g is the gradient of f, takes K onto its dual cone K. If f further satisfies f(tx) = f(x) ϑ ln t (1.3) for all x K and t > 0, then f is called a ϑ-logarithmically homogeneous self-concordant barrier for K. 3

Theorem 1.1. Let f be a ϑ-logarithmically homogeneous self-concordant barrier for K. Let g : K E and H : K L[E, E] be the gradient and Hessian of f respectively. For any x K and t > 0, and g(tx) = 1 t g(x), H(tx) = 1 H(x) (1.4) t H(x)x = g(x) (1.5) g(x), x = ϑ (1.6) Theorem 1.. If f is a ϑ-logarithmically homogeneous self-concordant barrier for K, then the functional f : K R : s inf { s, x + f(x)} x K is a ϑ-logarithmically homogeneous self-concordant barrier for K. We call the functional f in the theorem the conjugate barrier of f. Theorem 1.3. Let f be a ϑ-logarithmically homogeneous self-concordant barrier for K, and let f be its conjugate barrier. Let g (resp. g ) and H (resp. H ) be the gradient and Hessian of f (resp. f ) respectively. For any x K and s K, and f(x) + f ( g(x)) = f( g (s)) + f (s) = ϑ (1.7) g ( g(x)) = x, g( g (s)) = s (1.8) H ( g(x)) = H(x) 1, H( g (s)) = H (s) 1 (1.9) 1. Second-Order Cones and Their Dual Cones The n-dimensional second-order cone is the open cone {x R n : x 1 > x + + x n} It is also called the Lorentz cone, light cone or ice-cream cone. Notice that the direction d = (1, 0,..., 0) can be considered as the center direction of the second-order cone. Indeed, under the usual dot product, the angle between any direction along the boundary of the n-dimensional second-order cone and d is constant. In an arbitrary finite-dimensional real vector space E, a second-order cone is the open cone {x E : d, x > x Pr d x } = {x E : d, x > 1 x } 4

where d E, d = 1 is the center direction, and Pr d is the orthogonal projection onto the subspace spanned by d. It is a well-known fact that second-order cones are self-dual. This is stated formally in the following proposition. We leave the proof to the reader. Proposition 1.1. For any d E, d = 1, the second-order cone is identical to its dual cone under,. {x E : d, x > x Pr d x } 1.3 Second-Order Cone Approximations Throughout this subsection, let f be a ϑ-logarithmically homogeneous self-concordant barrier for K and let f be its conjugate barrier. Denote the gradient and Hessian of f by g and H respectively, and denote the gradient and Hessian of f by g and H respectively. For each x K, we define the inner product, x : E E R : (u, v) D f(x)[u, v] This is called the local inner product at x. The induced norm x : u u, u x is called the local norm at x. Similarly for the dual cone K, the local inner product at s K is and the local norm at s is, s : E E R : (u, v) D f (s)[u, v] s : u u, u s For x K and r > 0, the local ball of radius r at x is the set {z E : z x x < r} It is denoted by B r (x).. It was shown in [6] that for each x K, the local ball of radius 1 at x (also called the local unit ball at x) is contained in K. Hence, for r (0, 1), B r (x) K. Consider the smallest open cone containing B r (x). This cone is called the second-order cone of radius r along x and it is denoted by K r (x). For r (0, 1), B r (x) K = K r (x) K. Theorem 1.4. For each x K and r (0, 1), K r (x) = {z E : x, z x > ϑ r z x } Consequently, K r (x) is a second-order cone and its dual cone is (K r (x)) = {s E : g(x), s g(x) > r s g(x) } 5

Proof. Suppose z K r (x). Then there exists a µ > 0 such that µz B r (x). Therefore, r > µz x x = µ z x µ z, x x + x x = (µ z x z, x x / z x ) + ϑ z, x x/ z x ϑ z, x x/ z x implying that z, x x > (ϑ r ) z x. Furthermore, since r 1 ϑ = x x, we deduce from µ z, x x + x x µ z x µ z, x x + x x < r x x that z, x x 0, and hence z, x x > ϑ r z x. Conversely, suppose z, x x > ϑ r z x. For µ = z, x x / z x > 0, implying that z K r (x). For the dual, we rewrite K r (x) as µz x x = µ z x µ z, x x + x x = ϑ z, x x/ z x < r K r (x) = {z E : H(x) 1/ x, H(x) 1/ z > ϑ r H(x) 1/ z } = H(x) 1/ {z E : H(x) 1/ x, z > ϑ r z } = H(x) 1/ {z E : 1 ϑ r H(x) 1/ x, z > z Pr H(x) ϑ 1/ x z } = H(x) 1/ T {z E : 1 ϑ H(x) 1/ x, z > z Pr H(x) 1/ x z } r where T : z Pr H(x) 1/ x z + (z Pr ϑ r H(x) 1/ x z). By Proposition 1.1, (K r (x)) = H(x) 1/ T 1 {s E : 1 H(x) 1/ x, s > s Pr H(x) ϑ 1/ x s } = H(x) 1/ {s E : 1 r H(x) 1/ x, s > ϑ ϑ r s Pr H(x) 1/ x s } = H(x) 1/ {s E : H(x) 1/ x, s > r s } = {s E : H(x) 1/ x, H(x) 1/ s > r H(x) 1/ s } = {s E : x, s > r s g(x) } = {s E : g(x), s g(x) > r s g(x) } r 6

Similarly for the dual cone K, we define, for each s K and r > 0, the dual secondorder cone of radius r along s as the smallest open cone containing the local ball of radius r at s. It is given by K r (s) = {y E : s, y s > ϑ r y s } Since Kr ( g(x)) K for all x K and r (0, 1), we deduce from elementary duality theory that (Kr ( g(x)) (K ) = K. For simplicity, we denote the dual cone of Kr ( g(x)) by ˆK r (x). Theorem 1.5. For any r (0, 1), K = x K ˆK r (x) Proof. Since ˆK r (x) K for all x K, K x K ˆK r (x) Take any z / K. Since K = (K ), there exists s K such that z, s 0. Let x = g (s) K. By (1.8), s = g( g (s)) = g(x ) Kr ( g(x )). Together with z, s 0, we conclude that z / ˆK r (x ), which is the dual cone of Kr ( g(x )). Consequently, z / x K ˆK r (x). Hence, K x K ˆK r (x) The theorem suggests that we can approximate K by the simpler cone ˆK r (x) x S where S K. The approximating problem is a second-order cone programming problem. Of course, the problem may not be tractable if S is large when compared to the dimension of K. In the extreme case, we have S = 1, i.e. we approximate K by a second-order cone. We call the approximations in this extreme case the second-order cone approximations of K. In the following sections, we study the use of the second-order cone approximations in solving certain classes of convex conic programming problems. The Primal-Dual Second-Order Cone Approximations Algorithm One of the most well-studied class of convex conic programming problems is semi-definite programming (SDP), which is the class of convex conic programming problems with cones 7

of symmetric positive definite matrices as underlying cones. The study of semi-definite programming is well motivated by its wide applicability in various areas (see e.g. []). In this section, we apply the concept of second-order cone approximations to SDP. We develop a primal-dual interior-point algorithm for SDP, and show that for an underlying cone of n-by-n symmetric postiive definite matrices, the algorithm requires at most O( n ln 1 ) iterations to reduce the duality gap by ε. This complexity bound matches the ε best bound known for SDP. Actually, the primal-dual interior-point algorithm and its analysis can be extended to a wider class of optimization problems. In fact, in the section following this, we show that the extension can be made to symmetric cone programming. The discussion in this section is restricted to SDP because we need additional tools from the theory of Euclidean Jordan algebras for the extension to symmetric cone programming, and the primary purpose of this paper is to present the second-order cone approximations algorithm and its analysis rather than the algebraic tools..1 Semi-Definite Programming Let S n be the space of n-by-n symmetric matrices and, : S n S n R : (X, Y ) tr XY = n X ij Y ij be the trace inner product. A semi-definite programming problem is the following minimization problem min Ŝ, X s. t. X L + ˆX, X cl S n ++ where ˆX, Ŝ Sn, L S n is a vector subspace and S n ++, the cone of symmetric positive definite n-by-n matrices (also called the positive definite cone). The positive definite cone is self-dual, i.e. (S n ++) = S n ++. The Lagrangian dual of the SDP problem is min ˆX, S s. t. S L + Ŝ, S cl K Assume that both the primal and dual problems have strictly feasible solutions. This implies that the duality gap is zero, and the set of optimal solutions for the primal and dual problems are non-empty and bounded. Assume further that ˆX / L and Ŝ / L. For otherwise, if ˆX L, then the zero matrix is optimal for the primal SDP problem, and if Ŝ L, then the primal SDP problem has constant value zero. The standard logarithmic barrier for S n ++ is X ln det X 8 i,j=1 (P ) (D)

It is an n-self-concordant logarithmically homogeneous barrier. product, its gradient and Hessian are, respectively, Under the trace inner g(x) = X 1 and H(X) : S n S n : U X 1 UX 1 The duality map is X g(x) = X 1. The local inner product at X K is and the local norm at X is, X : (U, V ) H(X)[U, V ] = tr X 1 UX 1 V X : U tr(x 1 U) In the next proposition, we state several properties of standard logarithmic barriers for positive definite cones that are essential for the development and analysis of the primal-dual second-order cone approximations algorithm in subsequent subsections. Proposition.1. Let f be the standard logarithmic barrier for a positive definite cone S n ++. The following are true. 1. The gradient and Hessian of the conjugate barrier of f are respectively identical to the gradient g and Hessian H of f.. The duality map X g(x) has a unique fixed point e S n ++. 3. For any X, Z K, H(X)Z S n ++ and H(H(X)Z) = H(X) 1 H(Z)H(X) 1 (.1) 4. For any X S n ++, there exists X 1/ S n ++ such that H(X 1/ ) = H(X) 1/ (.) where H(X) 1/ denotes the self-adjoint positive definite linear operator that satisfies H(X) 1/ H(X) 1/ = H(X). 1 5. For any X S n ++, H(X) 1/ e = X (.3) 6. For any X S n ++, there exists λ 1,..., λ n > 0 such that for p = 1,,..., H(X) p/ e, e = n λ p i (.4) 1 In general, for any self-adjoint positive definite linear operator H and any positive integers p and q, we denote by H p/q the self-adjoint positive definite linear operator that satisfies (H p/q ) q = H p and denote by H p/q the operator (H 1 ) p/q = (H p/q ) 1. 9

Proof. Let X, Z S n ++ be arbitrary. 1. By (1.7), f (X) = f ( g(x 1 )) = f(x 1 ) n = ln det X 1 n = f(x) n. Since g(x) = X 1, it is clear that the unique fixed point of the duality map is the n-by-n identity matrix I n. 3. Clearly, if both X and Z are symmetric and positive definite, then so is H(X)Z = X 1 ZX 1. For any U S n, H(H(X)Z)U = H(X 1 ZX 1 )U = (X 1 ZX 1 ) 1 U(X 1 ZX 1 ) 1 = XZ 1 XUXZ 1 X = H(X) 1 H(Z)H(X) 1 U 4. Let X 1/ = X 1/ the symmetric positive definite square root of X. Then, for any U S n, H(X 1/ )H(X 1/ )U = X 1/ (X 1/ UX 1/ )X 1/ = H(X)U implying that H(X 1/ ) = H(X) 1/. 5. Since e is the identity matrix I n, H(X) 1/ e = X 1/ I n X 1/ = X 6. Let λ 1,..., λ i be the n positive eigenvalues of X. Then, H(X) p/ e, e = tr X p/ I n X p/ I n = tr X p n = λ p i Corollary.1. Let K be a self-dual regular open convex cone in a finite-dimensional real vector space E with inner product, and induced norm. Let f be a ϑ-logarithmically homogeneous for K, and let g and H be the gradient and Hessian of f respectively. If 10

f satisfies the first three properties listed in Proposition.1, then for any X, Z K and S K = K, H(H(X) 1/ Z) = H(X) 1/ H(Z)H(X) 1/ (.5) and S g(x) = X g(s) (.6) Proof. Let X, Z K be arbitrary. It follows from (.1) and (.) that H(H(X) 1/ Z) = H(H(X 1/ )Z) = H(X 1/ ) 1 H(Z)H(X 1/ ) 1 = H(X) 1/ H(Z)H(X) 1/ Let S K = K be arbitrary. Let W = H(S 1/ ) 1 X = H( g(s 1/ ))X K = K and Z = H(S 1/ )W 1/ K. Then, we deduce using (.1), (.) and (.3) that H(Z)X = H(H(S 1/ )W 1/ )X = H(S 1/ ) 1 H(W 1/ )H(S 1/ ) 1 X = H(S) 1/ H(H(S 1/ ) 1 X) 1/ H(S 1/ ) 1 X = H(S) 1/ e = S Therefore, we deduce using (1.9) and (.1) that S g(x) = H( g(x))s, S = H(X) 1 H(Z)X, H(Z)X = (H(Z) 1 H(X)H(Z) 1 ) 1 X, X = H(H(Z)X) 1 X, X = H(S) 1 X, X = H( g(s))x, X = X g(s). Description of Algorithm In this subsection, we describe a primal-dual interior-point algorithm, that uses the concept of second-order cone approximations, for semi-definite programming. Suppose we have a strictly feasible primal solution X, i.e. X is positive definite and feasible for the primal SDP problem. As suggested in Section 1.3, we may use the secondorder cone ˆK r ( X) to approximate K = S n ++, for any r (0, 1). Consider the second-order cone approximating problem and its dual min Ŝ, X s. t. X L + ˆX, X cl ˆK r ( X) (P ) 11

By Theorem 1.4, min ˆX, S s. t. S L + Ŝ, S cl K r ( g( X)) (D ) K r ( g( X)) = {S S n : g( X), S g( X) > n r S g( X) } = {S S n : X, S > n r S g( X) } and ˆK r ( X) = (K r ( g( X))) = {X S n : X, X X > r X X} Since X K ˆK r ( X), (P ) is strictly feasible. This implies that the set of optimal solutions of (D ) is bounded. However, (D ) may not be feasible, in which case (P ) does not have a finite optimal solution. Clearly, (D ) is feasible if and only if X satisfies the following condition S L + Ŝ s. t. X, S n r S g( X) (.7) Similarly, if we start from a strictly feasible dual solution S, we would consider the following primal-dual pair of second-order cone approximating problems. min ˆX, S s. t. S L + Ŝ S cl ˆK r ( S) min Ŝ, X s. t. X L + ˆX X cl K r ( g( S)) (D ) (P ) The problem (P ) has a non-empty bounded set of optimal solutions if and only if S satisfies the condition X L + ˆX s. t. S, X n r X g( S) (.8) Suppose the strictly feasible primal solution X satisfies (.7). Then (D ) has an optimal solution S. Since r < 1, cl K r ( g( X)) K. Therefore, S is strictly feasible for (D). Furthermore, by (.6), it satisfies (.8) with the choice X = X. Thus, (P ) has an optimal solution X. Similar argument shows that X is strictly feasible for (P ), and satisfies (.7). We then repeat the process until we arrive at a pair of strictly feasible primal-dual solutions with small duality gap. 1

Second Order Relaxation Algorithm Given strictly feasible primal-dual pair (X in, S in ) satisfying and ε > 0. X in, S in n r X in g(s in ) (.9) 1. Set k = 0, X(0) = X in and S(0) = S in.. While X(k), S(k) > ε X in, S in, (a) Solve for the optimal solution S(k + 1) of min ˆX, S s. t. S L + Ŝ S cl K r ( g(x(k))) (b) Solve for the optimal solution X(k + 1) of (c) Set k = k + 1. min Ŝ, X s. t. X L + ˆX 3. Output (X out, S out ) = (X(k), S(k)). X cl K r ( g(s(k + 1))).3 Optimization Over A Second-Order Cone In this subsection, we deal with the issue of solving the second-order programming subproblems in each iteration. Each subproblem takes the form min Ŝ, X s. t. X L + ˆX, X cl K r ( g( S)) (P ) where Ŝ, ˆX S n, L S n is a vector subspace, S L + Ŝ and K r ( g( S)) = {X S n : S, X > (n r ) H( S) 1/ X, S, X > 0} The dual problem is min ˆX, S s. t. S L + Ŝ, S cl ˆK r ( S) (D ) 13

where ˆK r ( S) = {S S n : g( S), S > r H( S) 1/ S, g( S), S > 0} We maintain the assumption that ˆX / L and Ŝ / L. Since the iterates in the algorithm satisfy (.7) and (.8), (P ) is feasible. Furthermore, the strict feasibility of (D ) implies that the set of optimal solutions (also called the optimal set) of (P ) is non-empty and bounded. Since the optimal set is a face of the feasible set (L + ˆX) cl K( g ( S), r), and proper faces of cl K r ( g( S)) are extreme rays, the optimal set can only be either the whole feasible set itself, an extreme ray of cl K r ( g( S)), or an extreme point of the feasible set. In the first case, if the feasible set is not a single point, then Ŝ L, contradicting our assumption. In the second case, the origin is in the optimal set, implying that ˆX L, which contradicts our assumption. Hence, (P ) has a unique optimal solution. Theorem.1. The unique solution of (P ) can be obtained exactly. Proof. The optimal solution X opt of (P ) satisfies the following Fritz John necessary conditions. µŝ ν ( S, X opt S (n r )H( S) 1 X opt ) τ S L (.10a) X opt ˆX L (.10b) S, X opt (n r ) H( S) 1/ X opt 0 (.10c) S, X opt 0 (.10d) ν ( S, X opt (n r ) H( S) ) 1/ X opt = 0 (.10e) τ S, X opt = 0 (.10f) µ, ν, τ 0 (.10g) (µ, ν, τ) (0, 0, 0) (.10h) If (.10d) holds with equality, then it follows from (.10c) that X opt is the origin. We then conclude from (.10b) that ˆX L, contradicting our assumption. Thus, (.10d) is strict and from (.10f), we have τ = 0. If ν = 0, then we conclude from (.10h) that µ 0 and from (.10a) that Ŝ L, contradicting our assumption. Thus, ν > 0 and without loss of generality, we may assume ν = 1. By (.10e), (.10c) holds with equality. Hence, we can strengthen the conditions to ( µŝ S, X opt S (n r )H( S) ) 1 X opt L (.11a) X opt ˆX L (.11b) S, X opt (n r ) H( S) 1/ X opt = 0 (.11c) S, X opt > 0 (.11d) µ 0 (.11e) 14

The condition (.11a) is equivalent to (µ S, X opt ) S + (n r )H( S) 1 X opt L ( S, X opt µ)g( S) (n r )X opt H( S)L = L g( S) where L g( S) denotes the orthogonal complement of L under the local inner product at g( S) K. Therefore, together with (.11b), we have ( X opt = Pr L, g( S) S, ) X opt µ g( n r S) ˆX + ˆX = S, X opt µ n r Pr L, g( S) g( S) + ˆX Pr L, g( S) ˆX where Pr L, g( S) denotes the orthogonal projection onto L under the local inner product at g( S) K. Observe that Pr L, g( S) g( S) and and ˆX Pr L, g( S) ˆX can be obtained by solving for X in H( S) 1 X S L H( S) 1 X L and X L X ˆX L respectively. Let X(α) = α Pr L, g( S) g( S) + ˆX Pr L, g( S) ˆX Then, X opt = X(α) for some α R. By (.11c), α is a solution of the quadratic equation S, X(α) (n r ) H( S) 1/ X(α) = 0 Consequently, the optimal solution can be obtained analytically..4 Analysis of Algorithm Consider the rate of decrease of the duality gap at each iteration, i.e. X(k), S(k) X(k + 1), S(k + 1) X(k), S(k) We claim that the rate of decrease is no less than c r / n at each iteration, where c r is some constant depending on r. Theorem.. Suppose n >. Let (X(k), S(k)) be the k th primal-dual pair of iterates of the primal-dual second-order cone approximations algorithm. For all k 1, where c r = X(k), S(k) X(k + 1), S(k + 1) X(k), S(k) > c r n { } (1 r) r (r + 1)(r + 3) min, 1 3 > 0 (r + 1)(r + 3) 15

Proof. Fix a k 1. To simplify notations, let X = X(k), S = S(k), X = X(k + 1) and S = S(k + 1). Clearly, X, S X, S X, S For each X S n ++ and r (0, 1), let Q(X, r) : S n S n be defined by Q(X, r)u, V = X, U X, V (n r ) H(X) 1 U, V Consider the necessary optimality conditions (.11a), (.11b), (.11c) and (.11e) at S and X. and µ ˆX Q( X, r) S L, S Ŝ L S, Q( X, r) S = 0 and µ 0 νŝ Q( S, r) X L, X ˆX L X, Q( S, r) X = 0 and ν 0 and From these conditions, we deduce that Q( X, r) S µ X, S S = 0 = µ X, S S = S, Q( X, r) S (.1) Q( S, r) X ν S, X X = 0 = ν X X, S = X, Q( S, r) X (.13) Q( X, r) S µ X, Q( S, r) X ν S = 0 = µν X, S = Q( X, r) S, Q( S, r) X (.14) From these equations, we deduce the following strict inequalities. We prove the inequalities in Appendix A. Lemma.1. The following inequalities hold. and X, Q( S, r) X > r (1 r) n(r + 1)(r + 3) X, S X, S (.15) S, Q( X, r) S > r (1 r) n(r + 1)(r + 3) X, S X, S (.16) Q( X, r r) S, Q( S, r) X (r + 1)(r + 3) < n X, S X, S (.17) We now consider two cases. Case 1 : µ X, S n. Using (.1) and (.16), we have X, S X, S S µ X, S S n = S, Q( X, r) S > r (1 r) n(r + 1)(r + 3) X, S X, S 16

Rearranging the terms, we have r (1 r) n(r + 1)(r + 3) < X, S X, S X, S < X, S X, S X, S Case : µ > X, S n. In this case, r (r + 1)(r + 3) n X, S X, S X X, S > Q( X, r) S, Q( S, r) X X X, S by (.17) =µν X, S X X, S by (.14) =µ X, Q( S, r) X X, S by (.13) > X, S r (1 r) n n(r + 1)(r + 3) X, S X, S r (1 r) = n n(r + 1)(r + 3) X, S X, S Rearranging the terms, we have Hence, X, S X, S X, S > (1 r) (r + 1) 3/ (r + 3) 3/ n by (.15) X, S X, S X, S = 1 X, S X, S 1 X, S X, S = X, S X, S X, S = 1 X, S X, S X, S + 1 1 > (r+1) 3/ (r+3) 3/ n + 1 (1 r) (1 r) = (1 r) + (r + 1) 3/ (r + 3) 3/ n (1 r) > 3(r + 1) 3/ (r + 3) 3/ n By combining both cases, we conclude that X, S X, S X, S > c r n 17

where { } r (1 r) c r = min (r + 1)(r + 3), (1 r) 3(r + 1) 3/ (r + 3) { 3/ } (1 r) r = (r + 1)(r + 3) min, 1 3 (r + 1)(r + 3) > 0 Once we have a lower bound on the rate of decrease that is inversely proportional to some polynomial in the dimension of the problem, we can invoke the following theorem to conclude that the algorithm is polynomial-time. Theorem.3. Let ε (0, 1), δ > 0 and ω > 0 be given. Suppose that a sequence of real numbers {µ k } k=1 satisfies ( µ k+1 1 δ ) µ n ω k for all k = 1,,.... Then there exists an index K with K = O(n ω ln 1 ε ) such that µ k εµ 1 Proof. See [9] Theorem 3.. Corollary.. Fix r (0, 1). Given any semi-definite programming problem over symmetric n-by-n matrices and a pair of strictly feasible primal-dual solutions (X in, S in ) satisfying (.9), the primal-dual second-order cone approximations algorithm requires at most O( n ln 1 ε ) iterations to produce a strictly feasible primal-dual pair (Xout, S out ) satisfying X out, S out ε X in, S in Proof. Apply Theorem.3 with δ = c r, ω = 1 and µ k = X(k), S(k), k = 1,,.... 3 Extension To Symmetric Programming At this point, we note that the development of the primal-dual second-order cone approximations algorithm and its analysis depend on the standard logarithmic barrier only through its properties listed in Proposition.1 and the fact that it is logarithmically homogeneous and self-concordant. Although these properties is easily proven for standard logarithmic barriers for positive definite cones, they are not specific to these barriers. Indeed, the standard logarithmic barriers for symmetric cones also possess these properties. This allows for 18

the extension of the primal-dual second-order cone approximations algorithm to symmetric cone programming. Symmetric cones are self-dual homogeneous cones. A regular open convex cone K in a finite-dimensional real vector space E with inner product, is homogeneous if the group of linear automorphisms of K acts transitively on it, i.e. for every x, y K, there exists a linear map A L[E, E] such that AK = K and Ax = y. The cone K is symmetric if it is self-dual (i.e. K = K ) and homogeneous. The class of symmetric cones consist of the following five classes of cones, and their direct sums. 1. The class of second-order cones.. The class of cones of real symmetric positive definite matrices. 3. The class of cones of complex Hermitian positive definite matrices. 4. The class of cones of Hermitian positive definite quaternion matrices. 5. An exceptional 7-dimensional cone. Thus, symmetric cone programming includes linear programming, semi-definite programming and second-order cone programming. Symmetric cones can also be characterized as cones of squares of Euclidean Jordan algebras. With this characterization, we define the standard logarithmic barriers for symmetric cones and show that these barriers possess the properties listed in Proposition.1. In the next subsection, we give a brief description of Euclidean Jordan algebras and present the minimal properties that are useful for the purpose of this paper. 3.1 Symmetric Cones And Euclidean Jordan Algebras An excellent exposition on the relation between symmetric cones and Euclidean Jordan algebras can be found in the book by Faraut and Korányi [4]. In this subsection, we review concepts in Euclidean Jordan algebras that are necessary for the purpose of this paper. Interested readers are referred to the second and third chapters of [4] for a more complete discussion on the theory of Euclidean Jordan algebras. A (finite-dimensional) vector space J over the reals is called a (finite-dimensional) algebra over the reals if a bilinear map : J J J, called the product, is defined. We denote an algebra by (J, ). The algebras (J 1, 1 ) and (J, ) are said to be isomorphic if there exists a isomorphism between the vector spaces J 1 and J that preserves multiplication. Since is bilinear, for each a J, there exists a linear map L(x) : J J such that L(x)y = x y. For each a J, the quadratic representation of a is P (a) := L (a) L(a ) A Jordan algebra is a finite-dimensional algebra (J, ) over the reals such that for all a, b J, 19

1. a b = b a, and. a (a b) = a (a b), where a = a a. For the sake of simplicity, we denote a Jordan algebra by J when the product is understood from the context. For any a J and each p = 1,,..., we define, inductively, a p+1 := a a p, with a 1 = a. Although the operator is non-associative for Jordan algebras in general, it is, nonetheless, power associative, i.e. a p+q = a p a q for all a J and all positive integers p and q (see [4], Theorem II.1.). The Jordan algebra J is said to be Euclidean if there exists a symmetric positive definite bilinear functional Q : J J R that is associative with respect to, i.e. for all a, b, c J, Q(a b, c) = Q(a, b c) It is known that all Euclidean Jordan algebras have an identity element 1 J such that for all a J, a 1 = 1 a = a. Clearly, the identity element is unique. Henceforth, let J be a Euclidean Jordan algebra with identity element 1. The cone of squares of J is the set {a : a J} Its interior is denoted by K(J). The next theorem relates symmetric cones with Euclidean Jordan algebras. Theorem 3.1. A cone is symmetric if and only if it is the interior of the cone of squares of a Euclidean Jordan algebra. Furthermore, the Euclidean Jordan algebra is unique up to isomorphism. Proof. See [4], Theorems III..1 and III.3.1. For each a J, let r be the least positive integer for which {e, a,..., a r } is a linearly dependent set. The integer r is called the degree of a, and it is denoted by deg(a). The rank of J, denoted by rk(j), is the greatest degree among all its elements. An element a J is called an idempotent if a = a. An idempotent is called primitive if it is not the sum of two other idempotents. A set of idempotents {c 1,..., c k } is called a complete system of orthogonal idempotents if c i c j is the zero element of J for all i j, and c 1 + + c k = e. A complete system of orthogonal primitive idempotents is called a Jordan frame. In Jordan frames, k = rk(j). Theorem 3. (Spectral Decomposition). Let J be an Euclidean Jordan algebra with rank r. Then, for each a J, there exists a Jordan frame {c 1,..., c r } and real numbers λ 1... λ r such that a = λ 1 c 1 + + λ r c r Furthermore, the λ i s are unique. 0

Proof. See [4], Theorem III.1.. The real numbers λ 1... λ r in a spectral decomposition of a are called the eigenvalues of a. We denote the eigenvalues of a by λ 1 (a) λ r (a). For each a J with a spectral decomposition λ 1 (a)c 1 + + λ r (a)c r, and each real number t, we define the element a t by a t := λ 1 (a) t c 1 + + λ k (a) t c k whenever λ 1 (a) t,..., λ k (a) t are all well-defined, and undefined otherwise. It should be noted that the notation a t is consistent with the previous notation a p when t is a positive integer. Theorem 3.3. Let J be an Euclidean Jordan algebra with rank r. The cone of squares of J is given by K(J) = {a J : λ i (a) > 0, i = 1,..., r} = {a J : λ r (a) > 0} Proof. This follows easily from λ i (a ) = λ i (a). The trace of a J, denoted by tr a, is the sum λ 1 (a) + + λ r (a). It is a linear functional on J. By definition, (a, b) a b is symmetric and bilinear. Therefore, we conclude that (a, b) tr a b is a symmetric bilinear functional. Furthermore, it follows from Theorem 3.3 that tr a 0 for all a J, and equality holds only when a is the zero element. Thus, the functional, : J J R defined by, : (a, b) tr a b is an inner product on J. It is called the trace inner product of the Euclidean Jordan algebra J. Under the trace inner product, the linear map L(a) is self-adjoint for all a J (see [4], Theorem II.4.3). Hence the quadratic representation P (a) is self-adjoint for all a J. The determinant of a J, denoted by det a, is the product λ 1 (a) λ r (a). Clearly, det a t = (det a) t whenever a t is defined. The next theorem provides a list of properties of the quadratic representation that are useful for the purpose of this paper. Theorem 3.4. Let J be an Euclidean Jordan algebra. The following are true. 1. For any a K(J), P (a) is self-adjoint and positive definite.. For invertible a J, 3. For any a J, P (a)k(j) = K(J) (3.1) P (a)1 = a (3.) 1

4. For any a K(J) and any rational t, P (a t ) = P (a) t (3.3) 5. For any a, b J, 6. For any a, b J, 7. For any a, b J, P (P (a)b) = P (a)p (b)p (a) (3.4) tr P (a)b = tr a b (3.5) det P (a)b = det a det b (3.6) Proof. Let a, b J be arbitrary. 1. Since L(a) is self-adjoint for all a J, so is P (a). From [4], Theorems II.3.1 and III.., we see that the set {P (a) : a K(J)} is the connected component of the identity map in the set {P (a) : a J, P (a) invertible}. Thus, P (a) is positive definite for any a K(J).. See [4], Theorem III... 3. By definition, P (a)1 = L(a) 1 L(a )1 = a (a 1) (a a) 1 = a 4. From [5], Chapter IV, Theorem 1 and the paragraph following it, we have that P (a) p = P (a p ) for any positive integer p. It then follows that for any positive integer p, P (a) = P ((a 1/p ) p ) = P (a 1/p ) p = P (a) 1/p = P (a 1/p ) for any a K(J). Consequently, for any a K(J) and any positive integers p and q, P (a p/q ) = P ((a 1/q ) p ) = P (a 1/q ) p = P (a) p/q Finally, it follows from [4], Theorem II.3.1 that 5. See [4], Theorem II.3.. 6. By definition, P (a p/q ) = P ((a p/q ) 1 ) = P (a p/q ) 1 = P (a) p/q tr P (a)b = a, L(a)b tr a b = L(a)a, b tr a b = tr a b 7. See [4], Theorem III.4..

3. Primal-Dual Second-Order Cone Approximations Algorithm For Symmetric Cones Let K be a symmetric cone, and J be a Euclidean Jordan algebra such that K = K(J). The standard logarithmic barrier for K is the functional f : K(J) R : a ln det(a) Since J is unique up to isomorphism, the standard logarithmic barrier is well-defined. The standard logarithmic barrier is a r-logarithmically homogeneous self-concordant barrier for K(J), where r = rk(j). Under the trace inner product, the gradient and Hessian of the standard logarithmic barrier are g(a) = a 1 and H(a) = P (a 1 ) = P (a) 1 (see [4], Theorems III.4. and II.3.3). In this subsection, we show that the standard logarithmic barrier of a symmetric cone possesses the properties listed in Proposition.1. This means that the primal-dual secondorder cone approximations algorithm and its analysis applies to symmetric cones with their standard logarithmic barriers. Proposition 3.1. Let f be the standard logarithmic barrier for the symmetric cone K(J) and let e be the unique fixed point of the duality map. The following are true. 1. The gradient and Hessian of the conjugate barrier of f are respectively identical to the gradient g and Hessian H of f.. The duality map a g(a) has a unique fixed point e K(J). 3. For any a, b K(J), H(a)b K(J) and H(H(a)b) = H(a) 1 H(b)H(a) 1 4. For any a K(J), there exists a 1/ K(J) such that H(a 1/ ) = H(a) 1/ 5. For any a K(J), there exists λ 1,..., λ n > 0 such that H(a) p/ e, e = r λ p i for all p = 1,,.... 6. For any a K(J), H(a) 1/ e = a 3

Proof. Let a, b K(J) be arbitrary. 1. From (1.7), we see that f (a) = f ( g(a 1 )) = f(a 1 ) r = ln det a 1 r = ln(det a) 1 r = ln det a r = f(a) r. Since g(a) = a 1, it is clear that the identity 1 is the unique fixed point e of the duality map. 3. It follows from (3.1) that From (3.4), we deduce that H(a)b = P (a 1 )b K(J) = K(J) H(H(a)b) = P (P (a 1 )b) 1 = (P (a 1 )P (b)p (a 1 )) 1 = P (a 1 ) 1 P (b) 1 P (a 1 ) 1 = H(a) 1 H(b)H(a) 1 4. Let a 1/ = a 1/. Using (3.3), we have H(a 1/ ) = P (a 1/ ) 1 = P (a) 1/ = H(a) 1/ 5. For i = 1,..., n, let λ i = λ i (a) > 0, the eigenvalues of a. It follows from (3.) and (3.3) that for all p = 1,,..., H(a) p/ e, e = P (a) p/ 1, 1 = P (a p/ )1, 1 = a p, 1 = tr a p = r λ i (a) p = r λ p i 6. It follows from (3.) and (3.3) that H(a) 1/ e = P (a) 1/ 1 = P (a 1/ )1 = (a 1/ ) = a 4

4 Acknowledgements The author thanks Professor James Renegar for his guidance and encouragements, and for his invaluable suggestions on the presentation of this paper. References [1] Farid Alizadeh, Stefan Schmieta; Potential reduction methods for symmetric cone programming, Technical Report RRR0-99, RUTCOR, Rutgers University, New Brunswick, NJ, September 1999 [] Aharon Ben-Tal, Arkadi Nemirovski; Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications, Mps-Siam Series on Optimization, SIAM Publication, Philadelphia, PA, NY, 001. [3] Leonid Faybusovich; Linear systems in Jordan algebras and primal-dual interior-point algorithms, Journal of Computational and Applied Mathematics 86 (1997), 149-175 [4] Jacques Faraut, Adam Korányi; Analysis on Symmetric Cones, Oxford Press, NY, USA, 1994. [5] Max Koecher; The Minnesota Notes on Jordan Algebras and their Applications, Springer-Verlag, Berlin-Heidelberg-New York, 1999. [6] Yurii E. Nesterov, Arkadii S. Nemirovskii; Interior point polynomial algorithms in convex programming, SIAM Publication, Philidelphia, Pennsylvania (1994) [7] Yurii E. Nesterov, Michael J. Todd; Self-scaled barriers and interior-point methods for convex programming, Mathematics of Operations Research, (1997), 1-46. [8] Yurii E. Nesterov, Michael J. Todd; Primal-dual interior-point methods for self-scaled cones, SIAM Journal on Optimization, 8 (1998), 34-364. [9] Stephen J. Wright; Primal-Dual Interior-Point Methods, SIAM Publication, Philadelphia, PA, 1997. A Technical Lemmas We need the following two technical lemmas for the proof of Lemma.1. Lemma A.1. If λ 1,..., λ m > 0 (m > 1) satisfies ( m λ i) = (m r ) m λ i for some r (0, 1), then m (m r ) λ 3 i (1 + r m r4 m + r3 5 )( (m ) m r m ) 3 λ m i m 1

and m (m r ) λ 3 i (1 + r m r4 m r3 )( (m ) m r m ) 3 λ m i m 1 Proof. Let D := {x R m : m λ i = 1, ( m λ i) = (m r ) m λ i }. Due to homogeneity of the inequalities, it suffices to show that (m r ) max{ λ 3 i : x D} 1 + r m r4 m + r3 (m ) m r m m 1 and (m r ) min{ λ 3 i : x D} 1 + r m r4 m r3 (m ) m r m m 1 Both optimization problems have the same Karush-Kuhn-Tucker optimality conditions given below. ( 3λ µe ν λ i = 1 and ) λ i (m r )λ = 0 ( m ) λ i = (m r ) x where λ = (λ 1,..., λ m ). Suppose λ = ( λ 1,..., λ m ) satisfies these optimality conditions. The first condition implies that components of x takes only two possible values, since they are roots of a quadratic polynomial. Suppose that p components of λ take the value of α and m p components take the value of β, with α > β > 0. Since r > 0, we deduce from the last Karush-Kuhn-Tucker condition that 0 < p < m. Now, Solving for α, we get ( m ) ( m λ i = (m r ) = (pα + (m p)β) = (m r )(pα + (m p)β ) α = p(m p)β ± p (m p) β p(m p)(p r )(m p r )β p(m p r ) = p(m p) ± r p(m p)(m r ) β p(m p r ) It follows from α > β that α = p(m p) + r p(m p)(m r ) β p(m p r ) 6 λ i )

The condition m λ i = 1 implies that pα + qβ = 1 = β = m p r (m r )(m p) + r p(m p)(m r ) and hence The lemma then follows from α = 1 m + r m p m p(m r ) m (m r ) λ 3 i = (m r ) (pα 3 + qβ 3 ) = 1 + r m r4 m r3 (p m) m r m p(m p) m r 1 + r m r4 m + r3 (m ) m m 1 where equality holds when p = 1, and m (m r ) λ 3 i = 1 + r m r4 m + r3 (m p) m r m p(m p) m r 1 + r m r4 m r3 (m ) m m 1 where equality holds when p = m 1. Lemma A.. If λ 1,..., λ m > 0 (m > 1) satisfies ( m λ i) = (m r ) m λ i for some r (0, 1), then m (m r ) 3 λ 4 i ( )( 1 + r (r + 1)(r + 3) m ) 4 λ i m Proof. Let D := {x R m : m λ i = 1, ( m λ i) = (m r ) m λ i }. Due to homogeneity of the inequalities, it suffices to show that (m r ) 3 max{ λ 4 i : x D} ( ) 1 + r (r + 1)(r + 3) m 7

Let λ = ( λ 1,..., λ m ) be the global maximum. λ satisfies the following Karush-Kuhn- Tucker optimality conditions. ( ) 4λ 3 µe ν λ i (m r )x = 0 λ i = 1 and x, e = (m r ) where λ = (λ 1,..., λ m ). The inner product of the first condition with λ gives (( m ) 4 λ 4 i µ λ i ν λ i (m r ) which implies that µ = 4 m λ 4 i under the last two conditions. The inner product of the first condition with e gives 4 λ 3 i mµ νr m λ i = 0 λ i λ i ) = 0 which implies that ν = 1 r (4 m λ 3 i mµ) = 4 r ( m λ 3 i m m λ 4 i ). Therefore, Now, = ν + λ = 4 m λ 3 i mµ r + µ = 4 m λ 3 i (m r )µ r = 4 [ m λ 3 r i (m r ) ( m ) ( m λ 3 i ( m = λ 4 i λ 4 i )( m )( m ( m = (m r ) ( m (m r ) ( m λ 3 i (m r ) 8 λ i λ i λ 4 i λ 4 i λ 4 i ) λ 4 i ] )( m ) λ i )( m )( m ) λ i λ 3 i ) )( m ) λ i

where the first two inequalities follow from the Cauchy-Schwartz inequality. Thus, ν +µ 0. This means that at least one root of the cubic polynomial t 4t 3 µ ν( (m r )t) is non-positive. It follows that the components of λ, which are positive roots of this cubic polynomial, can only take the values of the other two roots. Thus, we conclude, as in the previous proof, that p of the components of λ take the value of α = 1 m + r m p m p(m r ) and m p of the components take the value of β = m p r (m r )(m p) + r p(m p)(m r ) where 0 < p < m and α > β. The greatest root of the cubic polynomial is then α = 1 m + r m p m p(m r ) 1 m + r m 1 m m r where equality holds at p = 1. Hence, at z := 1 + r m 1 m m, the cubic polynomial must have a non-negative value, for m r otherwise there will a root greater than z. So, = 0 4z 3 µ ν(1 (m r )z) = 4z 3 4 λ 4 i 4 (4 λ 3 r i m = 4 r (r z 3 (1 (m r )z) 4 r (r + m(m r )z m) λ 4 i r z 3 + ((m r )z 1) m λ 3 i r + m(m r )z m r z 3 + ((m r )z 1)U (m r )(mz 1) (m ) m r ) m m 1 where U = ( 1 + r r4 + r3 m m second inequality in Lemma A.1 and λ 3 i λ 4 i ) λ 4 i ) (1 (m r )z) 1. The last inequality follows from the (m r ) (m r )z 1 = r m ( (m 1)(m r ) 1) > 0 9

Finally, it can be shown that (m r ) 3 (r z 3 + ((m r )z 1)U (m r )(nz 1) =1 + 4r3 (m )(m r ) m r m 3 m 1 + 3r m + r4 (m 1m + 1) r6 (m 8m + 8) m (m 1) m 3 (m 1) 1 + 4r3 m + 3r m + r4 m =1 + r (r + 1)(r + 3) m We are now ready to prove the main lemma. Proof of Lemma.1. From the necessary optimality conditions (.11a), (.11b), (.11c) and (.11e) at S and X, we deduce that 0 = S, Q( X, r) S = X, S n r = S g( X) = X g( S) (A.1) and 0 = X, Q( S, r) X = X, S n r = X g( S) = S g( X) (A.) Since X, S, µ and ν are non-negative, we deduce from (.14) that for U = H( S) 1/ X, V = H( S) 1/ X and D = e, V V (n r )H(V ) 1 e, where e is the unique fixed point of the duality map, 0 Q( X, r) S, Q( S, r) X = (n r ) Q( X, r) S, H( S) 1 X = (n r )( X, S X, H( S) 1 X (n r ) H( X) 1 S, H( S) 1 X ) = (n r ) X, H( S) 1/ e H( S) 1/ X, U (n r ) H( S) 1/ H( X) 1 H( S) 1/ e, U by (.3) = (n r ) e, V V, U (n r ) (H( S) 1/ H( X)H( S) 1/ ) 1 e, U = (n r ) e, V V, U (n r ) H(H( S) 1/ X) 1 e, U by (.1) = (n r ) D, U By (.4), there exists λ 1,..., λ n > 0 such that for all p = 1,,..., H(V ) p/ e, e = n λ p i (A.3) 30

From (1.9), (.3), (A.1) and (A.3) we see that ( n ) λ i = H(V ) 1/ e, e = V, e = X, H( S) 1/ e = X, S = (n r ) X = (n r g( S) ) X, H( g( S)) X = (n r ) V, V = (n r ) H(V ) 1/ e, H(V ) 1/ e n = (n r ) H(V ) 1 e, e = (n r ) Thus, the λ i s satisfies the conditions for Lemmas A.1 and A.. Therefore, using (.3), Lemma A.1, (A.1) and (A.3), we deduce that λ i D, V = e, V V (n r ) H(V ) 1 e, V = 1 n r e, V 3 (n r ) H(V ) 3/ e, e = 1 ( n ) 3 ( n ) λ n r i (n r ) λ 3 i 1 n r ( n = r a(n, r) n(n r ) ) 3 ( λ i r n + r4 n ( n ) 3 λ i ) (n ) n r n n 1 + r3 where a(n, r) = 1 r n r n (n ) n r n 1. Since r (0, 1) and n >, it follows that (r 1) > 0 = r > 1 r = n r n = 1 + r n > 1 + 1 r n 1 > n r n 1 = n > r + (n ) n r n 1 = r > r r(n ) n r + n n n 1 = a(n, r) > 1 r Therefore, ( n ) 3 D, V < r (1 r) λ n(n r i 0 ) 31

Thus, D is not the origin. Let H + be the half-space {Z S n : D, Z 0}. It follows from (n r ) D, U 0 that U H +. Using (.3), Lemma A.1, Lemma A., (A.1) and (A.3), we deduce that D = e, V V (n r ) e, V V, H(V ) 1 e + (n r) H(V ) 1 e = 1 n r e, V 4 (n r ) e, V e, H(V ) 3/ e + (n r) e, H(V ) e = 1 ( n ) 4 ( n )( n ) ( n ) λ n r i (n r ) λ i λ 3 i + (n r) λ 4 i 1 n r + 1 n r ( n = r (r + 1)(r + 3) n(n r ) < r (r + 1)(r + 3) n(n r ) λ i ) 4 n r (1 + r a(n, r)) ( 1 + r (r + 1)(r + 3) n ( n ( n ) 4 λ i )( n ) 4 λ i ( n ) 4 λ i λ i ) 4 n r r a(n, r) ( n ) 4 λ i (A.4) where the last inequality follows from r a(n, r) > r (1 r) > 0. Hence, the distance from V to the half-space V H+ is where b r = D, V D V = D, V D V > r(1 r) > 0. (r+1)(r+3) r (1 r) ( n n(n r ) λ i) 3 ( r (r+1)(r+3) ( )( n n(n r ) λ i) 1 ( ) = b r n n r λ n i) Since U U H+, it follows that the distance between Therefore, which implies that b r n < V V U V, U U = V U = (n r ) V, U e, V e, U U and U = (n r ) H( S) 1 X, X X, S X, S by (.3) V V by (A.1) and (A.) is greater than br n. b r n X, S X, S < X, S X, S (n r ) H( S) 1 X, X = X, Q( S, r) X (A.5) 3

This proves (.15). We can prove (.16) in a similar way. For (.17), we have 1 n r Q( X, r) S, Q( S, r) X = D, U U, e = D, U n e + e, V V (n r )H(V ) 1 U, e e, n e U, e U, e = D, U e + n n ( e, V (n r ) V ) U, e D U n e ( n ) r (r + 1)(r + 3) U, < λ i U e U, e + n(n r ) n n r (r + 1)(r + 3) = V, e X, S n(n r ) n r X, S n = r (r + 1)(r + 3) n(n r ) X, S X, S 33