c 2008 Society for Industrial and Applied Mathematics

Size: px

Start display at page:

Download "c 2008 Society for Industrial and Applied Mathematics"

Angel Harper
5 years ago
Views:

1 SIAM J. OPTIM. Vol. 9, No. 3, pp c 008 Society for Industrial and Applied Mathematics ON THE SECOND-ORDER FEASIBILITY CONE: PRIMAL-DUAL REPRESENTATION AND EFFICIENT PROJECTION ALEXANDRE BELLONI AND ROBERT M. FREUND Abstract. We study the second-order feasibility cone F = {y R n : My g T y} for given data (M,g). We construct a new representation for this cone and its dual based on the spectral decomposition of the matrix M T M gg T. This representation is used to efficiently solve the problem of projecting an arbitrary point x R n onto F: min y{ y x : My g T y}, whichasidefrom theoretical interest also arises as a necessary subroutine in the rescaled perceptron algorithm. We develop a method for solving the projection problem to an accuracy ε, whose computational complexity is bounded by O(mn +n ln ln(/ε)+n ln ln(/ min{width(f), width(f )})) operations. Here width(f) and width(f ) denote the width of F and F, respectively. We also perform computational tests that indicate that the method is extremely efficient in practice. Key words. method second-order cone, convex cone, projection, computational complexity, Newton AMS subject classifications. 90C60, 90C5, 90C5, 49M5, 49M9 DOI. 0.37/ X. Introduction and main results. Our notation is as follows: let K denote the dual of a convex cone K R k, i.e., K := {z R k : z T y 0 for all y K}. A convex cone K is regular if it is closed, has nonempty interior, and contains no lines, in which case K is also regular; see Rockafellar [7]. Define the standard second-order cone in R k to be Q k := {y R k : (y,...,y k ) y k },where denotes the Euclidean norm. Let B(y, r) denote the Euclidean ball of radius r centered at y. Given data (M,g) (R m n, R n ), our interest lies in the second-order feasibility cone F := { y R n : My g T y } = { y R n : ( My,g T y ) Q m+} and its dual cone F. We first take care of some trivial cases. When rank(m) = 0, it follows trivially that F = {y R n : g T y 0}, wherebyf is either a half-space or all of R n, depending on whether g 0org = 0, respectively. When rank(m) =,M = fc T for some f,c, and My = f c T y for any y. ThisimpliesthatF = {y R n :(g f c) T y 0, (g + f c) T y 0}, and hence F is the intersection of either one or two half-spaces. We dispose of these trivial cases by making the following assumption about the data. Assumption. rank(m) andg 0. We now describe our main representation result for F and F.Itiselementaryto establish that M T M gg T has at most one negative eigenvalue, and we can write its eigendecomposition as M T M gg T = QDQ T,whereQ is orthonormal (Q = Q T ) and D is the diagonal matrix of eigenvalues. For notational convenience we denote D i and Q i as the ith diagonal component of D and the ith column of Q, respectively. By Received by the editors October 0, 006; accepted for publication (in revised form) May 7, 008; published electronically October 3, 008. This research has been partially supported through the Singapore-MIT Alliance and an IBM Ph.D. Fellowship. Fuqua School of Business, Duke University, Durham, NC (abn5@duke.edu). MIT Sloan School of Management, Cambridge, MA 04 (rfreund@mit.edu). 073

2 074 ALEXANDRE BELLONI AND ROBERT M. FREUND reordering the columns of Q, we can presume that D D n and D,...,D n 0. By choosing either Q n or Q n, we can further presume that g T Q n 0. We implicitly assume Q and D can be computed to within machine precision (in the relative sense) in O(mn ) operations, consistent with computational practice. Our interest lies mainly in the case when F is a regular cone, so we will hypothesize that F is a regular cone for the remainder of this section. This hypothesis implies that n rank(m) min{n, m}. (We indicate how to amend our results and proofs to relax this hypothesis at the end of sections and 3.) Our main representation result is as follows. Theorem. Suppose that F is a regular cone. Then D,...,D n > 0 >D n, and (i) F = {y : y T QDQ T y 0, y T Q n 0}; (ii) F = {z : z T QD Q T z 0, z T Q n 0}; (iii) if y F and α 0, thenz := αqdq T y F. Furthermore, if y F, then z F and z T y =0; (iv) if z F and α 0, theny := αqd Q T z F.Furthermore,ifz F, then y F and z T y =0. Note that (i) and (ii) of Theorem describe easily computable representations of F and F that have the same computational structure, in that checking membership in each cone uses similar data, operations, etc., in a manner that is symmetric between the dual cones. Parts (iii) and (iv) indicate that the same matrices in (i) and (ii) can be used constructively to map points on the boundary of one cone to their orthogonal counterpart in the dual cone. Remark (geometry of F and F ). Examining (i) and the property that D n < 0, the orthonormal transformation y s := Q T y maps F onto the axes-aligned ellipsoidal cone S := {s R n : n j= D is i D n s n } so that F is the image of S under Q, F = {y : n i= D i(q T i y) D n Q T n y}, and F = {z : n i= (/D i)(q T i z) / D n Q T n z}. This establishes that F is indeed simply an ellipsoidal cone whose axes are the eigenvectors of Q with dilations corresponding to the eigenvalues of M T M gg T. From this perspective, the representation of F via (ii) makes natural geometric sense. Also, the central axis of both F and F is the ray {αq n : α 0}. Last of all, note that F = {y : y T QDQ T y 0, y T Q n 0} and F = {z : z T QD Q T z 0, z T Q n 0}. It turns out that the eigendecomposition of M T M gg T = QDQ T, while useful both conceptually and algorithmically (as we shall see), is not even necessary for the above representation of F and F. Indeed, Theorem can alternatively be stated replacing QDQ T and QD Q T by M T M gg T and (M T M gg T ). Under the further hypothesis that rank(m) = n, the theorem can be restated as follows. Corollary. Suppose that F is a regular cone and rank(m) =n. Then (i) F = {y : y T (M T M)y g T y}; (ii) F = {z : z T (M T M) z gt (M T M) z }; g T (M T M) g (iii) if y Fand α 0, thenz := α(m T M gg T )y F. Furthermore, if y F, thenz F and z T y =0; (iv) if z F and α 0, theny := α[(m T M) (M T M) gg T (M T M) g T (M T M) g ]z F. Furthermore, if z F,theny F and z T y =0. The proofs of Theorem and Corollary are presented in section, along with proofs that all of the stated quantities are well defined: in particular, D exists and g T (M T M) g > 0 under the given hypotheses.

3 SECOND-ORDER FEASIBILITY CONE REPRESENTATION 075 These representation results are used to solve the following dual pair of optimization problems, where x R n is a given point: () P : t := min y y x D : t := max z x T z s.t. y F, s.t. z z F. The problem P is the classical projection problem onto the cone F, whose solution is the point in F closest to x, and strong duality is easily established for this pair of problems. The problem D arises as a necessary subroutine in the rescaled perceptron algorithm in []: the subroutine needs to efficiently solve D using x = x k that arises at each outer iteration k of the algorithm. It is this latter problem that motivated our interest in efficiently representing F and solving both P and D. Notice that P/D involve intersections of a Euclidean ball and a second-order feasibility cone. This dual pair of problems is therefore a modest generalization of the trust region problem of optimizing a quadratic function over a Euclidean ball, for which Ye [0] showed how to combine a binary search and Newton s method to obtain double-logarithmic complexity. Using the representation results above and extending ideas from [0], we develop an algorithm for solving () in section 3. The complexity of the algorithm depends on the widths of the cones F and F, where the width τ K of a cone K is defined to be the radius of the largest ball contained in K that is centered at unit distance from the origin: τ K := max{r : B(y, r) K, y }. y,r It readily follows from Theorem that the widths of F and F are simple functions of the largest and smallest positive eigenvalues and the negative eigenvalue of M T M gg T, and it is straightforward to derive the following: D n / D n τ F = and τ F =. D n + D / D n +/D n The main complexity result, which is proved in section 3, is as follows. Theorem. Suppose that F is a regular cone, and x R n satisfying x = is given. Then feasible solutions (y, z) of (P, D) satisfying a duality gap of at most σ are computable in O(mn + n ln ln(/σ)+nln ln(/ min{τ F,τ F })) operations. In section 4, we complement this theoretical computational complexity bound with experimental computational results that indicate that the method is also extremely efficient in practice. Last of all, we note that the hypothesis that F is regular can be removed with no loss of strength of the results herein but with substantial expositional overhead. The case when F is nonregular is discussed at the end of sections and 3.. Proofs of representation results. Recall the eigendecomposition of M T M gg T = QDQ T,withD D n. A simple dimension argument establishes that M T M gg T has at most one negative eigenvalue, whereby D,...,D n 0. By choosing either Q n or Q n, we can ensure that g T Q n 0. In preparation for the proof of Theorem, we first prove some preliminary results. Proposition. Suppose that int F. Then D n < 0, andthereexistsy satisfying My <g T y.

4 076 ALEXANDRE BELLONI AND ROBERT M. FREUND Proof. We first suppose that there exists ȳ that satisfies Mȳ <g T ȳ. In this case it easily follows that 0 > ȳ T (M T M gg T )ȳ =ȳ T QDQ T ȳ,wherebyd n < 0. Next suppose that every y Fsatisfies My = g T y,andletȳ int F. Since ȳ int F, wehave M(ȳ + βd) = g T (ȳ + βd) for all d B(0, ) and all sufficiently small positive β. Squaring the previous equation, then rearranging and cancelling terms yields (d T M T Mȳ ȳ T gg T d)+β(d T M T Md d T gg T d)=0,whichistrueonly if g T d =0 Md = 0. This in turn implies that rank(m) =, violating Assumption. Therefore there exists y satisfying My <g T y. One characterization of F is as follows: () F = cl { M T λ + gα : λ α }. This result admits an elementary proof by a separating hyperplane argument and has been part of the folklore of convex analysis for several decades. For a standard proof, see, for example, Theorem 3. of Berman [3] applied to the second-order cone. The lack of closure of T := {M T λ + gα : λ α} can arise easily. Let M = [ 0 0 ] and g = [ 0]. In this case, T = {( λ + α, λ ) (λ,λ ) α}. Itiseasytoverifythat (0, ) / T but (ε, ) T for every ε>0(setλ = ε ε, λ =,andα = ε + ε ), which shows that T is not closed. For an analysis of cases when T is guaranteed to be closed, see Pataki [5]. ProofofTheorem. Since int F, Proposition implies that D n < 0, and so, for the sake of this proof, we rescale (M,g) by/ D n in order to conveniently satisfy D n =. (i) Define H := {y : y T QDQ T y 0, y T Q n 0} and L := {y : y T QDQ T y 0, y T g 0}. We need to prove that H = F. It is straightforward to check that F = L and, indeed, int F = int L = {y : y T QDQ T y<0, y T g>0} from Proposition, and this also readily establishes that Q n int F. For any y we can write (3) y T Q n = y T ( QDQ T ) Q n = y T ( gg T M T M ) Q n = ( My,g T y ) T ( MQn,g T ) Q n, where the final term s two parenthetic vectors lie in R m+.noticethat( MQ n,g T Q n ) int Q m+,sinceq n int F. Ify F, then both vectors in the last term of (3) are in Q m+ ; hence y T Q n 0 follows from the self-duality of Q m+. Therefore y H, showing that F H. Next suppose that y H. Then y Funless g T y<0, in which case y F.Using(3)wehave 0 y T Q n = ( My,g T y ) T ( MQn,g T Q n ) = ( My, g T y ) T ( MQn,g T Q n ) 0, again because the two vectors in the last term lie in the self-dual cone Q m+.thisimplies that equality holds throughout, and hence ( My, g T y) = (0, 0) since ( MQ n,g T Q n ) int Q m+, yielding the contradiction that g T y =0. Thisestablishes that F H, completing the proof of (i). (ii) Having established (i), suppose that D i =0forsomei {,...,n }. Then (θq i ) T QDQ T (θq i )=0andQ T n (θq i ) = 0, whereby θq i F for all θ, violating the hypothesis that F is regular. Therefore D i > 0 for all i {,...,n }, and hence D exists. Define J := {z : z T QD Q T z 0, z T Q n 0}. Suppose that z J

5 and y F,inwhichcase SECOND-ORDER FEASIBILITY CONE REPRESENTATION 077 n y T z = y T QQ T z = D ( i Q T i y ) D ( i Q T i z ) + y T Q n z T Q n i= n ( D i Q T i y ) n D ( i Q T i z ) + y T Q n z T Q n 0, i= where the first inequality is an application of the Cauchy Schwarz inequality and the second inequality follows, since z J and y F using part (i). Thus z F,which shows that J F. Next let Q denote the matrix of the first n columns of Q, andlet D denote the diagonal matrix composed of the n diagonal components D,...,D n. Then from part (i) we have F = {y : y T Q D Q T y Q T n y} = {y : D QT y Q T n y}, and using () we know that F = cl T, where T = { Q D λ + Qn α : λ α}. Let z T,wherez = Q D λ + Qn α and λ α. Then ( ) z T QD Q T z = Q D ( T ) λ + Qn α QD Q T Q D λ + Qn α = λ T λ α 0, and furthermore Q T n z = α 0, whereby z J. Thus T J. It then follows that F = cl T cl J = J, which completes the proof of (ii). To prove (iii), notice that Q T n z = αd nq T n y 0and z T QD Q T z = α y T QDQ T QD Q T QDQ T y = α y T QDQ T y (=) 0, since y F (y F) implies that y T QDQ T y (=) 0, and hence z F (z F ) from part (ii). Furthermore y T z = αy T QDQ T y =0wheny F, completing the proof of (iii). The proof of (iv) follows similar logic. Before proving Corollary we first prove the following. Proposition. Suppose that int F and rank(m) =n. Theng T (M T M) g> and ȳ := (M T M) g int F. Proof. Let α := g T (M T M) g > 0, since g 0 from Assumption. From Proposition we know there exists ŷ satisfying Mŷ <g T ŷ and rescale ŷ if necessary so that g T ŷ = α. Noticethatȳ optimizes the function f(y) =y T M T My g T y,whose optimal objective function value is α. Therefore i= α ŷ T M T Mŷ g T ŷ<α α, which implies that α > α > 0, and hence α >. Next observe that Mȳ = ȳt M T Mȳ = α<α= g T ȳ,wherebyȳ int F. Proof of Corollary. (i) is a restatement of the definition of F, (iii) is a restatement of part (iii) of Theorem, and (iv) is a restatement of part (iv) of Theorem using the Sherman Morrison formula QD Q T = ( M T M gg T ) ( = M T M ) (M T M) gg ( T M T M ) g T (M T M), g together with the fact from Proposition that g T (M T M) g>. It remains to prove (ii). Let K := {z R n : z T QD Q T z 0}. Then from Theorem we have K = F F. Let ȳ =(M T M) g, and note that ȳ int F

6 078 ALEXANDRE BELLONI AND ROBERT M. FREUND from Proposition. Define H := {z R n : ȳ T z 0}, and note that H F = F and H F = {0}. Therefore F = K H = {z R n : z T QD Q T z 0, g T (M T M) z 0}. Using the Sherman Morrison formula we obtain ( ( M F = {z T (M T M) T M ) ( gg T M T M ) ) g T (M T M) z 0, g T ( M T M ) } z 0, g which after rearranging yields the expression in (ii). Remark (the case when F is not regular). Let Z and N partition the set of indices according to zero and nonzero values of D i. If D n = 0, then one can show that F is a half-subspace in the subspace spanned by the Q i for i Z. If D n > 0, then F = {0}. IfD n < 0, then F has an interior, and we can interpret D i = for i Z. Then Theorem remains valid if we interpret z T QD Q T z 0 in (ii) as i N D i(q T z) i 0, (QT z) i =0fori Z, and y := αqd Q T z in(iv)as Q T i y := αd i Q T i z for i N and QT i y is set arbitrarily for i N. 3. An algorithm for approximately solving (). 3.. Basic properties of () and the polar problem pair. Returningto() where x is the given vector, consider the following conditions in (y, z, θ): y θz = x, y F, (4) z F, z, θ 0, θ z = θ. Examining (4), we see that x is decomposed into x = y θz, wherey Fand θz F and (y, z) is feasible for the problems (). Let G denote the duality gap for (), namely, G = y x + x T z. We also consider the following pair of conic problems that are polar to (): (5) P : f := min v v x D : f := max w x T w s.t. v F, s.t. w w F, together with the following conditions in (v, w, ρ): (6) v ρw = x, v F, w F, w, ρ 0, ρ w = ρ; here x is decomposed into x = v ρw, where now (v, w) isfeasiblefortheproblems (5), ρw F,andv F. Let G denote the duality gap for (5), namely, G = v x + x T w. It is a straightforward exercise to show that conditions (4) together with the complementarity condition y T z = 0 constitute necessary and sufficient optimality conditions for (), and similarly, (6) together with v T w = 0 are necessary and sufficient for optimality for (5). Furthermore, the solutions of (4) and (6) transform to one another: (y, z, θ) (v, w, ρ) =( θz, y/ y, y ), (v, w, ρ) (y, z, θ) =( ρw, v/ v, v ),

7 SECOND-ORDER FEASIBILITY CONE REPRESENTATION 079 with necessary modifications for the cases when y =0(setw = 0) and/or v =0(set z =0). Proposition 3. Suppose (y, z, θ) satisfy (4) and (v, w, ρ) satisfy (6). Then(y, z) and (v, w) are feasible for their respective problems with respective duality gaps: (i) G = y T z; (ii) G = v T w. Furthermore, (iii) if (y, z) is optimal for (), thent = θ; (iv) if (v, w) is optimal for (5), thenf = ρ; (v) (t ) +(f ) = x. Proof. To prove (i), observe that y T z = z T x + θ z = z T x + θ z = z T x + y x = G, and a similar argument establishes (ii). To prove (iii), observe that t = x y = θz = θ with similar arguments for (iv). To prove (v), notice that (y, z, θ) satisfy (4), and y T z =0ifandonlyif(y, z) is optimal for (), in which case it is easy to verify that (v, w, ρ) ( θz, y/ y, y ) satisfy(6)and(v, w) is optimal for (5). Therefore x =(y θz) T (y θz) =y T y + θ = ρ + θ =(f ) +(t ). Proposition 4. If Q T nx 0, thent τ F x. Proof. We assume for the proof that x =,sincet,f scale positively with x. Iff = 0, the result follows trivially since τ F, and t = from Proposition 3. If f > 0, define c = t f Q n, and note that c = t width, B(c, t f τ F ) F. Note that x c = + t f = f. Therefore f x c. f. By definition of the x T x + t f Q T n x + t f Q T n Q n Next observe that c + τ F c (x c) x c F, which is equivalent to c + τ F t (x c) f x c F. By the previous inequality, we have c + τ F t (x c) F.Thuswehave f c + τ F t (x c) x =( τ F t ) x c ( τ F t ) f. Therefore, t = f τ F t, which implies that τ F t. Proposition 5. Given x satisfying x =and Q T n x 0, suppose that (v, w, ρ) satisfies (6), with duality gap G στ F / for (5), where σ. Consider the assignment (y, z, θ) ( ρw, v/ v, v ) (with the necessary modification that y =0 if v =0). Then (y, z, θ) satisfies (4), withdualitygapg σ for (). Proof. Notethaty T z = (wt v)ρ v στ F ρ v, and we have the following relations: (i) w T v στ F / /, (ii) v = θ = y x t τ F from Proposition 4, and (iii) ρ = v x = v T w w T x / +f 3/ from Proposition 3. Therefore y T z τ F σ 3 τ σ. F 3.. The six cases. We assume here that the given x has unit norm, i.e., x =, and that we seek feasible solutions to () with a duality gap at most σ, whereσ. Armed with Propositions 3, 4, and 5, we now show how to compute a feasible solution (y, z) of () with duality gap G σ. Our method is best understood with the help of Figure. We know from section 3. and the conditions (4) and/or (6) that we need to decompose x into the sum of a vector in F plus a vector in F and that the central axes of F and F are the rays corresponding to Q n and Q n, respectively. Define the dividing hyperplane L F := {y : Q T ny =0} perpendicular to the central axes of F and F, and define L + F := {y Rn : Q T n y 0} and L F := L+ F. We divide L+ F into three regions: region corresponds to points in F, region corresponds to points in L + F near the dividing hyperplane (where our nearness criterion will be defined

8 080 ALEXANDRE BELLONI AND ROBERT M. FREUND F θ x 3 L F F ρ Fig.. The geometry of the sets F, F,andL F and the six cases. The central axes of F and F are the rays generated by ±Q n, respectively, which are orthogonal to the hyperplane L F. The regions corresponding to the six cases are shown as well. shortly), and region 3 corresponds to points in L + F \F that are far from L F. We divide L F similarly, into regions 4, 5, and 6. For each of the three regions in L+ F, we will work with the problem pair () and show how to compute a feasible solution (y, z) of () with duality gap G σ. For each of the three regions in L F, we will instead work with the problem pair (5) and show how to compute a feasible solution (w, s) of (5) with duality gap G στ F /, whereby from Proposition 5 we obtain a feasible solution (y, z) of () with duality gap G σ. We will consider six cases, one for each of the regions described above and in Figure. We first describe how we choose whether x is in region or 3. For x L + F \F, define Q T n x D n (7) ε P = ε P (x) := n i= D ( i Q T i x ), and notice that x L + F implies that ε P 0, x/ F implies that ε P <, and smaller values of ε P correspond to Q T n x closer to zero and hence x closer to L F. We specify a tolerance ε P and determine whether x is in region or 3 depending on whether ε P ε or ε P > ε, respectively, where we set ε = ε P := στ F. Case : Q T n x 0 and x T QDQ T x 0. From Theorem we know that x F. Then it is elementary to show that (y, z, θ) (x, 0, 0) satisfy (4), with y T z =0, whereby from Proposition 3 the duality gap is G =0. Case : Q T n x 0 and x T QDQ T x>0, ε P ε P := στ F. Let ŷ solve the following system of equations: (8) [I +/ D n D]Q T ŷ = Q T x e n Q T n x, Q T n ŷ =0, where e n =(0,...,0, ) R n. Notice that the last row of the first equation system

9 SECOND-ORDER FEASIBILITY CONE REPRESENTATION 08 has all zero entries. Therefore this system is not overdetermined, and one can write the closed-form solution (Q T ŷ) i =(Q T x) i /( + / D n D i )fori =,...,n and (Q T ŷ) n = 0, in the transformed variables ŝ := Q T ŷ. Having computed ŷ, next compute α := ŷ T QDQ T ŷ/ D n, and then make the following assignments to variables: ȳ ŷ + αq n, θ ȳ T QD Q T ȳ/ D n, z QDQ T ȳ/( D n θ), y ȳ + Q T n xq n. Proposition 6. Suppose that x =, σ, andε P ε < and that (y, z, θ) are computed according to Case above. Then (y, z, θ) is feasible for (4) with duality gap G ε/τ F for (). Applying Proposition 6 using ε = ε P := στ F ensures that the resulting duality gap satisfies G ε/τ F = σ. Note that the complexity of the computations in Case is O(mn ) (assuming that square roots are sufficiently accurately computed in O() operations). Proof of Proposition 6. It is easy to establish that (Q T x,...,q T n x) 0,and hence α>0. This in turn implies that Q T n ȳ = α>0, and hence θ>0, so z is well defined. It is straightforward to verify that ȳ T QDQ T ȳ =(ŷ + αq n ) T QDQ T (ŷ + αq n )=ŷ T QDQ T ŷ α D n =0, which shows via Theorem that ȳ F, and therefore z F and z T ȳ =0. Itisalso straightforward to verify that z =. Finally, we have from (8) that [I +/ D n D]Q T ȳ =[I +/ D n D] ( Q T ) ŷ + αe n =[I +/ D n D] ( Q T ŷ ) = Q T ( x Q n Q T nx ) (where the second equality above follows since the last row and column of the matrix are zero); hence ȳ +/ D n QDQ T ȳ = x Q n Q T n x. Substituting the values of y, z, θ into this expression yields y θz = x, which then shows that (y, z, θ) satisfy (4). Therefore from Proposition 3 (y, z) is feasible for () with duality gap G = z T y = z T ȳ + z T Q n Q T n x Q T n x = ε n P i= D i(q T i x) Dn ε D Dn ε D + D n = ε/τ F. Dn Case 3: Q T n x 0 and x T QDQ T x>0, ε P > ε P := στ F. Here x is on the same side of the dividing hyperplane L F as F but is neither in F nor close enough to L F in the nearness measure. Consider the following univariate function in γ: (9) f(γ) :=x T Q[I + γd] D[I + γd] Q T x = n i= D i ( x T Q i ) ( + D i γ), shown canonically in Figure. Notice that f(0) = x T QDQ T x>0, and since D n < 0, we have f(γ) as γ / D n. Furthermore, f (γ) = n i= D i (xt Q i ) ( + γd i ) 3 < 0forγ

10 08 ALEXANDRE BELLONI AND ROBERT M. FREUND f(0) > 0 γ γ γ = D γ = D n Fig.. The function f on the interval ( /D, / D n ). Among many desirable properties, f is strictly decreasing and analytic and has a unique root γ (0, / D n ). Moreover, f is convex over ( /D, γ) and concave over ( γ,/ D n ), where γ is the unique point satisfying f ( γ) =0. Note that one can have γ γ or γ γ. [0, / D n ). Therefore f(γ) is strictly decreasing in the domain [0, / D n ), whereby from the mean value theorem there is a unique value γ (0, / D n )forwhich f(γ ) = 0. We show in section 5 how to combine a binary search and Newton s method to very efficiently compute γ (0, / D n ) satisfying f(γ) 0andf(γ) 0 (and γ γ ). Presuming that this can be done very efficiently, consider the following variable assignment: (0) y Q [I + γd] Q T x, θ γ y T QD Q T y, z γqdq T y/θ. We now show that (y, θ, z) satisfy (4). First note that Q T n y = QT n x/( γ D n ) > 0, and furthermore this shows that θ>0, and so z is well defined. By the hypothesis that f(γ) 0wehave y T QDQ T y = x T Q[I + γd] D[I + γd] Q T x = f(γ) 0, which implies that y F, and hence z F from Theorem. It is also straightforward to verify that z =. Finally, rearranging the formula for y yields x = y + γqdq T y = y θz, which shows that (4) is satisfied. From Proposition 3, (y, z) is feasible for (), and using the above assignments the duality gap works out to be G = y T z = f(γ)/ x T QD [I + γd] Q T x, whereby G will be small if f(γ) 0. To make this more precise requires a detailed analysis of a binary search and Newton s method, which is postponed to section 5 where we will prove the following.

11 SECOND-ORDER FEASIBILITY CONE REPRESENTATION 083 Proposition 7. Suppose that x =, >ε P > ε, andg>0 is a given gap tolerance. If Q T n x>0 and x T QDQ T x>0, then a solution (y, z, θ) of (4) with duality gap G g for () is computable in O(n ln ln(/τ F +/ ε +/g)) operations. Substituting ε = ε P := στ F and g = σ, it follows that the complexity of computing a feasible of solution of (y, z) of () with duality gap at most σ is O(n ln ln(/τ F + /σ))=o(n ln ln(/ min{τ F,τ F } +/σ)) operations. Case 4: Q T n x 0 and x T QD Q T x 0. From Theorem we know that x F. Then it is elementary to show that (y, z, θ) (0, x/ x, x ) satisfy (4), with y T z = 0, whereby from Proposition 3 the duality gap is G =0. Before describing how we treat Cases 5 and 6 (corresponding to regions 5 and 6), we need to describe how we choose whether x is in region 5 or 6. We use a parallel concept to that used to distinguish regions and 3, except that F is replaced by F ; seefigure.forx L F \ F, define the following quantity analogous to (7): () ε P = ε P (x) := Q T nx / D n n i= (/D i) (, Q T i x) and notice that x L F implies ε P 0, x / F implies ε P <, and smaller values of ε P correspond to Q T n x closer to zero and hence x closer to L F. We specify a tolerance ε P and determine whether x is in region 5 or 6 depending on whether ε P ε or ε P > ε, respectively, where we set ε = ε P := στf /. Case 5: Q T n x 0 and xt QD Q T x>0, andε P ε P := στf /. This case is an exact analogue of Case, with F replaced by F and the pair () replaced by (5). Therefore the methodology of Case can be used to compute (v, w, ρ) satisfying (6), and hence (v, w) is feasible for (5). Applying Proposition 6 to the context of the polar pair (5) with ε = ε P, it follows that the duality gap for (5) will be G = v T w and will satisfy G ε/τ F = στf /(τ F ) στ F /. Converting (v, w, ρ) to(y, z, θ) using Proposition 5, we obtain (y, z) feasible for () with duality gap G σ. Here the complexity of the computations is of the same order as Case. Case 6: Q T n x 0 and xt QD Q T x>0, andε P > ε P := στf /. In concert with the previous case, this case is an exact analogue of Case 3, with F replaced by F and the pair () replaced by (5). Therefore the methodology of Case 3 can be used to compute (v, w, ρ) satisfying (6), and hence (v, w) is feasible for (5). Applying Proposition 7 to the context of the polar pair (5) with ε = ε P and g = στ F /, it follows that a solution (v, w, ρ) of (6) with duality gap G g = στ F / for (5) is computable in O(n ln ln(/τ F +/ ε +/g)) = O(n ln ln(/ min{τ F,τ F } + /σ)) operations. Converting (v, w, ρ) to(y, z, θ) using Proposition 5, we obtain (y, z) feasible for () with duality gap G σ. Proof of Theorem. The spectral decomposition of M T M gg T = QDQ T is assumed to take O(mn ) operations. The computations in Cases and 4 are trivial after checking the conditions of the cases, which is O(mn ) operations, and similarly for Cases and 5. Regarding Cases 3 and 6, the discussion in the description of these cases establishes the desired operation bound. Remark 3(the case when F is not regular, again). As in Remark, let Z and N partition the set of indices according to zero and nonzero values of D i. Consider the case when D n < 0 (the cases when D n > 0andD n = 0 were discussed in Remark ). We interpret Di = for i Z. Consider the orthonormal transformation Q T x and Q T y, Q T z of the given vector x and the variables y, z. Thenfori Z simply set Q T i y = QT i x and QT i z = 0 and work in the lower-dimensional problem in the subspace spanned by Q i, i N.

12 084 ALEXANDRE BELLONI AND ROBERT M. FREUND Table Average computational results from 00 randomly generated sparse problems. Iterations Running time (seconds) Range of widths Proposed method Proposed method Dimension Theoretical n bound Actual SDPT3 EIG Total SDPT3 Minimum Maximum e Comparison with interior-point methods. The primal-dual pair of problems () can be reformulated as second-order cone programs; one formulation of the primal problem is max y,t { t :(My,g T y) Q m+, (y x, t) Q n+ }, for example. It then follows from interior-point complexity theory that approximate solutions to () with duality gap at most δ can be computed in O(ln(/δ)) interior-point iterations. This iteration bound follows from Theorem.4. of Renegar [6], noting that an interior starting feasible solution with good symmetry is easy to precompute. Unlike the complexity bound for the proposed method in Theorem, the interior-point method bound has a stronger dependence on the duality gap δ (global linear convergence), but, unlike the proposed method, there is no dependence on the widths τ F,τ F. Therefore from a complexity viewpoint one cannot assert that one algorithm dominates the other. (There is a theory of local quadratic convergence for interior-point methods (see, for example, []) that could possibly be used to prove a weaker interior-point method dependence on δ for this class of problems.) In terms of computational practice, it is relevant to compare the two methods on randomly generated problems. We generated 00 relatively sparse random problem instances ((M,g) has density, respectively, 0% and 30% on average) for each of dimensions (m, n) =(n, n) forn = 0, 0, 50, 00, 00, and 500 and solved them using both our method and the conic convex interior-point method software SDPT3s [9]. All computation was performed in MatLab on a recent model laptop computer. We used MatLab s EIG command to compute the eigendecomposition for M T M gg T for our proposed method. Table shows our computational results. Columns and 3 of the table show the average theoretical iteration bound and the average actual iterations of our method. Columns 5, 6, and 7 report running time information. Note as expected that the eigendecomposition is the dominant computation in our method. Our method substantially outperforms SDPT3, as one would expect, since SDPT3 is not optimized for the problem class (). A fairer comparison can be done by presuming that the problem instances are preprocessed and transformed by the eigendecomposition, replacing Q T y with y, whereby the second-order cone formulation takes the diagonal form: max y,t { t : ( D y,..., D n y n, ) D n y n Q n, (y x, t) Q n+}, and SDPT3 naturally exploits the sparse structure of this problem. We generated 00 random problem instances each, for a range of n from n =0ton = 5000, where each instance was generated by randomly choosing D but ensuring that τ F = τ F = 0 7. These instances need no eigendecomposition for our method; also SDPT3 can take good advantage of the problem s natural sparsity as well. Table shows our

13 SECOND-ORDER FEASIBILITY CONE REPRESENTATION 085 Table Average computational results from 00 randomly generated diagonal problems with τ F = τ F = 0 7. Iterations Running time (seconds) Proposed method Dimension Theoretical Proposed n bound Actual SDPT3 method SDPT computational results. Our method still substantially outperforms SDPT3 but not as dramatically when n is very large. However, the running time numbers in Table are the running time until the stopping criteria are met for each method. The stopping criteria for SDPT3 includes stopping when the duality gap is sufficiently small or when insufficient progress is made in satisfying primal/dual feasibility/optimality. For the diagonal problems generated, this latter stopping criteria is unfortunately encountered quite often: the relative error of the final solution from SDPT3 was at least 0.0 for 65% of the diagonal problem instances and was at least 0.00 for 8% of the instances. In fact, SDPT3 stopped with a relative error of at most 0 6 in only % of the instances. In contrast, our proposed method terminated with a relative error of 0 in all instances. 5. Proof of Proposition 7. This section is devoted to the proof of Proposition 7. Our algorithmic approach is motivated by Ye [0], and it consists of a combination of a binary search and Newton s method to approximately solve f(γ) =0forthe function f given in (9). An alternate approach would be to use interpolation methods as presented and analyzed in Melman [4], for which global quadratic convergence is proved but there is no complexity analysis of associated constants. While Proposition 7 indicates that a solution (y, z, θ) of (4) with duality gap G g for () can be computed extremely efficiently, unfortunately our proof is not nearly as efficient as we or the reader might wish. We assume throughout this section that the hypotheses of Proposition 7 hold. We start with a review of Smale s main result for Newton s method in [8]. 5.. Newton s method and Smale s results. Let g be an analytic function, and consider the Newton iterate from a given point ˆγ: γ + =ˆγ g(ˆγ) g (ˆγ), and let {γ k } k 0 denote the sequence of points generated starting from ˆγ = γ 0. Definition. Apointγ 0 is said to be an approximate zero of g if γ k γ k (/) k γ γ 0 for k. For an approximate zero γ 0,letγ = lim k γ k. Then γ is a zero of g, and Newton s method starting from γ 0 converges quadratically to γ from the very first iteration. The main result in [8] can be restated as follows.

14 086 ALEXANDRE BELLONI AND ROBERT M. FREUND Theorem 3 (Smale [8]). Let g be an analytic function. If ˆγ satisfies () sup g (k) (ˆγ) k!g (ˆγ) k> /(k ) g (ˆγ) 8 g(ˆγ), then ˆγ is an approximate zero of g. Furthermore, if ˆγ is an approximate zero of g, then γ k γ (/) k γ γ 0 for all k. 5.. Properties of f(γ). We employ the change of variables s = Q T x,whereby from the hypotheses of Proposition 7 we have s n > 0, s T Ds > 0, and ε P = s n Dn / n j= D is i > ε. We consider computing a zero of our function of interest: (3) f(γ) =s T (I + γd) Ds = n i= D i s i ( + γd i ). Lemma. Under the hypotheses of Proposition 7, f has the following properties: (i) f(0) > 0, lim γ / Dn f(γ) =, andf has a unique root γ (0, / D n ). (ii) f is analytic on ( /D, / D n ), andfork the kth derivative of f is f (k) (γ) =( ) k (k +)!s T (I + γd) (k+) D k+ s n =( ) k D k+ i s i (k +)! ( + γd i ) k+. (iii) sup f (k) /(k ) (γ) k> k!f (γ) 3 { } max D D n,. +γd γ D n ε P (iv) γ ε P,whereε P is given by (7). D n + ε P D D n (v) There exists a unique value γ ( /D, / D n ) such that f is convex on ( /D, γ] and concave on [ γ,/ D n ). Proof. (i) follows from the mean value theorem and the observation that f is decreasing on (0, / D ), and (ii) follows using a standard derivation. To prove (iii) observe that f (k) /(k ) (γ) k!f (γ) = (k +)! /(k ) s T (I + γd) (k+) D k+ s /(k ) k! s T (I + γd) 3 D s [ k 3 s T (I + γd) 3/ D (I + γd) D] D (I + γd) 3/ s s T (I + γd) 3/ D (I + γd) 3/ s 3 max v 0 = 3 max i=,...,n v T P k v v T v { Di +γd i /(k ) }, i= /(k ) where P =(I + γd) D. Therefore f (k) /(k ) (γ) k!f (γ) 3 { } max Di 3 { i=,...,n +γd i max D, +γd } D n, γ D n

15 SECOND-ORDER FEASIBILITY CONE REPRESENTATION 087 which proves (iii). To prove the first inequality of (iv), note that f(γ) = n i= D i s n i ( + γd i ) ( + γd ) i= D i s i D n s n ( + γd n ). ε P D n +ε P D. The right-hand side of the expression above equals zero only at γ := This implies that f( γ) 0, whereby γ γ, since f is strictly decreasing. For the second inequality note that ε P (0, ) since s n > 0ands T Ds > 0. We have f(γ) < n i= s i D i D n s n /( + γd n), and substituting γ = εp D n into this strict inequality yields f( εp D ) < 0, which then implies that n γ < εp D n. To prove (v), examine the derivatives of f in (ii), and notice that f (k) (γ) < 0 for any odd value of k, wherebyf is strictly decreasing. Let γ be the unique point in ( /D, / D n ) such that f ( γ) = 0. Since f is strictly decreasing, f is convex on ( /D, γ) and concave on ( γ,/ D n ). Figure illustrates the geometry underlying some of the analytical properties of f described by Lemma. Remark 4. In the interval ( D, D n D ] the maximum in (iii) of Lemma is D +γd,andintheinterval[ D n D, D n )themaximumis D n +γd n Locating an approximate zero of f by binary search. From Lemma we know that γ (0, Ū], where Ū := ( ε)/ D n. We will cover this interval with subintervals and use a binary search to locate an approximate zero of f, motivated by the method of Ye [0]. Noticing from Remark 4 that the maximum in (iii) of Lemma depends on the midpoint M := D n D, we will consider two types of subintervals: the left intervals will cover [0, max{0, M}], and the right intervals will cover [max{0,m}, Ū]. (Of course, in the case when M 0, there is no need to create the left intervals.) The left intervals will be of the form [L i,l i ], where L i := D (( 3 )i ) for i =0,,...If M 0, we do not consider creating these intervals. The right intervals will have the form [R i,r i ], where R i := D n ( D n Ū)( 3 )i for i =0,,... Let [a, b] denote one of these intervals (either [L i,l i ]or[r i,r i ]forsomei). Note that if f(a) 0andf(b) 0, then γ [a, b]. Supposing that this is the case, it follows from Lemma that f is either convex on [a, γ ] or concave on [γ,b](or both), and consider starting Newton s method from ˆγ = a in the first case or ˆγ = b in the second case. Then the Newton step satisfies (4) γ + =ˆγ f(ˆγ) f (ˆγ) f(ˆγ) f (ˆγ) = γ+ ˆγ γ ˆγ b a, where the first inequality follows from either the convexity of f on [a, γ ]orthe concavity of f on [γ,b]. In particular, we have (5) f(ˆγ) f (ˆγ) γ ˆγ, which relates the value of the function at an approximate solution and the error in our approximation.

16 088 ALEXANDRE BELLONI AND ROBERT M. FREUND Lemma. Under the hypotheses of Proposition 7 the intervals described herein have the following properties: (i) The total number of left intervals and right intervals needed to cover [0, Ū] is ln(/)+ ln(/τf ) K L := ln(3/) + and K R := ln(/ ε) ln(3/), respectively. (ii) Let [a, b] denote one of these intervals, and suppose that f(a) 0 and f(b) 0. Then either a or b is an approximate zero of f, andγ [a, b]. (iii) R i R i D n for i =,...,K R and L i L i D n for i =,...,K L. Proof. We first prove (i) for the right intervals. We have R 0 = Ū and R KR = D n ε D n ( ) KR 3 D n ε D n { ε min, D n + } D { =max 0, D n } =max{0,m}, D and thus the right intervals cover [max{0,m}, Ū]. Note that, using the above reasoning, one easily shows that, because K R +ln(/ ε)/ ln(3/), one also has (6) ( 3 ) KR 3 ε. For the left intervals, first consider the case when M 0. Then D n D and τ F, whereby there is no need to take the nonnegative part in the definition of K L.WehaveL 0 =0and ( (3 L KL = ) KL ) D ( ) D τf = ( ) D + D n = M, D D n and thus the left intervals cover [0, M] =[0, max{0, M}]. Note that, using the above reasoning, one easily shows that, because K L +, one also has ln(/)+ ln(/τf ) ln(3/) ( ) KL 3 (7) 3 4τF. When M 0 there is nothing to prove. To prove (ii), we consider the two cases of [a, b] being either a left or right interval. If [a, b] is a left interval, then M 0andb = a(3/) + D. In this case, for one of ˆγ = a or ˆγ = b, we have for all k>: 8 f (ˆγ) f(ˆγ) /8 b a = /8 (/)(a +/D ) 3 D f (k) (ˆγ) +ˆγD k!f (ˆγ) /(k ) where the first inequality uses (4), the second inequality uses a ˆγ, and the third inequality uses Remark 4 and the fact that ˆγ M in conjunction with Lemma. Therefore ˆγ is an approximate zero of f. If [a, b] is a right interval, then a = b(3/) D n and M a b. In this case, for one of ˆγ = a or ˆγ = b, wehavefor all k>: 8 f (ˆγ) f(ˆγ) /8 b a = /8 b b(3/) + 3 D n ˆγ D n f (k) (ˆγ) k!f (ˆγ) D n /(k ) =, /8 ( ) = D b n 3, D n b D n

17 SECOND-ORDER FEASIBILITY CONE REPRESENTATION 089 where the first inequality uses (4), the second inequality uses M a ˆγ b, and the third inequality uses Remark 4 and the fact that ˆγ M in conjunction with Lemma. Therefore ˆγ is an approximate zero of f. To prove (iii), for the right intervals R i R i = ε 3 D n ( 3 ) i ε 3 D n ( ) KR D n = D n, by the definition of K R, and the second inequality derives from (6). For the left intervals, we can assume M 0 (otherwise they are not constructed), in which case D D n. In this case, we have L i L i = 3D ( ) i 3 3D ( ) KL 3 3D 3 4τ F = ( + ) 4 D D n D n, by the definition of K L, and the second inequality derives from (7). Based on these properties, consider the following method for locating an approximate zero of f. Perform a binary search on the end points of the intervals, testing the end points to locate an interval [a, b] forwhichf(a) 0andf(b) 0. Then either a or b is an approximate zero of f. Then initiate Newton s method from both a and b either in parallel or iterate-sequentially. Notice that, in order to perform a binary search on the left and right intervals, there is no need to compute and evaluate f for all of the end points. In fact, the operation complexity of a binary search will be O(n ln K L )ando(nln K R ), respectively, since each function evaluation of f requires O(n) operations Computing a solution of () with duality gap at most σ. Under the hypotheses of Proposition 7, suppose that [a, b] is one of the constructed intervals, f(a) 0, and f(b) 0. Then, from Lemmas and, γ [a, b] and either f is convex on [a, γ ] or concave on [γ,b] (or both). We first analyze the latter case, i.e., when f is concave on [γ,b], whereby b is an approximate zero of f, and we analyze the iterates of Newton s method for k iterations starting at γ 0 = b. Letγ := γ k be the final iterate. It follows from the concavity of f on [γ,b]thatγ γ and consequently f(γ) 0. Then the analysis in Case 3 shows that the assignment (0) yields a feasible solution of () with duality gap G = f(γ)/ s T D [I + γd] s. The following result bounds the value of this duality gap. Lemma 3. Let g (0, ] be the desired duality gap for (), andlet (( ln ln k =+ 3g )( )) + ε ln ln ln. Under the hypotheses of Proposition 7 and the setup above where b is an approximate zero of f, letγ 0 := b and γ,...,γ k be the Newton iterates, and define γ := γ k.then the assignment (0) will be feasible for () with duality gap at most g. Proof. Wehave f(γ) f (γ) γ γ from the concavity of f on [γ,b]. Also, we have f (γ) = n i= τ F Di s n i ( + γd i ) 3 Di s i ( + γd i ) + Dn s n ( + γd n ) ( + γd n ). i=

18 090 ALEXANDRE BELLONI AND ROBERT M. FREUND Substitute +γd n =+ γdn +γd n f (γ) to obtain n i= Let G = y T z denote the duality gap. Then G = D i s i ( + γd i ) γd 3 n s n ( + γd n ) 3. f(γ) st D [I + γd] s f (γ) γ γ st D [I + γd] s D i s i n i= (+γd i) + γ Dn 3 s n (+γd n) 3 γ γ st D [I + γd] s ( ) = s T D [I + γd] γ D n 3 s n s + ( + γd n ) 3 γ γ s T D [I + γd] s ( D + D n γdn + s ) n +γd n ( + γd n ) γ γ, whereweused s T D [I + γd] s D n s n /( + γd n ) in the last inequality. Next note that γ Ū = ε D n, which implies that ε +γd n. Therefore, recalling that γ is the kth iterate, we have G γ γ ( D + D n ε ( γ γ D n τ F ( 4 γ γ 0 D n ( 4 D n D n τ F + ( ) ε)d n D n ε + ε ) τf ) k + ε )( + ε )( ) k ) k = ( 3 τf + ε )(, where we used Theorem 3 for the third inequality and Lemma for the fourth inequality. Substituting the value of k above yields G g. Last of all, we analyze the case when f is convex on [a, γ ], whereby a is an approximate zero of f, and we analyze the iterates of Newton s method for k iterations starting at γ 0 = a. Let γ k be the final iterate. It follows from the convexity of f on [a, γ ]thatγ k γ and consequently f(γ k ) 0, in which case the assignment (0) is not necessarily feasible for (). However, invoking Theorem 3, we know that γ k +(/) k γ γ 0 γ, we also know that Ū γ, and we can set γ := min{γ k + (/) k γ γ 0, Ū}. Then the analysis in Case 3 shows that the assignment (0) yields a feasible solution of (), with duality gap G = f(γ)/ s T D [I + γd] s. The following result bounds the value of this duality gap. Lemma 4. Let g (0, ] be the desired duality gap for (), andlet (( ln ln k =+ 6 3g )( )) + ε ln ln ln. τ F

19 SECOND-ORDER FEASIBILITY CONE REPRESENTATION 09 Under the hypotheses of Proposition 7 and the setup above where a is an approximate zero of f, letγ 0 := a and γ,...,γ k be the Newton iterates, and define γ := min{γ k + (/) k γ γ 0, Ū}. Then the assignment (0) will be feasible for () with duality gap at most g. Proof. Define δ := γ γ k, and it follows that δ 0andγ k + δ Ū. Furthermore, (8) δ (/) k γ γ 0 ( 6 3g ) [/τf +/ ε ] D n min{ ε,τ F } D n min{ ε, τ F /( τ F )} D n =min{ ε/ D n, /D }. Therefore δ ε/ D n, whereby + γ k D n +δd n = (γ k + δ) D n δ D n + ε ε = 0, where we also used γ k + δ Ū =( ε)/ D n. Therefore (9) + γ k D n ( + (γ k + δ)d n ) ( + td n ) for all t [γ k,γ k + δ]. We also have from (8) that δ /D /D i /D i + γ k for i =,...,n ; hence (0) + γ k D i + δd i ( + γ k D i ), i =,...,n. The duality gap of the assignment (0) is G = y T z = f(γ) st D [I + γd] s = f(γ k + δ) st D [I +(γ k + δ)d] s. We now proceed to bound the numerator and denominator of the rightmost expression. For the numerator we have γk +δ f(γ k + δ) = f(γ k + δ) = f(γ k)+ f (t)dt. However, observe that f(γ k ) 0, f(γ k + δ) 0, and f (t) 0 for all t [0, / D n ), whereby f(γ k + δ) Using (9) for t [γ k,γ k + δ], we have n f Di (t) = s i ( + td i ) 3 + Dn s n ( + td n ) 3 i= γk +δ γ k γ k f (t) dt. n Di s i ( + γ k D i ) 3 +6 Dn s n ( + γ k D n ) 3 8 f (γ k ), i= and it follows that f(γ k + δ) 8δ f (γ k ). To bound the denominator, simply notice from (0) and + γ k D n + δd n +γ k D n that s T D [I +(γ k + δ)d] s (/) s T D [I + γ k D] s. Therefore G = f(γ k + δ) st D [I +(γ k + δ)d] s 6 δ f (γ k ) st D [I + γ k D] s.

20 09 ALEXANDRE BELLONI AND ROBERT M. FREUND Next notice from the logic from the proof of Lemma 3 that f ( (γ k ) st D [I + γ k D] s D n + ε ) ; τ F therefore ( G 3δ D n τ F + ε ) ( 3 D n τf + ε ) ( 6 3g ) [/τf +/ ε ] D n = g, where the last inequality uses the second inequality of (8). ProofofProposition7. Note from the discussion at the end of section 5.3 that the operation complexity of the binary search is O(n ln K L + n ln K R )=O(nln ln(/τ F + / ε)) from Lemma. The number of Newton steps is O(ln ln(/τ F +/ ε +/g)) from Lemmas 3 and 4, with each Newton step requiring O(n) operations, yielding the desired complexity bound. Acknowledgments. We are grateful to two anonymous referees for their suggestions on ways to improve the paper. REFERENCES [] F. Alizadeh, J.-P. A. Haeberly, and M. L. Overton, Primal-dual interior-point methods for semidefinite programming: Convergence rates, stability, and numerical results, SIAM J. Optim., 8 (998), pp [] A. Belloni, R. M. Freund, and S. Vempala, Efficiency of a Re-scaled Perceptron Algorithm for Conic Systems, Working paper OR , MIT Operations Research Center, Cambridge, MA, 006. [3] A. Berman, Cones, Matrices, and Mathematical Programming, Springer-Verlag, New York, 973. [4] A. Melman, A unifying convergence analysis of second-order methods for secular equations, Math. Comp., 66 (997), pp [5] G. Pataki, On the closedness of the linear image of a closed convex cone, Math. Oper. Res., 3 (007), pp [6] J. Renegar, A Mathematical View of Interior-Point Methods in Convex Optimization, SIAM, Philadelphia, 00. [7] R. T. Rockafellar, Convex Analysis, Princeton University Press, Princeton, NJ, 970. [8] S. Smale, Newton s method estimates from data at one point, in The Merging of Disciplines: New Directions in Pure, Applied and Computational Mathematics, R. Ewing, K. Gross, and C. Martin, eds., Springer-Verlag, New York, 986, pp [9] J. Sturm, Using SeDuMi.0, a Matlab toolbox for optimization over symmetric cones, Optim. Methods Softw., & (999), pp [0] Y. Ye, A new complexity result for minimizing a general quadratic function with a sphere constraint, in Recent Advances in Global Optimization, C. Floudas and P. Pardalos, eds., Princeton University Press, Princeton, NJ, 99, pp. 9 3.

On the Second-Order Feasibility Cone: Primal-Dual Representation and Efficient Projection

On the Second-Order Feasibility Cone: Primal-Dual Representation and Efficient Projection Alexandre Belloni and Robert M. Freund October, 2006 Abstract We study the second-order feasibility cone F = {y