An unconstrained smooth minimization reformulation of the second-order cone complementarity problem

Math. Program., Ser. B 104, 93 37 005 Digital Object Identifier DOI 10.1007/s10107-005-0617-0 Jein-Shan Chen Paul Tseng An unconstrained smooth minimization reformulation of the second-order cone complementarity problem In honor of Terry Rockafellar on his 70th birthday Received: July 1, 004 / Accepted: May 5, 005 Published online: July 14, 005 Springer-Verlag 005 Abstract. A popular approach to solving the nonlinear complementarity problem NCP is to reformulate it as the global minimization of a certain merit function over IR n. A popular choice of the merit function is the squared norm of the Fischer-Burmeister function, shown to be smooth over IR n and, for monotone NCP, each stationary point is a solution of the NCP. This merit function and its analysis were subsequently extended to the semidefinite complementarity problem SDCP, although only differentiability, not continuous differentiability, was established. In this paper, we extend this merit function and its analysis, including continuous differentiability, to the second-order cone complementarity problem SOCCP. Although SOCCP is reducible to a SDCP, the reduction does not allow for easy translation of the analysis from SDCP to SOCCP. Instead, our analysis exploits properties of the Jordan product and spectral factorization associated with the second-order cone. We also report preliminary numerical experience with solving DIMACS second-order cone programs using a limited-memory BFGS method to minimize the merit function. Key words. Second-order cone Complementarity Merit function Spectral factorization Jordan product Level set Error bound 1. Introduction We consider the following conic complementarity problem of finding x,y IR n and ζ IR n satisfying x,y =0, x K, y K, 1 x = Fζ, y = Gζ, where, is the Euclidean inner product, F :IR n IR n and G :IR n IR n are smooth i.e., continuously differentiable mappings, and K is a closed convex cone in IR n that is self-dual in the sense that K equals its dual cone K :={y x,y 0 x K}.We will focus on the case where K is the Cartesian product of second-order cones SOC, also called Lorentz cones [11]. In other words, K = K n 1 K n N, 3 J.-S. Chen: Department of Mathematics, National Taiwan Normal University, Taipei 11677, Taiwan. e-mail: jschen@math.ntnu.edu.tw P. Tseng: Department of Mathematics, University of Washington, Seattle, Washington 98195, USA. e-mail: tseng@math.washington.edu Mathematics Subject Classification 1991: 6B05, 6B35, 90C33, 65K05

94 J.-S. Chen and P. Tseng where N,n 1,...,n N 1, n 1 + +n N = n, and K n i := {x 1,x IR IR n i 1 x x 1 }, with denoting the Euclidean norm and K 1 denoting the set of nonnegative reals IR +. A special case of 3 is K = IR n +, the nonnegative orthant in IRn, which corresponds to N = n and n 1 = = n N = 1. We will refer to 1,, 3 as the second-order cone complementarity problem SOCCP. An important special case of SOCCP corresponds to Gζ = ζ for all ζ IR n. Then 1 and reduce to Fζ,ζ =0, F ζ K, ζ K. 4 If K = IR+ n, then 4 reduces to the nonlinear complementarity problem NCP and 1 reduce to the vertical NCP [9]. The NCP plays a fundamental role in optimization theory and has many applications in engineering and economics; see, e.g., [9, 13 15]. Another important special case of SOCCP corresponds to the Karush-Kuhn-Tucker KKT optimality conditions for the convex second-order cone program CSOCP: minimize gx subject to Ax = b, x K, 5 where A IR m n has full row rank, b IR m and g :IR n IR is a convex twice continuously differentiable function. When g is linear, this reduces to the SOCP which has numerous applications in engineering design, finance, robust optimization, and includes as special cases convex quadratically constrained quadratic programs and linear programs LP; see [1, 33] and references therein. The KKT optimality conditions for 5, which are sufficient but not necessary for optimality, are 1 and Ax = b, y = gx A T ζ d for some ζ d IR m. Choose any d IR n satisfying Ad = b. If no such d exists, then 5 has no feasible solution. Let B IR n n m be any matrix whose columns span the null space of A. Then x satisfies Ax = b if and only if x = d + Bζ p for some ζ p IR n m. Thus, the KKT optimality conditions can be written in the form of 1 and with ζ := ζ p,ζ d, F ζ := d + Bζ p, Gζ := gfζ A T ζ d. 6 Alternatively, since any ζ IR n can be decomposed into the sum of its orthogonal projection onto the column space of A T and the null space of A, Fζ := d + I A T AA T 1 Aζ, Gζ := gfζ A T AA T 1 Aζ 7 can also be used in place of 6. For large problems where A is sparse, 7 has the advantage that the main cost of evaluating the Jacobians F and G lies in inverting AA T, which can be done efficiently via sparse Cholesky factorization. In contrast, 6 entails multiplication by the matrix B, which can be dense. There have been proposed various methods for solving CSOCP and SOCCP. They include interior-point methods [, 3, 33, 36, 37, 4, 5], reformulating SOC constraints as smooth convex constraints [4], non-interior smoothing Newton methods [6, 19],

An unconstrained smooth minimization reformulation of the second-order cone 95 and smoothing regularization methods []. These methods require solving a nontrivial system of linear equations at each iteration. For the case where G I and F is affine with F strictly K-copositive, a matrix splitting method has been proposed [1]. In this paper, we study an alternative approach based on reformulating CSOCP and SOCCP as an unconstrained smooth minimization problem. In particular, we aim to find a smooth function ψ :IR n IR n IR + such that ψx,y = 0 x, y satisfies 1. 8 We call such a ψ a merit function. Then SOCCP can be expressed as an unconstrained smooth global minimization problem: min fζ:= ψf ζ, Gζ. 9 ζ IR n Various gradient methods, such as conjugate gradient methods and limited-memory quasi-newton methods [5, 18, 38], can now be applied to solve 9. They have the advantage of requiring less work per iteration than interior-point methods and non-interior Newton methods. This approach can also be combined with smoothing and nonsmooth Newton methods to improve the efficiency and robustness of the latter, as was done in the case of NCP [7, 8, 1, 17, 4, 7, 30]. For this approach to be effective, the choice of ψ is crucial. In the case of NCP, corresponding to 4 and K = IR+ n, a popular choice is ψx,y = 1 n φx i,y i i=1 for all x = x 1,..., x n T IR n, where φ is the well-known Fischer-Burmeister FB NCP-function [16, 17] defined by φx i,y i = xi + yi x i y i. It has been shown that ψ is smooth even though φ is not differentiable and satisfies 8 [10, 5, 6]. Moreover, when F is monotone or, more generally, a P 0 -function, every stationary point of ζ ψfζ,ζ is a solution of NCP [10, 0]. This is an important property since i gradient methods are guaranteed to find stationary points only, and ii when an LP is reformulated as an NCP, the resulting F is monotone, but neither strongly monotone nor a uniformly P -function. In contrast, other smooth merit functions for NCP, such as the implicit Lagrangian and the D-gap function [8, 35, 40, 45, 51, 54], require F to be a uniformly P -function in order for stationary points to be solutions of NCP. Thus these other merit functions cannot be used for LP. Subsequently, a number of variants of ψ with additional desirable properties have been proposed, e.g., [6, 10, 9, 31, 34, 41, 47, 49, 53]. A recent discussion of these variants can be found in the paper [47]. Moreover, the above merit function ψ, as well as a related merit function of Yamashita and Fukushima [53], have been extended to the semidefinite complementarity problem SDCP, which has the form 1,, but with x,y being q q q 1 real symmetric block-diagonal matrices of fixed block sizes,, being the trace inner product, and K being the cone of q q block-diagonal positive semidefinite matrices of fixed block

96 J.-S. Chen and P. Tseng sizes [50, 53]. However, the analysis in [50] showed ψ to be differentiable, but did not show it to be smooth. 1 Can the above merit functions for NCP be extended to SOCCP? To our knowledge, this question has not been studied previously. We study it in this paper. We are motivated by previous work on extending merit function from NCP to SDCP [50, 53]. We are further motivated by a recent work [19] showing that the FB function extends from NCP to SOCCP using the Jordan product associated with SOC [11]. Nice properties of the FB function, such as strong semismoothness, are preserved when extended to SOCCP [48]. More specifically, for any x = x 1,x, y = y 1,y IR IR n 1, we define their Jordan product associated with K n as x y := x,y, y 1 x + x 1 y. 10 The identity element under this product is e := 1, 0,...,0 T IR n. We write x to mean x x and write x + y to mean the usual componentwise addition of vectors. It is known that x K n for all x IR n. Moreover, if x K n, then there exists a unique vector in K n, denoted by x 1/, such that x 1/ = x 1/ x 1/ = x. Then, φx,y := x + y 1/ x y 11 is well defined for all x, y IR n IR n and maps IR n IR n to IR n. It was shown in [19] that φx,y = 0 if and only if x, y satisfies 1. Thus, ψ FB x, y := 1 N φx i,y i, 1 i=1 where x = x 1,...,x N T,y = y 1,...,y N T IR n 1 IR n N, is a merit function for SOCCP. We will show that, like the NCP case, ψ FB is smooth and, when F and G are column monotone, every stationary point of 9 solves SOCCP; see Propositions and 3. The same holds for the following analog of the SDCP merit function studied by Yamashita and Fukushima [53]: ψ YF x, y := ψ 0 x,y + ψ FB x, y, 13 where ψ 0 :IR [0, is any smooth function satisfying ψ 0 t = 0 t 0 and ψ 0 t > 0 t >0; 14 see Proposition 4. In [53], ψ 0 t = 1 4 max{0,t}4 was considered. Analogous to the NCP and SDCP cases, when Gζ is invertible, a F -free descent direction for and f FB ζ := ψ FB F ζ, Gζ 15 f YF ζ := ψ YF F ζ, Gζ 16 1 During the revising of this paper, a proof of smoothness is reported in [43].

An unconstrained smooth minimization reformulation of the second-order cone 97 can be found. The function f YF, compared to f FB, has additional bounded level-set and error bound properties; see Section 5. Our proof of the smoothness of ψ FB in Section 3 is quite technical, but further simplification seems difficult. In particular, neither general properties of the Jordan product associated with symmetric cones [11] nor the strong semismoothness proof for φ given in [48] lend themselves readily to a smoothness proof for ψ FB. In Section 6, we report our numerical experience with solving SOCP 5 from the DIMACS library by using a limited-memory BFGS L-BFGS method to minimize f FB, with F and G given by 7. On problems with n m and for low-to-medium solution accuracy, L-BFGS appears to be competitive with interior-point methods. We also report our experience with solving CSOCP using a BFGS method to minimize f FB. It is known that SOCCP can be reduced to an SDCP by observing that, for any x = x 1,x IR IR n 1,wehavex K n if and only if [ x1 x L x := T ] x x 1 I is positive semidefinite also see [19, p. 437] and [44]. However, this reduction increases the problem dimension from n to nn + 1/ and it is not known whether this increase can be mitigated by exploiting the special arrow structure of L x. Throughout this paper, IR n denotes the space of n-dimensional real column vectors and T denotes transpose. For any differentiable function f :IR n IR, fxdenotes the gradient of f at x. For any differentiable mapping F = F 1,..., F m T :IR n IR m, Fx = [ F 1 x F m x] denotes the transpose Jacobian of F at x. For any symmetric matrices A, B IR n n, we write A B respectively, A B to mean A B is positive semidefinite respectively, positive definite. For nonnegative scalars α and β, we write α = Oβ to mean α Cβ, with C independent of α and β.. Jordan product and spectral factorization It is known that K n is a closed convex self-dual cone with nonempty interior given by intk n ={x 1,x IR IR n 1 x <x 1 }. The Jordan product 10, unlike scalar or matrix multiplication, is not associative, which is a main source of complication in the analysis of SOCCP. For any x = x 1,x IR IR n 1, its determinant is defined by detx := x 1 x. In general, detx y detxdety unless x = y. We next recall from [19] that each x = x 1,x IR IR n 1 admits a spectral factorization, associated with K n, of the form x = λ 1 u 1 + λ u, where λ 1,λ and u 1,u are the spectral values and the associated spectral vectors of x given by

98 J.-S. Chen and P. Tseng λ i = x 1 + 1 i x, 1 u i 1, 1 i x if x 0; = x 1, 1 i w if x = 0, 1 for i = 1,, with w being any vector in IR n 1 satisfying w =1. If x 0, the factorization is unique. The above spectral factorization of x, as well as x and x 1/ and the matrix L x,have various interesting properties; see [19]. We list four properties that we will use later. Property 1. For any x = x 1,x IR IR n 1, with spectral values λ 1,λ and spectral vectors u 1,u, the following results hold. a x = λ 1 u1 + λ u K n. b If x K n, then 0 λ 1 λ and x 1/ = λ 1 u 1 + λ u. c If x intk n, then 0 <λ 1 λ, detx = λ 1 λ, and L x is invertible with L 1 x = 1 detx x 1 x T x detx x 1 I + 1 x 1 x x T d x y = L x y for all y IR n, and L x 0 if and only if x intk n.. 3. Smoothness property of merit functions In this section we show that the functions 1 and 13 are smooth functions satisfying 8. For simplicity, we focus on the special case of N = 1, i.e., ψ FB x, y = 1 φx,y 17 in this and the next two sections. Extension of our analyses to the general case of N 1 is straightforward. We begin with the following result from [19] showing that the FB function φ given by 11 has property analogous to the NCP and SDCP cases. Additional properties of φ are studied in [19, 48]. Lemma 1. [19, Proposition.1] Let φ :IR n IR n IR n be given by 11. Then φx,y = 0 x,y K n,x y = 0, x,y K n, x,y =0. Since x,y K n for any x,y IR n, we have that x +y = x + y, x 1 x + y 1 y K n. Thus x + y intk n x + y = x 1 x + y 1 y. 18 The spectral values of x + y are λ 1 := x + y x 1 x + y 1 y, λ := x + y + x 1 x + y 1 y. 19

An unconstrained smooth minimization reformulation of the second-order cone 99 Then, by Property 1b, z := x + y 1/ has the spectral values λ 1, λ and λ1 + λ z = z 1,z =, λ λ 1 w, 0 where w := x 1x + y 1 y x 1 x + y 1 y if x 1x + y 1 y 0 and otherwise w is any vector in IR n 1 satisfying w =1. The next key lemma, describing special properties of x,y with x + y intk n, will be used to prove Propositions 1,, and Lemma 6. Lemma. For any x = x 1,x, y = y 1,y IR IR n 1 with x + y intk n, we have x 1 = x, y 1 = y, x 1 y 1 = x T y, x 1 y = y 1 x. Proof. By 18, x + y = x 1 x + y 1 y. Thus y 1 y, so that x + y = 4 x 1 x + x 4 + x y + y 4 = 4x 1 x + y 1 y T x 1 x + y 1 y. Notice that x = x1 + x and y = y1 + y. Thus, x1 + x + x y + y1 + y =4x1 x + 8x 1 y 1 x T y +4y1 y. Simplifying the above expression yields x1 x + y1 y + x y 8x 1 y 1 x T y = 0. The first two terms are nonnegative. The third term is also nonnegative because x y = x 1 + x y 1 + y x 1 x y 1 y Hence = 4 x 1 y 1 x y 4x 1 y 1 x T y. x 1 = x, y 1 = y, x y 8x 1 y 1 x T y = 0. Substituting x 1 = x and y 1 = y into the last equation, the resulting three equations imply x 1 y 1 = x T y.

300 J.-S. Chen and P. Tseng It remains to prove that x 1 y = y 1 x.ifx 1 = 0, then x = x 1 =0 so this relation is true. Symmetrically, if y 1 = 0, then this relation is also true. Suppose that x 1 0 and y 1 0. Then x 0, y 0, and x 1 y 1 = x T y = x y cos θ = x 1 y 1 cos θ, where θ is the angle between x and y. Thus, cos θ { 1, 1}, i.e., y = αx for some α 0. Then so that y 1 /x 1 = α. Thus y = x y 1 /x 1. x 1 y 1 = x T y = α x = αx 1, The next technical lemma shows that two square terms are upper bounded by a quantity that measures how close x +y comes to the boundary of K n cf. 18. This lemma will be used to prove Lemma 4 and Proposition. Lemma 3. For any x = x 1,x, y = y 1,y IR IR n 1 with x 1 x + y 1 y 0,we have x 1 x 1x + y 1 y T x x 1 x + y 1 y x x 1 x + y 1 y x 1 x 1 x + y 1 y x + y x 1 x + y 1 y. Proof. The first inequality can be seen by expanding the square on both sides and using the Cauchy-Schwarz inequality. It remains to prove the second inequality. Let us multiply both sides of this inequality by x 1 x + y 1 y = x 1 x + x 1 y 1 x T y + y 1 y and let L and R denote, respectively, the left-hand side and the right-hand side. Since x 1 x + y 1 y 0, the second inequality is equivalent to R L 0. We have L = x x 1 x + y 1 y T x x 1 x 1 x + y 1 y + x 1 x 1 x + y 1 y = x x1 x + x 1 y 1 x T y + y1 y x 1 x 1 x + y 1 x T y x 1 x + y 1 y x 1 x + x 1 y 1 x T y + y 1 y +x 1 = x 1 x 4 + x 1 y 1 x T y x + y 1 x y x1 x x 1 x + y 1 y x 1 y 1 x T y x 1 x + y 1 y +x1 4 x + x1 3 y 1x T y + x1 y 1 y,

An unconstrained smooth minimization reformulation of the second-order cone 301 and R = x + y x 1 x + y 1 y x 1 x + y 1 y = x1 + x x 1 x + y 1 y x 1 x + y 1 y + y x 1 x + y 1 y = x1 + x x 1 x + y 1 y x 1 x + x 1 y 1 x T y + y 1 y + y x 1 x + y 1 y = x 4 1 x + x 3 1 y 1x T y + x 1 y 1 y + x 1 x 4 + x 1 y 1 x T y x +y1 x y x1 x x 1 x + y 1 y 4x 1 y 1 x T y x 1 x + y 1 y y1 y x 1 x + y 1 y + y x 1 x + y 1 y. Thus, taking the difference and using the Cauchy-Schwarz inequality yields R L = y x 1 x + y 1 y x 1 y 1 x T y x 1 x + y 1 y y 1 y x 1 x +y 1 y = y 1 x 1x + y 1 y + y x 1 x + y 1 y y 1 y T x 1x + y 1 y x 1 x + y 1 y y1 x 1x + y 1 y + y x 1 x + y 1 y y 1 y x 1 x + y 1 y = y 1 y x 1 x + y 1 y 0. Using Lemmas 1,, 3, and [19, Proposition 5.], we prove our first main result showing that ψ FB is differentiable and its gradient has a computable formula. Proposition 1. Let φ be given by 11. Then ψ FB given by 17 has the following properties. a ψ FB :IR n IR n IR + and satisfies 8. b ψ FB is differentiable at every x, y IR n IR n. Moreover, x ψ FB 0, 0 = y ψ FB 0, 0 = 0.Ifx, y 0, 0 and x + y intk n, then x ψ FB x, y = L x L 1 φx,y, y ψ FB x, y = I x +y 1/ L y L 1 I x +y 1/ φx,y. 1 If x, y 0, 0 and x + y intk n, then x1 + y 1 0 and x ψ FB x, y = x 1 1 φx,y, x1 + y 1 y ψ FB x, y = y 1 1 φx,y. 3 x1 + y 1

30 J.-S. Chen and P. Tseng Proof. a This follows from Lemma 1. b Case 1: x = y = 0. For any h, k IR n, let µ 1 µ be the spectral values and let v 1,v be the corresponding spectral vectors of h + k. Then, by Property 1b, Also h + k 1/ h k = µ 1 v 1 + µ v h k µ 1 v 1 + µ v + h + k = µ 1 + µ / + h + k. µ 1 µ = h + k + h 1 h + k 1 k h + k + h 1 h + k 1 k h + k. Combining the above two inequalities yields ψ FB h, k ψ FB 0, 0 = h + k 1/ h k µ 1 + µ / + h + k h + k / + h + k = O h + k. This shows that ψ FB is differentiable at 0, 0 with x ψ FB 0, 0 = y ψ FB 0, 0 = 0. Case : x, y 0, 0 and x + y intk n. Since x + y intk n, Proposition 5. of [19] implies that φ is continuously differentiable at x, y. Since ψ FB is the composition of φ with x 1 x, then ψ FB is continuously differentiable at x, y. The expressions 1 for x ψ FB x, y and y ψ FB x, y follow from the chain rule for differentiation and the expression for the Jacobian of φ given in [19, Proposition 5.] also see [19, Corollary 5.4]. Case 3: x, y 0, 0 and x + y intk n. By 18, x + y = x 1 x + y 1 y. Since x, y 0, 0, this also implies x 1 x + y 1 y 0, so Lemmas and 3 are applicable. By 0, x + y 1/ λ1 + λ =, λ λ 1 w, where λ 1,λ are given by 19 and w := x 1x + y 1 y x 1 x + y 1 y. Thus λ 1 = 0 and λ > 0. Since x 1 x +y 1 y 0, we have x 1 x +y 1 y 0 for all x,y IR n IR n sufficiently near to x, y. Moreover,

An unconstrained smooth minimization reformulation of the second-order cone 303 ψ FB x,y = x + y 1/ x y = x + y 1/ + x + y x + y 1/,x + y = x + y + x + y x + y 1/,x + y, where the third equality uses the observation that z = z,e for any z IR n. Since x + y + x + y is clearly differentiable in x,y, it suffices to show that x + y 1/,x + y = µ + µ 1 x 1 + y 1 + µ µ 1 x 1 x + y 1 y T x + y x 1 x + y 1 y = µ x 1 + y 1 + x 1 x + y 1 y T x + y x 1 x + y 1 y + µ 1 x 1 + y 1 x 1 x + y 1 y T x + y x 1 x + y 1 y 4 is differentiable at x,y = x, y, where µ 1,µ are the spectral values of x + y, i.e., µ i = x + y + 1 i x 1 x +y 1 y. Since λ > 0, we see that the first term on the right-hand side of 4 is differentiable at x,y = x, y. We claim that the second term on the right-hand side of 4 is o h + k with h := x x,k := y y, i.e., it is differentiable with zero gradient. To see this, notice that x 1 x +y 1 y 0, so that µ 1 = x + y x 1 x + y 1 y, viewed as a function of x,y, is differentiable at x,y = x, y. Moreover, µ 1 = λ 1 = 0 when x,y = x, y. Thus, first-order Taylor s expansion of µ 1 at x, y yields µ 1 = O x x + y y = O h + k. Also, since x 1 x + y 1 y 0, by the product and quotient rules for differentiation, the function x 1 + y 1 x 1 x + y 1 y T x + y x 1 x + y 1 y 5 is differentiable at x,y = x, y. Moreover, the function 5 has value 0 at x,y = x, y. This is because x 1 + y 1 x 1x + y 1 y T x + y x 1 x + y 1 y = x 1 w T x + y 1 w T y = 0 + 0, where w := x 1 x + y 1 y / x 1 x + y 1 y and the last equality uses the fact that, by Lemma 3 and x + y = x 1 x + y 1 y,wehavew T x = x 1, w T y = y 1. By symmetry, Lemma 3 still holds when x and y are switched. Thus, the function 5 is O h + k in magnitude. This together with µ 1 = O h + k shows that the second term on the right of 4 is O h + k 3/ = o h + k. Thus, we have shown that ψ FB is differentiable at x, y. Moreover, the preceding argument shows that ψ FB x, y is the sum of the gradient of x + y + x +y

304 J.-S. Chen and P. Tseng and the gradient of the first term on the right of 4, evaluated at x,y = x, y. The gradient of x + y + x + y with respect to x, evaluated at x,y = x, y, is 4x + y. Using the product and quotient rules for differentiation, the gradient of the first term on the right of 4 with respect to x 1, evaluated at x,y = x, y, works out to be x 1 + w T x x 1 + y 1 + w T x + y λ + λ 1 + xt x + y x 1 x + y 1 y wt x + y x 1 x + y 1 y wt x = x 1x 1 + y 1 x1 + y 1 + x1 + y 1, where w := x 1 x + y 1 y / x 1 x + y 1 y and the equality uses Lemma and the fact that, by Lemma 3 and x + y = x 1 x +y 1 y,wehavew T x = x 1, w T y = y 1. Similarly, the gradient of the first term on the right of 4 with respect to x, evaluated at x,y = x, y, works out to be x + w x 1 x 1 + y 1 + w T x + y λ + x 1 x + x 1 + y 1 y λ wt x + y x 1 x + y 1 y x 1 x + y 1 y w x 1 = x 1x + x 1 + y 1 y. x1 + y 1 In particular, the equality uses the fact that, by Lemma, we have x 1 y = y 1 x and x 1 x + y 1 y =x1 + y 1, so that w x 1 = x and λ = 4x1 + y 1. Thus, combining the preceding gradient expressions yields [ ] x x ψ FB x, y = 4x + y 1 + [ ] y x 1 1 x 1 + y 1. 0 x1 + x 1 x + x 1 + y 1 y y 1 Using x 1 x + y 1 y =x1 + y 1 and λ = 4x1 + y 1, we can also write x + y 1/ = x1 + y 1, x 1x + y 1 y, x1 + y 1 so that φx,y = x1 + y 1 x 1 + y 1, x 1x + y 1 y x1 + y 1 x + y. 6 Using the fact that x 1 y = y 1 x, we can rewrite the above expression for x ψ FB x, y in the form of. By symmetry, 3 also holds.

An unconstrained smooth minimization reformulation of the second-order cone 305 Proposition 1 gives a formula for ψ FB when ψ FB is given by 11, 1. Using Lemma 3, we have the following lemma on the uniform boundedness of the matrices in 1. This will be used to prove the smoothness of ψ FB at 0, 0. Lemma 4. There exists a scalar constant C>0 such that L x L 1 x +y 1/ F C, L y L 1 x +y 1/ F C for all x, y 0, 0 satisfying x + y intk n. A F denotes the Frobenius norm of A IR n n. Proof. Consider any x, y 0, 0 satisfying x + y intk n. Let λ 1,λ be the spectral values of x + y and let z := x + y 1/. Then, z is given by 0, i.e., λ1 + λ λ λ 1 z 1 =, z = w, with λ 1,λ given by 19, and w := x 1x + y 1 y x 1 x + y 1 y if x 1x + y 1 y 0; otherwise w is any vector satisfying w =1. Using Property 1c, we have that L x L 1 z = 1 detz = = 1 λ1 λ [ x 1 z 1 x T z x z 1 x 1 z λ1 + λ x 1 + x 1 z T + detz z 1 x T + xt z z 1 z T x z T + x 1detz z 1 I + x 1 λ1 λ x T w ] z 1 z z T λ1 λ x 1 w T + λ 1 λ λ1 + x T λ + λ 1 λ λ 1 + λ xt w w T λ1 + x λ 1 I λ1 + λ x + λ1 λ x 1 w λ1 λ x w T + λ 1 λ x 1 + x T w λ + x 1 x T w λ 1 x + x 1 w λ + x x 1 w λ 1 + λ 1 λ λ 1 + λ x 1w w T x1 w T x 1w T λ x T + λ 1 λ1 + λ λ λ1 λ1 + λ + λ 1 + λ xt w w T x w T x w T λ x 1 I. 7 + λ 1 λ1 + λ λ λ1 λ1 + λ + λ 1 + λ x 1w w T Since λ x, we see that λ x 1 and λ x. Also, w =1. Thus, terms that involve dividing x 1 or x or x 1 w or x T w or x 1 w w T or xt w w T by λ or λ 1 + λ are uniformly bounded. Also, λ 1 / λ 1. Thus

306 J.-S. Chen and P. Tseng L x L 1 z = = O1 + x 1 x T w λ 1 O1 x 1w T λ 1 + O1 + x x 1 w λ 1 O1 x w T λ 1 + O1 + x 1 x T w λ 1 O1 O1 + x x 1 w λ 1 O1 λ λ1 λ 1 + λ xt w w T λ λ1 λ 1 + λ x 1w w T x 1 w T λ 1 + λ λ x 1 x T w λ 1 + λ w T λ 1 x w T λ 1 + λ λ x x 1 w λ 1 + λ, w T λ 1 where O1 denote terms that are uniformly bounded, with bound independent of x, y. By Lemma 3, if x 1 x + y 1 y 0, then x 1 x T w x x 1 w λ 1.If x 1 x + y 1 y = 0, then λ 1 = x + y so that, by choosing w to further satisfy x T w = 0 in addition to w =1, we obtain Thus, all terms in L x L 1 z x 1 x T w x x 1 w = x λ 1. are uniformly bounded. Using Lemmas, 3, 4 and Proposition 1, we now prove the smoothness of ψ FB. This has been proven for the NCP case [10, 5, 6] but not for the SOCCP case. A proof for the SDP case was only recently reported in [43]. Our proof for the SOCCP case is fairly involved due to the structure of the SOC and its associated Jordan product. Proposition. Let φ be given by 11. Then ψ FB given by 17 is smooth everywhere on IR n IR n. Proof. By Proposition 1, ψ FB is differentiable everywhere on IR n IR n. We will show that ψ FB is continuous at every a, b IR n IR n. By the symmetry between x and y in ψ FB, it suffices to show that x ψ FB is continuous at every a, b IR n IR n. Case 1: a = b = 0. By Proposition 1, x ψ FB 0, 0 = 0. Thus, we need to show that x ψ FB x, y 0as x, y 0, 0.We consider two subcases: i x, y 0, 0 and x +y intk n and ii x, y 0, 0 and x +y intk n. In subcase i, we have from Proposition 1 that x ψ FB x, y is given by the expression 1. By Lemma 4, L x L 1 is uniformly x +y 1/ bounded, with bound independent of x, y. Also, φ given by 11 is continuous at 0, 0 so that φx,y 0asx, y a, b. It follows from 1 that x ψ FB x, y 0 as x, y a, b in subcase i. In subcase ii, we have from Proposition 1 that x ψ FB x, y is given by the expression. Clearly x 1 / x1 + y 1 is uniformly bounded, with bound independent of x, y. Also, φx,y 0asx, y a, b. It follows from that x ψ FB x, y 0asx, y a, b in subcase ii. Case : a, b 0, 0 and a + b intk n. It was already shown in the proof of Proposition 1 that ψ FB is continuously differentiable at a, b.

An unconstrained smooth minimization reformulation of the second-order cone 307 Case 3: a, b 0, 0 and a + b intk n. By 18, a + b = a 1 a + b 1 b. By Proposition 1, we have a1 + b 1 > 0 and x ψ FB a, b = a 1 1 φa,b. a1 + b 1 We need to show that x ψ FB x, y x ψ FB a, b. We consider two cases: i x, y 0, 0 and x + y intk n and ii x, y 0, 0 and x + y intk n. In subcase ii, we have from Proposition 1 that x ψ FB x, y is given by the expression. This expression is continuous at a, b. Thus x ψ FB x, y x ψ FB a, b as x, y a, b in subcase ii. The remainder of our proof treats subcase i. In subcase i, we have from Proposition 1 that x ψ FB x, y is given by the expression 1, i.e., x ψ FB x, y = L x L 1 x +y 1/ I φx,y = L x L 1 x + y 1/ L x +y 1/ x L 1 x + y φx,y x +y 1/ = x L x L 1 x + y φx,y. x +y 1/ Also, by Lemma, we have a 1 a +b 1 b = 1 a + 1 b = a1 +b 1 and a 1b = b 1 a, implying that see 19, 0 a 1 a + b 1/ a 1 = a a1 + b 1 a1 + 1 + b 1, a 1a + b 1 b b 1 a1 + b 1 = a 1, a 1 a + a 1 b 1 b a1 + b 1 = a 1, a 1 a + b1 a a1 + b 1 = a 1,a = a. This together with yields x ψ FB a, b = a 1 1 φa,b a1 + b 1 = = a 1 a 1 + b 1 a 1 a + b 1/ a + b a + b 1/ a1 + b 1 = a a 1 a 1 a 1 + b 1 a + b φa,b. a1 + b 1 φa,b a + b φa,b

308 J.-S. Chen and P. Tseng Since φ is continuous, to prove x ψ FB x, y x ψ FB a, b as x, y a, b, it suffices to show that L x L 1 x a 1 a as x, y a, b, 8 x +y 1/ a1 + b 1 L x L 1 y a 1 b as x, y a, b. 9 x +y 1/ a1 + b 1 Since a + b = a 1 a + b 1 b and a, b 0, 0, then a 1 a + b 1 b 0. Thus, by taking x, y sufficiently near to a, b, we can assume that x 1 x + y 1 y 0. Let z := x + y 1/. Then z is given by 0 with λ 1,λ given by 19 and w := x 1 x + y 1 y x 1 x + y 1 y. In addition, detz = z 1 z = λ 1 λ. Let ζ 1,ζ := L x L 1 z x. Then 8 reduces to ζ 1 a 1 a1 + b 1 and ζ a 1 a as x, y a, b. 30 a1 + b 1 We prove 30 below. By Lemma, as x, y a, b, λ 1 0, λ a + b + a 1 a + b 1 b =4a 1 + b 1, z 1 Using 7, we calculate the first component of L x L 1 z x to be ζ 1 := 1 detz x1 z 1 x T z x 1 x 1 z T x + detz x + xt z z 1 z 1 1 x 1 z 1 detz z1 xt z x 1 z 1 + x T z + x 1z 1 x T z. z 1 z 1 detz = x z 1 + = x Also, using Lemma and 31, x a z 1 a1 + b 1 = a 1. a1 + b 1 Thus, to prove the first relation in 30, it suffices to show that a 1 + b 1. 31, x 1 z 1 x T z z 1 detz 0 as x, y a, b.

An unconstrained smooth minimization reformulation of the second-order cone 309 We have that x 1 z 1 x T z z 1 detz 1 = z 1 λ1 λ 1 = z 1 λ1 λ = 1 z 1 λ λ1 + λ x 1 x 1 x 1 λ1 + λ1 λ + x T w λ λ 1 x 1 x T w λ1 + x 1 λ λ 1 x 1 x T w + λ λ 1 4 λ 1 x 1 x T w. 3 We also have from 31 that λ 1 0, λ a1 + b 1 > 0, and z 1 a1 + b 1 > 0. Moreover, by Lemma 3 and w = x 1x + y 1 y x 1 x + y 1 y, x 1 x T w λ1 0 as x, y a, b. Thus the right-hand side of 3 tends to zero as x, y a, b. This proves the first relation in 30. Using 7, we calculate the last n 1 components of L x L 1 z x to be Also, by 31, ζ := 1 x 1 x z 1 x1 detz z x T z x + x 1detz x + x 1 z z T z 1 z x 1 = x 1 x + 1 x x 1 z 1 x T T z 1 detz z x + x z 1 x 1 z z 1 = x 1 x + x 1z 1 x T z x x 1 z. z 1 detz z 1 x 1 x a 1 z 1 a1 + b 1 a. Thus, to prove the second relation in 30, it suffices to show that x 1 z 1 x T z x x 1 z 0 detz z 1 asx, y a, b.

310 J.-S. Chen and P. Tseng First, x 1z 1 x T z detz x 1 z 1 x T z detz is bounded as x, y a, b because, by 0, 1 λ1 + λ λ λ 1 = x 1 x T λ1 λ w 1 λ = x λ 1 1 λ1 + x 1 x T λ1 λ w = x 1 λ λ 1 + λ x 1 x T λ 1 λ w = x 1 + 1 λ 1 / λ x1 x T w, λ λ1 and the first term on the right-hand side converges to a 1 / 4a1 + b 1 see 31 while the second term is bounded by 19 and Lemma 3. Second, x x 1 z 0asx, y a, b z 1 because, by 0 and 31, x x 1 z 1 z a a 1 a 1 + b 1 = a a 1 a + a 1 b 1 b a 1 a + b 1 b = a a 1 a + b 1 a a 1 + b 1 = a a = 0, 4a 1 + b 1 a 1 a + b 1 b a 1 a + b 1 b where the second equality is due to Lemma, so that a 1 b = b 1 a and a 1 a + b 1 b = a1 + b 1. This proves the second relation in 30. Thus, we have proven 8. An analogous argument can be used to prove 9, which we omit for simplicity. This shows that x ψ FB x, y x ψ FB a, b as x, y a, b in subcase i. It follows from Proposition that the merit function ψ YF given by 13, with ψ 0 a smooth function, is also smooth. 4. Stationary points of merit functions for monotone SOCCP In this section we consider the case where SOCCP has a monotonicity property and show that every stationary point of 9 is a solution of the SOCCP. As in the previous section, we focus our analysis on the case of N = 1 for simplicity. We first need the following technical lemma from [19].

An unconstrained smooth minimization reformulation of the second-order cone 311 Lemma 5. [19, Proposition 3.4] For any x,y IR n and w K n such that w x y K n, we have L w L x + L y. Using Lemmas 1,, 5 and Proposition 1, we prove the following key properties of ψ FB. Similar properties have been proven for the case of NCP [10, 0, 34] and SDCP [50, 53]. However, our proof is quite different from these other proofs due to the different structures of SOC and its associated Jordan product. Lemma 6. Let φ be given by 11 and let ψ FB be given by 17. For any x, y IR n IR n, we have the following results. a b x, x ψ FB x, y + y, y ψ FB x, y = φx,y. 33 x ψ FB x, y, y ψ FB x, y 0, 34 with equality holding if and only if φx,y = 0. Proof. Case 1: x = y = 0. By Proposition 1, x ψ FB x, y = y ψ FB x, y = 0, so the proposition is true. Case : x, y 0, 0 and x + y intk n. By Proposition 1, we have x ψ FB x, y = L x L 1 y ψ FB x, y = z I L y L 1 z I φx,y, φx,y, where we let z := x + y 1/. For simplicity, we will write φx,y as φ. Thus, x, x ψ FB x, y + y, y ψ FB x, y = x,l x L 1 z = L 1 z = L 1 z Iφ + y,l y L 1 z Iφ L x Ix,φ + L 1 L y Iy,φ L x x + L 1 L y y x y,φ = L 1 z x + y x y,φ = L 1 z z x y,φ = z x y,φ = φ, where the next-to-last equality follows from L z z = z, so that L 1 z z = z. This proves 33. Similarly, x ψ FB x, y, y ψ FB x, y = L x L 1 z z Iφ, L y L 1 z z Iφ = L x L z L 1 z φ, L y L z L 1 z φ = L y L z L x L z L 1 z φ, L 1 z φ. 35

31 J.-S. Chen and P. Tseng Let S be the symmetric part of L y L z L x L z. Then S = 1 L y L z L x L z + L x L z L y L z = 1 L x L y + L y L x L z L x + L y L x + L y L z + L z = 1 L z L x L y + 1 L z L x L y. Since z K n and z = x + y, Lemma 5 yields L z L x L y O. Then 35 yields x ψ FB x, y, y ψ FB x, y = SL 1 z φ,l 1 z φ = 1 L z L x L y L 1 z φ,l 1 z φ + 1 L z L x L y L 1 z φ,l 1 z φ 1 L z L x L y L 1 z φ,l 1 z φ = 1 L φl 1 z φ, where the last equality uses L z L x L y = L z x y = L φ. This proves 34. If the inequality in 34 holds with equality, then the above relation yields L φ L 1 z φ = 0 and, by Property 1d, φ L 1 z φ = L φ L 1 z φ = 0. Then, the definition of Jordan product 10 yields φ, L 1 z φ =0. Since z = x + y 1/ intk n so that L 1 z O see Property 1d, this implies φ = 0. Conversely, if φ = 0, then it follows from 1 that x ψ FB x, y, y ψ FB x, y =0. Case 3: x, y 0, 0 and x + y intk n. By Proposition 1, we have x 1 x ψ FB x, y = x1 + y 1 y 1 y ψ FB x, y = x1 + y 1 1 φx,y, 1 φx,y.

An unconstrained smooth minimization reformulation of the second-order cone 313 Thus, x, x ψ FB x, y + y, y ψ FB x, y = x 1 1 x,φx,y + y 1 1 y,φx,y x1 + y 1 x1 + y 1 = x 1 1 x + y 1 1 y,φx,y x1 + y 1 x1 + y 1 x 1 x + y 1 y = x y,φx,y x1 + y 1 = φx,y,φx,y, where the last equality uses 6. This proves 33. Similarly, x ψ FB x, y, y ψ FB x, y = x 1 1 y 1 x1 + y 1 x1 + y 1 0. 1 φx,y This proves 34. If the inequality in 34 holds with equality, then either φx,y = 0 x or 1 y = 1or 1 = 1. In the second case, we have y 1 = 0 and x 1 0, so x 1 +y 1 x 1 +y 1 that Lemma yields y = 0 and x 1 = x. In the third case, we have x 1 = 0 and y 1 0, so that Lemma yields x = 0 and y 1 = y. Thus, in these two cases, we have x y = 0, x K n,y K n. Then, by Lemma 1, φx,y = 0. Below we assume that Fζ, Gζ are column monotone ζ IR n ; 36 see [9, p. 1014], [34, p. ]. In the case of 4, corresponding to Gζ = I, 36 is equivalent to F being monotone. More generally, if Gζ is invertible, then 36 is equivalent to Gζ 1 Fζ O for all ζ IR n. In the case of 6, 36 is satisfied always. To see this, note that Fζ = [B 0] T and Gζ = [B 0] T gfζ [0 A T ] T. Hence Fζu Gζ v = 0 is equivalent to [ B T 0 ] u [ B T 0 ] gf ζ v + [ 0 A] v = 0 for any u, v IR n. This yields B T u = B T gf ζ v and Av = 0. The second equation implies v = Bw for some w IR n m, so that multiplying the first equation on the left by w T and using gfζ 0 since g is convex yields w T B T u = v T u = v T gf ζ v 0. M,N IR n n are column monotone if, for any u, v IR n, Mu + Nv = 0 u T v 0.

314 J.-S. Chen and P. Tseng In the case of 7, 36 is also satisfied always, as can be argued similarly. Moreover, the argument extends to the more general problem where g is replaced by any differentiable monotone mapping from IR n to IR n. Using Lemma 5b, we prove below the first main result of this section, based on the merit function ψ FB. Analogous results have been proven for the NCP case [10, 0] and the SDCP case [50]. Proposition 3. Let φ be given by 11 and let ψ FB be given by 17. Let f FB be given by 15, where F and G are differentiable mappings from IR n to IR n satisfying 36. Then, for every ζ IR n, either i f FB ζ = 0 or ii f FB ζ 0. In case ii, if Gζ is invertible, then d FB ζ, f FB ζ < 0, where d FB ζ := Gζ 1 T x ψ FB F ζ, Gζ. Proof. Fix any ζ IR n. By Proposition, ψ FB is smooth, so the chain rule for differentiation yields f FB ζ = Fζ x ψ FB F ζ, Gζ + Gζ y ψ FB F ζ, Gζ. Suppose f FB ζ = 0. The column monotone property of Fζ, Gζ yields x ψ FB F ζ, Gζ, y ψ FB F ζ, Gζ 0. By Lemma 6b, the above inequality must hold with equality and hence φf ζ, Gζ = 0. Thus f FB ζ = 1 φf ζ, Gζ = 0. Suppose f FB ζ 0 and Gζ is invertible. Then dropping the argument ζ for simplicity, d FB, f FB = G 1 T x ψ FB F, G, F x ψ FB F, G + G y ψ FB F, G = x ψ FB F, G, G 1 F x ψ FB F, G + y ψ FB F, G = x ψ FB F, G, G 1 F x ψ FB F, G x ψ FB F, G, y ψ FB F, G x ψ FB F, G, y ψ FB F, G, where the inequality follows from G 1 F 0. By Lemma 6b, the right-hand side is non-positive and equals zero if and only if φf,g = 0, i.e., ζ is a global minimum of f FB. Since f FB 0, the right-hand side cannot equal zero, so it must be negative. The direction d FB ζ has the advantage that, unlike f FB ζ, it does not require Fζ for its evaluation. However, for CSOCP 6 or 7, Gζ is not invertible, so this direction cannot be used. Using Lemma 5, we prove below the second main result of this section, based on the merit function ψ YF given by 13. Similar results have been proven for the NCP case [34] and the SDCP case [53]. Proposition 4. Let φ be given by 11, let ψ FB be given by 17, and let ψ YF be given by 13, with ψ 0 :IR [0, being any smooth function satisfying 14. Let f YF be given by 16, where F and G are differentiable mappings from IR n to IR n satisfying

An unconstrained smooth minimization reformulation of the second-order cone 315 36. Then, for every ζ IR n, either i f YF ζ = 0 or ii f YF ζ 0. In case ii, if Gζ is invertible, then d YF ζ, f YF ζ < 0, where d YF ζ := Gζ 1 ψ T 0 F ζ, Gζ Gζ + xψ FB F ζ, Gζ. Proof. Fix any ζ IR n. By Proposition, ψ FB is smooth. Since ψ 0 is smooth, 13 shows that ψ YF is smooth. Then the chain rule for differentiation yields f YF ζ = α F ζ Gζ + Gζ F ζ + Fζ x ψ FB F ζ, Gζ + Gζ y ψ FB F ζ, Gζ, where we let α := ψ 0 F ζ, Gζ. Suppose f YF ζ = 0. Then, dropping the argument ζ for simplicity, we have α FG+ GF + F x ψ FB F, G + G y ψ FB F, G = 0. The column monotone property of F, G yields αg + x ψ FB F, G, αf + y ψ FB F, G 0. Upon collecting terms on the left-hand side, we have α F,G +α F, x ψ FB F, G + G, y ψ FB F, G + x ψ FB F, G, y ψ FB F, G 0. Our assumption 14 on ψ 0 implies the first term is nonnegative. By Lemma 6, the second and the third terms are also nonnegative. Thus, the third term must be zero, so Lemma 6b implies φf,g = 0. Thus f FB ζ = 1 φf ζ, Gζ = 0. Suppose f YF ζ 0 and Gζ is invertible. Again, we drop the argument ζ for simplicity. Then, d YF, f YF = G 1 T αg + x ψ FB F, G, FαG+ x ψ FB F, G + GαF + y ψ FB F, G = αg + x ψ FB F, G, G 1 FαG+ x ψ FB F, G αg + x ψ FB F, G, αf + y ψ FB F, G αg + x ψ FB F, G, αf + y ψ FB F, G = α F,G α F, x ψ FB F, G + G, y ψ FB F, G x ψ FB F, G, y ψ FB F, G,

316 J.-S. Chen and P. Tseng where the first inequality follows from G 1 F 0. We argued earlier that all three terms on the right-hand side are non-positive. Moreover, by Lemma 6b, the third term is zero if and only if φf,g = 0, i.e., ζ is a global minimum of f YF and hence a stationary point of f YF. Since f YF ζ 0, the right-hand side cannot equal zero, so it must be negative. 5. Bounded level sets and error bounds for f YF In this section, we consider the merit function f YF given by 16. We show that, analogous to the NCP and SDCP cases [34, 53], if F and G have a joint monotonicity property and a strictly feasible solution exists, then f YF has bounded level sets. If F and G have a joint strong monotonicity property, then f YF has bounded level sets and provides a global error bound on the distance to a solution of SOCCP. In contrast, the merit function f FB given by 15 lacks these properties due to the absence of the term ψ 0 F ζ, Gζ. As in the previous two sections, we focus our analysis on the case of N = 1 i.e., K = K n for simplicity. In what follows, for each x IR n, x + denotes the nearest-point in the Euclidean norm projection of x onto K n. We begin with the following lemma. Lemma 7. Let K be any closed convex cone in IR n. For each x IR n, let x + K and x K denote the nearest-point in the Euclidean norm projection of x onto K and K, respectively. The following results hold. a For any x IR n, we have x = x + K + x K and x = x + K + x K. b For any x IR n and y K, we have x,y x + K,y. c If K is self-dual, then for any x IR n and y K, we have x + y + K x + K. d For any x K n, y IR n with x y K n, we have x y K n. Proof. a. These are well-known results in convex geometry on representing x as the sum of its projection onto K and its polar K. b. Since x K K and y K, x K,y 0. By a, x,y = x+ K,y + x K,y x + K,y. c. Since K is self-dual, we have y K. Then x + y K y K. Since x K is the nearest-point projection of x onto K, this implies xk x x + y K y x. By a, this simplifies to x + K x + y+ K. d This is Proposition 3.4 of [19]. Lemma 7c generalizes [53, Lemma.4]. Using Lemma 7, we obtain the following two lemmas that are analogs of [53, Lemmas.5,.6] for SDCP. Lemma 8. Let ψ FB be given by 11 and 17. For any x, y IR n IR n, we have 4ψ FB x, y φx,y + x + + y +.

An unconstrained smooth minimization reformulation of the second-order cone 317 Proof. The first inequality follows from Lemma 7a. It remains to show the second inequality. By Lemma 7d, x + y 1/ x K n. Since K n is self-dual, then Lemma 7c yields x + y 1/ x y By a symmetric argument, x + y 1/ x y + + y +. x +. Adding the above two inequalities yields the desired second inequality. Lemma 9. Let ψ FB be given by 11 and 17. For any {x k,y k } k=1 IR n IR n, let λ k 1 λk and µk 1 µk denote the spectral values of xk and y k, respectively. Then the following results hold. a If λ k 1 or µk 1, then ψ FB xk,y k. b Suppose that {λ k 1 } and {µk 1 } are bounded below. If λk or µk x,x k + y,y k for any x,y intk n., then Proof. a. This follows from Lemma 8 and the fact that x k + = max{0, λ k i } i=1 and similarly for y k + ; see [19, Property. and Proposition 3.3]. b. Fix any x = x 1,x, y = y 1,y IR IR n 1 with x <x 1, y <y 1. Using the spectral decomposition x k = λ k 1 + λ k, λk λk 1 w k with w k =1, we have λ x,x k k = 1 + λ k λ k x 1 + λ k 1 x T wk = λk 1 x 1 x T wk + λk x 1 + x T wk. 37 Since w k =1, we have x 1 x T wk x 1 x > 0 and x 1 +x T wk x 1 x > 0. Since {λ k 1 } is bounded below, the first term on the right-hand side of 37 is bounded below. If {λ k }, then the second term on the right-hand side of 37 tends to infinity. Hence, x,x k. A similar argument shows that y,y k is bounded below. Thus, x,x k + y,y k.if{µ k }, the argument is symmetric to the one above.

318 J.-S. Chen and P. Tseng In what follows, we say that F and G are jointly monotone if Fζ F ξ, Gζ Gξ 0 ζ, ξ IR n. Similarly, F and G are jointly strongly monotone if there exists ρ>0 such that Fζ F ξ, Gζ Gξ ρ ζ ξ ζ, ξ IR n. In the case where Gζ = ζ for all ζ IR n, the above notions are equivalent to the wellknown notion of F being, respectively, monotone and strongly monotone [9, Section.3]. Since F is differentiable, F being monotone is equivalent to Fζ O for all ζ IR n ; see, e.g., [9, Proposition.3.]. 3 It can be seen that F,G given by 6 or 7 are jointly monotone, but not jointly strongly monotone. It is not difficult to see that if F,G are jointly strongly monotone, then SOCCP has at most one solution. Sufficient conditions for SOCCP to have a solution are given in, e.g., [9, Sections.,.4], [3, Chapter 6], as well as Proposition 6. Using Lemmas 7b and 8, we obtain the following global error bound results for SOCCP that is an analog of [53, Theorem 4.] for SDCP. The proof, based on Lemmas 7b and 8, is similar to the proof of [34, Theorem 3.4] and [53, Theorem 4.] and is included for completeness. Proposition 5. Suppose that F and G are jointly strongly monotone mappings from IR n to IR n. Also, suppose that SOCCP has a solution ζ. Then there exists a scalar τ>0 such that τ ζ ζ max{0, F ζ, Gζ }+ Fζ + + Gζ + ζ IR n. 38 Moreover, τ ζ ζ ψ0 1 fyf ζ + f YF ζ 1/ ζ IR n, 39 where f YF is given by 13, 16, 17, ψ 0 :IR [0, is a smooth function satisfying 14, and ψ0 1 denotes the inverse function of ψ 0 on [0,. 4 Proof. Since F and G are jointly strongly monotone, there exists a scalar ρ > 0 such that, for any ζ IR n, ρ ζ ζ Fζ Fζ, Gζ Gζ = F ζ, Gζ + Fζ, Gζ + Fζ, Gζ max{0, F ζ, Gζ }+ Fζ +, Gζ + Fζ, Gζ + max{0, F ζ, Gζ }+ Fζ + Gζ + Fζ Gζ + max{1, Fζ, Gζ } max{0, F ζ, Gζ }+ Fζ + + Gζ +, 3 However, F and G being jointly monotone seems not equivalent to Fζ, Gζ being column monotone for all ζ IR n. 4 ψ 1 0 is well defined since, by 14, ψ 0 is strictly increasing on [0,.

An unconstrained smooth minimization reformulation of the second-order cone 319 ρ where the second inequality uses Lemma 7b. Setting τ := max{1, Fζ, Gζ } yields 38. Using 13, 14 and 16, we have max{0, F ζ, Gζ } ψ0 1 fyf ζ and ψ FB F ζ, Gζ f YF ζ. Using Lemma 8 and the second inequality, we have Fζ + + Gζ + Fζ + + Gζ + 1/ Thus, ψ FB F ζ, Gζ 1/ f YF ζ 1/. max{0, F ζ, Gζ }+ Fζ + + Gζ + ψ0 1 This together with 38 yields 39. fyf ζ + f YF ζ 1/. If in addition F is continuous and Gζ = ζ for all ζ IR n, then the assumption that the SOCCP has a solution can be dropped from Proposition 5; see, e.g., [9, Proposition..7]. Also, the exponent in the definition of joint strong monotonicity can be replaced by any q>1, and Proposition 5 would generalize accordingly. By using Lemma 9 and Proposition 4, we have the following analog of [53, Theorem 4.1] on solution existence and boundedness of the level sets of f YF. Proposition 6. Suppose that F and G are differentiable, jointly monotone mappings from IR n to IR n satisfying lim Fζ + Gζ =. 40 ζ Suppose also that SOCCP is strictly feasible, i.e., there exists ζ F ζ,g ζ intk n. Then the level set IR n such that Lγ := {ζ IR n f YF ζ γ } is bounded for all γ 0, where f YF is given by 13, 16, 17, and ψ 0 :IR [0, is a smooth function satisfying 14. If in addition F,G satisfy 36, then Lγ for all γ 0. Proof. For any γ 0, if {ζ k } k=1 Lγ, then {f YF ζ k } is bounded and the joint monotonicity of F and G yields Fζ k, G ζ + F ζ, Gζ k Fζ k, Gζ k + F ζ,g ζ, k = 1,,... Using this together with Lemma 9 and an argument analogous to the proof of [53, Theorem 4.1], we obtain that { Fζ k + Gζ k } is bounded. Then 40 implies {ζ k } is bounded. This shows that Lγ is bounded. The proof of Lγ uses Proposition 4 and is nearly identical to the proof of [53, Theorem 4.1].

30 J.-S. Chen and P. Tseng It is straighforward to verify that F,G given by 6 or 7 are jointly monotone. Also, we saw in Section 4 that they satisfy 36. If g is linear or, more generally, lim x gx / x =0, then 40 holds and, by Proposition 6, CSOCP has nonempty bounded optimal primal and dual solution sets whenever it has strictly feasible primal and dual solutions. This result in fact extends to the more general problem where g is replaced by any differentiable monotone mapping from IR n to IR n. This result also holds when F is differentiable monotone and Gζ = ζ for all ζ IR n. 6. Preliminary numerical experience Propositions and 3 show that SOCP and, more generally, CSOCP 5 may be reformulated as the unconstrained minimization of the smooth merit function f FB or f YF, with F,G given by either 6 or 7. In particular, the merit function has a stationary point if and only if both primal and dual optimal solutions of the CSOCP exist and there is no duality gap. And each stationary point yields primal and dual optimal solutions. Thus, we can solve the CSOCP by applying any unconstrained minimization method to the merit function. In contrast to primal-dual interior-point methods for SOCP, this approach does not require the SOCP or its dual to have an interior feasible solution, and it opens SOCP to solution by unconstrained optimization methods. It also allows non-interior starting points. In this section, we report our preliminary experience with solving SOCP from the DIMACS library and randomly generated CSOCP by this approach. In our tests, we use the merit function f FB. Comparable results are expected with f YF. We consider F,G given by 7. We evaluate F,G using the Cholesky factorization of AA T, which is efficient when A is sparse. 5 In particular, given such a factorization LL T = AA T, we can compute x = Fζ and y = Gζ for each ζ via two sparse matrix-vector multiplications and two forward/backward solves: Lu = Aζ, L T v = u, w = A T v, x = d + ζ w, y = gx w. In contrast to interior-point methods, the Cholesky factorization needs to be computed only once, thus allowing f FB and its gradient to be efficiently evaluated. All computer codes are written in Matlab, except for the evaluation of φx,y and ψ FB x, y, which are more efficiently written in Fortran and called from Matlab as Mex files since their evaluations require looping N times through each SOC. In fact, coding these evaluations in Fortran instead of Matlab reduced the overall cpu time by a factor of about 10, despite some loss in accuracy which results in higher iteration counts. Cholesky factorization is computed using the Matlab routine chol. For the vector d satisfying Ad = b, which is effectively the initial x see below, we compute it as a solution of min d Ad b using Matlab s least square solver. It would be worthwhile to explore other choices. For the unconstrained optimization method ζ new := ζ + α, 5 We also experimented with a version that uses pre-conditioned conjugate gradient method instead of Cholesky factorization, but it did not seem to improve the cpu time significantly. Precomputing and storing the n n matrix A T AA T 1 A also did not improve the cpu time, even on problems with dense A.

An unconstrained smooth minimization reformulation of the second-order cone 31 we compute the direction by either the conjugate gradient CG method using Polak- Ribiere or Fletcher-Reeve updates or the BFGS method or the limited-memory BFGS L-BFGS method, and we compute the stepsize α by the Armijo rule with 1 as the initial trial stepsize, which is typically accepted [5, 18, 38]. We do not enforce the Wolfe condition [18, Chapter ] since it is expensive, requiring an extra gradient evaluation per stepsize. To ensure convergence, we revert to the steepest descent direction f FB ζ whenever the current direction fails to satisfy the sufficient descent condition f FB ζ T 10 5 f FB ζ. The initial point is chosen to be ζ init = 0, so that x init = d and y init = c. It may be worthwhile to explore other choices. The method terminates when max{f FB ζ, x T y } accur, 41 where accur is a user-specified solution accuracy. The duality gap x T y is added to facilitate comparison with interior-point methods. The method requires 1 gradient evaluation and at least 1 function evaluation per iteration. This is the dominant computation for CG and L-BFGS. 6.1. Solving SOCP with sparse A We consider the special case of CSOCP where A is sparse and gx = c T x for some c IR n. The test problems are drawn from the DIMACS Implementation Challenge library [39], a collection of nontrivial medium-to-large SOCP arising from applications. In our tests, L-BFGS is found to be clearly superior to CG and BFGS. Thus we focus on L-BFGS from here on. The recommended memory length of 5 [38, Section 9.1] is found to work the best. However, for the scaling matrix H 0 = γi, the choice 1 γ = p T q q T q is found to work better than the four choices used by Liu and Nocedal [3], including the recommended choice of γ = p T q/q T q [38, p. 6], where p := ζ ζ old and q := f FB ζ f FB ζ old. We do not have a good explanation for this. We will refer to the above method as L-BFGS-Merit. We also tested an alternative implementation whereby the public-domain Fortran L-BFGS code of Nocedal 1990 version [3], with default value of H 0, is called by Matlab as Mex files. Nocedal s code uses a stepsize procedure of Moré and Thuente, which enforces a curvature condition as well as sufficient descent. However, on the DIMACS problems, this alternative implementation requires more iterations and cpu time than L-BFGS-Merit to reach the same solution accuracy. In our tests on the DIMACS problems, we find that L-BFGS-Merit can solve SOCP to low-medium accuracy ac\-cur 1e-5 fairly fast on problems where n is much bigger than m in particular, nb, nb_l, nb_l_bessel. This can be seen from the cpu times reported in Table 1, comparing L-BFGS-Merit with SeDuMi Version 1.05 by Jos Sturm [46] with varying termination accuracy. SeDuMi is a primal-dual interiorpoint code that, in the benchmarking of Mittelmann [36, p. 44], is within a factor of