On Conically Ordered Convex Programs

On Conically Ordered Convex Programs Shuzhong Zhang May 003; revised December 004 Abstract In this paper we study a special class of convex optimization problems called conically ordered convex programs COCP), where the feasible region is given as the level set of a vector-valued nonlinear mapping, expressed as a nonnegative combination of convex functions. The nonnegativity of the vectors is defined using a pre-described conic ordering. The new model extends the ordinary convex programming models where the feasible sets are the level sets of convex functions, and it also extends the famous linear conic optimization models. We introduce a condition on the barrier function for the order-defining cone, termed as the cone-consistent property. The relationship between the cone-consistent barriers and the self-concordance barriers is investigated. We prove that if the order-defining cone admits a self-concordant and cone-consistent barrier function, and moreover, if the constraint functions are all convex quadratic then the overall composite barrier function is self-concordant. The problem is thus solvable in polynomial time, following Nesterov and Nemirovskii, by means of the path-following method. If the constraint functions are not quadratic, but harmonically convex, then we propose a variant of Iri-Imai type potential reduction method. In addition to the self-concordance and the cone-consistence conditions, we assume that the barrier function for the order-defining cone has the property that the image of the cone under its Hessian matrix is contained in its dual cone. All these conditions are satisfied by the familiar self-scaled cones. Under these conditions we show that the Iri-Imai type potential reduction algorithm converges in polynomial time. Duality issues related to this class of optimization problems are discussed as well. Keywords: convex programming, conic ordering, self-concordant barriers, potential reduction. Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, Hong Kong. Research supported by Hong Kong RGC Earmarked Grants CUHK433/0E and CUHK474/03E.

The problem formulation Let K R l be a solid closed convex cone. Let h j K, j =,..., k, and h 0 R l. Consider the following optimization problem, which we shall call conically ordered convex program COCP) in this paper, P K ) minimize c T x subject to Ax = b h 0 f j x)h j K, where A R m n, c R n and b R m. If we use the conic ordering defined by K, i.e., for y, z R l, y K z if and only if z y K, then P K ) can be written as minimize subject to c T x Ax = b f j x)h j K h 0. Note that all the results in this paper straightforwardly extend to the setting where more than one conic constraint are present, namely, minimize subject to c T x Ax = b fjx)h i i j K i h i 0, i =,..., p. To keep the analysis simple, however, we shall restrict ourselves to only one conic constraint in the subsequent discussion. It is evident that P K ) is indeed a convex program as stated below. Lemma. If f i s are all convex functions, i =,..., k, then P K ) is a convex program. Since K is solid, i.e. K + K) = R l, and so any vector in R l can be written as difference of two vectors in K. Therefore, the ordinary linear conic constraint h 0 k x j h j K, where h j R l for all j, can be written as h 0 k x j h + j x jh j ) = h 0 k x j h + j k x j )h j K, where h j = h + j h j and h + j, h j K for all j; the latter expression is in the form of COCP. Obviously, if K = R l + and h j s are unit vectors, then P K ) is the usual convex program where the constraints are given by inequalities: f j x) 0, j =,..., k.

Another interesting special case of P K ) is the following P R l ) + minimize c T x subject to Ax = b f j x)h j h 0 where f j x) = e ιt j x+ι j0 with ι j R n and ι j0 R, and h j R l +, j = 0,,..., k. This problem is known as geometric program. Note that in its original form, a general geometric program is given as GP ) minimize subject to m c j f 0j t) m a ij f ij t) b i, i =,..., p f ij t) = b i, i = p +,..., p + q, t > 0,..., t n > 0 where c j > 0 and a ij > 0 and f ij t) = t ιij) t ιij) n, i = 0,,..., p + q, j =,..., m. Such f ij t) s are called monomials. Using the variable transformation x = log t, x = log t,..., x n = log t n the monomial f ij t) becomes e ιij) ) T x where ι ij) ) T = [ι ij),, ι ij) n ], i = 0,,..., p + q, j =,..., m. This way, GP ) is transformed into a COCP problem P R l + ). Next let us consider another special case of COCP P P SD ) minimize c T x subject to Ax = b H 0 f j x)h j 0, where H 0,..., H k 0 are l l positive semidefinite matrices, and f x),..., f k x) are convex functions. Consider the following barrier function F x) = log deth 0 f j x)h j ) ) 3

for the cone of l l positive semidefinite matrices S+. l A natural question to ask is: under what conditions is the barrier function F x) defined in ) self-concordant? For details on the theory of self-concordant barriers, one is referred to the excellent texts [5, 4, 7]. The notion of self-concordant barrier function was introduced by Nesterov and Nemirovskii in their seminal work [5]. For convenience, we shall include the definition below. First, a barrier convex) function F x) for the cone K is defined to have the property that F x) < for all x int K, and F x k ) as x k x where x is on the boundary of K. Moreover, it is called self-concordant Section.3. of [5]) if it further satisfies the property that 3 F x)[u, u, u] C F x)[u, u]) 3/ ) and F x)[u] C F x)[u, u]) / 3) for any x int K and any direction u R n. The positive constants C and C can be scaled so as to have only one degree of freedom. Usually C is set to be and C is referred to as the complexity parameter of the cone K. As noted by Renegar [7] see also []), just like convexity, the self-concordant property is in fact a line property, meaning that if a function is self-concordant along every line thus one-dimensional) restricted to its domain then the function is self-concordant in the whole domain, and vice versa. For this purpose, let us consider a given x in the interior of the feasible domain of P P SD ), and u R n be any given direction. Consider the one-dimensional function Xt) := H 0 f j x + tu)h j in the domain that Xt) 0. Obviously, X t) = f j x + tu)[u]h j = f j x + tu) T uh j, X t) = f j x + tu)[u, u]h j = u T f j x + tu)uh j 0, X t) = 3 f j x + tu)[u, u, u]h j, and [Xt) ] = Xt) X t)xt). 4

Let ft) = log det Xt). Applying the chain-rule to the composite function ft) we get ) f t) = tr Xt) X t) 4) ) ) f t) = tr Xt) X t)xt) X t) tr Xt) X t) 5) ) ) 3 ) f t) = 3tr Xt) X t)xt) X t) tr Xt) X t) tr Xt) X t). 6) Theorem. If f j x) s, j =,..., k, are convex and quadratic functions, then F x) as defined in ) is a self-concordant barrier. Proof. Let As discussed before, we need only to verify the self-concordant condition along a given line. M := Xt) / X t)xt) / and M := Xt) / X t)xt) / 0. 7) It follows from 4), 5) and 6) respectively that f t) = tr M, 8) f t) = tr M + tr M = M F + M / F, 9) and f t) = 3tr M M tr M 3 tr ) Xt) X t). 0) By the Cauchy-Schwarz inequality, we have f t) = tr M ltr M ) / lf t)) / which is 3) with C = l. Since f j s are all quadratic, we have X t) 0 and so 0) reduces to f t) = 3tr M M tr M 3. ) Clearly, tr M 3 /3 tr M )/ f t)) /. Hence, tr M 3 f t)) 3/. ) 5

Moreover, tr M M M F M F M F M / F M F + M / ) 3/ F 3 = 3 f t)) 3/, 3) where the last step is due to 9). Therefore, from ), ) and 3) we obtain f t) + 3)f t)) 3/, which is ) with C = + 3. Q.E.D. In case all f j s are affine linear, we have M = 0, and f t) f t)) 3/, which corresponds to the well-known fact that the logarithmic determinant barrier is self-concordant for Linear Matrix Inequalities. We note that if all f j s are convex quadratic functions then the feasible set of P P SD ) can also be described by an enlarged system of linear matrix inequalities using Schur s complement lemma. In particular, suppose that f j x) = n i= q T ij x q i) + q 0j, j =,..., k. Then, H 0 k f j x)h j 0 if and only if H 0 q 0j H j q T x q )H / q T x q )H / q T nk x q n)h / k q T x q )H / I 0 0 0 0. 0 I 0 0 0...... qnk T x q n)h / k 0 0 0 0 I 0. However, the complexity parameter of the resulting LMI system becomes kn + )l, instead of l. The cone-consistent barriers The aim of this section is to derive the essential and general property of the barrier function that enabled Theorem. in the previous section. 6

Let us call a barrier function F x) to be consistent with the cone K denoted as C 3 -consistent in the sequel) if it satisfies the following condition: F x)[u, u]) / C3 F x)[u] 4) for all x int K and u K, where C 3 > 0 is a parameter. It is obvious that if K = R n + and F x) = n i= log x i then ) / n u F x)[u, u] = i x i= i n i= u i x i = F x)[u] for x > 0 and u 0. So 4) is satisfied with C 3 =. It is easy to verify that 4) is also satisfied by the logarithmic determinant barrier function for the positive semidefinite matrix cone. Similarly, it can be shown that the additional inequality 4) is satisfied by all self-scaled barrier functions for the self-scaled cones see Nesterov and Todd [6]) with C 3 = ; in a different but equivalent framework, the self-scaled barrier function is the log determinant barrier for the symmetric cones see Faybusovich [] or Sturm [8, 9]). One may therefore think that 4) is a property of the self-scaled cones only. However, this guess is incorrect. Let us consider the following cone in R 3 as an epigraph of the l 4 -norm in R : x K 4 := x x 0, x 4 x 4 + x 4 3) 0. x 3 The above cone is not self-scaled. Also, the barrier function B 4 x) = log x 4 x4 + x4 3 )) is not self-concordant, but it satisfies 4) with C 3 =. To see this, consider x = [x, x, x 3 ] T int K 4. We denote b 4 x) = x 4 x4 + x4 3 ) > 0 and compute that B 4 x) = b 4 x) 4x 3 4x 3 4x 3 3 and B 4 x) = B 4 x) B 4 x) T b 4 x) x 0 0 0 x 0 0 0 x 3. Take any y = [y, y, y 3 ] T K 4. We have y T B 4 x)y = B 4 x), y ) ) x b 4 x) y x y x 3y3. 7

Since x > x 4 + x4 3 and y y 4 + y4 3, we have x y x 4 + x4 3 )y4 + y4 3 ) x 4 y4 + x4 y4 3 + x4 3 y4 + x4 3 y4 3 x 4 y4 + x y x 3 y 3 + x4 3 y4 3 = x y + x 3y 3. Therefore, y T B 4 x)y B 4 x), y ). Inequality 4) thus holds for the barrier function B 4 x) with C 3 =. One may also suspect that the cone consistence condition 4) is implied by the self-concordance property. However, again this is not the case. In Zhang [] it is shown that if bx) is a convex quadratic function in R n, then the barrier function B h x) = logq pbx/p)) is a -logarithmically homogeneous self-concordant barrier for the convex cone K h = cl x = p q x R ++ R R n q pbx/p) 0. Consider y = [, + b0), 0 T ] T int K h, and x int K h. One computes that B h x) = q pbx/p) bx/p) + bx/p) T x/p) bx/p) and B h x) = bx/p) + bx/p) T x/p) bx/p) + bx/p) T T x/p) q pbx/p)) bx/p) bx/p) x/p) T bx/p)x/p) 0 x/p) T bx/p) + pq pbx/p)) 0 0 0. bx/p)x/p) 0 bx/p) Therefore, B h x), y = + b0) bx/p) + bx/p)t x/p) q pbx/p) > 0 5) 8

and Thus, from 5) and 6) we have B h x)[y, y] = B h x), y ) + x/p)t bx/p)x/p). 6) pq pbx/p)) B h x)[y, y] B h x), y ) = + q pbx/p) x/p) T bx/p)x/p) p + b0) bx/p) + bx/p) T x/p)). The above quantity can not be uniformly bounded from above for all x int K h if bx) is strictly convex, because q can take arbitrarily large value while p and x are fixed. This implies that for the barrier function B h x), the cone-consistence condition 4) is not satisfied, although the function is self-concordant and -logarithmically homogeneous. These two examples show that the new condition 4) is a related but different property as compared with the self-concordance property, the self-scaled property, or the logarithmic homogeneous property of the barrier functions. Consider P K ). Let F ) be a consistent self-concordant barrier function for the cone K satisfying ), 3) and 4) altogether. Let and px) := h 0 f j x)h j int K fx) := F px)). 7) Let ξ R n be any given direction. For a chosen coordinate system we have fx)[ξ] = F px)), f j x) T ξ)h j. 8) Consequently, by the chain-rule, fx)[ξ, ξ] = F px))[ f j x) T ξ)h j, f j x) T ξ)h j ] F px)), ξ T f j x)ξ)h j 9) and 3 fx)[ξ, ξ, ξ] = 3 F px))[ f j x) T ξ)h j, f j x) T ξ)h j, f j x) T ξ)h j ] +3 F px))[ ξ T f j x)ξ)h j, f j x) T ξ)h j ] 9

k 3 f j x)[ξ, ξ, ξ]h j, F px)). 0) Since f j s are convex functions, ξ T f j x)ξ 0 for all j =,..., k, and so ξ T f j x)ξ)h j K. Moreover, because F px)) int K we obtain fx)[ξ, ξ] F px))[ f j x) T ξ)h j, f j x) T ξ)h j ] C F px)), f j x) T ξ)h j = C fx)[ξ] ) which is the second inequality required by the self-concordance 3). Therefore we can use condition 4) to obtain from 9) that fx)[ξ, ξ] F px))[ f j x) T ξ)h j, f j x) T ξ)h j ] + C F px))[ ξ T f j x)ξ)h j, ξ T f j x)ξ)h j ]. ) 3 Denote and We rewrite ) as ) / d := F px)) f j x) T ξ)h j ) / d := F px)) ξ T f j x)ξ)h j. fx)[ξ, ξ] d + [ ) d d ] 3 d 3. 3) C 3 C 3 Theorem. Suppose that F ) is a self-concordant barrier function for K, and moreover it is consistent with K. Suppose that f j x) s are convex quadratic functions, j =,..., k. Then fx) as defined in 7) is a self-concordant barrier function for P K ). 0

Proof. If f j s are all quadratic functions then it follows from 0) that 3 fx)[ξ, ξ, ξ] C fx)[ξ, ξ]) 3/ + 3d T d C fx)[ξ, ξ]) 3/ + 3 d d C fx)[ξ, ξ]) 3/ + C 3 3 fx)[ξ, ξ]) 3/ = C + C 3 3) fx)[ξ, ξ]) 3/, 4) where the third inequality is due to 3). Inequalities 4) and ) show the self-concordance of the function fx). Q.E.D. 3 A potential reduction approach Theorem. suggests that if the conic barrier F is self-concordant and consistent with the cone, and f j s are all convex quadratic, then the composite barrier function f is self-concordant. Therefore we can apply the general theory of self-concordant barriers developed by Nesterov and Nemirovskii [5]. However, there are interesting cases that are not included in this model. Consider for instance the following geometric semidefinite program: P P SD ) minimize f 0 t) subject to l i f i t) u i, i =,..., k + m t j > 0, j =,..., n H 0 f j t)h j 0, where H 0,..., H k 0 are l l positive semidefinite matrices, and f i t) = t ι i tι in is a monomial, i =,..., k + m. After the variable transformation x j = log t j, j =,..., n, the above problem can be turned into P P SD ) minimize ι T 0 x subject to log l i ι T i x log u i, i =,..., k + m H 0 e ιt j x H j 0. Clearly, P P SD ) is a conically ordered convex program. Unfortunately, in this case, the function e ιt j x is not quadratic, and so Theorem. does not apply: the barrier function fx) is not selfconcordant in general. However, e ιt j x s are still well structured. In the above particular case, e ιt j x = e ιt j x ι j ι T j. Therefore, it holds that e ιt j x λ e ιt j y for all x and y in its feasible region, where λ = max{u /l,..., u k /l k }.

This example motivates us to study algorithms that can solve the COCP problems without an overall easy computable self-concordant barriers. As a matter of fact, most known interior point methods for nonlinear convex programs with provable polynomial-time computational complexities are constructed within the framework of the central-path following using a self-concordant barrier function, except for a perhaps less known member of the IPM family called the Iri-Imai method [3], which is a variant of the potential reduction method. Zhang [], and Sturm and Zhang [0] studied the Iri-Imai method or the potential reduction method) applied to the so-called harmonically convex programs, and show that such algorithms also run in polynomial time. The main proof techniques in [], and also proofs in this section, are inspired by a paper of Iri []. In this section, we shall study an Iri-Imai type potential reduction method for the conically ordered convex program P K ). First we mention the notion of harmonically convex functions; see []. Definition 3. A twice differentiable function f ) is called λ-harmonically convex on the convex domain Ω if there is a constant λ such that 0 fx) λ fy) holds for all x, y Ω. It is easy to see that if λ > 0 then λ fy) fx) λ fy) for all x, y Ω. Obviously, the affine linear functions 0-harmonically convex), the convex quadratic functions -harmonically convex), and the strongly convex functions, are all examples of harmonically convex functions. It is shown in [] that the direct) sum of these functions characterizes the whole class of harmonically convex functions. If a convex program is to minimize a harmonically convex objective function, subject to harmonically convex inequality constraints, then [] and [0] show that such problem can be solved by Iri-Imai type potential reduction algorithms in polynomial time. Note that even if fx) is harmonically convex and negative on a domain Ω, it does not follow that log fx)) is selfconcordant. Consider for example fx, x ) = x +x +sinx ), which is the sum of two harmonically convex functions, x and x +sinx ), hence harmonically convex. However, logx x sinx )) is not self-concordant. Consider P K ). Let F ) be a barrier function for K. Moreover, let us impose the following conditions. Condition 3. F ) is ν-logarithmically homogeneous for K, namely for all y int K and t > 0. F ty) = F y) ν log t It is well known see e.g. [5]) that if F is ν-logarithmically homogeneous for K, then the following

identities hold where y int K and t > 0: F ty) = F y); 5) t F ty) = t F y); 6) F y)y = F y); 7) F y)) T y = ν. 8) Condition 3. F ) is a self-concordant barrier function for K. In particular, the following two inequalities hold 3 F y)[u, u, u] F y)[u, u]) 3/ and F y)[u] ν F y)[u, u]) /, for all y int K and u span K. Notice that the second inequality in Condition 3. follows from 7), 8), and also the Cauchy- Schwarz inequality; it is in fact implied by Condition 3.. Like 7), denote fx) := F px)), with px) = h 0 k f j x)h j. Clearly, f ) is a barrier function for the feasible set for P K ). Lemma 3. Suppose that F ) is a ν-logarithmically homogeneous barrier function for K Condition 3.). Denote F to be the feasible set for P K ). Then for all x rel int F and ξ R n. fx)[ξ] ν ) / fx)[ξ, ξ] Proof. As x rel int F we have px) = h 0 k f j x)h j int K. Relations 8) and 9) give that fx)[ξ] = F px)), f j x) T ξ)h j 3

and fx)[ξ, ξ] = F px))[ f j x) T ξ)h j, f j x) T ξ)h j ] F px)), ξ T f j x)ξ)h j. Since F px)) K and k ξ T f j x)ξ)h j K due to the convexity of f j s, we obtain from the above equation that fx)[ξ, ξ] F px))[ f j x) T ξ)h j, f j x) T ξ)h j ] ν F px)), f j x) T ξ)h j = ν fx)[ξ], where the second step is due to Condition 3.. Q.E.D. Suppose that P K ) has a finite optimal value denoted by v. Let v v. For a give parameter γ > 0, let us introduce Gx; v) := c T x v) γ expfx)). 9) Note that Gx; v) is an extension of the so-called multiplicative barrier function of Iri and Imai; see [3]. The following theorem extends Iri and Imai s result. Theorem 3.3 Suppose that F ) is a ν-logarithmically homogeneous barrier function for K Condition 3.). If γ ν + then Gx; v) is a convex function on int F; if γ > ν + then Gx; v) is a strictly convex function on int F. Proof. Differentiation yields Gx; v) Gx; v) = γ c T c + fx) x v 30) Gx; v) Gx; v) = Gx; v) Gx; v) Gx; γ v)t c T x v) cct + fx). 3) Take any 0 η R n. Denote a := c T η/c T x v), b := η T fx) and e := η T fx)η > 0. By Lemma 3. we have e b /ν. Using 30) and 3) it follows that Gx; v) ηt Gx; v)η = γa + b) γa + e 4

= γ γ)a + γab + b + e γ γ)a + γab + + /ν)b. The last expression is a quadratic form with respect to a and b, which is nonnegative as its discriminant is γ γ γ)+/ν) = γν+ γ)/ν 0. The quantity η T Gx; v)η is strict positive if γ > ν+, because in this case the discriminant of the quadratic form is negative while e > 0. The theorem is thus proven. Q.E.D. For simplicity of the analysis, from now on we shall set γ to be ν + ν. Condition 3.3 F ) is κ-consistent κ > 0) with the cone K, i.e., for all y int K and u K. F y)[u, u]) / κ F y)[u] Condition 3.4 All f j ) s, j =,..., k, are harmonically convex on F. In particular, there is λ > 0 such that f j x) λ f j x) for all x, y F and j =,..., k. Condition 3.5 F ) satisfies F y)[u, v] 0 for all y int K and u, v K. F y)k K for all y int K. In other words, A few comments are in order here, concerning the last condition. If K is a self-scaled cone and F ) is the corresponding self-scaled barrier function, then as a part of the definition it holds that F y)k = K for all y int K. In that respect, Condition 3.5 is naturally satisfied by all selfscaled cones with the corresponding self-scaled barrier functions. Thus, one may conjecture that Condition 3.5 is satisfied if and only if K is a self-scaled cone. However, this is not the case. The key is that we only require F y)k K here, rather than F y)k = K for all y int K. In fact, K can be any pointed closed convex cone. To see this, let C be a second order cone that properly strictly) contain K. Therefore, C is strictly) properly contained in K. Let F c be the corresponding self-scaled barrier function for C. Thus, for any y int K int C we have F c y)k F c y)c = C K. For a given barrier function F ) for K, consider F µ ) := F )+µf c ), where µ > 0 is a given constant. Clearly, F µ ) is also a barrier function for K. If µ is chosen sufficiently large then F µ y)k K for any y int K. 5

Lemma 3.4 Suppose that F ) satisfies Condition 3.5, and v i, i =,..., m, are m arbitrary vectors in K. Let y int K and F y) 0 and x F y) := x T F y)x. Then, m max t i v i i= 0 t i ˆt i, i =,..., m = m ˆt i v i. F y) i= F y) Proof. Under Condition 3.5, we have vi T F y)v j 0 for any i and j. The lemma is obvious since m m t i v i = t i t j vi T F y)v j. i= F y) i, Q.E.D. Theorem 3.5 Suppose that F ) satisfies Conditions 3., 3.3, and 3.5, and f j ) s, j =,..., k, satisfy Condition 3.4. Let x rel int F be any feasible but non-optimal point. Let γ = ν + ν. Let the Newton direction at x be Then d N x) := Gx; v )) Gx; v ). Gx; v ) T d N x)/gx; v ) = Gx; v ) T Gx; v )) Gx; v )/Gx; v ) δ, where δ := / + κ + λκ > 0. Proof. Let δx) := Gx; v ) T Gx; v )) Gx; v )/Gx; v ). Since Gx; v ) 0, by the Cauchy-Schwarz inequality we know that for any η 0 it holds δx) Gx; v ) T η/gx; v ) η T Gx; v )η/gx; v ). 3) Let us take η = x x 0. Then, noting c T x x )/c T x v ) = and using 30) and 3) we have Gx; v ) T x x )/Gx; v ) = γ + fx) T x x ). 33) By the mean-value theorem, there is x j between x and x, j =,..., k, such that px) px ) = f j x) T x x )h j + x x ) T f j x j )x x )h j. 6

By letting we have ũ = x x ) T f j x j )x x )h j K 34) fx) T x x ) = F px))[ f j x) T x x )h j ] = F px))[px) px ) ũ] where in the last step we used the property 8). = ν F px))[px ) + ũ], 35) Using the estimation 35) in 33) and the fact that px), px ), ũ K we have Gx; v ) T x x )/Gx; v ) = ν F px))[px ) + ũ] ν + /κ) F px))[px ) + ũ, px ) + ũ]. 36) Similarly, since F px))[px )] 0, we have Gx; v ) T x x )/Gx; v ) ν F px))[ũ] ν + /κ) F px))[ũ, ũ]. 37) On the other hand, ) x x ) T Gx; v )x x )/Gx; v ) Gx; v ) T x x )/Gx; v ) = γ + x x ) T fx)x x ) = γ + F px))[ f j x) T x x )h j, f j x) T x x )h j ] F px))[ x x ) T f j x) T x x )h j ] γ + F px))[px) px ) ũ, px) px ) ũ] + ν F px))[ x x ) T f j x) T x x )h j, x x ) T f j x) T x x )h j ] ) γ + F px))[px), px)] + F px))[px ) + ũ, px ) + ũ] + λ ν F px))[ũ, ũ] ) γ + ν + κ Gx; v ) T x x )/Gx; v ) ) + λκ ν Gx; v ) T x x )/Gx; v ) + κ + λκ) Gx; v ) T x x )/Gx; v )), 38) 7

where the first step is due to 3) pre and post multiplying both sides by x x ) T and x x respectively), and at the fourth step we used x x ) T f j x) T x x ) λx x ) T f j x j ) T x x ), j =,..., k, due to Condition 3.4, and then applied Lemma 3.4, and finally at the fifth and the sixth steps we used 36) and 37). The theorem follows by combining 3) and 38) to obtain δx) δ. Q.E.D. Now let us study the effect of applying the Newton method with line-minimization) to reduce the value of Gx; v) where v v. Denote d := Gx; v)) Gx; v). Multiplying d T on both sides of 30), and pre-multiplying d T and post-multiplying d on both sides of 3), and denoting a := c T d/c T x v) and, if there is no confusion, still denoting δx) := Gx; v) T Gx; v) ) Gx; v)/gx; v), we obtain Eliminating a from 39) and 40) we get δx) = γa + d T fx) 39) δx) δx) = γa + d T fx)d. 40) γδx) δx)) + δx) + d T fx)) = γd T fx)d. 4) Relations 8) and 9) yield fx) T d = F px))[ d T f j x)h j ] 4) d T fx)d = F px))[ d T f j x)h j, d T f j x)h j ] F px))[ d T f j x)dh j ]. 43) Since F ) is ν-logarithmically homogeneous and f j s are convex, we obtain from 4) and 43) that γδx) δx)) + δx) + d T fx)) γ F px))[ d T f j x)h j, d T f j x)h j ] γ F px))[ d T f j x)h j ] ν = γ d fx)) T. 44) ν 8

Noting that γ = ν + ν, because d T fx) is a real number, checking the discriminant of the above quadratic inequality in terms of d T fx) we conclude that The quadratic inequality 44) can be rearranged as with δx) ν + ν ν. 45) α x) d T fx) α x), 46) α x) := νδx) ν γ ν )δx)) δx) α x) := νδx) + ν γ ν )δx)) δx). Simplifying, we get α x) ν + ν ν ν + ν + ν ν ν =: α and α x) α x) α. Now we use the κ-consistency, Condition 3.3, to obtain from 4) and 43) that κ F px))[ d T f j x)dh j, d T f j x)dh j ] F px))[ d T f j x)dh j ] d T fx)d = γδx) δx)) + δx) + dt fx)) γ γδx) + δx) + d T fx)) γ γ ν + γ ν ) + α γ =: α 3, 47) where the second step is due to 43), the third step is due to 4), and the last step is due to 45) and 46). This yields F px))[ d T f j x)dh j, d T f j x)dh j ] κ α3. 48) 9

Along the same line, from 43) and 47) we get F px))[ d T f j x)h j, d T f j x)h j ] d T fx)d α 3. 49) Consider an iterative point x + td along the Newton direction d with step size t > 0. Using the mean-value theorem, there are x j [x, x + td], j =,..., k, such that Let ũ := k d T f j x j )dh j. px + td) px) = f j x + td) f j x)) h j = t d T f j x)h j t d T f j x j )dh j. 50) Remember that the norm induced by the Hessian of F at y is denoted by u F y) = u T F y)u. The lemma below is well known and it is a crucial property of the self-concordant barrier functions; see [4] and [7]. Lemma 3.6 Suppose that F ) is self-concordant, i.e., it satisfies Condition 3.. Suppose that px) int K. If y px) F px)) < then y int K. Moreover, for all u. u F y) u F px)), y px) F px)) In this notation, 48) can be written as d T f j x)dh j F px)) κα 3, 5) and 49) as d T f j x)h j α 3. 5) F px)) Lemma 3.7 Suppose that F ) satisfies Conditions 3., 3., 3.3, and 3.5, and f j ) s, j =,..., k, satisfy Condition 3.4. Let x rel int F be any feasible but non-optimal point. Let γ = ν + ν. Let the Newton direction at x be d = Gx; v) ) Gx; v) with v v. Let α 4 := α3 + + λκ). 0

Then, for any step-size 0 < t < α 4, the iterative point px + td) F. Proof. It follows from 50) that px + td) px) = t d T f j x)h j t ũ. By Lemma 3.4 and Condition 3.4, also noting 5) we have ũ F px)) λ d T f j x)dh j λκα 3. F px)) Using the above estimation and 5) we know that if 0 < t < α 4 then px + td) px) F px)) t d T f j x)h j + t ũ F px)) F px)) t α 3 + t λκα 3 <, because α 4 is the largest root of the quadratic equation t α 3 + t λκα 3 = 0. By Lemma 3.6, the result follows. Q.E.D. Remark that, in a similar way one can show that if 0 < t < α 5 with α 5 := α3 + + λκ), then t d T f j x)h j + t ũ F px)) t α 3 + t λκα 3 < /. 53) F px)) Theorem 3.8 Suppose that F ) satisfies Conditions 3., 3., 3.3, and 3.5, and f j ) s, j =,..., k, satisfy Condition 3.4. Let x rel int F be any feasible but non-optimal point. Let γ = ν + ν. Let the Newton direction at x be d = Gx; v) ) Gx; v). Suppose that Gx; v) T d/gx; v) δ. Then there exists β > 0 such that min log Gx + td; v) log Gx; v)) β. 0 t α 5

Proof. Denote xt) := x + td, and let a := ct d. Lemma 3.7 and the remark thereafter assure c T x v that if 0 < t < α 5 < α 4 ) then xt) F, and pxt)) px) F px)) < /. Now, consider a fixed 0 < t < α 5. By the mean-value theorem, there is 0 t t < α 5 such that the following estimation holds log Gx + td; v) log Gx; v) = γ log + at) + fxt)) fx) ) = γa + fx) T d)t + t a γ + a t) + dt fx t))d δx)t + t F px t)))[ d T f j x t))h j, d T f j x t))h j ] + t ν F px t)))[ d T f j x t))dh j, d T f j x t))dh j ] δt + t k d T f j x t))h j F px)) px t)) px) ) + k d T f j x t))dh j ν F px)) px t)) px). F px)) F px)) 54) Notice that d T f j x t))h j F px)) d T f j x)h j + λ d T f j x)dh j F px)) F px)) d T f j x)h j + λ 4 d T f j x)dh j F px)) F px)) α 3 + λ κ α 3/, where the last step follows from 5) and 5), and d T f j x t))dh j λ d T f j x)dh j F px)) F px)) The coefficient of the second order term in 54) can be further estimated as k d T f j x t))h j F px)) px t)) px) F px)) ) + ν λκα 3. k d T f j x t))dh j F px)) px t)) px) F px))

8α 3 + λ κ α 3 + λκα 3 ν =: α6. Using these estimations we rewrite 54) as follows log Gx + td; v) log Gx; v) δt + α 6 t. Therefore, if we take the step-length as t := min{ δ α 6, α 5 }, then log Gx + td; v) log Gx; v) β, where β := δ t α 6 t. Q.E.D. If K is a symmetric cone, say K = S l + and F p) = log det p, then we can improve the bound in 54). In that case, we have F p)[h] = tr p h F p)[h, h] = tr p hp h). Therefore, if p p F p) = F p)[ p p, p p] = p p p)p F = p pp I F < / then p p 3 p, and since h K we have Instead of the estimations in 54), we now have 0 < F p)[h] = tr p h tr p h = F p)[h]. log Gx + td; v) log Gx; v) = γ log + at) + fxt)) fx) ) = γa + fx) T d)t + t a γ + a t) + dt fx t))d δx)t + t F px t)))[ d T f j x t))h j, d T f j x t))h j ] t F px t)))[ d T f j x t))dh j ] δt + t k d T f j x t))h j F px)) px t)) px) ) λ F px))[ d T f j x)dh j ] F px)) δt + t 8α 3 + λ κ α 3 + λα 3 ). By letting α 7 := 8α 3 + λ κ α 3 + λα 3, and t := min{ δ α 7, α 5 } we have the estimation that log Gx + t d; v) log Gx; v) β with β := δ t α 7 t ). 3

One easily verifies that all the constants, α to α 7, δ, β and β, are rationally related to the problem parameters ν, κ and λ. This gives rise to a polynomial-time complexity proof for the following extended version of the Iri-Imai method or, potential reduction method). An Iri-Imai Type Potential Reduction Method for COCP Input data: An initial x 0 rel int F and an initial lower bound of the optimal value v 0 v. Output: A feasible solution x rel int F such that c T x v ɛ where ɛ > 0 is the desired precision. Step 0. Let x := x 0 and v := v 0. Go to Step. Step. Check whether c T x v ɛ or not. If yes, stop with x as the output; if no, continue with Step. Step. Compute the Newton direction at x d := Gx; v)) Gx; v). If Gx; v) T d/gx; v) < δ then increase the value v until Gx; v) T d/gx; v) = δ; otherwise continue with Step 3. Step 3. Apply the line-minimization for log Gx; v) along the direction d, starting from x. Let t = argmin t 0 log Gx + td; v). Let x := x + td, and go to Step. The next theorem is a consequence of Theorem 3.8 and the remarks thereafter. Theorem 3.9 Suppose that F ) satisfies Conditions 3., 3., 3.3, and 3.5, and f j ) s, j =,..., k, satisfy Condition 3.4. Let γ = ν + ν. Then the above described Iri-Imai type potential reduction algorithm terminates in a number of steps, which is bounded by a polynomial in terms of ν, κ, λ and log ɛ, provided that i) an optimal solution for P K) exists; ii) fx) L for all x F, where L is bounded by a polynomial in ν, κ and λ; iii) the initial potential function value log Gx 0 ; v 0 ) is bounded by a polynomial in ν, κ, λ, and log ɛ. 4

Proof. Let the iterates produced by the algorithm be {x 0, x,, x j, }, and the sequence of the lower bounds be {v 0, v,, v j, }. By Theorem 3.8 we have log Gx j ; v j ) log Gx j ; v j ) β log Gx 0 ; v 0 ) jβ where /β is bounded from above by a polynomial in ν, κ and λ. Thus, Therefore, if then it follows from 55) that logc T x j v j ) log Gx0 ; v 0 ) jβ fx j ) γ j log Gx0 ; v 0 ) + L + γ log/ɛ) β c T x j v j ɛ. log Gx0 ; v 0 ) jβ + L. 55) γ Q.E.D. We observe that Theorem 3.9 provides explicit computational complexity bounds for some new classes of convex programming problems. To give one such example, let us call a convex function fx) to be quadratically dominated if it can be split into two parts: fx) = qx) + hx) where qx) is convex quadratic and hx) is any other function whose Hessian is dominated by qx) = Q 0, i.e., there is 0 θ < such that θq hx) θq. Our example at the beginning of Section 3, fx) = x + sin x, is such a convex quadratically dominated function. Now consider the following COCP: K is a symmetric cone κ = ); f j s are convex quadratically dominated functions all with a uniform and constant bound θ). Since convex quadratically dominated functions are harmonically convex with λ = + θ)/ θ), we have λ = O), as θ is assumed to be fixed here. It follows that α = ν+ν ν α 3 = ν ν + ν+ ν γ ν + γ ν ) +α α 4 = α3 + +λκ) ν O ν) γ O) O) α 5 = α3 + +λκ) O) α 7 = 8α 3 + λ κ α3 + λα 3 O) δ = / + κ + λκ O) t = min{ δ α 7, α 5 } O) β = δ t α 7 t ) O). Therefore, the Iri-Imai type potential reduction algorithm would solve this problem in Oν log ɛ ) number of iterations to reach an ɛ-optimal solution if ɛ > 0 is small enough, as asserted by Theorem 3.9. Similarly, the geometric semidefinite program P P SD ) that we introduced at the beginning of this section can be solved in Om + k + l) log ɛ ) number of iterations if max i k u i /l i is a constant. 5

Even if K is not a symmetric cone assuming κ is fixed however) and f j s are either quadratically dominated convex functions or they are generated from monomials, since α 6 = 8α 3 + λ κ α 3 + λκα 3 ν = O ν) we can still apply Theorem 3.8 and Theorem 3.9 to conclude that these problems can be solved in no more than Oν.5 log ɛ ) number of iterations to reach an ɛ-optimal solution. 4 Duality and lower bounds What remains to be discussed is how to obtain a lower bound v v in order to apply the Iri-Imai type potential reduction method as introduced in the previous section. This naturally leads to the duality issues. Consider the following COCP problem: P K ) minimize c T x subject to Ax = b h 0 f j x)h j K, where A R m n, c R n and b R m, h, h,..., h k K, and f j ) s are convex functions. For this purpose, let us note that the conjugate of a convex function fx), which is also a convex function, is defined as f s) = sup{ s) T x fx) x dom f}, where dom f stands for the domain of the function f. Define a closed convex cone as follows C := cl p q x p > 0, q pfx/p) 0. Then, the dual of C can be computed as u C = cl v s v > 0, u vf s/v) 0. One is referred to [] for discussions on this and other related issues. Let K j := cl p j q j x j p j > 0, q j p j f j x j /p j ) 0, j =,..., k. 6

Then, K j = cl u j v j s j v j > 0, u j v j fj s j /v j ) 0, j =,..., k. In light of these new cones, we may rewrite our original COCP as P K ) minimize c T x subject to Ax = b q j x K j, j =,..., k h 0 q j h j K. The above description is in the form of linear conic optimization. Hence its dual problem can be found by standard procedures. Let us derive the dual using the Lagrangian multipliers. In particular, let y be the multipliers for the first set of equality constraints, and u j v j s j K j be the multipliers for the second set of conic) constraints. Let z K be the multiplier for the last conic constraint. Then the Lagrangian function is Lx, q; y, u, v, s, z) = c T x y T Ax b) u j + v j q j + s T j x) z T h 0 q j h j ), from which the dual problem can be derived as D K ) maximize b T y h T 0 z u j subject to A T y + s j = c u j h T j z K j, j =,..., k s j z K. 7

This problem can be further explicitly expressed as D K ) maximize b T y h T 0 z h T j z)fj s j /h T j z) subject to A T y + s j = c z K. Therefore, a lower bound v v for P K ) could be obtained from a finite feasible solution for the above problem D K ). If f j x) = xt Q j x with Q j 0 for all j, then fj s) = st Q j s, and the dual problem becomes maximize b T y h T 0 z subject to A T y + s j = c z K. s T j Q j s j /h T j z In the case of conically ordered geometric programming, where f j x) = e ιt j x for all j. It is easy to compute that Therefore, its dual is fj +, s j ) = s j ι j log s j ι j s j ι j, maximize subject to b T y h T 0 z A T y [ι,..., ι k ]w = c z K, w 0. if s j s j ι j ι j if s j = s j ι j ι j. ) w j log w j log h T j z + w j References [] L. Faybusovich, Jordan algebra, symmetric cones and interior point methods, Research Report, University of Notre Dame, 995. [] M. Iri, A proof of the polynomiality of the Iri-Imai method, Journal of Complexity 9, 69 90, 993. [3] M. Iri and H. Imai, A multiplicative barrier function method for linear programming, Algorithmica, 455 48, 986. 8

[4] Yu. Nesterov, Introductory lectures on convex programming, Kluwer Academic Publishers, Dordrecht, 004. [5] Yu. Nesterov and A. Nemirovskii, Interior point polynomial methods in convex programming, Studies in Applied Mathematics 3, SIAM, Philadelphia, PA, 994. [6] Yu. Nesterov and M.J. Todd, Self scaled barriers and interior point methods for convex programming, Mathematics of Operations Research, 4, 997. [7] J. Renegar, A mathematical view of interior-point methods in convex optimization, MPS-SIAM Series on Optimization, 00. [8] J.F. Sturm, Unifying semi-definite and second-order cone programming: symetric cones, Research Report, 998. [9] J.F. Sturm, Similarity and other special relations for symmetric cones, Linear Algebra and its Applications 3, 35 54, 000. [0] J.F. Sturm and S. Zhang, A potential reduction method for harmonically convex programming, Journal of Optimization Theory and Applications 84, 8 05, 995. [] S. Zhang, Convergence property of the Iri-Imai algorithm for some smooth convex programming problems, Journal of Optimization Theory and Applications 8, 38, 994. [] S. Zhang, A new self-dual embedding method for convex programming, Journal of Global Optimization 9, 479 496, 004. 9