Primal Central Paths and Riemannian Distances for Convex Sets

Similar documents
Primal Central Paths and Riemannian Distances for Convex Sets

Nonsymmetric potential-reduction methods for general cones

Self-Concordant Barrier Functions for Convex Optimization

Supplement: Universal Self-Concordant Barrier Functions

Constructing self-concordant barriers for convex cones

On self-concordant barriers for generalized power cones

IE 521 Convex Optimization Homework #1 Solution

Augmented self-concordant barriers and nonlinear optimization problems with finite complexity

Exercises: Brunn, Minkowski and convex pie

Primal-dual IPM with Asymmetric Barrier

(Linear Programming program, Linear, Theorem on Alternative, Linear Programming duality)

Gradient methods for minimizing composite functions

Lecture 5. The Dual Cone and Dual Problem

Gradient methods for minimizing composite functions

Iteration-complexity of first-order penalty methods for convex programming

Course: Interior Point Polynomial Time Algorithms in Convex Programming ISyE 8813 Spring Lecturer: Prof. Arkadi Nemirovski, on sabbatical

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Multi-Parameter Surfaces of Analytic Centers and Long-step Surface-Following Interior Point Methods

10. Unconstrained minimization

Largest dual ellipsoids inscribed in dual cones

LECTURE 15: COMPLETENESS AND CONVEXITY

GEORGIA INSTITUTE OF TECHNOLOGY H. MILTON STEWART SCHOOL OF INDUSTRIAL AND SYSTEMS ENGINEERING LECTURE NOTES OPTIMIZATION III

Chapter 2 Convex Analysis

Primal-Dual Symmetric Interior-Point Methods from SDP to Hyperbolic Cone Programming and Beyond

Optimization and Optimal Control in Banach Spaces

Lecture 6: Conic Optimization September 8

Course Summary Math 211

On Conically Ordered Convex Programs

Cubic regularization of Newton s method for convex problems with constraints

Barrier Method. Javier Peña Convex Optimization /36-725

Note on the convex hull of the Stiefel manifold

Hessian Riemannian Gradient Flows in Convex Programming

Complexity bounds for primal-dual methods minimizing the model of objective function

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

Optimality Conditions for Nonsmooth Convex Optimization

APPLICATIONS OF DIFFERENTIABILITY IN R n.

What can be expressed via Conic Quadratic and Semidefinite Programming?

Constrained Optimization and Lagrangian Duality

Unconstrained minimization

Full Newton step polynomial time methods for LO based on locally self concordant barrier functions

Convex Feasibility Problems

1 Directional Derivatives and Differentiability

Real Analysis Math 131AH Rudin, Chapter #1. Dominique Abdi

Metric Spaces and Topology

PRIMAL-DUAL INTERIOR-POINT METHODS FOR SELF-SCALED CONES

Constraint qualifications for convex inequality systems with applications in constrained optimization

THE MIXING SET WITH FLOWS

2. Dual space is essential for the concept of gradient which, in turn, leads to the variational analysis of Lagrange multipliers.

INFORMATION-BASED COMPLEXITY CONVEX PROGRAMMING

A full-newton step infeasible interior-point algorithm for linear programming based on a kernel function

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

THE INVERSE FUNCTION THEOREM

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

MA651 Topology. Lecture 10. Metric Spaces.

On duality gap in linear conic problems

Convex Geometry. Carsten Schütt

DO NOT OPEN THIS QUESTION BOOKLET UNTIL YOU ARE TOLD TO DO SO

Topological properties

L. Vandenberghe EE236C (Spring 2016) 18. Symmetric cones. definition. spectral decomposition. quadratic representation. log-det barrier 18-1

The proximal mapping

EE 546, Univ of Washington, Spring Proximal mapping. introduction. review of conjugate functions. proximal mapping. Proximal mapping 6 1

15. Conic optimization

Mathematics for Economists

Lecture 7: Semidefinite programming

DIFFERENTIAL GEOMETRY, LECTURE 16-17, JULY 14-17

We describe the generalization of Hazan s algorithm for symmetric programming

Math 6455 Nov 1, Differential Geometry I Fall 2006, Georgia Tech

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

New stopping criteria for detecting infeasibility in conic optimization

Set, functions and Euclidean space. Seungjin Han

On John type ellipsoids

Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games

Convex Analysis Background

Accelerating the cubic regularization of Newton s method on convex problems

Weierstraß-Institut. für Angewandte Analysis und Stochastik. im Forschungsverbund Berlin e.v. Preprint ISSN

Lecture 5. Theorems of Alternatives and Self-Dual Embedding

Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University

Distances, volumes, and integration

A Brief Review on Convex Optimization

Convex Optimization M2

Selected Examples of CONIC DUALITY AT WORK Robust Linear Optimization Synthesis of Linear Controllers Matrix Cube Theorem A.

Convex Optimization and Modeling

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011

Math 341: Convex Geometry. Xi Chen

Exercises from other sources REAL NUMBERS 2,...,

Convex Optimization Boyd & Vandenberghe. 5. Duality

Some topics in analysis related to Banach algebras, 2

5. Duality. Lagrangian

Math 421, Homework #9 Solutions

1. Subspaces A subset M of Hilbert space H is a subspace of it is closed under the operation of forming linear combinations;i.e.,

Introduction to Real Analysis Alternative Chapter 1

Convex Optimization. 9. Unconstrained minimization. Prof. Ying Cui. Department of Electrical Engineering Shanghai Jiao Tong University

Local Superlinear Convergence of Polynomial-Time Interior-Point Methods for Hyperbolic Cone Optimization Problems

CSCI : Optimization and Control of Networks. Review on Convex Optimization

Banach Journal of Mathematical Analysis ISSN: (electronic)

Lecture 9 Sequential unconstrained minimization

A SET OF LECTURE NOTES ON CONVEX OPTIMIZATION WITH SOME APPLICATIONS TO PROBABILITY THEORY INCOMPLETE DRAFT. MAY 06

Ordinary Differential Equations. Iosif Petrakis

Transcription:

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 1» 1 2 3 4 5 6 7 8 9 1 11 12 13 14 15 16 17 18 19 2 21 22 23 24 25 26 27 28 29 3 31 32 33 34 35 36 37 38 39 4 41 42 43 44 45 46 47 DOI 1.17/s128-7-919-4 Primal Central Paths and Riemannian Distances for Convex Sets Y. Nesterov A. Nemirovski Received: 2 November 25 / Final version received: 29 May 27 / Accepted: 5 September 27 SFoCM 27 Abstract In this paper, we study the Riemannian length of the primal central path in a convex set computed with respect to the local metric defined by a self-concordant function. Despite some negative examples, in many important situations, the length of this path is quite close to the length of a geodesic curve. We show that in the case of a bounded convex set endowed with a ν-self-concordant barrier, the length of the central path is within a factor O(ν 1/4 ) of the length of the shortest geodesic curve. Keywords Riemannian geometry Convex optimization Structural optimization Interior-point methods Path-following methods Self-concordant functions Polynomial-time methods 1 Introduction Most modern interior-point schemes for convex optimization problems are based on tracing a path in the interior of a convex set. A generic case of this type is as follows. We are given a self-concordant function f(x) with convex open domain dom f in This paper presents research results of the Belgian Program on Interuniversity Poles of Attraction initiated by the Belgian State, Prime Minister s Office, Science Policy Programming. Y. Nesterov ( ) CORE, Catholic University of Louvain, 34 voie du Roman Pays, 1348 Louvain-la-Neuve, Belgium e-mail: nesterov@core.ucl.ac.be A. Nemirovski Technion-Israel Institute of Technology, Haifa, Israel e-mail: nemirovs@ie.technion.ac.il PDF-OUTPUT

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 2» 48 49 5 51 52 53 54 55 56 57 58 59 6 61 62 63 64 65 66 67 68 69 7 71 72 73 74 75 76 77 78 79 8 81 82 83 84 85 86 87 88 89 9 91 92 93 94 a finite-dimensional space E (see Sect. 2 to recall the definitions). Let A be a linear operator from E to another linear space E d and d E d. We are interested in tracing the path x(t) = argmin x dom f : Ax=d { ft (x) def = f(x)+ te,x }. (1.1) Note that this framework covers seemingly all standard short-step interior-point pathfollowing schemes (e.g., [5, 7], and [1]): The primal path-following method solving the optimization program { } min c,x :x cl dom f x traces, as t,theprimal central path (1.1) given by e = c, A, b =, and a ν-self-concordant barrier f for cl dom f. Let K be a closed pointed cone with nonempty interior in a finite-dimensional linear space G, A be a linear operator from G to R m with conjugate operator A, G be a linear space dual to G, and K G be a cone dual to K. And let F be a ν-logarithmically homogeneous self-concordant barrier F for K with Legendre transform F. The feasible-start primal-dual path-following method for solving the primal-dual pair of conic problems { } (P ) : min c,u :Au = b, u K, u (D) : { max b,y :c A } y K, traces, as t, the primal-dual central path (1.1) given by x (u, y) E = G R m, y f(x)= F (u) + F (c A y), A(x) = Au, d = b, e = (c, b). For the infeasible-start primal-dual path-following method as applied to (P ), (D), we need to fix a reference points u int K and s int K. Then we can trace, as t, the path (1.1) given by E = (G R) R m+1, x (u,τ,y,κ), Ax = ( Au τb,c,u b,y +κ ), d = ( Ax b,c,x +1 ), f(x)= F (u) + F ( s + (τ 1)c A y ) ln τ ln κ, with e defined by the condition that starting point x = (u, 1,, 1) coincides with x(1) (see [7] for details).

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 3» 95 96 97 98 99 1 11 12 13 14 15 16 17 18 19 11 111 112 113 114 115 116 117 118 119 12 121 122 123 124 125 126 127 128 129 13 131 132 133 134 135 136 137 138 139 14 141 It can be shown that in all situations an upper complexity bound for tracing the corresponding trajectories by a short-step strategy is of the order O( ν ln ν ɛ ) iterations, where ɛ is a relative accuracy of the final approximate solution of the problem (see, for example, [4, 5], and [7]). Recently in [6], there was developed a technique for establishing lower complexity bounds for short-step path-following schemes. This technique is based on a combination of the theory of self-concordant functions [5] with the concepts of Riemannian geometry (see [1 3, 8, 9] for the main definitions and their applications in optimization). Let f be a nondegenerate self-concordant function on an open convex domain Q E. We can use the Hessian of f in order to define a Riemannian structure on Q. Namely, for a continuously differentiable curve γ(t) Q, t t t 1, we define the length of this curve as follows: ρ [ ] t1 γ( ), t,t 1 = f (γ (t))γ (t), γ (t) 1/2 dt. t Definition 1.1 The infimum of the quantities ρ[γ( ), t,t 1 ] over all continuously differentiable curves γ( ) in Q linking a given pair of points x and y of the set (i.e., γ(t ) = x,γ(t 1 ) = y) is called the Riemannian distance between x and y. We denote it by σ(x,y). Clearly, σ(x,y) is a distance on Q.For Q Q, we define σ(x, Q) = inf y {σ(x,y): y Q}. Note that the Riemannian metric defined by f (x) arises very naturally in the theory of interior-point schemes. Indeed, it can be shown that the Dikin ellipsoid W r (x) = { y : f (x)(y x),y x r 2} centered at an arbitrary point x int Q always belongs to Q for r<1. Moreover, short-step interior-point methods usually generate a sequence of points {x k } k= satisfying the condition f (x k )(x k+1 x k ), x k+1 x k r 2 < 1, k. Hence, given a starting point x, a target x, both in int Q, and a step-size bound r (, 1), it would be interesting to estimate from below and from above the minimal number of steps N f ( x,x ), which is sufficient for connecting the end points by a short-step sequence. If x is an approximate solution of certain optimization problem, then an upper bound for N f ( x,x ) can be extracted from the complexity estimate of some numerical scheme. Thus, a lower bound for N f ( x,x ) delivers a lower complexity bound for corresponding methods. In [6], it was shown that the value O(σ(x,y)) provides us with a lower bound for N f (x, y) (see [6], Sect. 3). Moreover, in [6], Sect. 5, it was shown that the upper bounds for this value delivered by primal-dual feasible and infeasible path-following schemes coincide with the lower bounds up to a constant factor. Thus, for the corresponding problem setting, these methods are optimal.

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 4» 142 143 144 145 146 147 148 149 15 151 152 153 154 155 156 157 158 159 16 161 162 163 164 165 166 167 168 169 17 171 172 173 174 175 176 177 178 179 18 181 182 183 184 185 186 187 188 Note that the conclusion of [6] was derived from some special symmetry of the primal dual feasible cone K K. This technique is not applicable to the pure primal setting. Actually, it is easy to see that in this case the central path trajectories can be very bad. Let us look at the following example. Example 1.1 Let Q R+ n be the positive orthant in Rn. Let us endow Q with the standard self-concordant barrier f(x)= n ln x (i), ν = n. Then the Riemannian distance in Q is defined as follows (see (6.2), [6]): σ(x,y) = i=1 [ n ln 2 x(i) y (i) i=1 ] 1/2. Let us form a central path, which connects the point x = e = (1,...,1) T R n with the simplex { } n Δ(β) = x Q : x (i) = n (1 + β), β >. i=1 That is a solution of the following problem x(t) = arg min x Δ(t) f(x), t β. Clearly, x(t) = (1 + t) e. Thus, using the central path, we can travel from x = e to the set Δ(β) in O(σ(e,x(β))) iterations of a path-following scheme with σ ( e,x(β) ) = n ln(1 + β). However, it is easy to see that there exists a shortcut: y = e + nβe 1 Δ(β), where e 1 = (1,,...,) T R n. σ (e, y) = ln(1 + nβ), Despite referring to a slightly different problem setting, 1 the above observation is quite discouraging. Fortunately, the situation is not always so bad. The main goal of this paper is to show that for several important problems the primal central paths are O(ν 1/4 )-geodesic (we use terminology of [6]). The paper is organized as follows. In Sect. 2, we recall the reader the main facts on the theory of self-concordant functions and prove several new inequalities, which are necessary for our analysis. In Sect. 3, we establish different lower bounds on the 1 In Example 1.1, we speak about Riemannian distance between a point and a convex set.

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 5» 189 19 191 192 193 194 195 196 197 198 199 2 21 22 23 24 25 26 27 28 29 21 211 212 213 214 215 216 217 218 219 22 221 222 223 224 225 226 227 228 229 23 231 232 233 234 235 Riemannian distances in convex sets in terms of the local norms defined by a selfconcordant function. In Sect. 4, we prove the main result of this paper. That is an upper bound on the Riemannian length of a segment of a central path in terms of variation of the value of corresponding self-concordant function and the logarithm of the variation of the path parameter. In Sect. 5, we apply this result to different problem instances: finding a minimum of self-concordant function (Sect. 5.1), feasibility problem (Sect. 5.2) and the standard minimization problem (Sect. 5.3). We show that in Example 1.1 the presence of the unpleasant factor O( ν/ln ν) in the ratio of the length of the central path and the corresponding Riemannian distance is due to unboundedness of the basic feasible set Q.IfQ is bounded, this ratio can be at most of the order O(ν 1/4 ). 2 Self-Concordant Functions and Barriers In order to make the paper self-contained, in this section, we summarize the main results on self-concordant functions, which can be found in [5], Chap. 2, and in [4], Chap. 4. We prove also some new inequalities, which are useful for working with the Riemannian distances. Let E be a finite-dimensional real vector space and Q E be an open convex domain. A three-times continuously differentiable convex function f(x): Q R is called self-concordant, if the sets {x Q : f(x) a} are closed for every a R and d 3 ( f(x+ th) d 2 ) 3/2 dt3 2 f(x+ th) t= dt2, x Q, h E. (2.1) t= Such a function is called nondegenerate, if its Hessian is positive definite at some (and then at every) point of Q. This happens, for example, if Q contains no straight line. An equivalent condition to (2.1) can be written in terms of relation between the second and third differentials: D 3 f(x)[h 1,h 2,h 3 ] 2 3 f 1/2, (x)h i,h i x Q, h1,h 2,h 3 E. (2.2) i=1 Denote by E the space dual to E. Forh E and g E denote by g,h the value of the linear function g on the vector h. Letf be a nondegenerate selfconcordant function on Q. For every x Q, we have f (x) E. Thus, the Hessian f (x) defines a nondegenerate linear operator: h f (x)h E, h E. Hence, we can define a local primal norm: h x = f (x)h, h 1/2 : E R,

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 6» 236 237 238 239 24 241 242 243 244 245 246 247 248 249 25 251 252 253 254 255 256 257 258 259 26 261 262 263 264 265 266 267 268 269 27 271 272 273 274 275 276 277 278 279 28 281 282 and, using the standard definition η x = max h: h x 1η,h, the corresponding local dual norm: Denote by g x = g, [ f (x) ] 1 g 1/2: E R. λ(x) = f (x), [ f (x) ] 1 f (x) 1/2 = f (x) x the local norm of the gradient f (x). A nondegenerate self-concordant function f is called a self-concordant barrier with parameter ν, if λ 2 (x) f (x), [ f (x) ] 1 f (x) ν, x Q. Note that ν cannot be smaller than one. Let us mention first the well-known facts. Proposition 2.1 Let f be a nondegenerate self-concordant function on an open convex domain Q E. Then (i) For every x Q and r [, 1), the ellipsoid W r (x) {y : y x x <r} is contained in Q. For any y W r (x) and any h E, we have ( ) h x 1 y x x h x h y. (2.3) 1 y x x Moreover, for any x and y from Q and y x y y x x, (2.4) 1 + y x x f (x) f (y), x y x y 2 x. (2.5) 1 + x y x (ii) The following facts are related to existence of a minimizer x f of f(x)on Q. f attains its minimum on Q if and only if it is below bounded. f attains its minimum on Q if and only if the set {x : λ(x) < 1} is nonempty. If λ(x) < 1, then f(x) f(x f ) λ(x) ln(1 λ(x)). If x f exists, then it is unique. For every ρ<1, the set {x Q : λ(x) ρ} is compact. Denote r f (x) = x x f xf. If λ(x) < 1, then r f (x) ln ( 1 + r f (x) ) λ(x) ln ( 1 λ(x) ). Hence, the set {x : λ(x) 1 2 } is contained in the ellipsoid {x : x x f xf 3 4 }.

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 7» 283 284 285 286 287 288 289 29 291 292 293 294 295 296 297 298 299 3 31 32 33 34 35 36 37 38 39 31 311 312 313 314 315 316 317 318 319 32 321 322 323 324 325 326 327 328 329 (iii) For every x Q, the damped Newton iterate belongs to Q, and x + = x 1 [ f (x) ] 1 f (x) 1 + λ(x) f(x + ) f(x) [ λ(x) ln ( 1 + λ(x) )], λ(x + ) 2λ 2 (x). Hence, the damped Newton method converges quadratically as λ(x) < 1 2. (iv) The domain of the Legendre transformation [ ] f (ξ) = sup ξ,x f(x) : E R {+ } x (2.6) of f is an open convex set which is exactly the image of Q under the one-to-one C 2 mapping f (x) : Q E. The function f is a nondegenerate self-concordant function on its domain, and the Legendre transformation of f is f. If x f exists, then by (2.3) as applied to f, we have ( ) 1 λ(x) g x g x f g x 1 λ(x), x Q : λ(x) < 1, g E. (2.7) (v) Let f be ν-self-concordant barrier for cl Q. Let x and y belong to Q. Then f (x), y x ν. If in addition, f (x), y x, then y x x ν + 2 ν. Let us prove now some new inequalities. Let f be a nondegenerate self-concordant function with dom f E. Denote ζ(t) = ln(1 + t) t 1 + t, t > 1, ζ (t) = ln(1 t)+ t 1 t, t <1. In what follows, we assume that these functions are equal + outside their natural domains. Lemma 2.1 For every x,y dom f we have: f(y) f(x)+ f (x), y x + ζ ( y x y ), (2.8) f(y) f(x)+ f (x), y x + ζ ( y x y ), (2.9)

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 8» 33 331 332 333 334 335 336 337 338 339 34 341 342 343 344 345 346 347 348 349 35 351 352 353 354 355 356 357 358 359 36 361 362 363 364 365 366 367 368 369 37 371 372 373 374 375 376 f(x) f(y)+ f (y), x y + ζ ( f (x) f (y) y), (2.1) f(x) f(y)+ f (y), x y ( + ζ f (x) f (y) y). (2.11) Proof For t [, 1] consider the function φ(t)= f(y) f ( y + t(x y) ) + f ( y + t(x y) ),t(x y). Note that φ() =. Then φ (t) = t f ( y + t(x y) ) (x y),x y). Denote r = x y y. Since f is self-concordant, we have (see (2.4)). Hence, f ( y + t(x y) ) (x y),x y r 2 (1 + tr) 2 f(y) f(x)+ f (x), x y = φ(1) φ() = as required in (2.8). If r<1, we have also (see (2.3)). Hence, 1 1 tr 2 dt (1 + tr) 2 = ζ(r), f ( y + t(x y) ) (x y),x y r 2 (1 tr) 2 f(y) f(x)+ f (x), x y = φ(1) φ() = 1 φ (t) dt 1 φ (t) dt tr 2 dt (1 tr) 2 = ζ (r), and that is (2.9). In order to prove two others inequalities, note that the Legendre transformation of f [ ] f (s) = max s,x f(x) x is nondegenerate and self-concordant along with f (Proposition 2.1(iv)). Therefore, using (2.8) and (2.9) at the points u = f (x), v = f (y) we have f (v) f (u) f (u), v u ζ ( ) v u v, f (v) f (u) f (u), v u ( ) ζ v u v.

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 9» 377 378 379 38 381 382 383 384 385 386 387 388 389 39 391 392 393 394 395 396 397 398 399 4 41 42 43 44 45 46 47 48 49 41 411 412 413 414 415 416 417 418 419 42 421 422 423 It remains to note that f (u) = f (x), x f(x), f (u) = x, f (v) = f (y), y f(y), f (v) = y, and that h v =f (v)h, h 1/2 =[f (y)] 1 h, h 1/2 = h y for h E. We will need some bounds on the variation of the gradient of a self-concordant barrier in terms of a Minkowski function. Let π z (x) be the Minkowski function of Q with the pole at z Q: π z (x) = inf { t> z + t 1 (x z) Q }. Lemma 2.2 Let u be an arbitrary point in Q. Then for any v Q, we have f (v) u ν 1 π u (v). (2.12) Moreover, if f (u), v u for some v Q, then f (v) u π u (v) (ν + 2 ν)(1 π u (v)). (2.13) Finally, if f (u), v w = for some v,w Q, then 1 π u (v) 1 π u(w) 1 + ν + 2 ν. (2.14) Proof The set cl Q contains a u -ball of radius 1, which is centered at u. Consequently, this set contains a u -ball B of radius 1 π u (v), which is centered at v. Since f is a ν-self-concordant barrier, from Proposition 2.1(v), we have f (v), x v ν, x B cl Q, and (2.12) follows. In order to prove (2.13), let us choose v Q such that f (u), v u, and let r = v u u. The case of r = is trivial. Assuming r>, let φ(t)= f ( u + tr 1 (v u) ), t Δ = { t u + tr 1 (v u) Q }. Note that φ is self-concordant barrier for Δ and the right endpoint of Δ is the point T = r/π u (v). By Proposition 2.1(i), Δ contains the set {s : (s t) 2 φ (t) < 1}, whence t + (φ (t)) 1/2 T for all t Δ. Thus, for t Δ, t, we have φ (t) 1 (T t) 2. (2.15)

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 1» 424 425 426 427 428 429 43 431 432 433 434 435 436 437 438 439 44 441 442 443 444 445 446 447 448 449 45 451 452 453 454 455 456 457 458 459 46 461 462 463 464 465 466 467 468 469 47 Combining (2.15) with the relation φ () = r 1 f (u), v u, we get r φ (r) 1 (T t) 2 dt = r T(T r) = π 2 u (v) r(1 π u (v)). On the other hand, φ (r) =f (v), v u r 1 f (v) u, and we come to f (v) u π u(v) 1 π u (v) πu(v). (2.16) r Setting x = u + πu 1(v)(v u),wehavex cl Q and f (u), x u, whence by Proposition 2.1(v) we get r π u (v) = x u u ν + 2 ν, which combined with (2.16) implies (2.13). To prove (2.14), let f (v), w v =, and let Q v ={x Q f (v), x v =}. Since v is the minimizer of a ν-self-concordant barrier on Q v, the set Q v, regarded as a full-dimensional subset of its affine span, contains an ellipsoid centered at v, and is contained in a (ν + 2 ν) times larger concentric ellipsoid (Proposition 2.1(i), (v)). It follows that there exists x cl Q v, such that 1 v = 1 + ν + 2 ν w + ν + 2 ν 1 + ν + 2 ν x. Thus, 1 π u (v) 1 + ν + 2 ν π u(w) + ν + 2 ν 1 + ν + 2 ν π u(x). Note that π u (x) 1. Hence, 1 π u (v) 1 π u(w) 1 + ν + 2 ν, as required in (2.14). We conclude this section with two lower bounds on the size of the gradient of self-concordant function computed with respect to the local norm defined by its minimizer. Lemma 2.3 Assume that there exists a minimizer x f of a ν-self-concordant barrier f(x). Then for any x dom f we have f( x) f(x f ) ν ln ( 1 + 3 ( 1 + 2ν 1/2) f ( x) x f ) ν ln ( 1 + 9 f ( x) x f ). Proof Denote δ = x x f, r = δ xf, φ(t)= f (x f + tr δ ), Δ= dom φ.

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 11» 471 472 473 474 475 476 477 478 479 48 481 482 483 484 485 486 487 488 489 49 491 492 493 494 495 496 497 498 499 5 51 52 53 54 55 56 57 58 59 51 511 512 513 514 515 516 517 Then φ( ) is a ν-self-concordant barrier for the segment cl Δ. Note that r Δ and φ attains its minimum at t =. Moreover, φ () = 1. By Proposition 2.1(v), we have and r ν + 2 ν. Besides this, Thus, φ (t)(r t) ν, t [,r], φ (t) φ (r), t [,r]. f( x) f(x f ) = φ(r) φ() = r r φ (t) dt min [ φ (r), ν(r t) 1] dt { rφ = (r), if φ (r)r ν, ν(1 + ln(rφ (r)/ν)), otherwise ( ) ν ln 1 + 3 rφ (r) ν ( φ (r) = 1 f ( x),δ ) ν ln ( 1 + 3 ( 1 + 2ν 1/2) f ( x) ) r x. f Lemma 2.4 Assume that there exists a minimizer x f of a self-concordant barrier f(x). Then for any x dom f, we have x x f x ( 1 + 2 x x f xf ) f (x) x f. (2.17) If in addition f is a ν-self-concordant barrier, then x x f x ( 2ν + 4 ν + 1 ) f (x) x f. (2.18) Proof Denote δ = x x f, r = δ xf, φ(t)= f (x f + tr ) δ. Note that φ is self-concordant and φ () =. Therefore, in view of (2.5), we have whence φ rφ (r) (r) 1 + r φ (r), φ (r) ( φ (r) ) ( 2 1 + 1 r φ (r) ) 2. (2.19)

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 12» 518 519 52 521 522 523 524 525 526 527 528 529 53 531 532 533 534 535 536 537 538 539 54 541 542 543 544 545 546 547 548 549 55 551 552 553 554 555 556 557 558 559 56 561 562 563 564 Further, φ (t) = 1 r 2 f (x f + t r δ)δ,δ. Thus, φ () = 1 and φ (r) (1 + r) 2 by (2.4). Therefore, (2.19) implies that Note that φ (r) ( φ (r) ) 2( 2 + r 1 ) 2. φ (r) = r 1 f (x), δ f (x) x f. Combining the two last inequalities, we get f (x)δ, δ = r 2 φ (r) (1 + 2r) 2( φ (r) ) 2 (1 + 2r) 2 ( f (x) ) 2, x f as required in (2.17). Inequality (2.18) follows from (2.17) and the second part of Proposition 2.1(v). 3 Lower Bounds on Riemannian Distances Let us establish lower bounds for the Riemannian distance between two points in Q in terms of different local norms defined by f(x). Lemma 3.1 Let u and v belong to Q. Then for any h E\{}, we have ln h u h σ(u,v), (3.1) v and for any η E \{} Moreover, and ln η u η σ(u,v). (3.2) v λ(u) + 1 ln λ(v) + 1 σ(u,v), (3.3) ln f (u) v + 1 f (v) σ(u,v). (3.4) v + 1 Besides this, if f is a ν-self-concordant barrier for cl Q, then f (u) f(v) νσ (u, v). (3.5) Proof By continuity reasons, we may assume that both u, v are distinct from x f.let us fix ɛ>. Consider a C 1 -curve γ(t) Q, t 1, which satisfies the following

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 13» 565 566 567 568 569 57 571 572 573 574 575 576 577 578 579 58 581 582 583 584 585 586 587 588 589 59 591 592 593 594 595 596 597 598 599 6 61 62 63 64 65 66 67 68 69 61 611 conditions: γ() = u, γ (1) = v, γ(t) x f, t [, 1], ρ [ γ( ),, 1 ] σ(u,v)+ ɛ. To prove (3.1), let us fix h E\{} and set ψ(t)=f (γ (t))h, h. Then whence ψ (t) = D 3 f(γ(t)) [ γ (t), h, h ] (2.2) 2 f ( γ(t) ) γ (t), γ (t) 1/2 ψ(t), ln f (u)h, h 1/2 f (v)h, h 1/2 = 1 ψ(1) ln 2 ψ() ρ[ γ( ),, 1 ] σ(u,v)+ ɛ, and (3.1) follows. Relation (3.2) can be derived from (3.1) using the definition of the dual norm. Further, denoting ψ(t)= λ(γ (t)), wehave d dt ψ2 (t) = D 3 f(γ(t)) [ γ (t), [ f ( γ(t) )] 1 f ( γ(t) ), [ f ( γ(t) )] 1 f ( γ(t) )] + 2 [ f ( γ(t) )] 1 f ( γ(t) ) γ (t), f ( γ(t) ) 2 f ( γ(t) ) γ (t), γ (t) 1/2 f ( γ(t) )[ f ( γ(t) )] 1 f ( γ(t) ), [ f ( γ(t) )] 1 f ( γ(t) ) + 2 [ f ( γ(t) )] 1 f ( γ(t) ),f ( γ(t) ) 1/2 f ( γ(t) ) γ (t), γ (t) 1/2 = 2ψ(t) ( ψ(t)+ 1 ) f ( γ(t) ) γ (t), γ (t) 1/2. Since ψ( )>on[, 1], we get whence and (3.3) follows. d dt ln( 1 + ψ(t) ) f ( γ(t) ) γ (t), γ (t) 1/2, λ(u) + 1 ln λ(v) + 1 ρ[ γ( ),, 1 ] σ(u,v)+ ɛ,

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 14» 612 613 614 615 616 617 618 619 62 621 622 623 624 625 626 627 628 629 63 631 632 633 634 635 636 637 638 639 64 641 642 643 644 645 646 647 648 649 65 651 652 653 654 655 656 657 658 To prove (3.4), let ψ(t)= f (γ (t)) v and r(t) = ρ[γ( ),,t]. Then d dt ψ2 (t) = d f ( γ(t) ), [ f (v) ] 1 f ( γ(t) ) dt = 2 f ( γ(t) ) γ (t), [ f (v) ] 1 f ( γ(t) ) 2 γ (t) γ(t) [ f (v) ] 1 f ( γ(t) ) γ(t) }{{} r (t) (by (3.1)) 2r (t)e r(t) [ f (v) ] 1 f ( γ(t) ) v = 2r (t)e r(t) ψ(t), whence ψ (t) r (t)e r(t), so that f (u) v + 1 = ψ(1) + 1 ψ() + 1 + [ e r(1) e r()] f (v) v + exp{ σ(u,v)+ ɛ } ( f (v) v + 1) exp { σ(u,v)+ ɛ }, and (3.4) follows. Finally, to prove (3.5), it suffices to note that if f is a ν-self-concordant barrier, then d dt f ( γ(t) ) = f ( γ(t) ),γ (t) λ ( γ(t) ) f ( γ(t) ) γ (t), γ (t) 1/2 whence and (3.5) follows. ν f ( γ(t) ) γ (t), γ (t) 1/2, f (u) f(v) νρ [ γ( ),, 1 ] ν [ σ(u,v)+ ɛ ], 4 Riemannian Length of Central Path Let f be a nondegenerate self-concordant function with domain Q E. Given a nonzero vector e E, consider the associated central path [ ] x(t) = argmin te,x +f(x). (4.1) x By Proposition 2.1(iv), the domain of this curve is an open interval Δ on the axis. From now on, we assume that this interval contains a given segment [t,t 1 ] with

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 15» 659 66 661 662 663 664 665 666 667 668 669 67 671 672 673 674 675 676 677 678 679 68 681 682 683 684 685 686 687 688 689 69 691 692 693 694 695 696 697 698 699 7 71 72 73 74 75 t <t 1 <. Note that by the implicit function theorem the path x(t) is continuously differentiable on its domain and satisfies the relations f ( x(t) ) = te; x (t) = [ f ( x(t) )] 1 e = t 1 [ f ( x(t) )] 1 f ( x(t) ). (4.2) Our goal now is to obtain a useful upper bound on the Riemannian length ρ[x( ), t,t 1 ] of the central path. Lemma 4.1 (i) We always have If in addition λ(x(t )) 1 2, then (ii) If λ(x(t )) < 1 2, then ρ [ ] [f ( x( ), t,t 1 x(t1 ) ) f ( x(t ) )] ln t 1. (4.3) t ln t 1 = ln f (x(t 1 )) x(t ) t f (x(t )) σ ( x(t ), x(t 1 ) ) + ln 3. (4.4) x(t ) ρ [ ] [f ( x( ), t,t 1 ln 2 + x(t1 ) ) f ( x( t) )] ln t 1 t, (4.5) where t is the largest t [t,t 1 ] such that λ(x(t)) 1 2, and Proof Let ln t 1 ln ( max { 1, 4 f ( x(t 1 ) ) }) ( t x σ x(t ), x(t f 1 ) ) + ln 12. (4.6) φ(t)= f ( x(t) ), r(t)= ρ [ x( ), t,t ], t t t 1. Since from (4.2) x (t) = t 1 [f (x(t))] 1 f (x(t)), wehave φ (t) = f ( x(t) ),x (t) = t 1[ f ( x(t) )] 1 f ( x(t) ),f ( x(t) ) = t 1 λ 2( x(t) ), r (t) = [ f ( x(t) )] x (t), x (t) 1/2 = t 1 [ f ( x(t) )] 1 f ( x(t) ),f ( x(t) ) 1/2 = t 1 λ ( x(t) ), whence by the Cauchy inequality ( ρ 2[ ] t1 x( ), t,t 1 = t 1 λ ( x(t) ) ) 2 dt = ( φ(t 1 ) φ(t ) ) ln t 1, t as required in (4.3). t ( t1 t t 1 λ 2( x(t) ) )( t1 ) dt t 1 dt t

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 16» 76 77 78 79 71 711 712 713 714 715 716 717 718 719 72 721 722 723 724 725 726 727 728 729 73 731 732 733 734 735 736 737 738 739 74 741 742 743 744 745 746 747 748 749 75 751 752 Let us prove now inequality (4.4). Assume that λ(x(t )) 1 2. Since f ( x(t 1 ) ) = t 1 t f ( x(t ) ), using (3.4) and the fact that λ(x(t )) = f (x(t )) x(t ) 1 2, we get ln t 1 = ln f (x(t 1 )) x(t ) t f (x(t )) ln(3 f (x(t 1 )) x(t ) + 1 x(t ) f (x(t )) x(t ) + 1) σ ( x(t ), x(t 1 ) ) + ln 3, as required in (4.4). Now assume that λ(x(t )) < 1 2. By Proposition 2.1(ii), under this assumption f attains its minimum on Q at a unique point x f, the quantity T = max { t : t Δ,λ ( x(t) ) 1 } 2 is well defined and λ(x(t )) = 1 2. Note that the multiplication of vector e in (4.1) by an appropriate constant changes only the scale of the time and does not change the trajectory. Hence, for the sake of notation, we may assume that T = 1. Since λ(x(t )) < 1 2,wehave t < t T = 1. Let us first prove that ρ [ x( ),, 1 ] ln 2. (4.7) Note that e = f (x(1)). Denote by f the Legendre transformation of f. Since f attains it minimum on Q, wehave dom f and since 1 = T Δ, wehave e dom f. Besides this, f is nondegenerate and self-concordant on its domain in view of Proposition 2.1(iv). Thus, e,f (e)e = f ( x(1) ), [ f ( x(1) )] 1 f ( x(1) ) = λ 2( x(1) ) = 1 4, whence, by Proposition 2.1(i) Hence, 1 e,f (te)e λ 2 (x(1)) (1 (1 t)λ(x(1))) 2, t 1. 1 e,f (te)e 1/2 λ(x(1)) dt 1 (1 t)λ(x(1)) dt = ln( 1 λ ( x(1) )) = ln 2. On the other hand, we have f (x(t)) = tf (x(1)), whence f ( x(t) ) x (t) = f ( x(1) ) = t 1 f ( x(t) ),

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 17» 753 754 755 756 757 758 759 76 761 762 763 764 765 766 767 768 769 77 771 772 773 774 775 776 777 778 779 78 781 782 783 784 785 786 787 788 789 79 791 792 793 794 795 796 797 798 799 and consequently ρ [ x( ),, 1 ] = 1 f ( x(t) ) x (t), x (t) 1/2 dt = 1 f ( x(1) ), [ f ( x(t) )] 1 f ( x(1) ) 1/2 dt = 1 ( e,f f ( x(t) )) e 1 1/2 dt = e,f (te)e 1/2 dt ln 2, as required in (4.7). Now let us prove (4.5). In the case of t 1 1 [ T ], inequality (4.5) immediately follows from (4.7). In the case of 1 <t 1, we have t = 1, and ρ [ x( ), t,t 1 ] ρ [ x( ),,t1 ] = ρ [ x( ),, 1 ] + ρ [ x( ), 1,t1 ] ln 2 + ρ [ x( ), 1,t1 ]. Bounding ρ[x( ), 1,t 1 ]=ρ[x( ), t,t 1 ] from above by (4.3), we get (4.5). It remains to prove (4.6). There is nothing to prove if t = t 1. Thus, we may assume that t <t 1, whence, in particular, λ(x( t))= 1 2. In view of the latter observation, we can apply (4.4) and get or, which is the same, ln t 1 = ln f (x(t 1 )) x( t) t f (x( t) σ ( x( t),x(t 1 ) ) + ln 3, x( t) ln t 1 = ln ( 2 f ( x(t 1 ) ) ( t x( t)) σ x( t),x(t1 ) ) + ln 3. (4.8) By (4.7), we have σ ( x(t ), x( t) ) ρ [ x( ), t, t ] ρ [ x( ),, t ] ρ [ x( ),, 1 ] ln 2, (4.9) whence by triangle inequality σ ( x( t),x(t 1 ) ) ln 2 + σ ( x(t ), x(t 1 ) ). Note that by (2.7) 1 f ( x(t 1 ) ) f ( x(t 2 x f 1 ) ) x( t) 2 f ( x(t 1 ) ). x f Combining these relations and (4.8), we come to (4.6). Remark 4.1 Note that in the proof of (4.3), we did not use the fact that f is selfconcordant. We can establish now the O(ν 1/4 )-geodesic property of the central path associated with a self-concordant barrier (see [6] for terminology).

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 18» 8 81 82 83 84 85 86 87 88 89 81 811 812 813 814 815 816 817 818 819 82 821 822 823 824 825 826 827 828 829 83 831 832 833 834 835 836 837 838 839 84 841 842 843 844 845 846 Theorem 4.1 Let f be a nondegenerate ν-self-concordant barrier for cl Q, and let x(t) be the central path given by f ( x(t) ) = te, t t t 1. Then ρ [ ] x( ), t,t 1 ln 2 + ν 1/4 σ ( x(t ), x(t 1 ) )[ σ ( x(t ), x(t 1 ) ) + ln 12 ]. (4.1) If λ(x(t )) 1 2, then ρ [ ] x( ), t,t 1 ν 1/4 σ ( x(t ), x(t 1 ) )[ σ ( x(t ), x(t 1 ) ) + ln 3 ]. (4.11) Proof It suffices to combine (4.3), (4.5), (3.5), (4.4), and (4.6). Recall that the primal-dual central paths are 2-geodesic (see Theorem 5.2 in [6]). 5 Applications In this section, we apply the results of Sect. 4 to the analysis of several short-step interior-point methods. We consider the following three problems: 1. Finding an approximation to the minimizer of a self-concordant function f. 2. Finding a point in a nonempty intersection of a bounded convex open domain Q and an affine plane. We assume that Q is represented by a ν-self-concordant barrier f with dom f = Q, and that the minimizer x f is known. 3. Finding an ɛ-solution to the optimization problem min cl Q c,x, with Q represented in the same way as in item 2. Our main results state that in problems 2 and 3 (as in problem 1, provided that f is a ν-self-concordant barrier), appropriate well-known short-step path-following methods are suboptimal within the factor ν 1/4. Namely, the number of Newton steps, which is required by the methods, coincides, up to a factor O(ν 1/4 ), with the Riemannian distance, between the starting point and the set of solutions. Recall that this distance is a natural lower bound on the number of iterations of the short-step interior-point methods. 5.1 Minimization of a Self-Concordant Function Consider the following problem: min f(x), (5.1) x Q where f is a nondegenerate bounded-below self-concordant function with dom f = Q. Assume that we have a starting point x Q. Let us analyze the efficiency of a short-step path-following method M( x), which traces the central path x(t): f ( x(t) ) = tf ( x), t [, 1]. (5.2)

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 19» 847 848 849 85 851 852 853 854 855 856 857 858 859 86 861 862 863 864 865 866 867 868 869 87 871 872 873 874 875 876 877 878 879 88 881 882 883 884 885 886 887 888 889 89 891 892 893 as t decreases. 2 By Proposition 2.1(ii), dom f, and of course f ( x) dom f. Consequently, the path is well defined (Proposition 2.1(iv)), and x() is the minimizer x f of f(x)over Q. To avoid trivialities, we assume that λ( x) > 1 2, i.e., that x does not belong to the domain of quadratic convergence of the Newton method as applied to f. Then f( x) f(x f ) O(1), f ( x) O(1), σ (x x f f, x) O(1) (5.3) (from now on, O(1) s are positive absolute constants). Indeed, the first inequality is readily given by Proposition 2.1(iii); the second inequality follows from the first one in view of (2.11) applied with x = x,y = x f, and the third inequality follows from the second one in view of (3.4). Using Lemma 4.1(ii) with t =, t 1 = 1, in view of (5.3), we get the following result. Theorem 5.1 Let λ( x) > 1 2. Then the short-step method M( x) justifies the following upper bound: [ N f ( x,x f ) O(1) f( x) f(xf ) ] ln ( 1 + f ( x) ) xf (5.4) O(1) [f( x) f(xf ) ] σ(x f, x). Let us discuss the bound (5.4). 1. The only previously known efficiency estimate for the problem (5.1)was O(1) ( f( x) f(x f ) ) (5.5) Newton steps (recall that we do not assume f to be a self-concordant barrier). The simplest method providing us with this estimate is the usual damped Newton method. However, using a standard argument, it is not difficult to see that the short-step pathfollowing scheme as applied to (5.1) also share the same estimate (5.5). Let us demonstrate that the bound (5.4) is sharper. Indeed, using inequality (2.1), we have f( x) f(x f ) ln ( 1 + f ( x) x f ) 1, so that the bound (5.5) onn f ( x,x f ) follows from the first inequality in (5.4) combined with initial conditions (5.3). Note that the bound (5.4) can be much smaller than (5.5). Example 5.1 Let B n be the unit n-dimensional box: B n = { x R n : x (i) 1,i = 1,...,n }, 2 For a precise definition of the notion of short-step path-following method, see Definition 3.1 in [6].

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 2» 894 895 896 897 898 899 9 91 92 93 94 95 96 97 98 99 91 911 912 913 914 915 916 917 918 919 92 921 922 923 924 925 926 927 928 929 93 931 932 933 934 935 936 937 938 939 94 and f(x)= n i=1 ln(1 (x (i) ) 2 ). Without loss of generality, we may assume that x (i), so that x (i) =[1 ɛ i ] 1/2, for some ɛ i (, 1], i = 1,...,n. Then At the same time, i=1 f( x) = n ln 1. ɛ i i=1 ( f ( x) ) 2 1 n 4( x (i) ) 2 n x = f 2 (1 (x (i) ) 2 ) 2 = 2 1 ɛ i n 1 2n ɛi 2 2 min 1 i n ɛi 2. i=1 ɛ 2 i=1 i It follows that the ratio of the complexity bound (5.4) to the bound (5.5) does not exceed [ O(1)( 1 + max ln n ]/ [ ]) 1/2 n 1 + ln 1, 1 i n ɛ i ɛ i and the latter quantity can be arbitrary close to n 1/2. 2. Consider a particular case of problem (5.1), with f being a ν-self-concordant barrier. In this case, the known complexity estimate for a short-step path-following scheme is O(1) ν ln ( 1 + ν f ( x) ) (5.6) x f (see, for example, [4]). However, from Lemma 2.3, it is clear that the estimate (5.4) is sharper. 3. As it is shown in [6], a natural lower bound on the number of Newton steps in every short-step interior-point method for solving (5.1) is the Riemannian distance from x to x f. In the case when f is a ν-self-concordant barrier, the performance of the path-following scheme, in view of Theorem 4.1 (orinviewof(5.4) combined with (3.5)), is at most O(ν 1/4 ) times worse than this lower bound. 5.2 Finding a Feasible Point Consider the following problem: i=1 Find a point x F ={x Q, Ax = b}, (5.7) where Q is an open and bounded convex set endowed with a ν-self-concordant barrier f(x), and A : x Ax is a linear mapping from E onto a linear space F. Since the mapping A is an onto one, the conjugate mapping A : F E is an embedding. From now on, we assume that problem (5.7) is feasible, and that we know the minimizer x f of f on Q. Without loss of generality, we assume that x f =. Let f be the Legendre transformation of f. Since x f =, we have f () =. (5.8)

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 21» 941 942 943 944 945 946 947 948 949 95 951 952 953 954 955 956 957 958 959 96 961 962 963 964 965 966 967 968 969 97 971 972 973 974 975 976 977 978 979 98 981 982 983 984 985 986 987 Note that f (s) is a self-concordant function with dom f = E such that s,f (s)s ν, s E (5.9) (see [5], Theorem 2.4.2). In order to avoid trivial cases, let us assume that {x : Ax = b} { x : x < 1 } = (5.1) (recall that xf ). Indeed, otherwise a solution to (5.7) can be found by projecting the origin onto the plane Ax = b in the Euclidean metric (see Proposition 2.1(i)). In order to solve (5.7), we can trace the path of minimizers of f on the sets E t, x(t) = argmin f(x), E t ={x Q Ax = tb}, E t as t varies from to 1. Note that this path is well defined: E and E 1 are nonempty and bounded by assumption. Therefore, all E t, t (, 1) are also nonempty and bounded. As it is shown in [6], the path x( ) can be traced by an appropriate short-step path-following sequence, which length is proportional to ρ[x( ),, 1]. Thus, in order to establish the complexity of our problem, we need to find some bounds on the Riemannian length of the path x(t). Observe that tracing x(t) as t varies from 1 to is equivalent to tracing a dual central path y(t) : h ( y(t) ) = th (), (5.11) but with t varying from to 1. This path is associated with a nondegenerate (since A is an embedding) self-concordant function h(y) = f ( A y ) y,b :F R. The relation between the paths x(t) and y(t) is given by the following lemma. Lemma 5.1 For any t [, 1], we have x(1 t)= f ( A y(t) ). (5.12) Proof Indeed, from the origin of x(1 t), it follows that f (x(1 t)) (Ker A). This means that f (x(1 t)) = A y(t) for certain uniquely defined y(t), and this is equivalent to (5.12). Besides this, Ax(1 t) = (1 t)b, whence Af (A y(t)) = (1 t)b,or h ( y(t) ) = Af ( A y(t) ) b = tb = th (), where the concluding equality is readily given by (5.8). We have arrived at (5.11).

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 22» 988 989 99 991 992 993 994 995 996 997 998 999 1 11 12 13 14 15 16 17 18 19 11 111 112 113 114 115 116 117 118 119 12 121 122 123 124 125 126 127 128 129 13 131 132 133 134 It turns out that the Riemannian length ρ[y( ),, 1] of the path y( ) in the Riemannian structure given by h( ) is equal to ρ[x( ),, 1]: ρ [ x( ),, 1 ] 1 = f ( x(t) ) x (t), x (t) 1/2 dt = = = 1 f 1 1 f ( f ( A y(1 t) )) f ( A y(1 t) ) A y (1 t), ( A y(1 t) ) A y (1 t) 1/2 dt A y (1 t),f ( A y(1 t) ) A y (1 t) 1/2 dt h ( y(1 t) ) y (1 t),y (1 t) 1/2 dt = ρ [ y( ),, 1 ]. Observe that by (5.11), y(1) =, while y() is the minimizer of h( ). Inviewofthe latter fact and Lemma 4.1,wehave ρ [ x( ),, 1 ] = ρ [ y( ),, 1 ] [ ] [h ( ) ( )] 1 O(1) ln 2 + y(1) h y() ln, (5.13) t where t is the largest t [, 1] such that λ h (y(t)) 1 2, λ h(y) being the local norm of the gradient of h( ) at y dom h( ) F. Since y(1) =, we have h (y(1)) = b. Moreover, the point z = [ f () ] 1 A [ A [ f () ] 1 A ] 1 b clearly belongs to E 1, so that by (5.1), we have 1 z 2 = b, [ A [ f () ] 1 A ] 1 [ b = b, Af ()A ] 1 b = h (), [ h () ] 1 h () = λ 2 h () = λ2 h( y(1) ). From λ h (y(1)) 1 and λ h (y()) =, it follows that h ( y(1) ) h ( y() ) = h ( y(1) ) min h 1 ln 2, λ h ( t)= 1 2, ln O(1) 1 t (5.14) (see Proposition 2.1). Consequently, (5.13) implies that ρ [ y( ),, 1 ] [h ( ) ( )] 1 y(1) h y() ln t. (5.15)

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 23» 135 136 137 138 139 14 141 142 143 144 145 146 147 148 149 15 151 152 153 154 155 156 157 158 159 16 161 162 163 164 165 166 167 168 169 17 171 172 173 174 175 176 177 178 179 18 181 The next statement expresses the complexity bound (5.15) in terms of f. Theorem 5.2 For every x F E 1 Q, we have ρ [ x( ),, 1 ] [f(x) O(1) f(xf ) ][ O(1) + ln ν + ln f (x) ] xf O(1) ν ln ( O(1)ν f (x) x f ). (5.16) Proof Recall that we have assumed x f =. Observe first that Therefore, using (5.12), we get Thus, h ( y(1) ) = h() = f () = min f = f(). h ( y() ) ( = f A y() ) y(), b = A y(), f ( A y() ) f ( f ( A y() )) y(), b = y(), Ax(1) b f ( x(1) ) = f ( x(1) ). h ( y(1) ) h ( y() ) = f ( x(1) ) min f = f ( x(1) ) f(). (5.17) Q Let us prove now that Q ln 1 t O(1) ln f ( x(1) ) x(1), x(1). (5.18) Denote d = h (y(1)) y(). Then, either d 1, or d<1. In the latter case applying (2.11) with h playing the role of f and x = y(1), y = y(), we get h ( y(1) ) h ( y() ) ln(1 d)+ d 1 d, which combined with the first relation in (5.14) results in ln(1 d)+ d 1 d O(1), whence in any case d O(1). The last conclusion combined with Lemma 4.1(ii), (see (4.6)) implies that ln O(1) ln d. (5.19) 1 t Recalling that y(1) =, h () = b, we get d 2 = h ( y(1) ), [ h ( y() )] 1 h ( y(1) ) = b, [ Af ( A y() ) A ] 1 b = Ax(1), [ A [ f ( x(1) )] 1 A ] 1 Ax(1) f ( x(1) ) x(1), x(1), and (5.18) follows from (5.19).

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 24» 182 183 184 185 186 187 188 189 19 191 192 193 194 195 196 197 198 199 11 111 112 113 114 115 116 117 118 119 111 1111 1112 1113 1114 1115 1116 1117 1118 1119 112 1121 1122 1123 1124 1125 1126 1127 1128 Combining (5.15), (5.17), (5.18), and (2.18), we come to the following inequality ρ [ x( ),, 1 ] [f ( ) ] ( O(1) x(1) f() ln O(1)ν f ( x(1) ) ). (5.2) Let us derive inequality (5.16) from (5.2). Since x(1) is the minimizer of f on F and x() =, for every x F, we have f ( x(1) ) f ( x() ) f(x) f() f (x), x x f (x). We already have mentioned that x ν + 2 ν for every x Q. Thus, the first part of (5.16) is proved. Further, let x F. Applying (2.12) with u = and v = x(1), we have f ( x(1) ) ν 1 π (x(1)) [(2.14) with u =,v= x(1), w = x] ν(1 + ν + 2 ν) 1 π (x) [(2.13) with u =,v= x] ν(ν + 2 ν)(1 + ν + 2 ν) f (x) π (x) O(1)ν 4 f (x), the concluding inequality being given by (5.1) combined with the fact that Q is contained in -ball of the radius ν + 2 ν centered at (Proposition 2.1(v)). Thus, we get the second part of inequality (5.16). Corollary 5.1 Under Assumption (5.1), the Riemannian length of the central path can be bounded from above as ρ [ x( ),, 1 ] O(1)ν 1/4( σ(, F) + ln ν ), (5.21) where σ(x,x) is the infimum of Riemannian lengths of curves starting at a point x and ending at a point from the set X. Proof Let x F. By(3.5), we have f(x) f() νσ(,x), while by (3.4)wehave ln ( 1 + f (x) ) σ(,x). Combining these inequalities and (5.16), we get ρ [ x( ),, 1 ] O(1)ν 1/4( σ(,x)+ ln ν ) ; since the resulting inequality is valid for every x F and σ(,x) O(1) by (5.1), (5.21) follows. Note that in view of Example 1.1 this statement is not valid for an unbounded Q.

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 25» 1129 113 1131 1132 1133 1134 1135 1136 1137 1138 1139 114 1141 1142 1143 1144 1145 1146 1147 1148 1149 115 1151 1152 1153 1154 1155 1156 1157 1158 1159 116 1161 1162 1163 1164 1165 1166 1167 1168 1169 117 1171 1172 1173 1174 1175 5.3 Central Path in Standard Minimization Problem Now let us look at the optimization problem min { c,x :x cl Q }, (5.22) where Q is a bounded open convex set endowed with a ν-self-concordant barrier f. Assuming that the minimizer x f of f on Q is available, consider the standard f -generated short-step path-following method for solving (5.22), where one traces the central path x(t) : f ( x(t) ) = tc, t, as t. From the standard complexity bound for the number of Newton steps required to trace the segment <t t t 1 of the path, we can derive that ( N f x(t ), x(t 1 ) ) ( O(1) 1 + ν ln t ) 1. (5.23) On the other hand, Lemma 4.1 combined with approach [6] leads to another bound: N f ( x(t ), x(t 1 ) ) O(1) ( 1 + ρ [ x( ), t,t 1 ]), ρ [ ] [f ( x( ), t,t 1 x(t1 ) ) f ( x(t ) )] ln t 1. t t (5.24) Moreover, the value ρ[x( ), t,t 1 ] is an exact two-side estimate (up to constant factors) for the number of steps of any short-step path following scheme (see [6]). Note that the new bound (5.24) is sharper than the old one. Indeed, by (3.5), we have f ( x(t 1 ) ) f ( x(t ) ) νσ ( x(t ), x(t 1 ) ) νρ [ x( ), t,t 1 ], so that the second inequality in (5.24) says that ρ[x( ), t,t 1 ] ν ln t 1 t. With this upper bound for ρ[x( ), t,t 1 ], the first inequality in (5.24) implies (5.23). As it is shown in Example 5.1, the ratio of the right-hand side of (5.24) to that of (5.23) can be as small as O( 1 ). ν Bound (5.24) allows 1/2 to obtain certain result on suboptimality of the path-following method in the family of all short-step interior point methods associated with the same self-concordant barrier. Namely, consider a segment t [t,t 1 ], t >, t 1 <, of the central path and assume that at certain moment we stand at the point x(t ). Starting from this moment, the path-following method reaches the point x(t 1 ) with the value of the objective c,x(t 1 ) < c,x(t ) in O(ρ[x( ), t,t 1 ]) Newton steps. Let us ask ourselves whether it is possible to reach a point with the value of the objective at least c,x(t 1 ) by a short-step interior point method, associated with f,for which the number of iterations is essentially smaller than O(ρ[x( ), t,t 1 ]). Asitis shownin[6], the number of steps of any competing method is bounded below by O(σ(x(t ), Q t1 )), where Q t1 = { x Q c,x c,x(t 1 ) }.

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 26» 1176 1177 1178 1179 118 1181 1182 1183 1184 1185 1186 1187 1188 1189 119 1191 1192 1193 1194 1195 1196 1197 1198 1199 12 121 122 123 124 125 126 127 128 129 121 1211 1212 1213 1214 1215 1216 1217 1218 1219 122 1221 1222 Thus, the aforementioned question can be posed as follows: (Q) How large could be the ratio ρ[x( ), x(t ), x(t 1 )]/σ (x(t ), Q t1 )? The answer is given by the following theorem. Theorem 5.3 Assume that the ellipsoid W 1 1 (x(t )) does not intersect Q t1. 3 Then ρ [ ] x( ), t,t 1 O(1)ν 1/4 σ ( )[ ( ) ] x(t ), Q t1 σ x(t ), Q t1 + ln ν. (5.25) Proof (1) Let H ={x Q c,x =c,x(t 1 ) }. Since x(t ) Q\Q t1, we clearly have σ ( x(t ), H ) σ ( x(t ), Q t1 ). (5.26) Note that f attains its minimum on H at the point x(t 1 ), so that f ( x(t 1 ) ),x x(t 1 ) =, x H. (5.27) Since f (x(t 1 )) = t 1 c, f (x(t )) = t c and c,x(t 1 ) < c,x(t ),wehavealso f ( x(t ) ),x x(t ) = f ( x(t ) ),x(t 1 ) x(t ), x H. (5.28) (2) Let x H. Since f attains its minimum on H at the point x(t 1 ) and in view of (3.5), we have f ( x(t 1 ) ) f ( x(t ) ) f(x) f ( x(t ) ) νσ ( x(t ), x ). (5.29) Furthermore, by (3.4) f (x) x(t ) + 1 exp{ σ ( x(t ), x )}( f ( x(t ) ) x(t ) + 1) Using (2.12) with u = x(t ) and v = x(t 1 ) we get exp { σ ( x(t ), x )}( ν + 1 ). (5.3) f ( x(t 1 ) ) x(t ) ν 1 π x(t )(x(t 1 )) [(2.14) with u = x(t ), v = x(t 1 ), w = x and (5.27)] ν(1 + ν + 2 ν) 1 π x(t )(x) [(2.13) with u = x(t ), v = x and (5.28)] ν(ν + 2 ν)(1 + ν + 2 ν) f (x) π x(t x(t )(x) ) 3??? O(1)ν 4 f (x) x(t ),

«FOCM 128 layout: Small Extended v.1.2 reference style: focm file: focm919.tex (Michail) aid: 919 doctopic: OriginalPaper class: spr-small-v1.1 v.27/11/26 Prn:5/12/27; 11:4 p. 27» 1223 1224 1225 1226 1227 1228 1229 123 1231 1232 1233 1234 1235 1236 1237 1238 1239 124 1241 1242 1243 1244 1245 1246 1247 1248 1249 125 1251 1252 1253 1254 1255 1256 1257 1258 1259 126 1261 1262 1263 1264 1265 1266 1267 1268 1269 where the concluding inequality follows from the fact that, on one hand, W 1 (x(t )) 1 does not intersect H, whence x x(t ) x(t ) 1 1, and, on the other hand, the x(t )-distance from x(t ) to the boundary of Q in the direction x x(t ) does not exceed ν + 2 ν in view of (5.28) and Proposition 2.1(v). Thus, f ( x(t 1 ) ) x(t ) O(1)ν4 f (x) x(t ) O(1)ν5 exp { σ ( x(t ), x )}, (5.31) the concluding inequality being given by (5.3). Since the ellipsoid W 1 (x(t )) does not intersect Q t1 and x Q t1,wehave 1 σ ( x(t ), x ) O(1). (5.32) (3) Assume first that λ(x(t )) 1 2. Then by Lemma 4.1, wehave ρ [ ] [f ( x( ), t,t 1 x(t1 ) ) f ( x(t ) )] ln f (x(t 1 )) x(t ) f (x(t )) x(t ) [by (5.29) and λ(x(t )) 1 2 ν ] 1/2 σ ( x(t ), x ) ln ( 2 f ( x(t 1 ) ) ) x(t ) [by (5.31) and (5.32)] O(1) ν 1/2 σ ( x(t ), x )( σ ( x(t ), x ) + ln ν ). (5.33) Now let λ(x(t )) 1 2. Then by (2.7), we have 1 2 g x(t ) / g x f 2, g E \{}. (5.34) Now by Lemma 4.1(ii), we have ρ [ ] [f ( x( ), t,t 1 ln 2 + x(t1 ) ) f ( x(t ) )] ln ( max { 1, 4 f ( x(t 1 ) ) }) x f [by (5.29), (5.34)] ln 2 + ν 1/2 σ ( x(t ), x ) ln ( max { 1, 8 f ( x(t 1 ) ) }) x(t ) [by (5.31), (5.32)] ln 2 + O(1) ν 1/2 σ ( x(t ), x )( σ ( x(t ), x ) + ln ν ). Recalling (5.32), we conclude that ρ [ ] x( ), t,t 1 O(1)ν 1/4 σ ( x(t ), x )( σ ( x(t ), x ) + ln ν ). Since x H is arbitrary, we get ρ [ ] x( ), t,t 1 O(1)ν 1/4 σ ( x(t ), H )( σ ( x(t ), H ) + ln ν ). We conclude that the ratio in (Q) is up to logarithmic in ν terms, at most ν 1/4. Acknowledgements suggestions. We would like to thank the anonymous referees for very useful comments and