Nonsymmetric potential-reduction methods for general cones

Similar documents
Constructing self-concordant barriers for convex cones

Cubic regularization of Newton s method for convex problems with constraints

Accelerating the cubic regularization of Newton s method on convex problems

Primal-dual IPM with Asymmetric Barrier

Local Superlinear Convergence of Polynomial-Time Interior-Point Methods for Hyperbolic Cone Optimization Problems

On self-concordant barriers for generalized power cones

Complexity bounds for primal-dual methods minimizing the model of objective function

Augmented self-concordant barriers and nonlinear optimization problems with finite complexity

Largest dual ellipsoids inscribed in dual cones

Universal Gradient Methods for Convex Optimization Problems

Gradient methods for minimizing composite functions

Agenda. Interior Point Methods. 1 Barrier functions. 2 Analytic center. 3 Central path. 4 Barrier method. 5 Primal-dual path following algorithms

Gradient methods for minimizing composite functions

Self-Concordant Barrier Functions for Convex Optimization

Research Note. A New Infeasible Interior-Point Algorithm with Full Nesterov-Todd Step for Semi-Definite Optimization

An Infeasible Interior-Point Algorithm with full-newton Step for Linear Optimization

Primal-dual subgradient methods for convex problems

Primal Central Paths and Riemannian Distances for Convex Sets

12. Interior-point methods

Primal Central Paths and Riemannian Distances for Convex Sets

PRIMAL-DUAL INTERIOR-POINT METHODS FOR SELF-SCALED CONES

Supplement: Universal Self-Concordant Barrier Functions

A PREDICTOR-CORRECTOR PATH-FOLLOWING ALGORITHM FOR SYMMETRIC OPTIMIZATION BASED ON DARVAY'S TECHNIQUE

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method

New stopping criteria for detecting infeasibility in conic optimization

Lecture 17: Primal-dual interior-point methods part II

Full Newton step polynomial time methods for LO based on locally self concordant barrier functions

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

15. Conic optimization

Barrier Method. Javier Peña Convex Optimization /36-725

CORE 50 YEARS OF DISCUSSION PAPERS. Globally Convergent Second-order Schemes for Minimizing Twicedifferentiable 2016/28

POLYNOMIAL OPTIMIZATION WITH SUMS-OF-SQUARES INTERPOLANTS

Lecture 5. The Dual Cone and Dual Problem

Interior-Point Methods

Interior Point Algorithms for Constrained Convex Optimization

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

Interior-Point Methods for Linear Optimization

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

On Two Measures of Problem Instance Complexity and their Correlation with the Performance of SeDuMi on Second-Order Cone Problems

Multi-Parameter Surfaces of Analytic Centers and Long-step Surface-Following Interior Point Methods

Local Self-concordance of Barrier Functions Based on Kernel-functions

18. Primal-dual interior-point methods

Interior Point Methods in Mathematical Programming

Interior Point Methods for Mathematical Programming

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Interior-point methods Optimization Geoff Gordon Ryan Tibshirani

Lecture 9 Sequential unconstrained minimization

Semidefinite Programming

Lecture 7 Monotonicity. September 21, 2008

A Second-Order Path-Following Algorithm for Unconstrained Convex Optimization

A Second Full-Newton Step O(n) Infeasible Interior-Point Algorithm for Linear Optimization

Optimization: Then and Now

L. Vandenberghe EE236C (Spring 2016) 18. Symmetric cones. definition. spectral decomposition. quadratic representation. log-det barrier 18-1

A full-newton step infeasible interior-point algorithm for linear programming based on a kernel function

Primal-Dual Interior-Point Methods. Javier Peña Convex Optimization /36-725

arxiv: v1 [math.oc] 21 Jan 2019

Newton s Method. Javier Peña Convex Optimization /36-725

Semidefinite and Second Order Cone Programming Seminar Fall 2012 Project: Robust Optimization and its Application of Robust Portfolio Optimization

More First-Order Optimization Algorithms

A new primal-dual path-following method for convex quadratic programming

On Conically Ordered Convex Programs

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

Primal-dual Subgradient Method for Convex Problems with Functional Constraints

Lecture 6: Conic Optimization September 8

12. Interior-point methods

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

Advances in Convex Optimization: Theory, Algorithms, and Applications

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

Optimisation in Higher Dimensions

Interior Point Methods for Convex Quadratic and Convex Nonlinear Programming

Primal-Dual Symmetric Interior-Point Methods from SDP to Hyperbolic Cone Programming and Beyond

A Primal-Dual Second-Order Cone Approximations Algorithm For Symmetric Cone Programming

A FULL-NEWTON STEP INFEASIBLE-INTERIOR-POINT ALGORITHM COMPLEMENTARITY PROBLEMS

A polynomial-time interior-point method for conic optimization, with inexact barrier evaluations

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS

Linear programming II

Lecture 14 Barrier method

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization

On well definedness of the Central Path

A polynomial-time interior-point method for conic optimization, with inexact barrier evaluations


Lecture 3. Optimization Problems and Iterative Algorithms

Interior Point Methods: Second-Order Cone Programming and Semidefinite Programming

10. Unconstrained minimization

Primal-dual path-following algorithms for circular programming

Interior Point Methods for Linear Programming: Motivation & Theory

Lecture 5. Theorems of Alternatives and Self-Dual Embedding

Lagrangian Duality Theory

A priori bounds on the condition numbers in interior-point methods

A tight iteration-complexity upper bound for the MTY predictor-corrector algorithm via redundant Klee-Minty cubes

Newton s Method. Ryan Tibshirani Convex Optimization /36-725

CS711008Z Algorithm Design and Analysis

Oracle Complexity of Second-Order Methods for Smooth Convex Optimization

Lecture 8. Strong Duality Results. September 22, 2008

Unconstrained minimization of smooth functions

10 Numerical methods for constrained problems

Lagrangian-Conic Relaxations, Part I: A Unified Framework and Its Applications to Quadratic Optimization Problems

A polynomial-time inexact primal-dual infeasible path-following algorithm for convex quadratic SDP

Smooth minimization of non-smooth functions

Convex Optimization and l 1 -minimization

Transcription:

CORE DISCUSSION PAPER 2006/34 Nonsymmetric potential-reduction methods for general cones Yu. Nesterov March 28, 2006 Abstract In this paper we propose two new nonsymmetric primal-dual potential-reduction methods for conic problems. The methods are based on the primal-dual lifting [5]. This procedure allows to construct a strictly feasible primal-dual pair related by an exact scaling relation even if the cones are not symmetric. It is important that all necessary elements of our methods can be obtained from the standard solvers for primal Newton system. The first of the proposed schemes is based on the usual affine-scaling direction. For the second one, we apply a new first-order affine-scaling direction, which incorporates in a symmetric way the gradients of primal and dual barriers. For both methods we prove the standard O( ν ln ɛ ) complexity estimate, where ν is the parameter of the barrier and ɛ is the required accuracy. Keywords: convex optimization, conic problems, interior-point methods, potential-reduction methods, self-concordant barriers, self-scaled barriers, affine-scaling direction. Center for Operations Research and Econometrics (CORE), Catholic University of Louvain (UCL), 34 voie du Roman Pays, 348 Louvain-la-Neuve, Belgium; e-mail: nesterov@core.ucl.ac.be. The research results presented in this paper have been supported by a grant Action de recherche concertè ARC 04/09-35 from the Direction de la recherche scientifique - Communautè française de Belgique. The scientific responsibility rests with its author.

Introduction Motivation. In the last years, the main activity in the field of interior-point schemes was related to symmetric cones (see [], [8], [9]). The general cones did not attract to much attention maybe because of the difficulties related to constructing a good and computable self-concordant barrier and its Fenchel transform (see [2], [0]). However, recently in [6] it was shown that our abilities in constructing good barriers for convex sets and for convex cones are basically the same. For example, it is possible to develop self-concordant barriers for conic hulls of epigraphs of many important functions of one variable. The values of parameters of proposed barriers vary from three to four. This opens a possibility to solve efficiently by conic interior-point schemes many problems of Separable Optimization. On the other hand, very often the computation of the Fenchel transform of the primal barrier is not so easy. That was the main motivation for developing in [5] a framework for nonsymmetric interior-point methods for conic problems. The main idea of [5] was to attribute the main part of computations to the primal problem. For that, we treat an auxiliary problem of finding a point in close neighborhood of primal central path as a process of finding a scaling point. Using this point w, we can construct a feasible primaldual pair z = (x, s), satisfying the exact scaling relation s = F (w)x, where F is the primal barrier. This operation is called primal-dual lifting. In appears that the point z belongs to a neighborhood of the primal-dual central path. Moreover, using the Hessian F (w) it is possible to ine affine-scaling direction exactly by the same expression as for self-scaled barrier. Note that now, even for self-scaled barriers, we obtain the standard search direction applying no special machinery. For all cones, we only need to solve a system of linear equations for primal Newton direction. In order to perform a prediction step, we need a procedure, which can compute a value of the dual barrier. The main advantage of this approach is that computationally it is very cheap. In [5] there was analyzed nonsymmetric primal-dual path-following scheme based on the primal-dual lifting. In this paper we show that this lifting can be used for developing nonsymmetric primal-dual potential-reduction methods. Contents. The paper is organized as follows. In Section 2 we introduce a primal-dual pair of conic problems and the primal-dual central path. We also present the inition and the main properties of primal-dual lifting. In Section 3, using the framework of [3], we introduce the first potential-reduction method based on affine-scaling direction. In this scheme, we apply consequently three stages: the primal correction process, primal-dual lifting, and prediction step. At each stage we can guarantee a certain decrease of a penalty potential. It is shown that even the lifting can cause a significant drop in the value of the potential. We prove also O( ν ln ɛ ) complexity bound, where ν is the parameter of primal self-concordant barrier and ɛ is the required accuracy. Finally, in Section 4 we propose a first-order affine-scaling direction, which incorporates the gradients of the primal and dual barriers computed in the lifted point. We show that our inition possesses a primaldual symmetry. At the same time, it ensures the same complexity bound as the standard affine-scaling direction.

Notation and generalities. Let E be a finite dimensional real vector space with dual space E. We denote the corresponding scalar product by s, x, where x E and s E. If E = R n, then E = R n and we use the standard scalar product s, x = n s (i) x (i), i= x = x, x /2, s, x R n. The actual meaning of the notation, can be always clarified by the space containing the arguments. For a linear operator A : E E we ine its adjoint operator A : E E in a standard way: Ax, y = A y, x, x E, y E. If E = E, we can consider self-adjoint operators: A = A. Let Q be a closed convex set in E. For interior-point methods (IPM), Q must be represented by a self-concordant barrier F (x), x int Q, with parameter ν (see Chapter 4 in [4] for initions, examples and main results). At any x int Q we use the Hessian F (x) : E E for ining the following local Euclidean norms: h x = F (x)h, h /2, h E, s x = s, [F (x)] s, s E. It is well known that for any x int Q the corresponding Dikin ellipsoid is feasible: We often use two important inequalities: W (x) = {u E : u x x } Q. (.) F (u) F (x) + F (x), u x + ω(r), (.2) F (u) ( r) 2 F (x), (.3) where x int Q, r = u x x <, and ω(t) = t ln( t). Note that a sum of a selfconcordant barrier F (x) and a linear function c, x becomes a self-concordant function. An iterate of Damped Newton method as applied to f(x) = c, x + F (x) looks as follows: It can be shown that x + = x [F (x)] f (x) + f (x), x int Q. (.4) x f(x) f(x + ) ω ( f (x) x ), (.5) where ω (t) = t ln( + t). In many applications, the feasible set Q can be represented as an intersection of an affine subspace and a convex cone K E. We call K proper if it is a closed pointed cone with nonempty interior. For a proper cone, its dual cone is also proper. K = {s E : s, x 0 x K} 2

The natural barriers for cones are logarithmically homogeneous barriers: F (τx) F (x) ν ln τ, x int K, τ > 0. (.6) Let us point out some straightforward consequences of this identity: F (τx) = τ F (x), F (τx) = τ 2 F (x), (.7) F (x)x = F (x), (.8) F (x), x = ν, (.9) F (x)x, x = ν, F (x), [F (x)] F (x) = ν, (.0) (for proofs, see Section 2.3 in [7]). In what follows, we always assume that F (x) is logarithmically homogeneous. It is important that the dual barrier F (s) = max x { s, x F (x) : x int K}, s int K, is a ν-self-concordant logarithmically homogeneous barrier for K. The pair of primal-dual barriers satisfy the following duality relations: F (x) int K, F int K, (.) F ( F (x)) = F (x), x F (x) = ν F (x), F ( F (s)) = ν F (x). (.2) F ( F (x)) = x, F ( F (s)) = s, (.3) F ( F (x)) = [F (x)], F ( F (s)) = [F (s)], (.4) F (x) + F (s) ν ν ln s,x ν, (.5) and the last inequality is satisfied as an equality if and only if s = τf (x) for some τ > 0 (see Section 2.4 in [7]). 2 Primal-dual pair of conic problems Consider the standard conic optimization problem f = min x s.t. c, x, x F P = {x K : Ax = b}, (2.) where K E is a proper cone, c E, b R m, and linear operator A maps E to R m. Then, we can write down the dual problem s.t. max s,y b, y, s + A y = c, s K, y R m. (2.2) = F D 3

For a feasible primal-dual point z = (x, s, y) the following relation holds 0 s, x = c A y, x = c, x Ax, y = c, x b, y. (2.3) In what follows, we always assume existence of a strictly feasible primal-dual point (x 0, s 0, y 0 ) : Ax 0 = b, x 0 int K, s 0 + A y 0 = c, s 0 int K. (2.4) In this case, for problems (2.), (2.2) the strong duality holds. We assume that the primal cone is endowed with a ν-logarithmically homogeneous self-concordant barrier F (x). Then, the barriers F (x) and F (s) ine the primal-dual central path (see, for example, [3]). Theorem Under assumption (2.4), the primal-dual central path, x(t) = arg min x {t c, x + F (x) : Ax = b} y(t) = arg max {t b, y F (c A y)} y, t > 0, (2.5) s(t) = c A y(t) is well ined. Moreover, for any t > 0, the following identities hold: s(t), x(t) = c, x(t) b, y(t) = ν t, (2.6) F (x(t)) + F (s(t)) = ν + ν ln t, (2.7) s(t) = t F (x(t)), x(t) = t F (s(t)). (2.8) Hence, the optimal values of problems (2.), (2.2) coincide and their optimal sets are bounded. Modern IPM usually work directly with primal-dual problem min{ c, x b, y : (x, s, y) F}, x,s,y F = {(x, s, y) : Ax = b, s + A y = c, x K, s K }. (2.9) Note that (2.6) justifies unboundedness of F (consider t 0). The main advantage of (2.9) lies in the very useful relations (2.6) - (2.8), which allow one to ine different global proximity measures for the primal-dual central path. One of the most natural is the functional measure (see [3], [8]) Ω(x, s, y) (.5) = F (x) + F (s) + ν ln s,x ν + ν (2.3) = F (x) + F (s) + ν ln c,x b,y ν + ν 0. (2.0) This function vanishes only at points of the primal-dual central path. Recently, in [5] there was proposed a primal-dual lifting procedure, which transforms a well centered point u rint F P into a well centered primal-dual point z t (u). Namely, 4

let us fix a penalty parameter t > 0 and point u rint F P. Consider directions δ = δ t (u), and y = y t (u), which provide a unique solution to the following linear system: tc + F (u) + F (u)δ = ta y, Aδ = 0. (2.) Note that this system corresponds to the standard Newton step: { } δ t (u) = arg min tc + F (u), δ + Aδ=0 2 F (u)δ, δ. Define primal-dual lifting z t (u) = (x t (u), s t (u), y t (u)) of point u as follows: x t (u) = u δ t (u), s t (u) = c A y t (u). (2.2) Note that x t (u) is formed by a shift from u along a direction pointing away from the central path. Denote λ t (u) = δ t (u) u. Theorem 2 [5] If λ t (u) β <, then z t (u) rint F satisfies the scaling relations Moreover, it is well centered: and its duality gap is bounded as follows: ) ln ( β ν s t (u) = t F (u) x t (u), F (x t (u)) t F (u) F (s t (u)) u 2β2. (2.3) Ω(z t (u)) 2ω(β) + β 2, (2.4) 2 ln ( t ν s t(u), x t (u) ) ln Finally, we have the following bounds on the Hessians: ( + β ν ). (2.5) F (x t (u)) () 2 F (u), F (s t (u)) t 2 () 2 [F (u)]. (2.6) In [5] it was shown how to use the primal-dual lifting in predictor-corrector pathfollowing scheme. In the next section we apply this procedure in the potential-reduction framework. 3 Potential-reduction IPM One of the most useful functional characteristics of point z = (x, s, y) rint F is the primal-dual potential Φ(x, s, y) = F (x) + F (s) + 2ν ln s, x (2.3) = F (x) + F (s) + 2ν ln[ c, x b, y ]. (3.) 5

This potential provides us with an easily computable upper bound for the duality gap: c, x b, y ν exp ( + ν Φ(z) ). (3.2) It can be proved [3], that the region of the fastest decrease of Φ is located in a small neighborhood of the primal-dual central path. Therefore, potential reduction IPM tempt to decrease the penalty potential: P γ (z) = Φ(z) + γω(z) = ( + γ)p 0 γ (z), γ > 0, P 0 γ (z) = F (x) + F (s) + (ν + ρ) ln s, x + γ +γ ν ( ln ν), = Ω(z) + ρ ln s, x ν +γ ( ln ν), z rint F, (3.3) where ρ = ν +γ < ν. Thus, inequality (3.2) implies ) c, x b, y ν ( exp + ρ P γ 0 (z). (3.4) Let us discuss different strategies for decreasing the normalized penalty potential P 0 γ (z). We are going to present two nonsymmetric primal-dual methods empoying mainly a primal barrier F (x), which is assumed to be easily available. We need the following non-restrictive assumption. Assumption The primal feasible set F P is bounded. Note that assumption (2.4) and Theorem guarantee only the boundedness of optimal set in the primal-dual problem (2.9). Hence, F P may be unbounded. However, if we know a point x 0 rint F P, then we can modify the initial problem (2.) as follows: min κ,x { κ : c, x + κ = c, x 0 +, Ax = b, x K, κ 0}. (3.5) This problem has the same structure as (2.), but now its feasible set is bounded. Nonsymmetric primal-dual potential-reduction IPM consists of four stages. 0. Initialization. Choose arbitrary x 0 rint F P and an estimate f 0 < f. Since F P is bounded, the dual feasible set must be unbounded. Therefore, for any f 0 < f there exists a point (s 0, y 0 ) rint F D such that f 0 = b, y 0. This point is used only for interpreting our lower bound. Denote ψ k (x) = (ν + ρ) ln( c, x f k ) + F (x), k 0. We need to choose a tolerance parameter β (0, ). 6

. Primal kth stage (k 0). Set u 0 = x k, i = 0. Repeat a) Find solution (δ i, y i ) of the following linear system: ν+ρ c,u i f k c + F (u i ) + F (u i )δ i = A y i, Aδ i = 0. (3.6) Denote g i = b) Compute λ i = δ i ui. If λ i > β, then u i+ = u i + δ i +λ i. until λ i β. ν+ρ c,u i f k c + F (u i ). Note that ( ) ψ k (u i + δ) ψ k (u i ) = (ν + ρ) ln + c,δ c,u i f k + F (u i + δ) F (u i ) g i, δ + F (u i + δ) F (u i ). Therefore the Step b) in (3.6) can be interpreted as a Damped Newton step (.4) with λ i being the local norm of the gradient g i. Hence, in view of (.5), if λ i β, then ψ k (u i ) ψ k (u i+ ) ω (β). Hence, if the termination criterion is satisfied at step i k, then for any i, 0 i i k, we have P 0 γ (u i, s k ) P 0 γ (x k, s k ) i ω (β). (3.7) 2. Primal-dual lifting. We come at this stage after termination of the kth primal process with λ ik β <. Denote t k ẑ k = ν+ρ (2.3) c,u ik f k = ν+ρ s k,u ik, = (ˆx k = x tk (u ik ), ŝ k = s tk (u ik ), ŷ k = y tk (u ik )). (3.8) Then, P 0 γ (ẑ k ) (3.3) = Ω(ẑ k ) + ρ ln ŝ k, ˆx k ν +γ ( ln ν) (2.0) P 0 γ (u ik, s k ) + Ω(ẑ k ) + ρ ln ŝ k,ˆx k s k,u ik (3.8) = P 0 γ (u ik, s k ) + Ω(ẑ k ) + ρ ln t k ŝ k,ˆx k ν + ρ ln ν ν+ρ (2.4),(2.5) P 0 γ (u ik, s k ) + 2ω(β) + β 2 + 2β ρ ν ρ2 ν+ρ. 7

Thus, we have proved the following inequality P 0 γ (ẑ k ) P 0 γ (u ik, s k ), = ρ2 ν+ρ 2β ρ ν 2ω(β) β 2. (3.9) Note that we do not need to keep positive. 3. Affine-scaling prediction. In accordance to (4.2) in [5], ine affine-scaling direction z k = ( x k, s k, y k ) as a unique solution to the following system of linear equations: s k + t k F (u ik ) x k = ŝ k, A x k = 0, (3.0) Note that and for any α Therefore, s k + A y k = 0. ŝ k, x k + s k, ŝ k = ŝ k, ˆx k, (3.) [ ) 0, β+ we have (see Theorem 3 in [5]) ν Ω(ẑ k α z k ) Ω(ẑ k ) αβ 2 β+ ν + ω ( α β+ ν ). (3.2) P 0 γ (ẑ k α z k ) (3.),(3.2) ( ) Pγ 0 (ẑ k ) + ρ ln( α) + αβ 2 β+ ν + ω α β+ ν ( Pγ 0 (ẑ k ) α ρ β 2 β+ ν ) ( + ω α β+ ν ) Denoting τ = α β+ ν we can see that for finding the optimal step size we need to minimize τ 2 + ω(τ), where 2 = ρ β+ ν β2. (3.3) Of course, we need to assume that 2 is positive. This gives the optimal τ = 2 + 2. Thus, we conclude that the affine-scaling prediction step ensures at least the following decrease of the normalized penalty potential: Hence, we can form P 0 γ (ẑ k α z k ) P 0 γ (ẑ k ) ω ( 2 ), α = β+ ν z k+ = ẑ k α k z k, f k+ = b, y k+, 2 + 2. (3.4) (3.5) 8

where α k is equal to α or to any other positive value ensuring sufficient decrease of the normalized penalty potential. Since we do not know in advance how many steps of the process (3.6) we are going to perform between the affine scaling prediction steps, it is necessary to establish a sufficiently large uniform lower bound on the decrease of the potential at any iteration. That is Taking ρ = ν, we obtain = min{ω (β), + ω ( 2 )}. (3.9) = ν ν+ ν 2β β2 2ω(β) 2 2β β2 2ω(β), (3.3) 2 = ν β+ ν β2 +β β2. Clearly, for β small enough we can make 2 positive and ensure Thus, we have proved the following statement. + ω ( 2 ) ω (β) > 0. (3.6) Theorem 3 Let ρ = ν. For any β (0, ) satisfying condition (3.6) and ensuring 2 > 0, each step (3.6) or (3.8) with (3.5) ensures a decrease of the normalized penalty potential at least by the value ω (β). We get ɛ-solution of problem (2.9) at most in iterations. N ω (β) [ Ω(z 0 ) + ν ln c,x 0 f 0 ɛ ]. (3.7) Proof: Indeed, we have seen that any step of the potential-reduction method decreases the value of penalty potential by ω(β). Denoting by z j a current primal-dual point after jth iteration of the scheme, we have ν ln sj, x j (3.3) Pγ 0 (z j ) + ν +γ ( ln ν) P 0 γ (z 0 ) + ν +γ ( ln ν) ω (β) j (3.3) Ω(z 0 ) + ν ln s 0, x 0 ω (β) j. 4 First-order prediction step Note that in the inition of affine-scaling direction (3.0) we do not use any new information computed at points ˆx k and ŝ k. Let us show, that this can be done in a natural way. 9

Consider a point u rint F with λ t (u) β <. Denote x = x t (u) and s = s t (u). Assume that we are able to compute the gradients of the primal and dual barriers F (x) and F (s). Our goal is to ensure a better decrease the normalized penalty potential P 0 γ (z) using this additional information. However, we would like to keep the algebraic complexity of the iteration. This means that we agree to solve only some variants of the linear system (3.0) with different right-hand side g E : δs + t F (u) δx = g, Aδx = 0, (4.) δs + A δy = 0. Denote B = t F (u) and P = B /2 A [AB A ] AB /2. Note that the operator P is a projector: P = P 2. It can be easily checked that the solution δz(g) = (δx(g), δs(g), δy(g)) of system (4.) is given by the following expressions: δx(g) = B /2 (I P )B /2 g, δs(g) = B /2 P B /2 g, (4.2) δy(g) = [AB A ] AB g. Note that for a feasible displacement δz = (δx, δs, δy) we have At the same time, P γ (z δz) P γ (z) ν+ρ s,x [ s, δx + δs, x ] +F (x δx) F (x) + F (s δs) F (s). F (x δx) F (x) + F (x), δx F (s δs) + F (s) δs, F (s) (.2) ω( δx x ) (2.6) ( ω t Bδx, δx /2), (.2) ω( δs s ) (2.6) ( ω t δs, B δs /2). Since ω(ξ) + ω(τ) ω([ξ 2 + τ 2 ] /2 ) for any ξ, τ 0 with ξ 2 + τ 2 <, (see inequality (4.8) in [5]), we conclude that P γ (z δz) P γ (z) ν+ρ s,x s + F (x), δx δs, ν+ρ s,x x + F (s) ( +ω t [ Bδx, δx 2 + δs, B δs 2] /2 ). (4.3) Hence, it seems reasonable to compute the prediction directions as solutions to the following minimization problem: min g { ν+ρ s,x s + F (x), δx(g) δs(g), ν+ρ s,x x + F (s) + 2 Bδx(g), δx(g) 2 + 2 δs(g), B δs(g) 2 }. 0 (4.4)

It appears that the solution of this problem can be obtained by solving the linear system (4.) twice with different right-hand sides. Theorem 4 The solution δz = (δx, δs, δy ) of problem (4.4) can be obtained in the following way:. Compute g 0 = δs(f (x) B F (s)), 2. Define g = ν+ρ s,x s + F (x) g 0, (4.5) 3. Compute δz = δz(g ). Moreover, for the optimal predictor direction δz we have ν+ρ s,x s + F (x), δx + δs, ν+ρ s,x x + F (s) = Bδx, δx 2 + δs, B δs 2 = g, B g (4.6) s,x [ρ β2 (β + ] 2 ν), where the last inequality is valid for ρ β2 (β + ν). Proof: First of all, note that the quadratic term in the objective function of problem (4.4) can be written as follows: Bδx(g), δx(g) 2 + δs(g), B δs(g) 2 = δs(g) + Bδx(g), B (δs(g) + Bδx(g)) Denoting now w = B /2 g, ŝ = ν+ρ s,x problem: = g, B g. ν+ρ s, and ˆx = s,x x, we get the following minimization { min ŝ + F (x), B /2 (I P )w B /2 P w, ˆx + F w (s) + 2 w 2}. (4.7) Since ŝ = Bˆx, its solution can be represented as follows: w = (I P )B /2 (ŝ + F (x)) + P B /2 (ˆx + F (s)) [ ] = B /2 ŝ + B /2 (I P )B /2 F (x) + B /2 P B /2 F (s) [ ] = B /2 ŝ + F (x) B /2 P B /2 (F (x) BF (s)). Hence, the optimal g = g = ŝ + F (4.2) (x) g 0 with g 0 = δs(f (x) BF (s)). Further, the first two equalities in (4.6) follow from the form of objective function in minimization (4.7). Let us prove the remaining inequality. Note that g = ŝ + 2 [F (x) + BF (s)] + 2 [F (x) BF (s)] B /2 P B /2 [F (x) BF (s)] = ŝ + g + g 2, g = 2 [F (x) + BF (s)], g 2 = 2 B/2 (I 2P )B /2 [F (x) BF (s)],

and the vector g 2 is not too big: g 2 u = 2 [F (u)] /2 (I 2P )[F (u)] /2 [F (x) BF (s)] u = 2 (I 2P )[F (u)] /2 [F (x) BF (s)] = 2 [F (u)] /2 [F (x) BF (s)] = 2 F (x) BF (s) u (2.3) β2. Hence, g u ŝ + g u β2. At the same time, t ( ŝ + g u) 2 = t(ν+ρ) 2 s, [F s,x (u)] s + 2t ν+ρ 2 s,x g, [F (u)] s + t( g u) 2 (2.3) = (ν+ρ) 2 s,x + 2 ν+ρ s,x g, x + t( g u) 2. Since 2 g, x = F (x)+bf (s), x (2.3) = F (x), x + s, F (s) (.9) = 2ν, we get inequality Note that t ( ŝ + g u) 2 ρ2 ν 2 s,x + t( g u) 2. t s, x ( g u) 2 (2.3) = x 2 u ( g u) 2 g, x 2 = 4 F (x) + BF (s), x 2 Therefore, (2.3) = 4 [ F (x), x + s, F (s) ] 2 (.9) = ν 2. t /2 g u ρ β2 t /2 s,x /2 (2.5) [ ρ β2 s,x /2 (β + ] ν). Since t ( g u) 2 = g, B g, we obtain (4.6). Note that for self-scaled barriers we have in (4.5) g 0 = 0. Let us investigate now the efficiency of the step along direction δz. In view of inequality (4.3) and relation (4.6), we have ( P γ (z αδz ) P γ (z) α g, B g + ω α t /2 g, B g /2) ( τ = α t/2 g, B g /2 ) = τ t /2 g, B g /2 + ω(τ). Thus, the optimal step is τ = () g,b g /2 t /2 +() g,b g /2, α = τ t /2 g, B g /2, and we prove the following statement. 2

Theorem 5 Assume that ()ρ β 2 (β+ ν). Then, with the step size α, the decrease of normalized penalty potential along direction δz can be estimated as folows: ( P γ (z) P γ (z α δz ) ω (τ ) ω ()ρ β 2 (β+ ) ν) ( 2 )(β+. (4.8) ν)+()ρ Proof: Indeed, in view of inequality (4.6) we have τ = () g,b g /2 t /2 +() g,b g /2 ()ρ β 2 (β+ ν) t /2 s,x /2 +()ρ β 2 (β+ ν) (2.5) ()ρ β 2 (β+ ν) ( 2 )(β+ ν)+()ρ. Thus, we can see that the prediction direction δz can be used in Step 3 of potentialreduction scheme. Its complexity estimate is similar to (3.7). However, this direction can be better than the standard affine-scaling direction (3.0) since it takes into account the gradients of barrier functions. Of course, the final conclusion about the quality of these directions can be derived only from intensive computational testing. To conclude, let us shortly discuss reasonable strategies for choosing parameters in the new potential-reduction schemes. The most important parameter is, of course, ρ. From viewpoint of the worst case complexity analysis, we need to choose ρ = κ ν, where κ is an absolute constant. Then the impact of all three stages of the scheme is balanced and we can guarantee a constant decrease of the normalized penalty potential at any step of any stage. However, note that it is possible to choose, for example, ρ = 2ν. Then the primal-dual lifting decreases the potential by O(ρ) (see (3.9)). This means, that for problems with an easy correction phase, we can gain a lot by increasing ρ. Unfortunately, up to now we cannot convert this reasoning in a complexity bound. 3

References [] F. Alizadeh. Interior point methods in semiinite programming with applications to combinatorial optimization. SIAM Journal of Optimization, 5, 3 5 (995) [2] R. W. Freund, F. Jarre, and S. Schaible. On self-concordant barrier functions for conic hulls and fractional programming. Mathematical Programming, 74, 237 246 (996) [3] Yu. Nesterov. Long-Step Strategies in Interior-Point Primal-Dual Methods. Mathematical Programming, 76(), 47 94 (996) [4] Yu. Nesterov. Introductory Lectures on Convex Optimization. Kluwer, Boston, 2004. [5] Yu. Nesterov. Towards nonsymmetric conic optimization. CORE Discussion Paper #2006/28, (2006) [6] Yu. Nesterov. Constructing self-concordant barriers for convex cones. CORE Discussion Paper #2006/30, (2006) [7] Yu. Nesterov and A. Nemirovsky, Interior Point Polynomial Algorithms in Convex Programming, SIAM, Philadelphia, 994. [8] Yu. Nesterov, M. J. Todd. Primal-dual interior-point methods for self-scaled cones. SIAM Journal of Optimization, 8, 324 364 (998) [9] M. J. Todd, K. C. Toh and R. H. Tutuncu. On the Nesterov-Todd direction in semiinite programming. SIAM Journal on Optimization, 8, 769 796 (998) [0] G. Xue, Y. Ye. An efficient algorithm for minimizing a sum of p-norms. SIAM Journal on Optimization, 0(2), 55 579 (998) 4