Full Newton step polynomial time methods for LO based on locally self concordant barrier functions

Similar documents
Local Self-concordance of Barrier Functions Based on Kernel-functions

A full-newton step infeasible interior-point algorithm for linear programming based on a kernel function

Research Note. A New Infeasible Interior-Point Algorithm with Full Nesterov-Todd Step for Semi-Definite Optimization

A Second Full-Newton Step O(n) Infeasible Interior-Point Algorithm for Linear Optimization

A path following interior-point algorithm for semidefinite optimization problem based on new kernel function. djeffal

An Infeasible Interior-Point Algorithm with full-newton Step for Linear Optimization

A Full Newton Step Infeasible Interior Point Algorithm for Linear Optimization

A new primal-dual path-following method for convex quadratic programming

A NEW PROXIMITY FUNCTION GENERATING THE BEST KNOWN ITERATION BOUNDS FOR BOTH LARGE-UPDATE AND SMALL-UPDATE INTERIOR-POINT METHODS

A FULL-NEWTON STEP INFEASIBLE-INTERIOR-POINT ALGORITHM COMPLEMENTARITY PROBLEMS

A full-newton step feasible interior-point algorithm for P (κ)-lcp based on a new search direction

A full-newton step infeasible interior-point algorithm for linear complementarity problems based on a kernel function

Interior-point algorithm for linear optimization based on a new trigonometric kernel function

Improved Full-Newton Step O(nL) Infeasible Interior-Point Method for Linear Optimization

A PREDICTOR-CORRECTOR PATH-FOLLOWING ALGORITHM FOR SYMMETRIC OPTIMIZATION BASED ON DARVAY'S TECHNIQUE

Interior Point Methods for Nonlinear Optimization

Lecture 5. Theorems of Alternatives and Self-Dual Embedding

Nonsymmetric potential-reduction methods for general cones

4TE3/6TE3. Algorithms for. Continuous Optimization

A Full-Newton Step O(n) Infeasible Interior-Point Algorithm for Linear Optimization

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

2.1. Jordan algebras. In this subsection, we introduce Jordan algebras as well as some of their basic properties.

On self-concordant barriers for generalized power cones

Interior-Point Methods

CCO Commun. Comb. Optim.

Interior Point Methods for Linear Programming: Motivation & Theory

Primal-dual IPM with Asymmetric Barrier

Lecture 17: Primal-dual interior-point methods part II

A new Primal-Dual Interior-Point Algorithm for Second-Order Cone Optimization

A PRIMAL-DUAL INTERIOR POINT ALGORITHM FOR CONVEX QUADRATIC PROGRAMS. 1. Introduction Consider the quadratic program (PQ) in standard format:

Self-Concordant Barrier Functions for Convex Optimization

Constructing self-concordant barriers for convex cones

A WIDE NEIGHBORHOOD PRIMAL-DUAL INTERIOR-POINT ALGORITHM WITH ARC-SEARCH FOR LINEAR COMPLEMENTARITY PROBLEMS 1. INTRODUCTION

Enlarging neighborhoods of interior-point algorithms for linear programming via least values of proximity measure functions

A Second-Order Path-Following Algorithm for Unconstrained Convex Optimization

A New Class of Polynomial Primal-Dual Methods for Linear and Semidefinite Optimization

Lecture 5. The Dual Cone and Dual Problem

Barrier Method. Javier Peña Convex Optimization /36-725

PRIMAL-DUAL INTERIOR-POINT METHODS FOR SELF-SCALED CONES

IMPLEMENTATION OF INTERIOR POINT METHODS

Interior Point Methods: Second-Order Cone Programming and Semidefinite Programming

Largest dual ellipsoids inscribed in dual cones

Agenda. Interior Point Methods. 1 Barrier functions. 2 Analytic center. 3 Central path. 4 Barrier method. 5 Primal-dual path following algorithms

PRIMAL-DUAL ALGORITHMS FOR SEMIDEFINIT OPTIMIZATION PROBLEMS BASED ON GENERALIZED TRIGONOMETRIC BARRIER FUNCTION

Interior Point Algorithms for Constrained Convex Optimization

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

A Simpler and Tighter Redundant Klee-Minty Construction

A New Self-Dual Embedding Method for Convex Programming

4TE3/6TE3. Algorithms for. Continuous Optimization

New Interior Point Algorithms in Linear Programming

New stopping criteria for detecting infeasibility in conic optimization

Improved Full-Newton-Step Infeasible Interior- Point Method for Linear Complementarity Problems

A tight iteration-complexity upper bound for the MTY predictor-corrector algorithm via redundant Klee-Minty cubes

18. Primal-dual interior-point methods

A Weighted-Path-Following Interior-Point Algorithm for Second-Order Cone Optimization

Second-order cone programming

Interior Point Methods for Convex Quadratic and Convex Nonlinear Programming

Semidefinite Programming

Introduction to Nonlinear Stochastic Programming

12. Interior-point methods

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Analytic Center Cutting-Plane Method

Interior Point Methods in Mathematical Programming

A polynomial-time interior-point method for conic optimization, with inexact barrier evaluations

10 Numerical methods for constrained problems

On well definedness of the Central Path

POLYNOMIAL OPTIMIZATION WITH SUMS-OF-SQUARES INTERPOLANTS

Lecture Note 5: Semidefinite Programming for Stability Analysis

On Conically Ordered Convex Programs

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization

On Mehrotra-Type Predictor-Corrector Algorithms

Infeasible Interior-Point Methods for Linear Optimization Based on Large Neighborhood

Lecture 8 Plus properties, merit functions and gap functions. September 28, 2008

Supplement: Universal Self-Concordant Barrier Functions

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

A polynomial-time interior-point method for conic optimization, with inexact barrier evaluations

Nonlinear Programming

Primal-Dual Symmetric Interior-Point Methods from SDP to Hyperbolic Cone Programming and Beyond

Optimisation in Higher Dimensions

On the Sandwich Theorem and a approximation algorithm for MAX CUT

Lecture 15 Newton Method and Self-Concordance. October 23, 2008


DEPARTMENT OF MATHEMATICS

Lecture 14 Barrier method

Written Examination

Nonlinear Optimization for Optimal Control

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method

On Generalized Primal-Dual Interior-Point Methods with Non-uniform Complementarity Perturbations for Quadratic Programming

Improved Full-Newton-Step Infeasible Interior- Point Method for Linear Complementarity Problems

15. Conic optimization

Constrained Optimization and Lagrangian Duality

L. Vandenberghe EE236C (Spring 2016) 18. Symmetric cones. definition. spectral decomposition. quadratic representation. log-det barrier 18-1

ON THE ARITHMETIC-GEOMETRIC MEAN INEQUALITY AND ITS RELATIONSHIP TO LINEAR PROGRAMMING, BAHMAN KALANTARI

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Interior-Point Methods for Linear Optimization

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

A Redundant Klee-Minty Construction with All the Redundant Constraints Touching the Feasible Region

CS-E4830 Kernel Methods in Machine Learning

Using Schur Complement Theorem to prove convexity of some SOC-functions

Limiting behavior of the central path in semidefinite optimization

Transcription:

Full Newton step polynomial time methods for LO based on locally self concordant barrier functions (work in progress) Kees Roos and Hossein Mansouri e-mail: [C.Roos,H.Mansouri]@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos Georgia Tech, Atlanta, GA November 21, A.D. 2005 1

Self-concordant (barrier) functions Outline Definitions Newton step and proximity measure Algorithm with full Newton steps Complexity analysis Minimization of a linear function over a convex domain Algorithm with full Newton steps Complexity analysis Kernel-function-based approach Linear optimization via self-dual embedding Central path of self-dual problem Kernel-function-based barrier functions Complexity results Local self-concordancy of kernel-function-based barrier functions Analysis of the full Newton step method Concluding remarks Some references 2

Self-concordant univariate functions We start by considering a univariate function φ : D R. The domain D of the function φ must be an open interval in R. One calls φ a κ-self-concordant (SC) function if there exists a nonnegative number κ such that φ (x) 2κ ( φ (x) )32, x D. (1) Note that this definition assumes that φ (x) is nonnegative, whence φ is convex, and moreover that φ is three times differentiable. Moreover, if φ (x) > 0 for all x D, then φ is SC if and only if is bounded above (by 4κ 2 ). φ (x) 2 φ (x) 3 3

Self-concordant (multivariate) functions Let φ : D R be a strictly convex function, where the domain D is an open convex subset of R n, with n > 1. So φ is a multivariate function. Then φ is called a κ-sc function if its restriction to an arbitrary line in its domain is κ-sc. In other words, φ is a κ-sc if and only if φ(t) = φ(x + th) is κ-sc for all x D and for all h R n. The domain of φ(t) is defined in the natural way: given x and h it consists of all t such that x + th D. We want to find the minimal value φ on its domain (if it exists) by Newton s method. 4

Newton step and proximity measure Let φ : D R be a a strictly convex κ-sc function having a minimizer, and such that the minimal value equals 0. The Newton step at x is defined by x = H(x) 1 g(x), (2) where g(x) and H(x) denote the gradient and the Hessian of φ(x) at x, respectively. In the sequel we always assume that φ is strictly convex. As a consequence, the quantity λ(x) = x T H(x) x = x H(x) = g(x) T H(x) 1 g(x), can be used as a measure for the distance of x to the minimizer of φ(x). The quantity λ(x) plays a crucial role in the analysis of Newton s method. Many results can be nicely expressed by using the univariate (nonnegative) function ω(t) defined by For example, if λ(x) < 1 κ then one has ω(t) = t ln(1 + t), t > 1. (3) κλ(x) + ln(1 κλ(x)) φ(x) Hence, since ω(t) is monotonically decreasing if t ( 1,0], we obtain λ(x) 1 4κ κ 2 = ω ( κλ(x)) κ 2. φ(x) ω( 1 4 ) κ 2 = 0.0376821 κ 2 1 26κ 2. 5

Quadratic convergence result A major result in the theory of self-concordant functions states that the Newton process is quadratically convergent if 3κλ(x) < 1. This is because of the following result. Lemma 1 If κλ(x) < 1 then x + x D and λ(x + x) κ ( λ(x) 1 κλ(x) ) 2. Corollary 1 If 3κλ(x) < 1 then x + x D and λ(x + x) ( 3 2 λ(x) κ ) 2. 6

Algorithm with full Newton steps Assuming that we know a point x D with λ(x) 3κ 1 we can easily obtain a point x D such that λ(x) ǫ, for prescribed ǫ > 0, with the following algorithm. Input: An accuracy parameter ǫ (0,1); x D such that λ(x) 1 3κ. while λ(x) ǫ do x := x + x endwhile Theorem 1 Let x D and λ(x) 3κ 1. Then the algorithm with full Newton steps requires at most log 2 log ǫ log 3 4 ( log 2 3.5log 1 ) ǫ iterations. The output is a point x D such that λ(x) ǫ. 7

Minimization of a linear function over a convex domain We consider the problem of minimizing a linear function over a closed convex domain D: (P) min { c T x : x D }. We assume that we have a self-concordant barrier function φ : D R, where D = int D, and also that H(x) = 2 φ(x) is positive definite for every x D. For µ > 0 we define φ µ (x) := ct x µ + φ(x), x D. (P µ ) inf {φ µ (x) : x D}. We have g µ (x) := φ µ (x) = c µ + φ(x) = c µ + g(x) H µ (x) := 2 φ µ (x) = 2 φ(x) = H(x), 3 φ µ (x) = 3 φ(x). Note that the two higher derivatives do not depend on µ. It follows that φ µ (x) is selfconcordant. The minimizer of φ µ (x), if it exists, is denoted as x(µ). When µ runs though all positive numbers then x(µ) runs through the central path of (P). We expect that x(µ) converges to an optimal solution of (P) when µ approaches 0. Therefore we are going to follow the central path. This approach is likely to be feasible because since φ µ (x) is self-concordant, its minimizer can be computed efficiently. 8

Newton step and proximity measure φ µ (x) := ct x µ + φ(x), x D. g µ (x) := φ µ (x) = c µ + φ(x) = c µ + g(x) H µ (x) := 2 φ µ (x) = 2 φ(x) = H(x), 3 φ µ (x) = 3 φ(x). The Newton step at x is now given by x = H(x) 1 g µ (x) and the distance of x D to the µ-center x(µ) is measured by the quantity λ µ (x) = x T H(x) x = g µ (x) T H(x) 1 g µ (x) = g µ (x) H 1. 9

Effect of a µ-update and the barrier parameter ν Let λ = λ µ (x) and µ + = (1 θ)µ. Our aim is to estimate λ µ +(x). We have g µ +(x) = c µ + φ(x) = c + (1 θ)µ + φ(x) = 1 1 θ ( c Hence, denoting H(x) shortly as H, µ + φ(x) θ φ(x) λ µ +(x) = 1 1 θ g µ(x) θ φ(x) H 1 1 1 θ ) = 1 1 θ (g µ(x) θ φ(x)). g µ(x) H 1 } {{ } λ µ (x) +θ g(x) H 1 Definition 1 Let ν 0. The self-concordant barrier function φ is called a ν-barrier if λ(x) 2 = g(x) 2 H 1 ν, x D.. An immediate consequence of this definition is Lemma 2 If φ is a self-concordant ν-barrier then λ µ +(x) λ µ(x)+θ ν 1 θ. 10

Input: Algorithm with full Newton steps An accuracy parameter ǫ > 0; proximity parameter τ > 0; update parameter θ, 0 < θ < 1; x = x 0 D and µ = µ 0 > 0 such that λ µ (x) τ < 1 κ. ( while µ ν + τ (τ+ ν) 1 κτ µ := (1 θ)µ; x := x + x; endwhile ) ǫ do Theorem 2 If τ = 9κ 1 and θ = 5 requires not more than 9+36κ ν 2 ( 1 + 4κ ν ) ln 2µ0 ν ǫ, then the algorithm with full Newton steps iterations. The output is a point x D such that c T x c T x + ǫ, where x denotes an optimal solution of (P). 11

Graphical illustration of full-newton-step path-following method z 0 µe central path λ(x) τ z k = x k s k (1 θ)µe z 1 One iteration. 12

Relevant part of the analysis of the algorithm At the start of the first iteration we have x D and µ = µ 0 such that λ µ (x) τ. When the barrier parameter is updated to µ + = (1 θ)µ, Lemma 2 gives λ µ +(x) λ µ(x) + θ ν 1 θ τ + θ ν 1 θ. (4) Then after the Newton step, the new iterate is x + = x + x and ( ) 2 λ µ +(x + λµ +(x) ) κ. (5) 1 κλ µ +(x) The algorithm is well defined if we choose τ and θ such that λ µ +(x + ) τ. To get the lowest iteration bound, we need at the same time to maximize θ. From (5) we deduce that λ µ +(x + ) τ certainly holds if λ µ +(x) 1 κλ µ +(x) τ κ, which is equivalent to λ µ +(x) the following condition on θ: τ κ τ+ κ. According to (4) this will hold if τ+θ ν 1 θ θ τ 1 κτ κτ τ + νκ ( 1 + κτ ) τ κ τ+. This leads to κ We choose τ = 1 9κ. The upper bound for θ gets the value 5 9+36κ ν 1 2+8κ ν, and then λ µ+(x) 1 4κ. This justifies the choice of the value of τ and θ in the theorem. For the rest of the proof we refer to the relevant references. 13

Linear optimization via self-dual embedding (1) It is now well known that every linear optimization problem can be solved efficiently if we can find in polynomial time a strictly complementary solution of problems of the form (SP) min{q T x : Mx + q 0, x 0}, where the n n matrix M is skew-symmetric (i.e., M T = M) and q = (0;...;0; n) R n, and under the assumption that the all-one vector 1 is feasible with M 1 + q = 1. The problem (SP) is trivial in the sense that it has a trivial optimal solution, namely x = 0, with 0 as optimal value. But this observation is not sufficient for our goal, since we need a strictly complementary solution of (SP). What this means requires some explanation. 14

Linear optimization via self-dual embedding (2) We associate to any vector x R n its slack vector s(x) according to s(x) = Mx + q. In the sequel we simply denote s(x) as s, and s will always have this meaning. Since M is skew-symmetric we have z T Mz = 0 for every vector z R n. Hence we have q T x = (s Mx) T x = s T x x T Mx = s T x. Therefore, if x is feasible, then x is optimal if and only if s T x = 0. Since x and s are nonnegative this holds if and only if x i s i = 0 for each i. This shows that x is optimal if and only if the vectors x and s are complementary vectors. We say that x is a strictly complementary solution if moreover x i + s i > 0 for each i. Summarizing these facts, we have that x is feasible if x 0 and s 0. A feasible x is optimal if xs = 0, and x is a strictly complementary solution if moreover x + s > 0. Thus we need to solve the system s = Mx + q, x 0, s 0, xs = 0 x + s > 0. 15

Central path The basic idea of IPMs is to replace the so-called complementarity condition xs = 0 for (SP), by the parameterized equation xs = µ1, with µ > 0. Thus we consider the system s = Mx + q, x 0, s 0, xs = µ1. Clearly, any solution (x, s) will satisfy x > 0 and s > 0. Note that x = s = 1 and µ = 1 satisfy this system. Surprisingly enough, a solution exists for each µ > 0, and this solution is unique. It is denoted as (x(µ), s(µ)) and we call x(µ) the µ-center of (SP); s(µ) is the corresponding slack vector. The set of µ-centers (with µ running through all positive real numbers) gives a homotopy path, which is called the central path of (SP). If µ 0 then the limit of the central path exists and since the limit point satisfies the complementarity condition, the limit yields an optimal solution for (SP). Moreover, this solution can be shown to be strictly complementary. We will start our method at x = s = 1 and µ = 1. The method uses nonnegative barrier functions φ µ (x, s), for each µ > 0, such that φ µ (x(µ), s(µ)) = 0. If s = Mx + q then we denote φ µ (x, s) as Φ µ (x). 16

Kernel-function-based barrier functions First we choose a kernel function ψ : (0, ) [0, ). We require that ψ(t) is three times differentiable and strictly convex, and moreover that ψ(t) is minimal at t = 1, whereas ψ(1) = 0. Then we define n xs Φ µ (x) := φ µ (x, s) = 2 ψ (v i ) where v :=, s = Mx + q. µ i=1 The barrier function Φ µ (x) based on the kernel function ψ(t) is defined on the interior of the domain of (SP). φ µ (x, s) is strictly convex and minimal when v = 1, and then x = x(µ) (and s = s(µ)). Provided that θ is small enough, after a full Newton step we get a good enough approximation of x = x(µ). Then we repeat the above process: reduce µ by the factor 1 θ, do a full Newton step, etc., until µ is close enough to zero. At the end this yields an ǫ-solution of the problem (SP). In earlier papers we used a search direction determined by the system M x = s, s x + x s = µ v ψ (v). 17

Complexity results i kernel function ψ i (t) small-update large-update ref. t 1 2 1 2 ln t O ( ) n ln n O ( ) n ln n RTV ǫ ǫ ( ) 1 2 2 t 1 2 O ( ) ( ) t n ln n O n 2 3 ln n PRT ǫ ǫ t 3 2 1 2 + t1 q 1 q 1, q > 1 O ( q 2 ) ) n ln n O (qn q+1 2q ln n PRT ǫ ǫ t 4 2 1 2 + t1 q 1 q(q 1) q 1 q (t 1), q > 1 O ( q n ) ) ln n O (qn q+1 2q ln n PRT ǫ ǫ 5 6 t 2 1 2 + e1 t e e t 2 1 2 t 1 1 e1 ξ dξ O ( n ln n ǫ O ( ) n ln n O ( ǫ n ln 2 n ) ln n BER ǫ ) O ( n ln 2 n ) ln n BER ǫ 7 t 1 + t1 q 1 q 1, q > 1 O ( q 2 n ) ln n ǫ O(qn) ln n ǫ BR In all cases the iteration bound for small-update methods is O ( n log n ǫ ). The best bound for large-update methods is obtained for i {3,4} by taking q = 1 2 log n. This gives the iteration bound O ( ) n(log n)log n ǫ, which is currently the best known bound for large-update methods. 18

Local self-concordancy of the barrier function We define φ : D R to be locally κ-sc at x D R n if φ(x + th) is κ-sc for all h R n ; to express the dependence of κ on x we use the notation κ(x). Clearly φ is κ-sc if and only if κ(x) is bounded above by some (finite) constant on the domain of φ. It is well known that the classical logarithmic barrier function whose kernel function is t2 1 2 ln t is SC. But this is quite exceptional. In general kernel-function-based barrier functions are not SC, but they are locally SC. The following table shows this for the kernel function, ψ 2 (t). iteration bounds local value of κ i kernel function ψ i (t) small-update large-update ψ(t) Φ µ (x) 1 t 2 1 2 ln t O ( ) ( ) n ln n ǫ O n ln n ǫ 1 1 ( ) 2 t 1 2 t O ( ) n ln n ǫ? 1 2 2t 3 2 v 3 At the start of the algorithm we have v = 1, where the local value of κ is 2/ 3. During the course of the algorithm the iterates stay so close to the central path that v stays in a very small neighborhood of 1, and hence the barrier function is SC for some suitable value of κ, slightly larger than 2/ 3. 19

Assumptions on the kernel function We assume that and we make the following assumptions: ψ(t) = 1 2 ( t 2 1 ) + ψ b (t) (6) ψ b (t) < 0, ψ b (t) > 0, ψ b (t) < 0, t > 0. It will be convenient to use the following notations (for t > 0): ξ(t) := ψ (t) ψ (t) t, ξ b (t) = ψ b (t) ψ b (t) t. (7) Note that these definitions imply ξ(t) = ξ b (t) > 0, t > 0. 20

Consequences of the assumption For x > 0 and s > 0 we have φ µ (x, s) = 2 n i=1 Hence, if s := s(x) = Mx + q then Φ µ (x) = φ µ (x, s) = xt s nµ µ ψ (v i ) = +2 n j=1 n i=1 ( v 2 i 1 ) + 2 n i=1 ψ b ( vj ) = q T x nµ µ ψ b (v i ). +2 n j=1 ψ b ( xj s j µ ). In the special case that ψ(t) is the kernel function of the logarithmic barrier function we have ψ b (t) = ln t, whence φ µ (x, s) = qt x µ n j=1 ln ( x j s j ) + nln µ n, which is (up to the constant term n ln µ n) the classical primal-dual logarithmic barrier function. 21

Results on local self-concordancy (1) Let N(t) = ψ b (t) t > 0. (t), ψ b Theorem 3 ν(x, s) = 2 N(v) 2. It is quite surprising that the local value of ν depends only on the vector v. Recall that if x x(µ) and s s(µ) then v 1. We give two examples. i ψ i (t) ψ b (t) ψ b 1 t2 1 2 1 2 (t) ψ b 2 ln t ln t 1 t ( ) t 1 2 ( 12 t t 2 1 ) 1 3 t 3 t 4 (t) ψ b (t) ν(t) ν(v) 1 t 2 2 t 3 2 2n 12 t 5 2t 2 3 2 v 1 2 3 22

Proof of Theorem 3 We apply the composition rule, which is well known. Lemma 3 Let φ i be (κ i, ν i )-SCB s on D i, for i = 1,2. Then φ 1 + φ 2 is a (κ, ν)-scb for D 1 D 2, where κ = max {κ 1, κ 2 } and ν = ν 1 + ν 2. Since the linear part in φ µ (x, s) is 0-self-concordant, with ν = 0, it suffices to consider f(x, s) = 2 n j=1 ψ b ( xj s j µ where s = s(x) = Mx + q. In the sequel we will neglect this relation between s and x. Thus we will prove that f(x, s) is (κ, ν)-self-concordant on the set { (x, s) : x R n +, s R n } +. ), This will imply that f(x, s) is a (κ, ν)-self-concordant barrier function for the domain of (SP), which is the intersection of this set and affine space determined by s = Mx + q. We do this by considering each of the terms in the definition separately and then apply the composition rules of Lemma 3. 23

The case n = 1 (1) ( ) xs f(x, s) = 2ψ b, x > 0, s > 0. µ Now let σ, τ R and α such that x + ασ > 0 and s + ατ > 0. We define and Writing v = xs µ, v(α) = ϕ(α) = f(x + ασ, s + ατ) = 2 h = σ x, (x + ασ)(s + ατ), µ k = τ s n j=1 ψ b (v(α)). we have, using xs = µv 2, (x + ασ)(s + ατ) = xs(1 + αh)(1 + αk) = µv 2 (1 + αh)(1 + αk), and hence v(α) 2 = v 2 (1 + αh)(1 + αk). 24

The case n = 1 (2) v(α) 2 = v 2 (1 + αh)(1 + αk). Taking successive derivatives with respect to α at both sides we obtain 2v(α)v (α) = v 2 (h(1 + αk) + k(1 + αh)) v(α)v (α) + v (α) 2 = v 2 hk v(α)v (α) + 3 v (α)v (α) = 0. Substitution of α = 0 gives This gives v(0) 2 = v 2 2v (0) = v (h + k) vv (0) + v (0) 2 = v 2 hk vv (0) + 3v (0)v (0) = 0. v(0) = v v (0) = 1 2v (h + k) v (0) = 1 4 v (h k)2 v (0) = 3 8 v (h + k)(h k)2. 25

Since The case n = 1 (3) ϕ (α) = 2ψ b (v(α)) v (α) ϕ (α) = 2 [ ψ b (v(α)) v (α) 2 + ψ b (v(α)) v (α) ] ϕ (α) = 2 [ ψ b (v(α)) v (α) 3 + 3ψ b (v(α)) v (α)v (α) + ψ b (v(α)) v (α) ], it follows that ϕ (0) = 2ψ b (v) v (0) ϕ (0) = 2 [ ψ b (v) v (0) 2 + ψ b (v) v (0) ] ϕ (0) = 2 [ ψ b (v) v (0) 3 + 3ψ b (v) v (0)v (0) + ψ b (v) v (0) ]. Substitution of of the above expressions for v(0), v (0), v (0) and v (0) yields (8) ϕ (0) = ψ b (v) v (h + k) [ ψ ϕ (0) = 1 2 b (v) v2 (h + k) 2 ψ b (v) v (h k)2] ϕ (0) = 1 [ 4 ψ b (v) v2 (h + k) 2 3 ξ b (v) v (h k) 2] v (h + k). Lemma 4 φ µ (x, s) is strictly convex. 26

Computation of ν To compute the barrier parameter ν we need to find an upper bound for ( ϕ (0) ) [ 2 ψ 2 ϕ (0) = b (v) v (h + k)] Substituting we have ν = max y,z 1 2 1 2 [ ψ b (v) v2 (h + k) 2 ψ b (v) v (h k)2]. (9) y = h + k, z = h k, [ ψ b (v) vy ] 2 [ ψ b (v) v2 y 2 ψ b (v) vz2] = 2 [ ψ b (v) ] 2 ψ b (v) = 2N(v) 2. Thus we have proved the following lemma. Lemma 5 If n = 1, and N(t) is as defined before, then ν(x, s) = 2N(v) 2. Theorem 3 If n 1 then ν = 2 N(v) 2. Proof: This is an immediate consequence of Lemma 3 and Lemma 5. 27

We define where Results on local self-concordancy (2) K(t) = ρ(t) 2 1 ψ b (t) (, 3 ρ(t) ψ b (t))3 2 ρ(t) = ψ b (t) ψ b (t) ξ b (t)ψ b (t), ρ(t) = min[2, ρ(t)], ξ b(t) = ψ b (t) ψ b (t). t Theorem 4 κ(x, s) = K(v). i ψ i (t) ψ b (t) ψ b 1 t2 1 2 1 2 (t) ψ b 2 ln t ln t 1 t ( ) t 1 2 ( 12 t t 2 1 ) 1 3 t 3 t 4 (t) ψ b (t) ξ(t) ρ(t) κ(t) κ(v) 1 t 2 2 2 t 3 t 2 1 1 1 12 t 5 4 t 4 1 2t 3 2 v 3 28

Proof of Theorem 4 (1) We first consider the case where n = 1. Then that κ = κ(x, s) is defined by 2κ = max h,k Substituting get ϕ (0) (ϕ (0)) 3 2 = 1 4 [ ψ 2 2 κ = max y,z b (v) v2 (h + k) 2 3 ξ(v) v (h k) 2] v (h + k) [ [ 12 ψ b (v) v2 (h + k) 2 ψ b (v) v (h k)2]]3 2 y = h + k, z = h k, [ ψ b (v) v2 y 2 3 ξ(v) vz 2] vy [ ψ b (v) v2 y 2 ψ b (v) vz2]3 2 The last expression is homogeneous in (y, z). It follows that 2 2 κ = max { [ ψ b (v) v2 y 2 3 ξ(v) vz 2] vy : ψ b (v) v2 y 2 ψ b (v) vz2 = 1 }. Before proceeding we recall the definitions of ρ(t) and ρ(t): Note that ρ(t) (0,2]. ρ(t) = ψ b (t) ψ b (t) ξ(t)ψ, ρ(t) = min[2, ρ(t)]. (10) (t) b. 29.

Proof of Theorem 4 (2) 2 2 κ = max { [ ψ b (v) v 2 y 2 3 ξ(v) vz 2] vy : ψ b (v) v 2 y 2 ψ b(v) vz 2 = 1 }. (11) The optimality conditions are, for some suitable multiplier λ, or, equivalently, 3ψ b (v) v3 y 2 3 ξ(v) v 2 z 2 = 3λ ψ b (v) v2 y 6 ξ(v) v 2 yz = 3λ [ ψ b (v) vz], ψ b (v) vy2 ξ(v) z 2 = λ ψ b (v) y 2 ξ(v) vyz = λ ψ b (v) z. (12) We see that either z = 0 or 2 ξ(v) vy = λ ψ b (v). If z = 0 then the constraint in our problem implies that ψ b (v) v2 y 2 = 1, and hence (since (v) < 0), κ is in this case given by ψ b 2 2 κ = ψ b (v) (. (13) ψ b (v))3 2 30

Proof of Theorem 4 (3) Now assuming z 0, we can eliminate λ by substituting 2 ξ(v) vy = λ ψ b (v) into (12), which gives ψ b [ (v) ψ b (v) vy2 ξ(v) z 2] = λ ψ b (v) ψ b (v) y = 2 ξ(v)ψ b (v) vy2. Rearranging the terms, and using (10) we obtain ψ b (v)ξ(v) z2 = [ 2 ξ(v)ψ b (v) ψ b (v)ψ b (v) ] vy 2 = (2 ρ(v)) ξ(v)ψ b (v) vy2, yielding ψ b (v) z2 = (2 ρ(v)) ψ b (v) vy2, (14) Since ψ b (v) > 0 and ψ b (v) > 0, this equation has no nonzero solution for y if ρ(v) > 2, and hence κ is then given by (13). If ρ(v) 2, substitution of (14) into the constraint ψ b (v) v2 y 2 ψ b (v) vz2 = 1 yields or, equivalently, Hence ψ b (v) v2 y 2 + (2 ρ(v)) ψ b (v) v2 y 2 = 1, [3 ρ(v)] ψ b (v) v2 y 2 = 1. (15) vy = ±1 [3 ρ(v)] ψ b (16) (v). 31

Proof of Theorem 4 (4) The rest of the proof consists of computing the value of the objective function using the relations found so far. Using (14), (10) and (15), respectively, we may write 2 [ 2 κ = ± ψ b (2 ρ(v)) ψ (v) 3 ξ(v) b (v) ] ψ b (v) v 3 y 3 = ± 1 [ ψ b ψ (v) b (v) ψ b ] (v) + 3 (2 ρ(v)) ξ(v)ψ b (v) v 3 y 3 = ± ξ(v)ψ b (v) ψ b (v) [ρ(v) + 3(2 ρ(v))] v 3 y 3 = ±2 ξ(v) [ ψ b (3 ρ(v)) ψ (v) b (v) v2 y 2] vy = ±2 ξ(v) Finally, using (10) and (16) respectively, we get (since we are maximizing) 2 2 κ = ±2ξ(v) ±2ξ(v) ψ b vy = (v) ψ b (v) [3 ρ(v)] ψ b (v) = ρ(v) 2 ψ b (v) vy. (3 ρ(v)) ψ b (v) (. )3 ψ b (v) 2 For ρ(v) = 2 this yields exactly the same value as in (13). Thus the following holds. Lemma 6 If n = 1, and with K(t) as defined above, we have κ(x, s) = K(v). Theorem 4 If n 1 then κ = K(v). Proof: This is an immediate consequence of Lemma 3 and Lemma 6. 32

Summary of results From now on we assume that s = Mx + q. Our ingredients are Theorem 3 ν(x) = 2 N(v) 2, where N(t) = ψ b (t). ψ b (t) Theorem 4 κ(x) = K(v), where K(t) =. 1 ρ(t) 2 3 ρ(t) ψ b (t) ( ψ b (t))3 2, with ρ(t) = ψ b (t) ψ b (t) ξ b (t)ψ b (t), ρ(t) = min[2, ρ(t)], ξ b(t) = ψ b (t) ψ b (t). t Lemma 7 During the course of the algorithm we have λ(x) 1 4κ. This lemma implies φ µ (x, s) = 2 n i=1 ψ (v i ) ω( 1 4 ) κ 2 = 0.0376821 κ 2 1 26κ 2 33

Some examples of barier functions and their local κ and ν values i ψ b (t) ψ b (t) ψ b (t) ψ b (t) ξ b(t) ρ(t) ν(t) κ(t) 1 log t 1 t 1 2 2 1 1 1 t 2 t 3 t 2 2 1 2 ( t 2 1 ) 1 t 3 3 t 4 12 t 5 4 t 4 1 2 3t 2 2t 3 3 4 t 1 q 1 q 1 t q qt q 1 q(q + 1)t q 2 (q + 1)t q 1 1 e 1 t e e 5 t 1 e1 ξ 1 dξ e 1 t 6 e1 t 1 1+2t e 1 1 t 2 t 4 t 1+6t+6t2 e 1 1 1+3t t 6 t e 1 1 1+6t+6t2 t 4 t 2 qt q 1 2 e 1 t 1 1+5t+6t 2 1+2t 1 e1t 1 1+2t e 1 1 1+t t 2 t 4 t e 1 1 1+2t t 2 t 2t 2 e 1 1 t 1+t e σ(1 t) 1 e σ(1 t) σe σ(1 t) σ 2 eσ(1 t) 1+σt eσ(1 t) σt σ t 1+σt 2e σ(1 t) σ (1+3t) (q+1)t q 1 2 2 q 1+5t+6t 2 2+9t+12t 2 (2+4t) e 1 t 1 1+t t (1+σt) 1+t 2+t 2e 1 t 1 t 2σe σ(1 t) 1+σt 3+2σt 34

Analysis of the algorithm (1) Note that ψ(t) is monotonically decreasing for t 1 and monotonically increasing for t 1. In the sequel we denote by : [0, ) [1, ) the inverse function of ψ(t) for t 1 and by χ : [0, ) (0,1] the inverse function of ψ(t) for 0 < t 1. So we have and (s) = t s = ψ(t), s 0, t 1. (17) χ(s) = t s = ψ(t), s 0, 0 < t 1. (18) Note that χ(s) is monotonically decreasing and (s) is monotonically increasing in s 0. Lemma 8 Let t > 0 and ψ(t) s. Then χ(s) t (s). Proof: This is almost obvious. Since ψ(t) is strictly convex and minimal at t = 1, with ψ(1) = 0, ψ(t) s implies that t belongs to a closed interval whose extremal points are χ(s) and (s). 35

Graphical illustration of the functions χ(s) and (s) ψ(t) 11 10 9 8 s 7 6 5 4 3 2 1 0 0 1 2 3 4 5 χ(s) (s) t 36

The local values of κ and ν are given by κ(x) = max i We need to find values of κ and ν such that Analysis of the algorithm (2) K(v i ), ν(x) = 2 n i=1 N (v i ) 2. κ(x) κ, ν(x) ν for each v that occurs during the course of the algorithm. This certainly holds if φ µ (x, s) = 2 n i=1 The left-hand side of this implication implies ψ (v i ) 1 26κ 2 max K(v i ) κ, 2 i ψ (v i ) 1 52κ2, i = 1,..., n. According to Lemma 8 this implies ( ) ( ) 1 1 χ 52κ 2 v i 52κ 2, i = 1,..., n. n i=1 N (v i ) 2 ν. 37

χ ( 1 Analysis of the algorithm (3) 52κ 2 ) v i ( 1 52κ 2 ), i = 1,..., n. If we choose κ such that ( ( )) 1 K 52κ 2 κ (19) then the barrier function is locally κ-sc. The above inequality certainly has a solution, because if κ goes to infinity then the left-hand side approaches K(1), which is finite, whereas the right-hand side goes to. Let κ denote the smallest solution of (19). Finally, if we take ν such that ( ( ( ))) 1 2 ν = 2n N χ 52 κ 2 then the barrier function is a locally ( κ, ν)-sc barrier function. 38

Analysis of the algorithm (4) Substitution of the chosen values of κ and ν yields (also using that µ 0 = 1) the following iteration bound for the algorithm: 2 ( 1 + 4 κ ν ) ln 2 ν ǫ = 2 ( 1 + 4 κn ( ( 1 )) ) χ 2n 52 κ 2 ln 2n( N ( χ ( 1 ǫ ))) 2 52 κ 2 Note that apart from n the coefficients occurring in this expression depend only on the kernel function ψ, and not on n. Thus we may safely state that for every kernel function satisfying our conditions the iteration bound is ( nlog ) n O. ǫ. 39

Concluding remarks Recently we have used kernel function-based barrier functions (including so-called selfregular kernel functions) to improve the iteration bound for large-update methods from O(nlog n ǫ ) to O( n(log n)log n ǫ ). We were surprised to observe (most of the time after a tedious analysis, for each kernel function separately) that the iteration bounds for small-update methods based on these barrier functions always turned out to be O( nlog n ǫ ). The current results seem to explain this phenomenon. The results presented in this talk can be easily generalized to other (symmetric) cone optimization problems, like second-order cone optimization and semidefinite optimization. The next challenge is to find out if we can obtain the improved bounds for large-update methods by using this approach. 40

Some references Y.Q. Bai, M. El Ghami, and C. Roos. A comparative study of kernel functions for primaldual interior-point algorithms in linear optimization. SIAM Journal on Optimization, 15(1):101 128 (electronic), 2004. J. Peng, C. Roos, and T. Terlaky. Self-Regularity. A New Paradigm for Primal-Dual Interior-Point Algorithms. Princeton University Press, 2002. M. Salahi, T. Terlaky, and G. Zhang. The complexity of self-regular proximity based infeasible IPMs. Technical Report 2003/3, Advanced Optimization Laboratory, Mc Master University, Hamilton, Ontario, Canada, 2003. S. Boyd and L. Vandenberghe. Convex optimization. Cambridge University Press, Cambridge, 2004. Y. Nesterov. Introductory Lectures on Convex Optimization. Kluwer Academic Publishers, Dordrecht, The Netherlands, 2004. F. Glineur. Topics in convex optimization: interior-point methods, conic duality and approximations. Faculté Polytechnique de Mons, Mons, Belgium, 2001. PhD thesis. Y.E. Nesterov and A.S. Nemirovskii. Interior Point Polynomial Methods in Convex Programming. Theory and Algorithms. SIAM, Philadelphia, USA, 1993. 41