Interior Point Methods for Mathematical Programming

Similar documents
Interior Point Methods in Mathematical Programming

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization

10 Numerical methods for constrained problems


Semidefinite Programming

Numerical Optimization

Lecture 9 Sequential unconstrained minimization

Nonlinear Optimization: What s important?

Interior-Point Methods

Algorithms for constrained local optimization

Interior-Point Methods for Linear Optimization

A Second-Order Path-Following Algorithm for Unconstrained Convex Optimization

More First-Order Optimization Algorithms

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

4TE3/6TE3. Algorithms for. Continuous Optimization

CS711008Z Algorithm Design and Analysis

Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization /36-725

2.098/6.255/ Optimization Methods Practice True/False Questions

Primal-Dual Interior-Point Methods. Javier Peña Convex Optimization /36-725

An O(nL) Infeasible-Interior-Point Algorithm for Linear Programming arxiv: v2 [math.oc] 29 Jun 2015

Lecture 6: Conic Optimization September 8

Optimization: Then and Now

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL)

Written Examination

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization

Lagrangian Duality Theory

January 29, Introduction to optimization and complexity. Outline. Introduction. Problem formulation. Convexity reminder. Optimality Conditions

Computational Finance

CS295: Convex Optimization. Xiaohui Xie Department of Computer Science University of California, Irvine

A full-newton step infeasible interior-point algorithm for linear programming based on a kernel function

On well definedness of the Central Path

IE 5531 Midterm #2 Solutions

Lecture 18: Optimization Programming

A FULL-NEWTON STEP INFEASIBLE-INTERIOR-POINT ALGORITHM COMPLEMENTARITY PROBLEMS

An Infeasible Interior-Point Algorithm with full-newton Step for Linear Optimization

Lecture 13: Constrained optimization

Convex Optimization and l 1 -minimization

Operations Research Lecture 4: Linear Programming Interior Point Method

Nonlinear optimization

IE 5531: Engineering Optimization I

Optimality Conditions for Constrained Optimization

Numerical Optimization. Review: Unconstrained Optimization

Lecture 10. Primal-Dual Interior Point Method for LP

CSCI 1951-G Optimization Methods in Finance Part 01: Linear Programming

Conditional Gradient (Frank-Wolfe) Method

Lecture 3. Optimization Problems and Iterative Algorithms

Convex Optimization. Prof. Nati Srebro. Lecture 12: Infeasible-Start Newton s Method Interior Point Methods

On the iterate convergence of descent methods for convex optimization

Lecture 3 Interior Point Methods and Nonlinear Optimization

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

LINEAR AND NONLINEAR PROGRAMMING

2.3 Linear Programming

Constrained Optimization and Lagrangian Duality

On Superlinear Convergence of Infeasible Interior-Point Algorithms for Linearly Constrained Convex Programs *

A Generalized Homogeneous and Self-Dual Algorithm. for Linear Programming. February 1994 (revised December 1994)

Conic Linear Programming. Yinyu Ye

Nonlinear Optimization for Optimal Control

A Full Newton Step Infeasible Interior Point Algorithm for Linear Optimization

Enlarging neighborhoods of interior-point algorithms for linear programming via least values of proximity measure functions

Apolynomialtimeinteriorpointmethodforproblemswith nonconvex constraints

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

Advances in Convex Optimization: Theory, Algorithms, and Applications

Newton s Method. Javier Peña Convex Optimization /36-725

Convex Optimization. Ofer Meshi. Lecture 6: Lower Bounds Constrained Optimization

An interior-point trust-region polynomial algorithm for convex programming

LMI Methods in Optimal and Robust Control

1 Computing with constraints

Semidefinite Programming. Yinyu Ye

Barrier Method. Javier Peña Convex Optimization /36-725

Following The Central Trajectory Using The Monomial Method Rather Than Newton's Method

Conic Linear Programming. Yinyu Ye

Interior Point Methods for Convex Quadratic and Convex Nonlinear Programming

CONSTRAINED NONLINEAR PROGRAMMING

A QUADRATIC CONE RELAXATION-BASED ALGORITHM FOR LINEAR PROGRAMMING

Constrained Optimization

Motivating examples Introduction to algorithms Simplex algorithm. On a particular example General algorithm. Duality An application to game theory

CSCI 1951-G Optimization Methods in Finance Part 09: Interior Point Methods

MATH 4211/6211 Optimization Basics of Optimization Problems

Lecture 15: October 15

Interior-point methods Optimization Geoff Gordon Ryan Tibshirani

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

A PREDICTOR-CORRECTOR PATH-FOLLOWING ALGORITHM FOR SYMMETRIC OPTIMIZATION BASED ON DARVAY'S TECHNIQUE

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods

Scientific Computing: Optimization

10-725/ Optimization Midterm Exam

Curvature as a Complexity Bound in Interior-Point Methods

Chapter 6 Interior-Point Approach to Linear Programming

An E cient A ne-scaling Algorithm for Hyperbolic Programming

Examination paper for TMA4180 Optimization I

Linear programming II

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods

A new primal-dual path-following method for convex quadratic programming

Midterm Review. Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A.

Introduction and Math Preliminaries

Lecture: Algorithms for LP, SOCP and SDP

Primal-dual IPM with Asymmetric Barrier

Primal-Dual Interior-Point Methods

Transcription:

Interior Point Methods for Mathematical Programming Clóvis C. Gonzaga Federal University of Santa Catarina, Florianópolis, Brazil EURO - 2013 Roma

Our heroes Cauchy Newton Lagrange

Early results Unconstrained minimization minimize f (x) Inequality constraints minimize f (x) subject to g(x) 0 Cauchy (steepest descent): x + = x λ f (x) Newton: x + = x H(x) 1 f (x) For the constrained problem: Frisch, Fiacco, McCormick: F(x) = f (x) µ log( g i (x))

Early results Unconstrained minimization minimize f (x) Inequality constraints minimize f (x) subject to g(x) 0 Cauchy (steepest descent): x + = x λ f (x) Newton: x + = x H(x) 1 f (x) For the constrained problem: Frisch, Fiacco, McCormick: F(x) = f (x) µ log( g i (x))

Early results Unconstrained minimization minimize f (x) Inequality constraints minimize f (x) subject to g(x) 0 Cauchy (steepest descent): x + = x λ f (x) Newton: x + = x H(x) 1 f (x) For the constrained problem: Frisch, Fiacco, McCormick: F(x) = f (x) µ log( g i (x))

The barrier function Problem P1

Near the boundary, the barrier function tends to +.

The Linear Programming Problem Inequality constraints minimize c T x subject to Ax b Equality and non-negativity minimize subject to c T x Ax = b x 0 Interior point: x > 0. Solution: Ax = b. Feasible solution: Ax = b, x 0. Interior (feasible) solution: Ax = b, x > 0.

The Linear Programming Problem Inequality constraints minimize c T x subject to Ax b Equality and non-negativity minimize subject to c T x Ax = b x 0 Interior point: x > 0. Solution: Ax = b. Feasible solution: Ax = b, x 0. Interior (feasible) solution: Ax = b, x > 0.

The Linear Programming Problem Inequality constraints minimize c T x subject to Ax b Equality and non-negativity minimize subject to c T x Ax = b x 0 Interior point: x > 0. Solution: Ax = b. Feasible solution: Ax = b, x 0. Interior (feasible) solution: Ax = b, x > 0.

The Linear Programming Problem Inequality constraints minimize c T x subject to Ax b Equality and non-negativity minimize subject to c T x Ax = b x 0 Interior point: x > 0. Solution: Ax = b. Feasible solution: Ax = b, x 0. Interior (feasible) solution: Ax = b, x > 0.

Simplex Method: Dantzig 1948 Exponential in the worst case Good for usual problems.

Centers of gravity

Centers of gravity Conclusions Worst case performance: O(n ln(1/ε)) calls to the oracle.

Centers of gravity Conclusions Worst case performance: O(n ln(1/ε)) calls to the oracle. The method is generally not realizable: too difficult to compute the center of gravity. Work with simpler shapes: ellipsoids.

The ellipsoid method: Shor, Nemirovsky, Yudin, Khachyian Performance The computation of each new ellipsoid needs O(n 2 ) arithmetical operations. Worst case performance: O(n 2 ln(1/ε)) calls to the oracle, and O(n 4 ln(1/ε)) operations.

The Khachiyan Sputnik 1978 Khachiyan applied the ellipsoidal method to LP, and defined the number L=number of bits in the input for a problem with rational data. A solution with precision 2 L can be purified to an exact vertex solution of LP. Performance: O(n 4 )L computations.

Interior point methods Karmarkar 1984: projective IP method: O(n 3.5 L) computations. Renegar 1986: path following algorithm: O(n 3.5 L) computations. Vajdya and G. (independently) 1987: path following algorithms: O(n 3 L) computations.

Extensions Linear programming and quadratic programming: Jaceck Gonzio solved a Portfolio Optimization problem based on Markowitz analysis using 16 million scenarios, 1.10 billion variables, 353 million constraints in 53 iterations of an IP method in 50 minutes of a system of 1280 parallel processors. Second Order Cone Programming: extension of the Lorentz cone. Applications to Quadratically constrained quadratic programming, robust LP, robust Least Squares, Filter design, Portfolio Optimization, Truss design,... Semi-definite programming. A huge body of theory and applications to Structural Optimization, Optimal Control, Combinatorial Optimization,...

Extensions Linear programming and quadratic programming: Jaceck Gonzio solved a Portfolio Optimization problem based on Markowitz analysis using 16 million scenarios, 1.10 billion variables, 353 million constraints in 53 iterations of an IP method in 50 minutes of a system of 1280 parallel processors. Second Order Cone Programming: extension of the Lorentz cone. Applications to Quadratically constrained quadratic programming, robust LP, robust Least Squares, Filter design, Portfolio Optimization, Truss design,... Semi-definite programming. A huge body of theory and applications to Structural Optimization, Optimal Control, Combinatorial Optimization,...

Extensions Linear programming and quadratic programming: Jaceck Gonzio solved a Portfolio Optimization problem based on Markowitz analysis using 16 million scenarios, 1.10 billion variables, 353 million constraints in 53 iterations of an IP method in 50 minutes of a system of 1280 parallel processors. Second Order Cone Programming: extension of the Lorentz cone. Applications to Quadratically constrained quadratic programming, robust LP, robust Least Squares, Filter design, Portfolio Optimization, Truss design,... Semi-definite programming. A huge body of theory and applications to Structural Optimization, Optimal Control, Combinatorial Optimization,...

The Affine-Scaling direction Projection matrix Given c R n and a matrix A, c can be decomposed as c = P A c + A T y, where P A c N (A) is the projection of c into N (A).

The Affine-Scaling direction Linearly constrained problem: minimize subject to f (x) Ax = b x 0 Define c = f (x 0 ). The projected gradient (Cauchy) direction is c P = P A c, and the steepest descent direction is d = c P. It solves the trust region problem minimize{c T h Ah = 0, d }.

c c p h

Largest ball centered in xk

Largest simple ellipsoid centered in xk

The affine scaling direction c c p h

Largest ball centered in e = (1,1,...,1)

The Affine-Scaling direction Given a feasible point x 0, X = diag(x 0 ) and c = f (x 0 ) minimize subject to c T x Ax = b x 0 x = X x d = X d minimize subject to Scaled steepest descent direction: x = Xe (x is scaled into e): Dikin s direction: d = P AX Xc d = X d = XP AX Xc d = P AX Xc d = X d/ d. (Xc) T x AX x = b x 0

Dikin s algorithm Problem P1

Dikin s algorithm Problem P1

Dikin s algorithm Problem P1

Dikin s algorithm Problem P1

Affine scaling algorithm Problem P1

Affine scaling algorithm Problem P1

Affine scaling algorithm Problem P1

Affine scaling algorithm Problem P1

The logarithmic barrier function x R n ++ p(x) = Scaling: for a diagonal matrix D > 0 Derivatives: n i=1 logx i. p(dx) = p(x) + constant, p(dx) p(dy) = p(x) p(y). p(x) = x 1 p(e) = e 2 p(x) = X 2 2 p(e) = I. At x = e, the Hessian matrix is the identity, and hence the Newton direction coincides with the Cauchy direction.

The logarithmic barrier function x R n ++ p(x) = Scaling: for a diagonal matrix D > 0 Derivatives: n i=1 logx i. p(dx) = p(x) + constant, p(dx) p(dy) = p(x) p(y). p(x) = x 1 p(e) = e 2 p(x) = X 2 2 p(e) = I. At x = e, the Hessian matrix is the identity, and hence the Newton direction coincides with the Cauchy direction. At any x > 0, the affine scaling direction coincides with the Newton direction.

The penalized function in linear programming For x > 0, µ > 0 and α = 1/µ, minimize subject to f α (x) = αc T x + p(x) or f µ (x) = c T x + µp(x) c T x Ax = b x 0 minimize subject to c T x + µp(x) Ax = b x > 0 For α 0, f α is strictly convex and grows indefinitely near the boundary of the feasible set. Whenever the minimizers exist, they are defined uniquely by x α = argmin x Ω f α (x). In particular, if Ω is bounded, x 0 is the analytic center of Ω If the optimal face of the problem is bounded, then the curve α > 0 x α is well defined and is called the primal central path.

The penalized function in linear programming For x > 0, µ > 0 and α = 1/µ, minimize subject to f α (x) = αc T x + p(x) or f µ (x) = c T x + µp(x) c T x Ax = b x 0 minimize subject to c T x + µp(x) Ax = b x > 0 For α 0, f α is strictly convex and grows indefinitely near the boundary of the feasible set. Whenever the minimizers exist, they are defined uniquely by x α = argmin x Ω f α (x). In particular, if Ω is bounded, x 0 is the analytic center of Ω If the optimal face of the problem is bounded, then the curve α > 0 x α is well defined and is called the primal central path.

The penalized function in linear programming For x > 0, µ > 0 and α = 1/µ, minimize subject to f α (x) = αc T x + p(x) or f µ (x) = c T x + µp(x) c T x Ax = b x 0 minimize subject to c T x + µp(x) Ax = b x > 0 For α 0, f α is strictly convex and grows indefinitely near the boundary of the feasible set. Whenever the minimizers exist, they are defined uniquely by x α = argmin x Ω f α (x). In particular, if Ω is bounded, x 0 is the analytic center of Ω If the optimal face of the problem is bounded, then the curve α > 0 x α is well defined and is called the primal central path.

The central path Problem P1

Equivalent definitions of the central path There are four equivalent ways of defining central points: Minimizers of the penalized function: argmin x Ω f α (x). Analytic centers of constant cost slices argmin x Ω {p(x) c T x = K} Renegar centers: Analytic centers of Ω with an extra constraint c T x. argmin x Ω {p(x) log(k c T x) c T x < K} Primal-dual central points (seen ahead).

Constant cost slices Enter the new cut position (one point) and then the initial point

Constant cost slices Enter the new cut position (one point) and then the initial point

Constant cost slices Enter the new cut position (one point) and then the initial point

Renegar cuts Problem Ptemp

Renegar cuts Problem Ptemp

Renegar cuts Problem Ptemp

Renegar cuts Problem Ptemp

Renegar cuts Problem Ptemp

Renegar cuts Problem Ptemp

Centering The most important problem in interior point methods is the following: Centering problem Given a feasible interior point x 0 and a value α 0, solve approximately the problem minimize x Ω 0 αc T x + p(x). The Newton direction from x 0 coincides with the affine-scaling direction, and hence is the best possible. It is given by d = X d, d = P AX X(αc x 1 ) = αp AX Xc + P AX e.

Centering The most important problem in interior point methods is the following: Centering problem Given a feasible interior point x 0 and a value α 0, solve approximately the problem minimize x Ω 0 αc T x + p(x). The Newton direction from x 0 coincides with the affine-scaling direction, and hence is the best possible. It is given by d = X d, d = P AX X(αc x 1 ) = αp AX Xc + P AX e.

Efficiency of the Newton step for centering Newton direction: d = X d, d = P AX X(αc x 1 ) = αp AX Xc + P AX e. We define the Proximity to the central point as δ(x,α) = d = αp AX Xc + P AX e. The following important theorem says how efficient it is: Theorem Consider a feasible point x and a parameter α. Let x + = x + d be the point resulting from a Newton centering step. If δ(x,α) = δ < 1, then δ(x +,α) < δ 2. If δ(x,α) 0.5, then this value is a very good approximation to the euclidean distance between e and X 1 x α, i. e., between x and x α in the scaled space.

Efficiency of the Newton step for centering Newton direction: d = X d, d = P AX X(αc x 1 ) = αp AX Xc + P AX e. We define the Proximity to the central point as δ(x,α) = d = αp AX Xc + P AX e. The following important theorem says how efficient it is: Theorem Consider a feasible point x and a parameter α. Let x + = x + d be the point resulting from a Newton centering step. If δ(x,α) = δ < 1, then δ(x +,α) < δ 2. If δ(x,α) 0.5, then this value is a very good approximation to the euclidean distance between e and X 1 x α, i. e., between x and x α in the scaled space.

Short steps

Primal results as we saw are important to give a geometrical meaning to the procedures, and to develop the intuition. Also, these results can be generalized to a large class of problems, by generalizing the idea of barrier functions. From now on we shall deal with primal-dual results, which are more efficient for linear and non-linear programming.

The Linear Programming Problem LP LD minimize subject to c T x Ax = b x 0 maximize subject to b T y A T y c

The Linear Programming Problem LP LD minimize subject to c T x Ax = b x 0 maximize subject to b T y A T y + s = c s 0

The Linear Programming Problem LP minimize subject to KKT: multipliers y, s c T x Ax = b x 0 A T y + s = c Ax = b xs = 0 x,s 0 LD maximize subject to b T y A T y + s = c s 0

The Linear Programming Problem LP minimize subject to c T x Ax = b x 0 LD maximize subject to b T y A T y + s = c s 0 KKT: multipliers y, s A T y + s = c Ax = b xs = 0 x,s 0 KKT: multipliers x A T y + s = c Ax = b xs = 0 x,s 0

The Linear Programming Problem LP minimize subject to c T x Ax = b x 0 LD maximize subject to b T y A T y + s = c s 0 Primal-dual optimality A T y + s = c Ax = b xs = 0 x,s 0 Duality gap For x, y, s feasible, c T x b T y = x T s 0

The Linear Programming Problem LP minimize subject to c T x Ax = b x 0 LD maximize subject to b T y A T y + s = c s 0 Primal-dual optimality A T y + s = c Ax = b xs = 0 x,s 0 Duality gap For x, y, s feasible, c T x b T y = x T s 0 (LP) has solution x and (LD) has solution y,s if and only if the optimality conditions have solution x, y, s.

Primal-dual centering Let us write the KKT conditions for the centering problem (now using µ instead of α = 1/µ). minimize c T x µ logx i subject to Ax = b x > 0 Primal-dual center: KKT conditions xs = µe A T y + s = c Ax = b x,s > 0

Primal-dual centering Let us write the KKT conditions for the centering problem (now using µ instead of α = 1/µ). minimize c T x µ logx i subject to Ax = b x > 0 Primal-dual center: KKT conditions xs = µe A T y + s = c Ax = b x,s > 0

Generalization Let us write the KKT conditions for the convex quadratic programming problem minimize subject to c T x + 1 2 xt Hx Ax = b x > 0 The first KKT condition is written as c + Hx A T y s = 0 To obtain a symmetrical formulation for the problem, we may multiply this equation by a matrix B whose rows form a basis for the null space of A. Then BA T y = 0, and we obtain the following conditions conditions: xs = 0 BHx + Bs = Bc Ax = b x,s 0

Horizontal linear complementarity problem In any case, the problem can be written as xs = 0 Qx + Rs = b x,s 0 This is a linear complementarity problem, which includes linear and quadratic programming as particular problems. The techniques studied here apply to these problems, as long as the following monotonicity condition holds: For any feasible pair of directions (u,v) such that Qu + Rv = 0, we have u T v 0. The optimal face: the optimal solutions must satisfy x i s i = 0 for i = 1,...,n. This is a combinatorial constraint, responsible for all the difficulty in the solution.

Horizontal linear complementarity problem In any case, the problem can be written as xs = 0 Qx + Rs = b x,s 0 This is a linear complementarity problem, which includes linear and quadratic programming as particular problems. The techniques studied here apply to these problems, as long as the following monotonicity condition holds: For any feasible pair of directions (u,v) such that Qu + Rv = 0, we have u T v 0. The optimal face: the optimal solutions must satisfy x i s i = 0 for i = 1,...,n. This is a combinatorial constraint, responsible for all the difficulty in the solution.

Primal-dual centering: the Newton step Given x,s feasible and µ > 0, find x + = x + u s + = s + v such that x + s + = µe Qx + + Rs + = b xs + su + xv + uv = µe Qu + Rv = 0 Newton step Xv + Su = xs + µe Qu + Rv = 0 Solving this linear system is all the work. In the case of linear programming one should keep the multipliers y and simplify the resulting system of equations

Primal-dual centering: the Newton step Given x,s feasible and µ > 0, find x + = x + u s + = s + v such that x + s + = µe Qx + + Rs + = b xs + su + xv + uv = µe Qu + Rv = 0 Newton step Xv + Su = xs + µe Qu + Rv = 0 Solving this linear system is all the work. In the case of linear programming one should keep the multipliers y and simplify the resulting system of equations

Primal-dual centering: the Newton step Given x,s feasible and µ > 0, find x + = x + u s + = s + v such that x + s + = µe Qx + + Rs + = b xs + su + xv + uv = µe Qu + Rv = 0 Newton step Xv + Su = xs + µe Qu + Rv = 0 Solving this linear system is all the work. In the case of linear programming one should keep the multipliers y and simplify the resulting system of equations

Primal-dual centering: Proximity measure Given x,s feasible and µ > 0, we want xs = µe or equivalently xs µ e = 0 The actual error in this equation gives the proximity measure: Proximity measure Theorem x,s,µ δ(x,s,µ) = xs µ e. Given a feasible pair (x,s) and a parameter µ, Let x + = x + u and s + = s + v be the point resulting from a Newton centering step. If δ(x,s,µ) = δ < 1, then δ(x +,s +,µ) < 1 8 δ 2 1 δ. In particular, if δ 0.7, then δ(x +,s +,µ) < δ 2.

Primal-dual centering: Proximity measure Given x,s feasible and µ > 0, we want xs = µe or equivalently xs µ e = 0 The actual error in this equation gives the proximity measure: Proximity measure Theorem x,s,µ δ(x,s,µ) = xs µ e. Given a feasible pair (x,s) and a parameter µ, Let x + = x + u and s + = s + v be the point resulting from a Newton centering step. If δ(x,s,µ) = δ < 1, then δ(x +,s +,µ) < 1 8 δ 2 1 δ. In particular, if δ 0.7, then δ(x +,s +,µ) < δ 2.

Primal-dual path following: Traditional approach Assume that we have x,s,µ such that (x,s) is feasible and δ(x,s,µ) α < 1 Choose µ + = γµ, with γ < 1. Use Newton s algorithm (with line searches to avoid infeasible points) to find (x +,s + ) such that δ(x +,s +,µ + ) α Neighborhood of the central path Given β (0,1), we define the neighborhood η(α) as the set of all feasible pairs (x,s) such that for some µ > 0 δ(x,s,µ) β The methods must ensure that all points are in such a neighborhood, using line searches

Primal-dual path following: Traditional approach Assume that we have x,s,µ such that (x,s) is feasible and δ(x,s,µ) α < 1 Choose µ + = γµ, with γ < 1. Use Newton s algorithm (with line searches to avoid infeasible points) to find (x +,s + ) such that δ(x +,s +,µ + ) α Neighborhood of the central path Given β (0,1), we define the neighborhood η(α) as the set of all feasible pairs (x,s) such that for some µ > 0 δ(x,s,µ) β The methods must ensure that all points are in such a neighborhood, using line searches

Primal-dual path following: Traditional approach Assume that we have x,s,µ such that (x,s) is feasible and δ(x,s,µ) α < 1 Choose µ + = γµ, with γ < 1. Use Newton s algorithm (with line searches to avoid infeasible points) to find (x +,s + ) such that δ(x +,s +,µ + ) α Neighborhood of the central path Given β (0,1), we define the neighborhood η(α) as the set of all feasible pairs (x,s) such that for some µ > 0 δ(x,s,µ) β The methods must ensure that all points are in such a neighborhood, using line searches

A neighborhood of the central path

Short steps Using γ near 1, we obtain short steps. With γ = 0.4/ n, only one Newton step is needed at each iteration, and the algorithm is polynomial: it finds a solution with precision 2 L in O( nl) iterations.

Short steps Problem P1

Large steps Using γ small, say γ = 0.1, we obtain large steps. This uses to work well in practice, but some sort of line search is needed, to avoid leaving the neighborhood. Predictor-corrector methods are better, as we shall see.

Large steps Problem P1

Adaptive methods Assume that (x,s) feasible is given in η(β), but no value of µ is given. Then we know: if (x,s) is a central point, then xs = µe implies x T s = nµ. Hence the best choice for µ is µ = x T s/n. If (x,s) is not a central point, the value µ(x,s) = x T s/n gives a parameter value which in a certain sense is the best possible. An adaptive algorithm does not use a value of µ coming from a former iteration: it computes µ(x, s) and then chooses a value γµ(x, s) as new target. The target may be far. Compute a direction (u,v) and follow it until δ(x + λu,s + λv,µ(x + λu,s + λv)) = β

Adaptive steps

Predictor-corrector methods Alternate two kinds of iterations: Predictor: An iteration starts with (x, s) near the central path, and computes a Newton step (u, v) with goal γµ(x, s), γ small. Follow it until δ(x + λu,s + λv,µ(x + λu,s + λv)) = β Corrector: Set x + = x + λu, s + = s + λv, compute µ = µ(x +,s + ) and take a Newton step with target µ If the predictor uses γ = 0, it is called the affine scaling step. It has no centering, and tries to solve the original problem in one step. Using a neighborhood with β = 0.5, the resulting algorithm (the Mizuno-Todd-Ye algorithm) converges quadratically to an optimal solution, keeping the complexity at its best value of O( nl) iterations.

Predictor-corrector methods Alternate two kinds of iterations: Predictor: An iteration starts with (x, s) near the central path, and computes a Newton step (u, v) with goal γµ(x, s), γ small. Follow it until δ(x + λu,s + λv,µ(x + λu,s + λv)) = β Corrector: Set x + = x + λu, s + = s + λv, compute µ = µ(x +,s + ) and take a Newton step with target µ If the predictor uses γ = 0, it is called the affine scaling step. It has no centering, and tries to solve the original problem in one step. Using a neighborhood with β = 0.5, the resulting algorithm (the Mizuno-Todd-Ye algorithm) converges quadratically to an optimal solution, keeping the complexity at its best value of O( nl) iterations.

Predictor-corrector

Mehrotra Predictor-corrector method: second order When computing the Newton step, we eliminated the nonlinear term uv in the equation xs + su + xv + uv = µe Qu + Rv = 0 The second order method corrects the values of u,v by estimating the value of the term uv by a predictor step. Predictor: An iteration starts with (x, s) near the central path, and computes a Newton step (u,v) with goal µ +, small. The first equation is Compute a correction ( u, v) by xv + su = xs + µ + e x v + s u = uv. Line search: Set x + = x + λu + λ 2 u, s + = s + λv + λ 2 v, by a line search so that δ(x +,s +,µ(x +,s + )) = β.

Mehrotra Predictor-corrector

Thank you