Interior Point Methods for Convex Quadratic and Convex Nonlinear Programming

Similar documents
Introduction to Nonlinear Stochastic Programming

Interior Point Methods for Linear Programming: Motivation & Theory

Interior Point Methods: Second-Order Cone Programming and Semidefinite Programming

10 Numerical methods for constrained problems

Algorithms for nonlinear programming problems II

Lectures 9 and 10: Constrained optimization problems and their optimality conditions


Optimisation in Higher Dimensions

Algorithms for nonlinear programming problems II

Interior-Point Methods

SF2822 Applied Nonlinear Optimization. Preparatory question. Lecture 10: Interior methods. Anders Forsgren. 1. Try to solve theory question 7.

Interior Point Methods in Mathematical Programming

Lecture 18: Optimization Programming

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

Convergence Analysis of Inexact Infeasible Interior Point Method. for Linear Optimization

CS711008Z Algorithm Design and Analysis

Linear and non-linear programming

Algorithms for constrained local optimization

SF2822 Applied Nonlinear Optimization. Preparatory question. Lecture 9: Sequential quadratic programming. Anders Forsgren

Operations Research Lecture 4: Linear Programming Interior Point Method

Nonlinear optimization

Constrained optimization: direct methods (cont.)

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

Lecture 3. Optimization Problems and Iterative Algorithms

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization

N. L. P. NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP. Optimization. Models of following form:

Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization. Nick Gould (RAL)

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

CE 191: Civil and Environmental Engineering Systems Analysis. LEC 05 : Optimality Conditions

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

Interior Point Algorithms for Constrained Convex Optimization

LP. Kap. 17: Interior-point methods

Second-order cone programming

Computational Finance

Lecture 2: Convex Sets and Functions

5. Duality. Lagrangian

Applications of Linear Programming

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

Linear Algebra in IPMs

Interior Point Methods for Mathematical Programming

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

Optimization Theory. Lectures 4-6

Chapter 11. Taylor Series. Josef Leydold Mathematical Methods WS 2018/19 11 Taylor Series 1 / 27

Nonlinear Optimization: What s important?

Following The Central Trajectory Using The Monomial Method Rather Than Newton's Method

15. Conic optimization

5 Handling Constraints

Interior-Point Methods for Linear Optimization

SF2822 Applied nonlinear optimization, final exam Wednesday June

Module 04 Optimization Problems KKT Conditions & Solvers

Convex Optimization M2

Example: feasibility. Interpretation as formal proof. Example: linear inequalities and Farkas lemma

minimize x subject to (x 2)(x 4) u,

Using interior point methods for optimization in training very large scale Support Vector Machines. Interior Point Methods for QP. Elements of the IPM

Generalization to inequality constrained problem. Maximize

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

Lecture: Algorithms for LP, SOCP and SDP

Two hours. To be provided by Examinations Office: Mathematical Formula Tables. THE UNIVERSITY OF MANCHESTER. xx xxxx 2017 xx:xx xx.

4. Algebra and Duality

Lecture 3 Interior Point Methods and Nonlinear Optimization

1 Computing with constraints

Lecture: Introduction to LP, SDP and SOCP

Optimality, Duality, Complementarity for Constrained Optimization

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

Convexification of Mixed-Integer Quadratically Constrained Quadratic Programs

Constrained Optimization and Lagrangian Duality

Optimization. Yuh-Jye Lee. March 28, Data Science and Machine Intelligence Lab National Chiao Tung University 1 / 40

4TE3/6TE3. Algorithms for. Continuous Optimization

Semidefinite Programming Basics and Applications

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Written Examination

Convex Optimization Boyd & Vandenberghe. 5. Duality

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

4TE3/6TE3. Algorithms for. Continuous Optimization

1 Overview. 2 A Characterization of Convex Functions. 2.1 First-order Taylor approximation. AM 221: Advanced Optimization Spring 2016

Appendix A Taylor Approximations and Definite Matrices

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

Primal-dual IPM with Asymmetric Barrier

What s New in Active-Set Methods for Nonlinear Optimization?

SMO vs PDCO for SVM: Sequential Minimal Optimization vs Primal-Dual interior method for Convex Objectives for Support Vector Machines

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey

CONVERGENCE ANALYSIS OF AN INTERIOR-POINT METHOD FOR NONCONVEX NONLINEAR PROGRAMMING

Linear programming II

Duality Theory of Constrained Optimization

Nonlinear Programming

12. Interior-point methods

CHAPTER 2: QUADRATIC PROGRAMMING

CSCI 1951-G Optimization Methods in Finance Part 01: Linear Programming

Examination paper for TMA4180 Optimization I

CONSTRAINED NONLINEAR PROGRAMMING

Lecture 8. Strong Duality Results. September 22, 2008

6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC

Solution Methods. Richard Lusby. Department of Management Engineering Technical University of Denmark

Optimization and Root Finding. Kurt Hornik

Review of Optimization Methods

LINEAR AND NONLINEAR PROGRAMMING

Convex Optimization & Lagrange Duality

Multidisciplinary System Design Optimization (MSDO)

Analysis/Calculus Review Day 3

Transcription:

School of Mathematics T H E U N I V E R S I T Y O H F E D I N B U R G Interior Point Methods for Convex Quadratic and Convex Nonlinear Programming Jacek Gondzio Email: J.Gondzio@ed.ac.uk URL: http://www.maths.ed.ac.uk/~gondzio Paris, January 2018 1

Outline Part 1: IPM for QP quadratic forms duality in QP first order optimality conditions primal-dual framework Part 2: IPM for NLP NLP notation Lagrangian first order optimality conditions primal-dual framework Algebraic Modelling Languages Self-concordant barrier Paris, January 2018 2

IPM for Convex QP Paris, January 2018 3

Convex Quadratic Programs The quadratic function f(x) = x T Qx is convex if and only if the matrix Q is positive definite. In such case the quadratic programming problem is well defined. min c T x+ 1 2 xt Qx s.t. Ax = b, x 0, If there exists a feasible solution to it, then there exists an optimal solution. Paris, January 2018 4

QP Background: Def. A matrix Q R n n is positive semidefinite if x T Qx 0 for any x 0. We write Q 0. Def. A matrix Q R n n is positive definite if x T Qx > 0 for any x 0. We write Q 0. Example: Consider quadratic functions f(x) = x T Qx with the following matrices: [ ] [ ] [ ] 1 0 1 0 5 4 Q 1 = 0 2, Q 2 = 0 1, Q 3 = 4 3. Q 1 is positive definite (hence f 1 is convex). Q 2 and Q 3 are indefinite (f 2,f 3 are not convex). Paris, January 2018 5

QP with IPMs Consider the convex quadratic programming problem. The primal min c T x+ 1 2 xt Qx s.t. Ax = b, x 0, and the dual max b T y 1 2 xt Qx s.t. A T y +s Qx = c, x,s 0. Apply the usual procedure: replace inequalities with log barriers; form the Lagrangian; write the first order optimality conditions; apply Newton method to them. Paris, January 2018 6

QP with IPMs: Log Barriers Replace the primal QP min c T x+ 1 2 xt Qx s.t. Ax = b, x 0, with the primal barrier QP min c T x+ 1 2 xt Qx n s.t. Ax = b. j=1 lnx j Paris, January 2018 7

QP with IPMs: Log Barriers Replace the dual QP with the dual barrier QP max b T y 2 1xT Qx s.t. A T y + s Qx = c, y free, s 0, max b T y 1 2 xt Qx+ n j=1 s.t. A T y +s Qx = c. lns j Paris, January 2018 8

First Order Optimality Conditions Consider the primal barrier quadratic program min c T x+ 1 2 xt Qx µ n s.t. Ax = b, where µ 0 is a barrier parameter. Write out the Lagrangian j=1 lnx j L(x,y,µ) = c T x+ 1 2 xt Qx y T (Ax b) µ n j=1 lnx j, Paris, January 2018 9

First Order Optimality Conditions (cont d) The conditions for a stationary point of the Lagrangian: L(x,y,µ) = c T x+ 1 n 2 xt Qx y T (Ax b) µ lnx j, are j=1 x L(x,y,µ) = c A T y µx 1 e+qx = 0 y L(x,y,µ) = Ax b = 0, where X 1 = diag{x 1 1,x 1 2,,x 1 n }. Let us denote s = µx 1 e, i.e. XSe = µe. The First Order Optimality Conditions are: Ax = b, A T y +s Qx = c, XSe = µe. Paris, January 2018 10

Apply Newton Method to the FOC The first order optimality conditions for the barrier problem form a large system of nonlinear equations F(x,y,s) = 0, where F : R 2n+m R 2n+m is an application defined as follows: Ax b F(x,y,s) = A T y +s Qx c. XSe µe Actually, the first two terms of it are linear; only the last one, corresponding to the complementarity condition, is nonlinear. Note that F(x,y,s) = A 0 0 Q A T I. S 0 X Paris, January 2018 11

Newton Method for the FOC (cont d) Thus, for a given point (x,y,s) we find the Newton direction ( x, y, s) by solving the system of linear equations: A 0 0 Q A T I S 0 X [ x y s ] = b Ax c A T y s+qx µe XSe. Paris, January 2018 12

Interior-Point QP Algorithm Initialize k = 0, (x 0,y 0,s 0 ) F 0, µ 0 = 1 n (x0 ) T s 0, α 0 = 0.9995 Repeat until optimality k = k +1 µ k = σµ k 1, where σ (0,1) = Newton direction towards µ-center Ratio test: α P := max {α > 0 : x+α x 0}, α D := max {α > 0 : s+α s 0}. Make step: x k+1 = x k +α 0 α P x, y k+1 = y k +α 0 α D y, s k+1 = s k +α 0 α D s. Paris, January 2018 13

From LP to QP QP problem min c T x+ 1 2 xt Qx s.t. Ax = b, x 0. First order conditions (for barrier problem) Ax = b, A T y +s Qx = c, XSe = µe. Paris, January 2018 14

IPMs for Convex NLP Paris, January 2018 15

Convex Nonlinear Optimization Consider the nonlinear optimization problem min f(x) s.t. g(x) 0, where x R n, and f : R n R and g : R n R m are convex, twice differentiable. Assumptions: f and g are convex If there exists a local minimum then it is a global one. f and g are twice differentiable We can use the second order Taylor approximations. Some additional (technical) conditions We need them to prove that the point which satisfies the first order optimality conditions is the optimum. We won t use them in this course. Paris, January 2018 16

Taylor Expansion of f : R R Let f : R R. If all derivatives of f are continuously differentiable at x 0, then f (k) (x f(x) = 0 ) (x x k! 0 ) k, k=0 where f (k) (x 0 ) is the k-th derivative of f at x 0. The first order approximation of the function: f(x) = f(x 0 )+f (x 0 )(x x 0 )+r 2 (x x 0 ), where the remainder satisfies: r lim 2 (x x 0 ) = 0. x x 0 x x 0 Paris, January 2018 17

Taylor Expansion (cont d) The second order approximation: f(x) = f(x 0 )+f (x 0 )(x x 0 ) where the remainder satisfies: + 1 2 f (x 0 )(x x 0 ) 2 +r 3 (x x 0 ), r lim 3 (x x 0 ) x x 0 (x x 0 ) 2 = 0. Paris, January 2018 18

Derivatives of f : R n R The vector ( f(x)) T = is called the gradient of f at x. The matrix 2 f x 2 (x) 1 2 f(x)= ( f (x), f (x),..., f ) (x) x 1 x 2 x n 2 f x 2 x 1 (x) 2 f x 1 x 2 (x)... 2 f x 1 x n (x) 2 f x 2 (x)... 2.......... 2 f x n x 1 (x) is called the Hessian of f at x. 2 f x n x 2 (x)... 2 f x 2 x n (x) 2 f x 2 (x) n Paris, January 2018 19

Taylor Expansion of f : R n R Let f : R n R. If all derivatives of f are continuously differentiable at x 0, then f (k) (x f(x) = 0 ) (x x k! 0 ) k, k=0 where f (k) (x 0 ) is the k-th derivative of f at x 0. The first order approximation of the function: f(x) = f(x 0 )+ f(x 0 ) T (x x 0 )+r 2 (x x 0 ), where the remainder satisfies: r lim 2 (x x 0 ) x x 0 x x 0 = 0. Paris, January 2018 20

Taylor Expansion (cont d) The second order approximation: of the function: f(x) = f(x 0 )+ f(x 0 ) T (x x 0 ) + 1 2 (x x 0) T 2 f(x 0 )(x x 0 )+r 3 (x x 0 ), where the remainder satisfies: r lim 3 (x x 0 ) x x 0 x x 0 2 = 0. Paris, January 2018 21

Convexity: Reminder Property 1. For any collection {C i i I} of convex sets, the intersection i I C i is convex. Property 4. If C is a convexset and f : C R is convexfunction, the level sets {x C f(x) α} and {x C f(x) < α} are convex for all scalars α. Lemma 1: If g : R n R m is a convex function, then the set {x R n g(x) 0} is convex. Proof: Since every function g i : R n R,i = 1,2,...,m is convex, from Property 4, we conclude that every set X i = {x R n g i (x) 0} is convex. Next from Property 1, we deduce that the intersection m X = X i = {x R n g(x) 0} i=1 is convex, which completes the proof. Paris, January 2018 22

Differentiable Convex Functions Property 8. Let C R n be a convex set and f : C R be twice continuously differentiable over C. (a)if 2 f(x)ispositivesemidefiniteforallx C,thenf isconvex. (b) If 2 f(x) is positive definite for all x C, then f is strictly convex. (c)iff isconvex,then 2 f(x)ispositivesemidefiniteforallx C. Let the second order approximation of the function be given: f(x) f(x 0 )+c T (x x 0 )+ 1 2 (x x 0) T Q(x x 0 ), where c = f(x 0 ) and Q = 2 f(x 0 ). From Property 8, it follows that when f is convex and twice differentiable, then Q exists and is a positive semidefinite matrix. Conclusion: Iff isconvexandtwicedifferentiable,thenoptimizationoff(x)can (locally) be replaced with the minimization of its quadratic model. Paris, January 2018 23

Nonlinear Optimization with IPMs Nonlinear Optimization via QPs: Sequential Quadratic Programming (SQP). Repeat until optimality: approximate NLP (locally) with a QP; solve (approximately) the QP. Nonlinear Optimization with IPMs: works similarly to SQP scheme. However, the (local) QP approximations are not solved to optimality. Instead, only one step in the Newton direction corresponding to a given QP approximation is made and the new QP approximation is computed. Paris, January 2018 24

Nonlinear Optimization with IPMs Derive an IPM for NLP: replace inequalities with log barriers; form the Lagrangian; write the first order optimality conditions; apply Newton method to them. Paris, January 2018 25

NLP Notation Consider the nonlinear optimization problem min f(x) s.t. g(x) 0, where x R n, and f : R n R and g : R n R m are convex, twice differentiable. The vector-valued function g : R n R m has a derivative [ ] gi A(x) = g(x) = R m n x j which is called the Jacobian of g. i=1..m,j=1..n Paris, January 2018 26

NLP Notation (cont d) The Lagrangian associated with the NLP is: L(x,y) = f(x)+y T g(x), where y R m,y 0 are Lagrange multipliers (dual variables). The first derivatives of the Lagrangian: x L(x,y) = f(x)+ g(x) T y y L(x,y) = g(x). The Hessian of the Lagrangian, Q(x,y) R n n : m Q(x,y) = 2 xxl(x,y) = 2 f(x)+ y i 2 g i (x). i=1 Paris, January 2018 27

Convexity in NLP Lemma 2: If f : R n R and g : R n R m are convex, twice differentiable, then the Hessian of the Lagrangian m Q(x,y) = 2 f(x)+ y i 2 g i (x) is positive semidefinite for any x and any y 0. If f is strictly convex, then Q(x,y) is positive definite for any x and any y 0. Proof: UsingProperty8,theconvexityoff impliesthat 2 f(x)is positive semidefinite for any x. Similarly, the convexity of g implies that for all i = 1,2,...,m, 2 g i (x) is positive semidefinite for any x. Since y i 0 for all i = 1,2,...,m and Q(x,y) is the sum of positive semidefinite matrices, we conclude that Q(x, y) is positive semidefinite. If f is strictly convex, then 2 f(x) is positive definite and so is Q(x,y). Paris, January 2018 28 i=1

IPM for NLP Add slack variables to nonlinear inequalities: min f(x) s.t. g(x)+z = 0 z 0, where z R m. Replace inequality z 0 with the logarithmic barrier: min f(x) µ m lnz i i=1 s.t. g(x)+z = 0. Write out the Lagrangian L(x,y,z,µ) = f(x)+y T (g(x)+z) µ m i=1 lnz i, Paris, January 2018 29

IPM for NLP For the Lagrangian L(x,y,z,µ) = f(x)+y T (g(x)+z) µ m lnz i, i=1 write the conditions for a stationary point x L(x,y,z,µ) = f(x)+ g(x) T y = 0 y L(x,y,z,µ) = g(x)+z = 0 z L(x,y,z,µ) = y µz 1 e = 0, where Z 1 = diag{z 1 1,z 1 2,,z 1 m }. The First Order Optimality Conditions are: f(x)+ g(x) T y = 0, g(x)+z = 0, YZe = µe. Paris, January 2018 30

Newton Method for the FOC The first order optimality conditions for the barrier problem form a large system of nonlinear equations F(x,y,z) = 0, where F : R n+2m R n+2m is an application defined as follows: F(x,y,z) = f(x) + g(x)t y g(x) + z. YZe µe Note that all three terms of it are nonlinear. (In LP and QP the first two terms were linear.) Paris, January 2018 31

Newton Method for the FOC Observe that F(x,y,z) = Q(x,y) A(x)T 0 A(x) 0 I 0 Z Y, where A(x) is the Jacobian of g and Q(x,y) is the Hessian of L. They are defined as follows: A(x) = g(x) Q(x,y) = 2 f(x)+ m i=1 R m n y i 2 g i (x) R n n Paris, January 2018 32

Newton Method (cont d) Foragivenpoint(x,y,z)wefindtheNewtondirection( x, y, z) by solving the system of linear equations: Q(x,y) [ ] A(x)T 0 x A(x) 0 I y = f(x) A(x)T y g(x) z. 0 Z Y z µe YZe Using the third equation we eliminate z = µy 1 e Ze ZY 1 y, from the second equation and get [ ][ ] Q(x,y) A(x) T x A(x) ZY 1 y = [ f(x) A(x) T y g(x) µy 1 e ]. Paris, January 2018 33

Interior-Point NLP Algorithm Initialize k = 0 (x 0,y 0,z 0 ) such that y 0 > 0 and z 0 > 0, µ 0 = 1 m (y0 ) T z 0 Repeat until optimality k = k +1 µ k = σµ k 1, where σ (0,1) Compute A(x) and Q(x,y) = Newton direction towards µ-center Ratio test: α 1 := max {α > 0 : y +α y 0}, α 2 := max {α > 0 : z +α z 0}. Choosethestep: (usetrustregionorlinesearch) α min{α 1,α 2 }. Make step: x k+1 = x k +α x, y k+1 = y k +α y, z k+1 = z k +α z. Paris, January 2018 34

From QP to NLP Newton direction for QP Q AT I A 0 0 S 0 X [ x y s ] = [ ξd ξp ξ µ ]. Augmented system for QP [ Q SX 1 A T A 0 ][ x y ] = [ ξd X 1 ] ξ µ. ξ p Paris, January 2018 35

From QP to NLP Newton direction for NLP Q(x,y) [ A(x)T 0 x A(x) 0 I y 0 Z Y z ] = f(x) A(x)T y g(x) z µe YZe. Augmented system for NLP [ ][ Q(x,y) A(x) T x A(x) ZY 1 y ] = [ f(x) A(x) T y g(x) µy 1 e ]. Conclusion: NLP is a natural extension of QP. Paris, January 2018 36

Linear Algebra in IPM for NLP Newton direction for NLP Q(x,y) [ ] A(x)T 0 x A(x) 0 I y 0 Z Y z = f(x) A(x)T y g(x) z µe YZe The corresponding augmented system [ ][ ] [ Q(x,y) A(x) T x f(x) A(x) A(x) ZY 1 y = T ] y g(x) µy 1. e where A(x) R m n is the Jacobian of g and Q(x,y) R n n is the Hessian of L A(x) = g(x) Q(x,y) = 2 f(x)+ m y i 2 g i (x) Paris, January 2018 37 i=1.

Linear Algebra in IPM for NLP (cont d) Automatic differentiation is very useful... get Q(x, y) and A(x) from Algebraic Modeling Language. Model Solution AML Num. Anal. Package SOLVER Output Paris, January 2018 38

Automatic Differentiation AD in the Internet: ADIFOR (FORTRAN code for AD): http://www-unix.mcs.anl.gov/autodiff/adifor/ ADOL-C (C/C++ code for AD): http://www-unix.mcs.anl.gov/autodiff/ AD Tools/adolc.anl/adolc.html AD page at Cornell: http://www.tc.cornell.edu/~averma/ad/ Paris, January 2018 39

IPMs: Remarks Interior Point Methods provide the unified framework for convex optimization. IPMs provide polynomial algorithms for LP, QP and NLP. The linear algebra in LP, QP and NLP is very similar. Use IPMs to solve very large problems. Further Extensions: Nonconvex optimization. IPMs in the Internet: LP FAQ (Frequently Asked Questions): http://www-unix.mcs.anl.gov/otc/guide/faq/ Interior Point Methods On-Line: http://www-unix.mcs.anl.gov/otc/interiorpoint/ NEOS (Network Enabled Optimization Services): http://www-neos.mcs.anl.gov/ Paris, January 2018 40

Newton Method and Self-concordant Barriers Paris, January 2018 41

Another View of Newton M. for Optimization Newton Method for Optimization Let f : R n R be a twice continuously differentiable function. Suppose we build a quadratic model f of f around a given point x k, i.e., we define x = x x k and write: f(x) = f(x k )+ f(x k ) T x+ 1 2 xt 2 f(x k ) x Now we optimize the model f instead of optimizing f. A minimum(or, more generally, a stationary point) of the quadratic model satisfies: f(x) = f(x k )+ 2 f(x k ) x = 0, i.e. x = x x k = ( 2 f(x k )) 1 f(x k ), which reduces to the usual equation: x k+1 = x k ( 2 f(x k )) 1 f(x k ). Paris, January 2018 42

logx Barrier Function Consider the primal barrier linear program n min c T x µ lnx j s.t. Ax = b, j=1 where µ 0 is a barrier parameter. Write out the Lagrangian L(x,y,µ) = c T x y T (Ax b) µ and the conditions for a stationary point n lnx j, j=1 x L(x,y,µ) = c A T y µx 1 e = 0 y L(x,y,µ) = Ax b = 0, where X 1 = diag{x 1 1,x 1 2,,x 1 n }. Paris, January 2018 43

log x Barrier Function (cont d) Let us denote s = µx 1 e, i.e. XSe = µe. The First Order Optimality Conditions are: Ax = b, A T y +s = c, XSe = µe. Paris, January 2018 44

- logx bf: Newton Method The first order optimality conditions for the barrier problem form a large system of nonlinear equations F(x,y,s) = 0, where F : R 2n+m R 2n+m is an application defined as follows: Ax b F(x,y,s) = A T y +s c. XSe µe Actually, the first two terms of it are linear; only the last one, corresponding to the complementarity condition, is nonlinear. Note that F(x,y,s) = A 0 0 0 A T I. S 0 X Paris, January 2018 45

- log x bf: Newton Method (cont d) Thus, for a given point (x,y,s) we find the Newton direction ( x, y, s) by solving the system of linear equations: A 0 0 [ ] x 0 A T I y = b Ax c A T y s. S 0 X s µe XSe Paris, January 2018 46

1/x α, α >0 Barrier Function Consider the primal barrier linear program n min c T 1 x µ s.t. Ax = b, j=1 where µ 0 is a barrier parameter and α > 0. Write out the Lagrangian x α j L(x,y,µ) = c T x y T (Ax b)+µ and the conditions for a stationary point n j=1 x L(x,y,µ) = c A T y µαx α 1 e = 0 y L(x,y,µ) = Ax b = 0, where X α 1 = diag{x α 1 1,x α 1 2,,x α 1 n }. Paris, January 2018 47 1 x α j,

1/x α, α >0 Barrier Function (cont d) Let us denote s = µαx α 1 e, i.e. X α+1 Se = µαe. The First Order Optimality Conditions are: Ax = b, A T y +s = c, X α+1 Se = µαe. Paris, January 2018 48

1/x α, α>0 bf: Newton Method The first order optimality conditions for the barrier problem are F(x,y,s) = 0, where F : R 2n+m R 2n+m is an application defined as follows: Ax b F(x,y,s) = A T y +s c. X α+1 Se µαe As before, only the last term, corresponding to the complementarity condition, is nonlinear. Note that F(x,y,s) = A 0 0 0 A T I (α+1)x α S 0 X α+1 Paris, January 2018 49.

1/x α, α>0 bf: Newton Method (cont d) Thus, for a given point (x,y,s) we find the Newton direction ( x, y, s) by solving the system of linear equations: A 0 0 [ ] x b Ax 0 A T I y = c A T y s (α+1)x α S 0 X α+1 s µαe X α+1 Se. Paris, January 2018 50

e 1/x Barrier Function Consider the primal barrier linear program n min c T x µ e 1/x j s.t. Ax = b, j=1 where µ 0 is a barrier parameter. Write out the Lagrangian L(x,y,µ) = c T x y T (Ax b)+µ n e 1/x j, j=1 and the conditions for a stationary point x L(x,y,µ) = c A T y µx 2 exp(x 1 )e = 0 y L(x,y,µ) = Ax b = 0, where exp(x 1 ) = diag{e 1/x 1,e 1/x 2,,e 1/x n}. Paris, January 2018 51

e 1/x Barrier Function Let us denote s = µx 2 exp(x 1 )e, i.e. X 2 exp( X 1 )Se = µe. The First Order Optimality Conditions are: Ax = b, A T y +s = c, X 2 exp( X 1 )Se = µe. Paris, January 2018 52

e 1/x bf: Newton Method The first order optimality conditions are F(x,y,s) = 0, where F : R 2n+m R 2n+m is defined as follows: Ax b F(x,y,s) = A T y +s c. X 2 exp( X 1 )Se µe As before, only the last term, corresponding to the complementarity condition, is nonlinear. Note that A 0 0 F(x,y,s) = 0 A T I. (2X +I)exp( X 1 ) 0 X 2 exp( X 1 ) Paris, January 2018 53

e 1/x bf: Newton Method (cont d) Newton direction( x, y, s) solves the following system of linear equations: A 0 0 0 A T I (2X +I)exp( X 1 )S 0 X 2 exp( X 1 ) = [ x y s b Ax c A T y s µe X 2 exp( X 1 )Se ]. Paris, January 2018 54

Why Log Barrier is the Best? The First Order Optimality Conditions: logx : XSe = µe, 1/x α : X α+1 Se = µαe, e 1/x : X 2 exp( X 1 )Se = µe. Log Barrier ensures the symmetry between the primal and the dual. 3rd row in the Newton Equation System: logx : F 3 = [S,0,X], 1/x α : F 3 = [(α+1)x α S,0,X α+1 ] e 1/x : F 3 = [(2X+I)exp( X 1 )S,0,X 2 exp( X 1 )] Log Barrier produces the weakest nonlinearity. Paris, January 2018 55

Self-concordant Functions There is a nice property of the function that is responsible for a good behaviour of the Newton method. Def Let C R n be an open nonempty convex set. Let f : C R be a three times continuously differentiable convex function. A function f is called self-concordant if there exists a constant p > 0 such that 3 f(x)[h,h,h] 2p 1/2 ( 2 f(x)[h,h]) 3/2, x C, h : x+h C. (We then say that f is p-self-concordant). Note that a self-concordant function is always well approximated by the quadratic model because the error of such an approximation can be bounded by the 3/2 power of 2 f(x)[h,h]. Paris, January 2018 56

Self-concordant Barriers Lemma The barrier function logx is self-concordant on R +. Proof Consider f(x) = logx. Compute f (x) = x 1, f (x) = x 2 and f (x) = 2x 3 and check that the self-concordance condition is satisfied for p = 1. Lemma The barrier function 1/x α, with α (0, ) is not self-concordant on R +. Lemma The barrier function e 1/x is not self-concordant on R +. Use self-concordant barriers in optimization Paris, January 2018 57