CSCI 1951-G Optimization Methods in Finance Part 09: Interior Point Methods

Similar documents
Interior Point Algorithms for Constrained Convex Optimization

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Barrier Method. Javier Peña Convex Optimization /36-725

12. Interior-point methods

12. Interior-point methods

Lecture 9 Sequential unconstrained minimization

Inequality constrained minimization: log-barrier method

Lecture 16: October 22

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

CS711008Z Algorithm Design and Analysis

Lecture 14 Barrier method

Convex Optimization and l 1 -minimization

Applications of Linear Programming

Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization

CS-E4830 Kernel Methods in Machine Learning

A Tutorial on Convex Optimization II: Duality and Interior Point Methods

Primal-Dual Interior-Point Methods

10. Unconstrained minimization

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Homework 4. Convex Optimization /36-725

Constrained Optimization and Lagrangian Duality

CSCI : Optimization and Control of Networks. Review on Convex Optimization

Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization /36-725

Unconstrained minimization

Interior-Point Methods for Linear Optimization

11. Equality constrained minimization

ICS-E4030 Kernel Methods in Machine Learning

Written Examination

Optimization Tutorial 1. Basic Gradient Descent

A Brief Review on Convex Optimization

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

Nonlinear Optimization for Optimal Control

10 Numerical methods for constrained problems

Lecture 8. Strong Duality Results. September 22, 2008

Convex Optimization. 9. Unconstrained minimization. Prof. Ying Cui. Department of Electrical Engineering Shanghai Jiao Tong University

subject to (x 2)(x 4) u,

Convex Optimization. Prof. Nati Srebro. Lecture 12: Infeasible-Start Newton s Method Interior Point Methods

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

Convex Optimization Lecture 13

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

5 Handling Constraints

Newton s Method. Ryan Tibshirani Convex Optimization /36-725

Newton s Method. Javier Peña Convex Optimization /36-725

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

minimize x subject to (x 2)(x 4) u,

2.098/6.255/ Optimization Methods Practice True/False Questions

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006

Lecture 14: October 17

5. Duality. Lagrangian

Lecture 24: August 28

Convex Optimization. Lecture 12 - Equality Constrained Optimization. Instructor: Yuanzhang Xiao. Fall University of Hawaii at Manoa

Convex Optimization and SVM

Two hours. To be provided by Examinations Office: Mathematical Formula Tables. THE UNIVERSITY OF MANCHESTER. xx xxxx 2017 xx:xx xx.

Convex Optimization & Lagrange Duality

EE364a Homework 8 solutions

Primal-Dual Interior-Point Methods. Javier Peña Convex Optimization /36-725

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Constrained optimization: direct methods (cont.)

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Lecture 7: Convex Optimizations

Advances in Convex Optimization: Theory, Algorithms, and Applications

Nonlinear Programming

Convex Optimization M2

Interior Point Methods in Mathematical Programming

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

Equality constrained minimization

Algorithms for constrained local optimization

Computational Optimization. Constrained Optimization Part 2

Analytic Center Cutting-Plane Method

Lecture: Duality of LP, SOCP and SDP

Duality revisited. Javier Peña Convex Optimization /36-725

Optimization. Yuh-Jye Lee. March 21, Data Science and Machine Intelligence Lab National Chiao Tung University 1 / 29

Optimality Conditions for Constrained Optimization

Determinant maximization with linear. S. Boyd, L. Vandenberghe, S.-P. Wu. Information Systems Laboratory. Stanford University

Agenda. Interior Point Methods. 1 Barrier functions. 2 Analytic center. 3 Central path. 4 Barrier method. 5 Primal-dual path following algorithms

Nonlinear Optimization: What s important?

Lecture 5: September 15

Lecture 3. Optimization Problems and Iterative Algorithms

Constrained Optimization

Penalty and Barrier Methods. So we again build on our unconstrained algorithms, but in a different way.

Convex Optimization Boyd & Vandenberghe. 5. Duality

Lecture 6: Conic Optimization September 8

Scientific Computing: Optimization

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 5 Fall 2009

8. Geometric problems

4TE3/6TE3. Algorithms for. Continuous Optimization

Truncated Newton Method

Linear and non-linear programming

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

Optimization. Yuh-Jye Lee. March 28, Data Science and Machine Intelligence Lab National Chiao Tung University 1 / 40

Self-Concordant Barrier Functions for Convex Optimization

Convex Optimization. Ofer Meshi. Lecture 6: Lower Bounds Constrained Optimization

Machine Learning. Support Vector Machines. Manfred Huber

Duality. Geoff Gordon & Ryan Tibshirani Optimization /

Existence of minimizers

4. Algebra and Duality

Transcription:

CSCI 1951-G Optimization Methods in Finance Part 09: Interior Point Methods March 23, 2018 1 / 35

This material is covered in S. Boyd, L. Vandenberge s book Convex Optimization https://web.stanford.edu/~boyd/cvxbook/. Some of the materials and the figures are taken from it. 2 / 35

Context Two weeks ago: unconstrained problems, solved with descent methods Last week: linearly constrained problems, solved with Newton s method This week: inequality constrained problems, solved with interior point methods 3 / 35

Inequality constrained minimization problems min f 0 (x) s.t. f i (x) 0, i = 1,..., m Ax = b f 0,..., f m : convex and twice continuously differentiable, A R p n, rank(a) = p < n) Assume: optimal solution x exists, with obj. value p. problem is strictly feasible (i.e., feasible region has interior points) Slater s condition hold: There exist λ and ν that, with x, satisfy KKTs. 4 / 35

Hierarchy of algorithms Transforming constrained problem to unconstrained: always possible, but has drawbacks Solving the constrained problem: direct, leverages problem structure What s the constrained problem class that is the easiest to solve? Quadratic Problems with Linear equality Constraints (LCQP) Only require to solve...a system of linear equations How did we solve generic problems with linear equality constraints? With Newton s method, which solves a sequence of...lcqps! We will solve inequality constrained problems with interior point methods, which solve a sequence of linear constrained problems! 5 / 35

Problem Transformation Goal: approximate the Inequality Constrained Problem (ICP) with an Equality Constrained Problem (ECP) solvable with Newton s method; We start by transforming the ICP into an equivalent ECP: From: To: min f 0 (x) s.t. f i (x) 0, i = 1,..., m Ax = b min g(x) s.t. Ax = b For g(x) = f 0 (x) + m I _ (f i (x)) where I _ (u) = i=1 { 0 u 0 u > 0 So we just use Newton s method and we are done. The End. Nope. 6 / 35

Logarithmic barrier min f 0 (x) + s.t. Ax = b m I _ (f i (x)) i=1 The obj. function is in general not differentiable: We can t use Newton s method. We want to approximate I _ (u) with a differentiable function: Î _ (u) = 1 t log( u) with domain R ++, and where t > 0 is a parameter 7 / 35

Logarithmic barrier The problem (11.3) has no inequality constraints, but its objective function is not (in general) differentiable, so Newton s method cannot be applied. 8 / 35 Î _ (u) 11.2is convex Logarithmic andbarrier differentiable function and central path 563 10 5 0 5 3 2 1 0 1 u Figure 11.1 The dashed lines show the function I (u), and the solid curves show Î (u) = (1/t) log( u), for t =0.5, 1, 2. The curve for t = 2 gives the best approximation.

Logarithmic barrier min f 0 (x) 1 t s.t. Ax = b m log( f i (x)) i=1 The objective function is convex and differentiable: we can use Newton s method φ(x) = m i=1 log( f i(x)) is called the logarithmic barrier for the problem 9 / 35

Example: Inequality form linear programming min c T x Ax b The logarithmic barrier for this problem is m φ(x) = log(b i a T i x) i=1 where a i are the rows of A. 10 / 35

How to choose t? min f 0 (x) + 1 t φ(x) s.t. Ax = b is an approximation of the original problem. How does the quality of the approximation change with t? As t grows, 1 t φ(x) tends to I _ (f i (x)) so the approximation quality increases So let s just use a large t? Nope. 11 / 35

Why not using (immediately) a large t? What s the intuition behind Newton s method? Replace obj. function with 2nd-order Taylor approximation at x: f(x + v) f(x) + f(x) T v + 1 2 vt 2 f(x)v When does this approximation (and Newton s method) work well? When the Hessian changes slowly Is it the case for the barrier function? 12 / 35

Back to the example min c T x s.t. Ax b φ(x) = 2 φ(x) = m log(b i a T i x) i=1 m i=1 1 (b i a T i x)2 a ia T i The Hessian changes fast as x gets close to the boundary of the feasible region. 13 / 35

Why not using (immediately) a large t? The Hessian of the function f 0 + 1 t φ varies rapidly near the boundary of the feasible set. This fact makes directly using a large t not efficient Instead, we will solve a sequence of problems in the form for increasing values of t min f 0 (x) + 1 t φ(x) s.t. Ax = b We start each Newton minimization at the solution of the problem for the previous value of t. 14 / 35

The central path Slight rewrite: min tf 0 (x) + φ(x) s.t. Ax = b Assume it has a unique solution x (t) for each t > 0. Central path: {x (t) : t > 0} (made of central points) 15 / 35

The central path Necessary and sufficient conditions for x (t): Strict feasibility: Ax (t) = b f i (x (t)) < 0, i = 1,..., m Zero of the Lagrangian (centrality condition): Exists ˆν 0 = t f 0 (x (t)) + φ(x (t)) + A Tˆν m = t f 0 (x 1 (t)) + f i (x (t)) f i(x (t)) + A Tˆν i=1 16 / 35

Back to the example min c T x s.t. Ax b Centrality condition: φ(x) = m log(b i a T i x) i=1 0 = t f 0 (x (t)) + φ(x (t)) + A Tˆν m 1 = tc + b i a T i xa i i=1 17 / 35

Back to the example we see that x (t) minimizes the Lagrangian 18 / 35 0 = tc + m 1 b i a T i xa i 566 i=1 11 Interior-point methods c x x (10) Figure 11.2 Central path for an LP with n = 2 and m = 6. The dashed curves show three contour lines of the logarithmic barrier function φ. The central path converges to the optimal point x as t.alsoshownisthe point on the central path with t = 10. The optimality condition (11.9) at this point can be verified geometrically: The line c T x = c T x (10) is tangent to the contour line of φ through x (10).

Dual point from the central path Every central point x (t) yields a dual feasible point (λ (t), ν (t)), thus a...lower bound to the optimal obj. value p : λ 1 i (t) = tf i (x (t)), i = 1,..., m ν (t) = ˆν t The proof gives us a lot of information 19 / 35

Proof λ i (t) > 0 because f i (x (t)) < 0 Rewrite the centrality condition: m 0 = t f 0 (x 1 (t)) + f i=1 i (x (t)) f i(x (t)) + A Tˆν m = f 0 (x (t)) + λ i (t) f i (x (t)) + A T ν (t) The above equals i=1 L x (x (t), λ (t), ν (t)) = 0 i.e., x (t)...minimizes the Lagrangian at λ (t), ν (t); 20 / 35

Proof Let s look at the dual function: m g(λ (t), ν (t)) = f 0 (x (t)) + λ i (t)f i x (t) + ν (t)(ax b) i=1 It holds g(λ (t), ν (t)) = f 0 (x (t)) m/t So f 0 (x (t))p m/t i.e., x ( t) is no more than m/t-suboptimal! x (t) converges to x as t. 21 / 35

The barrier method To get an ε-approximation we could just set t =m/ε and solve min m ε f 0(x) + φ(x) Ax = b This method does not scale well with the size of the problem and with ε. Barrier method: Compute x (t) for an increasing sequence of values t until t m/ε 22 / 35

The barrier method input: strictly feasible x = x (0), t = t (0) > 0, µ > 1, ε > 0 repeat: 1 Centering step: Compute x (t) by minimizing tf 0 + φ subject to Ax = b, starting at x 2 Update: x x (t) 3 Stopping criterion: quit if m/t < ε 4 Increase t: t µt What can we ask about this algorithm? 23 / 35

The barrier method What can we ask about this algorithm? 1 How many iterations does it take to converge? 2 Do we need to optimally solve the centering step? 3 What is a good value for µ? 4 How to choose t (0)? 24 / 35

Convergence The algorithm stops when m/t < ε t starts at t (0) t increases to µt at each iteration How to compute the number of iterations needed? We must find the smallest i such that It holds: i = m ε < t(0) µ i log m εt (0) log m Is there anything important that this analysis does not tell us? It does not tell us whether, as t grows, the centering step becomes more difficult. (It does not) 25 / 35

35 Newton iterations 30 25 20 15 10 1 10 2 10 3 m Figure 11.8 Average number of Newton steps required to solve 100 randomly generated LPs of different dimensions, with n = 2m. Error bars show standard deviation, around the average value, for each value of m. The growth in the number of Newton steps required, as the problem dimensions range over a 100:1 ratio, is very small. 26 / 35

The barrier method What can we ask about this algorithm? 1 How many iterations does it take to converge? 2 Do we need to optimally solve the centering step? 3 What is a good value for µ? 4 How to choose t (0)? 27 / 35

Solving the centering step optimally? Computing x (t) exactly is not necessary: the central path has no significance, it just leads to a solution of the original problem Inexact centering will still lead to the solution but the points (λ (t), ν (t)) may not be dual feasible. This issue can be corrected (homework) Additionally, getting a extremely accurate minimizer of tf 0 + φ only takes a few more Newton iterations than a good minimizer, so why not just go for it? 28 / 35

The barrier method What can we ask about this algorithm? 1 How many iterations does it take to converge? 2 Do we need to optimally solve the centering step? 3 What is a good value for µ? 4 How to choose t (0)? 29 / 35

Choosing µ The choice of µ involves a trade-off between the number of outer iterations of the barrier method and the number of inner iterations of the Newton s method For small µ, t grows...slowly: the initial point of Newton s method is very good: in few inner iterations it converges to the next x(t). successive x(t), x(µt) are close so more outer iterations are needed For larger µ, the opposite holds. The two effects really cancel out: the total number of inner iterations stay constant for sufficiently large µ. 30 / 35

11 Interior-point methods 10 2 10 0 duality gap 10 2 10 4 10 6 µ =50 µ =150 µ =2 0 20 40 60 80 Newton iterations Figure 11.4 Progress of barrier method for a small LP, showing duality gap versus cumulative number of Newton steps. Three plots are shown, corresponding to three values of the parameter µ: 2, 50, and 150. In each case, we have approximately linear convergence of duality gap. Newton s method is λ(x) 2 /2 10 5, where λ(x) is the Newton decrement of the 31 / 35

11.3 The barrier method 140 120 Newton iterations 100 80 60 40 20 0 0 40 80 120 160 200 µ Figure 11.5 Trade-off in the choice of the parameter µ, for a small LP. The vertical axis shows the total number of Newton steps required to reduce the duality gap from 100 to 10 3, and the horizontal axis shows µ. The plot shows the barrier method works well for values of µ larger than around 3, but is otherwise not sensitive to the value of µ. This plot shows that the barrier method performs very well for a wide range of 32 / 35

The barrier method What can we ask about this algorithm? 1 How many iterations does it take to converge? 2 Do we need to optimally solve the centering step? 3 What is a good value for µ? 4 How to choose t (0)? 33 / 35

How to choose t (0) A very large initial t incurs in more inner iterations at the first outer iteration A very small initial t incurs in more outer iterations m/t (0) is the 1st duality gap. We want to choose t (0) so that m/t (0) µ(f 0 (x (0) ) p ). If we have feasible dual points (λ, ν), with duality gap η = f 0 (x (0) ) g(λ, ν), then we can take t (0) = m/η. Thus in the 1st outer iteration we get the same duality gap as the initial primal and dual. 34 / 35

Recap Inequality constrained problems Up and down a hierarchy of algorithms The central path Getting the dual points and the optimality certificate The barrier method Convergence, parameters, and other details 35 / 35