Continuous Optimisation, Chpt 6: Solution methods for Constrained Optimisation

Similar documents
10 Numerical methods for constrained problems

Examination paper for TMA4180 Optimization I

Algorithms for constrained local optimization

Numerical Optimization

Continuous Optimisation, Chpt 7: Proper Cones

Optimization. Yuh-Jye Lee. March 21, Data Science and Machine Intelligence Lab National Chiao Tung University 1 / 29

The Fundamental Theorem of Linear Inequalities

4TE3/6TE3. Algorithms for. Continuous Optimization

Continuous Optimisation, Chpt 9: Semidefinite Optimisation

Continuous Optimisation, Chpt 9: Semidefinite Problems

Optimization. Yuh-Jye Lee. March 28, Data Science and Machine Intelligence Lab National Chiao Tung University 1 / 40

IE 5531 Midterm #2 Solutions

Lagrangian Duality for Dummies


Second Order Optimality Conditions for Constrained Nonlinear Programming

Lecture 18: Optimization Programming

Constrained Optimization

Penalty and Barrier Methods. So we again build on our unconstrained algorithms, but in a different way.

Convex Optimization. Prof. Nati Srebro. Lecture 12: Infeasible-Start Newton s Method Interior Point Methods

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Written Examination

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

Lecture 13: Constrained optimization

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006

ICS-E4030 Kernel Methods in Machine Learning

Determination of Feasible Directions by Successive Quadratic Programming and Zoutendijk Algorithms: A Comparative Study

8 Barrier Methods for Constrained Optimization

1 Computing with constraints

Lecture 3. Optimization Problems and Iterative Algorithms

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Algorithms for nonlinear programming problems II

Nonlinear Optimization: What s important?

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Unconstrained Optimization

TMA947/MAN280 APPLIED OPTIMIZATION

minimize x subject to (x 2)(x 4) u,

Chap 2. Optimality conditions

On sequential optimality conditions for constrained optimization. José Mario Martínez martinez

5 Handling Constraints

A SHIFTED PRIMAL-DUAL INTERIOR METHOD FOR NONLINEAR OPTIMIZATION

10-725/ Optimization Midterm Exam

Algorithms for nonlinear programming problems II

Primal-dual Subgradient Method for Convex Problems with Functional Constraints

SECTION C: CONTINUOUS OPTIMISATION LECTURE 9: FIRST ORDER OPTIMALITY CONDITIONS FOR CONSTRAINED NONLINEAR PROGRAMMING

Numerical Optimization

Optimality Conditions for Constrained Optimization

The Conjugate Gradient Method

SECTION C: CONTINUOUS OPTIMISATION LECTURE 11: THE METHOD OF LAGRANGE MULTIPLIERS

Mathematical Optimisation, Chpt 2: Linear Equations and inequalities

Primal/Dual Decomposition Methods

Computational Optimization. Augmented Lagrangian NW 17.3

MVE165/MMG631 Linear and integer optimization with applications Lecture 13 Overview of nonlinear programming. Ann-Brith Strömberg

CS-E4830 Kernel Methods in Machine Learning

Generalization to inequality constrained problem. Maximize

Convex Optimization and SVM

Algorithms for Constrained Optimization

Optimisation in Higher Dimensions

2.3 Linear Programming

Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization. Nick Gould (RAL)

Convex Optimization and Support Vector Machine

Nonlinear Programming

Convex Optimization. Ofer Meshi. Lecture 6: Lower Bounds Constrained Optimization

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method

CSCI : Optimization and Control of Networks. Review on Convex Optimization

Applications of Linear Programming

A Brief Review on Convex Optimization

Constrained Optimization and Lagrangian Duality

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

Subgradients. subgradients and quasigradients. subgradient calculus. optimality conditions via subgradients. directional derivatives

Mathematical Optimisation, Chpt 2: Linear Equations and inequalities

Optimization and Root Finding. Kurt Hornik

Computational Optimization. Constrained Optimization Part 2

CONSTRAINED NONLINEAR PROGRAMMING

Numerical optimization

Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities

You should be able to...

AM 205: lecture 18. Last time: optimization methods Today: conditions for optimality

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

Lecture 12 Unconstrained Optimization (contd.) Constrained Optimization. October 15, 2008

Optimality conditions for unconstrained optimization. Outline

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems

NONLINEAR. (Hillier & Lieberman Introduction to Operations Research, 8 th edition)

Interior Point Algorithms for Constrained Convex Optimization

Image restoration: numerical optimisation

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey

A Primal-Dual Interior-Point Method for Nonlinear Programming with Strong Global and Local Convergence Properties

Apolynomialtimeinteriorpointmethodforproblemswith nonconvex constraints

Solving Dual Problems

CS711008Z Algorithm Design and Analysis

Solution Methods. Richard Lusby. Department of Management Engineering Technical University of Denmark

Interior-Point Methods for Linear Optimization

Nonlinear Optimization for Optimal Control

Support Vector Machines for Regression

PDE-Constrained and Nonsmooth Optimization

Constrained Nonlinear Optimization Algorithms

Chapter 2. Optimization. Gradients, convexity, and ALS

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Applied Lagrange Duality for Constrained Optimization

Transcription:

Continuous Optimisation, Chpt 6: Solution methods for Constrained Optimisation Peter J.C. Dickinson DMMP, University of Twente p.j.c.dickinson@utwente.nl http://dickinson.website/teaching/2017co.html version: 06/11/17 Monday 6th November 2017 Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 1/25

Problem min x f (x) s. t. g j (x) 0 for all j = 1,..., m x R n. (C) f, g 1,..., g m C 1, f, g 1,..., g m : R n R, F := {x C : g j (x) 0 for all j = 1,..., m}. We will not make any convexity assumptions. Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 2/25

Table of Contents 1 Introduction 2 Feasible descent method Basic idea Naive choice of direction Alternative choice of direction 3 Unconstrained optimisation 4 Penalty method 5 Barrier method Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 3/25

Basic idea 1 Start at a point x 0 F. (k = 0) 2 If x k is a John point then STOP. 3 If it is not a John point then there is a strictly feasible descent direction d k. 4 Line search: Find λ k = arg min λ {f (x k + λd k ) : λ R, x k + λd k F} (or just f (x k + λ k d k ) < f (x k ) and x k + λd k F). 5 Let x k+1 = x k + λ k d k F and k k + 1. 6 If stopping criteria satisfied then STOP, else go to step 2. Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 4/25

Choosing d k : Naive method If there is a strictly feasible descent direction, then the following problem will provide one: min d,z z s. t. f (x k ) T d z g i (x k ) T d z for all i s. t. g i (x k ) = 0 1 d j 1 for all j = 1,..., n. Remark 6.1 (+) This is a relative simple method for choosing d k. (-) It ignores constraints s.t. g i (x k ) < 0 but g i (x k ) 0. This can lead to bad convergence, and possibly even converging to points which are not John points. Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 5/25

Choosing d k : Topkis and Veinott method Ex. 6.1 Consider the following optimisation problem: f (x k ) T d z min d,z z : g i (x k ) T d z g i (x k ) for all i = 1,..., m 1 d j 1 for all j = 1,..., n. 1 Prove that if (d, z ) is optimal solution to the problem above with z < 0 then d is strictly feasible descent direction in (C) 2 Prove that if there is strictly feasible descent direction in (C) then the optimal value to problem above is strictly negative. (+) Relatively simple method for choosing d k. (+) All constraints are taken into account. (+) If there is a x F such that a subsequence of the solutions tend towards x, then x is a John point. [FKS,Th.12.5] (-) This gives a first order method (only gradients are taken in to account) and such methods generally have slow convergence. Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 6/25

Table of Contents 1 Introduction 2 Feasible descent method 3 Unconstrained optimisation Newton s method Interpretations 4 Penalty method 5 Barrier method Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 7/25

Newton s method To minimise f : C R, f C 2 we do the following: 1 Start at a point x 0 C (k = 0). 2 If f (x k ) = 0 then STOP. 3 Assuming 2 f (x k ) O, let h k = ( 2 f (x k )) 1 f (x k ). 4 Let x k+1 = x k + h k and k k + 1. 5 If stopping criteria is satisfied then STOP, else go to step 2. Remark 6.2 We could penalise moving too far away from x k by exchanging f (x) for f k,µ (x) = f (x) + µ x x k 2 2, with parameter µ > 0. f k,µ (x) = f (x) + 2µ(x x k ), f k,µ (x k ) = f (x k ), 2 f k,µ (x) = 2 f (x) + 2µI, For µ high enough we then have 2 f k,µ (x) O. 2 f k,µ (x k ) = 2 f (x k ) + 2µI. Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 8/25

Interpretations Interpretation 1 Want to find h in order to minimise f (x k + h). Have f (x k + h) f (x k ) + f (x k ) T h + 1 2 ht ( 2 f (x k ))h. Assuming 2 f (x k ) O and considering RHS of above as function of h, it is minimised at h = ( 2 f (x k )) 1 f (x k ). Interpretation 2 Want to find h such that f (x k + h) = 0. Have f (x k + h) f (x k ) + 2 f (x k )h. Assuming 2 f (x k ) is nonsingular, the RHS of the above is equal to 0 if and only if h = ( 2 f (x k )) 1 f (x k ). Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 9/25

Table of Contents 1 Introduction 2 Feasible descent method 3 Unconstrained optimisation 4 Penalty method Basic idea Basic results (Dis)advantages Choices for p Example Implementation Peter J.C. 5 Dickinson Barrier methodhttp://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 10/25

Basic idea Definition 6.3 p : R n R is a penalty function with respect to F if Penalty method p C 0, p(x) = 0 for all x F, p(x) > 0 for all x R n \ F. In the penalty method we solve the following unconstrained optimisation problem (for a suitable parameter r > 0 and penalty function p): min{f (x) + r p(x)}. x Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 11/25

Basic results Lemma 6.4 For r > 0 we have min{f (x) + r p(x)} min{f (x) + r p(x) : x F} = val(c). x x If F arg min x {f (x) + r p(x)} then we have equality above. Theorem 6.5 Suppose we have {r k : k N} R ++ with lim k r k = and x k arg min x {f (x) + r k p(x)} for all k N, such that x = lim k x k for some x R n. Then x F and x is a global minimiser of (C). Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 12/25

(Dis)advantages (+) This is an unconstrained problem, and thus we can use our methods from unconstrained optimisation. (+) The optimal solution for a given r > 0 gives a lower bound on the optimal value of (C). (+) If for some r > 0 we have an optimal solution to the penalty problem in F, then this is also an optimal solution to the original problem. (Under some conditions we can guarantee this happens for r large enough). (+) If x is a limiting point of a subsequence of optimal solutions x r as r then x F and x is a global minimiser to (C). ( ) In general we will get optimal solutions x r / F. Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 13/25

Choice of p Letting g + j (x) = max{0, g j (x)}, two common choices are: m m ( 2 p(x) = g + j (x), and p(x) = g + j (x)). Ex. 6.2 j=1 Show that 1 if g is convex then g + is also convex. j=1 2 if g is convex then (g + (x)) 2 is also convex. If g C 1 then (g + (x)) 2 also has a continuous derivative. In general g + / C 1. If LICQ satisfied at local minimiser x F of (C), and y R m + are the KKT multipliers, then for p(x) = m j=1 g + j (x) and r > max{y j : j J x }, we have that x is local minimiser of penalty problem. [FKS, Th.12.10] Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 14/25

Example Example https://ggbm.at/szsqwcpu Ex. 6.3 Consider the problems min x {x : x 1}, min x {x 3 : x 1}. For each of these problems: 1 What is the global minimiser, denoted x, and optimal value to this problem? 2 For p(x) = m j=1 g + j (x) and p(x) = m j=1 (g + j (x)) 2 : 1 For r > 0, is the derivative of f r (x) := f (x) + r p(x) with respect to x continuous or not? 2 Find all the stationary points to f r (x), as a function of r. 3 Find the optimal value and solution to min x {f r (x) : x R} as a function of r > 0. Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 15/25

Implementation One implementation would be to solve the penalty problem once for r very large. Alternatively, we could note that we are only interested in the limit as r, and not the solutions to the penalty problem for any fixed r > 0. We could thus use something like Newton s method to attempt to find a solution to the penalty problem, and in each iteration increase r. Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 16/25

Table of Contents 1 Introduction 2 Feasible descent method 3 Unconstrained optimisation 4 Penalty method 5 Barrier method Basic idea Basic results (Dis)advantages Frisch s barrier function Implementation Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 17/25

Basic idea Will let F = {x R n : g i (x) < 0 for all i} and assume F = cl F. Lemma 6.6 inf x {f (x) : x F} = Definition 6.7 inf x {f (x) : x F}. b : F R, b C 0 is a barrier function for (C) if for all x bd F we have lim x x b(x) =. Barrier method In the barrier method we solve the following unconstrained optimisation problem (for a suitable parameter ρ > 0 and a suitable barrier function b): min{f (x) + ρb(x) : x F}. x Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 18/25

Basic results Lemma 6.8 Have F F, and thus for all x F, we get an upper bound of f ( x) on the optimal value to (C). Theorem 6.9 Suppose we have {ρ k : k N} R ++ with lim k ρ k = 0 and x k arg min x {f (x) + ρ k b(x)} for all k N, such that x = lim k x k for some x R n. Then x F and x is a global minimiser of (C). Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 19/25

(Dis)advantages (+) This is an unconstrained problem, and thus we can use our methods from unconstrained optimisation. (+) F F, and thus all feasible points for this problem are feasible for (C). (+) If x is a limiting point of a subsequence of optimal solutions x ρ as ρ 0 + then x F and x is a global minimiser to (C). Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 20/25

Frisch s barrier function Frisch s barrier function b(x) = m i=1 ln( g i(x)). Ex. 6.4 Consider g C 2 and b : {x R n : g(x) < 0} R, b(x) = ln( g(x)). Find 2 b(x) and using this, show that if g is a convex function then so is b. Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 21/25

Parameterised KKT conditions Theorem 6.10 For Frisch s barrier function we have (f (x) + ρb(x)) = f (x) + m i=1 ρ g i (x) g i(x). Have that x is a stationary point to the barrier function if and only if its gradient is zero, or equivalently λ R m such that: x R n, λ R m +, 0 = f (x) + m λ i g i (x), i=1 g i (x) 0, λ i g i (x) = ρ for all i = 1,..., m. This system is known as the parameterised KKT conditions. Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 22/25

Parameterised KKT conditions continued Theorem 6.11 Suppose we have {ρ k : k N} R ++ with lim k ρ k = 0 and (x k, λ k ) are solutions to the parameterised KKT conditions (with ρ = ρ k ). Then x k F and λ k R m +, implying that ψ(λ k ) val(c) f (x k ) for all k N. If ( x, λ) = lim k (x k, λ k ) for some ( x, λ) R n R m. Then x is a KKT point for (C) with multipliers λ. Recall that if (C) is convex, this implies that ( x, λ) is a saddle point to the Lagrangian function, and thus x is a optimal solution to the primal problem, whilst λ is an optimal solution to the dual problem. Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 23/25

Example Example https://ggbm.at/szsqwcpu Ex. 6.5 Consider the problems min x {x : x 1}, min x {x : (x 1) exp(x 2 ) 0}. For each of these problems: 1 What is the global minimiser, denoted x, and optimal value to this problem? 2 For Frisch s barrier function, determine the optimal value to the barrier problem as a function of ρ. Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 24/25

Implementation One implementation would be to solve the barrier problem once for ρ > 0 very small. Alternatively, we could note that we are only interested in the limit as ρ 0, and not the solutions to the penalty problem for any fixed ρ > 0. We could thus use something like Newton s method to attempt to find a solution to the penalty problem, and in each iteration decrease ρ. Peter J.C. Dickinson http://dickinson.website CO17, Chpt 6: Sol n methods: Constrained Opt. 25/25