Similar documents
5 Handling Constraints

Algorithms for Constrained Optimization

Algorithms for constrained local optimization

4TE3/6TE3. Algorithms for. Continuous Optimization

2.3 Linear Programming

Multidisciplinary System Design Optimization (MSDO)

CONSTRAINED NONLINEAR PROGRAMMING

10 Numerical methods for constrained problems

Numerical Optimization

Interior Point Methods for Mathematical Programming

Lecture 13: Constrained optimization

Interior-Point Methods for Linear Optimization

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

Lecture V. Numerical Optimization

NONLINEAR. (Hillier & Lieberman Introduction to Operations Research, 8 th edition)

Lecture 3. Optimization Problems and Iterative Algorithms

Optimisation in Higher Dimensions

Inexact Newton Methods and Nonlinear Constrained Optimization

Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization. Nick Gould (RAL)

Algorithms for nonlinear programming problems II

Lecture 11 and 12: Penalty methods and augmented Lagrangian methods for nonlinear programming

arxiv: v1 [math.na] 8 Jun 2018

1 Computing with constraints

A SHIFTED PRIMAL-DUAL INTERIOR METHOD FOR NONLINEAR OPTIMIZATION

MATH 4211/6211 Optimization Basics of Optimization Problems

An Inexact Newton Method for Nonlinear Constrained Optimization

Algorithms for nonlinear programming problems II

Penalty and Barrier Methods. So we again build on our unconstrained algorithms, but in a different way.

Outline. Scientific Computing: An Introductory Survey. Optimization. Optimization Problems. Examples: Optimization Problems

An Inexact Newton Method for Optimization

On sequential optimality conditions for constrained optimization. José Mario Martínez martinez

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

A SHIFTED PRIMAL-DUAL PENALTY-BARRIER METHOD FOR NONLINEAR OPTIMIZATION

PDE-Constrained and Nonsmooth Optimization

Solution Methods. Richard Lusby. Department of Management Engineering Technical University of Denmark

2.098/6.255/ Optimization Methods Practice True/False Questions

Optimization. Yuh-Jye Lee. March 21, Data Science and Machine Intelligence Lab National Chiao Tung University 1 / 29

Written Examination

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL)

Interior Point Methods in Mathematical Programming

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization

5.6 Penalty method and augmented Lagrangian method

MS&E 318 (CME 338) Large-Scale Numerical Optimization

Interior Point Methods for Convex Quadratic and Convex Nonlinear Programming

An interior-point stochastic approximation method and an L1-regularized delta rule

Implementation of an Interior Point Multidimensional Filter Line Search Method for Constrained Optimization

POWER SYSTEMS in general are currently operating

CE 191: Civil and Environmental Engineering Systems Analysis. LEC 05 : Optimality Conditions

Computational Finance

MVE165/MMG631 Linear and integer optimization with applications Lecture 13 Overview of nonlinear programming. Ann-Brith Strömberg

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

CS711008Z Algorithm Design and Analysis

Optimization and Root Finding. Kurt Hornik

REGULARIZED SEQUENTIAL QUADRATIC PROGRAMMING METHODS

AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING

Generalization to inequality constrained problem. Maximize

Constrained Optimization

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

Applications of Linear Programming

A Penalty-Interior-Point Algorithm for Nonlinear Constrained Optimization

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Constrained optimization: direct methods (cont.)

LAGRANGIAN TRANSFORMATION IN CONVEX OPTIMIZATION

A Primal-Dual Augmented Lagrangian Penalty-Interior-Point Filter Line Search Algorithm

A STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE

Continuous Optimisation, Chpt 6: Solution methods for Constrained Optimisation

Part 2: NLP Constrained Optimization

A GLOBALLY CONVERGENT STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE

A Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity

Penalty, Barrier and Augmented Lagrangian Methods

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method

Operations Research Lecture 4: Linear Programming Interior Point Method

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods

An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization

Optimization. Yuh-Jye Lee. March 28, Data Science and Machine Intelligence Lab National Chiao Tung University 1 / 40

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

8 Barrier Methods for Constrained Optimization

Derivative-Based Numerical Method for Penalty-Barrier Nonlinear Programming

A GLOBALLY CONVERGENT STABILIZED SQP METHOD

12. Interior-point methods

Computational Optimization. Augmented Lagrangian NW 17.3

Scientific Computing: An Introductory Survey

LINEAR AND NONLINEAR PROGRAMMING

Rachid Benouahboun 1 and Abdelatif Mansouri 1

AN INTERIOR-POINT METHOD FOR NONLINEAR OPTIMIZATION PROBLEMS WITH LOCATABLE AND SEPARABLE NONSMOOTHNESS

The Squared Slacks Transformation in Nonlinear Programming

INTERIOR-POINT METHODS FOR NONCONVEX NONLINEAR PROGRAMMING: CONVERGENCE ANALYSIS AND COMPUTATIONAL PERFORMANCE

Introduction to Nonlinear Optimization Paul J. Atzberger

Lecture: Algorithms for LP, SOCP and SDP

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

SF2822 Applied Nonlinear Optimization. Preparatory question. Lecture 9: Sequential quadratic programming. Anders Forsgren

CONVERGENCE ANALYSIS OF AN INTERIOR-POINT METHOD FOR NONCONVEX NONLINEAR PROGRAMMING

PRIMAL DUAL METHODS FOR NONLINEAR CONSTRAINED OPTIMIZATION

Constrained Nonlinear Optimization Algorithms

A PRIMAL-DUAL TRUST REGION ALGORITHM FOR NONLINEAR OPTIMIZATION

Constrained Optimization and Lagrangian Duality

Transcription:

Penalty and Barrier Methods General classical constrained minimization problem minimize f(x) subject to g(x) 0 h(x) =0 Penalty methods are motivated by the desire to use unconstrained optimization techniques to solve constrained problems. This is achieved by either adding a penalty for infeasibility and forcing the solution to feasibility and subsequent optimum, or adding a barrier to ensure that a feasible solution never becomes infeasible. Optimization in Engineering Design Georgia Institute of Technology Systems Realization Laboratory 11 7

Penalty Methods Penalty methods use a mathematical function that will increase the objective for any given constrained violation. General transformation of constrained problem into an unconstrained problem: min T(x) = f(x) + r k P(x) where f(x) is the objective function of the constrained problem r k is a scalar denoted as the penalty or controlling parameter P(x) is a function which imposes penalities for infeasibility (note that P(x) is controlled by r k ) T(x) is the (pseudo) transformed objective Two approaches exist in the choice of transformation algorithms: 1) Sequential penalty transformations 2) Exact penalty transformations Optimization in Engineering Design Georgia Institute of Technology Systems Realization Laboratory 11 8

Sequential Penalty Transformations Sequential Penalty Transformations are the oldest penalty methods. Also known as Sequential Unconstrained Minimization Techniques (SUMT) based upon the work of Fiacco and McCormick, 1968. Consider the following frequently used general purpose penalty function: m T(x) = y(x) + r k { (max[0, g i (x)] 2 ) + [h j (x)] 2 ) } i=1 l j=1 Sequential transformations work as follows: Choose P(x) and a sequence of r k such that when k goes to infinity, the solution found. x* is For example, for k = 1, r 1 = 1 and we solve the problem. Then, for the second iteration r k is increased by a factor 10 and the problem is resolved starting from the previous solution. Note that an increasing value of r k will increase the effect of the penalties on T( x). The process is terminated when no improvement in T( x) is found and all constraints are satisfied. Optimization in Engineering Design Georgia Institute of Technology Systems Realization Laboratory 11 9

Two Classes of Sequential Methods Two major classes exist in sequential methods: 1) First class uses a sequence of infeasible points and feasibility is obtained only at the optimum. These are referred to as penalty function or exterior-point penalty function methods. 2) Second class is characterized by the property of preserving feasibility at all times. These are referred to as barrier function methods or interiorpoint penalty function methods. General barrier function transformation T(x) = y(x) + r k B(x) where B(x) is a barrier function and r k the penalty parameter which is supposed to go to zero when k approaches infinity. Typical Barrier functions are the inverse or logarithmic, that is: m B(x) = g i -1 (x) i=1 Optimization in Engineering Design m or B(x) = ln[ g i (x)] i=1 Georgia Institute of Technology Systems Realization Laboratory 12 0

What to Choose? Some prefer barrier methods because even if they do not converge, you will still have a feasible solution. Others prefer penalty function methods because You are less likely to be stuck in a feasible pocket with a local minimum. Penalty methods are more robust because in practice you may often have an infeasible starting point. However, penalty functions typically require more function evaluations. Choice becomes simple if you have equality constraints. (Why?) Optimization in Engineering Design Georgia Institute of Technology Systems Realization Laboratory 12 1

Exact Penalty Methods Theoretically, an infinite sequence of penalty problems should be solved for the sequential penalty function approach, because the effect of the penalty has to be more strongly amplified by the penalty parameter during the iterations. Exact transformation methods avoid this long sequence by constructing penalty functions that are exact in the sense that the solution of the penalty problem yields the exact solution to the original problem for a finite value of the penalty parameter. However, it can be shown that such exact functions are not differentiable. Exact methods are newer and less well-established as sequential methods. Optimization in Engineering Design Georgia Institute of Technology Systems Realization Laboratory 12 2

Closing Remarks Typically, you will encounter sequential approaches. Various penalty functions P(x) exist in the literature. Various approaches to selecting the penalty parameter sequence exist. Simplest is to keep it constant during all iterations. Always ensure that penalty does not dominate the objective function during initial iterations of exterior point method. Optimization in Engineering Design Georgia Institute of Technology Systems Realization Laboratory 12 3

Numerical Optimization Unit 9: Penalty Method and Interior Point Method Unit 10: Filter Method and the Maratos Effect Che-Rung Lee Scribe: May 1, 2011 (UNIT 9,10) Numerical Optimization May 1, 2011 1 / 24

Penalty method The idea is to add penalty terms to the objective function, which turns a constrained optimization problem to an unconstrained one. Quadratic penalty function Example (For equality constraints) min x 1 + x 2 subject to x 2 1 + x 2 2 2 = 0 ( x = (1, 1)) Define Q( x, μ) = x 1 + x 2 + μ 2 (x 2 1 + x 2 2 2)2 For μ = 1, ( 1 + 2(x 2 Q( x, 1) = 1 + x2 2 2)x ) 1 1 + 2(x1 2 + x 2 2 2)x = 2 ( 0 0 ), ( x 1 x 2 ) = ( 1.1 1.1 ) For μ = 10, ( 1 + 20(x 2 Q( x, 10) = 1 + x2 2 2)x 1 1 + 20(x1 2 + x 2 2 2)x 2 ) ( 0 = 0 ), ( x 1 x 2 ) = ( 1.0000001 1.0000001 ) (UNIT 9,10) Numerical Optimization May 1, 2011 2 / 24

Size of μ Example It seems the larger μ, the better solution is. When μ is large, matrix 2 Q μ c c T is ill-conditioned. μ cannot be too small either. min x 5x 2 1 + x 2 2 s.t. x 1 = 1. Q( x, μ) = 5x 2 1 + x 2 2 + μ 2 (x 1 1) 2. Q(x, μ) = f (x) + μ 2 (c(x))2 Q = f + μc c 2 Q = 2 f + μ c c T + μc 2 c For μ < 10, the problem min Q( x, μ) is unbounded. (UNIT 9,10) Numerical Optimization May 1, 2011 3 / 24

Quadratic penalty function Picks a proper initial guess of μ and gradually increases it. Algorithm: Quadratic penalty function 1 Given μ 0 > 0 and x 0 2 For k = 0, 1, 2,... 1 Solve min Q(:, μ k ) = f ( x) + μ k x 2 2 If converged, stop ci 2 ( x). i E 3 Increase μ k+1 > μ k and find a new x k+1 Problem: the solution is not exact for μ. (UNIT 9,10) Numerical Optimization May 1, 2011 4 / 24

Augmented Lagrangian method Use the Lagrangian function to rescue the inexactness problem. Let L( x, ρ, μ) = f ( x) i ε L = f ( x) i ε ρ i c i ( x) + μ ci 2 ( x) 2 ρ i c i ( x) + μ i ε i ε c i ( x) c i. By the Lagrangian theory, L = f i ε (ρ i μc i ) c }{{} i. λ i At the optimal solution, c i ( x ) = 1 μ (λ i ρ i ). If we can approximate ρ i λ i, μ k need not be increased indefinitely, λ k+1 i = λ k i μ k c i (x k ) Algorithm: update ρ i at each iteration. (UNIT 9,10) Numerical Optimization May 1, 2011 5 / 24

Inequality constraints There are two approaches to handle inequality constraints. 1 Make the object function nonsmooth (non-differentiable at some points). 2 Add slack variable to turn the inequality constraints to equality constraints. { ci ( x) s c i 0 i = 0 s i 0 But then we have bounded constraints for slack variables. We will focus on the second approach here. (UNIT 9,10) Numerical Optimization May 1, 2011 6 / 24

Inequality constraints Suppose the augmented Lagrangian method is used and all inequality constraints are converted to bounded constraints. For a fixed μ and λ, min x s.t. L( x, λ, μ) = f ( x) l x u m λ i c i ( x) + μ 2 i=1 m ci 2 ( x) The first order necessary condition for x to be a solution of the above problem is x = P( x x L A ( x, λ, μ), l, u), where P( g, l i, if g i l i ; l, u) = g i, if g i (l i, u i ); u i, if g i u i. i=1 for all i = 1, 2,..., n. (UNIT 9,10) Numerical Optimization May 1, 2011 7 / 24

Nonlinear gradient projection method Sequential quadratic programming + trust region method to solve min x f ( x) s.t. l x u Algorithm: Nonlinear gradient projection method 1 At each iteration, build a quadratic model q( x) = 1 2 (x x k) T B k (x x k ) + f T k (x x k) where B k is SPD approximation of 2 f (x k ). 2 For some Δ k, use the gradient projection method to solve min x q( x) s.t. max( l, x k Δ k ) x max( u, x k + Δ k ), 3 Update Δ k and repeat 1-3 until converge. (UNIT 9,10) Numerical Optimization May 1, 2011 8 / 24

Interior point method Consider the problem min x f ( x) s.t. C E ( x) = 0 C I ( x) s = 0 s 0 where s are slack variables. The interior point method starts a point inside the feasible region, and builds walls on the boundary of the feasible region. A barrier function goes to infinity when the input is close to zero. min f ( x) μ m log(s i ) s.t. x, s i=1 The function f (x) = log x as x 0. μ: barrier parameter C E ( x) = 0 C I ( x) s = 0 (UNIT 9,10) Numerical Optimization May 1, 2011 9 / 24 (1)

An example Example (min x + 1, s.t. x 1) min x x + 1 μ ln(1 x) y 1 x 0 μ = 1, x = 0.00005 μ = 0.1, x = 0.89999 μ = 0.01, x = 0.989999 μ = 10 5, x = 0.99993 x The Lagrangian of (1) is L( x, s, y, z) = f ( x) μ m log(s i ) y T C E ( x) z T (C I ( x) s) i=1 1 Vector y is the Lagrangian multiplier of equality constraints. 2 Vector z is the Lagrangian multiplier of inequality constraints. (UNIT 9,10) Numerical Optimization May 1, 2011 10 / 24

The KKT conditions The KKT conditions The KKT conditions for (1) x L = 0 f A E y A I z = 0 s L = 0 SZ μi = 0 y L = 0 C E ( x) = 0 z L = 0 C I ( x) s = 0 (2) Matrix S = diag( s) and matrix Z = diag( z). Matrix A E is the Jacobian of C E and matrix A I is the Jacobian of C I. (UNIT 9,10) Numerical Optimization May 1, 2011 11 / 24

Newton s step Let F = x L s L y L z L = f A E y A I z SZ μi C E ( x) C I ( x) s. The interior point method uses Newton s method to solve F = 0. F = xx L 0 A E ( x) A I ( x) 0 Z 0 S A E ( x) 0 0 0 A I ( x) I 0 0 Newton s step F = p x p s p y p z = F x k+1 = x k + α x p x s k+1 = s k + α s p s y k+1 = y k + α y p y z k+1 = z k + α z p z (UNIT 9,10) Numerical Optimization May 1, 2011 12 / 24

Algorithm: Interior point method (IPM) Algorithm: Interior point method (IPM) 1 Given initial x 0, s 0, y 0, z 0, and μ 0 2 For k = 0, 1, 2,... until converge (a) Compute p x, p s, p y, p z and α x, α s, α y, α z (b) ( x k+1, s k+1, y k+1, z k+1 ) = ( x k, s k, y k, z k ) + (α x p x, α s p s, α y p y, α z p z ) (c) Adjust μ k+1 < μ k (UNIT 9,10) Numerical Optimization May 1, 2011 13 / 24

Some comments of IMP Some comments of the interior point method 1 The complementarity slackness condition says s i z i = 0 at the optimal solution, by which, the parameter μ, SZ = μi, needs to decrease to zero as the current solution approaches to the optimal solution. 2 Why cannot we set μ zero or small in the beginning? Because that will make x k going to the nearest constraint, and the entire process will move along constraint by constraint, which again becomes an exponential algorithm. 3 To keep x k (or s and z) too close any constraints, IPM also limits the step size of s and z αs max αz max = max{α (0, 1], s + α p s (1 τ) s} = max{α (0, 1], z + α p z (1 τ) z} (UNIT 9,10) Numerical Optimization May 1, 2011 14 / 24

Interior point method for linear programming We will use linear programming to illustrate the details of IPM. KKT conditions The primal min x c T x s.t. A x = b, x 0. The dual max λ b T λ s.t. A T λ + s = c, s 0. A T λ + s = c A x = b x i s i = 0 x 0, s 0 (UNIT 9,10) Numerical Optimization May 1, 2011 15 / 24

Solve problem Let X = x 1 x 2... xn, S = s 1 s 2... sn, F = A T λ + s c A x b X s μ e The problem is to solve F = 0 (UNIT 9,10) Numerical Optimization May 1, 2011 16 / 24

Newton s method Using Newton s method F p x p λ p z F = = F 0 A T I A 0 0 S 0 X How to decide μ k? μ k = 1 n x k. s k is called duality measure. x k+1 = x k + α x p x λk+1 = λ k + α λ p λ z k+1 = z k + α z p z (UNIT 9,10) Numerical Optimization May 1, 2011 17 / 24

The central path The central path The central path: a set of points, p(τ) = solution of the equation x τ λ τ z τ, defined by the A T λ + s = c A x = b x i s i = τ i = 1, 2,, n x, s > 0 (UNIT 9,10) Numerical Optimization May 1, 2011 18 / 24

Algorithm Algorithm: The interior point method for solving linear programming 1 Given an interior point x 0 and the initial guess of slack variables s 0 2 For k = 0, 1,... 0 A T I (a) Solve A 0 0 S k 0 X k σ k [0, 1]. (b) Compute (α x, α λ, α s ) s.t. Δx k Δλ k Δs k x k+1 λk+1 s k+1 = = b A xk c s k A T λ k X k S k e + σ k μ k e x k + α x Δx k λk + α λ Δλ k s k + α s Δs k for is in the neighborhood of the central path x N (θ) = λ F XS e μ e θμ for some θ (0, 1]. s (UNIT 9,10) Numerical Optimization May 1, 2011 19 / 24

Filter method Example There are two goals of constrained optimization: 1 Minimize the objective function. 2 Satisfy the constraints. Suppose the problem is min x f ( x) s.t. c i ( x) = 0 for i E c i ( x) 0 for i I Define h( x) penalty functions of constraints. h( x) = i E c i ( x) + i I [c i ( x)], in which the notation { [z] = max{0, z}. min f ( x) The goals become min h( x) (UNIT 9,10) Numerical Optimization May 1, 2011 20 / 24

The filter method A pair (f k, h k ) dominates (f l, h l ) if f k < f l and h k < h l. A filter is a list of pairs (f k, h k ) such that no pair dominates any other. The filter method only accepts the steps that are not dominated by other pairs. Algorithm: The filter method 1 Given initial x 0 and an initial trust region Δ 0. 2 For k = 0, 1, 2,... until converge 1 Compute a trial x + by solving a local quadric programming model 2 If (f +, h + ) is accepted to the filter Set x k+1 = x +, add (f +, h + ) to the filter, and remove pairs dominated by it. Else Set x k+1 = x k and decrease Δ k. (UNIT 9,10) Numerical Optimization May 1, 2011 21 / 24

The Maratos effect Example The Maratos effect shows the filter method could reject a good step. min x 1,x 2 f (x 1, x 2 ) = 2(x 2 1 + x 2 2 1) x 1 s.t. x 2 1 + x 2 2 1 = 0 The optimal solution is x = (1, 0) ( ) ( cos θ sin Suppose x 2 θ k =, p sin θ k = sin θ cos θ x k+1 = x k + p k = ) ( cos θ + sin 2 θ sin θ(1 cos θ) ) (UNIT 9,10) Numerical Optimization May 1, 2011 22 / 24

Reject a good step ( ) x k x = cos θ 1 sin θ = cos 2 θ 2 cos θ + 1 + sin 2 θ = 2(1 cos θ) x k+1 x = cos θ + sin2 θ 1 sin θ sin θ cos θ = cos θ(1 cos θ) sin θ(1 cos θ) = cos 2 θ(1 cos θ) 2 + sin 2 θ(1 cos θ) 2 = (1 cos θ) 2 Therefore x k+1 x x k x 2 = 1. This step gives a quadratic convergence. 2 However, the filter method will reject this step because f ( x k ) = cos θ, and c( x k ) = 0, f ( x k+1 ) = cos θ sin 2θ = sin 2 θ cos θ > f ( x k ) c( x k+1 ) = sin 2 θ > 0 = c( x k ) (UNIT 9,10) Numerical Optimization May 1, 2011 23 / 24

The second order correction The second order correction could help to solve this problem. Instead of c( x k ) T p k + c( x k ) = 0, use quadratic approximation c( x k ) + c( x k ) T dk + 1 2 d T k 2 xxc( x) d k = 0. (3) Suppose d k p k is small. Use Taylor expansion to approximate quadratic term c( x k + p k ) c( x k ) + c( x k ) T p k + 1 2 pt k 2 xxc( x) p k. 1 2 d T k 2 xxc( x) d k 1 2 pt k 2 xxc( x) p k c( x k + p k ) c( x k ) c( x k ) T p k. Equation (3) can be rewritten as c( x k ) T dk + c( x k + p k ) c( x k ) T p k = 0 Use the corrected linearized constraint: c( x k ) T p + c( x k + p k ) c( x k ) T p k = 0. (The original linearized constraint is c( x k ) T p + c( x k ) = 0.) (UNIT 9,10) Numerical Optimization May 1, 2011 24 / 24