Trust-Region SQP Methods with Inexact Linear System Solves for Large-Scale Optimization

Similar documents
Integration of Sequential Quadratic Programming and Domain Decomposition Methods for Nonlinear Optimal Control Problems

An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization

An Inexact Newton Method for Nonlinear Constrained Optimization

A Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity

Numerical Methods for PDE-Constrained Optimization

An Inexact Newton Method for Optimization

Inexact Newton Methods and Nonlinear Constrained Optimization

5 Handling Constraints

Analysis of Inexact Trust-Region Interior-Point SQP Algorithms. Matthias Heinkenschloss Luis N. Vicente. TR95-18 June 1995 (revised April 1996)

Hot-Starting NLP Solvers

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization

PDE-Constrained and Nonsmooth Optimization

1. Introduction. In this paper we discuss an algorithm for equality constrained optimization problems of the form. f(x) s.t.

MS&E 318 (CME 338) Large-Scale Numerical Optimization

Linear algebra issues in Interior Point methods for bound-constrained least-squares problems

Nonmonotone Trust Region Methods for Nonlinear Equality Constrained Optimization without a Penalty Function

Large-Scale Nonlinear Optimization with Inexact Step Computations

Constrained Nonlinear Optimization Algorithms

1 Computing with constraints

SF2822 Applied Nonlinear Optimization. Preparatory question. Lecture 9: Sequential quadratic programming. Anders Forsgren

Recent Adaptive Methods for Nonlinear Optimization

Numerical Methods for Large-Scale Nonlinear Equations

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

A New Penalty-SQP Method

A Trust-Funnel Algorithm for Nonlinear Programming

Algorithms for Constrained Optimization

Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization. Nick Gould (RAL)

Constrained optimization: direct methods (cont.)

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods

What s New in Active-Set Methods for Nonlinear Optimization?

Linear Solvers. Andrew Hazel

Fast Iterative Solution of Saddle Point Problems

Lecture 15: SQP methods for equality constrained optimization

Optimal control problems with PDE constraints

1. Introduction. Consider nonlinear equality-constrained optimization problems of the form. f(x), s.t. c(x) = 0, (1.1)

PARALLEL LAGRANGE-NEWTON-KRYLOV-SCHUR METHODS FOR PDE-CONSTRAINED OPTIMIZATION. PART I: THE KRYLOV-SCHUR SOLVER

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method

A Trust-region-based Sequential Quadratic Programming Algorithm

An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints

PARALLEL LAGRANGE NEWTON KRYLOV SCHUR METHODS FOR PDE-CONSTRAINED OPTIMIZATION. PART I: THE KRYLOV SCHUR SOLVER

Survey of NLP Algorithms. L. T. Biegler Chemical Engineering Department Carnegie Mellon University Pittsburgh, PA

Lecture 11 and 12: Penalty methods and augmented Lagrangian methods for nonlinear programming

Interior-Point Methods as Inexact Newton Methods. Silvia Bonettini Università di Modena e Reggio Emilia Italy

An Inexact Newton Method for Nonconvex Equality Constrained Optimization

Poisson Equation in 2D

On Lagrange multipliers of trust region subproblems

Efficient Augmented Lagrangian-type Preconditioning for the Oseen Problem using Grad-Div Stabilization

Parallelizing large scale time domain electromagnetic inverse problem

minimize x subject to (x 2)(x 4) u,

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL)

arxiv: v1 [math.na] 8 Jun 2018

Determination of Feasible Directions by Successive Quadratic Programming and Zoutendijk Algorithms: A Comparative Study

An Introduction to Algebraic Multigrid (AMG) Algorithms Derrick Cerwinsky and Craig C. Douglas 1/84

REPORTS IN INFORMATICS

High Performance Nonlinear Solvers

On Lagrange multipliers of trust-region subproblems

AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING

On Stability of Fuzzy Multi-Objective. Constrained Optimization Problem Using. a Trust-Region Algorithm

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

DELFT UNIVERSITY OF TECHNOLOGY

An Accelerated Block-Parallel Newton Method via Overlapped Partitioning

Termination criteria for inexact fixed point methods

Suboptimal Open-loop Control Using POD. Stefan Volkwein

A null-space primal-dual interior-point algorithm for nonlinear optimization with nice convergence properties


Kasetsart University Workshop. Multigrid methods: An introduction

Mathematics and Computer Science

7.4 The Saddle Point Stokes Problem

Lecture 3: Inexact inverse iteration with preconditioning

Key words. minimization, nonlinear optimization, large-scale optimization, constrained optimization, trust region methods, quasi-newton methods

Chapter 3 Numerical Methods

Numerical optimization

Constrained Optimization

Department of Computer Science, University of Illinois at Urbana-Champaign

Higher-Order Methods

MS&E 318 (CME 338) Large-Scale Numerical Optimization

ADAPTIVE ACCURACY CONTROL OF NONLINEAR NEWTON-KRYLOV METHODS FOR MULTISCALE INTEGRATED HYDROLOGIC MODELS

A Robust Preconditioner for the Hessian System in Elliptic Optimal Control Problems

2.3 Linear Programming

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization

A recursive model-based trust-region method for derivative-free bound-constrained optimization.

Newton s Method and Efficient, Robust Variants

ALADIN An Algorithm for Distributed Non-Convex Optimization and Control

Written Examination

Simultaneous estimation of wavefields & medium parameters

2 CAI, KEYES AND MARCINKOWSKI proportional to the relative nonlinearity of the function; i.e., as the relative nonlinearity increases the domain of co

5.6 Penalty method and augmented Lagrangian method

4TE3/6TE3. Algorithms for. Continuous Optimization

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

A robust multilevel approximate inverse preconditioner for symmetric positive definite matrices

An Efficient Low Memory Implicit DG Algorithm for Time Dependent Problems

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey

The TAO Linearly-Constrained Augmented Lagrangian Method for PDE-Constrained Optimization 1

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares

MODIFYING SQP FOR DEGENERATE PROBLEMS

Algorithms for constrained local optimization

Transcription:

Trust-Region SQP Methods with Inexact Linear System Solves for Large-Scale Optimization Denis Ridzal Department of Computational and Applied Mathematics Rice University, Houston, Texas dridzal@caam.rice.edu March 24, 2006 Rice University CAAM699 Seminar University of Houston D. Ridzal Inexact TR SQP for Large Scale Optimization 1

Outline Motivation Large Scale Problems in PDE Constrained Optimization Inexactness in linear system solves arising in an SQP algorithm Trust Region SQP Algorithm with Inexact Linear System Solves Existing Work on Inexactness in Optimization Algorithms Review of the SQP Methodology Mechanisms of Inexactness Control Numerical Results Conclusion D. Ridzal Inexact TR SQP for Large Scale Optimization 2

Motivation: A PDE Constrained Optimization Problem 10 5 Velocity Field u 0 0 10 20 30 40 50 60 70 80 90 100 subject to minimize 1 2 Ω c 2 dω + α2 2 Computed Concentration c Ω c ( v) 2 + v 2 dγ ρ(u u) µ ( u + u T ) + p = 0 in Ω, u = 0 in Ω, pn + µ ( u + u T ) n = 0 on Ω o, u = 0 on Ω \ ( Ω c Ω o ), u = vn on Ω c, (ɛ c) + u c = f on Ω, c = 0 on Ω \ Ω c, n ɛ c = g on Ω c. D. Ridzal Inexact TR SQP for Large Scale Optimization 3

Large Scale Optimization Problems: Common Features Other applications: optimal design / shape optimization, parameter estimation, inverse problems. Common features: can be solved as constrained nonlinear programming problems (NLPs) using all-at-once techniques number of variables can easily be in the millions in 3D the discretized linear operators are often not available in matrix form even if available explicitly, the resulting linear systems usually require specialized solvers, such as multigrid or domain decomposition regardless of which optimization algorithm is used, linear systems must be solved iteratively! D. Ridzal Inexact TR SQP for Large Scale Optimization 4

Use of Sequential Quadratic Programming Methods SQP methods have been used successfully for the solution of smooth NLPs in R n. Most available SQP codes (NPSOL, SNOPT, KNITRO, LOQO) are based on direct (dense or sparse) linear algebra. impossible to apply to many large scale optimization problems, in particular PDE constrained optimization problems not suitable for parallel computing environments Contribution: Incorporated iterative linear algebra in an SQP framework:!!! iterative linear system solvers are inherently inexact rigorous theoretical analysis of inexactness within an SQP algorithm practical approaches to inexactness control D. Ridzal Inexact TR SQP for Large Scale Optimization 5

Outline Motivation Large Scale Problems in PDE Constrained Optimization Inexactness in linear system solves arising in an SQP algorithm Trust Region SQP Algorithm with Inexact Linear System Solves Existing Work on Inexactness in Optimization Algorithms Review of the SQP Methodology Mechanisms of Inexactness Control Numerical Results Conclusion D. Ridzal Inexact TR SQP for Large Scale Optimization 5

Inexactness in Optimization Algorithms: Existing Work Early results for inexact Newton methods in optimization: e.g. Dembo, Eisenstat, Steihaug, Dennis, Walker (1980s) Connection with inexact SQP methods: Dembo and Tulowitzki (1985) and Fontecilla (1985), limited to local convergence analysis! Global results for inexact Newton methods for nonlinear equations: e.g. Brown and Saad (1990,1994), Eisenstat and Walker (1994) Jäger and Sachs (1997) line search reduced space SQP first global convergence result dependence on Lipschitz constants and derivative bounds Biros and Ghattas (2002) quasi Newton reduced space SQP dependence on derivative bounds Heinkenschloss and Vicente (2001) reduced space TRSQP established a theoretical convergence framework that does not rely on Lipschitz constants or derivative bounds limited to the reduced space SQP approach D. Ridzal Inexact TR SQP for Large Scale Optimization 6

Review of Trust-Region SQP Solve NLP: min f(x) s.t. c(x) = 0 where f : X R and c : X Y, for some Hilbert spaces X and Y, and f and c are twice continuously Fréchet differentiable. define Lagrangian functional L : X Y R: L(x, λ) = f(x) + λ, c(x) Y if regular point x is a local solution of the NLP, then there exists a λ Y satisfying the 1st order necessary optimality conditions: x f(x ) + c x (x ) λ = 0 c(x ) = 0 D. Ridzal Inexact TR SQP for Large Scale Optimization 7

Newton s method applied to the 1st order optimality conditions: ( ) ( ) ( ) xx L(x k, λ k ) c x (x k ) s x k x f(x c x (x k ) 0 s λ = k ) + c x (x k ) λ k k c(x k ) If xx L(x k, λ k ) is positive definite on the null space of c x (x k ), the above KKT system is necessary and sufficient for the solution of the quadratic programming problem (QP): min 1 2 xxl(x k, λ k )s x k, s x k X + x L(x k, λ k ), s x k X s.t. c x (x k )s x k + c(x k ) = 0 To globalize the convergence, we add a trust region constraint: min 1 2 H ks x k, s x k X + x L k, s x k X s.t. c x (x k )s x k + c(x k ) = 0 s x k X k. Possible incompatibility of constraints: Composite Step Approach. D. Ridzal Inexact TR SQP for Large Scale Optimization 8

Composite Step Approach for the Solution of the Quadratic Subproblem TR SQP step: s k = n k + t k quasi-normal step n k : moves toward feasibility tangential step t k : moves toward optimality while staying in the null space of the linearized constraints t k n k ζ k k c x(x k )s x + c(x k ) = 0 c x(x k )t = 0 e.g. Omojokun [1989], Byrd, Hribar, Nocedal [1997], Dennis, El Alem, Maciel [1997], Dennis, Heinkenschloss, Vicente [1998], Conn, Gould, Toint [2000] D. Ridzal Inexact TR SQP for Large Scale Optimization 9

Acceptance of the Step Merit function: φ(x, λ; ρ) = f(x) + λ, c(x) Y + ρ c(x) 2 Y = L(x, λ) + ρ c(x) 2 Y. Actual reduction at step k: ared(s x k; ρ k ) = φ(x k, λ k ; ρ k ) φ(x k + s k, λ k+1 ; ρ k ) Predicted reduction at step k: [ pred(s x k; ρ k ) = φ(x k, λ k ; ρ k ) L(x k, λ k )+ g k, s k X + 1 2 H ks x k, s x k X ] + λ k+1 λ k, c x (x k )s x k + c(x k ) Y + ρ k c x (x k )s x k + c(x k ) 2 Y. D. Ridzal Inexact TR SQP for Large Scale Optimization 10

Composite Step Trust Region SQP Algorithm 1. Compute quasi normal step n k. 2. Compute tangential step t k. 3. Compute new Lagrange multiplier estimate λ k+1. 4. Update penalty parameter ρ k. 5. Compute ared k, pred k. 6. Decide whether to accept the new iterate x k+1 = x k + n k + t k, and update k+1 from k, based on ared k pred k. D. Ridzal Inexact TR SQP for Large Scale Optimization 11

Composite Step Trust Region SQP Algorithm 1. Compute quasi normal step n k. One linear system involving c x (x k ). Possible inexactness! 2. Compute tangential step t k. 3. Compute new Lagrange multiplier estimate λ k+1. One linear system involving c x (x k ). Possible inexactness! 4. Update penalty parameter ρ k. 5. Compute ared k, pred k. 6. Decide whether to accept the new iterate x k+1 = x k + n k + t k, and update k+1 from k, based on ared k pred k. D. Ridzal Inexact TR SQP for Large Scale Optimization 11

Composite Step Trust Region SQP Algorithm 1. Compute quasi normal step n k. One linear system involving c x (x k ). Possible inexactness! 2. Compute tangential step t k. Multiple linear systems involving c x (x k ). Possible inexactness! Depends on already (inexactly) computed quantities n k and λ k. 3. Compute new Lagrange multiplier estimate λ k+1. One linear system involving c x (x k ). Possible inexactness! 4. Update penalty parameter ρ k. 5. Compute ared k, pred k. 6. Decide whether to accept the new iterate x k+1 = x k + n k + t k, and update k+1 from k, based on ared k pred k. D. Ridzal Inexact TR SQP for Large Scale Optimization 11

Composite Step Trust Region SQP Algorithm 1. Compute quasi normal step n k. One linear system involving c x (x k ). Possible inexactness! 2. Compute tangential step t k. Multiple linear systems involving c x (x k ). Possible inexactness! Depends on already (inexactly) computed quantities n k and λ k. 3. Compute new Lagrange multiplier estimate λ k+1. One linear system involving c x (x k ). Possible inexactness! 4. Update penalty parameter ρ k. Need to modify penalty parameter update! 5. Compute ared k, pred k. The definition of pred k must be modified! 6. Decide whether to accept the new iterate x k+1 = x k + n k + t k, and update k+1 from k, based on ared k pred k. D. Ridzal Inexact TR SQP for Large Scale Optimization 11

Balancing Inexactness in the Quasi Normal and the Tangential Step c x (x k )s x k + c(x k) = 0 c x (x k )t k = 0 ζ k k D. Ridzal Inexact TR SQP for Large Scale Optimization 12

Balancing Inexactness in the Quasi Normal and the Tangential Step c x (x k )s x k + c(x k) = 0 c x (x k )t k = 0 ζ k k D. Ridzal Inexact TR SQP for Large Scale Optimization 12

Inexactness in TRSQP: Summary of My Contributions Iterative linear system solves arise in the computation of: (1) Lagrange multipliers, (2) quasi-normal step, (3) tangential step. Global convergence theory for TR/SQP methods gives a rather generic treatment of the issue of inexactness. My work ties these generic requirements to inexactness specific to linear system solves, for each of the above. The devised stopping criteria for iterative linear system solves are dynamically adjusted by the SQP algorithm, based on its current progress toward a KKT point, trade gains in feasibility for gains in optimality and vice versa, can be easily implemented and are sufficient to guarantee first order global convergence of the algorithm, allow for a rigorous integration of preconditioners for KKT systems. D. Ridzal Inexact TR SQP for Large Scale Optimization 13

Tangential Step The exact model requires that t k approximately solve the problem: min 1 2 H k(t + n k ), t + n k X + x L k, t + n k X s.t. c x (x k )t = 0 t + n k X k. Assume that there exists a bounded linear operator W k : Z X, where Z is a Hilbert space, such that Range(W k ) = Null(c x (x k )). Covers all existing implementations for handling c x(x k )t = 0. Drop constant term from the QP, ignore n k in the trust region constraint, set g k = H k n k + x L k, let t = W k w. Obtain equivalent reduced QP min q k (w) 1 2 W k H k W k w, w Z + W k g k, w Z s.t. W k w X k. D. Ridzal Inexact TR SQP for Large Scale Optimization 14

Tangential Step Steihaug Toint CG 0. Let w 0 = 0 Z. Let r 0 = W k g k, p 0 = r 0. 1. For i = 0, 1, 2,... 1.1 If p i, W k H k W k p i Z 0, extend w i to boundary of TR and stop. 1.2 α i = r i, r i Z / p i, W k H k W k p i Z 1.3 w i+1 = w i + α ip i 1.4 If W k w i+1 k, extend w i to boundary of TR and stop. 1.5 r i+1 = r i α iw k H k W k p i 1.6 β i = r i+1, r i+1 Z / r i, r i Z 1.7 p i+1 = r i+1 + β ip i D. Ridzal Inexact TR SQP for Large Scale Optimization 15

Tangential Step Linear Systems The application of W k, Wk requires linear system solves. Example: W k is an orthogonal projector onto Null(c x (x k )). Any computation z = W k p can be performed by solving the augmented system ( ) ( ) ( ) I c x (x k ) z p = c x (x k ) 0 y 0 If I is replaced by G k H k, and W k G kw k is positive definite, this leads to the preconditioning of the reduced Hessian W k H kw k [Keller, Gould, Wathen 2000]. Attractive if we have a good preconditioner for KKT systems: [H., Nguyen 2004], [Bartlett, H., R., van Bloemen Waanders 2006]. We have the tools to efficiently solve large scale KKT systems or above augmented systems iteratively. D. Ridzal Inexact TR SQP for Large Scale Optimization 16

Tangential Step with Inexactness (Projector Case) Issues: Augmented systems are solved iteratively. Every CG iteration uses a different W k. The CG operator W k H kw k is nonsymmetric. The CG operator W k H kw k is effectively nonlinear. Which quadratic functional are we minimizing? Conventional proofs of global convergence for SQP methods require us to replace the reduced QP with the following inexact problem: min 1 W k 2 kw k w, w + W k g k, w Z Z s.t. w X k. W k H kw k =? W k g k =? D. Ridzal Inexact TR SQP for Large Scale Optimization 17

Tangential Step with Inexactness (Projector Case) Outline of the Solution: Use a full space approach, in which the CG operator is H k (exact), and the inexactness is moved into a preconditioner W k (inexact). min 1 2 H kt, t X + g k, t X s.t. t Range(W k ), t X k. Find a fixed (with respect to every CG iteration) linear representation W k = W k + E k of the inexact null space operator. W k H kw k = W k Hk Wk, W k g k = W k gk Establish bounds on E k that can be controlled in practice. D. Ridzal Inexact TR SQP for Large Scale Optimization 18

Tangential Step Inexact CG with Full Orthogonalization 0. Let t 0 = 0 X. Let r 0 = g k. Set i max, set i = 0. 1. While (W k (r i ) 0 and i < i max ) 1.1 z i = W k (r i) 1.2 p i = z i + i 1 z i,h k p j X j=0 p j p j,h k p j X 1.3 If p i, H k p i X 0, extend t i to boundary of TR and stop. 1.4 α i = r i, p i X / p i, H k p i X 1.5 t i+1 = t i + α ip i 1.6 If t i+1 k, extend t i to boundary of TR and stop. 1.7 r i+1 = r i + α ih k p i 1.8 i i + 1 D. Ridzal Inexact TR SQP for Large Scale Optimization 19

Inexact CG with Full Orthogonalization Theory Theorem (1) If W k = W k is a fixed (exact) linear operator, then the inexact CG algorithm in the full space is equivalent to a traditional Steih./T. CG algorithm applied to the tangential subproblem in the reduced space. Proof. Straightforward. If linear system solves can be performed with high accuracy, we recover the convergence properties of traditional CG. D. Ridzal Inexact TR SQP for Large Scale Optimization 20

Inexact CG with Full Orthogonalization Theory Theorem (2) There exists a fixed linear operator W k such that W k (r i ) = W k r i for every iteration i of the inexact CG algorithm. Proof. It can be shown that residual vectors r i, i = 0, 1,..., m, are linearly independent, so the matrix R m = [r 0, r 1,..., r m] has full column rank. Introduce matrices Y m = [W k r 0, W k r 1,..., W k r m], Ỹ m = [W k (r 0), W k (r 1),..., W k (r m)]. Inexact operator (one possible choice): W k = W k + (Ỹm Ym)(R mr m) 1 R m, since W k R m = Ỹm. D. Ridzal Inexact TR SQP for Large Scale Optimization 21

Inexact CG with Full Orthogonalization Theory Theorem (2) There exists a fixed linear operator W k such that W k (r i ) = W k r i for every iteration i of the inexact CG algorithm. Inexact CG effectively solves the inexact tangential subproblem: min 1 Wk Hk Wk w, w + Wk gk, w 2 Z Z s.t. W k w X k. Use conventional theory for global convergence of SQP methods. Remark: For analytical purposes, we use the inexact operator W k = W k + E k = W k + (Ỹm Y m )(Ỹm Rm ) 1 Ỹ m (after establishing the conditions for the invertibility of Ỹm Rm ). D. Ridzal Inexact TR SQP for Large Scale Optimization 21

Tangential Step Global Convergence Requirements (C1) W k gk Wk g k X κ 1 min ( W ) k gk X, k, (C2) (C3) 1 2 Wk Hk Wk w k, w k Wk Hk Wk w k, w k X κ 2 w k 2 X, X κ 3 W k gk X min Wk gk, w k X {κ 4 W k gk X, κ 5 k }, for positive constants κ 1,..., κ 5 independent of k. D. Ridzal Inexact TR SQP for Large Scale Optimization 22

Tangential Step Global Convergence Requirements (C1) W k gk Wk g k X κ 1 min ( W ) k gk X, k, (C2) (C3) 1 2 Wk Hk Wk w k, w k Wk Hk Wk w k, w k X κ 2 w k 2 X, X κ 3 W k gk X min Wk gk, w k X {κ 4 W k gk X, κ 5 k }, for positive constants κ 1,..., κ 5 independent of k. The true difficulty is in proving the global convergence condition (C1), related to the inexact reduced gradient. D. Ridzal Inexact TR SQP for Large Scale Optimization 22

Inexact CG with Full Orthogonalization Theory Theorem (3) If at every iteration i of the inexact CG algorithm { W k r i W k r i ξ min W } k g k g k, k g k, β W k r i, ξ > 0, and c 1 W k gk W k g k c 2 W k gk, c 1, c 2 > 0, then the convergence requirements (C1) (C2) are satisfied. Proof. Relies on a bound for the quantity E k = (Ỹm Ym)(Ỹm Rm) 1 Ỹ m. Notes: (1) Even though the inexact reduced gradient W k g k is computed in the very first CG iteration, in order to guarantee (C1) our theoretical framework puts restrictions on all subsequent applications of W k. (2) The theorem gives a sufficient condition that works extremely well in practice. D. Ridzal Inexact TR SQP for Large Scale Optimization 23

Application of the Inexact Operator W k Recall: (i) At every iteration k of the SQP algorithm, inexact CG is called. (ii) At every CG iteration i, we compute iteratively an inexact projected residual z i = W k (r i ) = W k r i such that ( I c x (x k ) c x (x k ) 0 ) ( zi y ) = ( ri 0 ) + ( e 1 i e 2 i ). D. Ridzal Inexact TR SQP for Large Scale Optimization 24

Application of the Inexact Operator W k Recall: (i) At every iteration k of the SQP algorithm, inexact CG is called. (ii) At every CG iteration i, we compute iteratively an inexact projected residual z i = W k (r i ) = W k r i such that ( I c x (x k ) c x (x k ) 0 ) ( zi y ) = ( ri 0 ) + ( e 1 i e 2 i ). Control global SQP convergence by controlling e i! D. Ridzal Inexact TR SQP for Large Scale Optimization 24

Application of the Inexact Operator W k Recall: (i) At every iteration k of the SQP algorithm, inexact CG is called. (ii) At every CG iteration i, we compute iteratively an inexact projected residual z i = W k (r i ) = W k r i such that ( I c x (x k ) c x (x k ) 0 ) ( zi y ) = ( ri 0 ) + ( e 1 i e 2 i Theory: If at every iteration i of the inexact CG algorithm { W k r i W k r i ξ min W } k g k g k, k g k, β W k r i, ξ > 0, and c 1 W k gk W k g k c 2 W k gk, c 1, c 2 > 0, then the convergence requirements (C1) (C2) are satisfied. ). D. Ridzal Inexact TR SQP for Large Scale Optimization 24

Application of the Inexact Operator W k Recall: (i) At every iteration k of the SQP algorithm, inexact CG is called. (ii) At every CG iteration i, we compute iteratively an inexact projected residual z i = W k (r i ) = W k r i such that ( I c x (x k ) c x (x k ) 0 ) ( zi y ) = ( ri 0 ) + ( e 1 i e 2 i Practice: It is sufficient to require { e i min W } k g k g k, k g k, β z i, }{{} γ where β = 10 3 (fixed small constant). Note W k g k = z 0. ). D. Ridzal Inexact TR SQP for Large Scale Optimization 24

Application of the Inexact Operator W k Recall: (i) At every iteration k of the SQP algorithm, inexact CG is called. (ii) At every CG iteration i, we compute iteratively an inexact projected residual z i = W k (r i ) = W k r i such that ( I c x (x k ) c x (x k ) 0 ) ( zi y ) = Implementation: First CG iteration Stop the linear system solver at iteration m, if e (m) 0 γ z (m) 0. ( ri 0 ) + ( e 1 i e 2 i Subsequent CG iterations Heuristics: reuse the size of the iterate returned by the previous solve, e i γ z i 1. ). D. Ridzal Inexact TR SQP for Large Scale Optimization 24

Outline Motivation Large Scale Problems in PDE Constrained Optimization Inexactness in linear system solves arising in an SQP algorithm Trust Region SQP Algorithm with Inexact Linear System Solves Existing Work on Inexactness in Optimization Algorithms Review of the SQP Methodology Mechanisms of Inexactness Control Numerical Results Conclusion D. Ridzal Inexact TR SQP for Large Scale Optimization 25

Example 1: Burgers Equation in 1D subject to min 1 2 1 0 (y(x) y d (x)) 2 dx + α 2 1 0 u 2 (x)dx νy xx (x) + y(x)y x (x) = f(x) + u(x) x (0, 1) y(0) = 0, y(1) = 0. Finite element discretization with linear elements. ν = 10 2, α = 10 5, 100 equidistant subintervals. SQP stopping criteria: c(x k ) < 10 6, x L(x k, λ k ) < 10 6. For augmented system solves use GMRES with incomplete LU preconditioning. D. Ridzal Inexact TR SQP for Large Scale Optimization 26

Example 1 Inexactness Control in Tang. Step 10 4 absolute inner solver stopping tol 10 6 10 8 10 10 10 12 CG iterations (over all SQP iterations) Controlled tolerance in first CG iteration (one for every SQP iteration). * Controlled tolerance in all other CG iterations. D. Ridzal Inexact TR SQP for Large Scale Optimization 27

Example 1 Inexactness Control in Tang. Step 10 4 absolute inner solver stopping tol 10 6 10 8 10 10 10 12 CG iterations (over all SQP iterations) Total number of GMRES iterations: 2544. Runtime: 11 seconds. D. Ridzal Inexact TR SQP for Large Scale Optimization 27

Example 1 Inexactness Control in Tang. Step 10 4 absolute inner solver stopping tol 10 6 10 8 10 10 10 12 CG iterations (over all SQP iterations) How do we pick a fixed tolerance for comparison? D. Ridzal Inexact TR SQP for Large Scale Optimization 27

Example 1 Inexactness Control in Tang. Step 10 4 absolute inner solver stopping tol 10 6 10 8 10 10 10 12 CG iterations (over all SQP iterations) Pick the largest tolerance that recovers the same convergence profile (in terms of the number of SQP iterations and the quality of the solution). D. Ridzal Inexact TR SQP for Large Scale Optimization 27

Example 1 Inexactness Control in Tang. Step 10 4 absolute inner solver stopping tol 10 6 10 8 10 10 10 12 CG iterations (over all SQP iterations) Fixed Tolerance: 1 10 11. Total number of GMRES iterations: 5652 (was 2544). Runtime: 33 seconds (was 11). D. Ridzal Inexact TR SQP for Large Scale Optimization 27

Example 1 Inexactness Control in Tang. Step 10 3 relative inner solver stopping tol 10 4 10 5 10 6 10 7 10 8 CG iterations (over all SQP iterations) Relative linear solver stopping never need to surpass the desired SQP stopping tolerances! D. Ridzal Inexact TR SQP for Large Scale Optimization 28

Example 2: Nonlinear Elliptic Problem in 2D subject to minimize 1 y 0 (x)) 2 Ω(y(x) 2 dx + 1 u 2 (x)dx 2 Ω y(x) + y 3 (x) y(x) = 0 in Ω, y (x) = u(x) n on Ω. The computational domain is the [0, 1] [0, 1] square. Unstructured meshes generated by Triangle, partitioned using Metis. Mesh sizes: 32K, 64K, 128K, 256K ( total number of variables). Partition sizes: 2, 4, 8, 16 (= number of processors). For augmented system solves use GMRES with DD preconditioning. Beowulf cluster (Mike Heroux, CSBSJU, MN and Sandia, NM): 16 Athlon 2.0GHz nodes / 1GB RAM / 100 Mbps Ethernet D. Ridzal Inexact TR SQP for Large Scale Optimization 29

Example 2 Inexactness Control in Tang. Step absolute inner solver stopping tol 10 4 10 6 10 8 10 10 1 2 1 2 1 2 1 2 3 4 5 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 7 CG iterations Controlled tolerance in first CG iteration (one for every SQP iteration). * Controlled tolerance in all other CG iterations. D. Ridzal Inexact TR SQP for Large Scale Optimization 30

Example 2 Inexactness Control in Tang. Step absolute inner solver stopping tol 10 4 10 6 10 8 10 10 1 2 1 2 1 2 1 2 3 4 5 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 7 CG iterations How do we pick a fixed tolerance for comparison? D. Ridzal Inexact TR SQP for Large Scale Optimization 30

Example 2 Inexactness Control in Tang. Step absolute inner solver stopping tol 10 4 10 6 10 8 10 10 1 2 1 2 1 2 1 2 3 4 5 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 7 CG iterations Pick the largest tolerance that recovers the same convergence profile (in terms of the number of SQP iterations and the quality of the solution). Fixed Tolerance: 5 10 9. D. Ridzal Inexact TR SQP for Large Scale Optimization 30

Example 2 Inexactness Control in Tang. Step Total Number of GMRES Iterations: Fixed Tol / Controlled Tol Mesh \ Part. 2 4 8 16 32K 297/197 396/252 495/327 605/402 64K 254/166 318/190 432/273 526/337 128K 378/275 504/363 652/464 750/544 256K 425/283 564/401 730/521 906/665 Savings 30% (tangential step computation only) Wall Time in Seconds: Fixed Tol / Controlled Tol Mesh \ Part. 2 4 8 16 32K 51/46 41/34 44/40 60/75 64K 82/71 57/47 53/47 63/58 128K 268/243 185/167 144/126 140/130 256K 661/575 426/376 301/265 221/182 Savings 15% D. Ridzal Inexact TR SQP for Large Scale Optimization 31

Example 3: Navier Stokes Problem in 2D Finite element discretization with the Taylor Hood element pair. ν = 5 10 3, α = 10 1, δ = 10 5. SQP stopping criteria: c(x k ) < 10 6, x L(x k, λ k ) < 10 6. For augmented system solves use GMRES with incomplete LU preconditioning (drop tolerance 5 10 5 ). Use full reorthogonalization for all tangential step computations. D. Ridzal Inexact TR SQP for Large Scale Optimization 32

Example 3 Inexactness Control in Tang. Step absolute inner solver stopping tol 10 2 10 4 10 6 10 8 10 10 10 12 10 14 CG iterations (over all SQP iterations) Controlled tolerance in first CG iteration (one for every SQP iteration). * Controlled tolerance in all other CG iterations. D. Ridzal Inexact TR SQP for Large Scale Optimization 33

Example 3 Inexactness Control in Tang. Step absolute inner solver stopping tol 10 2 10 4 10 6 10 8 10 10 10 12 10 14 CG iterations (over all SQP iterations) Total number of GMRES iterations: 2672. D. Ridzal Inexact TR SQP for Large Scale Optimization 33

Example 3 Inexactness Control in Tang. Step absolute inner solver stopping tol 10 2 10 4 10 6 10 8 10 10 10 12 10 14 CG iterations (over all SQP iterations) How do we pick a fixed tolerance for comparison? D. Ridzal Inexact TR SQP for Large Scale Optimization 33

Example 3 Inexactness Control in Tang. Step absolute inner solver stopping tol 10 2 10 4 10 6 10 8 10 10 10 12 10 14 CG iterations (over all SQP iterations) Pick the largest tolerance, by trial and error, that recovers the same convergence profile (in terms of the number of SQP iterations and the quality of the solution). D. Ridzal Inexact TR SQP for Large Scale Optimization 33

Example 3 Inexactness Control in Tang. Step absolute inner solver stopping tol 10 2 10 4 10 6 10 8 10 10 10 12 10 14 CG iterations (over all SQP iterations) Fixed Tolerance: 1 10 10. Total number of GMRES iterations: 3404 (was 2672). D. Ridzal Inexact TR SQP for Large Scale Optimization 33

Example 3 Inexactness Control in Tang. Step More Details Stopping Tolerances for Linear System Solver inx. ctrl 1e-12 1e-11 1e-10 1e-9 1e-8 converges YES YES YES YES NO NO GMRES iter s 2670 4020 3728 3404 >10000 >10000 CG iter s 162 142 142 142 >500 >500 SQP iter s 8 7 7 7 >50 >50 No theoretical justification. D. Ridzal Inexact TR SQP for Large Scale Optimization 34

Outline Motivation Large Scale Problems in PDE Constrained Optimization Inexactness in linear system solves arising in an SQP algorithm Trust Region SQP Algorithm with Inexact Linear System Solves Existing Work on Inexactness in Optimization Algorithms Review of the SQP Methodology Mechanisms of Inexactness Control Numerical Results Conclusion D. Ridzal Inexact TR SQP for Large Scale Optimization 35

Conclusion Integrated iterative linear solvers in a trust-region SQP algorithm. Global convergence of the SQP algorithm is guaranteed through a mechanism of inexpensive and easily implementable stopping conditions for iterative linear system solvers. Eliminated the need to guess fixed solver tolerances, at the expense of a few vector norm computations and a full reorthogonalization in the tangential step computation. extra work < 1% of the cost of linear system solves (for a simple medium scale problem) Numerical results indicate that the dynamic stopping conditions effectively reduce oversolves. Local convergence behavior of the algorithm must be investigated. D. Ridzal Inexact TR SQP for Large Scale Optimization 36