Numerical Optimal Control Overview. Moritz Diehl

Similar documents
Direct Methods. Moritz Diehl. Optimization in Engineering Center (OPTEC) and Electrical Engineering Department (ESAT) K.U.

EN Applied Optimal Control Lecture 8: Dynamic Programming October 10, 2018

Optimal Control. Lecture 18. Hamilton-Jacobi-Bellman Equation, Cont. John T. Wen. March 29, Ref: Bryson & Ho Chapter 4.

Numerical Methods for Embedded Optimization and Optimal Control. Exercises

Efficient Numerical Methods for Nonlinear MPC and Moving Horizon Estimation

Numerical approximation for optimal control problems via MPC and HJB. Giulia Fabrini

Optimal Control and Applications

Numerical Methods for Optimal Control Problems. Part I: Hamilton-Jacobi-Bellman Equations and Pontryagin Minimum Principle

Numerical Optimal Control Mixing Collocation with Single Shooting: A Case Study

DISCRETE MECHANICS AND OPTIMAL CONTROL

Numerical methods for nonlinear optimal control problems

The Direct Transcription Method For Optimal Control. Part 2: Optimal Control

Optimal Control Theory

Chapter 2 Optimal Control Problem

Deterministic Dynamic Programming

OPTIMAL CONTROL. Sadegh Bolouki. Lecture slides for ECE 515. University of Illinois, Urbana-Champaign. Fall S. Bolouki (UIUC) 1 / 28

A FORMULATION OF NONLINEAR MODEL PREDICTIVE CONTROL USING AUTOMATIC DIFFERENTIATION. Yi Cao,1. School of Engineering, Cranfield University, UK

The Lifted Newton Method and Its Use in Optimization

Optimal Control. Quadratic Functions. Single variable quadratic function: Multi-variable quadratic function:

Tutorial on Control and State Constrained Optimal Control Problems

minimize x subject to (x 2)(x 4) u,

Controlled Diffusions and Hamilton-Jacobi Bellman Equations

Implicitly and Explicitly Constrained Optimization Problems for Training of Recurrent Neural Networks

Numerical Optimal Control Part 2: Discretization techniques, structure exploitation, calculation of gradients

Theory and Applications of Constrained Optimal Control Proble

Hot-Starting NLP Solvers

A Gauss Lobatto quadrature method for solving optimal control problems

Pontryagin s maximum principle

Stochastic and Adaptive Optimal Control

Dynamic Real-Time Optimization: Linking Off-line Planning with On-line Optimization

Real-time Constrained Nonlinear Optimization for Maximum Power Take-off of a Wave Energy Converter

ALADIN An Algorithm for Distributed Non-Convex Optimization and Control

Module 05 Introduction to Optimal Control

Solution of Stochastic Optimal Control Problems and Financial Applications

Parallel-in-time integrators for Hamiltonian systems

SQP METHODS AND THEIR APPLICATION TO NUMERICAL OPTIMAL CONTROL

Time-Optimal Automobile Test Drives with Gear Shifts

Suboptimal feedback control of PDEs by solving Hamilton-Jacobi Bellman equations on sparse grids

Optimal Control. McGill COMP 765 Oct 3 rd, 2017

Quadratic Stability of Dynamical Systems. Raktim Bhattacharya Aerospace Engineering, Texas A&M University

Numerical Optimal Control Part 3: Function space methods

Static and Dynamic Optimization (42111)

The HJB-POD approach for infinite dimensional control problems

Numerical Optimal Control (preliminary and incomplete draft) Moritz Diehl and Sébastien Gros

On Perspective Functions, Vanishing Constraints, and Complementarity Programming

Direct and indirect methods for optimal control problems and applications in engineering

Convergence of a Gauss Pseudospectral Method for Optimal Control

Chapter 5. Pontryagin s Minimum Principle (Constrained OCP)

Feedback Optimal Control of Low-thrust Orbit Transfer in Central Gravity Field

An Introduction to Algebraic Multigrid (AMG) Algorithms Derrick Cerwinsky and Craig C. Douglas 1/84

A Robust Controller for Scalar Autonomous Optimal Control Problems

Optimal control and applications to aerospace: some results and challenges

INVERSION IN INDIRECT OPTIMAL CONTROL

Optimizing Economic Performance using Model Predictive Control

OPTIMAL SPACECRAF1 ROTATIONAL MANEUVERS

MATH4406 (Control Theory) Unit 6: The Linear Quadratic Regulator (LQR) and Model Predictive Control (MPC) Prepared by Yoni Nazarathy, Artem

An adaptive RBF-based Semi-Lagrangian scheme for HJB equations

Alberto Bressan. Department of Mathematics, Penn State University

ECE7850 Lecture 7. Discrete Time Optimal Control and Dynamic Programming

A Remark on IVP and TVP Non-Smooth Viscosity Solutions to Hamilton-Jacobi Equations

Computational Issues in Nonlinear Control and Estimation. Arthur J Krener Naval Postgraduate School Monterey, CA 93943

Optimal Control of Differential Equations with Pure State Constraints

Mathematical Economics. Lecture Notes (in extracts)

1 Computing with constraints

Optimal periodic locomotion for a two piece worm with an asymmetric dry friction model

An Inexact Newton Method for Nonlinear Constrained Optimization

Lecture 3: Hamilton-Jacobi-Bellman Equations. Distributional Macroeconomics. Benjamin Moll. Part II of ECON Harvard University, Spring

An Introduction to Optimal Control of Partial Differential Equations with Real-life Applications

Noncausal Optimal Tracking of Linear Switched Systems

Computational Issues in Nonlinear Dynamics and Control

Formula Sheet for Optimal Control

1.2 Derivation. d p f = d p f(x(p)) = x fd p x (= f x x p ). (1) Second, g x x p + g p = 0. d p f = f x g 1. The expression f x gx

Linear-Quadratic-Gaussian (LQG) Controllers and Kalman Filters

Constrained optimization: direct methods (cont.)

Optimal control of nonstructured nonlinear descriptor systems

Algorithms for constrained local optimization

Graph Coarsening Method for KKT Matrices Arising in Orthogonal Collocation Methods for Optimal Control Problems

CE 191: Civil and Environmental Engineering Systems Analysis. LEC 05 : Optimality Conditions

Numerical Methods for Constrained Optimal Control Problems

Robust Process Control by Dynamic Stochastic Programming

EE291E/ME 290Q Lecture Notes 8. Optimal Control and Dynamic Games

Algorithms for Constrained Optimization

Lecture 18: Optimization Programming

2 A. BARCLAY, P. E. GILL AND J. B. ROSEN where c is a vector of nonlinear functions, A is a constant matrix that denes the linear constraints, and b l

An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization

A NEW COMPUTATIONAL METHOD FOR OPTIMAL CONTROL OF A CLASS OF CONSTRAINED SYSTEMS GOVERNED BY PARTIAL DIFFERENTIAL EQUATIONS

Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations

Game Theory Extra Lecture 1 (BoB)

Hamilton-Jacobi-Bellman Equation Feb 25, 2008

Methods for Computing Periodic Steady-State Jacob White

CHAPTER 2: QUADRATIC PROGRAMMING

Lecture 9 Approximations of Laplace s Equation, Finite Element Method. Mathématiques appliquées (MATH0504-1) B. Dewals, C.

An Application of Pontryagin s Maximum Principle in a Linear Quadratic Differential Game

Proper Orthogonal Decomposition in PDE-Constrained Optimization

Block Condensing with qpdunes

CHAPTER 2 THE MAXIMUM PRINCIPLE: CONTINUOUS TIME. Chapter2 p. 1/67

Homework Solution # 3

Project Optimal Control: Optimal Control with State Variable Inequality Constraints (SVIC)

Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games

UCLA Chemical Engineering. Process & Control Systems Engineering Laboratory

Transcription:

Numerical Optimal Control Overview Moritz Diehl

Simplified Optimal Control Problem in ODE path constraints h(x, u) 0 initial value x0 states x(t) terminal constraint r(x(t )) 0 controls u(t) 0 t T minimize x( ),u( ) subject to T 0 L(x(t), u(t)) dt + E (x(t )) x(0) x 0 = 0, (fixed initial value) ẋ(t) f (x(t), u(t)) = 0, t [0, T ], (ODE model) h(x(t), u(t)) 0, t [0, T ], (path constraints) r (x(t )) 0 (terminal constraints)

More general optimal control problems Many features left out here for simplicity of presentation: multiple dynamic stages differential algebraic equations (DAE) instead of ODE explicit time dependence constant design parameters multipoint constraints r(x(t 0 ), x(t 1 ),..., x(t end )) = 0

Optimal Control Family Tree Three basic families: Hamilton-Jacobi-Bellmann equation / dynamic programming Indirect Methods / calculus of variations / Pontryagin Direct Methods (control discretization)

Principle of Optimality Any subarc of an optimal trajectory is also optimal. initial value x 0 states x(t) optimal controls u(t) 0 t intermediate value x T Subarc on [ t, T ] is optimal solution for initial value x.

Dynamic Programming Cost-to-go IDEA: Introduce optimal-cost-to-go function on [ t, T ] T J( x, t) := min x,u t L(x, u)dt + E(x(T )) s.t. x( t) = x,... Introduce grid 0 = t 0 <... < t N = T. Use principle of optimality on intervals [t k, t k+1 ]: tk+1 J(x k, t k ) = min L(x, u)dt + J(x(t k+1 ), t k+1 ) x,u t k x k x(tk+1 ) s.t. x(t k ) = x k,... t k t k+1 T

Dynamic Programming Recursion Starting from J(x, t N ) = E(x), compute recursively backwards, for k = N 1,..., 0 J(x k, t k ) := min x,u tk+1 t k L(x, u)dt + J(x(t k+1 ), t k+1 ) s.t. x(t k ) = x k,... by solution of short horizon problems for all possible x k and tabulation in state space.

Dynamic Programming Recursion Starting from J(x, t N ) = E(x), compute recursively backwards, for k = N 1,..., 0 J(x k, t k ) := min x,u tk+1 t k L(x, u)dt + J(x(t k+1 ), t k+1 ) s.t. x(t k ) = x k,... by solution of short horizon problems for all possible x k and tabulation in state space. J(, t N ) x N

Dynamic Programming Recursion Starting from J(x, t N ) = E(x), compute recursively backwards, for k = N 1,..., 0 J(x k, t k ) := min x,u tk+1 t k L(x, u)dt + J(x(t k+1 ), t k+1 ) s.t. x(t k ) = x k,... by solution of short horizon problems for all possible x k and tabulation in state space. J(, t N 1 ) J(, t N ) x N 1 x N

Dynamic Programming Recursion Starting from J(x, t N ) = E(x), compute recursively backwards, for k = N 1,..., 0 J(x k, t k ) := min x,u tk+1 t k L(x, u)dt + J(x(t k+1 ), t k+1 ) s.t. x(t k ) = x k,... by solution of short horizon problems for all possible x k and tabulation in state space. J(, t 0 ) J(, t N 1 ) J(, t N ) x 0 x N 1 x N

Hamilton-Jacobi-Bellman (HJB) Equation Dynamic Programming with infinitely small timesteps leads to Hamilton-Jacobi-Bellman (HJB) Equation: J ( (x, t) = min L(x, u) + J ) (x, t)f (x, u) s.t. h(x, u) 0. t u x Solve this partial differential equation (PDE) backwards for t [0, T ], starting at the end of the horizon with J(x, T ) = E(x). NOTE: Optimal controls for state x at time t are obtained from ( u (x, t) = arg min L(x, u) + J ) (x, t)f (x, u) s.t. h(x, u) 0. u x

Dynamic Programming / HJB Dynamic Programming applies to discrete time, HJB to continuous time systems. Pros and Cons + Searches whole state space, finds global optimum. + Optimal feedback controls precomputed. + Analytic solution to some problems possible (linear systems with quadratic cost Riccati Equation) Viscosity solutions (Lions et al.) exist for quite general nonlinear problems. - But: in general intractable, because partial differential equation (PDE) in high dimensional state space: curse of dimensionality. Possible remedy: Approximate J e.g. in framework of neuro-dynamic programming [Bertsekas 1996]. Used for practical optimal control of small scale systems e.g. by Bonnans, Zidani, Lee, Back,...

Indirect Methods For simplicity, regard only problem without inequality constraints: initial value x0 states x(t) terminal cost E(x(T )) controls u(t) 0 t T minimize x( ),u( ) subject to T 0 L(x(t), u(t)) dt + E (x(t )) x(0) x 0 = 0, (fixed initial value) ẋ(t) f (x(t), u(t)) = 0, t [0, T ], (ODE model)

Pontryagin s Minimum Principle OBSERVATION: In HJB, optimal controls ( u (t) = arg min u L(x, u) + J (x, t)f (x, u) x depend only on derivative J x (x, t), not on J itself! IDEA: Introduce adjoint variables λ(t) ˆ= J x (x(t), t)t R nx and get controls from Pontryagin s Minimum Principle u (t, x, λ) = arg min u QUESTION: How to obtain λ(t)? ) L(x, u) + λ T f (x, u) }{{} Hamiltonian=:H(x,u,λ)

Adjoint Differential Equation Differentiate HJB Equation J t (x, t) = min u with respect to x and obtain: J H(x, u, x (x, t)t ) λ T = x (H(x(t), u (t, x, λ), λ(t))). Likewise, differentiate J(x, T ) = E(x) and obtain terminal condition λ(t ) T = E (x(t )). x

How to obtain explicit expression for controls? In simplest case, u (t) = arg min H(x(t), u, λ(t)) u is defined by H u (x(t), u (t), λ(t)) = 0 (Calculus of Variations, Euler-Lagrange). In presence of path constraints, expression for u (t) changes whenever active constraints change. This leads to state dependent switches. If minimum of Hamiltonian locally not unique, singular arcs occur. Treatment needs higher order derivatives of H.

Necessary Optimality Conditions Summarize optimality conditions as boundary value problem: x(0) = x 0, initial value ẋ(t) = f (x(t), u (t)), t [0, T ], ODE model λ(t) = H x (x(t), u (t), λ(t)) T, t [0, T ], adjoint equations u (t) = arg min H(x(t), u, λ(t)), u t [0, T ], minimum principle λ(t ) = E x (x(t ))T. adjoint final value. Solve with so called gradient methods, shooting methods, or collocation.

Indirect Methods First optimize, then discretize Pros and Cons + Boundary value problem with only 2 n x ODE. + Can treat large scale systems. - Only necessary conditions for local optimality. - Need explicit expression for u (t), singular arcs difficult to treat. - ODE strongly nonlinear and unstable. - Inequalities lead to ODE with state dependent switches. Possible remedy: Use interior point method in function space inequalities, e.g. Weiser and Deuflhard, Bonnans and Laurent-Varin Used for optimal control e.g. in satellite orbit planning at CNES...

Direct Methods First discretize, then optimize Transcribe infinite problem into finite dimensional, Nonlinear Programming Problem (NLP), and solve NLP. Pros and Cons: + Can use state-of-the-art methods for NLP solution. + Can treat inequality constraints and multipoint constraints much easier. - Obtains only suboptimal/approximate solution. Nowadays most commonly used methods due to their easy applicability and robustness.

Direct Methods Overview We treat three direct methods: Direct Single Shooting (sequential simulation and optimization) Direct Collocation (simultaneous simulation and optimization) Direct Multiple Shooting (simultaneous resp. hybrid)

Direct Single Shooting [Hicks1971,Sargent1978] Discretize controls u(t) on fixed grid 0 = t 0 < t 1 <... < t N = T, regard states x(t) on [0, T ] as dependent variables. states x(t; q) x 0 discretized controls u(t; q) q 0 0 q t 1 q N 1 T Use numerical integration to obtain state as function x(t; q) of finitely many control parameters q = (q 0, q 1,..., q N 1 )

NLP in Direct Single Shooting After control discretization and numerical ODE solution, obtain NLP: minimize q subject to T h(x(t i ; q), u(t i ; q)) 0, i = 0,..., N, 0 r (x(t ; q)) 0. L(x(t; q), u(t; q)) dt + E (x(t ; q)) (discretized path constraints) (terminal constraints) Solve with finite dimensional optimization solver, e.g. Sequential Quadratic Programming (SQP).

Solution by Standard SQP Summarize problem as min q F (q) s.t. H(q) 0. Solve e.g. by Sequential Quadratic Programming (SQP), starting with guess q 0 for controls. k := 0 1. Evaluate F (q k ), H(q k ) by ODE solution, and derivatives! 2. Compute correction q k by solution of QP: min q F (q k) T q+ 1 2 qt A k q s.t. H(q k )+ H(q k ) T q 0. 3. Perform step q k+1 = q k + α k q k with step length α k determined by line search.

ODE Sensitivities x(t; q) How to compute the sensitivity of a numerical ODE q solution x(t; q) with respect to the controls q? Four ways: 1. External Numerical Differentiation (END) 2. Variational Differential Equations 3. Automatic Differentiation 4. Internal Numerical Differentiation (IND)

Numerical Test Problem minimize x( ),u( ) subject to 3 0 x(t) 2 + u(t) 2 dt x(0) = x 0, (initial value) ẋ =(1 + x)x + u, t [0, 3], (ODE model) 1 x(t) 0 1 + x(t) 1 u(t) 0 0, t [0, 3], (bounds) 1 + u(t) 0 x(3) = 0. Remark: Uncontrollable growth for (1 + x 0 )x 0 1 0 x 0 0.618. (zero terminal constraint).

Single Shooting Optimization for x 0 = 0.05 Choose N = 30 equal control intervals. Initialize with steady state controls u(t) 0. Initial value x 0 = 0.05 is the maximum possible, because initial trajectory explodes otherwise.

Single Shooting: Initialization

Single Shooting: First Iteration

Single Shooting: 2nd Iteration

Single Shooting: 3rd Iteration

Single Shooting: 4th Iteration

Single Shooting: 5th Iteration

Single Shooting: 6th Iteration

Single Shooting: 7th Iteration and Solution

Direct Single Shooting: Pros and Cons Sequential simulation and optimization. + Can use state-of-the-art ODE/DAE solvers. + Few degrees of freedom even for large ODE/DAE systems. + Active set changes easily treated. + Need only initial guess for controls q. - Cannot use knowledge of x in initialization (e.g. in tracking problems). - ODE solution x(t; q) can depend very nonlinearly on q. - Unstable systems difficult to treat. Often used in engineering applications e.g. in packages gopt (PSE), DYOS (Marquardt),...

Direct Collocation (Sketch) [Tsang1975] Discretize controls and states on fine grid with node values s i x(t i ). Replace infinite ODE 0 = ẋ(t) f (x(t), u(t)), t [0, T ] by finitely many equality constraints e.g. c i (q i, s i, s i+1 ) = 0, i = 0,..., N 1, c i (q i, s i, s i+1 ) := s ( ) i+1 s i si + s i+1 f, q i t i+1 t i 2 Approximate also integrals, e.g. ti+1 t i ( ) si + s i+1 L(x(t), u(t))dt l i (q i, s i, s i+1 ) := L, q i (t i+1 t i ) 2

NLP in Direct Collocation After discretization obtain large scale, but sparse NLP: minimize s,q subject to N 1 i=0 l i (q i, s i, s i+1 ) + E (s N ) s 0 x 0 = 0, (fixed initial value) c i (q i, s i, s i+1 ) = 0, i = 0,..., N 1, (discretized ODE model) h(s i, q i ) 0, i = 0,..., N, (discretized path constraint r (s N ) 0. (terminal constraints) Solve e.g. with SQP method for sparse problems.

What is a sparse NLP? General NLP: min F (w) s.t. w G(w) = 0, H(w) 0. is called sparse if the Jacobians (derivative matrices) w G T = G ( ) G w = and w H T w j contain many zero elements. In SQP methods, this makes QP much cheaper to build and to solve. ij

Direct Collocation: Pros and Cons Simultaneous simulation and optimization. + Large scale, but very sparse NLP. + Can use knowledge of x in initialization. + Can treat unstable systems well. + Robust handling of path and terminal constraints. - Adaptivity needs new grid, changes NLP dimensions. Successfully used for practical optimal control e.g. by Biegler and Wächter (IPOPT), Betts,

Direct Multiple Shooting [Bock 1984] Discretize controls piecewise on a coarse grid u(t) = q i for t [t i, t i+1 ] Solve ODE on each interval [t i, t i+1 ] numerically, starting with artificial initial value s i : ẋ i (t; s i, q i ) = f (x i (t; s i, q i ), q i ), t [t i, t i+1 ], x i (t i ; s i, q i ) = s i. Obtain trajectory pieces x i (t; s i, q i ). Also numerically compute integrals l i (s i, q i ) := ti+1 t i L(x i (t i ; s i, q i ), q i )dt

Sketch of Direct Multiple Shooting x i (t i+1 ; s i, q i ) s i+1 s i s i+1 s 0 s 1 x0 q i s N 1 s N q 0 t 0 t 1 t i t i+1 t N 1 t N

NLP in Direct Multiple Shooting minimize s,q subject to N 1 i=0 l i (s i, q i ) + E (s N ) s 0 x 0 = 0, s i+1 x i (t i+1 ; s i, q i ) = 0, i = 0,..., N 1, h(s i, q i ) 0, i = 0,..., N, r (s N ) 0. (initial value) (continuity) (discretized path constraints (terminal constraints)

Structured NLP Summarize all variables as w := (s 0, q 0, s 1, q 1,..., s N ). Obtain structured NLP min w F (w) s.t. { G(w) = 0 H(w) 0. Jacobian G(w k ) T contains dynamic model equations. Jacobians and Hessian of NLP are block sparse, can be exploited in numerical solution procedure.

Test Example: Initialization with u(t) 0

Multiple Shooting: First Iteration

Multiple Shooting: 2nd Iteration

Multiple Shooting: 3rd Iteration and Solution

Direct Multiple Shooting: Pros and Cons Simultaneous simulation and optimization. + uses adaptive ODE/DAE solvers + but NLP has fixed dimensions + can use knowledge of x in initialization (here bounds; more important in online context). + can treat unstable systems well. + robust handling of path and terminal constraints. + easy to parallelize. - not as sparse as collocation. Used for practical optimal control e.g by Franke (ABB) ( HQP ), Terwen (Daimler); Bock et al. ( MUSCOD-II ); in ACADO Toolkit;...

Conclusions: Optimal Control Family Tree Hamilton-Jacobi- Bellman Equation: Tabulation in State Space Single Shooting: Only discretized controls in NLP (sequential) Indirect Methods, Pontryagin: Solve Boundary Value Problem Collocation: Discretized controls and states in NLP (simultaneous) Direct Methods: Transform into Nonlinear Program (NLP) Multiple Shooting: Controls and node start values in NLP (simultaneous/hybrid)

Literature T. Binder, L. Blank, H. G. Bock, R. Bulirsch, W. Dahmen, M. Diehl, T. Kronseder, W. Marquardt and J. P. Schler, and O. v. Stryk: Introduction to Model Based Optimization of Chemical Processes on Moving Horizons. In Grötschel, Krumke, Rambau (eds.): Online Optimization of Large Scale Systems: State of the Art, Springer, 2001. pp. 295 340. John T. Betts: Practical Methods for Optimal Control Using Nonlinear Programming. SIAM, Philadelphia, 2001. ISBN 0-89871-488-5 Dimitri P. Bertsekas: Dynamic Programming and Optimal Control. Athena Scientific, Belmont, 2000 (Vol I, ISBN: 1-886529-09-4) & 2001 (Vol II, ISBN: 1-886529-27-2) A. E. Bryson and Y. C. Ho: Applied Optimal Control, Hemisphere/Wiley, 1975.