Optimal Control. Lecture 18. Hamilton-Jacobi-Bellman Equation, Cont. John T. Wen. March 29, Ref: Bryson & Ho Chapter 4.

Similar documents
Optimal Control. Lecture 20. Control Lyapunov Function, Optimal Estimation. John T. Wen. April 5. Ref: Papers by R. Freeman (on-line), B&H Ch.

EN Applied Optimal Control Lecture 8: Dynamic Programming October 10, 2018

Hamilton-Jacobi-Bellman Equation Feb 25, 2008

Quadratic Stability of Dynamical Systems. Raktim Bhattacharya Aerospace Engineering, Texas A&M University

OPTIMAL CONTROL. Sadegh Bolouki. Lecture slides for ECE 515. University of Illinois, Urbana-Champaign. Fall S. Bolouki (UIUC) 1 / 28

Numerical Optimal Control Overview. Moritz Diehl

Robotics. Control Theory. Marc Toussaint U Stuttgart

Lecture 5 Linear Quadratic Stochastic Control

EE221A Linear System Theory Final Exam

Homework Solution # 3

Deterministic Dynamic Programming

Advanced Mechatronics Engineering

Optimal Control. Lecture 3. Optimal Control of Discrete Time Dynamical Systems. John T. Wen. January 22, 2004

UCLA Chemical Engineering. Process & Control Systems Engineering Laboratory

Stabilization and Passivity-Based Control

Controlled Diffusions and Hamilton-Jacobi Bellman Equations

Stochastic and Adaptive Optimal Control

MATH4406 (Control Theory) Unit 6: The Linear Quadratic Regulator (LQR) and Model Predictive Control (MPC) Prepared by Yoni Nazarathy, Artem

Static and Dynamic Optimization (42111)

Chapter 2 Optimal Control Problem

EE C128 / ME C134 Feedback Control Systems

Inverse Optimality Design for Biological Movement Systems

The Linear Quadratic Regulator

IEOR 265 Lecture 14 (Robust) Linear Tube MPC

Optimal Control Theory

Optimal Control. McGill COMP 765 Oct 3 rd, 2017

Computational Issues in Nonlinear Dynamics and Control

ECE7850 Lecture 7. Discrete Time Optimal Control and Dynamic Programming

4F3 - Predictive Control

Output Feedback and State Feedback. EL2620 Nonlinear Control. Nonlinear Observers. Nonlinear Controllers. ẋ = f(x,u), y = h(x)

A Globally Stabilizing Receding Horizon Controller for Neutrally Stable Linear Systems with Input Constraints 1

Optimal Control. Quadratic Functions. Single variable quadratic function: Multi-variable quadratic function:

Theory in Model Predictive Control :" Constraint Satisfaction and Stability!

ESC794: Special Topics: Model Predictive Control

Steady State Kalman Filter

Alberto Bressan. Department of Mathematics, Penn State University

Lecture Note 13:Continuous Time Switched Optimal Control: Embedding Principle and Numerical Algorithms

4F3 - Predictive Control

Computational Issues in Nonlinear Control and Estimation. Arthur J Krener Naval Postgraduate School Monterey, CA 93943

minimize x subject to (x 2)(x 4) u,

Chapter 5. Pontryagin s Minimum Principle (Constrained OCP)

Generalization to inequality constrained problem. Maximize

On the Stability of the Best Reply Map for Noncooperative Differential Games

Problem 1 Cost of an Infinite Horizon LQR

EE363 homework 2 solutions

Numerical approximation for optimal control problems via MPC and HJB. Giulia Fabrini

POD for Parametric PDEs and for Optimality Systems

Polynomial approximation of high-dimensional Hamilton-Jacobi-Bellman equations and applications to feedback control of semilinear parabolic PDEs

ECE7850 Lecture 8. Nonlinear Model Predictive Control: Theoretical Aspects

ECSE.6440 MIDTERM EXAM Solution Optimal Control. Assigned: February 26, 2004 Due: 12:00 pm, March 4, 2004

Real Time Stochastic Control and Decision Making: From theory to algorithms and applications

Taylor expansions for the HJB equation associated with a bilinear control problem

Numerical Methods for Optimal Control Problems. Part I: Hamilton-Jacobi-Bellman Equations and Pontryagin Minimum Principle

The Inverted Pendulum

Linear-Quadratic-Gaussian (LQG) Controllers and Kalman Filters

UCLA Chemical Engineering. Process & Control Systems Engineering Laboratory

Lecture 9 Nonlinear Control Design

OPTIMAL SPACECRAF1 ROTATIONAL MANEUVERS

EE291E/ME 290Q Lecture Notes 8. Optimal Control and Dynamic Games

Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control

CHAPTER 3 THE MAXIMUM PRINCIPLE: MIXED INEQUALITY CONSTRAINTS. p. 1/73

Controllability, Observability, Full State Feedback, Observer Based Control

Robotics: Science & Systems [Topic 6: Control] Prof. Sethu Vijayakumar Course webpage:

Suboptimality of minmax MPC. Seungho Lee. ẋ(t) = f(x(t), u(t)), x(0) = x 0, t 0 (1)

Regional Solution of Constrained LQ Optimal Control

Subject: Optimal Control Assignment-1 (Related to Lecture notes 1-10)

Lecture 2 and 3: Controllability of DT-LTI systems

Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations

Suppose that we have a specific single stage dynamic system governed by the following equation:

Adaptive Nonlinear Model Predictive Control with Suboptimality and Stability Guarantees

The Euler Method for Linear Control Systems Revisited

1. Find the solution of the following uncontrolled linear system. 2 α 1 1

arxiv: v1 [q-fin.mf] 5 Jul 2016

MATH4406 (Control Theory) Unit 1: Introduction Prepared by Yoni Nazarathy, July 21, 2012

Numerical Methods for Constrained Optimal Control Problems

EE C128 / ME C134 Final Exam Fall 2014

Optimal nonlinear control for time delay system. Delsys 2013

Continuous Time Finance

Feedback Optimal Control of Low-thrust Orbit Transfer in Central Gravity Field

Lecture 10 Linear Quadratic Stochastic Control with Partial State Observation

ECE7850 Lecture 9. Model Predictive Control: Computational Aspects

Weighted balanced realization and model reduction for nonlinear systems

Introduction to Geometric Control

INVERSION IN INDIRECT OPTIMAL CONTROL

Introduction to Nonlinear Control Lecture # 3 Time-Varying and Perturbed Systems

Nonlinear Model Predictive Control Tools (NMPC Tools)

Nonlinear Control Systems

Pontryagin s maximum principle

Optimization-Based Control. Richard M. Murray Control and Dynamical Systems California Institute of Technology

State Regulator. Advanced Control. design of controllers using pole placement and LQ design rules

Game Theory Extra Lecture 1 (BoB)

LQR, Kalman Filter, and LQG. Postgraduate Course, M.Sc. Electrical Engineering Department College of Engineering University of Salahaddin

Feedback Control of Turbulent Wall Flows

Solution of Stochastic Optimal Control Problems and Financial Applications

Lecture 19: Heat conduction with distributed sources/sinks

Policy iteration-based optimal control design for nonlinear descriptor systems

An Introduction to Noncooperative Games

Theorem 1. ẋ = Ax is globally exponentially stable (GES) iff A is Hurwitz (i.e., max(re(σ(a))) < 0).

EE16B Designing Information Devices and Systems II

Model Predictive Regulation

Transcription:

Optimal Control Lecture 18 Hamilton-Jacobi-Bellman Equation, Cont. John T. Wen Ref: Bryson & Ho Chapter 4. March 29, 2004

Outline Hamilton-Jacobi-Bellman (HJB) Equation Iterative solution of HJB Equation March 29, 2004Copyrighted by John T. Wen Page 1

Continuous Time Systems Consider ẋ = f (x,u,t); x(t 0 ) = x 0 with cost Cost-to-go J(x(t),t): T J(x(t 0 ),t 0 ) = φ(x(t ),T ) + L(x, u,t) dt. 0 T J(x(t),t) = φ(x(t ),T ) + L(x,u,τ)dτ. t We mimick the discrete time approach by 1. assume we know optimal cost-to-go J (x + x,t + t) 2. find optimal u (τ) for τ [t,t + t] 3. let t 0. March 29, 2004Copyrighted by John T. Wen Page 2

HJB Equation Hamilton-Jacobi-Bellman (HJB) Equation: J t = min u(t) ( L(x,u,t) + ( ) J x ) f (x,u,t). with boundary condition J (ξ,t ) = φ(ξ,t ) for all ξ. If x(t ) is required to satisfy ψ(x(t ),T ) = 0, then the boundary condition becomes J (ξ,t ) = φ(ξ,t ) for all ξ that satisfy ψ(ξ,t ) = 0. Alternatively, T J (x (t),t) = φ(x(t ),T ) + L(x,u,τ)dτ. t Differentiate with respect to t, we obtain the HJB equation directly: dj dt = L(x,u,t) = J t + J x f (x,u,t). March 29, 2004Copyrighted by John T. Wen Page 3

Properties of HJB Equation Partial differential equation of J (x,t) (x and t are independent variables x does not depend on t) with specified boundary condition at (x(t ),T ). Solution is in feedback form (u is in terms of x). Almost never solvable exactly, but sometimes approximate solutions may be possible. March 29, 2004Copyrighted by John T. Wen Page 4

Time Invariant and Infinite Horizon Case Consider a time invariant system and an infinite horizon optimization J = 0 ẋ = f (x,u); f (0,0) = 0,x(0) = x 0 L(x(τ),u(τ))dτ,L(0,0) = 0,L(x,u) 0. In this case, J = J (x) (no explicit dependence on t) and therefore J t = 0. HJB then becomes: min u {L(x,u) + J x f (x,u) } = 0, J (0) = 0,J(x) positive definite. March 29, 2004Copyrighted by John T. Wen Page 5

Scalar Examples Plant: ẋ = x + u; x(0) = x 0. Cost: J = 1 2 x2 (T ) + 1 2 T 0 ru 2 dt. Plant (affine nonlinear system): ẋ = f (x) + g(x)u; x(0) = x 0,x R,u R. Cost: J(x 0 ) = 1 2 0 (q(x) + u 2 )dt. Example: ẋ = x 3 + u. March 29, 2004Copyrighted by John T. Wen Page 6

General Continuous Time LQR Plant: ẋ = Ax + Bu; x(0) = x 0. Cost: J(x 0,0) = 1 2 xt (T )Q(T )x(t ) + 1 2 T 0 ( x T Qx + u T Ru ) dt. March 29, 2004Copyrighted by John T. Wen Page 7

Euler Lagrange Equations Let λ T = J x. Then HJB becomes J t = min u (L + λ T f ). If u is constrained, then u must minimize H(x,u,λ,t) which is the same as Pontryagin s minimum principle. If u is unconstrained, we recover the Euler Lagrange Equations. March 29, 2004Copyrighted by John T. Wen Page 8

Iterative Performance Improvement HJB equation is difficult to solve in general, e.g., for quadratic control penalty and affine state dynamics, L(x,u) = q(x) + 1 2 ut Ru; ẋ = f (x) + g(x)u the HJB solution is u = argmin u HJB is a PDE in J (x): q(x) 1 2 {q(x) + 12 ut Ru + J x ( J x } ( ) J ( f (x) + g(x)u) = R 1 g(x) T T. x ) ( ) J g(x)r 1 g T T (x) + J x x f 0(x) = 0; J (0) = 0. March 29, 2004Copyrighted by John T. Wen Page 9

Relaxation Method It may be easier to apply a relaxation method to solve the HJB iteratively: 1. Start with a (locally) stabilizing control u(x), solve V (x) from the generalized HJB (GHJB) equation (for the given u): L(x,u) + V x Note that the PDE is linear in V. 2. Solve û (with V fixed from step 1) from { V û = argmin v x f (x,u) = 0; V (0) = 0. } f (x,v) + L(x,v). 3. Solve ˆV from GHJB again. It has been shown (Saridis & Lee, 79) that ˆV (x) V (x). 4. Repeat until V converges. For L(x,u) = q(x)+ 1 2 ut Ru, q positive definite, V in each iteration is a Lyapunov function for the stabilizing controller u. March 29, 2004Copyrighted by John T. Wen Page 10

Approximate Solution of HJB Equation GHJB is easier to solve than HJB, but it is still difficult in general: L(x,u) + V x We can apply Galerkin approximation f (x,u) = 0; V (0) = 0. where c i s are determined from < L(x,u) + V (x) = N i=1 c i dφ i (x) dx N c i φ i (x); φ i (0) = 0 i=1 f (x,u),φ j >= 0 j = 1,...,N. Residue error can be made small for N sufficiently large and the approximate V can be made arbitrarily close to true V. March 29, 2004Copyrighted by John T. Wen Page 11

Example Consider a simple scalar example HJB: ẋ = x 3 + u; J = 1 2 0 (x 2 + u 2 )dt. J ( x 3 + u) + 1 2 (x2 + u 2 ) = 0 or u = J. Substituting back: 1 2 J 2 J x 3 + 1 2 x2 = 0. We can solve J explicitly (choose the solution so that J (x) is positive definite). Alternatively, we can use GHJB to iterate on u and V : V ( x 3 + u) + 1 2 (x2 + u 2 ) = 0 March 29, 2004Copyrighted by John T. Wen Page 12

Control Lyapunov Function From the HJB equation V x f (x,u ) = L(x,u ) < 0. Suppose we can find a positive definite function V such that min( V u x f (x,u)) < 0. V is called a control Lyapunov function (clf). The feedback control u generated this way is called the inverse optimal control (since there is an optimal control problem corresponds to u). It is easy to find a clf when the system is feedback linearizable, but tough in general. March 29, 2004Copyrighted by John T. Wen Page 13