Introduction to Optimal Control Theory and Hamilton-Jacobi equations. Seung Yeal Ha Department of Mathematical Sciences Seoul National University

Similar documents
AN INTRODUCTION TO MATHEMATICAL OPTIMAL CONTROL THEORY VERSION 0.1. Lawrence C. Evans Department of Mathematics University of California, Berkeley

ACM/CMS 107 Linear Analysis & Applications Fall 2016 Assignment 4: Linear ODEs and Control Theory Due: 5th December 2016

2 Statement of the problem and assumptions

OPTIMAL CONTROL CHAPTER INTRODUCTION

HJB equations. Seminar in Stochastic Modelling in Economics and Finance January 10, 2011

Overview of the Seminar Topic

Recent Trends in Differential Inclusions

Deterministic Dynamic Programming

Solution of Stochastic Optimal Control Problems and Financial Applications

On Controllability of Linear Systems 1

Objective. 1 Specification/modeling of the controlled system. 2 Specification of a performance criterion

LMI Methods in Optimal and Robust Control

Dynamic Blocking Problems for Models of Fire Propagation

An introduction to Mathematical Theory of Control

Optimal Control. Macroeconomics II SMU. Ömer Özak (SMU) Economic Growth Macroeconomics II 1 / 112

Weak Convergence of Numerical Methods for Dynamical Systems and Optimal Control, and a relation with Large Deviations for Stochastic Equations

= m(0) + 4e 2 ( 3e 2 ) 2e 2, 1 (2k + k 2 ) dt. m(0) = u + R 1 B T P x 2 R dt. u + R 1 B T P y 2 R dt +

NOTES ON CALCULUS OF VARIATIONS. September 13, 2012

Introduction Optimality and Asset Pricing

Optimal Control Theory SF 2852

Chapter 2 Optimal Control Problem

1 Lyapunov theory of stability

2tdt 1 y = t2 + C y = which implies C = 1 and the solution is y = 1

Control Theory and PDE s

Theorem 1. ẋ = Ax is globally exponentially stable (GES) iff A is Hurwitz (i.e., max(re(σ(a))) < 0).

Control, Stabilization and Numerics for Partial Differential Equations

Numerical Methods for Optimal Control Problems. Part I: Hamilton-Jacobi-Bellman Equations and Pontryagin Minimum Principle

OPTIMAL CONTROL THEORY: APPLICATIONS TO MANAGEMENT SCIENCE AND ECONOMICS

Systems and Control Theory Lecture Notes. Laura Giarré

Tonelli Full-Regularity in the Calculus of Variations and Optimal Control

CHAPTER 3 THE MAXIMUM PRINCIPLE: MIXED INEQUALITY CONSTRAINTS. p. 1/73

CHAPTER 2 THE MAXIMUM PRINCIPLE: CONTINUOUS TIME. Chapter2 p. 1/67

Subject: Optimal Control Assignment-1 (Related to Lecture notes 1-10)

Neighboring feasible trajectories in infinite dimension

Optimal Control Theory SF 2852

Introduction to numerical simulations for Stochastic ODEs

Nonlinear Observers. Jaime A. Moreno. Eléctrica y Computación Instituto de Ingeniería Universidad Nacional Autónoma de México

Hamilton-Jacobi theory for optimal control problems on stratified domains

Nonlinear Control Systems

Control Systems Design, SC4026. SC4026 Fall 2009, dr. A. Abate, DCSC, TU Delft

An Introduction to the Mathematical Theory. of Nonlinear Control Systems

minimize x subject to (x 2)(x 4) u,

Robotics. Islam S. M. Khalil. November 15, German University in Cairo

An Introduction to Noncooperative Games

ECON 582: Dynamic Programming (Chapter 6, Acemoglu) Instructor: Dmytro Hryshko

16.31 Fall 2005 Lecture Presentation Mon 31-Oct-05 ver 1.1

AN EFFECTIVE METRIC ON C(H, K) WITH NORMAL STRUCTURE. Mona Nabiei (Received 23 June, 2015)

The Bang-Bang theorem via Baire category. A Dual Approach

EL 625 Lecture 10. Pole Placement and Observer Design. ẋ = Ax (1)

Optimal periodic locomotion for a two piece worm with an asymmetric dry friction model

Optimal Control. McGill COMP 765 Oct 3 rd, 2017

Lagrangian for Central Potentials

On Stopping Times and Impulse Control with Constraint

MA5510 Ordinary Differential Equation, Fall, 2014

Convexity of the Reachable Set of Nonlinear Systems under L 2 Bounded Controls

Controlled Diffusions and Hamilton-Jacobi Bellman Equations

A motion planner for nonholonomic mobile robots

Control Systems Design, SC4026. SC4026 Fall 2010, dr. A. Abate, DCSC, TU Delft

Mathematical Economics. Lecture Notes (in extracts)

Alberto Bressan. Department of Mathematics, Penn State University

Your first day at work MATH 806 (Fall 2015)

Introduction to optimal control theory in continuos time (with economic applications) Salvatore Federico

Extremal Solutions of Differential Inclusions via Baire Category: a Dual Approach

Dynamic Programming Theorems

The Euler Method for Linear Control Systems Revisited

Non-linear wave equations. Hans Ringström. Department of Mathematics, KTH, Stockholm, Sweden

Classical Mechanics Comprehensive Exam Solution

Optimal Control Theory - Module 3 - Maximum Principle

Differential Games II. Marc Quincampoix Université de Bretagne Occidentale ( Brest-France) SADCO, London, September 2011

1 Continuous-time Systems

LINEAR-CONVEX CONTROL AND DUALITY

Calculus of Variations Summer Term 2015

University of Houston, Department of Mathematics Numerical Analysis, Fall 2005

Robotics. Dynamics. Marc Toussaint U Stuttgart

ODE Homework 1. Due Wed. 19 August 2009; At the beginning of the class

Nonlinear Systems Theory

INTRODUCTION TO ALGEBRAIC TOPOLOGY. (1) Let k < j 1 and 0 j n, where 1 n. We want to show that e j n e k n 1 = e k n e j 1

CHAPTER 1. Introduction

S2500 ANALYSIS AND OPTIMIZATION, SUMMER 2016 FINAL EXAM SOLUTIONS

Stochastic Differential Equations.

Viscosity Solutions for Dummies (including Economists)

ECON 582: An Introduction to the Theory of Optimal Control (Chapter 7, Acemoglu) Instructor: Dmytro Hryshko

CLASSIFICATIONS OF THE FLOWS OF LINEAR ODE

ECE504: Lecture 8. D. Richard Brown III. Worcester Polytechnic Institute. 28-Oct-2008

On the Bellman equation for control problems with exit times and unbounded cost functionals 1

An homotopy method for exact tracking of nonlinear nonminimum phase systems: the example of the spherical inverted pendulum

Lecture 2: Controllability of nonlinear systems

Hyperbolicity of Systems Describing Value Functions in Differential Games which Model Duopoly Problems. Joanna Zwierzchowska

Nontrivial Solutions for Boundary Value Problems of Nonlinear Differential Equation

Control in finite and infinite dimension. Emmanuel Trélat, notes de cours de M2

Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games

Control Systems I. Lecture 4: Diagonalization, Modal Analysis, Intro to Feedback. Readings: Emilio Frazzoli

Further Results on the Bellman Equation for Optimal Control Problems with Exit Times and Nonnegative Instantaneous Costs

Lecture 4: Ito s Stochastic Calculus and SDE. Seung Yeal Ha Dept of Mathematical Sciences Seoul National University

Chapter 2. Optimal Control of ODEs: the Linear-Quadratic (LQ) case. 2.1 Introduction

Linear conic optimization for nonlinear optimal control

PH.D. PRELIMINARY EXAMINATION MATHEMATICS

Existence and Uniqueness

Numerical approximation for optimal control problems via MPC and HJB. Giulia Fabrini

Problem List MATH 5173 Spring, 2014

Transcription:

Introduction to Optimal Control Theory and Hamilton-Jacobi equations Seung Yeal Ha Department of Mathematical Sciences Seoul National University 1

A priori message from SYHA The main purpose of these series of lectures is to make you understand ABC of OCT and help you prepare advanced course on stochastic optimal control theory that you might have in future. 2

Lecture Schedules Lecture 1: ABC of Optimal Control Theory Lecture 2: PMP v.s. Bellman s dynamic programming Lecture 3: Hamilton-Jacobi equations (classical theory) Lecture 4: Hamilton-Jacobi equations (modern theory) 3

References An introduction to mathematical optimal control theory by Craig Evans. Available at www.math.berkeley.edu/ evans. Optimal Control Theory by Donald E. Kirk. Introduction to the mathematical theory of control by Alberto Bressan and Benedetto Piccoli. 4

What do you mean by control of system? Control of a system has a double meaning: (Weak sense): Checking or testing whether the system s behavior is satisfactory. (Strong sense): Putting things in order to guarantee that the system behaves as desired. 5

Maxims Since the building of the universe is perfect and is created by the wisdom creator, nothing arises in the universe in which one cannot see the sense of some maximum or minimum. by Leonhard Euler The words control theory are, of course, of recent origin, but the subject itself is much older, since it contains the classical calculus of variations as a special case, and the first calculus of variations problems go back to classical Greece. by Hector J. Sussmann. 6

Lecture 1: ABC of Optimal Control Theory 7

Goal of OCT The objective of OCT is to determine the control signals that will cause a process to satisfy the physical constraints and at the same time minimize (or maximize) some performance criterion. Minimize cost and maximize payoff, utility A problem well put is a problem half solved 8

Key words of Lecture 1 Controlled system, state, control, Performance measure Controllability, reachable set, linear time-invariant system Bang-bang control (principle), Kalman s rank theorem, observability, 9

Three giants of modern control theory Richard Bellman (August 26, 1920 - March 19, 1984) Bellman s dynamic programming (1953) 10

Lev Pontryagin (3 September 1908 3 May 1988) Pontryagin s Maximum principle (PMP) (1956) 11

Rudolf E. Kalman (19 May 1930) Kalman filter, Kalman s rank theorem (1960) 12

Controlled dynamical systems (Dynamical system) ẋ = f(x, t), x = (x 1,, x n ), x = x(t) R n. x: state (variable) (Controlled dynamical system) ẋ = f(x, u, t), u = (u 1,, u m ), u = u(t) R m. u: control (parameter) 13

Example 1: Merton s optimal investment and consumption Step A: (Modeling of a problem) Consider an individual whose wealth today is W 0, and who will live exactly T years. His task is to plan the rate of consumption of wealth C(s) for 0 < s < T. All wealth not yet consumed earns interest at a fixed rate r. We have, for simplicity, assigned no utility to final-time wealth (a bequest). Let W (t) to be the amount of wealth at time t. { Ẇ = rw C, 0 < t T, W (0) = W 0,

Step B: (Identification of physical constraints) W (t) 0, W (T ) = 0, C(t) 0. Step C: (Performance measure) T P [C] = 0 e ρt U(C(s))ds, max P [C]. where ρ is a discounting rate, and U is the utility function of consumption. Reformulation as calculus of variation problem max W ( ) T 0 e ρt h(rw Ẇ )ds, subject to W (0) = W 0.

Example 2. (automobile problem): Minimum-time optimal problem Step A: (Modeling of a problem) The car is to be driven in a strainght line away from the point 0 to the point e. The distance of the car from 0 at time t is given by d(t). For simplicity, we assume that the car is denoted by the unit point mass that can be accelerate by using the throttle or decelerated by using the brake. We set d(t) = α(t) + β(t), where α and β stand for throttle accelerate and braking deceleration respectively.

Again we set x 1 := d, x 2 := d, u 1 := α, u 2 := β. Then our controlled dynamical system is given by ( ) ( ) 0 1 0 1 ẋ(t) = x(t) + u(t), 0 0 1 1 and two point boundary conditions: ( ) 0 x(t 0 ) =, x(t 0 f ) = ( e 0 ).

Step B: (Identification of physical constraints) State constraints 0 x 1 e, 0 x 2. Control constraints 0 u 1 M 1, M 2 u 2 0. Assume that the amount of gasoline at time t = t 0 is G gallons, and the rate of gas consumption is proportional to both accelerate and speed, thus the amount of gasoline used tf ( ) k 1 u 1 (t) + k 2 x 2 (t) dt G. t 0

Step C: (Performance measure) Minimum-time control minimize J := t f t 0. In general, the performance measure takes the form of J = g(x(t f ), t f ) + tf t 0 r(x(t), u(t), t)dt, g 0; Lagrange problem, r 0: Mayer problem Admissible controls: controls satisfying physical constraints, admissible trajectory

Optimal control problem Optimal control problem is to find an admissible control u which causes the system ẋ = f(x, u, t) to follow an admissible trajectory x that maximize the performance measure (payoff) tf P = g(x(t f ), t f ) + r(x(t), u(t), t) dt. }{{} t }{{} 0 terminal payoff running payoff In this lecture, we assume that U ad denotes the set of all admissible controls: U ad := {u : R + U : u = u( ) is measurable and satisfies constraints}, 14

and U = [ 1, 1] n : symmetric and convex. Basic problem To find a control u = u (t) U ad which maximize th payoff P [u ] P [u], u U ad.

Main questions 1. Does an optimal control exist? (Existence of optimal control) 2. How can we characterize an optimal control mathematically? (characterization of optimal control) 3. How can we construct an optimal control? (realization of optimal control)

Two examples 1. Control of production and consumption x(t) := amount of output produced at time t 0 Assumptions: We consume some fraction of output at each time We reinvest the remaining fraction 15

Let u = u(t) be the fraction of output reinvested at time t 0, 0 u 1. In this case { ẋ = ux, 0 < t T, where x(0) = x 0, x 0, 0 u 1, U = [0, 1]. and P [u] = T 0 (1 u(t))x(t)dt.

2. A pedulum problem { θ + λ θ + ω 2 sin θ = u, 0 < t T, θ(0) = θ 0, θ(0) = ω 0. We use the approximation sin θ θ, θ 1 to get the linear approximate equation { θ + λ θ + ω 2 θ = u, 0 < t T, θ(0) = θ 0, θ(0) = ω 0. So the main question is to determine the control u so that (θ, θ approaches (0, 0) as soon as possible. (minimum-time control problem)

We set x 1 = θ, x 2 = θ, then, we have We now set d dt ( x1 x 2 ) = ( 0 1 ω 2 λ ) ( x1 x 2 τ = τ[α( )] : first time such that ( x1 x 2 We also define P [α] = τ 0 ) 1dt = τ. + ( 0 u ) = ). ( 0 0 ).

Controllability In order to be able to do whatever we want with the given dynamical system under control input, the system must be controllable. 16

Controllability Consider a controlled dynamics { ẋ = f(x, u), t0 < t <, x(t 0 ) = x 0, We set the solution of the above controlled system as x(t; t 0, x 0 ): Natural a prior question before optimal control For a fixed desired state x 0, x f R n, can we find a control u U ad and t f < such that x(t f ; t 0, x 0 ) = x f. 17

Controllability question Given the initial point x 0 R n and a set S R n, does there exist a control u steering the system to the set S in finite time? For the case S = {x f }, controllability question asks us { T <, u U ad such that ẋ = f(x, u), 0 < t <, x(0) = x 0, x(t ) = x f.

Controllability for a linear system From now on, consider a linear-control system and S = {0}, i.e., { ẋ = Mx + Nu, M M n n, N M n m, x(0) = x 0 (1). Definition: 1. A linear-control system (??) is completely controllable For any x 0, x f R n, there exists a control u : [0, t f ] R m such that x(t f ) = x f. 18

2. Reachable set C(t) := {x 0 : u A such that x(t, u, x 0 ) = 0}, C := t 0 C(t) : reachable set Simple observation 0 C(t) and x 0 C(t), ˆt > t = x 0 C(ˆt).

Looking for sufficient and necessary condition of controllability 19

Snapshot of ODE theory Consider a homogeneous linear ODE: { ẋ = Mx, t > 0, M M n n : constant matrix x(0) = x 0. Then we have x(t) = Φ(t)x 0, where Φ is a fundamental matrix define by Φ(t) = e tm ( := k=0 t k M k k! ). 20

Consider an inhomogeneous linear system: { ẋ = Mx + f(t), t > 0, x(0) = x 0. Then the solution x can be given by the variation of parameters formula (Duhamel s formula) x(t) = Φ(t)x 0 + where Φ(t s) = Φ(t)Φ 1 (s). t 0 Φ(t s)f(s)ds,

We now return to { ẋ = Mx + Nu, t > 0, x(0) = x 0. Then by Duhamel s formula, we have x(t) = Φ(t)x 0 + Φ(t) t 0 Φ 1 (s)nu(s)ds. Note that x 0 C(t) x(t) = 0 x 0 = t 0 Φ 1 (s)nu(s)ds, for some u U ad.

Theorem (Geometry of reachable set) Rechable set C is symmetric and convex, i.e., (i) x 0 C = x 0 C. (ii) x 0, ˆx 0 C, λ [0, 1] = λx 0 + (1 λ)ˆx 0 C.

Kalman s rank theorem (1960) Consider ẋ = Mx + Nu, M M n n, N M n m. (2) We define a controllability matrix G(M, N) := [N MN M n 1 N]. Definition The linear control system (??) is controllable C = R n. 21

Theorem rankg(m, N) = n 0 C 0. Proof. ( =) Suppose that 0 C 0. Note that rankg(m, N) n. If rankg(m, N) < n, then there exists b 0 such that b t G(M, N) = 0.

This yields b t N = b t MN = = b t M n 1 N = O. By Cayley-Hamilton s theorem, we also have b t M k N = O, k 0, b t Φ 1 (t)n = O.

We now claim If x 0 C(t), then b is perpendicular to C(t), i.e., C 0 =. Therefore, x 0 = t b t x 0 = 0 Φ 1 (s)nu(s)ds, u U ad. t 0 bt Φ 1 (s)nu(s)ds = 0.

(= ) Suppose that 0 C 0, i.e., 0 (C 0 ) c t 0 (C 0 (t)) c. Then 0 C 0 (t), t 0. Therefore 0 C(t). Since C(t) is convex, there exists b 0 such that b t x 0 0, x 0 C(t). For x 0 C(t), Thus x 0 = t b t x 0 = 0 Φ 1 (s)nu(s)ds, u U ad. t 0 bt Φ 1 (s)nu(s)ds 0.

This yields b t Φ 1 (s)n = 0. By differentiating the above relation, we have b t N = b t MN = = b t M n 1 N = O, i.e., b t G(M, N) = 0. Hence rankg(m, N) < n.

Theorem Let λ be the eigenvalue of M. rankg(m, N) = n and Re(λ) 0 = The system (??) is controllable.

Theorem Kalman (1960) The system (??) is controllable rank(g) = n. This rank indicates how many components of the system are sensitive to the action of the control

Examples 1. n = 2, m = 1, A = [ 1, 1] { ẋ1 = 0, t > 0, ẋ 2 = u(t). 2. { ẋ1 = x 2, t > 0, ẋ 2 = u(t). 3. { ẋ1 = u(t), t > 0, ẋ 2 = u(t).

Observability In order to see what is going on inside the system under observation, the system must be observable 22

Observability Consider uncontrolled system: { ẋ = Mx, t > 0, M M n n, x R n, Note that x(0) = x 0 : unknown. (3) x(t) = e tm x 0. (4) Once we know x 0, then we know everything!! Suppose that we have observed data y: y(t) = Nx(t), N M m n, m n. 23

Observability question Given the observation y = y( ) which is low-dimensional, can we in principle construct high-dimensional real dynamics x = x( )? Thus problem is how to recover x 0 R n from the observed data y. Note that y(0) = Nx(0) = Nx 0, ẏ(0) = Nẋ(0) = NMx 0, =, y (n 1) (0) = Nx (n 1) (0) = NM n 1 x 0.

This yields y(0) y (0) y (n 1) (0) = N NM NM n 1 x0. Thus, we have (??) and (??) is observable rank N NM NM n 1 = n.

Definition { ẋ = Mx, t > 0, x(0) = x 0, y = Nx is observable For two solutions x 1 and x 2 such that Nx 1 = Nx 2 on [0, t] x 1 (0) = x 2 (0). Example: N = I, N = 0. Theorem (Observability is a dual concept of controllability) { ẋ = Mx, t > 0, is observable y = Nx, ż = M t z + N t u is controllable.

La Salle s Bang-bang control U ad := {u : [0, ) U = [ 1, 1] m : u is measurable}. Definition Let u U ad. u = (u 1,, u m ) is a bang-bang control u i (t) = 1, t > 0, i = 1,, m. Theorem (Bang-bang principle) { ẋ = Mx + Nu, x(0) = x 0 x 0 C(t), = u = u ( ) : bang-bang control such that x 0 = t 0 Φ 1 (s)nu (s)ds. 24

Preliminaries for bang-bang principle L = L (0, t; R m ) = {u : [0, t] R m : sup 0 s t u(s) < }, u L := sup u(s), 0 s t L 1 (L ). Definition Let u n, u L. u n u in weak-star topology t t u n(s)ϕ(s)ds u(s)ϕ(s)ds, 0 0 u n converges to u in weak-star topology, where ϕ is a L 1 -test function with t 0 ϕ(s) ds <. 25

Theorem (Banach-Alaoglu) Any bounded set in L is weak-star compact. Corollary If u n U ad := {u : [0, t] [ 1.1] m : u is measurable}, {u nk } : subsequence of u n such that u nk u weak-star topology.

Definition Let K R n and z K. 1. K is convex x, y K, 0 λ 1, λx+(1 λ)y K. 2. z K is an extreme point there does not exist x, ˆx K and λ (0, 1) such that z = λx + (1 λ)ˆx. Theorem (Krein-Milman) K : convex and weak-star compact = K has at least one extreme pint.

The proof of bang-bang s principle Let x 0 C(t). Then, we set K := {u U ad : u steers x 0 to 0 at time t}. Lemma K : convex and weak-star compact. 26