EN Applied Optimal Control Lecture 8: Dynamic Programming October 10, 2018

Size: px
Start display at page:

Download "EN Applied Optimal Control Lecture 8: Dynamic Programming October 10, 2018"

Transcription

1 EN Applied Optimal Control Lecture 8: Dynamic Programming October 0, 08 Lecturer: Marin Kobilarov Dynamic Programming (DP) is conerned with the computation of an optimal policy, i.e. an optimal control signal expressed as a function of the state and time u = u(x, t). This means that we seek solutions from any state x(t) to a desired goal defined either as reaching a single state state x f at time t f, or reaching a surface defined by ψ(x(t f ), t f ) = 0. This global viewpoint is in contrast to the calculus of variations approach where we typically look at local variations of trajectories in the vicinity of a given initial path x( ) often starting from a given state x(0) = x 0. The DP principle is based on the idea that once an optimal path x (t) for t [t 0, t f ] is computed then any segment of this path is optimal, i.e. x (t) for t [t, t f ] is the optimal path from x (t ) to x (t f ). x x x f x 0 For instance, consider an optimal path x (t) between x 0 and x f, i.e. a path with cost J(x 0,x f ) that is less than the cost of any other trajectory between these two points. According to the DP principle, for any given time t (t 0, t f ) the trajectory x (t) from x = x (t ) to x(t f ) is also otpimal. The principle can be easily proved by contradition. It it were not optimal then the optimal path must pass through a state x which does not lie on x (t), i.e. But then we have J (x,x,x f ) < J (x,x f ). J (x0,x ) + J (x,x,x f ) < J (x0,x ) + J (x,x f ) = J (x 0,x f ), which is a contradition. This property is key in dynamic programming, since it allows the computation of optimal trajectory segments which can then be pieced together in one globally optimal tarjectory, rather than say enumerating and comparing all possible trajectories ending at the goal.

2 DP applies to both continuous and discrete decision domains. The fully discrete setting occurs when the state x takes values from a finite set e.g. x {a, b, c,... } and when the time variable t can only take a finite number of values, e.g. t {t 0, t, t,... }. As we will see there are efficient algorithms which can solve discrete problems and compute optimal policy from any state, as long as the size of these discrete sets is not too large. For instance, consider the problem (Kirk 3.4) of a motorist navigating through a maze-like environment consisting of one-way roads and intersections marked by letters a 8 d 3 e 8 h 5 5 b 9 c 3 f 3 g The goal is to travel between a and h in an optimal way where the total cost is the sum of costs along each road given by the numbers above. One way to solve the problem is to enumerate all paths and pick the least-cost one. This is doable in this problem but does not scale if the there were much more roads and intersections. Another way is to solve the problem recursively by computing optimal subpaths starting backwards from the goal h. Denote the cost along an edge ab by J ab and the optimal cost of going between two nodes a and b by J ab. Then we have J eh = 8, J gh =, J fg = 3 Clearly, we also have but J eh = 8, J gh =, J fh = 5 J eh = min{j eh, J ef + J fh } = 7 In other words the optimal cost at e can be computed by looking at the already compuetd optimal cost at f and the increment J ef to get from e to f. Once we have Jeh the procedure is repeated recursively from all nodes connected to e and f, emanating backwards until the start a is reached and the cost Jah is computed. The costs of the form J eh in this case are called optimal cost-to-go since they denote the optimal cost to reach the goal. In the optimal control setting, we are tpically dealing with continuous spaces, e.g. x R n and t [t 0, t f ]. One way to apply the discrete DP approach is to discretize space and time using a a grid of N x discrete values along each state dimension and N t discete values along time. The possible number of discrete states (or nodes) will then be #of nodes = N t N n x, or in other words their number is exponential in the state dimension. The case n = is illustrated below where each cell could be regarded as a node in a graph similar to the maze navigation.

3 N x x space N x t 0 t t N t segments time This illustrates one of the fundamental problems in control and DP known as the curve of dimensionlity. In practice,, this means that applying the discrete DP procedure is only feasible for a few dimensions, e.g. n 5 and coarse discretizations, e.g. N x, N t < 00. But there are exceptions. An additional complication when transforming a continuous problem into a discrete one is that when the system can only take on a finite number of states its continuous dynamics can be grossly violated. This motivates the study of DP theory not only in discrete domains but also in continuous state space X, and discrete time t {t 0, t, t,...} continuous state space X, and continuous time t [t 0, t f ] (our standard case). To proceed we will consider the restricted optimal control problems tf J = φ(x(t f ), t f ) + L (x(t), u(t), t) dt, t 0 () subject to ẋ(t) = f(x(t), u(t), t), with given x(t 0 ), t 0, t f Discrete-time DP. Bellman equations Assume that the time line is discretized into the discrete set {t 0, t,..., t N }. When time is discrete, we work with discrete trajectories, i.e. sequences of states at these discrete times. A continuous trajectory x(t) is approximated by a discrete trajectory x 0:N given by x 0:N = {x 0, x,..., x N }, 3

4 where x i x(t i ), i = 0,..., N. The dynamics then must be expressed in a discrete form according to x i+ = f i (x i, u i ), where f i is a numerical approximation of the dynamics. For instance, using the simplest Euler rule resulting from finite differences ẋ(t i ) x(t i + t) x(t i ) t we have f i (x i, u i ) = x i + t i f(t i, x i, u i ), where t i is the time step defined by t i = t i+ t i. The cost function along each segment [t i, t i+ ] is approximated by ti+ t i L(x, u, t)dt L i (x i, u i ), where L i is the discrete cost. For instance, in the simplest first order Euler approximation we have L i (x, u) = t i L(x(t i ), u(t i ), t i ) To summarize we have approximated our continuous problem () by the dsicrete problem N J J 0 = φ(x N, t N ) + L i (x i, u i ), () i=0 subject to x i+ = f i (x i, u i ), with given x(t 0 ), t 0, t N In order to establish a link to DP define the cost from discrete stage i to N by N J i = φ(x N, t N ) + L k (x k, u k ). Similarly to the maze navigation problem, let us define the optimal cost-to-go function (or optimal value function) at stage i as the optimal cost from state x at stage i to the end stage N: k=i V i (x) J i (x) = min u i:n J i (x, u i:n ) The optimal cost-to-go at x is computed by looking at all states x = f i (x, u) that can be reached using control inputs u and selecting u which results in minimum cost-to-go from x in addition to the cost to get to x. More formally, V i (x) = min u [ Li (x, u) + V i+ ( x )], where x = f i (x, u). This can be equivalently written as the Bellman equation with V N (x) = φ(x, t N ). V i (x) = min u [L i (x, u) + V i+ (f i (x, u))], 4

5 . Discrete-time linear-quadratic case Bellman equation has closed form solution when f i is linear, i.e. when and when the cost is quadratic, i.e. x i+ = A i x i + B i u i φ(x) = xt P f x, L i (x, u) = x T Q i x + u T R i u. To see that assume that the value function is of the form V i (x) = xt P i x, for P i > 0 with boundary condition P N = P f. Bellman s principle requires that { xt P i x = min u ut R i u + xt Q i x + } (A ix + B i u) T P i+ (A i x + B i u) The minimization results in where u = (R i + B T i P i+ B i ) B T i P i+ A i x K i x, K i = (R i + B T i P i+ B i ) B T i P i+ A i Substituting u back into the Bellman equation we obtain x T P i x = x T [ K T i R i K i + Q i + (A i + B i K i ) T P i+ (A i + B i K i ) ] x. Note, setting P i = K T i R i K i + Q i + (A i + B i K i ) T P i+ (A i + B i K i ) with P N = P f will satisfy the Bellman equation. This relationship can be cycled backwards starting from i = N to i = 0 to obtain P N, P N,..., P 0. The recurence can also be expressed without gains K i according to P i = Q i + A T i [P i+ P i+ B i (R i + B T i P i+ B i ) B T i P i+ ]A i. Note that we have expressed the optimal control at stage i according to u i = K i x i, or in a linear feedback form, similarly to the continuous case. This means that the gains K i can be computed only once in the beginning and then used from any state x i. This solution is equivalent to the discrete Linear-Quadratic-Regulator. 5

6 Continuous Dynamic Programming We next consider the fully continuous case and will derive an analog to Bellman s equation. The continuous analog is called the Hamilton-Jacobi-Bellman equation and is a key result in optimal control. Consider the continuous value function V (x, t) computed over the time interval [t, t f ] defined by [ tf ] V (x(t), t) = min φ(x(t f ), t f ) + L(x(τ), u(τ), τ)dτ u(t),t [t,t f ] t As noted earlier, Bellman s principle of optimality states that if a trajectory over [t 0, t f ] is optimal then it is also optimal on any subinterval [t, t + ] [t 0, t f ]. This can be expressed more formally through the recursive relationship ] V (x(t), t) = min u(t),t [t,t+ ] [ t+ t L (x(τ), u(τ), τ) dτ + V (x(t + ), t + ), (3) where the optimization is over the continuous control signal u(t) over the interval [t, t + ]. We can now state the key principle of dynamic programming in continuous space: optimal trajectories must satisfy the Hamilton-Jacobi-Bellman equations (HJB) given by: t V (x, t) = min u(t) { L(x, u, t) + x V (x, t) T f(x, u, t) } (4) To prove HJB, assume that is small expand V (x(t+ ), t+ ) according to V (x(t + ), t + ) = V (x(t), t) + [ t V (x(t), t) + x V (x(t), t) T ẋ ] + o( ) Substituting the above in (3) and taking 0 we have { [ V (x(t), t) = min L (x(t), u(t), t) + V (x(t), t) + t V (x(t), t) + x V (x(t), t) T ẋ ] }, u(t) which is equivalent to { 0 = min L (x(t), u(t), t) + t V (x(t), t) + x V (x(t), t) T f(x(t), u(t), t) }. u(t) Since this must hold for any then we obtain the HJB equation. Example. A simple nonlinear system. Consider the scalar system ẋ = x 3 + u with cost First we find The solution is J = t [q(x) + u ]dt { } u = arg min u [q(x) + u ] + x V (x, t)( x 3 + u) u = x V (x, t) 6

7 The HJB equation becomes t V (t, x) = [q(x) xv (t, x) ] x V (t, x)x 3 If we can solve this PDE globally then we re done. But this might be difficult Example. Linear-quadratic case. Consider the scalar system ẋ = x + u with cost We have J = tf x(t f ) + 0 u(t) dt H(x, u, x V, t) = u + x V [x + u] The optimal control is computed by satisfying u H = 0, i.e. u (t) = x V (x, t) Furthermore, we have uh = > 0 so u is globally optimal! Substituting the control, the HJB equation becomes Consider the possible value function t V (t, x) = xv (t, x) + x V (t, x)x V (t, x) = P (t)x The HJB equation becomes P x = P x + P x which is equivalent to P + P P = 0 Using separation of variables and the final condition P (t f ) =, the solution is P (t) = The optimal (feedback) control law becomes e (t tf) + u = P (t)x. 7

8 . General Linear-Quadratic Regulator (LQR) Consider the linear system ẋ = Ax + Bu with cost function J = tf xt (t f )P f x(t f ) + 0 ( x T Qx + u T Ru ) dt The Hamiltonian is H = ( x T Qx + u T Ru ) + x V T (Ax + Bu) It is positive definite since uh = R. The optimal control is u = R B T x V which results in the HJB t V = xt Qx + xv T BR B T x V + x V T (Ax BR B T x V ). Consider the value function V (x(t), t) = xt (t)p (t)x(t), for a positive definite symmetric P (t). The optimal control law becomes We obtain which will always hold true if u(t) = R B T P (t)x(t). x P x = xt (Q P BR B T P + P A)x = xt (Q P BR B T P + P A + A T P )x P (t) = A T P (t) + P (t)a P (t)br B T P (t) + Q. which is the Ricatti ODE for P (t). Note that we replaced P A with P A + A T P above in order to ensure that P (t) remains symmetric for all t. The matrix P is known at the terminal point, i.e. P (t f ) = P f Therefore, P (t) is computed by backward integration of the Ricatti ODE. Note that P is symmetric, so only n(n + )/ equations are integrated. 8

9 . Link b/n MP and DP We have the relationship L(x, u, t) + x V (x, t) T f(x, u, t) = H(x, u, x V (x, t), t), i.e. x V plays the role of the multiplier λ HJB equation can be written as t V (x, t) = max u H(x, u, xv (x, t), t) The optimal control becomes u = max u H(x, u, xv, t) If the corresponding optimal trajectory is x then t V (x, t) = H(x, u, x V (x, t), t) Think of λ as guiding the evolution along the gradient of V. Visually, if V can be thought of an expanding front emanating from the goal set, then the vector λ is orthogonal to that front and pointing outward, i.e. it gives the direction of fastest increase of V. DP in practice. If we knew V (x, t) then then DP gives a closed-loop control law u = max u H(x, u, xv, t) But computing V (x, t) globally is extremely challenging Various approximate solutions are possible Simplifications (but not drastic) obtained in the time-independent case with infinite time horizon 9

OPTIMAL CONTROL. Sadegh Bolouki. Lecture slides for ECE 515. University of Illinois, Urbana-Champaign. Fall S. Bolouki (UIUC) 1 / 28

OPTIMAL CONTROL. Sadegh Bolouki. Lecture slides for ECE 515. University of Illinois, Urbana-Champaign. Fall S. Bolouki (UIUC) 1 / 28 OPTIMAL CONTROL Sadegh Bolouki Lecture slides for ECE 515 University of Illinois, Urbana-Champaign Fall 2016 S. Bolouki (UIUC) 1 / 28 (Example from Optimal Control Theory, Kirk) Objective: To get from

More information

Optimal Control. Lecture 18. Hamilton-Jacobi-Bellman Equation, Cont. John T. Wen. March 29, Ref: Bryson & Ho Chapter 4.

Optimal Control. Lecture 18. Hamilton-Jacobi-Bellman Equation, Cont. John T. Wen. March 29, Ref: Bryson & Ho Chapter 4. Optimal Control Lecture 18 Hamilton-Jacobi-Bellman Equation, Cont. John T. Wen Ref: Bryson & Ho Chapter 4. March 29, 2004 Outline Hamilton-Jacobi-Bellman (HJB) Equation Iterative solution of HJB Equation

More information

Deterministic Dynamic Programming

Deterministic Dynamic Programming Deterministic Dynamic Programming 1 Value Function Consider the following optimal control problem in Mayer s form: V (t 0, x 0 ) = inf u U J(t 1, x(t 1 )) (1) subject to ẋ(t) = f(t, x(t), u(t)), x(t 0

More information

Pontryagin s maximum principle

Pontryagin s maximum principle Pontryagin s maximum principle Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 2012 Emo Todorov (UW) AMATH/CSE 579, Winter 2012 Lecture 5 1 / 9 Pontryagin

More information

Optimal Control Theory

Optimal Control Theory Optimal Control Theory The theory Optimal control theory is a mature mathematical discipline which provides algorithms to solve various control problems The elaborate mathematical machinery behind optimal

More information

Hamilton-Jacobi-Bellman Equation Feb 25, 2008

Hamilton-Jacobi-Bellman Equation Feb 25, 2008 Hamilton-Jacobi-Bellman Equation Feb 25, 2008 What is it? The Hamilton-Jacobi-Bellman (HJB) equation is the continuous-time analog to the discrete deterministic dynamic programming algorithm Discrete VS

More information

Optimal Control. Quadratic Functions. Single variable quadratic function: Multi-variable quadratic function:

Optimal Control. Quadratic Functions. Single variable quadratic function: Multi-variable quadratic function: Optimal Control Control design based on pole-placement has non unique solutions Best locations for eigenvalues are sometimes difficult to determine Linear Quadratic LQ) Optimal control minimizes a quadratic

More information

Quadratic Stability of Dynamical Systems. Raktim Bhattacharya Aerospace Engineering, Texas A&M University

Quadratic Stability of Dynamical Systems. Raktim Bhattacharya Aerospace Engineering, Texas A&M University .. Quadratic Stability of Dynamical Systems Raktim Bhattacharya Aerospace Engineering, Texas A&M University Quadratic Lyapunov Functions Quadratic Stability Dynamical system is quadratically stable if

More information

Stochastic and Adaptive Optimal Control

Stochastic and Adaptive Optimal Control Stochastic and Adaptive Optimal Control Robert Stengel Optimal Control and Estimation, MAE 546 Princeton University, 2018! Nonlinear systems with random inputs and perfect measurements! Stochastic neighboring-optimal

More information

Numerical Optimal Control Overview. Moritz Diehl

Numerical Optimal Control Overview. Moritz Diehl Numerical Optimal Control Overview Moritz Diehl Simplified Optimal Control Problem in ODE path constraints h(x, u) 0 initial value x0 states x(t) terminal constraint r(x(t )) 0 controls u(t) 0 t T minimize

More information

Lecture 5 Linear Quadratic Stochastic Control

Lecture 5 Linear Quadratic Stochastic Control EE363 Winter 2008-09 Lecture 5 Linear Quadratic Stochastic Control linear-quadratic stochastic control problem solution via dynamic programming 5 1 Linear stochastic system linear dynamical system, over

More information

Problem 1 Cost of an Infinite Horizon LQR

Problem 1 Cost of an Infinite Horizon LQR THE UNIVERSITY OF TEXAS AT SAN ANTONIO EE 5243 INTRODUCTION TO CYBER-PHYSICAL SYSTEMS H O M E W O R K # 5 Ahmad F. Taha October 12, 215 Homework Instructions: 1. Type your solutions in the LATEX homework

More information

Lecture 4 Continuous time linear quadratic regulator

Lecture 4 Continuous time linear quadratic regulator EE363 Winter 2008-09 Lecture 4 Continuous time linear quadratic regulator continuous-time LQR problem dynamic programming solution Hamiltonian system and two point boundary value problem infinite horizon

More information

Advanced Mechatronics Engineering

Advanced Mechatronics Engineering Advanced Mechatronics Engineering German University in Cairo 21 December, 2013 Outline Necessary conditions for optimal input Example Linear regulator problem Example Necessary conditions for optimal input

More information

Module 05 Introduction to Optimal Control

Module 05 Introduction to Optimal Control Module 05 Introduction to Optimal Control Ahmad F. Taha EE 5243: Introduction to Cyber-Physical Systems Email: ahmad.taha@utsa.edu Webpage: http://engineering.utsa.edu/ taha/index.html October 8, 2015

More information

UCLA Chemical Engineering. Process & Control Systems Engineering Laboratory

UCLA Chemical Engineering. Process & Control Systems Engineering Laboratory Constrained Innite-Time Nonlinear Quadratic Optimal Control V. Manousiouthakis D. Chmielewski Chemical Engineering Department UCLA 1998 AIChE Annual Meeting Outline Unconstrained Innite-Time Nonlinear

More information

ECE7850 Lecture 7. Discrete Time Optimal Control and Dynamic Programming

ECE7850 Lecture 7. Discrete Time Optimal Control and Dynamic Programming ECE7850 Lecture 7 Discrete Time Optimal Control and Dynamic Programming Discrete Time Optimal control Problems Short Introduction to Dynamic Programming Connection to Stabilization Problems 1 DT nonlinear

More information

Controlled Diffusions and Hamilton-Jacobi Bellman Equations

Controlled Diffusions and Hamilton-Jacobi Bellman Equations Controlled Diffusions and Hamilton-Jacobi Bellman Equations Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 2014 Emo Todorov (UW) AMATH/CSE 579, Winter

More information

Static and Dynamic Optimization (42111)

Static and Dynamic Optimization (42111) Static and Dynamic Optimization (421) Niels Kjølstad Poulsen Build. 0b, room 01 Section for Dynamical Systems Dept. of Applied Mathematics and Computer Science The Technical University of Denmark Email:

More information

Formula Sheet for Optimal Control

Formula Sheet for Optimal Control Formula Sheet for Optimal Control Division of Optimization and Systems Theory Royal Institute of Technology 144 Stockholm, Sweden 23 December 1, 29 1 Dynamic Programming 11 Discrete Dynamic Programming

More information

EE221A Linear System Theory Final Exam

EE221A Linear System Theory Final Exam EE221A Linear System Theory Final Exam Professor C. Tomlin Department of Electrical Engineering and Computer Sciences, UC Berkeley Fall 2016 12/16/16, 8-11am Your answers must be supported by analysis,

More information

Homework Solution # 3

Homework Solution # 3 ECSE 644 Optimal Control Feb, 4 Due: Feb 17, 4 (Tuesday) Homework Solution # 3 1 (5%) Consider the discrete nonlinear control system in Homework # For the optimal control and trajectory that you have found

More information

Chapter 5. Pontryagin s Minimum Principle (Constrained OCP)

Chapter 5. Pontryagin s Minimum Principle (Constrained OCP) Chapter 5 Pontryagin s Minimum Principle (Constrained OCP) 1 Pontryagin s Minimum Principle Plant: (5-1) u () t U PI: (5-2) Boundary condition: The goal is to find Optimal Control. 2 Pontryagin s Minimum

More information

Robotics. Control Theory. Marc Toussaint U Stuttgart

Robotics. Control Theory. Marc Toussaint U Stuttgart Robotics Control Theory Topics in control theory, optimal control, HJB equation, infinite horizon case, Linear-Quadratic optimal control, Riccati equations (differential, algebraic, discrete-time), controllability,

More information

Solution of Stochastic Optimal Control Problems and Financial Applications

Solution of Stochastic Optimal Control Problems and Financial Applications Journal of Mathematical Extension Vol. 11, No. 4, (2017), 27-44 ISSN: 1735-8299 URL: http://www.ijmex.com Solution of Stochastic Optimal Control Problems and Financial Applications 2 Mat B. Kafash 1 Faculty

More information

Optimal Control. McGill COMP 765 Oct 3 rd, 2017

Optimal Control. McGill COMP 765 Oct 3 rd, 2017 Optimal Control McGill COMP 765 Oct 3 rd, 2017 Classical Control Quiz Question 1: Can a PID controller be used to balance an inverted pendulum: A) That starts upright? B) That must be swung-up (perhaps

More information

EE C128 / ME C134 Final Exam Fall 2014

EE C128 / ME C134 Final Exam Fall 2014 EE C128 / ME C134 Final Exam Fall 2014 December 19, 2014 Your PRINTED FULL NAME Your STUDENT ID NUMBER Number of additional sheets 1. No computers, no tablets, no connected device (phone etc.) 2. Pocket

More information

Computational Issues in Nonlinear Dynamics and Control

Computational Issues in Nonlinear Dynamics and Control Computational Issues in Nonlinear Dynamics and Control Arthur J. Krener ajkrener@ucdavis.edu Supported by AFOSR and NSF Typical Problems Numerical Computation of Invariant Manifolds Typical Problems Numerical

More information

ESC794: Special Topics: Model Predictive Control

ESC794: Special Topics: Model Predictive Control ESC794: Special Topics: Model Predictive Control Nonlinear MPC Analysis : Part 1 Reference: Nonlinear Model Predictive Control (Ch.3), Grüne and Pannek Hanz Richter, Professor Mechanical Engineering Department

More information

A Tour of Reinforcement Learning The View from Continuous Control. Benjamin Recht University of California, Berkeley

A Tour of Reinforcement Learning The View from Continuous Control. Benjamin Recht University of California, Berkeley A Tour of Reinforcement Learning The View from Continuous Control Benjamin Recht University of California, Berkeley trustable, scalable, predictable Control Theory! Reinforcement Learning is the study

More information

Lecture 10 Linear Quadratic Stochastic Control with Partial State Observation

Lecture 10 Linear Quadratic Stochastic Control with Partial State Observation EE363 Winter 2008-09 Lecture 10 Linear Quadratic Stochastic Control with Partial State Observation partially observed linear-quadratic stochastic control problem estimation-control separation principle

More information

Chapter 2 Optimal Control Problem

Chapter 2 Optimal Control Problem Chapter 2 Optimal Control Problem Optimal control of any process can be achieved either in open or closed loop. In the following two chapters we concentrate mainly on the first class. The first chapter

More information

Linear Quadratic Optimal Control Topics

Linear Quadratic Optimal Control Topics Linear Quadratic Optimal Control Topics Finite time LQR problem for time varying systems Open loop solution via Lagrange multiplier Closed loop solution Dynamic programming (DP) principle Cost-to-go function

More information

Static and Dynamic Optimization (42111)

Static and Dynamic Optimization (42111) Static and Dynamic Optimization (42111) Niels Kjølstad Poulsen Build. 303b, room 016 Section for Dynamical Systems Dept. of Applied Mathematics and Computer Science The Technical University of Denmark

More information

Numerical approximation for optimal control problems via MPC and HJB. Giulia Fabrini

Numerical approximation for optimal control problems via MPC and HJB. Giulia Fabrini Numerical approximation for optimal control problems via MPC and HJB Giulia Fabrini Konstanz Women In Mathematics 15 May, 2018 G. Fabrini (University of Konstanz) Numerical approximation for OCP 1 / 33

More information

MATH4406 (Control Theory) Unit 6: The Linear Quadratic Regulator (LQR) and Model Predictive Control (MPC) Prepared by Yoni Nazarathy, Artem

MATH4406 (Control Theory) Unit 6: The Linear Quadratic Regulator (LQR) and Model Predictive Control (MPC) Prepared by Yoni Nazarathy, Artem MATH4406 (Control Theory) Unit 6: The Linear Quadratic Regulator (LQR) and Model Predictive Control (MPC) Prepared by Yoni Nazarathy, Artem Pulemotov, September 12, 2012 Unit Outline Goal 1: Outline linear

More information

Lecture 9: Discrete-Time Linear Quadratic Regulator Finite-Horizon Case

Lecture 9: Discrete-Time Linear Quadratic Regulator Finite-Horizon Case Lecture 9: Discrete-Time Linear Quadratic Regulator Finite-Horizon Case Dr. Burak Demirel Faculty of Electrical Engineering and Information Technology, University of Paderborn December 15, 2015 2 Previous

More information

EE363 homework 2 solutions

EE363 homework 2 solutions EE363 Prof. S. Boyd EE363 homework 2 solutions. Derivative of matrix inverse. Suppose that X : R R n n, and that X(t is invertible. Show that ( d d dt X(t = X(t dt X(t X(t. Hint: differentiate X(tX(t =

More information

Suppose that we have a specific single stage dynamic system governed by the following equation:

Suppose that we have a specific single stage dynamic system governed by the following equation: Dynamic Optimisation Discrete Dynamic Systems A single stage example Suppose that we have a specific single stage dynamic system governed by the following equation: x 1 = ax 0 + bu 0, x 0 = x i (1) where

More information

EE C128 / ME C134 Feedback Control Systems

EE C128 / ME C134 Feedback Control Systems EE C128 / ME C134 Feedback Control Systems Lecture Additional Material Introduction to Model Predictive Control Maximilian Balandat Department of Electrical Engineering & Computer Science University of

More information

CHAPTER 2 THE MAXIMUM PRINCIPLE: CONTINUOUS TIME. Chapter2 p. 1/67

CHAPTER 2 THE MAXIMUM PRINCIPLE: CONTINUOUS TIME. Chapter2 p. 1/67 CHAPTER 2 THE MAXIMUM PRINCIPLE: CONTINUOUS TIME Chapter2 p. 1/67 THE MAXIMUM PRINCIPLE: CONTINUOUS TIME Main Purpose: Introduce the maximum principle as a necessary condition to be satisfied by any optimal

More information

Principles of Optimal Control Spring 2008

Principles of Optimal Control Spring 2008 MIT OpenCourseWare http://ocw.mit.edu 16.323 Principles of Optimal Control Spring 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 16.323 Lecture

More information

Game Theory Extra Lecture 1 (BoB)

Game Theory Extra Lecture 1 (BoB) Game Theory 2014 Extra Lecture 1 (BoB) Differential games Tools from optimal control Dynamic programming Hamilton-Jacobi-Bellman-Isaacs equation Zerosum linear quadratic games and H control Baser/Olsder,

More information

Direct Methods. Moritz Diehl. Optimization in Engineering Center (OPTEC) and Electrical Engineering Department (ESAT) K.U.

Direct Methods. Moritz Diehl. Optimization in Engineering Center (OPTEC) and Electrical Engineering Department (ESAT) K.U. Direct Methods Moritz Diehl Optimization in Engineering Center (OPTEC) and Electrical Engineering Department (ESAT) K.U. Leuven Belgium Overview Direct Single Shooting Direct Collocation Direct Multiple

More information

EE291E/ME 290Q Lecture Notes 8. Optimal Control and Dynamic Games

EE291E/ME 290Q Lecture Notes 8. Optimal Control and Dynamic Games EE291E/ME 290Q Lecture Notes 8. Optimal Control and Dynamic Games S. S. Sastry REVISED March 29th There exist two main approaches to optimal control and dynamic games: 1. via the Calculus of Variations

More information

Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control

Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control RTyrrell Rockafellar and Peter R Wolenski Abstract This paper describes some recent results in Hamilton- Jacobi theory

More information

EN Nonlinear Control and Planning in Robotics Lecture 3: Stability February 4, 2015

EN Nonlinear Control and Planning in Robotics Lecture 3: Stability February 4, 2015 EN530.678 Nonlinear Control and Planning in Robotics Lecture 3: Stability February 4, 2015 Prof: Marin Kobilarov 0.1 Model prerequisites Consider ẋ = f(t, x). We will make the following basic assumptions

More information

The Linear Quadratic Regulator

The Linear Quadratic Regulator 10 The Linear Qadratic Reglator 10.1 Problem formlation This chapter concerns optimal control of dynamical systems. Most of this development concerns linear models with a particlarly simple notion of optimality.

More information

Control of Dynamical System

Control of Dynamical System Control of Dynamical System Yuan Gao Applied Mathematics University of Washington yuangao@uw.edu Spring 2015 1 / 21 Simplified Model for a Car To start, let s consider the simple example of controlling

More information

The Inverted Pendulum

The Inverted Pendulum Lab 1 The Inverted Pendulum Lab Objective: We will set up the LQR optimal control problem for the inverted pendulum and compute the solution numerically. Think back to your childhood days when, for entertainment

More information

LINEAR-CONVEX CONTROL AND DUALITY

LINEAR-CONVEX CONTROL AND DUALITY 1 LINEAR-CONVEX CONTROL AND DUALITY R.T. Rockafellar Department of Mathematics, University of Washington Seattle, WA 98195-4350, USA Email: rtr@math.washington.edu R. Goebel 3518 NE 42 St., Seattle, WA

More information

Optimization via the Hamilton-Jacobi-Bellman Method: Theory and Applications

Optimization via the Hamilton-Jacobi-Bellman Method: Theory and Applications Optimization via the Hamilton-Jacobi-Bellman Method: Theory and Applications Navin Khaneja lectre notes taken by Christiane Koch Jne 24, 29 1 Variation yields a classical Hamiltonian system Sppose that

More information

The HJB-POD approach for infinite dimensional control problems

The HJB-POD approach for infinite dimensional control problems The HJB-POD approach for infinite dimensional control problems M. Falcone works in collaboration with A. Alla, D. Kalise and S. Volkwein Università di Roma La Sapienza OCERTO Workshop Cortona, June 22,

More information

Linear-Quadratic-Gaussian (LQG) Controllers and Kalman Filters

Linear-Quadratic-Gaussian (LQG) Controllers and Kalman Filters Linear-Quadratic-Gaussian (LQG) Controllers and Kalman Filters Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 204 Emo Todorov (UW) AMATH/CSE 579, Winter

More information

A Globally Stabilizing Receding Horizon Controller for Neutrally Stable Linear Systems with Input Constraints 1

A Globally Stabilizing Receding Horizon Controller for Neutrally Stable Linear Systems with Input Constraints 1 A Globally Stabilizing Receding Horizon Controller for Neutrally Stable Linear Systems with Input Constraints 1 Ali Jadbabaie, Claudio De Persis, and Tae-Woong Yoon 2 Department of Electrical Engineering

More information

A Remark on IVP and TVP Non-Smooth Viscosity Solutions to Hamilton-Jacobi Equations

A Remark on IVP and TVP Non-Smooth Viscosity Solutions to Hamilton-Jacobi Equations 2005 American Control Conference June 8-10, 2005. Portland, OR, USA WeB10.3 A Remark on IVP and TVP Non-Smooth Viscosity Solutions to Hamilton-Jacobi Equations Arik Melikyan, Andrei Akhmetzhanov and Naira

More information

Feedback Optimal Control of Low-thrust Orbit Transfer in Central Gravity Field

Feedback Optimal Control of Low-thrust Orbit Transfer in Central Gravity Field Vol. 4, No. 4, 23 Feedback Optimal Control of Low-thrust Orbit Transfer in Central Gravity Field Ashraf H. Owis Department of Astronomy, Space and Meteorology, Faculty of Science, Cairo University Department

More information

Dynamic Programming with Applications. Class Notes. René Caldentey Stern School of Business, New York University

Dynamic Programming with Applications. Class Notes. René Caldentey Stern School of Business, New York University Dynamic Programming with Applications Class Notes René Caldentey Stern School of Business, New York University Spring 2011 Prof. R. Caldentey Preface These lecture notes are based on the material that

More information

Alberto Bressan. Department of Mathematics, Penn State University

Alberto Bressan. Department of Mathematics, Penn State University Non-cooperative Differential Games A Homotopy Approach Alberto Bressan Department of Mathematics, Penn State University 1 Differential Games d dt x(t) = G(x(t), u 1(t), u 2 (t)), x(0) = y, u i (t) U i

More information

minimize x subject to (x 2)(x 4) u,

minimize x subject to (x 2)(x 4) u, Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for

More information

Lecture 9 Approximations of Laplace s Equation, Finite Element Method. Mathématiques appliquées (MATH0504-1) B. Dewals, C.

Lecture 9 Approximations of Laplace s Equation, Finite Element Method. Mathématiques appliquées (MATH0504-1) B. Dewals, C. Lecture 9 Approximations of Laplace s Equation, Finite Element Method Mathématiques appliquées (MATH54-1) B. Dewals, C. Geuzaine V1.2 23/11/218 1 Learning objectives of this lecture Apply the finite difference

More information

OPTIMAL SPACECRAF1 ROTATIONAL MANEUVERS

OPTIMAL SPACECRAF1 ROTATIONAL MANEUVERS STUDIES IN ASTRONAUTICS 3 OPTIMAL SPACECRAF1 ROTATIONAL MANEUVERS JOHNL.JUNKINS Texas A&M University, College Station, Texas, U.S.A. and JAMES D.TURNER Cambridge Research, Division of PRA, Inc., Cambridge,

More information

Stochastic optimal control theory

Stochastic optimal control theory Stochastic optimal control theory Bert Kappen SNN Radboud University Nijmegen the Netherlands July 5, 2008 Bert Kappen Introduction Optimal control theory: Optimize sum of a path cost and end cost. Result

More information

ECSE.6440 MIDTERM EXAM Solution Optimal Control. Assigned: February 26, 2004 Due: 12:00 pm, March 4, 2004

ECSE.6440 MIDTERM EXAM Solution Optimal Control. Assigned: February 26, 2004 Due: 12:00 pm, March 4, 2004 ECSE.6440 MIDTERM EXAM Solution Optimal Control Assigned: February 26, 2004 Due: 12:00 pm, March 4, 2004 This is a take home exam. It is essential to SHOW ALL STEPS IN YOUR WORK. In NO circumstance is

More information

Numerical Methods for Constrained Optimal Control Problems

Numerical Methods for Constrained Optimal Control Problems Numerical Methods for Constrained Optimal Control Problems Hartono Hartono A Thesis submitted for the degree of Doctor of Philosophy School of Mathematics and Statistics University of Western Australia

More information

Course Summary Math 211

Course Summary Math 211 Course Summary Math 211 table of contents I. Functions of several variables. II. R n. III. Derivatives. IV. Taylor s Theorem. V. Differential Geometry. VI. Applications. 1. Best affine approximations.

More information

The Derivative. Appendix B. B.1 The Derivative of f. Mappings from IR to IR

The Derivative. Appendix B. B.1 The Derivative of f. Mappings from IR to IR Appendix B The Derivative B.1 The Derivative of f In this chapter, we give a short summary of the derivative. Specifically, we want to compare/contrast how the derivative appears for functions whose domain

More information

LECTURE NOTES IN CALCULUS OF VARIATIONS AND OPTIMAL CONTROL MSc in Systems and Control. Dr George Halikias

LECTURE NOTES IN CALCULUS OF VARIATIONS AND OPTIMAL CONTROL MSc in Systems and Control. Dr George Halikias Ver.1.2 LECTURE NOTES IN CALCULUS OF VARIATIONS AND OPTIMAL CONTROL MSc in Systems and Control Dr George Halikias EEIE, School of Engineering and Mathematical Sciences, City University 4 March 27 1. Calculus

More information

Passive control. Carles Batlle. II EURON/GEOPLEX Summer School on Modeling and Control of Complex Dynamical Systems Bertinoro, Italy, July

Passive control. Carles Batlle. II EURON/GEOPLEX Summer School on Modeling and Control of Complex Dynamical Systems Bertinoro, Italy, July Passive control theory II Carles Batlle II EURON/GEOPLEX Summer School on Modeling and Control of Complex Dynamical Systems Bertinoro, Italy, July 18-22 2005 Contents of this lecture Interconnection and

More information

Mathematical Economics. Lecture Notes (in extracts)

Mathematical Economics. Lecture Notes (in extracts) Prof. Dr. Frank Werner Faculty of Mathematics Institute of Mathematical Optimization (IMO) http://math.uni-magdeburg.de/ werner/math-ec-new.html Mathematical Economics Lecture Notes (in extracts) Winter

More information

4F3 - Predictive Control

4F3 - Predictive Control 4F3 Predictive Control - Lecture 2 p 1/23 4F3 - Predictive Control Lecture 2 - Unconstrained Predictive Control Jan Maciejowski jmm@engcamacuk 4F3 Predictive Control - Lecture 2 p 2/23 References Predictive

More information

Optimal Control, TSRT08: Exercises. This version: October 2015

Optimal Control, TSRT08: Exercises. This version: October 2015 Optimal Control, TSRT8: Exercises This version: October 215 Preface Acknowledgment Abbreviations The exercises are modified problems from different sources: Exercises 2.1, 2.2,3.5,4.1,4.2,4.3,4.5, 5.1a)

More information

Numerical Methods for Optimal Control Problems. Part I: Hamilton-Jacobi-Bellman Equations and Pontryagin Minimum Principle

Numerical Methods for Optimal Control Problems. Part I: Hamilton-Jacobi-Bellman Equations and Pontryagin Minimum Principle Numerical Methods for Optimal Control Problems. Part I: Hamilton-Jacobi-Bellman Equations and Pontryagin Minimum Principle Ph.D. course in OPTIMAL CONTROL Emiliano Cristiani (IAC CNR) e.cristiani@iac.cnr.it

More information

Subject: Optimal Control Assignment-1 (Related to Lecture notes 1-10)

Subject: Optimal Control Assignment-1 (Related to Lecture notes 1-10) Subject: Optimal Control Assignment- (Related to Lecture notes -). Design a oil mug, shown in fig., to hold as much oil possible. The height and radius of the mug should not be more than 6cm. The mug must

More information

Computational Issues in Nonlinear Control and Estimation. Arthur J Krener Naval Postgraduate School Monterey, CA 93943

Computational Issues in Nonlinear Control and Estimation. Arthur J Krener Naval Postgraduate School Monterey, CA 93943 Computational Issues in Nonlinear Control and Estimation Arthur J Krener Naval Postgraduate School Monterey, CA 93943 Modern Control Theory Modern Control Theory dates back to first International Federation

More information

Path Integral Stochastic Optimal Control for Reinforcement Learning

Path Integral Stochastic Optimal Control for Reinforcement Learning Preprint August 3, 204 The st Multidisciplinary Conference on Reinforcement Learning and Decision Making RLDM203 Path Integral Stochastic Optimal Control for Reinforcement Learning Farbod Farshidian Institute

More information

LMI Methods in Optimal and Robust Control

LMI Methods in Optimal and Robust Control LMI Methods in Optimal and Robust Control Matthew M. Peet Arizona State University Lecture 15: Nonlinear Systems and Lyapunov Functions Overview Our next goal is to extend LMI s and optimization to nonlinear

More information

Lecture 2 and 3: Controllability of DT-LTI systems

Lecture 2 and 3: Controllability of DT-LTI systems 1 Lecture 2 and 3: Controllability of DT-LTI systems Spring 2013 - EE 194, Advanced Control (Prof Khan) January 23 (Wed) and 28 (Mon), 2013 I LTI SYSTEMS Recall that continuous-time LTI systems can be

More information

Ordinary differential equations. Phys 420/580 Lecture 8

Ordinary differential equations. Phys 420/580 Lecture 8 Ordinary differential equations Phys 420/580 Lecture 8 Most physical laws are expressed as differential equations These come in three flavours: initial-value problems boundary-value problems eigenvalue

More information

Bayesian Decision Theory in Sensorimotor Control

Bayesian Decision Theory in Sensorimotor Control Bayesian Decision Theory in Sensorimotor Control Matthias Freiberger, Martin Öttl Signal Processing and Speech Communication Laboratory Advanced Signal Processing Matthias Freiberger, Martin Öttl Advanced

More information

Inverse Optimality Design for Biological Movement Systems

Inverse Optimality Design for Biological Movement Systems Inverse Optimality Design for Biological Movement Systems Weiwei Li Emanuel Todorov Dan Liu Nordson Asymtek Carlsbad CA 921 USA e-mail: wwli@ieee.org. University of Washington Seattle WA 98195 USA Google

More information

Theoretical Justification for LQ problems: Sufficiency condition: LQ problem is the second order expansion of nonlinear optimal control problems.

Theoretical Justification for LQ problems: Sufficiency condition: LQ problem is the second order expansion of nonlinear optimal control problems. ES22 Lecture Notes #11 Theoretical Justification for LQ problems: Sufficiency condition: LQ problem is the second order expansion of nonlinear optimal control problems. J = φ(x( ) + L(x,u,t)dt ; x= f(x,u,t)

More information

UCLA Chemical Engineering. Process & Control Systems Engineering Laboratory

UCLA Chemical Engineering. Process & Control Systems Engineering Laboratory Constrained Innite-time Optimal Control Donald J. Chmielewski Chemical Engineering Department University of California Los Angeles February 23, 2000 Stochastic Formulation - Min Max Formulation - UCLA

More information

Suboptimality of minmax MPC. Seungho Lee. ẋ(t) = f(x(t), u(t)), x(0) = x 0, t 0 (1)

Suboptimality of minmax MPC. Seungho Lee. ẋ(t) = f(x(t), u(t)), x(0) = x 0, t 0 (1) Suboptimality of minmax MPC Seungho Lee In this paper, we consider particular case of Model Predictive Control (MPC) when the problem that needs to be solved in each sample time is the form of min max

More information

Non-Commutative Worlds

Non-Commutative Worlds Non-Commutative Worlds Louis H. Kauffman*(kauffman@uic.edu) We explore how a discrete viewpoint about physics is related to non-commutativity, gauge theory and differential geometry.

More information

A Very General Hamilton Jacobi Theorem

A Very General Hamilton Jacobi Theorem University of Salerno 9th AIMS ICDSDEA Orlando (Florida) USA, July 1 5, 2012 Introduction: Generalized Hamilton-Jacobi Theorem Hamilton-Jacobi (HJ) Theorem: Consider a Hamiltonian system with Hamiltonian

More information

Sliding Mode Regulator as Solution to Optimal Control Problem for Nonlinear Polynomial Systems

Sliding Mode Regulator as Solution to Optimal Control Problem for Nonlinear Polynomial Systems 29 American Control Conference Hyatt Regency Riverfront, St. Louis, MO, USA June -2, 29 WeA3.5 Sliding Mode Regulator as Solution to Optimal Control Problem for Nonlinear Polynomial Systems Michael Basin

More information

Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games

Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games Alberto Bressan ) and Khai T. Nguyen ) *) Department of Mathematics, Penn State University **) Department of Mathematics,

More information

Robotics: Science & Systems [Topic 6: Control] Prof. Sethu Vijayakumar Course webpage:

Robotics: Science & Systems [Topic 6: Control] Prof. Sethu Vijayakumar Course webpage: Robotics: Science & Systems [Topic 6: Control] Prof. Sethu Vijayakumar Course webpage: http://wcms.inf.ed.ac.uk/ipab/rss Control Theory Concerns controlled systems of the form: and a controller of the

More information

ECON 582: An Introduction to the Theory of Optimal Control (Chapter 7, Acemoglu) Instructor: Dmytro Hryshko

ECON 582: An Introduction to the Theory of Optimal Control (Chapter 7, Acemoglu) Instructor: Dmytro Hryshko ECON 582: An Introduction to the Theory of Optimal Control (Chapter 7, Acemoglu) Instructor: Dmytro Hryshko Continuous-time optimization involves maximization wrt to an innite dimensional object, an entire

More information

Steady State Kalman Filter

Steady State Kalman Filter Steady State Kalman Filter Infinite Horizon LQ Control: ẋ = Ax + Bu R positive definite, Q = Q T 2Q 1 2. (A, B) stabilizable, (A, Q 1 2) detectable. Solve for the positive (semi-) definite P in the ARE:

More information

CDS 110b: Lecture 2-1 Linear Quadratic Regulators

CDS 110b: Lecture 2-1 Linear Quadratic Regulators CDS 110b: Lecture 2-1 Linear Quadratic Regulators Richard M. Murray 11 January 2006 Goals: Derive the linear quadratic regulator and demonstrate its use Reading: Friedland, Chapter 9 (different derivation,

More information

Linear Quadratic Optimal Control

Linear Quadratic Optimal Control 156 c Perry Y.Li Chapter 6 Linear Quadratic Optimal Control 6.1 Introduction In previous lectures, we discussed the design of state feedback controllers using using eigenvalue (pole) placement algorithms.

More information

Exercises for Multivariable Differential Calculus XM521

Exercises for Multivariable Differential Calculus XM521 This document lists all the exercises for XM521. The Type I (True/False) exercises will be given, and should be answered, online immediately following each lecture. The Type III exercises are to be done

More information

Math 124A October 11, 2011

Math 124A October 11, 2011 Math 14A October 11, 11 Viktor Grigoryan 6 Wave equation: solution In this lecture we will solve the wave equation on the entire real line x R. This corresponds to a string of infinite length. Although

More information

9 Controller Discretization

9 Controller Discretization 9 Controller Discretization In most applications, a control system is implemented in a digital fashion on a computer. This implies that the measurements that are supplied to the control system must be

More information

A Relationship Between Minimum Bending Energy and Degree Elevation for Bézier Curves

A Relationship Between Minimum Bending Energy and Degree Elevation for Bézier Curves A Relationship Between Minimum Bending Energy and Degree Elevation for Bézier Curves David Eberly, Geometric Tools, Redmond WA 9852 https://www.geometrictools.com/ This work is licensed under the Creative

More information

DYNAMIC LECTURE 5: DISCRETE TIME INTERTEMPORAL OPTIMIZATION

DYNAMIC LECTURE 5: DISCRETE TIME INTERTEMPORAL OPTIMIZATION DYNAMIC LECTURE 5: DISCRETE TIME INTERTEMPORAL OPTIMIZATION UNIVERSITY OF MARYLAND: ECON 600. Alternative Methods of Discrete Time Intertemporal Optimization We will start by solving a discrete time intertemporal

More information

Linearly-Solvable Stochastic Optimal Control Problems

Linearly-Solvable Stochastic Optimal Control Problems Linearly-Solvable Stochastic Optimal Control Problems Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 2014 Emo Todorov (UW) AMATH/CSE 579, Winter 2014

More information

An Introduction to Noncooperative Games

An Introduction to Noncooperative Games An Introduction to Noncooperative Games Alberto Bressan Department of Mathematics, Penn State University Alberto Bressan (Penn State) Noncooperative Games 1 / 29 introduction to non-cooperative games:

More information