Optimal Control Theory

Similar documents
Pontryagin s maximum principle

EN Applied Optimal Control Lecture 8: Dynamic Programming October 10, 2018

Optimal Control. McGill COMP 765 Oct 3 rd, 2017

Lecture 5 Linear Quadratic Stochastic Control

Linear-Quadratic-Gaussian (LQG) Controllers and Kalman Filters

Controlled Diffusions and Hamilton-Jacobi Bellman Equations

OPTIMAL CONTROL. Sadegh Bolouki. Lecture slides for ECE 515. University of Illinois, Urbana-Champaign. Fall S. Bolouki (UIUC) 1 / 28

Deterministic Dynamic Programming

Problem 1 Cost of an Infinite Horizon LQR

ECE7850 Lecture 7. Discrete Time Optimal Control and Dynamic Programming

Numerical Optimal Control Overview. Moritz Diehl

Optimal Control. Lecture 18. Hamilton-Jacobi-Bellman Equation, Cont. John T. Wen. March 29, Ref: Bryson & Ho Chapter 4.

The integrating factor method (Sect. 1.1)

Bayesian Decision Theory in Sensorimotor Control

Robotics: Science & Systems [Topic 6: Control] Prof. Sethu Vijayakumar Course webpage:

Control of Dynamical System

MATH4406 (Control Theory) Unit 6: The Linear Quadratic Regulator (LQR) and Model Predictive Control (MPC) Prepared by Yoni Nazarathy, Artem

Introduction to Reachability Somil Bansal Hybrid Systems Lab, UC Berkeley

Numerical Methods for Optimal Control Problems. Part I: Hamilton-Jacobi-Bellman Equations and Pontryagin Minimum Principle

Trajectory-based optimization

Path Integral Stochastic Optimal Control for Reinforcement Learning

CDS 110b: Lecture 2-1 Linear Quadratic Regulators

Math 4329: Numerical Analysis Chapter 03: Newton s Method. Natasha S. Sharma, PhD

Economics 2010c: Lectures 9-10 Bellman Equation in Continuous Time

Linearly-Solvable Stochastic Optimal Control Problems

Outline. 1 Linear Quadratic Problem. 2 Constraints. 3 Dynamic Programming Solution. 4 The Infinite Horizon LQ Problem.

EE C128 / ME C134 Feedback Control Systems

Lecture 3: Hamilton-Jacobi-Bellman Equations. Distributional Macroeconomics. Benjamin Moll. Part II of ECON Harvard University, Spring

The HJB-POD approach for infinite dimensional control problems

Robotics. Control Theory. Marc Toussaint U Stuttgart

Numerical approximation for optimal control problems via MPC and HJB. Giulia Fabrini

Optimal Control. Lecture 3. Optimal Control of Discrete Time Dynamical Systems. John T. Wen. January 22, 2004

Chapter 5. Pontryagin s Minimum Principle (Constrained OCP)

Hamilton-Jacobi-Bellman Equation Feb 25, 2008

Computational Issues in Nonlinear Dynamics and Control

Optimal control and estimation

Game Theoretic Continuous Time Differential Dynamic Programming

Lecture 10 Linear Quadratic Stochastic Control with Partial State Observation

Proceedings of the International Conference on Neural Networks, Orlando Florida, June Leemon C. Baird III

Stochastic and Adaptive Optimal Control

Suboptimal feedback control of PDEs by solving Hamilton-Jacobi Bellman equations on sparse grids

A Tour of Reinforcement Learning The View from Continuous Control. Benjamin Recht University of California, Berkeley

Model Predictive Control Short Course Regulation

Chapter 2 Optimal Control Problem

ECON 582: Dynamic Programming (Chapter 6, Acemoglu) Instructor: Dmytro Hryshko

Thuong Nguyen. SADCO Internal Review Metting

Lecture 20: Linear Dynamics and LQG

DYNAMIC LECTURE 5: DISCRETE TIME INTERTEMPORAL OPTIMIZATION

MODEL-BASED REINFORCEMENT LEARNING FOR ONLINE APPROXIMATE OPTIMAL CONTROL

Solution of Stochastic Optimal Control Problems and Financial Applications

Steady State Kalman Filter

Lecture 9: Discrete-Time Linear Quadratic Regulator Finite-Horizon Case

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

MATH4406 (Control Theory) Unit 1: Introduction Prepared by Yoni Nazarathy, July 21, 2012

Outline. Linear regulation and state estimation (LQR and LQE) Linear differential equations. Discrete time linear difference equations

Markov Decision Processes Infinite Horizon Problems

Math 4329: Numerical Analysis Chapter 03: Fixed Point Iteration and Ill behaving problems. Natasha S. Sharma, PhD

Static and Dynamic Optimization (42111)

Numerical Methods. King Saud University

Optimal Control. Quadratic Functions. Single variable quadratic function: Multi-variable quadratic function:

CMU Lecture 11: Markov Decision Processes II. Teacher: Gianni A. Di Caro

Model Predictive Regulation

Maximum Process Problems in Optimal Control Theory

Generalized Dual Dynamic Programming for Infinite Horizon Problems in Continuous State and Action Spaces

Homework Solution # 3

MS&E338 Reinforcement Learning Lecture 1 - April 2, Introduction

ESC794: Special Topics: Model Predictive Control

Nonlinear Control Systems

Real-time Constrained Nonlinear Optimization for Maximum Power Take-off of a Wave Energy Converter

Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations

Module 05 Introduction to Optimal Control

Linear conic optimization for nonlinear optimal control

Lecture 4: The Bellman Operator Dynamic Programming

Game Theory Extra Lecture 1 (BoB)

7 OPTIMAL CONTROL 7.1 EXERCISE 1. Solve the following optimal control problem. max. (x u 2 )dt x = u x(0) = Solution with the first variation

CS 7180: Behavioral Modeling and Decisionmaking

Suboptimality of minmax MPC. Seungho Lee. ẋ(t) = f(x(t), u(t)), x(0) = x 0, t 0 (1)

Optimal Control, Trajectory Optimization, Learning Dynamics

Inverse Optimality Design for Biological Movement Systems

UCLA Chemical Engineering. Process & Control Systems Engineering Laboratory

Lecture 4: Dynamic Programming

MDP Preliminaries. Nan Jiang. February 10, 2019

Lecture 4 Continuous time linear quadratic regulator

Deterministic Models

HJB equations. Seminar in Stochastic Modelling in Economics and Finance January 10, 2011

Constrained Optimization

Lecture 1: Dynamic Programming

Lecture 3: Markov Decision Processes

Suppose that we have a specific single stage dynamic system governed by the following equation:

Robust control and applications in economic theory

BSM510 Numerical Analysis

Feedback Optimal Control of Low-thrust Orbit Transfer in Central Gravity Field

Computational Issues in Nonlinear Control and Estimation. Arthur J Krener Naval Postgraduate School Monterey, CA 93943

A Very Brief Introduction to Conservation Laws

Introduction Linear system Nonlinear equation Interpolation

Continuous State Space Q-Learning for Control of Nonlinear Systems

Linear Quadratic Regulator (LQR) Design I

Chapter 3 Nonlinear Model Predictive Control

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods

Lecture 18: Optimization Programming

Transcription:

Optimal Control Theory

The theory Optimal control theory is a mature mathematical discipline which provides algorithms to solve various control problems The elaborate mathematical machinery behind optimal control models is rarely exposed to computer animation community Most controllers designed in practice are theoretically suboptimal This lecture closely follows the excellent tutorial by Dr. Emo Todorov (http://www.cs.washington.edu/homes/todorov/ papers/optimality_chapter.pdf)

Discrete control: Bellman equations Continuous control: HJB equations Maximum principle Linear quadratic regulator (LQR)

Standard problem Find an action sequence (u0, u1,..., un-1) and corresponding state sequence (x0, x1,..., xn-1) minimizing the total cost The initial state (x0) and the destination state (xn) are given

Discrete control $120 $150 $200 $450 $500 $350 $350 $250 $200 $150 $120 next(x,u) cost(x,u) $300 $250

Dynamic programming Bellman optimality principle: If a given state-action sequence is optimal and we remove the first state and action, remaining sequence is also optimal The choice of optimal actions in the futures is independent of the past actions which led to the present state The optimal state-action sequences can be constructed by starting at the final state and extending backwards

Optimal value function v(x) = minimal total cost for completing the task starting from state x Find optimal actions: 1. Consider every action available at the current state 2. Add its immediate cost to the optimal value of the resulting next state 3. Choose an action for which the sum is minimal

Optimal control policy A mapping from states to actions is called control policy or control law Once we have a control policy, we can start at any state and reach the destination state by following the control policy Optimal control policy satisfies Its corresponding optimal value function satisfies

Value iteration Bellman equations cannot be solved in a single pass if the state transitions are cyclic Value iteration starts with a guess v (0) of the optimal value function and construct a sequence of improved guesses:

Discrete control: Bellman equations Continuous control: HJB equations Maximum principle Linear quadratic regulator (LQR)

Continuous control State space and control space are continuos Dynamics of the system: Continuous time Discrete time Objective function:

HJB equation HJB equation is a nonlinear PDE with respect to unknown function v v t (x,t)= min (l(x, u,t)+f(x, u2u(x) u)t v x (x,t)) An optimal control π(x, t) is a value of u which achieves the minimum in HJB equation (x,t) = arg min u2u(x) (l(x, u,t)+f(x, u)t v x (x,t))

Numerical solution Non-linear differential equations do not always have classic solutions which satisfy them everywhere Numerical methods guarantee convergence, but they rely on discretization of the state space, which grows exponentially in the state space dimension Nevertheless, the HJB equations have motivated a number of methods for approximate solution

Parametric value function Consider an approximation to the optimal value function The derivative function with respect to x Choose a large enough set of states and evaluate the right hand side of HJB using the approximated value function Adjust theta such that get closer to target values

Discrete control: Bellman equations Continuous control: HJB equations Maximum principle Linear quadratic regulator (LQR)

Maximum principle Maximum principle solves the optimal control for a deterministic dynamic system with boundary conditions Can be derived via HJB equations or Lagrange multipliers Can be generalized to other types of optimal control problems: free final time, intermediate constraints, first exit time, control constraints, etc

Derivation via HJB The finite horizon HJB: If an optimal control policy, π(x, t) is given, we can set u = π(x, t) and drop the min operator in HJB

Maximum principle The remarkable property of the maximum principle is that it is an ODE, even though we derived it starting from a PDE An ODE is a consistency condition which singles out specific trajectories without reference to neighboring trajectories Extremal trajectories which solve the above optimization remove the dependence on neighboring trajectories

Hamiltonian function The maximum principle can be written in more compact and symmetric form with the help of the Hamiltonian function Maximum principle can be redefined as

Discrete control: Bellman equations Continuous control: HJB equations Maximum principle Linear quadratic regulator (LQR)

Linear quadratic regulator Most optimal control problems do not have closed-form solutions. One exception is LQR case LQR is a class of problems which dynamic function is linear and cost function is quadratic dynamics: cost rate: final cost R is symmetric positive definite, and Q and Q f are symmetric A, B, R, Q can be made time-varying

Optimal value function For a LQR problem, the optimal value function is quadratic in x and can be expressed as where V(t) is a symmetric matrix We can obtain the ODE of V(t) via HJB equation

Discrete LQR LQR is defined as follows when time is discretized dynamics cost rate final cost Let n = tf /Δ, the correspondence to continuous-time problem is

Optimal value function We derive optimal value function from Bellman equation Again, the optimal value function is quadratic in x and changes over time Plugging in Bellman equation, we obtain a recursive relation of Vk The optimal control law is linear in x