Real Time Stochastic Control and Decision Making: From theory to algorithms and applications
|
|
- Kimberly Cunningham
- 5 years ago
- Views:
Transcription
1 Real Time Stochastic Control and Decision Making: From theory to algorithms and applications Evangelos A. Theodorou Autonomous Control and Decision Systems Lab
2 Challenges in control Uncertainty Stochastic uncertainty Parametric uncertainty Stochastic and parametric uncertainty Total uncertainty Scalability State space Control parameterization Computational Efficiency Real time Few data/interactions Optimality Global optimality 2
3 First principles Stochastic Control Theory Statistical Physics Richard Bellman Richard Feynman Lev Pontryagin Parallel Computation 8 Machine Learning Velocity at 6 m/s and 8 m/s Targets Mark Kac 7 Velocity (m/s) U x U y k(u x,u y)k 6 m/s Target 8 m/s Target Time (s)
4 Autonomy Aerospace Engineering Lineariza8on & Uncertainty Representa8on Machine Learning Control Theory & Dynamical Systems Stochas8c Control (SC) Sampling, Feynman-Kac & Informa8on Theore8c Physics & Mechanics Robo8cs Applica8on to Neuroscience Bio-control Theore8cal Areas Neuroscience & Bioinspired Robo8cs
5 Dynamic Programming Problem Formulation: Cost under minimization: V (x,t i )=min u Running Cost: E Q apple (x,t N )+ Z tn L(x, u,t)=q 0 (x,t)+q (x,t)u + 2 ut Ru Controlled Diffusion Dynamics: L(x, u,t)dt dx = F (x, u)dt + B(x)d! t i Noise F(x, u) =f(x)+g(x)u Control E Q : Expectation w.r.t controlled dynamics E P : Expectation w.r.t uncontrolled dynamics Bellman Principle Discrete: Cost to go(current state) = min[cost to reach(next state) + Cost to go(next state)] V (x ) Start Stage V (x 2 ) V (x 3 ) 2 3 Stage 2 V (x 4 ) 4 5 V (x 5 ) Stage3 L 4,6 L 4,7 L 4,8 6 V (x 6 ) 7 V (x 7 ) 8 V (x 8 ) Stage 4 L 6,9 L 7,9 L 8,9 9 Goal
6 Dynamic Programming Problem Formulation: Cost under minimization: V (x,t i )=min u Running Cost: E Q apple (x,t N )+ Z tn L(x, u,t)=q 0 (x,t)+q (x,t)u + 2 ut Ru Controlled Diffusion Dynamics: L(x, u,t)dt dx = F (x, u)dt + B(x)d! t i Noise F(x, u) =f(x)+g(x)u Control E Q : Expectation w.r.t controlled dynamics E P : Expectation w.r.t uncontrolled dynamics Bellman Principle Discrete: Cost to go(current state) = min[cost to reach(next state) + Cost to go(next state)] V (x ) Start Stage V (x 2 ) 4 V (x 5 ) V (x 6 ) L 6,9 ) Backward Process L L 7 7,9 2) Partial Differential Equation 4,8 hard to solve V (x 3 ) 2 3 Stage 2 V (x 4 ) 5 Stage3 L 4,6 L 4,7 6 V (x 7 ) 8 V (x 8 ) Stage 4 L 8,9 9 Goal
7 Making Control Linear Hamilton Jacobi t V = q +(r x V ) T f 2 (r xv ) T GR G T (r x V )+ 2 tr (r xxv )BB T Desirability: (x,t)= V (x,t) Noise regulates control authority: G(x)R G(x) T = B(x)B(x) T Chapman t = q + f T (r x )+ 2 tr (r xx )BB T Statistical Physics Feynman - Kac Sample from the uncontrolled dynamics: dx = f(x)dt + B(x)d! Evaluate the ectation: Find optimal Controls: u PI (x) = apple (x,t i )=E P R q (x,t) Z tn t i q(x)dt G(x) T r x (x,t) (x,t) (x,t N ) Push the dynamics to states with high desirability. Richard Feynman Mark Kac
8 Making Control Linear Hamilton Jacobi t V = q +(r x V ) T f 2 (r xv ) T GR G T (r x V )+ 2 tr (r xxv )BB T Desirability: (x,t)= V (x,t) Noise regulates control authority: G(x)R G(x) T = B(x)B(x) T Chapman t = q + f T (r x )+ 2 tr (r xx )BB T Statistical Physics Feynman - Kac Sample from the uncontrolled dynamics: dx = f(x)dt + B(x)d! Evaluate the ectation: (x,t i )=E P apple Z tn q(x)dt (x,t N ) Richard Feynman t i Mark Kac
9 Making Control Linear Hamilton Jacobi t V = q +(r x V ) T f 2 (r xv ) T GR G T (r x V )+ 2 tr (r xxv )BB T Desirability: (x,t)= V (x,t) Noise regulates control authority: G(x)R G(x) T = B(x)B(x) T Chapman t = q + f T (r x )+ 2 tr (r xx )BB T Statistical Physics Feynman - Kac Sample from the uncontrolled dynamics: dx = f(x)dt + B(x)d! Evaluate the ectation: Lower bound: apple (x,t i )=E P apple V (x,t i )= log E P apple apple E Q (x,t N )+ Z tn t i Z tn t i Z tn t i q(x,t)dt L(x, u,t)dt q(x)dt (x,t N ) (x,t N ) Richard Feynman Mark Kac
10 Generalized Path Integral Control Path Integral Control: Local Controls: u L (x ti,t i )=R G(x ti ) T G(x ti )R G(x ti ) T G(x ti )dw(t i ) Y Trajectory Probability: P(~x) = S(~x) R S(~x) d~x Intuition... Start dw () t 0 dw (M) t 0 dw (2) t 0 S 2 (~x) S (~x) S M (~x) Goal u = X i P i dw (i) t o P i = P ( S i(~x)) i ( S i(~x)) Sample from the uncontrolled dynamics: 8 E.Theodorou, J. Buchli,S. Schaal. AISTATS2009, JMLR200
11 Iterative Path Integral Control Sampling with uncontrolled dynamics: (x,t i )= Z q(xt,t)dt (x,t N )dq Uncontrolled Sampling with controlled dynamics: (x,t i )= Z q(xt,t)dt (x,t N ) dq Uncontrolled dp Controlled dp Controlled Find optimal Controls: u PI (x) = R q (x,t) G(x) T r x (x,t) (x,t) Radon Nikodym derivative Iterative Path Integral Control: u new (t k )=u old (t k )+ X i P i dw (i) (t k ) Path Probability: P i = P i S(~x i ) + correction S(~x i ) + correction 0 E.Theodorou, J. Buchli,S. Schaal JMLR200 E.Theodorou, E. Todorov. CDC202
12 Overview Optimal Control Theory Constrained Optimization Dynamic Programming Hamilton Jacobi Bellman PDE V (x,t) V (x,t)= log (x,t) Backward Chapman Kolmogorov (x,t) Feynman Kac Lemma Fundamental Bound E. Theodorou, E Todorov. CDC 202 E. Theodorou, Entropy 205
13 Information Theoretic View Free Energy: F = log Z J(x) dp(!) Generalized Entropy: S(Q P) = Z dq(!) dp(!) log dq(!) dp(!) dp(!) log Z apple J(x) dp(!) apple E Q J(x) + KL Q P Free Energy apple Work Temperature Generalized Entropy Optimal Measure: dq = R J(x) dp J(x) dp
14 Non-Classical View Uncontrolled dynamics: Controlled dynamics: P :dx = f(x)dt + p B(x)d! (0) Q :dx = f(x)dt + B(x) udt + p d! () Fundamental Relationship: = apple log E P J (x) apple apple E Q J (x) + KL Q P Radon Nikodym: Z dq tn dp = 2 t i u T udt + p Z tn t i u T dw () (t) Fundamental Relationship: (x,t i )= apple log E P J (x) apple E Q applej (x)+ Z tn t i 2 ut udt Free Energy Cost Function There is a lower bound in the cost function How to find this lower bound Permits generalizations to other classes of stochastic dynamics.
15 Non-Classical View Basic Relationship: (x,t i )= Z log (x,t) apple J(x) dp(!) apple E Q J(x)+ Z tn t i 2 ut Rudt Backward Chapman t = q 0 + f T (r x )+ 2 tr (r xx )BB T Exponential Transformation: (x,t)=( (x,t)) Hamilton Jacobi t = q 0 +(r x ) T f 2 (r x ) T BB T (r x ) + 2 tr (r xx )BB T (x,t) : is a Value function (x,t) : is a desirability function
16 Overview Optimal Control Theory Constrained Optimization Dynamic Programming Information Theory Free Energy & Relative Entropy Jensens Inequality & Girsanov Theorem Hamilton Jacobi Bellman PDE V (x,t) Fundamental Bound V (x,t)= log (x,t) (x,t)= (x,t) V (x,t)= (x,t) Feynman Kac Lemma Backward Chapman Kolmogorov (x,t) Backward Chapman Kolmogorov (x,t) Feynman Kac Lemma (x,t)= log (x,t) Fundamental Bound Hamilton Jacobi Bellman PDE E. Theodorou, E Todorov, CDC 202. E. Theodorou, Entropy 205.
17 Non-Classical View Fundamental Relationship: Z log apple J(x) dp(!) apple E Q J(x) + KL Q P Uncontrolled dynamics: P :dx = F(x, 0)dt + C(x)dw (0) Controlled dynamics: Q :dx = F(x, u)dt + C(x)dw () Grady et all, ICRA 206.
18 Non-Classical View Fundamental Relationship: Z log apple J(x) dp(!) apple E Q J(x) + KL Q P Uncontrolled dynamics: P :dx = F(x, 0)dt + C(x)dw (0) Controlled dynamics: Q :dx = F(x, u)dt + C(x)dw () Optimal Measure: dq = R J(x) dp J(x) dp New Optimal Control Formulation: u = argmin KL Q Q(u) Grady et all, ICRA 206.
19 Non-Classical View Fundamental Relationship: Z log apple J(x) dp(!) apple E Q J(x) + KL Q P Uncontrolled dynamics: P :dx = F(x, 0)dt + C(x)dw (0) Controlled dynamics: Q :dx = F(x, u)dt + C(x)dw () Optimal Measure: dq = R J(x) dp J(x) dp New Optimal Control Formulation: u = argmin KL Q Q(u) Optimal Measure: dq dp Radon Nikodym: dp dq(u) Grady et all, ICRA 206.
20 Non-Classical View Fundamental Relationship: Z log apple J(x) dp(!) apple E Q J(x) + KL Q P Uncontrolled dynamics: P :dx = F(x, 0)dt + C(x)dw (0) Controlled dynamics: Q :dx = F(x, u)dt + C(x)dw () Optimal Measure: dq = R J(x) dp J(x) dp New Optimal Control Formulation: u = argmin KL Q Q(u) Grady et all, ICRA 206.
21 Non-Classical View Fundamental Relationship: Z log apple J(x) dp(!) apple E Q J(x) + KL Q P Uncontrolled dynamics: P :dx = F(x, 0)dt + C(x)dw (0) Controlled dynamics: Q :dx = F(x, u)dt + C(x)dw () Optimal Measure: dq = R J(x) dp J(x) dp New Optimal Control Formulation: u = argmin KL Q Q(u) Control parameterization: 8 u 0 if 0 apple t< t u if t apple t<2 t >< u t =. u j if j t apple t<(j + ) t >:. Grady et all, ICRA 206.
22 Non-Classical View Fundamental Relationship: Z log apple J(x) dp(!) apple E Q J(x) + KL Q P Uncontrolled dynamics: P :dx = F(x, 0)dt + C(x)dw (0) Controlled dynamics: Q :dx = F(x, u)dt + C(x)dw () Optimal Measure: dq = R J(x) dp J(x) dp New Optimal Control Formulation: u = argmin KL Q Q(u) Control parameterization: 8 u 0 if 0 apple t< t u if t apple t<2 t >< u t =. u j if j t apple t<(j + ) t >:. Generalized Importance Sampling: E P apple f(x) = E Qm, applef(x) dp dq m, Grady et all, ICRA 206.
23 MPPI with offline model learning 7~8 m/sec 20
24 Y Position (m) Applications in Robotics-High Speed MPPI with offline learned dynamics Paths Taken for 6 m/s and 8 m/s Targets X Position (m) 6 m/s Target 8 m/s Target Position Error (m) Heading Error (rad) P x P y Multi-Step Prediction Error U x U y Velocity Error (m/s) Time (s) Hea. Rate Error (rad/s) Controlled dynamics: f i (x) N (µ i (x), 2 i (x)) Mean: µ i (x) = 2 n ( x) T A i ( X)Y i Variance: 2 i ( x) = ( x) T A i ( x) A i = 2 n ( x) ( x) T + p
25 Applications in Robotics-MPPI with online model learning 8~9 m/sec
26 Model Predictive Path Integral Control using Artificial Neural Networks ~ m/sec 23
27 Applications in Multi-vehicle
28 Applications in Robotics-Comparisons Average Running Cost DDP Solution = 50 = 00 = 50 = Number of Rollouts (Log Scale) DDP Cornering Maneuver MPC-DDP MPPI Cornering Maneuver MPPI
29 Applications in Multi-vehicle Success Rate Static Obstacles Moving Obstacles 9 Quadrotors Optimization Iteration Time (ms) Quadrotor 3 Quadrotors 9 Quadrotors Real-Time CUDA Optimization Times Rollouts (Thousands) Number of Rollouts
30 Learning and optimization in different Time scales u(x,t,) = (x,t) (k) PI (t i) (k) PI (t i+) (k) PI (t N ) u(x,t,) = (x)(t) 32
31 Applications in Robotics-Manipulation
32 Applications in Robotics-Manipulation
33 Nonlinear Feynman-Kac Cost Function: apple J(,x ; u( )) = E g(x(t )) + Z T q 0 (t, x(t)) + q (t, x(t)) > u(t) dt Stochastic Dynamics: dx(t) =f(t, x(t))dt + G(t, x(t))u(t)dt + (t, x(t))dw t Hamilton-Jacobi-Bellman: v t +inf u2u 2 tr(v xx > )+v > x f + v > x G + q > D(sgn(u)) u + q 0 =0 Hamilton-Jacobi-Bellman: v t + 2 tr(v xx > )+v > x f + q 0 + X i= min (v > x G + q > i umax i, 0, (v > x G q > i umin i =0
34 Nonlinear Feynman-Kac Forward SDE: dx s = b(s, X s )ds + (s, X s )dw s Backward SDE: dy s = h(s, Xs t,x,y s,z s )ds + Z s > dw s b(t, x) f(t, x) h(t, x, z) q 0 (t, x)+ X i= min (z > + q > i umax i, 0, (z > q > i umin i Importance Forward SDE: d X s =[b(s, X s )+ (s, X s )K s ]ds + (s, X s )dw s Compensated Backward SDE: dỹs =[ h(s, X s, Ỹs, Z s )+ Z > s K s ]ds + Z > s dw s
35 Nonlinear Feynman-Kac Forward SDE: dx s = b(s, X s )ds + (s, X s )dw s Backward SDE: dy s = h(s, Xs t,x,y s,z s )ds + Z s > dw s b(t, x) f(t, x) h(t, x, z) q 0 (t, x)+ X i= min (z > + q > i umax i, 0, (z > q > i umin i Importance Forward SDE: d X s =[b(s, X s )+ (s, X s )K s ]ds + (s, X s )dw s Compensated Backward SDE: dỹs =[ h(s, X s, Ỹs, Z s )+ Z > s K s ]ds + Z > s dw s v t + 2 tr(v xx > )+v > x (b + K)+h(t, x, v, > v x ) v > x K =0
36 Nonlinear Feynman-Kac Cost Mean and Variance 70 Cost per iteration 60 Cost (mean + 3 std dev) Iteration Mean of the Controlled System Trajectories of each Iteration 6 Cost Mean 5 Cost Variance Angle 3 2 Angular Vel Det. Open Loop Det. Feedback Stoch. Feedback 0 Det. Open Loop Det. Feedback Stoch. Feedback Stochastic Cooperative Games Risk Sensitive Stochastic Control affine, State/Control Multiplicative Time t Time t
37 Statistical Physics Summary Control Theory Richard Feynman Richard Bellman Mark Kac Parallel Computation Neuro-Morphic Velocity (m/s) Machine Learning Ux Uy Velocity at 6 m/s and 8 m/s Targets k(ux,uy)k Time (s) 6 m/s Target 8 m/s Target Lev Pontryagin
38 Autonomous Control and Decision Systems Lab My#world#is#delayed# My#world#is#quantum#mechanical# What#these#guys#are# talking#about?# My#world#is#infinite#dimensional# My#world#is#Varia-onal# My#world#is#parallel## My#world#is# probabilis-c# My#world#is#mul-ple# Collaborators: Sponsors: 40
39 Thanks!
Path Integral Stochastic Optimal Control for Reinforcement Learning
Preprint August 3, 204 The st Multidisciplinary Conference on Reinforcement Learning and Decision Making RLDM203 Path Integral Stochastic Optimal Control for Reinforcement Learning Farbod Farshidian Institute
More informationControlled Diffusions and Hamilton-Jacobi Bellman Equations
Controlled Diffusions and Hamilton-Jacobi Bellman Equations Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 2014 Emo Todorov (UW) AMATH/CSE 579, Winter
More informationSDE Coefficients. March 4, 2008
SDE Coefficients March 4, 2008 The following is a summary of GARD sections 3.3 and 6., mainly as an overview of the two main approaches to creating a SDE model. Stochastic Differential Equations (SDE)
More informationSolution of Stochastic Optimal Control Problems and Financial Applications
Journal of Mathematical Extension Vol. 11, No. 4, (2017), 27-44 ISSN: 1735-8299 URL: http://www.ijmex.com Solution of Stochastic Optimal Control Problems and Financial Applications 2 Mat B. Kafash 1 Faculty
More informationOptimal Control. Lecture 18. Hamilton-Jacobi-Bellman Equation, Cont. John T. Wen. March 29, Ref: Bryson & Ho Chapter 4.
Optimal Control Lecture 18 Hamilton-Jacobi-Bellman Equation, Cont. John T. Wen Ref: Bryson & Ho Chapter 4. March 29, 2004 Outline Hamilton-Jacobi-Bellman (HJB) Equation Iterative solution of HJB Equation
More informationRobust control and applications in economic theory
Robust control and applications in economic theory In honour of Professor Emeritus Grigoris Kalogeropoulos on the occasion of his retirement A. N. Yannacopoulos Department of Statistics AUEB 24 May 2013
More informationInformation Theoretic views of Path Integral Control.
Information Theoretic views of Path Integral Control. Evangelos A. Theodoro and Emannel Todorov Abstract We derive the connections of Path IntegralPI and Klback-LieblerKL control as presented in machine
More informationKolmogorov Equations and Markov Processes
Kolmogorov Equations and Markov Processes May 3, 013 1 Transition measures and functions Consider a stochastic process {X(t)} t 0 whose state space is a product of intervals contained in R n. We define
More informationContinuous Time Finance
Continuous Time Finance Lisbon 2013 Tomas Björk Stockholm School of Economics Tomas Björk, 2013 Contents Stochastic Calculus (Ch 4-5). Black-Scholes (Ch 6-7. Completeness and hedging (Ch 8-9. The martingale
More informationData-Driven Differential Dynamic Programming Using Gaussian Processes
5 American Control Conference Palmer House Hilton July -3, 5. Chicago, IL, USA Data-Driven Differential Dynamic Programming Using Gaussian Processes Yunpeng Pan and Evangelos A. Theodorou Abstract We present
More informationWeak Convergence of Numerical Methods for Dynamical Systems and Optimal Control, and a relation with Large Deviations for Stochastic Equations
Weak Convergence of Numerical Methods for Dynamical Systems and, and a relation with Large Deviations for Stochastic Equations Mattias Sandberg KTH CSC 2010-10-21 Outline The error representation for weak
More informationGame Theoretic Continuous Time Differential Dynamic Programming
Game heoretic Continuous ime Differential Dynamic Programming Wei Sun, Evangelos A. heodorou and Panagiotis siotras 3 Abstract In this work, we derive a Game heoretic Differential Dynamic Programming G-DDP
More informationEN Applied Optimal Control Lecture 8: Dynamic Programming October 10, 2018
EN530.603 Applied Optimal Control Lecture 8: Dynamic Programming October 0, 08 Lecturer: Marin Kobilarov Dynamic Programming (DP) is conerned with the computation of an optimal policy, i.e. an optimal
More informationOptimal Control. Quadratic Functions. Single variable quadratic function: Multi-variable quadratic function:
Optimal Control Control design based on pole-placement has non unique solutions Best locations for eigenvalues are sometimes difficult to determine Linear Quadratic LQ) Optimal control minimizes a quadratic
More informationRisk-Sensitive and Robust Mean Field Games
Risk-Sensitive and Robust Mean Field Games Tamer Başar Coordinated Science Laboratory Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign Urbana, IL - 6181 IPAM
More informationHJB equations. Seminar in Stochastic Modelling in Economics and Finance January 10, 2011
Department of Probability and Mathematical Statistics Faculty of Mathematics and Physics, Charles University in Prague petrasek@karlin.mff.cuni.cz Seminar in Stochastic Modelling in Economics and Finance
More informationNumerical Optimal Control Overview. Moritz Diehl
Numerical Optimal Control Overview Moritz Diehl Simplified Optimal Control Problem in ODE path constraints h(x, u) 0 initial value x0 states x(t) terminal constraint r(x(t )) 0 controls u(t) 0 t T minimize
More informationarxiv: v1 [cs.sy] 3 Sep 2015
Model Based Reinforcement Learning with Final Time Horizon Optimization arxiv:9.86v [cs.sy] 3 Sep Wei Sun School of Aerospace Engineering Georgia Institute of Technology Atlanta, GA 333-, USA. wsun4@gatech.edu
More informationA Tour of Reinforcement Learning The View from Continuous Control. Benjamin Recht University of California, Berkeley
A Tour of Reinforcement Learning The View from Continuous Control Benjamin Recht University of California, Berkeley trustable, scalable, predictable Control Theory! Reinforcement Learning is the study
More informationInverse Optimality Design for Biological Movement Systems
Inverse Optimality Design for Biological Movement Systems Weiwei Li Emanuel Todorov Dan Liu Nordson Asymtek Carlsbad CA 921 USA e-mail: wwli@ieee.org. University of Washington Seattle WA 98195 USA Google
More informationLinearly-Solvable Stochastic Optimal Control Problems
Linearly-Solvable Stochastic Optimal Control Problems Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 2014 Emo Todorov (UW) AMATH/CSE 579, Winter 2014
More informationUCLA Chemical Engineering. Process & Control Systems Engineering Laboratory
Constrained Innite-time Optimal Control Donald J. Chmielewski Chemical Engineering Department University of California Los Angeles February 23, 2000 Stochastic Formulation - Min Max Formulation - UCLA
More informationLatent state estimation using control theory
Latent state estimation using control theory Bert Kappen SNN Donders Institute, Radboud University, Nijmegen Gatsby Unit, UCL London August 3, 7 with Hans Christian Ruiz Bert Kappen Smoothing problem Given
More informationDeterministic Dynamic Programming
Deterministic Dynamic Programming 1 Value Function Consider the following optimal control problem in Mayer s form: V (t 0, x 0 ) = inf u U J(t 1, x(t 1 )) (1) subject to ẋ(t) = f(t, x(t), u(t)), x(t 0
More informationMATH4406 (Control Theory) Unit 1: Introduction Prepared by Yoni Nazarathy, July 21, 2012
MATH4406 (Control Theory) Unit 1: Introduction Prepared by Yoni Nazarathy, July 21, 2012 Unit Outline Introduction to the course: Course goals, assessment, etc... What is Control Theory A bit of jargon,
More informationSuboptimality of minmax MPC. Seungho Lee. ẋ(t) = f(x(t), u(t)), x(0) = x 0, t 0 (1)
Suboptimality of minmax MPC Seungho Lee In this paper, we consider particular case of Model Predictive Control (MPC) when the problem that needs to be solved in each sample time is the form of min max
More informationRobotics Part II: From Learning Model-based Control to Model-free Reinforcement Learning
Robotics Part II: From Learning Model-based Control to Model-free Reinforcement Learning Stefan Schaal Max-Planck-Institute for Intelligent Systems Tübingen, Germany & Computer Science, Neuroscience, &
More informationStochastic Differential Equations.
Chapter 3 Stochastic Differential Equations. 3.1 Existence and Uniqueness. One of the ways of constructing a Diffusion process is to solve the stochastic differential equation dx(t) = σ(t, x(t)) dβ(t)
More informationInformation Theoretic Model Predictive Control: Theory and Applications to Autonomous Driving
MANUSCRIPT 1 Information Theoretic Model Predictive Control: Theory and Applications to Autonomous Driving Grady Williams, Paul Drews, Brian Goldfain, James M. Rehg, and Evangelos A. Theodorou arxiv:177.2342v1
More informationECE7850 Lecture 8. Nonlinear Model Predictive Control: Theoretical Aspects
ECE7850 Lecture 8 Nonlinear Model Predictive Control: Theoretical Aspects Model Predictive control (MPC) is a powerful control design method for constrained dynamical systems. The basic principles and
More informationStochastic Differential Equations
CHAPTER 1 Stochastic Differential Equations Consider a stochastic process X t satisfying dx t = bt, X t,w t dt + σt, X t,w t dw t. 1.1 Question. 1 Can we obtain the existence and uniqueness theorem for
More informationECE7850 Lecture 9. Model Predictive Control: Computational Aspects
ECE785 ECE785 Lecture 9 Model Predictive Control: Computational Aspects Model Predictive Control for Constrained Linear Systems Online Solution to Linear MPC Multiparametric Programming Explicit Linear
More informationPROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS
PROBABILITY: LIMIT THEOREMS II, SPRING 218. HOMEWORK PROBLEMS PROF. YURI BAKHTIN Instructions. You are allowed to work on solutions in groups, but you are required to write up solutions on your own. Please
More informationPolicy Search for Path Integral Control
Policy Search for Path Integral Control Vicenç Gómez 1,2, Hilbert J Kappen 2, Jan Peters 3,4, and Gerhard Neumann 3 1 Universitat Pompeu Fabra, Barcelona Department of Information and Communication Technologies,
More informationQuadratic Stability of Dynamical Systems. Raktim Bhattacharya Aerospace Engineering, Texas A&M University
.. Quadratic Stability of Dynamical Systems Raktim Bhattacharya Aerospace Engineering, Texas A&M University Quadratic Lyapunov Functions Quadratic Stability Dynamical system is quadratically stable if
More informationNonlinear and robust MPC with applications in robotics
Nonlinear and robust MPC with applications in robotics Boris Houska, Mario Villanueva, Benoît Chachuat ShanghaiTech, Texas A&M, Imperial College London 1 Overview Introduction to Robust MPC Min-Max Differential
More informationLinear-Quadratic-Gaussian (LQG) Controllers and Kalman Filters
Linear-Quadratic-Gaussian (LQG) Controllers and Kalman Filters Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 204 Emo Todorov (UW) AMATH/CSE 579, Winter
More informationAn efficient approach to stochastic optimal control. Bert Kappen SNN Radboud University Nijmegen the Netherlands
An efficient approach to stochastic optimal control Bert Kappen SNN Radboud University Nijmegen the Netherlands Bert Kappen Examples of control tasks Motor control Bert Kappen Pascal workshop, 27-29 May
More informationITERATIVE PATH INTEGRAL STOCHASTIC OPTIMAL CONTROL: THEORY AND APPLICATIONS TO MOTOR CONTROL. Evangelos A. Theodorou
ITERATIVE PATH INTEGRAL STOCHASTIC OPTIMAL CONTROL: THEORY AND APPLICATIONS TO MOTOR CONTROL by Evangelos A. Theodorou A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN
More informationStochastic Calculus February 11, / 33
Martingale Transform M n martingale with respect to F n, n =, 1, 2,... σ n F n (σ M) n = n 1 i= σ i(m i+1 M i ) is a Martingale E[(σ M) n F n 1 ] n 1 = E[ σ i (M i+1 M i ) F n 1 ] i= n 2 = σ i (M i+1 M
More informationEE C128 / ME C134 Feedback Control Systems
EE C128 / ME C134 Feedback Control Systems Lecture Additional Material Introduction to Model Predictive Control Maximilian Balandat Department of Electrical Engineering & Computer Science University of
More informationHOMEWORK 4 1. P45. # 1.
HOMEWORK 4 SHUANGLIN SHAO P45 # Proof By the maximum principle, u(x, t x kt attains the maximum at the bottom or on the two sides When t, x kt x attains the maximum at x, ie, x When x, x kt kt attains
More informationLearning Policy Improvements with Path Integrals
Evangelos Theodorou etheodor@usc.edu Jonas Buchli jonas@buchli.org Stefan Schaal sschaal@usc.edu Computer Science, Neuroscience, & Biomedical Engineering University of Southern California, 370 S.McClintock
More informationLecture 6: Multiple Model Filtering, Particle Filtering and Other Approximations
Lecture 6: Multiple Model Filtering, Particle Filtering and Other Approximations Department of Biomedical Engineering and Computational Science Aalto University April 28, 2010 Contents 1 Multiple Model
More informationFormula Sheet for Optimal Control
Formula Sheet for Optimal Control Division of Optimization and Systems Theory Royal Institute of Technology 144 Stockholm, Sweden 23 December 1, 29 1 Dynamic Programming 11 Discrete Dynamic Programming
More informationNumerical approximation for optimal control problems via MPC and HJB. Giulia Fabrini
Numerical approximation for optimal control problems via MPC and HJB Giulia Fabrini Konstanz Women In Mathematics 15 May, 2018 G. Fabrini (University of Konstanz) Numerical approximation for OCP 1 / 33
More informationMinimization with Equality Constraints
Path Constraints and Numerical Optimization Robert Stengel Optimal Control and Estimation, MAE 546, Princeton University, 2015 State and Control Equality Constraints Pseudoinverse State and Control Inequality
More informationRobotics: Science & Systems [Topic 6: Control] Prof. Sethu Vijayakumar Course webpage:
Robotics: Science & Systems [Topic 6: Control] Prof. Sethu Vijayakumar Course webpage: http://wcms.inf.ed.ac.uk/ipab/rss Control Theory Concerns controlled systems of the form: and a controller of the
More informationGaussian processes for inference in stochastic differential equations
Gaussian processes for inference in stochastic differential equations Manfred Opper, AI group, TU Berlin November 6, 2017 Manfred Opper, AI group, TU Berlin (TU Berlin) inference in SDE November 6, 2017
More informationStochastic and Adaptive Optimal Control
Stochastic and Adaptive Optimal Control Robert Stengel Optimal Control and Estimation, MAE 546 Princeton University, 2018! Nonlinear systems with random inputs and perfect measurements! Stochastic neighboring-optimal
More informationIterative methods to compute center and center-stable manifolds with application to the optimal output regulation problem
Iterative methods to compute center and center-stable manifolds with application to the optimal output regulation problem Noboru Sakamoto, Branislav Rehak N.S.: Nagoya University, Department of Aerospace
More informationMean-field SDE driven by a fractional BM. A related stochastic control problem
Mean-field SDE driven by a fractional BM. A related stochastic control problem Rainer Buckdahn, Université de Bretagne Occidentale, Brest Durham Symposium on Stochastic Analysis, July 1th to July 2th,
More informationp 1 ( Y p dp) 1/p ( X p dp) 1 1 p
Doob s inequality Let X(t) be a right continuous submartingale with respect to F(t), t 1 P(sup s t X(s) λ) 1 λ {sup s t X(s) λ} X + (t)dp 2 For 1 < p
More informationPath Integral methods for solving stochastic problems. Carson C. Chow, NIH
Path Integral methods for solving stochastic problems Carson C. Chow, NIH Why? Often in neuroscience we run into stochastic ODEs of the form dx dt = f(x)+g(x)η(t) η(t) =0 η(t)η(t ) = δ(t t ) Examples Integrate-and-fire
More informationLMI Methods in Optimal and Robust Control
LMI Methods in Optimal and Robust Control Matthew M. Peet Arizona State University Lecture 15: Nonlinear Systems and Lyapunov Functions Overview Our next goal is to extend LMI s and optimization to nonlinear
More informationOverview of the Seminar Topic
Overview of the Seminar Topic Simo Särkkä Laboratory of Computational Engineering Helsinki University of Technology September 17, 2007 Contents 1 What is Control Theory? 2 History
More informationOn some relations between Optimal Transport and Stochastic Geometric Mechanics
Title On some relations between Optimal Transport and Stochastic Geometric Mechanics Banff, December 218 Ana Bela Cruzeiro Dep. Mathematics IST and Grupo de Física-Matemática Univ. Lisboa 1 / 2 Title Based
More informationThe first order quasi-linear PDEs
Chapter 2 The first order quasi-linear PDEs The first order quasi-linear PDEs have the following general form: F (x, u, Du) = 0, (2.1) where x = (x 1, x 2,, x 3 ) R n, u = u(x), Du is the gradient of u.
More informationON THE FIRST TIME THAT AN ITO PROCESS HITS A BARRIER
ON THE FIRST TIME THAT AN ITO PROCESS HITS A BARRIER GERARDO HERNANDEZ-DEL-VALLE arxiv:1209.2411v1 [math.pr] 10 Sep 2012 Abstract. This work deals with first hitting time densities of Ito processes whose
More informationMathematical Methods - Lecture 9
Mathematical Methods - Lecture 9 Yuliya Tarabalka Inria Sophia-Antipolis Méditerranée, Titane team, http://www-sop.inria.fr/members/yuliya.tarabalka/ Tel.: +33 (0)4 92 38 77 09 email: yuliya.tarabalka@inria.fr
More informationFind the indicated derivative. 1) Find y(4) if y = 3 sin x. A) y(4) = 3 cos x B) y(4) = 3 sin x C) y(4) = - 3 cos x D) y(4) = - 3 sin x
Assignment 5 Name Find the indicated derivative. ) Find y(4) if y = sin x. ) A) y(4) = cos x B) y(4) = sin x y(4) = - cos x y(4) = - sin x ) y = (csc x + cot x)(csc x - cot x) ) A) y = 0 B) y = y = - csc
More informationSimulation of conditional diffusions via forward-reverse stochastic representations
Weierstrass Institute for Applied Analysis and Stochastics Simulation of conditional diffusions via forward-reverse stochastic representations Christian Bayer and John Schoenmakers Numerical methods for
More informationIntroduction to Reachability Somil Bansal Hybrid Systems Lab, UC Berkeley
Introduction to Reachability Somil Bansal Hybrid Systems Lab, UC Berkeley Outline Introduction to optimal control Reachability as an optimal control problem Various shades of reachability Goal of This
More informationMathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )
Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio (2014-2015) Etienne Tanré - Olivier Faugeras INRIA - Team Tosca November 26th, 2014 E. Tanré (INRIA - Team Tosca) Mathematical
More informationIntroduction to Diffusion Processes.
Introduction to Diffusion Processes. Arka P. Ghosh Department of Statistics Iowa State University Ames, IA 511-121 apghosh@iastate.edu (515) 294-7851. February 1, 21 Abstract In this section we describe
More informationNonlinear representation, backward SDEs, and application to the Principal-Agent problem
Nonlinear representation, backward SDEs, and application to the Principal-Agent problem Ecole Polytechnique, France April 4, 218 Outline The Principal-Agent problem Formulation 1 The Principal-Agent problem
More informationThomas Knispel Leibniz Universität Hannover
Optimal long term investment under model ambiguity Optimal long term investment under model ambiguity homas Knispel Leibniz Universität Hannover knispel@stochastik.uni-hannover.de AnStAp0 Vienna, July
More informationOn the Stability of the Best Reply Map for Noncooperative Differential Games
On the Stability of the Best Reply Map for Noncooperative Differential Games Alberto Bressan and Zipeng Wang Department of Mathematics, Penn State University, University Park, PA, 68, USA DPMMS, University
More informationStability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games
Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games Alberto Bressan ) and Khai T. Nguyen ) *) Department of Mathematics, Penn State University **) Department of Mathematics,
More informationSchrodinger Bridges classical and quantum evolution
.. Schrodinger Bridges classical and quantum evolution Tryphon Georgiou University of Minnesota! Joint work with Michele Pavon and Yongxin Chen! LCCC Workshop on Dynamic and Control in Networks Lund, October
More informationI forgot to mention last time: in the Ito formula for two standard processes, putting
I forgot to mention last time: in the Ito formula for two standard processes, putting dx t = a t dt + b t db t dy t = α t dt + β t db t, and taking f(x, y = xy, one has f x = y, f y = x, and f xx = f yy
More informationPoisson Jumps in Credit Risk Modeling: a Partial Integro-differential Equation Formulation
Poisson Jumps in Credit Risk Modeling: a Partial Integro-differential Equation Formulation Jingyi Zhu Department of Mathematics University of Utah zhu@math.utah.edu Collaborator: Marco Avellaneda (Courant
More informationLINEARLY SOLVABLE OPTIMAL CONTROL
CHAPTER 6 LINEARLY SOLVABLE OPTIMAL CONTROL K. Dvijotham 1 and E. Todorov 2 1 Computer Science & Engineering, University of Washington, Seattle 2 Computer Science & Engineering and Applied Mathematics,
More informationMalliavin Calculus in Finance
Malliavin Calculus in Finance Peter K. Friz 1 Greeks and the logarithmic derivative trick Model an underlying assent by a Markov process with values in R m with dynamics described by the SDE dx t = b(x
More informationminimize x subject to (x 2)(x 4) u,
Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for
More informationSome Properties of NSFDEs
Chenggui Yuan (Swansea University) Some Properties of NSFDEs 1 / 41 Some Properties of NSFDEs Chenggui Yuan Swansea University Chenggui Yuan (Swansea University) Some Properties of NSFDEs 2 / 41 Outline
More informationStochastic Nonlinear Stabilization Part II: Inverse Optimality Hua Deng and Miroslav Krstic Department of Mechanical Engineering h
Stochastic Nonlinear Stabilization Part II: Inverse Optimality Hua Deng and Miroslav Krstic Department of Mechanical Engineering denghua@eng.umd.edu http://www.engr.umd.edu/~denghua/ University of Maryland
More informationMaximum Process Problems in Optimal Control Theory
J. Appl. Math. Stochastic Anal. Vol. 25, No., 25, (77-88) Research Report No. 423, 2, Dept. Theoret. Statist. Aarhus (2 pp) Maximum Process Problems in Optimal Control Theory GORAN PESKIR 3 Given a standard
More informationSolving First Order PDEs
Solving Ryan C. Trinity University Partial Differential Equations Lecture 2 Solving the transport equation Goal: Determine every function u(x, t) that solves u t +v u x = 0, where v is a fixed constant.
More informationReflected Brownian Motion
Chapter 6 Reflected Brownian Motion Often we encounter Diffusions in regions with boundary. If the process can reach the boundary from the interior in finite time with positive probability we need to decide
More information. Frobenius-Perron Operator ACC Workshop on Uncertainty Analysis & Estimation. Raktim Bhattacharya
.. Frobenius-Perron Operator 2014 ACC Workshop on Uncertainty Analysis & Estimation Raktim Bhattacharya Laboratory For Uncertainty Quantification Aerospace Engineering, Texas A&M University. uq.tamu.edu
More informationProbabilistic Differential Dynamic Programming
Probabilistic Differential Dynamic Programming Yunpeng Pan and Evangelos A. Theodorou Daniel Guggenheim School of Aerospace Engineering Institute for Robotics and Intelligent Machines Georgia Institute
More informationMatthew Zyskowski 1 Quanyan Zhu 2
Matthew Zyskowski 1 Quanyan Zhu 2 1 Decision Science, Credit Risk Office Barclaycard US 2 Department of Electrical Engineering Princeton University Outline Outline I Modern control theory with game-theoretic
More informationSolving First Order PDEs
Solving Ryan C. Trinity University Partial Differential Equations January 21, 2014 Solving the transport equation Goal: Determine every function u(x, t) that solves u t +v u x = 0, where v is a fixed constant.
More informationSolution Methods. Richard Lusby. Department of Management Engineering Technical University of Denmark
Solution Methods Richard Lusby Department of Management Engineering Technical University of Denmark Lecture Overview (jg Unconstrained Several Variables Quadratic Programming Separable Programming SUMT
More informationMan Kyu Im*, Un Cig Ji **, and Jae Hee Kim ***
JOURNAL OF THE CHUNGCHEONG MATHEMATICAL SOCIETY Volume 19, No. 4, December 26 GIRSANOV THEOREM FOR GAUSSIAN PROCESS WITH INDEPENDENT INCREMENTS Man Kyu Im*, Un Cig Ji **, and Jae Hee Kim *** Abstract.
More informationProf. Erhan Bayraktar (University of Michigan)
September 17, 2012 KAP 414 2:15 PM- 3:15 PM Prof. (University of Michigan) Abstract: We consider a zero-sum stochastic differential controller-and-stopper game in which the state process is a controlled
More informationTools of stochastic calculus
slides for the course Interest rate theory, University of Ljubljana, 212-13/I, part III József Gáll University of Debrecen Nov. 212 Jan. 213, Ljubljana Itô integral, summary of main facts Notations, basic
More informationControl of McKean-Vlasov Dynamics versus Mean Field Games
Control of McKean-Vlasov Dynamics versus Mean Field Games René Carmona (a),, François Delarue (b), and Aimé Lachapelle (a), (a) ORFE, Bendheim Center for Finance, Princeton University, Princeton, NJ 8544,
More informationQuadrotor Modeling and Control for DLO Transportation
Quadrotor Modeling and Control for DLO Transportation Thesis dissertation Advisor: Prof. Manuel Graña Computational Intelligence Group University of the Basque Country (UPV/EHU) Donostia Jun 24, 2016 Abstract
More informationMulti-Robot Active SLAM with Relative Entropy Optimization
Multi-Robot Active SLAM with Relative Entropy Optimization Michail Kontitsis, Evangelos A. Theodorou and Emanuel Todorov 3 Abstract This paper presents a new approach for Active Simultaneous Localization
More informationTheory in Model Predictive Control :" Constraint Satisfaction and Stability!
Theory in Model Predictive Control :" Constraint Satisfaction and Stability Colin Jones, Melanie Zeilinger Automatic Control Laboratory, EPFL Example: Cessna Citation Aircraft Linearized continuous-time
More informationExact Simulation of Diffusions and Jump Diffusions
Exact Simulation of Diffusions and Jump Diffusions A work by: Prof. Gareth O. Roberts Dr. Alexandros Beskos Dr. Omiros Papaspiliopoulos Dr. Bruno Casella 28 th May, 2008 Content 1 Exact Algorithm Construction
More informationEULER MARUYAMA APPROXIMATION FOR SDES WITH JUMPS AND NON-LIPSCHITZ COEFFICIENTS
Qiao, H. Osaka J. Math. 51 (14), 47 66 EULER MARUYAMA APPROXIMATION FOR SDES WITH JUMPS AND NON-LIPSCHITZ COEFFICIENTS HUIJIE QIAO (Received May 6, 11, revised May 1, 1) Abstract In this paper we show
More informationChapter 5. Pontryagin s Minimum Principle (Constrained OCP)
Chapter 5 Pontryagin s Minimum Principle (Constrained OCP) 1 Pontryagin s Minimum Principle Plant: (5-1) u () t U PI: (5-2) Boundary condition: The goal is to find Optimal Control. 2 Pontryagin s Minimum
More informationStochastic Differential Dynamic Programming
Stochastic Differential Dynamic Programming Evangelos heodorou, Yuval assa & Emo odorov Abstract Although there has been a significant amount of work in the area of stochastic optimal control theory towards
More informationXt i Xs i N(0, σ 2 (t s)) and they are independent. This implies that the density function of X t X s is a product of normal density functions:
174 BROWNIAN MOTION 8.4. Brownian motion in R d and the heat equation. The heat equation is a partial differential equation. We are going to convert it into a probabilistic equation by reversing time.
More informationStrauss PDEs 2e: Section Exercise 2 Page 1 of 6. Solve the completely inhomogeneous diffusion problem on the half-line
Strauss PDEs 2e: Section 3.3 - Exercise 2 Page of 6 Exercise 2 Solve the completely inhomogeneous diffusion problem on the half-line v t kv xx = f(x, t) for < x
More informationRisk-Sensitive Filtering and Smoothing via Reference Probability Methods
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 42, NO. 11, NOVEMBER 1997 1587 [5] M. A. Dahleh and J. B. Pearson, Minimization of a regulated response to a fixed input, IEEE Trans. Automat. Contr., vol.
More informationThe Mathematics of Continuous Time Contract Theory
The Mathematics of Continuous Time Contract Theory Ecole Polytechnique, France University of Michigan, April 3, 2018 Outline Introduction to moral hazard 1 Introduction to moral hazard 2 3 General formulation
More information