Controlled Diffusions and Hamilton-Jacobi Bellman Equations
|
|
- Lawrence Leonard
- 5 years ago
- Views:
Transcription
1 Controlled Diffusions and Hamilton-Jacobi Bellman Equations Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 2014 Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
2 Continuous-time formulation Notation and terminology: x (t) 2 R n u (t) 2 R m ω (t) 2 R k dx = f (x, u) dt + G (x, u) dω Σ (x, u) = G (x, u) G (x, u) T ` (x, u) 0 q T (x) 0 π (x) 2 R m v π (x) 0 π (x), v (x) state vector control vector Brownian motion (integral of white noise) continuous-time dynamics noise covariance cost for choosing control u in state x (optional) scalar cost at terminal states x 2 T control law value/cost-to-go function optimal control law and its value function Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
3 Stochastic differential equations and integrals Ito diffusion / stochastic differential equation (SDE): dx = f (x) dt + g (x) dω This cannot be written as ẋ = f (x) + g (x) ω because ω does not exist. The SDE means that the time-integrals of the two sides are equal: x (T) x (0) = Z T 0 Z T f (x (t)) dt + g (x (t)) dω (t) 0 The last term is an Ito integral. For an Ito process y (t) adapted to ω (t), i.e. depending on the sample path only up to time t, this integral is Definition (Ito integral) Z T y (t) dω (t), lim N 1 0 N! i=0 y (t i) (ω (t i+1 ) ω (t i )) 0=t 0 <t 1 <<t N =T Replacing y (t i ) with y ((t i+1 + t i ) /2) yields the Stratonovich integral. Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
4 Stochastic chain rule and integration by parts A twice-differentiable function a (x) of an Ito diffusion dx = f (x) dt + g (x) dω is an Ito process (not necessarily a diffusion) which satisfies: Lemma (Ito) da (x (t)) = a 0 (x (t)) dx (t) a00 (x (t)) g (x (t)) 2 dt This is the stochastic version of the chain rule. Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
5 Stochastic chain rule and integration by parts A twice-differentiable function a (x) of an Ito diffusion dx = f (x) dt + g (x) dω is an Ito process (not necessarily a diffusion) which satisfies: Lemma (Ito) da (x (t)) = a 0 (x (t)) dx (t) a00 (x (t)) g (x (t)) 2 dt This is the stochastic version of the chain rule. There is also a stochastic version of integration by parts: x (T) y (T) x (0) y (0) = Z T 0 Z T x (t) dy (t) + y (t) dx (t) + [x, y] T 0 The last term (which would be 0 if x (t) or y (t) were differentiable) is Definition (quadratic covariation) [x, y] T, lim N 1 N! i=0 (x (t i+1) x (t i )) (y (t i+1 ) y (t i )) 0=t 0 <t 1 <<t N =T For a diffusion with constant noise amplitude we have [x, x] T = g 2 T. Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
6 Forward and backward equations, generator Let p (y, sjx, t), s t denote the transition probability density under the Ito diffusion dx = f (x) dt + g (x) dω. Then p satisfies the following PDEs: Theorem (Kolmogorov equations) forward (FP) equation backward equation s p = y (fp) y 2 g 2 p t p = f x (p) g2 (p) = L [p (y, sj, t)] 2 x2 Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
7 Forward and backward equations, generator Let p (y, sjx, t), s t denote the transition probability density under the Ito diffusion dx = f (x) dt + g (x) dω. Then p satisfies the following PDEs: Theorem (Kolmogorov equations) forward (FP) equation backward equation s p = y (fp) y 2 g 2 p t p = f x (p) g2 (p) = L [p (y, sj, t)] 2 x2 The operator L which computes expected directional derivatives is called the generator of the stochastic process. It satisfies (in the vector case): Theorem (generator) E L [v ()] (x), x(0)=x [v (x ( ))] v (x) lim = f (x) T v x (x) + 1!0 2 tr (Σ (x) v xx (x)) Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
8 Discretizing the time axis Consider the explicit Euler discretization with time step : x (t + ) = x (t) + f (x (t), u (t)) + p G (x (t), u (t)) ε (t) where ε (t) N (0, I). The term p appears because the variance grows linearly with time. Thus the transition probability p (x 0 jx, u) is Gaussian, with mean x + f (x, u) and covariance matrix Σ (x, u). The one-step cost is ` (x, u). Now we can apply the Bellman equation (in the finite horizon setting): v (x, t) = min n ` (x, u) + E u x 0 p(jx,u) v x 0, t + o = n o min ` (x, u) + E u dn( f(x,u), Σ(x,u)) [v (x + d, t + )] Next we use the Taylor-series expansion of v... Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
9 Hamilton-Jacobi-Bellman (HJB) equation v (x + d, t + ) = v (x, t) + v t (x, t) + o 2 + +d T v x (x, t) dt v xx (x, t) d + o d 3 h i Using the fact that E d T Md = tr (cov [d] M) + o 2, the expectation is E d [v (x + d, t + )] = v (x, t) + v t (x, t) + o f (x, u) T v x (x, t) + 2 tr (Σ (x, u) v xx (x, t)) Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
10 Hamilton-Jacobi-Bellman (HJB) equation v (x + d, t + ) = v (x, t) + v t (x, t) + o 2 + +d T v x (x, t) dt v xx (x, t) d + o d 3 h i Using the fact that E d T Md = tr (cov [d] M) + o 2, the expectation is E d [v (x + d, t + )] = v (x, t) + v t (x, t) + o 2 + Substituting in the Bellman equation, v (x, t) = min u + f (x, u) T v x (x, t) + 2 tr (Σ (x, u) v xx (x, t)) ` (x, u) + v (x, t) + vt (x, t) + o f (x, u) T v x (x, t) + 2 tr (Σ (x, u) v xx (x, t)) Simplifying, dividing by and taking! 0 yields the HJB equation v t (x, t) = min ` (x, u) + f (x, u) T v x (x) + 1 u 2 tr (Σ (x, u) v xx (x)) Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
11 HJB equations for different problem formulations Definition (Hamiltonian) H [x, u, v ()], ` (x, u) + f (x, u) T v x (x) tr (Σ (x, u) v xx (x)) = ` + L [v] The HJB equations for the optimal cost-to-go v are Theorem (HJB equations) first exit 0 = min u H [x, u, v ()] v (x 2 T ) = q T (x) finite horizon v t (x, t) = min u H [x, u, v (, t)] v (x, T) = q T (x) discounted average 1 τ v (x) = min u H [x, u, v ()] c = min u H [x, u, ev ()] Discounted cost-to-go: v π (x) = E R 0 exp ( t/τ) ` (x (t), u (t)) dt. Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
12 Existence and uniqueness of solutions The HJB equation has at most one classic solution (i.e. a function which satisfies the PDE everywhere.) If a classic solution exists then it is the optimal cost-to-go function. The HJB equation may not have a classic solution; in that case the optimal cost-to-go function is non-smooth (e.g. bang-bang control.) The HJB equation always has a unique viscosity solution which is the optimal cost-to-go function. Approximation schemes based on MDP discretization (see below) are guaranteed to converge to the unique viscosity solution / optimal cost-to-go function. Most continuous function approximation schemes (which scale better) are unable to represent non-smooth solutions. All examples of non-smoothness seem to be deterministic; noise tends to smooth the optimal cost-to-go function. Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
13 Example of noise smoothing Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
14 More tractable problems Consider a restricted family of problems with dynamics and cost dx = (a (x) + B (x) u) dt + C (x) dω ` (x, u) = q (x) ut R (x) u For such problems the Hamiltonian can be minimized analytically w.r.t. u. Suppressing the dependence on x for clarity, we have min H = min q + 12 ut Ru + (a + Bu) T v x + 12 CC tr T v xx u u The minimum is achieved at u = min H = q + a T v x + 1 u 2 tr CC T v xx R 1 B T v x and the result is 1 2 vt x BR 1 B T v x Thus the HJB equations become 2nd-order quadratic PDEs, no longer involving the min operator. Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
15 More tractable problems (generalizations) Allowing control-multiplicative noise: Σ (x, u) = C 0 (x) C 0 (x) T + K k=1 C k (x) uu T C k (x) T The optimal control law becomes: u = R + K k=1 CT k v xxc k 1 B T v x Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
16 More tractable problems (generalizations) Allowing control-multiplicative noise: Σ (x, u) = C 0 (x) C 0 (x) T + K k=1 C k (x) uu T C k (x) T The optimal control law becomes: u = R + K k=1 CT k v xxc k 1 B T v x Allowing more general control costs: ` (x, u) = q (x) + i r (u i ), r : convex The optimal control law becomes: n o u = arg min i r (u i ) + u T B T v x = r 0 1 B T v x u Z juj r (u) = s atanh 0 w u max dw =) u = u max tanh s 1 B T v x Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
17 Pendulum example θ = k sin (θ) + u Stochastic dynamics: dx = a (x) dt + B (udt + σdω) Cost and optimal control: First-order form: x1 x = = x 2 θ θ x a (x) = 2 k sin (x 1 ) 0 B = 1 ` (x, u) = q (x) + r 2 u2 u (x) = r 1 v x2 (x) HJB equation (discounted): 1 τ v = q + x 2v x1 + k sin (x 1 ) v x2 + σ2 2 v x 2 x 2 1 2r v2 x 2 Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
18 Pendulum example continued Parameters: k = σ = r = 1, τ = 0.3, q = 1 exp 2θ 2, β = 0.99 Dicretize state space, approximate derivatives via finite differences, iterate: v (n+1) = βv (n) + (1 β) τ min H (n) u Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
19 MDP discretization Define discrete state and control spaces X (h) R n, U (h) R m and discrete time step (h), where h is a "coarseness" parameter and h! 0 corresponds to infinitely dense discretization. Construct p (h) (x 0 (h) jx (h), u (h)) s.t. Definition (local consistency) d, x 0 (h) x (h) E [d] = (h) f(x (h), u (h) ) + o( (h) ) cov [d] = (h) Σ(x (h), u (h) ) + o( (h) ) In the limit h! 0 the MDP solution v (h) converges to the solution v of the continuous problem, even when v is non-smooth (Kushner and Dupois) Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
20 Constructing the MDP For each x (h), u (h) choose vectors fv i g i=1k such that all possible next states are x 0 (h) = x (h) + hv i. Then compute p i (h) = p (h)(x (h) + hv i jx (h), u (h) ) as: Find w i, y i s.t. i w i v i = f i y i v i v T i = Σ i y i v i = 0 and set i w i = 1, w i 0 i y i = 1, y i 0 p i (h) = hw i + y i h + 1 h (h) = 2 h + 1 Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
21 Constructing the MDP For each x (h), u (h) choose vectors fv i g i=1k such that all possible next states are x 0 (h) = x (h) + hv i. Then compute p i (h) = p (h)(x (h) + hv i jx (h), u (h) ) as: Find w i, y i s.t. i w i v i = f i y i v i v T i = Σ i y i v i = 0 i w i = 1, w i 0 i y i = 1, y i 0 and set Set (h) = h and minimize Σ h i p i (h) (v i f) (v i f) T 2 s.t. i p i (h) v i = f i p i (h) = 1, p i (h) 0 p i (h) = hw i + y i h + 1 h (h) = 2 h + 1 Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
22 Constructing the MDP For each x (h), u (h) choose vectors fv i g i=1k such that all possible next states are x 0 (h) = x (h) + hv i. Then compute p i (h) = p (h)(x (h) + hv i jx (h), u (h) ) as: Find w i, y i s.t. i w i v i = f i y i v i v T i = Σ i y i v i = 0 and set i w i = 1, w i 0 i y i = 1, y i 0 p i (h) = hw i + y i h + 1 h (h) = 2 h + 1 Set (h) = h and minimize Σ h i p i (h) (v i f) (v i f) T 2 s.t. i p i (h) v i = f i p i (h) = 1, p i (h) 0 Set (h) = h and p i (h) x N (h) + hv i ; hf, hσ Emo Todorov (UW) AMATH/CSE 579, Winter 2014 Winter / 16
Linear-Quadratic-Gaussian (LQG) Controllers and Kalman Filters
Linear-Quadratic-Gaussian (LQG) Controllers and Kalman Filters Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 204 Emo Todorov (UW) AMATH/CSE 579, Winter
More informationPontryagin s maximum principle
Pontryagin s maximum principle Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 2012 Emo Todorov (UW) AMATH/CSE 579, Winter 2012 Lecture 5 1 / 9 Pontryagin
More informationLinearly-Solvable Stochastic Optimal Control Problems
Linearly-Solvable Stochastic Optimal Control Problems Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 2014 Emo Todorov (UW) AMATH/CSE 579, Winter 2014
More informationPROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS
PROBABILITY: LIMIT THEOREMS II, SPRING 218. HOMEWORK PROBLEMS PROF. YURI BAKHTIN Instructions. You are allowed to work on solutions in groups, but you are required to write up solutions on your own. Please
More informationPROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS
PROBABILITY: LIMIT THEOREMS II, SPRING 15. HOMEWORK PROBLEMS PROF. YURI BAKHTIN Instructions. You are allowed to work on solutions in groups, but you are required to write up solutions on your own. Please
More informationOptimal Control Theory
Optimal Control Theory The theory Optimal control theory is a mature mathematical discipline which provides algorithms to solve various control problems The elaborate mathematical machinery behind optimal
More informationHamilton-Jacobi-Bellman Equation Feb 25, 2008
Hamilton-Jacobi-Bellman Equation Feb 25, 2008 What is it? The Hamilton-Jacobi-Bellman (HJB) equation is the continuous-time analog to the discrete deterministic dynamic programming algorithm Discrete VS
More informationInverse Optimality Design for Biological Movement Systems
Inverse Optimality Design for Biological Movement Systems Weiwei Li Emanuel Todorov Dan Liu Nordson Asymtek Carlsbad CA 921 USA e-mail: wwli@ieee.org. University of Washington Seattle WA 98195 USA Google
More informationGaussians. Pieter Abbeel UC Berkeley EECS. Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics
Gaussians Pieter Abbeel UC Berkeley EECS Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics Outline Univariate Gaussian Multivariate Gaussian Law of Total Probability Conditioning
More informationBayesian Decision Theory in Sensorimotor Control
Bayesian Decision Theory in Sensorimotor Control Matthias Freiberger, Martin Öttl Signal Processing and Speech Communication Laboratory Advanced Signal Processing Matthias Freiberger, Martin Öttl Advanced
More informationTrajectory-based optimization
Trajectory-based optimization Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 2012 Emo Todorov (UW) AMATH/CSE 579, Winter 2012 Lecture 6 1 / 13 Using
More informationEN Applied Optimal Control Lecture 8: Dynamic Programming October 10, 2018
EN530.603 Applied Optimal Control Lecture 8: Dynamic Programming October 0, 08 Lecturer: Marin Kobilarov Dynamic Programming (DP) is conerned with the computation of an optimal policy, i.e. an optimal
More informationSolution of Stochastic Optimal Control Problems and Financial Applications
Journal of Mathematical Extension Vol. 11, No. 4, (2017), 27-44 ISSN: 1735-8299 URL: http://www.ijmex.com Solution of Stochastic Optimal Control Problems and Financial Applications 2 Mat B. Kafash 1 Faculty
More informationFast-slow systems with chaotic noise
Fast-slow systems with chaotic noise David Kelly Ian Melbourne Courant Institute New York University New York NY www.dtbkelly.com May 1, 216 Statistical properties of dynamical systems, ESI Vienna. David
More informationFast-slow systems with chaotic noise
Fast-slow systems with chaotic noise David Kelly Ian Melbourne Courant Institute New York University New York NY www.dtbkelly.com May 12, 215 Averaging and homogenization workshop, Luminy. Fast-slow systems
More informationHJB equations. Seminar in Stochastic Modelling in Economics and Finance January 10, 2011
Department of Probability and Mathematical Statistics Faculty of Mathematics and Physics, Charles University in Prague petrasek@karlin.mff.cuni.cz Seminar in Stochastic Modelling in Economics and Finance
More informationMathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )
Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio (2014-2015) Etienne Tanré - Olivier Faugeras INRIA - Team Tosca November 26th, 2014 E. Tanré (INRIA - Team Tosca) Mathematical
More informationOptimal Control. Lecture 18. Hamilton-Jacobi-Bellman Equation, Cont. John T. Wen. March 29, Ref: Bryson & Ho Chapter 4.
Optimal Control Lecture 18 Hamilton-Jacobi-Bellman Equation, Cont. John T. Wen Ref: Bryson & Ho Chapter 4. March 29, 2004 Outline Hamilton-Jacobi-Bellman (HJB) Equation Iterative solution of HJB Equation
More informationLecture 5 Linear Quadratic Stochastic Control
EE363 Winter 2008-09 Lecture 5 Linear Quadratic Stochastic Control linear-quadratic stochastic control problem solution via dynamic programming 5 1 Linear stochastic system linear dynamical system, over
More informationNumerical schemes of resolution of stochastic optimal control HJB equation
Numerical schemes of resolution of stochastic optimal control HJB equation Elisabeth Ottenwaelter Journée des doctorants 7 mars 2007 Équipe COMMANDS Elisabeth Ottenwaelter (INRIA et CMAP) Numerical schemes
More informationStochastic Differential Equations.
Chapter 3 Stochastic Differential Equations. 3.1 Existence and Uniqueness. One of the ways of constructing a Diffusion process is to solve the stochastic differential equation dx(t) = σ(t, x(t)) dβ(t)
More informationON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME
ON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME SAUL D. JACKA AND ALEKSANDAR MIJATOVIĆ Abstract. We develop a general approach to the Policy Improvement Algorithm (PIA) for stochastic control problems
More informationMean-Field Games with non-convex Hamiltonian
Mean-Field Games with non-convex Hamiltonian Martino Bardi Dipartimento di Matematica "Tullio Levi-Civita" Università di Padova Workshop "Optimal Control and MFGs" Pavia, September 19 21, 2018 Martino
More informationBrownian Motion. An Undergraduate Introduction to Financial Mathematics. J. Robert Buchanan. J. Robert Buchanan Brownian Motion
Brownian Motion An Undergraduate Introduction to Financial Mathematics J. Robert Buchanan 2010 Background We have already seen that the limiting behavior of a discrete random walk yields a derivation of
More informationHomogenization with stochastic differential equations
Homogenization with stochastic differential equations Scott Hottovy shottovy@math.arizona.edu University of Arizona Program in Applied Mathematics October 12, 2011 Modeling with SDE Use SDE to model system
More informationStochastic and Adaptive Optimal Control
Stochastic and Adaptive Optimal Control Robert Stengel Optimal Control and Estimation, MAE 546 Princeton University, 2018! Nonlinear systems with random inputs and perfect measurements! Stochastic neighboring-optimal
More informationStochastic Gradient Descent in Continuous Time
Stochastic Gradient Descent in Continuous Time Justin Sirignano University of Illinois at Urbana Champaign with Konstantinos Spiliopoulos (Boston University) 1 / 27 We consider a diffusion X t X = R m
More informationDeterministic Dynamic Programming
Deterministic Dynamic Programming 1 Value Function Consider the following optimal control problem in Mayer s form: V (t 0, x 0 ) = inf u U J(t 1, x(t 1 )) (1) subject to ẋ(t) = f(t, x(t), u(t)), x(t 0
More informationNonlinear representation, backward SDEs, and application to the Principal-Agent problem
Nonlinear representation, backward SDEs, and application to the Principal-Agent problem Ecole Polytechnique, France April 4, 218 Outline The Principal-Agent problem Formulation 1 The Principal-Agent problem
More informationReal Time Stochastic Control and Decision Making: From theory to algorithms and applications
Real Time Stochastic Control and Decision Making: From theory to algorithms and applications Evangelos A. Theodorou Autonomous Control and Decision Systems Lab Challenges in control Uncertainty Stochastic
More informationQualitative behaviour of numerical methods for SDEs and application to homogenization
Qualitative behaviour of numerical methods for SDEs and application to homogenization K. C. Zygalakis Oxford Centre For Collaborative Applied Mathematics, University of Oxford. Center for Nonlinear Analysis,
More informationNonparametric Drift Estimation for Stochastic Differential Equations
Nonparametric Drift Estimation for Stochastic Differential Equations Gareth Roberts 1 Department of Statistics University of Warwick Brazilian Bayesian meeting, March 2010 Joint work with O. Papaspiliopoulos,
More informationHigher order weak approximations of stochastic differential equations with and without jumps
Higher order weak approximations of stochastic differential equations with and without jumps Hideyuki TANAKA Graduate School of Science and Engineering, Ritsumeikan University Rough Path Analysis and Related
More informationIntroduction to Algorithmic Trading Strategies Lecture 4
Introduction to Algorithmic Trading Strategies Lecture 4 Optimal Pairs Trading by Stochastic Control Haksun Li haksun.li@numericalmethod.com www.numericalmethod.com Outline Problem formulation Ito s lemma
More informationThe concentration of a drug in blood. Exponential decay. Different realizations. Exponential decay with noise. dc(t) dt.
The concentration of a drug in blood Exponential decay C12 concentration 2 4 6 8 1 C12 concentration 2 4 6 8 1 dc(t) dt = µc(t) C(t) = C()e µt 2 4 6 8 1 12 time in minutes 2 4 6 8 1 12 time in minutes
More informationDiscretization of SDEs: Euler Methods and Beyond
Discretization of SDEs: Euler Methods and Beyond 09-26-2006 / PRisMa 2006 Workshop Outline Introduction 1 Introduction Motivation Stochastic Differential Equations 2 The Time Discretization of SDEs Monte-Carlo
More information1. Stochastic Process
HETERGENEITY IN QUANTITATIVE MACROECONOMICS @ TSE OCTOBER 17, 216 STOCHASTIC CALCULUS BASICS SANG YOON (TIM) LEE Very simple notes (need to add references). It is NOT meant to be a substitute for a real
More informationIntroduction Optimality and Asset Pricing
Introduction Optimality and Asset Pricing Andrea Buraschi Imperial College Business School October 2010 The Euler Equation Take an economy where price is given with respect to the numéraire, which is our
More informationControl of Dynamical System
Control of Dynamical System Yuan Gao Applied Mathematics University of Washington yuangao@uw.edu Spring 2015 1 / 21 Simplified Model for a Car To start, let s consider the simple example of controlling
More informationSloppy derivations of Ito s formula and the Fokker-Planck equations
Sloppy derivations of Ito s formula and the Fokker-Planck equations P. G. Harrison Department of Computing, Imperial College London South Kensington Campus, London SW7 AZ, UK email: pgh@doc.ic.ac.uk April
More informationGaussian processes for inference in stochastic differential equations
Gaussian processes for inference in stochastic differential equations Manfred Opper, AI group, TU Berlin November 6, 2017 Manfred Opper, AI group, TU Berlin (TU Berlin) inference in SDE November 6, 2017
More informationNumerical Optimal Control Overview. Moritz Diehl
Numerical Optimal Control Overview Moritz Diehl Simplified Optimal Control Problem in ODE path constraints h(x, u) 0 initial value x0 states x(t) terminal constraint r(x(t )) 0 controls u(t) 0 t T minimize
More informationLecture 1: Pragmatic Introduction to Stochastic Differential Equations
Lecture 1: Pragmatic Introduction to Stochastic Differential Equations Simo Särkkä Aalto University, Finland (visiting at Oxford University, UK) November 13, 2013 Simo Särkkä (Aalto) Lecture 1: Pragmatic
More informationNumerical Integration of SDEs: A Short Tutorial
Numerical Integration of SDEs: A Short Tutorial Thomas Schaffter January 19, 010 1 Introduction 1.1 Itô and Stratonovich SDEs 1-dimensional stochastic differentiable equation (SDE) is given by [6, 7] dx
More informationOptimal Control. Quadratic Functions. Single variable quadratic function: Multi-variable quadratic function:
Optimal Control Control design based on pole-placement has non unique solutions Best locations for eigenvalues are sometimes difficult to determine Linear Quadratic LQ) Optimal control minimizes a quadratic
More informationExercises. T 2T. e ita φ(t)dt.
Exercises. Set #. Construct an example of a sequence of probability measures P n on R which converge weakly to a probability measure P but so that the first moments m,n = xdp n do not converge to m = xdp.
More information1. Stochastic Processes and filtrations
1. Stochastic Processes and 1. Stoch. pr., A stochastic process (X t ) t T is a collection of random variables on (Ω, F) with values in a measurable space (S, S), i.e., for all t, In our case X t : Ω S
More informationPath Integral Stochastic Optimal Control for Reinforcement Learning
Preprint August 3, 204 The st Multidisciplinary Conference on Reinforcement Learning and Decision Making RLDM203 Path Integral Stochastic Optimal Control for Reinforcement Learning Farbod Farshidian Institute
More informationOptimal Control. McGill COMP 765 Oct 3 rd, 2017
Optimal Control McGill COMP 765 Oct 3 rd, 2017 Classical Control Quiz Question 1: Can a PID controller be used to balance an inverted pendulum: A) That starts upright? B) That must be swung-up (perhaps
More informationWeak Convergence of Numerical Methods for Dynamical Systems and Optimal Control, and a relation with Large Deviations for Stochastic Equations
Weak Convergence of Numerical Methods for Dynamical Systems and, and a relation with Large Deviations for Stochastic Equations Mattias Sandberg KTH CSC 2010-10-21 Outline The error representation for weak
More informationIntroduction to multiscale modeling and simulation. Explicit methods for ODEs : forward Euler. y n+1 = y n + tf(y n ) dy dt = f(y), y(0) = y 0
Introduction to multiscale modeling and simulation Lecture 5 Numerical methods for ODEs, SDEs and PDEs The need for multiscale methods Two generic frameworks for multiscale computation Explicit methods
More informationStochastic Integration and Stochastic Differential Equations: a gentle introduction
Stochastic Integration and Stochastic Differential Equations: a gentle introduction Oleg Makhnin New Mexico Tech Dept. of Mathematics October 26, 27 Intro: why Stochastic? Brownian Motion/ Wiener process
More informationChapter Two: Numerical Methods for Elliptic PDEs. 1 Finite Difference Methods for Elliptic PDEs
Chapter Two: Numerical Methods for Elliptic PDEs Finite Difference Methods for Elliptic PDEs.. Finite difference scheme. We consider a simple example u := subject to Dirichlet boundary conditions ( ) u
More informationLarge deviations and averaging for systems of slow fast stochastic reaction diffusion equations.
Large deviations and averaging for systems of slow fast stochastic reaction diffusion equations. Wenqing Hu. 1 (Joint work with Michael Salins 2, Konstantinos Spiliopoulos 3.) 1. Department of Mathematics
More informationA Concise Course on Stochastic Partial Differential Equations
A Concise Course on Stochastic Partial Differential Equations Michael Röckner Reference: C. Prevot, M. Röckner: Springer LN in Math. 1905, Berlin (2007) And see the references therein for the original
More informationLecture 4: Introduction to stochastic processes and stochastic calculus
Lecture 4: Introduction to stochastic processes and stochastic calculus Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London
More informationStochastic optimal control with rough paths
Stochastic optimal control with rough paths Paul Gassiat TU Berlin Stochastic processes and their statistics in Finance, Okinawa, October 28, 2013 Joint work with Joscha Diehl and Peter Friz Introduction
More informationA Tour of Reinforcement Learning The View from Continuous Control. Benjamin Recht University of California, Berkeley
A Tour of Reinforcement Learning The View from Continuous Control Benjamin Recht University of California, Berkeley trustable, scalable, predictable Control Theory! Reinforcement Learning is the study
More informationLecture 6: Multiple Model Filtering, Particle Filtering and Other Approximations
Lecture 6: Multiple Model Filtering, Particle Filtering and Other Approximations Department of Biomedical Engineering and Computational Science Aalto University April 28, 2010 Contents 1 Multiple Model
More informationSelected HW Solutions
Selected HW Solutions HW1 1 & See web page notes Derivative Approximations. For example: df f i+1 f i 1 = dx h i 1 f i + hf i + h h f i + h3 6 f i + f i + h 6 f i + 3 a realmax 17 1.7014 10 38 b realmin
More informationA random perturbation approach to some stochastic approximation algorithms in optimization.
A random perturbation approach to some stochastic approximation algorithms in optimization. Wenqing Hu. 1 (Presentation based on joint works with Chris Junchi Li 2, Weijie Su 3, Haoyi Xiong 4.) 1. Department
More informationLecture 4: Numerical Solution of SDEs, Itô Taylor Series, Gaussian Approximations
Lecture 4: Numerical Solution of SDEs, Itô Taylor Series, Gaussian Approximations Simo Särkkä Aalto University, Finland November 18, 2014 Simo Särkkä (Aalto) Lecture 4: Numerical Solution of SDEs November
More informationEconomics 2010c: Lectures 9-10 Bellman Equation in Continuous Time
Economics 2010c: Lectures 9-10 Bellman Equation in Continuous Time David Laibson 9/30/2014 Outline Lectures 9-10: 9.1 Continuous-time Bellman Equation 9.2 Application: Merton s Problem 9.3 Application:
More informationBrownian Motion. 1 Definition Brownian Motion Wiener measure... 3
Brownian Motion Contents 1 Definition 2 1.1 Brownian Motion................................. 2 1.2 Wiener measure.................................. 3 2 Construction 4 2.1 Gaussian process.................................
More informationUCLA Chemical Engineering. Process & Control Systems Engineering Laboratory
Constrained Innite-Time Nonlinear Quadratic Optimal Control V. Manousiouthakis D. Chmielewski Chemical Engineering Department UCLA 1998 AIChE Annual Meeting Outline Unconstrained Innite-Time Nonlinear
More informationFinite element approximation of the stochastic heat equation with additive noise
p. 1/32 Finite element approximation of the stochastic heat equation with additive noise Stig Larsson p. 2/32 Outline Stochastic heat equation with additive noise du u dt = dw, x D, t > u =, x D, t > u()
More informationWeak solutions to Fokker-Planck equations and Mean Field Games
Weak solutions to Fokker-Planck equations and Mean Field Games Alessio Porretta Universita di Roma Tor Vergata LJLL, Paris VI March 6, 2015 Outlines of the talk Brief description of the Mean Field Games
More informationIntroduction to Algorithmic Trading Strategies Lecture 3
Introduction to Algorithmic Trading Strategies Lecture 3 Trend Following Haksun Li haksun.li@numericalmethod.com www.numericalmethod.com References Introduction to Stochastic Calculus with Applications.
More informationThe Smoluchowski-Kramers Approximation: What model describes a Brownian particle?
The Smoluchowski-Kramers Approximation: What model describes a Brownian particle? Scott Hottovy shottovy@math.arizona.edu University of Arizona Applied Mathematics October 7, 2011 Brown observes a particle
More informationx x2 2 + x3 3 x4 3. Use the divided-difference method to find a polynomial of least degree that fits the values shown: (b)
Numerical Methods - PROBLEMS. The Taylor series, about the origin, for log( + x) is x x2 2 + x3 3 x4 4 + Find an upper bound on the magnitude of the truncation error on the interval x.5 when log( + x)
More informationLecture 4: Numerical Solution of SDEs, Itô Taylor Series, Gaussian Process Approximations
Lecture 4: Numerical Solution of SDEs, Itô Taylor Series, Gaussian Process Approximations Simo Särkkä Aalto University Tampere University of Technology Lappeenranta University of Technology Finland November
More informationA new approach for investment performance measurement. 3rd WCMF, Santa Barbara November 2009
A new approach for investment performance measurement 3rd WCMF, Santa Barbara November 2009 Thaleia Zariphopoulou University of Oxford, Oxford-Man Institute and The University of Texas at Austin 1 Performance
More informationReflected Brownian Motion
Chapter 6 Reflected Brownian Motion Often we encounter Diffusions in regions with boundary. If the process can reach the boundary from the interior in finite time with positive probability we need to decide
More informationStochastic differential equations in neuroscience
Stochastic differential equations in neuroscience Nils Berglund MAPMO, Orléans (CNRS, UMR 6628) http://www.univ-orleans.fr/mapmo/membres/berglund/ Barbara Gentz, Universität Bielefeld Damien Landon, MAPMO-Orléans
More informationContinuous dependence estimates for the ergodic problem with an application to homogenization
Continuous dependence estimates for the ergodic problem with an application to homogenization Claudio Marchi Bayreuth, September 12 th, 2013 C. Marchi (Università di Padova) Continuous dependence Bayreuth,
More informationRegularization by noise in infinite dimensions
Regularization by noise in infinite dimensions Franco Flandoli, University of Pisa King s College 2017 Franco Flandoli, University of Pisa () Regularization by noise King s College 2017 1 / 33 Plan of
More informationInterest Rate Models:
1/17 Interest Rate Models: from Parametric Statistics to Infinite Dimensional Stochastic Analysis René Carmona Bendheim Center for Finance ORFE & PACM, Princeton University email: rcarmna@princeton.edu
More informationGeometric projection of stochastic differential equations
Geometric projection of stochastic differential equations John Armstrong (King s College London) Damiano Brigo (Imperial) August 9, 2018 Idea: Projection Idea: Projection Projection gives a method of systematically
More informationRough Burgers-like equations with multiplicative noise
Rough Burgers-like equations with multiplicative noise Martin Hairer Hendrik Weber Mathematics Institute University of Warwick Bielefeld, 3.11.21 Burgers-like equation Aim: Existence/Uniqueness for du
More informationThis is a Gaussian probability centered around m = 0 (the most probable and mean position is the origin) and the mean square displacement m 2 = n,or
Physics 7b: Statistical Mechanics Brownian Motion Brownian motion is the motion of a particle due to the buffeting by the molecules in a gas or liquid. The particle must be small enough that the effects
More information2. As we shall see, we choose to write in terms of σ x because ( X ) 2 = σ 2 x.
Section 5.1 Simple One-Dimensional Problems: The Free Particle Page 9 The Free Particle Gaussian Wave Packets The Gaussian wave packet initial state is one of the few states for which both the { x } and
More informationKolmogorov Equations and Markov Processes
Kolmogorov Equations and Markov Processes May 3, 013 1 Transition measures and functions Consider a stochastic process {X(t)} t 0 whose state space is a product of intervals contained in R n. We define
More informationLecture 6: Bayesian Inference in SDE Models
Lecture 6: Bayesian Inference in SDE Models Bayesian Filtering and Smoothing Point of View Simo Särkkä Aalto University Simo Särkkä (Aalto) Lecture 6: Bayesian Inference in SDEs 1 / 45 Contents 1 SDEs
More informationFinite-Horizon Optimal State-Feedback Control of Nonlinear Stochastic Systems Based on a Minimum Principle
Finite-Horizon Optimal State-Feedbac Control of Nonlinear Stochastic Systems Based on a Minimum Principle Marc P Deisenroth, Toshiyui Ohtsua, Florian Weissel, Dietrich Brunn, and Uwe D Hanebec Abstract
More informationSTOCHASTIC PERRON S METHOD AND VERIFICATION WITHOUT SMOOTHNESS USING VISCOSITY COMPARISON: OBSTACLE PROBLEMS AND DYNKIN GAMES
STOCHASTIC PERRON S METHOD AND VERIFICATION WITHOUT SMOOTHNESS USING VISCOSITY COMPARISON: OBSTACLE PROBLEMS AND DYNKIN GAMES ERHAN BAYRAKTAR AND MIHAI SÎRBU Abstract. We adapt the Stochastic Perron s
More informationRobust control and applications in economic theory
Robust control and applications in economic theory In honour of Professor Emeritus Grigoris Kalogeropoulos on the occasion of his retirement A. N. Yannacopoulos Department of Statistics AUEB 24 May 2013
More informationHandbook of Stochastic Methods
C. W. Gardiner Handbook of Stochastic Methods for Physics, Chemistry and the Natural Sciences Third Edition With 30 Figures Springer Contents 1. A Historical Introduction 1 1.1 Motivation I 1.2 Some Historical
More informationMalliavin Calculus in Finance
Malliavin Calculus in Finance Peter K. Friz 1 Greeks and the logarithmic derivative trick Model an underlying assent by a Markov process with values in R m with dynamics described by the SDE dx t = b(x
More informationIntroduction to Reachability Somil Bansal Hybrid Systems Lab, UC Berkeley
Introduction to Reachability Somil Bansal Hybrid Systems Lab, UC Berkeley Outline Introduction to optimal control Reachability as an optimal control problem Various shades of reachability Goal of This
More informationViscosity Solutions for Dummies (including Economists)
Viscosity Solutions for Dummies (including Economists) Online Appendix to Income and Wealth Distribution in Macroeconomics: A Continuous-Time Approach written by Benjamin Moll August 13, 2017 1 Viscosity
More informationChap. 1. Some Differential Geometric Tools
Chap. 1. Some Differential Geometric Tools 1. Manifold, Diffeomorphism 1.1. The Implicit Function Theorem ϕ : U R n R n p (0 p < n), of class C k (k 1) x 0 U such that ϕ(x 0 ) = 0 rank Dϕ(x) = n p x U
More informationStochastic Homogenization for Reaction-Diffusion Equations
Stochastic Homogenization for Reaction-Diffusion Equations Jessica Lin McGill University Joint Work with Andrej Zlatoš June 18, 2018 Motivation: Forest Fires ç ç ç ç ç ç ç ç ç ç Motivation: Forest Fires
More informationXt i Xs i N(0, σ 2 (t s)) and they are independent. This implies that the density function of X t X s is a product of normal density functions:
174 BROWNIAN MOTION 8.4. Brownian motion in R d and the heat equation. The heat equation is a partial differential equation. We are going to convert it into a probabilistic equation by reversing time.
More informationStochastic differential equation models in biology Susanne Ditlevsen
Stochastic differential equation models in biology Susanne Ditlevsen Introduction This chapter is concerned with continuous time processes, which are often modeled as a system of ordinary differential
More information04. Random Variables: Concepts
University of Rhode Island DigitalCommons@URI Nonequilibrium Statistical Physics Physics Course Materials 215 4. Random Variables: Concepts Gerhard Müller University of Rhode Island, gmuller@uri.edu Creative
More informationp 1 ( Y p dp) 1/p ( X p dp) 1 1 p
Doob s inequality Let X(t) be a right continuous submartingale with respect to F(t), t 1 P(sup s t X(s) λ) 1 λ {sup s t X(s) λ} X + (t)dp 2 For 1 < p
More informationPDEs in Image Processing, Tutorials
PDEs in Image Processing, Tutorials Markus Grasmair Vienna, Winter Term 2010 2011 Direct Methods Let X be a topological space and R: X R {+ } some functional. following definitions: The mapping R is lower
More informationSteady State Kalman Filter
Steady State Kalman Filter Infinite Horizon LQ Control: ẋ = Ax + Bu R positive definite, Q = Q T 2Q 1 2. (A, B) stabilizable, (A, Q 1 2) detectable. Solve for the positive (semi-) definite P in the ARE:
More informationLecture 3: Hamilton-Jacobi-Bellman Equations. Distributional Macroeconomics. Benjamin Moll. Part II of ECON Harvard University, Spring
Lecture 3: Hamilton-Jacobi-Bellman Equations Distributional Macroeconomics Part II of ECON 2149 Benjamin Moll Harvard University, Spring 2018 1 Outline 1. Hamilton-Jacobi-Bellman equations in deterministic
More informationDUBLIN CITY UNIVERSITY
DUBLIN CITY UNIVERSITY SAMPLE EXAMINATIONS 2017/2018 MODULE: QUALIFICATIONS: Simulation for Finance MS455 B.Sc. Actuarial Mathematics ACM B.Sc. Financial Mathematics FIM YEAR OF STUDY: 4 EXAMINERS: Mr
More information