Optimal Control of Partially Observable Piecewise Deterministic Markov Processes

Size: px
Start display at page:

Download "Optimal Control of Partially Observable Piecewise Deterministic Markov Processes"

Transcription

1 Optimal Control of Partially Observable Piecewise Deterministic Markov Processes Nicole Bäuerle based on a joint work with D. Lange Wien, April 2018

2 Outline Motivation PDMP with Partial Observation Controlled PDMP with Partial Observation Reduction to a Discrete-Time Problem The Filter Existence Results Application

3 Motivation process state running cost 0 1 st jump 2nd time y process state A typical path of a PDMP

4 Ingredients of a PDMP In order to define a PDMP we need the following data The drift Φ : R d R + R d is continuous and satisfies for all y R d and s, t > 0: Φ(y, t + s) = Φ(Φ(y, s), t).

5 Ingredients of a PDMP In order to define a PDMP we need the following data The drift Φ : R d R + R d is continuous and satisfies for all y R d and s, t > 0: Φ(y, t + s) = Φ(Φ(y, s), t). The jump times 0 := T 0 < T 1 <... are R + -valued random variables such that S n := T n T n 1, n N, S 0 := 0. They are generated by an intensity λ : R d (0, ).

6 Ingredients of a PDMP In order to define a PDMP we need the following data The drift Φ : R d R + R d is continuous and satisfies for all y R d and s, t > 0: Φ(y, t + s) = Φ(Φ(y, s), t). The jump times 0 := T 0 < T 1 <... are R + -valued random variables such that S n := T n T n 1, n N, S 0 := 0. They are generated by an intensity λ : R d (0, ). A transition kernel Q from R d to R d gives the probability Q(B y) that the process jumps into B given the state y.

7 Ingredients of a PDMP In order to define a PDMP we need the following data The drift Φ : R d R + R d is continuous and satisfies for all y R d and s, t > 0: Φ(y, t + s) = Φ(Φ(y, s), t). The jump times 0 := T 0 < T 1 <... are R + -valued random variables such that S n := T n T n 1, n N, S 0 := 0. They are generated by an intensity λ : R d (0, ). A transition kernel Q from R d to R d gives the probability Q(B y) that the process jumps into B given the state y. The PDMP is then defined by Y t := Φ(Y Tn, t T n ), for T n t < T n+1, n N 0.

8 A PDMP with Partial Observation Let (ɛ n ) n N be iid R d -valued random variables. We assume that the controller is only able to observe X n := Y Tn + ɛ n. Define Λ(y, t) := t 0 λ( Φ(y, s) ) ds. Then P x,y (S n t, Y Tn C, X n D S 0, Y T0, X 0,..., S n 1, Y Tn 1, X n 1 ) = t 0 exp ( Λ(Y Tn 1, s) ) λ ( Φ(Y Tn 1, s) ) C Q ɛ (D y )Q ( dy Φ(Y Tn 1, s) ) ds

9 Policies for POPDMP Consider marked point process (T n, Ŷn := Y Tn, X n ).

10 Policies for POPDMP Consider marked point process (T n, Ŷn := Y Tn, X n ). Relaxed controls on action space A: R := {r : [0, ) P(A) r measurable}.

11 Policies for POPDMP Consider marked point process (T n, Ŷn := Y Tn, X n ). Relaxed controls on action space A: R := {r : [0, ) P(A) r measurable}. Observable histories: H 0 := R d and for n N H n := H n 1 R R + R d. Element: h n = (x 0, r 0, s 1, x 1,..., r n 1, s n, x n ) H n

12 Policies for POPDMP Consider marked point process (T n, Ŷn := Y Tn, X n ). Relaxed controls on action space A: R := {r : [0, ) P(A) r measurable}. Observable histories: H 0 := R d and for n N H n := H n 1 R R + R d. Element: h n = (x 0, r 0, s 1, x 1,..., r n 1, s n, x n ) H n A discrete time history dependent relaxed control policy is π D := (π D 0, πd 1,... ) where πd n : H n R is measurable.

13 Policies for POPDMP Consider marked point process (T n, Ŷn := Y Tn, X n ). Relaxed controls on action space A: R := {r : [0, ) P(A) r measurable}. Observable histories: H 0 := R d and for n N H n := H n 1 R R + R d. Element: h n = (x 0, r 0, s 1, x 1,..., r n 1, s n, x n ) H n A discrete time history dependent relaxed control policy is π D := (π D 0, πd 1,... ) where πd n : H n R is measurable. Continuous control: π t := 1 {Tn t<tn+1 }(t) πn D (H n )(t T n ). n=0

14 Controlled Data We assume that the policy influences the drift, the intensity and the transition kernel.

15 Controlled Data We assume that the policy influences the drift, the intensity and the transition kernel. Drift Φ : R R d R + R d, notation Φ r (y, t).

16 Controlled Data We assume that the policy influences the drift, the intensity and the transition kernel. Drift Φ : R R d R + R d, notation Φ r (y, t). Jump rate λ A : R d A (0, ).

17 Controlled Data We assume that the policy influences the drift, the intensity and the transition kernel. Drift Φ : R R d R + R d, notation Φ r (y, t). Jump rate λ A : R d A (0, ). Jump kernel Q A from R d A to R d.

18 Controlled Process A Controlled Partially Observable Piecewise Deterministic Markov Process with local characteristics (Φ r, λ A, Q A, Q ɛ ) satisfies P πd t = e ΛπD n 1(Ŷn 1,s) 0 x,y(s n t, Ŷn C, X n D S 0,..., S n 1, Ŷn 1, X n 1, πn 1 D ) λ A( Φ πd n 1 ( Ŷ n 1, s), a ) A Q ɛ (D y )Q A( dy Φ πd n 1 ( Ŷ n 1, s), a ) πn 1 D (H n 1, s)(da)ds C where Λ r (y, t) := t 0 A λa (Φ r (y, s), a)r s (da) ds. Toy Example.

19 The Optimization Problem For a fixed policy π D we define the cost as [ J(x, π D ) := e βt c(y t, a) π t (da) dt IE πd x,y 0 A ] Q 0 (dy x).

20 The Optimization Problem For a fixed policy π D we define the cost as [ J(x, π D ) := e βt c(y t, a) π t (da) dt IE πd x,y The value function is defined as 0 J(x) := inf π D J(x, π D ) for all x R d. A ] Q 0 (dy x).

21 The Optimization Problem For a fixed policy π D we define the cost as [ J(x, π D ) := e βt c(y t, a) π t (da) dt IE πd x,y The value function is defined as A policy π is optimal if 0 J(x) := inf π D J(x, π D ) for all x R d. J(x) = J(x, π ) for all x R d. A ] Q 0 (dy x).

22 A discrete-time POMDP We define a POMDP as follows:

23 A discrete-time POMDP We define a POMDP as follows: State space R + R 2d (s, y, x).

24 A discrete-time POMDP We define a POMDP as follows: State space R + R 2d (s, y, x). Action space R r.

25 A discrete-time POMDP We define a POMDP as follows: State space R + R 2d (s, y, x). Action space R r. Transition law with Γ r (y, t) := βt + t 0 t e Γr (y,u) 0 A λa (Φ r (y, u), a)r u (da)du Q ( [0, t] C D y, r ) = Aλ A( Φ r (y, u), a ) Q(D y )Q A( dy Φ r (y, u), a ) r u (da)du. C

26 A discrete-time POMDP We define a POMDP as follows: State space R + R 2d (s, y, x). Action space R r. Transition law with Γ r (y, t) := βt + t 0 t e Γr (y,u) 0 A λa (Φ r (y, u), a)r u (da)du Q ( [0, t] C D y, r ) = Aλ A( Φ r (y, u), a ) Q(D y )Q A( dy Φ r (y, u), a ) r u (da)du. C The one-stage cost g(y, r) = 0 e Γr (y,t) A c(φ r (y, t), a) r t (da) dt.

27 The Problem for the discrete-time POMDP For a fixed policy π D we define the cost as [ ] J(x, π D ) := ĨEπD x g(ŷk, πk D (H k)). k=0

28 The Problem for the discrete-time POMDP For a fixed policy π D we define the cost as [ ] J(x, π D ) := ĨEπD x g(ŷk, πk D (H k)). The value function is defined as k=0 J(x) := inf π D J(x, π D ) x R d.

29 The Problem for the discrete-time POMDP For a fixed policy π D we define the cost as [ ] J(x, π D ) := ĨEπD x g(ŷk, πk D (H k)). The value function is defined as A policy π is optimal if k=0 J(x) := inf π D J(x, π D ) x R d. J(x) = J(x, π ) x R d.

30 Equivalence of the two Problems Theorem Let x R d be an initial observation and π D a policy. Then, it holds J(x, π D ) = J(x, π D ).

31 Assumptions (C1) The action space A is a compact metric space.

32 Assumptions (C1) The action space A is a compact metric space. (C2) λ A : R d A (0, ) is continuous and 0 < λ < λ A < λ.

33 Assumptions (C1) The action space A is a compact metric space. (C2) λ A : R d A (0, ) is continuous and 0 < λ < λ A < λ. (C3) Q A is weakly continuous, i.e. (x, a) v(z)q A (dz x, a) is continuous and bounded for all v : R d R continuous and bounded.

34 Assumptions (C1) The action space A is a compact metric space. (C2) λ A : R d A (0, ) is continuous and 0 < λ < λ A < λ. (C3) Q A is weakly continuous, i.e. (x, a) v(z)q A (dz x, a) is continuous and bounded for all v : R d R continuous and bounded. (B1) There exists E 0 = {y 1,..., y d } R d s.t. for all y R d and a A we have Q A (E 0 y, a) = 1 and Q 0 (E 0 ) = 1.

35 Assumptions (C1) The action space A is a compact metric space. (C2) λ A : R d A (0, ) is continuous and 0 < λ < λ A < λ. (C3) Q A is weakly continuous, i.e. (x, a) v(z)q A (dz x, a) is continuous and bounded for all v : R d R continuous and bounded. (B1) There exists E 0 = {y 1,..., y d } R d s.t. for all y R d and a A we have Q A (E 0 y, a) = 1 and Q 0 (E 0 ) = 1. (B2) Q ɛ has a bounded density f ɛ with respect to some σ-finite measure ν.

36 The Filter Assumption (B2) implies that the transition law has the density q(s, y, x y, r) = e Γr (y,s) f (x y ) λ A( Φ r (y, s), a ) Q A( y Φ r (y, s), a ) r s (da). A

37 The Filter Assumption (B2) implies that the transition law has the density q(s, y, x y, r) = e Γr (y,s) f (x y ) A λ A( Φ r (y, s), a ) Q A( y Φ r (y, s), a ) r s (da). Define for y E 0 the updating operator Ψ(ρ, r, s, x)(y y E q(s, ) := 0 y, x y, r)ρ(y) ŷ E y E q(s, 0 0 ŷ, x y, r)ρ(y). and for h n = (x 0, r 0, s 1, x 1,..., r n 1, s n, x n ), µ 0 (x 0 ) := Q 0 ( x 0 ), µ n ( h n ) = µ n ( h n 1, r n 1, s n, x n ) := Ψ ( µ n 1 ( h n 1 ), r n 1, s n, x n ).

38 The Filter Theorem It holds that ) P πd x (Ŷn C X 0, R 0, S 1, X 1,..., R n 1, S n, X n = µ n (C X 0, R 0, S 1, X 1,..., R n 1, S n, X n )

39 Filtered MDP We define a POMDP as follows:

40 Filtered MDP We define a POMDP as follows: State space P(E 0 ) ρ.

41 Filtered MDP We define a POMDP as follows: State space P(E 0 ) ρ. Action space R r.

42 Filtered MDP We define a POMDP as follows: State space P(E 0 ) ρ. Action space R r. Transition law ˆQ(B ρ, r) = y E 0 1 B ( Ψ(ρ, r, s, x) ) q SX (s, x y, r)ν(dx)dsρ(y), where q SX (s, x y, r) := y E 0 q(s, y, x y, r).

43 Filtered MDP We define a POMDP as follows: State space P(E 0 ) ρ. Action space R r. Transition law ˆQ(B ρ, r) = y E 0 1 B ( Ψ(ρ, r, s, x) ) q SX (s, x y, r)ν(dx)dsρ(y), where q SX (s, x y, r) := y E q(s, 0 y, x y, r). The one-stage cost ĝ(ρ, r) := g(y, r)ρ(y). y E 0

44 Filtered MDP Policy π = (f 0, f 1,...) with f n : P(E 0 ) R can be seen as a special policy π D Π D by setting π D n (h n ) := f n (µ n ( h n )).

45 Filtered MDP Policy π = (f 0, f 1,...) with f n : P(E 0 ) R can be seen as a special policy π D Π D by setting π D n (h n ) := f n (µ n ( h n )). We denote the cost of policy π by [ V (ρ, π) := ÎE π ρ ĝ ( µ n, f n (µ n ) )]. n=0

46 Filtered MDP Policy π = (f 0, f 1,...) with f n : P(E 0 ) R can be seen as a special policy π D Π D by setting π D n (h n ) := f n (µ n ( h n )). We denote the cost of policy π by [ V (ρ, π) := ÎE π ρ ĝ ( µ n, f n (µ n ) )]. n=0 The value function is defined as V (ρ) := inf V (ρ, π) for all ρ P(E 0 ). π Π

47 Equivalence of the two Problems Theorem Let x R d be an initial observation, π and π D given as on previous slide. Then, it holds V (Q 0 ( x), π) = J(x, π D ).

48 Further Assumptions (C4) r Φ r (y, t) is continuous for all y E 0, t 0.

49 Further Assumptions (C4) r Φ r (y, t) is continuous for all y E 0, t 0. (C5) c : R d A R + is lower semi-continuous with respect to the product topology.

50 Regularization of the Filter Let h σ : R R, σ > 0 be a regularization kernel, i.e. (i) h σ (t) 0 for all t R, (ii) R h σ(t)dt = 1, a (iii) lim σ 0 a h σ(t)dt = 1 for all a > 0.

51 Regularization of the Filter Let h σ : R R, σ > 0 be a regularization kernel, i.e. (i) h σ (t) 0 for all t R, (ii) R h σ(t)dt = 1, a (iii) lim σ 0 a h σ(t)dt = 1 for all a > 0. We use a regularized filter of the form ˆΨ(ρ, r, s, x)(y R y E q(u, ) := 0 y, x y, r)ρ(y)h σ (s u)du ŷ E y E q(u, 0 0 ŷ, x y, r)ρ(y)h σ (s u)du. Note that we have lim σ 0 ˆΨ = Ψ R

52 Solution of the Problem Theorem Under all assumptions we have for the regularized version of the problem a) V σ (ρ) = inf r R {ĝ(ρ, r) + P(E 0 ) V σ (ρ ) ˆQ(dρ } ρ, r). b) There exists an f which attains the minimum in a) and (f, f,...) is optimal for the filtered problem. The optimal policy for the original problem is (π0 D, πd 1,...) with π P 0 (x)(t) = f ( Q 0 ( x) ) (t), x R d π P n (h n )(t) = f ( µ n ( h n ) ) (t), h n H n.

53 Application: Toy Example (i) The state space is R, E 0 := { 2, 0, 2}, the action space is A := [ 1; 1]. (ii) The controlled drift is given by d dt Φr (y, t) = ar t (da), Φ(y, 0) = y. (iii) We set λ A 1 and β := 1. A (iv) The jump transition kernel Q A and the cost function are given in the picture on the next slide. (vii) Signal density: f ɛ ( 1) = f ɛ (0) = f ɛ (1) = 1 3. back

54 Application: Transition Kernel and Cost Function c(y) y Q(y; ) = δ 2 ( ) Q(y; ) = δ 0 ( ) Q(y; ) = δ 2 ( )

55 Solution of the Problem: Optimal Policy Theorem In this POPDMP all assumptions which we previously made, are satisfied and there exists an optimal policy. In this example we have also computed the value function and the optimal policy numerically by value iteration. For the optimal policy we obtained { 1{t 1 π n (h n, t) := 2 } if µ 1 n µ 3 n, 1 {t 1 2 } if µ 1 n µ 3 n.

56 Solution of the Problem: Value Function

57 References References N.B. and Lange, D.: Optimal control of partially observable piecewise deterministic Markov processes. SIAM J. Control Optim., 56, , N.B. and Rieder, U.: Markov Decision Processes with Applications to Finance. Springer, Brandejsky, A. and de Saporta, B. and Dufour, F.: Optimal stopping for partially observed piecewise-deterministic Markov processes. Stochastic Process. Appl., 123, , Davis, M.H.A.: Markov Models and Optimization. Chapman and Hall, Yushkevich, A. A.: On reducing a jump controllable Markov model to a model with discrete time. Theory Probab. Appl., 25, 58-69, 1980.

58 References Thank you very much for your attention!

arxiv: v2 [math.oc] 17 Jan 2018

arxiv: v2 [math.oc] 17 Jan 2018 OPTIML CONTROL OF PRTILLY OBSERVBLE PIECEWISE DETERMINISTIC MRKOV PROCESSES NICOLE BÄUERLE ND DIRK LNGE arxiv:176.9142v2 [math.oc 17 Jan 218 bstract. In this paper we consider a control problem for a Partially

More information

On Stopping Times and Impulse Control with Constraint

On Stopping Times and Impulse Control with Constraint On Stopping Times and Impulse Control with Constraint Jose Luis Menaldi Based on joint papers with M. Robin (216, 217) Department of Mathematics Wayne State University Detroit, Michigan 4822, USA (e-mail:

More information

MDP Algorithms for Portfolio Optimization Problems in pure Jump Markets

MDP Algorithms for Portfolio Optimization Problems in pure Jump Markets MDP Algorithms for Portfolio Optimization Problems in pure Jump Markets Nicole Bäuerle, Ulrich Rieder Abstract We consider the problem of maximizing the expected utility of the terminal wealth of a portfolio

More information

ON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME

ON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME ON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME SAUL D. JACKA AND ALEKSANDAR MIJATOVIĆ Abstract. We develop a general approach to the Policy Improvement Algorithm (PIA) for stochastic control problems

More information

A MODEL FOR THE LONG-TERM OPTIMAL CAPACITY LEVEL OF AN INVESTMENT PROJECT

A MODEL FOR THE LONG-TERM OPTIMAL CAPACITY LEVEL OF AN INVESTMENT PROJECT A MODEL FOR HE LONG-ERM OPIMAL CAPACIY LEVEL OF AN INVESMEN PROJEC ARNE LØKKA AND MIHAIL ZERVOS Abstract. We consider an investment project that produces a single commodity. he project s operation yields

More information

Deterministic Kalman Filtering on Semi-infinite Interval

Deterministic Kalman Filtering on Semi-infinite Interval Deterministic Kalman Filtering on Semi-infinite Interval L. Faybusovich and T. Mouktonglang Abstract We relate a deterministic Kalman filter on semi-infinite interval to linear-quadratic tracking control

More information

Heavy Tailed Time Series with Extremal Independence

Heavy Tailed Time Series with Extremal Independence Heavy Tailed Time Series with Extremal Independence Rafa l Kulik and Philippe Soulier Conference in honour of Prof. Herold Dehling Bochum January 16, 2015 Rafa l Kulik and Philippe Soulier Regular variation

More information

Existence, Uniqueness and Stability of Invariant Distributions in Continuous-Time Stochastic Models

Existence, Uniqueness and Stability of Invariant Distributions in Continuous-Time Stochastic Models Existence, Uniqueness and Stability of Invariant Distributions in Continuous-Time Stochastic Models Christian Bayer and Klaus Wälde Weierstrass Institute for Applied Analysis and Stochastics and University

More information

Minimum average value-at-risk for finite horizon semi-markov decision processes

Minimum average value-at-risk for finite horizon semi-markov decision processes 12th workshop on Markov processes and related topics Minimum average value-at-risk for finite horizon semi-markov decision processes Xianping Guo (with Y.H. HUANG) Sun Yat-Sen University Email: mcsgxp@mail.sysu.edu.cn

More information

Ernesto Mordecki 1. Lecture III. PASI - Guanajuato - June 2010

Ernesto Mordecki 1. Lecture III. PASI - Guanajuato - June 2010 Optimal stopping for Hunt and Lévy processes Ernesto Mordecki 1 Lecture III. PASI - Guanajuato - June 2010 1Joint work with Paavo Salminen (Åbo, Finland) 1 Plan of the talk 1. Motivation: from Finance

More information

On the Approximate Solution of POMDP and the Near-Optimality of Finite-State Controllers

On the Approximate Solution of POMDP and the Near-Optimality of Finite-State Controllers On the Approximate Solution of POMDP and the Near-Optimality of Finite-State Controllers Huizhen (Janey) Yu (janey@mit.edu) Dimitri Bertsekas (dimitrib@mit.edu) Lab for Information and Decision Systems,

More information

Thomas Knispel Leibniz Universität Hannover

Thomas Knispel Leibniz Universität Hannover Optimal long term investment under model ambiguity Optimal long term investment under model ambiguity homas Knispel Leibniz Universität Hannover knispel@stochastik.uni-hannover.de AnStAp0 Vienna, July

More information

Functional Limit theorems for the quadratic variation of a continuous time random walk and for certain stochastic integrals

Functional Limit theorems for the quadratic variation of a continuous time random walk and for certain stochastic integrals Functional Limit theorems for the quadratic variation of a continuous time random walk and for certain stochastic integrals Noèlia Viles Cuadros BCAM- Basque Center of Applied Mathematics with Prof. Enrico

More information

A.Piunovskiy. University of Liverpool Fluid Approximation to Controlled Markov. Chains with Local Transitions. A.Piunovskiy.

A.Piunovskiy. University of Liverpool Fluid Approximation to Controlled Markov. Chains with Local Transitions. A.Piunovskiy. University of Liverpool piunov@liv.ac.uk The Markov Decision Process under consideration is defined by the following elements X = {0, 1, 2,...} is the state space; A is the action space (Borel); p(z x,

More information

Optimal portfolio strategies under partial information with expert opinions

Optimal portfolio strategies under partial information with expert opinions 1 / 35 Optimal portfolio strategies under partial information with expert opinions Ralf Wunderlich Brandenburg University of Technology Cottbus, Germany Joint work with Rüdiger Frey Research Seminar WU

More information

Skew Brownian Motion and Applications in Fluid Dispersion

Skew Brownian Motion and Applications in Fluid Dispersion Skew Brownian Motion and Applications in Fluid Dispersion Ed Waymire Department of Mathematics Oregon State University Corvallis, OR 97331 *Based on joint work with Thilanka Appuhamillage, Vrushali Bokil,

More information

OPTIMAL SOLUTIONS TO STOCHASTIC DIFFERENTIAL INCLUSIONS

OPTIMAL SOLUTIONS TO STOCHASTIC DIFFERENTIAL INCLUSIONS APPLICATIONES MATHEMATICAE 29,4 (22), pp. 387 398 Mariusz Michta (Zielona Góra) OPTIMAL SOLUTIONS TO STOCHASTIC DIFFERENTIAL INCLUSIONS Abstract. A martingale problem approach is used first to analyze

More information

The Mabinogion Sheep Problem

The Mabinogion Sheep Problem The Mabinogion Sheep Problem Kun Dong Cornell University April 22, 2015 K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, 2015 1 / 18 Introduction (Williams 1991) we are given a herd

More information

INEQUALITY FOR VARIANCES OF THE DISCOUNTED RE- WARDS

INEQUALITY FOR VARIANCES OF THE DISCOUNTED RE- WARDS Applied Probability Trust (5 October 29) INEQUALITY FOR VARIANCES OF THE DISCOUNTED RE- WARDS EUGENE A. FEINBERG, Stony Brook University JUN FEI, Stony Brook University Abstract We consider the following

More information

Value Iteration and Action ɛ-approximation of Optimal Policies in Discounted Markov Decision Processes

Value Iteration and Action ɛ-approximation of Optimal Policies in Discounted Markov Decision Processes Value Iteration and Action ɛ-approximation of Optimal Policies in Discounted Markov Decision Processes RAÚL MONTES-DE-OCA Departamento de Matemáticas Universidad Autónoma Metropolitana-Iztapalapa San Rafael

More information

Stochastic relations of random variables and processes

Stochastic relations of random variables and processes Stochastic relations of random variables and processes Lasse Leskelä Helsinki University of Technology 7th World Congress in Probability and Statistics Singapore, 18 July 2008 Fundamental problem of applied

More information

Walsh Diffusions. Andrey Sarantsev. March 27, University of California, Santa Barbara. Andrey Sarantsev University of Washington, Seattle 1 / 1

Walsh Diffusions. Andrey Sarantsev. March 27, University of California, Santa Barbara. Andrey Sarantsev University of Washington, Seattle 1 / 1 Walsh Diffusions Andrey Sarantsev University of California, Santa Barbara March 27, 2017 Andrey Sarantsev University of Washington, Seattle 1 / 1 Walsh Brownian Motion on R d Spinning measure µ: probability

More information

Malliavin Calculus in Finance

Malliavin Calculus in Finance Malliavin Calculus in Finance Peter K. Friz 1 Greeks and the logarithmic derivative trick Model an underlying assent by a Markov process with values in R m with dynamics described by the SDE dx t = b(x

More information

Infinite-Horizon Discounted Markov Decision Processes

Infinite-Horizon Discounted Markov Decision Processes Infinite-Horizon Discounted Markov Decision Processes Dan Zhang Leeds School of Business University of Colorado at Boulder Dan Zhang, Spring 2012 Infinite Horizon Discounted MDP 1 Outline The expected

More information

Representation of Markov chains. Stochastic stability AIMS 2014

Representation of Markov chains. Stochastic stability AIMS 2014 Markov chain model Joint with J. Jost and M. Kell Max Planck Institute for Mathematics in the Sciences Leipzig 0th July 204 AIMS 204 We consider f : M! M to be C r for r 0 and a small perturbation parameter

More information

Regular Variation and Extreme Events for Stochastic Processes

Regular Variation and Extreme Events for Stochastic Processes 1 Regular Variation and Extreme Events for Stochastic Processes FILIP LINDSKOG Royal Institute of Technology, Stockholm 2005 based on joint work with Henrik Hult www.math.kth.se/ lindskog 2 Extremes for

More information

Christopher Watkins and Peter Dayan. Noga Zaslavsky. The Hebrew University of Jerusalem Advanced Seminar in Deep Learning (67679) November 1, 2015

Christopher Watkins and Peter Dayan. Noga Zaslavsky. The Hebrew University of Jerusalem Advanced Seminar in Deep Learning (67679) November 1, 2015 Q-Learning Christopher Watkins and Peter Dayan Noga Zaslavsky The Hebrew University of Jerusalem Advanced Seminar in Deep Learning (67679) November 1, 2015 Noga Zaslavsky Q-Learning (Watkins & Dayan, 1992)

More information

Reaction-Diffusion Equations In Narrow Tubes and Wave Front P

Reaction-Diffusion Equations In Narrow Tubes and Wave Front P Outlines Reaction-Diffusion Equations In Narrow Tubes and Wave Front Propagation University of Maryland, College Park USA Outline of Part I Outlines Real Life Examples Description of the Problem and Main

More information

Affine Processes. Econometric specifications. Eduardo Rossi. University of Pavia. March 17, 2009

Affine Processes. Econometric specifications. Eduardo Rossi. University of Pavia. March 17, 2009 Affine Processes Econometric specifications Eduardo Rossi University of Pavia March 17, 2009 Eduardo Rossi (University of Pavia) Affine Processes March 17, 2009 1 / 40 Outline 1 Affine Processes 2 Affine

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 22 12/09/2013. Skorokhod Mapping Theorem. Reflected Brownian Motion

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 22 12/09/2013. Skorokhod Mapping Theorem. Reflected Brownian Motion MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.7J Fall 213 Lecture 22 12/9/213 Skorokhod Mapping Theorem. Reflected Brownian Motion Content. 1. G/G/1 queueing system 2. One dimensional reflection mapping

More information

HJB equations. Seminar in Stochastic Modelling in Economics and Finance January 10, 2011

HJB equations. Seminar in Stochastic Modelling in Economics and Finance January 10, 2011 Department of Probability and Mathematical Statistics Faculty of Mathematics and Physics, Charles University in Prague petrasek@karlin.mff.cuni.cz Seminar in Stochastic Modelling in Economics and Finance

More information

Reductions Of Undiscounted Markov Decision Processes and Stochastic Games To Discounted Ones. Jefferson Huang

Reductions Of Undiscounted Markov Decision Processes and Stochastic Games To Discounted Ones. Jefferson Huang Reductions Of Undiscounted Markov Decision Processes and Stochastic Games To Discounted Ones Jefferson Huang School of Operations Research and Information Engineering Cornell University November 16, 2016

More information

Exercises in stochastic analysis

Exercises in stochastic analysis Exercises in stochastic analysis Franco Flandoli, Mario Maurelli, Dario Trevisan The exercises with a P are those which have been done totally or partially) in the previous lectures; the exercises with

More information

Theoretical Tutorial Session 2

Theoretical Tutorial Session 2 1 / 36 Theoretical Tutorial Session 2 Xiaoming Song Department of Mathematics Drexel University July 27, 216 Outline 2 / 36 Itô s formula Martingale representation theorem Stochastic differential equations

More information

Strong Markov property of determinantal processes associated with extended kernels

Strong Markov property of determinantal processes associated with extended kernels Strong Markov property of determinantal processes associated with extended kernels Hideki Tanemura Chiba university (Chiba, Japan) (November 22, 2013) Hideki Tanemura (Chiba univ.) () Markov process (November

More information

Short-time expansions for close-to-the-money options under a Lévy jump model with stochastic volatility

Short-time expansions for close-to-the-money options under a Lévy jump model with stochastic volatility Short-time expansions for close-to-the-money options under a Lévy jump model with stochastic volatility José Enrique Figueroa-López 1 1 Department of Statistics Purdue University Statistics, Jump Processes,

More information

Total Expected Discounted Reward MDPs: Existence of Optimal Policies

Total Expected Discounted Reward MDPs: Existence of Optimal Policies Total Expected Discounted Reward MDPs: Existence of Optimal Policies Eugene A. Feinberg Department of Applied Mathematics and Statistics State University of New York at Stony Brook Stony Brook, NY 11794-3600

More information

Optimal stopping of a risk process when claims are covered immediately

Optimal stopping of a risk process when claims are covered immediately Optimal stopping of a risk process when claims are covered immediately Bogdan Muciek Krzysztof Szajowski Abstract The optimal stopping problem for the risk process with interests rates and when claims

More information

WEAK VERSIONS OF STOCHASTIC ADAMS-BASHFORTH AND SEMI-IMPLICIT LEAPFROG SCHEMES FOR SDES. 1. Introduction

WEAK VERSIONS OF STOCHASTIC ADAMS-BASHFORTH AND SEMI-IMPLICIT LEAPFROG SCHEMES FOR SDES. 1. Introduction WEAK VERSIONS OF STOCHASTIC ADAMS-BASHFORTH AND SEMI-IMPLICIT LEAPFROG SCHEMES FOR SDES BRIAN D. EWALD 1 Abstract. We consider the weak analogues of certain strong stochastic numerical schemes considered

More information

Introduction to self-similar growth-fragmentations

Introduction to self-similar growth-fragmentations Introduction to self-similar growth-fragmentations Quan Shi CIMAT, 11-15 December, 2017 Quan Shi Growth-Fragmentations CIMAT, 11-15 December, 2017 1 / 34 Literature Jean Bertoin, Compensated fragmentation

More information

Simulation of conditional diffusions via forward-reverse stochastic representations

Simulation of conditional diffusions via forward-reverse stochastic representations Weierstrass Institute for Applied Analysis and Stochastics Simulation of conditional diffusions via forward-reverse stochastic representations Christian Bayer and John Schoenmakers Numerical methods for

More information

r-largest jump or observation; trimming; PA models

r-largest jump or observation; trimming; PA models r-largest jump or observation; trimming; PA models Sidney Resnick School of Operations Research and Industrial Engineering Rhodes Hall, Cornell University Ithaca NY 14853 USA http://people.orie.cornell.edu/sid

More information

Dynamic Matching Models

Dynamic Matching Models Dynamic Matching Models Ana Bušić Inria Paris - Rocquencourt CS Department of École normale supérieure joint work with Varun Gupta, Jean Mairesse and Sean Meyn 3rd Workshop on Cognition and Control January

More information

Kolmogorov Equations and Markov Processes

Kolmogorov Equations and Markov Processes Kolmogorov Equations and Markov Processes May 3, 013 1 Transition measures and functions Consider a stochastic process {X(t)} t 0 whose state space is a product of intervals contained in R n. We define

More information

Infinite-Horizon Average Reward Markov Decision Processes

Infinite-Horizon Average Reward Markov Decision Processes Infinite-Horizon Average Reward Markov Decision Processes Dan Zhang Leeds School of Business University of Colorado at Boulder Dan Zhang, Spring 2012 Infinite Horizon Average Reward MDP 1 Outline The average

More information

Wiener Measure and Brownian Motion

Wiener Measure and Brownian Motion Chapter 16 Wiener Measure and Brownian Motion Diffusion of particles is a product of their apparently random motion. The density u(t, x) of diffusing particles satisfies the diffusion equation (16.1) u

More information

Approximation of BSDEs using least-squares regression and Malliavin weights

Approximation of BSDEs using least-squares regression and Malliavin weights Approximation of BSDEs using least-squares regression and Malliavin weights Plamen Turkedjiev (turkedji@math.hu-berlin.de) 3rd July, 2012 Joint work with Prof. Emmanuel Gobet (E cole Polytechnique) Plamen

More information

Recent results in game theoretic mathematical finance

Recent results in game theoretic mathematical finance Recent results in game theoretic mathematical finance Nicolas Perkowski Humboldt Universität zu Berlin May 31st, 2017 Thera Stochastics In Honor of Ioannis Karatzas s 65th Birthday Based on joint work

More information

ON THE DEFINITION OF RELATIVE PRESSURE FOR FACTOR MAPS ON SHIFTS OF FINITE TYPE. 1. Introduction

ON THE DEFINITION OF RELATIVE PRESSURE FOR FACTOR MAPS ON SHIFTS OF FINITE TYPE. 1. Introduction ON THE DEFINITION OF RELATIVE PRESSURE FOR FACTOR MAPS ON SHIFTS OF FINITE TYPE KARL PETERSEN AND SUJIN SHIN Abstract. We show that two natural definitions of the relative pressure function for a locally

More information

Chapter 2 SOME ANALYTICAL TOOLS USED IN THE THESIS

Chapter 2 SOME ANALYTICAL TOOLS USED IN THE THESIS Chapter 2 SOME ANALYTICAL TOOLS USED IN THE THESIS 63 2.1 Introduction In this chapter we describe the analytical tools used in this thesis. They are Markov Decision Processes(MDP), Markov Renewal process

More information

Physician Performance Assessment / Spatial Inference of Pollutant Concentrations

Physician Performance Assessment / Spatial Inference of Pollutant Concentrations Physician Performance Assessment / Spatial Inference of Pollutant Concentrations Dawn Woodard Operations Research & Information Engineering Cornell University Johns Hopkins Dept. of Biostatistics, April

More information

Semi-Infinite Relaxations for a Dynamic Knapsack Problem

Semi-Infinite Relaxations for a Dynamic Knapsack Problem Semi-Infinite Relaxations for a Dynamic Knapsack Problem Alejandro Toriello joint with Daniel Blado, Weihong Hu Stewart School of Industrial and Systems Engineering Georgia Institute of Technology MIT

More information

arxiv: v1 [math.pr] 19 Aug 2014

arxiv: v1 [math.pr] 19 Aug 2014 A Multiplicative Wavelet-based Model for Simulation of a andom Process IEVGEN TUCHYN University of Lausanne, Lausanne, Switzerland arxiv:1408.453v1 [math.p] 19 Aug 014 We consider a random process Y(t)

More information

ON THE FIRST TIME THAT AN ITO PROCESS HITS A BARRIER

ON THE FIRST TIME THAT AN ITO PROCESS HITS A BARRIER ON THE FIRST TIME THAT AN ITO PROCESS HITS A BARRIER GERARDO HERNANDEZ-DEL-VALLE arxiv:1209.2411v1 [math.pr] 10 Sep 2012 Abstract. This work deals with first hitting time densities of Ito processes whose

More information

Discretization of SDEs: Euler Methods and Beyond

Discretization of SDEs: Euler Methods and Beyond Discretization of SDEs: Euler Methods and Beyond 09-26-2006 / PRisMa 2006 Workshop Outline Introduction 1 Introduction Motivation Stochastic Differential Equations 2 The Time Discretization of SDEs Monte-Carlo

More information

Martingale Problems. Abhay G. Bhatt Theoretical Statistics and Mathematics Unit Indian Statistical Institute, Delhi

Martingale Problems. Abhay G. Bhatt Theoretical Statistics and Mathematics Unit Indian Statistical Institute, Delhi s Abhay G. Bhatt Theoretical Statistics and Mathematics Unit Indian Statistical Institute, Delhi Lectures on Probability and Stochastic Processes III Indian Statistical Institute, Kolkata 20 24 November

More information

Reflected Brownian Motion

Reflected Brownian Motion Chapter 6 Reflected Brownian Motion Often we encounter Diffusions in regions with boundary. If the process can reach the boundary from the interior in finite time with positive probability we need to decide

More information

Bayesian Inference and the Symbolic Dynamics of Deterministic Chaos. Christopher C. Strelioff 1,2 Dr. James P. Crutchfield 2

Bayesian Inference and the Symbolic Dynamics of Deterministic Chaos. Christopher C. Strelioff 1,2 Dr. James P. Crutchfield 2 How Random Bayesian Inference and the Symbolic Dynamics of Deterministic Chaos Christopher C. Strelioff 1,2 Dr. James P. Crutchfield 2 1 Center for Complex Systems Research and Department of Physics University

More information

Poisson random measure: motivation

Poisson random measure: motivation : motivation The Lévy measure provides the expected number of jumps by time unit, i.e. in a time interval of the form: [t, t + 1], and of a certain size Example: ν([1, )) is the expected number of jumps

More information

A Diffusion Approximation for Stationary Distribution of Many-Server Queueing System In Halfin-Whitt Regime

A Diffusion Approximation for Stationary Distribution of Many-Server Queueing System In Halfin-Whitt Regime A Diffusion Approximation for Stationary Distribution of Many-Server Queueing System In Halfin-Whitt Regime Mohammadreza Aghajani joint work with Kavita Ramanan Brown University APS Conference, Istanbul,

More information

On detection of unit roots generalizing the classic Dickey-Fuller approach

On detection of unit roots generalizing the classic Dickey-Fuller approach On detection of unit roots generalizing the classic Dickey-Fuller approach A. Steland Ruhr-Universität Bochum Fakultät für Mathematik Building NA 3/71 D-4478 Bochum, Germany February 18, 25 1 Abstract

More information

ON ESSENTIAL INFORMATION IN SEQUENTIAL DECISION PROCESSES

ON ESSENTIAL INFORMATION IN SEQUENTIAL DECISION PROCESSES MMOR manuscript No. (will be inserted by the editor) ON ESSENTIAL INFORMATION IN SEQUENTIAL DECISION PROCESSES Eugene A. Feinberg Department of Applied Mathematics and Statistics; State University of New

More information

Simulation methods for stochastic models in chemistry

Simulation methods for stochastic models in chemistry Simulation methods for stochastic models in chemistry David F. Anderson anderson@math.wisc.edu Department of Mathematics University of Wisconsin - Madison SIAM: Barcelona June 4th, 21 Overview 1. Notation

More information

Continuous-Time Markov Decision Processes. Discounted and Average Optimality Conditions. Xianping Guo Zhongshan University.

Continuous-Time Markov Decision Processes. Discounted and Average Optimality Conditions. Xianping Guo Zhongshan University. Continuous-Time Markov Decision Processes Discounted and Average Optimality Conditions Xianping Guo Zhongshan University. Email: mcsgxp@zsu.edu.cn Outline The control model The existing works Our conditions

More information

Zero-Sum Average Semi-Markov Games: Fixed Point Solutions of the Shapley Equation

Zero-Sum Average Semi-Markov Games: Fixed Point Solutions of the Shapley Equation Zero-Sum Average Semi-Markov Games: Fixed Point Solutions of the Shapley Equation Oscar Vega-Amaya Departamento de Matemáticas Universidad de Sonora May 2002 Abstract This paper deals with zero-sum average

More information

On Ergodic Impulse Control with Constraint

On Ergodic Impulse Control with Constraint On Ergodic Impulse Control with Constraint Maurice Robin Based on joint papers with J.L. Menaldi University Paris-Sanclay 9119 Saint-Aubin, France (e-mail: maurice.robin@polytechnique.edu) IMA, Minneapolis,

More information

Reliability and Risk Analysis. Time Series, Types of Trend Functions and Estimates of Trends

Reliability and Risk Analysis. Time Series, Types of Trend Functions and Estimates of Trends Reliability and Risk Analysis Stochastic process The sequence of random variables {Y t, t = 0, ±1, ±2 } is called the stochastic process The mean function of a stochastic process {Y t} is the function

More information

Monte Carlo methods for sampling-based Stochastic Optimization

Monte Carlo methods for sampling-based Stochastic Optimization Monte Carlo methods for sampling-based Stochastic Optimization Gersende FORT LTCI CNRS & Telecom ParisTech Paris, France Joint works with B. Jourdain, T. Lelièvre, G. Stoltz from ENPC and E. Kuhn from

More information

Reinforcement Learning. Introduction

Reinforcement Learning. Introduction Reinforcement Learning Introduction Reinforcement Learning Agent interacts and learns from a stochastic environment Science of sequential decision making Many faces of reinforcement learning Optimal control

More information

Intertwining of Markov processes

Intertwining of Markov processes January 4, 2011 Outline Outline First passage times of birth and death processes Outline First passage times of birth and death processes The contact process on the hierarchical group 1 0.5 0-0.5-1 0 0.2

More information

Weak Feller Property of Non-linear Filters

Weak Feller Property of Non-linear Filters ariv:1812.05509v1 [math.oc] 13 Dec 2018 Weak Feller Property of Non-linear Filters Ali Devran Kara, Naci Saldi and Serdar üksel Abstract Weak Feller property of controlled and control-free Markov chains

More information

Reinforcement Learning as Classification Leveraging Modern Classifiers

Reinforcement Learning as Classification Leveraging Modern Classifiers Reinforcement Learning as Classification Leveraging Modern Classifiers Michail G. Lagoudakis and Ronald Parr Department of Computer Science Duke University Durham, NC 27708 Machine Learning Reductions

More information

A Dynamic Contagion Process with Applications to Finance & Insurance

A Dynamic Contagion Process with Applications to Finance & Insurance A Dynamic Contagion Process with Applications to Finance & Insurance Angelos Dassios Department of Statistics London School of Economics Angelos Dassios, Hongbiao Zhao (LSE) A Dynamic Contagion Process

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Rectifiability of sets and measures

Rectifiability of sets and measures Rectifiability of sets and measures Tatiana Toro University of Washington IMPA February 7, 206 Tatiana Toro (University of Washington) Structure of n-uniform measure in R m February 7, 206 / 23 State of

More information

MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti

MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti 1 MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti Historical background 2 Original motivation: animal learning Early

More information

Statistics 581, Problem Set 1 Solutions Wellner; 10/3/ (a) The case r = 1 of Chebychev s Inequality is known as Markov s Inequality

Statistics 581, Problem Set 1 Solutions Wellner; 10/3/ (a) The case r = 1 of Chebychev s Inequality is known as Markov s Inequality Statistics 581, Problem Set 1 Solutions Wellner; 1/3/218 1. (a) The case r = 1 of Chebychev s Inequality is known as Markov s Inequality and is usually written P ( X ɛ) E( X )/ɛ for an arbitrary random

More information

Nested Uncertain Differential Equations and Its Application to Multi-factor Term Structure Model

Nested Uncertain Differential Equations and Its Application to Multi-factor Term Structure Model Nested Uncertain Differential Equations and Its Application to Multi-factor Term Structure Model Xiaowei Chen International Business School, Nankai University, Tianjin 371, China School of Finance, Nankai

More information

Stochastic Volatility and Correction to the Heat Equation

Stochastic Volatility and Correction to the Heat Equation Stochastic Volatility and Correction to the Heat Equation Jean-Pierre Fouque, George Papanicolaou and Ronnie Sircar Abstract. From a probabilist s point of view the Twentieth Century has been a century

More information

Zero-sum semi-markov games in Borel spaces: discounted and average payoff

Zero-sum semi-markov games in Borel spaces: discounted and average payoff Zero-sum semi-markov games in Borel spaces: discounted and average payoff Fernando Luque-Vásquez Departamento de Matemáticas Universidad de Sonora México May 2002 Abstract We study two-person zero-sum

More information

A SKOROHOD REPRESENTATION THEOREM WITHOUT SEPARABILITY PATRIZIA BERTI, LUCA PRATELLI, AND PIETRO RIGO

A SKOROHOD REPRESENTATION THEOREM WITHOUT SEPARABILITY PATRIZIA BERTI, LUCA PRATELLI, AND PIETRO RIGO A SKOROHOD REPRESENTATION THEOREM WITHOUT SEPARABILITY PATRIZIA BERTI, LUCA PRATELLI, AND PIETRO RIGO Abstract. Let (S, d) be a metric space, G a σ-field on S and (µ n : n 0) a sequence of probabilities

More information

{σ x >t}p x. (σ x >t)=e at.

{σ x >t}p x. (σ x >t)=e at. 3.11. EXERCISES 121 3.11 Exercises Exercise 3.1 Consider the Ornstein Uhlenbeck process in example 3.1.7(B). Show that the defined process is a Markov process which converges in distribution to an N(0,σ

More information

A NEW PROOF OF THE WIENER HOPF FACTORIZATION VIA BASU S THEOREM

A NEW PROOF OF THE WIENER HOPF FACTORIZATION VIA BASU S THEOREM J. Appl. Prob. 49, 876 882 (2012 Printed in England Applied Probability Trust 2012 A NEW PROOF OF THE WIENER HOPF FACTORIZATION VIA BASU S THEOREM BRIAN FRALIX and COLIN GALLAGHER, Clemson University Abstract

More information

Large Deviations Principles for McKean-Vlasov SDEs, Skeletons, Supports and the law of iterated logarithm

Large Deviations Principles for McKean-Vlasov SDEs, Skeletons, Supports and the law of iterated logarithm Large Deviations Principles for McKean-Vlasov SDEs, Skeletons, Supports and the law of iterated logarithm Gonçalo dos Reis University of Edinburgh (UK) & CMA/FCT/UNL (PT) jointly with: W. Salkeld, U. of

More information

Robust Optimal Control Using Conditional Risk Mappings in Infinite Horizon

Robust Optimal Control Using Conditional Risk Mappings in Infinite Horizon Robust Optimal Control Using Conditional Risk Mappings in Infinite Horizon Kerem Uğurlu Monday 9 th April, 2018 Department of Applied Mathematics, University of Washington, Seattle, WA 98195 e-mail: keremu@uw.edu

More information

Homogenization for chaotic dynamical systems

Homogenization for chaotic dynamical systems Homogenization for chaotic dynamical systems David Kelly Ian Melbourne Department of Mathematics / Renci UNC Chapel Hill Mathematics Institute University of Warwick November 3, 2013 Duke/UNC Probability

More information

Grundlagen der Künstlichen Intelligenz

Grundlagen der Künstlichen Intelligenz Grundlagen der Künstlichen Intelligenz Formal models of interaction Daniel Hennes 27.11.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Taxonomy of domains Models of

More information

The Contraction Method on C([0, 1]) and Donsker s Theorem

The Contraction Method on C([0, 1]) and Donsker s Theorem The Contraction Method on C([0, 1]) and Donsker s Theorem Henning Sulzbach J. W. Goethe-Universität Frankfurt a. M. YEP VII Probability, random trees and algorithms Eindhoven, March 12, 2010 joint work

More information

Harnack Inequalities and Applications for Stochastic Equations

Harnack Inequalities and Applications for Stochastic Equations p. 1/32 Harnack Inequalities and Applications for Stochastic Equations PhD Thesis Defense Shun-Xiang Ouyang Under the Supervision of Prof. Michael Röckner & Prof. Feng-Yu Wang March 6, 29 p. 2/32 Outline

More information

Z. Zhou On the classical limit of a time-dependent self-consistent field system: analysis. computation

Z. Zhou On the classical limit of a time-dependent self-consistent field system: analysis. computation On the classical limit of a time-dependent self-consistent field system: analysis and computation Zhennan Zhou 1 joint work with Prof. Shi Jin and Prof. Christof Sparber. 1 Department of Mathematics Duke

More information

Yaglom-type limit theorems for branching Brownian motion with absorption. by Jason Schweinsberg University of California San Diego

Yaglom-type limit theorems for branching Brownian motion with absorption. by Jason Schweinsberg University of California San Diego Yaglom-type limit theorems for branching Brownian motion with absorption by Jason Schweinsberg University of California San Diego (with Julien Berestycki, Nathanaël Berestycki, Pascal Maillard) Outline

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Dynamic Programming Marc Toussaint University of Stuttgart Winter 2018/19 Motivation: So far we focussed on tree search-like solvers for decision problems. There is a second important

More information

Markov Decision Processes and Solving Finite Problems. February 8, 2017

Markov Decision Processes and Solving Finite Problems. February 8, 2017 Markov Decision Processes and Solving Finite Problems February 8, 2017 Overview of Upcoming Lectures Feb 8: Markov decision processes, value iteration, policy iteration Feb 13: Policy gradients Feb 15:

More information

On the split equality common fixed point problem for quasi-nonexpansive multi-valued mappings in Banach spaces

On the split equality common fixed point problem for quasi-nonexpansive multi-valued mappings in Banach spaces Available online at www.tjnsa.com J. Nonlinear Sci. Appl. 9 (06), 5536 5543 Research Article On the split equality common fixed point problem for quasi-nonexpansive multi-valued mappings in Banach spaces

More information

Mean field simulation for Monte Carlo integration. Part I : Intro. nonlinear Markov models (+links integro-diff eq.) P. Del Moral

Mean field simulation for Monte Carlo integration. Part I : Intro. nonlinear Markov models (+links integro-diff eq.) P. Del Moral Mean field simulation for Monte Carlo integration Part I : Intro. nonlinear Markov models (+links integro-diff eq.) P. Del Moral INRIA Bordeaux & Inst. Maths. Bordeaux & CMAP Polytechnique Lectures, INLN

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning Markov decision process & Dynamic programming Evaluative feedback, value function, Bellman equation, optimality, Markov property, Markov decision process, dynamic programming, value

More information

Chain Rule. MATH 311, Calculus III. J. Robert Buchanan. Spring Department of Mathematics

Chain Rule. MATH 311, Calculus III. J. Robert Buchanan. Spring Department of Mathematics 3.33pt Chain Rule MATH 311, Calculus III J. Robert Buchanan Department of Mathematics Spring 2019 Single Variable Chain Rule Suppose y = g(x) and z = f (y) then dz dx = d (f (g(x))) dx = f (g(x))g (x)

More information

4th Preparation Sheet - Solutions

4th Preparation Sheet - Solutions Prof. Dr. Rainer Dahlhaus Probability Theory Summer term 017 4th Preparation Sheet - Solutions Remark: Throughout the exercise sheet we use the two equivalent definitions of separability of a metric space

More information

, and rewards and transition matrices as shown below:

, and rewards and transition matrices as shown below: CSE 50a. Assignment 7 Out: Tue Nov Due: Thu Dec Reading: Sutton & Barto, Chapters -. 7. Policy improvement Consider the Markov decision process (MDP) with two states s {0, }, two actions a {0, }, discount

More information

Lecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past.

Lecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past. 1 Markov chain: definition Lecture 5 Definition 1.1 Markov chain] A sequence of random variables (X n ) n 0 taking values in a measurable state space (S, S) is called a (discrete time) Markov chain, if

More information