Stochastic Modelling of Biological Processes Lecture Notes

Size: px
Start display at page:

Download "Stochastic Modelling of Biological Processes Lecture Notes"

Transcription

1 Stochastic Modelling of Biological Processes Lecture Notes Ruth Baker Hilary Term 218

2 Abstract These lecture notes have been written to accompany the Part B course Stochastic Modelling of Biological Processes. They are based heavily on lecture notes originally written by Radek Erban for the Part C course of the same name, which will form part of the planned book R. Erban and S. J. Chapman, Stochastic Modelling of Reaction-Diffusion Processes. A huge thanks to Radek for letting me adapt this material for Part B. Matlab codes for most of the examples in these notes can be found on the course webpages. Although the final examination does not include computer practicals, you are encouraged to implement the algorithms contained in these notes as doing so will greatly aid in understanding. General suggested reading material is listed on the course webpages, and more specific references can be found at the of each lecture. Please me with any mistakes, however large or small! Ruth Baker, HT 218.

3 Contents 1 Modelling of degradation The chemical master equation Stochastic simulation Connection with the reaction rate equation References and further reading Tasks Example Matlab code Modelling production and degradation The chemical master equation The stationary distribution Stochastic simulation References and further reading Tasks Example Matlab code Modelling general chemical reactions The chemical master equation Stochastic simulation Example References and further reading Tasks Example Matlab code Stochastic versus deterministic modelling Stochastic modelling of dimerization Stochastic focussing References and further reading Tasks Example Matlab code Connection to stochastic differential equations The tau-leap method The chemical Langevin equation References and further reading i

4 Stochastic modelling of biological processes ii 5.4 Tasks Example Matlab code Introduction to stochastic differential equations A computational definition of a stochastic differential equation Example Example Example References and further reading Tasks Example Matlab code The Fokker-Planck equation Derivation of the Fokker-Planck equation The stationary distribution References and further reading Tasks Example Matlab code The backward Kolmogorov equation Derivation of the backward Kolmogorov equation The diffusion coefficient Average switching times Example 3 of Lecture References and further reading Tasks Example Matlab code The chemical Fokker-Planck equation Example: production and degradation The chemical Fokker-Planck equation References and further reading Tasks Example Matlab code A simple model of diffusion A compartment-based approach to diffusion Connection to a macroscale diffusion coefficient Analysis of the variance References and further reading Tasks Example Matlab code

5 Stochastic modelling of biological processes iii 11 The reaction-diffusion master equation A compartment-based model for production, degradation and diffusion A compartment-based model for higher order reactions Choice of compartment size, h Models of pattern formation References and further reading Tasks Diffusion and stochastic differential equations The Fokker-Planck equation and the diffusion equation One-dimensional diffusion and boundary conditions References and further reading Tasks Example Matlab code Molecular approaches to reaction-diffusion Molecular-based model of diffusion, production and degradation Molecular-based approaches for second-order reactions Reaction radius and reaction probability References and further reading Tasks A simple velocity-jump process A simple velocity-jump model in one dimension Analysis of the simple velocity-jump model A simply velocity-jump model with boundary conditions References and further reading Tasks Example Matlab code A more general velocity-jump process Large friction limit Einstein-Smoluchowski relation References and further reading Tasks

6 Chapter 1: Modelling of degradation Consider degradation of the chemical species A according to the following reaction: A k, (1.1) where k > is the rate constant of the reaction (units s 1 ). It is defined so that kdt is the probability that a molecule of A degrades during the time interval [t, t + dt) where t is the time and dt is an (infinitesimally) small time step. Denote the number of molecules of A present at time t by A(t). Then we have P(no reactions in [t, t + dt)) = 1 A(t)kdt + O ( dt 2), (1.2) P(one reaction in [t, t + dt)) = A(t)kdt + O ( dt 2), (1.3) P(more than one reaction in [t, t + dt)) = O ( dt 2). (1.4) 1.1 The chemical master equation Let p n denote the probability that there are n molecules of A present in the system at time t. Then p n (t + dt) = (1 kndt) p n (t) + k(n + 1)dtp n+1 (t) + O ( dt 2), (1.5) so that, rearranging and taking the limit dt, we have the chemical master equation: dp n dt = k(n + 1)p n+1(t) knp n (t). (1.6) Equation (1.6) is a system of ordinary differential equations for the probabilities p n where n =, 1, 2,..., N 1 and A() = N is the initial number of A molecules in the system. The equation for P N is dp N dt = knp N (t) with P N () = 1 = p N (t) = e knt. (1.7) Inductively, one can show that ( ) N p n (t) = e knt 1 e kt) N n, n n =, 1,..., N, (1.8) i.e. p n (t) B(N, e kt ). This means that the mean and variance of the number of A molecules present in the system at time t are, respectively, given by M(t) = Ne kt and V (t) = Ne kt ( 1 e kt). (1.9) The chemical master equation, (1.6) and (1.7), and its solution, (1.8), enable us to quantify the stochastic fluctuations around the mean. 1

7 Chapter 1 Stochastic modelling of biological processes Stochastic simulation We would like to be able to generate individual trajectories from the model using a computational algorithm (this is more useful when we are unable to solve the chemical master equation explicitly). An efficient algorithm involves generating random numbers that represent the waiting times between successive degradation events. In order to do this, we need to understand, for each time t, how to compute the time when the next molecule degrades, t + τ. We note that τ is a random variable so we need to calculate its probability distribution function, and then how to draw random variates from this distribution Waiting times Let f(a(t), s)ds denote the probability that, given A(t) molecules of A in the system at time t, the next degradation reaction occurs in the time interval [t + s, t + s + ds), where ds is an (infinitesimally) small time step. For this to happen, we know that there cannot be a reaction in the time interval [t, t + s) and then a reaction must occur in the time interval [t + s, t + s + ds). Hence we can write f(a(t), s)ds = g(a(t), s) A(t + s)kds = g(a(t), s) A(t)kds, (1.1) where g(a(t), s) is the probability that no reaction occurs during the time interval [t, t+s) when there are A(t) molecules at time t. For any σ >, the probability that no reaction happens in the interval [t, t + σ + dσ) is given by g(a(t), σ + dσ) = g(a(t), σ) [1 A(t + σ)kdσ] = g(a(t), σ) [1 A(t)kdσ]. (1.11) Rearranging and taking the limit as dσ gives dg(a(t), σ) dσ = ka(t)g(a(t), σ) = g(a(t), σ) = e ka(t)σ, (1.12) upon noting that g(a(t), ) = 1. Substituting into Equation (1.1) gives f(a(t), s)ds = ka(t)e ka(t)s ds. (1.13) Generating random numbers To generate stochastic sample paths, we now need a means by which to generate random numbers distributed according to Equation (1.13). We start by considering the function F (τ) = e ka(t)τ, (1.14) which is monotone decreasing for A(t) > and such that F : (, ) (, 1). For a, b (, 1) with a < b we then have that P (F (τ) (a, b)) = P ( τ ( F 1 (b), F 1 (a) )), (1.15)

8 Chapter 1 Stochastic modelling of biological processes 3 or, equivalently, F 1 (a) F 1 (b) f(a(t), s)ds = F 1 (a) F 1 (b) F 1 (a) ka(t)e ka(t)s ds df = F 1 (b) ds ds = F ( F 1 (a) ) + F ( F 1 (b) ) = b a. (1.16) This means that if τ is a random number distributed according to Equation (1.13) then F (τ) is a random number uniformly distributed in (, 1). As such, if we can generate a random number, r, uniformly distributed on (, 1) then we can generate the time of the next reaction by solving r = F (τ) = e ka(t)τ, (1.17) to give τ = 1 ( ) 1 ka(t) ln. (1.18) r Stochastic simulation algorithm for degradation The stochastic simulation algorithm for a degradation reaction can then be written: 1. Set t = and A(t) = N. 2. Generate a random number r U(, 1) and set τ = 1 ( ) 1 ka(t) ln. r 3. (a) If t + τ t final set t = t + τ and A(t + τ) = A(t) 1. If A(t) > return to Step 2, otherwise exit. (b) If t > t final set t = t final, A(t final ) = A(t) and exit. Figure 1.1 shows a number of sample paths generated using this stochastic simulation algorithm. 1.3 Connection with the reaction rate equation Using the chemical master equation, (1.6) and (1.7), we can show that the mean number of A molecules, M(t) = np n (t), (1.19) n= satisfies (as we might have expected from consideration of the corresponding deterministic model) dm = km with M() = N, (1.2) dt which has solution given by (1.9). Evolution of the mean molecule number is plotted in Figure 1.1, alongside a number of sample paths.

9 number of A molecules number of A molecules Chapter 1 Stochastic modelling of biological processes time [sec] time [sec] Figure 1.1: Sample paths from a degradation reaction system. Left: four different sample paths. Right: twenty different sample paths, with the mean (black dashed line). Parameters are A() = 2, k =.1s References and further reading A practical guide to stochastic simulations of reaction-diffusion processes. R. Erban, S. J. Chapman and P. K. Maini. arχiv (27). A rigorous derivation of the chemical master equation. D. T. Gillespie. Physica A 188 (1992). Exact stochastic simulation of coupled chemical reactions. D. T. Gillespie. J. Chem. Phys. 81 (1977) Tasks Implement a stochastic simulation algorithm that models the degradation reaction (1.1), and plot some sample paths generated using the algorithm and the parameter values in Figure 1.1. Use your algorithm to estimate the evolution of the mean and variance in the number of A molecules by averaging over a large number of sample paths. By plotting both on the same axes, compare your result with the analytical expressions for the mean and variance given in Equation (1.9). 1.6 Example Matlab code To simulate sample paths from the degradation model using the Gillespie algorithm. function degradation lecture() clear all; close all; N=2; % initial number of A molecules k=.1; % reaction rate t final=3; % final time

10 Chapter 1 Stochastic modelling of biological processes 5 % create variables to store the results no paths=5; % number of sample paths A=cell(no paths,1); % create a cell array to record number of A molecules t=cell(no paths,1); % create a cell array to record reaction times %% % analytic expressions for the mean and variance t grid=:.5:t final; M=N*exp(-k*t grid); V=N*exp(-k*t grid).*(1-exp(-k*t grid)); %% % use the Gillespie algorithm to generate sample paths for ii=1:no paths % set and record the initial time and molecule numbers jj=1; t{ii}(jj)=; A{ii}(jj)=N; % perform reactions until the final time is reached while t{ii}(jj)<t final tau=1/(k*a{ii}(jj))*log(1/rand); % time until the next reaction if t{ii}(jj)+tau<=t final % update molecule numbers and time A{ii}(jj+1)=A{ii}(jj)-1; t{ii}(jj+1)=t{ii}(jj)+tau; jj=jj+1; if A{ii}(jj)== % stop if all molecules of A have decayed break; else A{ii}(jj+1)=A{ii}(jj); t{ii}(jj+1)=t final; break; %% % plot the results figure(1); clf; hold on; box on for ii=1:no paths stairs(t{ii},a{ii},'linewidth',1) plot(t grid,m,'k--','linewidth',2) axis([ 3 N]) set(gca,'xtick',:1:3) set(gca,'ytick',:5:2) xlabel('time [sec]'); ylabel('number of A molecules')

11 Chapter 2: Modelling production and degradation Consider the production and degradation of the chemical species A according to the following reactions: A k 1, (2.1) k 2 A, (2.2) where k 1 > is the rate of degradation (units s 1 ) and k 2 is the rate of production of A per unit volume (units m 3 s 1 ). This means that one molecule of A is produced during the time interval [t, t + dt) with probability k 2 νdt where ν is the volume of the system. As before, we denote the number of molecules of A present at time t by A(t). Then we have P(no reactions in [t, t + dt)) = 1 (k 1 A(t) + k 2 ν)dt + O ( dt 2), (2.3) P(one A molecule decays in [t, t + dt)) = k 1 A(t)dt + O ( dt 2), (2.4) P(one A molecule produced in [t, t + dt)) = k 2 νdt + O ( dt 2), (2.5) P(more than one reaction in [t, t + dt)) = O ( dt 2). (2.6) 2.1 The chemical master equation As before, let p n (t) denote the probability that there are n molecules of A present in the system at time t. Then, for n >, we have p n (t + dt) = (1 k 1 ndt k 2 νdt) p n (t) + k 1 (n + 1)dtp n+1 (t) + k 2 νdtp n 1 (t) + O ( dt 2), (2.7) so that, rearranging and taking the limit dt, we arrive at dp n dt = k 1(n + 1)p n+1 k 1 np n + k 2 νp n 1 k 2 νp n. (2.8) For the case n = we have dp dt = k 1 p 1 k 2 ν p. (2.9) equations (2.8) and (2.9) constitute the chemical master equation, a system of ordinary differential equations for the probabilities p n where n =, 1, 2,... and A() = N is the initial number of A molecules in the system Mean and variance of molecule number We can use the chemical master equation, (2.8) and (2.9), to derive equations for the mean and variance of A(t): M(t) = np n (t) and V (t) = n= 6 (n M(t)) 2 p n (t). (2.1) n=

12 Chapter 2 Stochastic modelling of biological processes 7 Multiplying the chemical master equation by n and summing over n, we have d dt np n = k 1 n= n= n(n + 1)p n+1 k 1 n= n 2 p n + k 2 ν np n 1 k 2 ν np n, (2.11) where we define p 1 = to write (2.9) in the same form as (2.8). Changing indices on the right-hand side (e.g. n ± 1 n) gives n= n= dm dt = k 1 (n 1)np n k 1 n 2 p n + k 2 ν (n + 1)p n k 2 ν np n n= = k 1 np n + k 2 ν n= n= n= p n n= n= = k 1 M + k 2 ν, (2.12) with M(t) = k 2ν k 1 ( 1 e k 1t ) + Ne k 1t k 2ν k 1 as t. (2.13) To derive an expression for the variance, we multiply the chemical master equation by n 2, sum over n and change indices on the right-hand side to obtain d dt n 2 p n = k 1 n 2 (n + 1)p n+1 k 1 n= n= n= n= n 3 p n + k 2 ν n 2 p n 1 k 2 ν n 2 p n = k 1 ( 2n 2 + n)p n + k 2 ν (2n + 1)p n. (2.14) Substituting into the expression for V (t) we have n= n= n= dv dt = d dt n 2 p n (t) 2M dm dt n= = ( 2k 1 V + M 2 ) + k 1 M + 2k 2 νm + k 2 ν 2M ( k 1 M + k 2 ν) = 2k 1 V + k 1 M + k 2 ν. (2.15) We can then see that the stationary distribution (t ) is such that M s = lim t M(t) = k 2ν k 1 and V s = lim t V (t) = k 2ν k 1 = M s. (2.16) Evolution of the mean molecule number towards this steady state is plotted in Figure The stationary distribution The variance, V (t), gives us some information about the fluctuations in molecule number. However, we can learn more about the fluctuations about the (quasi-)steady state by considering the stationary distribution: φ(n) := lim t p n (t), n =, 1, 2,... (2.17) We can compute φ(n) by considering the steady states of the chemical master equation: = k 1 φ(1) k 2 ν φ(); (2.18) = k 1 (n + 1)φ(n + 1) k 1 nφ(n) + k 2 νφ(n 1) k 2 νφ(n); (2.19)

13 Chapter 2 Stochastic modelling of biological processes 8 for n 1, to arrive at the recursive definition φ(1) = k 2ν φ(), (2.2) k 1 1 φ(n + 1) = k 1 (n + 1) [k 1nφ(n) + k 2 νφ(n) k 2 νφ(n 1)], n 1. (2.21) We can eliminate the remaining constant, φ(), by noting that n= φ(n) = 1. The stationary distribution is plotted in Figure Stochastic simulation As previously, we would like to be able to generate individual sample paths from the model using a computational algorithm. This time we have two decisions: when the next reaction occurs; and which reaction (production or degradation) occurs Waiting times We can use the same principles as before to find the waiting time, τ, until the next reaction: let f(a(t), s)ds denote the probability that, given A(t) molecules of A in the system at time t, the next reaction occurs in the time interval [t + s, t + s + ds), where ds is an (infinitesimally) small time step. For this to happen, we know that there cannot be a reaction in time interval [t, t + s) and then a reaction must occur in the time interval [t + s, t + s + ds). Hence we can write f(a(t), s)ds = g(a(t), s) (A(t + s)k 1 + k 2 ν) ds = g(a(t), s) (A(t)k 1 + k 2 ν) ds, (2.22) where g(a(t), s) is the probability that no reaction occurs during the time interval [t, t + s). by For any σ >, the probability that no reaction happens in the interval [t, t + σ + dσ) is given g(a(t), σ + dσ) = g(a(t), σ) [1 (A(t + σ)k 1 + k 2 ν) dσ] Rearranging and taking the limit as dσ gives = g(a(t), σ) [1 (A(t)k 1 + k 2 ν) dσ]. (2.23) hence dg(a(t), σ) dσ = a (t)g(a(t), σ) where a (t) = (A(t)k 1 + k 2 ν), (2.24) g(a(t), σ) = e a (t)σ, (2.25) Substituting into Equation (2.22) gives f(a(t), s)ds = a (t)e a(t)s ds. (2.26) Generating random numbers for the waiting time To generate stochastic sample paths, we now need a means by which to generate random numbers distributed according to Equation (2.26). Following the same arguments as in Section 1.2.2, if we can generate a random number, r 1, uniformly distributed on (, 1) then we can generate the time of the next reaction by solving r 1 = F (τ) = e a(t)τ, (2.27)

14 Chapter 2 Stochastic modelling of biological processes 9 to give where, as in Equation (2.24), a (t) = k 1 A(t) + k 2 ν Choosing which reaction occurs τ = 1 ( ) 1 a (t) ln, (2.28) r 1 Whether the reaction that occurs is a production or degradation reaction deps on the relative probabilities of the two reactions: k 1 A(t) P(degradation reaction occurs) = k 1 A(t) + k 2 ν = k 1A(t) a (t) ; (2.29) k 2 ν P(production reaction occurs) = k 1 A(t) + k 2 ν = k 2ν a (t). (2.3) This means that if we can draw a random number, r 2, uniformly distributed on (, 1) then we can decide which reaction occurs using the following rule: degradation reaction occurs if r 2 a (t) [, k 1 A(t)); (2.31) production reaction occurs if r 2 a (t) [k 1 A(t), k 1 A(t) + k 2 ν). (2.32) Stochastic simulation algorithm for production and degradation The stochastic simulation algorithm for a degradation reaction can then be written: 1. Set t = and A(t) = N. 2. Calculate a (t). 3. Generate a random number r 1 U(, 1) and set τ = 1 ( 1 a (t) ln 4. (a) If t + τ t final then generate a random number r 2 U(, 1). i. If r 2 a (t) [, k 1 A(t)) then set A(t + τ) = A(t) 1 and t = t + τ. ii. If r 2 a (t) [k 1 A(t), k 1 A(t) + k 2 ν) then set A(t + τ) = A(t) + 1 and t = t + τ. Return to Step 2. (b) If t > t final exit. Five sample paths generated using this algorithm are plotted in Figure 2.1. A large number (1 7 ) of sample paths generated using the algorithm are also used to approximate the stationary distribution on the right of Figure References and further reading A practical guide to stochastic simulations of reaction-diffusion processes. R. Erban, S. J. Chapman and P. K. Maini. arχiv (27). A rigorous derivation of the chemical master equation. D. T. Gillespie. Physica A 188 r 1 ). (1992). Exact stochastic simulation of coupled chemical reactions. D. T. Gillespie. J. Chem. Phys. 81 (1977).

15 number of A molecules frequency Chapter 2 Stochastic modelling of biological processes time [sec] number of A molecules Figure 2.1: Sample paths from a production-degradation reaction system. Left: five different sample paths with the mean (black dashed line). Right: the stationary distribution calculated using both equations (2.2) and (2.21) (orange) and repeated stochastic simulation (grey). Parameters are A() =, k 1 =.1s 1 and k 2 ν = 1.s Tasks Implement a stochastic simulation algorithm that models the production-degradation system (2.1)-(2.2), and plot some sample paths generated using the algorithm and the parameter values in Figure 2.1. Use your algorithm to estimate the stationary distribution, as in Figure 2.1. In order to make this computationally efficient you can, for example, collect data from a single sample path every second for 1 7 seconds. By plotting both on the same axes, compare your result with the analytical expression specified in (2.2)-(2.21). Note that to do this, you should show inductively that φ(n) = C ( ) k2 ν n ( with C = exp k ) 2ν. (2.33) n! k 1 k Example Matlab code To generate sample paths from the production-degradation model using the Gillespie algorithm, and use a long sample path to estimate the stationary distribution. function production degradation lecture() clear all; close all; N=; % initial number of A molecules k1=.1; %decay rate k2v=1; % production rate t final=1; % final time %% % create variables to store the results

16 Chapter 2 Stochastic modelling of biological processes 11 no paths=5; A=cell(no paths,1); % create a cell array to record number of A molecules t=cell(no paths,1); % create a cell array to record reaction times % analytic expression for the mean t grid=:.5:t final; M=k2V/k1*(1-exp(-k1*t grid)); % Generate sample paths using the Gillespie algorithm for i=1:no paths % set and record the initial time and molecule numbers j=1; t{i}(j)=; A{i}(j)=N; while t{i}(j)<t final A=k1*A{i}(j)+k2V; % calculate a tau=1/a*log(1/rand); % time until the next reaction if t{i}(j)+tau<=t final % update molecule numbers and time r2=a*rand; if r2<k1*a{i}(j) A{i}(j+1)=A{i}(j)-1; else A{i}(j+1)=A{i}(j)+1; t{i}(j+1)=t{i}(j)+tau; j=j+1; else break; %% % plot the results figure(1); clf; hold on; box on for i=1:no paths stairs(t{i},a{i},'linewidth',1) plot(t grid,m,'k--','linewidth',2) axis([ 8 2]) set(gca,'xtick',:2:8) set(gca,'ytick',:5:2) xlabel('time [sec]'); ylabel('number of A molecules') %% % estimate the stationary distribution using a very long sample path path length=1e7; % length of the sample path t=; A=k2V/k1; % initial value of path A stationary=zeros(11,1); % create a vector to hold values of A

17 Chapter 2 Stochastic modelling of biological processes 12 for i=1:path length % record the value of A every 1 seconds while t<i A=k1*A+k2V; tau=1/a*log(1/rand); % update molecule numbers and time r2=a*rand; if r2<k1*a A=A-1; else A=A+1; t=t+tau; if A<=1 A stationary(a+1)=a stationary(a+1)+1; % analytic expression for the stationary distribution C=exp(-k2V/k1); n phi=:1:25; phi=c./factorial(n phi).*(k2v/k1).ˆn phi; % plot the results figure(2); clf; cla; hold on; box on bar(:1:1,a stationary/path length,'facecolor',[.7.7.7],... 'LineWidth',.5,'BarWidth',.6); plot(n phi,phi,'*','color',[ ],'markersize',2) axis([ ]) set(gca,'xtick',:5:25) set(gca,'ytick',:.5:.15) set(gca,'yticklabel',arrayfun(@(s)sprintf('%.2f', s),cellfun(@(s)str2num(s),... get(gca,'yticklabel')),'uniformoutput', false)) ylabel('frequency'); xlabel('number of A molecules')

18 Chapter 3: Modelling general chemical reactions The reactions we have considered up until now have either been zeroth order (production of A) or first order (degradation of A). We would also like to be able to simulate second order reactions of the form A + B A + A where, in each case, k > is the rate of reaction. k C, (3.1) k B, (3.2) We have to think carefully about the probabilities of each reaction and also how the probability of a reaction occurring during the time interval [t, t + dt) scales with the volume of the system, ν. For example, we would expect one molecule of A and one molecule of B to collide and react twice as often in a system that has volume ν/2 as compared to one with volume ν. This means that, for one molecule of A and one molecule of B, the probability of a reaction occurring in the time interval [t, t + dt) is k dt/ν, and that the rate constant k has unit m 3 s 1. When we have more than one molecule of A and/or B, we need to consider the number of different pairs of A and B molecules. This is equal to the product A(t)B(t), where A(t) and B(t) are the number of A and B molecules, respectively. The probability that reaction (3.1) occurs during the time interval [t, t + dt) is therefore A(t)B(t)k dt/ν. chemical reaction order propensity function, a(t) units of k k A zeroth kν m 3 s 1 k A first ka(t) s 1 k A + B C second ka(t)b(t)/ν m 3 s 1 k A + A B second ka(t)(a(t) 1)/ν m 3 s 1 k A + B + C D third ka(t)b(t)c(t)/ν 2 m 6 s 1 k A + A + B C third ka(t)(a(t) 1)B(t)/ν 2 m 6 s 1 A + A + A k B third ka(t)(a(t) 1)(A(t) 2)/ν 2 m 6 s 1 Table 3.1: The basic types of reactions with their order, propensity function and the units of the rate constant. For each reaction, the propensity function is defined such that the probability of a reaction occurring in the infinitesimally small time interval [t, t+dt) is a i (t)dt. We can use the propensity functions to understand how the units of the reaction rate change for each type of reaction. Table 3.1 lists the propensity functions for the reactions we have discussed thus far, along with their propensity function and the units of k. Note that for reaction (3.2) the number of different 13

19 Chapter 3 Stochastic modelling of biological processes 14 pairs of A molecules is ( ) A(t) = 2 A(t)(A(t) 1), (3.3) 2 and that it is common practice to absorb the factor 1/2 into the rate constant, k. Similar choices are made for third and higher order reactions also. Note that this might be different notation to that used in the Part B Further Mathematical Biology course. 3.1 The chemical master equation To write down the chemical master equation for a general reaction system, we need some additional terminology. We will consider a biochemical network consisting of N species, S 1,..., S N that may be involved in M possible reactions R 1,..., R M. The population size of S i is known as its copy number and is denoted by X i (t) at time t. The state vector is then defined as X 1 (t) X(t) :=.. (3.4) X N (t) With every reaction, j, we have two quantities. The first is the propensity function, a j (X(t)), which we have already discussed. The second is the stoichiometric, or state-change, vector, ν j := where ν ij is the change in species i caused by the firing of reaction j. ν 1j. ν Nj, (3.5) As before, we construct the chemical master equation by considering how the probability that the system is in a given state changes through time. Define P(x, t x, t ) = P(X(t) = x given X(t ) = x ). (3.6) Then, by considering the possible changes in species numbers brought about by a single reaction taking place we have Notes dp(x, t x, t ) dt = M [a j (x ν j )P(x ν j, t x, t ) a j (x)p(x, t x, t )]. (3.7) j=1 The chemical master equation in fact constitutes a (possibly infinite) system of ODEs that is closed by specifying an initial condition. The description of stochastic chemical kinetics used here is a Markov jump process, and then chemical master equation is otherwise known as Kolmogorov s forward equation for the Markov jump process. 3.2 Stochastic simulation The stochastic simulation algorithms outlined in Lectures 1-2 are special forms of the Gillespie Direct Method stochastic simulation algorithm. We will now generalise it to allow the generation of sample paths from Equation (3.7). Once again, we have two decisions: when the next reaction occurs; and which reaction occurs.

20 Chapter 3 Stochastic modelling of biological processes Generating sample paths for the waiting time Following the same arguments as in Lectures 1-2, for a system in state X(t) at time t, if we can generate a random number, r 1, uniformly distributed on (, 1) then we can generate the time of the next reaction as τ = 1 ( ) 1 a (t) ln r 1 where a (t) = M a j (t), (3.8) with a j (t) the propensity of reaction j and, for brevity, we have dropped the explicit depence of the a j upon the state X(t) Choosing which reaction occurs Again, the same arguments as in Lectures 1-2 tell us that P(reaction R j occurs) = j=1 a j (t) M j=1 a j(t) = a j(t) a (t). (3.9) This means that if we can draw a random number, r 2, uniformly distributed on (, 1) then we can decide which reaction occurs using the following rule: reaction R j occurs if j 1 a i (t) r 2 a (t) < i= Gillespie stochastic simulation algorithm j a i (t). (3.1) The Gillespie (direct method) stochastic simulation algorithm can then be written: 1. Set t = t and X(t) = X(t ). 2. Calculate a j (X(t)) for j = 1,..., M and a (t). 3. Generate a random number r 1 U(, 1) and set τ = 1 ( 1 a (t) ln 4. (a) If t + τ t final then: i. generate a random number r 2 U(, 1); ii. find j such that j 1 r 1 ). a i (t) r 2 a (t) < i=1 i=1 j a i (t). Let X(t + τ) = X(t) + ν j, t = t + τ and return to Step 2. (b) If t + τ > t final exit. 3.3 Example Consider two chemical species, A and B, that undergo the following chemical reactions i=1 A + A k 1, (3.11) A + B k 2, (3.12) k 3 A, (3.13) k 4 B, (3.14) in a system with volume ν where k 1, k 2, k 3 and k 4 are all positive rate constants.

21 Chapter 3 Stochastic modelling of biological processes Chemical master equation Let p n,m (t) = P (A(t) = n and B(t) = m A(t ) = n and B(t ) = m ), (3.15) where we adopt the previous convention that A(t) denotes the number of A molecules present at time t and similarly for B. We then have dp n,m dt = k 1 ν (n + 2)(n + 1)p n+2,m k 1 ν n(n 1)p n,m + k 2 ν (n + 1)(m + 1)p n+1,m+1 k 2 ν nmp n,m + k 3 νp n 1,m k 3 νp n,m + k 4 νp n,m 1 k 4 νp n,m, (3.16) for n, m with the convention, as before, that p n,m if n < or m <. The stationary distribution is then defined as φ(n, m) = lim t p n,m (t), (3.17) and one can also compute the stationary distribution of A only as φ(n) = φ(n, m). (3.18) The stationary distributions for reaction system (3.11)-(3.14) are plotted in Figure Stochastic simulation m= Since reactions (3.11) and (3.12) are second order we cannot solve Equation (3.16) analytically, nor can we obtain closed evolution equations for the stochastic mean and variance. This means that to make progress one option is to generate statistics for the system using repeated stochastic simulation. For the reaction system (3.11)-(3.14) need to compute the following reaction propensities and stoichiometric vectors: a 1 (t) = k 1 ν A(t)(A(t) 1), ν 1 = a 2 (t) = k 2 ν A(t)B(t), ν 2 = a 3 (t) = k 3 ν, ν 3 = a 4 (t) = k 4 ν, ν 4 = [ [ [ [ ] ] ] ] ; (3.19) ; (3.2) ; (3.21) ; (3.22) with a (t) = a 1 (t) + a 2 (t) + a 3 (t) + a 4 (t). algorithm are shown in Figure 3.1. Five different sample paths generated using this

22 number of A molecules number of B molecules Chapter 3 Stochastic modelling of biological processes time [sec] time [sec] Figure 3.1: Sample paths from system (3.11)-(3.14). Numerical solutions of the reaction rate equations (3.28)-(3.29) are indicated by black dashed lines. Parameters are A() =, B() =, k 1 /ν =.1s 1 and k 2 /ν =.1s 1, k 3 = 1.2s 1 and k 4 = 1.s Reaction rate equations The connection with the reaction rate equations can be found by using Equation (3.16) to derive expressions for the evolution of mean A and B molecule numbers: We have where A = A 2 = n= m= d A dt d B dt n= m= np n,m (t) and B = n= m= mp n,m (t). (3.23) = 2k 1 ν A(A 1) k 2 ν AB + k 3ν, (3.24) = k 2 ν AB + k 4ν, (3.25) n 2 p n,m (t) and AB = n= m= nmp n,m (t). (3.26) The reaction rate equations can be deduced from equations (3.24)-(3.25) by taking limits as the molecule numbers, A(t) and B(t), and the system size, ν, t to infinity in such a way that a(t) = A(t) ν and b(t) = B(t) ν, (3.27) are held constant, and assuming A 2 = A 2 and AB = A B, to arrive at da dt db dt = 2k 1 a 2 k 2 ab + k 3, (3.28) = k 2 ab + k 4. (3.29) Note that this process of taking limits means that the reaction rate equations (3.28)-(3.29) do not exactly describe the mean behaviour of the system. In the next lecture and Problem Sheet 1 we shall explore this in more detail.

23 number of B molecules frequency Chapter 3 Stochastic modelling of biological processes number of A molecules number of A molecules Figure 3.2: The stationary distribution for system (3.11)-(3.14) estimated using long time paths generated using the stochastic simulation algorithm. Left: φ(n, m). Right: φ(n) with the mean predicted by the reaction rate equations (3.28)-(3.29) indicated in orange. Parameters are A() =, B() =, k 1 /ν =.1s 1 and k 2 /ν =.1s 1, k 3 = 1.2s 1 and k 4 = 1.s References and further reading A practical guide to stochastic simulations of reaction-diffusion processes. R. Erban, S. J. Chapman and P. K. Maini. arχiv (27). A rigorous derivation of the chemical master equation. D. T. Gillespie. Physica A 188 (1992). Exact stochastic simulation of coupled chemical reactions. D. T. Gillespie. J. Chem. Phys. 81 (1977) Tasks Implement a stochastic simulation algorithm that models the two-species system (3.11)- (3.14), and plot some sample paths generated using the algorithm using the parameter values in Figure 3.1. Use your algorithm to generate a long time sample path to estimate the stationary distributions for both A and B, and for each of A and B individually, as in Figure 3.2. Derive the reaction rate equations (3.28)-(3.29) from the chemical master equation (3.16) and compare the steady states predicted by the reaction rate equations with the stationary distribution. How do the average values predicted by long time simulation compare with those of the reaction rate equations? 3.6 Example Matlab code To generate sample paths from the two species model using the Gillespie algorithm. function example lecture() clear all;

24 Chapter 3 Stochastic modelling of biological processes 19 close all; initial A=; % initial number of A molecules initial B=; % initial number of B molecules % set the values of the rate constants k1=.1; k2=.1; k3=1.2; k4=1; t final=12; % final time %% % generate sample paths using the Gillespie algorithm % create data structures to hold the results no repeats=5; A=cell(no repeats,1); B=cell(no repeats,1); t=cell(no repeats,1); for i=1:no repeats % set the initial time and molecule numbers j=1; t{i}(j)=; A{i}(j)=initial A; B{i}(j)=initial B; while t{i}(j)<t final A=k1*A{i}(j)*(A{i}(j)-1)+k2*A{i}(j)*B{i}(j)+k3+k4; % calculate a tau=1/a*log(1/rand); % calculate the time to the next reaction if t{i}(j)+tau<=t final % update molecule numbers and time % figure out which reaction has taken place r2=a*rand; ss=k1*a{i}(j)*(a{i}(j)-1); % check to see if first reaction if r2<ss A{i}(j+1)=A{i}(j)-2; B{i}(j+1)=B{i}(j); else % if not, check to see if second reaction ss=ss+k2*a{i}(j)*b{i}(j); if r2<ss A{i}(j+1)=A{i}(j)-1; B{i}(j+1)=B{i}(j)-1; else % if not, check to see if third reaction ss=ss+k3; if r2<ss A{i}(j+1)=A{i}(j)+1; B{i}(j+1)=B{i}(j); else % if not, then it must be the fourth reaction

25 Chapter 3 Stochastic modelling of biological processes 2 A{i}(j+1)=A{i}(j); B{i}(j+1)=B{i}(j)+1; % update time t{i}(j+1)=t{i}(j)+tau; j=j+1; else break; %% % solve the reaction rate equations using a forward Euler method dt=.1; % time step t grid=:dt:t final; A det=zeros(length(t),1); B det=zeros(length(t),1); A det(1)=initial A; % initial A value B det(1)=initial B; % initial B value for i=1:length(t grid)-1 A det(i+1)=a det(i)+dt*(-2*k1*a det(i)ˆ2-k2*a det(i)*b det(i)+k3); B det(i+1)=b det(i)+dt*(-k2*a det(i)*b det(i)+k4); %% % plot the results figure(1); clf; hold on; box on for i=1:no repeats stairs(t{i},a{i},'linewidth',1) plot(t grid,a det,'k--','linewidth',2) axis([ 1 25]) set(gca,'xtick',:25:1) set(gca,'ytick',:5:25) xlabel('time [sec]'); ylabel('number of A molecules') figure(2); clf; hold on; box on for i=1:no repeats stairs(t{i},b{i},'linewidth',1) plot(t grid,b det,'k--','linewidth',2) axis([ 1 25]) set(gca,'xtick',:25:1) set(gca,'ytick',:5:25)

26 Chapter 3 Stochastic modelling of biological processes 21 xlabel('time [sec]'); ylabel('number of B molecules') To use a long sample path to estimate the stationary distribution. function example stationary density() clear all; close all; % set the parameter values k1=.1; k2=.1; k3=1.2; k4=1; % calculate the deterministic steady state initial A=sqrt((k3-k4)/(2*k1)); initial B=k4/(k2*initial A); %% t=; % round the initial molecule numbers A=round(initial A); B=round(initial B); path length=1e7; % set the path length % create matrices to hold values of A and B AB stationary=zeros(11,11); A stationary=zeros(11,1); A RRE=zeros(11,1); for i=1:path length % record the values of A and B every 1 seconds while t<i A=k1*A*(A-1)+k2*A*B+k3+k4; % calculate a tau=1/a*log(1/rand); % calculate the time until the next reaction % figure out which reaction has taken place r2=a*rand; ss=k1*a*(a-1); % check to see if first reaction if r2<ss A=A-2; B=B; else % if not, check to see if second reaction ss=ss+k2*a*b; if r2<ss A=A-1; B=B-1; else % if not, check to see if second reaction ss=ss+k3;

27 Chapter 3 Stochastic modelling of biological processes 22 if r2<ss A=A+1; B=B; % if not, must be fourth reaction else A=A; B=B+1; t=t+tau; % save the values of A and B if A<=1 && B<=1 A stationary(a+1)=a stationary(a+1)+1; AB stationary(a+1,b+1)=ab stationary(a+1,b+1)+1; A RRE(round(initial A)+1)=A stationary(round(initial A)+1); %% % plot the results % plot the joint stationary distribution of A and B figure(1); clf; hold on; box on imagesc(ab stationary/path length); axis([ 3 3]) colormap gray colormap(flipud(colormap)) plot(1,1,'+','color',[ ]); set(gca,'xtick',:5:3) set(gca,'ytick',:5:3) xlabel('number of A molecules'); ylabel('number of B molecules') % plot the marginal stationary distribution of A figure(2); clf; hold on; box on bar(:1:1,a stationary/path length,'facecolor',[.7.7.7],... 'LineWidth',.5,'BarWidth',.6) bar(:1:1,a RRE/path length,'facecolor',[ ],... 'LineWidth',.5,'BarWidth',.6) axis([ 3.1]) set(gca,'xtick',:5:3) set(gca,'ytick',:.2:.1) set(gca,'yticklabel',arrayfun(@(s)sprintf('%.2f', s),cellfun(@(s)str2num(s),... get(gca,'yticklabel')),'uniformoutput',false)) xlabel('number of A molecules'); ylabel('frequency')

28 Chapter 4: Stochastic versus deterministic modelling In the first two lectures, we saw that when only zeroth or first order reactions are present, the reaction rate equations exactly predict evolution of mean particle numbers. In this lecture, we shall explore a range of cases in which the predictions of stochastic and deterministic models differ when second or higher reactions are included. 4.1 Stochastic modelling of dimerization Consider a system consisting of the following reactions: where k 1 and k 2 are positive rate constants. evolution of p n (t) := P(A(t) = n), can be written A + A k 1, (4.1) k 2 A, (4.2) The chemical master equation, describing the dp n dt = k 1 ν (n + 2)(n + 1)p n+2 k 1 ν n(n 1)p n + k 2 νp n 1 k 2 νp n n =, 1, 2,..., (4.3) where we define p 1. As dimerisation is a second order reaction, we know that we cannot obtain a closed evolution equation for the mean or variance of A(t). Stochastic simulation provides one approach to investigate the dynamics of the system, but if we are interested only in the stationary values, M s and V s, and the stationary distribution φ(n) then we can make some progress analytically using probability generating functions Probability generating function approach Define G : [ 1, 1] (, ) R by G(x, t) := Differentiating G(x, t) with respect to x we have x n p n (t), (4.4) n= G x = nx n 1 p n (t), (4.5) 2 G x 2 = n=1 n(n 1)x n 2 p n (t). (4.6) n=1 Using the definitions provided in Equation (2.1) gives M(t) = G (1, t), (4.7) x ( ) V (t) = 2 G G G 2 (1, t) + (1, t) (1, t). (4.8) x2 x x 23

29 Chapter 4 Stochastic modelling of biological processes 24 Using induction, it is also possible to show that p n (t) = 1 n G (, t), n. (4.9) n! xn To derive an expression for G(x, t) we multiply the chemical master equation (4.3) by x n and sum over n to arrive at x n p n = k 1 x n (n + 2)(n + 1)p n+2 k 1 x n n(n 1)p n t ν ν n= n= n= +k 2 ν x n p n 1 k 2 ν x n p n, (4.1) n= and then we substitute using Equations (4.4) and (4.7) to obtain a partial differential equation for G: G t = k 1 ν (1 x2 ) 2 G x 2 + k 2ν(x 1)G. (4.11) Together with appropriate boundary and initial conditions, this equation can be solved numerically to give information about M(t), V (t) and p n (t). Here we will focus on the stationary distribution, and the associated stationary probability generating function G s (x) = lim G(x, t) = x n φ(n), (4.12) t which satisfies the ordinary differential equation = k 1 ν (1 x2 ) d2 G s dx 2 + k 2ν(x 1)G s, (4.13) or, equivalently, d 2 G s dx 2 = k 2ν 2 1 k x G s. (4.14) The nontrivial solution of this equation is G s (x) = C k 2 ν 1 + xi (1 + x), (4.15) k 1 where I 1 is the modified Bessel function of the first kind and is a solution of the equation We can evaluate C by noting that G(1, t) = n= n= z 2 I 1 (z) + zi 1(z) (z 2 1)I 1 (z) =. (4.16) p n (t) = 1 = C = n= 2I 1 2 2k 2 ν 2 k 1 Differentiating Equation (4.15) with respect to x and substituting we have M s = G s(1) = k 2 ν 2 2 I 1 2 k 1 I 1 2k 2 ν 2 k 1 2k 2 ν 2 k (4.17), (4.18) and V s = k 2ν 2 + M s Ms 2, (4.19) 2k 1 φ(n) = C 1 ( k2 ν 2 ) n/2 2k 2 ν I n 1 2 2, n = 1, 2, 3,... (4.2) n! k 1 k 1

30 number of A molecules frequency Chapter 4 Stochastic modelling of biological processes Reaction rate equations The reaction rate equation for system (4.1)-(4.2) is da dt = 2k 1a 2 + k 2, (4.21) where a(t) is the concentration of A molecules in volume ν. We can then compare the output from repeated stochastic simulation of system (4.1)-(4.2) with dā dt = 2k 1 ν Ā2 + k 2 ν, (4.22) where Ā(t) is the mean molecule number predicted by the reaction rate equation at time t Comparison of stochastic versus deterministic model Figure 4.1 shows a comparison of the results predicted by the stochastic and deterministic models. On the right-hand side five sample paths of the stochastic model are plotted, along with the solution of the reaction rate equation (4.21) and on the left-hand side the stationary distribution is shown. Equation (4.21) predicts a steady state A population of Ās = 1, whereas M s = 1.13 (to 2 d.p.). Although the difference between the two predictions is not large, we see that in this case the reaction rate equations cannot exactly predict the mean evolution of the system. In general this is true whenever second and higher order reactions are present. In the following sections, and on Problem Sheet 1, we will explore situations where the differences between stochastic and deterministic models are much more significant time [sec] number of A molecules Figure 4.1: Left: Sample paths from the dimerization system (4.1)-(4.2), alongside the solution of the reaction rate equation (4.21). The stationary distribution for system (4.1)-(4.2) generated using both long time path simulation (grey) and analytically (4.2) (orange). Parameters are A() =, k 1 /ν =.5s 1 and k 2 ν = 1.s Stochastic focussing Consider the following chemical reactions: k 1 C k 2 B k 3, (4.23) A + C k 4 A, (4.24) k 5 A k 6, (4.25)

31 Chapter 4 Stochastic modelling of biological processes 26 where k 1,..., k 6 are positive rate constants. We will refer to A as the signal and B as the product and study how changes in the number of signal molecules affects the number of product molecules in the system Reaction rate equations The reaction rate equations for system (4.23)-(4.25) can be written as da dt db dt dc dt = k 5 k 6 a, (4.26) = k 2 c k 3 b, (4.27) = k 1 k 2 c k 4 ac, (4.28) which means that the mean molecule numbers predicted by the reaction rate equations, Ā(t), B(t), C(t), are given by dā dt d B dt d C dt = k 5 ν k 6 Ā, (4.29) = k 2 C k3 B, (4.3) = k 1 ν k 2 C k 4 ν Ā C. (4.31) The steady states predicted by the reaction rate equations are therefore Ā s = k 5ν k 6, Bs = k 1 νk 2 k 6 k 3 (k 2 k 6 + k 4 k 5 ), Cs = Comparison of stochastic versus deterministic model First, we look at the predictions of the two models using the parameter values and k 1 ν = 1 s 1, k 2 = 1 s 1, k 3 =.1 s 1, k 5 ν = { 1 s 1 for t < 1 min, 5 s 1 for t 1 min, k 1 νk 6 k 2 k 6 + k 4 k 5. (4.32) k 4 ν = 99 s 1, (4.33) k 6 = 1 s 1, (4.34) with initial conditions A() =, B() = 1, C() =. In Figure 4.2 we see the comparison between stochastic and deterministic models of system (4.23)-(4.25). The deterministic model predicts that, as k 5 ν is halved the number of signal molecules (A) halves and, in response, the number of product (B) molecules doubles. However, simulations of the stochastic model show that the number of product molecules in fact nearly triples i.e. it is more sensitive to the change in signal than the deterministic model. Ignoring fluctuations in the signal We know that the deterministic model correctly predicts evolution of the stochastic mean number of A molecules. To try to understand the differences between the predictions of the stochastic and deterministic models, we can fix A(t) = M A,s = { 1 for t < 1 min, 5 for t 1 min, (4.35)

32 number of A molecules number of B molecules number of A molecules number of B molecules Chapter 4 Stochastic modelling of biological processes time [min] time [min] Figure 4.2: Five sample paths from system (4.23)-(4.25), together with the solution of the reaction rate equations (4.29)-(4.31). Parameter values are as given in (4.33)-(4.34). and then use a stochastic simulation algorithm to generate sample paths from the system The corresponding reaction rate equations give d B dt d C dt In each case we use the initial conditions B() = 1, C() =. k 1 C k 2 B k 3, (4.36) C k 4A(t). (4.37) = k 2 C k3 B, (4.38) = k 1 ν k 2 C k 4 ν A(t) C. (4.39) In Figure 4.3 we see that the deterministic model correctly predicts the time evolution of the number of B molecules. This is to be expected since there are only zeroth and first order reactions in the system, and it indicates that fluctuations in the number of A molecules is what gives rise to the differences between the model predictions time [min] time [min] Figure 4.3: Five sample paths from system (4.36)-(4.37), together with the solution of the reaction rate equations (4.38)-(4.38). Parameter values are as given in (4.33) and (4.35).

33 number of A molecules number of B molecules Chapter 4 Stochastic modelling of biological processes 28 Looking at the C population The parameter values listed in Equations (4.33)-(4.34) entail { 1 C 3 for t < 1 min, s = for t 1 min, (4.4) This means that it is not sensible to interpret C s as the number of C molecules present in the system. Instead, the stochastic models predicts that there is either zero or one molecules of C present. This low number of C molecules, and the role of C in the second order reaction, is what causes the large differences in the number of B molecules predicted by the two different models. To further illustrate, we now look at the predictions of the models when the parameter values are k 1 ν = 1 s 1, k 2 =.1 s 1, k 3 =.1 s 1 k 4, ν =.99 s 1, (4.41) { 1 s 1 for t < 1 min, k 5 ν = 5 s 1 k 6 = 1 s 1, (4.42) for t 1 min, together with initial conditions A() =, B() = 1, C() = time [min] time [min] Figure 4.4: Five sample paths from system (4.23)-(4.25), together with the solution of the reaction rate equations (4.29)-(4.31). Parameter values are as given in (4.41)-(4.42). Figure 4.4 shows that, with these parameter values, the predictions of the stochastic and deterministic models are very similar. We note that Ās and B s are the same for both sets of parameter values, but in the second case we have { 1 for t < 1 min, C s = (4.43) 19.8 for t 1 min. This means that the number of C molecules is large enough that it can be approximated well by C s. Figure 4.5 shows the time evolution of the number of C molecules for each parameter set and confirms our hypotheses. Further analysis of the model We have seen that second (or higher) order reactions, together with low copy numbers can lead to significant differences between the average behaviour of the stochastic model and that of the deterministic model.

34 number of C molecules number of C molecules Chapter 4 Stochastic modelling of biological processes time [min] time [min] Figure 4.5: Comparison of the C population for system (4.23)-(4.25). Left: Parameter values as given in (4.33)-(4.34). Right: Parameter values as given in (4.41)-(4.42). From our work in Lecture 2 we know that if the rate constants k 5 and k 6 are fixed then we can estimate the number of A molecules in the system as φ(n) = 1 n! (M A,s) n exp ( M A,s ). (4.44) Assuming now that the number of A molecules is fixed and equal to n, then the probability of there being a C molecule in the system is approximately k 1 ν k 2 + nk 4 /ν. (4.45) Putting the two together, the average probability of finding a C molecule in the system is n k 1 ν k 2 + nk 4 /ν φ(n) = n k 1 ν k 2 + nk 4 /ν 1 n! (M A,s) n exp ( M A,s ). (4.46) From here we can estimate the average number of B molecules in the system as k 1 νk 2 1 k 3 (k 2 + nk 4 /ν) n! (M A,s) n exp ( M A,s ). (4.47) n= Using Equation (4.35) we have M B,s { 113. for t < 1 min, for t 1 min. Comparison with Figure 4.2 shows this to be a good estimate! 4.3 References and further reading (4.48) Stochastic focusing: Fluctuation-enhanced sensitivity of intracellular regulation. J. Paulsson, O. G. Berg and M. Ehrenberg. Proc. Natl. Acad. Sci. USA 97: Tasks Verify that (4.15) is a solution of the generating function Equation (4.13). Complete Problem Sheet 1!

35 Chapter 4 Stochastic modelling of biological processes Example Matlab code To generate sample paths and estimate the stationary distribution of the single species dimerization model using the Gillespie algorithm. function dimerization lecture() clear all; close all; initial A=; % initial number of A molecules k1=.5; % annihilation rate k2=1.; % production rate t final=1; % final time %% no paths=5; % number of sample paths A=cell(no paths,1); t=cell(no paths,1); % use the Gillespie algorithm to generate sample paths for i=1:no paths % set the initial time and molecule numbers j=1; t{i}(j)=; A{i}(j)=initial A; while t{i}(j)<t final A=k1*A{i}(j)*(A{i}(j)-1)+k2; % calculate a tau=1/a*log(1/rand); % calculate the time until the next reaction if t{i}(j)+tau<=t final % update molecule numbers and time r2=a*rand; ss=k1*a{i}(j)*(a{i}(j)-1); if r2<ss A{i}(j+1)=A{i}(j)-2; else A{i}(j+1)=A{i}(j)+1; t{i}(j+1)=t{i}(j)+tau; j=j+1; else break; %% % solve the reaction rate equations using a forward Euler method dt=.1; t grid=:dt:t final;

36 Chapter 4 Stochastic modelling of biological processes 31 A det=zeros(length(t grid),1); A det(1)=initial A; for i=1:length(t grid)-1 A det(i+1)=a det(i)+dt*(-2*k1*a det(i)ˆ2+k2); %% % plot the results figure(1); clf; hold on; box on for i=1:no paths stairs(t{i},a{i},'linewidth',1) plot(t grid,a det,'k--','linewidth',2) axis([ 1 2]) set(gca,'xtick',:25:1) set(gca,'ytick',:5:2) xlabel('time [sec]'); ylabel('number of A molecules') %% % estimate the stationary distribution using a long sample path path length=1e8; % path length A stationary=zeros(11,1); t=; A=round(sqrt(k2/(2*k1))); % initial number of A molecules for i=1:path length % record the number of A molecules every second while t<i A=k1*A*(A-1)+k2; tau=1/a*log(1/rand); r2=a*rand; ss=k1*a*(a-1); if r2<ss A=A-2; else A=A+1; t=t+tau; if A<=1 A stationary(a+1)=a stationary(a+1)+1; % record the mean number of A molecules A mean=sum(a stationary.*(:1:1)')/path length

37 Chapter 4 Stochastic modelling of biological processes 32 %% % calculate the analytic stationary distribution n phi=1:1:11; C=(sqrt(2)*besseli(1,2*sqrt(2*k2/k1)))ˆ(-1); phi=c./factorial(n phi).*(k2/k1).ˆ(n phi/2).*besseli(n phi-1,2*sqrt(k2/k1)); phi=[c phi]; n phi=[ n phi]; %% % save the results (useful if the code takes some time to run) % save dimerization.mat % load the results % load dimerization.mat %% % plot the results figure(2); clf; hold on; box on bar(:1:1,a stationary/path length,'facecolor',[.7.7.7],... 'LineWidth',.2,'BarWidth',.6) plot(n phi,phi,'*','color',[ ],'markersize',1); axis([ ]) set(gca,'xtick',:5:25) set(gca,'ytick',:.5:.15) set(gca,'yticklabel',arrayfun(@(s)sprintf('%.2f', s),cellfun(@(s)str2num(s),... get(gca,'yticklabel')), 'UniformOutput', false)) xlabel('number of A molecules'); ylabel('frequency')

38 Chapter 5: Connection to stochastic differential equations In this lecture we will consider successive approximations to the chemical master equation, that facilitate both analytical and efficient numerical interrogation of the dynamics of biochemical reaction networks. 5.1 The tau-leap method Suppose that we are considering the biochemical reaction system defined in Section 3.1, that is a biochemical network consisting of N species, S 1,..., S N, that may be involved in M possible reactions, R 1,..., R M. We restate the chemical master equation (3.7): dp(x, t x, t ) dt M = [a j (x ν j )P(x ν j, t x, t ) a j (x)p(x, t x, t )]. j=1 The Gillespie stochastic simulation algorithm is termed event-driven as it advances through time one reaction at a time. A means to speed up the generation of sample paths is to leap over an interval of length τ and work out approximately how many reactions of each type fire in this interval. We will now describe how to do this. With the system in state X at time t suppose that there exists τ > such that during [t, t + τ) no propensity function, a j (X), j = 1,..., M, changes its value significantly. It then follows that, for j = 1,..., M, the number of times reaction channel R j fires during [t, t + τ) is (approximately) a Poisson random variable with mean (and variance) a j (X(t))τ. That is, number of firings of reaction channel in [t, t + τ) P j (a j (X(t))τ), j = 1,..., M, (5.1) where the P j (m j ), j = 1,..., M are statistically indepent Poisson random variables with mean (and variance) m j. This means that we can approximately leap the system forward in time by step τ by taking M X(t + τ) X(t) + P j (a j (X(t))τ)ν j. (5.2) j= The tau-leap approximate stochastic simulation algorithm Equation (5.2) provides a computational definition for the tau-leap approximate stochastic simulation algorithm: 1. Set t = t and X(t) = X(t ). Then, while t < t final, 2. Calculate a j (X(t)) for j = 1,..., M. 3. Generate random numbers R j P j (a j (X(t))τ). 33

39 number of A molecules mean number of A molecules Chapter 5 Stochastic modelling of biological processes Let X(t + τ) = X(t) + M j=1 R jν j, t = t + τ and return to step 2. Note that the Matlab function poissrnd(lambda) generates Poisson distributed random variates with mean lambda. The results of using the tau-leap algorithm to generate sample paths from the degradation reaction (1.1) are shown in Figure 5.1. Four sample paths generated using τ =.1 are shown on the left-hand side. On the right-hand side the predicted mean molecule number is plotted for τ = 1. and τ = 5., alongside the analytical solution. We see that the accuracy of the method decreases as τ increases time [sec] time [sec] Figure 5.1: Sample paths from a degradation reaction system generated using the tau-leap algorithm. Left: four different sample paths each generated using τ =.1. Right: mean number of A molecules predicted using τ = 1 (blue) and τ = 5 (orange). In each case 1 4 sample paths were used to estimate the mean. The exact solution for the mean number of A molecules is plotted in black. Parameters are A() = 2, k =.1s Connection to the random time-change representation An alternative way to derive the tau-leap algorithm is to consider a different description of the biochemical reaction system. In constructing the random time-change representation our aim is to be able to write M X(t) = X() + R j (t) ν j, (5.3) where R j (t) is the number of times reaction j fires in the interval (, t). We assume that two R j reactions cannot take place at the same time, so that R j (t) is a counting process, that is R j () = and R j is constant except for jumps of plus one. To make progress in understanding how to calculate R j (t) we will first remind ourselves of the definition of a Poisson process. A constant rate Poisson process A Poisson process is a model for a series of random observations occurring in time. Let Y (t) denote the number of observations by time t. Note that, for t < s, Y (s) Y (t) is the number of observations in the time interval (t, s]. We make the following assumptions about the model. 1. Observations occur one at a time. j=1

40 Chapter 5 Stochastic modelling of biological processes The numbers of observations in disjoint time intervals are indepent random variables, i.e. if t < t 1 <... < t m, then Y (t k ) Y (t k 1 ), k = 1,..., m are indepent random variables. 3. The distribution of Y (t + a) Y (t) does not dep on t. When these assumptions are satisfied, there is a constant λ > such that, for t < s, Y (s) Y (t) is Poisson distributed with parameter λ(s t), that is, P(Y (s) Y (t) = k) = [λ(s t)]k k! exp[ λ(s t)]. (5.4) If λ = 1 then we denote the process Y 1 (t) and call it the unit rate Poisson process. An alternative way to think about the unit rate Poisson process is to let ξ i, i = 1, 2,..., be indepent, identically distributed exponential random variables with parameter one and put points down on a line with spacing equal to the ξ i. Then Y 1 (t) is simply the number of points hit when we run along the time frame at rate one. We can then define a Poisson process with parameter a to be Y a (t) := Y 1 (at) where Y 1 is a unit rate Possion process. This means that the Poisson process with rate a is simply the number of points (of the unit rate Possion process) hit when we run along the time frame at rate a. An inhomogeneous Poisson process There is no reason a needs to be constant in time, in which case we can define ( t ) Y a (t) = Y 1 a(s) ds, (5.5) to be an inhomogeneous Poisson process. The random time-change representation The inhomogeneous Poisson process gives us a natural means by which to write R j (t): ( t ) R j (t) = Y j 1 a j (X(s ))ds, (5.6) where Y j 1 is a unit rate Poisson process. We can then write the random time-change representation of the system as X(t) = X() + M j=1 Connection to the tau-leap algorithm ( t ) Y j 1 a j (X(s ))ds ν j. (5.7) The τ-leaping algorithm can then be deduced by taking a forward Euler approximation of Equation (5.7) with time step τ. Let N τ = t/τ and X(t) Z τ (t) to give N τ M Z τ (t) = Z τ () + Y j 1 (a j(z τ ((k 1)τ )τ)) ν j, (5.8) or k=1 j=1 Z τ (t + τ) = Z τ (t) + M j=1 Y j 1 (a j(z τ (t))τ) ν j. (5.9)

41 Chapter 5 Stochastic modelling of biological processes The chemical Langevin equation With the system in state X at time t now suppose that, in addition to being able to find a τ such that during [t, t+τ) no propensity function, a j (X), j = 1,..., M, changes its value significantly, this τ is also large enough that the expected number of firings of R j during [t, t + τ) is 1, that is, a j (X(t))τ 1. (5.1) By the Central Limit Theorem, we know that for large λ the Poisson random variable P(λ) can be approximated using a normal random variable with mean and variance λ, i.e. This means that we can write P(λ) N (λ, λ) = λ + λn (, 1). (5.11) P j (a j (X(t))τ) a j (X(t)) + for j = 1,..., M, and hence further approximate Equation (5.2) as j=1 a j (X(t))τN j (, 1), (5.12) M M X(t + τ) X(t) + a j (X(t))τ ν j + a j (X(t))τN j (, 1)ν j. (5.13) Whilst we won t concern ourselves with the proof here, it can be shown using the theory of continuous time Markov processes that Equation (5.13) can also be formally written as a stochastic differential equation in white noise form: dx(t) dt M M a j (X(t))ν j + j=1 j=1 j=1 a j (X(t))dW j ν j. (5.14) The dw j, j = 1,..., M, are statistically indepent Gaussian white noise processes satisfying dw j (t)dw j (t ) = δ j,j δ(t t ), (5.15) where the first δ represents the Kronecker delta function and the second the Dirac delta function. In the next lectures, we will sp some time exploring the properties of stochastic differential equations, before returning to discuss their use in modelling biochemical reaction systems. 5.3 References and further reading Stochastic simulation of chemical kinetics. D. T. Gillespie. Annu. Rev. Phys. Chem. 58 (27). Continuous time Markov chain models for chemical reaction networks. D. F. Anderson and T. G. Kurtz. Chapter 1, Design and analysis of bimolecular circuits (211). http: // The chemical Langevin equation. D. T. Gillespie. J. Chem. Phys. 113 (2). http: //doi.org/1.163/ Tasks Implement a tau-leaping approximate stochastic simulation algorithm that models the death process of Equation (1.1). How do your results change as you vary the value of τ? Why is this the case?

42 Chapter 5 Stochastic modelling of biological processes Example Matlab code To generate tau-leap sample paths, and compare results as tau is varied. function degradation lecture() clear all; close all; N=2; % initial number of A molecules k=.1; % reaction rate t final=3; % final time % no paths=4; A=cell(no paths,1); t=cell(no paths,1); tau=.1; % value of tau no steps=t final/tau; % number of steps required A=zeros(no paths,no steps+1); A(:,1)=N*ones(no paths,1); for i=1:no steps % generate path by repeatedly drawing Poisson random variates A(:,i+1)=A(:,i)-poissrnd(k*A(:,i)*tau); % if molecule numbers go negative, set to zero neg=find(a(:,i+1)<); A(neg)==; % plot the results figure(1); clf; hold on; box on for i=1:no paths stairs(:tau:t final,a(i,:),'linewidth',1); axis([ 3 N]) set(gca,'xtick',:1:3) set(gca,'ytick',:5:2) xlabel('time [sec]'); ylabel('number of A molecules'); %% % compare results using different values of tau against the analytic result % deterministic solution t grid=:.5:t final; M=N*exp(-k*t grid); no paths=1e4; tau=[1 5]; % values of tau no steps=t final./tau;

43 Chapter 5 Stochastic modelling of biological processes 38 % generate the paths using the tau leap algorithm for j=1:length(tau) A=zeros(no paths,no steps(j)+1); A(:,1)=N*ones(no paths,1); for i=1:no steps(j) A(:,i+1)=A(:,i)-poissrnd(k*A(:,i)*tau(j)); neg=find(a<); A(neg)=; % average the results A mean{j}=mean(a); % plot the results figure(1); clf; hold on; box on plot(:tau(1):t final,a mean{1},'+','markersize',5,'color',[ ]) plot(:tau(2):t final,a mean{2},'b+','markersize',5,'color',[ ]) plot(t grid,m,'k','linewidth',1) axis([ 3 N]) set(gca,'xtick',:1:3) set(gca,'ytick',:5:2) xlabel('time [sec]'); ylabel('mean number of A molecules');

44 Chapter 6: Introduction to stochastic differential equations In this lecture we will give an informal introduction to some simple stochastic differential equations. We will look at a computational definition of a stochastic differential equation and study three simple examples. 6.1 A computational definition of a stochastic differential equation Suppose that x(t) evolves according to dx dt = f(x, t) with x() = x, (6.1) where f : R [, ) R is a given, sufficiently nice function. equation (6.1) can be re-written as The ordinary differential dx = f(x, t)dt, (6.2) to specify the infinitesimal change in x(t), dx(t) = x(t + dt) x(t). This means that we can then write (6.1) as x(t + dt) = x(t) + f(x(t), t)dt. (6.3) Equation (6.3) gives us a simple means to compute an approximate solution to the ordinary differential equation (6.1). Choosing a small time step t then, given x(t), we can write x(t + t) = x(t) + f(x(t), t) t with x() = x. (6.4) This method is called the forward Euler method, and the error of the approximation can be decreased by making the time step, t, smaller. In simple terms, a stochastic differential equation is just an ordinary differential equation with an additional noise term describing stochastic fluctuations. If Equation (6.4) is the computational definition of an ordinary differential equation, then we can write the computational definition of the corresponding stochastic differential equation as X(t + t) = X(t) + f(x(t), t) t + g(x(t), t) tξ with X() = x, (6.5) where g : R [, ) R is the strength of the noise and ξ N (, 1). The stochastic differential equation can be formally written in the form X(t + t) = X(t) + f(x(t), t) t + g(x(t), t)dw with X() = x, (6.6) where dw is the so-called white noise. Note that other noise terms are possible, but here we will only use Gaussian (white) noise. For our purposes, it will be sufficient to know that the meaning of Equation (6.6) is given by the computational definition, (6.5). In fact, we will only need to know how to simulate stochastic 39

45 X mean(x) / variance(x) Chapter 6 Stochastic modelling of biological processes 4 differential equations numerically and to use them for the analysis of reaction-diffusion processes. This means that whenever we write a stochastic differential equation in the form (5.14) we can replace the dw by tn (, 1) where t is the time step of the algorithm. Equation (6.5) is often called the Euler-Maruyama method for solving the stochastic differential equation, and one can immediately see how this method relates to the forward Euler method for ordinary differential equations discussed above. 6.2 Example 1 Suppose that both time, t, and the variable X(t) are dimensionless and f(x, t), g(x, t) 1 so that X(t) satisfies the stochastic differential equation X(t + dt) = X(t) + dw, with X() =. (6.7) The corresponding computational definition is X(t + t) = X(t) + tξ, with X() =, (6.8) where t is the small time step and ξ N (, 1). Four sample paths generated using Equation (6.8) to define a stochastic simulation algorithm are shown in Figure time [sec] time [sec] Figure 6.1: Left: four sample paths generated using Equation (6.8). Right: the mean (orange) and variance (blue) estimated from 5 sample paths, together with the analytic results (M(t) = and V (t) = t, black dashed). Parameters are t =.1s. Let M(t) := E[X(t)] and V (t) := Var(X(t)) = E[X(t) 2 ] M(t) 2 where E( ) denotes the average over (infinitely) many realisations of the stochastic simulation algorithm. computational definition, (6.8), we have M(t + t) = E [X(t + t)] [ = E X(t) + ] tξ = E [X(t)] + te [ξ ] Using the = M(t), (6.9)

46 X mean(x) / variance(x) Chapter 6 Stochastic modelling of biological processes 41 and V (t + t) = E [ X(t + t) 2] M(t + t) 2 [ ( = E X(t) + ) ] 2 tξ M(t) 2 = E [ X(t) 2] + 2 te [X(t)] E [ξ ] + te [ ξ 2] M(t) 2 = E [ X(t) 2] + t M(t) 2 = V (t) + t, (6.1) where we have used the fact that E[ξ ] = and E[ξ 2 ] = 1. The initial conditions are M(t) = and V (t) =, so that we have M(t) and V (t) = t (see Figure 6.1). This means that both M(t) and V (t) are indepent of the time step t. This is in fact true for any moment, E [ X(t) k] for k = 1, 2, 3,... (see Problem Sheet 2), and is one of the reasons why we chose the computational definition of dw as tξ. 6.3 Example 2 The function f(x, t) in Equation (6.6) is often called the drift coefficient. We will now ext Example 1 by letting f(x, t) 1, g(x, t) 1 so that X(t + dt) = X(t) + dt + dw, with X() =, (6.11) with corresponding computational definition X(t + t) = X(t) + t + tξ, with X() =. (6.12) Four sample paths generated using Equation (6.8) to define a stochastic simulation algorithm are shown in Figure time [sec] time [sec] Figure 6.2: Left: four sample paths generated using Equation (6.12). Right: the mean (orange) and variance (blue) estimated from 5 sample paths, together with the analytic results (M(t) = t and V (t) = t, black dashed). Parameters are t =.1s. Using Equation (6.12) we have M(t + t) = E [X(t + t)] [ = E X(t) + t + ] tξ = E [X(t)] + t + te [ξ ] = M(t) + t, (6.13)

47 X X Chapter 6 Stochastic modelling of biological processes 42 so that M(t) = t (see Figure 6.2). This means that, as one might expect, solutions of Equation (6.11) fluctuate around a mean value that deps only on the drift coefficient. Similarly, one can show that V (t) = t (see Figure 6.2). 6.4 Example 3 The final example is motivated by considering the bistability example we looked at in Problem Sheet 1, Question 3. We take f(x, t) = k 1 x 3 + k 2 x 2 k 3 x + k 4, (6.14) g(x, t) = k 5, (6.15) where k 1,..., k 5 are positive constants to give X(t + dt) = X(t) + [ k 1 X(t) 3 + k 2 X(t) 2 ] k 3 X(t) + k 4 dt + k5 dw. (6.16) time [sec] time [sec] Figure 6.3: Two sample paths generated by solving Equation (6.16) with X() = (left) and X() = 5 (right) are shown in blue. The corresponding solution of Equation (6.17) is plotted in orange. Parameters are k 1 = 1 3, k 2 =.75, k 3 = 165, k 4 = 1 4, k 5 = 2 and t =.1. We note that if k 5 = then Equation (6.16) becomes the ordinary differential equation derived in Question 3 of Problem Sheet 1: dx dt = k 1x 3 + k 2 x 2 k 3 x + k 4. (6.17) Choosing k 1 = 1 3, k 2 =.75, k 3 = 165 and k 4 = 1 4 gives steady states states, two stable and one unstable, with x 1 s = 1, x u = 25, x 2 s = 4. (6.18) Two sample paths generated by solving Equation (6.16) are shown in Figure 6.3, along with the corresponding solutions of Equation (6.17). 6.5 References and further reading Stochastic simulation of chemical kinetics. D. T. Gillespie. Annu. Rev. Phys. Chem. 58 (27). The chemical Langevin equation. D. T. Gillespie. J. Chem. Phys. 113 (2). http: //doi.org/1.163/

48 Chapter 6 Stochastic modelling of biological processes Tasks Show that V (t) = t for Example 2. Implement a stochastic simulation algorithm to generate a large number of sample paths from Example 2. Compare the mean and variance of your sample with the theoretical predictions M(t) = t and V (t) = t. Implement a stochastic simulation algorithm to generate a large number of sample paths from Example 3. Compare your results with the solution of Equation (6.17). 6.7 Example Matlab code To generate sample paths from Example 1. function example1 lecture() clear all; close all; % no paths=4; % number of sample paths t final=1; % final time dt=.1; % time step no steps=t final/dt; % total number of steps X=zeros(no paths,no steps+1); dw=sqrt(dt)*randn(no paths,no steps); % generate all the noise increments X(:,2:)=cumsum(dW,2); % take cumulative sums to generate the sample paths % plot the results figure(1); clf; hold on; box on for i=1:no paths plot(:dt:t final,x,'linewidth',1); axis([ t final -6 6]) set(gca,'xtick',:2:1) set(gca,'ytick',-6:3:6) xlabel('time [sec]'); ylabel('x') %% no paths=5; % number of sample paths t final=1; % final time dt=.1; % time step no steps=t final/dt; % total number steps X=zeros(no paths,no steps+1); dw=sqrt(dt)*randn(no paths,no steps); % generate all the noise increments X(:,2:)=cumsum(dW,2); % take cumulative sums to generate the sample paths % plot the mean and variance of the asample paths, compare to analytic results figure(2); clf; hold on; box on

49 Chapter 6 Stochastic modelling of biological processes 44 plot(:dt:t final,mean(x),'color',[ ],'linewidth',1); plot(:dt:t final,zeros(1,no steps+1),'k--','linewidth',1) plot(:dt:t final,var(x),'color',[ ],'linewidth',1); plot(:dt:t final,:dt:t final,'k--','linewidth',1) axis([ t final ]) set(gca,'xtick',:2:1) set(gca,'ytick',:2:1) xlabel('time [sec]') ylabel('mean(x) / variance(x)') To generate sample paths from Example 2. function example2 lecture() clear all; close all; % no paths=4; % number of paths t final=1; % final time dt=.1; % time step no steps=t final/dt; % total number of steps X=zeros(no paths,no steps+1); dt=dt*ones(no paths,no steps); dw=sqrt(dt)*randn(no paths,no steps); % generate all random increments X(:,2:)=cumsum(dT,2)+cumsum(dW,2); % generate sample paths using cumulative sums % plot the results figure(1); clf; hold on; box on for i=1:no paths plot(:dt:t final,x,'linewidth',1); axis([ t final -1 15]) set(gca,'xtick',:2:1) set(gca,'ytick',:5:15) xlabel('time [sec]'); ylabel('x') %% no paths=5; t final=1; dt=.1; no steps=t final/dt; X=zeros(no paths,no steps+1); dt=dt*ones(no paths,no steps); dw=sqrt(dt)*randn(no paths,no steps); X(:,2:)=cumsum(dT,2)+cumsum(dW,2); % plot the mean and the variance of the sample paths

50 Chapter 6 Stochastic modelling of biological processes 45 figure(2); clf; hold on; box on plot(:dt:t final,mean(x),'color',[ ],'linewidth',1); plot(:dt:t final,var(x),'color',[ ],'linewidth',1); plot(:dt:t final,:dt:t final,'k--','linewidth',1) axis([ t final ]) set(gca,'xtick',:2:1) set(gca,'ytick',:5:15) xlabel('time [sec]'); ylabel('mean(x) / variance(x)')

51 Chapter 7: The Fokker-Planck equation In this lecture we will show how to derive the Fokker-Planck equation using our computational definition of a stochastic differential equation. Suppose that X(t) evolves according to the stochastic differential equation (6.6). We define its probability distribution, p(x, t), such that p(x, t)dx = P (X(t) [x, x + dx] X() = x ). (7.1) Roughly speaking, p(x, t) quantifies the probability of finding a given trajectory of the stochastic differential equation around the point x at time t given it started at x at time t =. For any time t, p(x, t) satisfies the normalisation condition p(x, t)dx = 1. (7.2) We will show that p(x, t) evolves according to the partial differential equation ( ) p 2 g(x, t) 2 (x, t) = t x 2 p(x, t) ( ) f(x, t)p(x, t). (7.3) 2 x Equation (7.3) is often called the Fokker-Planck equation or the forward Kolmogorov equation. 7.1 Derivation of the Fokker-Planck equation For s < t let p(x, t y, s) := P (X(t) [x + dx) X(s) = y). (7.4) Then we can write down the Chapman-Kolmogorov equation to describe the value of X at time t + t: p(z, t + t y, s) = R p(z, t + t x, t)p(x, t y, s)dx, (7.5) where s < t and Equation (7.5) is valid for all t. To derive the Fokker-Planck equation we will take the limit as t. However, we first multiply both sides by a smooth test function ϕ(z) and integrate over z to give p(z, t + t y, s)ϕ(z)dz = R R [ ] p(z, t + t x, t)ϕ(z)dz p(x, t y, s)dx. (7.6) R We now rename the integrating variable z to x on the left-hand side so that [ ] p(x, t + t y, s)ϕ(x)dx = p(z, t + t x, t)ϕ(z)dz p(x, t y, s)dx. (7.7) R R R On the right-hand side, we Taylor expand ϕ(z) around the point x i.e. we write ϕ(z) = ϕ(x) + (z x)ϕ (x) (z x)2 ϕ (x) + o ( (z x) 2), (7.8) 46

52 Chapter 7 Stochastic modelling of biological processes 47 so that p(x, t + t y, s)ϕ(x)dx = R = R R [ { p(z, t + t x, t) ϕ(x) + (z x)ϕ (x) R (z x)2 ϕ (x) + o ( (z x) 2)} ] dz p(x, t y, s)dx [ ϕ(x) p(z, t + t x, t)dz R +ϕ (x) (z x)p(z, t + t x, t)dz R ϕ (x) (z x) 2 p(z, t + t x, t)dz R + o ( (z x) 2) ] p(z, t + t x, t)dz p(x, t y, s)dx. (7.9) R We simplify the right-hand side of Equation (7.9) by considering each term individually. For the first term, we have R p(z, t + t x, t)dz = 1. (7.1) We note that the second term can also be written (z x)p(z, t + t x, t)dz = E [X(t + t) x X(t) = x]. (7.11) R Then we can use the computational definition (6.5) to write [ E [X(t + t) x X(t) = x] = E f(x, t) t + g(x, t) ] tξ Together, this means that R = f(x, t) t + g(x, t) te [ξ ] = f(x, t) t. (7.12) (z x)p(z, t + t x, t)dz = f(x, t) t. (7.13) To use the same approach for the third term, we note that [ ] (z x) 2 p(z, t + t x, t)dz = E (X(t + t) x) 2 X(t) = x, (7.14) R and again use the computational definition (6.5) to write [ ] E (X(t + t) x) 2 X(t) = x [ ( = E f(x, t) t + g(x, t) ) ] 2 tξ = f(x, t) 2 ( t) 2 + 2f(x, t)g(x, t)( t) 3/2 E [ξ ] +g(x, t) 2 te [ ξ 2 ] = g(x, t) 2 t + O ( ( t) 2). (7.15) Together, this means that (z x) 2 p(z, t + t x, t)dz = g(x, t) 2 t + O ( ( t) 2). (7.16) R

53 Chapter 7 Stochastic modelling of biological processes 48 Substituting equations (7.1), (7.13) and (7.16) into Equation (7.9) gives [ p(x, t + t y, s)ϕ(x)dx = ϕ(x) + ϕ (x)f(x, t) t R R ] +ϕ g(x, t)2 (x) t p(x, t y, s)dx + O ( ( t) 2), (7.17) 2 which can be rearranged to obtain p(x, t + t y, s) p(x, t y, s) ϕ(x)dx = t R R We can then use integration by parts on the right-hand side to give p(x, t + t y, s) p(x, t y, s) ϕ(x) dx = ϕ(x) t x ϕ (x)f(x, t)p(x, t y, s)dx R + ϕ g(x, t)2 (x) p(x, t y, s)dx + O( t). (7.18) 2 + R R R ϕ(x) 2 x 2 The above expression can now be written as one integral: [ p(x, t + t y, s) p(x, t y, s) = ϕ(x) R t ( ) f(x, t)p(x, t y, s) x ( g(x, t) x 2 2 ( f(x, t)p(x, t y, s) ( g(x, t) 2 2 ) dx ) p(x, t y, s) dx + O( t). (7.19) )] p(x, t y, s) dx + O( t). (7.2) Since the test function, ϕ(x), is arbitrary, we can conclude that the term inside the square brackets must be zero to arrive at p(x, t + t y, s) p(x, t y, s) t = 2 x 2 ( g(x, t) 2 2 ) p(x, t y, s) ( ) f(x, t)p(x, t y, s) + O( t). x (7.21) Taking the limit as t we obtain the Fokker-Planck equation: ( ) 2 g(x, t) 2 p(x, t y, s) = t x 2 p(x, t y, s) ( ) f(x, t)p(x, t y, s). (7.22) 2 x Note that to write the Fokker-Planck equation in the same form as Equation (7.3), we set y = x and s = and note that the function we denoted as p(x, t) should more formally be written as p(x, t x, t ) Example 1 of Lecture 6 For Example 1 of Lecture 6 we have f(x, t), g(x, t) 1 and X() =. The corresponding Fokker-Planck equation is then t p(x, t) = 1 p(x, t) with p(x, ) = δ(x), (7.23) 2 x2 2

54 p(x,1) Chapter 7 Stochastic modelling of biological processes 49 and it has solution p(x, t) = 1 ) exp ( x2. (7.24) 2πt 2t Figure 7.1 shows comparison of p(x, 1) with the results of simulating many sample paths using Equation (6.8) x Figure 7.1: Solution of the Fokker-Planck equation, (7.23), given by Equation (7.24) at t = 1 (orange) together with a histogram of sample path positions at t = 1 (grey) generated using 1 5 sample paths and t = The stationary distribution If f(x, t) f(x) and g(x, t) g(x) then we can evaluate the long time behaviour of the corresponding stochastic differential equations by considering the stationary distribution: p s (x) := lim t p(x, t). (7.25) The function can be found by solving the stationary problem corresponding to (7.3), which is the ordinary differential equation d 2 ( ) g(x) 2 dx 2 p s (x) d ( ) f(x)p s (x) =, (7.26) 2 dx with solution p s (x) = C [ x ] g(x) 2 exp 2f(y) g(y) 2 dy. (7.27) Using the normalisation condition, Equation (7.2), we have ( C = Example 3 of Lecture 6 R [ 1 x ] g(x) 2 exp 2f(y) 1 dx) g(y) 2 dy. (7.28) Recall the stochastic differential equation of Lecture 6, Example 3, X(t + dt) = X(t) + [ k 1 X(t) 3 + k 2 X(t) 2 k 3 X(t) + k 4 ] dt + k5 dw. (7.29) Using the method outlined above we can calculate the stationary distribution as [ p s (x) = C 3k1 x 4 + 4k 2 x 3 6k 3 x 2 ] + 12k 4 x exp, (7.3) 6k 2 5

55 p s (x) Chapter 7 Stochastic modelling of biological processes 5 where ( [ 3k1 x C 4 + 4k 2 x 3 6k 3 x 2 ] k 4 x = exp dx). (7.31) R 6k x Figure 7.2: Stationary distribution for Example 3 of Lecture 6, given by Equation (7.3) (orange), together with an estimate of the stationary distribution generated using stochastic simulation (grey) with 1 5 sample paths and t = References and further reading An Introduction to Stochastic Processes in Physics and Chemistry. L. S. J. Allen. Stochastic Processes in Physics and Chemistry. N. G. van Kampen. 7.4 Tasks Reproduce the results shown in Figure 7.1. Solve Equation (7.26) to show that the stationary distribution is as stated in Equation (7.27), and use the normalisation condition, Equation (7.2), to show C is as stated in Equation (7.28). 7.5 Example Matlab code To generate sample paths from Example 1. function example1 lecture() clear all; close all; %% no paths=1e5; % number of sample paths t final=1; % final time dt=.1; % time step dx=.25; % bin width no steps=t final/dt; % total number of time steps X=zeros(no paths,no steps+1);

56 Chapter 7 Stochastic modelling of biological processes 51 dw=sqrt(dt)*randn(no paths,no steps); % generate all random increments X(:,2:)=cumsum(dW,2); % generate sample paths using cumulative sum bins=-5:dx:5; SDE hist=hist(x(:,),bins)/(no paths*dx); % bin the results to plot a histogram % plot the results figure(1); clf; hold on; box on % plot the averaged stochastic results bar(bins-dx/2,sde hist','facecolor',[.7.7.7],'linewidth',.5,'barwidth',.6) % plot the analytic result plot(bins-dx/2,1/sqrt(2*pi)*exp(-bins.ˆ2/2),'linewidth',1,'color',[ ]); axis([ ]) set(gca,'xtick',-4:4:4) set(gca,'ytick',:.1:.4) xlabel('x') ylabel('p(x,1)') To generate sample paths from Example 3. function example3 lecture() clear all; close all; t final=1e6; % final time dt=.1; % time step save step=1; dx=1.; % bin width no steps=t final/dt; % number of save steps no saves=t final/save step+1; SDE hist=zeros(5/dx+1,1); % parameters k1=1e-3; k2=.75; k3=165; k4=1e4; k5=2; X=1; % SDE realisation dw=sqrt(dt)*randn(no steps,1); % generate all random increments for ii=1:no steps % save the results every 1 second if mod((ii-1)*dt,save step)== jj=round(x/dx); SDE hist(jj)=sde hist(jj)+1; % update the sample path X=X+(-k1*X.ˆ3+k2*X.ˆ2-k3*X+k4)*dt+k5*dW(ii);

57 Chapter 7 Stochastic modelling of biological processes 52 % deterministic solution x=:1:5; ps=exp((-3*k1*x.ˆ4+4*k2*x.ˆ3-6*k3*x.ˆ2+12*k4*x)/(6*k5ˆ2)); ps=ps/(sum(ps)); % normalise the constant \bar{c} % save the results to file save lec7 example3.mat %% % load the results load lec7 example3.mat % plot the results figure(1); clf; hold on; box on bins=:dx:5; bar(bins+dx/2,sde hist'/(sum(sde hist)*dx),'facecolor',[.7.7.7],... 'LineWidth',.5,'BarWidth',.6); plot(x-5,ps,'linewidth',1,'color',[ ]) axis([ 5.1]) set(gca,'xtick',:1:5) set(gca,'ytick',:.1:.1) xlabel('x'); ylabel('p s(x)')

58 Chapter 8: The backward Kolmogorov equation Sometimes we might want to understand how the likelihood of ing up in a given state deps on the starting state. This means that, in contrast to Lecture 7, the position is known while the starting position is underdetermined. We would like to be able to understand the evolution of p(x, t y, s) in terms of the initial time, s, and initial state, y. 8.1 Derivation of the backward Kolmogorov equation We start by renaming variables in the Chapman-Kolmogorov equation, (7.5), to obtain p(x, t y, s s) = p(x, t z, s)p(z, s y, s s)dz. (8.1) R As in Lecture 7, this equation is valid for any s, and we will eventually take the limit s. First, we Taylor expand about the point z = y to write p(x, t z, s) = p(x, t y, s)+(z y) y p(x, t y, s)+ 1 2 (z y)2 2 y 2 p(x, t y, s)+o ( (z y) 2), (8.2) and substitute into the right-hand side of Equation (8.1) so that we have p(x, t y, s s) = p(x, t y, s) p(z, s y, s s)dz R + y p(x, t y, s) (z y)p(z, s y, s s)dz R p(x, t y, s) y2 2 (z y)2 p(z, s y, s s)dz + O ( ( s) 2). (8.3) Using Equations (7.1), (7.13) and (7.16) we can then obtain R p(x, t y, s s) p(x, t y, s) s = f(y, s) g(y, s)2 2 p(x, t y, s) + p(x, t y, s) + O( s). (8.4) y 2 y2 Taking the limit s gives the so-called backward Kolmogorov equation: s p(x, t y, s) = f(y, s) g(y, s)2 2 p(x, t y, s) + p(x, t y, s). (8.5) y 2 y2 Both the Fokker-Planck equation and the backward Kolmogorov equation provide an exact description of p(x, t y, s) corresponding to the stochastic differential equation (6.6). 8.2 The diffusion coefficient We can simplify notation by defining the diffusion coefficient, d(x, t) = 1 2 g(x, t)2. (8.6) 53

59 Chapter 8 Stochastic modelling of biological processes 54 In this case, the Fokker-Planck equation becomes ( ) p 2 (x, t) = t x 2 d(x, t)p(x, t) ( ) f(x, t)p(x, t), (8.7) x the backward Kolmogorov equation can be written s p(x, t y, s) = f(y, s) y and the stationary distribution is where p s (x) = ( C = 8.3 Average switching times R 2 p(x, t y, s) + d(y, s) p(x, t y, s), (8.8) y2 C [ x ] d(x) exp f(y) d(y) dy. (8.9) [ 1 x ] d(x) exp f(y) 1 dx) d(y) dy. (8.1) Recall that the stochastic differential equation of Lecture 6, Example 3, X(t + dt) = X(t) + [ k 1 X(t) 3 + k 2 X(t) 2 k 3 X(t) + k 4 ] dt + k5 dw, (8.11) with parameters k 1 = 1 3, k 2 =.75, k 3 = 165 and k 4 = 1 4 has two favourable states, x 1 s = 1 and x 2 s = 4, and an unfavourable state, x u = 25. In Figure 6.3 we saw that most of the time, X(t) stays close to those favourable states and occasionally it switches between them. In this section we will learn how to calculate the average time it takes to switch between two favourable states. Since the trajectories jump around each of the favourable states, it is not clear quite how to define mathematically when the system is in a given state (i.e. when it is close to x 1 s or x 2 s). This makes it hard to define when the switch takes place, and so we need to take a different approach. We define τ to be the average time for a trajectory to reach x u, given that X() = x 1 s. If a trajectory reaches x u there is a 5% chance it will return back to x 1 s and a 5% chance it will continue on to x 2 s. This means that the average switching is therefore 2 τ Numerical estimation of the average switching time We can average over a number of trajectories generated using the stochastic simulation algorithm to estimate τ. We start each trajectory at x 1 s and wait until the trajectory first leaves the interval (, x u ). For the parameters used in Lecture 6, Example 3, t =.1 and generating 1 5 trajectories, we have τ sim = 64.7 with a sample variance of Analytical expression for the average switching time We will derive an analytical expression for τ for any stochastic differential equation of the form (6.6) where f(x, t) f(x) and g(x, t) g(x). Let h(y, t) := P ( X(t ) (, x u ) t (, t) X() = y (, x u ) ). (8.12) Then h(y, t) = xu p(x, t y, )dx, (8.13)

60 frequency Chapter 8 Stochastic modelling of biological processes switching time Figure 8.1: The distribution of switching times for Equation (6.16) as estimated using 1 5 sample paths. Parameters are k 1 = 1 3, k 2 =.75, k 3 = 165, k 4 = 1 4, k 5 = 2 and t =.1, and in each case X() = x 1 s = 1. where p(x, t y, s) represents the probability that the trajectory remains in (, x u ) and lies in the interval [x, x + dx) at time t given that it started at y at time s < t. From our results thus far, we know that p satisfies the Fokker-Planck and backward Kolmogorov equations with the boundary conditions p(x u, t y, s) = p(x, t x u, s) =, (8.14) so that p(x, t y, s) = if y x u or x x u. (8.15) Since the coefficients f and g are assumed not to dep on time, we can shift time in the definition of p in Equation (8.13): h(y, t) = xu p(x, y, t)dx. (8.16) Using the same transformation in the backward Kolmogorov equation, (8.8), we obtain s p(x, y, t) = f(y) y 2 p(x, y, t) + d(y) p(x, y, t). (8.17) y2 Integrating this equation with respect to x, and using Equation (8.13), gives t h(y, t) = f(y) 2 h(y, t) + d(y) h(y, t). (8.18) y y2 Let τ(y) be the average time for a trajectory, X(t), with X() = y to leave the interval (, x u ). The probability that X first leaves the interval (, x u ) during the time interval [t, t + dt) is then This means that τ(y) can be computed as h(y, t) h(y, t + dt) h(y, t)dt. (8.19) t τ(y) = t t h(y, t)dt = h(y, t) dt. (8.2)

61 Chapter 8 Stochastic modelling of biological processes 56 Integrating Equation (8.18) with respect to t between and gives h(y, ) h(y, ) = f(y) y h(y, t)dt + d(y) 2 y 2 h(y, t) dt. (8.21) Substituting for τ(y) and using the fact that h(y, ) = and h(y, ) = 1, we obtain 1 = f(y) dτ dy + d(y)d2 τ dy 2 for y (, x u ). (8.22) The system is closed by specifying two boundary conditions. First, p(x, t x u, s) = implies h(x u, t) = and so τ(x u ) =. (8.23) The other boundary condition deps on the problem under consideration. Suppose, for example, that f(y) as y : if we start trajectories further and further to the left, we expect that the exit time will not dep on the starting position. Thus, we impose dτ ( ) =. (8.24) dy Using an integrating factor to integrate (8.22) with respect to y, and using the boundary condition at y = we have dτ dy = exp [ y 1 = d(y)p s (y) y ] f(z) y d(z) dz [ 1 z ] d(z) exp f(x) d(x) dx dz p s (x)dx, (8.25) where p s (x) is the stationary distribution, (8.9). Integrating again with respect to y and using the boundary condition at y = x u τ(y) = xu By definition, we can now compute τ as y τ = τ(x 1 s) = 8.4 Example 3 of Lecture 6 1 z p s (x)dxdz. (8.26) d(z)p s (z) xu x 1 s 1 z p s (x)dxdz. (8.27) d(z)p s (z) Recall again the stochastic differential equation of Lecture 6, Example 3, X(t + dt) = X(t) + [ k 1 X(t) 3 + k 2 X(t) 2 k 3 X(t) + k 4 ] dt + k5 dw. (8.28) The stationary distribution, p s (x), for this example of a bistable system is given in equations (7.3)-(7.31). Substituting p s (x) into Equation (8.27) and evaluating the resulting integrals numerically, we obtain τ = The mean exit time is plotted as a function of starting position in Figure 8.2. There is an error (approximately 9%) between the theoretical value of τ and the value τ sim = The main reason for this is that we simulated the stochastic differential equation using a finite time step of =.1. Decreasing the time step improves the accuracy of the stochastic

62 X mean exit time Chapter 8 Stochastic modelling of biological processes time x X() Figure 8.2: Left: trajectories of (8.28) computed using t = 1 3 (blue) and t = 1 5 (orange). The trajectory with smaller time step leaves the domain (, x u ) whilst the trajectory with larger time step does not. Right: the mean exit time as a function of starting position with the analytic estimate, Equation (8.26), shown in orange and the averaged discrete results in blue. Parameters are k 1 = 1 3, k 2 =.75, k 3 = 165, k 4 = 1 4, k 5 = 2, x u = 25 and t =.1, and in each case X() = x 1 s = 1. simulation algorithm. This is essentially because, even if X(t) < x u and X(t + t) < x u, there is some probability that the trajectory left the domain (, x u ) during the time interval (t, t + t) (see Figure 8.2). To estimate this probability, we suppose that during (t, t + t) the particle diffuses only with diffusion coefficient d = k 2 5 /2. This is a good approximation close to x u because we know that the drift coefficient is such that f(x u ) =. In this case, the probability that the trajectory left (, x u ) during the time interval (t, t+ t) is approximately (see Problem Sheet 2) probability left (, x u ) during (t, t + t) exp 8.5 References and further reading [ (X(t) x ] u) (X(t + t) x u ). (8.29) d t An Introduction to Stochastic Processes in Physics and Chemistry. L. S. J. Allen. Stochastic Processes in Physics and Chemistry. N. G. van Kampen. 8.6 Tasks Use the computational definition of Equation (8.28) to numerically estimate the switching time τ (recall that this is defined as the average time for a trajectory to reach x u, given that X() = x 1 s). Evaluate the switching time, τ(y), from Equation (8.26) and compare your result with that estimated using repeated simulation of SDE sample paths. Repeat this exercise, but with a numerical estimate that is corrected using the result in Equation (8.29). Complete Problem Sheet 2!

63 Chapter 8 Stochastic modelling of biological processes Example Matlab code To estimate the switching time distribution for Lecture 6, Example 3. function example3 switching SDE lecture() clear all; close all; no paths=1e5; % number of sample paths dt=1e-3; % time step t stop=zeros(no paths,1); % parameters k1=1e-3; k2=.75; k3=165; k4=1e4; k5=2; X unstable=25; % SDE realisations dw=sqrt(dt)*randn(1e6,1); % generate initial set of random increments jj=1; for ii=1:no paths X=1; tt=; % until x u reached, evolve the sample paths while X<X unstable tt=tt+dt; X=X+(-k1*X.ˆ3+k2*X.ˆ2-k3*X+k4)*dt+k5*dW(jj); jj=jj+1; if jj>1e6 % if more random increments needed, generate them dw=sqrt(dt)*randn(1e6,1); jj=1; % record the time at which x u is reached t stop(ii)=tt; % output the mean and variance of the switching time mean(t stop) var(t stop)/no paths %% % save the results to file % save example3 switching pdf.mat % load the results from file % load example3 switching pdf.mat

64 Chapter 8 Stochastic modelling of biological processes 59 %% % create histogram of switching times bin width=5; t stop hist=hist(t stop,:bin width:1); % plot the results figure (1); clf; hold on; box on stairs(bin width/2:bin width:1+bin width/2,... t stop hist'/(sum(t stop hist)*bin width),'linewidth',1.); axis([ 25.2]) set(gca,'xtick',:5:25) set(gca,'ytick',:.1:.2) xlabel('switching time'); ylabel('frequency') To compare analytic estimate for switching time with that obtained using repeated stochastic simulation. function example3 switching analytical lecture() clear all; close all; % parameters k1=1e-3; k2=.75; k3=165; k4=1e4; k5=2; x unstable=25; %% analytical calculations % analytical calculation of the switching time as a function of y x=; %Left Side of domain xn=6; %Right side of domain delta=1.; %Step size % regular mesh for integrals xseries=x:delta:xn; % regular mesh for evaluating tau tauseries=x:delta:x unstable; N=length(xseries); taun=length(tauseries); p=zeros(1,length(xseries)); % drift and diffusion function for the SDE f=@(x)(-k1*x.ˆ3+k2*x.ˆ2-k3*x+k4); d=k5ˆ2/2; % compute the stationary distribution

65 Chapter 8 Stochastic modelling of biological processes 6 % define the function for the stationary distribution pfunc=@(x)exp(integral(@(y)(f(y)./d),,x))/d; p=arrayfun(pfunc,xseries); %Compute the norm of p norm=trapz(xseries,p); %Normalise p p=p/norm; % compute the exit time % compute the function Z(x) = \int ˆx p(y) dy for values x = x to xn Z=cumtrapz(xseries,p); % compute the function Z(x)/(d(x)*p(x)) for values x = x to xn tauintegrand = Z./(d.*p); % compute the exit time, integration limits reversed to work well with cumtrapz tauy=tauintegrand(1:taun); tau=cumtrapz(tauseries(:-1:1),tauy(:-1:1)); % analytical estimate for the switching time from x sˆ1 -tau(15) %% computational approximation no paths=1e4; dt=1e-3; t stop=zeros(no paths,taun); % SDE realisations dw=sqrt(dt)*randn(1e6,1); correct=rand(1e6,1); jj=1; kk=1; for rr=1:taun rr for ii=1:no paths X=tauseries(rr); xold=tauseries(rr); tt=; while X<x unstable tt=tt+dt; X=X+(-k1*X.ˆ3+k2*X.ˆ2-k3*X+k4)*dt+k5*dW(jj); jj=jj+1; if jj>1e6 dw=sqrt(dt)*randn(1e6,1); jj=1; % correction probability if correct(kk)<exp(-(xold-x unstable)*(x-x unstable)/(dt*k5ˆ2/2)) X=1e1; kk=kk+1; if kk>1e6 correct=rand(1e6,1); kk=1;

66 Chapter 8 Stochastic modelling of biological processes 61 xold=x; t stop(ii,rr)=tt; average time=mean(t stop,1); var time=var(t stop,1)/no paths; %% % save example3 switching diffy.mat % load example3 switching diffy.mat %% figure (1); clf; hold on; box on stairs(tauseries,average time,'linewidth',1.); plot(tauseries(:-1:1),-tau,'linewidth',1.); axis([ 25 7]) set(gca,'xtick',:5:25) set(gca,'ytick',:1:7) xlabel('x()'); ylabel('mean exit time')

67 Chapter 9: The chemical Fokker-Planck equation In Lecture 5, we saw that it was possible to approximate the dynamics of the chemical master equation using the chemical Langevin equation: M M X(t + dt) X(t) + dt a j (X(t))ν j + a j (X(t))dW j ν j. (9.1) j=1 where the dw j, j = 1,..., M are statistically indepent Gaussian white noise processes. The computational definition of the chemical Langevin equation is M X(t + t) X(t) + t a j (x(t))ν j + M t a j (x(t))ξ j ν j, (9.2) j=1 where ξ j N (, 1) for j = 1,..., M. In this lecture, we will learn how to connect the chemical Langevin equation with the chemical Fokker-Planck equation, proceeding by means of example. 9.1 Example: production and degradation For the production/degradation system considered in Lecture 2, the chemical Langevin equation becomes j=1 j=1 A k 1, (9.3) k 2 A, (9.4) X(t + dt) = X(t) + [ k 1 X(t) + k 2 ν] dt k 1 X(t)dW 1 + k 2 ν dw 2, (9.5) with computational definition X(t + t) = X(t) + [ k 1 X(t) + k 2 ν] t k 1 X(t) tξ 1 + k 2 ν tξ 2, (9.6) where ξ 1 and ξ 2 are two random numbers sampled from the unit normal distribution. Equation (9.5) is different from those we have studied in the last three lectures because it has two indepent white noises. However, on Problem Sheet 3 we will show that this does not complicate derivation of the corresponding Fokker-Planck equation. For this example, we have drift and diffusion coefficients f(x) = k 1 x + k 2 ν and d(x) = 1 2 (k 1x + k 2 ν), (9.7) and ( ) 2 p(x, t) = t x 2 d(x)p(x, t) ( ) f(x)p(x, t). (9.8) x 62

68 frequency =(y) [sec] Chapter 9 Stochastic modelling of biological processes 63 We are now in a position to apply the theory that we developed over the last three lectures. The stationary distribution is given by [ C x ] p s (x) = d(x) exp f(y) d(y) dy [ 2C x ] = k 1 x + k 2 ν exp k 1 y + k 2 ν 2 k 1 y + k 2 ν dy [ 2C x ] = k 1 x + k 2 ν exp 1 2x + 4k 2 ν k 1 y + k 2 ν dy 2 = C [ k 1 x + k 2 ν exp 2x + 4k ] 2ν log (k 1 x + k 2 ν) k 1 [ ( ) ] = 2 C 4k2 ν exp 2x + 1 log (k 1 x + k 2 ν), (9.9) k 1 where ( 2 C = exp R [ ( ) ] 4k2 ν 1 2x + 1 log (k 1 x + k 2 ν) dx). (9.1) k number of A molecules y Figure 9.1: Right: comparison of the stationary distribution as estimated using repeated stochastic simulation (grey bars) and analytically using Equation (9.9) (blue). Left: mean exit time as estimated using Equation (9.11) (red) and using repeated stochastic simulation (blue). Parameters are A() =, k 1 =.1s 1 and k 2 ν = 1.s 1. We can look at how well the chemical Fokker-Planck equation performs when it comes to estimating mean transition times. Define τ SSA (n), n = 1, 2, 3,..., 18, to be the average time predicted by the Gillespie stochastic simulation algorithm for trajectories to leave the interval (, 18] when the initial condition is A() = n. We can approximate τ SSA (y) using our estimate derived from the Fokker-Planck equation: τ F P E (y) = 19 y 1 z p s (x)dxdz, (9.11) d(z)p s (z) where p s (x) is as given by equations (9.9) and (9.1). We can evaluate the integrals in τ F P E numerically. The results are shown in Figure 9.1, and demonstrate that Equation (9.11) is a good approximation of τ(n).

69 Chapter 9 Stochastic modelling of biological processes The chemical Fokker-Planck equation Now consider general well-stirred reaction system with N species, S 1,..., S N, that may be involved in M possible reactions, R 1,..., R M, as in Lectures 3 and 5. Langevin equation is j=1 j=1 Then the chemical M M X(t + dt) X(t) + dt a j (x(t))ν j + a j (x(t))dw j ν j, (9.12) and the chemical Fokker-Planck equation is t p(x, t) = M ν ji a j (x) p(x, t) x i=1 i j=1 + 1 N 2 M 2 x 2 νjia 2 j (x) p(x, t) i=1 i j=1 N i 1 2 M ν ji ν jk a j (x) p(x, t). x i x k (9.13) i=1 k=1 9.3 References and further reading An Introduction to Stochastic Processes in Physics and Chemistry. L. S. J. Allen. Stochastic Processes in Physics and Chemistry. N. G. van Kampen. 9.4 Tasks Estimate the stationary distribution for the production/degradation system using repeated stochastic simulation, and compare your result with the corresponding analytical estimate derived in equations (9.9) and (9.1). Use repeated stochastic simulation to estimate τ SSA, the average time to leave the interval (, 18] when the initial condition is A() = n. Evaluate the corresponding approximate mean exit time using the Fokker-Planck equation and compare your results. 9.5 Example Matlab code To estimate the stationary distribution for the production-degradation example. j=1 function production degradation stationary lecture() clear all; close all; k1=.1; % decay rate k2v=1; % production rate t final=1; % final time %%

70 Chapter 9 Stochastic modelling of biological processes 65 %estimate stationary distribution using a long sample path no samples=1e7; A stationary=zeros(no samples,1); t=; A=k2V/k1; % initial condition A stationary=zeros(11,1); for i=1:no samples % set the initial time and molecule numbers while t<i A=k1*A+k2V; % calculate a tau=1/a*log(1/rand); % calculate time until next reaction % update molecule numbers and time r2=a*rand; if r2<k1*a A=A-1; else A=A+1; t=t+tau; if A<=1 A stationary(a+1)=a stationary(a+1)+1; %% % analytic expression for the stationary distribution dx=.1; x=:dx:5; ps=2*exp(-2*x+(4*k2v/k1-1)*log(k1*x+k2v)); ps=ps/(sum(ps)*dx); %% % plot results figure(2); clf; cla; hold on; box on bar(:1:1,a stationary/no samples,'facecolor',[.7.7.7],... 'LineWidth',.5,'BarWidth',.6); plot(x,ps,'linewidth',1.) axis([ ]) set(gca,'xtick',:5:25) set(gca,'ytick',:.5:.15) set(gca,'yticklabel',arrayfun(@(s)sprintf('%.2f', s),... cellfun(@(s)str2num(s),get(gca,'yticklabel')),'uniformoutput',false)) ylabel('frequency'); xlabel('number of A molecules')

71 Chapter 1: A simple model of diffusion In Lectures 1-5 we learnt how to analyse and simulate models of chemical reactions where the systems were all assumed to be well-mixed, that is, the concentrations of reacting species were assumed spatially homogeneous throughout the reaction volume, ν. The goal of this lecture is to begin to ext this approach in a simple way to systems that are not well-mixed. To do this we need a model of diffusion 1.1 A compartment-based approach to diffusion Suppose that we want to model the diffusion of chemical species A on the domain [, L] [, h] [, h], where L = Kh. To do this we will divide the domain along the x axis into K compartments of length h. We denote the number of A molecules in the ith compartment, x [(i 1)/h, i/h), by A i, i = 1,..., K. As a result of Brownian motion, the molecules jump between neighbouring compartments. This means that we can model diffusion as the following chain of chemical reactions: where A i d A 1 d d A 2 d d A 3 d d A K, (1.1) d d d d A i+1 means A i A i+1 and A i+1 A i. (1.2) d The rate constant, d, has units s 1. This means that the propensity function for a diffusion event from box i to box i + 1 is da i. The system of chemical reactions (1.1) can be simulated using the Gillespie stochastic simulation algorithm outlined in Section Example We illustrate this method using an example where we simulate 1, molecules starting from position x =.4 mm in the interval x [, L] where L = 1 mm. We will take d =.16 s 1 and K = 4 (so that h =.25 mm). Since x =.4 mm is on the boundary between the 16th and 17th compartments, we take the initial condition to be A 16 () = 5, A 17 () = 5 and A i () = for i 16, 17. The right-hand plot of Figure 1.1 shows six sample paths of individual molecules diffusing in the system, whereas the left-hand plot of Figure 1.1 shows the density profile at time t = 4 minutes. 1.2 Connection to a macroscale diffusion coefficient We would now like to think about how to relate the jump rate, d, to a macroscale diffusion coefficient, D. We denote by p(n, t) the joint probability that A i (t) = n i, i = 1,..., K, where n = [n 1, n 2,..., n K ]. Let us define the operators R i, L i : N K N K (where N is the set of non-negative integers) by R i : [n 1,..., n i, n i+1,..., n K ] [n 1,..., n i + 1, n i+1 1,..., n K ], (1.3) 66

72 time [min] A Chapter 1 Stochastic modelling of biological processes x [mm] x [mm] Figure 1.1: Left: The paths of six individual molecules. Right: histogram of number of A molecules in each compartment (grey bars) together with the solution of the diffusion equation (1.12) (blue). For further details see Section for i = 1,..., K 1, and L i : [n 1,..., n i 1, n i,..., n K ] [n 1,..., n i 1 1, n i + 1,..., n K ], (1.4) for i = 2,..., K. Then the (diffusion) chemical master equation, which corresponds to the system of chemical reactions given by Equation (1.1), can be written as follows p(n, t) t K 1 = d {(n j + 1) p(r j n, t) n j p(n, t)} +d j=1 K {(n j + 1) p(l j n, t) n j p(n, t)}. (1.5) j=2 The mean is defined as the vector M(t) [M 1, M 2,..., M K ] where M i (t) = n n i p(n, t) n i p(n, t), (1.6) n 1 = n 2 = n K = gives the mean number of molecules in the ith compartment, i = 1, 2,..., K. To derive an evolution equation for the mean vector M(t) we can follow the method from Section 2.1. Multiplying (1.5) by n i and summing over all the possible values the state vector, n, can take, we obtain (see Problem Sheet 3) a system of equations for M i of the form M i t M 1 t M K t = d (M i+1 2M i + M i 1 ), for i = 2,..., K 1, (1.7) = d (M 2 M 1 ), (1.8) = d (M K 1 M K ). (1.9) The classical deterministic description of diffusion is written in terms of concentration a(x, t) which can be approximated as a(x i, t) M i (t)/h where x i is the centre of the ith compartment, i = 1, 2,..., K. Dividing (1.7) by h, we obtain t a(x i, t) d (a(x i + h, t) 2a(x i, t) + a(x i h, t)). (1.1)

73 Chapter 1 Stochastic modelling of biological processes 68 By Taylor expanding the right-hand side we arrive at t a(x i, t) = dh 2 2 x 2 a(x i, t) + O ( h 4). (1.11) This means that, in the limit h, the system of equations (1.1) is equivalent to the diffusion equation with D = dh 2. Since molecules of A cannot move left out of compartment 1 or right out of compartment K, we see from Equations (1.8) and (1.9) that zero flux boundary conditions are appropriate for the diffusion equation so that we have a t = D 2 a x 2 for x (, L) with a x =. (1.12) x=,l The system is closed by specifying appropriate initial conditions. Comparison of the solution of Equation (1.12) with the results from stochastic simulation of the system are shown in Figure Analysis of the variance We can extract further information from the (diffusion) chemical master equation by also considering evolution of the variance vector V (t) [V 1 (t), V 2 (t),..., V K (t)] where V i (t) = (n i M i (t)) 2 p(n, t) (n i M i (t)) 2 p(n, t), (1.13) n n 1 = n 2 = n K = gives the variance in the number of A molecules in compartment i. To derive the evolution equation for the vector V (t), we define more generally the covariance matrix {V ij } by V ij = n n i n j p(n, t) M i M j for i, j = 1, 2,..., K. (1.14) From Equation (1.14) we see that the variance vector comprises the diagonal entries of this matrix: V i = V ii for i = 1, 2,..., K. Multiplying Equation (1.5) by n 2 i and summing over n, we obtain { K 1 n 2 i p(n, t) = d n 2 i (n j + 1)p(R j n, t) } n 2 i n j p(n, t) t n j=1 n n { K +d n 2 i (n j + 1)p(L j n, t) } n 2 i n j p(n, t). (1.15) j=2 n n Let us consider the case that i {2,..., K 1}. We evaluate first the term corresponding to j = i in the first sum on the right-hand side. We have n 2 i (n i + 1)p(R i n, t) n 2 i n i p(n, t) = (n i 1) 2 n i p(n, t) n 2 i n i p(n, t) n n n n = ( 2n 2 i + n i )p(n, t) (1.16) n = 2V i 2M 2 i + M i. Here we changed indices in the first sum R i n n and then used Equations (1.6) and (1.14). Similarly, the term corresponding to j = i 1 in the first sum on the right-hand side of Equation (1.15) can be rewritten as n 2 i (n i 1 + 1)p(R i 1 n, t) n 2 i n i 1 p(n, t) n n = (2n i n i 1 + n i 1 )p(n, t) n = 2V i,i 1 + 2M i M i 1 + M i 1. (1.17)

74 Chapter 1 Stochastic modelling of biological processes 69 Other terms corresponding to j i, (i 1) in the first sum on the right-hand side of Equation (1.15) are equal to zero. The second sum on the right-hand side of Equation (1.15) can be handled analogously to give, finally, t n 2 i p(n) = d { 2V i,i 1 + 2M i M i 1 + M i 1 2V i 2M 2 } i + M i n +d { 2V i,i+1 + 2M i M i+1 + M i+1 2V i 2Mi 2 } + M i. (1.18) Now using Equation (1.14) and Equation (1.7) on the left-hand side of Equation (1.18), we obtain t n n 2 i p(n, t) = V i t + 2M M i i t Substituting this into Equation (1.18), we have = V i t + d ( 2M i M i+1 + 2M i M i 1 4Mi 2 ). (1.19) V i t = 2d {V i,i+1 + V i,i 1 2V i } + d {M i+1 + M i 1 + 2M i }, (1.2) for i = 2,..., K 1. A similar analysis gives V 1 t V K t = 2d{V 1,2 V 1 } + d{m 2 + M 1 }, (1.21) = 2d{V K,K 1 V K } + d{m K 1 + M K }. (1.22) We see that the evolution equation for the variance vector, V (t), deps on the mean, M, variance, V, and on non-diagonal terms of the covariance matrix, {V ij }. To get a closed system of equations, we have to derive evolution equations for V ij too. This can be done by multiplying (1.5) by n i n j, summing over n and following the same arguments as before. 1.4 References and further reading A practical guide to stochastic simulations of reaction-diffusion processes. R. Erban, S. J. Chapman and P. K. Maini. arχiv (27) Tasks Generate a number of sample paths from system (1.1) using the Gillespie algorithm. Show that the averaged discrete dynamics is consistent with the diffusion equation. [Note that to do this you could, for example, use a forward Euler method for the time stepping, with centered finite diferences for the second spatial derivative.] 1.6 Example Matlab code To generate sample paths for a simple model of diffusion. function example1 sample paths lecture() clear all; close all;

75 Chapter 1 Stochastic modelling of biological processes 7 %% D=.1; % diffusion coefficient L=1; % domain length t final=1*6; % final time no boxes=4; h=1/no boxes; no realisations=6; X initial=.4; % initial position X=cell(no realisations,1); t=cell(no realisations,1); for ii=1:no realisations % initial position corresponds to a compartment boundary % put half into each neighbouring compartment if ii<=no realisations/2 X{ii}(1)=X initial-h/2; t{ii}(1)=; else X{ii}(1)=X initial+h/2; t{ii}(1)=; ; time=; kk=1; while time<t final r1=rand; r2=rand; a=2*d/(h*h); time=time+(1/a)*log(1/r1); % time of the next reaction % check to see if left-ward jump if r2*a<d/hˆ2 X{ii}(kk+1)=X{ii}(kk)-h; else % if not, must be right-ward jump X{ii}(kk+1)=X{ii}(kk)+h; ; % if jumped left out of domain, reflect back into domain if X{ii}(kk+1)< X{ii}(kk+1)=h/2; ; % if jumped right out of domain, reflect back into domain if X{ii}(kk+1)>L X{ii}(kk+1)=L-h/2; ; t{ii}(kk+1)=time; kk=kk+1; ; ; %%

76 Chapter 1 Stochastic modelling of biological processes 71 % plot the results figure(1); clf; hold on; box on for ii=1:no realisations t{ii}=t{ii}/6; stairs(x{ii},t{ii},'linewidth',1) axis([ 1 1]) set(gca,'xtick',:.2:1.) set(gca,'ytick',:2:1) xlabel('x [mm]') ylabel('time [min]') set(gca,'xticklabel',arrayfun(@(s)sprintf('%.1f', s),... cellfun(@(s)str2num(s),get(gca,'xticklabel')),'uniformoutput',false)) To estimate the distribution of particle positions. function example1 distribution lecture() clear all; close all; %% D=.1; % diffusion coefficient L=1; %domain length t final=4*6; % final time no molecules=1e4; % total number of molecules no boxes=4; % number of compartments h=l/no boxes; % width of compartments mesh=[h/2:h:l-h/2]; A=zeros(no boxes,1); A(16)=no molecules/2; %put half the molecules into box 16 initially A(17)=no molecules/2; %put half the molecules into box 17 initially time=; while time<t final r1=rand; a=2*d/(hˆ2)*(no molecules-a(1)/2-a(no boxes)/2); % calculate a time=time+(1/a)*log(1/r1); % calculate time of next reaction ss=; k=; % decide which reaction occurred r2=rand; % check to see if molecule jumped to the right while ss<=r2*a && k<no boxes-1 k=k+1; ss=ss+d/hˆ2*a(k); ; % implement rightwards jump from correct compartment if ss>r2*a

77 Chapter 1 Stochastic modelling of biological processes 72 A(k)=A(k)-1; A(k+1)=A(k+1)+1; else % else molecule must have jumped to the left k=1; while ss<=r2*a && k<no boxes k=k+1; ss=ss+d/hˆ2*a(k); ; % implement leftwards jump from correct compartment A(k)=A(k)-1; A(k-1)=A(k-1)+1; ; ; %% % solve the diffusion equation numerically using the forward Euler method dt=.1; M=zeros(no boxes,1); M(16)=no molecules/2; M(17)=no molecules/2; hh=dt*d/(hˆ2); time PDE=; while time PDE<=t final; M old=m; M(1)=M old(1)+hh*(m old(2)-m old(1)); M(2:-1)=M old(2:-1)+hh*(m old(3:)+m old(1:-2)-2*m old(2:-1)); M()=M old()+hh*(m old(-1)-m old()); time PDE=time PDE+dt; ; %% % plot the results figure(1); clf; hold on; box on bar(mesh,a,'facecolor',[.7.7.7],'linewidth',.5,'barwidth',.6) plot(mesh,m,'linewidth',1) axis([ 1 5]) set(gca,'xtick',:.2:1.) set(gca,'ytick',:1:5) xlabel('x [mm]') ylabel('a') set(gca,'xticklabel',arrayfun(@(s)sprintf('%.1f', s),... cellfun(@(s)str2num(s),get(gca,'xticklabel')),'uniformoutput',false))

78 A A Chapter 11: The reaction-diffusion master equation Now that we have outlined a simple model for diffusion, we are now in a position to write down a model for chemical reactions that take place in a reaction volume that cannot be assumed well-mixed A compartment-based model for production, degradation and diffusion Suppose that we want to model the production, degradation and diffusion of chemical species A on the domain [, L] [, h] [, h], where L = Kh. As in Lecture 1, to do this we will divide the domain along the x axis into K compartments of length h. We denote the number of A molecules in the ith compartment, x [(i 1)/h, i/h), by A i, i = 1,..., K. The reaction-diffusion process we will consider can be described by the following set of chemical reactions: A 1 d d A 2 d d A 3 d d A K, (11.1) d d k A 1 i, for i = 1, 2,..., K, (11.2) k 2 A i, for i = 1, 2,..., K/5. (11.3) Here, Equation (11.2) describes the decay of A molecules in each compartment at rate k 1 s 1, and Equation (11.3) describes the production of A molecules in each of the first K/5 compartments at rate k 2 h 3 s x [7m] 5 1 x [7m] Figure 11.1: Histogram of number of A molecules in each compartment (grey bars) together with the solution of the reaction-diffusion equation (11.9) (blue) at (a) t = 1 minutes and (b) t = 3 minutes. Parameters are: L = 1µm, D = 1µms 1, K = 4 (h = 25µmm), k 1 = s 1 and k 2 = 2 1 5µm 3 s 1. This model is relatively easy to simulate using the Gillespie algorithm. Each diffusion reaction has propensity function da i (t) s 1, where D = dh 2, whilst the production reactions have 73

79 Chapter 11 Stochastic modelling of biological processes 74 propensity function k 2 h 3 s 1 and the decay reactions have propensity function k 1 A i (t) s 1. Figure 11.1 shows the results from stochastic simulation, starting with no molecules of A in the system The reaction-diffusion master equation This model can be analysed using the reaction-diffusion master equation. Let p(n, t) = P(A(t) = n A() = A ), where A = [A 1, A 2,..., A K ] and n = [n 1, n 2..., n K ]. Then the reaction-diffusion master equation for system (11.1)-(11.3) can be written as follows: p(n, t) t K 1 = d {(n i + 1) p(r i n, t) n i p(n, t)} +d i=1 K {(n i + 1) p(l i n, t) n i p(n, t)} i=2 K +k 1 {(n i + 1)p(n 1,..., n i + 1,..., n K, t) n i p(n, t)} i=1 K/5 +k 2 h 3 {p(n 1,..., n i 1,..., n K, t) n i p(n, t)}. (11.4) i=1 Following similar derivations to those previously, we can show that M 1 t M i t M i t M K t = d (M 2 M 1 ) + k 2 h 3 k 1 M 1, (11.5) = d (M i+1 2M i + M i 1 ) + k 2 h 3 k 1 M i, for i = 2,..., K/5, (11.6) = d (M i+1 2M i + M i 1 ) k 1 M i, for i = K/5 + 1,..., K 1, (11.7) = d (M K 1 M K ) k 1 M K. (11.8) Expanding using Taylor series, we can show that, in the limit h, the concentration of A molecules is given by a t = D 2 a x 2 + k 2χ [,L/5] k 1 a, with a x =. (11.9) x=,l Again, the system is closed by specifying appropriate initial conditions. Figure 11.1 compares the solution of Equation (11.9) with the results generated using stochastic simulation with the initial condition a(x, ) = A compartment-based model for higher order reactions We now consider the reaction and diffusion of species A and B on domain [, L] [, h] [, h], where L = Kh where L = 1µm and h = 25µm and A + A k 1, A + B k 2, (11.1) A k 3, B k 4, k 5 A, (11.11) k 6 B in subdomain [3L/5, L] [, h] [, h]. (11.12)

80 A B Chapter 11 Stochastic modelling of biological processes 75 We model this system using a compartment-based approach: we divide the computational domain into 4 compartments of volume h 3 and denote the number of A (B) molecules in compartment i as A i (t) (B i (t)) for i = 1,..., K. Denoting the diffusion coefficients of A and B by D A and D B, respectively, then the diffusion chemical reactions are A 1 d A da A 2 d A da A 3 d A da d A A K, (11.13) da B 1 d B db B 2 where d A = D A /h 2 and d B = D B /h 2. d B db B 3 d B d B B K, (11.14) db db Second order reactions are implemented in the compartment-based approach by assuming that only molecules that are in the same compartment can react with each other. This means that we have the following chemical reactions: k A i + A 1 k i, Ai + B 2 i, i = 1, 2,..., K, (11.15) k A 3 k i, 4 k Bi, 5 Ai, i = 1, 2,..., K, (11.16) k 6 B i, i = 3K/5 + 1,..., K. (11.17) x [7m] 5 1 x [7m] Figure 11.2: Left: Histogram of number of A molecules in each compartment (grey bars) together with the solution of Equation (11.18) (blue). Right: Histogram of number of B molecules in each compartment (grey bars) together with the solution of Equation (11.19) (blue). Parameters are: L = 1µm, D A = 1.µm 2 s 1, D B = 1.µm 2 s 1, K = 4 (h = 25µmm), k 1 = µm 3 s 1 and k 2 = µm 3 s 1, k 3 = s 1, k 4 = s 1, k 5 = 1 7 µm 3 s 1 and k 5 = 1 6 µm 3 s 1. As with the well-stirred case, the introduction of second order reactions means that we cannot write down closed form solutions for the mean number of A and B molecules in the system. To make progress, we can use the Law of Mass Action to write down partial differential equations describing the concentrations a(x, t) A i (t)/h 3 and b(x, t) B i (t)/h 3 where x ih: a t b t = D A 2 a x 2 2 k 1 h 3 a2 k 2 h 3 ab k 3a + k 5 h 3, (11.18) = D B 2 b x 2 k 2 h 3 ab k 3b + k 6 h 3 χ [3L/5,L], (11.19)

81 Chapter 11 Stochastic modelling of biological processes 76 together with zero flux boundary conditions a x =, x=,l and b x =. (11.2) x=,l Figure 11.2 shows both the results from stochastic simulation, as well as solution of the approximate partial differential equation of the system Choice of compartment size, h An important question that was not addressed in the previous sections is: What is the appropriate choice of the compartment size, h?. Until now, we have mostly considered linear models and we were able to derive exact equations for the mean molecule numbers, for example, (11.5)-(11.8), and derive the corresponding deterministic reaction-diffusion partial differential equation (11.9) for the concentration of A by dividing by h 3 and taking the limit as h. Equation (11.9) can also be viewed as an equation for the probability distribution function of a single molecule. Consequently, for reaction-diffusion systems involving only zero- and firstorder chemical reactions we can increase the accuracy of the stochastic simulation algorithm by decreasing h. The situation is much more delicate when the system involves second or higher order reactions. In this case, although diffusion is modelled more accurately as h is decreased, the reactions might be modelled less accurately as h is decreased, so that we lose accuracy if we choose h too small. We demonstrate this phenomenon using the following illustrative example. We consider chemical species A and B that diffuse in the cubic domain [, L] [, L] [, L] and are subject to the following two chemical reactions A + B k 1 B, (11.21) k 2 A. (11.22) Since the number of B molecules is preserved, the dynamics of the model are simple: some molecules of A are produced by the second reaction and some are destroyed by the first reaction. Thus, after an initial transient behaviour, the number of A molecules fluctuates around its equilibrium value The well-stirred case In Lecture 2, we investigated this chemical system under the assumption that the reactor is well-stirred. Then the propensity of the first reaction, (11.21), is α 1 (t) = A(t)B k 1 /ν where B is the (constant) number of molecules of B and ν = L 3 is the volume of the cubic domain [, L] [, L] [, L]. The propensity of the second reaction, (11.22), is α 2 (t) = k 2 ν. In particular, the system is equivalent to the production/degradation example (2.1)-(2.2) and we know that the stationary distribution can be written φ(n) = 1 ( k2 ν 2 ) n exp [ k 2ν 2 ], n! k 1 B k 1 B n =, 1, 2, 3,... (11.23) Including spatial heterogeneity Our goal is to highlight that a very small compartment size h leads to large computational errors. We divide the cubic domain [, L] [, L] [, L] into K 3 cubic compartments of volume h 3 where K 1 and h = L/K. To formulate precisely the compartment-based stochastic simulation

82 Chapter 11 Stochastic modelling of biological processes 77 algorithm for the illustrative chemical system (11.21)-(11.22) in the reactor [, L] [, L] [, L], we denote the compartments by indices from the set I all = {(i, j, k) i, j, k are integers such that 1 i, j, k K}. (11.24) Let A ijk (t) (respectively, B ijk (t)) be the number of molecules of the chemical species A (respectively, B) in the (i, j, k)-th compartment at time t where (i, j, k) I all. Diffusion is modelled as a jump process between neighbouring compartments. Let us define the set of possible directions of jumps E = {[1,, ], [ 1,, ], [, 1, ], [, 1, ], [,, 1], [,, 1]}. (11.25) For every (i, j, k) I all, we also define E ijk = {e E ((i, j, k) + e) I all }, (11.26) i.e E ijk is the set of possible directions of jumps from the (i, j, k)-th compartment. For most compartments E ijk will be the full set of possible jumps E, but for compartments on the boundary the set of jumps is restricted. The notation E ijk avoids us having to write down separate equations for each boundary compartment. The main idea of the compartment-based approach is that the small compartments are assumed to be well-mixed, and that only molecules in the same compartment can react according to bimolecular reactions. Thus the compartment-based reaction-diffusion model can be written using the chemical reaction formalism as follows: k A ijk + B 1 ijk Bijk, k 2 A ijk, for (i, j, k) I all, (11.27) A ijk D A /h 2 A ijk+e, for (i, j, k) I all, e E ijk, (11.28) B ijk D B /h 2 B ijk+e, for (i, j, k) I all, e E ijk, (11.29) where D A (respectively, D B ) is the diffusion constant of A (respectively, B). The propensity functions of reactions (11.27) are α ijk,1 (t) = A ijk (t)b ijk (t)k 1 /h 3, α ijk,2 (t) = k 2 h 3, (11.3) where h 3 is the volume of the compartment. The reactions (11.28)-(11.29) correspond to diffusive jumps between neighbouring compartments. The propensity functions of these reactions are equal to A ijk (t)d A /h 2 and B ijk (t)d B /h 2. The number of molecules of A in the whole container [, L] [, L] [, L] is given by A(t) = A ijk (t). (11.31) (i,j,k) I all Let p n (t) be the probability that A(t) = n, and φ K (n) be the stationary distribution φ K (n) = lim t p n (t), (11.32) so that φ K (n) is the probability that there are n molecules of A in the system when it is simulated using K compartments, provided that the system is observed for a long time. Since A molecules are produced uniformly in the volume, we do not expect any spatial variation in the probability distribution for the number of A molecules. This means that we expect φ K (n) = φ(n) K.

83 stationary distribution Chapter 11 Stochastic modelling of biological processes Results from stochastic simulation In Figure 11.3, we present the stationary distributions φ K (n) for K = 1, 2, 2. We observe that the peak of φ K (n) moves to the right as K is increased (i.e. as h is decreased) and that φ K (n) does not converge to φ 1 (n) as h. In fact, φ K (n) does not converge to any distribution as h ; it moves further and further to the right as h is decreased. The shift of φ K (n) to the right in Figure 11.3 is caused by the bimolecular reaction being lost in the limit h (i.e. this reaction does not occur as frequently as it should when h is too small). This makes the assessment of the accuracy of computations more challenging than in the deterministic case..2 L h number of A molecules Figure 11.3: Stationary distribution φ K (n), given by (11.32), for K = 1 (grey bars),k = 2 (blue), K = 2 (orange) and K = 1 (yellow), computed from long time simulations generated using the Gillespie stochastic simulation algorithm. Parameters are k 1 =.2µm 3 s 1, k 2 = 1µm 3 s 1, D A = D B = 1µm 2 s 1, L = 1µm and B = Conclusions The compartment-based stochastic simulation algorithm is generally considered valid only for a range of values of h. In particular, h must not be too small. For the second-order reactions lik (11.21) this constraint is usually stated in the form h k/(d A +D B ) where k is the reaction rate constant (of any second or higher order reactions present). To satisfy this condition in our particular example, we could simply choose h = L. However, if the system under consideration has some spatial variations, then we obviously want to choose h small enough to capture the desired spatial resolution. This leads to a restriction on h from above, namely L h. Thus it is often suggested to choose h small (to satisfy L h) but not too small (in order to satisfy h k/(d A + D B )). The optimal choice of h is subject of current research Models of pattern formation In this section, we will discuss the stochastic equivalents of two models for spatial patterning. The first is the French Flag model and relates to patterning via concentration gradients, whilst the second is the mechanism of diffusion-driven instability The French flag model Here one assumes that the domain is prepatterned: in our example from Section 11.1, we considered a chemical A that is produced in only part of the domain [, L], specifically, in

84 number of molecules number of molecules Chapter 11 Stochastic modelling of biological processes 79 [, L/5]. We assume that the interval [, L] describes a layer of cells which are sensitive to the concentration of the chemical A. In particular, we suppose that a cell can have three different fates (e.g. different genes are switched on or off) deping on the concentration of A. Then the concentration gradient of A can help to distinguish three different regions in [, L]; see Figure If the concentration of A is high enough (above a certain threshold), a cell follows the blue program. The white program is followed for medium concentrations of A, and the red program is followed for low concentrations x [mm] x [mm] Figure 11.4: Left: The deterministic (partial differential equation) version of the French flag model. Right: the corresponding stochastic version Diffusion-driven instability Consider a system of two chemical species A and B in the elongated computational domain [, L] [, h] [, h], where L = 1mm and h = 25µm, which react according to the Schnakenberg system of chemical reactions (see Sheet 1 Question 4): 2A + B k 1 3A; k 2 A; A k 3 ; k 4 B. (11.33) If molecules of A and B are well-mixed then, for the parameter values in Figure 11.5, the corresponding deterministic system of ordinary differential equations has one non-negative stable steady state equal to a s = 2 and b s = 75 molecules per volume, h 3. Introducing diffusion to the model, one steady state solution of the spatial problem is the constant one (a s, b s ) everywhere. However, this solution might not be stable (so might not be seen in reality) if the diffusion constants of A and B differ significantly. To simulate the reaction-diffusion problem with the Schnakenberg system of chemical reactions (11.33), we follow the compartment-based method of Section Starting with a uniform distribution of chemicals A i () = a s = 2 and B i () = b s = 75, i = 1, 2,..., K, at time t =, we plot the numbers of molecules in each compartment at time t = 3 minutes computed by the Gillespie stochastic simulation algorithm in Figure To demonstrate the idea of patterning, compartments with above steady state values, a s or b s, are plotted in blue and other compartments are plotted in red. We see in Figure 11.5 that the chemical A can be clearly used to divide our computational domain into several regions. There are two and half blue peaks in this figure. The number of blue peaks deps on the size of the computational domain [, L] and it is not a unique number in general. The reaction-diffusion system has several favourable states, each with a different number of blue peaks.

85 number of A molecules number of B molecules Chapter 11 Stochastic modelling of biological processes x [mm] x [mm] Figure 11.5: Turing patterns. Left: numbers of molecules of chemical species A in each compartment at time 3 minutes. Right: the same plot for chemical species B. Parameters are k 1 /h 6 = 1 6 s 1, k 2 h 3 = 1s 1, k 3 =.2s 1 and k 4 h 3 = 3s 1, D A = 1 5 mm 2 s 1 and D B = 1 3 mm 2 s References and further reading A convergent reaction-diffusion master equation. S. A. Isaacson. J. Chem. Phys. 139 (213) Tasks Implement a stochastic simulation algorithm to generate a number of sample paths from system (11.1)-(11.3). Compare you results with those generated by solving the corresponding partial differential equation model.

86 y [mm] y [mm] Chapter 12: Diffusion and stochastic differential equations Consider a typical protein molecule immersed in the aqueous medium of a living cell. As with any small particle, it has a non-zero kinetic energy which is proportional to the absolute temperature. In particular, the protein molecule has a non-zero instantaneous speed. However, it cannot travel too far before it bumps into other molecules (e.g. water molecules) in the solution. As a result, the trajectory of the molecule is not straight but it executes a random walk, the well-known Brownian motion. This means that the position of the molecule evolves according to X(t + dt) = X(t) + 2DdW x, (12.1) Y (t + dt) = Y (t) + 2DdW y, (12.2) Z(t + dt) = Z(t) + 2DdW z, (12.3) where [X(t), Y (t), Z(t)] R 3 is the position of the diffusing molecule at time t, and D is the diffusion constant. We can simulate sample paths from (12.1)-(12.3) by using the computational definition we used in Lecture 6, i.e. we choose time step t and compute the solution iteratively using X(t + t) = X(t) + 2D tξ x, (12.4) Y (t + t) = Y (t) + 2D tξ y, (12.5) Z(t + t) = Z(t) + 2D tξ z, (12.6) where ξ x, ξ y, ξ z N (, 1) x [mm] x [mm] Figure 12.1: Left: six trajectories generated using equations (12.4)-(12.6). All trajectories start at the origin and the points are marked with an asterisk. Right: corresponding solution of the diffusion equation (12.13). Parameters are D = 1 4 mm 2 s 1 and t = 1 minutes. 81

Gillespie s Algorithm and its Approximations. Des Higham Department of Mathematics and Statistics University of Strathclyde

Gillespie s Algorithm and its Approximations. Des Higham Department of Mathematics and Statistics University of Strathclyde Gillespie s Algorithm and its Approximations Des Higham Department of Mathematics and Statistics University of Strathclyde djh@maths.strath.ac.uk The Three Lectures 1 Gillespie s algorithm and its relation

More information

Stochastic Simulation of Biochemical Reactions

Stochastic Simulation of Biochemical Reactions 1 / 75 Stochastic Simulation of Biochemical Reactions Jorge Júlvez University of Zaragoza 2 / 75 Outline 1 Biochemical Kinetics 2 Reaction Rate Equation 3 Chemical Master Equation 4 Stochastic Simulation

More information

Efficient Leaping Methods for Stochastic Chemical Systems

Efficient Leaping Methods for Stochastic Chemical Systems Efficient Leaping Methods for Stochastic Chemical Systems Ioana Cipcigan Muruhan Rathinam November 18, 28 Abstract. Well stirred chemical reaction systems which involve small numbers of molecules for some

More information

STOCHASTIC CHEMICAL KINETICS

STOCHASTIC CHEMICAL KINETICS STOCHASTIC CHEICAL KINETICS Dan Gillespie GillespieDT@mailaps.org Current Support: Caltech (NIGS) Caltech (NIH) University of California at Santa Barbara (NIH) Past Support: Caltech (DARPA/AFOSR, Beckman/BNC))

More information

Simulation methods for stochastic models in chemistry

Simulation methods for stochastic models in chemistry Simulation methods for stochastic models in chemistry David F. Anderson anderson@math.wisc.edu Department of Mathematics University of Wisconsin - Madison SIAM: Barcelona June 4th, 21 Overview 1. Notation

More information

Modelling in Systems Biology

Modelling in Systems Biology Modelling in Systems Biology Maria Grazia Vigliotti thanks to my students Anton Stefanek, Ahmed Guecioueur Imperial College Formal representation of chemical reactions precise qualitative and quantitative

More information

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( ) Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio (2014-2015) Etienne Tanré - Olivier Faugeras INRIA - Team Tosca November 26th, 2014 E. Tanré (INRIA - Team Tosca) Mathematical

More information

Fast Probability Generating Function Method for Stochastic Chemical Reaction Networks

Fast Probability Generating Function Method for Stochastic Chemical Reaction Networks MATCH Communications in Mathematical and in Computer Chemistry MATCH Commun. Math. Comput. Chem. 71 (2014) 57-69 ISSN 0340-6253 Fast Probability Generating Function Method for Stochastic Chemical Reaction

More information

This is a Gaussian probability centered around m = 0 (the most probable and mean position is the origin) and the mean square displacement m 2 = n,or

This is a Gaussian probability centered around m = 0 (the most probable and mean position is the origin) and the mean square displacement m 2 = n,or Physics 7b: Statistical Mechanics Brownian Motion Brownian motion is the motion of a particle due to the buffeting by the molecules in a gas or liquid. The particle must be small enough that the effects

More information

Stochastic model of mrna production

Stochastic model of mrna production Stochastic model of mrna production We assume that the number of mrna (m) of a gene can change either due to the production of a mrna by transcription of DNA (which occurs at a rate α) or due to degradation

More information

Reaction time distributions in chemical kinetics: Oscillations and other weird behaviors

Reaction time distributions in chemical kinetics: Oscillations and other weird behaviors Introduction The algorithm Results Summary Reaction time distributions in chemical kinetics: Oscillations and other weird behaviors Ramon Xulvi-Brunet Escuela Politécnica Nacional Outline Introduction

More information

Stochastic Chemical Kinetics

Stochastic Chemical Kinetics Stochastic Chemical Kinetics Joseph K Scott November 10, 2011 1 Introduction to Stochastic Chemical Kinetics Consider the reaction I + I D The conventional kinetic model for the concentration of I in a

More information

Extending the Tools of Chemical Reaction Engineering to the Molecular Scale

Extending the Tools of Chemical Reaction Engineering to the Molecular Scale Extending the Tools of Chemical Reaction Engineering to the Molecular Scale Multiple-time-scale order reduction for stochastic kinetics James B. Rawlings Department of Chemical and Biological Engineering

More information

Lecture 1: Pragmatic Introduction to Stochastic Differential Equations

Lecture 1: Pragmatic Introduction to Stochastic Differential Equations Lecture 1: Pragmatic Introduction to Stochastic Differential Equations Simo Särkkä Aalto University, Finland (visiting at Oxford University, UK) November 13, 2013 Simo Särkkä (Aalto) Lecture 1: Pragmatic

More information

The concentration of a drug in blood. Exponential decay. Different realizations. Exponential decay with noise. dc(t) dt.

The concentration of a drug in blood. Exponential decay. Different realizations. Exponential decay with noise. dc(t) dt. The concentration of a drug in blood Exponential decay C12 concentration 2 4 6 8 1 C12 concentration 2 4 6 8 1 dc(t) dt = µc(t) C(t) = C()e µt 2 4 6 8 1 12 time in minutes 2 4 6 8 1 12 time in minutes

More information

16. Working with the Langevin and Fokker-Planck equations

16. Working with the Langevin and Fokker-Planck equations 16. Working with the Langevin and Fokker-Planck equations In the preceding Lecture, we have shown that given a Langevin equation (LE), it is possible to write down an equivalent Fokker-Planck equation

More information

Advanced Physical Chemistry CHAPTER 18 ELEMENTARY CHEMICAL KINETICS

Advanced Physical Chemistry CHAPTER 18 ELEMENTARY CHEMICAL KINETICS Experimental Kinetics and Gas Phase Reactions Advanced Physical Chemistry CHAPTER 18 ELEMENTARY CHEMICAL KINETICS Professor Angelo R. Rossi http://homepages.uconn.edu/rossi Department of Chemistry, Room

More information

Derivations for order reduction of the chemical master equation

Derivations for order reduction of the chemical master equation 2 TWMCC Texas-Wisconsin Modeling and Control Consortium 1 Technical report number 2006-02 Derivations for order reduction of the chemical master equation Ethan A. Mastny, Eric L. Haseltine, and James B.

More information

Brownian Motion: Fokker-Planck Equation

Brownian Motion: Fokker-Planck Equation Chapter 7 Brownian Motion: Fokker-Planck Equation The Fokker-Planck equation is the equation governing the time evolution of the probability density of the Brownian particla. It is a second order differential

More information

Lecture 6: Bayesian Inference in SDE Models

Lecture 6: Bayesian Inference in SDE Models Lecture 6: Bayesian Inference in SDE Models Bayesian Filtering and Smoothing Point of View Simo Särkkä Aalto University Simo Särkkä (Aalto) Lecture 6: Bayesian Inference in SDEs 1 / 45 Contents 1 SDEs

More information

SMSTC (2007/08) Probability.

SMSTC (2007/08) Probability. SMSTC (27/8) Probability www.smstc.ac.uk Contents 12 Markov chains in continuous time 12 1 12.1 Markov property and the Kolmogorov equations.................... 12 2 12.1.1 Finite state space.................................

More information

Derivation of Itô SDE and Relationship to ODE and CTMC Models

Derivation of Itô SDE and Relationship to ODE and CTMC Models Derivation of Itô SDE and Relationship to ODE and CTMC Models Biomathematics II April 23, 2015 Linda J. S. Allen Texas Tech University TTU 1 Euler-Maruyama Method for Numerical Solution of an Itô SDE dx(t)

More information

Lecture 7: Simple genetic circuits I

Lecture 7: Simple genetic circuits I Lecture 7: Simple genetic circuits I Paul C Bressloff (Fall 2018) 7.1 Transcription and translation In Fig. 20 we show the two main stages in the expression of a single gene according to the central dogma.

More information

Universal examples. Chapter The Bernoulli process

Universal examples. Chapter The Bernoulli process Chapter 1 Universal examples 1.1 The Bernoulli process First description: Bernoulli random variables Y i for i = 1, 2, 3,... independent with P [Y i = 1] = p and P [Y i = ] = 1 p. Second description: Binomial

More information

Series Solutions. 8.1 Taylor Polynomials

Series Solutions. 8.1 Taylor Polynomials 8 Series Solutions 8.1 Taylor Polynomials Polynomial functions, as we have seen, are well behaved. They are continuous everywhere, and have continuous derivatives of all orders everywhere. It also turns

More information

CDA6530: Performance Models of Computers and Networks. Chapter 3: Review of Practical Stochastic Processes

CDA6530: Performance Models of Computers and Networks. Chapter 3: Review of Practical Stochastic Processes CDA6530: Performance Models of Computers and Networks Chapter 3: Review of Practical Stochastic Processes Definition Stochastic process X = {X(t), t2 T} is a collection of random variables (rvs); one rv

More information

M4A42 APPLIED STOCHASTIC PROCESSES

M4A42 APPLIED STOCHASTIC PROCESSES M4A42 APPLIED STOCHASTIC PROCESSES G.A. Pavliotis Department of Mathematics Imperial College London, UK LECTURE 1 12/10/2009 Lectures: Mondays 09:00-11:00, Huxley 139, Tuesdays 09:00-10:00, Huxley 144.

More information

Lecture Notes 7 Random Processes. Markov Processes Markov Chains. Random Processes

Lecture Notes 7 Random Processes. Markov Processes Markov Chains. Random Processes Lecture Notes 7 Random Processes Definition IID Processes Bernoulli Process Binomial Counting Process Interarrival Time Process Markov Processes Markov Chains Classification of States Steady State Probabilities

More information

Langevin Methods. Burkhard Dünweg Max Planck Institute for Polymer Research Ackermannweg 10 D Mainz Germany

Langevin Methods. Burkhard Dünweg Max Planck Institute for Polymer Research Ackermannweg 10 D Mainz Germany Langevin Methods Burkhard Dünweg Max Planck Institute for Polymer Research Ackermannweg 1 D 55128 Mainz Germany Motivation Original idea: Fast and slow degrees of freedom Example: Brownian motion Replace

More information

6 Continuous-Time Birth and Death Chains

6 Continuous-Time Birth and Death Chains 6 Continuous-Time Birth and Death Chains Angela Peace Biomathematics II MATH 5355 Spring 2017 Lecture notes follow: Allen, Linda JS. An introduction to stochastic processes with applications to biology.

More information

1. Introduction to Chemical Kinetics

1. Introduction to Chemical Kinetics 1. Introduction to Chemical Kinetics objectives of chemical kinetics 1) Determine empirical rate laws H 2 + I 2 2HI How does the concentration of H 2, I 2, and HI change with time? 2) Determine the mechanism

More information

Stochastic process. X, a series of random variables indexed by t

Stochastic process. X, a series of random variables indexed by t Stochastic process X, a series of random variables indexed by t X={X(t), t 0} is a continuous time stochastic process X={X(t), t=0,1, } is a discrete time stochastic process X(t) is the state at time t,

More information

Stochastic Simulation Methods for Solving Systems with Multi-State Species

Stochastic Simulation Methods for Solving Systems with Multi-State Species Stochastic Simulation Methods for Solving Systems with Multi-State Species Zhen Liu Thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of

More information

1.3 Forward Kolmogorov equation

1.3 Forward Kolmogorov equation 1.3 Forward Kolmogorov equation Let us again start with the Master equation, for a system where the states can be ordered along a line, such as the previous examples with population size n = 0, 1, 2,.

More information

An Introduction to Stochastic Simulation

An Introduction to Stochastic Simulation Stephen Gilmore Laboratory for Foundations of Computer Science School of Informatics University of Edinburgh PASTA workshop, London, 29th June 2006 Background The modelling of chemical reactions using

More information

1. Stochastic Process

1. Stochastic Process HETERGENEITY IN QUANTITATIVE MACROECONOMICS @ TSE OCTOBER 17, 216 STOCHASTIC CALCULUS BASICS SANG YOON (TIM) LEE Very simple notes (need to add references). It is NOT meant to be a substitute for a real

More information

Chapter 6 - Random Processes

Chapter 6 - Random Processes EE385 Class Notes //04 John Stensby Chapter 6 - Random Processes Recall that a random variable X is a mapping between the sample space S and the extended real line R +. That is, X : S R +. A random process

More information

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS PROBABILITY: LIMIT THEOREMS II, SPRING 218. HOMEWORK PROBLEMS PROF. YURI BAKHTIN Instructions. You are allowed to work on solutions in groups, but you are required to write up solutions on your own. Please

More information

Stochastic modelling of epidemic spread

Stochastic modelling of epidemic spread Stochastic modelling of epidemic spread Julien Arino Centre for Research on Inner City Health St Michael s Hospital Toronto On leave from Department of Mathematics University of Manitoba Julien Arino@umanitoba.ca

More information

18.175: Lecture 8 Weak laws and moment-generating/characteristic functions

18.175: Lecture 8 Weak laws and moment-generating/characteristic functions 18.175: Lecture 8 Weak laws and moment-generating/characteristic functions Scott Sheffield MIT 18.175 Lecture 8 1 Outline Moment generating functions Weak law of large numbers: Markov/Chebyshev approach

More information

An efficient approach to stochastic optimal control. Bert Kappen SNN Radboud University Nijmegen the Netherlands

An efficient approach to stochastic optimal control. Bert Kappen SNN Radboud University Nijmegen the Netherlands An efficient approach to stochastic optimal control Bert Kappen SNN Radboud University Nijmegen the Netherlands Bert Kappen Examples of control tasks Motor control Bert Kappen Pascal workshop, 27-29 May

More information

Rate Laws. many elementary reactions. The overall stoichiometry of a composite reaction tells us little about the mechanism!

Rate Laws. many elementary reactions. The overall stoichiometry of a composite reaction tells us little about the mechanism! Rate Laws We have seen how to obtain the differential form of rate laws based upon experimental observation. As they involve derivatives, we must integrate the rate equations to obtain the time dependence

More information

Longtime behavior of stochastically modeled biochemical reaction networks

Longtime behavior of stochastically modeled biochemical reaction networks Longtime behavior of stochastically modeled biochemical reaction networks David F. Anderson Department of Mathematics University of Wisconsin - Madison ASU Math Biology Seminar February 17th, 217 Overview

More information

Latent voter model on random regular graphs

Latent voter model on random regular graphs Latent voter model on random regular graphs Shirshendu Chatterjee Cornell University (visiting Duke U.) Work in progress with Rick Durrett April 25, 2011 Outline Definition of voter model and duality with

More information

In terms of measures: Exercise 1. Existence of a Gaussian process: Theorem 2. Remark 3.

In terms of measures: Exercise 1. Existence of a Gaussian process: Theorem 2. Remark 3. 1. GAUSSIAN PROCESSES A Gaussian process on a set T is a collection of random variables X =(X t ) t T on a common probability space such that for any n 1 and any t 1,...,t n T, the vector (X(t 1 ),...,X(t

More information

On a class of stochastic differential equations in a financial network model

On a class of stochastic differential equations in a financial network model 1 On a class of stochastic differential equations in a financial network model Tomoyuki Ichiba Department of Statistics & Applied Probability, Center for Financial Mathematics and Actuarial Research, University

More information

1 R.V k V k 1 / I.k/ here; we ll stimulate the action potential another way.) Note that this further simplifies to. m 3 k h k.

1 R.V k V k 1 / I.k/ here; we ll stimulate the action potential another way.) Note that this further simplifies to. m 3 k h k. 1. The goal of this problem is to simulate a propagating action potential for the Hodgkin-Huxley model and to determine the propagation speed. From the class notes, the discrete version (i.e., after breaking

More information

The Derivative of a Function

The Derivative of a Function The Derivative of a Function James K Peterson Department of Biological Sciences and Department of Mathematical Sciences Clemson University March 1, 2017 Outline A Basic Evolutionary Model The Next Generation

More information

Probability Distributions

Probability Distributions Lecture : Background in Probability Theory Probability Distributions The probability mass function (pmf) or probability density functions (pdf), mean, µ, variance, σ 2, and moment generating function (mgf)

More information

Formal Modeling of Biological Systems with Delays

Formal Modeling of Biological Systems with Delays Universita degli Studi di Pisa Dipartimento di Informatica Dottorato di Ricerca in Informatica Ph.D. Thesis Proposal Formal Modeling of Biological Systems with Delays Giulio Caravagna caravagn@di.unipi.it

More information

2008 Hotelling Lectures

2008 Hotelling Lectures First Prev Next Go To Go Back Full Screen Close Quit 1 28 Hotelling Lectures 1. Stochastic models for chemical reactions 2. Identifying separated time scales in stochastic models of reaction networks 3.

More information

Persistence and Stationary Distributions of Biochemical Reaction Networks

Persistence and Stationary Distributions of Biochemical Reaction Networks Persistence and Stationary Distributions of Biochemical Reaction Networks David F. Anderson Department of Mathematics University of Wisconsin - Madison Discrete Models in Systems Biology SAMSI December

More information

Lecture 4 The stochastic ingredient

Lecture 4 The stochastic ingredient Lecture 4 The stochastic ingredient Luca Bortolussi 1 Alberto Policriti 2 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste Via Valerio 12/a, 34100 Trieste. luca@dmi.units.it

More information

Stochastic Differential Equations.

Stochastic Differential Equations. Chapter 3 Stochastic Differential Equations. 3.1 Existence and Uniqueness. One of the ways of constructing a Diffusion process is to solve the stochastic differential equation dx(t) = σ(t, x(t)) dβ(t)

More information

Brownian Motion and Langevin Equations

Brownian Motion and Langevin Equations 1 Brownian Motion and Langevin Equations 1.1 Langevin Equation and the Fluctuation- Dissipation Theorem The theory of Brownian motion is perhaps the simplest approximate way to treat the dynamics of nonequilibrium

More information

CDA5530: Performance Models of Computers and Networks. Chapter 3: Review of Practical

CDA5530: Performance Models of Computers and Networks. Chapter 3: Review of Practical CDA5530: Performance Models of Computers and Networks Chapter 3: Review of Practical Stochastic Processes Definition Stochastic ti process X = {X(t), t T} is a collection of random variables (rvs); one

More information

Numerical solution of stochastic epidemiological models

Numerical solution of stochastic epidemiological models Numerical solution of stochastic epidemiological models John M. Drake & Pejman Rohani 1 Introduction He we expand our modeling toolkit to include methods for studying stochastic versions of the compartmental

More information

Consensus on networks

Consensus on networks Consensus on networks c A. J. Ganesh, University of Bristol The spread of a rumour is one example of an absorbing Markov process on networks. It was a purely increasing process and so it reached the absorbing

More information

When do diffusion-limited trajectories become memoryless?

When do diffusion-limited trajectories become memoryless? When do diffusion-limited trajectories become memoryless? Maciej Dobrzyński CWI (Center for Mathematics and Computer Science) Kruislaan 413, 1098 SJ Amsterdam, The Netherlands Abstract Stochastic description

More information

Stochastic modelling of epidemic spread

Stochastic modelling of epidemic spread Stochastic modelling of epidemic spread Julien Arino Department of Mathematics University of Manitoba Winnipeg Julien Arino@umanitoba.ca 19 May 2012 1 Introduction 2 Stochastic processes 3 The SIS model

More information

Gaussian processes for inference in stochastic differential equations

Gaussian processes for inference in stochastic differential equations Gaussian processes for inference in stochastic differential equations Manfred Opper, AI group, TU Berlin November 6, 2017 Manfred Opper, AI group, TU Berlin (TU Berlin) inference in SDE November 6, 2017

More information

1.1 Definition of BM and its finite-dimensional distributions

1.1 Definition of BM and its finite-dimensional distributions 1 Brownian motion Brownian motion as a physical phenomenon was discovered by botanist Robert Brown as he observed a chaotic motion of particles suspended in water. The rigorous mathematical model of BM

More information

1 Types of stochastic models

1 Types of stochastic models 1 Types of stochastic models Models so far discussed are all deterministic, meaning that, if the present state were perfectly known, it would be possible to predict exactly all future states. We have seen

More information

CHAPTER 9 LECTURE NOTES

CHAPTER 9 LECTURE NOTES CHAPTER 9 LECTURE NOTES 9.1, 9.2: Rate of a reaction For a general reaction of the type A + 3B 2Y, the rates of consumption of A and B, and the rate of formation of Y are defined as follows: Rate of consumption

More information

Stability of Stochastic Differential Equations

Stability of Stochastic Differential Equations Lyapunov stability theory for ODEs s Stability of Stochastic Differential Equations Part 1: Introduction Department of Mathematics and Statistics University of Strathclyde Glasgow, G1 1XH December 2010

More information

6.2.2 Point processes and counting processes

6.2.2 Point processes and counting processes 56 CHAPTER 6. MASTER EQUATIONS which yield a system of differential equations The solution to the system of equations is d p (t) λp, (6.3) d p k(t) λ (p k p k 1 (t)), (6.4) k 1. (6.5) p n (t) (λt)n e λt

More information

(Infinite) Series Series a n = a 1 + a 2 + a a n +...

(Infinite) Series Series a n = a 1 + a 2 + a a n +... (Infinite) Series Series a n = a 1 + a 2 + a 3 +... + a n +... What does it mean to add infinitely many terms? The sequence of partial sums S 1, S 2, S 3, S 4,...,S n,...,where nx S n = a i = a 1 + a 2

More information

Stochastic Modelling Unit 1: Markov chain models

Stochastic Modelling Unit 1: Markov chain models Stochastic Modelling Unit 1: Markov chain models Russell Gerrard and Douglas Wright Cass Business School, City University, London June 2004 Contents of Unit 1 1 Stochastic Processes 2 Markov Chains 3 Poisson

More information

Simulating stochastic epidemics

Simulating stochastic epidemics Simulating stochastic epidemics John M. Drake & Pejman Rohani 1 Introduction This course will use the R language programming environment for computer modeling. The purpose of this exercise is to introduce

More information

arxiv: v2 [q-bio.qm] 12 Jan 2017

arxiv: v2 [q-bio.qm] 12 Jan 2017 Approximation and inference methods for stochastic biochemical kinetics - a tutorial review arxiv:1608.06582v2 [q-bio.qm] 12 Jan 2017 David Schnoerr 1,2,3, Guido Sanguinetti 2,3, and Ramon Grima 1,3,*

More information

Poisson Jumps in Credit Risk Modeling: a Partial Integro-differential Equation Formulation

Poisson Jumps in Credit Risk Modeling: a Partial Integro-differential Equation Formulation Poisson Jumps in Credit Risk Modeling: a Partial Integro-differential Equation Formulation Jingyi Zhu Department of Mathematics University of Utah zhu@math.utah.edu Collaborator: Marco Avellaneda (Courant

More information

where R = universal gas constant R = PV/nT R = atm L mol R = atm dm 3 mol 1 K 1 R = J mol 1 K 1 (SI unit)

where R = universal gas constant R = PV/nT R = atm L mol R = atm dm 3 mol 1 K 1 R = J mol 1 K 1 (SI unit) Ideal Gas Law PV = nrt where R = universal gas constant R = PV/nT R = 0.0821 atm L mol 1 K 1 R = 0.0821 atm dm 3 mol 1 K 1 R = 8.314 J mol 1 K 1 (SI unit) Standard molar volume = 22.4 L mol 1 at 0 C and

More information

Linear Algebra Section 2.6 : LU Decomposition Section 2.7 : Permutations and transposes Wednesday, February 13th Math 301 Week #4

Linear Algebra Section 2.6 : LU Decomposition Section 2.7 : Permutations and transposes Wednesday, February 13th Math 301 Week #4 Linear Algebra Section. : LU Decomposition Section. : Permutations and transposes Wednesday, February 1th Math 01 Week # 1 The LU Decomposition We learned last time that we can factor a invertible matrix

More information

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS PROBABILITY: LIMIT THEOREMS II, SPRING 15. HOMEWORK PROBLEMS PROF. YURI BAKHTIN Instructions. You are allowed to work on solutions in groups, but you are required to write up solutions on your own. Please

More information

Introduction. Stochastic Processes. Will Penny. Stochastic Differential Equations. Stochastic Chain Rule. Expectations.

Introduction. Stochastic Processes. Will Penny. Stochastic Differential Equations. Stochastic Chain Rule. Expectations. 19th May 2011 Chain Introduction We will Show the relation between stochastic differential equations, Gaussian processes and methods This gives us a formal way of deriving equations for the activity of

More information

Math 345 Intro to Math Biology Lecture 19: Models of Molecular Events and Biochemistry

Math 345 Intro to Math Biology Lecture 19: Models of Molecular Events and Biochemistry Math 345 Intro to Math Biology Lecture 19: Models of Molecular Events and Biochemistry Junping Shi College of William and Mary, USA Molecular biology and Biochemical kinetics Molecular biology is one of

More information

UNDERSTANDING BOLTZMANN S ANALYSIS VIA. Contents SOLVABLE MODELS

UNDERSTANDING BOLTZMANN S ANALYSIS VIA. Contents SOLVABLE MODELS UNDERSTANDING BOLTZMANN S ANALYSIS VIA Contents SOLVABLE MODELS 1 Kac ring model 2 1.1 Microstates............................ 3 1.2 Macrostates............................ 6 1.3 Boltzmann s entropy.......................

More information

Kinetic Monte Carlo. Heiko Rieger. Theoretical Physics Saarland University Saarbrücken, Germany

Kinetic Monte Carlo. Heiko Rieger. Theoretical Physics Saarland University Saarbrücken, Germany Kinetic Monte Carlo Heiko Rieger Theoretical Physics Saarland University Saarbrücken, Germany DPG school on Efficient Algorithms in Computational Physics, 10.-14.9.2012, Bad Honnef Intro Kinetic Monte

More information

Solving a system of Master Equations for parallel chemical interactions

Solving a system of Master Equations for parallel chemical interactions Technical Report CoSBi 21/2007 Solving a system of Master Equations for parallel chemical interactions Paola Lecca The Microsoft Research - University of Trento Centre for Computational and Systems Biology

More information

Brownian motion and the Central Limit Theorem

Brownian motion and the Central Limit Theorem Brownian motion and the Central Limit Theorem Amir Bar January 4, 3 Based on Shang-Keng Ma, Statistical Mechanics, sections.,.7 and the course s notes section 6. Introduction In this tutorial we shall

More information

TMS165/MSA350 Stochastic Calculus, Lecture on Applications

TMS165/MSA350 Stochastic Calculus, Lecture on Applications TMS165/MSA35 Stochastic Calculus, Lecture on Applications In this lecture we demonstrate how statistical methods such as the maximum likelihood method likelihood ratio estimation can be applied to the

More information

Fokker-Planck Equation with Detailed Balance

Fokker-Planck Equation with Detailed Balance Appendix E Fokker-Planck Equation with Detailed Balance A stochastic process is simply a function of two variables, one is the time, the other is a stochastic variable X, defined by specifying: a: the

More information

MA22S3 Summary Sheet: Ordinary Differential Equations

MA22S3 Summary Sheet: Ordinary Differential Equations MA22S3 Summary Sheet: Ordinary Differential Equations December 14, 2017 Kreyszig s textbook is a suitable guide for this part of the module. Contents 1 Terminology 1 2 First order separable 2 2.1 Separable

More information

Lecture 3: Analysis of Variance II

Lecture 3: Analysis of Variance II Lecture 3: Analysis of Variance II http://www.stats.ox.ac.uk/ winkel/phs.html Dr Matthias Winkel 1 Outline I. A second introduction to two-way ANOVA II. Repeated measures design III. Independent versus

More information

Statistics 150: Spring 2007

Statistics 150: Spring 2007 Statistics 150: Spring 2007 April 23, 2008 0-1 1 Limiting Probabilities If the discrete-time Markov chain with transition probabilities p ij is irreducible and positive recurrent; then the limiting probabilities

More information

Statistics 992 Continuous-time Markov Chains Spring 2004

Statistics 992 Continuous-time Markov Chains Spring 2004 Summary Continuous-time finite-state-space Markov chains are stochastic processes that are widely used to model the process of nucleotide substitution. This chapter aims to present much of the mathematics

More information

Accelerated stochastic simulation of the stiff enzyme-substrate reaction

Accelerated stochastic simulation of the stiff enzyme-substrate reaction THE JOURNAL OF CHEMICAL PHYSICS 123, 144917 2005 Accelerated stochastic simulation of the stiff enzyme-substrate reaction Yang Cao a Department of Computer Science, University of California, Santa Barbara,

More information

Chapter 2 Event-Triggered Sampling

Chapter 2 Event-Triggered Sampling Chapter Event-Triggered Sampling In this chapter, some general ideas and basic results on event-triggered sampling are introduced. The process considered is described by a first-order stochastic differential

More information

Chemical reaction network theory for stochastic and deterministic models of biochemical reaction systems

Chemical reaction network theory for stochastic and deterministic models of biochemical reaction systems Chemical reaction network theory for stochastic and deterministic models of biochemical reaction systems University of Wisconsin at Madison anderson@math.wisc.edu MBI Workshop for Young Researchers in

More information

Linear Equations in Linear Algebra

Linear Equations in Linear Algebra 1 Linear Equations in Linear Algebra 1.1 SYSTEMS OF LINEAR EQUATIONS LINEAR EQUATION x 1,, x n A linear equation in the variables equation that can be written in the form a 1 x 1 + a 2 x 2 + + a n x n

More information

Notes for Math 450 Stochastic Petri nets and reactions

Notes for Math 450 Stochastic Petri nets and reactions Notes for Math 450 Stochastic Petri nets and reactions Renato Feres Petri nets Petri nets are a special class of networks, introduced in 96 by Carl Adam Petri, that provide a convenient language and graphical

More information

On the Interpretation of Delays in Delay Stochastic Simulation of Biological Systems

On the Interpretation of Delays in Delay Stochastic Simulation of Biological Systems On the Interpretation of Delays in Delay Stochastic Simulation of Biological Systems Roberto Barbuti Giulio Caravagna Andrea Maggiolo-Schettini Paolo Milazzo Dipartimento di Informatica, Università di

More information

10.34 Numerical Methods Applied to Chemical Engineering. Quiz 2

10.34 Numerical Methods Applied to Chemical Engineering. Quiz 2 10.34 Numerical Methods Applied to Chemical Engineering Quiz 2 This quiz consists of three problems worth 35, 35, and 30 points respectively. There are 4 pages in this quiz (including this cover page).

More information

Problem Set 5. 1 Waiting times for chemical reactions (8 points)

Problem Set 5. 1 Waiting times for chemical reactions (8 points) Problem Set 5 1 Waiting times for chemical reactions (8 points) In the previous assignment, we saw that for a chemical reaction occurring at rate r, the distribution of waiting times τ between reaction

More information

Engineering Physics 1 Dr. B. K. Patra Department of Physics Indian Institute of Technology-Roorkee

Engineering Physics 1 Dr. B. K. Patra Department of Physics Indian Institute of Technology-Roorkee Engineering Physics 1 Dr. B. K. Patra Department of Physics Indian Institute of Technology-Roorkee Module-05 Lecture-04 Maxwellian Distribution Law of Velocity Part 02 So, we have already told to experiment

More information

A Brief Introduction to Numerical Methods for Differential Equations

A Brief Introduction to Numerical Methods for Differential Equations A Brief Introduction to Numerical Methods for Differential Equations January 10, 2011 This tutorial introduces some basic numerical computation techniques that are useful for the simulation and analysis

More information

Continuous and Discrete random process

Continuous and Discrete random process Continuous and Discrete random and Discrete stochastic es. Continuous stochastic taking values in R. Many real data falls into the continuous category: Meteorological data, molecular motion, traffic data...

More information

16.4. Power Series. Introduction. Prerequisites. Learning Outcomes

16.4. Power Series. Introduction. Prerequisites. Learning Outcomes Power Series 6.4 Introduction In this Section we consider power series. These are examples of infinite series where each term contains a variable, x, raised to a positive integer power. We use the ratio

More information

Lecture 6: Multiple Model Filtering, Particle Filtering and Other Approximations

Lecture 6: Multiple Model Filtering, Particle Filtering and Other Approximations Lecture 6: Multiple Model Filtering, Particle Filtering and Other Approximations Department of Biomedical Engineering and Computational Science Aalto University April 28, 2010 Contents 1 Multiple Model

More information

Assignment 4. u n+1 n(n + 1) i(i + 1) = n n (n + 1)(n + 2) n(n + 2) + 1 = (n + 1)(n + 2) 2 n + 1. u n (n + 1)(n + 2) n(n + 1) = n

Assignment 4. u n+1 n(n + 1) i(i + 1) = n n (n + 1)(n + 2) n(n + 2) + 1 = (n + 1)(n + 2) 2 n + 1. u n (n + 1)(n + 2) n(n + 1) = n Assignment 4 Arfken 5..2 We have the sum Note that the first 4 partial sums are n n(n + ) s 2, s 2 2 3, s 3 3 4, s 4 4 5 so we guess that s n n/(n + ). Proving this by induction, we see it is true for

More information