Computational methods for continuous time Markov chains with applications to biological processes

Similar documents
Stochastic models of biochemical systems

Simulation methods for stochastic models in chemistry

2008 Hotelling Lectures

Longtime behavior of stochastically modeled biochemical reaction networks

Stochastic and deterministic models of biochemical reaction networks

arxiv: v2 [q-bio.mn] 31 Aug 2007

Persistence and Stationary Distributions of Biochemical Reaction Networks

Fast Probability Generating Function Method for Stochastic Chemical Reaction Networks

Stochastic models for chemical reactions

Stochastic analysis of biochemical reaction networks with absolute concentration robustness

Extending the Tools of Chemical Reaction Engineering to the Molecular Scale

Gillespie s Algorithm and its Approximations. Des Higham Department of Mathematics and Statistics University of Strathclyde

Efficient Leaping Methods for Stochastic Chemical Systems

Computational complexity analysis for Monte Carlo approximations of classically scaled population processes

Extending the multi-level method for the simulation of stochastic biological systems

Nested Stochastic Simulation Algorithm for KMC with Multiple Time-Scales

Non-nested Adaptive Timesteps in Multilevel Monte Carlo Computations

On the numerical solution of the chemical master equation with sums of rank one tensors

Chemical reaction network theory for stochastic and deterministic models of biochemical reaction systems

c 2018 Society for Industrial and Applied Mathematics

Modelling in Systems Biology

Nested Stochastic Simulation Algorithm for Chemical Kinetic Systems with Disparate Rates. Abstract

Introduction to stochastic multiscale modelling in tumour growth

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

An Introduction to Stochastic Simulation

Multifidelity Approaches to Approximate Bayesian Computation

Stochastic Simulation.

Problem Set 5. 1 Waiting times for chemical reactions (8 points)

Adaptive timestepping for SDEs with non-globally Lipschitz drift

Exact Simulation of Continuous Time Markov Jump Processes with Anticorrelated Variance Reduced Monte Carlo Estimation

arxiv: v2 [math.na] 20 Dec 2016

Multilevel Monte Carlo for Lévy Driven SDEs

Parameter Estimation in Stochastic Chemical Kinetic Models. Rishi Srivastava. A dissertation submitted in partial fulfillment of

Stat 516, Homework 1

Lecture 4 The stochastic ingredient

On Derivative Estimation of the Mean Time to Failure in Simulations of Highly Reliable Markovian Systems

STOCHASTIC CHEMICAL KINETICS

Comparison of Two Parameter Estimation Techniques for Stochastic Models

The Adaptive Explicit-Implicit Tau-Leaping Method with Automatic Tau Selection

Mean-square Stability Analysis of an Extended Euler-Maruyama Method for a System of Stochastic Differential Equations

Efficient step size selection for the tau-leaping simulation method

Supporting Information for The Stochastic Quasi-Steady-State Assumption: Reducing the Model but Not the Noise

This is now an algebraic equation that can be solved simply:

Stochastic Simulation Methods for Solving Systems with Multi-State Species

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =

An Application of Perturbation Methods in Evolutionary Ecology

Introduction to Rare Event Simulation

EECS 126 Probability and Random Processes University of California, Berkeley: Spring 2017 Kannan Ramchandran March 21, 2017.

Stochastic Simulation of Biochemical Reactions

Introduction to Stochastic Processes with Applications in the Biosciences

Sequential Monte Carlo Methods for Bayesian Computation

Introduction to the Numerical Solution of SDEs Selected topics

SPA for quantitative analysis: Lecture 6 Modelling Biological Processes

Multi-level Approximate Bayesian Computation

Some investigations concerning the CTMC and the ODE model derived from Bio-PEPA

Models of Molecular Evolution

A Note on the Central Limit Theorem for a Class of Linear Systems 1

Cybergenetics: Control theory for living cells

Slow Scale Tau-leaping Method

Kinetic Monte Carlo. Heiko Rieger. Theoretical Physics Saarland University Saarbrücken, Germany

Equilibrium Time, Permutation, Multiscale and Modified Multiscale Entropies for Low-High Infection Level Intracellular Viral Reaction Kinetics

Inference For High Dimensional M-estimates. Fixed Design Results


ABC methods for phase-type distributions with applications in insurance risk problems

Normalising constants and maximum likelihood inference

An Overview of Methods for Applying Semi-Markov Processes in Biostatistics.

Ph.D. Qualifying Exam Monday Tuesday, January 4 5, 2016

Branching Processes II: Convergence of critical branching to Feller s CSB

Inference For High Dimensional M-estimates: Fixed Design Results

On the Interpretation of Delays in Delay Stochastic Simulation of Biological Systems

Stein s Method for Steady-State Approximations: Error Bounds and Engineering Solutions

(implicitly assuming time-homogeneity from here on)

7.32/7.81J/8.591J: Systems Biology. Fall Exam #1

Stochastic Modeling of Chemical Reactions

Multi-fidelity sensitivity analysis

Lecture 7: Simple genetic circuits I

Spring 2012 Math 541B Exam 1

ECE353: Probability and Random Processes. Lecture 18 - Stochastic Processes

A Comparison of Computational Efficiencies of Stochastic Algorithms in Terms of Two Infection Models

FEEDBACK CONTROL OF GROWTH RATE AND SURFACE ROUGHNESS IN THIN FILM GROWTH. Yiming Lou and Panagiotis D. Christofides

Lecture 8. October 22, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

Simulating ribosome biogenesis in replicating whole cells

The square root rule for adaptive importance sampling

dans les modèles à vraisemblance non explicite par des algorithmes gradient-proximaux perturbés

EXAMINATIONS OF THE HONG KONG STATISTICAL SOCIETY

1Non Linear mixed effects ordinary differential equations models. M. Prague - SISTM - NLME-ODE September 27,

Multilevel Sequential 2 Monte Carlo for Bayesian Inverse Problems

Single cell experiments and Gillespie s algorithm

The Monte Carlo EM method for the parameter estimation of biological models

When do diffusion-limited trajectories become memoryless?

AARMS Homework Exercises

System Biology - Deterministic & Stochastic Dynamical Systems

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012

Derivations for order reduction of the chemical master equation

Multilevel Monte Carlo for Stochastic McKean-Vlasov Equations

Accurate hybrid stochastic simulation of a system of coupled chemical or biochemical reactions

CONTENTS. Preface List of Symbols and Notation

Stochastic Analogues to Deterministic Optimizers

1. Introduction Systems evolving on widely separated time-scales represent a challenge for numerical simulations. A generic example is

Approximate Bayesian computation: methods and applications for complex systems

Transcription:

Computational methods for continuous time Markov chains with applications to biological processes David F. Anderson anderson@math.wisc.edu Department of Mathematics University of Wisconsin - Madison Penn. State January 13th, 212

Stochastic Models of Biochemical Reaction Systems Most common stochastic models of biochemical reaction systems are continuous time Markov chains. Often called chemical master equation type models in biosciences. Common examples include: 1. Gene regulatory networks. 2. Models of viral infection. 3. General population models (epidemic, predator-prey, etc.

Stochastic Models of Biochemical Reaction Systems Most common stochastic models of biochemical reaction systems are continuous time Markov chains. Often called chemical master equation type models in biosciences. Common examples include: 1. Gene regulatory networks. 2. Models of viral infection. 3. General population models (epidemic, predator-prey, etc. Path-wise simulation methods include: Language in Biology Gillespie s Algorithm Next reaction method First reaction method Language in Math Sim. embedded DTMC Sim. random time change representation of Tom Kurtz Sim. using exponential alarm clocks

Stochastic Models of Biochemical Reaction Systems Path-wise methods can approximate values such as For example, 1. Means: f (x = x i. 2. Moments/variances: f (x = x 2 i. Ef (X(t 3. Probabilities: f (x = 1 {x A}. or compute sensitivities d dκ Ef (X κ (t.

Stochastic Models of Biochemical Reaction Systems Path-wise methods can approximate values such as For example, 1. Means: f (x = x i. 2. Moments/variances: f (x = x 2 i. Ef (X(t 3. Probabilities: f (x = 1 {x A}. or compute sensitivities d dκ Ef (X κ (t. Problem: solving using these algorithms can be computationally expensive.

First problem: joint with Des Higham Our first problem: Approximate Ef (X(T to some desired tolerance, ɛ >.

First problem: joint with Des Higham Our first problem: Approximate Ef (X(T to some desired tolerance, ɛ >. Easy! Simulate the CTMC exactly, generate independent paths, X [i] (t, use the unbiased estimator µ n = 1 n n f (X [i] (t. i=1 stop when desired confidence interval is ± ɛ.

What is the computational cost? Recall, Thus, µ n = 1 n n f (X [i] (t. i=1 Var(µ n = O ( 1. n

What is the computational cost? Recall, Thus, So, if we want we need µ n = 1 n n f (X [i] (t. i=1 Var(µ n = O σ n = O(ɛ, ( 1. n 1 n = O(ɛ = n = O(ɛ 2.

What is the computational cost? Recall, Thus, So, if we want we need µ n = 1 n n f (X [i] (t. i=1 Var(µ n = O σ n = O(ɛ, ( 1. n 1 n = O(ɛ = n = O(ɛ 2. If N gives average cost (steps of a path using exact algorithm: Total computational complexity = (cost per path (# paths = O(Nɛ 2. Can be bad if (i N is large, or (ii ɛ is small.

Benefits/drawbacks Benefits: 1. Easy to implement. 2. Estimator is unbiased. µ n = 1 n n f (X [i] (t i=1

Benefits/drawbacks Benefits: 1. Easy to implement. 2. Estimator is unbiased. µ n = 1 n n f (X [i] (t i=1 Drawbacks: 1. The cost of O(Nɛ 2 could be prohibitively large. 2. For our models, we often have that N is very large. We need to develop the model for better ideas...

Build up model: Random time change representation of Tom Kurtz Consider the simple system A + B C where one molecule each of A and B is being converted to one of C.

Build up model: Random time change representation of Tom Kurtz Consider the simple system A + B C where one molecule each of A and B is being converted to one of C. Simple book-keeping: if X(t = (X A (t, X B (t, X C (t T gives the state at time t, X(t = X( + R(t 1 1 1, where R(t is the # of times the reaction has occurred by time t, and X( is the initial condition.

Build up model: Random time change representation of Tom Kurtz Assuming intensity or propensity of reaction is κx A (sx B (s,

Build up model: Random time change representation of Tom Kurtz Assuming intensity or propensity of reaction is We can model ( t R(t = Y κx A (sx B (s, where Y is a unit-rate Poisson point process. κx A (sx B (sds

Build up model: Random time change representation of Tom Kurtz Assuming intensity or propensity of reaction is We can model ( t R(t = Y κx A (sx B (s, where Y is a unit-rate Poisson point process. κx A (sx B (sds Hence X A (t X B (t X C (t X(t = X( + 1 1 1 ( t Y κx A (sx B (sds.

Build up model: Random time change representation of Tom Kurtz Now consider a network of reactions involving d chemical species, S 1,..., S d : d d ν ik S i i=1 i=1 Denote reaction vector as ζ k = ν k ν k, ν iks i

Build up model: Random time change representation of Tom Kurtz Now consider a network of reactions involving d chemical species, S 1,..., S d : d d ν ik S i i=1 i=1 Denote reaction vector as ζ k = ν k ν k, ν iks i The intensity (or propensity of kth reaction is λ k : Z d R. By analogy with before X(t = X( + k Y k ( t λ k (X(sds ζ k, Y k are independent, unit-rate Poisson processes.

Example Consider a model of gene transcription and translation: G 25 G + M, M 1 M + P, P + P.1 D, M.1, P 1 Then, if X = [X M, X P, X D ] T, (Transcription (Translation (Dimerization (Degradation of mrna (Degradation of Protein.

Example Consider a model of gene transcription and translation: G 25 G + M, M 1 M + P, P + P.1 D, M.1, (Transcription (Translation (Dimerization (Degradation of mrna P 1 Then, if X = [X M, X P, X D ] T, X(t = X( + Y 1 (25t 1 (Degradation of Protein. + Y 2 (1 t t + Y 3 (.1 X P (s(x P (s 1ds t + Y 4 (.1 X M (sds 1 + Y 5 (1 X M (sds 2 1 t 1 X P (sds 1

Back to our problem Recall: Benefits: 1. Easy to implement. 2. Estimator is unbiased. µ n = 1 n n f (X [i] (t i=1 Drawbacks: 1. The cost of O(Nɛ 2 could be prohibitively large. 2. For our models, we often have that N is very large. Let s try an approximate scheme.

Tau-leaping: Euler s method Explicit tau-leaping 1 or Euler s method, was first formulated by Dan Gillespie in this setting. Tau-leaping is essentially an Euler approximation of Z (h = Z ( + k Z ( + k d = Z ( + k t ( h Y k λ k (Z (s ds Y k (λ k (Z ( h ζ k λ k (X(sds: ζ k ( Poisson λ k (Z ( h ζ k. 1 D. T. Gillespie, J. Chem. Phys., 115, 1716 1733.

Euler s method Path-wise representation for Z (t generated by Euler s method is Z (t = X( + k Y k ( t λ k (Z η(sds ζ k, where η(s = t n if t n s < t n+1 = t n + h is a step function giving left endpoints of time discretization.

Return to approximating Ef (X(T Let Z L denote an approximate processes generated with time discretization step of h L. Let µ n = 1 n f (Z L,[i] (t. n i=1

Return to approximating Ef (X(T Let Z L denote an approximate processes generated with time discretization step of h L. Let µ n = 1 n f (Z L,[i] (t. n We note i=1 Ef (X(t µ n = [ Ef (X(t Ef (Z L (t ] + Ef (Z L (t µ n

Return to approximating Ef (X(T Let Z L denote an approximate processes generated with time discretization step of h L. Let µ n = 1 n f (Z L,[i] (t. n We note i=1 Ef (X(t µ n = [ Ef (X(t Ef (Z L (t ] + Ef (Z L (t µ n Suppose have an order one method Ef (X(t Ef (Z L (t = O(h L.

Return to approximating Ef (X(T Let Z L denote an approximate processes generated with time discretization step of h L. Let µ n = 1 n f (Z L,[i] (t. n We note i=1 Ef (X(t µ n = [ Ef (X(t Ef (Z L (t ] + Ef (Z L (t µ n Suppose have an order one method Ef (X(t Ef (Z L (t = O(h L. We need: 1. h L = O(ɛ. 2. n = ɛ 2.

Return to approximating Ef (X(T Let Z L denote an approximate processes generated with time discretization step of h L. Let µ n = 1 n f (Z L,[i] (t. n We note i=1 Ef (X(t µ n = [ Ef (X(t Ef (Z L (t ] + Ef (Z L (t µ n Suppose have an order one method Ef (X(t Ef (Z L (t = O(h L. We need: 1. h L = O(ɛ. 2. n = ɛ 2. Suppose a path costs O(ɛ 1 steps. Then Total computational complexity = (# paths (cost per path = O(ɛ 3.

Benefits/drawbacks Benefits: 1. Can drastically lower the computational complexity of a problem if ɛ 1 N. CC of using exact = Nɛ 2 CC of using approximate = ɛ 1 ɛ 2.

Benefits/drawbacks Benefits: 1. Can drastically lower the computational complexity of a problem if ɛ 1 N. CC of using exact = Nɛ 2 CC of using approximate = ɛ 1 ɛ 2. Drawbacks: 1. Convergence results usually give order of convergence. Can t give a precise h L. Bias is a problem. 2. Tau-leaping has problems: what happens if you go negative? 3. Gone away from an unbiased estimator.

Multi-level Monte Carlo and control variates Suppose I want EX 1 n X [i], n but realizations of X are expensive. i=1

Multi-level Monte Carlo and control variates Suppose I want EX 1 n X [i], n but realizations of X are expensive. i=1 Suppose X ẐL, and ẐL is cheap.

Multi-level Monte Carlo and control variates Suppose I want EX 1 n X [i], n but realizations of X are expensive. i=1 Suppose X ẐL, and ẐL is cheap. Suppose X, ẐL can be generated simultaneously so that Var(X ẐL is small.

Multi-level Monte Carlo and control variates Suppose I want EX 1 n X [i], n but realizations of X are expensive. i=1 Suppose X ẐL, and ẐL is cheap. Suppose X, ẐL can be generated simultaneously so that Var(X ẐL is small. Then use EX = E[X ẐL] + EẐL

Multi-level Monte Carlo and control variates Suppose I want EX 1 n X [i], n but realizations of X are expensive. i=1 Suppose X ẐL, and ẐL is cheap. Suppose X, ẐL can be generated simultaneously so that Var(X ẐL is small. Then use EX = E[X ẐL] + EẐL 1 n 1 (X [i] n ẐL,[i] + 1 n 2 Ẑ L,[i]. 1 n 2 i=1 i=1

Multi-level Monte Carlo and control variates Suppose I want EX 1 n X [i], n but realizations of X are expensive. i=1 Suppose X ẐL, and ẐL is cheap. Suppose X, ẐL can be generated simultaneously so that Var(X ẐL is small. Then use EX = E[X ẐL] + EẐL 1 n 1 (X [i] n ẐL,[i] + 1 n 2 Ẑ L,[i]. 1 n 2 i=1 i=1 Multi-level Monte Carlo (Mike Giles, Stefan Heinrich = Keep going EX = E(X ẐL + EẐL =

Multi-level Monte Carlo and control variates Suppose I want EX 1 n X [i], n but realizations of X are expensive. i=1 Suppose X ẐL, and ẐL is cheap. Suppose X, ẐL can be generated simultaneously so that Var(X ẐL is small. Then use EX = E[X ẐL] + EẐL 1 n 1 (X [i] n ẐL,[i] + 1 n 2 Ẑ L,[i]. 1 n 2 i=1 i=1 Multi-level Monte Carlo (Mike Giles, Stefan Heinrich = Keep going EX = E(X ẐL + EẐL = E(Z ẐL + E(ẐL ẐL 1 + EẐL 1 =

Multi-level Monte Carlo: an unbiased estimator In our setting: Ef (X(t =

Multi-level Monte Carlo: an unbiased estimator In our setting: Ef (X(t = E[f (X(t f (Z L (t] + L E[f (Z l (t f (Z l 1 (t] + Ef (Z l (t. l=l +1

Multi-level Monte Carlo: an unbiased estimator In our setting: Ef (X(t = E[f (X(t f (Z L (t] + L l=l +1 E[f (Z l (t f (Z l 1 (t] + Ef (Z l (t. For appropiate choices of n, n l, and n E, we define the estimators for the three terms above via Q E def = 1 n E n E i=1 (f (X [i] (T f (Z L,[i] (T,

Multi-level Monte Carlo: an unbiased estimator In our setting: Ef (X(t = E[f (X(t f (Z L (t] + L l=l +1 E[f (Z l (t f (Z l 1 (t] + Ef (Z l (t. For appropiate choices of n, n l, and n E, we define the estimators for the three terms above via Q E Q l def = 1 n E n E i=1 (f (X [i] (T f (Z L,[i] (T, def = 1 n l (f (Z l,[i] (T f (Z l 1,[i] (T, for l {l + 1,..., L} n l i=1

Multi-level Monte Carlo: an unbiased estimator In our setting: Ef (X(t = E[f (X(t f (Z L (t] + L l=l +1 E[f (Z l (t f (Z l 1 (t] + Ef (Z l (t. For appropiate choices of n, n l, and n E, we define the estimators for the three terms above via Q E Q l Q def = 1 n E and note that n E i=1 (f (X [i] (T f (Z L,[i] (T, def = 1 n l (f (Z l,[i] (T f (Z l 1,[i] (T, for l {l + 1,..., L} n l i=1 def = 1 n f (Z l,[i](t, n i=1 Q def = Q E + is an unbiased estimator for Ef (X(T. L l=l +1 Q l + Q

Multi-level Monte Carlo: an unbiased estimator In our setting: Ef (X(t = E[f (X(t f (Z L (t] + L l=l +1 E[f (Z l (t f (Z l 1 (t] + Ef (Z l (t. For appropiate choices of n, n l, and n E, we define the estimators for the three terms above via Q E Q l Q def = 1 n E and note that n E i=1 (f (X [i] (T f (Z L,[i] (T, def = 1 n l (f (Z l,[i] (T f (Z l 1,[i] (T, for l {l + 1,..., L} n l i=1 def = 1 n f (Z l,[i](t, n i=1 Q def = Q E + is an unbiased estimator for Ef (X(T. L l=l +1 Q l + Q So what is the coupling and the variance of the estimator?

How do we generate processes simultaneously Suppose I want to generate: A Poisson process with intensity 13.1. A Poisson process with intensity 13.

How do we generate processes simultaneously Suppose I want to generate: A Poisson process with intensity 13.1. A Poisson process with intensity 13. We could let Y 1 and Y 2 be independent, unit-rate Poisson processes, and set Z 13.1 (t = Y 1 (13.1t, Z 13 (t = Y 2 (13t, Using this representation, these processes are independent and, hence, not coupled.

How do we generate processes simultaneously Suppose I want to generate: A Poisson process with intensity 13.1. A Poisson process with intensity 13. We could let Y 1 and Y 2 be independent, unit-rate Poisson processes, and set Z 13.1 (t = Y 1 (13.1t, Z 13 (t = Y 2 (13t, Using this representation, these processes are independent and, hence, not coupled. The variance of difference is large: Var(Z 13.1 (t Z 13 (t = Var(Y 1 (13.1t + Var(Y 2 (13t = 26.1t.

How do we generate processes simultaneously Suppose I want to generate: A Poisson process with intensity 13.1. A Poisson process with intensity 13.

How do we generate processes simultaneously Suppose I want to generate: A Poisson process with intensity 13.1. A Poisson process with intensity 13. We could let Y 1 and Y 2 be independent unit-rate Poisson processes, and set Z 13.1 (t = Y 1 (13t + Y 2 (.1t Z 13 (t = Y 1 (13t, The variance of difference is much smaller: Var(Z 13.1 (t Z 13 (t = Var (Y 2 (.1t =.1t.

How do we generate processes simultaneously More generally, suppose we want 1. non-homogeneous Poisson process with intensity f (t and 2. non-homogeneous Poisson process with intensity g(t.

How do we generate processes simultaneously More generally, suppose we want 1. non-homogeneous Poisson process with intensity f (t and 2. non-homogeneous Poisson process with intensity g(t. We can can let Y 1, Y 2, and Y 3 be independent, unit-rate Poisson processes and define ( t ( t Z f (t = Y 1 f (s g(sds + Y 2 f (s (f (s g(s ds, ( t ( t Z g(t = Y 1 f (s g(sds + Y 3 g(s (f (s g(s ds,

How do we generate processes simultaneously More generally, suppose we want 1. non-homogeneous Poisson process with intensity f (t and 2. non-homogeneous Poisson process with intensity g(t. We can can let Y 1, Y 2, and Y 3 be independent, unit-rate Poisson processes and define ( t ( t Z f (t = Y 1 f (s g(sds + Y 2 f (s (f (s g(s ds, ( t ( t Z g(t = Y 1 f (s g(sds + Y 3 g(s (f (s g(s ds, where we are using that, for example, ( t ( t ( t Y 1 f (s g(sds + Y 2 f (s (f (s g(s ds = Y where Y is a unit rate Poisson process. f (sds,

Back to our processes X(t = X( + k Z (t = X( + k Y k ( t Y k ( t λ k (X(sds ζ k, λ k (Z η(sds ζ k.

Back to our processes X(t = X( + k Z (t = X( + k Y k ( t Y k ( t λ k (X(sds ζ k, λ k (Z η(sds ζ k. Now couple X(t = X( + k + k Z l (t = Z l ( + k + k Y k,1 ( t Y k,2 ( t Y k,1 ( t Y k,3 ( t λ k (X(s λ k (Z l η l (sds ζ k λ k (X(s λ k (X(s λ k (Z l η l (sds ζ k λ k (X(s λ k (Z l η l (sds ζ k λ k (Z l η l (s λ k (X(s λ k (Z l η l (sds ζ k

Back to our processes X(t = X( + k Z (t = X( + k Y k ( t Y k ( t λ k (X(sds ζ k, λ k (Z η(sds ζ k. Now couple X(t = X( + k + k Z l (t = Z l ( + k + k Y k,1 ( t Y k,2 ( t Y k,1 ( t Y k,3 ( t λ k (X(s λ k (Z l η l (sds ζ k λ k (X(s λ k (X(s λ k (Z l η l (sds ζ k λ k (X(s λ k (Z l η l (sds ζ k λ k (Z l η l (s λ k (X(s λ k (Z l η l (sds ζ k Algorithm for simulation is equivalent to next reaction method or Gillespie.

For approximate processes Z l (t = Z l ( + ( t Y k,1 λ k (Z l η l (s λ k (Z l 1 η l 1 (sds ζ k k + ( t Y k,2 λ k (Z l η l (s λ k (Z l η l (s λ k (Z l 1 η l 1 (sds ζ k k Z l 1 (t = Z l 1 ( + ( t Y k,1 λ k (Z l η l (s λ k (Z l 1 η l 1 (sds ζ k k + ( t Y k,3 λ k (Z l 1 η l 1 (s λ k (Z l η l (s λ k (Z l 1 η l 1 (sds ζ k, k Algorithm for simulation is equivalent in to τ-leaping.

Multi-level Monte Carlo: chemical kinetic setting Can prove: Theorem (Anderson, Higham 211 Suppose (X, Z l satisfy coupling. Then, sup E X(t Z l (t 2 C 1 (T N ρ h l + C 2 (T hl. 2 t T 1 David F. Anderson and Desmond J. Higham, Multi-level Monte Carlo for stochastically modeled chemical kinetic systems. To appear in SIAM: Modeling and Simulation. Available at arxiv.org:117.2181. Also at www.math.wisc.edu/ anderson.

Multi-level Monte Carlo: chemical kinetic setting Can prove: Theorem (Anderson, Higham 211 Suppose (X, Z l satisfy coupling. Then, sup E X(t Z l (t 2 C 1 (T N ρ h l + C 2 (T hl. 2 t T Theorem (Anderson, Higham 211 Suppose (Z l, Z l 1 satisfy coupling. Then, sup E Z l (t Z l 1 (t 2 C 1 (T N ρ h l + C 2 (T hl. 2 t T 1 David F. Anderson and Desmond J. Higham, Multi-level Monte Carlo for stochastically modeled chemical kinetic systems. To appear in SIAM: Modeling and Simulation. Available at arxiv.org:117.2181. Also at www.math.wisc.edu/ anderson.

Multi-level Monte Carlo: an unbiased estimator For well chosen n, n l, and n E. We have Var( Q L = Var QE + l=l +1 Q l + Q = O(ɛ 2, with [ ] ( Comp. cost = ɛ 2 (N ρ h L + hl 2 N+ɛ 2 h 1 l + ln(ɛ 2 N ρ + ln(ɛ 1 1 M 1 h l

Multi-level Monte Carlo: an unbiased estimator Some observations: 1. Weak error plays no role in analysis: free to choose h L.

Multi-level Monte Carlo: an unbiased estimator Some observations: 1. Weak error plays no role in analysis: free to choose h L. 2. Common problems associated with tau-leaping Negativity of species numbers, does not matter. Just define process in a sensible way.

Multi-level Monte Carlo: an unbiased estimator Some observations: 1. Weak error plays no role in analysis: free to choose h L. 2. Common problems associated with tau-leaping Negativity of species numbers, does not matter. Just define process in a sensible way. 3. The method is unbiased.

Example Consider a model of gene transcription and translation: Suppose: G 25 G + M, M 1 M + P, P + P.1 D, M.1, P 1. 1. initialize with: G = 1, M =, P =, D =, 2. want to estimate the expected number of dimers at time T = 1, 3. to an accuracy of ± 1. with 95% confidence.

Example Method: Exact algorithm with crude Monte Carlo. Approximation # paths CPU Time # updates 3,714.2 ± 1. 4,74, 149, CPU S (41 hours! 8.27 1 1

Example Method: Exact algorithm with crude Monte Carlo. Approximation # paths CPU Time # updates 3,714.2 ± 1. 4,74, 149, CPU S (41 hours! 8.27 1 1 Method: Euler tau-leaping with crude Monte Carlo. Step-size Approximation # paths CPU Time # updates h = 3 7 3,712.3 ± 1. 4,75, 13,374.6 S 6.2 1 1 h = 3 6 3,77.5 ± 1. 4,75, 6,27.9 S 2.1 1 1 h = 3 5 3,693.4 ± 1. 4,7, 2,83.9 S 6.9 1 9 h = 3 4 3,654.6 ± 1. 4,65, 1,219. S 2.6 1 9

Example Method: Exact algorithm with crude Monte Carlo. Approximation # paths CPU Time # updates 3,714.2 ± 1. 4,74, 149, CPU S (41 hours! 8.27 1 1 Method: Euler tau-leaping with crude Monte Carlo. Step-size Approximation # paths CPU Time # updates h = 3 7 3,712.3 ± 1. 4,75, 13,374.6 S 6.2 1 1 h = 3 6 3,77.5 ± 1. 4,75, 6,27.9 S 2.1 1 1 h = 3 5 3,693.4 ± 1. 4,7, 2,83.9 S 6.9 1 9 h = 3 4 3,654.6 ± 1. 4,65, 1,219. S 2.6 1 9 Method: unbiased MLMC with l = 2, and M and L detailed below. Step-size parameters Approx. CPU Time # updates M = 3, L = 6 3,713.9 ± 1. 1,63.3 S 1.1 1 9 M = 3, L = 5 3,714.7 ± 1. 1,114.9 S 9.4 1 8 M = 3, L = 4 3,714.2 ± 1. 1,656.6 S 1. 1 9 M = 4, L = 4 3714.2 ± 1. 1,334.8 S 1.1 1 9 M = 4, L = 5 3,713.8 ± 1. 1,14.9 S 1.1 1 9

Example Method: Exact algorithm with crude Monte Carlo. Approximation # paths CPU Time # updates 3,714.2 ± 1. 4,74, 149, CPU S (41 hours! 8.27 1 1 Method: Euler tau-leaping with crude Monte Carlo. Step-size Approximation # paths CPU Time # updates h = 3 7 3,712.3 ± 1. 4,75, 13,374.6 S 6.2 1 1 h = 3 6 3,77.5 ± 1. 4,75, 6,27.9 S 2.1 1 1 h = 3 5 3,693.4 ± 1. 4,7, 2,83.9 S 6.9 1 9 h = 3 4 3,654.6 ± 1. 4,65, 1,219. S 2.6 1 9 Method: unbiased MLMC with l = 2, and M and L detailed below. Step-size parameters Approx. CPU Time # updates M = 3, L = 6 3,713.9 ± 1. 1,63.3 S 1.1 1 9 M = 3, L = 5 3,714.7 ± 1. 1,114.9 S 9.4 1 8 M = 3, L = 4 3,714.2 ± 1. 1,656.6 S 1. 1 9 M = 4, L = 4 3714.2 ± 1. 1,334.8 S 1.1 1 9 M = 4, L = 5 3,713.8 ± 1. 1,14.9 S 1.1 1 9 the exact algorithm with crude Monte Carlo demanded 14 times more CPU time than our unbiased MLMC estimator!

Example Method: Exact algorithm with crude Monte Carlo. Approximation # paths CPU Time # updates 3,714.2 ± 1. 4,74, 149, CPU S (41 hours! 8.27 1 1 Unbiased Multi-level Monte Carlo with M = 3, L = 5, and l = 2. Level # paths CPU Time Var. estimator # updates (X, Z 3 5 3,9 279.6 S.658 6.8 1 7 (Z 3 5, Z 3 4 3, 49. S.217 8.8 1 7 (Z 3 4, Z 3 3 15, 71.7 S.179 1.5 1 8 (Z 3 3, Z 3 2 51, 112.3 S.319 1.7 1 8 Tau-leap with h = 3 2 8,63, 518.4 S.1192 4.7 1 8 Totals N.A. 131. S.2565 9.5 1 8

Some conclusions about this method 1. Gillespie s algorithm is by far the most common way to compute expectations: 1.1 Means. 1.2 Variances. 1.3 Probabilities. 2. The new method (MLMC also performs this task with no bias (exact.

Some conclusions about this method 1. Gillespie s algorithm is by far the most common way to compute expectations: 1.1 Means. 1.2 Variances. 1.3 Probabilities. 2. The new method (MLMC also performs this task with no bias (exact. 3. Will be at worst the same speed as Gillespie (exact algorithm + crude Monte Carlo. 4. Will commonly be many orders of magnitude faster.

Some conclusions about this method 1. Gillespie s algorithm is by far the most common way to compute expectations: 1.1 Means. 1.2 Variances. 1.3 Probabilities. 2. The new method (MLMC also performs this task with no bias (exact. 3. Will be at worst the same speed as Gillespie (exact algorithm + crude Monte Carlo. 4. Will commonly be many orders of magnitude faster. 5. Applicable to essentially all continuous time Markov chains: X(t = X( + ( t Y k λ k (X(sds ζ k. k

Some conclusions about this method 1. Gillespie s algorithm is by far the most common way to compute expectations: 1.1 Means. 1.2 Variances. 1.3 Probabilities. 2. The new method (MLMC also performs this task with no bias (exact. 3. Will be at worst the same speed as Gillespie (exact algorithm + crude Monte Carlo. 4. Will commonly be many orders of magnitude faster. 5. Applicable to essentially all continuous time Markov chains: X(t = X( + ( t Y k λ k (X(sds ζ k. k 6. Con: Is substantially harder to implement; good software is needed.

Some conclusions about this method 1. Gillespie s algorithm is by far the most common way to compute expectations: 1.1 Means. 1.2 Variances. 1.3 Probabilities. 2. The new method (MLMC also performs this task with no bias (exact. 3. Will be at worst the same speed as Gillespie (exact algorithm + crude Monte Carlo. 4. Will commonly be many orders of magnitude faster. 5. Applicable to essentially all continuous time Markov chains: X(t = X( + ( t Y k λ k (X(sds ζ k. k 6. Con: Is substantially harder to implement; good software is needed. 7. Makes no use of any specific structure or scaling in the problem.

Another example: Viral infection Let 1. T = viral template. 2. G = viral genome. 3. S = viral structure. 4. V = virus. Reactions: R1 T + stuff κ 1 T + G κ 1 = 1 R2 G κ 2 T κ 2 =.25 R3 T + stuff κ 3 T + S κ 3 = 1 R4 T κ 4 κ 4 =.25 R5 S κ 5 κ 5 = 2 R6 G + S κ 6 V κ 6 = 7.5 1 6 R. Srivastava, L. You, J. Summers, and J. Yin, J. Theoret. Biol., 22. E. Haseltine and J. Rawlings, J. Chem. Phys, 22. K. Ball, T. Kurtz, L. Popovic, and G. Rempala, Annals of Applied Probability, 26. W. E, D. Liu, and E. Vanden-Eijden, J. Comput. Phys, 26.

Another example: Viral infection Stochastic equations for X = (X G, X S, X T, X V are ( t ( X 1 (t = X 1 ( + Y 1 X 3 (sds Y 2.25 Y 6 ( 7.5 1 6 t t X 1 (sx 2 (sds t ( X 2 (t = X 2 ( + Y 3 (1 X 3 (sds Y 5 2 ( t Y 6 7.5 1 6 X 1 (sx 2 (sds t X 3 (t = X 3 ( + Y 2 (.25 X 1 (sds X 4 (t = X 4 ( + Y 6 ( 7.5 1 6 t t ( Y 4.25 X 1 (sx 2 (sds. X 1 (sds X 2 (sds t X 3 (sds

Another example: Viral infection Reactions: R1 T + stuff κ 1 T + G κ 1 = 1 R2 G κ 2 T κ 2 =.25 R3 T + stuff κ 3 T + S κ 3 = 1 R4 T κ 4 κ 4 =.25 R5 S κ 5 κ 5 = 2 R6 G + S κ 6 V κ 6 = 7.5 1 6 If T >, reactions 3 and 5 are much faster than others. Looks like S is approximately Poisson(5 T.

Another example: Viral infection Reactions: R1 T + stuff κ 1 T + G κ 1 = 1 R2 G κ 2 T κ 2 =.25 R3 T + stuff κ 3 T + S κ 3 = 1 R4 T κ 4 κ 4 =.25 R5 S κ 5 κ 5 = 2 R6 G + S κ 6 V κ 6 = 7.5 1 6 If T >, reactions 3 and 5 are much faster than others. Looks like S is approximately Poisson(5 T. Can average out to get approximate process Z (t.

Another example: Viral infection Approximate process satisfies. ( t ( Z 1 (t = X 1 ( + Y 1 Z 3 (sds Y 2.25 Y 6 ( 3.75 1 3 t t Z 3 (t = X 3 ( + Y 2 (.25 Z 1 (sds Z 4 (t = X 4 ( + Y 6 ( 3.75 1 3 t t Z 1 (sz 3 (sds ( Y 4.25 Z 1 (sz 3 (sds. Z 1 (sds t Z 3 (sds (1 Now use Ef (X(t = E[f (X(t f (Z (t] + Ef (Z (t.

Another example: Viral infection ( t min{x 3 (s, Z 3 (s}ds ζ 1 + Y 1,2 ( t X(t = X( + Y 1,1 t ( t + Y 2,1 (.25 min{x 1 (s, Z 1 (s}ds ζ 2 + Y 2,2.25 t + Y 3 (1 X 3 (sds ζ 3 t ( t + Y 4,1 (.25 min{x 3 (s, Z 3 (s}(sds ζ 4 + Y 4,2.25 t + Y 5 (2 X 2 (sds ζ 5 ( t + Y 6,1 X 3 (s min{x 3 (s, Z 3 (s}ds ζ 1 X 1 (s min{x 1 (s, Z 1 (s}ds ζ 2 X 3 (s min{x 3 (s, Z 3 (s}(sds ζ 4 ( t min{λ 6 (X(s, Λ 6 (Z (s}ds ζ 6 Y 6,2 λ 6 (X(s min{λ 6 (X(s, Λ 6 (Z (s}ds ζ 6 ( t ( t Z (t = Y 1,1 min{x 3 (s, Z 3 (s}ds ζ 1 + Y 1,3 Z 3 (s min{x 3 (s, Z 3 (s}ds ζ 1 t ( t + Y 2,1 (.25 min{x 1 (s, Z 1 (s}ds ζ 2 + Y 2,3.25 Z 1 (s min{x 1 (s, Z 1 (s}ds ζ 2 t ( t + Y 4,1 (.25 min{x 3 (s, Z 3 (s}(sds ζ 4 + Y 4,3.25 Z 3 (s min{x 3 (s, Z 3 (s}(sds ζ 4 ( t ( t + Y 6,1 min{λ 6 (X(s, Λ 6 (Z (s}ds ζ 6 Y 6,3 Λ 6 (Z (s min{λ 6 (X(s, Λ 6 (Z (s}ds ζ 6,

Another example: Viral infection Suppose want Given T ( = 1, all others zero. EX virus (2 Method: Exact algorithm with crude Monte Carlo. Approximation # paths CPU Time # updates 13.85 ±.7 75, 24,8 CPU S 1.45 1 1

Another example: Viral infection Suppose want Given T ( = 1, all others zero. EX virus (2 Method: Exact algorithm with crude Monte Carlo. Approximation # paths CPU Time # updates 13.85 ±.7 75, 24,8 CPU S 1.45 1 1 Method: Ef (X(t = E[f (X(t f (Z (t] + Ef (Z (t. Approximation CPU Time # updates 13.91 ±.7 1,118.5 CPU S 2.41 1 8 Exact + crude Monte Carlo used: 1. 6 times more total steps. 2. 22 times more CPU time.

Mathematical Analysis We had X(t = X( + k Y k ( t λ k(x(sds ζ k. Assumed λ k(x( N 1. k There are therefore two extreme parameters floating around our models: 1. Some parameter N 1, causing N 1 (inherent to model. 2. h, the stepsize (inherent to approximation. To quantify errors, need to account for both.

Mathematical Analysis: Scaling in style of Thomas Kurtz For each species i, define the normalized abundance Xi N (t = N α i X i (t, where α i should be selected so that X N i = O(1.

Mathematical Analysis: Scaling in style of Thomas Kurtz For each species i, define the normalized abundance Xi N (t = N α i X i (t, where α i should be selected so that X N i = O(1. Rate constants, κ k, may also vary over several orders of magnitude. We write κ k = κ k N β k where the β k are selected so that κ k = O(1.

Mathematical Analysis: Scaling in style of Thomas Kurtz For each species i, define the normalized abundance Xi N (t = N α i X i (t, where α i should be selected so that X N i = O(1. Rate constants, κ k, may also vary over several orders of magnitude. We write κ k = κ k N β k where the β k are selected so that κ k = O(1. Eventually leads to scaled model X N (t = X N ( + k Y k ( N γ t N β k +α ν k γ λ k (X N (sds ζk N.

Results Let ρ k satisfy and set X N (t = X N ( + k Y k ( N γ t ζ N k N ρ k, ρ = min{ρ k }. N c k λ k (X N (sds ζk N. Theorem (A., Higham 211 Suppose (Z N l, Z N l 1 satisfy coupling with Z N l ( = Z N l 1(. Then, sup E Zl N (t Zl 1(t N 2 C 1 (T, N, γn ρ h l + C 2 (T, N, γhl. 2 t T

Results Let ρ k satisfy and set X N (t = X N ( + k Y k ( N γ t ζ N k N ρ k, ρ = min{ρ k }. N c k λ k (X N (sds ζk N. Theorem (A., Higham 211 Suppose (X N, Z N l satisfy coupling with X N ( = Z N l (. Then, sup E X N (t Zl N (t 2 C 1 (T, N, γn ρ h l + C 2 (T, N, γhl. 2 t T

Flavor of Proof Theorem (A., Higham 211 Suppose (X N, Z N l satisfy coupling with X N ( = Z N l (. Then, sup E X N (t Zl N (t 2 C 1 (T, N, γn ρ h l + C 2 (T, N, γhl. 2 t T η l(sds ζk N X N (t =X N ( + t Y k,1 (N γ N c k λ k (X N (s λ k (Zl N k + t Y k,2 (N γ N c k λ k (X N (s λ k (X N (s λ k (Zl N k η l(sds ζk N Zl N (t =Z l N ( + t Y k,1 (N γ N c k λ k (X N (s λ k (Z l η l (sds ζk N k + t Y k,3 (N γ N c k λ k (Zl N η l(s λ k (X N (s λ k (Zl N η l(sds ζk N k

Flavor of Proof So, X N (t Z N (t = t Y k,2 (N γ N c k λ k (X N (s λ k (X N (s λ k (Zl N η l(sds ζk N k t Y k,3 (N γ N c k λ k (Zl N η l(s λ k (X N (s λ k (Zl N η l(sds ζk N

Flavor of Proof So, X N (t Z N (t = t Y k,2 (N γ N c k λ k (X N (s λ k (X N (s λ k (Zl N η l(sds ζk N k t Y k,3 (N γ N c k λ k (Zl N η l(s λ k (X N (s λ k (Zl N η l(sds ζk N Hence, X N (t Z N (t = M N (t + k N γ ζ N k N c k t (λ k (X N (s λ k (Z N l η l (sds. Now work.

Next problem: parameter sensitivities. Motivated by Jim Rawlings.

Next problem: parameter sensitivities. Motivated by Jim Rawlings. We have and we define X θ (t = X θ ( + k Y k ( t λ θ k (X θ (sds ζ k. J(θ = Ef (X θ (t].

Next problem: parameter sensitivities. Motivated by Jim Rawlings. We have and we define X θ (t = X θ ( + k Y k ( t λ θ k (X θ (sds ζ k. J(θ = Ef (X θ (t]. We want J (θ = d dθ Ef (X θ (t.

Next problem: parameter sensitivities. Motivated by Jim Rawlings. We have and we define X θ (t = X θ ( + k Y k ( t λ θ k (X θ (sds ζ k. J(θ = Ef (X θ (t]. We want J (θ = d dθ Ef (X θ (t. There are multiple methods. We consider finite differences: J (θ = J(θ + ɛ J(θ ɛ + O(ɛ.

Next problem: parameter sensitivities. Noting that J (θ = d dθ Ef (X θ (t = Ef (X θ+ɛ (t Ef (X θ (t ɛ + o(ɛ. The usual finite difference estimator is D R (ɛ = ɛ 1 1 R R i=1 f (X θ+ɛ [i] (t 1 R R f (X[j](t θ j=1

Next problem: parameter sensitivities. Noting that J (θ = d dθ Ef (X θ (t = Ef (X θ+ɛ (t Ef (X θ (t ɛ + o(ɛ. The usual finite difference estimator is D R (ɛ = ɛ 1 1 R R i=1 f (X θ+ɛ [i] (t 1 R R f (X[j](t θ j=1 If generated independently, then Var(D R (ɛ = O(R 1 ɛ 2.

Next problem: parameter sensitivities. Couple the processes. X θ+ɛ (t = X θ+ɛ ( + ( t Y k,1 λ θ+ɛ k (X θ+ɛ (s λ θ k (X θ (sds ζ k k + ( t Y k,2 λ θ+ɛ k (X θ+ɛ (s λ θ+ɛ k (X θ+ɛ (s λ θ k (X θ (sds k X θ (t = X θ ( + ( t Y k,1 λ θ+ɛ k (X θ+ɛ (s λ θ k (X θ (sds ζ k k + ( t Y k,3 λ θ k (X θ (s λ θ+ɛ k (X θ+ɛ (s λ θ k (X θ (sds ζ k, k ζ k Use: D R (ɛ = ɛ 1 1 R R [ f (X θ+ɛ [i] i=1 ] (t f (X[i](t θ.

Next problem: parameter sensitivities. Theorem (Anderson, 211 Suppose (X θ+ɛ, X θ satisfy coupling. Then, for any T > there is a C T,f > for which [ ] ( 2 E f (X θ+ɛ (t f (X θ (t C T,f ɛ. sup t T 1 David F. Anderson, An efficient Finite Difference Method for Parameter Sensitivities of Continuous Time Markov Chains. Submitted. Available at arxiv.org:119.289. Also at www.math.wisc.edu/ anderson.

Next problem: parameter sensitivities. Theorem (Anderson, 211 Suppose (X θ+ɛ, X θ satisfy coupling. Then, for any T > there is a C T,f > for which [ ] ( 2 E f (X θ+ɛ (t f (X θ (t C T,f ɛ. sup t T This lowers variance of estimator from O(R 1 ɛ 2, to O(R 1 ɛ 1. Lowered by order of magnitude (in ɛ. 1 David F. Anderson, An efficient Finite Difference Method for Parameter Sensitivities of Continuous Time Markov Chains. Submitted. Available at arxiv.org:119.289. Also at www.math.wisc.edu/ anderson.

Parameter Sensitivities G 2 G + M, M 1 M + P, M k, P 1. Want k E [ X k protein(3 ], k 1/4.

Parameter Sensitivities G 2 G + M, M 1 M + P, M k, P 1. Want [ ] k E Xprotein(3 k, k 1/4. Method # paths Approximation # updates CPU Time Likelihood ratio 689,6-312.1 ± 6. 2.9 1 9 3,56.6 S Exact/Naive FD 246,2-318.8 ± 6. 2.1 1 9 3,282.1 S CRP 26,32-32.7 ± 6. 2.2 1 8 41. S Coupled 4,78-321.2 ± 6. 2.1 1 7 35.3 S

Analysis Theorem Suppose (X θ+ɛ, X θ satisfy coupling. Then, for any T > there is a C T,f > for which ( 2 E sup f (X θ+ɛ (t f (X (t θ CT,f ɛ. t T Proof:

Analysis Theorem Suppose (X θ+ɛ, X θ satisfy coupling. Then, for any T > there is a C T,f > for which ( 2 E sup f (X θ+ɛ (t f (X (t θ CT,f ɛ. t T Proof: Key observation of proof: X θ+ɛ (t X θ (t = M θ,ɛ (t + t F θ+ɛ (X θ+ɛ (s F θ (X θ (sds. Now work on Martingale and absolutely continuous part.

Thanks! References: 1. David F. Anderson and Desmond J. Higham, Multi-level Monte Carlo for continuous time Markov chains, with applications in biochemical kinetics, to appear in SIAM: Multiscale Modeling and Simulation. Available at arxiv.org:117.2181. Also on my website: www.math.wisc.edu/ anderson. 2. David F. Anderson, Efficient Finite Difference Method for Parameter Sensitivities of Continuous time Markov Chains, submitted. Available at arxiv.org:119.289. Also on my website: www.math.wisc.edu/ anderson. Funding: NSF-DMS-19275.