Stochastic models of biochemical systems David F. Anderson anderson@math.wisc.edu Department of Mathematics University of Wisconsin - Madison University of Amsterdam November 14th, 212
Stochastic models of biochemical systems Goal: give broad introduction to stochastic models of biochemical systems, with minimal technical details. Outline 1. Construct useful representation for most common continuous time Markov chain model for population processes. 2. Discuss some computational methods sensitivity analysis. 3. Discuss various approximate models for these CTMCs.
Example: ODE Lotka-Volterra predator-prey model Think of A as a prey and B as a predator. A κ 1 2A, A + B κ 2 2B, B κ 3, with κ 1 = 2, κ 2 =.2, κ 3 = 2.
Example: ODE Lotka-Volterra predator-prey model Think of A as a prey and B as a predator. A κ 1 2A, A + B κ 2 2B, B κ 3, with κ 1 = 2, κ 2 =.2, κ 3 = 2. Deterministic model. Let x(t) = [# prey at t, # predator at t] T or [ x(t) 1 = κ 1 x 1 (t) ] [ 1 + κ 2 x 1 (t)x 2 (t) 1 ] [ + κ 3 x 2 (t) 1 ] x(t) = x() + κ 1 t [ 1 x 1 (s)ds ] + κ 2 t [ 1 x 1 (s)x 2 (s)ds 1 ] + κ 3 t [ x 2 (s)ds 1 ]
Lotka-Volterra Think of A as a prey and B as a predator. A κ 1 2A, A + B κ 2 2B, B κ 3, with κ 1 = 2, κ 2 =.2, κ 3 = 2. 15 14 13 Prey Predator 12 11 1 9 8 7 5 1 15 2 25 3 35 4
Biological example: transcription-translation Gene transcription & translation: G κ 1 G + M M κ 2 M + P M κ 3 P κ 4 G + P κ 5 κ 5 B transcription translation degradation degradation Binding/unbinding of Gene Cartoon representation: 1 N α q 1 X 1 X 2, N α q 2 N λ X 1 N λ 1 M, 2 X2 M, M µ. 1 J. Paulsson, Physics of Life Reviews, 2, 25 157 175.
Another example: Viral infection Let 1. T = viral template. 2. G = viral genome. 3. S = viral structure. 4. V = virus. Reactions: R1) T + stuff κ 1 T + G κ 1 = 1 R2) G κ 2 T κ 2 =.25 R3) T + stuff κ 3 T + S κ 3 = 1 R4) T κ 4 κ 4 =.25 R5) S κ 5 κ 5 = 2 R6) G + S κ 6 V κ 6 = 7.5 1 6 R. Srivastava, L. You, J. Summers, and J. Yin, J. Theoret. Biol., 22. E. Haseltine and J. Rawlings, J. Chem. Phys, 22. K. Ball, T. Kurtz, L. Popovic, and G. Rempala, Annals of Applied Probability, 26. W. E, D. Liu, and E. Vanden-Eijden, J. Comput. Phys, 26.
Some examples E. coli Heat Shock Response Model. 9 species, 18 reactions. 2 2 Hye Won Kang, presentation at SPA in 27.
Modeling 1. These models (and much more complicated ones) have historically been predominantly modeled using ODEs. 2. However: 2.1 there are often low numbers of molecules, which makes timing of reactions more random (less averaging), 2.2 when a reaction occurs, the system jumps to new state by non-trivial amount: 1.
Modeling 1. These models (and much more complicated ones) have historically been predominantly modeled using ODEs. 2. However: 2.1 there are often low numbers of molecules, which makes timing of reactions more random (less averaging), 2.2 when a reaction occurs, the system jumps to new state by non-trivial amount: 1. 3. Researchers (mostly) lived with these shortcomings until the late 199s and early 2s when it was shown ODE models can not capture important qualitative behavior of certain models: λ-phage lysis-lysogeny decision mechanism (Arkin-McAdams 1998). Green fluorescent protein. ODEs were often the wrong modeling choice.
Specifying infinitesimal behavior Q: What is a better modeling choice? Should be 1. discrete space, since counting molecules, and 2. stochastic dynamics. Let s return to development of ODEs. An ordinary differential equation is specified by describing how a function should vary over a small period of time X(t + t) X(t) F(X(t)) t
Specifying infinitesimal behavior Q: What is a better modeling choice? Should be 1. discrete space, since counting molecules, and 2. stochastic dynamics. Let s return to development of ODEs. An ordinary differential equation is specified by describing how a function should vary over a small period of time X(t + t) X(t) F(X(t)) t A more precise description (consider a telescoping sum) X(t) = X() + t F(X(s))ds
Infinitesimal behavior for jump processes We are interested in functions that are piecewise constant and random. Changes, when they occur, won t be small. If reaction k occurs at time t, X(t) X(t ) = ζ k Z d
Infinitesimal behavior for jump processes We are interested in functions that are piecewise constant and random. Changes, when they occur, won t be small. If reaction k occurs at time t, X(t) X(t ) = ζ k Z d What is small? The probability of seeing a jump of a particular size. P{X(t + t) X(t) = ζ k F t} λ ζk (t) t Question: Can we specify the λ ζk in some way that determines X? For the ODE, F depended on X. Maybe λ ζk should depend on X?
Simple model For example, consider the simple system A + B C where one molecule each of A and B is being converted to one of C.
Simple model For example, consider the simple system A + B C where one molecule each of A and B is being converted to one of C. Intuition for standard stochastic model: P{reaction occurs in (t, t + t] F t} κx A (t)x B (t) t where κ is a positive constant, the reaction rate constant. F t is all the information pertaining to the process up through time t. Can we specify a reasonable model satisfying this assumption?
Background information: The Poisson process Will view a Poisson process, Y ( ), through the lens of an underlying point process. (a) Let {e i } be i.i.d. exponential random variables with parameter one.
Background information: The Poisson process Will view a Poisson process, Y ( ), through the lens of an underlying point process. (a) Let {e i } be i.i.d. exponential random variables with parameter one. (b) Now, put points down on a line with spacing equal to the e i : x x x x x x x x e 1 e2 e3 t Let Y1 (t) denote the number of points hit by time t. In the figure above, Y1 (t) = 6. 25 2 λ =1 15 1 5 5 1 15 2
The Poisson process Let Y 1 be a unit rate Poisson process. Define Y λ (t) Y 1 (λt), Then Y λ is a Poisson process with parameter λ. x x x x x x x x e 1 e2 e3 t Intuition: The Poisson process with rate λ is simply the number of points hit (of the unit-rate point process) when we run along the time frame at rate λ. 6 5 4 λ =3 3 2 1 5 1 15 2
The Poisson process There is no reason λ needs to be constant in time, in which case ( t ) Y λ (t) Y λ(s)ds is a non-homogeneous Poisson process with propensity/intensity λ(t). Thus P{Y λ (t + t) Y λ (t) > F t} = 1 exp { t+ t } λ(s)ds λ(t) t. t
The Poisson process There is no reason λ needs to be constant in time, in which case ( t ) Y λ (t) Y λ(s)ds is a non-homogeneous Poisson process with propensity/intensity λ(t). Thus P{Y λ (t + t) Y λ (t) > F t} = 1 exp { t+ t } λ(s)ds λ(t) t. t Points: 1. We have changed time to convert a unit-rate Poisson process to one which has rate or intensity or propensity λ(t). 2. Will use similar time changes of unit-rate processes to build models of interest.
Return to models of interest Consider the simple system A + B C where one molecule each of A and B is being converted to one of C. Intuition for standard stochastic model: P{reaction occurs in (t, t + t] F t} κx A (t)x B (t) t where κ is a positive constant, the reaction rate constant. F t is all the information pertaining to the process up through time t.
Models of interest A + B C Simple book-keeping says: if gives the state at time t, then where X(t) = X A (t) X B (t) X C (t) X(t) = X() + R(t) 1 1 1, R(t) is the # of times the reaction has occurred by time t and X() is the initial condition. Goal: represent R(t) in terms of Poisson process.
Models of interest Recall that for A + B C our intuition was to specify infinitesimal behavior P{reaction occurs in (t, t + t] Ft} κx A (t)x B (t) t,
Models of interest Recall that for A + B C our intuition was to specify infinitesimal behavior P{reaction occurs in (t, t + t] Ft} κx A (t)x B (t) t, and that for a counting process with specified intensity λ(t) we have P{Y λ (t + t) Y λ (t) = 1 F t} λ(t) t.
Models of interest Recall that for A + B C our intuition was to specify infinitesimal behavior P{reaction occurs in (t, t + t] Ft} κx A (t)x B (t) t, and that for a counting process with specified intensity λ(t) we have P{Y λ (t + t) Y λ (t) = 1 F t} λ(t) t. This suggests we can model ( t R(t) = Y where Y is a unit-rate Poisson process. ) κx A (s)x B (s)ds
Models of interest Recall that for A + B C our intuition was to specify infinitesimal behavior P{reaction occurs in (t, t + t] Ft} κx A (t)x B (t) t, and that for a counting process with specified intensity λ(t) we have P{Y λ (t + t) Y λ (t) = 1 F t} λ(t) t. This suggests we can model ( t R(t) = Y where Y is a unit-rate Poisson process. Hence X A (t) X B (t) X C (t) X(t) = X() + ) κx A (s)x B (s)ds 1 1 1 ( t Y This equation uniquely determines X for all t. ) κx A (s)x B (s)ds.
Build up model: Random time change representation of Kurtz Now consider a network of reactions involving d chemical species, S 1,..., S d : d d ν ik S i i=1 i=1 ν iks i Denote reaction vector as ζ k = ν k ν k, so that if reaction k occurs at time t X(t) = X(t ) + ζ k.
Build up model: Random time change representation of Kurtz Now consider a network of reactions involving d chemical species, S 1,..., S d : d d ν ik S i i=1 i=1 ν iks i Denote reaction vector as ζ k = ν k ν k, so that if reaction k occurs at time t X(t) = X(t ) + ζ k. The intensity (or propensity) of kth reaction is λ k : Z d R. By analogy with before: X(t) = X() + k R k (t)ζ k, with X(t) = X() + ( t ) Y k λ k (X(s))ds ζ k, k Y k are independent, unit-rate Poisson processes.
Mass-action kinetics The standard intensity function chosen is mass-action kinetics: λ k (x) = κ k ( ( ) x x i! ν ik!) = κ k ν k (x i ν ik )!. i i Example: If S 1 anything, then λ k (x) = κ k x 1. Example: If S 1 + S 2 anything, then λ k (x) = κ k x 1 x 2. Example: If 2S 2 anything, then λ k (x) = κ k x 2 (x 2 1).
Other ways to understand model The infinitesimal generator of a Markov process determines the process: Af (x) def 1 = lim [Exf (X(h)) f (x)] h h
Other ways to understand model The infinitesimal generator of a Markov process determines the process: Af (x) def 1 = lim [Exf (X(h)) f (x)] h h [ ] 1 = lim (f (x + ζ k ) f (x))p(r k (h) = 1) + O(h) h h k
Other ways to understand model The infinitesimal generator of a Markov process determines the process: Af (x) def 1 = lim [Exf (X(h)) f (x)] h h [ ] 1 = lim (f (x + ζ k ) f (x))p(r k (h) = 1) h h k [ ] 1 = lim (f (x + ζ k ) f (x))λ k (x)h + O(h) h h k + O(h)
Other ways to understand model The infinitesimal generator of a Markov process determines the process: Af (x) def 1 = lim [Exf (X(h)) f (x)] h h [ ] 1 = lim (f (x + ζ k ) f (x))p(r k (h) = 1) h h k [ ] 1 = lim (f (x + ζ k ) f (x))λ k (x)h h h k + O(h) + O(h) = k λ k (x)(f (x + ζ k ) f (x)).
Other ways to understand model And we have Dynkin s formula (See Ethier and Kurtz, 1986, Ch. 1) Ef (X(t)) f (X ) = E t Af (X(s))ds,
Other ways to understand model And we have Dynkin s formula (See Ethier and Kurtz, 1986, Ch. 1) Ef (X(t)) f (X ) = E t Af (X(s))ds, Letting f (y) = 1 x(y), above so that E[f (X(t))] = P{X(t) = x} = p x(t), gives Kolmogorov forward equation (chemical master equation) p t (x) = k λ(x ζ k )p t(x ζ k ) p t(x) k λ k (x)
Equivalence of formulations We now have three ways of making the infinitesimal specification precise: P{X(t + t) X(t) = ξ k F X t } λ k (X(t)) t 1. The stochastic equation: X(t) = X() + k Y k ( t ) λ k (X(s))ds ζ k 2. The process is Markov with infinitesimal generator (Af )(x) = k λ k (x)(f (x + ζ k ) f (x)) 3. The master (forward) equation for the probability distributions: p x(t) = k λ k (x ζ k )p t(x ζ k ) p t(x) k λ k (x) Fortunately, if the solution of the stochastic equation doesn t blow up, the three are equivalent. This model is an example of a continuous time Markov chain.
Example: ODE Lotka-Volterra predator-prey model Think of A as a prey and B as a predator. A κ 1 2A, A + B κ 2 2B, B κ 3, with κ 1 = 2, κ 2 =.2, κ 3 = 2. Deterministic model. Let x(t) = [#prey, #predators] T x(t) = x() + κ 1 t [ 1 x 1 (s)ds ] + κ 2 t [ 1 x 1 (s)x 2 (s)ds 1 ] + κ 3 t [ x 2 (s)ds 1 ]
Example: ODE Lotka-Volterra predator-prey model Think of A as a prey and B as a predator. A κ 1 2A, A + B κ 2 2B, B κ 3, with κ 1 = 2, κ 2 =.2, κ 3 = 2. Deterministic model. Let x(t) = [#prey, #predators] T x(t) = x() + κ 1 t [ 1 x 1 (s)ds ] + κ 2 t [ 1 x 1 (s)x 2 (s)ds 1 ] + κ 3 t [ x 2 (s)ds 1 ] Stochastic model. Let X(t) = [#prey, #predators] T ( t X(t) = X() + Y 1 κ 1 ( t + Y 3 κ 3 ) [ 1 X 1 (s)ds ) [ X 2 (s)ds 1 ] + Y 2 (κ 2 t ] ) [ 1 X 1 (s)x 2 (s)ds 1 ]
Another example: Viral infection Let 1. T = viral template. 2. G = viral genome. 3. S = viral structure. 4. V = virus. Reactions: R1) T + stuff κ 1 T + G κ 1 = 1 R2) G κ 2 T κ 2 =.25 R3) T + stuff κ 3 T + S κ 3 = 1 R4) T κ 4 κ 4 =.25 R5) S κ 5 κ 5 = 2 R6) G + S κ 6 V κ 6 = 7.5 1 6 R. Srivastava, L. You, J. Summers, and J. Yin, J. Theoret. Biol., 22. E. Haseltine and J. Rawlings, J. Chem. Phys, 22. K. Ball, T. Kurtz, L. Popovic, and G. Rempala, Annals of Applied Probability, 26. W. E, D. Liu, and E. Vanden-Eijden, J. Comput. Phys, 26.
Another example: Viral infection Stochastic equations for X = (X G, X S, X T, X V ) are ( t ) ( X 1 (t) = X 1 () + Y 1 X 3 (s)ds Y 2.25 Y 6 ( 7.5 1 6 t t ) X 1 (s)x 2 (s)ds t ) ( X 2 (t) = X 2 () + Y 3 (1 X 3 (s)ds Y 5 2 ( t ) Y 6 7.5 1 6 X 1 (s)x 2 (s)ds t X 3 (t) = X 3 () + Y 2 (.25 X 1 (s)ds X 4 (t) = X 4 () + Y 6 ( 7.5 1 6 t t ) ( Y 4.25 ) X 1 (s)x 2 (s)ds. ) X 1 (s)ds ) X 2 (s)ds t ) X 3 (s)ds
Computational methods These are continuous time Markov chains! Simulation/computation should be easy. The most common simulation methods include 1. Gillespie s Algorithm Answer where and when independently. 2. The next reaction method of Gibson and Bruck. 3. Each is an example of discrete event simulation.
Numerical methods Each exact method produces sample paths that can approximate values such as (which I will talk about tomorrow at CWI) Ef (X(t)) 1 n For example, 1. Means expected virus yield. 2. Variances. 3. Probabilities. n f (X [i] (t)) i=1 or sensitivities d dθ Ef (θ, X θ (t)).
Numerical methods Each exact method produces sample paths that can approximate values such as (which I will talk about tomorrow at CWI) Ef (X(t)) 1 n n f (X [i] (t)) i=1 For example, 1. Means expected virus yield. 2. Variances. 3. Probabilities. or sensitivities d dθ Ef (θ, X θ (t)). Problem: solving using these algorithms can be computationally expensive: 1. Each path may require significant number of computational steps. 2. May require significant number of paths. Solution: Need to use novel stochastic representations to get good methods.
Specific computational problem: Gradient estimation/sensitivity analysis We have X θ (t) = X θ () + ( t ) Y k λ k (θ, X θ (s))ds ζ k, k with θ R s, and we define J(θ) = Ef (θ, X θ (t)]. We know how to estimate J(θ) using Monte Carlo.
Specific computational problem: Gradient estimation/sensitivity analysis We have X θ (t) = X θ () + ( t ) Y k λ k (θ, X θ (s))ds ζ k, k with θ R s, and we define J(θ) = Ef (θ, X θ (t)]. We know how to estimate J(θ) using Monte Carlo. However, what if we want J (θ) = d dθ Ef (θ, X θ (t)). Thus, we want to know how sensitive our statistic is to perturbations in θ. Tells us, for example: 1. Robustness of system to perturbations in parameters. 2. Which parameters we need to estimate well from data, etc.
Specific computational problem: Gradient estimation/sensitivity analysis We have X θ (t) = X θ () + ( t ) Y k λ k (θ, X θ (s))ds ζ k, k with θ R s, and we define J(θ) = Ef (θ, X θ (t)]. We know how to estimate J(θ) using Monte Carlo. However, what if we want J (θ) = d dθ Ef (θ, X θ (t)). Thus, we want to know how sensitive our statistic is to perturbations in θ. Tells us, for example: 1. Robustness of system to perturbations in parameters. 2. Which parameters we need to estimate well from data, etc. There are multiple methods. We will consider: Finite differences.
Finite differencing This method is pretty straightforward and is therefore used most.
Finite differencing This method is pretty straightforward and is therefore used most. Simply note that J (θ) = J(θ + ɛ) J(θ) ɛ Centered differencing reduces bias to O(ɛ 2 ). [ f (θ + ɛ, X θ+ɛ (t)) f (θ, X θ (t)) + O(ɛ) = E ɛ ] + O(ɛ).
Finite differencing This method is pretty straightforward and is therefore used most. Simply note that J (θ) = J(θ + ɛ) J(θ) ɛ Centered differencing reduces bias to O(ɛ 2 ). [ f (θ + ɛ, X θ+ɛ (t)) f (θ, X θ (t)) + O(ɛ) = E ɛ ] + O(ɛ). The usual finite difference estimator is D N (ɛ) = 1 N N i=1 f (θ + ɛ, X θ+ɛ [i] (t)) f (θ, X[i](t)) θ ɛ Letting δ > be some desired accuracy (for confidence interval), we need N so that Var(DN (ɛ)) δ.
Finite differencing Want Var(DN (ɛ)) δ. with D N (ɛ) = 1 N N i=1 f (θ, X θ+ɛ [i] (t)) f (θ, X[i](t)) θ ɛ If paths generated independently, then implying Var(D N (ɛ)) = N 1 ɛ 2 Var(f (θ, X θ+ɛ [i] (t)) f (θ, X[i](t))) θ = O(N 1 ɛ 2 ), Terrible. Worse than expectations. 1 1 N ɛ = O(δ) = N = O(ɛ 2 δ 2 ) How about common random numbers for variance reduction?
Common random numbers It s exactly what it sounds like. Reuse the random numbers used in the generation of X θ+ɛ [i] (t) and X[i](t). θ Why?
Common random numbers It s exactly what it sounds like. Reuse the random numbers used in the generation of X θ+ɛ [i] (t) and X[i](t). θ Why? Because: Var(f (θ, X θ+ɛ [i] (t)) f (θ, X θ+ɛ (t))) = Var(f (θ, X θ+ɛ (t))) + Var(f (θ, X[i](t))) θ [i] [i] 2Cov(f (θ, X θ+ɛ [i] (t)), f (θ, X[i](t))). θ So, if we can couple the random variables, we can get a variance reduction! Sometimes substantial.
Common random numbers In the context of Gillespie s algorithm, we simply reuse all the same random numbers (uniforms). This can be achieved simply by setting the seed of the random number generator before generating X θ+ɛ and X θ.
Common random numbers CRN + Gillespie is good idea. 1. Costs little in terms of implementation. 2. Variance reduction and gains in efficiency can be huge. Thus, it is probably the most common method used today. But: Over time, the processes decouple, often completely. Can we do better?
Coupling Using common random numbers in previous fashion is a way of coupling the two processes together.
Coupling Using common random numbers in previous fashion is a way of coupling the two processes together. Is there a natural way to couple processes using random time change? Can we couple the Poisson processes?
Coupling Using common random numbers in previous fashion is a way of coupling the two processes together. Is there a natural way to couple processes using random time change? Can we couple the Poisson processes? Answer: yes. Multiple ways. I will show one which works very well.
How do we generate processes simultaneously Suppose I want to generate: A Poisson process with intensity 13.1. A Poisson process with intensity 13.
How do we generate processes simultaneously Suppose I want to generate: A Poisson process with intensity 13.1. A Poisson process with intensity 13. We could let Y 1 and Y 2 be independent, unit-rate Poisson processes, and set Z 13.1 (t) = Y 1 (13.1t), Z 13 (t) = Y 2 (13t), Using this representation, these processes are independent and, hence, not coupled.
How do we generate processes simultaneously Suppose I want to generate: A Poisson process with intensity 13.1. A Poisson process with intensity 13. We could let Y 1 and Y 2 be independent, unit-rate Poisson processes, and set Z 13.1 (t) = Y 1 (13.1t), Z 13 (t) = Y 2 (13t), Using this representation, these processes are independent and, hence, not coupled. The variance of difference is large: Var(Z 13.1 (t) Z 13 (t)) = Var(Y 1 (13.1t)) + Var(Y 2 (13t)) = 26.1t.
How do we generate processes simultaneously Suppose I want to generate: A Poisson process with intensity 13.1. A Poisson process with intensity 13.
How do we generate processes simultaneously Suppose I want to generate: A Poisson process with intensity 13.1. A Poisson process with intensity 13. We could let Y 1 and Y 2 be independent unit-rate Poisson processes, and set Z 13.1 (t) = Y 1 (13t) + Y 2 (.1t) Z 13 (t) = Y 1 (13t), The variance of difference is much smaller: Var(Z 13.1 (t) Z 13 (t)) = Var (Y 2 (.1t)) =.1t. Using a fact: sum of homogeneous Poisson process is again a Poisson process.
How do we generate processes simultaneously More generally, suppose we want 1. non-homogeneous Poisson process with intensity f (t) and 2. non-homogeneous Poisson process with intensity g(t).
How do we generate processes simultaneously More generally, suppose we want 1. non-homogeneous Poisson process with intensity f (t) and 2. non-homogeneous Poisson process with intensity g(t). We can can let Y 1, Y 2, and Y 3 be independent, unit-rate Poisson processes and define ( t ) ( t ) Z f (t) = Y 1 f (s) g(s)ds + Y 2 f (s) (f (s) g(s)) ds, ( t ) ( t Z g(t) = Y 1 f (s) g(s)ds + Y 3 ) g(s) (f (s) g(s)) ds,
How do we generate processes simultaneously More generally, suppose we want 1. non-homogeneous Poisson process with intensity f (t) and 2. non-homogeneous Poisson process with intensity g(t). We can can let Y 1, Y 2, and Y 3 be independent, unit-rate Poisson processes and define ( t ) ( t ) Z f (t) = Y 1 f (s) g(s)ds + Y 2 f (s) (f (s) g(s)) ds, ( t ) ( t Z g(t) = Y 1 f (s) g(s)ds + Y 3 ) g(s) (f (s) g(s)) ds, where we are using that, for example, ( t ) ( t ) ( t Y 1 f (s) g(s)ds + Y 2 f (s) (f (s) g(s)) ds = Y where Y is a unit rate Poisson process. ) f (s)ds,
Parameter sensitivities. Couple the processes. X θ+ɛ (t) = X θ+ɛ () + k + k X θ (t) = X θ () + k + k Y k,2 ( t Y k,1 ( t Y k,1 ( t Y k,3 ( t λ θ+ɛ k ) λ θ+ɛ k (X θ+ɛ (s)) λ θ k (X θ (s))ds ζ k (X θ+ɛ (s)) λ θ+ɛ k ) (X θ+ɛ (s)) λ θ k (X θ (s))ds ) λ θ+ɛ k (X θ+ɛ (s)) λ θ k (X θ (s))ds ζ k ) λ θ k (X θ (s)) λ θ+ɛ k (X θ+ɛ (s)) λ θ k (X θ (s))ds ζ k, ζ k
Parameter sensitivities. Theorem 3 Suppose (X θ+ɛ, X θ ) satisfy coupling. Then, for any T > there is a C T,f > for which ( 2 E sup f (θ + ɛ, X θ+ɛ (t)) f (θ, X (t))) θ CT,f ɛ. t T 3 David F. Anderson, An Efficient Finite Difference Method for Parameter Sensitivities of Continuous Time Markov Chains, SIAM: Journal on Numerical Analysis, Vol. 5, No. 5, 212.
Parameter sensitivities. Theorem 3 Suppose (X θ+ɛ, X θ ) satisfy coupling. Then, for any T > there is a C T,f > for which ( 2 E sup f (θ + ɛ, X θ+ɛ (t)) f (θ, X (t))) θ CT,f ɛ. t T This lowers variance of estimator from to Lowered by order of magnitude (in ɛ). O(N 1 ɛ 2 ), O(N 1 ɛ 1 ). Point: a deeper mathematical understanding led to better computational method. 3 David F. Anderson, An Efficient Finite Difference Method for Parameter Sensitivities of Continuous Time Markov Chains, SIAM: Journal on Numerical Analysis, Vol. 5, No. 5, 212.
Analysis Theorem Suppose (X θ+ɛ, X θ ) satisfy coupling. Then, for any T > there is a C T,f > for which ( 2 E sup f (θ + ɛ, X θ+ɛ (t)) f (θ, X (t))) θ CT,f ɛ. t T Proof:
Analysis Theorem Suppose (X θ+ɛ, X θ ) satisfy coupling. Then, for any T > there is a C T,f > for which ( 2 E sup f (θ + ɛ, X θ+ɛ (t)) f (θ, X (t))) θ CT,f ɛ. t T Proof: Key observation of proof: X θ+ɛ (t) X θ (t) = M θ,ɛ (t) + t where most of the jumps have vanished. F θ+ɛ (X θ+ɛ (s)) F θ (X θ (s))ds, Now work on Martingale and absolutely continuous part.
Example: gene transcription and translation G 2 G + M, M 1 M + P, M k, P 1. Want θ E [ X θ protein(3) ], θ 1/4.
Example: gene transcription and translation G 2 G + M, M 1 M + P, M k, P 1. Want [ ] θ E Xprotein(3) θ, θ 1/4. Method R 95% CI # updates CPU Time Likelihood 689,6-312.1 ± 6. 2.9 1 9 3,56.6 S CMC 246, -319.3 ± 6. 2.1 1 9 2,364.8 S CRP/CRN 25,98-316.7 ± 6. 2.2 1 8 27.9 S CFD 4,58-319.9 ± 6. 2. 1 7 29.2 S Table: Each finite difference method used ɛ = 1/4. The exact value is J(1/4) = 318.73.
Comparison from 5, samples each with ɛ = 1/4 6 5 5 4 Variance 4 3 2 Coupled Finite Differences Common Reaction Path Variance 3 2 Crude Monte Carlo 1 1 1 2 3 4 5 6 Time 1 2 3 4 5 6 Time 3 25 Girsanov Transformation 2 Variance 15 1 5 1 2 3 4 5 6 Time
Example: genetic toggle switch λ 1(X) X 1, λ 2 (X) λ 3(X) X 2, (1) λ 4 (X) with intensity functions λ 1 (X(t)) = α 1 1 + X 2 (t) β, λ 2(X(t)) = X 1 (t) and parameter choice λ 3 (X(t)) = α 2 1 + X 1 (t) γ. λ 4(X(t)) = X 2 (t), α 1 = 5, α 2 = 16, β = 2.5, γ = 1. Begin the process with initial condition [, ] and consider the sensitivity of X 1 as a function of α 1.
Example: genetic toggle switch.8.7.6 Coupled Finite Differences Common Reaction Path Variance.5.4.3.2.1 5 1 15 2 25 3 35 4 Time (a) Variance to time T = 4 Figure: Time plot of the variance of the Coupled Finite Difference estimator versus the Common Reaction Path estimator for the model (1). Each plot was generated using 1, sample paths. A perturbation of ɛ = 1/1 was used.
Are these representations only good for simulation? LLN and ODEs. Tom Kurtz 197 s Suppose X N (t) = O(N). Denote concentrations via X N (t) = N 1 X N (t) = O(1).
Are these representations only good for simulation? LLN and ODEs. Tom Kurtz 197 s Suppose X N (t) = O(N). Denote concentrations via Under mild assumptions, have X N (t) = N 1 X N (t) = O(1). λ k (X N (t)) = λ k (N X N (t)/n) = N λ k (X N (t)).
Are these representations only good for simulation? LLN and ODEs. Tom Kurtz 197 s Suppose X N (t) = O(N). Denote concentrations via Under mild assumptions, have X N (t) = N 1 X N (t) = O(1). λ k (X N (t)) = λ k (N X N (t)/n) = N λ k (X N (t)). becomes X N (t) = X N () + k X N (t) = X N () + k Y k ( t ) λ k (X N (s))ds ξ k t ) N 1 Y k (N λ k (X N (s))ds ξ k
Are these representations only good for simulation? LLN and ODEs. Tom Kurtz 197 s Suppose X N (t) = O(N). Denote concentrations via Under mild assumptions, have X N (t) = N 1 X N (t) = O(1). λ k (X N (t)) = λ k (N X N (t)/n) = N λ k (X N (t)). becomes use that X N (t) = X N () + k X N (t) = X N () + k lim sup N {u U} Y k ( t ) λ k (X N (s))ds ξ k t ) N 1 Y k (N λ k (X N (s))ds ξ k N 1 Y (Nu) u =,
Are these representations only good for simulation? LLN and ODEs. Tom Kurtz 197 s Suppose X N (t) = O(N). Denote concentrations via Under mild assumptions, have X N (t) = N 1 X N (t) = O(1). λ k (X N (t)) = λ k (N X N (t)/n) = N λ k (X N (t)). becomes X N (t) = X N () + k X N (t) = X N () + k Y k ( t ) λ k (X N (s))ds ξ k t ) N 1 Y k (N λ k (X N (s))ds ξ k use that lim sup N {u U} N 1 Y (Nu) u =, find X N (t) converges to solution of classical ODE x(t) = x() + k t λ k (x(s))ds ξ k x() + t F(X(s))ds.
Diffusions? Argument due to Tom Kurtz Suppose X N (t) = O(N). Denote concentrations via X N (t) = N 1 X N (t) = O(1).
Diffusions? Argument due to Tom Kurtz Suppose X N (t) = O(N). Denote concentrations via X N (t) = N 1 X N (t) = O(1). becomes X N (t) = X N () + k X N (t) = X N () + k Y k ( t ) λ k (X N (s))ds ξ k t ) N 1 Y k (N λ k (X N (s))ds ξ k
Diffusions? Argument due to Tom Kurtz Suppose X N (t) = O(N). Denote concentrations via X N (t) = N 1 X N (t) = O(1). becomes use that X N (t) = X N () + k X N (t) = X N () + k Y k ( t ) λ k (X N (s))ds ξ k t ) N 1 Y k (N λ k (X N (s))ds ξ k 1 N (Y k (Nu) Nu) W k (u)
Diffusions? Argument due to Tom Kurtz Suppose X N (t) = O(N). Denote concentrations via X N (t) = N 1 X N (t) = O(1). becomes use that X N (t) = X N () + k X N (t) = X N () + k Y k ( t ) λ k (X N (s))ds ξ k t ) N 1 Y k (N λ k (X N (s))ds ξ k 1 N (Y k (Nu) Nu) W k (u) find X N (t) well approximated by chemical Langevin process X(t) = X() + k t ζ k λ k (X(s))ds + 1 N t k λk (X(s))dW k (s). or dx(t) = F(X(t))dt + N 1/2 ζ k λk (X(t))dW k (t). k
Central limit theorem - Kurtz/ Van Kampen Suppose X N (t) = O(N). Denote concentrations via Let X N (t) = N 1 X N (t) = O(1), x(t) = ODE solution. U N (t) = ( ) Xn(t) Nx(t) N X N (t) x(t) =. N
Central limit theorem - Kurtz/ Van Kampen Suppose X N (t) = O(N). Denote concentrations via Let Then, U N (t) = N X N (t) = N 1 X N (t) = O(1), x(t) = ODE solution. U N (t) = ( ) Xn(t) Nx(t) N X N (t) x(t) =. N ( + N N 1 k t k t ) ζ k Ỹ k (N ) λ k (X n(s))ds (F(X n(s)) F (x(s)))ds 1 t ) t ζ k Ỹ k (N λ k (X n(s))ds + DF(x(s))U n(s)ds. N
Central limit theorem - Kurtz/ Van Kampen Suppose X N (t) = O(N). Denote concentrations via Let Then, U N (t) = N X N (t) = N 1 X N (t) = O(1), x(t) = ODE solution. U N (t) = ( ) Xn(t) Nx(t) N X N (t) x(t) =. N ( + N N 1 k t k t ) ζ k Ỹ k (N ) λ k (X n(s))ds (F(X n(s)) F (x(s)))ds 1 t ) t ζ k Ỹ k (N λ k (X n(s))ds + DF(x(s))U n(s)ds. N use martingale central limit theorem to show that 1 N Ỹ k (N ) W k ( ), get U n U, U(t) = k ζ k W k ( t ) t λ k (x(s))ds + DF(x(s))U(s)ds
Thanks! References: 1. David F. Anderson, An Efficient Finite Difference Method for Parameter Sensitivities of Continuous Time Markov Chains, SIAM: Journal on Numerical Analysis, Vol. 5, No. 5, 212. 2. David F. Anderson and Thomas G. Kurtz, Continuous Time Markov Chain Models for Chemical Reaction Networks, in Design and Analysis of biomolecular circuits, Springer, 211, Eds. Heinz Koeppl et al. Funding: NSF-DMS-19275.