Persistence and Stationary Distributions of Biochemical Reaction Networks

Persistence and Stationary Distributions of Biochemical Reaction Networks David F. Anderson Department of Mathematics University of Wisconsin - Madison Discrete Models in Systems Biology SAMSI December 3rd, 2008

Overview 1. Biochemical systems can be large and are typically quite complex. 2. There are potentially millions of such systems that could warrant study at one time or another. 3. The dynamics of these systems can be modeled stochastically or deterministically (or a combination of both). 4. Rate constants (key system parameter values) are oftentimes unknown and even vary from cell to cell. 5. Network structure induces equations governing dynamics. 6. Want to infer qualitative properties of the dynamics from the network structure without the need for details of system parameter values. 7. (if time) Simulation methods for stochastic systems.

Examples: Gene Regulation and Enzyme Kinetics 1 2 Gene regulatory network within the plant A. thaliana. Pointed arrows reflect an activation relationship, whereas the blunt arrows reflect a repression relationship between genes. S + E 1 SE 1 P + E 1, E 1 A + E 2, E 2. 1 Shahrezaei and Swain, Curr. Opinion in Biotech., 2008. 2 Gambin et al., In Silico, 2006.

Example: Metabolism

Chemical Reaction Networks Standard notation for chemical reactions: A + B C is interpreted as a molecule of A combines with a molecule of B to give a molecule of C. We make the association: 2 A + B = ν = 4 1 1 0 3 2 5, C = ν = 4 0 0 1 3 5. Note: each instance of the reaction A + B C or ν ν changes the state of the system by the vector: ν ν.

Chemical Reaction Networks: Species, Complexes, and Reactions A chemical reaction network consists of: 1. N 1 chemical species, {S 1,..., S N } = S, undergoing a series of chemical reactions. 2. Vectors, ν k, ν k Z N 0, representing the numbers of molecules of each species consumed and created in the kth reaction, respectively. Each vector may appear as both a source (ν) and product (ν ). We associate with each ν k (and ν k ) a linear combination of the species in which the coefficient of S i is ν ki. 2 3 For example, if ν k = 4 2 0 1 5, then ν k is associated with 2S 1 + S 3. Under this association, {ν k } = C are the complexes of the network. 3. R = {ν k ν k} are the reactions. Definition. The triple {S, C, R} is called a chemical reaction network.

Reaction Diagrams To each reaction network, {S, C, R}, there is a unique directed graph, called a reaction diagram, constructed in the following manner: 1. The nodes of the graph are the complexes, C. 2. A directed edge (ν, ν ) exists if and only if ν ν R. Each connected component of the resulting graph is termed a linkage class of the graph. Definition. The chemical reaction network is said to be weakly reversible if each linkage class of the corresponding reaction diagram is strongly connected. A network is said to be reversible if ν ν R = ν ν R.

Chemical Reaction Networks Example. Consider the reaction network with the following diagram: S + E 1 SE 1 P + E 1, E 1 A + E 2, E 2. 1. S = {S, E 1, SE 1, P, A, E 2 }. 2. C = {S + E 1, SE 1, P + E 1, E 1, A + E 2,, E 2 }. 3. R = {S + E 1 SE 1, SE 1 S + E 1, SE 1 P + E 1, E 1 A + E 2, A + E 2 E 1, E 2, E 2 }. Also, this system is not weakly reversible.

Deterministic Dynamics: Mass Action Kinetics We will assume that when the concentration vector of the system is x R N 0, the rate of the reaction ν k ν k, is R k (x) = κ k x ν 1k 1 x ν 2k 2 x ν Nk N = κ k x ν k Example: If S 1 + 2S 2 anything, then R k (x) = κ k x 1 x 2 2. Equations governing dynamics are the nonlinear ODEs: ẋ(t) = X k R k (x(t))(ν k ν k ) = X k κ k x(t) ν k (ν k ν k ), A system with kinetics given by R k (x) = κ k x ν k is said to have deterministic mass action kinetics. Note: up to parameter values, κ k, equations arise solely from the network structure.

ω-limit points, Persistence, and a Conjecture Definition. Let x(t) be a trajectory with x(0) R N >0. The set of ω-limit points for this trajectory is: ω(x(0)) = {x R N 0 x(t n) x, for some sequence t n }. Definition. A system is persistent if ω(x(0)) R N 0 =, for all x(0) R N >0. Conjecture (Feinberg - 1987): Let {S, C, R} be a weakly reversible network with deterministic mass action kinetics. Then this system is persistent.

Preliminary Observations and Definitions 1. Any solution x(t) satisfies x(t) x(0) = X Z t k 0 «R k (x(s)) ds (ν k ν k ) span{ν k ν k } k. Definition. S = span{ν k ν k } k is the stoichiometric subspace of the network. Let dim(s) = s. We see that for all t 0: x(t) x(0) + S = {z R N z = x(0) + v, for some v S}. 2. Both R N 0 and R N >0 are forward invariant. Definition. We let: P x(0) = P = (x(0) + S) R N 0, be the positive stoichiometric compatibility class associated with x(0).

The Faces of P: Boundary Points The set P = (x(0) + S) R N 0 is a polyhedron of dimension dim(s) = s. For W S, let Z W = {x R N x i = 0 if i W }, be the linear space whose zero-set is W. For each face of P, call it F, there is a W S, such that F = F W = P Z W. The dimension of each face satisfies 0 dim(f W ) s. A facet is a face such that dim(f W ) = dim(p) 1 = s 1. A vertex is a face such that dim(f W ) = 0. Note: F W = P Z W = if P does not intersect the face of the positive orthant associated with W. Clearly, no face of P corresponds with such a W. In this case we say that Z W is stoichiometrically unattainable.

Example Consider a system with reaction diagram: 2A A + B, B C, S R 3 is the plane spanned by ( 1, 1, 0) and (0, 1, 1). For T > 0, the positive compatibility classes are two-dimensional triangles n o P = (x a, x b, x c) R 3 0 x a + x b + x c = T The three facets of P are the one dimensional segments F {A}, F {B}, F {C}. The three vertices are the three points F {A,B}, F {A,C}, F {B,C}. The set Z {A,B,C} = { 0} is stoichiometrically unattainable.

Semilocking Sets Definition. A nonempty subset, W, of the set of species, S, is called a semilocking set if for each reaction in which there is an element of W in the product complex, there is an element of W in the reactant complex. Example: X 1 + X 2 X 3. If X 3 is in a semilocking set, then so too must X 1 and/or X 2. Example: 2A A + B, 2B A + C, D E. Semilocking sets are W 1 = {A, B}, W 2 = {D, E}, W 3 = {A, B, C}, W 4 = {A, B, D, E}. Intuition: If the concentration of each element of W is zero, then any flux which affects the concentrations of the species of W is zero, and the concentrations of the elements of W are locked at zero for all time.

Semilocking Sets Theorem (Feinberg - 1987, Angeli, De Leenheer, Sontag - 2007, A. - 2008). Let x(0) R N >0, and let P = (x(0) + S) R N 0 denote the corresponding positive class. If there exists an ω(x(0))-limit point, z ω(x(0)), and a subset of the species, W, such that z is contained within the interior of the face F W, then W is a semilocking set. Corollary. Suppose that F W = P Z W = for each semilocking set W S (i.e. Z W is stoichiometrically unattainable). Then the system is persistent. This corollary has been widely applied to biological examples: 1. Sontag - 2001: Model of T-Cell receptor signal transduction. 2. Angeli, De Leenheer, Sontag - 2007: Different enzymatic reactions and cascades of reactions (i.e. MAPK). 3. Gnacadja - 2008: Reversible binding reactions.

Example Consider again the system with reaction diagram: 2A A + B, B C, S R 3 is the plane spanned by ( 1, 1, 0) and (0, 1, 1). For T > 0, the positive compatibility classes are two-dimensional triangles n o P = (x a, x b, x c) R 3 0 x a + x b + x c = T The only semilocking sets are W 1 = {A} and W 2 = {A, B, C}. Z W2 = Z {A,B,C} = { 0} is stoichiometrically unattainable. ω-limit points can only be found on F {A}.

ω-limit Points and Facets Theorem (A., Shiu - 2008). Let {S, C, R} be a weakly-reversible network with deterministic mass-action kinetics. Let x(0) R N >0 P. Suppose that z ω(x(0)) lies in the interior of the face F W of P, for some W S. Then F W is not a facet of P; that is, dim(f W ) < dim(p) 1. In words: No ω-limit points exist within the facets of P. Corollary. Let {S, C, R} be a weakly-reversible network with deterministic mass-action kinetics. Suppose that each semilocking set, W, satisfies dim(f W ) = dim(p) 1 or F W =. Then the system is persistent. This shows example on previous slide is persistent.

Idea of Proof Proof: Suppose z ω(x(0)) F W with dim(f W ) = dim(p) 1. Will find contradiction. The proof consists of three parts. 1. Argue that the span of {ν k ν k } restricted to W is one-dimensional. Say spanned by v R W. 2. Argue that no subset of W can be conserved and this, combined with above point, implies that v can be chosen so that v R W >0. 3. Argue that v R W >0 repelling. implies there is a neighborhood of z which is

Application: Complex-Balanced Systems Definition. An equilibrium c R N 0 is said to be complex-balanced if the following equality holds for each η C: X κ k c ν k = X κ k c νk. {k ν k =η} {k ν k =η} A complex-balanced system is a system that admits a complex-balancing equilibrium. Some chemists/physicists believe that all chemical systems are detailed-balanced, and hence, complex-balanced.

Deficiency and complex-balancing Definition. The deficiency of a network is δ = n l s, where n is the number of complexes, l is the number of linkage classes, and s = dim(s) = dim(p). Theorem (Feinberg - 1979). Let {S, C, R} be weakly reversible and satisfy δ = 0. Then, independent of the choice of rate constants, the system with deterministic mass action kinetics is complex-balanced. A partial converse holds: Theorem (Feinberg - 1979, Gunawardena - 2003). A complex-balanced system is weakly reversible.

Behavior of Complex-Balanced Systems Theorem (Horn, Jackson, Feinberg - 1972). Let {S, C, R} be a complex-balanced system. Then, within the interior of each positive class, P, there is precisely one equilibrium value, c, and that equilibrium is: 1. Complex-balanced. 2. Locally asymptotically stable relative to its compatibility class. Global Attractor Conjecture (GAC): For a complex-balanced system, each of the equilibria c R N >0 is globally asymptotically stable relative to the interior of its compatibility class P.

Example Consider the system 2A A + B B + C There are : 3 complexes, 1 linkage classes, and s = 2 = δ = 0. The stoichiometric subspace is of dimension two, and the quantity A + B + C is conserved. Thus, each stoichiometric compatibility class is a triangle. Within each positive stoichiometric compatibility class, there is a unique equilibrium value that is locally asymptotically stable relative to its compatibility class.

Partial Results Towards the GAC Can be shown that persistence will guarantee global asymptotic stability. = Semilocking sets and ω-limit points are of interest. There is more structure in complex-balanced systems, so more is already known: Theorem (A. - 2008, Craciun, Dickenstein, Shiu, Sturmfels - 2008). No vertex of a positive class P can be an ω-limit point. Theorem (Craciun, Dickenstein, Shiu, Sturmfels - 2008). If a system is detailed-balanced, P is two-dimensional, and the network is conservative, then the GAC holds. Can generalize this: Theorem (A., Shiu - 2008). If a system is complex-balanced and P is two-dimensional, then the GAC holds.

Towards the GAC Most general statement about GAC to date: Theorem (A., Shiu - 2008). Let {S, C, R} be a complex-balanced system. Suppose that for each semilocking set W S, one of the following holds: 1. F W = Z W P = (Z W is stoichiometrically unattainable). 2. dim(f W ) = 0 (F W is a vertex). 3. dim(f W ) = dim(p) 1 (F W is a facet). Then the GAC holds for this system.

Example Consider either of the systems 2A A + B B + C or 2A A + B, B C. Each has δ = 0 and is weakly reversible = they are complex balanced systems. They have same stoichiometric compatibility classes P: dim(p) = 2, so the GAC holds.

Dynamics: A Stochastic Model 1. Typically used when abundances of molecules are low. 2. Treats system as a continuous time Markov chain with state X Z N 0 representing the abundance of each species present and with each reaction modeled as a possible transition for the state. 3. If reaction k occurs at time t, then X(t) = X(t ) + ν k ν k. 4. The kth reaction has an associated intensity function, λ k (x), such that P(X(t + t) = X(t) + ν k ν k X(t)) = λ k (X(t)) t + o( t). 5. Kolmogorov s Forward equation (Chemical Master Equation) d dt P(x, t) = X k λ k (x ν k + ν k )P(x ν k + ν k, t) X k λ k (x)p(x, t), describes the time evolution of the distribution of the state of the system.

Stochastic Mass Action Kinetics One standard intensity function is stochastic mass-action kinetics: N! Y x i Y N x i! λ k (x) = κ k ν ki! = κ k (x i ν ki )!. i=1 ν ki i=1 Example: If S 1 + S 2 anything, then λ k (x) = κ k x 1 x 2. Example: If S 1 + 2S 2 anything, then λ k (x) = κ k x 1 x 2 (x 2 1). Idea: rate is proportional to number of distinct subsets of the molecules present that can form the inputs for the reaction. Intuitively, this assumption reflects the idea that the system is well-stirred in the sense that all molecules are equally likely to be at any location at any time.

Qualitative Properties: Stochastically Modeled Systems Recall, stochastically modeled systems are continuous time Markov chains that satisfy Kolmogorov s Forward equation d dt P(x, t) = X k λ k (x ν k + ν k )P(x ν k + ν k, t) X k λ k (x)p(x, t). We see a stationary distribution, π(x), must satisfy X π(x ν k + ν k )λ k (x ν k + ν k ) = π(x) X k k λ k (x), for all x in some closed, irreducible subset of the state space.

Stochastic Version of Complex-Balanced Theorem Theorem (Anderson, Craciun, Kurtz - 2008). Let {S, C, R} be a stochastically modeled chemical reaction network with rate constants κ k. Suppose that 1. the associated deterministic mass-action system with rate constants κ k is complex-balanced with equilibrium c R N >0. Then for any closed, irreducible communicating equivalence class, Γ, the stochastic system has a product-form stationary distribution N π(x) = M cx x! = M Y c x i i x i!, x Γ, i=1 where M is a normalizing constant. Recall: Feinberg showed that weak reversibility and zero deficiency implies complex-balancing. These are easily checked conditions that depend only upon the network structure.

Example: Open, first-order system If the network is an open, first-order reaction network, then S = R N, Γ = Z N 0 and so, π(x) = e P N i=1 c i N Y i=1 c x i i x i! = N Y i=1 e c c x i i i x i!, x ZN 0. where c R N >0 is the equilibrium of the associated (linear) deterministic system. Thus, when in distributional equilibrium, the species numbers are independent and have Poisson distributions. This is well known. However: 1. Neither the independence nor the Poisson distribution resulted from the fact that the system under consideration was a first order system. 2. Instead, both facts followed from Γ = Z N 0.

Example: Enzyme kinetics Consider the possible model of enzyme kinetics given by E + S ES E + P, E S Easy to see that Γ = Z 4 0. Thus, in distributional equilibrium, the species numbers are independent and have Poisson distributions.

Generalizations λ k (x) = κ k N Y i=1 θ i (x i )θ i (x i 1) θ i (x i (ν ik 1)), where θ i ( ) is the rate of association of the ith species. Examples: 1. θ i (x i ) = x i gives mass-action kinetics. 2. θ i (x i ) = v ix i k i + x i gives a stochastic Michaelis-Menten kinetics. 3. Can have different kinetics in same model. Ex: Species A governed by Michaelis-Menten Species B governed by mass-action. Then, A + B ν k has intensity λ k (x) = κ k v 1 x 1 k 1 + x 1 x 2. 4. If (1) ν k {0, 1} and (2) θ i (x i ) = min{n i, x i }, then the system is an M/M/n queueing network.

Generalizations λ k (x) = κ k N Y i=1 θ i (x i )θ i (x i 1)θ i (x i (ν ik 1)), (1) Theorem (Anderson, Craciun, Kurtz - 2008). Let {S, C, R} be a stochastically modeled chemical reaction network with intensity functions (1). Suppose that 1. the associated deterministic mass-action system with rate constants κ k is complex-balanced with equilibrium c R N >0. Then for any closed, irreducible communicating equivalence class, Γ, the stochastic system has a product-form stationary distribution π(x) = M NY i=1 c x i i Q xi j=1 θ, x Γ, (2) i(j) where M is a normalizing constant, provided that (2) is summable.

Example: Michaelis-Menten kinetics Suppose we have a system with only Michaelis-Menten kinetics. Then,! x Y i x Y i v i j θ i (j) = = v x i i / k i + x i. k i + j x i So, j=1 j=1 π(x) = M NY i=1 c x i i Q xi j=1 θ i(j) = M NY i=1 k i + x i x i! «xi ci. v i Note: k i + x i x i! = O(x k i i ), x i, and so the above π(x) is summable if for each species S i whose abundances are unbounded, we have c i < v i.

More Generalizations Note: θ(x) = θ(x) λ k (x) = κ k θ(x ν k ). (3) x NY Y i θ i (j), gives the previous (natural) case. i=1 j=1 Theorem (Anderson, Craciun, Kurtz - 2008). Let {S, C, R} be a stochastically modeled chemical reaction network with intensity functions (3). Suppose that 1. the associated mass-action deterministic system with rate constants κ k has a complex balanced equilibrium c R N >0. Then for any closed, irreducible communicating equivalence class, Γ, the stochastic system has a stationary distribution π(x) = M 1 θ(x) cx = M 1 θ(x) NY i=1 c x i i, x Γ, (4) where M is a normalizing constant, provided that (4) is summable.

Simulation of Stochastically Modeled Systems Oftentimes mathematical analysis is not sufficient and/or timely. In such cases, simulation methods are utilized. This is not problematic in deterministic setting. This is problematic in stochastic setting as we want to incorporate Monte Carlo methods. Make use of the random time change representation: X(t) = X(0) + X k Y k Z t 0 «λ k (X(s)) ds (ν k ν k ), where the Y k are independent, unit-rate Poisson processes.

Simulation of Stochastically Modeled Systems X(t) = X(0) + X k Y k Z t 0 «λ k (X(s)) ds (ν k ν k ), This representation can be utilized to: (A. - 2007) Develop efficient and statistically exact algorithms that are easily adaptable to such typical situations as: 1 Time varying (even stochastically driven) rate constants. 2 Delays in the initiation and completion of reactions (such as is common with gene transcription and/or translation models). (A. - 2008) Develop approximate algorithms, that are both more stable and more accurate than existing algorithms. (A., Ganguly, Kurtz - 2008) Perform the first rigorous error analysis of the approximate methods using an appropriate scaling.

Tau-Leaping Error Analysis Tau-leaping is equivalent to an Euler approximation of the integral: X(τ) = X(0) + X Z τ «Y k λ k (X(s)) ds (ν k ν k ) k 0 X(0) + X k d = X(0) + X k Y k (τ λ k (X(0))) (ν k ν k ) Poisson(λ k (X(0))τ)(ν k ν k ). Previous error analysis was performed under the scaling τ 0. However, in this scaling, exact methods are faster.

Tau-Leaping Error Analysis Using the random time change representation you can perform error analysis in the following way: Let V be order of magnitude of typical abundance. Let τ = c/v β for some c > 0 and 0 < β < 1. As V, τ 0 and λ k (X)τ, which is appropriate regime for the use of approximate methods. In this scaling regime, can show the following: 1. Standard tau-leaping is an order τ = 1/V β method in both the weak and strong sense. 2. Midpoint tau-leaping has a strong order 1/V g(β), where j 2β, β 1/3 g(β) = β + 1, (1 β), β 1/3 2 and a weak order of τ 2 = 1/V 2β. This result is in contrast with previous error analysis (τ 0), but in agreement with numerical examples.

Example: Irreversible Isomerization (X 0) Consider the simple example Solutions satisfy X 1 0, X(0) = 10 6 = V. X(t) = X(0) Y 1 Z t 0 «X(s) ds. Simulating 100, 000 sample paths with τ = 1/10 2 = 1/V 1/3 yields

Partial List of References 1. David F. Anderson and Anne Shiu, Persistence of deterministic population processes and the Global Attractor Conjecture, in preparation (to be submitted very soon). 2. David F. Anderson, Global asymptotic stability for a class of nonlinear chemical equations, Siam J. Appl. Math., 68(5), 1464-1476, May 2008. 3. David F. Anderson, Gheorghe Craciun, and Thomas G. Kurtz, Product-form stationary distributions for deficiency zero chemical reaction networks, submitted, arxiv:0803.3042, link on my website. 4. David F. Anderson, Arnab Ganguly, and Thomas G. Kurtz, Error Analysis of the tau-leap simulation method for stochastically modeled chemical reaction systems, in preparation. 5. David F. Anderson, Incorporating postleap checks in tau-leaping, Journal of Chemical Physics, 128(5), 054103, February 2008. 6. David F. Anderson, A modified Next Reaction Method for simulating chemical systems with time dependent propensities and delays, Journal of Chemical Physics, 127(21), 214107, December 2007. 7. Martin Feinberg, Lectures on chemical reaction networks, delivered at Univ. of Wisconsin, Madison, 1979. Available for download at: http://www.che.eng.ohio-state.edu/ feinberg/lecturesonreactionnetworks