From RNA-seq Time Series Data to Models of Regulatory Networks Konstantin Mischaikow Dept. of Mathematics, Rutgers mischaik@math.rutgers.edu Brown, Dec 2017
A Model Problem
Malaria Estimated number of malaria cases in 2010: between 219 and 550 million Estimated number of deaths due to malaria in 2010: 600,000 to 1,240,000 Malaria may have killed half of all the people that ever lived. And more people are now infected than at any point in history. There are up to half a billion cases every year, and about 2 million deaths - half of those are children in sub-saharan Africa. J. Whitfield, Nature, 2002 Malaria is of great public health concern, and seems likely to be the vector-borne disease most sensitive to long-term climate change. World Health Organization Resistance is now common against all classes of antimalarial drugs apart from artemisinins. Malaria strains found on the Cambodia Thailand border are resistant to combination therapies that include artemisinins, and may therefore be untreatable. World Health Organization
Malaria: P. falciparum Task: Understand the regulation on the genetic/biomolecular level with the goal of affecting the dynamics with drugs. All genes (5409) 48 hour cycle 1-2 minutes periodic$genes$(43)$ 1.5$ 0.0$ High Standard$devia0ons$from$ mean$expression$(z&score)$ A differential equation dx dt = f(x, ) is probably a reasonable model for the dynamics, but I do not have an analytic description of f or estimates of the parameters. &1.5$ Low 0 0$ 10$ 20$ 30$ 40$ 50$ 60$ 0me$in#vitro#(hours)$ 10 20 30 40 50 60 Malaria is Sequenced Poorly annotated Walter Reed Army Inst. Research Duke University A proposed network
A Philosophical Interlude
The Lac Operon Ozbudak et al. Nature 2004 ODES are great modeling tools, but should be handled with care. Network Model 1 R T ẏ = y R T + R(x) 1 ẋ = y x x R(x) = 1+ R T n x x 0 ODE Model y = Data 84.4 + 16.1 1+(G/8.1) 1.2 =... parameter values
What does it mean to solve an ODE? truth model Precise Not Accurate Not Rigorous Classical Qualita7ve Representa7on of Dynamics parameter Conley-Morse Chain Complex Dynamic Signature (Morse Graph) Not Precise Accurate Rigorous
Combinatorial Dynamics
State Transition Graph F : X! X p3 Linear time Algorithm! Simple decomposition of Dynamics: Recurrent p2 Nonrecurrent (gradient-like) p1 p0 Vertices: States Edges: Dynamics Don t know exact current state, so don t know exact next state Morse Graph of state transition graph Poset
What is observable? A X is an attractor if F(A) =A Birkhoff s Theorem implies that the Morse graph and the lattice of Attractors are equivalent. p3 Observable p3,p2,p1,p0 p2,p1,p0 p1,p0 Lower Sets O(M) Join Irreducible J _ (A) p2 P O S E T Computable p1 p0 p1 p0 ; Lattice of Attractors _ = [ of F : X! X ^ = maximal attractor in \ Morse Graph of F : X! X
Topology (differential equations are not defined on discrete sets)
Topology Let X be a compact metric space. Let R(X) denote the lattice of regular closed subsets of X. Let L be a finite bounded sublattice of R(X). G(L) denoted atoms of L phase space Infinite unbounded lattice Level of measurement Applicable scale for model smallest elements of L Dynamics Declare a bounded sublattice A L to be a lattice of forward invariant regions (attracting blocks). Use Birkhoff to define poset (P := J _ (A),<) For each p 2 P define a Morse tile M(p) :=cl(a \ pred(a)) Remark: I have purposefully ignored the relation between L and F : X! X
Example Phase space: X =[ 4, 4] R Atoms of lattice: G(L) ={[n, n + 1] n = 4,...,3} Lattice of attracting blocks: A = {[ 3, 1], [1, 3], [ 3, 1] [ [1, 3], [ 4, 4]} A Birkhoff P Let F 0 (x) = 3 1 2 How does this relate to a differential equation dx dt = f(x)? f(x). Attracting blocks are regions of phase space that are forward invariant with time. Morse tiles M(p) -4 0 4 F F -4 0 4 Remark: This leads to a homology theory F
Switching Systems (an example of how to use these ideas) Choosing L and F : X!! X
Biological Model How do I want to interpret this information? What differential equation do I want to use? x i denotes amount of species i. Assume x i decays. Proposed model: dx dx i i dx i dt = ix i dt dt = ix ix i i + i (x) i j ) Parameters 1/node 3/edge 1 2 x 1 represses the production of x 2. dx 2 dt u 2,1 l 2,1 1 2 x 1 activates the production of x 2. x 1 2,1 For x 1 < 2,1 we ask about sign ( 2x 2 + u 2,1 ). For x 1 > 2,1 we ask about sign ( 2x 2 + l 2,1 ). j,i (x i)= ( u j,i `j,i if x i < j,i if x i > j,i Focus on sign of ix i + i,j (x j ) ix i + + i,j (x j)
Example 1 2 We care about sign of: 1 2,1 + 1,2 (x 2 ) 2 1,2 + + 2,1 (x 1) Parameter space is a subset of (0, 1) 8 Fix z a regular parameter value. x 2 If x 1 < 2,1 and 2 1,2 + + 2,1 (x 1) > 0 If x 1 < 2,1 and 2 1,2 + + 2,1 (x 1) < 0 1,2 z is a regular parameter value if 2,1 Phase space: X =(0, 1) 2 x 1 0 < i 0 < `i,j <u i,j, 0 < i,k 6= j,k, and 0 6= i j,i + i (x)
Fix z a regular parameter value. Need to Construct State Transition Graph F z : X! X x 2 Vertices X corresponds to all rectangular domains and co-dimension 1 faces defined by thresholds. 1,2 Edges Faces pointing in map to their domain. 2,1 x 1 Domains map to their faces pointing out. If no outpointing faces domain maps to itself.
Example 1 2 Constructing state transition graph F z : X! X Fix z a regular parameter value. Check signs of i j,i + ± i,j (x j) Assume: l 1,2 < 1 2,1 <u 1,2 2 1,2 <l 2,1 <u 2,1 l 1,2 < 1 2,1 <u 1,2 l 2,1 < 2 1,2 <u 2,1 x 2 x 2 1,2 1,2 x 1 x 1 Morse Graph 2,1 FP{0,1} FC 2,1 Dynamics orders maxima and minima (M 1,M 2,m 1,m 2 )
Input: Regulatory Network DSGRN Database 1 2 Output: DSGRN database (7) FP(1,0) 1 2,1 <l 1,2 <u 1,2 l 2,1 <u 2,1 < 2 1,2 (8) FP(1,0) l 1,2 < 1 2,1 <u 1,2 l 2,1 <u 2,1 < 2 1,2 (9) FP(0,0) l 1,2 <u 1,2 < 1 2,1 l 2,1 <u 2,1 < 2 1,2 (4) FP(1,1) 1 2,1 <l 1,2 <u 1,2 l 2,1 < 2 1,2 <u 2,1 (5) FC l 1,2 < 1 2,1 <u 1,2 l 2,1 < 2 1,2 <u 2,1 (6) FP(0,0) l 1,2 <u 1,2 < 1 2,1 l 2,1 < 2 1,2 <u 2,1 (1) FP(1,1) 1 2,1 <l 1,2 <u 1,2 (2) FP(0,1) l 1,2 < 1 2,1 <u 1,2 (3) FP(0,1) l 1,2 <u 1,2 < 1 2,1 2 1,2 <l 2,1 <u 2,1 2 1,2 <l 2,1 <u 2,1 2 1,2 <l 2,1 <u 2,1 Parameter graph provides explicit partition of entire 8-D parameter space. Observe that we can query this database for local or global dynamics.
Back to Malaria Remark: there are a variety of statistical methods for generating possible regulatory networks from this type of time series data.
In vitro data (WRAIR+ Haase) Remarks about dynamics: 1. Gene expression is cyclic in nature. 2. We know relative times of expression of genes. Putative TF genes (456) 1.5 0 High Standard deviations from mean expression (z-score) -1.5 Low Assumption: Expression of important functions must be robust to perturbations. 0 10 20 30 40 50 60 Time in vitro (hours)
Simple Test Cyclic feedback system: well understood using classical dynamical systems techniques. Experimental 7me series for associated genes Under the assump7on of monotone switches if parameter values are chosen such that there exists a stable periodic orbit, then the maxima in the network must occur in the order: (188,93,184, 395) (green, blue, cyan,red) Conclusion: This network does not generate observed dynamics
No mathema7cal theory DSGRN computa7on produces a parameter graph with approximately 45,000 nodes. Computa7on 7me on laptop approximately 1 second. Time series for associated genes SQL Query: A stable cycle involving oscilla7ons in all genes 96 parameter graph nodes with Morse graph that has a minimal node consis7ng of a Full Cycle (FC).
DSGRN Analysis (II): Max-Min Matching M m M m M m M M m M m m Have developed polynomial 7me algorithm that take paths in state transi7on graph and iden7fies sequences of possible maxima and minima. Tested all max-min sequences from state transi7on graphs from all 96 parameter graph nodes against 17,280 experimental pa`erns. No Match Conclusion: This network does not generate observed dynamics
DSGRN strategy I. Start with a proposed gene regulatory network. Extract from experimental data the poset indicating possible max-min orderings. Perform DSGRN computation to identify parameter node for which minimal Morse node is FC Reject parameter node if max-min sequences of FC are not linear extensions of poset. Compute fraction of parameter nodes that match experimental data. Assumption: Expression of important functions must be robust to perturbations. IF fraction is small, THEN reject regulatory network.
DSGRN strategy II. Start with an acceptable gene regulatory network. Create new regulatory network via random perturbations: Add/Remove edge(s) Add/Remove node I. Start with the newly proposed gene regulatory network. 90% of parameter nodes* induce minimal FC node Current favorite network 80% of parameter nodes* induce minimal FC which agrees with experimentally determined max-min ordering.
Thank-you for your Attention Rutgers S. Harker MSU T. Gedeon B. Cummings FAU W. Kalies VU Amsterdam R. Vandervorst Homology + Database Software chomp.rutgers.edu
Example (continued) X =[ 4, 4] R A (P,<) The homology Conley index of M(p) is 1 3 2-4 0 4 (k, 0,...) 2 4 0 0 1 0 0 1 0 0 0 (0, k, 0,...) 3 5 (k, 0,...) F F CH (p) :=H (A, pred(a); k) k a field Conley index can be used to guarantee existence of equilibria, periodic orbits, heteroclinic and homoclinic orbits, and chaotic dynamics. Theorem: (R. Franzosa) There exists a strictly upper triangular (with respect to <) boundary operator : M p2p CH (p)! M p2p CH (p) such that the induced homology is isomorphic to H (X).