Gene Network Science Diagrammatic Cell Language and Visual Cell Mr. Tan Chee Meng Scientific Programmer, System Biology Group, Bioinformatics Institute
Overview Introduction Why? Challenges Diagrammatic Cell Language TM Visual Cell TM Minimal Cell Model Simulation Conclusion
Introduction GNS has created the largest known network of interconnected signal transduction pathways and gene expression networks controlling human cell growth, combining mitogenic signaling pathways with the processes of proliferation and apoptosis. 2,000 variables, >500 genes and proteins.
Introduction The model describes processes of endocytosis, receptor signaling, protein degradation, signal transduction, transcriptional control of gene expression networks, protein translation.
Introduction It predicts various physiological outcomes such as Cell cycle progression and arrest through G1- S and G2-M M starting from mitogenic signaling, Cell cycle arrest and apoptosis induction via p53, Interplay between survival signals and apoptosis
Why? Why do we need Molecular Interaction Map (MIM)? (Kohn, 1999) Records of known interaction. Used as a road map or electronic circuit diagram. Suggest new interpretations or questions for experiment. Imposes a discipline of logic and critique to the formulation of functional models. Provides a shorthand for recording complicated findings or hypotheses.
Why? Molecular Interaction Map CellML Cell Language SBML Simulation Engine
Why? SBML is aimed at exchanging information about pathway and reaction models between several existing applications. CellML XML-based exchange formats for describing cellular models describes any associated metadata. Metadata can be included to facilitate searches of collections of models and model components
CellML Why supports both quantitative and qualitative pathway models describes the structure and underlying mathematics of cellular models in a very general way <role> element - "reactant", "product", "catalyst", "activator", "inhibitor", "modifier", or "rate".
Challenges Challenges in construction of MIM: Complexity? Multisubunit complexes, protein modifications, enzymes interaction and protein regulation. Ambiguous? Compactness? Readability? Standardize? Incompleteness and uncertainty of knowledge. Limited scope of applicability of some interactions. In a nutshell, does it reflects the peculiar nature of the computational machinery of the cell?
Challenges Characteristic of a good MIM (Khon( 1999) Each molecular species should ideally appear only once in the model. Facilitate tracing of all the known interactions of any given molecular species. Concise method to represent multi-molecular molecular complexes. Representations of protein modifications such as phosphorylations. Visualization of the actions and effects of each molecular species or interactions.
Pre-Diagrammatic Cell Language Kohn, 1999 has suggested a standard symbols and rules. In the paper, Kohn has built a MIM of the regulatory network that controls the mammalian cell cycle and DNA repair systems.
Diagrammatic Cell Language Diagrammatic Cell Language is a patent- pending product of GNS. Compact visual representation of millions of chemical states and reactions. Complete, grammatically and mathematically, containing Nouns (atoms( atoms, linkboxes,, and likeboxes) ) and Verbs (Reactions).
Diagrammatic Cell Language Atoms smallest computational units. eg: : gene, protein, mrna, metaphase Single-copy, or a condition switch, molecular species. Three basic types: Uniques: : global and single occurrence. Centrosome, a metaphase switch. Commons: Found in many copies but have dynamics. Proteins, mrna, calcium ions. Ubiques: : small atoms which diffuse quickly and reach equilibrium so quickly that they have no dynamics at all. Phosphate groups, ATP, ions.
Diagrammatic Cell Language Actions process that transforms any set of nouns to another. Three types: Bindings Reactions Processes slow. mrna transcription, protein folding/transport processes.
Diagrammatic Cell Language LinkBoxes represents the physical joining of one, two or more nouns. Protein is linkbox of its binding sites. DNA is linkbox of the genes. LikeBoxes defines objects that act alike. Equivalence represents equivalent actions. Links to DCL
Diagrammatic Cell Language This linkbox can be in 8 states: Chemical A Unbound, Chemical B Unbound, Chemical C Unbound Chemical A Bound, Chemical B Unbound, Chemical C Unbound Chemical A Unound, Chemical B Bound, Chemical C Unbound Chemical A Bound, Chemical B Bound, Chemical C Unbound Chemical A Unbound, Chemical B Unbound, Chemical C Bound Chemical A Bound, Chemical B Unbound, Chemical C Bound Chemical A Unbound, Chemical B Bound, Chemical C Bound Chemical A Bound, Chemical B Bound, Chemical C Bound When there are linkbox has N binding sites, the number of different states is 2^N.
Diagrammatic Cell Language
Diagrammatic Cell Language The diagram with likeboxes is equivalent to six diagrams
Diagrammatic Cell Language Kohn Notation DCL a scaffold protein can bind several proteins independently
Diagrammatic Cell Language E2 (0,1) (0,1,2) E2F1 DP1 prb Kohn Notation DCL
Diagrammatic Cell Language
Visual Cell
Visual Cell
Visual Cell
Minimal Cell Model Towards the Development of a Minimal Cell Model by Generalization of a Model of Escherichia coli: Use of Dimensionless Rate Parameters, Samuel et al., 2001. Simplest free-living microbe that is capable of growth and self-replication. Essential Genes minimal gene set that is necessary and sufficient to sustain the existence of a modern type cell. experiments and theoretical arguments suggest that such an organism would have as few as 250 genes Developed based on Escherichia coli (E. coli).
Minimal Cell Model Dynamic model. Lumping pools of related components. For instance, a single pool represents all RNA. Important chemical species for cellular control are represented explicitly. Explicitly accounts for all essential functions. Provides a modular framework in which molecular details can be added to each module. Have a different set of kinetic requirements than E. coli due to the reduced genome size.
Minimal Cell Model Hypothesis: relative rate constant of a reaction in relation to other rate constants in related reactions is the key factor in determining cellular function. mass balances for the individual species must add up to the values for the lumped species to remain consistent! How to broaden specificity of enzymes?
Simulation Time-course experiments measuring mrna abundance and protein activity are conducted on Caco and other human colonic cell lines. To constrain unknown regulatory interactions and kinetic parameters via BioMetrics TM - A statistical inference tool used to integrate and analyze diverse biological data types in a statistically meaningful fashion. network inference, sensitivity analysis, Digital Cell TM parameter optimization methods.
Simulation Simulated with Digital Cell TM (Available through Alliances and Collaborations) Using a data-driven, driven, scalable computer simulation platform that consists of a 192- processor Linux supercomputing cluster from strategic partner IBM Corp.
Conclusion GNS has made a good achievement in MIM. Can we borrow the idea and integrate it into Cellware?
Are doing this?
End of Presentation Thank you