Biological Concepts and Information Technology (Systems Biology) Janaina de Andréa Dernowsek Postdoctoral at Center for Information Technology Renato Archer Janaina.dernowsek@cti.gov.br Division of 3D Technologies (DT3D) Center for Information Technology Renato Archer/CTI
Lecture overview What am I going to talk about Systems Biology Biological concepts omics databanks Standard analysis tools for large datasets Information Technology (IT) The goal of the integration between biology and information technology (systems biology) is to understand the biological systems. How a combination of experimental approaches and computational tools can provide us a understanting of how components within a cell to interact to became cells agglomerate, tissues and organs.
Information technology - IT The information technology (IT) plays a fundamental role in different areas There are many non-conflicting definitions of the acronym IT, however these topics can be part of these definitions: Data collection Data classification Data organization Data analysis Decision making Modeling Simulation Pediction With the help of IT for biological area we are moving from small scale biology to big data biology.
What is Systems Biology? It is used in so many different contexts Systems biology is the study of how molecules interact and come together to give rise to subcellular machinery that form the functional units capable of operations that are needed for cell, tissue, organ level physiological functions. The term Systems biology started to be widely used in the early 2000s, however it was often called complex systems. The term complex system emcompasses studies not only in biology, but also in physics, chemistry, social sciences, economics, among others...
Why is it difficult to define Systems Biology?
Systems Biology Systems biology involves collection of large sets of experimental data; proposal of mathematical models that might account for at least some significant aspects of this data set; accurate computer solution of the mathematical equations to obtain numerical predictions, and assessment of the quality of the model by comparing numerical simulations with the experimental data. Systems biology uses molecular biology and biochemistry of cellular components to understand HOW physiological functions at the cell/tissues and organs level arise.
Systems Biology Multi-disciplinary Field are involved Engineering Principles Nonlinear systems analysis Network theory Abstract mathematics representation theory, group theory and graph theory Nonlinear Thermodynamics Physics Chemistry Biology
History of Systems Biology Two Roots Molecular biology, with its emphasis on individual molecules, formal analysis of new functional states that arise when multiple molecules interact in the same time. Westerhoff & Palsson, Nature Biotechnology, 22 (10) 2004
History of Systems Biology for Biofabrication Genomics Proteomics It is crawling Systems Biology for Biofabrication 1990 1995 2000 2005 2010 2015 2020
Systems Biology vs. traditional cell and molecular biology Experimental techniques in systems biology are high throughput; Intensive computation is involved from the start in systems biology, in order to organize the data into usable computable databases; Exploration in traditional biology proceeds by successive cycles of hypothesis formation and testing; data accumulates during these cycles; Systems biology initially gathers data without prior hypothesis formation; hypothesis formation and testing comes during post-experiment data analysis and modeling.
Systems Biology vs. traditional cell and molecular biology Experimentally development of microarrays to measure the levels of thousand of mrnas simuntaneously allowed us to see HOW MANY COMPONENTS IN A CELL CHANGE IN RESPONSE TO STIMULLI. Polymerase chain reaction (PCR) Southern blot, Northern Blot and Western blot Microarrays» provide us with a first step towards understanding gene function and regulation on a global scale Next generation sequencing (NGS) Mass spectromety experiments Nuclear magnetic resonance (NMR) The combination of experiments and computation (IT)- together showed HOW THE SYSTEM EXIHIBIT ROBUST BEHAVIOR in response to many perturbations.
Overview of organizations of the life Nucleus = library Chromosomes = bookshelves Genes = books Almost every cell in an organism contains the same libraries and the same sets of books. Books represent all the information (DNA) that every cell in the body needs so it can grow and carry out its various functions. Gene: a discrete units of hereditary information located on the chromosomes and consisting of DNA. Genome: a set of chromosomes (Winler, 1920)
Overview of organizations of approaches Omics an approach most experimental that measures the many individual entities that make a system currently an overused and sometimes abused term, Genomics study of all (or most/many) genes involved in a certain physiological function even in the functioning of na entire organism Transcriptomics study of many mrna Proteomics study of many protein at a time often less clearly identifed with a specific function, so most often uses mass-spectrometry Metabolomics study of many metabolites in an organism, tissue or cells Most often uses liquid chromatography, mass-spectrometry
Organizing the informations from systemwide surveys Databases and tools Databases of genes, gene expression pattern are freely available on databanks; These large survey type experiment provide a lot of information, a lot of data, languages and these data, need to be organized in a way that will allow us to extract knowledge. Languages Systems Biology Markup Language (SBML) CellML - is an open standard based on the XML Systems Biology Workbench Some commonly used Databases: mrna profiling microrna Protein Genome-wide association studies Disease genes Drugs GEO (Gene Expression Omnibus) TargetScan Swiss-Prot DbGAP (Database of Genotypes and Phenotypes) OMIM (Online Mendelian Inheritance in Man) Pharm GKB (Pharmacogenomics Knowledgebase)
Computing from Databases Networks Computing from Databases 1. Statistical co-relations between entities in a database 2. Statistical co-relations between intities in different databases From these types of statistical analysis we can generate lists (of genes, proteins, microrna, etc) that can be related to a specific physiological or pathophysiological state. From lists, we can build networks; Networks are systems consisting of entities (such genes or proteins) and relationships (such as direct interactions) between these entities. The entities are called nodes and the relationships are called edges. Networks are computable systems and the computation can provide knowledge of HOW the system is organized.
Cell Signaling Network Cell signaling network takes information from environment and passes transduced that information into the cell Cell signaling networks are commonly represented as signed mixed graphs; Inflammatory cell showing composite changes in apparent expression over 24 hr, identifying nodes and interactions Calvano et al. Originally published in Nature. 437:1032-37 Nodes are mostly proteins, but also can be metabolites, lipids, mrna, microrna, peptides. Interactions designate information flow and can be activation or inhibition, and are direct and physical. Most of such interactions are direct physical interactions
Cell Signaling Network There are several network databases such as, Wikipathways Kyoto Encyclopedia of genes and Genomes (KEGG) Cell Signaling Networks Data base (CSNDB) Biocarta Pathway Interaction Database (PID) Typically, biologist are organizing their knowledge about the pathways
Cell Signaling Network The principal motivation for building pathway databases and information systems is to facilitate qualitative and quantitative modeling of biological systems, outside of the direct capacity of the human brain, using software on powerful computers. A wide range of techniques have been developed that use pathway data of varying detail to answer specific biological questions.
Pathway data integration for systems biology The diversity among pathway databases makes this challenging. Differences in data models, data access methods, file formats and subtle semantic differences in shared terms create numerous difficulties for those attempting to gather and analyze data from multiple sources. Thus, the integration of the different omics data sets (transcriptomics, proteomics, metabolomics, interactomics, RNomics, fluxomics), and the different areas are the main challenges for the future of systems biology approaches. Cary et al., 2005
Top-Down and Bottom-up Approaches Top-Down: start from a description of the system as a whole understand system characteristics and capabilities Typically the top-down models provide us a big and sometimes comprehensive picture, however, relations are tipically identified by correlation, and causal inference is often not possible Bottom up: start with cellular components (e.g., genes, proteins, lipids, sugar) to develop an understanting of HOW functional system such as subcellular or other parts machines are assembled, controlled and operated. Botton-up models can provide mechanistic undestanting HOW things work, but as the systems get bigger one can be lost in the detail. A case of can t see the forest by just looking at the leaves
Bottom-up modeling Another approach for understanting how systems are put together is to identify one component at a time and the binary relationships between these components. Many signaling pathway have been identified by this approach Dynamical model based on differential equation can quantitatively estimate how such patways respond to certain stimuli.
The scales of organizations Multi scale systems - is sort of an intrinsic part of the study of systems biology Molecular components to sbcellular machines transcriptional machinery, cell motility machinery subcellular machines to cells Cells to tissues and organs Organs to whole organisms Increasing levels of organization give rise to new properties and capabilities Sometimes multiscale also refer functions in different time and space scale
Multi-scale of a Biological System Part of the challenge in model building is to choose the right level of abstraction despite the complexity of biological processes. In other words, we need to work out what aspects of this biological complexity we can ignore and still gain critical insights about a biological phenomenon.
Hierarchical approach The computational methods used include analytical methods, mathematical modelling and simulation. New theoretical and computational models are therefore needed to make sense of this abundance of data. Hierarchical approach to improved biological understanding Avaiable at http://cfsib.com/science/computational-genomics/
Biology and Information Technology Biological networks are analogous to concurrent computer systems in many respects. Concurrent systems are built up using basic concepts such as choice, recursion, modularity, synchronization, and mobility. By exploiting these analogies, the existing tools and formalisms for computing systems can be applied to networks. Bioalgorithms.Info, cnx.org,
Cell Biology and Electronics Many systems biologists view aspects of cell biology and electronics as comparable, but with molecules, ions, proteins, and DNA replacing electrons and transistors. Graphic shows correspondence between electronic and biological systems and demonstrates similarities between the components at different levels of complexity. [Equinox Graphics/Photo Researchers]
Major challenges and limitations Measurement of chemical kinetics parameters and molecular concentrations Differences between in vitro and in vivo data Interoperability Data (informations) Parameters estimation Software Hardware
Some Conclusions System biology builds on physiology, molecular biology, biochemistry and cell biology to understand HOW molecular components INTERACT one another to form functional units, Experiments in Systems Biology often involve measuring many cellular entities simultaneously, Systems Biology integrate experiments and computational modeling to understand HOW the biological systems FUNCTION, The diversity among databases is the big challenge for us, The use of computation is a KEY feature of SYSTEMS BIOLOGY Systems Biology Helps several fields, including tissue enginneering, biofabrication and bioprinting of tissues and organs.
Systems Biology can be done by breaking down each system into modules Many problems remain unsolved in exactly how to do this, but independent efforts are being developed in most areas that may one day merge together Today, our group want to use several biological concepts and computational approaches (IT) to understand some steps of the biofabrication of tissues and organs.
Thank you for your kind attention!