Outline. Terminologies and Ontologies. Communication and Computation. Communication. Outline. Terminologies and Vocabularies.

Similar documents
Biological Pathways Representation by Petri Nets and extension

2. Cellular and Molecular Biology

V14 extreme pathways

86 Part 4 SUMMARY INTRODUCTION

V19 Metabolic Networks - Overview

Redox cycling. basic concepts of redox biology

Using C-OWL for the Alignment and Merging of Medical Ontologies

GENE ONTOLOGY (GO) Wilver Martínez Martínez Giovanny Silva Rincón

Open PHACTS Explorer: Compound by Name

2. The study of is the study of behavior (capture, storage, usage) of energy in living systems.

Chapter 6: Energy and Metabolism

A Protein Ontology from Large-scale Textmining?

Some Problems from Enzyme Families

Genome Annotation. Bioinformatics and Computational Biology. Genome sequencing Assembly. Gene prediction. Protein targeting.

SABIO-RK Integration and Curation of Reaction Kinetics Data Ulrike Wittig

SPA for quantitative analysis: Lecture 6 Modelling Biological Processes

Gene Ontology and overrepresentation analysis

V14 Graph connectivity Metabolic networks

Supplementary Information 16

BMD645. Integration of Omics

Integration of functional genomics data

Cell Organelles. a review of structure and function

The EcoCyc Database. January 25, de Nitrógeno, UNAM,Cuernavaca, A.P. 565-A, Morelos, 62100, Mexico;

Permanent Link:

Chemistry 1506: Allied Health Chemistry 2. Section 10: Enzymes. Biochemical Catalysts. Outline

RULE-BASED REASONING FOR SYSTEM DYNAMICS IN CELL SYSTEMS

Field 045: Science Life Science Assessment Blueprint

GO ID GO term Number of members GO: translation 225 GO: nucleosome 50 GO: calcium ion binding 76 GO: structural

Energy and Cellular Metabolism

CSCE555 Bioinformatics. Protein Function Annotation

Distributive Lattice Ordered Ontologies

Gene Ontology and Functional Enrichment. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein

A Semantic Network for Modeling Biological Knowledge in Multiple Databases

Principles of Cellular Biology

Chapter 2 1. Using an annotated diagram, describe the structure of a plant cell. (12 marks)

S1 Gene ontology (GO) analysis of the network alignment results

Text of objective. Investigate and describe the structure and functions of cells including: Cell organelles

From Petri Nets to Differential Equations An Integrative Approach for Biochemical Network Analysis

Knowledge representation DATA INFORMATION KNOWLEDGE WISDOM. Figure Relation ship between data, information knowledge and wisdom.

REGULATION OF GENE EXPRESSION. Bacterial Genetics Lac and Trp Operon

CATEGORY a TERM COUNT b P VALUE GENE FAMILIES: REPRESENTATIVE GENE SYMBOLS c. Annotation Cluster 1 Enrichment Score: 1.

Bayesian Hierarchical Classification. Seminar on Predicting Structured Data Jukka Kohonen

Outline. Metabolism: Energy and Enzymes. Forms of Energy. Chapter 6

Enzyme Enzymes are proteins that act as biological catalysts. Enzymes accelerate, or catalyze, chemical reactions. The molecules at the beginning of

-max_target_seqs: maximum number of targets to report

Metabolism and Enzymes

Hampton High School Biology Competencies & Requisite Skills

Energy Transformation, Cellular Energy & Enzymes (Outline)

A Finite Model Theory for Biological Hypotheses

Analysis and visualization of protein-protein interactions. Olga Vitek Assistant Professor Statistics and Computer Science

Computational Systems Biology

SPRINGFIELD TECHNICAL COMMUNITY COLLEGE ACADEMIC AFFAIRS

Designing and Evaluating Generic Ontologies

EBI web resources II: Ensembl and InterPro. Yanbin Yin Spring 2013

Ozone and Plant Cell. Victoria V. Roshchina. Valentina D. Roshchina SPRINGER-SCIENCE+BUSINESS MEDIA, B.V. and

Chapter 6- An Introduction to Metabolism*

Introduction Biology before Systems Biology: Reductionism Reduce the study from the whole organism to inner most details like protein or the DNA.

Unit 3: Cell Energy Guided Notes

BIOLOGY 10/11/2014. An Introduction to Metabolism. Outline. Overview: The Energy of Life

Gene Ontology. Shifra Ben-Dor. Weizmann Institute of Science

ToxiCat: Hybrid Named Entity Recognition services to support curation of the Comparative Toxicogenomic Database

Gene Network Science Diagrammatic Cell Language and Visual Cell

CISC 636 Computational Biology & Bioinformatics (Fall 2016)

MONTGOMERY COUNTY COMMUNITY COLLEGE BIO 140 CHAPTER 4. Functional Anatomy of Prokaryotic and Eukaryotic Cells

Title: HyBrow - A Prototype System for Computer-Aided Hypothesis Evaluation

An Introduction to GLIF

Chapter 19. History of Life on Earth

2 GENE FUNCTIONAL SIMILARITY. 2.1 Semantic values of GO terms

Bioinformatics. Dept. of Computational Biology & Bioinformatics

AP Biology. Metabolism & Enzymes

Miller & Levine Biology 2014

AP BIOLOGY SUMMER ASSIGNMENT

9/25/2011. Outline. Overview: The Energy of Life. I. Forms of Energy II. Laws of Thermodynamics III. Energy and metabolism IV. ATP V.

Chapter 8 Notes. An Introduction to Metabolism

Evolution of a Foundational Model of Physiology: Symbolic Representation for Functional Bioinformatics

Rule learning for gene expression data

AN EVIDENCE ONTOLOGY FOR USE IN PATHWAY/GENOME DATABASES

An Introduction to Description Logics

Lecture 7: Enzymes and Energetics

Chapter 8: An Introduction to Metabolism

An Introduction to Metabolism

Metabolism and enzymes

20. Electron Transport and Oxidative Phosphorylation

Regulation and signaling. Overview. Control of gene expression. Cells need to regulate the amounts of different proteins they express, depending on

Energy Metabolism exergonic reaction endergonic reaction Energy of activation

Biological Chemistry and Metabolic Pathways

Flow of Energy. Flow of Energy. Energy and Metabolism. Chapter 6

Prediction of protein function from sequence analysis

ChemWiki BioWiki GeoWiki StatWiki PhysWiki MathWiki SolarWiki

Chapter 8 Metabolism: Energy, Enzymes, and Regulation

SC55 Anatomy and Physiology Course #: SC-55 Grade Level: 10-12

Transcriptome analysis of leaf tissue from Bermudagrass (Cynodon dactylon) using a normalised cdna library

Supplementary Table 3. Membrane/Signaling/Neural Genes of the DmSP. FBgn CG5265 acetyltransferase amino acid metabolism

16 The Cell Cycle. Chapter Outline The Eukaryotic Cell Cycle Regulators of Cell Cycle Progression The Events of M Phase Meiosis and Fertilization

Objectives INTRODUCTION TO METABOLISM. Metabolism. Catabolic Pathways. Anabolic Pathways 3/6/2011. How to Read a Chemical Equation

Structured Descriptions & Tradeoff Between Expressiveness and Tractability

The BRENDA Enzyme Information System. Module B4. Ligand Search Substructure Search

Exploring molecular networks using MONET ontology

REVIEW 1: BIOCHEMISTRY UNIT. A. Top 10 If you learned anything from this unit, you should have learned:

BIOCHEMISTRY/MOLECULAR BIOLOGY

Transcription:

Page 1 Outline 1. Why do we need terminologies and ontologies? Terminologies and Ontologies Iwei Yeh yeh@smi.stanford.edu 04/16/2002 2. Controlled Terminologies Enzyme Classification Gene Ontology 3. Ontologies Definition Representations 4. Example Ontologies RiboWeb EcoCyc TAMBIS Peleg Process Model Communication and Computation Communication (conveying the meaning we intend): lots of biological and medical knowledge to be communicated Computation: making use of background knowledge in computational tasks organizing information meaningfully to help find significant relationships creating data structures for powerful algorithms Communication In order to communicate effectively we need: common language common understanding Example: Metabolic Pathways: language: names of products, enzymes, substrates and pathways knowledge: what is a reaction, how do enzymes and substrates participate, what makes a pathway Terminologies and Vocabularies Terminology: set of words, often hierarchical Vocabulary: terminology with definitions, semantics Outline 1. Why do we need terminologies and ontologies? 2. Controlled Terminologies Enzyme Classification Gene Ontology 3. Ontologies Definition Representations 4. Example Ontologies RiboWeb EcoCyc TAMBIS Peleg Process Model

Page 2 Enzyme Classification (e.g. EC 2.8.3.X) http://prowl.rockefeller.edu/enzymes/enzymes.htm Enzyme Classification First controlled terminology for enzyme function. Maps to function, one gene can have multiple functions. Only covers metabolic enzymes, not other functionality. Other Classification Systems SwissPROT EGAD GenProtEC TIGR role InterPRO Gene Ontology (http://www.geneontology.org/) Used to classify function in human genome draft. A controlled listing of three types of function: Molecular Function Biological Process Cellular Component Molecular Function <molecular_function ; GO:0003674 %anti-toxin ; GO:0015643 %lipoprotein anti-toxin ; GO:0015644 %anticoagulant ; GO:0008435 %antifreeze ; GO:0016172 %ice nucleation inhibitor ; GO:0016173 %antioxidant ; GO:0016209 %glutathione reductase (NADPH) ; GO:0004362 ; EC:1.6.4.2 % flavin-containing electron transporter ; GO:0015933 % oxidoreductase\, acting on NADH or NADPH\, disulfide as acceptor ; GO:0016654 %thioredoxin reductase (NADPH) ; GO:0004791 ; EC:1.6.4.5 % flavin-containing electron transporter ; GO:0015933 % oxidoreductase\, acting on NADH or NADPH\, disulfide as acceptor ; GO:0016654 Biological Process <biological_process ; GO:0008150 %behavior ; GO:0007610 %adult behavior (sensu Insecta) ; GO:0008044 %adult feeding behavior ; GO:0008343 % feeding behavior ; GO:0007631 %response to cocaine ; GO:0008341 %chemosensory behavior ; GO:0007635 %chemosensory jump behavior ; GO:0007636 %proboscis extension reflex ; GO:0007637 %feeding behavior ; GO:0007631 %adult feeding behavior ; GO:0008343 % adult behavior (sensu Insecta) ; GO:0008044 %larval feeding behavior ; GO:0008342 % larval behavior (sensu Insecta) ; GO:0007627

Page 3 Cellular Component <cellular_component ; GO:0005575 %cell ; GO:0005623 %ascus ; GO:0005627 <ascus lipid droplet ; GO:0005633 % lipid particle ; GO:0005811 <prospore membrane ; GO:0005628 % membrane ; GO:0016020 <spore wall (sensu Fungi) ; GO:0005619 % cell wall (sensu Fungi) ; GO:0009277 % extracellular ; GO:0005576 <chitosan layer of spore wall ; GO:0005631 <dityrosine layer of spore wall ; GO:0005630 <inner layer of spore wall ; GO:0005632 Current genome annotations http://www.geneontology.org/#currentannot Annotation of Human Genome UMLS Metathesaurus and MeSH Unified Medical Language System contains biomedical concepts formed from several different medical vocabularies Medical Subject Headings (MeSH) 15 different axes indexers at the NLM assign MeSH headings to articles

Page 4 Outline 1. Why do we need terminologies and ontologies? 2. Controlled Terminologies Enzyme Classification Gene Ontology 3. Ontologies Definition Representations 4. Example Ontologies RiboWeb EcoCyc TAMBIS Peleg Process Model What is an ontology? A formal specification of the conceptualization of some domain of discourse. What are the concepts and how do they interact? Concepts: represent a set of objects with common properties example: enzymatic reaction catalyzed by enzyme produces substrates characterized by rate constant Methods of Representing Ontologies Frames (Minsky, 1987): a concept is represented by a frame Frame: Enzymatic Reaction Slot: Enzyme of Type Protein Slot: Substrates of Type Chemical Slot: rate constant of type Number Hierarchical structure, attributes and relationships. α α Description Logics: using a subset of first order logic, we represent concepts with sets of axioms a, b, are variables which can be T or F (and) a b (or) a b (equivalence) a b ( a b) (implies) a b Satisfiability: sometimes true Validity: always true Entailment: KB = α α is true in all possible worlds of KB Predicates where arguments are constants: Female (Joan) Mother(Joan, Paul) Mother(A,B) Female (A) :Valid Overall goal: given a KB (set of axioms) is a given axiom entailed? Satisfiable? Can create a model of a system and see if it is consistent (satisfiable?) Why use frame-based ontologies? These are data structures. Data structures exist to facilitate the creation of algorithms that are effective. Ontologies should enable powerful applications to be built.

Page 5 Why use description logic ontologies? Expressive. Well formulated solutions for decision problems. Compositional and Dynamic. Outline 1. Why do we need terminologies and ontologies? 2. Controlled Terminologies Enzyme Classification Gene Ontology 3. Ontologies Definition Representations 4. Example Ontologies RiboWeb EcoCyc TAMBIS Peleg Process Model Raw data in this form not useful for computers. Large data set summarized graphically. Powers & Noller 400 observations on RNA Is sufficient to compute a low resolution model. A-priori-target Footprinting-Agent Protecting-Moiety Footprinting-Strength Protected-part

Page 6 Powers Paper, Sample Data A-priori-target: Intact 30S subunits Footprinting-agent: Hydroxyl radical Protecting-moiety: S2 Protected-part: Base 698 Footprinting-strength: Strong

Page 7

Page 8 EcoCYC (Karp et al) http://ecocyc.pangeasystems.com/ecocyc/ecocyc.html http://www.sciencemag.org/cgi/reprint/293/5537/2040.pdf An ontology-based environment for representing metabolic compounds, reactions and pathways. Careful representation of these allows for: Graphical display Query Building new databases by analogy A Reaction in EcoCyc EcoCyc Ontology TAMBIS: Transparent Access to Multiple Biological Information Sources Bioinformatics 16(2), p 184-5, 2000

Page 9 TAMBIS Integration Tool Biological Ontology for TAMBIS (Bioinformatics 15 (6) p 510, 1999) Example Concepts from TAMBIS Ontologies for Biological Processes and Functions Not just controlled terminology Need to represent sequential aspects Need to represent temporal aspects Need to represent hierarchical decomposition of processes Need to be able to attach quantitative information

Page 10 Modeling biological processes using workflow and Petri net models Mor Peleg, Iwei Yeh, and Russ Altman BMI 210A Components of a biological model Biological process Molecular function Proteolysis Transport Gene regulation Gene products Cellular location Sequence components DB entries 3 aspects of a biological system Dynamic aspect Static-structural (concept model) Biological concepts (e.g., protein, cellular component) Relationships gene gene product protein complex protein sub-units protein cellular location Functional Biological reactions Dynamic substrate catalyst Reaction Inhibitor product F1 is before F2 F4 is done in parallel to {1,2,3} If we knockout or inhibit F4, can we get from A to B? Molecular function 1 Molecular function 2 Molecular function 3 State A AND Molecular function 4 State B Desired properties of a biological processes model 3 aspects of a biological system Biological concept model Graphical representation Hierarchical model (complexity) Formal semantics to verify correctness Biological queries (reasoning) Proteins that have the same (diff.) functional roles, scoped to cellular location Reactions that have same substrates, products, inhibitors All processes that are a kind of Adhesion Mapping business workflow to biological systems Business Workflow model Process model Organizational model Organizational Unit (Medical School) member Role Human (Dean) Biological Process Model Process model Structural model Biomolecular complex (Replication complex) member Biopolymer (Helicase) Role (DNA unwinding)

Page 11 Graphic dynamic & functional model Mapping to Petri Nets Explicitly represent states Verification of properties liveness, safety, soundness Reasoning on dynamics without t1, can we reach P4 from P1? P1 t1 P2 t2 P3 AND t4 t3 Queries: All processes inhibited by neuraminadase All adhesion activities that occur in erythrocytic plasma membrane P4 AND Conclusions The development of ontologies is linked to the design of associated applications. Ontologies are being used for: Controlled terminologies Standard representation of data Modeling biological processes Thanks yeh@smi.stanford.edu They are a stepping stone to full quantitative mathematical models.