The Role of Ontology in the Era of Big (Military) Data

Similar documents
Frontiers in Microbiology

ToxiCat: Hybrid Named Entity Recognition services to support curation of the Comparative Toxicogenomic Database

Dr. habil. Anna Salek. Mikrobiologist Biotechnologist Research Associate

Side-by-Side Comparison of the Texas Educational Knowledge and Skills (TEKS) and Louisiana Grade Level Expectations (GLEs) SCIENCE: Biology

CELL AND MICROBIOLOGY Nadia Iskandarani

Microbe Mission C Test

Microbiology BIOL 202 Lecture Course Outcome Guide (COG) Approved 22 MARCH 2012 Pg.1

Encyclopedia of Bioterrorism Defense

Programme Specification (Undergraduate) For 2017/18 entry Date amended: 25/06/18

Gene Ontology and overrepresentation analysis

Test Bank for Microbiology A Systems Approach 3rd edition by Cowan

Studying Life. Lesson Overview. Lesson Overview. 1.3 Studying Life

Life Science FROM MOLECULES TO ORGANISMS: STRUCTURES AND PROCESSES

Microbe Mission B Test

Optional Second Majors

Test Bank for Microbiology A Systems Approach 3rd edition by Cowan

CLASSIFICATION UNIT GUIDE DUE WEDNESDAY 3/1

Introduction to Microbiology. CLS 212: Medical Microbiology Miss Zeina Alkudmani

Role of GIS in Tracking and Controlling Spread of Disease

Bio-Medical Text Mining with Machine Learning

STAAR Biology Assessment

What s Bugging You? The Microbiology of Health

Microbiology. Microbiology derived by Greek mikros (small) bios (life) logos (science)

Curriculum Links. AQA GCE Biology. AS level

Biology Year at a Glance

BIOLOGY Chair Associate Chair Faculty with Research Interests Degree Programs Other Degree Programs in Biology Research Honors Program

Ontologies for nanotechnology. Colin Batchelor

PDF // IS BACTERIA A PROKARYOTE OR EUKARYOTE

Introduction to the EMBL-EBI Ontology Lookup Service

TEST SUMMARY AND FRAMEWORK TEST SUMMARY

Unit D: Controlling Pests and Diseases in the Orchard. Lesson 5: Identify and Control Diseases in the Orchard

Biology Assessment. Eligible Texas Essential Knowledge and Skills

The Role of Network Science in Biology and Medicine. Tiffany J. Callahan Computational Bioscience Program Hunter/Kahn Labs

Plant and Trait Ontology Current Status. Pankaj Jaiswal Oregon State University

Taxonomy refers to the classification and organization of organisms based on their scientific names to reflect their

Prokaryotes & Viruses. Multiple Choice Review. Slide 1 / 47. Slide 2 / 47. Slide 3 / 47

EASTERN ARIZONA COLLEGE Microbiology

Leeuwenhoek s Animacules

Leeuwenhoek s Animacules. Early History of Microbiology: Fig. 1.4

no.1 Raya Ayman Anas Abu-Humaidan

Science Unit Learning Summary

GIS Capability Maturity Assessment: How is Your Organization Doing?

Protein Ontology (PRO)

BIG IDEA 4: BIOLOGICAL SYSTEMS INTERACT, AND THESE SYSTEMS AND THEIR INTERACTIONS POSSESS COMPLEX PROPERTIES.

BEFORE TAKING THIS MODULE YOU MUST ( TAKE BIO-4013Y OR TAKE BIO-

Chapter 01 Humans and the Microbial World

Chapter 01 Humans and the Microbial World

Map of AP-Aligned Bio-Rad Kits with Learning Objectives

AP Biology Essential Knowledge Cards BIG IDEA 1

BIOL 101 Introduction to Biological Research Techniques I

Course Descriptions Biology

A. Correct! Taxonomy is the science of classification. B. Incorrect! Taxonomy is the science of classification.

BIO 101 INTRODUCTORY BIOLOGY PROF. ANI NKANG DEPT. OF BIOLOGICAL SCIENCES ARTHUR JARVIS UNIVERSITY

Bloodborne Pathogens. Introduction to Microorganisms. Next >> COURSE 2 MODULE 1

Text mining and natural language analysis. Jefrey Lijffijt

The approximate weightings of the learning categories in this course are shown in the table: Category K/U T/I C A Exam Total Weight

APPENDIX II. Laboratory Techniques in AEM. Microbial Ecology & Metabolism Principles of Toxicology. Microbial Physiology and Genetics II

Big Idea 1: Does the process of evolution drive the diversity and unit of life?

Introduction to Biology

AP Curriculum Framework with Learning Objectives

Science 24 Study Guide Final Exam June 2013

Analysis and visualization of protein-protein interactions. Olga Vitek Assistant Professor Statistics and Computer Science

BIOINFORMATICS: An Introduction

Microbiota: Its Evolution and Essence. Hsin-Jung Joyce Wu "Microbiota and man: the story about us

Bundle at a Glance Biology 2015/16

Objective 1: The student will demonstrate an understanding of the nature of science.

Science TAKS Objectives for Grades 10 and 11

BIOLOGY STANDARDS BASED RUBRIC

Kharkov National Medical University. Head of Microbiology, Virology and Immunology Department Minukhin Valeriy Vladimirivich

THE IDENTIFICATION OF TWO UNKNOWN BACTERIA AFUA WILLIAMS BIO 3302 TEST TUBE 3 PROF. N. HAQUE 5/14/18

Evolution of a Foundational Model of Physiology: Symbolic Representation for Functional Bioinformatics

CELL BIOLOGY AND MICROBIOLOGY Nadia Iskandarani

Microbial Ecology and Microbiomes

MEDLINE Clinical Laboratory Sciences Journals

Putnam County Public Schools Curriculum Map BIOLOGY Yearly Outlook First Nine Weeks Second Nine Weeks Third Nine Weeks Fourth Nine Weeks

Test Bank for Burton s Microbiology for the Health Sciences 9th edition by Engelkrirk

Variety of Living Organisms

Prokaryotes & Viruses. Multiple Choice Review. Slide 1 / 47. Slide 2 / 47. Slide 3 / 47

When and Why to use a Classifier?

Prokaryotes & Viruses. Multiple Choice Review. Slide 2 / 47. Slide 1 / 47. Slide 3 (Answer) / 47. Slide 3 / 47. Slide 4 / 47. Slide 4 (Answer) / 47

Fairfield Public Schools Science Curriculum Human Anatomy and Physiology: Brains, Bones and Brawn

TEACHER CERTIFICATION STUDY GUIDE. Table of Contents I. BASIC PRINCIPLES OF SCIENCE (HISTORY AND NATURAL SCIENCE)

The invention of the microscope has opened to us a world of extraordinary numbers. A singular drop of pond water reveals countless life forms

THE ROLE OF BBSRC IN BIODIVERSITY RESEARCH

Scientific names allow scientists to talk about particular species without confusion

Connecticut State Department of Education Science - Core Curriculum Standards High School Grades 9-12

VCE BIOLOGY Relationship between the key knowledge and key skills of the Study Design and the Study Design

Holt McDougal ScienceFusion Student Edition 2012 Grades 6 8. correlated to the. Minnesota Academic Standards Science Grade 7

The anatomy of phenotype ontologies: principles, properties and applications

A A A A B B1

Big Idea 1: The process of evolution drives the diversity and unity of life.

A DISEASE ECOLOGIST S GUIDE TO EVOLUTION: EVIDENCE FROM HOST- PARASITE RELATIONSHIPS

Download full Test Bank for Microbiology A Human Perspective 7th Edition

Prereq: Concurrent 3 CH

Administrative-Master Syllabus form approved June/2006 revised Page 1 of 1

Enduring understanding 1.A: Change in the genetic makeup of a population over time is evolution.

Department Curriculum and Assessment Outline

Lowndes County Biology II Pacing Guide Approximate

Campbell Biology AP Edition 11 th Edition, 2018

The Environment Ontology - Linking Environmental Data eenvironment Thu 26th March 2009, Praha, Czech Republic

Transcription:

Distributed Common Ground System Army (DCGS-A) The Role of Ontology in the Era of Big (Military) Data Barry Smith Director National Center for Ontological Research 1

Distributed Development of a Shared Semantic Resource (SSR) in support of US Army s Distributed Common Ground System Standard Cloud (DSC) initiative with thanks to: Tanya Malyuta, Ron Rudnicki Background materials: http://x.co/yyxn 2

3

Making data (re-)usable through common controlled vocabularies Allow multiple databases to be treated as if they were a single data source by eliminating terminological redundancy in ways data are described not Person, and Human, and Human Being, and Pn, and HB, but simply: person Allow development and use of common tools and techniques, common training, single validation of data, focused around semantic technology coordinated ontology development and use 4

Ontology =def. controlled vocabulary organized as a graph nodes in the graph are terms representing types in reality each node is associated with definition and synonyms edges in the graph represent well-defined relations between these types the graph is structured hierarchically via subtype relations 5

Ontologies computer-tractable representations of types in specific areas of reality divided into more and less general upper = organizing ontologies, provide common architecture and thus promote interoperability lower = domain ontologies, provide grounding in reality reflecting top-down and bottom-up strategy 6

Success story in biomedicine Goal: integration of biological and clinical data across different species across levels of granularity (organ, organism, cell, molecule) across different perspectives (physical, biological, clinical) within and across domains (growth, aging, environment, genetic disease, toxicity ) 8

RELATION TO TIME CONTINUANT OCCURRENT GRANULARITY INDEPENDENT DEPENDENT ORGAN AND ORGANISM CELL AND CELLULAR COMPONENT Organism (NCBI Taxonomy) Cell (CL) Anatomical Entity (FMA, CARO) Cellular Component (FMA, GO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Cellular Function (GO) Biological Process (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) The Open Biomedical Ontologies (OBO) Foundry 9

RELATION TO TIME CONTINUANT OCCURRENT GRANULARITY INDEPENDENT DEPENDENT COMPLEX OF ORGANISMS ORGAN AND ORGANISM CELL AND CELLULAR COMPONENT Family, Community, Population Organism (NCBI Taxonomy) Cell (CL) Anatomical Entity (FMA, CARO) Cellular Component (FMA, GO) Organ Function (FMP, CPRO) Cellular Function (GO) Population Phenotype Phenotypic Quality (PaTO) Population Process Biological Process (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Population-level ontologies 10

Environment Ontology RELATION TO TIME CONTINUANT OCCURRENT INDEPENDENT DEPENDENT GRANULARITY ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) Environment Ontology 11

RELATION TO TIME CONTINUANT OCCURRENT GRANULARITY INDEPENDENT DEPENDENT ORGAN AND ORGANISM CELL AND CELLULAR COMPONENT Organism (NCBI Taxonomy) Cell (CL) Anatomical Entity (FMA, CARO) Cellular Component (FMA, GO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Cellular Function (GO) Organism-Level Process (GO) Cellular Process (GO) MOLECULE Molecule (ChEBI, SO, RNAO, PRO) Molecular Function (GO) Molecular Process (GO) rationale of OBO Foundry coverage 12

OBO Foundry approach extended into other domains NIF Standard ISF Ontologies OGMS and Extensions IDO Consortium crop Neuroscience Information Framework Integrated Semantic Framework Ontology for General Medical Science Infectious Disease Ontology Common Reference Ontologies for Plants 13

Modular organization + Extension strategy top level Basic Formal Ontology (BFO) domain level Anatomy Ontology (FMA*, CARO) Cell Ontology (CL) Cellular Component Ontology (FMA*, GO*) Environment Ontology (EnvO) Subcellular Anatomy Ontology (SAO) Infectious Disease Ontology (IDO*) Phenotypic Quality Ontology (PaTO) Biological Process Ontology (GO*) Sequence Ontology (SO*) Protein Ontology (PRO*) Molecular Function (GO*) 14

~100 ontologies using BFO US Army Biometrics Ontology Brucella Ontology (IDO-BRU) eagle-i and VIVO (NCRR) Financial Report Ontology (to support SEC through XBRL) IDO Infectious Disease Ontology (NIAID) Malaria Ontology (IDO-MAL) Nanoparticle Ontology (NPO) Ontology for Risks Against Patient Safety (RAPS/REMINE) Parasite Experiment Ontology (PEO) Subcellular Anatomy Ontology (SAO) Vaccine Ontology (VO) 15

Basic Formal Ontology BFO:Continuant BFO:Entity BFO:Occurrent BFO BFO:Independent Continuant BFO:Dependent Continuant BFO:Process BFO:Disposition Thursday, April 18, 2013 16

Basic Formal Ontology and Mental Functioning Ontology (MFO) BFO:Entity BFO:Continuant BFO:Occurrent BFO MFO BFO:Independent Continuant BFO:Dependent Continuant BFO:Process Organism Mental Functioning Related Anatomical Structure BFO:Disposition BFO:Quality Behaviour inducing state Cognitive Representation Affective Representation Bodily Process Mental Process Thursday, April 18, 2013 17

Emotion Ontology extends MFO BFO:Entity BFO MFO BFO:Continuant BFO:Occurrent MFO-EM BFO:Independent Continuant BFO:Dependent Continuant BFO:Process Organism inheres_in BFO:Disposition Cognitive Representation Physiological Response to Emotion Process Bodily Process Mental Process Emotional Action Tendencies Affective Representation Appraisal is_output_of Appraisal Process Subjective Emotional Feeling Emotional Behavioural Process has_part agent_of Emotion Occurrent

Sample from Emotion Ontology: Types of Feeling Thursday, April 18, 2013 19

The problem of joint / coalition operations Intelligence Fire Support Targeting Maneuver & Blue Force Tracking Air Operations Civil-Military Operations Logistics 23

US DoD Civil Affairs strategy for non-classified information sharing 24

Ontologies / semantic technology can help to solve this problem Intelligence Fire Support Targetin g Maneuver & Blue Force Tracking Air Operations Civil-Military Operations Logistics 25

But each community produces its own ontology, this will merely create new, semantic siloes Intelligence Fire Support Targeting Maneuver & Blue Force Tracking Air Operations Civil-Military Operations Logistics 26

What we are doing to avoid the problem of semantic siloes Distributed Development of a Shared Semantic Resource Pilot testing to demonstrate feasibility 27

creating the analog of this in the military domain top level Basic Formal Ontology (BFO) domain level Anatomy Ontology (FMA*, CARO) Cell Ontology (CL) Cellular Component Ontology (FMA*, GO*) Environment Ontology (EnvO) Subcellular Anatomy Ontology (SAO) Infectious Disease Ontology (IDO*) Phenotypic Quality Ontology (PaTO) Biological Process Ontology (GO*) Sequence Ontology (SO*) Protein Ontology (PRO*) Molecular Function (GO*) 28

Semantic Enhancement Annotation (tagging) of source data models using terms from coordinated ontologies data remain in their original state (are treated at arms length) tagged using interoperable ontologies created in tandem can be as complete as needed, lossless, long-lasting because flexible and responsive big bang for buck measurable benefit even from first small investments Coordination through shared governance and training 29

Main challenge: Will it scale? The problem of scalability turns on the ability to accommodate ever increasing volumes and types of data and numbers of users can we preserve coordination (consistency, non-redundancy) as ever more domains become involved? can we respond in agile fashion to ever changing bodies of source data? 31

Strategy for agile ontology creation Identify or create carefully validated general purpose plug-and-play reference ontology modules for principal domains Develop a method whereby these reference ontologies can be extended very easily to cope with specific, local data through creation of application ontologies 32

Reference Ontology vehicle =def: an object used for transporting people or goods tractor =def: a vehicle that is used for towing crane =def: a vehicle that is used for lifting and moving heavy objects vehicle platform=def: means of providing mobility to a vehicle wheeled platform=def: a vehicle platform that provides mobility through the use of wheels Application Ontology artillery vehicle = def. vehicle designed for the transport of one or more artillery weapons wheeled tractor = def. a tractor that has a wheeled platform Russian wheeled tractor type T33 = def. a wheeled tractor of type T33 manufactured in Russia Ukrainian wheeled tractor type T33 = def. a wheeled tractor of type T33 manufactured in Ukraine tracked platform=def: a vehicle platform that provides mobility through the use of continuous tracks

Reference Ontology vehicle =def: an object used for transporting people or goods tractor =def: a vehicle that is used for towing crane =def: a vehicle that is used for lifting and moving heavy objects vehicle platform=def: means of providing mobility to a vehicle wheeled platform=def: a vehicle platform that provides mobility through the use of wheels tracked platform=def: a vehicle platform that provides mobility through the use of continuous tracks Application Ontology artillery vehicle = def. vehicle designed for the transport of one or more artillery weapons wheeled tractor = def. a tractor that has a wheeled platform Russian wheeled tractor type T33 = def. a wheeled tractor of type T33 manufactured in Russia Ukrainian wheeled tractor type T33 = def. a wheeled tractor of type T33 manufactured in Ukraine

Basic Formal Ontology (BFO) Extended Relation Ontology Agent Ontology Artifact Ontology Event Ontology Geospatial Ontology Information Entity Ontology Quality Ontology Time Ontology

http://milportal.org 40

41

42

43

An example of agile application ontology development: The Bioweapons Ontology (BWO) 44

Kinds of chemical and biological weapons Chemical Nerve agents (sarin gas) Blister agents (mustard gas) Blood agents (cyanide gas) Biological Infectious agents BWO(I) Toxic agents (botulinum toxin, ricin) BWO(T) 45

We focus here on BWO(I) Infectious agents Bacterial (anthrax, bubonic plague, tularemia, brucellosis, cholera ) Viral (Ebola, Marburg ) 46

Examples of ontology terms BFO IDO StaphIDO Independent Continuant Infectious disorder Staph. aureus disorder Dependent Continuant Infectious disease Protective resistance MRSA Methicillin resistance Occurrent Infectious disease course MRSA course 47

Infectious Disease Ontology (IDO) with thanks to Lindsay Cowell (University of Texas SW Medical Center) and Albert Goldfain (Blue Highway, Inc.) IDO Core (Reference Ontology) General terms in the ID domain. IDO Extensions (Application Ontologies) Disease-, host-, pathogen-specific. Developed by subject matter experts. The hub-and-spokes strategy ensures that logical content of IDO Core is automatically inherited by the IDO Extensions

IDO Core Contains general terms in the ID domain: E.g., colonization, pathogen, infection A contract between IDO extension ontologies and the datasets that use them. Intended to represent information along several dimensions: biological scale (gene, cell, organ, organism, population) discipline (clinical, immunological, microbiological) organisms involved (host, pathogen, and vector types)

Examples of ontology terms BFO IDO StaphIDO Independent Continuant Infectious disorder Staph. aureus disorder Dependent Continuant Infectious disease Protective resistance MRSA Methicillin resistance Occurrent Infectious disease course MRSA course 50

IDO Extensions IDO Brucellosis IDO Dengue Fever IDO Influenza IDO Malaria IDO Staphylococcus Aureus Bacteremia IDO Vector Surveillance and Management IDO Plant VO Vaccine Ontology BWO(I) Bioweapons Ontology (Infectious Agents) 51

How IDO evolves: the case of Staph. aureus IDOMAL IDOFLU IDORatSa IDOCore IDORatStrep IDOHIV HUB and SPOKES: Domain ontologies IDOSa IDOStrep IDOHumanSa IDOMRSa IDOHumanStrep IDOAntibioticResistant SEMI-LATTICE: By subject matter experts in different communities of interest. IDOHumanBacterial

54

BWO:disease by infectious agent = def. a disease that is the consequence of the presence of pathogenic microbial agents, including pathogenic viruses, pathogenic bacteria, fungi, protozoa, multicellular parasites, and aberrant proteins known as prions

Strategy used to build BWO(I) with thanks to Lindsay Cowell and Oliver He (Michigan) 1. Start with a glossary such as: http://www.emedicinehealth.com/biological_warfare/ 2. Select corresponding terms from IDO core and related ontologies such as the CHEBI Chemistry Ontology terms needed to describe bioweapons 3. All ontology terms keep their original definitions and IDs. 4. The result is a spreadsheet 57

5. Where glossary terms have no ontology equivalent, create BWO ontology terms and definitions as needed no corresponding ontology term 58

6. Use the Ontofox too to create the first version of the BWO(I) application ontology (http://ontofox.hegroup.org/) 7. Use BWO(I) in annotations, and where gaps are identified create extension terms, for instance weaponized brucella aerosol anthrax smallpox incubation period This establishes a virtuous cycle between ontology development and use in annotations 59

Potential uses of BWO semantic enhancement of bioweapons intelligence data results will be automatically interoperable with relevant bioinformatics and public health IT tools for dealing with infections, epidemics, vaccines, forensics, to annotate research literature and research data on bioweapons to create computable definitions to substitute for definitions in free text glossaries 60

Why do people think they need lexicons Training Compiling lessons learned Compiling results of testing, e.g. of proposed new doctrine Collective inferencing Official reporting Doctrinal development Standard operating procedures Sharing of data People need to (ensure that they) understand each other