Ontologies for nanotechnology Colin Batchelor batchelorc@rsc.org 2009-03-25
Ontologies for nanotechnology: Outline What is an ontology and how do they work? What do people use ontologies for? How are we using and developing ontologies at the RSC? InChIs for nanoparticles 2
What is ontology (philosophy)? Barry Smith: Ontology as a branch of philosophy is the science of what is, of the kinds and structures of objects, properties, events, processes and relations in every area of reality. 3
What is an ontology (computing)? Ontologies are controlled vocabularies enriched with quantified relations between terms. Examples of relations: Methane is an alkane (ChEBI) All GC-MS experiments have participant some mass spectrometer (Chemical Methods Ontology) All eukaryotic mature RNAs are derived from some eukaryotic primary RNA (Sequence Ontology) 4
What do people use ontologies for? Manufacturing requirements Software testing Microarray data and genome annotation (http:// geneontology.org/) Semantic Web (http://www.w3.org/2001/sw/hcls/) Annotating and searching journal articles (http:// www.projectprospect.org/) 5
Motivations for using ontologies for markup Definitions (human- and machine-readable) Rich set of synonyms Backoffs (e.g. ChEBI terms instead of InChIs) Semantic web 6
Enhanced HTML 7
Enhanced HTML 8
RSS 2007: First ever journal articles RSS feeds containing OBO terms 2009: RSS feed bringing together all Prospected articles from across the RSC 9
RSS: the gory details <item rdf:about=http://xlink.rsc.org/?doi=b716356h&rss=1> <title> [ title] </title> <link>http://xlink.rsc.org/?doi=b716356h&rss=1</link> <description> [ blah] </description> <content:encoded> [ human-readable stuff</content:encoded> [ dublin core stuff ] <content:items> <rdf:bag> <rdf:li> <content:item rdf:about= info:inchi/inchi=1/c22h22no4/ c1-13-16-11-21(26-4)20(25-3)10-15(16)8-18-17-12-22(27-5)19(24-2)9-14(17)6-7-23(13)18/h6-12h,1-5h3/q+1"/> </rdf:li> <rdf:li> <content:item rdf:about= http://purl.org/obo/owl/so#so:0000028 /> </rdf:li> </rdf:bag> </content:items> </item> 10
How do we develop new ontologies? Identify scope in terms of category and granularity. Category: is the term an object, a property of an object, or a process? (http://ifomis.org/bfo) Granularity: molecular or bulk? Cellular or organismal? Organism or species? (Tom Bittner) 11
New ontologies Name reaction ontology (RXNO) Category: processes Granularity: molecular Chemical methods ontology (CMO) Categories: processes (methods) + objects (instruments) Granularity: laboratory-scale http://www.rsc.org/ontologies 12
Existing ontologies for nanotechnology Objects NCI Cancer Thesaurus (subset) ChEBI (http://www.ebi.ac.uk/chebi) Properties of objects ChEBI Processes? 13
The problem Ontologies are finite dictionary-like lists of entities and the relations between them. Reinvention of the wheel We don t know what kinds of nanoparticles people will make in future. 14
Example nanoparticle terms from ChEBI Quantum dot (CHEBI:50853) Cadmium selenide nanoparticle (CHEBI:50835) Armchair nanotube (CHEBI:50798) These all have textual definitions and parents. But can we do for nanoparticles what has been done for small molecules with the layered InChI? 15
Formal definitions: step one Quantum dot (CHEBI:50853) = nanoparticle that is semiconducting Cadmium selenide nanoparticle (CHEBI:50835) = nanoparticle that is made of cadmium selenide Armchair nanotube (CHEBI:50798) = nanoparticle that has tubular morphology, is made of carbon, and is conducting 16
Formal relations Need to divide property into two: Qualities are always there Dispositions are only realized sometimes (and always associated with a process) Relations: is_a has_part_only has_quality has_disposition 17
Formal definitions: step two Quantum dot (CHEBI:50853) is_a nanoparticle (CHEBI:50803) that has_disposition semiconducting Cadmium selenide nanoparticle (CHEBI:50835) is_a nanoparticle (CHEBI:50803) that has_part_only cadmium selenide Armchair nanotube (CHEBI:50798) is_a nanoparticle (CHEBI:50803) that has_part_only carbon (CHEBI:33415) and has_quality single-walled and has_quality tubular morphology and has_disposition conducting 18
Formal definitions: step three With a small ontology of properties, and ChEBI to provide the chemistry, we can now generate arbitrary nanoparticle representations and share them in RDF whether in HTML or RSS. (See Chris Mungall et al. Representing Phenotypes in OWL in Proceedings of the OWLED 2007 Workshop, Innsbruck, Austria 2007) 19
Summary What ontologies are How we re using them at the RSC How we re developing them at the RSC What ontologies for nanotechnology exist InChIs for nanoparticles 20
Questions? 21