1 Workshop on CEFIC LRI Project EEM9.4 LRI AMBIT with IUCLID6 support and extended search capabilities IUCLID Substance Data Nikolay Kochev Ideaconsult Ltd. Sofia,Bulgaria
2 Chemical structure vs. Substance A chemical structure describes a well-defined molecule. 1,2-dimethoxyethane Chemicals synthesized in reality are not pure substances. In fact such substances represent mixtures of several components. Therefore real substances can not be associated with an unique structure. In contrast, components (i.e.: constituents, impurities and/or additives) can clearly be characterized by a defined structure in each case. Under REACH, the concept of substance is clearly described. This definition is implemented in the IUCLID data base.
3 Substances under REACH under REACH, a chemical substance is composed of: Constituents (n>=1) Impurities (n>=0) Additives (n>=0) under REACH, a chemical substance can have several compositions, e.g. crude, distilled, etc. under REACH, the type of a chemical substance can be: Either mono-constituent (a substance, defined by its composition, in which one main constituent is present to at least 80% (w/w)). Or multi-constituent (a substance, defined by its composition, in which more than one main constituent is present in a concentration 10% (w/w) and < 80% (w/w)) Or UVCB (Substance of Unknown or Variable composition, Complex reaction products or Biological materials)
4 REACH substance definition implemented in IUCLID Example: mono-constituent substance Three different compositions
5 REACH substance definition implemented in IUCLID Example: mono-constituent substance Three different compositions
6 REACH substance definition implemented in IUCLID Example: mono-constituent substance Three different compositions
7 REACH substance definition implemented in IUCLID Example: UVCB N,N-dimethyl-C12-14-(even numbered)-alkyl-1-amines
8 REACH substance definition implemented in IUCLID Example: multi-constituent substance The substance has 3 constituents and 3 impurities characterized by different structures
9 IUCLID6 support in AMBIT Given : Completely new XML schema of all objects 372 schema files, 111 endpoint study record files Different approach of linking between objects (compared to IUCLID5) Implementation Java classes generated from the XML schema (via JAXB) AMBIT code to convert the generated classes to the internal data model and be able to store into the database Use existing code for writing into the database And existing UI to show the data Transparent from user point of view: select.i6z or.i5z
10 IUCLID6 support in AMBIT Files (both IUCLID5 and IUCLID6) Transparent from user point of view: select.i6z or.i5z Web services IUCLID5 (SOAP) and IUCLID6 (REST) All endpoint study records supported previously (and more) Potential to support all endpoint study records The Test material is no more a checkbox Each study record links to a test material (a substance, identified by UUID) Substance and compositions Reference substances
11 IUCLID6 new composition types legal entity composition of the substance (default) boundary composition of the substance composition of the substance generated upon use other: IUCLID5 composition is migrated to Legal entity composition The composition record includes study information Introduced mostly because of nanomaterials, as REACH substance is defined by the main constituent (e.g. all TiO2 materials, regardless of the coatings=one substance) All different nanoforms are described as different compositions of the same substance And they have different shape, size, etc (i.e. characterisation)
12 Detailed information Composition (1) Every constituent, impurity and additive is described in detail with a Reference substance with several identifiers
13 Detailed information Composition (2) The structure associated to the reference substance is stored in the IUICLID as a picture format only which is normally not searchable. InChI notation could be used for structure identification. SMILES notation could be used for structure identification only if unique SMILES strings are used both on data import and query definition.
14 Full structure support in AMBIT for all substance components Various chemoinformatics approaches for handling chemical structures
15 Motivation to transfer IUCLID data to Ambit chemoinformatic system IUCLID Limitation: IUCLID allows queries in the substance data but has no functionality to search chemical structures (exact, similar, or substructures). Queries using the SMILES and InChI notation are possible. In addition, IUCLID describes endpoints in very detailed complexity. Extraction of key information relevant for substance evaluation is not convenient. The IUCLID substance composition and IUCLID endpoint data can be transferred and updated into the Ambit system. During this process structures are assigned automatically to the constituents/impurities/additives of the substance. In contrast to IUCLID, Ambit allows structure and data search.
16 Motivation to transfer IUCLID data to Ambit chemoinformatic system Ambit advantages: Chemical structure searching: exact, similarity and substructure search; Read-across workflow; Flexible faceted and free text searching for structure and data; Export to various data formats preferred by industry and scientific community; Modelling, data analysing and visualization utilities; Support for chemical substances including nanomaterials; Programmatic access via REST API; User friendly web interface.
17 Extracting data from IUCLID Substances which should be transferred to AMBIT have to be flagged in IUCLID In the IUCLID chapter 1.3 Identifiers company specific flags can be added Company specific flags examples: TRA number to identify trade products in the SAP System Substances will be transferred to Ambit (CompTox Ambit Transfer) All Flags will be transferred to Ambit and are searchable in Ambit
18 Public, LRI Project EEM9.3, IUCLID Substance Data Import criteria to specify which studies will be imported into AMBIT Where can I find these fields in IUCLID? In each Endpoint study record the relevant fields are located in Administrative Data Data source
19 Public, LRI Project EEM9.3, IUCLID Substance Data Why a selection is reasonable? Only high quality study records of the IUCLID substance itself should be imported into AMBIT, therefore we recommend to select only: Key studies and Supporting studies (Adequacy of Study/Purpose flag/); the flags weight of evidence and disregarded study are not high quality information. Reliability 1 and 2 (Reliability); 3 (not reliable) and 4 (not assignable) are not helpful to characterize the relevant endpoint information. Experimental result (Study result type); Read across information should not be selected, because these information will be transferred with the original IUCLID substance to AMBIT. Study reports, Publications and Review article (Reference type); secondary source and grey literature should not be imported
20 Import IUCLID files in AMBIT In Ambit some import filters can be selected
21 Retrieve substances in AMBIT from IUCLID server In Ambit some import filters can be selected