Data Bases and Data Modeling

Similar documents
Biological Read-Across: Species-Species and Endpoint- Endpoint Extrapolation

QMRF# Title. number and title in JRC QSAR Model Data base 2.0 (new) number and title in JRC QSAR Model Data base 1.0

RSC Publishing. Principles and Applications. In Silico Toxicology. Liverpool John Moores University, Liverpool, Edited by

User manual Strategies for grouping chemicals to fill data gaps to assess acute aquatic toxicity endpoints

Solved and Unsolved Problems in Chemoinformatics

OECD QSAR Toolbox v.4.1. Step-by-step example for building QSAR model

HOW TO FIND CHEMICAL INFORMATION

HOW TO FIND CHEMICAL INFORMATION

Adverse Outcome Pathways in Ecotoxicology Research

OECD QSAR Toolbox v.4.0. Tutorial on how to predict Skin sensitization potential taking into account alert performance

Priority Setting of Endocrine Disruptors Using QSARs

OECD QSAR Toolbox v.3.3. Predicting acute aquatic toxicity to fish of Dodecanenitrile (CAS ) taking into account tautomerism

Overview. Database Overview Chart Databases. And now, a Few Words About Searching. How Database Content is Delivered

OECD QSAR Toolbox v.4.1. Tutorial on how to predict Skin sensitization potential taking into account alert performance

OECD QSAR Toolbox v.3.3. Step-by-step example of how to build a userdefined

OECD QSAR Toolbox v.3.4. Step-by-step example of how to build and evaluate a category based on mechanism of action with protein and DNA binding

OECD QSAR Toolbox v.3.3. Step-by-step example of how to build and evaluate a category based on mechanism of action with protein and DNA binding

1.3.Software coding the model: QSARModel Molcode Ltd., Turu 2, Tartu, 51014, Estonia

Screening and prioritisation of substances of concern: A regulators perspective within the JANUS project

OECD QSAR Toolbox v.3.3

OECD QSAR Toolbox v.3.3

OECD QSAR Toolbox v4.0 Simplifying the correct use of non-test methods

Aggregation Prediction

OECD QSAR Toolbox v.3.4

OECD QSAR Toolbox v.3.4

Molecular descriptors and chemometrics: a powerful combined tool for pharmaceutical, toxicological and environmental problems.

PIOTR GOLKIEWICZ LIFE SCIENCES SOLUTIONS CONSULTANT CENTRAL-EASTERN EUROPE

Neural and Neuro-Fuzzy Models of Toxic Action of Phenols

Global Harmonization and Hazard Communication

OECD QSAR Toolbox v.3.3. Step-by-step example of how to categorize an inventory by mechanistic behaviour of the chemicals which it consists

OECD QSAR Toolbox v.4.1

OECD QSAR Toolbox v.3.2. Step-by-step example of how to build and evaluate a category based on mechanism of action with protein and DNA binding

Chemoinformatics and information management. Peter Willett, University of Sheffield, UK

ADRIANA.Code and SONNIA. Tutorial

WHAT S WRONG WITH THIS PICTURE?

Good Read-Across Practice 1: State of the Art of Read-Across for Toxicity Prediction. Mark Cronin Liverpool John Moores University England

Next Generation Computational Chemistry Tools to Predict Toxicity of CWAs

RISKCYCLE (#226552) Deliverable 4.2. List of databases and meta-databases

OECD QSAR Toolbox v.3.3. Predicting skin sensitisation potential of a chemical using skin sensitization data extracted from ECHA CHEM database

QSAR in Green Chemistry

Introduction to Chemoinformatics

Regulatory use of (Q)SARs under REACH

Section II Assessing Polymers

A directory of information resources on radioactive waste management, decontamination and decommissioning, and environmental restoration

Structure-Activity Modeling - QSAR. Uwe Koch

Chemistry 204: Benzene and Aromaticity

Case study: Category consistency assessment in Toolbox for a list of Cyclic unsaturated hydrocarbons with respect to repeated dose toxicity.

Comparative study of mechanism of action of allyl alcohols for different endpoints

OECD QSAR Toolbox v.3.4. Example for predicting Repeated dose toxicity of 2,3-dimethylaniline

OECD QSAR Toolbox v.4.1. Tutorial illustrating new options for grouping with metabolism

Applications of multi-class machine

OECD QSAR Toolbox v.3.0

JCICS Major Research Areas

Data Mining in the Chemical Industry. Overview of presentation

Technical information on alternative methods

Introduction to the Globally Harmonized System of Classification and Labelling of Chemicals (GHS)

Name: Period: Date: UNIT 12: Solutions Lesson 2: Electronegativity and Polarity!

5. (6 pts) Show how the following compound can be synthesized from the indicated starting material:

The Guidelines for Hazard Identification of New Chemical Substances

Section I Assessing Chemicals

Chapter 17: Reactions of Aromatic Compounds

Canada s Experience with Chemicals Assessment and Management and its Application to Nanomaterials

OECD QSAR Toolbox v.4.1. Tutorial of how to use Automated workflow for ecotoxicological prediction

Prediction of toxicity using a novel RBF neural network training methodology

Opportunities to Incorporate Toxicology into the Chemistry Curriculum Report from the Field

Human Health Models Skin sensitization


Prediction of Acute Toxicity of Emerging Contaminants on the Water Flea Daphnia magna by Ant Colony Optimization - Support Vector Machine QSTR models

MS Program in Environmental and Green Chemistry at GWU

OECD QSAR Toolbox v.4.1. Step-by-step example for predicting skin sensitization accounting for abiotic activation of chemicals

CHEM 4170 Problem Set #1

KATE2017 on NET beta version Operating manual

Experience with data handling in large chemical databases NEAL LANGERMAN

Chapter 9 Chemical Names And Formulas Quiz Answers

Vibrations. Matti Hotokka

Read-Across or QSARs?

Electronegativity (MHR Text p ) Draw an electron dot formula for HCl.

An Integrated Approach to in-silico

Unit 27 Oxidation and Reduction Definitions (Chapter 8)

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

Exploration of alternative methods for toxicity assessment of pesticide metabolites

Guidance for Industry

11/30/ Substituent Effects in Electrophilic Substitutions. Substituent Effects in Electrophilic Substitutions

LIFE Project Acronym and Number ANTARES LIFE08 ENV/IT/ Deliverable Report. Deliverable Name and Number

Internet Resource Guide. For Chemical Engineering Students at The Pennsylvania State University

Product Properties Test Guidelines OPPTS Dissociation Constants in Water

Substance identification. Chemical Watch Expo

Reaxys (Beilstein) Assignment. Preliminary Report Due: October 2nd Final Report Due: November 4th

THE ROLE OF CHEMICAL SYNTHESIS IN SUPPORT OF THE SUBSTITUTION PRINCIPLE Ferdinando Fiorino Elisa Perissutti

1.QSAR identifier 1.1.QSAR identifier (title): QSAR for female rat carcinogenicity (TD50) of nitro compounds 1.2.Other related models:

1.QSAR identifier 1.1.QSAR identifier (title): Nonlinear QSAR model for acute oral toxicity of rat 1.2.Other related models:

SciFinder Scholar Guide to Getting Started

Microorganisms. Dissolved inorganics. Native vs. Introduced; Oligotrophic vs. Eutrophic Millions to billions per ml or g Complex consortia

C 6 H 12 O 6 + 6O 2 6CO 2 + 6H 2 O

Chapter 8 Covalent Bonding Test B Answers Cordlessore

Teaching Toxicology through a Laboratory Safety Program

CSNB (Chemical Safety NewsBase)

The Globally Harmonized System (GHS) for Hazard Classification and Labelling. Development of a Worldwide System for Hazard Communication

16. Chemistry of Benzene: Electrophilic Aromatic Substitution. Based on McMurry s Organic Chemistry, 7 th edition

In Silico Quantitative Structure Toxicity Relationship of Chemical Compounds: Some Case Studies

Transcription:

Data Bases and Data Modeling J. Gasteiger Computer-Chemie-Centrum Universität Erlangen-Nürnberg D-91052 Erlangen http://www2.chemie.uni-erlangen.de

Overview types of environmental databases selection of environmental databases metadatabases Google search on atrazine toxicity information prediction of toxic mode of action

Categories of Environmental Databases Environmental Databases Online CD - ROMs Internet general interest science and technology business and legal

Environmental Databases Types of Environmental Databases Text-based Databases Fact-based Databases Bibliographic Databases/BI Full-text Databases/FT Chemical Name Directories/CN Numerical Databases/NU Metadatabases/MD Research Databases/RD Catalog/CA Integrated Databases Structure Databases/ST Reaction Databases/RE

Environmental Bibliographic Databases databas e Enviroline Pollutio n hosts DataStar, DIALOG, DIMDI DIALOG, STN free internet No No no. of documents 278,000 (04/02) 284,000 (04/02) ULIT DataStar, FIZ-Technik, GBI, STN http://isis.uba.de:3001/un/ 425,000 opacs.html (11/01)

Selection of Numerical Databases Biocatalysis/Biodegradation Database Chemicals Information System for Consumer-relevant Substances (CIVS) ECOTOX (several databases) Envirofacts (several databases) Environmental Fate Database (several databases) Environmental Health Criteria Monographs (EHCs)

Selection of Numerical Databases Biocatalysis/Biodegradation Database EXTOXNET (several databases) HSDB NRA Chemical Review Program (profiles) Pesticide Database, Japan (profiles) SIDS (profiles) UmweltInfo

Environmental Databases on the Free Internet Name ULIT UFOR ECOTOX HSDB ChemFinder ChemIDplus CORDIS Research Database Sigma, Aldrich Fisher Scientific URL http://isis.uba.de:3001/un/opacs.html http://isis.uba.de:3001/vh/opacs.html http://www.epa.gov/ecotox/ http://toxnet.nlm.nih.gov/cgi-bin/ sis/htmlgen?hsdb http://chemfinder.cambridgesoft.com/ http://chem.sis.nlm.nih.gov/chemidplus/ http://dbs.cordis.lu/search/en/simple/ EN_PROJ_simple.html http://www.sigmaaldrich.com/ http://www.fishersci.com File type BI RD NU NU CN CN RD CA CA Comments German, English German, English Peer-reviewed data Peer-reviewed data Structural formula Structural formula Several languages 90,000 MSDSs Several companies

DAIN (metadatabase of INternet resources) http://www.wiz.uni-kassel.de/dain/ topic Pesticides Solvents Warfare Agents Pharmaceuticals Polymers total number of databases 48 8 12 11 9 no. of fact-basedno. of text-based databases databases 34 8 12 10 8 14 0 0 1 1 Ecotoxicity Toxicity Biodegradation Detection in air Physical-chemical parameters 8 35 7 44 18 8 0 27 8 6 1 44 0 17 1

U.S. EPA: Atrazine: Interim Reregistration Elegibility Decision

Results Considerable Data-gaps especially in: Environmental Fate and Pathway Parameters Chronic Ecotoxicity Parameters Even High Production Volume (HPV) chemicals are not coverered in each environmental database Even the largest Internet databases provide insufficient data

Publication Kristina Voigt, Databases on Environmental Information, in: Handbook of Chemoinformatics, 4 volumes, J. Gasteiger (editor), Wiley-VCH, Weinheim, 2003, pp. 722-742.

http://www.epa.gov/nheerl/dsstox/

DSSTox Databases CPDB: Carcinogenic Potency Database DBPCAN: Water Disinfection By-Products Database with Carcinogenicity Estimates EPAFHM: EPA Fathead Minnow Aquatic Toxicity Database NCTRER: FDA s National Center for Toxicological Research - Estrogen Receptor Binding Database

Polar Narcotics log(1/igc 50 ) -1 0 1 2 3 0 2 4 6 log P IGC 50 of Tetrahymena pyriformis with polar narcotic phenols versus log P

The Larger Picture log(1/igc 50 ) -1 0 1 2 3 polar narcotics uncouplers of oxidative phosphorylation precursors to soft electrophiles soft electrophiles 0 2 4 6 log P

Why Prediction of Modes of Action (MOA)? most QSARs in toxicology focus on a certain class of compounds OH OH however: Cl Cl and Cl Cl Cl have different biological responses, require different QSAR equations. Cl

Definition of Mode of Action (MOA) MOA: common set of physiological and behavioural signs characterizing a type of adverse biological response toxic mechanism: biochemical process underlying a given MOA Rand G. (1995) Fundamentals of Aquatic Toxicology

Data Set: MOA of Phenols 1. polar narcotics (157 cpds) 2. uncouplers of oxidative phosphorylation (20 cpds) 3. precursors to soft electrophiles (23 cpds) 4. soft electrophiles (21 cpds) 221 cpds Aptula A. O., Netzeva T. I., Valkova I. V., Cronin M. T. D., Schultz T. W. D., Kühne R. and Schüürmann G.(2002): Quant.Struct.-Act. Relat., 21, 12-22.

Typical Structures OH OH OH OH Cl Cl Cl O Cl OH NO 2 polar uncouplers precursors to soft narcotics soft electrophiles electrophiles

Architecture of the Kohonen Network Self-Organizing Map (SOM) object with m input data competitive layer winning neuron ( ) m out min x w i= 1 c si ji Euclidean distance m ( ) d = w w sj si ji i= 1 2 2 m weights

Structure Representation 3D structure radial distribution function of sigma- and lone pair-electronegativity (16 values) molecular surfaces autocorrelation of hydrogen bonding potential (12 values)

Classification According to Mode of Action (4 output layers) 28-dimensional vector (RDF + HBP) 4 Modes of Action baseline toxicants...... soft electrophiles

Polar Narcotics active non-active 221 phenols with 12 Surface HBP AC, 2 x 8 RDF (χ LP, χ σ ), rectangular

Uncouplers of Oxidative Phosphorylation active non-active 221 phenols with 12 Surface HBP AC, 2 x 8 RDF (χ LP, χ σ ), rectangular

Soft Electrophiles active non-active 221 phenols with 12 Surface HBP AC, 2 x 8 RDF (χ LP, χ σ ), rectangular

Pro-electrophiles active non-active 221 phenols with 12 Surface HBP AC, 2 x 8 RDF (χ LP, χ σ ), rectangular

NH 2 -groups in orthoand para-position Focus into the Neural Network OH-groups in ortho-position OH-groups in para position

Final Model 12 HBP surface autocorrelation, 2 x 8 RDF values (χ LP, χ σ ) 5-fold cross-validation: predicted actual narcotic uncouplers pro-electroph. soft electroph. narcotic 155 0 4 0 uncouplers 0 19 0 1 pro-electroph. 1 0 19 0 soft electroph. 1 1 0 19 undecided 0 0 0 1 overall 95.9 % correct 221 phenols with 12 Surface HBP AC, 2 x 8 RDF (χ LP, χ σ ), rectangular

Classification in 5-fold Cross-Validation OH OH Cl Cl Cl Cl Cl Cl polar narcotic uncoupler of oxidative phosphorylation classification successfull!

Conclusions good predictive models for the MOA classification self-organizing maps can be used as transparent and robust neural network method the two step process, first MOA classification then building of a QSAR model is the right way to go

Acknowledgements Toxic Modes of Action Simon Spycher, CCC, Erlangen Environmental Databases Dr. Kristina Voigt, GSF, Neuherberg bmb+f