Introduction. OntoChem

Similar documents
OntoChem Software. Chemoinformatic Solutions for Life Sciences Problems

Dr. Sander B. Nabuurs. Computational Drug Discovery group Center for Molecular and Biomolecular Informatics Radboud University Medical Centre

Structural biology and drug design: An overview

Computational chemical biology to address non-traditional drug targets. John Karanicolas

Synthetic organic compounds

KNIME-based scoring functions in Muse 3.0. KNIME User Group Meeting 2013 Fabian Bös

György M. Keserű H2020 FRAGNET Network Hungarian Academy of Sciences

Using AutoDock for Virtual Screening

Molecular Complexity Effects and Fingerprint-Based Similarity Search Strategies

Introduction to Chemoinformatics and Drug Discovery

Drug Informatics for Chemical Genomics...

Chemogenomic: Approaches to Rational Drug Design. Jonas Skjødt Møller

Virtual Libraries and Virtual Screening in Drug Discovery Processes using KNIME

In Silico Investigation of Off-Target Effects

Applying Bioisosteric Transformations to Predict Novel, High Quality Compounds

Synthetic organic compounds

Computational Chemistry in Drug Design. Xavier Fradera Barcelona, 17/4/2007

How IJC is Adding Value to a Molecular Design Business

Capturing Chemistry. What you see is what you get In the world of mechanism and chemical transformations

In silico pharmacology for drug discovery

ChemDiv Beyond the Flatland 3D-Fragment Library for Fragment-Based Drug Discovery (FBDD)

Biologically Relevant Molecular Comparisons. Mark Mackey

Virtual screening in drug discovery

Data Quality Issues That Can Impact Drug Discovery

Contents 1 Open-Source Tools, Techniques, and Data in Chemoinformatics

An Enhancement of Bayesian Inference Network for Ligand-Based Virtual Screening using Features Selection

Chemoinformatics and information management. Peter Willett, University of Sheffield, UK

Computational Methods and Drug-Likeness. Benjamin Georgi und Philip Groth Pharmakokinetik WS 2003/2004

An Integrated Approach to in-silico

Similarity Search. Uwe Koch

Machine learning for ligand-based virtual screening and chemogenomics!

Chemical library design

A Tiered Screen Protocol for the Discovery of Structurally Diverse HIV Integrase Inhibitors

The Changing Requirements for Informatics Systems During the Growth of a Collaborative Drug Discovery Service Company. Sally Rose BioFocus plc

Building innovative drug discovery alliances. Just in KNIME: Successful Process Driven Drug Discovery

Structure based drug design and LIE models for GPCRs

Hit Finding and Optimization Using BLAZE & FORGE

Plan. Lecture: What is Chemoinformatics and Drug Design? Description of Support Vector Machine (SVM) and its used in Chemoinformatics.

EMPIRICAL VS. RATIONAL METHODS OF DISCOVERING NEW DRUGS

Advanced Medicinal Chemistry SLIDES B

Receptor Based Drug Design (1)

Navigation in Chemical Space Towards Biological Activity. Peter Ertl Novartis Institutes for BioMedical Research Basel, Switzerland

Design and Synthesis of the Comprehensive Fragment Library

AMRI COMPOUND LIBRARY CONSORTIUM: A NOVEL WAY TO FILL YOUR DRUG PIPELINE

Ignasi Belda, PhD CEO. HPC Advisory Council Spain Conference 2015

Cheminformatics Role in Pharmaceutical Industry. Randal Chen Ph.D. Abbott Laboratories Aug. 23, 2004 ACS

has its own advantages and drawbacks, depending on the questions facing the drug discovery.

Progress of Compound Library Design Using In-silico Approach for Collaborative Drug Discovery

Plan. Day 2: Exercise on MHC molecules.

Integrated Cheminformatics to Guide Drug Discovery

Docking. GBCB 5874: Problem Solving in GBCB

Data Mining in the Chemical Industry. Overview of presentation

Author Index Volume

Functional Group Fingerprints CNS Chemistry Wilmington, USA

LIBRARY DESIGN FOR COLLABORATIVE DRUG DISCOVERY: EXPANDING DRUGGABLE CHEMOGENOMIC SPACE

Molecular Dynamics Graphical Visualization 3-D QSAR Pharmacophore QSAR, COMBINE, Scoring Functions, Homology Modeling,..

DivCalc: A Utility for Diversity Analysis and Compound Sampling

BioSolveIT. A Combinatorial Approach for Handling of Protonation and Tautomer Ambiguities in Docking Experiments

De Novo molecular design with Deep Reinforcement Learning

The PhilOEsophy. There are only two fundamental molecular descriptors

LigandScout. Automated Structure-Based Pharmacophore Model Generation. Gerhard Wolber* and Thierry Langer

Enamine Golden Fragment Library

Bioengineering & Bioinformatics Summer Institute, Dept. Computational Biology, University of Pittsburgh, PGH, PA

Retrieving hits through in silico screening and expert assessment M. N. Drwal a,b and R. Griffith a

Jonathan S. Mason,, Isabelle Morize, Paul R. Menard,*, Daniel L. Cheney, Christopher Hulme, and Richard F. Labaudiniere

Why do we need so many chemical similarity search methods?

Next Generation Computational Chemistry Tools to Predict Toxicity of CWAs

A reliable computational workflow for the selection of optimal screening libraries

Solved and Unsolved Problems in Chemoinformatics

Drug Design 2. Oliver Kohlbacher. Winter 2009/ QSAR Part 4: Selected Chapters

COMPARISON OF SIMILARITY METHOD TO IMPROVE RETRIEVAL PERFORMANCE FOR CHEMICAL DATA

Dispensing Processes Profoundly Impact Biological, Computational and Statistical Analyses

Virtual Screening: How Are We Doing?

Chemoinformatics and Drug Discovery

BioSolveIT. A Combinatorial Docking Approach for Dealing with Protonation and Tautomer Ambiguities

DRUG DISCOVERY TODAY ELN ELN. Chemistry. Biology. Known ligands. DBs. Generate chemistry ideas. Check chemical feasibility In-house.

Bridging the Dimensions:

Early Stages of Drug Discovery in the Pharmaceutical Industry

ChemBioNet: Chemical Biology supported by Networks of Chemists and Biologists. Affinity Proteomics Meeting Alpbach. Michael Lisurek 15.3.

Chemical Space. Space, Diversity, and Synthesis. Jeremy Henle, 4/23/2013

Fragment-based de novo Design

Fragment based drug discovery in teams of medicinal and computational chemists. Carsten Detering

Using Bayesian Statistics to Predict Water Affinity and Behavior in Protein Binding Sites. J. Andrew Surface

Universities of Leeds, Sheffield and York

COMBINATORIAL CHEMISTRY IN A HISTORICAL PERSPECTIVE

Validation Study on an Information Driven Library Design Strategy. Sabbath Marchend

Patent Searching using Bayesian Statistics

How Diverse Are Diversity Assessment Methods? A Comparative Analysis and Benchmarking of Molecular Descriptor Space

Xia Ning,*, Huzefa Rangwala, and George Karypis

Web tools for Monomer selection, Library Design and Compound Acquisition. Andrew Leach GlaxoSmithKline Research and Development Stevenage

A graph based approach to construct target focused libraries for virtual screening

Cheminformatics analysis and learning in a data pipelining environment

Using Self-Organizing maps to accelerate similarity search

Targeting protein-protein interactions: A hot topic in drug discovery

Computational Chemistry in Drug Discovery. Darren Green EBI 2016

Development of Pharmacophore Model for Indeno[1,2-b]indoles as Human Protein Kinase CK2 Inhibitors and Database Mining

Practical QSAR and Library Design: Advanced tools for research teams

Unlocking the potential of your drug discovery programme

Structure-Activity Modeling - QSAR. Uwe Koch

Ultra High Throughput Screening using THINK on the Internet

Transcription:

Introduction ntochem Providing drug discovery knowledge & small molecules... Supporting the task of medicinal chemistry Allows selecting best possible small molecule starting point From target to leads candidates within a few months Generating new intellectual property

ntochem drug design process Process flow chart Design of protein interaction inhibitors using a novel chemoand bioinformatic platform based on information theory and rational design see L Weber, QSAR & Combinatorial Science, 2005, 809-823.

ntochem example 1 MDM2 - p53 inhibitors discovery project A protein-protein interaction with promises in oncology Has been a challenge in drug discovery H Roche utlins

ntochem example 1 MDM2 - p53 inhibitors, step 1 - ligand based design Generating surface and pharmacophore properties of ligand ther criteria: MW<500, Lipinski-rules Search terabyte server for sub-set that satisfy criteria 2D surface property scoring of selected molecules with input ligand

ntochem example 1 MDM2 - p53 inhibitors, step 2 Docking of selected molecules into protein pharmacophore Docking best molecules into protein Scoring (orthogonal fusion) Success rate: 0.000009% (9 out of 100 million) Resulting into ranked hit list of molecule proposals to be synthesized by wet-lab chemistry Found molecules (2 scaffold series) that are active and selective

ntochem example 1 Inhibitor series properties MW < 400 Polar surface area PSA= 41 H-bond donor 0, acceptor 5 clogp 4 Rotatable bonds 3 Synthesis via a two step procedure, one step is a MCR Straightforward MedChem optimization possible Roche utlin series properties MW 583.5 Polar surface area PSA= 78 H-bond donor 2, acceptor 8 clogp 5 Rotatable bonds 7 Synthesis via 12 sequential steps Filed two patents

ntochem terabyte server umbers ( are not everything but may help): ew search technology Innovative Content: MCRs are the most efficient way to construct molecules 10000000000 1000000000 Products by worked >4000 MCRs and 100 classical transforms From proprietary starting materials or otherwise accessible Preselected by MW < 500 and fuzzy Lipinski filter assical chemistry MCR chemistry P 100000000 10000000 1000000 100000 10000 1000 100 10 1 biotech compound library Pharma compound library Beilstein Chemavigator Accessible compounds CAS Terabyte Server P 6-CR

ntochem CARD + DayCart access CARD + DayCart provides straightforward everywhere access to Reaction + transform database (currently >4000 on file) Basis for generating synthetically accessible compounds Updated with worked reactions - scope and limitations Project databases for collaborative projects Integrating biological data Calculated physico-chemical data Terabyte server compound database (10 10 compounds on file) Selected for Lipinski rules Accessible by 1-3 steps of straightforward chemistry Basis for HT in silico screening

ntochem CARD + DayCart access CARD + DayCart advantages o need for large centralized or distributed infrastructure Integration with the CABIET federation of other databases Highest flexibility Easy to setup and run

ntochem CARD + DayCart access CARD + DayCart reaction db example

Terabyte Server Search Methods: Chemical Similarity today's problem E.g., 1a is calculated most similar to 2b, using Tanimoto similarity of bitstrings = wrong! (M. Stahl et al., J. Med. Chem. 2005, 48, 4358) Daylight fingerprints 1a 1b 2a 2b 1a: 1.00 0.27 0.26 0.32 1b: 0.27 1.00 0.31 0.19 2a: 0.26 0.31 1.00 0.37 2b: 0.32 0.19 0.37 1.00 MACCS keys 1a 1b 2a 2b 1a: 1.00 0.56 0.53 0.54 1a 1b 2a 2b 1b: 0.56 1.00 0.64 0.50 2a: 0.53 0.64 1.00 0.62 2b: 0.54 0.50 0.62 1.00 Dopamine D4 antagonists Histamine H3 receptor ligands

Topological Torsion validation TT similarity allows better classification of compounds than by other known methods (from ilakatan 1987 to Sheridan 2004) TT - Tanimoto 1a 1b 2a 2b 1a: 1.00 0.37 0.19 0.20 1b: 0.37 1.00 0.16 0.15 2a: 0.19 0.16 1.00 0.30 2b: 0.20 0.15 0.30 1.00 TT - Dice 1a: 1.00 0.54 0.31 0.33 H 0.37 0.2 H 0.30 0.15 1b: 0.54 1.00 0.27 0.27 2a: 0.31 0.27 1.00 0.47 2b: 0.33 0.27 0.47 1.00 Dopamine D4 antagonists Histamine H3 receptor ligands 1a 1b 2a 2b 0.19 0.16

TT validation Example for the linear multi-pharmacophore model (LMPM) algorithm, predicting the LogK w for 336 GPCR compounds (T. prea) with different scaffolds and active for different GPCRs. ote that clogp s are typically calculated by commercially available, dedicated algorithms, however, the good fit achieved with LMPM demonstrates the universality of the method. predicted LogKw 8.00 7.00 6.00 5.00 4.00 3.00 2.00 1.00 0.00 LMPM -2.00-1.000.00 2.00 4.00 6.00 8.00-2.00 measured LogKw

TT validation HIV protease inhibitor PU-96988 and a new inhibitor Daylight Fingerprint similarity 0.16 TT Tanimoto similarity 0.302 H H H

TT validation Tudor prea dataset 341 GPCR ligands average TT Tanimoto similarity 0.131 49 have TT similarity > 0.2, 34 (!) are 5HT2A ligands from 50 total Chiral F ot ligands but very similar

TT diversity design Shannon information entropy: Introduced by J Bajorath to drug design Information is always a measure of the decrease of uncertainty Medicinal chemistry can be regarded as a process to obtain maximal information from an input (compounds) and an output (test results) Maximal information can be obtained if number of information channels is maximized: diverse chemical structures = different TT s Defined here as H = -Σ(p i /n)*log 2 (p i /n) i-state of n total states, p i -occupation of i-state TT s may be used as state descriptors for small molecules approx n = 100 for small molecule MW < 500 approx H max = 6.64 and e.g. H benzene = 0 H = 4.57 (32 different TT s, n = 92) for

TT diversity design Information entropy diversity design method: Maximize number of different TT s in a molecule Minimize number of equal TT s in different molecules What is the gain in H by adding a new molecule to the library? In regard to new and different TT s Max-min diversity design: ptimally diverse library - minimize TT similarity matrix: 1.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 4 molecules do not share common ToTo s 0.00 0.00 1.00 0.00 0.00 0.00 0.00 1.00 ptimally diverse library - maximize H entropy of each molecule: 4.57 4.57 4.57 4.57

Library Information Entropy TT allows straightforward information entropy calculation TT 1a 1b 2a 2b 1a: 1.00 0.37 0.19 0.20 1b: 0.37 1.00 0.16 0.15 2a: 0.19 0.16 1.00 0.30 2b: 0.20 0.15 0.30 1.00 H molecule = 5.18 4.57 5.62 5.53 H H Dopamine D4 antagonists Histamine H3 receptor ligands H library = 6.30 (n = 400, only 142 different), maximum = 8.64

Reaction assification Using TT s - similar to the InfoChem s CLASSIFY Transform TT into SMIRKS...

TT Utility medchem selector Medicinal chemistry applications of TT similarity analysis. Diversity Analysis of corporate/vendor chemical library Analyze information content of a library Gap analysis - synthesize molecules to fill the gap Maximize diversity SAR prediction of small molecules Target class specific fingerprints (e.g. using known kinase inhibitors) Target specific fingerprints (e.g. using known Aurora inhibitors) Property Prediction allows efficient substructure design Exclusion of molecules Activity clogp ADMET... Selection of most promising candidates in-silico