Virtual Libraries and Virtual Screening in Drug Discovery Processes using KNIME

Similar documents
Pipeline Pilot Integration

Marvin. Sketching, viewing and predicting properties with Marvin - features, tips and tricks. Gyorgy Pirok. Solutions for Cheminformatics

Pipeline Pilot Integration

Data Mining in the Chemical Industry. Overview of presentation

Biologically Relevant Molecular Comparisons. Mark Mackey

Methods for tautomer enumeration, -searching and -duplicate filtering

Similarity Search. Uwe Koch

Introduction to Chemoinformatics and Drug Discovery

Computational Chemistry in Drug Design. Xavier Fradera Barcelona, 17/4/2007

The Schrödinger KNIME extensions

est Drive K20 GPUs! Experience The Acceleration Run Computational Chemistry Codes on Tesla K20 GPU today

Introduction. OntoChem

The PhilOEsophy. There are only two fundamental molecular descriptors

Using AutoDock for Virtual Screening

ChemAxon. Content. By György Pirok. D Standardization D Virtual Reactions. D Fragmentation. ChemAxon European UGM Visegrad 2008

Navigation in Chemical Space Towards Biological Activity. Peter Ertl Novartis Institutes for BioMedical Research Basel, Switzerland

In Silico Investigation of Off-Target Effects

Practical QSAR and Library Design: Advanced tools for research teams

Using Phase for Pharmacophore Modelling. 5th European Life Science Bootcamp March, 2017

Drug Informatics for Chemical Genomics...

Using Self-Organizing maps to accelerate similarity search

An Integrated Approach to in-silico

The Schrödinger KNIME extensions

ICM-Chemist How-To Guide. Version 3.6-1g Last Updated 12/01/2009

In silico pharmacology for drug discovery

BioSolveIT. A Combinatorial Approach for Handling of Protonation and Tautomer Ambiguities in Docking Experiments

Contents 1 Open-Source Tools, Techniques, and Data in Chemoinformatics

LigandScout. Automated Structure-Based Pharmacophore Model Generation. Gerhard Wolber* and Thierry Langer

Dr. Sander B. Nabuurs. Computational Drug Discovery group Center for Molecular and Biomolecular Informatics Radboud University Medical Centre

Fast similarity searching making the virtual real. Stephen Pickett, GSK

Bioengineering & Bioinformatics Summer Institute, Dept. Computational Biology, University of Pittsburgh, PGH, PA

Structural biology and drug design: An overview

Integrated Cheminformatics to Guide Drug Discovery

QSAR Modeling of ErbB1 Inhibitors Using Genetic Algorithm-Based Regression

Receptor Based Drug Design (1)

Ultra High Throughput Screening using THINK on the Internet

BioSolveIT. A Combinatorial Docking Approach for Dealing with Protonation and Tautomer Ambiguities

SCULPT 3.0. Using SCULPT to Gain Competitive Insights. Brings 3D Visualization to the Lab Bench SPECIAL REPORT. 4 Molecular Connection Fall 1999

Ligand Scout Tutorials

Reaxys Medicinal Chemistry Fact Sheet

KNIME-based scoring functions in Muse 3.0. KNIME User Group Meeting 2013 Fabian Bös

Chemoinformatics and information management. Peter Willett, University of Sheffield, UK

Chemogenomic: Approaches to Rational Drug Design. Jonas Skjødt Møller

The Conformation Search Problem

Enamine Golden Fragment Library

Farewell, PipelinePilot Migrating the Exquiron cheminformatics platform to KNIME and the ChemAxon technology

Building innovative drug discovery alliances. Just in KNIME: Successful Process Driven Drug Discovery

Reaxys Pipeline Pilot Components Installation and User Guide

DECEMBER 2014 REAXYS R201 ADVANCED STRUCTURE SEARCHING

Chemical library design

Expanding the scope of literature data with document to structure tools PatentInformatics applications at Aptuit

Computational chemical biology to address non-traditional drug targets. John Karanicolas

Kd = koff/kon = [R][L]/[RL]

Command-line tools of ChemAxon: tips and tricks

Advanced Medicinal Chemistry SLIDES B

Drug Design 2. Oliver Kohlbacher. Winter 2009/ QSAR Part 4: Selected Chapters

Tautomerism in chemical information management systems

FROM MOLECULAR FORMULAS TO MARKUSH STRUCTURES

Molecular Complexity Effects and Fingerprint-Based Similarity Search Strategies

Chemical Space. Space, Diversity, and Synthesis. Jeremy Henle, 4/23/2013

DOCKING TUTORIAL. A. The docking Workflow

The Schrödinger KNIME extensions

Retrieving hits through in silico screening and expert assessment M. N. Drwal a,b and R. Griffith a

Virtual screening in drug discovery

Chemoinformatics and Drug Discovery

Merck Virtual Library (MVL): Deployment, Application, and Future Enhancement

ESPRESSO (Extremely Speedy PRE-Screening method with Segmented compounds) 1

Design and Synthesis of the Comprehensive Fragment Library

Targeting protein-protein interactions: A hot topic in drug discovery

Ignasi Belda, PhD CEO. HPC Advisory Council Spain Conference 2015

Cheminformatics analysis and learning in a data pipelining environment

Performing a Pharmacophore Search using CSD-CrossMiner

How to Create a Substance Answer Set

Bridging the Dimensions:

Docking. GBCB 5874: Problem Solving in GBCB

Rapid Application Development using InforSense Open Workflow and Daylight Technologies Deliver Discovery Value

How Diverse Are Diversity Assessment Methods? A Comparative Analysis and Benchmarking of Molecular Descriptor Space

Ákos Tarcsay CHEMAXON SOLUTIONS

Medicinal Chemistry/ CHEM 458/658 Chapter 4- Computer-Aided Drug Design

Pose and affinity prediction by ICM in D3R GC3. Max Totrov Molsoft

Cross Discipline Analysis made possible with Data Pipelining. J.R. Tozer SciTegic

Early Stages of Drug Discovery in the Pharmaceutical Industry

AMRI COMPOUND LIBRARY CONSORTIUM: A NOVEL WAY TO FILL YOUR DRUG PIPELINE

bcl::cheminfo Suite Enables Machine Learning-Based Drug Discovery Using GPUs Edward W. Lowe, Jr. Nils Woetzel May 17, 2012

Similarity methods for ligandbased virtual screening

GCC E x h i b i t i o n N e w s l e t t e r. 8 th GERMAN CONFERENCE ON CHEMOINFORMATICS TOPICS

Molecular Modelling. Computational Chemistry Demystified. RSC Publishing. Interprobe Chemical Services, Lenzie, Kirkintilloch, Glasgow, UK

Data Quality Issues That Can Impact Drug Discovery

György M. Keserű H2020 FRAGNET Network Hungarian Academy of Sciences

Bioinformatics Workshop - NM-AIST

Different conformations of the drugs within the virtual library of FDA approved drugs will be generated.

Hit Finding and Optimization Using BLAZE & FORGE

Web tools for Monomer selection, Library Design and Compound Acquisition. Andrew Leach GlaxoSmithKline Research and Development Stevenage

EMPIRICAL VS. RATIONAL METHODS OF DISCOVERING NEW DRUGS

Analyzing Small Molecule Data in R

Reaxys The Highlights

October 6 University Faculty of pharmacy Computer Aided Drug Design Unit

Identifying Interaction Hot Spots with SuperStar

Virtual Screening: How Are We Doing?

Structure-Activity Modeling - QSAR. Uwe Koch

Transcription:

Virtual Libraries and Virtual Screening in Drug Discovery Processes using KNIME Iván Solt Solutions for Cheminformatics

Drug Discovery Strategies for known targets High-Throughput Screening (HTS) Cells or recombinant protein Fluorescent or luminescent readout Automated, miniaturized Thousands of samples / day Number of primer actives: ~1% Virtual Screening (VS) Ligand or structure based Virtual or real libraries Similarity search, 2D or 3D Can lead to thousands of possible actives: further processing needed Measurement: Enrichment ratio, ROC curves for known actives

Virtual Library Design Workflow DB DB Databases Reactions Molecules Queries Fragmentation R-group decomposition Fragmentation Reagent clipping Compound selection Similarity searches Substructure searches Enumeration Fuse fragments R-group composition Reaction enumeration Library analysis Clustering 2D similarity screen 3D Shape similarity screen

Find or Virtually Create Candidates Virtual screening of existing compounds Pros: Fast Hits are readily available for in vitro experiments Cons: Limitation on available compounds De novo design Pros: No limitation on virtual compound space Structural novelty Cons: Are hits synthetically available?

Find or Virtually Create Candidates Virtual screening of existing compounds Pros: Fast Hits are readily available for in vitro experiments Cons: Limitation on available compounds De novo design Pros: No limitation on virtual compound space Structural novelty Cons: Are hits synthetically available?

Virtual Screening Workflow DB DB Molecules in-house or commercially available 1. Reactions virtual synthetic path Synthetically Accessible Compounds 2. Filtering in vivo experiment? 5. Clustering 4. 3D alignment 3. Similarity Search

Step 1: Reaction Enumeration Reaction schema for accessible syntheses Combinatorial or sequential enumeration Reaction rules: phrase + apply public and in-house chemical knowledge Selectivity with tolerance Reactivity Exclusion rules EXCLUDE: match(reactant(1), "[Cl,Br,I]C(=[O,S])C=C") or match(reactant(0), "[H][O,S]C=[O,S]") or match(reactant(0), "[P][H]") or (max(pka(reactant(0), filter(reactant(0), "match('[o,s;h1]')"), "acidic")) > 14.5) or (max(pka(reactant(0), filter(reactant(0), "match('[#7:1][h]', 1)"), "basic")) > 0)

Step 1: Reaction Enumeration

Step 1: Reaction Enumeration Reaction rules ON Fewer results than theoretical Unfeasible starting materials eliminated Feasible products only Custom rules can be added to increase selectivity Reaction rules OFF More results Best for debugging purposes Prodcts may be incorrect due to neglecting chemical rules

Step 2: Filtering Lead likeness, drug likeness Chemical Terms Could it fit to the active centre? Basic analysis: size, mass... Could it get to the active centre? ADME properties: solubility, pka, polar surface, partition coefficients... Structural filtering e.g. reactive groups Toxicity, environmental concerns, etc... Calculator plugins Elemental Analysis Elemental Analysis IUPAC Name Structure to Name Protonation pk a Microspecies Isoelectric Point Partitioning logp logd Charge Charge Polarizability Orbital Electronegetivity Isomers Tautomerization Stereoisomer Conformation Conformer Flexible 3D Alignment Molecular Dynamics Geometry Topology Analysis Geometry Polar Surface Area (2D) Molecular Surface Area (3D) Markush Markush Enumeration Other Hydrogen Bond Donor- Acceptor Huckel Analysis Refractivity Structural Framework Resonance

Step 3: Similarity search Screen 2D + Descriptor package Screen against known bioactives Chemical Fingerprints Topology Pharmacophore Fingerprints: Custom atomic properties + their topological relationship H-bond donors / acceptors Cationic / anionic groups Hydrophobic groups Aromatic groups etc. ECFP/FCFP Similarity searches Tanimoto, Eucledian, Tversky metrics Metrics optimization 0.57 0.47 0.55 regular Tanimoto optimized Tanimoto 0.20 0.28 0.06

Step 4: Screen 3D Align the candidates to the known active in 3D Treat the candidate flexible! Consider pharmacophore atom types (align cationic to cationic, etc.)! Problem: complicated conformational space

Step 4: Screen 3D Simple sampling of the conformational space: Minimum and maximum distance between atom pairs in the full torsion space Select atoms Colors (e.g. pharmacophore types ) Topological features (e.g.:longest chain start/end/center) Ring centers (aromatic, aliphatic) Calculate Min/max internal distance ranges Distance histograms for selected atoms Only once for each molecule

Step 4: Screen 3D Hybrid alignment: Separate translation&rotation from torsions Robust and goes fast Needs good guess on atomatom mapping: Same colors Distance ranges must be allowed for all mapped pairs Triangle inequality must be fulfilled for any atom triplet

% of the actives retrieved Screen 3D: Test on DUD 30 Average of 1% Enrichments 25 20 15 10 5 0 Giganti et al. J. Chem. Inf. Model. 2010, 50, 992

% of the actives retrieved Screen 3D: Test on DUD 100 Average of 10% enrichments 90 80 70 60 50 40 30 20 10 0 Giganti et al. J. Chem. Inf. Model. 2010, 50, 992

Screen 3D: Test on DUD Average time per compound (without precalculations) ChemAxon Screen3D 0.07 ROCS 0.5 FRED 1.0 ICMsim 2.4 Surflex-sim 6.7 FlexS 6.9 Surflex-dock 14.6 FLEXX 15.6 ICM 17.7 Speed Intel Q6600 2.4 GHz Intel Xeon 2.4 GHz Giganti et al. J. Chem. Inf. Model. 2010, 50, 992

Step 5: Clustering, library analysis JKlustor Wide range of methods Unsupervised, agglomerative clustering Hierarchical and non-hierarchical methods Similarity based and structure based techniques Flexible search options Tanimoto and Euclidean metrics, weighting Maximum common substructure identification chemical property matching including atom type, bond type, hybridization, charge

JChem Extensions in KNIME Worklflow management in KNIME JChem extension nodes developed by InfoCom, Japan Constantly developing palette of available JChem tools

JChem Extensions in KNIME IO molecule and reaction import, export, drawing Visualization Manipulators Calculator plugins Reactor Similarity and structure-based search Fingerprint calculation Fragmentation Clustering R-group composition, decompozition Standardization... Database management Molecular format conversion Web search services

Step 1: Reaction Enumeration

Step 2: Filtering

Step 2: Filtering

Step 3: Similarity search

JChem Extensions in KNIME DB DB 1. Reactions virtual synthetic path Synthetically Accessible Compounds 2. Filtering in vivo experiment? 4. 3D alignment 3. Similarity Search 1. Import reactants 2. Enumerate reaction Carry out topology analysis 3. Calculate properties Filter 4. Screen for similarity against known active 5. Export results

Conclusions Virtual libraries and virtual screening are essential tools in modern Drug Discovery No special hardware, short experiment cycles, variety of approaches Database of synthetically accessible compounds can be designed with reaction libraries and custom in-house synthetic knowledge Powerful 3D alignment techniques allow highthroughput conformational screening with great efficiency Straightforward integration into KNIME

Contributors Tímea Polgár Attila Tajti

www.chemaxon.com