Web tools for Monomer selection, Library Design and Compound Acquisition. Andrew Leach GlaxoSmithKline Research and Development Stevenage

Similar documents
Introduction. OntoChem

The use of Design of Experiments to develop Efficient Arrays for SAR and Property Exploration

CSD. Unlock value from crystal structure information in the CSD

Navigation in Chemical Space Towards Biological Activity. Peter Ertl Novartis Institutes for BioMedical Research Basel, Switzerland

Computational Chemistry in Drug Design. Xavier Fradera Barcelona, 17/4/2007

Virtual Libraries and Virtual Screening in Drug Discovery Processes using KNIME

Data Mining in the Chemical Industry. Overview of presentation

Computational Methods and Drug-Likeness. Benjamin Georgi und Philip Groth Pharmakokinetik WS 2003/2004

Introduction to Chemoinformatics and Drug Discovery

Using AutoDock for Virtual Screening

Building innovative drug discovery alliances. Just in KNIME: Successful Process Driven Drug Discovery

Automated Compound Collection Enhancement: how Pipeline Pilot preserved our sanity. Darren Green GSK

Structural biology and drug design: An overview

Ultra High Throughput Screening using THINK on the Internet

Chemoinformatics and information management. Peter Willett, University of Sheffield, UK

AMRI COMPOUND LIBRARY CONSORTIUM: A NOVEL WAY TO FILL YOUR DRUG PIPELINE

DivCalc: A Utility for Diversity Analysis and Compound Sampling

Hydrogen Bonding & Molecular Design Peter

György M. Keserű H2020 FRAGNET Network Hungarian Academy of Sciences

Overview. Descriptors. Definition. Descriptors. Overview 2D-QSAR. Number Vector Function. Physicochemical property (log P) Atom

The Schrödinger KNIME extensions

CSD. CSD-Enterprise. Access the CSD and ALL CCDC application software

An Integrated Approach to in-silico

The Case for Use Cases

Drug Informatics for Chemical Genomics...

Enamine Golden Fragment Library

How IJC is Adding Value to a Molecular Design Business

The Changing Requirements for Informatics Systems During the Growth of a Collaborative Drug Discovery Service Company. Sally Rose BioFocus plc

Using Phase for Pharmacophore Modelling. 5th European Life Science Bootcamp March, 2017

Chemically Intelligent Experiment Data Management

Ligand Scout Tutorials

SCULPT 3.0. Using SCULPT to Gain Competitive Insights. Brings 3D Visualization to the Lab Bench SPECIAL REPORT. 4 Molecular Connection Fall 1999

Chemical library design

Contents 1 Open-Source Tools, Techniques, and Data in Chemoinformatics

Capturing Chemistry. What you see is what you get In the world of mechanism and chemical transformations

Integrated Cheminformatics to Guide Drug Discovery

Data Quality Issues That Can Impact Drug Discovery

Rapid Application Development using InforSense Open Workflow and Daylight Technologies Deliver Discovery Value

Fast similarity searching making the virtual real. Stephen Pickett, GSK

Performing a Pharmacophore Search using CSD-CrossMiner

Fragment-based de novo Design

Important Aspects of Fragment Screening Collection Design

Canonical Line Notations

Alkane/water partition coefficients and hydrogen bonding. Peter Kenny

Dictionary of ligands

The shortest path to chemistry data and literature

Biologically Relevant Molecular Comparisons. Mark Mackey

Design and Synthesis of the Comprehensive Fragment Library

Reaxys The Highlights

JCICS Major Research Areas

Identifying Interaction Hot Spots with SuperStar

Creating a Pharmacophore Query from a Reference Molecule & Scaffold Hopping in CSD-CrossMiner

Bioisosteres in Medicinal Chemistry

Practical QSAR and Library Design: Advanced tools for research teams

Evolutionary Algorithm for Drug Discovery Interim Design Report

User Guide for LeDock

Similarity Search. Uwe Koch

The Schrödinger KNIME extensions

Characterization of Pharmacophore Multiplet Fingerprints as Molecular Descriptors. Robert D. Clark 2004 Tripos, Inc.

BioSolveIT. A Combinatorial Approach for Handling of Protonation and Tautomer Ambiguities in Docking Experiments

Dispensing Processes Profoundly Impact Biological, Computational and Statistical Analyses

Receptor Based Drug Design (1)

Version 1.2 October 2017 CSD v5.39

In Silico Investigation of Off-Target Effects

A Picture Paints a Thousand Words Visualisation of SAR. John Cumming, AstraZeneca, Alderley Park, UK

Dock Ligands from a 2D Molecule Sketch

Analysis of a Large Structure/Biological Activity. Data Set Using Recursive Partitioning and. Simulated Annealing

Expanding the scope of literature data with document to structure tools PatentInformatics applications at Aptuit

Functional Group Fingerprints CNS Chemistry Wilmington, USA

Relative Drug Likelihood: Going beyond Drug-Likeness

Tautomerism in chemical information management systems

Chem 1075 Chapter 19 Organic Chemistry Lecture Outline

Tutorials on Library Design E. Lounkine and J. Bajorath (University of Bonn) C. Muller and A. Varnek (University of Strasbourg)

How to add your reactions to generate a Chemistry Space in KNIME

Design Drugs Collaboratively Using Spotfire Visualization and Analysis

Chemists are from Mars, Biologists from Venus. Originally published 7th November 2006

Reaxys Pipeline Pilot Components Installation and User Guide

Chemical Space: Modeling Exploration & Understanding

Fragment Hotspot Maps: A CSD-derived Method for Hotspot identification

Next Generation Computational Chemistry Tools to Predict Toxicity of CWAs

Bridging the Dimensions:

Large scale classification of chemical reactions from patent data

به نام خدا. New topics in. organic chemistry. Dr Morteza Mehrdad University of Guilan, Department of Chemistry, Rasht, Iran

Plan. Lecture: What is Chemoinformatics and Drug Design? Description of Support Vector Machine (SVM) and its used in Chemoinformatics.

LigandScout. Automated Structure-Based Pharmacophore Model Generation. Gerhard Wolber* and Thierry Langer

Plan. Day 2: Exercise on MHC molecules.

Intelligent NMR productivity tools

Pipeline Pilot Integration

Similarity methods for ligandbased virtual screening

Using Bayesian Statistics to Predict Water Affinity and Behavior in Protein Binding Sites. J. Andrew Surface

Introduction to Spark

Reaxys Medicinal Chemistry Fact Sheet

The Conformation Search Problem

Marvin. Sketching, viewing and predicting properties with Marvin - features, tips and tricks. Gyorgy Pirok. Solutions for Cheminformatics

OECD QSAR Toolbox v.4.1. Tutorial on how to predict Skin sensitization potential taking into account alert performance

Pipelining Ligands in PHENIX: elbow and REEL

Open PHACTS Explorer: Compound by Name

Universities of Leeds, Sheffield and York

Welcome to Week 5. Chapter 9 - Binding, Structure, and Diversity. 9.1 Intermolecular Forces. Starting week five video. Introduction to Chapter 9

Investigating crystal engineering principles using a data set of 50 pharmaceutical cocrystals

Transcription:

Web tools for Monomer selection, Library Design and Compound Acquisition Andrew Leach GlaxoSmithKline Research and Development Stevenage

Historical perspective Bench scientists unused to dealing with and manipulating large numbers of structures library design, HTS analysis, compound acquisition Many rapidly changing developments within expert computational chemistry and chemoinformatics groups Web facilitates delivery to large numbers of scientists (internationally) without need for traditional IT infrastructure

ADEPT: integration of monomer & compound selection, library enumeration, profiling and design A.R. Leach et al J. Chem. Inf. Comp. Sci. 39(6) (1999) 1161-1172 Chemical transform (e.g. via IsisDraw) Enumerate virtual library Select pool of possible monomers / compounds Refine list according to functional groups; availability; simple properties compounds Databases Data sources (e.g. racle; chemical knowledge) Calculate properties/profile/apply model/fit to structure etc. Select monomers / compounds for experiment: product-based selection Virtual Screening

Database searching Wide variety of databases available ex-gw, ex-sb, ACD, BACD/XCD, WDI, natural products, virtual monomers/libraries... Substructure search via Chime/IsisDraw - conversion to SMARTS using in-house algorithm Structure lookup which of these molecules have we made before, and which could I buy? Canonical representation (smiles - easily manipulated as a string) Similarity search Merlin - fast

Filters Can have a dramatic impact on size of hit-lists Functional group inclusion/exclusion, counts only one carboxylic acid Protecting groups Reactivity/inappropriateness filters Widely used for selecting compounds for screens, for HTS analysis and compound acquisition Specific filters for popular reactions (e.g. amide formation, reductive amination) developed in collaboration with in-house experts Can define one s own filter sets for re-use

Selecting commercially available monomers using chemist s expertise and GaP In ACD Passed Filters Chemist Selection Additional GaP Selection Total Acids 25,524 1109 247 54 301 Alcohols 33,250 652 162 39 201 Aldehydes 2555 268 99 19 118 1ary Alkyl Amines 9356 380 106 37 143 1ary Aryl Amines 9875 231 66 25 91 2ary Alkyl Amines 4112 199 89 24 113 Apply Filters Chemist s Selection Small, Simple Monomers Medicinal Chemist s intuition Review and supplement with GaP C H 3 H 3 C S H 2 CH 3 H 2 Cl Cl H 2 H 2 H 2 H 3 C CH 3 S H2 H 2 H 3 C H 2

Library enumeration Reaction transform approach: Daylight SMIRKS and GW-MTZ language Wide variety of reaction types must be covered (not just combichem reactions!) Simple reactions A + B P Multi-step reactions Resin coupling, protection/deprotection Typically use one transform for each distinct chemical step Specific cases especially for lead optimisation - can be very project-specific IsisDraw/Chime: conversion of rxn file to SMIRKS for user transformations Problematic situations need to be handled differential reactivity multi-functional reagents protecting group removal

Multi-step reactions Me H Me Cl Me H Me R3 Me Me Me R3 Me R1 Me Me Me R1 Me R3 Me Maps onto the chemistry actually performed Li R 2 X Me R3 TFA R1 R2 Small number of transforms can be used for many libraries Electronic rehearsal helps identify problematic monomers early May be slower, but can always write a single step transform if desired R1 R2 R3 Boojamra et al. J. rg Chem. 62 (1997) 1240-1256

Property calculations (profile) Rapidly calculated from 2D structure Focus on properties understood by medicinal chemists Simple counts donors, acceptors, positively/negatively ionisable groups, rotatable bonds etc. Properties logp, CMR, MW etc. Graphical display (histogram) or tabular output Simultaneously filter on multiple properties and iterate

Property Definitions HB donors, acceptors, +/- charged/ionisable, rot. bonds Some issues: Heavy-atom counts or hydrogen-count? rthogonal definitions? Do some functional groups count more than one? Through-bond effects (e.g. for basic groups) Hierarchical set of definitions Based on SMARTS - each definition is associated with an integral contribution (may be zero) Flag atom/group once identified - cannot match again

Selection methods Library design - monomer selection Monomer frequency analysis PLUMS (SELECT) Cluster analysis Aim is to re-order a list of compounds to facilitate visual inspection, rather than provide an automatic selection (e.g. via cluster centroids) MaxMin diversity selection

Storing reaction schemes, filters Re-entry of information is a pain chemists often use the same reaction scheme, filter sets Want mechanism to store (and search) reaction schemes and sets of filters Has someone else already done this chemical transformation? Reaction scheme database normalisation (e.g. one scheme; multiple users) Mix of racle and Daylight Reaction hierarchy: scheme stage step Retrieval according to userid, reaction SMARTS

ne stage, three components one step CH HFmoc 1 ne stage, one component two steps couple to resin (12) remove fmoc (5) RCH 3 C[Re] H 2 2 Bu thiazolidinone formation (21) CR' S 7 HC R Bu SH Esterification (4) R'H 6 Bu Bu C[Re] S R 4 general deprotection (15) CH remove from resin (13) S R 5 verall 3 monomer sets 4 stages 6 steps C.P. Holmes et al. J. rg. Chem. 60(22) (1995) 7328-7333 G.C. Look et al Bioorg. Med. Chem. Lett. 6(6) (1996) 707-712

Added benefits Automated enumeration inspired by 1-click shopping define initial monomer sets, specify filters and reaction scheme facilitate creation of virtual libraries Data-mining log all library enumerations, monomer sets What are the most popular reaction schemes, monomers?

R2 [Cl,Br,I] R1 R2 Reaction scheme usage R1 H + [Cl,Br,I] R1 R1 H + R2 H 300 R1 R2 R3 R1 S [Cl,Br] + R3 R2 H R1 S R2 250 200 150 R1 H + R2 H R1 H H R2 100 50 0

Monomer usage H 2 S Cl H 250 200 150 100 50 0 1 14 27 40 53 66 79 92 105 118 131 144 157 170 183 196

Profile comparison Comparison of property profiles provided visual indication of problems with early libraries 25 20 15 10 5 0 wdi gl41 gl53 gl307 gl357 gl435 core96 Rotatable bonds 0 2 4 6 8 10 12 14 16 18 20 20 18 16 14 12 10 8 6 4 2 0 12.5 87.5 162.5 237.5 312.5 387.5 462.5 537.5 612.5 687.5 762.5 837.5 wdi gl41 gl53 gl307 gl357 gl435 core96 Molecular weight

Profile comparison in library design and sample set selection Library design Compare profile of initial/design library with pre-defined sets Incorporate concepts of drug-likeness, lead-likeness, 7TMlikeness etc. SELECT program Set selection Plates previously constructed for focussed screening Much easier to dispense than cherry-picks Which set(s) are most suitable for my target? In what order should the sets be screened?

Quantitative profile comparison 25 20 15 10 5 0 ss54 ss57 ss67 ss290 ss604 ID ss54 ss57 ss67 ss290 ss604 ss54 0 4.88257 5.77708 5.6378 9.00608 ss57 4.88257 0 8.10324 5.97339 8.11271 ss67 5.77708 8.10324 0 8.6757 12.5477 ss290 5.6378 5.97339 8.6757 0 6.2015 ss604 9.00608 8.11271 12.5477 6.2015 0 12.5 62.5 112.5 162.5 212.5 262.5 312.5 362.5 412.5 462.5 512.5 Reinforce visual comparison Chi-squared, Kolmogorov-Smirnov methods Combine several properties different weights data fusion

Lead-likeness vs. drug-likeness Is there a difference between a lead and a drug? W. Sneader: Drug Prototypes and their Exploitation. Provides the drug prototype ca. 500 compounds selected Compare properties of prototype and final drug * percentage 18 16 14 12 10 8 6 4 2 0 wdi sneader_start sneader_end percentage 30 25 20 15 10 5 0 wdi sneader_start sneader_end 12.5 113 213 313 413 513 613 713 813 913 1013 0 2 4 6 8 10 12 14 16 MW umber of acceptors * M. Hann, A. Leach and G. Harper in press See also Teague et al. Angew. Chem., Int. Ed. 1999, 38(24), 3743-3748

Active Compounds/Library samples Filter to remove undesirable Compounds Substructure Search Active structures of interest Library design Similarity Search Cluster Compounds Define chemistry Identify possible monomers Profiling Profile-based set selection Pharmacophore Generation 3D Search Library enumeration Profiling Monomer Selection Final Monomer Selection rder compounds & test VSVL rder monomers synthesise & test

ther in-house web applications RECAP Identify privileged monomers GaP - Gridding and Partitioning Pharmacophore analysis for monomers and molecules HTS analysis tools SIV: Selection by Interactive Visualisation Visualisation of protein-ligand crystal structures Facilitate understanding by medicinal chemists

ew concepts: Education issues Gridding and Partitioning (GaP) GaP: monomer analysis using pharmacophores Monomer acquisition Monomer selection (hits-to-leads, lead optimisation) H 2 H attachment group at origin Acid z y H 2 H H * * y x Free x-axis rotation about attachment bond Pharmacophore point Track locations of pharmacophores within regular grid How can we communicate these concepts to the local chemistry community? Aromatic ring z H * * x A.R. Leach et al J. Chem. Inf. Comput. Sci. 40 (2000) 1262-1269...001101000111000101110001100010... Pharmacophore key

Run a competition! Aim is to select the optimal set of 10 monomers out of 50 Scoring function incorporates pharmacophore coverage and physical properties (logp) Compare to Genetic Algorithm selection f = pharmacophorecentres boxesfilled bysubset boxesoccupied bysubset number of centres number of clogpbins filled bysubset total number of clogp bins occupied by entire set

score name tries 0.646201006 jw 21 0.646201006 kvh 120 0.646201006 ag 29 0.646201006 GA 0.632384107 reb 26 0.631906009 gmr 339 0.631742362 jpr 9 0.11222195 nka 2 0.084873329 drp 1 0.068448476 2d_clustering

Summary Web (intranet) provides a straightforward mechanism to deliver useful tools to a large (international) audience both experts and non-experts Installation and updating straightforward Everyone knows how to use a browser keep it simple! Task-oriented Valuable method for education and training (esp. true for ADEPT/library design) Enables non-experts to do it for themselves experts can focus on more complex problems

Acknowledgements Gavin Harper Mike Hann Darren Green Francis Atkinson Gianpaolo Bravi Alfonso Pozzan Duncan Judd Val Gillet Peter Willett