Web tools for Monomer selection, Library Design and Compound Acquisition Andrew Leach GlaxoSmithKline Research and Development Stevenage
Historical perspective Bench scientists unused to dealing with and manipulating large numbers of structures library design, HTS analysis, compound acquisition Many rapidly changing developments within expert computational chemistry and chemoinformatics groups Web facilitates delivery to large numbers of scientists (internationally) without need for traditional IT infrastructure
ADEPT: integration of monomer & compound selection, library enumeration, profiling and design A.R. Leach et al J. Chem. Inf. Comp. Sci. 39(6) (1999) 1161-1172 Chemical transform (e.g. via IsisDraw) Enumerate virtual library Select pool of possible monomers / compounds Refine list according to functional groups; availability; simple properties compounds Databases Data sources (e.g. racle; chemical knowledge) Calculate properties/profile/apply model/fit to structure etc. Select monomers / compounds for experiment: product-based selection Virtual Screening
Database searching Wide variety of databases available ex-gw, ex-sb, ACD, BACD/XCD, WDI, natural products, virtual monomers/libraries... Substructure search via Chime/IsisDraw - conversion to SMARTS using in-house algorithm Structure lookup which of these molecules have we made before, and which could I buy? Canonical representation (smiles - easily manipulated as a string) Similarity search Merlin - fast
Filters Can have a dramatic impact on size of hit-lists Functional group inclusion/exclusion, counts only one carboxylic acid Protecting groups Reactivity/inappropriateness filters Widely used for selecting compounds for screens, for HTS analysis and compound acquisition Specific filters for popular reactions (e.g. amide formation, reductive amination) developed in collaboration with in-house experts Can define one s own filter sets for re-use
Selecting commercially available monomers using chemist s expertise and GaP In ACD Passed Filters Chemist Selection Additional GaP Selection Total Acids 25,524 1109 247 54 301 Alcohols 33,250 652 162 39 201 Aldehydes 2555 268 99 19 118 1ary Alkyl Amines 9356 380 106 37 143 1ary Aryl Amines 9875 231 66 25 91 2ary Alkyl Amines 4112 199 89 24 113 Apply Filters Chemist s Selection Small, Simple Monomers Medicinal Chemist s intuition Review and supplement with GaP C H 3 H 3 C S H 2 CH 3 H 2 Cl Cl H 2 H 2 H 2 H 3 C CH 3 S H2 H 2 H 3 C H 2
Library enumeration Reaction transform approach: Daylight SMIRKS and GW-MTZ language Wide variety of reaction types must be covered (not just combichem reactions!) Simple reactions A + B P Multi-step reactions Resin coupling, protection/deprotection Typically use one transform for each distinct chemical step Specific cases especially for lead optimisation - can be very project-specific IsisDraw/Chime: conversion of rxn file to SMIRKS for user transformations Problematic situations need to be handled differential reactivity multi-functional reagents protecting group removal
Multi-step reactions Me H Me Cl Me H Me R3 Me Me Me R3 Me R1 Me Me Me R1 Me R3 Me Maps onto the chemistry actually performed Li R 2 X Me R3 TFA R1 R2 Small number of transforms can be used for many libraries Electronic rehearsal helps identify problematic monomers early May be slower, but can always write a single step transform if desired R1 R2 R3 Boojamra et al. J. rg Chem. 62 (1997) 1240-1256
Property calculations (profile) Rapidly calculated from 2D structure Focus on properties understood by medicinal chemists Simple counts donors, acceptors, positively/negatively ionisable groups, rotatable bonds etc. Properties logp, CMR, MW etc. Graphical display (histogram) or tabular output Simultaneously filter on multiple properties and iterate
Property Definitions HB donors, acceptors, +/- charged/ionisable, rot. bonds Some issues: Heavy-atom counts or hydrogen-count? rthogonal definitions? Do some functional groups count more than one? Through-bond effects (e.g. for basic groups) Hierarchical set of definitions Based on SMARTS - each definition is associated with an integral contribution (may be zero) Flag atom/group once identified - cannot match again
Selection methods Library design - monomer selection Monomer frequency analysis PLUMS (SELECT) Cluster analysis Aim is to re-order a list of compounds to facilitate visual inspection, rather than provide an automatic selection (e.g. via cluster centroids) MaxMin diversity selection
Storing reaction schemes, filters Re-entry of information is a pain chemists often use the same reaction scheme, filter sets Want mechanism to store (and search) reaction schemes and sets of filters Has someone else already done this chemical transformation? Reaction scheme database normalisation (e.g. one scheme; multiple users) Mix of racle and Daylight Reaction hierarchy: scheme stage step Retrieval according to userid, reaction SMARTS
ne stage, three components one step CH HFmoc 1 ne stage, one component two steps couple to resin (12) remove fmoc (5) RCH 3 C[Re] H 2 2 Bu thiazolidinone formation (21) CR' S 7 HC R Bu SH Esterification (4) R'H 6 Bu Bu C[Re] S R 4 general deprotection (15) CH remove from resin (13) S R 5 verall 3 monomer sets 4 stages 6 steps C.P. Holmes et al. J. rg. Chem. 60(22) (1995) 7328-7333 G.C. Look et al Bioorg. Med. Chem. Lett. 6(6) (1996) 707-712
Added benefits Automated enumeration inspired by 1-click shopping define initial monomer sets, specify filters and reaction scheme facilitate creation of virtual libraries Data-mining log all library enumerations, monomer sets What are the most popular reaction schemes, monomers?
R2 [Cl,Br,I] R1 R2 Reaction scheme usage R1 H + [Cl,Br,I] R1 R1 H + R2 H 300 R1 R2 R3 R1 S [Cl,Br] + R3 R2 H R1 S R2 250 200 150 R1 H + R2 H R1 H H R2 100 50 0
Monomer usage H 2 S Cl H 250 200 150 100 50 0 1 14 27 40 53 66 79 92 105 118 131 144 157 170 183 196
Profile comparison Comparison of property profiles provided visual indication of problems with early libraries 25 20 15 10 5 0 wdi gl41 gl53 gl307 gl357 gl435 core96 Rotatable bonds 0 2 4 6 8 10 12 14 16 18 20 20 18 16 14 12 10 8 6 4 2 0 12.5 87.5 162.5 237.5 312.5 387.5 462.5 537.5 612.5 687.5 762.5 837.5 wdi gl41 gl53 gl307 gl357 gl435 core96 Molecular weight
Profile comparison in library design and sample set selection Library design Compare profile of initial/design library with pre-defined sets Incorporate concepts of drug-likeness, lead-likeness, 7TMlikeness etc. SELECT program Set selection Plates previously constructed for focussed screening Much easier to dispense than cherry-picks Which set(s) are most suitable for my target? In what order should the sets be screened?
Quantitative profile comparison 25 20 15 10 5 0 ss54 ss57 ss67 ss290 ss604 ID ss54 ss57 ss67 ss290 ss604 ss54 0 4.88257 5.77708 5.6378 9.00608 ss57 4.88257 0 8.10324 5.97339 8.11271 ss67 5.77708 8.10324 0 8.6757 12.5477 ss290 5.6378 5.97339 8.6757 0 6.2015 ss604 9.00608 8.11271 12.5477 6.2015 0 12.5 62.5 112.5 162.5 212.5 262.5 312.5 362.5 412.5 462.5 512.5 Reinforce visual comparison Chi-squared, Kolmogorov-Smirnov methods Combine several properties different weights data fusion
Lead-likeness vs. drug-likeness Is there a difference between a lead and a drug? W. Sneader: Drug Prototypes and their Exploitation. Provides the drug prototype ca. 500 compounds selected Compare properties of prototype and final drug * percentage 18 16 14 12 10 8 6 4 2 0 wdi sneader_start sneader_end percentage 30 25 20 15 10 5 0 wdi sneader_start sneader_end 12.5 113 213 313 413 513 613 713 813 913 1013 0 2 4 6 8 10 12 14 16 MW umber of acceptors * M. Hann, A. Leach and G. Harper in press See also Teague et al. Angew. Chem., Int. Ed. 1999, 38(24), 3743-3748
Active Compounds/Library samples Filter to remove undesirable Compounds Substructure Search Active structures of interest Library design Similarity Search Cluster Compounds Define chemistry Identify possible monomers Profiling Profile-based set selection Pharmacophore Generation 3D Search Library enumeration Profiling Monomer Selection Final Monomer Selection rder compounds & test VSVL rder monomers synthesise & test
ther in-house web applications RECAP Identify privileged monomers GaP - Gridding and Partitioning Pharmacophore analysis for monomers and molecules HTS analysis tools SIV: Selection by Interactive Visualisation Visualisation of protein-ligand crystal structures Facilitate understanding by medicinal chemists
ew concepts: Education issues Gridding and Partitioning (GaP) GaP: monomer analysis using pharmacophores Monomer acquisition Monomer selection (hits-to-leads, lead optimisation) H 2 H attachment group at origin Acid z y H 2 H H * * y x Free x-axis rotation about attachment bond Pharmacophore point Track locations of pharmacophores within regular grid How can we communicate these concepts to the local chemistry community? Aromatic ring z H * * x A.R. Leach et al J. Chem. Inf. Comput. Sci. 40 (2000) 1262-1269...001101000111000101110001100010... Pharmacophore key
Run a competition! Aim is to select the optimal set of 10 monomers out of 50 Scoring function incorporates pharmacophore coverage and physical properties (logp) Compare to Genetic Algorithm selection f = pharmacophorecentres boxesfilled bysubset boxesoccupied bysubset number of centres number of clogpbins filled bysubset total number of clogp bins occupied by entire set
score name tries 0.646201006 jw 21 0.646201006 kvh 120 0.646201006 ag 29 0.646201006 GA 0.632384107 reb 26 0.631906009 gmr 339 0.631742362 jpr 9 0.11222195 nka 2 0.084873329 drp 1 0.068448476 2d_clustering
Summary Web (intranet) provides a straightforward mechanism to deliver useful tools to a large (international) audience both experts and non-experts Installation and updating straightforward Everyone knows how to use a browser keep it simple! Task-oriented Valuable method for education and training (esp. true for ADEPT/library design) Enables non-experts to do it for themselves experts can focus on more complex problems
Acknowledgements Gavin Harper Mike Hann Darren Green Francis Atkinson Gianpaolo Bravi Alfonso Pozzan Duncan Judd Val Gillet Peter Willett