The PhilOEsophy. There are only two fundamental molecular descriptors

Similar documents
est Drive K20 GPUs! Experience The Acceleration Run Computational Chemistry Codes on Tesla K20 GPU today

Computational chemical biology to address non-traditional drug targets. John Karanicolas

Using Phase for Pharmacophore Modelling. 5th European Life Science Bootcamp March, 2017

Pose and affinity prediction by ICM in D3R GC3. Max Totrov Molsoft

Dr. Sander B. Nabuurs. Computational Drug Discovery group Center for Molecular and Biomolecular Informatics Radboud University Medical Centre

Virtual Libraries and Virtual Screening in Drug Discovery Processes using KNIME

What is Protein-Ligand Docking?

Kd = koff/kon = [R][L]/[RL]

User Guide for LeDock

Conformational Sampling of Druglike Molecules with MOE and Catalyst: Implications for Pharmacophore Modeling and Virtual Screening

Schrodinger ebootcamp #3, Summer EXPLORING METHODS FOR CONFORMER SEARCHING Jas Bhachoo, Senior Applications Scientist

GC and CELPP: Workflows and Insights

Ligand Scout Tutorials

Computational Chemistry in Drug Design. Xavier Fradera Barcelona, 17/4/2007

DOCKING TUTORIAL. A. The docking Workflow

Softwares for Molecular Docking. Lokesh P. Tripathi NCBS 17 December 2007

Conformational Searching using MacroModel and ConfGen. John Shelley Schrödinger Fellow

Analyzing Molecular Conformations Using the Cambridge Structural Database. Jason Cole Cambridge Crystallographic Data Centre

Generating Small Molecule Conformations from Structural Data

BioSolveIT. A Combinatorial Approach for Handling of Protonation and Tautomer Ambiguities in Docking Experiments

The Conformation Search Problem

Introduction. OntoChem

Virtual Screening: How Are We Doing?

Hit Finding and Optimization Using BLAZE & FORGE

Next Generation Computational Chemistry Tools to Predict Toxicity of CWAs

Docking. GBCB 5874: Problem Solving in GBCB

MM-GBSA for Calculating Binding Affinity A rank-ordering study for the lead optimization of Fxa and COX-2 inhibitors

Fragment based drug discovery in teams of medicinal and computational chemists. Carsten Detering

Structural Bioinformatics (C3210) Molecular Docking

Integrated Cheminformatics to Guide Drug Discovery

BioSolveIT. A Combinatorial Docking Approach for Dealing with Protonation and Tautomer Ambiguities

Virtual screening in drug discovery

Conformational sampling of macrocycles in solution and in the solid state

A Tiered Screen Protocol for the Discovery of Structurally Diverse HIV Integrase Inhibitors

Protein Structure Prediction and Protein-Ligand Docking

The Long and Rocky Road from a PDB File to a Protein Ligand Docking Score. Protein Structures: The Starting Point for New Drugs 2

MD Simulation in Pose Refinement and Scoring Using AMBER Workflows

Protein-Ligand Docking Evaluations

Similarity Search. Uwe Koch

CSD. CSD-Enterprise. Access the CSD and ALL CCDC application software

The Schrödinger KNIME extensions

PharmDock: A Pharmacophore-Based Docking Program

BUDE. A General Purpose Molecular Docking Program Using OpenCL. Richard B Sessions

Cross Discipline Analysis made possible with Data Pipelining. J.R. Tozer SciTegic

Structure based drug design and LIE models for GPCRs

Biologically Relevant Molecular Comparisons. Mark Mackey

Bioengineering & Bioinformatics Summer Institute, Dept. Computational Biology, University of Pittsburgh, PGH, PA

Virtual Screening. Anna Linusson. Department of Chemistry Umeå University Sweden

On Evaluating Molecular-Docking Methods for Pose Prediction and Enrichment Factors

ICM-Chemist-Pro How-To Guide. Version 3.6-1h Last Updated 12/29/2009

LS-align: an atom-level, flexible ligand structural alignment algorithm for high-throughput virtual screening

SHAFTS: A Hybrid Approach for 3D Molecular Similarity Calculation. 1. Method and Assessment of Virtual Screening

PL-PatchSurfer: A Novel Molecular Local Surface-Based Method for Exploring Protein-Ligand Interactions

Part 6. 3D Pharmacophore Modeling

Creating a Pharmacophore Query from a Reference Molecule & Scaffold Hopping in CSD-CrossMiner

Introduction to Spark

LigandScout. Automated Structure-Based Pharmacophore Model Generation. Gerhard Wolber* and Thierry Langer

Drug Design 2. Oliver Kohlbacher. Winter 2009/ QSAR Part 4: Selected Chapters

5.1. Hardwares, Softwares and Web server used in Molecular modeling

The Schrödinger KNIME extensions

Receptor Based Drug Design (1)

Contents 1 Open-Source Tools, Techniques, and Data in Chemoinformatics

KNIME-based scoring functions in Muse 3.0. KNIME User Group Meeting 2013 Fabian Bös

Structural biology and drug design: An overview

Performing a Pharmacophore Search using CSD-CrossMiner

Joana Pereira Lamzin Group EMBL Hamburg, Germany. Small molecules How to identify and build them (with ARP/wARP)

Machine learning for ligand-based virtual screening and chemogenomics!

Ultra High Throughput Screening using THINK on the Internet

Comparative Analysis of Pharmacophore Screening Tools

Identifying Interaction Hot Spots with SuperStar

Virtual Screening in Drug Discovery

POSIT: Flexible Shape-Guided Docking For Pose Prediction

Author Index Volume

Using AutoDock for Virtual Screening

High Throughput In-Silico Screening Against Flexible Protein Receptors

Toward an Understanding of GPCR-ligand Interactions. Alexander Heifetz

Week 10: Homology Modelling (II) - HHpred

Molecular Interactions F14NMI. Lecture 4: worked answers to practice questions

Flexibility and Constraints in GOLD

SCISSORS: A Linear-Algebraical Technique to Rapidly Approximate Chemical Similarities

Molecular Similarity Searching Using Inference Network

MM-PBSA Validation Study. Trent E. Balius Department of Applied Mathematics and Statistics AMS

Retrieving hits through in silico screening and expert assessment M. N. Drwal a,b and R. Griffith a

Homology modeling. Dinesh Gupta ICGEB, New Delhi 1/27/2010 5:59 PM

DISCRETE TUTORIAL. Agustí Emperador. Institute for Research in Biomedicine, Barcelona APPLICATION OF DISCRETE TO FLEXIBLE PROTEIN-PROTEIN DOCKING:

Fondamenti di Chimica Farmaceutica. Computer Chemistry in Drug Research: Introduction

Progress of Compound Library Design Using In-silico Approach for Collaborative Drug Discovery

Molecular Dynamics Graphical Visualization 3-D QSAR Pharmacophore QSAR, COMBINE, Scoring Functions, Homology Modeling,..

Supporting Information

Characterization of Pharmacophore Multiplet Fingerprints as Molecular Descriptors. Robert D. Clark 2004 Tripos, Inc.

The reuse of structural data for fragment binding site prediction

Molecular Shape and Medicinal Chemistry: A Perspective

ESPRESSO (Extremely Speedy PRE-Screening method with Segmented compounds) 1

Chemogenomic: Approaches to Rational Drug Design. Jonas Skjødt Møller

In silico pharmacology for drug discovery

Thermodynamic Integration with Enhanced Sampling (TIES)

Focusing Conformational Ensembles on Bioactive-Like Conformations

Fragment Hotspot Maps: A CSD-derived Method for Hotspot identification

Quantification of free ligand conformational preferences by NMR and their relationship to the bioactive conformation

Fast similarity searching making the virtual real. Stephen Pickett, GSK

Transcription:

The PhilOEsophy There are only two fundamental molecular descriptors

Where can we use shape? Virtual screening More effective than 2D Lead-hopping Shape analogues are not graph analogues Molecular alignment No requirement for (manual) atom matching Pose generation/prediction Matching a binding site Matching a bound ligand

Where can we use electrostatics? Lead-hopping Electrostatic analogues are not graph analogues Solvent treatment Continuum and semi-continuum

Virtual Screening Protein Preparation Compound Collection Database Preparation Screening Database Structure-based 3D & 2D Ligand-based Ligand Preparation Hybrid & Consensus

Using a protein structure Pose v. protein FRED Score v. protein FRED/SZYBKI Score v. ligand ROCS/EON Pose v. ligand ROCS HYBRID (VS) & POSIT (posing)

Virtual Screening with OpenEye QUACPAC tautomers charges FILTER OMEGA Conformations Remove undesirables FRED/HYBRID Posing and SBVS ROCS Shape alignment & scoring

OMEGA: Would you like to? Generate high quality conformer ensembles rapidly. Store large ensembles in very compact databases for rapid searching. Calculate useful conformer energetics in a variety of environments.

ROCS: Would you like to? Efficiently align molecules by shape and chemical features. Rapidly screen large databases for non-obvious actives. Obtain informative overlays between active and untested compounds.

The ROCS GUI: vrocs Generate custom queries

FRED: Would you like to? Perform structure-based VS rapidly. Identify binding mode(s) of molecules in an active site. Utilize more information to achieve better results. HYBRID

POSIT: Would you like to? Produce good quality predictions of ligand poses with very high frequency. Accurately estimate the probability that a predicted pose is accurate. Automatically determine the best protein structure from a set to pose a molecule against.

Virtual Screening with OpenEye QUACPAC tautomers charges FILTER OMEGA Conformations Remove undesirables FRED/HYBRID Posing and SBVS ROCS Shape alignment & scoring

OMEGA: conformation generation OMEGA Knowledge-based conformation generation Virtual screening Crystal structure reproduction Ensemble properties

OMEGA: The best validated conformer generator Carefully selected crystallographic structures PDB and CSD Multiple measures of success Closeness and coverage Rigorous statistical analysis DOI: 10.1021/ci100031x

OMEGA: The process Input molecule (1D, 2D, 3D) Find fragments 3D Fragment library Built-in or custom Assemble fragments -> 3D structure Torsion driving Complete conformer ensemble Torsion library Built-in & extensible Knowledge Base Pruned conformer ensemble

Size (MB) The file size problem SD/MOL2 files too large to store large numbers 14000 of molecules or conformers 12000 10000 OpenEye binary (OEB) much smaller 8000 10x or more 6000 4000 Can we do better? 2000 File size for 22 million conformers How is this done? 0 MOL2 SDF OEB ROC-OEB File Format

Rotor-offset compression (ROC) Speeds up downstream tools 10-15% Store one set of coordinates. All other conformers defined by torsion angles.

RMSD OMEGA: accuracy on a carefully chosen dataset 2.5 2 Mean RMSD: 0.67Å (0.655, 0.688) Median RMSD: 0.53Å 1.5 1 0.5 0 0 50 100 150 200 Count J. Chem. Inf. Model., 50, 572 (2010).

OMEGA: relative accuracy 100 75 50 MOE/Import Catalyst/BEST ConfGen/CompMin ConfGen/CombMin OMEGA2 25 0 <0.5 <1 <1.5 <2 Watts et al. J. Chem. Inf. Model. 50, 534 (2010)

OMEGA: speed 150 120 90 60 NumConfs Time(s) 30 0 MOE/Import Catalyst/BEST ConfGen/CompMin ConfGen/CombMin OMEGA2 Average OMEGA time = 2.7 secs/molecule J. Chem. Inf. Model. 50, 822 (2010)

OMEGA Summary Speed: 0.5-2 molecules/sec Fastest of all commercial applications Quality: Excellent reproduction of X-ray poses Best overall at highly precise reproduction (< 0.5Å) Flexibility in generation of conformers Focus/diversity of conformer sets can easily be controlled In vacuo, solution, protein-bound

Virtual Screening with OpenEye QUACPAC tautomers charges FILTER OMEGA Conformations Remove undesirables FRED/HYBRID Posing and SBVS ROCS Shape alignment & scoring

The Shape of Ligand-based Design ROCS ROCS compares molecules by shape & chemistry Rigid overlay of a query conformer(s) with a set of conformers of database molecules Scoring by shape similarity and chemical (color) similarity (in 3D)

Per cent actives ROCS: Shape overlay and scoring Effective virtual screening 100 75 50 25 ROCS 0 0 5 Per 10 cent 15 screened 20 25 30 35 Identify shared features Molecular alignment Leadhopping

Shape similarity & graph similarity are not the same CDK2 inhibitors 10 nm 10-32 nm ROCS (shape) sim = 0.90 Fingerprint (2D) sim = 0.40

ROCS: Overlays + Scores Shape Tanimoto = 0.90 Color Tanimoto = 0.17 TanimotoCombo = 1.07

VS Comparison from Merck Virtual screening on 11 targets CA, CDK2, COX-2, DHFR, ER, HIV-PR, HIV-RT, NA, PTP- 1B, thrombin, TS Structure-based and ligand-based compared ROCS and docking Same X-ray structures; ligand as query McGaughey et al., J. Chem. Inf. Model., 2007, 47, 1504.

E (1%) ROCS is better than docking VS by Merck 30 25 20 15 10 5 Mean StdDev Median 0 GLIDE ROCS Application

Conclusion Extensive Merck study shows that ROCS is the best overall VS tool available Fast Reliability High hit rate Diverse hit structures Merck no longer uses docking for VS

VS against GPCRs Evers et al., J. Med. Chem., 2005, 48, 5448. 5HT2A, A1A, D2, M1 Various 3D techniques Docking to homology models Gold, FlexX-Pharm Ligand-based methods Catalyst, FlexS Compare to ROCS

Enrichment Mean of Results 20 15 10 5 GOLD FlexX-Pharm Catalyst FlexS ROCS 2D_MACCS 0 1% 5% 10% Per cent screened FlexS, ROCS: 1 query molecule, 1 computed conformation Catalyst: 15-20 query molecules -> 1 pharmacophore

ROCS Summary Powerful VS application Frequently outperforms docking Success does NOT require a bioactive conformation for the query Only low database conformational sampling required 25-50 confs/molecule Fast Up to 40 molecules/second 1000-2000 conformers/second

The ROCS GUI: vrocs Generate queries from molecules Customized queries Multi-molecule queries

Why vrocs? Enhanced Virtual Screening Active Compound(s) Query Creation vrocs Query Editing Query Validation ROCS

Virtual Screening with OpenEye QUACPAC tautomers charges FILTER OMEGA Conformations Remove undesirables FRED/HYBRID Posing and SBVS ROCS Shape alignment & scoring

How does FRED work? Build/customize receptor model GUI fred_receptor Input conformer database Optimized with OMEGA Exhaustive posing Structure-based & ligand-based scoring FRED Consensus pose selection

Global Exhaustive Search Systematic Rotations Systematic Translations Poses for scoring X Filtering of clashing poses

Scoring Operate on best poses from Exhaustive Search Protein-based scoring PLP, ChemScore, ScreenScore ChemGauss3, ShapeGauss PB (electrostatic interactions) Ligand-based scoring CGO Consensus

Fraction FRED: Self Docking Results 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.5 1 1.5 2 RMSD (Å) Top Scoring Pose Top 5 Top 10 Top 20 This is a completely irrelevant problem.

Cross-docking is difficult Average self-docking success Average cross-docking success J. Chem. Inf. Model. 50, 1432 (2010)

VS Comparison from Merck Virtual screening on 11 targets CA, CDK2, COX-2, DHFR, ER, HIV-PR, HIV-RT, NA, PTP-1B, thrombin, TS Structure-based and ligand-based compared McGaughey et al., J. Chem. Inf. Model., 2007, 47, 1504.

Enrichment (1%) FRED = GLIDE for VS 20 15 10 5 Mean StdDev Median FRED - Lower standard deviation, higher consistency 0 FRED Application GLIDE Best indicator of future performance

FRED - Summary Does well at posing Cross-docking is very difficult Virtual screening performance is good Reliable Can we do better?

Is a co-crystal structure available? Yes Use docking No Ligand-based (2D & 3D) Docking to apo structure is risky Best answer - use BOTH

Hybrid Docking: Using what you know Docking e.g. FRED Hybrid Docking Ligand- Based Design e.g. ROCS Bound ligand structure guides docking

FRED Hybrid vs. Standard Docking

Hybrid docking: speed Docking Time per Compound (one CPU 2.4 GHz Xeon) FRED 2.2 Standard Docking HYBRID 5sec 1sec

Hybrid docking: Posing 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Success Rate at 2Å Hybrid FRED Top20 Top10 Top5 Best HYBRID: 86% @ 2Å FRED: 70% @ 2Å

Virtual Screening Cross et al., J. Chem. Inf. Model., 49, 1455 (2010); McGann ibid. 51, 578 (2011). Results from DUD dataset Thick bars are the 95% confidence interval for the true average AUC Whiskers are the 95% confidence interval for the result of a single trial 95% confidence interval for the true average AUC. 95% confidence interval for the result of a single trial Performance of docking tools is very variable. Is it possible to show statistically meaningful differences between tools?

Virtual Screening Comparison Probability that the mean performance of HYBRID is better than FRED Probability that HYBRID will do better than FRED on one system 93% 62% Use more information Better results

FRED Summary Efficient virtual screening 3-5 sec/molecule Good pose prediction 70% < 2Å RMSD Variety of scoring Unique ligand-based Using more information gives better results Hybrid docking

Why structure-based design? Pose Prediction POSIT Virtual screening FRED/HYBRID Binding affinity prediction

Using a protein structure Pose v. protein FRED Score v. protein FRED/SZYBKI Score v. ligand ROCS/EON Pose v. ligand ROCS HYBRID (VS) & POSIT (posing)

Structure-based design with OpenEye POSIT Ligand-based posing SZMAP Solvent mapping FRED/HYBRID Posing and SBVS SZYBKI MMFF94 optimisation BROOD ROCS Shape alignment & scoring EON Electrostatic similarity Fragment replacement

Count POSIT: Accurate and reliable analogue posing Flexibly fit a new molecule to shape of a known ligand 60 50 > 90% 0-0.5Å RMSD 40 30 20 10 0 RMSD

Cross-docking pose prediction J. Chem. Inf. Model. 50, 1432 (2010) Average self-docking success Average cross-docking success How can we improve? Predict reliability? Identify likely failure cases?

POSIT: analogue posing CDK2 inhibitors Shape analogues are not obvious graph analogues. BUT Obvious graph analogues ARE shape analogues. Shape Tanimoto = 0.903 Fingerprint Tanimoto = 0.45

Molecular Similarity in 3D: How POSIT defines an analogue Shape Tanimoto = 0.90 Color Tanimoto = 0.17 TanimotoCombo = 1.07

How to use what you know: POSIT X-ray structure of known ligand New molecule Pose for new molecule using known ligand. Score AND estimate quality with TanimotoCombo.

What POSIT knows Ligand Information Protein Structure Symbol E Overlay E MMFF Potential TanimotoCombo Merck Molecular Force Field (MMFF94) Goal Match shape and chemistry of bound ligand Maximize interaction with the protein E POSIT = 1 λ E Overlay + λe MMFF λ = scaling factor

Cross-Docking: the real problem 100.00 Percent < 2.0 Angstrom 90.00 80.00 70.00 60.00 Posit* Gold_PLP Glide_PLP 50.00 40.00 30.00 AD4 Fred 20.00 10.00 0.00 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 X-ray v. Fit ligands Tanimoto Combo Tuccinardi, et al. J. Chem. Inf. Model., 2010, 50, 1432-1440. * Currently validating that the same data set was used

Prospective Results POSIT predicted crystal structures versus X-ray Active projects at Abbott Labs 17/20 (85%) TC > 1.4 RMSD <= 2Å TanimotoCombo Pose to Crystallographic Ligand

POSIT Summary POSIT gives a pose, a score and a CONFIDENCE POSIT knows when to fail Pose quality can be accurately predicted in POSIT Docking scores cannot predict quality POSIT works PROSPECTIVELY Fast: 10-20 sec/molecule Reliable: pose + confidence Accurate: 98% poses < 2Å; 90% < 0.5Å

POSIT

The PhilOEsophy There are two fundamental molecular descriptors

>omega2 in my_filtered_molecules.ism out my_dbase.oeb.gz >rocs query my_query.oeb dbase my_dbase.oeb.gz besthits 500 prefix my_rocs_results(.sdf) (virtual screening) >fred rec my_receptor.oeb.gz dbase my_dbase.oeb.gz - prefix my_fred_results oformat sdf.gz num_alt_poses 4 (hybrid) >fred rec my_receptor.oeb.gz dbase my_dbase.oeb.gz prefix my_hybrid_results oformat sdf.gz exhaustive_scoring cgo opt Chemgauss3 chemgauss3 true num_alt_poses 4