Pose and affinity prediction by ICM in D3R GC3. Max Totrov Molsoft

Similar documents
GC and CELPP: Workflows and Insights

The PhilOEsophy. There are only two fundamental molecular descriptors

ICM-Chemist-Pro How-To Guide. Version 3.6-1h Last Updated 12/29/2009

Using Phase for Pharmacophore Modelling. 5th European Life Science Bootcamp March, 2017

Docking. GBCB 5874: Problem Solving in GBCB

Computational chemical biology to address non-traditional drug targets. John Karanicolas

5.1. Hardwares, Softwares and Web server used in Molecular modeling

MD Simulation in Pose Refinement and Scoring Using AMBER Workflows

User Guide for LeDock

Generating Small Molecule Conformations from Structural Data

Hit Finding and Optimization Using BLAZE & FORGE

Analyzing Molecular Conformations Using the Cambridge Structural Database. Jason Cole Cambridge Crystallographic Data Centre

MM-GBSA for Calculating Binding Affinity A rank-ordering study for the lead optimization of Fxa and COX-2 inhibitors

DOCKING TUTORIAL. A. The docking Workflow

Molecular Interactions F14NMI. Lecture 4: worked answers to practice questions

est Drive K20 GPUs! Experience The Acceleration Run Computational Chemistry Codes on Tesla K20 GPU today

PL-PatchSurfer: A Novel Molecular Local Surface-Based Method for Exploring Protein-Ligand Interactions

Protein-Ligand Docking Evaluations

Virtual Libraries and Virtual Screening in Drug Discovery Processes using KNIME

Ligand Scout Tutorials

The Long and Rocky Road from a PDB File to a Protein Ligand Docking Score. Protein Structures: The Starting Point for New Drugs 2

Best Practices in Docking and Activity Prediction

The Flexible Pocketome Engine for Structural Chemogenomics

SHAFTS: A Hybrid Approach for 3D Molecular Similarity Calculation. 1. Method and Assessment of Virtual Screening

PROTEIN-PROTEIN DOCKING REFINEMENT USING RESTRAINT MOLECULAR DYNAMICS SIMULATIONS

Schrodinger ebootcamp #3, Summer EXPLORING METHODS FOR CONFORMER SEARCHING Jas Bhachoo, Senior Applications Scientist

PharmDock: A Pharmacophore-Based Docking Program

Softwares for Molecular Docking. Lokesh P. Tripathi NCBS 17 December 2007

What is Protein-Ligand Docking?

Chemogenomic: Approaches to Rational Drug Design. Jonas Skjødt Møller

SCREENED CHARGE ELECTROSTATIC MODEL IN PROTEIN-PROTEIN DOCKING SIMULATIONS

Structure based drug design and LIE models for GPCRs

A structure-guided approach for protein pocket modeling and affinity prediction

Structural Bioinformatics (C3210) Molecular Docking

Spatial chemical distance based on atomic property fields

Dr. Sander B. Nabuurs. Computational Drug Discovery group Center for Molecular and Biomolecular Informatics Radboud University Medical Centre

The reuse of structural data for fragment binding site prediction

Prediction and refinement of NMR structures from sparse experimental data

Structure-Activity Modeling - QSAR. Uwe Koch

Detection of Protein Binding Sites II

Medicinal Chemistry/ CHEM 458/658 Chapter 4- Computer-Aided Drug Design

De Novo molecular design with Deep Reinforcement Learning

Bioengineering & Bioinformatics Summer Institute, Dept. Computational Biology, University of Pittsburgh, PGH, PA

In silico pharmacology for drug discovery

Joana Pereira Lamzin Group EMBL Hamburg, Germany. Small molecules How to identify and build them (with ARP/wARP)

Similarity Search. Uwe Koch

Computational Chemistry in Drug Design. Xavier Fradera Barcelona, 17/4/2007

ICM-DISCO Docking by Global Energy Optimization With Fully Flexible Side-Chains

On Evaluating Molecular-Docking Methods for Pose Prediction and Enrichment Factors

Mechanistic insight into inhibition of two-component system signaling

Training a Scoring Function for the Alignment of Small Molecules

Kd = koff/kon = [R][L]/[RL]

Development of a Structure Generator to Explore Target Areas on Chemical Space

Machine Learning Concepts in Chemoinformatics

Chemical Space: Modeling Exploration & Understanding

Chemical properties that affect binding of enzyme-inhibiting drugs to enzymes

Biologically Relevant Molecular Comparisons. Mark Mackey

QSAR Modeling of ErbB1 Inhibitors Using Genetic Algorithm-Based Regression

Virtual Screening: How Are We Doing?

A Deep Convolutional Neural Network for Bioactivity Prediction in Structure-based Drug Discovery

Virtual screening in drug discovery

Using AutoDock 4 with ADT: A Tutorial

The Schrödinger KNIME extensions

Progress of Compound Library Design Using In-silico Approach for Collaborative Drug Discovery

fconv Tutorial Part 2

Machine learning for ligand-based virtual screening and chemogenomics!

Comparative Analysis of Pharmacophore Screening Tools

Retrieving hits through in silico screening and expert assessment M. N. Drwal a,b and R. Griffith a

Exploring the black box: structural and functional interpretation of QSAR models.

Fondamenti di Chimica Farmaceutica. Computer Chemistry in Drug Research: Introduction

LigandScout. Automated Structure-Based Pharmacophore Model Generation. Gerhard Wolber* and Thierry Langer

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror

October 6 University Faculty of pharmacy Computer Aided Drug Design Unit

FlexPepDock In a nutshell

Part 6. 3D Pharmacophore Modeling

Supporting Information

Methods of Protein Structure Comparison. Irina Kufareva and Ruben Abagyan

Next Generation Computational Chemistry Tools to Predict Toxicity of CWAs

Towards Physics-based Models for ADME/Tox. Tyler Day

Protein-Ligand Docking

Integrated Cheminformatics to Guide Drug Discovery

GPCR agonist binding revealed by modeling and crystallography

New approaches to scoring function design for protein-ligand binding affinities. Richard A. Friesner Columbia University

Francisco Melo, Damien Devos, Eric Depiereux and Ernest Feytmans

Drug Design 2. Oliver Kohlbacher. Winter 2009/ QSAR Part 4: Selected Chapters

Performing a Pharmacophore Search using CSD-CrossMiner

Crystal Violet as a Fluorescent Switch-On Probe for I-Motif: Label-Free DNA-Based Logic Gate

Procheck output. Bond angles (Procheck) Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics.

The Schrödinger KNIME extensions

Enhancing Specificity in the Janus Kinases: A Study on the Thienopyridine. JAK2 Selective Mechanism Combined Molecular Dynamics Simulation

To the best of my knowledge and as understood by the student in the Thesis/Dissertation Agreement, Publication Delay, and Certification/Disclaimer

A Large-Scale Test of Free-Energy Simulation Estimates of Protein-Ligand Binding Affinities.

Toward an Understanding of GPCR-ligand Interactions. Alexander Heifetz

D3RGC2: free energy scoring by alchemical free energy implementation in SOMD

Molecular Modeling Study of Some Anthelmintic 2-phenyl Benzimidazole-1- Acetamides as β-tubulin Inhibitor

tconcoord-gui: Visually Supported Conformational Sampling of Bioactive Molecules

Molecular Dynamics Graphical Visualization 3-D QSAR Pharmacophore QSAR, COMBINE, Scoring Functions, Homology Modeling,..

Chemical properties that affect binding of enzyme-inhibiting drugs to enzymes

Supporting Online Material for

LS-align: an atom-level, flexible ligand structural alignment algorithm for high-throughput virtual screening

Transcription:

Pose and affinity prediction by ICM in D3R GC3 Max Totrov Molsoft

Pose prediction method: ICM-dock ICM-dock: - pre-sampling of ligand conformers - multiple trajectory Monte-Carlo with gradient minimization in internal coordinates - receptor represented by grid potentials - multiple receptor conformations used when needed to address flexibility - pose re-ranking with ICM VLS score - optional chemical biasing by APF from available experimental ligand structures/templates

Atomic Property Fields (APF) Totrov M. Atomic Property Fields: generalized 3D pharmacophoric potential for automated ligand superposition, pharmacophore elucidation and 3D QSAR. Chem Biol Drug Des. 2008 ;71(1):15-27.

APF - 3D pharmacophoric potential 3D pharmacophore: arrangement of molecular properties in space that confers activity. Generalization of point pharmacophore concept: Discrete pharmacophoric points Continuous distributions Moieties represented as Ph4 types f j i - vector of properties i for atom j Vectors of atomic properties Atomic similarity measure - dot product of property vectors: f j i f k i Atomic Property Field (APF) - continuous 3D potential: P i (r) = f j i exp((r-r j ) 2 /l 2 APF); Pseudo-energy (score) of a compound in APF: E APF =- f j i P i (r j ); Implementation - on a 3D (multi)grid - continuous derivatives (spline) - fast potential for molecular mechanics/optimization in combination with force-field energy

Accuracy of Flexible Ligand Superposition Independent broad benchmark: ligands without X-ray structures but similar chemotype to a solved complex. Assessment of superposition quality 2/1/0 - good / acceptable / poor. 11 targets from DUD (out of 40). ADA CDK2 DHFR ER FXA HIVRT NA P38 THR TK TRP mean (39) (72) (410) (39) (146) (43) (49) (454) (72) (22) (49) (1100) ---------------------------------------------------------------------------------------------------------------------------------------- Surflex-sim 2 12.82 12.5 44.39 56.41 4.11 18.6 18.37 9.69 4.17 68.18 40.82 23.15 1 35.9 51.39 53.66 43.59 33.56 72.09 75.51 69.6 93.06 31.82 59.18 59.07 0 51.28 36.11 1.95 0 62.33 9.3 6.12 20.7 2.78 0 0 17.78 ROCS 2 12.82 43.06 74.15 41.03 14.38 30.23 79.59 9.47 2.78 86.36 8.16 35.63 1 20.51 36.11 14.39 56.41 28.77 34.88 14.29 41.19 69.44 9.09 81.63 32.83 0 66.67 20.83 11.46 2.56 56.85 34.88 6.12 49.34 27.78 4.55 10.2 31.54 FlexS 2 15.38 25 56.1 48.72 35.62 16.28 36.73 14.98 30.56 81.82 18.37 33.48 1 20.51 19.44 11.71 43.59 13.7 46.51 57.14 74.01 5.56 13.64 2.04 35.77 0 64.1 55.56 32.2 7.69 50.68 37.21 6.12 11.01 63.89 4.55 79.59 30.75 ICM/APF 2 46.15 12.5 86.83 51.28 70.55 18.6 75.51 20.04 88.89 90.91 69.39 54.48 1 23.08 68.06 11.95 46.15 16.44 46.51 14.29 68.28 9.72 9.09 28.57 36.49 0 30.77 19.44 1.22 2.56 13.01 34.88 10.2 11.67 1.39 0 2.04 9.03 Giganti et al. J Chem Inf Model 2010, 50, 992-1004

Ligand-Biased docking with APF MC docking simulations: APF potentials in addition to physical interaction term grids Pose ranking: composite score combining physicsbased ICM VLS score and APF pseudoenergy Visualization of APF used for ligand bias in Cathepsin S docking Lam PC, Abagyan R, Totrov M. Ligand-biased ensemble receptor docking (LigBEnD): a hybrid ligand/receptor structure-based approach. J Comput Aided Mol Des. 2018; 32(1):187-198.

D3R Cathepsin S: Pose prediction Average RMSD, top pose Average RMSD, best pose of 5 2.82Å 1.31Å Median RMSD, top pose Median RMSD, best pose of 5 1.7Å 1.06Å

CatS Ligands: RMSD for top 5 poses - Most accurate pose of 5

Apparent crystal contact effects Crystal neighbor Cathespin Crystal neighbor Cathespin Ligands Primary receptor Cathespin Superimposed answer X-rays nmxm (CatS_7), rpwj (CatS_9) and gabj (CatS_14). Extensive ligand-crystallographic neighbor contacts (~150Å 2 ) are visible Primary receptor Cathespin Superimposed answer X-ray yrpk (CatS_16), and its top predicted pose. Also shown are top poses for CatS_7, CatS_9 and CatS_14 (thin wires)

Kinases and FXR: flexibility ensembles Ensemble construction: - PDB structures collected/aligned via Pocketome database - Up to 10 representative X- ray structures selected by iterative procedure to maximize number of compatible ligands - For each receptor conformation, compatible ligands are used as APF templates in docking X-ray Receptor Conformation X-ray Ligand Bound Ligand/Receptor conformation compatibility matrix heatmap for VEGFR2

Pocketome: comprehensive collection of ligand-binding pockets from PDB Instant access to all relevant PDB X-ray structures, optimally pre-aligned around the binding pocket. Kufareva I, Ilatovskiy AV, Abagyan R. Pocketome: an encyclopedia of small-molecule binding sites in 4D. Nucleic Acids Res. 2012; 40:D535-40.

FXR (GC2) pose prediction results Average RMSD, top pose Average RMSD, best pose of 5 1.95Å 1.69Å Median RMSD, top pose Median RMSD, best pose of 5 1.95Å 1.95Å

Affinity prediction approaches Docking to generate aligned poses Receptor/Physics-based approach ICM VLS score: ΔG = α 1 ΔE FF + α 2 ΔE GB + α 3 ΔE HP + α 4 ΔE HB + α 5 ΔE PD + α 6 TΔS TO Ligand/APF-based approach: DG» f m i P i APF-QSAR (r m ); P i APF-QSAR (r) = w k if j i exp(-(r-r j ) 2 /l 2 ); 7 N train weights w k i for the contributions of each molecule k in the training set into each APF component i DG l» w k i E APF kl i ; E APF kl i = f m i f j i exp(-(r m -r j ) 2 /l 2 ); Partial Least Squares (PLS) to determine weights w k i Totrov M. Atomic Property Fields: generalized 3D pharmacophoric potential for automated ligand superposition, pharmacophore elucidation and 3D QSAR. Chem Biol Drug Des. 2008 ;71(1):15-27.

Training APF 3D QSAR: Cathepsin S 302 related compounds from ChEMBL v2.3 docked 3D poses used to build APF 3DQSAR model Visualization of APF fields of pkd model for Cathepsin S

Training sets of activity data Source: ChEMBL v2.3 Varying number and relevance: Cath. S VEGFR2 Target N of data points JAK2 p38a Cath S 1754 VEGFR2 5733 JAK2 1618 p38a 4183 Distributions of Tanimoto distances to the closest training set compound for each challenge compound

Training/Testing Set Generation LOO cross-validation or simple N-fold random test subsets don t reflect realistic challenge adequately Stringent 3-fold cluster cross-validation: - Cluster full training set (APF 3D chemical distance, 0.25 cutoff) - Randomly assign clusters to three groups - Use any 2 groups to train, 3 rd group for test (Q2/RMSE)

Improving upon APF 3D QSAR: Combining APF and physics based terms: - APF produces better models provided sufficient training data. Physics based terms are typically noisier, but more general. - Can the two combine for better performance? - Investigate single and staged models combining chemical and physical terms. PLS and/or RFR Dynamic focused models: - Some evidence that large training set dilutes local activity trends - Investigate focused models trained on subsets of data related to challenge molecules

Dynamic/Focused Model Training 1. Dock Ligands 2. Cluster by 3D poses in APF 3. For each cluster: Find ~300 nearest known ligands as training set 4. Train a model for each cluster Challenge Ligands ChEMBL Ligands

Kinase models cross-validation Training Sets Full/ Static Focused/ Dynamic Terms Method VEGFR2 Q 2 Physics/ VLS-Score Physics Only No training VEGFR2 RMSE JAK2 Q 2 JAK2 RMSE P38a Q 2 P38a RMSE 0.12 NA 0.23 NA 0.13 NA PLS 0.13 1.2 0.30 1.1 0.25 1.0 RFR 0.12 1.2 0.23 1.2 0.18 1.1 APF only PLS 0.22 1.4 0.30 1.2 0.29 1.1 Physics + APF 1 Stage PLS 2 Stage PLS 0.26 1.3 0.36 1.2 0.29 1.1 0.22 1.4 0.32 1.2 0.30 1.1 PLS/RFR 0.25 1.2 0.33 1.1 0.33 1.0 APF only PLS 0.26 1.2 0.35 1.2 0.32 1.0 Physics + APF PLS/RFR 0.28 1.1 0.40 1.1 0.33 1.0 RMSE is shown in pkd units

Challenge Set Performance Training Set Terms Method VEGFR2 Corr R VEGFR 2 RMSE JAK2 Corr R JAK2 RMSE P38a Corr R Static APF PLS 0.54 1.4 0.55 1.2 0.55 1.1 Focused/ Dynamic Physics + APF PLS 0.61 1.2 0.65 1.0 0.56 1.1 PLS/RFR 0.68 1.0 0.61 1.0 0.63 1.0 APF PLS 0.53 1.5 0.53 1.2 0.51 1.3 Physics + APF PLS/RFR 0.67 Q C3F =0.53 <d min >= 0.2 1.0 0.59 Q C3F =0.63 <d min >= 0.3 1.0 0.56 Q C3F =0.57 <d min >= 0.27 P38a RMSE 1.3

D3R affinity prediction: ligand ranking, Kendall τ τ = 0.45 Cathepsin S stage 1 VEGFR2 τ = 0.45 JAK2_SC2 p38a τ = 0.47 τ = 0.41

Conclusions Ligand biased docking (ICMdock+APF) consistently produces good pose accuracy Atomic Property Field-based 3D QSAR activity models outperform physical term based models Cluster cross-validation is adequate to assess model quality Using dynamic/focused training sets did not result in consistently better predictions Composite models and in particular PLS(APF)/RFR(Phys) are consistently most predictive

Acknoledgments Polo Lam Eugene Rausch Ruben Abagyan D3R organizers