The reuse of structural data for fragment binding site prediction

Similar documents
User Guide for LeDock

Fragment Hotspot Maps: A CSD-derived Method for Hotspot identification

Using Phase for Pharmacophore Modelling. 5th European Life Science Bootcamp March, 2017

Towards Physics-based Models for ADME/Tox. Tyler Day

Computational chemical biology to address non-traditional drug targets. John Karanicolas

Next Generation Computational Chemistry Tools to Predict Toxicity of CWAs

Using AutoDock for Virtual Screening

Ligand Scout Tutorials

Pose and affinity prediction by ICM in D3R GC3. Max Totrov Molsoft

Introduction. OntoChem

Detection of Protein Binding Sites II

The PhilOEsophy. There are only two fundamental molecular descriptors

Introduction to FBDD Fragment screening methods and library design

György M. Keserű H2020 FRAGNET Network Hungarian Academy of Sciences

Dr. Sander B. Nabuurs. Computational Drug Discovery group Center for Molecular and Biomolecular Informatics Radboud University Medical Centre

Machine Learning Concepts in Chemoinformatics

Virtual Screening: How Are We Doing?

Chemogenomic: Approaches to Rational Drug Design. Jonas Skjødt Møller

Retrieving hits through in silico screening and expert assessment M. N. Drwal a,b and R. Griffith a

Docking. GBCB 5874: Problem Solving in GBCB

Functional Group Fingerprints CNS Chemistry Wilmington, USA

Plan. Lecture: What is Chemoinformatics and Drug Design? Description of Support Vector Machine (SVM) and its used in Chemoinformatics.

Studying the effect of noise on Laplacian-modified Bayesian Analysis and Tanimoto Similarity

Binding Response: A Descriptor for Selecting Ligand Binding Site on Protein Surfaces

Improving protein-ligand binding site prediction accuracy by classification of inner pocket points using local features

Performing a Pharmacophore Search using CSD-CrossMiner

Machine-learning scoring functions for docking

5.1. Hardwares, Softwares and Web server used in Molecular modeling

Bridging the Dimensions:

Biologically Relevant Molecular Comparisons. Mark Mackey

Identifying Interaction Hot Spots with SuperStar

Data Quality Issues That Can Impact Drug Discovery

Comparative Analysis of Pharmacophore Screening Tools

Ultra High Throughput Screening using THINK on the Internet

What is Protein-Ligand Docking?

Structure-Activity Modeling - QSAR. Uwe Koch

proteins Comparison of structure-based and threading-based approaches to protein functional annotation Michal Brylinski, and Jeffrey Skolnick*

Supplementary Material

Performance Evaluation

New approaches to scoring function design for protein-ligand binding affinities. Richard A. Friesner Columbia University

Creating a Pharmacophore Query from a Reference Molecule & Scaffold Hopping in CSD-CrossMiner

Molecular Complexity Effects and Fingerprint-Based Similarity Search Strategies

DOCKING TUTORIAL. A. The docking Workflow

Computational Chemistry in Drug Design. Xavier Fradera Barcelona, 17/4/2007

How IJC is Adding Value to a Molecular Design Business

ICM-Chemist-Pro How-To Guide. Version 3.6-1h Last Updated 12/29/2009

Molecular Interactions F14NMI. Lecture 4: worked answers to practice questions

An Integrated Approach to in-silico

Overview of IslandPick pipeline and the generation of GI datasets

MM-GBSA for Calculating Binding Affinity A rank-ordering study for the lead optimization of Fxa and COX-2 inhibitors

Introduction to Chemoinformatics and Drug Discovery

Virtual Libraries and Virtual Screening in Drug Discovery Processes using KNIME

Dock Ligands from a 2D Molecule Sketch

CSCE555 Bioinformatics. Protein Function Annotation

CS4445 Data Mining and Knowledge Discovery in Databases. B Term 2014 Solutions Exam 2 - December 15, 2014

Hit Finding and Optimization Using BLAZE & FORGE

Structure based drug design and LIE models for GPCRs

A.I. in health informatics lecture 2 clinical reasoning & probabilistic inference, I. kevin small & byron wallace

A Deep Convolutional Neural Network for Bioactivity Prediction in Structure-based Drug Discovery

Cross Discipline Analysis made possible with Data Pipelining. J.R. Tozer SciTegic

CHAPTER 3. Pharmacophore modelling studies:

Structural biology and drug design: An overview

CSD. CSD-Enterprise. Access the CSD and ALL CCDC application software

Introduction to Structure Preparation and Visualization

Similarity Search. Uwe Koch

Development of Pharmacophore Model for Indeno[1,2-b]indoles as Human Protein Kinase CK2 Inhibitors and Database Mining

An Enhancement of Bayesian Inference Network for Ligand-Based Virtual Screening using Features Selection

Integrated Cheminformatics to Guide Drug Discovery

Chemical library design

Bioengineering & Bioinformatics Summer Institute, Dept. Computational Biology, University of Pittsburgh, PGH, PA

Cheminformatics platform for drug discovery application

Virtual screening for drug discovery. Markus Lill Purdue University

Structure-based maximal affinity model predicts small-molecule druggability

Overview. Descriptors. Definition. Descriptors. Overview 2D-QSAR. Number Vector Function. Physicochemical property (log P) Atom

PerspectiVe. Recent Developments in Fragment-Based Drug Discovery. 1. Introduction. 2. Trends and Developments

Receptor Based Drug Design (1)

Expanded Interaction Fingerprint Method for Analyzing Ligand Binding Modes in Docking and Structure-Based Drug Design

KNIME-based scoring functions in Muse 3.0. KNIME User Group Meeting 2013 Fabian Bös

Medicinal Chemistry/ CHEM 458/658 Chapter 4- Computer-Aided Drug Design

Development of a Structure Generator to Explore Target Areas on Chemical Space

Softwares for Molecular Docking. Lokesh P. Tripathi NCBS 17 December 2007

11. IN SILICO DOCKING ANALYSIS

Rationalizing Tight Ligand Binding through Cooperative Interaction Networks

Drug Design 2. Oliver Kohlbacher. Winter 2009/ QSAR Part 4: Selected Chapters

Scoring functions for of protein-ligand docking: New routes towards old goals

A Tiered Screen Protocol for the Discovery of Structurally Diverse HIV Integrase Inhibitors

An optimized energy potential can predict SH2 domainpeptide

Joana Pereira Lamzin Group EMBL Hamburg, Germany. Small molecules How to identify and build them (with ARP/wARP)

Drug Informatics for Chemical Genomics...

Virtual screening in drug discovery

Week 10: Homology Modelling (II) - HHpred

Benchmarking of protein descriptor sets in proteochemometric modeling (part 2): modeling performance of 13 amino acid descriptor sets

PL-PatchSurfer: A Novel Molecular Local Surface-Based Method for Exploring Protein-Ligand Interactions

bcl::cheminfo Suite Enables Machine Learning-Based Drug Discovery Using GPUs Edward W. Lowe, Jr. Nils Woetzel May 17, 2012

Mechanistic insight into inhibition of two-component system signaling

Computational Methods and Drug-Likeness. Benjamin Georgi und Philip Groth Pharmakokinetik WS 2003/2004

The Long and Rocky Road from a PDB File to a Protein Ligand Docking Score. Protein Structures: The Starting Point for New Drugs 2

est Drive K20 GPUs! Experience The Acceleration Run Computational Chemistry Codes on Tesla K20 GPU today

How Diverse Are Diversity Assessment Methods? A Comparative Analysis and Benchmarking of Molecular Descriptor Space

Model Accuracy Measures

Transcription:

The reuse of structural data for fragment binding site prediction Richard Hall 1

Motivation many examples of fragments binding in a phenyl shaped pocket or a kinase slot good shape complementarity between the ligand and receptor can we ascertain the shapes of pockets filled by fragments use the wealth of structural data from the Astex database can we generate a descriptor for this kind of pocket 2

The Phenyl Shaped Pocket 3

Astex Diverse Set Fragments 1N1M: Dipeptidyl Peptidase IV 1W1P: Chitinase B 1N2V: TGT 1Q41: GSK-3beta 1LRH: Auxin binding protein 1 1SG0: Quinone Reductase 2 www.ccdc.cam.ac.uk/products/life_sciences/gold/validation/astex_diverse 4

Other Pocket Finding Algorithms Are Available Pérot, S., Sperandio, O., Miteva, M.A., Camproux, A.-C., Villoutreix, B.O., Druggable pockets and binding site centric chemical space: a paradigm shift in drug discovery, Drug Discovery Today (2010), doi:10.1016/j.drudis.2010.05.015 lists 30 algorithms to search for binding pockets we are particularly interested in areas where fragments are likely to bind 5

Ligsite ligsite is a simple algorithm for finding pockets in receptor structures create an excluded volume grid at each point on the grid count number of vectors along which an atom hits the surface each point scores between 0 (very exposed) and 7 (very enclosed) 6

ADS Fragments 7

ADS Ligsite 8

Digsite: Distant Dependant Ligsite digsite evolved from the ligsite algorithm added to the number of vectors (additional 6), reducing rotational variance bin each distance (0-6 Å in 0.5A bins, plus one mop bucket) from ligand atoms to the excluded volume of the receptor generate a fingerprint spectrum smooth spectrum using a 25:50:25 function # mop r 9

Digsite Training Set we selected good binding fragments for 17 Astex targets (including 6 kinase, 5 protease) HAC <= 15 MW <= 300 ClogP <= 3 NDon <= 3 NAcc <= 3 ranked by LE diverse 127 complexes average HAC = 11.7 calculated the spectrum for each atom 1492 spectra 10

Clustering The Spectra 0 200 400 600 800 1000 spectra were clustered in R (Euclidean distance, Wards method) dendrogram was cut to produce 15 clusters 11

Cluster Pruning we averaged the spectra of each cluster average for each target and again for kinases we wanted to find descriptors for tight binding any average cluster spectrum with >2 distances in the mop bucket was ignored rank average cluster spectra by a consensus count of distances in the low distance bins top three clusters used as prototype spectra 7 6 5 prototype spectra # 4 3 2 1 0-1 0 2 4 6 8 distance Spectrum 1 Spectrum 2 Spectrum 3 12

Digsite Grid Generation at each grid point outside the excluded volume: generate a digsite spectrum calculate the minimum average z score to each of the three prototype spectra transform scores to max(2-zscore, 0) expand/smooth resulting grids 13

ADS Fragments 14

ADS Ligsite 15

ADS Digsite 16

A Poor Digsite Score 17

A Confession when we started this project, the idea was to weight scoring functions so as to bias docking towards fragment binding regions... it didn t improve results http://en.wikipedia.org/wiki/hype_cycle 18

Fragment Binding Site Prediction build a digsite or ligsite grid for the entire protein look at how many good scoring points (digsite > 1.2, ligsite >6) overlap with fragment volume(s) (VdW+1Å) do not count any point in the receptor excluded volume (<VdW + 1Å) or distant (>VdW+5Å) from the receptor how well do we perform for pdb and astex structures tn fp fn tp fp tp fn 19

PDB Fragments the astex diverse set are 85 diverse protein ligand targets that are relevant to drug discovery ligands are drug like but only 7/85 are fragment like there are other pdb structures that have good sequence identity with the astex diverse set CLASP database clusters where sequence similarity > 0.7 ligands extracted and stored separately can we find examples of fragment ligands bound to these similar structures not rigorously qc ed as per the original 85 20

PDB Fragment Selection similar fragment binding site within 5A of the centroid of the original ads ligand only organic atoms, >2 carbon atoms rule of three compliant 6 <= HAC <= 15 found 114 fragments 80 unique smiles 27/85 clusters have at least one fragment bound choose one pdb code to represent each cluster conserve original ADS fragments 21

PDB Fragment Analysis use a confusion matrix to quantify success for each complex true +ve false +ve true +ve false +ve 115 1273 104 187 false ve 1074 true ve 388574 false ve 1085 true ve 389660 ligsite digsite compare precision, (true +ve)/(true +ve and false +ve) above example ligsite 0.08, digsite 0.36 (best case, 3KGP) can be misleading when ligsite true+ve >> digsite true+ve compare false +ve ratio for the two methods 22

Precision Explanation 23

PDB Fragment Results both methods find a ligand binding site in 100% of cases digsite is more precise than ligsite average ratio 1.7, median 1.5 ligsite produces far more false positives than digsite average ratio 7.4, median 6.5 24

Astex Fragments 67 reference structures (23 targets, 724 fragments) all fragment ligands aligned to reference structure and used to mark true binding regions HAC <= 15 atoms metrics computed for ligsite and digsite 91% of digsite grids found true positives, 96% ligsite two targets were not amenable to digsite open nature of binding sites 25

Astex Fragment Results digsite is more precise than ligsite average ratio 1.7, median 1.2 ligsite produces far more false positives than digsite average ratio 12.1, median 6.5 26

Conclusions we used our database of crystal structures to derive a descriptor that can be used to highlight fragment binding sites a retrospective analysis of PDB and Astex data has shown that digsite can be used to find fragment binding sites digsite grids are much more focussed than ligsite grids 27

Future Work more extensive pdb evaluation compare digsite to other pocket finding methods eg ASP grids, pocketome method of An et al perform a prospective analysis of in house structures, to ensure we have found all potential binding sites look at island counts, rather than precision/false positives consider a coloured digsite spectra (metals, donors, acceptors, etc) 28

Acknowledgements Fred Ludlow Ilenia Giangreco Marcel Verdonk Astex CCI CCDC 29

Thank you www.astex-therapeutics.com