Joana Pereira Lamzin Group EMBL Hamburg, Germany. Small molecules How to identify and build them (with ARP/wARP)

Similar documents
Automated Protein Model Building with ARP/wARP

Web-based Auto-Rickshaw for validation of the X-ray experiment at the synchrotron beamline

1.b What are current best practices for selecting an initial target ligand atomic model(s) for structure refinement from X-ray diffraction data?!

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Garib N Murshudov MRC-LMB, Cambridge

Generating Small Molecule Conformations from Structural Data

Pipelining Ligands in PHENIX: elbow and REEL

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror

Full wwpdb NMR Structure Validation Report i

Receptor Based Drug Design (1)

Protein Structure Determination from Pseudocontact Shifts Using ROSETTA

MR model selection, preparation and assessing the solution

Retrieving hits through in silico screening and expert assessment M. N. Drwal a,b and R. Griffith a

Docking. GBCB 5874: Problem Solving in GBCB

Likelihood and SAD phasing in Phaser. R J Read, Department of Haematology Cambridge Institute for Medical Research

Direct Method. Very few protein diffraction data meet the 2nd condition

Analyzing Molecular Conformations Using the Cambridge Structural Database. Jason Cole Cambridge Crystallographic Data Centre

Tools for Cryo-EM Map Fitting. Paul Emsley MRC Laboratory of Molecular Biology

Supporting Information. Synthesis of Aspartame by Thermolysin : An X-ray Structural Study

In silico pharmacology for drug discovery

Dr. Sander B. Nabuurs. Computational Drug Discovery group Center for Molecular and Biomolecular Informatics Radboud University Medical Centre

Ligand Scout Tutorials

Supporting Information

Phaser: Experimental phasing

Next Generation Computational Chemistry Tools to Predict Toxicity of CWAs

Ab-initio protein structure prediction

Cross Discipline Analysis made possible with Data Pipelining. J.R. Tozer SciTegic

NMR, X-ray Diffraction, Protein Structure, and RasMol

Softwares for Molecular Docking. Lokesh P. Tripathi NCBS 17 December 2007

Experimental phasing in Crank2

Validation of Experimental Crystal Structures

User Guide for LeDock

1. Protein Data Bank (PDB) 1. Protein Data Bank (PDB)

Manipulating Ligands Using Coot. Paul Emsley May 2013

CMPS 3110: Bioinformatics. Tertiary Structure Prediction

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror

X-ray Crystallography

Full wwpdb X-ray Structure Validation Report i

Computational chemical biology to address non-traditional drug targets. John Karanicolas

Dictionary of ligands

Structural biology and drug design: An overview

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction

Machine-learning scoring functions for docking

GC and CELPP: Workflows and Insights

Portal. User Guide Version 1.0. Contributors

Structural Bioinformatics (C3210) Molecular Docking

DOCKING TUTORIAL. A. The docking Workflow

Proteomics. Yeast two hybrid. Proteomics - PAGE techniques. Data obtained. What is it?

The PhilOEsophy. There are only two fundamental molecular descriptors

Full wwpdb X-ray Structure Validation Report i

Full wwpdb X-ray Structure Validation Report i

Bioinformatics 2. Yeast two hybrid. Proteomics. Proteomics

Chemogenomic: Approaches to Rational Drug Design. Jonas Skjødt Møller

CSD. CSD-Enterprise. Access the CSD and ALL CCDC application software

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

Nitrile Groups as Hydrogen-Bond Acceptors in a Donor-Rich Hydrogen-Bonding Network. Supplementary Information

Full wwpdb X-ray Structure Validation Report i

Introduction to" Protein Structure

PAN-modular Structure of Parasite Sarcocystis muris Microneme Protein SML-2 at 1.95 Å Resolution and the Complex with 1-Thio-β-D-Galactose

Ranking of HIV-protease inhibitors using AutoDock

ID14-EH3. Adam Round

Preparing a PDB File

MM-GBSA for Calculating Binding Affinity A rank-ordering study for the lead optimization of Fxa and COX-2 inhibitors

CSD. Unlock value from crystal structure information in the CSD

Full wwpdb X-ray Structure Validation Report i

wwpdb X-ray Structure Validation Summary Report

Fragment Hotspot Maps: A CSD-derived Method for Hotspot identification

Full wwpdb X-ray Structure Validation Report i

Full wwpdb X-ray Structure Validation Report i

Using Phase for Pharmacophore Modelling. 5th European Life Science Bootcamp March, 2017

Week 10: Homology Modelling (II) - HHpred

Molecular Biology Course 2006 Protein Crystallography Part II

Full wwpdb X-ray Structure Validation Report i

Structure to Function. Molecular Bioinformatics, X3, 2006

Progress of Compound Library Design Using In-silico Approach for Collaborative Drug Discovery

Introduction to Structure Preparation and Visualization

Bioengineering & Bioinformatics Summer Institute, Dept. Computational Biology, University of Pittsburgh, PGH, PA

Creating a Pharmacophore Query from a Reference Molecule & Scaffold Hopping in CSD-CrossMiner

Schrodinger ebootcamp #3, Summer EXPLORING METHODS FOR CONFORMER SEARCHING Jas Bhachoo, Senior Applications Scientist

Full wwpdb X-ray Structure Validation Report i

Experimental phasing in Crank2

Improving Protein Function Prediction with Molecular Dynamics Simulations. Dariya Glazer Russ Altman

Modelling Macromolecules with Coot

Protein Structures. 11/19/2002 Lecture 24 1

CCP4 Diamond 2014 SHELXC/D/E. Andrea Thorn

Kd = koff/kon = [R][L]/[RL]

Prediction and refinement of NMR structures from sparse experimental data

Life Science Webinar Series

Assignment 2 Atomic-Level Molecular Modeling

Molecular Interactions F14NMI. Lecture 4: worked answers to practice questions

Protein surface descriptors for binding sites comparison and ligand prediction

Molecular Modeling lecture 2

Performing a Pharmacophore Search using CSD-CrossMiner

Pose and affinity prediction by ICM in D3R GC3. Max Totrov Molsoft

5.1. Hardwares, Softwares and Web server used in Molecular modeling

Cks1 CDK1 CDK1 CDK1 CKS1. are ice- lobe. conserved. conserved

Department of Biochemistry, University of Zürich, Winterthurerstrasse 190, CH-8057 Zürich, Switzerland

Orientational degeneracy in the presence of one alignment tensor.

Machine Learning Concepts in Chemoinformatics

Detection of Protein Binding Sites II

Transcription:

Joana Pereira Lamzin Group EMBL Hamburg, Germany Small molecules How to identify and build them (with ARP/wARP)

The task at hand To find ligand density and build it!

Fitting a ligand We have: electron density map (good resolution) + protein model shows excess density not accounted for by protein difference density map to fit the appropriate ligand into

Fitting a ligand Challenges: different resolutions and data quality different ligand complexity / topology partial disorder of a ligand different ligands at the same site?

The protocol of a modelling task What we know at the beginning will determine the protocol: Binding site Known Unknown 1 density cluster 1 ligand Ligand identity Known Unknown X

The protocol of a modelling task What we know at the beginning will determine the protocol: Binding site N density clusters Known Unknown 1 ligand Ligand identity Known Unknown X

The protocol of a modelling task What we know at the beginning will determine the protocol: N ligand candidates Binding site Known Unknown 1 density cluster Ligand identity Known Unknown X

The protocol of a modelling task What we know at the beginning will determine the protocol: N density clusters N ligand candidates Binding site Known Unknown Ligand identity Known Unknown X

Researcher s needs #1 I am trying to solve a structure of a protein with some inhibitor. I want to know how I can put in my inhibitor in the density map of the data I got. I can see some density in the active site where the inhibitor should be I am not sure of how to do it. Which case is this? 1 density cluster 1 ligand

Ligand fitting with ARP/wARP Organised as a pipeline of core modules for specific sub-tasks following an intuitive approach Prepare Identify ligand binding site and /or ligand Sparse density map and generate ligand topology Construct Construct ensemble of ligand models in plausible conformation to fit sparsed density Refine Refine ligand coordinates to satisfy geometric constraints and maximise fit to density Choose best model

Constructing the ligand To address the different ligand sizes and complexities encountered to increase the robustness of the software We use 2 separate construction methods: Label swapping on the sparsed grid Metropolis search in conformation space

Constructing the ligand: graph search Uses the sparse grid representation of the electron density at the chosen binding site & topology of known ligand Knowledge about ligand Connectivity Distances Angles Chirality Knowledge about grid Connectivity Distances Density NO IDENTITIES 4 2 3 1 8 7 5 6 10 9 25 33 42 44 50 70 62 51 34 46 24 43 23 27 16 19 45 61 26 28 35 36 37 59 58 60 69 79 32 40 49 56 55 57 73 85 86 87 84 74 81 67 72 68 78 66 76 47 41 48 64 65 77 13 18 21 39 22 14 54 12 31 53 71 80 15 17 20 63 75 11 38 52 30 29 90 89 91 92 88 82 83

Constructing the ligand: graph search Label swapping Ligand is expanded on the sparse density, preserving connectivities Every dummy atom of the sparse grid is tried as a start point An exhaustive graph search is performed Models are scored by their fit to density and expected stereochemical features 4 2 3 1 8 7 5 6 10 9 Start Atom 25 33 42 44 50 70 62 51 34 46 24 43 23 27 16 19 45 61 26 28 35 36 37 59 58 60 69 79 32 40 49 56 55 57 73 85 86 87 90 89 91 92 88 84 74 81 67 72 68 78 66 76 47 41 48 64 65 77 13 18 21 39 22 14 54 12 11 15 17 20 31 38 53 52 63 71 75 80 82 83 30 29 Tim Joana Wiegels Pereira 08.11.2014

37 Constructing the ligand: graph search Index of starting grid point Dead branches No of ligand atoms placed 4 2 3 1 8 7 5 6 10 9 Start Atom 16 25 24 33 23 19 34 27 42 44 26 32 46 43 28 50 35 40 45 36 51 49 70 62 61 59 58 60 56 55 57 69 73 79 85 86 87 84 90 89 88 91 92 Surviving start points 74 81 67 72 68 78 66 76 47 41 48 64 65 77 13 18 21 39 22 14 54 12 11 15 17 20 31 38 53 52 63 71 75 80 82 83 30 29

Constructing the ligand: metropolis search Perform a random walk in parameter space biased towards the optimum of a score function Parameter space: position, orientation and conformation Score function: pseudo map correlation Advantage: less degrees of freedom Rigid groups Rotatable bonds

Constructing the ligand: metropolis search Evolution of a model during a metropolis optimisation A B C D FMN to 1jqv at 2.1Å: 0.23Å rmsd. Final result: 9000 moves + refinement

Researcher s needs #2 Is there a simple way to determine whether ligand is bound or not by comparing the diffraction patterns between ligand- free (structure known) and ligand- soaked protein crystals? I would like to solve the ligand bound protein structure, but before I do so, I have to find out if the ligand is actually bound and if so, where. Thank you very much! Which case is this? N density clusters 1 ligand

Automatic binding site identification The difference density map from low to high contour threshold

Automatic binding site identification Capturing the dependence of the difference density map on changing the contour thresholds: the fragmentation tree 10 5 10 4 Deposited Selection Ligand Cluster volume [Å 3 ] 10 3 10 2 10 1 10 0 10-1 1 2 3 4 5 6 Contour threshold [σ]

Shape features to measure similarity An electron density map calculated from the ligand to be fit is compared to each potential density cluster: Find the matching cluster for this ligand! 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1) Surface to volume ratio 2) Bounding box limits 3) Moments of inertia 4) Rotation match score 5) Eigenvalues 6) Distance histogram 7) Geodesic distance histogram Decision is based on feature vectors

Researcher s needs #3 We are working on protein/inhibitor complex structure. However, we did find a strange density at the active site, it looks really like GSH, the natural co- enzyme of this protein. We tried to use very simple solution to get crystal then exclude the possibility of buffer molecules, but that density is always there. I want to identify this ligand. Are there any methods to do this work? Which case is this? 1 density cluster N ligand candidates

It s all in the shape, baby!

Shape features to measure similarity To assign a likelihood to a ligand knowing the binding site Final ligand ranking

High throughput ligand identification Segmentation Projection/ normalisation Density map (or otherwise surface of protein pocket) Calculation of shape descriptors Ranking Comparison Ligand Database 84 of the most common ligands in crystallography C. Carolan & V. Lamzin, Acta D (July 2014)

Ligand identification with ARP/wARP Flexibility - Ligand Conformations! full ligand flexibility is considered Currently, a database of 84 of the most common ligands in crystallography is screened. Beware of multiple ligands, possibly covalently bound, and coordinated solvent among other things!

Researcher s needs #4 I have a ligand in the binding pocket and mediocre data. The core of the ligand is well defined, however there are chains/tails of the ligand which are not.the question here is how the chains/tails should be modelled (if at all) Which case is this? Tim Joana Wiegels Pereira 08.11.2014

Partial Disorder, Partially Occupied Ligands Deposited in PDB Built automatically with ARP/wARP when the full ligand is given 27 Artificial Cocktail SAM in 1v2x at 1.5Å Compound chosen in cocktail case Tim Joana Wiegels Pereira 08.11.2014

Partial ligand building in ArpNavigator Complete partial ligand building in under 2 minutes!!! TimJoana Wiegels Pereira 09.04.2014 08.11.2014

Success rate Tested on > 20k PDB entries from EDS X-ray resolution limits: 1.0 to 3.0 Å Building the largest fully occupied ligand Correctness criterion: r.m.s.d. < 1.0 Å For ARP/wARP 7.2 5 6 atoms, any map corr. 7 100 atoms, any map corr. 10 40 atoms, map corr. > 80% 20 40 atoms, map corr. > 80%

Success criteria: r.m.s.d < 1A Correct Correct rmsd=0.4å rmsd=1.0å Incorrect Incorrect rmsd=1.8å In yellow: the deposited ligand rmsd=1.5å

Ligand building methods Sparse grids Conformational fit Fine skeletons High-throughput ligand identification using shape descriptors Building models which do NOT have repetitive motifs like peptides or nucleotides TimJoana Wiegels Pereira 09.04.2014 08.11.2014

The people - The power! Collaborators EMBL Hamburg: Gleb Bourenkov, Santosh Panjikar MRC Laboratory, Cambridge : Garib Murshudov s group Rutherford Appleton Laboratory: CCP4 team Previous Members Ciaran Carolan, Tim Wiegels Funding