Using Bayesian Statistics to Predict Water Affinity and Behavior in Protein Binding Sites. J. Andrew Surface

Similar documents
Chemical properties that affect binding of enzyme-inhibiting drugs to enzymes

Chemical properties that affect binding of enzyme-inhibiting drugs to enzymes

Protein-Ligand Interactions* and Energy Evaluation Methods

Dihedral Angles. Homayoun Valafar. Department of Computer Science and Engineering, USC 02/03/10 CSCE 769

Introduction to Chemoinformatics and Drug Discovery

Using AutoDock for Virtual Screening

Other Cells. Hormones. Viruses. Toxins. Cell. Bacteria

Receptor Based Drug Design (1)

High Throughput In-Silico Screening Against Flexible Protein Receptors

Virtual Screening: How Are We Doing?

Structural Bioinformatics (C3210) Molecular Mechanics

Hydrogen Bonding & Molecular Design Peter

Docking. GBCB 5874: Problem Solving in GBCB

Structural biology and drug design: An overview

Retrieving hits through in silico screening and expert assessment M. N. Drwal a,b and R. Griffith a

Solutions and Non-Covalent Binding Forces

CHEM 4170 Problem Set #1

Relative Drug Likelihood: Going beyond Drug-Likeness

Advanced Medicinal Chemistry SLIDES B

Patrick: An Introduction to Medicinal Chemistry 5e Chapter 01

Ligand-based QSAR Studies on the Indolinones Derivatives Bull. Korean Chem. Soc. 2004, Vol. 25, No

Medicinal Chemistry/ CHEM 458/658 Chapter 4- Computer-Aided Drug Design

Molecular Mechanics, Dynamics & Docking

Plan. Day 2: Exercise on MHC molecules.

Life Sciences 1a Lecture Slides Set 10 Fall Prof. David R. Liu. Lecture Readings. Required: Lecture Notes McMurray p , O NH

Introduction. OntoChem

Nonlinear QSAR and 3D QSAR

Biophysics II. Hydrophobic Bio-molecules. Key points to be covered. Molecular Interactions in Bio-molecular Structures - van der Waals Interaction

Dispensing Processes Profoundly Impact Biological, Computational and Statistical Analyses

Bioengineering & Bioinformatics Summer Institute, Dept. Computational Biology, University of Pittsburgh, PGH, PA

In silico pharmacology for drug discovery

Dr. Sander B. Nabuurs. Computational Drug Discovery group Center for Molecular and Biomolecular Informatics Radboud University Medical Centre

Introduction to FBDD Fragment screening methods and library design

Virtual screening for drug discovery. Markus Lill Purdue University

User Guide for LeDock

Why Proteins Fold. How Proteins Fold? e - ΔG/kT. Protein Folding, Nonbonding Forces, and Free Energy

Plan. Lecture: What is Chemoinformatics and Drug Design? Description of Support Vector Machine (SVM) and its used in Chemoinformatics.

Fragment Hotspot Maps: A CSD-derived Method for Hotspot identification

An introduction to Molecular Dynamics. EMBO, June 2016

Alchemical free energy calculations in OpenMM

Robust Classification of Relevant Water Molecules in Putative Protein Binding Sites

Alkane/water partition coefficients and hydrogen bonding. Peter Kenny

Analysis of a Large Structure/Biological Activity. Data Set Using Recursive Partitioning and. Simulated Annealing

Molecular Interactions F14NMI. Lecture 4: worked answers to practice questions

Using Phase for Pharmacophore Modelling. 5th European Life Science Bootcamp March, 2017

Protein-Ligand Docking Evaluations

Biochemistry,530:,, Introduc5on,to,Structural,Biology, Autumn,Quarter,2015,

Introduction into Biochemistry. Dr. Mamoun Ahram Lecture 1

LigandScout. Automated Structure-Based Pharmacophore Model Generation. Gerhard Wolber* and Thierry Langer

5.1. Hardwares, Softwares and Web server used in Molecular modeling

Molecular docking, 3D-QSAR studies of indole hydrazone as Staphylococcus aureus pyruvate kinase inhibitor

Life Science Webinar Series

Notes of Dr. Anil Mishra at 1

11. IN SILICO DOCKING ANALYSIS

Virtual affinity fingerprints in drug discovery: The Drug Profile Matching method

Aqueous solutions. Solubility of different compounds in water

Electronic Supplementary Information Effective lead optimization targeted for displacing bridging water molecule

Chemical Physics. Themodynamics Chemical Equilibria Ideal Gas Van der Waals Gas Phase Changes Ising Models Statistical Mechanics etc...

BIOC : Homework 1 Due 10/10

Saba Al Fayoumi. Tamer Barakat. Dr. Mamoun Ahram + Dr. Diala Abu-Hassan

Concept review: Binding equilibria

Microcalorimetry for the Life Sciences

Towards Physics-based Models for ADME/Tox. Tyler Day

FRAUNHOFER IME SCREENINGPORT

ChemBioNet: Chemical Biology supported by Networks of Chemists and Biologists. Affinity Proteomics Meeting Alpbach. Michael Lisurek 15.3.

MM-GBSA for Calculating Binding Affinity A rank-ordering study for the lead optimization of Fxa and COX-2 inhibitors

3rd Advanced in silico Drug Design KFC/ADD Molecular mechanics intro Karel Berka, Ph.D. Martin Lepšík, Ph.D. Pavel Polishchuk, Ph.D.

Data Quality Issues That Can Impact Drug Discovery

Development of empirical molecular interaction models that incorporate hydrophobicity and hydropathy. The HINT paradigm

Problem solving steps

October 6 University Faculty of pharmacy Computer Aided Drug Design Unit

Building 3D models of proteins

Drug Design 2. Oliver Kohlbacher. Winter 2009/ QSAR Part 4: Selected Chapters

Various approximations for describing electrons in metals, starting with the simplest: E=0 jellium model = particle in a box

Glycosaminoglycan Protein Interactions: Molecular Modeling and Simulation Methods

György M. Keserű H2020 FRAGNET Network Hungarian Academy of Sciences

Protein-Ligand Docking Methods

Carbohydrate- Protein interac;ons are Cri;cal in Life and Death. Other Cells. Hormones. Viruses. Toxins. Cell. Bacteria

schematic diagram; EGF binding, dimerization, phosphorylation, Grb2 binding, etc.

Biological Thermodynamics

Kd = koff/kon = [R][L]/[RL]

Chimica Farmaceutica

Atomic and molecular interaction forces in biology

Toward an Understanding of GPCR-ligand Interactions. Alexander Heifetz

DISCRETE TUTORIAL. Agustí Emperador. Institute for Research in Biomedicine, Barcelona APPLICATION OF DISCRETE TO FLEXIBLE PROTEIN-PROTEIN DOCKING:

Full file at Chapter 2 Water: The Solvent for Biochemical Reactions

Three Dimensional Pharmacophore Modelling of Monoamine oxidase-a (MAO-A) inhibitors

Chem 204. Mid-Term Exam I. July 21, There are 3 sections to this exam: Answer ALL questions

AGS Chemistry 2007 Correlated to: Prentice Hall Chemistry (Wilbraham) including AGS Differentiated Instruction Strategies

Supplementary Methods

Chemogenomic: Approaches to Rational Drug Design. Jonas Skjødt Møller

3. Solutions W = N!/(N A!N B!) (3.1) Using Stirling s approximation ln(n!) = NlnN N: ΔS mix = k (N A lnn + N B lnn N A lnn A N B lnn B ) (3.

Lattice protein models

PATTERN RECOGNITION AND DISTRIBUTED COMPUTING

Chemical Reactions: The Law of Conservation of Mass

Similarity Search. Uwe Koch

Virtual Libraries and Virtual Screening in Drug Discovery Processes using KNIME

Presentation Microcalorimetry for Life Science Research

THERMODYNAMICS IN DRUG DESIGN. HIGH AFFINITY AND SELECTIVITY

Vol. 114 (2008) ACTA PHYSICA POLONICA A No. 6 A

Transcription:

Using Bayesian Statistics to Predict Water Affinity and Behavior in Protein Binding Sites Introduction J. Andrew Surface Hampden-Sydney College / Virginia Commonwealth University In the past several decades drug development has come a long way in testing assays of medicinal compounds for drug activity potential. Particularly, the areas of development have been in computerized molecular modeling and in silico assays that enable scientists to test thousands of drug compounds simultaneously for drug potential on various biological molecules, including proteins. Proteins have been of particular interest as drug targets because proteins control body function and have many specific drug targets that are highly specific to certain biological systems within the body. Highly specific targets are ideal for medicinal research because they have the lowest potential for negative sideeffects. Simulated docking procedures have been developed to fully understand how ligands bind to proteins and these systems have enabled medicinal chemists to build ligands that have high biological activity on proteins. Understanding docking sites and docking mechanisms have given chemists even more information on the process of making ligand-protein complexes with desirable properties. One aspect of ligand-protein binding that has not been fully understood to date is the role that water plays in both the docking process of ligands and the stasis energetics of proteins. Biological environments are aqueous, and proteins are not only full of crevices and caverns, ideal for water molecules to "slip" into, but also are covered with both hydrophobic and hydrophilic regions. These regions attract or inhibit water from interacting with the protein's different surfaces, and the multiple hydrophilic regions produce multiple H-bonding sites, filled with both H-bond acceptors and donors. The different regions on a protein that are filled with waters can be strengthened or weakened depending upon the arrangement of water molecules within the protein. They can furthermore inhibit or aid (by building H-bonding bridges between ligand and protein) by slowing the conformational change of a protein needed during a docking procedure of a ligand or by being in the ligand's way within the docking site on a protein. Obviously, then, water is a large contributor to protein conformational change and complexing abilities, both promoting and inhibiting it in various ways. Understanding how water interacts with the protein is therefore necessary to fully understand how the protein and different ligands interact in different biological environments. Further understanding the water/protein relationship and incorporating that knowledge into a molecular modeling program would greatly increase the accuracy of in silico assays in medicinal chemistry applications, aiding structure-based drug design.

To date, much work has been done by Dr. Glen Kellogg at Virginia Commonwealth University, as well as many others. The primary methodology developed to date has been a system known as HINT scoring (Hydropathic INTeractions). HINT is a developed method of scoring water molecules by their hydropathic interactions, giving a number that shows the strength of the interactions that the water molecule has with the protein at a given site. The HINT score is a number developed from a non-newtonian forcefield that is based on the partition constant between 1-octanol and water (log P o/w ) [1, 2]. It works as follows: Figure 1 [1] The HINT equation is a simple representation of the interaction between two atoms [1]. i and j are the two atoms represented in the above equation. b ij is the interaction score between the two atoms, a is the hydrophobic atom constant, S is the atomic solventaccessible surface area, T ij is a logic function that assumes the polar nature of interacting atoms (-1 or +1), and R ij is an exponential function that relates the distance between the two atoms i and j. r ij tells the distance between i and j but is related to the Lennard-Jones function [1]. Inherent within the HINT model is an incorporation of enthalpy, entropy, salvation and other energies, although they are not fully quantifiable [4]. As such, HINT score is a valuable number that represents a combinatorial data set that aids in estimating water behavior within a protein. Another factor that can be taken into account when trying to understand th ebehavior of water molecules in proteins is called the Rank. The Rank is essentially a method that measures the number of H-bonds that a water molecule has the ability to make in a given location [3]. The Rank algorithm works as follows: Figure 2 [1] Where 2.80 A is the length of an ideal hydrogen bond, i.e., between the water and an external donor or acceptor, and r n is the actual distance between the water molecule and the target hydrogen binding site on the protein. n is maximally 4, as that is the maximum number of H-bonds for a single water molecule. θ Td is the ideal tetrahedral angle between all of the hydrogen bonds and θ nm is the actual angle. This value is divided by 6 because there is a maximum number of 6 angles.

Experimentally, Rank and HINT have both been used in various applications to aid medicinal chemists in classifying water behavior in certain proteins. What has not been done, to date, is relating the Rank and the HINT score together with the goal of being able to predict the probability of a particular water molecule's behavior in any region of the protein when the protein is complexed. It is believed that using the Rank values and the HINT scores of various waters in various proteins would lead to several equations whose results could be weighted and then run through a statistical filter, out of which would come the percent-probability of water being retained in a protein. Since HINT score and Rank both incorporate the necessary data to calculate such problems, it is believed that this is the only data needed to compute the probability. Methods The proposed method of developing the algorithm to predict the water molecule behavior is as follows: to first prepare a training set of data of proteins that contain water molecules of known HINT score and Rank. This training set will then be used to develop the functions that will relate HINT score and Rank to percent-conserved waters. Using these two functions, a proper mathematical weight will be established for each one before using the results of each function in a Bayesian statistical function that will take the two percent-conserved values predicted from both Rank and HINT score functions, and output one total percent-conserved value that will establish the likelihood of the individual water being conserved. Training sets have already been established to develop the HINT score function and the Rank function. They can be seen below: % conserved/functionally displaced waters 120 100 80 60 40 20 0-200 0 200 400 600 800 1000 1200 HINT score Figure 3

% conserved/functionally displaced waters 120 100 80 60 40 20 0 0 1 2 3 4 5 6 Rank value Figure 4 The Bayesian statistical analysis will also incorporate a training set of waters with known Ranks and HINT scores. This data set will use to further train the Bayesian algorithm. Once the Bayesian algorithm has been designed based on the training set, another set of selected waters in proteins with known affinities will be tested against the algorithm by comparing the predicted percent-conserved water prediction of the algorithm from the realistic values. This will test the accuracy of the function. Potential Results Using Bayesian statistics to analyze the percent-conservation of waters has the potential of giving molecular modeling programs the ability to not only know the properties of water molecules in a particular protein, but also how they will affect the ligand-protein binding process by determining if they will remain (be conserved) or not after the protein has been complexed. The goal is to get the predictive Bayesian algorithm within 90% or greater accuracy. This will be an excellent and very useful addition to the HINT tool in molecular modeling programs such as Sybyl.

References: [1] Amadasi, A., Spyrakis, F., Cozzini, P/, Abraham, D/J., Kellogg, G.E., and Mozzarelli, A. (2006). Mapping the Energetics of Water-Protein and Water-Ligand Interactions with the Natural HINT Forcefiled: Predictive Tools for Characterizing the Roles of Water in Biomolecules.J. Mol. Bio. 358, 289-309. [2] Kellogg, G. E., Semus, S.F. & Abraham, D.J. (1991). HINT: a new method of empirical hydrophobic field calculation for CoMFA. J. Comput. Aided Mol. Des. 5, 545-552. [3] Chen, D/L. & Kellogg, G. E. (2005). A computational tool to optimize ligand selectivity between two similar biomacromolecular targets. J. Comput. Aided Mol. Des. 19, 69-82. [4] Kellogg, G.E., Abraham, D. J. (2000). Hydrophobicity: is LogP o/w more than the sum of its parts? Eur. J. Med. Chem. 35, 651-661.