Nov Julien Michel

Similar documents
Introd uction to Num erical Analysis for Eng ineers

Procedures for Computing Classification Consistency and Accuracy Indices with Multiple Categories

Protein- Ligand Interactions

OPTIMISATION PROCESSES IN TIDAL ANALYSIS

Few thoughts on PFA, from the calorim etric point of view

С-4. Simulation of Smoke Particles Coagulation in the Exhaust System of Piston Engine

APPLICATION OF AUTOMATION IN THE STUDY AND PREDICTION OF TIDES AT THE FRENCH NAVAL HYDROGRAPHIC SERVICE

GENETIC ALGORlIHMS WITH AN APPLICATION TO NONLINEAR TRANSPORTATION PROBLEMS

TIDAL PREDICTION WITH A SMALL PERSONAL COMPUTER

Large chunks. voids. Use of Shale in Highway Embankments

Presented by Arkajit Dey, Matthew Low, Efrem Rensi, Eric Prawira Tan, Jason Thorsen, Michael Vartanian, Weitao Wu.

Chapter 5 Workshop on Fitting of Linear Data

EC 219 SA M PLING AND INFERENCE. One and One HalfH ours (1 1 2 H ours) Answerallparts ofquestion 1,and ONE other question.

Real Gas Equation of State for Methane

Author's personal copy

MAST ACADEMY OUTREACH. WOW (Weather on Wheels)

The M echanism of Factor VIII Inactivation by H um an Antibodies

A Study of Drink Driving and Recidivism in the State of Victoria Australia, during the Fiscal Years 1992/ /96 (Inclusive)

3B/16. L a force portante des fondations en coins

C o r p o r a t e l i f e i n A n c i e n t I n d i a e x p r e s s e d i t s e l f

(2009) Journal of Rem ote Sensing (, 2006) 2. 1 (, 1999), : ( : 2007CB714402) ;

Nuclear Science Research Group

Archit ect ure s w it h

The Ability C ongress held at the Shoreham Hotel Decem ber 29 to 31, was a reco rd breaker for winter C ongresses.

Straight Lines. Distance Formula. Gradients. positive direction. Equation of a Straight Line. Medians. hsn.uk.net

On the convergence of solutions of the non-linear differential equation

Alles Taylor & Duke, LLC Bob Wright, PE RECORD DRAWINGS. CPOW Mini-Ed Conf er ence Mar ch 27, 2015

Impact of Drink-drive Enforcement and Public Education Programs in Victoria, Australia

The Ind ian Mynah b ird is no t fro m Vanuat u. It w as b ro ug ht here fro m overseas and is now causing lo t s o f p ro b lem s.

BOGOLIUBOV LABORATORY OF THEORETICAL PHYSICS. Theoretical Physics at BLTP. Precision tests of the Standard Model at LHC

The Effectiveness of the «Checkpoint Tennessee» Program

Light-absorbing capacity of phytoplankton in the Gulf of Gdańsk in May, 1987

Chapter - 3. ANN Approach for Efficient Computation of Logarithm and Antilogarithm of Decimal Numbers

On the M in imum Spann ing Tree Determ ined by n Poin ts in the Un it Square

PDF hosted at the Radboud Repository of the Radboud University Nijmegen

F O R M T H R E E K enya C ertificate of Secondary E ducation

Mark Schem e (Results) March GCSE Mathem atics (Linear) 1MA0 Higher (Calculator) Paper 2H

Interpretation of Laboratory Results Using M ultidimensional Scaling and Principal C om ponent Analysis*

Chapter 5. Self-condensation of acetophenone

(IGBP) km 2,? PRO GR ESS IN GEO GRA PH Y. V o l. 20, N o. 4 D ec., 2001 : (2001) m m 3,

M A T R IX M U L T IP L IC A T IO N A S A T E C H N IQ U E O F P O P U L A T IO N AN A L Y S IS

Oscar Cubo M edina fi.upm.es)

Effect of Methods of Platelet Resuspension on Stored Platelets

HYBRIDISATION OF A WAVE PROPAGATION MODEL (SWASH) AND A MESHFREE PARTICLE METHOD (SPH) FOR REAL APPLICATIONS

(Received March 30, 1953)

To link to this article:

Designing the Human Machine Interface of Innovative Emergency Handling Systems in Cars

Exploring the energy landscape

Introduction to Machine Learning CMU-10701

The S tructure of L iquid H ydrogen Peroxide

ECE 302 Division 1 MWF 10:30-11:20 (Prof. Pollak) Final Exam Solutions, 5/3/2004. Please read the instructions carefully before proceeding.

Jia, Yudan (2011) Numerical modelling of shaft lining stability. PhD thesis, University of Nottingham.

INERTIAL SURVEY SYSTEM (ISS)

Boris Backovic, B.Eng.

Applying Metrics to Rule-Based Systems

Lectures 5 & 6: Hypothesis Testing

CHEM-UA 652: Thermodynamics and Kinetics

UNIT 1: CHEMISTRY AND SOCIETY

Some Filtering Techniques for Digital Image Processing

Strategies for E stim ating B ehaviou ral F requency in Survey Interview s

Prim ary I onization Track (Gases)

JP3.7 SHORT-RANGE ENSEMBLE PRECIPITATION FORECASTS FOR NWS ADVANCED HYDROLOGIC PREDICTION SERVICES (AHPS): PARAMETER ESTIMATION ISSUES

M ultiple O rg anisations of E ven ts in M em ory

HYDROGRAPHIC SOUNDINGS IN SURF AREAS

O p tim ization o f P iezo electric A ctu a to r C onfigu ration on a F lexib le F in for V ib ration C ontrol U sin g G en etic A lgorith m s

Monte Carlo integration (naive Monte Carlo)

A GRAPHICAL METHOD OF ADJUSTING A DOUBLY-BRACED QUADRILATERAL

The effect of pollutants of crude oil origin on the diffusive reflectance of the ocean*

PDF hosted at the Radboud Repository of the Radboud University Nijmegen

The use of small airplanes to gather swan data in Alaska

Mark Schem e (Results) Novem ber GCSE Mathem atics (Linear) 1MA0 Higher (Non-Calculator) Paper 1H

2 Semester Final Exam Study Guide

Analysis and Experimental Study of Vibration System Characteristics of Ultrasonic Compound Electrical Machining

THERMAL MASS FLOW MEASUREMENT

Fr anchi s ee appl i cat i on for m

Bayesian Methods with Monte Carlo Markov Chains II

Soil disturbance during continuous flight auger piling in sand

LSU Historical Dissertations and Theses

T h e C S E T I P r o j e c t

Sodium-Initiated Polymerization of Alpha- Methylstyrene in the Vicinity of Its Reported Ceiling Temperature

Practical Baraminology

Contr, Mo..?? M arin e B iologies! L ab o rato ry H singer D anm ark

T he poisoning of a palladium catalyst by carbon m onoxide

PDF hosted at the Radboud Repository of the Radboud University Nijmegen

NORTHW ESTERN UNIVERSITY. A Case S tudy of Three Universities A DISSERTATION SUBMITTED TO THE GRADUATE SCHOOL

M a rtin H. B r e e n, M.S., Q u i T. D a n g, M.S., J o se p h T. J a in g, B.S., G reta N. B o y d,

Monte Carlo. Lecture 15 4/9/18. Harvard SEAS AP 275 Atomistic Modeling of Materials Boris Kozinsky

Geometric Predicates P r og r a m s need t o t es t r ela t ive p os it ions of p oint s b a s ed on t heir coor d ina t es. S im p le exa m p les ( i

Basics of Statistical Mechanics

Today: Fundamentals of Monte Carlo

H am iltonian Therm ostatting Techniques for. M olecular D ynam ics Sim ulation.

Basic Semantics of Prolog. CAS 706 Program m ing Language. Xinjun Wu mcmaster.ca

ML estimation: Random-intercepts logistic model. and z

Benchm ar k One St ud y Guid e: Science Benchm ar k Wed. Oct. 2nd

for Global Optimization with a Square-Root Cooling Schedule Faming Liang Simulated Stochastic Approximation Annealing for Global Optim

MATH 3510: PROBABILITY AND STATS July 1, 2011 FINAL EXAM

Multiple time step Monte Carlo

Essentials on the Analysis of Randomized Algorithms

Introduction to Design of Experiments

7.1 Coupling from the Past

Transcription:

Int roduct ion t o Free Ene rgy Calculat ions ov. 2005 Julien Michel

Predict ing Binding Free Energies A A Recept or B B Recept or Can we predict if A or B will bind? Can we predict the stronger binder? Can we do this reliably and quickly?

W hat is in t he t oolbox? Rigorous Free Energy M e t hods Accuracy Em pirical Scoring Funct ions Approxim at e Free Energy M e t hods seconds / com pound CPU t im e days / com pound

The im port ance of t he ent ropy Better Lennard Jones interactions with the protein But configurat ionaly rest rict ed Experim ental Binding Free Energy difference : + 1.8 kcal.m ol -1 Free Energy calculations capture aspects of ligand binding that are overlooked (or crudely considered) by em pirical scoring funct ions : Ent ropic effect s Solvat ion

The concept of Phase space 1 dim ensional harm onic oscillator U ( x) U ( x,y) y x x U(x) = k 1 (x x o ) 2 2 dim ensional harm onic oscillator x y U(x,y) = k 1 (x x o ) 2 + k 2 (y y 0 ) 2 x Serious sim ulations typically requires thousands of degrees of freedom and hence the phase space is in thousands of dim ensions Mom otina I love you!

St at ist ical Physics in one equat ion P ens = P( r exp ) exp ( βu ( r )) ( ) βu ( r ) dr dr P P ens = ( r ) π ( r ) dr Boltzm ann dist ribut ion Ensem ble Average of P Value of P for one configuration r P = P (Post ulat e) macroscopic ens Point in Phase space

um erical int egrat ion : Quadrat ure ( ) Two atom s on a 7*7 checkerboard : 49*49 = 2401 different states contributes to eq (2) Lot s of states! In eq (2), the integral is in 3 dim ensions where is the num ber of atom s sim ulated. And there are thousands of atom s to sim ulate Direct Integration is not feasible Im portant rem ark : Many, m any of the possible states involve overlaps of atom s and hence are highly unlikely

um erical e int e grat ion : M ont e Carlo Generate random ly the position of the atom s on the checkerboard, calculate the property of interest I s (X i ) Repeat tim es The quantity I est is an estim ate of the integrant I I est = 1 i =1 ( X The essential advantage over quadrature techniques is that the m ethod can be applied to system s of high dim ensionality without having to build a m esh that covers space I S i )

W hat is t he volum e of a sphere? 1 2 3 k 2 V π = V R k Γ + 1 2 2 k um ber of dim ensions k k 1 2 3 5 10 50 100 V/V R 1.00E+ 00 7.85E-01 5.24E-01 1.64E-01 2.49E-03 1.54E-27 1.87E-69 As k increases, the vast m ajority of the points in the k-dim ension space lies outside of the sphere Random selection of points doesn t work Strong analogy with system s in the condensed phase. There are few low energy states that contributes m eaningfully to the integral and m any high energy states (e.g atom ic overlaps) that do not contribute.

Im port ance Sam pling Instead of drawing random points from an uniform distribution, draw points from a distribution π. The Monte Carlo integration equation becom es I est = 1 I ( X i i = 1 from π π ( X i ) ) π is selected so that points are in the region of space which contributes the m ost to the integrand (e.g, in the sphere) The bias on the selection of X i is rem oved when the contribution to the integrand is estim ated.

Im port ance Sam pling : Exam ple f ( x) = 3x I = 1 0 2 f ( x) π π ( x) k k ( x) dx π π 0 1 = = 1 4x 3 4 3.5 3 2.5 Estim ate of I after draw ing 100 sam ples w ith tw o different im portance sam pler Y 2 1.5 1 Funct ion Average St d. Dev 0.5 π 0 1.027 0.111 0 π 1 0.999.036 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 X f(x) =3x**2 importance distribution uniform distribution

Im port ance Sam pling in St at ist ical Physics P ens = P( r exp ( βu ( r )) ( βu ( r )) dr ) exp If P(r ) does not dom inate the product in the num erator, then an ideal im portance sam pling function to estim ate equation (2) is : dr π ( r ) i = exp exp ( βu ( r )) ( βu ( r ) ) dr i Problem : Im possible to draw sam ples from π(r i ) without knowing the denom inator, which we can t (as it involves solving directly a very difficult int egral)

M arkov Chains ( in a nut shell) A Markov Chain is a set of probabilistic rules which governs transitions between states and is often represented as a transition m atrix Π p11 p12 p13 Probability of m oving from state 1 Π = p21 p31 p22 p32 p23 p33 to 3 Assum ing Π obeys a num ber of m athem atical properties, then the following is true π (1) ( n) = lim n p Π p (1) represent an arbitrary initial distribution (e.g, a random s starting point) and Π (n) represents n applications of the transition m atrix Π This equations m eans : Repeated applications of the transition m atrix Πconverges any initial arbitrary distribution p (1) towards a unique lim iting distribution π If the transition m atrix is properly constructed, we can draw sam ples from p without having to specify π a priori ( e.g, not have to solve the bad integral )

M arkov Chains M ont e Carlo : M et ropolis M ont e Carlo ( 1 9 5 3 ) 1. Start in state I 2. Attem pt a m ove to state j with probabiliy pij 3. Accept this m ove with probability α ij π j exp( βu j ) / Z = = = exp( β ( U π exp( βu ) / Z i i 6. If the m ove is accepted, set i = j, otherwise i = i 7. Accum ulate any property of interest A(i) 8. Return to 1 or term inate after iterations j U i )) Perhaps the m ost im portant 20th century contribution to num erical science

( U ( r )) π ( r dr A = exp + β ) Absolut e free ene rgy calculat ion : P P ens = ( r ) π ( r ) dr ( ) U ( r ) π ( r dr A = exp + β ) A = exp( + βu ( r ) The first term in the integral is significant com pared to the second term. Regions of space that contributes to the integral are not always those that have a significant value of π (r) Very slow convergence of t he free energy Anyway What is the free energy of a cactus?

* probability would be m ore accurate Relat ive free energy calculat ion : Zwanzig equat ion (1954) 1 AA B = ln exp β β ( ( U ) B ( r ) U A( r ) A If A and B are sim ilar, then m ost configurations of the system A will have a sim ilar energy to the system B and the energy difference is 0 Much better convergence because only a sm all num ber of configurations have to be generated (those that differs in energy* between A and B) Energy Ua Ub (Ub-Ua) Phase Space

The coupling param et er λ : A and B have to be very very sim ilar for the previous equation to yield converged results in a reasonable tim e Solution, m ulti stage the calculation with as m any interm ediates potential as it takes A B A A = A = 1 k= 0 1 β ln exp ( β ( U ( r ) U ( r ))) λ k + 1 λ k λ k The interm ediates potentials U k can be anything you want. Typically taken by linear com bination of the potential for system A and B U k λ = (1 λk ) U A + λ k U B

Therm odynam ic Int egrat ion : A = λ 1 A B = λ = 0 U λ λ dλ Another way to get the free energy difference between A and B The quantity in brackets is the ensem ble average of the gradient of the energy with respect to the coupling param eter Calculated by im plem enting λ derivatives in the sim ulation code Or by finite difference approxim ation U U λ λ If the change A to B is big, the gradients varies a lot and m any points are necessary to obtain a good estim ate of A by num erical integration

Sum m ary G? 0.0 λ 1.0

Sum m ary H λ G = 1 = λ= 0 λ λ dλ λ = 0. 0 λ = 0. 5 λ = 1. 0 G

Relat ive Binding Free e Energy Calculat ions by a The rm odynam ic cycle A in water A in protein G 3 G 1 G 2 G 4 B in water B in protein G = G 4 - G 3 = G 2 - G 1

Problem s and pit falls : Forcefield The potentials U A and U B are approxim ate and this affect the quality of the results (ligand polarisation in water and binding site?) Convergence Long sim ulation tim es required to get a good estim ate of the free energy change (= m any hours of CPUs) Many interm ediate potentials have to be run to get a god estim ate of the free energy change (= m any CPUs) It is im possible to know if your results have converged! Som e calculations give the sam e answer if you start from a different point in phase space and run with a different random num ber. Other never gives the sam e answer This restrict the applicability of the m ethod to system that are very sim ilar.

Diagnosing convergence: Block averaging Cut the raw data into chunks and analyse independently each chunk to get a free energy. Measure the variance of the result ing dist ribut ion of free energies. Independent runs Run the com plete sim ulation m any tim es, from different starting points Therm odynam ic cycle closure A G A B C B G A C G C B G A B - G C B G A C = 0 Deviations are a m easure of lack of convergence one of the m ethods ensure your results are trully converged

W hy is it so difficult? U Phase space In docking, we only care about the single lowest energy point in phase space and the search algorithm s can cut any corner to get there But in statistical physics, you need to know the probability of each low energy point in phase Very difficult to explore quickly phase space and sim ultaneously obtain a good estim ate of the probability of the low energy states

Current scope of free e energy calculat ions: Biot in/ Sprept avidin COX-2. COX-1 Trypsin HIV-1 prot ease P38 ( ) We need Better algorithm s to run m ore quickly Typical error in a successful free energy sim ulation = 1 kcal.m ol -1 (e.g., a 5-6 fold difference in IC 50 at room tem perature) Better forcefields to get m ore accurate results Better sam pling m ethods to get m ore precise results Better ways to sim ulate different ligands to deal with a m ore diverse set of com pounds

Quest ions.