Molecular Interactions F14NMI. Lecture 4: worked answers to practice questions

Molecular Interactions F14NMI Lecture 4: worked answers to practice questions http://comp.chem.nottingham.ac.uk/teaching/f14nmi jonathan.hirst@nottingham.ac.uk (1) (a) Describe the Monte Carlo algorithm and discuss its advantages and disadvantages with respect to classical molecular dynamics. [10] (b) In a Monte Carlo simulation at 298K, if a trial configuration has an energy that is 10 kj /mol higher than the current configuration, what is the probability that the trial configuration will be accepted? [Boltzmann s constant / Avogadro s number, k/n A = 8.314 J/mol/K] [5] (c) Write down an expression describing how the potential energy, E, of a bond varies as it is distorted away from its equilibrium bond length, r 0. [5] (d) Differentiate the expression you wrote down for part (c) to show that the force experienced by the atoms in a bond at its equilibrium separation is zero. [5] 1

Monte Carlo Monte Carlo Accept a new configuration if energy is lower, or (if energy higher by E) with probability p ~ exp(- E/kT) For thermodynamics require an ensemble that obeys/generates Boltzmann distribution. MC vs MD MD Time-dependent properties Good for large molecules (need small steps) Better for exploring local phase space MC Const T, P simulations easier For simple liquids, thermodynamic properties converge quickly Non-physical moves can explore phase space more fully 2

Sampling should exceed timescales of interest by a factor of 10 Size & complexity also increase required timescales e.g., ions, complex landscapes Timescales kt in J/mol = RT = 8.314 x 300 ~ 2.5 kj/mol ~ 0.6 kcal/mol In a Monte Carlo simulation at 298K, if a trial configuration has an energy that is 10 kj /mol higher than the current configuration, what is the probability that the trial configuration will be accepted? kt in J/mol = RT = 8.314 x 298 ~ 2.5 kj/mol probability p ~ exp(- E/kT) p = exp(-10/2.5) = exp(-4) = 0.018 3

Write down an expression describing how the potential energy, E, of a bond varies as it is distorted away from its equilibrium bond length, r 0. k U( r) = 2 du 2k = dr 2 ( r r ) 0 ( r r ) 0 2 Differentiate the expression to show that the force experienced by the atoms in a bond at its equilibrium separation is zero. du Force = dr Force = k ( r r ) = k( r r ) 0 [ r r0 ] 0 0 0 = (2) (a) Define the following concepts: entropy, heat, internal energy. [6] (b) Identify three entropic factors that could contribute significantly to the free energy of binding in the formation of a protein-ligand complex, and state whether they are likely to favour or disfavour binding. [6] (c) Write bullet point notes on Free Energy Perturbation calculations. [10] (d) How does the Helmholtz free energy differ from the Gibbs free energy, and why might it be more convenient to calculate the former rather than the latter? [3] 4

Define entropy, heat, internal energy Entropy: measure of disorder; S = klnw (k is the Boltzmann constant; W is the number of accessible states); for a spontaneous change, S q = T q rev is the heat absorbed in a reversible process Heat: Q = U + PV; heat is the internal energy of a system plus the work done by the system. Internal energy: U = KE + PE (kinetic energy + potential energy). PE comprises energy from bonded and non-bonded interactions rev Entropic factors that could contribute significantly to the free energy of binding Change in the entropy associated with protein motions - disfavour binding If ligand accesses a wide range of conformations in the free state, but only a narrow range when bound, its configurational entropy will drop and disfavour binding Hydrophobic effect: burial of hydrophobic surface area increases the entropy of the solvent and will favour binding 5

FEP theory uses non-physical legs of thermodynamic cycle Helmholtz and Gibbs Free Energies Spontaneity criteria, which refer to the system (rather than Universe) A G = U TS Helmholtz Free Energy = H TS = U + PV TS Gibbs Free Energy V,T A sys 0 P,T G sys 0 G = A + (PV) 6

Free Energies and Partition Functions If the energy is fixed (E) only microstates with that energy can be visited; the probability of finding the system in a particular microstate is: (NVE) P i 1 = Microcanonical or NVE Ensemble probability Ω (N,V,E) If the temperature is fixed (T) other energy levels can be visited P(N, V,T) = i E exp i kt Ej exp j kt = E exp i kt Q(N, V,T) Canonical or NVT Ensemble probability Q(N,V,T) Ej = exp = j kt E Ω(N,V,E)exp E kt Canonical Partition Function Free Energies as Ensemble Averages A = kt lnq = kt ln exp High energy configurations contribute more to the average But the system preferentially explores low energy regions of the phase space H kt We will find convergence problems in the average. Very long simulation times required 7

Free Energy Perturbation How to do it: I II A(II I) = kt lnexp [ ( HII HI ) kt] I [ ( ) ] II II I A(I II) = kt lnexp HI HII kt i) Simulate the I state (or the II state) ii) Select a number of configurations of the I state (or the II) iii) Evaluate the energy of the II state (state I) in these configurations iv) Average the exponential of the energy difference / kt v) Obtain Helmholtz free energy difference Problems: If states I and II are too different (the phase spaces of both do not overlap) then the energy difference will be too large and the configurations will not significantly contribute to the average However, we can imagine intermediate states between I & II Free Energy Perturbation How to do it: A = A = kt II N 1 i= 1 A I = ln exp ( A A ) + ( A A ) + K+ ( A A ) II N 1 N 1 N 2 N 1 [ ( Hi+ 1 Hi ) kt ] = kt ln exp[ ( Hi Hi+ 1) kt ] i We can create as many intermediate states as needed using a coupling parameter between state I and II i= 1 2 I = i+ 1 H( λ ) = (1 λ)h I + λh II Linear coupling (the most common!) when λ=0 we are in the state I, when λ=1 we are in state II λ can take any value between 0 and 1 8

Structure-based drug design, lock and key mechanism, induced fit Structure-based drug design: using insight from either (i) the atomic-resolution structure of the protein (from X-ray or NMR) for de novo design or virtual screening or (ii) the structure of the protein-ligand complex Induced fit: binding of substrate or ligand to enzyme induces a change in the enzyme structure (enhancing binding) 9

Induced fit http://lorentz.dynstr.pasteur.fr/trajectories/examples.php Adenylate kinase Induced fit http://lorentz.dynstr.pasteur.fr/trajectories/examples.php Hexokinase Bacillus DNA polymerase 10

Lock & key... the intimate contact between the molecules... is possible only with similar geometrical configurations. To use a picture, I would say that the enzyme and the substrate must fit together like a lock and key. Receptor Emil Fischer, Ber. Dtsch. Chem. Ges. 1894, 27, 2985. versus induced fit (Koshland) Proc. Natl. Acad. Sci. USA 44:98 104 (1958) Steps in a typical docking algorithm (DOCK) 1) Add protons, vdw parameters, and partial charges for both target and small molecule. 2) Calculate solvent accessible surface area of target. 3) Create negative image of surface features surrounding active site using spheres. 4) Calculate energy grid for target. Each grid point stores vdw score and charge for that area of space. 5) Match ligand atoms to sphere centres and score against grid. 6) Rank best scoring poses 11

5 freely rotatable bonds. Assume 3 minima per bond (trans, gauche +, gauche - ) Number of low energy conformations = 3 5 = 243 Comparative Molecular Field Analysis Make 3D models of molecules Align them (non-trivial) (CoMFA) Sample molecular field (steric and/or electrostatic) values Correlate field values with activity, using PLS 12

Partial Least Squares (PLS) Number of observations, n p >> n Number of variables, p Grossly underdetermined extract orthogonal components. maximise the covariance between X and y regression is part of the algorithm y = β1t1 + β2t2+ β3t3 +K t = w1 x1 + w2 x2 + w3 3 1 x +K PLS is a rotation 14 12 t 2 10 8 6 4 t 1 2 0 0 2 4 6 8 10 12 13

2D or not 2D? 3D Time-consuming Structure generation; alignment 2D based only on properties derivable from atom types and connection tables Faster Better (on occasion*) *Brown, R. D.; Martin, Y. C. Use of Structure Activity Data to Compare Structure-Based Clustering Methods and Descriptors for Use in Compound Selection. J. Chem. Inf. Comput. Sci. 1996, 36, 572-584. Computing BioMolecular Interactions Protein-ligand interactions Drug design Quantitative Structure-Activity Relationships QSAR Docking Molecular Dynamics Simulation 14