Molecular Mechanics. I. Quantum mechanical treatment of molecular systems

Similar documents
Semi Empirical Force Fields and Their Limitations. Potential Energy Surface (PES)

Figure 1. Molecules geometries of 5021 and Each neutral group in CHARMM topology was grouped in dash circle.

Potential Energy (hyper)surface

Lecture 11: Potential Energy Functions

Bioengineering 215. An Introduction to Molecular Dynamics for Biomolecules

Biochemistry,530:,, Introduc5on,to,Structural,Biology, Autumn,Quarter,2015,

Molecular Modelling. part of Bioinformatik von RNA- und Proteinstrukturen. Sonja Prohaska. Leipzig, SS Computational EvoDevo University Leipzig

Session 1. Introduction to Computational Chemistry. Computational (chemistry education) and/or (Computational chemistry) education

CE 530 Molecular Simulation

Molecular Mechanics, Dynamics & Docking

( ) x10 8 m. The energy in a mole of 400 nm photons is calculated by: ' & sec( ) ( & % ) 6.022x10 23 photons' E = h! = hc & 6.

Conformational Geometry of Peptides and Proteins:

Dihedral Angles. Homayoun Valafar. Department of Computer Science and Engineering, USC 02/03/10 CSCE 769

Example questions for Molecular modelling (Level 4) Dr. Adrian Mulholland

TOPOLOGIES AND FORCE FIELD PARAMETERS FOR NITROXIDE SPIN LABELS

Close agreement between the orientation dependence of hydrogen bonds observed in protein structures and quantum mechanical calculations

Molecular dynamics simulations of anti-aggregation effect of ibuprofen. Wenling E. Chang, Takako Takeda, E. Prabhu Raman, and Dmitri Klimov

Computational Chemistry. An Introduction to Molecular Dynamic Simulations

The Molecular Dynamics Method

k θ (θ θ 0 ) 2 angles r i j r i j

Lecture 2 and 3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability

Introduction to Comparative Protein Modeling. Chapter 4 Part I

From Amino Acids to Proteins - in 4 Easy Steps

Water models in classical simulations

Molecular dynamics simulation of Aquaporin-1. 4 nm

THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION

CAP 5510 Lecture 3 Protein Structures

Lecture 2-3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability

Peptide folding in non-aqueous environments investigated with molecular dynamics simulations Soto Becerra, Patricia

Introduction to Computational Chemistry

Homology modeling. Dinesh Gupta ICGEB, New Delhi 1/27/2010 5:59 PM

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES

Intro/Review of Quantum

Force Fields for Classical Molecular Dynamics simulations of Biomolecules. Emad Tajkhorshid

Why study protein dynamics?

Intro/Review of Quantum

Structural Bioinformatics (C3210) Molecular Mechanics

Section Week 3. Junaid Malek, M.D.

Structure of Cement Phases from ab initio Modeling Crystalline C-S-HC

Protein Structure Basics

An introduction to Molecular Dynamics. EMBO, June 2016

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy

DENSITY FUNCTIONAL THEORY FOR NON-THEORISTS JOHN P. PERDEW DEPARTMENTS OF PHYSICS AND CHEMISTRY TEMPLE UNIVERSITY

Biomolecules: lecture 10

LS1a Fall 2014 Problem Set #2 Due Monday 10/6 at 6 pm in the drop boxes on the Science Center 2 nd Floor

BCMP 201 Protein biochemistry

Peptides And Proteins

Dominant Paths in Protein Folding

Polypeptide Folding Using Monte Carlo Sampling, Concerted Rotation, and Continuum Solvation

The protein folding problem consists of two parts:

Density Functional Theory

Computational Biology & Computational Medicine

Born-Oppenheimer Approximation

Why Is Molecular Interaction Important in Our Life

Chemistry in Biology. Section 1. Atoms, Elements, and Compounds

Lecture 26: Polymers: DNA Packing and Protein folding 26.1 Problem Set 4 due today. Reading for Lectures 22 24: PKT Chapter 8 [ ].

Unfolding CspB by means of biased molecular dynamics

All-atom Molecular Mechanics. Trent E. Balius AMS 535 / CHE /27/2010

PROTEIN STRUCTURE AMINO ACIDS H R. Zwitterion (dipolar ion) CO 2 H. PEPTIDES Formal reactions showing formation of peptide bond by dehydration:

Molecular dynamics simulation. CS/CME/BioE/Biophys/BMI 279 Oct. 5 and 10, 2017 Ron Dror

CHAPTER 29 HW: AMINO ACIDS + PROTEINS

SUPPLEMENTAL MATERIAL

Applications of Molecular Dynamics

Protein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron.

The Effect of Motional Averaging on the Calculation of NMR-Derived Structural Properties

Outline. The ensemble folding kinetics of protein G from an all-atom Monte Carlo simulation. Unfolded Folded. What is protein folding?

Chem 442 Review for Exam 2. Exact separation of the Hamiltonian of a hydrogenic atom into center-of-mass (3D) and relative (3D) components.

Subject of the Lecture:

510 Subject Index. Hamiltonian 33, 86, 88, 89 Hamilton operator 34, 164, 166

Protein structure (and biomolecular structure more generally) CS/CME/BioE/Biophys/BMI 279 Sept. 28 and Oct. 3, 2017 Ron Dror

Biochemistry Prof. S. DasGupta Department of Chemistry Indian Institute of Technology Kharagpur. Lecture - 06 Protein Structure IV

Solutions and Non-Covalent Binding Forces

BIOCHEMISTRY GUIDED NOTES - AP BIOLOGY-

Fondamenti di Chimica Farmaceutica. Computer Chemistry in Drug Research: Introduction

Molecular Mechanics. Yohann Moreau. November 26, 2015

VALENCE Hilary Term 2018

F. Piazza Center for Molecular Biophysics and University of Orléans, France. Selected topic in Physical Biology. Lecture 1

Introduction and theoretical background

Modeling Biological Systems Opportunities for Computer Scientists

23 The Born-Oppenheimer approximation, the Many Electron Hamiltonian and the molecular Schrödinger Equation M I

A Molecular Dynamics Simulation of a Homogeneous Organic-Inorganic Hybrid Silica Membrane

Connecting NMR data to biomolecular structure and dynamics

Secondary and sidechain structures

Theory and Applications of Residual Dipolar Couplings in Biomolecular NMR

The Chemical Basis of Life

Why Proteins Fold. How Proteins Fold? e - ΔG/kT. Protein Folding, Nonbonding Forces, and Free Energy

Aqueous solutions. Solubility of different compounds in water

2: CHEMICAL COMPOSITION OF THE BODY

16 years ago TODAY (9/11) at 8:46, the first tower was hit at 9:03, the second tower was hit. Lecture 2 (9/11/17)

Chemistry. for the life and medical sciences. Mitch Fry and Elizabeth Page. second edition

Molecular mechanics. classical description of molecules. Marcus Elstner and Tomáš Kubař. April 29, 2016

Protein Folding. I. Characteristics of proteins. C α

Atomic and molecular interaction forces in biology

Statistical Mechanics for Proteins

The Molecular Dynamics Method

Proton Acidity. (b) For the following reaction, draw the arrowhead properly to indicate the position of the equilibrium: HA + K + B -

Packing of Secondary Structures

Problem Set 1

Introduction to molecular dynamics

T6.2 Molecular Mechanics

Transcription:

Molecular Mechanics I. Quantum mechanical treatment of molecular systems The first principle approach for describing the properties of molecules, including proteins, involves quantum mechanics. For example, consider an atom, which consists of nuclei and electrons. On can write a time-independent (stationary) Schrödinger equation where H ˆ ψ = Eψ, 2 H ˆ h 2 = + V 2m is the Hamiltonian, which includes the contribution from kinetic energy (the first term) and potential energy (the second term). ψ and E are the wavefunction and energy of the system (including both, kinetic and potential, parts), which can be treated as the 2 eigenfunction and eigenvalue of the operator Ĥ. The wavefunction (more precisely ψ ) describes the spatial probability distribution for a particle. Generally, Schrödinger equation can be analytically solved for a very limited set of problems, such as hydrogen atom. For molecules, especially macromolecules, direct numerical solution of Schrödinger equation is computationally unfeasible. To simplify the quantum description of molecular systems, a Born-Oppenheimer (BO) approximation is applied, which assumes that motions of nuclei and electrons can be decoupled due to large disparity in their masses. This implies that nuclei is considered fixed and only the motions of electrons are considered. As a result E and ψ in the Schrödinger equation now represent electronic properties. BO approximation serves as foundation for ab initio and semi-empiric quantum approaches. Ab initio approach neglects all the relativistic effects and further assumes that the molecular orbitals (i.e., the wavefunctions of the entire molecule) can be represented as a linear combination of atomic orbitals, which are essentially electron wavefunctions. The related coefficients in the linear combinations are determined by energy minimization. The energy of a molecule constructed in this way is referred to as Hartree-Fock energy. The Hartree-Fock energy can then be used to adjust the positions of nuclei to reflect the computed molecular orbitals. After this Hartee-Fock energy must be again recomputed. This process is repeated self-consistently until no further adjustments to the nuclei positions are necessary. The ab initio method is computationally expensive and is suitable for the systems containing < 1000 atoms. Semi-empirical method uses additional approximation by assuming that the matrix elements associated with the combinations of various wavefunctions ψ n and ψ m may be obtained from the predetermined set of atomic orbitals, which are consistent with the experimental data. Because semi-empirical method

does not require computations of complex integrals of wavefunctions, larger quantum systems can be examined (number of atoms < 10,000). An interesting simulation methodology is based on the combination of quantum and classical approaches. For example, protein atoms may be considered using quantum approach, whereas solvent and the interactions of solvent with a protein are treated classically. The advantage of quantum description is that high frequency motions or bond formation/breakage can be studied, yet classical simulations of solvent significantly reduce the computational burden. The example of such mixed, quantum/classical simulations is reported for crambin. Quantum computations were performed using semi-empirical method and the crambin with explicit water was simulated for 350 ps (Protein Structure Function Genetics 44, 484 (2001)). II. Applicability of classical approach Generally, molecular properties have quantum characteristics at low temperatures or when high-frequency (fast) motions are involved. Let us determine an approximate timescale of characteristic motions in a protein, which serves as a border between quantum and classical descriptions. The classical physics is applicable, if the thermal energy k B T B >> hν, where hν can be thought of as energy gap between quantum energy levels (h is the Planck s constant). Therefore, the frequency of molecular motions, at which classical -23-1 -34-1 description is not valid, is ν q = k B T/h B = 1.38 10 J K 300K/6.62 10 J s 6 ps. Therefore, the characteristic timescale τ q =ν q -1 0.2ps~10-13 s and any motions taking place on a timescale faster than τ q are quantum in nature. Consider now a covalent bond between an oxygen and hydrogen atoms and evaluate classically the frequency of vibration of this bond. In the CHARMM force field the hydrogen atom is linked to oxygen through the bond-length potential V=k a (r-a) 2, where r is the H-O distance, k a =400 kcal/(mol Å 2 ) is the spring constant, and a is the equilibrium distance between H and O. Applying Newton equation of motion V m& r = = 2k a ( r a) r and taking into account the fact that 1 2k ν = a, 2π m

where m =1.67 10-27 kg is the hydrogen mass, we find 2 20 1 2 400J /( mol m ) 10 4184 1 ν = 90 ps. (1) 27 23 1 2π 1.67 10 kg 6.02 10 mol Therefore, the characteristic time scale is τ=ν -1 = 0.01ps=10-14 s. We see now that the time scale of hydrogen motion is about an order of magnitude faster than the time scale of the crossover between quantum and classical physics. Therefore, the motions of hydrogen must be described by quantum physics. The same conclusion is applicable for the covalent bonds between heavy atoms (other than hydrogens) and for the bond angle potential. There is, however, a solution designed to retain classical energy terms, which is based on applying a stiff spring potential (to prevent large bond length deviations) or even simply constraining the distance r to a. Indeed, rigorous quantum calculations show that the fluctuations in r, <Δr 2 > ~ 0.1Å or about 10% of the equilibrium bond length that justifies keeping the stiff classical potential for bond lengths. Still one has to be cautions on drawing any conclusions from the fast motions (on the time scales less or about τ q =0.2ps), because these may potentially be the artifacts of classical approximation. If the vibrations of atom around their bond lengths fall in the realm of quantum mechanics, what are the types of motions, for which classical description suffices? The local motions in proteins span a wide time scale range from 10-15 to 10-1 s and include atomic fluctuations, side chain or loop motions. Thus, only extremely fast motions of this type are quantum in nature, while most of them can still be treated classically. Furthermore, classical description is clearly sufficient for rigid body motions, which involve helix and strand motions, rearrangements of several secondary structure elements in protein etc. The times scales of these motions is from 10-9 to 10 1 s. Structural transitions in proteins (helix formation, folding and unfolding, aggregation) are also well above the boundary between classical and quantum physics. The arguments presented above outline the applicability of classical description of molecules and demonstrate that many biological processes can be considered on the basis of classical physics. This lays the foundation of molecular mechanics. III. Basic principles of molecular mechanics The main idea behind the molecular mechanics (MM) is that a molecular system can be viewed as a microscopic mechanical system. According to this idea the atoms in the molecular system are linked by mechanical springs, which control their covalent bonds, angles between successive bonds, rotations around the bonds, etc. Atoms interact with each other (attract or repel) according to classical non-bonded potentials, which determine the non-bonded interatomic forces. In mathematical terms, this description requires the construction of a potential function incorporating exclusively classical terms.

This potential (or energy) function is then used to compute all the relevant forces for the Newton equation of motions, which ultimately describe the microscopic dynamics of the molecular system. Fig. 1 The idea of MM is illustrated for four atom fragment of polypeptide chain. Four types of interactions indicated make up the basis of energy function in MM.. There are three fundamental principles of MM, namely, thermodynamic hypothesis, additivity, and transferability. Thermodynamic hypothesis states that the native state of a protein corresponds to the minimum of potential energy. This is essentially a rephrased version of Anfinsen idea that folding of proteins is largely controlled by thermodynamic principles. A unique native state is encoded in the protein sequence, because the complex network interactions in the heterogeneous polypeptide chain guides a protein toward a native conformation, which is the structure of minimum energy. These ideas also lay in the center of new view on protein folding. Although it is impossible to prove this assertion rigorously, the experimental and theoretical data support it. Spontaneous folding of many single domain proteins does not rely on any external help and appears to be following internal sequence instructions. Native structures of protein are generally stable (albeit not always) against mutations, which tend to increase their native energies. Additivity assumes that the total potential function E p may be decomposed into individual terms associated with electrostatic, van-der-waals, and mechanical-like terms describing bond stretching, bond angle fluctuations and rotations. In other words, E p can be separated into local and non-local energy terms. The local part of E p is VBL + VBA + local E = V, (2) p bonds angles DA dihedrals

where V BL, V BA, V DA are bond-length, bond-angle, and dihedral potentials, respectively. Note that no cross terms are usually considered in Eq. (2), i.e., possible couplings between the bond stretching and bond angles are neglected. Because the total number of bonds, angles and dihedrals is proportional to the length of protein, E p local scales with N, the number of amino acids, as N 1. The non-local (non-bonded) part of E p is non local E p = Vij, (3) ij where V ij are non-bonded interactions between atoms i and j, including electrostatic and/or van-der-waals interactions. It is clear from Eq. (3) that E p non-local scales as N 2 that turns their calculations for large molecular systems in computational bottleneck. Note also that Eq. (3) assumes that all terms in non-local potential energy are represented by pairwise interactions. This assumption is generally not correct, because, for example, electrostatic interactions between a pair of atoms affect the charge distribution on other atoms (through polarization effects). Therefore, the more consistent form of Eq. (3) should include multi-body terms as well. However, in the majority of current force fields these polarization effects are neglected and only two-body terms are incorporated. The additivity of potential functions may be exploited in the design of multistep algorithms, which take into account faster variations in local terms as compared with non-local interactions. Transferability hypothesis suggests that the properties of atoms in large molecules can be deduced from the study of representative set of small molecules. The derived energy parameters are then transferred without modification to proteins or other larger and more complex molecular systems. Roughly speaking, the transferability implies that the bond lengths, angles etc are the same in small test molecules as well as in much larger proteins. This approach is employed in the development of all potential functions for proteins. For example, the parameters for the potential function ( force field ) OPLS (Optimal Parameters for Liquid Simulations) were derived by Jorgensen and coworkers by performing Monte Carlo simulations of 36 organic molecules and their water solutions, which have similar structures to various protein side chains (JACS 110, 1657 (1988)). The aim of Monte Carlo simulations was to choose such values of interaction parameters for the organic molecules, which closely reproduce their experimental data on density or heats of vaporization. The OPLS parameters for the potential function were then shown to reproduce the fragments of X-ray native structures of proteins. The properties of atoms depend on the specific local environment, such as bond character or electron hybridization state. For example, the properties of a carbon atom depend on whether the atom is a part of aromatic or aliphatic side chain. Similarly, hydrogen may adopt polar or non-polar properties depending on the local environment. The solution to this problem is the use of atom types, i.e., to assign properties to the same atom depending on particular environment. One of the standard current force fields, CHARMM22, uses 57 different atom types. Specifically, there are 21 different carbons, 12 hydrogens, 11 nitrogens, seven oxygens, etc.

The fragment of the CHARMM22 topology file is shown in Fig. 2. Hydrogen atom H appears as polar hydrogen (atom type H), which is found in backbone amides (Figs. 2 and 3), or as non-polar hydrogen (atom type HA), which is used in Val or Ala side chains (Figs. 2 and 3). There are also aromatic hydrogens (atom type HP) used in aromatic side chains or backbone hydrogen (atom type HB) bonded to C α -carbon (Figs. 2 and 3). Similarly, carbon C (atom type C) is used in peptide backbone (carbonyl group, Figs. 2 and 3), aromatic C (atom type CA) appears, e.g., in Phe side chain. Several aliphatic carbons (atom types CT1, CT2, CT3), which are used in backbone or Val side chains, are also included (Figs. 2 and 3). Atom type atomic mass specific environment MASS 1 H 1.00800 H! polar H MASS 2 HC 1.00800 H! N-ter H MASS 3 HA 1.00800 H! nonpolar H MASS 4 HT 1.00800 H! TIPS3P WATER HYDROGEN MASS 5 HP 1.00800 H! aromatic H MASS 6 HB 1.00800 H! backbone H. MASS 20 C 12.01100 C! carbonyl C, peptide backbone MASS 21 CA 12.01100 C! aromatic C MASS 22 CT1 12.01100 C! aliphatic sp3 C for CH MASS 23 CT2 12.01100 C! aliphatic sp3 C for CH2 MASS 24 CT3 12.01100 C! aliphatic sp3 C for CH3.. MASS 54 NH1 14.00700 N! peptide nitrogen MASS 55 NH2 14.00700 N! amide nitrogen MASS 56 NH3 14.00700 N! ammonium nitrogen. MASS 70 O 15.99900 O! carbonyl oxygen. MASS 72 OC 15.99900 O! carboxylate oxygen MASS 73 OH1 15.99900 O! hydroxyl oxygen. MASS 75 OT 15.99940 O! TIP3P WATER OXYGEN Fig.2 Fragment of CHARMM22 topology file, showing different atom types. It is necessary to differentiate the atom types from atom names as they are used in the coordinate files (Figs. 3 and 4). The atoms names must be unique for a specific residue, whereas an atom of a particular type may appear in many amino acids in a given protein. Note also that atom type is defined by a particular set of van-der-waals parameters, whereas the specific partial charge depends in a particular environment. Consider, for example, C α - and C β - carbons in Val amino acid. Both are represented by an aliphatic atom type CT1, which carry nevertheless different partial charges (Fig. 3).

Atom name Atom type Partial charge (as it appears in coordinate files) RESI VAL 0.00 ATOM N NH1-0.47! HG11 HG12 ATOM HN H 0.31! HN-N / ATOM CA CT1 0.07! CG1--HG13 ATOM HA HB 0.09! /! HA-CA--CB-HB ATOM CB CT1-0.09! \ ATOM HB HA 0.09! CG2--HG21! O=C / \ ATOM CG1 CT3-0.27! HG21 HG22 ATOM HG11 HA 0.09 ATOM HG12 HA 0.09 ATOM HG13 HA 0.09 ATOM CG2 CT3-0.27 ATOM HG21 HA 0.09 ATOM HG22 HA 0.09 ATOM HG23 HA 0.09 ATOM C C 0.51 ATOM O O -0.51 Fig. 3. Structure and composition of Val residue from CHARMM22 force field. Atom number Atom name Residue name x, y, z coordinates and number. ATOM 23 N VAL 2-2.277 1.367 0.626 1.00 0.00 ATOM 24 HN VAL 2-2.422 0.441 1.000 1.00 0.00 ATOM 25 CA VAL 2-0.968 2.037 0.765 1.00 0.00 ATOM 26 HA VAL 2-0.792 2.733-0.045 0.00 0.00 ATOM 27 CB VAL 2-0.812 2.616 2.186 1.00 0.00 ATOM 28 HB VAL 2-0.885 1.765 2.907 0.00 0.00 ATOM 29 CG1 VAL 2 0.569 3.342 2.456 1.00 0.00 ATOM 30 HG11 VAL 2 0.588 3.764 3.484 0.00 0.00 ATOM 31 HG12 VAL 2 1.424 2.640 2.371 0.00 0.00 ATOM 32 HG13 VAL 2 0.720 4.177 1.739 0.00 0.00. ATOM 36 HG23 VAL 2-2.952 3.208 2.593 0.00 0.00 ATOM 37 C VAL 2-0.018 0.898 0.591 1.00 0.00 ATOM 38 O VAL 2-0.164-0.010 1.353 1.00 0.00. Fig. 4 Fragment of coordinate file containing Val coordinates as used in CHARM22 MD simulations.

The CHARMM22 or OPLS force fields are reasonably successful in capturing many properties of proteins. One of the directions of further improvement is the incorporation of more realistic distribution in partial charges. In CHARMM22, OPLS, and many other force fields partial charges are fixed at the centers of atoms. For example, in TIP3P water (Fig. 5) +q=0.417e charges are assigned to hydrogens and -2q=-0.834e is placed on oxygen. This assumption distorts higher order electric moments, which could be correctly included if partial charges are placed off the nuclei centers. Furthermore, in the force fields with fixed partial charges the radial distribution of electrons is assumed isotropic (spherical). For example, the water oxygen has four sp3 pairs of electrons, two of which are engaged in covalent bonding with hydrogens. Two other electrons make up the two lone pairs used in hydrogen bond formation, but their spatial distribution is highly anisotropic. Yet in the TIP3P water model in CHRAMM22 this anisotropy and even the total number of free lone pairs are not taken into account. As a result the number and geometry of HBs may become distorted. The fact that water forms the right number of HBs in CHARMM22 force field (i.e., four) is a consequence of purely steric effects. Finally, many atoms or molecules may experience polarization effects when subject to external electric field. The polarization would change the distribution and the value of partial charges. New versions of force fields, which take into account the effects outlined above, are being developed, but they have not become the mainstream of MD simulations as yet. RESI TIP3 0.000! tip3p water model ATOM OH2 OT -0.834 ATOM H1 HT 0.417 ATOM H2 HT 0.417 Fig. 5 Representation of TIP3P water in CHARMM22 force field. It is important to keep in mind that MM is used to develop a force field (i.e., a potential function), which can later be used for Monte Carlo, molecular dynamics, minimization calculations, or Langevin dynamics. In other words, MM itself does not make any assumptions about the type of dynamics used in the molecular simulations. In the case of molecular dynamics, the potential function constructed using MM principles is used to compute the forces in the Newton equation of motions which act upon the atom i r E p m & i r i = Fi = r, i where r i is the radius vector of an atom i.