Simulating Folding of Helical Proteins with Coarse Grained Models

Similar documents
It is not yet possible to simulate the formation of proteins

DISCRETE TUTORIAL. Agustí Emperador. Institute for Research in Biomedicine, Barcelona APPLICATION OF DISCRETE TO FLEXIBLE PROTEIN-PROTEIN DOCKING:

Dihedral Angles. Homayoun Valafar. Department of Computer Science and Engineering, USC 02/03/10 CSCE 769

Folding of small proteins using a single continuous potential

Lecture 2 and 3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability

Hydrophobic Aided Replica Exchange: an Efficient Algorithm for Protein Folding in Explicit Solvent

Many proteins spontaneously refold into native form in vitro with high fidelity and high speed.

Molecular Mechanics. I. Quantum mechanical treatment of molecular systems

Ab-initio protein structure prediction

Short Announcements. 1 st Quiz today: 15 minutes. Homework 3: Due next Wednesday.

arxiv: v1 [cond-mat.soft] 22 Oct 2007

Presenter: She Zhang

Molecular Modelling. part of Bioinformatik von RNA- und Proteinstrukturen. Sonja Prohaska. Leipzig, SS Computational EvoDevo University Leipzig

Molecular dynamics simulation of Aquaporin-1. 4 nm

arxiv:cond-mat/ v1 [cond-mat.soft] 19 Mar 2001

Protein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron.

Introduction to Comparative Protein Modeling. Chapter 4 Part I

arxiv:cond-mat/ v1 [cond-mat.soft] 5 May 1998

The protein folding problem consists of two parts:

Useful background reading

THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION

Computer simulations of protein folding with a small number of distance restraints

arxiv:cond-mat/ v1 2 Feb 94

arxiv:cond-mat/ v1 [cond-mat.soft] 16 Nov 2002

Lecture 11: Protein Folding & Stability

Protein Folding & Stability. Lecture 11: Margaret A. Daugherty. Fall Protein Folding: What we know. Protein Folding

Protein Folding. I. Characteristics of proteins. C α

Simulation of mutation: Influence of a side group on global minimum structure and dynamics of a protein model

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES

Can a continuum solvent model reproduce the free energy landscape of a β-hairpin folding in water?

Biochemistry,530:,, Introduc5on,to,Structural,Biology, Autumn,Quarter,2015,

Introduction to Computational Structural Biology

Protein folding. Today s Outline

Assignment 2 Atomic-Level Molecular Modeling

Molecular dynamics simulations of anti-aggregation effect of ibuprofen. Wenling E. Chang, Takako Takeda, E. Prabhu Raman, and Dmitri Klimov

Protein Folding & Stability. Lecture 11: Margaret A. Daugherty. Fall How do we go from an unfolded polypeptide chain to a

Molecular Mechanics, Dynamics & Docking

Monte Carlo simulations of polyalanine using a reduced model and statistics-based interaction potentials

Protein Folding Prof. Eugene Shakhnovich

CAP 5510 Lecture 3 Protein Structures

Molecular Dynamics Studies of Human β-glucuronidase

Clustering of low-energy conformations near the native structures of small proteins

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror

Supporting Online Material for

Dominant Paths in Protein Folding

Molecular modeling. A fragment sequence of 24 residues encompassing the region of interest of WT-

Biochemistry Prof. S. DasGupta Department of Chemistry Indian Institute of Technology Kharagpur. Lecture - 06 Protein Structure IV

Docking. GBCB 5874: Problem Solving in GBCB

The Molecular Dynamics Method

Protein Structure Prediction, Engineering & Design CHEM 430

Contact pair dynamics during folding of two small proteins: Chicken villin head piece and the Alzheimer protein -amyloid

Protein structure forces, and folding

Lecture 2-3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability

Self-consistently optimized energy functions for protein structure prediction by molecular dynamics (protein folding energy landscape)

Polypeptide Folding Using Monte Carlo Sampling, Concerted Rotation, and Continuum Solvation

A Minimal Model for the Hydrophobic and Hydrogen Bonding Effects on Secondary and Tertiary Structure Formation in Proteins

Unfolding CspB by means of biased molecular dynamics

Multi-Scale Hierarchical Structure Prediction of Helical Transmembrane Proteins

PROTEIN-PROTEIN DOCKING REFINEMENT USING RESTRAINT MOLECULAR DYNAMICS SIMULATIONS

CMPS 3110: Bioinformatics. Tertiary Structure Prediction

Secondary and sidechain structures

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction

Lecture 21 (11/3/17) Protein Stability, Folding, and Dynamics Hydrophobic effect drives protein folding

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

Examples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE

Modeling Background; Donald J. Jacobs, University of North Carolina at Charlotte Page 1 of 8

Potential Energy (hyper)surface

Effects of Crowding and Confinement on the Structures of the Transition State Ensemble in Proteins

It is now well established that proteins are minimally frustrated

All-atom ab initio folding of a diverse set of proteins

Section Week 3. Junaid Malek, M.D.

The Dominant Interaction Between Peptide and Urea is Electrostatic in Nature: A Molecular Dynamics Simulation Study

Protein structure and folding

Secondary Structure. Bioch/BIMS 503 Lecture 2. Structure and Function of Proteins. Further Reading. Φ, Ψ angles alone determine protein structure

Introduction to" Protein Structure

Supplementary Figures:

Convergence of replica exchange molecular dynamics

Molecular dynamics simulation. CS/CME/BioE/Biophys/BMI 279 Oct. 5 and 10, 2017 Ron Dror

Protein Structures. 11/19/2002 Lecture 24 1

The role of secondary structure in protein structure selection

BCH 4053 Spring 2003 Chapter 6 Lecture Notes

Distance Constraint Model; Donald J. Jacobs, University of North Carolina at Charlotte Page 1 of 11

Enhancing Specificity in the Janus Kinases: A Study on the Thienopyridine. JAK2 Selective Mechanism Combined Molecular Dynamics Simulation

Chem. 27 Section 1 Conformational Analysis Week of Feb. 6, TF: Walter E. Kowtoniuk Mallinckrodt 303 Liu Laboratory

Figure 1. Molecules geometries of 5021 and Each neutral group in CHARMM topology was grouped in dash circle.

arxiv: v1 [q-bio.bm] 31 Jan 2008

Basics of protein structure

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy

A Path Planning-Based Study of Protein Folding with a Case Study of Hairpin Formation in Protein G and L

Proteins are not rigid structures: Protein dynamics, conformational variability, and thermodynamic stability

Quiz 2 Morphology of Complex Materials

Dana Alsulaibi. Jaleel G.Sweis. Mamoon Ahram

THE UNIVERSITY OF MANITOBA. PAPER NO: 409 LOCATION: Fr. Kennedy Gold Gym PAGE NO: 1 of 6 DEPARTMENT & COURSE NO: CHEM 4630 TIME: 3 HOURS

Master equation approach to finding the rate-limiting steps in biopolymer folding

ALL LECTURES IN SB Introduction

Importance of chirality and reduced flexibility of protein side chains: A study with square and tetrahedral lattice models

BME Engineering Molecular Cell Biology. Structure and Dynamics of Cellular Molecules. Basics of Cell Biology Literature Reading

Long Range Moves for High Density Polymer Simulations

Why Proteins Fold. How Proteins Fold? e - ΔG/kT. Protein Folding, Nonbonding Forces, and Free Energy

Transcription:

366 Progress of Theoretical Physics Supplement No. 138, 2000 Simulating Folding of Helical Proteins with Coarse Grained Models Shoji Takada Department of Chemistry, Kobe University, Kobe 657-8501, Japan (Received October 11, 1999) We describe how potential parameters in a coarse graind model of proteins can be optimized with use of available protein three dimensional database. With this optimized potentials, we simulated a three helix bundle protein and found that all trajectories reach at the native structure within 1 microsecond. Interestingly, a quasi-mirror image is successfully discriminated from the native topology. 1. Introduction Protein folding has been intensively studied for about 40 years since Anfinsen s famous experiments. 1) The problem includes two aspects; physical understanding of folding mechanisms 2), 3) and predicting the three dimensional structure.for both aspects, it is crucial to construct a model that is realistic enough to discriminate the native structure from many non-native ones and that is simple enough to be able to sample wide range of conformational spaces with currently available computers. Models that include all atoms and solvent molecules, such as CHARM or AMBER, are still somewhat too demanding for this purpose.on the other hand, so called minimal models that include one bead per amino acid seems to be too crude for, at least, prediction. Recently, models that are in between above mentioned two limits have been proposed and studied.in this paper, we describe our recent work to this direction. Our model includes 4 united atoms per amino acid (3 for glycine), by which backbone dynamics is modeled quite realistically, while side chain atoms are grouped into a bead and solvent effects are taken into account only indirectly.functional form of interactions is devised to be consistent with physico-chemical knowledge; especially solvent effects are carefully taken into account via an idea of context dependent dielectric constant. With this model, we performed simulation of a 54 residue long protein-like peptide made of three kinds of amino acids (PRO54), where three types of amino acids include hydrophobic, polar, and flexible ones. 4) The sequence of PRO54 is designed to have three helix bundle structure imitating a laboratory designed four helix bundle of DeGrado s group.for PRO54, restricting the ranges of parameters not too different from experimentally anticipated values, we tried to tune parameters empirically so that the peptide can reach at three helix bundle form starting from any random coil structure.after months of trial, we ended up with a set of parameters that indeed enables the peptide to fold within a microsecond. Although its promising result, we found some significantly different properties for the simulated PRO54 comparing with natural proteins. 4) Among them are 1) native

Simulating Folding of Helical Proteins with Coarse Grained Models 367 like state of PRO54 has significant residual fluctuation, in which three helix bundle form is kept, but their relative alignment changes, 2) folding-unfolding transition is much less cooperative for PRO54, and 3) the most seriously, two quasi-mirror images of three helix bundle forms have almost same stability for PRO54.The first two characters were actually observed in laboratory designed peptide of DeGrado s, too. The third one may need more explanation: For three helix bundle topology, there can be two different ways of alignment of three helices.for both, all amino acids in the core are hydrophobic ones and the surface amino acids in helices are polar ones.thus it is natural not to be able to have energy gap between the two topology.thinking these together, we concluded that major reason for above mentioned differences is due to three letter codes, instead of due to inappropriate modeling. Now we go forward in trying to simulate a natural protein, namely that made of 20 types of amino acids (actually 17 amino acids exist in a protein studied in this paper, though).apparently, the model includes many more parameters and ad hoc determination of them is hopeless.thus, we need some systematic ways to determine them.we use an idea developed by Wolynes and his co-workers; 5) namely optimize parameters so that relative stability of the native structure against misfold ones normalized by the standard deviation of energy fluctuation is maximal.this will be discussed in detail in the next section.with the optimized potential parameters, folding simulation is performed for a three helix bundle protein, albumin binding domain with 47 residues (pdb code; 1prb).Some preliminary result is reported in 3.Conclusion is given in the last section. 2. Optimization of energy parameters Here, we start with a short summary of the model used; an amino acid is modeled as three backbone united atoms, NH, CH, and CO, and a bead for the side chain. The latter is located near the center of mass of non-hydrogen atoms.molecular dynamics simulation is performed by the position Langevin equation, where the Stokes law is utilized to decide the friction coefficients of atoms.all chemical bond lengths (real for backbone, and virtual between CH and a side chain) and bond-bond angles are fixed by the LINCS algorithm, 6) which is significantly better than the socalled SHAKE, the well-known algorithm.the systematic force in the Langevin equation is calculated by the derivative of the potential function, which consists of various interactions, V = V ω + V φ + V ψ + V Rama + V vdw + V HB + V HP + V EL. (2.1) Meaning of each term is the following; (in order) the hindered rotation around ω dihedral angle (1st), that around φ (2nd), that around ψ (3rd), the side chain entropy effect representing the secondary structure propensity (4th), the van der Waals potential (5th), the hydrogen bonding interaction (6th), the hydrophobic interaction (7th), and the electrostatic interactions (8th).The explicit expression will be described elsewhere. The potential function includes many energetic parameters ɛ in linear form in

368 S. Takada Fig. 1. Z score optimization procedure as a function of Monte Carlo steps. The top curve is for Z max, the dashed curve is for average of 39 Z scores, and other three are Z scores of the first three proteins out of 39 s. For the first 50000 step, only the hydrophobic interaction parameters are optimized that is followed by the optimization of the rest of parameters with fixed hydrophobic ones. the potential energy term; V (r, ɛ)= i ɛ i u i (r), (2.2) where ɛ i is a parameter and u i (r) is a function of protein conformation collectively denoted as r.now we introduce the so-called Z score, Z(ɛ) = V (r nat,ɛ) V (r, ɛ) D, (2.3) V (ɛ) the (potential) energy of the native structure V (r nat ) relative to average energy V (r) D of denatured ensemble divided by the standard deviation of energy fluctuation V.(Note that the opposite sign to this definition is sometimes used.) It was theoretically analyzed that for the protein to fold quickly avoiding severe trap in misfolded states the protein has to have reasonably small Z score, i.e., negative and large in absolute value in the current definition. 5) Our strategy to optimize parameters is as follows; We first choose some training set of proteins for which native three dimensional structures are known from experiments.we use 39 proteins in this paper.our goal is to find a set of potential parameters that can be used for simulation of any proteins.therefore we decided

Simulating Folding of Helical Proteins with Coarse Grained Models 369 to use the maximum value of Z score, Z max, as an index representing quality of the energy function.we performed simulated annealing runs in parameter space ɛ with use of Z max as a scoring function; Namely, assuming some initial set of parameters ɛ, we compute Z score of training proteins and get the maximal value of them, Z max. We then make small change in ɛ and recompute Z max.metoropolis criteria is used for Z max to decide whether the change is accepted or rejected.the procedure is repeated with decreasing temperature until getting an annealed parameters. Figure 1 represents a trajectory of an annealing run, where in addition to Z max, average of Z scores for 39 proteins, and Z scores of the first three proteins are plotted as a function of Monte Carlo step. With use of the optimized parameter set, we computed Z scores of several small proteins that are not in the training set.the Z values are as follows; 3.48 for 1bdd (the pdb code), 2.64 for 1r69, 0.75 for 1coa, 1.20 for 2gb1, 1.39 for 1srl, and 0.82 for 1nmg where the first two are all α proteins, while the others include β sheet.namely, for all-α proteins, the current energy function seems to be useful even if they are not involved in the training set.unfortunately, this is not the case for proteins with β sheet. 3. Simulating protein folding with 20 letter code: Albumin binding domain Since the current energy function is supposed to be good for helical proteins, we performed folding simulation of a three helix bundle protein, 47 residue of albumin binding domain (6 residues are cleaved out from the sequence in the pdb file 1prb). We tune up one parameter that is responsible for strength of overall collapse.after tuning up this one parameter, we found that almost all trajectories (13/14 at this moment) can reach at the native like structure within 1µs, starting from random conformations.the simulated native structure, after quenching, has about 3 A root mean square deviation (RMSD) from the experimental structure.figure 2 shows snapshots of a typical folding trajectory; after several tens of nanoseconds, collapse and helix formation simultaneously occured.after forming about-right topology, it takes somewhat long time to reach at the native structure.in contrast to the simulation result for PRO54, the protein-like peptide with three types of amino acids, we found that a quasi-mirror image is seldom reached through folding runs.energetic analysis suggested that quasi-mirror image has total energy about 7 kcal/mol higher than the right topology.this difference arises from the vdw interaction, the HB interaction, and the HP interaction. 4. Conclusions An automated optimization of potential parameters is proposed and is tested. We found that training of parameters with 39 protein database leads to better modeling not only for proteins in the database but also many other proteins.in particular all-α proteins have better score with the current energy function.langevin dynamics simulation for a three helix bundle protein is performed finding that folding run from

370 S. Takada Fig. 2. Snapshots of a typical folding trajectory for albumin binding domain (drawn with Molscript 7) ). any random conformations always reaches at a native-like structure within about 3 A RMSD.A quasi-mirror image where helix alignment is opposite to the native structure is discriminated about 7 kcal/mol and almost all trajectories fall into the right conformation. Acknowledgments I would like to appreciate Peter G.Wolynes and Zaida Luthey-Schulten for useful discussions.this work has been supported by JSPS Research for the Future Program Photo Science and by the Grant-in-Aid on Priority Areas Molecular Physical Chemistry.

Simulating Folding of Helical Proteins with Coarse Grained Models 371 References 1) A. R. Fersht, Structure and Mechanism in Protein Science: A Guide to Enzyme Catalysis and Protein Folding (WH Freeman and Co NY 1999). 2) J. D. Bryngelson, J. N. Onuchic, N. D. Socci and P. G. Wolynes, PROTEINS: Struct, Funct, Genetics. 21 (1995), 167. 3) J. J. Portman, S. Takada and P. G. Wolynes, Phys. Rev. Lett. 81 (1998), 5237. 4) S. Takada, Z. Luthey-Schulten and P. G. Wolynes, J. Chem. Phys. 110 (1999), 11616. 5) R. Goldstein, Z. Luthey-Schulten and P. G. Wolynes, Proc. Natl. Acad. Sci. USA 89 (1992), 4918. 6) B. Hess, H. Bekker, H. J. C. Berendsen and J. G. E. M. Fraaije, J. Comp. Chem. 18 (1997), 1463. 7) P. J. Kraulis, Molscript, J. Appl. Crystallogr. 24 (1991), 946.