Natural proteins have the algorithm of finding the global

Size: px
Start display at page:

Download "Natural proteins have the algorithm of finding the global"

Transcription

1 Shaping up the protein folding funnel by local interaction: Lesson from a structure prediction study George Chikenji*, Yoshimi Fujitsuka, and Shoji Takada* *Department of Chemistry, Faculty of Science, and Graduate School of Science and Technology, Kobe University, Nada, Kobe , Japan; Department of Computational Science and Engineering, Graduate School of Engineering, Nagoya University, Nagoya , Japan; and Core Research for Evolutionary Science and Technology, Japan Science and Technology Agency, Nada, Kobe , Japan Edited by R. Stephen Berry, University of Chicago, Chicago, IL, and approved December 30, 2005 (received for review September 22, 2005) Predicting protein tertiary structure by folding-like simulations is one of the most stringent tests of how much we understand the principle of protein folding. Currently, the most successful method for folding-based structure prediction is the fragment assembly (FA) method. Here, we address why the FA method is so successful and its lesson for the folding problem. To do so, using the FA method, we designed a structure prediction test of chimera proteins. In the chimera proteins, local structural preference is specific to the target sequences, whereas nonlocal interactions are only sequence-independent compaction forces. We find that these chimera proteins can find the native folds of the intact sequences with high probability indicating dominant roles of the local interactions. We further explore roles of local structural preference by exact calculation of the HP lattice model of proteins. From these results, we suggest principles of protein folding: For small proteins, compact structures that are fully compatible with local structural preference are few, one of which is the native fold. These local biases shape up the funnel-like energy landscape. computational protein design energy landscape fragment assembly Go model SIMFOLD Natural proteins have the algorithm of finding the global minimum of their free energy surface within a biologically relevant timescale (1). One of the most stringent tests of how much we understand the algorithm may be to predict protein tertiary structures by simulating processes that are analogous to folding, which is often called de novo structure prediction. Recently, significant progress has been made in de novo structure prediction, in which the most successful method is the fragment assembly (FA) method developed by Baker and coworkers (2) and others (3 6). The FA method shows considerable promise for new fold targets of recent Critical Assessments of Techniques for Protein Structure Prediction (CASPs), the community-wide blind tests of structure prediction (7 11). In the FA method, the protocol is separated into two stages: First, we collect structural candidates for every short segment of the target sequence, retrieving them from the structural database. The second stage is to assemble fold these fragments for constructing tertiary structures that have low energies. Simple questions arose as to why the FA method is so successful and what we can learn about protein folding from the success of the FA method. According to Baker and coworkers (12, 13), the FA method is based on the experimental observation that local sequence of a protein biases but does not uniquely decide its local structure. To what extent does the modest local bias influence tertiary structures generated? How is the FA method related to recently developed protein folding theory (14, 15)? In this work, we address these questions. For this purpose, we need structure prediction software that uses the FA method. Here, we use the in-house-developed software, SIMFOLD. SIMFOLD employs a coarse-grained protein representation and has its own energy function derived from physicochemical considerations (16, 17). Conformational sampling is performed by the FA method (18, 19). In the recent CASP6, our group Rokko (as well as the server Rokky) used the SIMFOLD software as a primary prediction tool, demonstrating one of the top-level prediction performances in the new fold category (elite club membership by the assessor, B.K. Lee). In this work, we start with comparison of the structures predicted in the past. We then design a structure prediction test of chimera proteins, where the local structural preference is that for the wild-type (WT) sequence and nonlocal interactions are the only sequence-independent compaction force. We show for some small proteins that the chimera proteins can find the native-like topology of the intact proteins with high probability, indicating that local structural biases and nonspecific compaction forces together lead proteins to their native folds. We further explore this idea by performing exact calculation of the HP lattice model of proteins. The result supports the above suggestion and further indicates that compact structures that are fully compatible with weak bias in local structure are few, one of which is the native fold. These studies elucidate that local structural biases of modest strength make the energy landscape of proteins funnel-like. Results and Discussion Lesson from Structure Prediction in the Past. We start with two observations about comparisons of structures predicted in the past, which provide important insight. First, by using the same energy function (SIMFOLD), we compared the prediction performance by molecular dynamics (MD) simulation with that by the FA method for some small proteins (17). We found that the performance is strikingly different, albeit use of the same energy function. In the case of protein G, for example, both methods found roughly correct secondary structure elements at the right residues, whereas their packing in three dimensions was quite different. MD simulations generated many misfolds with the wrong topology, but not the native fold (four structures that are not marked X in Fig. 1a are snapshots from MD). In contrast, FA could successfully generate the native fold with quite high population. (The structure labeled with Native Fold in Fig. 1b is from FA simulation.) Why did it happen, and what does it tell us? The lowest energy of the misfold found in MD was slightly lower than that around the native basin, indicating that the SIMFOLD energy was not good at identifying the native fold. So, why could FA generate the native fold with high population? Because the FA method restricts local structure to that found in the database, local structure is not affected by the energy function used, as is the case for MD. Indeed, the lowest-energy structures found in MD contained some local structures, especially in the loops, that never appear in the structural database, implying inaccuracy of local interac- Conflict of interest statement: No conflicts declared. This paper was submitted directly (Track II) to the PNAS office. Abbreviations: CASP, Critical Assessment of Techniques for Protein Structure Prediction; FA, fragment assembly; MD, molecular dynamics; PDB, Protein Data Bank; rmsd, rms deviation. To whom correspondence should be addressed. stakada@kobe-u.ac.jp by The National Academy of Sciences of the USA BIOPHYSICS cgi doi pnas PNAS February 28, 2006 vol. 103 no

2 Fig. 1. Comparison of structures generated by MD (a) and FA (b) with the same energy function and corresponding schematic energy landscapes. (a) MD simulations generated four misfold structures, suggesting a highly rugged energy landscape. (b) The FA method generated the native fold with high probability, suggesting a funnel-like landscape. Structures without red X were obtained by the corresponding simulation, whereas those with red X were not observed. The cartoons of protein structures were generated by MOLSCRIPT (59). tion of SIMFOLD, e.g., because of ignored local hydrogen bonding between main chain and side chain. The FA method is free from this inaccuracy in local interactions, which may be the primary reason that the FA method works so well. Success of FA, in turn, suggests importance of local interactions for encoding the native fold. The second observation is from the recent CASPs. In the new fold category of CASP6, most prediction groups of high score used FA-based methods. Although each group used its own scoring (i.e., energy) function, predicted structures were sometimes similar to each other. Obviously, well predicted structures were similar. Interestingly, sometimes predicted structures with the wrong fold were somewhat similar. A nice example is the target T0201 from CASP6. Shown in Fig. 2 are prediction models of three independent groups, Rokko, Baker ROBETTA, and Jones UCL, by the FA methods. Apparently, they have the wrong fold, but are very similar to each other. This observation suggests that the prediction of the FA method is robust against difference in scoring functions. Chimera Experiment. Motivated by these two observations, we designed a chimera experiment for addressing robustness of the FA method with respect to the energy function and the role of local interactions for protein folding. Here, we note that local and nonlocal interactions are treated explicitly in separate stages in the FA method; local structural preference is taken into Fig. 2. Examples of structure predictions from the new fold category in CASP6. (a) The native structure of target T0201 (PDB ID code 1s12). (b) Model 1 from group Rokko. (c) Model 6 from group Baker Robetta. (d) Model 1 from group Jones UCL. account when fragment candidates are listed up in the first stage, whereas nonlocal interactions play central roles in assembling folding these fragments in the second stage. Taking advantage of this separation, we designed the following chimera experiment. We use protein G (56-residue protein) as an example. First, we prepared fragment candidates of every overlapping eight residues, as usual, for protein G. Second, in the assembling stage, we used the energy function for the 56-residue polyvaline (polyval) sequence instead of the sequence of protein G and performed multicanonical Monte Carlo simulations of assembling (18). Thus, the simulation was conducted for the chimera interactions; local structural preference is for protein G, and nonlocal interaction is for polyval. For the latter, we chose Val nearly by chance as a representative of a hydrophobic residue. Hydrophobic residues tend to attract each other, thus introducing nonspecific force of compaction (collapse). Note that we are not interested in polyval folding itself. We tested the same experiment using polyisoleucin and polyleucine and found essentially the same results (data not shown), confirming that the choice of Val is not essential. As a control experiment, we also performed structure prediction of protein G using the WT sequence in the folding assembling stage, as usual. The chimera and control experiments are denoted as [polyval WT] and [WT WT], respectively. (Here, [A B] means that fragment candidates are prepared for B and the assembling simulation uses interactions of sequence A.) The fragment candidates prepared are identical between chimera and control experiments. We now show the results. Fig. 3 a and b plot energies and rms deviations (rmsds) of structures generated by assembling simulations of [WT WT] (control) and [polyval WT] (chimera), respectively. Here, the rmsd is measured from the native. We clearly see a stronger correlation between rmsd and a lower envelope of energies in the [WT WT] case than that in the [polyval WT] case, suggesting that the WT sequence has a more funnel-like energy landscape, and thus the prediction performance of the WT sequence is better than that of the chimera protein. Surprisingly enough, there is weak, but significant, correlation between the rmsd and the lower envelope of energies in the [polyval WT] case. Next, as usual in structure prediction, we performed cluster analysis of low energy structures. Fig. 3 d and e show the structures of the largest cluster centers for [WT WT] and [polyval WT], respectively. (The native structure of protein G is depicted in Fig. 3c as a reference.) As expected, the predicted structure of the [WT WT] case is quite similar to the native (rmsd is 4.0 Å). Surprisingly, the predicted structure of the [polyval WT] case also has the correct fold (rmsd of 5.1 Å), although secondary structure elements are slightly perturbed; the second -strand is too short. The size of the largest cluster for [polyval WT] is reasonably large ( 20% in population) by the standards of de novo structure prediction, although it is smaller than the control case. The structure prediction performance by the FA method is unexpectedly robust with respect to the energy function used for assembling. This robustness is not limited to protein G. We performed the same chimera and control experiments as above for an all -protein, 434 C1 repressor [Protein Data Bank (PDB) ID code 1r69), and an all -protein Hex1 (PDB ID code 1khi). rmsd energy plots of 434 C1 repressor for the [WT WT] and that for the [polyval WT] cases are shown in Fig. 3 f and g, respectively. The two figures look quite different, reflecting the difference in energy functions used in assembling: For the [WT WT] case, weak energetic bias is found toward smaller rmsd, whereas it is not seen in the chimera case. As was the case for protein G, performing cluster analysis for low-energy structures, we identified that the structures of the largest cluster center have the native fold for both cases (see Fig. 3 i and j), although the predicted structure for [polyval WT] (rmsd 5.5 Å) is worse than cgi doi pnas Chikenji et al.

3 Fig. 3. Results of control and chimera experiments of protein G (a e), 434 C1 repressor (f j), and Hex1 (k o). (a) rmsd energy plots of protein G for [WT WT]. (b) rmsd energy plots for [polyval WT]. (c) The native structure of protein G (PDB ID code 1pgb). (d) The structure of the largest cluster center of protein G for [WT WT]. (e) The structure of the largest cluster center for [polyval WT]. (f) rmsd energy plots of 434 C1 repressor for [WT WT]. (g) rmsd energy plots for [polyval WT]. (h) The native structure of 434 C1 repressor (PDB ID code 1r69). (i) The structure of the largest cluster center of 434 C1 repressor for [WT WT]. (j) The structure of the largest cluster center for [polyval WT]. (k) rmsd energy plots of Hex1 for [WT WT]. (l) rmsd energy plots for [polyval WT]. (m) The native structure of Hex1 (PDB ID code 1khi). (n) The structure of the second largest cluster center of Hex1 for [WT WT]. (o) The structure of the largest cluster center for [polyval WT]. that for [WT WT] (rmsd 3.5 Å). Results for Hex1 are shown in Fig. 3 k and l. In this case, there is no correlation between rmsds and energies for both [WT WT] and [polyval WT] cases, and thus it is expected that predictions are worse than the previous cases of protein G and 434 C1 repressor. Fig. 3 n and o shows the structure of the second largest cluster center from the [WT WT] simulations and the structure of the largest cluster center from the [polyval WT], respectively. Interestingly, the structure for [polyval WT] has the correct fold with rmsd of 5.4 Å, whereas that for the [WT WT] simulations has a pair of -strands (fourth strand in yellow and fifth strand in red) flipped, resulting in a larger rmsd of 5.9 Å. (The largest cluster of the [WT WT] simulations is even worse.) The above chimera experiment showed that local structural preferences specific to the WT sequence and the nonspecific compaction force together drive the protein into its native fold with reasonably high probability for small proteins. In other words, given the fragment structure ensemble that reflects local structure preferences of the target sequence, compact structures that are fully compatible with these fragments are few, one of which is the native fold. We also performed the same experiments for some mediumto-large size proteins, finding that the performance of chimera experiments, on average, becomes worse as the size increases (data not shown). Thus, the conclusion above is limited to small proteins. Nature of Short Fragments Taken from Proteins. Because the above results highlight the importance of local structural bias, we here briefly discuss physical properties of short fragments taken from proteins. First, a class of peptides excised from proteins can fold to the native structure by themselves, independently from other parts of the proteins (20 23). For these peptides, local interactions are so strong and specific that the peptide conformation is constrained into basically one structure. Another class of short sequences, often called chameleon sequences, is known to fold to entirely different conformations when they are embedded in different sequence contexts, demonstrating possible multiplicity of local structure bias (24). Some other peptides, once cleaved, do not possess any well defined structure (25). For these peptides, some local structures are prohibited by, for example, steric clash between side-chain atoms or by proximity of two atoms with the same charge. Thus, local bias does not pin down their conformations but reduces available conformational space. From these observations, it is reasonable to consider that short peptides excised from proteins have local biases and restriction in conformational space that are strongly sequence dependent. One of the biased conformations corresponds to the native. All atom MD simulations for short peptides extracted from proteins support this idea (26, 27). In the FA method, it is assumed that such biased conformations of short segments can be approximated by fragment conformations retrieved from the structure database by sequence-comparison methods. Impact of Weak Local Constraint: Lattice HP Model. Then, how and to what extent do local structural biases alter the global shape of the energy landscape? For addressing this question with least ambiguity, we conducted an exact calculation using a simple two-dimensional lattice (2D) HP model of proteins (28), to which we weakly added local structural biases, mimicking characteristics of short peptides of real proteins. In the lattice HP model, a protein is represented as a selfavoiding chain on a 2D square lattice with two types of amino acids: H (hydrophobic) and P (polar). The energy, E, of a chain conformation is determined only by the number of H H contacts, h, ase h, where is a positive constant (used as the energy unit). With this simplicity, complete enumeration in structural and sequence space is feasible for short chains, and thus the sequence structure relationship of this model has been quite thoroughly uncovered (28, 29). Here, we randomly choose a sequence that has a unique ground state-structure with the chain length n 16. The sequence chosen here is HHHPHH- PHHHHPHHP. By exact enumeration of all possible conformations, we easily BIOPHYSICS Chikenji et al. PNAS February 28, 2006 vol. 103 no

4 Fig. 5. Schematic energy landscapes of the pure HP model (a) and the HP model with local structure constraints (b). Fig. 4. The HP model studied here. (a) Native structure of the HP model. (b) Example of a maximally compact structure of the HP model. (c) All four-residue segments of the HP sequence and all possible structures. Structures surrounded by green rectangles are prohibited by the local structure constraint we impose here. (d) All structures of the first excited states (E 8) of the HP model. Structures are classified by the number of native contacts Q. Only two gray shaded structures are not prohibited by imposing the local conformational constraints. find that this sequence has a unique ground state with energy E 9 at the structure depicted in Fig. 4a and 20 misfolded structures in the first excited states (E 8). All of the latter structures are shown in Fig. 4d, as classified by the nativeness score, the Q score, which is used as a measure of structural similarity to the native and is defined by the number of nonbonded contacts common to those at the native structure (Q in the native state of this model is 9). The presence of many low-energy misfolded structures with low Q scores indicates that the energy landscape of this sequence is highly rugged (see Fig. 5a). It has been known in general that the HP model has a highly rugged energy landscape, which is non-protein-like (30). The HP model simply captures the hydrophobic force, which is widely believed to be the driving force for the protein to fold (31). As is clear from argument above, the key feature missing in this HP model is biases in local conformation. Thus, we introduce sequence-dependent weak biases in local structures of the HP model and consider the effect of them on the global shape of the energy landscape. Local bias of the HP model were investigated before (32, 33), where biases were sequence independent. Based on the argument in the previous section, the local structural bias we introduce should be weak and sequence dependent. Namely, the bias should not decisively pin down local structure of a segment, but should modestly reduce diversity of local structures. Physically, given a sequence of a short segment, some local structures are prohibited by, for example, steric clash. Obviously, these prohibited fragment conformations should not contain those that appear in the native structure. In the HP lattice model, we here impose structure biases on a four-residue fragment. There are six types of four-residue sequences in the sequence we use here, which are HHHH, PHHH, HPHH, HHPH, HHHP, and PHHP. The number of possible conformations of four residues in two dimensions is five, as listed in Fig. 4c. For each four-residue sequence, one of the five structures is chosen as a prohibited conformation (marked by green rectangles in Fig. 4c), where the choice is at random from all structures that do not appear in the native structure. These structures should be considered as conformations with high free energy. We first investigate how the first-excited structures (E 8) are affected by the introduced constraint in local structures, finding that 18 of 20 first-excited structures contain at least one prohibited local structure, and thus they are eliminated. (See Fig. 4d where all nonshaded structures contain prohibited local structures.) In particular, structures with small Q score, i.e., highly dissimilar to the native, are completely eliminated by the weak local constraint, where the Q score is the number of native contacts. Only 2 of the 20 structures remain, and they have relatively high Q score of 7 (shaded ones in Fig. 4d; Q score at the native is 9), which means that these first-excited structures are near native. This result suggests that the energy landscape becomes more funnel-like by adding a weak constraint in local structure (see Fig. 5). Next, we see how characteristics of global conformational space are altered by the weak constraint in local structure. Fig. 6a plots numbers of conformations (in logarithmic scale) as a function of total number of nonbonded contacts C, which reflects compactness, where data for the pure HP model without local constraint and the locally constrained HP model are shown in white and black bars, respectively. For all C, the number of conformations is significantly reduced by adding a local constraint. Also important is that the number of conformations decreases with C, indicating that structures with higher compactness are fewer. Led by these two tendencies, the striking result is that the number of maximally compact structures (C 9) is 69 without the local constraint, but only 2 of 69 remain after imposing the local constraint. One of the two remaining structures is of course the native structure, by design. The other structure (shown in Fig. 4b) has the energy E 5 and Q 1 and thus is highly excited. Suppose we try to predict the ground state of this locally cgi doi pnas Chikenji et al.

5 Fig. 6. The effect of the local structure constraint on the conformational space of lattice protein models. (a) The number of conformations (in logarithmic scale) as a function of total number of contacts C. The white bar indicates log (C) of the pure HP model. The dotted bar indicates log (C) of the HP model with local constraints. (b) Temperature dependence of the number of nonnative contacts. Dotted, solid, and broken lines correspond to the pure HP model, the locally constrained HP model, and Go model, respectively. constrained HP model. Assuming that the ground state is a maximally compact structure, we generate maximally compact structures that satisfy the local conformational rule. There are only two of these structures, and one of them is the native. Thus, we can predict the native structure with probability 0.5 regardless of the energy function. Because two maximally compact structures have the energies of 9 and 5, an energy function that has typical error of 2 energy units is enough to identify the ground state. This finding is perfectly parallel to the fact that, in the chimera experiments, just generating compact structures using fragment libraries produces the native fold with high probability. What Makes the Protein Energy Landscape Funnel-Like? What do these results tell us about protein folding? Recently it was established that the transition state ensemble of folding pathways can often be well described by the so-called Go -like models (34 41), which correspond to the perfect-funnel. The basic assumption in the model is that only native contacts are favorable, and nonnative contacts are neutral or repulsive (42). These studies suggested that the energy landscape of proteins is funnel-like. It is unclear, however, how the smooth and funnelshaped energy landscape is organized from numerous atomic interactions. Important and interesting questions related to this point are what is the physicochemical reason of the funnelshaped energy landscape and why nonnative interactions play only minor roles. In the previous section, we showed that constraints in local structure prohibit many stable misfolded structures, resulting in a funnel-like energy landscape. To see this effect more quantitatively under thermodynamic conditions, we calculated the temperature dependence of the average number of nonnative contacts for the above lattice HP model for the pure and locally constrained HP models (Fig. 6b). For comparison, we also show results for a Go model, in which the native structure is set as the same as that of the HP model (Fig. 4a). As shown in Fig. 6b, there is an apparent peak in the number of nonnative contacts curve for the pure HP model, indicating that there is a nonnative intermediate state in the thermodynamic transition. Thus, the pure HP model does not show a two-state transition. In contrast, the thermodynamic behavior of the number of nonnative contacts of the locally constrained HP model shows a sigmoidal transition, which is quite similar to that of the Go model. Indeed nonnative interactions in the locally constrained HP model are significantly reduced, relative to the pure HP model. Namely, although nonnative interactions exist in the locally constrained HP model, they do not contribute very much, suggesting local structural biases make the energy landscape more funnel-like (see Fig. 5b). We note that the current studies do not suggest that the nonlocal interactions are unimportant. Hydrophobicity and van der Waals packing, which are predominantly nonlocal, play major roles in the thermodynamic stability of proteins. The present work suggests the primary code that specifies the native fold, but not the stability, is in the local structure preferences. Implication to Computational Protein Design. The perspective that there are only limited numbers of compact structures fully compatible with local constraints may have implications to protein design as well. Recently, considerable progress has been made in de novo design of foldable proteins (43 45), where the primary strategy is to seek the sequence that optimizes sidechain interactions at a target structure. From the perspective of folding theory, however, it has been anticipated that, instead of optimizing the interactions at the target structure, the (normalized) gap between target energy and the average energy of the denatured ensemble should be optimized. Lattice model simulations support this theory (46). However, using this idea directly for designing real protein sequences is limited (47). More importantly, major successes of protein design so far come from optimization of the interaction energy only at the target (43 45). How can the practice and the theory be reconciled? Obviously by now, the perspective that there are only limited numbers of compact structures fully compatible with local constraints indeed fills the gap: If this is the case and if the target structure is one of the possible compact structures, destabilizing alternative conformations may not be necessary. Discussion on Denatured States of Proteins. The finding that local structural preference and compaction forces together lead the protein into the native fold may have some links to characteristics of denatured states of proteins, which recently attract considerable attention (48 54). In simulations of a -hairpin confined in a pore, Thirumalai and coworkers (52) found that, under highly denatured conditions, the confined -hairpin possesses partially native-like order. This result is in contrast to the denatured ensemble of a bulk -hairpin, which is extended and has little native-like order. Thus, compaction force alone could induce native-like order as in ref. 55. The present work suggests that this hypothesis is quite probable by the help of local structural preferences. Conclusions This study addressed what microscopic interactions make the protein energy landscape funnel-like, motivated the by recent success of the FA method in structure prediction. We reached the following principles of protein folding: (i) There are only limited numbers of compact conformations that are fully compatible with local structural preferences, and the native fold is one of them, and thus (ii) local structure preferences strongly shape up the protein folding funnel. Materials and Methods SIMFOLD energy employs a coarse-grained protein representation. It includes explicit backbone structure and a simplified side chain, which is represented by the centroid of side-chain heavy atoms. The SIMFOLD energy function consists of various interactions: hydrogen bonding, hydrophobic interaction, van der Waals interaction, and so on. Their functional forms are well based on physicochemical considerations, whereas parameters are optimized based on the energy landscape theory and structural database. The explicit expression is described in ref. 17. In the first stage of structure prediction, we prepared fragment candidates for every segment of a target sequence, as usual, using profile profile comparison (56) against a nonredundant set of the structural database. Sequences homologous to the target BIOPHYSICS Chikenji et al. PNAS February 28, 2006 vol. 103 no

6 protein [PSI-BLAST (57) E value 0.01] were rigorously excluded from the fragment libraries, because the focus here is not on homology modeling. We prepared 20 fragment candidates for every eight-residue-length segment of the target protein. Prepared fragment structures would reflect the structural preference of the corresponding segments of the target protein. Second, tertiary structures were generated by assembling the fragment candidates prepared above. Monte Carlo simulation was carried out for finding low-energy conformations with the newly developed reversible fragment replacement algorithm (G.C., Y.F., and S.T., unpublished data). This algorithm has some attractive characteristics over conventional fragment insertion. First, in contrast to conventional algorithms, the current one satisfies the detailed balance condition, thus enabling us to estimate thermodynamic quantities and use modern sampling techniques in computational physics. Here, we use one of these techniques, the multicanonical ensemble method that realizes uniform sampling in energy space, avoiding too-long trapping by misfolded structures (18). Second, in conventional fragment insertion algorithms, insertion of a fragment overwrites part of old fragments and thus short portions of fragments (for example, one-residue-length remains of fragment candidates) remain in any stage of the simulation. Frequent use of short remains of fragments diminishes the advantage of the FA method, that is, correlated, angle information in consecutive residues. The new algorithm rigorously prohibits use of remains of less than four residues, and thus any structure generated is made of portions of fragments with the length between four and eight. Finally, we performed cluster analysis of the resulting lowenergy structures using the pairwise C rmsd as a measure of dissimilarity. Cluster centers of largest cluster sizes were selected as prediction models (58). The cluster cutoff was set to 6 Å here. We thank Satoshi Takahashi and Sotaro Fuchigami for fruitful discussions. We also thank David Leitner for useful discussions and careful reading of the manuscript. G.C. and Y.F. were supported by the Japan Society for the Promotion of Science Research Fellowship for Young Scientists. This work was supported by Grant-in-Aid for scientific research on priority areas Water and Biomolecules from the Ministry of Education, Science, Sports, and Culture of Japan. 1. Fersht, A. R. (1999) Structure and Mechanism in Protein Science: A Guide to Enzyme Catalysis and Protein Folding (Freeman, New York). 2. Simons, K. T., Kooperberg, C., Huang, E. & Baker, D. (1997) J. Mol. Biol. 268, Jones, T. A. & Thirup, S. (1986) EMBO J. 5, Levitt, M. (1992) J. Mol. Biol. 226, Bowie, J. U. & Eisenberg, D. (1994) Proc. Natl. Acad. Sci. USA 91, Jones, D. T. (1997) Proteins 29, Bradley, P., Chivian, D., Meiler, J., Misura, K. M. S., Rohl, C. A., Schief, W. R., Wedemeyer, W. J., Schueler-Furman, O., Murphy, P., Strauss, J., et al. (2003) Proteins 53, Jones, D. T. & McGuffin, L. J. (2003) Proteins 53, Fang, Q. & Shortle, D. (2003) Proteins 53, Karplus, K., Karchin, R., Draper, J., Casper, J., Mandel-Gutfreund, Y., Diekhans, M. & Hughey, R. (2003) Proteins 53, Shao, Y. & Bystroff, C. (2003) Proteins 53, Bonneau, R. & Baker, D. (2001) Annu. Rev. Biophys. Biomol. Struct. 30, Rohl, C., C. E. M. Strauss, K. M. S. Misura & Baker, D. (2004) Methods Enzymol. 383, Bryngelson, J. D., Onuchic, J. N., Socci, N. D. & Wolynes, P. G. (1995) Proteins 21, Onuchic, J. N. & Wolynes, P. G. (2004) Curr. Opin. Struct. Biol. 14, Takada, S., Luthey-Schulten, Z. A. & Wolynes, P. G. (1999) J. Chem. Phys. 110, Fujitsuka, Y., Takada, S., Luthey-Schulten, Z. A. & Wolynes, P. G. (2004) Proteins 54, Chikenji, G., Fujitsuka, Y. & Takada, S. (2003) J. Chem. Phys. 119, Chikenji, G., Fujitsuka, Y. & Takada, S. (2004) Chem. Phys. 307, Blanco, F. J., Rivas, G. & Serrano, L. (1994) Nat. Struct. Biol. 1, Yi, Q., Bystroff, C., Rajagopal, P., Klevit, R. E. & Baker, D. (1998) Protein Sci. 283, Rotondi, K. S. & Gierasch, L. M. (2003) Biochemistry 42, Honda, S., Yamasaki, K., Sawada, Y. & Morii, H. (2004) Structure (London) 12, Zhou, X., Aleber, F., Folkers, G., Gonnet, G. H. & Chelvanayagam, G. (2000) Proteins 41, Wang, Y. & Shortle, D. (1997) Folding Des. 2, Nakajima, N., Higo, J., Kidera, A. & Nakamura, H. (2000) J. Mol. Biol. 296, Higo, J., Ito, N., Kuroda, M., Ono, S., Nakajima, N. & Nakamura, H. (2001) Protein Sci. 10, Lau, K. F. & Dill, K. A. (1989) Macromolecules 22, Li, H., Helling, R., Tang, C. & Wingreen, N. (1996) Science 273, Wolynes, P. G. (1997) Nat. Struct. Biol. 4, Dill, K. A. (1990) Biochemistry 29, Thomas, P. D. & Dill, K. A. (1993) Protein Sci. 2, Irbäck, A. & Sandelin, E. (1998) J. Chem. Phys. 108, Shoemaker, B. A., Wang, J. & Wolynes, P. G. (1997) Proc. Natl. Acad. Sci. USA 94, Portman, J. J., Takada, S. & Wolynes, P. G. (1998) Phys. Rev. Lett. 81, Galzitskaya, O. V. & Finkelstein, A. V. (1999) Proc. Natl. Acad. Sci. USA 96, Alm, E. & Baker, D. (1999) Proc. Natl. Acad. Sci. USA 96, Munoz, V. & Eaton, W. A. (1999) Proc. Natl. Acad. Sci. USA 96, Clementi, C., Nymeyer, H. & Onuchic, J. N. (2000) J. Mol. Biol. 298, Koga, N. & Takada, S. (2001) J. Mol. Biol. 313, Karaniclas, J. & Brooks, C. L., III (2003) J. Mol. Biol. 334, Go, N. (1983) Annu. Rev. Biophys. Bioeng. 12, Dahiyat, B. I. & Mayo, S. L. (1997) Science 278, Walsh, S. T. R., Cheng, H., Bryson, J. W., Roder, H. & DeGrado, W. F. (1999) Proc. Natl. Acad. Sci. USA 96, Kuhlman, B., Dantas, G., Ireton, G. C., Varani, G., Stoddard, B. L. & Baker, D. (2003) Science 302, Shakhnovich, E. I. (1998) Folding Des. 3, R45 R Jin, W., Kambara, O., Sasakawa, H., Tamura, A. & Takada, S. (2003) Structure (London) 11, Smith, L. J., Fiebig, K. M., Schwalbe H. & Dobson, C. M. (1996) Folding Des. 1, R95 R Hammarström, P. & Carlsson, U. (2000) Biochem. Biophys. Res. Commun. 276, Shortle, D. & Ackerman, M. S. (2001) Science 293, Mohana-Borges, R., Goto, N. K., Kroon, G. J., Dyson, H. J. & Wright, P. E. (2004) J. Mol. Biol. 304, Klimov, D. K., Newfield, D. & Thirumalai, D. (2002) Proc. Natl. Acad. Sci. USA 99, Zagrovic, B. & Pande, V. S. (2003) Nat. Struct. Biol. 10, Jha, A. K., Colubri, A., Freed, K. F. & Sosnick, T. R. (2005) Proc. Natl. Acad. Sci. USA 102, Chan, H. S. & Dill, K. A. (1990) Proc. Natl. Acad. Sci. USA 87, Pietrokovski, S. (1996) Nucleic Acids Res. 24, Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D. J. (1997) Nucleic Acids Res. 25, Shortle, D., Simons, K. T. & Baker, D. (1998) Proc. Natl. Acad. Sci. USA 95, Kraulis, P. (1991) J. Appl. Crystallogr. 24, cgi doi pnas Chikenji et al.

Supporting Online Material for

Supporting Online Material for www.sciencemag.org/cgi/content/full/309/5742/1868/dc1 Supporting Online Material for Toward High-Resolution de Novo Structure Prediction for Small Proteins Philip Bradley, Kira M. S. Misura, David Baker*

More information

Folding of small proteins using a single continuous potential

Folding of small proteins using a single continuous potential JOURNAL OF CHEMICAL PHYSICS VOLUME 120, NUMBER 17 1 MAY 2004 Folding of small proteins using a single continuous potential Seung-Yeon Kim School of Computational Sciences, Korea Institute for Advanced

More information

arxiv:cond-mat/ v1 [cond-mat.soft] 19 Mar 2001

arxiv:cond-mat/ v1 [cond-mat.soft] 19 Mar 2001 Modeling two-state cooperativity in protein folding Ke Fan, Jun Wang, and Wei Wang arxiv:cond-mat/0103385v1 [cond-mat.soft] 19 Mar 2001 National Laboratory of Solid State Microstructure and Department

More information

Simulating Folding of Helical Proteins with Coarse Grained Models

Simulating Folding of Helical Proteins with Coarse Grained Models 366 Progress of Theoretical Physics Supplement No. 138, 2000 Simulating Folding of Helical Proteins with Coarse Grained Models Shoji Takada Department of Chemistry, Kobe University, Kobe 657-8501, Japan

More information

Clustering of low-energy conformations near the native structures of small proteins

Clustering of low-energy conformations near the native structures of small proteins Proc. Natl. Acad. Sci. USA Vol. 95, pp. 11158 11162, September 1998 Biophysics Clustering of low-energy conformations near the native structures of small proteins DAVID SHORTLE*, KIM T. SIMONS, AND DAVID

More information

The sequences of naturally occurring proteins are defined by

The sequences of naturally occurring proteins are defined by Protein topology and stability define the space of allowed sequences Patrice Koehl* and Michael Levitt Department of Structural Biology, Fairchild Building, D109, Stanford University, Stanford, CA 94305

More information

arxiv: v1 [cond-mat.soft] 22 Oct 2007

arxiv: v1 [cond-mat.soft] 22 Oct 2007 Conformational Transitions of Heteropolymers arxiv:0710.4095v1 [cond-mat.soft] 22 Oct 2007 Michael Bachmann and Wolfhard Janke Institut für Theoretische Physik, Universität Leipzig, Augustusplatz 10/11,

More information

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy Design of a Novel Globular Protein Fold with Atomic-Level Accuracy Brian Kuhlman, Gautam Dantas, Gregory C. Ireton, Gabriele Varani, Barry L. Stoddard, David Baker Presented by Kate Stafford 4 May 05 Protein

More information

Many proteins spontaneously refold into native form in vitro with high fidelity and high speed.

Many proteins spontaneously refold into native form in vitro with high fidelity and high speed. Macromolecular Processes 20. Protein Folding Composed of 50 500 amino acids linked in 1D sequence by the polypeptide backbone The amino acid physical and chemical properties of the 20 amino acids dictate

More information

Outline. The ensemble folding kinetics of protein G from an all-atom Monte Carlo simulation. Unfolded Folded. What is protein folding?

Outline. The ensemble folding kinetics of protein G from an all-atom Monte Carlo simulation. Unfolded Folded. What is protein folding? The ensemble folding kinetics of protein G from an all-atom Monte Carlo simulation By Jun Shimada and Eugine Shaknovich Bill Hawse Dr. Bahar Elisa Sandvik and Mehrdad Safavian Outline Background on protein

More information

Local Interactions Dominate Folding in a Simple Protein Model

Local Interactions Dominate Folding in a Simple Protein Model J. Mol. Biol. (1996) 259, 988 994 Local Interactions Dominate Folding in a Simple Protein Model Ron Unger 1,2 * and John Moult 2 1 Department of Life Sciences Bar-Ilan University Ramat-Gan, 52900, Israel

More information

Prediction of protein-folding mechanisms from free-energy landscapes derived from native structures

Prediction of protein-folding mechanisms from free-energy landscapes derived from native structures Proc. Natl. Acad. Sci. USA Vol. 96, pp. 11305 11310, September 1999 Biophysics Prediction of protein-folding mechanisms from free-energy landscapes derived from native structures E. ALM AND D. BAKER* Department

More information

It is now well established that proteins are minimally frustrated

It is now well established that proteins are minimally frustrated Domain swapping is a consequence of minimal frustration Sichun Yang*, Samuel S. Cho*, Yaakov Levy*, Margaret S. Cheung*, Herbert Levine*, Peter G. Wolynes*, and José N. Onuchic* *Center for Theoretical

More information

Protein Structure Prediction, Engineering & Design CHEM 430

Protein Structure Prediction, Engineering & Design CHEM 430 Protein Structure Prediction, Engineering & Design CHEM 430 Eero Saarinen The free energy surface of a protein Protein Structure Prediction & Design Full Protein Structure from Sequence - High Alignment

More information

Master equation approach to finding the rate-limiting steps in biopolymer folding

Master equation approach to finding the rate-limiting steps in biopolymer folding JOURNAL OF CHEMICAL PHYSICS VOLUME 118, NUMBER 7 15 FEBRUARY 2003 Master equation approach to finding the rate-limiting steps in biopolymer folding Wenbing Zhang and Shi-Jie Chen a) Department of Physics

More information

arxiv:cond-mat/ v1 2 Feb 94

arxiv:cond-mat/ v1 2 Feb 94 cond-mat/9402010 Properties and Origins of Protein Secondary Structure Nicholas D. Socci (1), William S. Bialek (2), and José Nelson Onuchic (1) (1) Department of Physics, University of California at San

More information

Protein Structure Prediction

Protein Structure Prediction Page 1 Protein Structure Prediction Russ B. Altman BMI 214 CS 274 Protein Folding is different from structure prediction --Folding is concerned with the process of taking the 3D shape, usually based on

More information

Generalized ensemble methods for de novo structure prediction. 1 To whom correspondence may be addressed.

Generalized ensemble methods for de novo structure prediction. 1 To whom correspondence may be addressed. Generalized ensemble methods for de novo structure prediction Alena Shmygelska 1 and Michael Levitt 1 Department of Structural Biology, Stanford University, Stanford, CA 94305-5126 Contributed by Michael

More information

It is not yet possible to simulate the formation of proteins

It is not yet possible to simulate the formation of proteins Three-helix-bundle protein in a Ramachandran model Anders Irbäck*, Fredrik Sjunnesson, and Stefan Wallin Complex Systems Division, Department of Theoretical Physics, Lund University, Sölvegatan 14A, S-223

More information

arxiv:cond-mat/ v1 [cond-mat.soft] 23 Mar 2007

arxiv:cond-mat/ v1 [cond-mat.soft] 23 Mar 2007 The structure of the free energy surface of coarse-grained off-lattice protein models arxiv:cond-mat/0703606v1 [cond-mat.soft] 23 Mar 2007 Ethem Aktürk and Handan Arkin Hacettepe University, Department

More information

The kinetics of protein folding is often remarkably simple. For

The kinetics of protein folding is often remarkably simple. For Fast protein folding kinetics Jack Schonbrun* and Ken A. Dill *Graduate Group in Biophysics and Department of Pharmaceutical Chemistry, University of California, San Francisco, CA 94118 Communicated by

More information

Simulating disorder order transitions in molecular recognition of unstructured proteins: Where folding meets binding

Simulating disorder order transitions in molecular recognition of unstructured proteins: Where folding meets binding Simulating disorder order transitions in molecular recognition of unstructured proteins: Where folding meets binding Gennady M. Verkhivker*, Djamal Bouzida, Daniel K. Gehlhaar, Paul A. Rejto, Stephan T.

More information

Presenter: She Zhang

Presenter: She Zhang Presenter: She Zhang Introduction Dr. David Baker Introduction Why design proteins de novo? It is not clear how non-covalent interactions favor one specific native structure over many other non-native

More information

arxiv:cond-mat/ v1 [cond-mat.soft] 5 May 1998

arxiv:cond-mat/ v1 [cond-mat.soft] 5 May 1998 Linking Rates of Folding in Lattice Models of Proteins with Underlying Thermodynamic Characteristics arxiv:cond-mat/9805061v1 [cond-mat.soft] 5 May 1998 D.K.Klimov and D.Thirumalai Institute for Physical

More information

CMPS 3110: Bioinformatics. Tertiary Structure Prediction

CMPS 3110: Bioinformatics. Tertiary Structure Prediction CMPS 3110: Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the laws of physics! Conformation space is finite

More information

Effect of Sequences on the Shape of Protein Energy Landscapes Yue Li Department of Computer Science Florida State University Tallahassee, FL 32306

Effect of Sequences on the Shape of Protein Energy Landscapes Yue Li Department of Computer Science Florida State University Tallahassee, FL 32306 Effect of Sequences on the Shape of Protein Energy Landscapes Yue Li Department of Computer Science Florida State University Tallahassee, FL 32306 yli@cs.fsu.edu Gary Tyson Department of Computer Science

More information

Improved Beta-Protein Structure Prediction by Multilevel Optimization of NonLocal Strand Pairings and Local Backbone Conformation

Improved Beta-Protein Structure Prediction by Multilevel Optimization of NonLocal Strand Pairings and Local Backbone Conformation 65:922 929 (2006) Improved Beta-Protein Structure Prediction by Multilevel Optimization of NonLocal Strand Pairings and Local Backbone Conformation Philip Bradley and David Baker* University of Washington,

More information

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction CMPS 6630: Introduction to Computational Biology and Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the

More information

Elucidation of the RNA-folding mechanism at the level of both

Elucidation of the RNA-folding mechanism at the level of both RNA hairpin-folding kinetics Wenbing Zhang and Shi-Jie Chen* Department of Physics and Astronomy and Department of Biochemistry, University of Missouri, Columbia, MO 65211 Edited by Peter G. Wolynes, University

More information

Protein Folding Prof. Eugene Shakhnovich

Protein Folding Prof. Eugene Shakhnovich Protein Folding Eugene Shakhnovich Department of Chemistry and Chemical Biology Harvard University 1 Proteins are folded on various scales As of now we know hundreds of thousands of sequences (Swissprot)

More information

How protein thermodynamics and folding mechanisms are altered by the chaperonin cage: Molecular simulations

How protein thermodynamics and folding mechanisms are altered by the chaperonin cage: Molecular simulations How protein thermodynamics and folding mechanisms are altered by the chaperonin cage: Molecular simulations Fumiko Takagi*, Nobuyasu Koga, and Shoji Takada* *PRESTO, Japan Science and Technology Corporation,

More information

The role of secondary structure in protein structure selection

The role of secondary structure in protein structure selection Eur. Phys. J. E 32, 103 107 (2010) DOI 10.1140/epje/i2010-10591-5 Regular Article THE EUROPEAN PHYSICAL JOURNAL E The role of secondary structure in protein structure selection Yong-Yun Ji 1,a and You-Quan

More information

arxiv:cond-mat/ v1 [cond-mat.soft] 16 Nov 2002

arxiv:cond-mat/ v1 [cond-mat.soft] 16 Nov 2002 Dependence of folding rates on protein length Mai Suan Li 1, D. K. Klimov 2 and D. Thirumalai 2 1 Institute of Physics, Polish Academy of Sciences, Al. Lotnikow 32/46, 02-668 Warsaw, Poland 2 Institute

More information

Simulation of mutation: Influence of a side group on global minimum structure and dynamics of a protein model

Simulation of mutation: Influence of a side group on global minimum structure and dynamics of a protein model JOURNAL OF CHEMICAL PHYSICS VOLUME 111, NUMBER 8 22 AUGUST 1999 Simulation of mutation: Influence of a side group on global minimum structure and dynamics of a protein model Benjamin Vekhter and R. Stephen

More information

Effects of Crowding and Confinement on the Structures of the Transition State Ensemble in Proteins

Effects of Crowding and Confinement on the Structures of the Transition State Ensemble in Proteins 8250 J. Phys. Chem. B 2007, 111, 8250-8257 Effects of Crowding and Confinement on the Structures of the Transition State Ensemble in Proteins Margaret S. Cheung and D. Thirumalai*,, Department of Physics,

More information

Protein quality assessment

Protein quality assessment Protein quality assessment Speaker: Renzhi Cao Advisor: Dr. Jianlin Cheng Major: Computer Science May 17 th, 2013 1 Outline Introduction Paper1 Paper2 Paper3 Discussion and research plan Acknowledgement

More information

Identifying the Protein Folding Nucleus Using Molecular Dynamics

Identifying the Protein Folding Nucleus Using Molecular Dynamics doi:10.1006/jmbi.1999.3534 available online at http://www.idealibrary.com on J. Mol. Biol. (2000) 296, 1183±1188 COMMUNICATION Identifying the Protein Folding Nucleus Using Molecular Dynamics Nikolay V.

More information

Computer simulation of polypeptides in a confinement

Computer simulation of polypeptides in a confinement J Mol Model (27) 13:327 333 DOI 1.17/s894-6-147-6 ORIGINAL PAPER Computer simulation of polypeptides in a confinement Andrzej Sikorski & Piotr Romiszowski Received: 3 November 25 / Accepted: 27 June 26

More information

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Introduction to Comparative Protein Modeling. Chapter 4 Part I Introduction to Comparative Protein Modeling Chapter 4 Part I 1 Information on Proteins Each modeling study depends on the quality of the known experimental data. Basis of the model Search in the literature

More information

To understand pathways of protein folding, experimentalists

To understand pathways of protein folding, experimentalists Transition-state structure as a unifying basis in protein-folding mechanisms: Contact order, chain topology, stability, and the extended nucleus mechanism Alan R. Fersht* Cambridge University Chemical

More information

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror Protein structure prediction CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror 1 Outline Why predict protein structure? Can we use (pure) physics-based methods? Knowledge-based methods Two major

More information

DATE A DAtabase of TIM Barrel Enzymes

DATE A DAtabase of TIM Barrel Enzymes DATE A DAtabase of TIM Barrel Enzymes 2 2.1 Introduction.. 2.2 Objective and salient features of the database 2.2.1 Choice of the dataset.. 2.3 Statistical information on the database.. 2.4 Features....

More information

Alpha-helical Topology and Tertiary Structure Prediction of Globular Proteins Scott R. McAllister Christodoulos A. Floudas Princeton University

Alpha-helical Topology and Tertiary Structure Prediction of Globular Proteins Scott R. McAllister Christodoulos A. Floudas Princeton University Alpha-helical Topology and Tertiary Structure Prediction of Globular Proteins Scott R. McAllister Christodoulos A. Floudas Princeton University Department of Chemical Engineering Program of Applied and

More information

Template-Based Modeling of Protein Structure

Template-Based Modeling of Protein Structure Template-Based Modeling of Protein Structure David Constant Biochemistry 218 December 11, 2011 Introduction. Much can be learned about the biology of a protein from its structure. Simply put, structure

More information

Protein Folding In Vitro*

Protein Folding In Vitro* Protein Folding In Vitro* Biochemistry 412 February 29, 2008 [*Note: includes computational (in silico) studies] Fersht & Daggett (2002) Cell 108, 573. Some folding-related facts about proteins: Many small,

More information

Solvation in protein folding analysis: Combination of theoretical and experimental approaches

Solvation in protein folding analysis: Combination of theoretical and experimental approaches Solvation in protein folding analysis: Combination of theoretical and experimental approaches A. M. Fernández-Escamilla*, M. S. Cheung, M. C. Vega, M. Wilmanns, J. N. Onuchic, and L. Serrano* *European

More information

The typical end scenario for those who try to predict protein

The typical end scenario for those who try to predict protein A method for evaluating the structural quality of protein models by using higher-order pairs scoring Gregory E. Sims and Sung-Hou Kim Berkeley Structural Genomics Center, Lawrence Berkeley National Laboratory,

More information

Abstract. Introduction

Abstract. Introduction In silico protein design: the implementation of Dead-End Elimination algorithm CS 273 Spring 2005: Project Report Tyrone Anderson 2, Yu Bai1 3, and Caroline E. Moore-Kochlacs 2 1 Biophysics program, 2

More information

All-atom ab initio folding of a diverse set of proteins

All-atom ab initio folding of a diverse set of proteins All-atom ab initio folding of a diverse set of proteins Jae Shick Yang 1, William W. Chen 2,1, Jeffrey Skolnick 3, and Eugene I. Shakhnovich 1, * 1 Department of Chemistry and Chemical Biology 2 Department

More information

Ab initio protein structure prediction Corey Hardin*, Taras V Pogorelov and Zaida Luthey-Schulten*

Ab initio protein structure prediction Corey Hardin*, Taras V Pogorelov and Zaida Luthey-Schulten* 176 Ab initio protein structure prediction Corey Hardin*, Taras V Pogorelov and Zaida Luthey-Schulten* Steady progress has been made in the field of ab initio protein folding. A variety of methods now

More information

A simple model for calculating the kinetics of protein folding from three-dimensional structures

A simple model for calculating the kinetics of protein folding from three-dimensional structures Proc. Natl. Acad. Sci. USA Vol. 96, pp. 11311 11316, September 1999 Biophysics, Chemistry A simple model for calculating the kinetics of protein folding from three-dimensional structures VICTOR MUÑOZ*

More information

A Path Planning-Based Study of Protein Folding with a Case Study of Hairpin Formation in Protein G and L

A Path Planning-Based Study of Protein Folding with a Case Study of Hairpin Formation in Protein G and L A Path Planning-Based Study of Protein Folding with a Case Study of Hairpin Formation in Protein G and L G. Song, S. Thomas, K.A. Dill, J.M. Scholtz, N.M. Amato Pacific Symposium on Biocomputing 8:240-251(2003)

More information

Visualizing folding of proteins (1 3) and RNA (2) in terms of

Visualizing folding of proteins (1 3) and RNA (2) in terms of Can energy landscape roughness of proteins and RNA be measured by using mechanical unfolding experiments? Changbong Hyeon and D. Thirumalai Chemical Physics Program, Institute for Physical Science and

More information

DesignabilityofProteinStructures:ALattice-ModelStudy usingthemiyazawa-jerniganmatrix

DesignabilityofProteinStructures:ALattice-ModelStudy usingthemiyazawa-jerniganmatrix PROTEINS: Structure, Function, and Genetics 49:403 412 (2002) DesignabilityofProteinStructures:ALattice-ModelStudy usingthemiyazawa-jerniganmatrix HaoLi,ChaoTang,* andneds.wingreen NECResearchInstitute,Princeton,NewJersey

More information

Effective stochastic dynamics on a protein folding energy landscape

Effective stochastic dynamics on a protein folding energy landscape THE JOURNAL OF CHEMICAL PHYSICS 125, 054910 2006 Effective stochastic dynamics on a protein folding energy landscape Sichun Yang, a José N. Onuchic, b and Herbert Levine c Center for Theoretical Biological

More information

Temperature dependence of reactions with multiple pathways

Temperature dependence of reactions with multiple pathways PCCP Temperature dependence of reactions with multiple pathways Muhammad H. Zaman, ac Tobin R. Sosnick bc and R. Stephen Berry* ad a Department of Chemistry, The University of Chicago, Chicago, IL 60637,

More information

Hidden symmetries in primary sequences of small α proteins

Hidden symmetries in primary sequences of small α proteins Hidden symmetries in primary sequences of small α proteins Ruizhen Xu, Yanzhao Huang, Mingfen Li, Hanlin Chen, and Yi Xiao * Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University

More information

PROTEIN FOLDING THEORY: From Lattice to All-Atom Models

PROTEIN FOLDING THEORY: From Lattice to All-Atom Models Annu. Rev. Biophys. Biomol. Struct. 2001. 30:361 96 Copyright c 2001 by Annual Reviews. All rights reserved PROTEIN FOLDING THEORY: From Lattice to All-Atom Models Leonid Mirny and Eugene Shakhnovich Department

More information

Why Do Protein Structures Recur?

Why Do Protein Structures Recur? Why Do Protein Structures Recur? Dartmouth Computer Science Technical Report TR2015-775 Rebecca Leong, Gevorg Grigoryan, PhD May 28, 2015 Abstract Protein tertiary structures exhibit an observable degeneracy

More information

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror Protein structure prediction CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror 1 Outline Why predict protein structure? Can we use (pure) physics-based methods? Knowledge-based methods Two major

More information

Evolutionary design of energy functions for protein structure prediction

Evolutionary design of energy functions for protein structure prediction Evolutionary design of energy functions for protein structure prediction Natalio Krasnogor nxk@ cs. nott. ac. uk Paweł Widera, Jonathan Garibaldi 7th Annual HUMIES Awards 2010-07-09 Protein structure prediction

More information

Homology Modeling. Roberto Lins EPFL - summer semester 2005

Homology Modeling. Roberto Lins EPFL - summer semester 2005 Homology Modeling Roberto Lins EPFL - summer semester 2005 Disclaimer: course material is mainly taken from: P.E. Bourne & H Weissig, Structural Bioinformatics; C.A. Orengo, D.T. Jones & J.M. Thornton,

More information

CHRIS J. BOND*, KAM-BO WONG*, JANE CLARKE, ALAN R. FERSHT, AND VALERIE DAGGETT* METHODS

CHRIS J. BOND*, KAM-BO WONG*, JANE CLARKE, ALAN R. FERSHT, AND VALERIE DAGGETT* METHODS Proc. Natl. Acad. Sci. USA Vol. 94, pp. 13409 13413, December 1997 Biochemistry Characterization of residual structure in the thermally denatured state of barnase by simulation and experiment: Description

More information

Monte Carlo simulations of polyalanine using a reduced model and statistics-based interaction potentials

Monte Carlo simulations of polyalanine using a reduced model and statistics-based interaction potentials THE JOURNAL OF CHEMICAL PHYSICS 122, 024904 2005 Monte Carlo simulations of polyalanine using a reduced model and statistics-based interaction potentials Alan E. van Giessen and John E. Straub Department

More information

Building native protein conformation from highly approximate backbone torsion angles

Building native protein conformation from highly approximate backbone torsion angles Building native protein conformation from highly approximate backbone torsion angles Haipeng Gong, Patrick J. Fleming, and George D. Rose* T. C. Jenkins Department of Biophysics, The Johns Hopkins University,

More information

Protein Folding Pathways and Kinetics: Molecular Dynamics Simulations of -Strand Motifs

Protein Folding Pathways and Kinetics: Molecular Dynamics Simulations of -Strand Motifs Biophysical Journal Volume 83 August 2002 819 835 819 Protein Folding Pathways and Kinetics: Molecular Dynamics Simulations of -Strand Motifs Hyunbum Jang,* Carol K. Hall,* and Yaoqi Zhou *Department of

More information

Molecular dynamics simulations of anti-aggregation effect of ibuprofen. Wenling E. Chang, Takako Takeda, E. Prabhu Raman, and Dmitri Klimov

Molecular dynamics simulations of anti-aggregation effect of ibuprofen. Wenling E. Chang, Takako Takeda, E. Prabhu Raman, and Dmitri Klimov Biophysical Journal, Volume 98 Supporting Material Molecular dynamics simulations of anti-aggregation effect of ibuprofen Wenling E. Chang, Takako Takeda, E. Prabhu Raman, and Dmitri Klimov Supplemental

More information

Computer simulations of protein folding with a small number of distance restraints

Computer simulations of protein folding with a small number of distance restraints Vol. 49 No. 3/2002 683 692 QUARTERLY Computer simulations of protein folding with a small number of distance restraints Andrzej Sikorski 1, Andrzej Kolinski 1,2 and Jeffrey Skolnick 2 1 Department of Chemistry,

More information

Protein Structure Determination from Pseudocontact Shifts Using ROSETTA

Protein Structure Determination from Pseudocontact Shifts Using ROSETTA Supporting Information Protein Structure Determination from Pseudocontact Shifts Using ROSETTA Christophe Schmitz, Robert Vernon, Gottfried Otting, David Baker and Thomas Huber Table S0. Biological Magnetic

More information

Grouping of amino acids and recognition of protein structurally conserved regions by reduced alphabets of amino acids

Grouping of amino acids and recognition of protein structurally conserved regions by reduced alphabets of amino acids Science in China Series C: Life Sciences 2007 Science in China Press Springer-Verlag Grouping of amino acids and recognition of protein structurally conserved regions by reduced alphabets of amino acids

More information

Single-molecule spectroscopy has recently emerged as a powerful. Effects of surface tethering on protein folding mechanisms

Single-molecule spectroscopy has recently emerged as a powerful. Effects of surface tethering on protein folding mechanisms Effects of surface tethering on protein folding mechanisms Miriam Friedel*, Andrij Baumketner, and Joan-Emma Shea Departments of *Physics and Chemistry and Biochemistry, University of California, Santa

More information

Protein Folding. I. Characteristics of proteins. C α

Protein Folding. I. Characteristics of proteins. C α I. Characteristics of proteins Protein Folding 1. Proteins are one of the most important molecules of life. They perform numerous functions, from storing oxygen in tissues or transporting it in a blood

More information

Polypeptide Folding Using Monte Carlo Sampling, Concerted Rotation, and Continuum Solvation

Polypeptide Folding Using Monte Carlo Sampling, Concerted Rotation, and Continuum Solvation Polypeptide Folding Using Monte Carlo Sampling, Concerted Rotation, and Continuum Solvation Jakob P. Ulmschneider and William L. Jorgensen J.A.C.S. 2004, 126, 1849-1857 Presented by Laura L. Thomas and

More information

Universal correlation between energy gap and foldability for the random energy model and lattice proteins

Universal correlation between energy gap and foldability for the random energy model and lattice proteins JOURNAL OF CHEMICAL PHYSICS VOLUME 111, NUMBER 14 8 OCTOBER 1999 Universal correlation between energy gap and foldability for the random energy model and lattice proteins Nicolas E. G. Buchler Biophysics

More information

Short Announcements. 1 st Quiz today: 15 minutes. Homework 3: Due next Wednesday.

Short Announcements. 1 st Quiz today: 15 minutes. Homework 3: Due next Wednesday. Short Announcements 1 st Quiz today: 15 minutes Homework 3: Due next Wednesday. Next Lecture, on Visualizing Molecular Dynamics (VMD) by Klaus Schulten Today s Lecture: Protein Folding, Misfolding, Aggregation

More information

Template Free Protein Structure Modeling Jianlin Cheng, PhD

Template Free Protein Structure Modeling Jianlin Cheng, PhD Template Free Protein Structure Modeling Jianlin Cheng, PhD Associate Professor Computer Science Department Informatics Institute University of Missouri, Columbia 2013 Protein Energy Landscape & Free Sampling

More information

ALL LECTURES IN SB Introduction

ALL LECTURES IN SB Introduction 1. Introduction 2. Molecular Architecture I 3. Molecular Architecture II 4. Molecular Simulation I 5. Molecular Simulation II 6. Bioinformatics I 7. Bioinformatics II 8. Prediction I 9. Prediction II ALL

More information

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES Protein Structure W. M. Grogan, Ph.D. OBJECTIVES 1. Describe the structure and characteristic properties of typical proteins. 2. List and describe the four levels of structure found in proteins. 3. Relate

More information

Protein folding is a fundamental problem in modern structural

Protein folding is a fundamental problem in modern structural Protein folding pathways from replica exchange simulations and a kinetic network model Michael Andrec, Anthony K. Felts, Emilio Gallicchio, and Ronald M. Levy* Department of Chemistry and Chemical Biology

More information

Protein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron.

Protein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron. Protein Dynamics The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron. Below is myoglobin hydrated with 350 water molecules. Only a small

More information

Protein Structures. 11/19/2002 Lecture 24 1

Protein Structures. 11/19/2002 Lecture 24 1 Protein Structures 11/19/2002 Lecture 24 1 All 3 figures are cartoons of an amino acid residue. 11/19/2002 Lecture 24 2 Peptide bonds in chains of residues 11/19/2002 Lecture 24 3 Angles φ and ψ in the

More information

07/30/2012 Rose=a Conference. Principle for designing! ideal protein structures! Nobuyasu & Rie Koga University of Washington

07/30/2012 Rose=a Conference. Principle for designing! ideal protein structures! Nobuyasu & Rie Koga University of Washington 07/30/2012 Rose=a Conference Principle for designing! ideal protein structures! Nobuyasu & Rie Koga University of Washington Naturally occurring protein structures are complicated Functional site or junks

More information

Aggregation of the Amyloid-β Protein: Monte Carlo Optimization Study

Aggregation of the Amyloid-β Protein: Monte Carlo Optimization Study John von Neumann Institute for Computing Aggregation of the Amyloid-β Protein: Monte Carlo Optimization Study S. M. Gopal, K. V. Klenin, W. Wenzel published in From Computational Biophysics to Systems

More information

Scattered Hammond plots reveal second level of site-specific information in protein folding: ( )

Scattered Hammond plots reveal second level of site-specific information in protein folding: ( ) Scattered Hammond plots reveal second level of site-specific information in protein folding: ( ) Linda Hedberg and Mikael Oliveberg* Department of Biochemistry, Umeå University, S-901 87 Umeå, Sweden Edited

More information

arxiv:q-bio/ v1 [q-bio.bm] 11 Jan 2004

arxiv:q-bio/ v1 [q-bio.bm] 11 Jan 2004 Cooperativity and Contact Order in Protein Folding Marek Cieplak Institute of Physics, Polish Academy of Sciences, Al. Lotników 32/46, 02-668 Warsaw, Poland arxiv:q-bio/0401017v1 [q-bio.bm] 11 Jan 2004

More information

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche The molecular structure of a protein can be broken down hierarchically. The primary structure of a protein is simply its

More information

Modeling protein folding: the beauty and power of simplicity Eugene I Shakhnovich

Modeling protein folding: the beauty and power of simplicity Eugene I Shakhnovich R50 Review Modeling protein folding: the beauty and power of simplicity Eugene I Shakhnovich It is argued that simplified models capture key features of protein stability and folding, whereas more detailed

More information

Protein Folding by Robotics

Protein Folding by Robotics Protein Folding by Robotics 1 TBI Winterseminar 2006 February 21, 2006 Protein Folding by Robotics 1 TBI Winterseminar 2006 February 21, 2006 Protein Folding by Robotics Probabilistic Roadmap Planning

More information

A new combination of replica exchange Monte Carlo and histogram analysis for protein folding and thermodynamics

A new combination of replica exchange Monte Carlo and histogram analysis for protein folding and thermodynamics JOURNAL OF CHEMICAL PHYSICS VOLUME 115, NUMBER 3 15 JULY 2001 A new combination of replica exchange Monte Carlo and histogram analysis for protein folding and thermodynamics Dominik Gront Department of

More information

09/06/25. Computergestützte Strukturbiologie (Strukturelle Bioinformatik) Non-uniform distribution of folds. Scheme of protein structure predicition

09/06/25. Computergestützte Strukturbiologie (Strukturelle Bioinformatik) Non-uniform distribution of folds. Scheme of protein structure predicition Sequence identity Structural similarity Computergestützte Strukturbiologie (Strukturelle Bioinformatik) Fold recognition Sommersemester 2009 Peter Güntert Structural similarity X Sequence identity Non-uniform

More information

Protein folding in the cell occurs in a crowded, heterogeneous

Protein folding in the cell occurs in a crowded, heterogeneous Thermodynamics and kinetics of protein folding under confinement Jeetain Mittal a and Robert B. Best b,1 a Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases,

More information

114 Grundlagen der Bioinformatik, SS 09, D. Huson, July 6, 2009

114 Grundlagen der Bioinformatik, SS 09, D. Huson, July 6, 2009 114 Grundlagen der Bioinformatik, SS 09, D. Huson, July 6, 2009 9 Protein tertiary structure Sources for this chapter, which are all recommended reading: D.W. Mount. Bioinformatics: Sequences and Genome

More information

The starting point for folding proteins on a computer is to

The starting point for folding proteins on a computer is to A statistical mechanical method to optimize energy functions for protein folding Ugo Bastolla*, Michele Vendruscolo, and Ernst-Walter Knapp* *Freie Universität Berlin, Department of Biology, Chemistry

More information

Prediction and refinement of NMR structures from sparse experimental data

Prediction and refinement of NMR structures from sparse experimental data Prediction and refinement of NMR structures from sparse experimental data Jeff Skolnick Director Center for the Study of Systems Biology School of Biology Georgia Institute of Technology Overview of talk

More information

The protein folding problem consists of two parts:

The protein folding problem consists of two parts: Energetics and kinetics of protein folding The protein folding problem consists of two parts: 1)Creating a stable, well-defined structure that is significantly more stable than all other possible structures.

More information

Programme Last week s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues

Programme Last week s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues Programme 8.00-8.20 Last week s quiz results + Summary 8.20-9.00 Fold recognition 9.00-9.15 Break 9.15-11.20 Exercise: Modelling remote homologues 11.20-11.40 Summary & discussion 11.40-12.00 Quiz 1 Feedback

More information

Evolution of functionality in lattice proteins

Evolution of functionality in lattice proteins Evolution of functionality in lattice proteins Paul D. Williams,* David D. Pollock, and Richard A. Goldstein* *Department of Chemistry, University of Michigan, Ann Arbor, MI, USA Department of Biological

More information

Hydrophobic Aided Replica Exchange: an Efficient Algorithm for Protein Folding in Explicit Solvent

Hydrophobic Aided Replica Exchange: an Efficient Algorithm for Protein Folding in Explicit Solvent 19018 J. Phys. Chem. B 2006, 110, 19018-19022 Hydrophobic Aided Replica Exchange: an Efficient Algorithm for Protein Folding in Explicit Solvent Pu Liu, Xuhui Huang, Ruhong Zhou,, and B. J. Berne*,, Department

More information

Small organic molecules in aqueous solution can have profound

Small organic molecules in aqueous solution can have profound The molecular basis for the chemical denaturation of proteins by urea Brian J. Bennion and Valerie Daggett* Department of Medicinal Chemistry, University of Washington, Seattle, WA 98195-7610 Edited by

More information

Procheck output. Bond angles (Procheck) Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics.

Procheck output. Bond angles (Procheck) Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics. Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics Iosif Vaisman Email: ivaisman@gmu.edu ----------------------------------------------------------------- Bond

More information