Protein Folding Prof. Eugene Shakhnovich

Similar documents
Outline. The ensemble folding kinetics of protein G from an all-atom Monte Carlo simulation. Unfolded Folded. What is protein folding?

Many proteins spontaneously refold into native form in vitro with high fidelity and high speed.

BCHS 6229 Protein Structure and Function. Lecture 3 (October 18, 2011) Protein Folding: Forces, Mechanisms & Characterization

arxiv:cond-mat/ v1 [cond-mat.soft] 19 Mar 2001

Protein Folding. I. Characteristics of proteins. C α

Protein Folding In Vitro*

PROTEIN FOLDING THEORY: From Lattice to All-Atom Models

PROTEIN EVOLUTION AND PROTEIN FOLDING: NON-FUNCTIONAL CONSERVED RESIDUES AND THEIR PROBABLE ROLE

Identifying the Protein Folding Nucleus Using Molecular Dynamics

Secondary structure stability, beta-sheet formation & stability

Protein folding. Today s Outline

Protein Folding & Stability. Lecture 11: Margaret A. Daugherty. Fall How do we go from an unfolded polypeptide chain to a

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy

Lecture 11: Protein Folding & Stability

Protein Folding & Stability. Lecture 11: Margaret A. Daugherty. Fall Protein Folding: What we know. Protein Folding

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction

Basics of protein structure

Introduction to Computational Structural Biology

CMPS 3110: Bioinformatics. Tertiary Structure Prediction

Prediction and refinement of NMR structures from sparse experimental data

Universality and diversity of folding mechanics for three-helix bundle proteins

Biological Thermodynamics

THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION

Computer simulations of protein folding with a small number of distance restraints

Energetics and Thermodynamics

Biology Chemistry & Physics of Biomolecules. Examination #1. Proteins Module. September 29, Answer Key

The protein folding problem consists of two parts:

Modeling Background; Donald J. Jacobs, University of North Carolina at Charlotte Page 1 of 8

Quantitative Stability/Flexibility Relationships; Donald J. Jacobs, University of North Carolina at Charlotte Page 1 of 12

Lecture 2 and 3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability

Protein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron.

Quiz 2 Morphology of Complex Materials

1 of 31. Nucleation and the transition state of the SH3 domain. Isaac A. Hubner, Katherine A. Edmonds, and Eugene I. Shakhnovich *

Biotechnology of Proteins. The Source of Stability in Proteins (III) Fall 2015

Alpha-helical Topology and Tertiary Structure Prediction of Globular Proteins Scott R. McAllister Christodoulos A. Floudas Princeton University

Molecular dynamics simulations of anti-aggregation effect of ibuprofen. Wenling E. Chang, Takako Takeda, E. Prabhu Raman, and Dmitri Klimov

CHRIS J. BOND*, KAM-BO WONG*, JANE CLARKE, ALAN R. FERSHT, AND VALERIE DAGGETT* METHODS

Can a continuum solvent model reproduce the free energy landscape of a β-hairpin folding in water?

Introduction to" Protein Structure

= (-22) = +2kJ /mol

Packing of Secondary Structures

Distance Constraint Model; Donald J. Jacobs, University of North Carolina at Charlotte Page 1 of 11

Exploring the Free Energy Surface of Short Peptides by Using Metadynamics

To understand pathways of protein folding, experimentalists

Computational Biology 1

Motif Prediction in Amino Acid Interaction Networks

Protein Structure Prediction

Thermodynamics. Entropy and its Applications. Lecture 11. NC State University

Folding of small proteins using a single continuous potential

Statistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics

Archives of Biochemistry and Biophysics

Protein structure forces, and folding

All-atom ab initio folding of a diverse set of proteins

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES

Prediction of protein-folding mechanisms from free-energy landscapes derived from native structures

Investigation of physiochemical interactions in

Simulating disorder order transitions in molecular recognition of unstructured proteins: Where folding meets binding

arxiv:cond-mat/ v1 7 Jul 2000

CAP 5510 Lecture 3 Protein Structures

Lecture 21 (11/3/17) Protein Stability, Folding, and Dynamics Hydrophobic effect drives protein folding

Monte Carlo Simulations of Protein Folding using Lattice Models

Biochemistry,530:,, Introduc5on,to,Structural,Biology, Autumn,Quarter,2015,

Short Announcements. 1 st Quiz today: 15 minutes. Homework 3: Due next Wednesday.

Microcalorimetric techniques

Simulating Folding of Helical Proteins with Coarse Grained Models

Presenter: She Zhang

Cecilia Clementi s research group.

Dihedral Angles. Homayoun Valafar. Department of Computer Science and Engineering, USC 02/03/10 CSCE 769

Proteins are not rigid structures: Protein dynamics, conformational variability, and thermodynamic stability

Lecture 34 Protein Unfolding Thermodynamics

Nucleation and the Transition State of the SH3 Domain

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

ALL LECTURES IN SB Introduction

Supporting Online Material for

Paul Sigler et al, 1998.

Lecture 2-3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability

Guessing the upper bound free-energy difference between native-like structures. Jorge A. Vila

Helix-coil and beta sheet-coil transitions in a simplified, yet realistic protein model

Mechanical Proteins. Stretching imunoglobulin and fibronectin. domains of the muscle protein titin. Adhesion Proteins of the Immune System

Presentation Outline. Prediction of Protein Secondary Structure using Neural Networks at Better than 70% Accuracy

Temperature dependence of reactions with multiple pathways

Molecular Modelling. part of Bioinformatik von RNA- und Proteinstrukturen. Sonja Prohaska. Leipzig, SS Computational EvoDevo University Leipzig

Clustering of low-energy conformations near the native structures of small proteins

Supplementary Figures:

Modeling protein folding: the beauty and power of simplicity Eugene I Shakhnovich

Protein Secondary Structure Prediction

Protein Folding by Robotics

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

Charged amino acids (side-chains)

7.88J Protein Folding Problem Fall 2007

BCH 4053 Spring 2003 Chapter 6 Lecture Notes

arxiv:cond-mat/ v1 [cond-mat.soft] 16 Nov 2002

HIV protease inhibitor. Certain level of function can be found without structure. But a structure is a key to understand the detailed mechanism.

Abstract. Introduction

Contact map guided ab initio structure prediction

Polypeptide Folding Using Monte Carlo Sampling, Concerted Rotation, and Continuum Solvation

BIOCHEMISTRY Unit 2 Part 4 ACTIVITY #6 (Chapter 5) PROTEINS

Principal Component Analysis (PCA)

Computer simulation of polypeptides in a confinement

Protein Folding experiments and theory

Transcription:

Protein Folding Eugene Shakhnovich Department of Chemistry and Chemical Biology Harvard University 1 Proteins are folded on various scales As of now we know hundreds of thousands of sequences (Swissprot) and a few thousand of structures (protein data bank) 2 Proteins are tightly packed 3 The screen versions of these slides have full details of copyright and acknowledgements 1

Proteins can fold in vivo and in vitro (Anfinsen 59): protein folding problem 4 Protein physical properties LDAPSQIEVKDVTDTTLAI... ~1 ms 1 s ~10 Å 1. Protein sequence uniquely defines protein native (ground state) structure 2. Native state is thermodynamically stable 3. Native state is kineticall y accessible reachable in a biologically reasonable time 5 Calorimetry: important experimental test of cooperativity Calorimetric study of the lysozyme heat denaturation at various ph; The position of the heat capacity (Ср) peak determines transition temperature T0, the peak width gives the transition width DT, and the area under the peak determines heat DН absorbed by a gram of the protein; The values DT, DH protein s_m.w., and T0 satisfy van t Hoff equations indicating that the denaturation occurs as an all-or-none (first order) transition; The increased heat capacity of the denatured protein ( Ср) originated from the enlarged interface between its hydrophobic groups and water after denaturation; Adapted from P.L.Privalov & N.N.Khechinashvili, J. Mol. Biol. (1974) 86: 665-684 6 The screen versions of these slides have full details of copyright and acknowledgements 2

Small proteins are cooperative two state systems Transition state Free Energy: F = E-TS Folded: low energy E Unfolded: high entropy S 7 Theoretical analysis identified single thermodynamic parameter, energy gap ( ), as a universal predictor of folding thermodynamics and kinetics Major insights from theoretical studies: It was found that only evolutionary selected sequences that have large energy gap can fold cooperatively In kinetics, theory and simulations identified nucleation as a major kinetic event in folding, consistent with first-order cooperative-character of its thermodynamics Folding nuclei were found and characterized Review: Chemical Reviews, 106, pp.1559-88 (2006) 8 Why energy gap is important? random and evolutionary selected sequences 9 The screen versions of these slides have full details of copyright and acknowledgements 3

Large GAP design Test of protein folding theory: the importance of energy gap Energy Q Monte Carlo Steps 10 Finding folding nucleus in simulations Q FREE ENERGY 1x10 6 MC steps 4x10 6 Abkevich et al., Biochemistry 33, 10026-10036 (1994) Shakhnovich et al., Nature 379, 96-98 (1996) 11 Protein engineering: Φ -value analysis Method: engineer a protein with an altered amino acid at a target position and test to which extent the transition state is affected compared to the native state mutant wild type G T G U Transition States Φ = 1: Residue is kinetically important Φ = 0: Residue is kinetically unimpor tant Unfolded States Fersht, Curr. Opin. Struct. Biol. 7, 3-9 (1997) G N Native State 12 The screen versions of these slides have full details of copyright and acknowledgements 4

Folding nucleus in SH3 domains 13 Evolutionary control of folding rates and stability The idea: nucleus residues may determi ne folding rate Therefore if evolution cared about folding kinetics it could have exerted extra pressure on nucleus residues Nucleus residues can be found from the analysis of conservation in sequences of structurally aligned proteins (Mirny and EIS, J.Mol.Biol., 299, p.177 (1999) 14 Evolutionary analysis correctly predicts folding nucleus in Ig-fold proteins Prediction: Mirny and EIS, JMB, 1999 Experiment: J.Clarke and coauthors (2001) 15 The screen versions of these slides have full details of copyright and acknowledgements 5

An all-atom Monte-Carlo folding simulation Unfolded (random coil) Folded (native state) 16 Protein G folding of a small protein in all-atom detail Go model Black first beta-hairpin Red alpha helix Green second beta-hairpin J.Shimada and EIS, PNAS, 99, p. 11175 (2002) 17 Protein G folding pathways: summary Helix-hairpin 1 (accumulates) Helix-β1 Unfolded Helix-β1 or β2 β1-β4 sheet (does not accumulate) Helix-β2 Green circle/box means native-like structure Helix-hairpin 2 (does not accumulate) Native 18 The screen versions of these slides have full details of copyright and acknowledgements 6

What about TSE in protein G? A protocol using Pfold identifies conformations that are committed to fold very fast, downhill, in less than 10 7 steps 19 A structure belonging to the transition state ensemble Green = important in WT Red = important also in mutant 20 From sequence to structure (i.e. non-go) All atom Low RMSD. (wishlist.) How to fold a protein? Approach: All-atom statistical potentials (2-body + hydrogen bonds) Kussell, ES PNAS 02, Hubner, Deeds, ES, PNAS 05 21 The screen versions of these slides have full details of copyright and acknowledgements 7

The potential Energy function: E tot = E contact + a E h-bond Hydrogen bonding potential working for α proteins 22 Contact term: µ-potential Considers only side-chain side-chain interactions; 79 different atom type in total 0.75 1.35 µ A i : atom type of atom i (79 different types) : no. of contacts between A & B in the DB : no. of contacts in pairs not in contact : chosen to make the net interaction zero EAB d ij/(r i+r j) E ij In contact E.Kussell, PNAS 02 Hubner Deeds, ES, PNAS 05 23 Methods 4000 folding runs from fully unfolded chain At constant T Graph analysis of massive data Clustering in multiple order parameters: a multidimensional comprehensive view Analysis of the transition state ensemble 24 The screen versions of these slides have full details of copyright and acknowledgements 8

Folding at physiological T ~25 C 25 Identifying the native state Lowest E prediction is 2.44 Å (best of 4000) Of 4000, 44 trajectories sampled the 2 Å range, 523 3 Å range, 1685 4 Å range, 2700 5 Å range, and 3331 6 Å range This is consistent with usual exponential distributions of FPT 26 A network ensemble view folding Construct a structural graph by clustering confor mations observed in all trajectories Allows combination of multiple trajectories Multidimensi onal view: cluster conformations based on various properties: RMSD, Rg, drms That will allow to fully characterize the folding mechani sm, while any single order parameter may be misleading How to introduce ensemble kinetics into the graph description: idea of flux! 27 The screen versions of these slides have full details of copyright and acknowledgements 9

Example: RMSD graph Each node represents a protein conformation; Colored by RMSD to the native state: from blue (closest) to red (most distant) 28 Flux: putting all runs together 29 Folding scenario: summary 30 The screen versions of these slides have full details of copyright and acknowledgements 10

Conformati ons of the drms intermediar y cluster perfectly Atomistically resolved structural intermediate with NMR-derived structures of L16A model of the intermedi ate (25 structures 1UZC, A. Fersht and coworkers) Average RMSD between drms cluster and 25 1UZC is 4.6 Å, some conformations as low as 1.5 Å 31 Successful ab initio ensemble folding of a small alpha helical domain at constant physiological T < T m Ensemble folding pathway at atomic detail Conclusions Various earlier proposed mechanisms at work: collapse, framework intermediate, nucleation (late) in a fully resolved pathway A graph is useful to conceptualize and organize protein structural space 32 33 The screen versions of these slides have full details of copyright and acknowledgements 11