Protein Dynamics The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron. Below is myoglobin hydrated with 350 water molecules. Only a small edge of the heme ring appears at the surface in green. The rest is buried. Early X-ray diffraction structures showed a possible conformational change that could allow access to the heme iron. The crystal structure below shows that Arg-45, a sulphate, His-64, Val-68, and the propionic acid of the heme block O 2 access to the Fe. A crystal structure of phenylhydrazine bound to Mb shows a conformation with an open access to the Fe.
Based on a rigid structure the energy barrier to O 2 binding is about 400 kjoules/mol. But we know that Mb does not hydrolyse ATP in order to bind O 2. How is the energy barrier lowered? Conformational changes enable proteins to carry out a wide range of activities. e.g. antigen binding, enzyme catalysis, muscle protein movement, O 2 binding to globins, receptor binding and activation etc. What are conformational changes and where does the energy come from to enable them? Experimental methods for studying protein dynamics include infra-red, Raman, NMR, CD, and fluorescence spectroscopy and X-ray diffraction. Theoretical methods include Molecular Dynamics Simulations. It is important to remember that computer simulations are only useful if their results can be tested experimentally. For proper analysis of protein dynamics the rules of quantum mechanics must be applied. QM calculations are very computationally expensive and usually only 20 30 atom dynamics are calculated. QM methods are required to simulate enzyme catalysis of bond breakage and formation. Hybrid methods have been successful in simulating enzyme dynamics by combining QM simulation of the active site atoms with classical treatment of the rest of the protein. Classical molecular dynamics appears to successfully simulate protein conformational changes. Classical simulations consider atoms to be balls on springs.
The simulation of dynamics is done by solving Newton s equation of motion for all the atoms where F = ma = mdv dt. If we know F and m, in principle we can solve for a and the trajectory of each atom in the protein. In classical calculations, X-ray co-ordinates are used to set the initial position, r i (t), of each atom. The forces on the atoms F i (t), due to covalent bonds, bond angles, torsion angles, van-der Waals interactions, and electrostatic interactions are calculated using data from small molecules. These are called empirical potentials or force fields. a i E = 2 (l l ) 2 + i i0 bonds b i 2 (θ θ ) 2 + i i0 angles Vn 2 (1+ cos(nω γ)) + torsions N i=1 N 4ε ij [( σ ij ) 12 ( σ ij ) 6 ] + 1 r i j ij r ij 2 i=1 N N i j q i q j r ij Thermal energy available at ambient temperature causes random Brownian motion of the atoms in proteins on the 10-15 s timescale. RT = 0.6 kcal/mole = 2.5 kj/mole at 25 o C. H-bond strength is 5 25 kj/mole. The atoms are all given thermal energy that causes them to move with a velocity, v i (t) in a random direction. Then F i (t+δt), v i (t+δt), and r i (t+δt) are calculated for Δt ~ 10-15 s. The calculations are repeated millions or billions or trillions of times. The longest calculations have simulated atomic motion in a small protein for about 1 microsecond. In 2007, a group at Stanford University folded a protein using a distributed network of personal computers: Folding-at-Home. They were able to fold a fast-folding mutant of the villin headpiece an 8.5 kda, three-helix bundle that folds in about 700 ns.
http://www.stanford.edu/group/pandegroup/foldi ng/villin/ What have we learned from classical simulations? Down to the 10-10 s scale atomic motions are incoherent and interrupted. At about 5x10-9 s collective motions appear and begin to dominate since atoms cannot move unless others get out of the way. At the surface of a folded protein the coherent movements are up to 2Å and in the interior on the order of 0.5 Å. In Myoglobin, picosecond fluctuations in sidechain positions lower the conformational E-barrier to ~ 20 kjoules / mol and determine the O 2 delivery rate. See the dynamics movies on the course www site. Conclusions: 1. Thermal fluctuations lubricate the largescale conformational changes. 2. Proteins are soft i.e. structures can absorb small conformational perturbations in the neighborhood of the perturbant. Biologically important conformational changes range from 0.5 50Å and occur on the 10-9 -10 +3 s scale. A well known conformational change is the rotation of the antigen-binding domains in IgG upon binding antigen. The domains rotate about a hinge on the 10-8 10-7 s scale. This has been too slow to simulate, so far.
See the Figure. Motion can only take place if all the atoms in the hinge are able to move. A different approach is a kind of flip-book animation. Based on two or more X-ray diffraction structures plausible intermediate conformations between the structures are constructed and then a movie is made of the motion. This is how the morph server works at http://www2.molmovdb.org/. Some aspects of these dynamics have been confirmed by detailed measurements of ms and µs dynamics using FRET and NMR. Our current picture of proteins and enzymes is that they exist in many different conformations in equilibrium. e.g. calmodulin exists in at least 10 different conformations. Calmodulin-binding proteins select the conformation of calmodulin that matches their binding site and thereby shift the equilibrium. In addition to thermal energy, proteins may also use interaction energy available upon binding to other proteins and molecules e.g. H-bond formation, electrostatic interactions etc. Protein Structure Prediction and Design ROSETTA is a program for calculating protein structures from amino acid sequences and has been very successful in prediction competitions (CASP). CASP Critical Assessment of Structures for Protein Prediction. Every year, protein predictors compete to see whose program produces the best predictions of new protein structures.
The actual structures determined by X-ray diffraction or NMR are not released until the predictions have been submitted. ROSETTA has also been used for new protein and enzyme design and for protein docking with other proteins and with DNA and RNA. For prediction, ROSETTA begins by searching for homologous segments of a protein in the PDB and applying the backbone conformation to the sequence. This is called threading and is the basis of many prediction programs. Figure 2A shows 5 segment conformations applied to a sequence. To simplify the calculations, the side-chains are represented by hydrophobic, polar, positively-charged or negatively-charge spheres. Figure 2 Das & Baker Ann. Rev. Biochem (2008), 363. In stage 2, (Figure 2B) different local backbone conformations are explored that permit the development of stabilizing tertiary interactions. Exploration means randomly changing the backbone conformation to generate thousands of different structures which are then analysed to determine which have the lowest energy. In stage 3 (Figure 2C) an all-atom representation of the protein is refined by simulated annealing and other procedures. Simulated annealing is a form of molecular dynamics simulation. The protein is heated and then slowly cooled allowing the atoms to explore conformational space.
Repeated heated and cooling can permit the protein to find a global free energy minimum. This approach works particularly well for proteins with low contact order. It does not work nearly so well for proteins with high contact order. Contact order is the average separation in sequence between residues close in space in the protein structure. Usually AA are said to be in contact if any of their atoms are within 6 Å. The sequence separating all of the contacting AA is added up and divided by the number of AA in the protein. Interestingly, proteins with low contact order fold quickly, so ROSETTA works well only for fast folding proteins. Protein Design So far, α-helical proteins have proven easier to design. This is because α-helix formation involves local structure (i -> i+4 H-bond) whereas β- sheet formation involves more long-range structure. Particularly successful is the design of 4-helix bundles based on the heptad repeat. Below is a picture of a 4-helix heme-binding protein tetramer. Small β-sheets have been designed on the basis of the β-hairpin.
New zinc finger proteins have also been designed. One group has designed a protein that can switch between a trimeric coiled-coil and a zinc finger upon addition of zinc. In 2003, ROSETTA was used to design a new α/β protein not observed in the PDB. The structure consists of a β α β α β motif with an antiparallel β-strand added at each of the N- and C-terminus. Figure 1 shows the secondary structure: Squares = helix; circles = loops; hexagons = β-strand purple arrows = H-bonds It is relatively easy to design segments of secondary structure that collapse to form a hydrophobic globular structure. However, many designed proteins form Molten globules collapsed proteins with a high content of secondary structure but no stable tertiary structure. Method: 1. From the 2D picture above ROSETTA was used to search the PDB for 3- and 9-residue
fragments that have a backbone conformation that agrees with the model. 172 backbone-only 3D-models were generated that agreed with each other to within an RMSD of 2-3 Å. 2. Next, 1 sequence was added to each model. All AA were allowed at 71 positions and 110 rotamers (side-chain conformations) were searched for each position. 22 β-strand side-chains were restricted to polar residues and only 75 rotamers were searched. 110 71 x 75 22 > 10 186 structures were searched for each model. This takes about 10 min on a Pentium III per model. From all the selected structures the one with the lowest energy is selected. 3. Next, backbone and side-chain conformations were optimized by cycling through thousands of rounds of backbone side-chain backbone optimization. Following backbone optimization step 2 is repeated so a more optimum sequence is generated. 15 cycles of sequence-backbone optimization were carried out 5 times for each model giving 860 final models. A number of the early models were expressed in E. coli and formed molten globules. Only methods that further optimized steric packing in the interior of the protein produced low energy, tightly packed proteins. The optimized sequence is similar to no known sequence and the structure is unique. Figure 3C shows the predicted structure and 3D the 2.4 Å X-ray diffraction structure, rotated by 90 o.
The agreement between the measured electron density of the protein and the model structure (blue) and real protein structure (red) are shown in Figure 3A and 3B. Figure 4A shows the similarity in the backbone Cα positions where the RMSD between the computer model (blue) and the X-ray structure (red) is 1.17 Å. Figure 4C shows that the side-chains in the hydrophobic core superimpose very closely. Figure 2C shows that the protein is quite resistant to denaturation with guanidine HCl. It does not melt below 98oC. The free energy of unfolding is 13.2 kcal/mol at 25oC indicating that the protein is more stable than most in the PDB.
Does this success mean that we understand protein structures?