CS 882 Course Project Protein Planes

Size: px
Start display at page:

Download "CS 882 Course Project Protein Planes"

Transcription

1 CS 882 Course Project Protein Planes Robert Fraser SN Abstract This study is a review of the properties of protein planes. The basis of the research is the covalent bonds along the backbone of the protein, since the plane is defined by these atoms. We will measure the values of the bond lengths and the angles between the bonds to determine the consistency of these values in the PDB. We will go further to measure the lengths of the planes and omega, the dihedral angle that defines the plane. Finally, we will look at secondary structures to see if there exist preferential angles between the peptide planes and the axes of the secondary structure. The results show that the first two properties are consistent and are reliably used for the purposes of refinement and structure prediction. The angles related to the secondary structures were less consistent, but the average value found for the alpha helix matched what was claimed in the literature. The other secondary structures have similar accuracy, and the preferred angles discovered are novel results based upon the literature review. 2 Introduction The purpose of this research has been to study the properties of protein planar regions, particularly with respect to their physical configurations found in known protein structures. There are a significant number of properties that were selected for study. We begin by looking at properties such as covalent bond lengths and angles, we progress with properties such as the plane length and shape, and we conclude with a look at how planes are configured in secondary structures. Let s begin by motivating this research with past work and potential applications. The primary motivation for this work is an open problem known as the C α -trace problem. To understand this problem, we must first take a quick look at protein structure determination. Traditionally, protein structures have been determined by X-ray crystallography. Although this method is highly time-consuming due to the need to crystallize the purified protein, it is used extensively and has been the structure determination method of choice since the first high resolution structure was published in 1960 [KDS + 60]. Once the crystal is formed, an electron density map is produced using X-ray diffraction. The crystallographer can determine the approximate positions of the constituent heavy atoms (ie. other than hydrogen)

2 from this map, but the positions are only as good as the resolution of the map. On small molecules this resolution can be better than 0.03Å, but on proteins 1.5 or 2Å is typical [WD93]. Another common technique is nuclear magnetic resonance (NMR) spectroscopy. NMR produces a graph with peaks corresponding to the shift due to each nucleus in the molecule. The result is higher resolution structures, but it is still subject to inaccuracies and is limited to small molecules [PR04]. The C α -trace problem arises when we are provided with only approximate positions of the alpha carbon atoms for a given protein and we would like to determine the remainder of the structure given only that information. This problem finds a number of applications. One is that you may wish to work with the approximate alpha carbon atom positions determined by X-ray crystallography. Since these positions are subject to inaccuracies, we would like to use some techniques to determine their precise coordinates, as well as the other atoms in the protein. There are a number of protein structures where the only information known is the alpha carbon coordinates, possibly because that was the only information found or because the researcher wished to obfuscate their results. It would be useful to have an accurate method to determine the remainder of the coordinates. Also, some protein structure prediction techniques produce a C α -trace as a first step (see [GKD06], for example), so a good solution to this problem would improve their prediction accuracy. Refinement in this context is the practice of using heuristics to improve the quality of an inaccurate structure. There are many approaches that have been used, and they can be roughly grouped into the following categories: De novo - This method is based purely on the energy of the protein, and adjustments are made to reduce the total energy of the molecule. Anfinsen s thermodynamic hypothesis [Anf73] is that the native state is at or very close to the global minimum energy of the protein. Therefore, force field approximations such as CHARMM [Cor90] or ECEPP/3 [KLS02] fields may be used to refine the structure so that the energy of the protein is minimized. Fragment matching - This approach is to search for fragments of proteins with known structure that have similar alpha carbon positions and sequence information to fragments in the unknown structure. Levitt [Lev92] uses fragment matching in his program SEGMOD, which first fits the discovered fragments into the C α -trace model, and then uses energy minimization to further refine the model. Maximize hydrogen bonding - This approach is similar to a de novo approach, but the first step of the algorithm is to determine which orientations of the peptide planes will maximize hydrogen bonding in the protein [LPW + 93]. Once this structure is determined, it is further refined using energy minimization. Idealized covalent geometry - This approach is the inspiration for this study. Used by Engh and Huber [EH91] for X-ray crystallography refinement, the idea is that even small variations from the ideal geometry for the protein backbone result in significant increases in the energy of the protein. Therefore, the protein

3 structure is refined by constraining all of these values to being as close to ideal as possible. This approach has also been supplemented by including dihedral information [Pay93,DdBSB03]. All of these methods achieve under 1Å root mean square deviation (rmsd) 1 from native structures, while under 0.5Å rmsd is considered good. Perhaps including more information about the plane could further improve results, particularly in the latter approaches. That is purpose of this research: we wish to determine whether there are additional properties associated with protein planes that could be used. We will approach this problem by first examining those properties that are traditionally used, and we will determine how close these values are adhered to. Finally, we will look at some additional properties such as the configuration of the peptide plane with respect to secondary structures to see if the accuracy of these properties is similar to that of the traditional metrics. 3 Protein Plane Properties The protein plane is an area of special interest because of the structural stability that it provides and the inherent reduction in the complexity of the protein model that a planar region entails. The peptide plane was identified in some of the earliest models of proteins. Pauling, Corey and Branson [PCB51] developed models which identified the planes, based solely on theoretical chemistry and crude X-ray crystallographic structures of amino acids. There was little uncertainty: they state there can be no doubt whatever about its (existence). They also were able to identify the covalent bond lengths and angles with remarkable accuracy. The reason for the planarity is because the bond between the nitrogen and carbonyl carbon atoms form a partial double bond. The chemical explanation is that the carbonyl carbon, nitrogen, and oxygen atoms have adjacent p orbitals, and these interact to produce three π orbitals. The result is electron sharing between these three orbitals, and the lowest energy configuration for this sharing is a planar one. For more details, refer to [Kyt95]. An important thing to note at this point is that peptide planes do not map one-to-one with amino acid residues. It is conventional when dealing with proteins to think of the amino acid as the basic building block, which is a useful model. The atom in the middle of the amino acid is the alpha carbon, and it forms the junction between the protein backbone and the side chain for each residue. The alpha carbon is also the corner of each protein plane, so each plane spans two residues. This planar region is sometimes called a peptide unit, since we can use it as a building block as well [BT99]. This concept is illustrated in Figure 1. Now that the nature of the protein plane is clear, we will look at the properties of interest in turn. 1 rmsd is a standard metric for comparing two structures. It is used when we have two sets of points where each point can be identified as having a unique counterpart in the the other set. The rmsd is essentially the average distance between each point in one set and its counterpart in the other.

4 Fig. 1. A section of polypeptide chain is illustrated to show the planar regions. Note that the alpha carbon is at the corner of the plane, while one amino acid residue is of the sequence N C α C along the backbone. Some important properties such as covalent bond lengths and angles are shown. The two dihedral angles φ and ψ are represented as well. ω, the dihedral angle in the peptide plane, is not labelled. The side chains are represented by spheres, where the sphere labelled R i is meant to represent the side chain for residue i. This image is reproduced with permission from [PR04].

5 3.1 Bond Lengths and Angles The lengths of the covalent bonds along the backbone of the protein chain have been determined accurately by X-ray diffraction to hundredths of an Ångström (Å). As mentioned earlier, we are interested to see how closely the bond lengths in protein structure files adhere to these values. The known values for the bonds we are interested in are summarized in Table 1. The standard angles between these atoms are shown in Table 2. The values that are generally considered standard are those of Engh and Huber [EH91]. We include those values published in [PCB51] over 50 years earlier to demonstrate the impressive accuracy of this early work. Although the planar hydrogen atom is shown in Figure 1, it is not being included in this study because hydrogen atoms are not resolved by X-ray diffraction methods. Thus, the majority of protein structure files do not include coordinate data for hydrogen atoms. The bond lengths and angles that are being studied are all those on the backbone that do not involve hydrogen. Table 1. The standard values for the covalent bond lengths among protein backbone atoms, as published by Engh and Huber [EH91]. The standard values from a recent textbook [PR04] are included as well, demonstrating that that the Engh and Huber values are considered standard. The classic results of Pauling et al. [PCB51] are included for interest. Bond Bond Length (Å) Textbook Pauling et al. N C α / C α C / C O / C N / Table 2. The standard values for the angles between protein backbone atoms are given [EH91]. Again, the classic results of Pauling et al. [PCB51] are included for interest. Bonds Bond Angle ( ) Pauling et al. N C α C / C α C O / C α C N / O C N / C N C α / Plane Length When we refer to the length of the plane in this context, we are referring to the distance from one alpha carbon to the next. This could be considered the length of the plane from corner to corner, for the two corners of the plane defined by alpha

6 carbon atoms. This distance is also referred to as the C α C α bond, as it can be treated as a sort of bond in models that consist only of alpha carbon atoms. Ideally, this distance should also be fairly consistent, since all of the intermediary lengths and angles are considered virtually fixed. The purpose of this section is to put that idea to the test. Making this slightly more interesting however, is that there are two stable configurations for the plane to adopt. Vastly more common (on the order of 1000:1 [Ham92]) is the trans configuration, shown in Figure 1. The other planar configuration is the cis conformation, the difference between these is shown in Figure 2. It is clear from the figure that we should expect that the length of the trans configuration will be longer than the cis configuration. For any non-planar instances, we would expect values that lie between these two distances. In the analysis, we will determine the average lengths and standard deviation of the planes for each of the cis and trans configurations. We will also count the number of instances of each to determine their relative prevalence. Finally, it is known that there is a strong preference for the second residue of the plane to be proline when in the cis conformation [Kyt95], we will measure this frequency as well. We will also determine the percentage of instances of proline that are involved in cis configurations, it has been measured as 25% [Kyt95] or in the range of 10-30% [Ham92] when proline is the second residue in the peptide unit. Fig. 2. The trans and cis configurations of the peptide plane both lie in the plane. The difference is that the C α atoms lie on opposite sides of the line formed by the C N bond in the trans configuration, while they are on the same side in the cis configuration. It is clear that the distance between alpha carbon atoms is less in the cis conformation. The terminology is from Latin, where trans means across, and cis translates as on the same side. 3.3 Omega Omega (ω) is the measure of the planarity of the plane. It is the dihedral angle about the C α C N C α atoms. The method of calculating ω is outlined in

7 Figure 3. We will use ω to classify the planes into the three conformations discussed in the previous section. If the angle is 180, we have a clear instance of the trans configuration. Since we are dealing with a natural system, we will allow for a 15 deviation from this value within the trans class. Similarly, anything within 15 of 0 will be considered cis. Everything else will be placed in an other class. It has been documented that such values do not occur [Kyt95], and structures containing values of ω as much as 10 (ie. [OS86]) away from trans were considered noteworthy. More recent works have noted a tendency for angles varying from 180 however, and a standard deviation of 6 has been observed [MT96,Edi01], as shown in Figure 4. Fig. 3. The vectors used to calculate omega. A and B are the cross products of vectors corresponding to backbone bonds, and omega is given by the angle between them. The example shown here is in the trans configuration, so the vectors point in opposite directions and the angle between them is near 180. The other class will be an interesting one to note. For one, it is expected that the occurrence will be fairly low, because it is a higher energy configuration than the two planar ones. The reason that there is a planar tendency in the backbone is that the N C bond becomes a partial double bond, as mentioned earlier. If we have a non-planar configuration, this partial double bond is not being adopted and the energy of the bond is higher. Correspondingly, since double bonds are shorter than single bonds, we would expect that the N C bond is longer in these nonplanar instances. We will check the data to see if this is reflected in known protein structures. 3.4 Secondary Structures Finally, we will examine the relationship between the protein plane and the most common secondary structures: alpha helices, beta strands, and 3-10 helices. The inspiration for this section is the claim that protein planes are generally parallel to the axis of an alpha helix [Ham92,GP98]. This claim is worth investigating to determine whether it is reliable enough for predictive purposes and for refinement of

8 Fig. 4. This graph shows the results of a previous survey of ω dihedral angles of peptide planes in the trans configuration. The histogram contains values of omega from proteins with high accuracy. The dots correspond to the Maxwell-Boltzmann relation with the mapping black-proteins, grey-peptides, and red-current ω value. The line is a classic energy function derived by Corey and Pauling. This image is reproduced from [Edi01]. structures. In addition, it is worthwhile to investigate whether protein planes exhibit any preferential orientations to the other secondary structures. Obviously, the first requirement for this analysis will be to determine the axis of the secondary structure. We will focus the discussion on the alpha helix here, and then the application of this method to the other structures will be covered briefly. There is no conventional method for determining the axis of an alpha helix; the method used should be driven by the particular application. The simplest method is to employ some sort of linear regression so that the distance from the axis to each of the alpha carbon atoms in the helix is minimized. However, secondary structures are natural structures, and as such there are deformations due to their surroundings inside and outside of the protein. Since we would like to know the angle between the protein plane and the axis, this model would not serve us so well as a curved line or segmented axis that follows the bends of the alpha helix. Chothia et al. [CLR81] developed a model that was later refined by Walther et al. [WEA96] that serves this purpose well. It is often referred to as the cross product of triad bisectors method [CSB96]. The method begins by finding the vector B i perpendicular to the axis at the i th alpha carbon, and then using the cross product of two consecutive such vectors to obtain the axis to the helix u i which is local to these alpha carbons. This approach has a particular aptitude for our approach because two alpha carbons are associated with each protein plane, thus we get one axis vector corresponding to one plane. These vectors are shown in Figure 5.

9 Fig. 5. The method described by Walther et al. for determining local helix axis vectors. The local axis vectors are calculated in the first iteration by taking the cross product of two of the consecutive normals. B i = r i + r i+2 2r i+1 u i = B i B i+1 For the helix axis for residue r n (the one at the C-terminus of the helix), the axis vector of r n 1 is used again. For the purposes of smoothing, the axes must be fit to the helix. In order to assign positions for the axis vectors, the geometric center of four consecutive residues around the present is calculated: A i = r i 1 + r i + r i+1 + r i+2 4 The formula needs to be adjusted for the ends of the helices. The axis vectors (u i ) are adjusted so that their lengths are all 1.5Å, which is the average rise per residue along the axis for an ideal alpha helix. The axis of the helix is now described by a series of line segments. The endpoints, b i and e i, of the local helix axis for residue r i are given by: b i = A i e i = A i + u i The axes are smoothed using an iterative approach. The first step is to take the average of three consecutive helix axis vectors (two at the ends): u i,smoothed = u i 1 + u i + u i+1. 3

10 Now the average point coordinates A i are adjusted by finding the midpoint between the beginning point of the current helix axis and the endpoint of the previous one: A i,smoothed = b i + e i 1 2 This smoothing process is repeated three times; the result is a series of local helix axis line segments which approximate the curve of the helix. For our analysis, we will analyze the positions of the plane relative to the smoothed and unsmoothed axis vectors. The nature of the axis calculation technique lends itself well to any other regular structure containing an axis, since there is nothing inherent in the method to alpha helices (except for the rise per residue in the smoothing step). The vectors perpendicular to the axis are in the direction of the bisector of the external angle of the lines formed between three consecutive alpha carbons. Figure 6 illustrates some beta strands and alpha helices, it is clear from this figure that the same principles would apply to both. See [Rot95] for another example of treating a beta strand as a helix for the purpose of determining its axis. A review of the literature did not reveal any preferences for the configuration between beta strands and protein planes, nor for 3-10 helices, so this study will elucidate whether or not such properties exist. Fig. 6. This figure illustrates the configuration of beta strands. The image on the right is a ball and stick model where each ball corresponds to an alpha carbon, and the image on the right is the cartoon representation of the secondary structures of the same protein. The latter image is included to assist in locating beta strands; they are the arrow-shaped objects. Notice that the cross product of consecutive triad bisectors would give a vector that would form a local axis for the beta strand. For the plane, we will find the plane normal by taking the cross product of the C C α vector with the C α,i+1 C α,i vector so that the normal will be on the same

11 side of the plane regardless of the isomerism of the plane. Since nearly all helices are right-handed in proteins, this normal vector will consistently point towards the interior of the helix. Therefore we can have a signed angle, and we will define it so that planes tilted inwards have a positive angle. With beta strands however, the axis will be alternating on different sides of the planes, so will only measure the angle in the range of 0 to 90, with 0 being parallel. 4 Implementation Details In this section, we will begin by covering the data that was used in this study, and then we will briefly discuss how the data was analyzed. We will conclude by mentioning a few of the technical issues encountered during this research. The standard knowledge base used for this study is the Protein Data Bank (PDB) [BKW + 77]. The PDB is the standard repository for protein files once their structures have been determined. The scope of this survey was to analyze every PDB file in the repository as of There is a question of whether surveying the entire PDB would introduce bias or poor data, since many of the structures have poor resolution and there are groups of highly homologous proteins. However, since the purpose of this study was to be exhaustive, these compromises were accepted. Future studies could be easily run on data sets with only high resolution proteins and low homology to determine whether these factors make a difference. Matlab was chosen to perform the analysis so that the pdbread function in the Bioinformatics toolkit could be used. Because there are so many idiosyncrasies in PDB files, the robustness of the Matlab function was desirable. There are many files with unusual indexing, sometimes this is because there are loop regions that could not be resolved by the crystallographer. Sometimes there are errors in the data. In most cases, pdbread is able to make sense of this data, and it returns a struct containing most of the data in the file in a usable format. When Matlab identified a file as unreadable, often due to the file containing only alpha carbon atoms, the file was left aside for the purposes of this study. Of the files in the snapshot of the PDB, were able to be used in the study. Once the struct is obtained for a particular protein, we next need to extract the information relevant to our study. The struct contains an Atom field, which contains the coordinate information for each atom in the protein, along with its other attributes. We begin by extracting only the backbone atoms for each residue by parsing through the struct. The secondary structure of each protein was obtained by running the DSSP program [KS83] on each PDB file. To save computation at runtime, all of the DSSP files were precomputed and stored; this took about 12 hours on a standard 1GHz PC. For each amino acid residue, we have four atom structs (N,C α,c,o), each containing the following (in each case a character refers to an alphanumeric character): PDBID - the 4 character PDB identifier for the protein;

12 atomname - the name of the atom: N for nitrogen, CA for alpha carbon, etc.; resname - the 3 character name of the amino acid; resseq - the number associated with the amino acid containing the atom, incrementing from the N terminal on each chain; chainid - the character corresponding to the current chain; coords - the X,Y,Z coordinates of the atom; ss - the secondary structure. For this survey, we were interested in three types of secondary structure: the alpha helix (identified by an H in DSSP), the beta strand ( E ), and the 3-10 helix( G ). Everything else was considered as being other; for the most part this consisted of loop regions. Once we have all of these atom structs for the current protein, we can run the battery of tests to get all of the desired results. These results are then merged with all of the results previously obtained from other PDB files. 4.1 Technical Issues There were a number of technical issues that were encountered during this research that may be of interest to others hoping to do similar work. Because of the extremely large volume of data, the program had to be run in batches to avoid overflow errors. Thus, the source files were divided into 69 groups of 500 files each, and the results of each of these runs were compiled together once all the files had been analyzed. Another obstacle is the computation time required by the pdbread function. It takes roughly 30 seconds on average to parse a pdb file with a standard 1GHz PC. When parsing such a large number of files, the computation time becomes cumbersome. Due to the previous obstacle however, it became possible to run the analysis in parallel by having different computers analyzing different batches of files. Using this approach, the analysis was performed using three PCs, and the computation took 5 days. 5 Results This study is composed of a large number of tests, and the results are presented in this section in a series of tables and graphs. We begin by looking at the covalent bond lengths (Table 3) and angles (Table 4). In each case, we again present the standard values derived by Engh and Huber [EH91] for reference. We then present the average values found in our study. The standard deviation (σ) values presented are the average standard deviation in each file rather than the standard deviation over the entire data set to give an impression of the variance in each file. For interest, we have also included the absolute maximum and minimum values found for each attribute, as well as the averages of the maximum and minimum values found for each of the 69 batches. These show that there are physically impossible values present, and that the results would likely be cleaner if some sanity checking were performed before the

13 analysis to eliminate such cases. It is possible that there were some instances where there were multiple chains that were labelled as a single chain in the PDB file, since there are examples of the C N bond that are very long. Table 3. The standard values for the covalent bond lengths among protein backbone atoms. Bond Textbook Average σ Instances Maximum Avg Max Minimum Avg Min N C α C α C C O C N Non-planar C N Variable Notice that there is a small difference between the length of the C N bond in the overall and non-planar cases, but this difference is not significant. It is probably not considered in most refinement approaches. Table 4. The standard values for the angles between protein backbone atoms. Bonds Textbook Average σ Instances Maximum Avg Max Minimum Avg Min N C α C C α C O C α C N O C N C N C α These results are what we expected, the values are very close to the textbook values, and there is very little deviation. Thus, we should expect that the lengths of the planes should have similar properties. The results are shown in Table 5. The overall average is presented first, and then we present the results for each isomer. Once again, extreme values are presented for interest. Table 5. The results for the length of the peptide plane. The combined results are shown first, followed by the results for each of the trans and cis isomers. Isomer Instances Average σ Maximum Avg Max Minimum Avg Min All trans cis There were proline residues as the second residue in cis conformation peptide units, while there were a total of prolines encountered in the survey.

14 Thus 3.7% of prolines are associated with cis conformations, which is much lower than the values of 10-30% claimed earlier. The number of cis isomers with proline is 86.5% however, so we do have strong evidence for proline being preferred. We found that the trans isomer outnumbers its counterpart by a ratio of approximately 500:1, which is within a factor of 2 of the value presented in the methods section. In both cases, the standard deviation with respect to the length of the plane is low, so these values are reliable enough for refinement purposes. The values for omega for each of the isomers is shown in Table 6. Table 6. The average values of omega for each isomer class are presented. Isomer Instances Average Observed Standard Deviation trans cis Other In each case, the average is several degrees away from the ideal, which would be expected since we were measuring the absolute value of the angle. In the other case, the mean is surprisingly close to the trans threshold. In order to present the nature of these other values, we present two histograms of the data, as shown in Figure 7. We can now examine the results with respect to secondary structures. The first property looked at was the number of instances of cis configurations and others in the secondary structures to see the occurrence rates differ from the average, as shown in Table 7. Table 7. The number of cis isomers and others in each of the secondary structures chosen for study is presented. Isomer trans Instances cis Instances Other Instances Alpha Helix (26.7%) (20.7%) (28.5%) Beta Strand (10.9%) (25.9%) (20.6%) 3-10 Helix (0.2%) (4.3%) (3.2%) Other (62.1%) (49%) (42.9%) These results reveal some interesting tendencies. The numbers of non-trans isomers in beta strands is high, as well as for 3-10 helices. The number of cis isomers found alpha helices is lower than for the other structures since it is detrimental to alpha helices. Considering this, the number of instances is still quite high. It is also surprising to see the other class of secondary structures, composed mostly of loop regions, has a greater tendency to trans configurations than otherwise. The final results of the study are shown in Table 8. For each type of secondary structure, both the axis found in the initial iteration of the Walther et al. [WEA96] method and that after smoothing were analyzed. The results for the unsmoothed alpha helix

15 Fig. 7. These histograms show the distribution of omega dihedral angles for peptide units that are not in either the trans or cis classes. The graphs show the same data; the latter shows the logarithm (base 10) of the number of instances to elucidate the distribution. It is quite even below 135 except for a slight rise below 30. Above 135, there is a consistent increase in the number of instances in each bin with increasing angle. The bins are 5 wide, and are labelled according to their lower bounds.

16 axes was very close to the hypothesized parallel configuration, although the standard deviation was high. The distribution of the angles is shown the histogram in Figure 8. The smoothed axes were not quite as close to parallel, and the standard deviation was no lower, so based on this result it could be concluded that the raw axis found using the cross product of consecutive bisectors method is accurate for their corresponding peptide unit. We also looked at helices that were longer than three turns to see if the results would be different, but the difference was not significant. The results of the beta strands are shown in Figure 9. The standard deviation of the beta strands is lower than that for the two helices, but the range of possible values was half that of the helices. The distribution of the values for the 3-10 helices is very similar to those for the alpha helix; this histogram is shown in Figure 10. Table 8. This table summarizes the results of survey for the angles between peptide planes and the axes of secondary structures. In total there were alpha helices, of which were more than three turns long. There were beta strands and helices. The instances in the table refers to the number of peptide planes of each type. Secondary Structure Instances Average Observed Standard Deviation α-helix Long α-helix Smoothed α-helix Smoothed Long α-helix β-strand Smoothed β-strand Helix Smoothed 3-10-Helix Conclusions The study reviewed several properties associated with peptide planes. The lengths of the covalent bonds on the backbone of the proteins and the bond angles between these bonds were all found to be close to the accepted values, and had very low standard deviation. The lengths of the planes were found to have low deviation values as well. Once the secondary structure of the proteins were considered, the values predictably become less consistent. The average value of the angle between the axis of an alpha helix and the plane of the peptide unit was found to be very close to 0, as was claimed in the literature. Similar results were obtained for the 3-10 helix and the angle between the beta strand was found to be Both of these results had standard deviation values at least as good as that for the alpha helix, so the claim is equally as strong that these values are valid. These claims were not found in the literature review conducted, so they may be novel results. Finally, it should be considered that the protein structures surveyed have been refined. What is being measured in this review in a sense is how the refinement has been conducted

17 Fig. 8. This is the histogram for the angles between the local axis of each peptide unit in an alpha helix and the peptide plane. Notice that the main peak is slightly greater than 0. Each bin is 5 wide, and is labelled by the upper threshold of the bin. Fig. 9. This is the histogram for the angles between the local axis of each peptide unit in a beta strand and the peptide plane. The distribution is fairly even below 35. Each bin is 5 wide, and is labelled by the upper threshold of the bin.

18 Fig. 10. This is the histogram for the angles between the local axis of each peptide unit in a 3-10 helix and the peptide plane. Notice that the shape of the distribution is very similar to that of the alpha helix. Each bin is 5 wide, and is labelled by the upper threshold of the bin. in the past, so we should expect that the values used as constraints would have low deviation values. Thus, these results demonstrate that the secondary structure constraints are not being used in refinement, possibly to the detriment of the final model. 7 Future Work A more extensive literature review should be conducted to determine whether the novel results determined in this research were indeed novel. If they are not novel however, they are not widely recognized, and thus publication of this work may be of interest to the community. In addition, it would likely be useful to conduct this survey again on a data set consisting only of high resolution protein structures to determine whether the results with regard to secondary structures can be improved. Many researchers like to see such surveys conducted on data sets with low homology as well, so this constraint could also be added in a future review. Finally, the secondary structure constraints could be implemented into a refinement program to determine whether improvement can be obtained. References [Anf73] C.B. Anfinsen. Principles that govern the folding of protein chains. Science, 181(4096): , 1973.

19 [BKW + 77] F.C. Bernstein, T.F. Koetzle, G.J.B. Williams, E.F. Meyer, M.D. Brice, J.R. Rodgers, O. Kennard, T. Shimanouchi, and M. Tasumi. The Protein Data Bank: A computer-based archival file for macromolecular structures. European Journal Of Biochemistry, 80(2): , [BT99] C. Branden and J. Tooze. Introduction to Protein Structure. Garland Publishing Inc., New York, [CLR81] C. Chothia, M. Levitt, and D. Richardson. Helix to helix packing in proteins. Journal of Molecular Biology, 145: , [Cor90] P.E. Correa. The building of protein structures from α-carbon coordinates. Proteins, 7: , [CSB96] J.A. Christopher, R. Swanson, and T.O. Baldwin. Algorithms for finding the axis of a helix: Fast rotational and parametric least-squares methods. Computers & Chemistry, 20(3): , [DdBSB03] M.A. Depristo, P.I.W. de Bakker, R.P. Shetty, and T.L. Blundell. Discrete restraint-based protein modeling and the C α -trace problem. Protein Science, 12: , [Edi01] A.S. Edison. Linus pauling and the planar peptide bond. Nature Structural Biology, 8: , [EH91] [GKD06] [GP98] [Ham92] [KDS + 60] [KLS02] R.A. Engh and R. Huber. Accurate bond and angle parameters for x-ray protein structure refinement. Acta Crystallographica A, 47: , J. Glasgow, T. Kuo, and J. Davies. Protein structure from contact maps: A case-based reasoning approach. Information Systems Frontiers, 8:29 36, N. Guex and M.C. Peitsch. Tutorial: Comparative protein modelling. In The Sixth International Conference on Intelligent Systems for Molecular Biology (ISMB 98), K. Hamaguchi. The Protein Molecule: Conformation, Stability and Folding. Japan Scientific Societies Press, Springer-Verlag, Tokyo, J.C. Kendrew, R.E. Dickerson, B.E. Strandberg, R.G. Hart, D.R. Davies, D.C. Phillips, and V.C. Shore. Structure of myoglobin: A three-dimensional fourier synthesis at 2Å resolution. Nature, 185: , R. Kamierkiewicz, A. Liwo, and H.A. Scheraga. Energy-based reconstruction of a protein backbone from its α-carbon trace by a monte-carlo method. Journal of Computational Chemistry, 23(7): , [KS83] W. Kabsch and C. Sander. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers, 22(12): , [Kyt95] J. Kyte. Structure in Protein Chemistry. Garland Publishing Inc., New York, [Lev92] M. Levitt. Accurate modeling of protein conformation by automatic segment matching. Journal of Molecular Biology, 226: , [LPW + 93] A. Liwo, M.R. Pincus, R.J. Wawak, S. Rackovsky, and H.A. Scheraga. Calculation of protein backbone geometry from α-carbon coordinates based on peptide-group dipole alignment. Protein Science, 2(10): , [MT96] M.W. MacArthur and J.M. Thornton. Deviations from planarity of the peptide bond in peptides and proteins. Journal of Molecular Biology, 264(5): , [OS86] C. Oefner and D. Suck. Crystallographic refinement and structure of DNase I at 2Å resolution. Journal of Molecular Biology, 192(3): , [Pay93] P.W. Payne. Reconstruction of protein conformations from estimated positions of the C α [PCB51] [PR04] [Rot95] [WD93] [WEA96] coordinates. Protein Science, 2: , L. Pauling, R.B. Corey, and H.R. Branson. The structure of proteins: two hydrogen-bonded helical configurations of the polypeptide chain. Proceedings of the National Academy of Science USA, Chemistry, 37: , G.A. Petsko and D. Ringe. Protein Structure and Function. New Science Press Ltd, London, I. Roterman. The geometrical analysis of peptide backbone structure and its local deformations. Biochimie, 77(3): , D.A. Waller and G.G. Dodson. Biological structures obtained by x-ray diffraction methods. In R. Diamond, T.F. Koetzle, K. Prout, and J.S. Richardson, editors, Molecular Structures in Biology, pages Oxford University Press, D. Walther, F. Eisenhaber, and P. Argos. Principles of helix-helix packing in proteins: the helical lattice superimposition model. Journal of Molecular Biology, 255: , 1996.

Algorithm for Rapid Reconstruction of Protein Backbone from Alpha Carbon Coordinates

Algorithm for Rapid Reconstruction of Protein Backbone from Alpha Carbon Coordinates Algorithm for Rapid Reconstruction of Protein Backbone from Alpha Carbon Coordinates MARIUSZ MILIK, 1 *, ANDRZEJ KOLINSKI, 1, 2 and JEFFREY SKOLNICK 1 1 The Scripps Research Institute, Department of Molecular

More information

Introducing Hippy: A visualization tool for understanding the α-helix pair interface

Introducing Hippy: A visualization tool for understanding the α-helix pair interface Introducing Hippy: A visualization tool for understanding the α-helix pair interface Robert Fraser and Janice Glasgow School of Computing, Queen s University, Kingston ON, Canada, K7L3N6 {robert,janice}@cs.queensu.ca

More information

CAP 5510 Lecture 3 Protein Structures

CAP 5510 Lecture 3 Protein Structures CAP 5510 Lecture 3 Protein Structures Su-Shing Chen Bioinformatics CISE 8/19/2005 Su-Shing Chen, CISE 1 Protein Conformation 8/19/2005 Su-Shing Chen, CISE 2 Protein Conformational Structures Hydrophobicity

More information

HOMOLOGY MODELING. The sequence alignment and template structure are then used to produce a structural model of the target.

HOMOLOGY MODELING. The sequence alignment and template structure are then used to produce a structural model of the target. HOMOLOGY MODELING Homology modeling, also known as comparative modeling of protein refers to constructing an atomic-resolution model of the "target" protein from its amino acid sequence and an experimental

More information

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Introduction to Comparative Protein Modeling. Chapter 4 Part I Introduction to Comparative Protein Modeling Chapter 4 Part I 1 Information on Proteins Each modeling study depends on the quality of the known experimental data. Basis of the model Search in the literature

More information

Secondary and sidechain structures

Secondary and sidechain structures Lecture 2 Secondary and sidechain structures James Chou BCMP201 Spring 2008 Images from Petsko & Ringe, Protein Structure and Function. Branden & Tooze, Introduction to Protein Structure. Richardson, J.

More information

Number sequence representation of protein structures based on the second derivative of a folded tetrahedron sequence

Number sequence representation of protein structures based on the second derivative of a folded tetrahedron sequence Number sequence representation of protein structures based on the second derivative of a folded tetrahedron sequence Naoto Morikawa (nmorika@genocript.com) October 7, 2006. Abstract A protein is a sequence

More information

Analysis and Prediction of Protein Structure (I)

Analysis and Prediction of Protein Structure (I) Analysis and Prediction of Protein Structure (I) Jianlin Cheng, PhD School of Electrical Engineering and Computer Science University of Central Florida 2006 Free for academic use. Copyright @ Jianlin Cheng

More information

Basics of protein structure

Basics of protein structure Today: 1. Projects a. Requirements: i. Critical review of one paper ii. At least one computational result b. Noon, Dec. 3 rd written report and oral presentation are due; submit via email to bphys101@fas.harvard.edu

More information

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES Protein Structure W. M. Grogan, Ph.D. OBJECTIVES 1. Describe the structure and characteristic properties of typical proteins. 2. List and describe the four levels of structure found in proteins. 3. Relate

More information

Introduction to" Protein Structure

Introduction to Protein Structure Introduction to" Protein Structure Function, evolution & experimental methods Thomas Blicher, Center for Biological Sequence Analysis Learning Objectives Outline the basic levels of protein structure.

More information

Protein Structure Determination from Pseudocontact Shifts Using ROSETTA

Protein Structure Determination from Pseudocontact Shifts Using ROSETTA Supporting Information Protein Structure Determination from Pseudocontact Shifts Using ROSETTA Christophe Schmitz, Robert Vernon, Gottfried Otting, David Baker and Thomas Huber Table S0. Biological Magnetic

More information

Protein Structure Analysis and Verification. Course S Basics for Biosystems of the Cell exercise work. Maija Nevala, BIO, 67485U 16.1.

Protein Structure Analysis and Verification. Course S Basics for Biosystems of the Cell exercise work. Maija Nevala, BIO, 67485U 16.1. Protein Structure Analysis and Verification Course S-114.2500 Basics for Biosystems of the Cell exercise work Maija Nevala, BIO, 67485U 16.1.2008 1. Preface When faced with an unknown protein, scientists

More information

Figure 1. Molecules geometries of 5021 and Each neutral group in CHARMM topology was grouped in dash circle.

Figure 1. Molecules geometries of 5021 and Each neutral group in CHARMM topology was grouped in dash circle. Project I Chemistry 8021, Spring 2005/2/23 This document was turned in by a student as a homework paper. 1. Methods First, the cartesian coordinates of 5021 and 8021 molecules (Fig. 1) are generated, in

More information

Statistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics

Statistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics Statistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics Jianlin Cheng, PhD Department of Computer Science University of Missouri, Columbia

More information

Orientational degeneracy in the presence of one alignment tensor.

Orientational degeneracy in the presence of one alignment tensor. Orientational degeneracy in the presence of one alignment tensor. Rotation about the x, y and z axes can be performed in the aligned mode of the program to examine the four degenerate orientations of two

More information

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche The molecular structure of a protein can be broken down hierarchically. The primary structure of a protein is simply its

More information

Ramachandran and his Map

Ramachandran and his Map Ramachandran and his Map C Ramakrishnan Introduction C Ramakrishnan, retired professor from Molecular Biophysics Unit, Indian Institute of Science, Bangalore, has been associated with Professor G N Ramachandran

More information

Francisco Melo, Damien Devos, Eric Depiereux and Ernest Feytmans

Francisco Melo, Damien Devos, Eric Depiereux and Ernest Feytmans From: ISMB-97 Proceedings. Copyright 1997, AAAI (www.aaai.org). All rights reserved. ANOLEA: A www Server to Assess Protein Structures Francisco Melo, Damien Devos, Eric Depiereux and Ernest Feytmans Facultés

More information

THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION

THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION AND CALIBRATION Calculation of turn and beta intrinsic propensities. A statistical analysis of a protein structure

More information

Molecular Modeling lecture 2

Molecular Modeling lecture 2 Molecular Modeling 2018 -- lecture 2 Topics 1. Secondary structure 3. Sequence similarity and homology 2. Secondary structure prediction 4. Where do protein structures come from? X-ray crystallography

More information

Molecular Modelling. part of Bioinformatik von RNA- und Proteinstrukturen. Sonja Prohaska. Leipzig, SS Computational EvoDevo University Leipzig

Molecular Modelling. part of Bioinformatik von RNA- und Proteinstrukturen. Sonja Prohaska. Leipzig, SS Computational EvoDevo University Leipzig part of Bioinformatik von RNA- und Proteinstrukturen Computational EvoDevo University Leipzig Leipzig, SS 2011 Protein Structure levels or organization Primary structure: sequence of amino acids (from

More information

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748 CAP 5510: Introduction to Bioinformatics Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs07.html 2/15/07 CAP5510 1 EM Algorithm Goal: Find θ, Z that maximize Pr

More information

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy Design of a Novel Globular Protein Fold with Atomic-Level Accuracy Brian Kuhlman, Gautam Dantas, Gregory C. Ireton, Gabriele Varani, Barry L. Stoddard, David Baker Presented by Kate Stafford 4 May 05 Protein

More information

Assignment 2 Atomic-Level Molecular Modeling

Assignment 2 Atomic-Level Molecular Modeling Assignment 2 Atomic-Level Molecular Modeling CS/BIOE/CME/BIOPHYS/BIOMEDIN 279 Due: November 3, 2016 at 3:00 PM The goal of this assignment is to understand the biological and computational aspects of macromolecular

More information

Lecture 26: Polymers: DNA Packing and Protein folding 26.1 Problem Set 4 due today. Reading for Lectures 22 24: PKT Chapter 8 [ ].

Lecture 26: Polymers: DNA Packing and Protein folding 26.1 Problem Set 4 due today. Reading for Lectures 22 24: PKT Chapter 8 [ ]. Lecture 26: Polymers: DA Packing and Protein folding 26.1 Problem Set 4 due today. eading for Lectures 22 24: PKT hapter 8 DA Packing for Eukaryotes: The packing problem for the larger eukaryotic genomes

More information

Secondary Structure. Bioch/BIMS 503 Lecture 2. Structure and Function of Proteins. Further Reading. Φ, Ψ angles alone determine protein structure

Secondary Structure. Bioch/BIMS 503 Lecture 2. Structure and Function of Proteins. Further Reading. Φ, Ψ angles alone determine protein structure Bioch/BIMS 503 Lecture 2 Structure and Function of Proteins August 28, 2008 Robert Nakamoto rkn3c@virginia.edu 2-0279 Secondary Structure Φ Ψ angles determine protein structure Φ Ψ angles are restricted

More information

Conformational Geometry of Peptides and Proteins:

Conformational Geometry of Peptides and Proteins: Conformational Geometry of Peptides and Proteins: Before discussing secondary structure, it is important to appreciate the conformational plasticity of proteins. Each residue in a polypeptide has three

More information

Copyright Mark Brandt, Ph.D A third method, cryogenic electron microscopy has seen increasing use over the past few years.

Copyright Mark Brandt, Ph.D A third method, cryogenic electron microscopy has seen increasing use over the past few years. Structure Determination and Sequence Analysis The vast majority of the experimentally determined three-dimensional protein structures have been solved by one of two methods: X-ray diffraction and Nuclear

More information

Biochemistry,530:,, Introduc5on,to,Structural,Biology, Autumn,Quarter,2015,

Biochemistry,530:,, Introduc5on,to,Structural,Biology, Autumn,Quarter,2015, Biochemistry,530:,, Introduc5on,to,Structural,Biology, Autumn,Quarter,2015, Course,Informa5on, BIOC%530% GraduateAlevel,discussion,of,the,structure,,func5on,,and,chemistry,of,proteins,and, nucleic,acids,,control,of,enzyma5c,reac5ons.,please,see,the,course,syllabus,and,

More information

THE UNIVERSITY OF MANITOBA. PAPER NO: 409 LOCATION: Fr. Kennedy Gold Gym PAGE NO: 1 of 6 DEPARTMENT & COURSE NO: CHEM 4630 TIME: 3 HOURS

THE UNIVERSITY OF MANITOBA. PAPER NO: 409 LOCATION: Fr. Kennedy Gold Gym PAGE NO: 1 of 6 DEPARTMENT & COURSE NO: CHEM 4630 TIME: 3 HOURS PAPER NO: 409 LOCATION: Fr. Kennedy Gold Gym PAGE NO: 1 of 6 DEPARTMENT & COURSE NO: CHEM 4630 TIME: 3 HOURS EXAMINATION: Biochemistry of Proteins EXAMINER: J. O'Neil Section 1: You must answer all of

More information

Useful background reading

Useful background reading Overview of lecture * General comment on peptide bond * Discussion of backbone dihedral angles * Discussion of Ramachandran plots * Description of helix types. * Description of structures * NMR patterns

More information

Bioinformatics. Macromolecular structure

Bioinformatics. Macromolecular structure Bioinformatics Macromolecular structure Contents Determination of protein structure Structure databases Secondary structure elements (SSE) Tertiary structure Structure analysis Structure alignment Domain

More information

Dihedral Angles. Homayoun Valafar. Department of Computer Science and Engineering, USC 02/03/10 CSCE 769

Dihedral Angles. Homayoun Valafar. Department of Computer Science and Engineering, USC 02/03/10 CSCE 769 Dihedral Angles Homayoun Valafar Department of Computer Science and Engineering, USC The precise definition of a dihedral or torsion angle can be found in spatial geometry Angle between to planes Dihedral

More information

Tools for Cryo-EM Map Fitting. Paul Emsley MRC Laboratory of Molecular Biology

Tools for Cryo-EM Map Fitting. Paul Emsley MRC Laboratory of Molecular Biology Tools for Cryo-EM Map Fitting Paul Emsley MRC Laboratory of Molecular Biology April 2017 Cryo-EM model-building typically need to move more atoms that one does for crystallography the maps are lower resolution

More information

Computing RMSD and fitting protein structures: how I do it and how others do it

Computing RMSD and fitting protein structures: how I do it and how others do it Computing RMSD and fitting protein structures: how I do it and how others do it Bertalan Kovács, Pázmány Péter Catholic University 03/08/2016 0. Introduction All the following algorithms have been implemented

More information

Model Mélange. Physical Models of Peptides and Proteins

Model Mélange. Physical Models of Peptides and Proteins Model Mélange Physical Models of Peptides and Proteins In the Model Mélange activity, you will visit four different stations each featuring a variety of different physical models of peptides or proteins.

More information

NMR, X-ray Diffraction, Protein Structure, and RasMol

NMR, X-ray Diffraction, Protein Structure, and RasMol NMR, X-ray Diffraction, Protein Structure, and RasMol Introduction So far we have been mostly concerned with the proteins themselves. The techniques (NMR or X-ray diffraction) used to determine a structure

More information

4 Proteins: Structure, Function, Folding W. H. Freeman and Company

4 Proteins: Structure, Function, Folding W. H. Freeman and Company 4 Proteins: Structure, Function, Folding 2013 W. H. Freeman and Company CHAPTER 4 Proteins: Structure, Function, Folding Learning goals: Structure and properties of the peptide bond Structural hierarchy

More information

The Structure and Functions of Proteins

The Structure and Functions of Proteins Wright State University CORE Scholar Computer Science and Engineering Faculty Publications Computer Science and Engineering 2003 The Structure and Functions of Proteins Dan E. Krane Wright State University

More information

Protein Structure. Hierarchy of Protein Structure. Tertiary structure. independently stable structural unit. includes disulfide bonds

Protein Structure. Hierarchy of Protein Structure. Tertiary structure. independently stable structural unit. includes disulfide bonds Protein Structure Hierarchy of Protein Structure 2 3 Structural element Primary structure Secondary structure Super-secondary structure Domain Tertiary structure Quaternary structure Description amino

More information

Protein structure (and biomolecular structure more generally) CS/CME/BioE/Biophys/BMI 279 Sept. 28 and Oct. 3, 2017 Ron Dror

Protein structure (and biomolecular structure more generally) CS/CME/BioE/Biophys/BMI 279 Sept. 28 and Oct. 3, 2017 Ron Dror Protein structure (and biomolecular structure more generally) CS/CME/BioE/Biophys/BMI 279 Sept. 28 and Oct. 3, 2017 Ron Dror Please interrupt if you have questions, and especially if you re confused! Assignment

More information

Protein Structure Basics

Protein Structure Basics Protein Structure Basics Presented by Alison Fraser, Christine Lee, Pradhuman Jhala, Corban Rivera Importance of Proteins Muscle structure depends on protein-protein interactions Transport across membranes

More information

Better Bond Angles in the Protein Data Bank

Better Bond Angles in the Protein Data Bank Better Bond Angles in the Protein Data Bank C.J. Robinson and D.B. Skillicorn School of Computing Queen s University {robinson,skill}@cs.queensu.ca Abstract The Protein Data Bank (PDB) contains, at least

More information

From Amino Acids to Proteins - in 4 Easy Steps

From Amino Acids to Proteins - in 4 Easy Steps From Amino Acids to Proteins - in 4 Easy Steps Although protein structure appears to be overwhelmingly complex, you can provide your students with a basic understanding of how proteins fold by focusing

More information

Protein Structures. 11/19/2002 Lecture 24 1

Protein Structures. 11/19/2002 Lecture 24 1 Protein Structures 11/19/2002 Lecture 24 1 All 3 figures are cartoons of an amino acid residue. 11/19/2002 Lecture 24 2 Peptide bonds in chains of residues 11/19/2002 Lecture 24 3 Angles φ and ψ in the

More information

Protein structure analysis. Risto Laakso 10th January 2005

Protein structure analysis. Risto Laakso 10th January 2005 Protein structure analysis Risto Laakso risto.laakso@hut.fi 10th January 2005 1 1 Summary Various methods of protein structure analysis were examined. Two proteins, 1HLB (Sea cucumber hemoglobin) and 1HLM

More information

Discrete representations of the protein C Xavier F de la Cruz 1, Michael W Mahoney 2 and Byungkook Lee

Discrete representations of the protein C Xavier F de la Cruz 1, Michael W Mahoney 2 and Byungkook Lee Research Paper 223 Discrete representations of the protein C chain Xavier F de la Cruz 1, Michael W Mahoney 2 and Byungkook Lee Background: When a large number of protein conformations are generated and

More information

Protein Structure: Data Bases and Classification Ingo Ruczinski

Protein Structure: Data Bases and Classification Ingo Ruczinski Protein Structure: Data Bases and Classification Ingo Ruczinski Department of Biostatistics, Johns Hopkins University Reference Bourne and Weissig Structural Bioinformatics Wiley, 2003 More References

More information

Examples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE

Examples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE Examples of Protein Modeling Protein Modeling Visualization Examination of an experimental structure to gain insight about a research question Dynamics To examine the dynamics of protein structures To

More information

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction CMPS 6630: Introduction to Computational Biology and Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the

More information

Computational Molecular Modeling

Computational Molecular Modeling Computational Molecular Modeling Lecture 1: Structure Models, Properties Chandrajit Bajaj Today s Outline Intro to atoms, bonds, structure, biomolecules, Geometry of Proteins, Nucleic Acids, Ribosomes,

More information

Motif Prediction in Amino Acid Interaction Networks

Motif Prediction in Amino Acid Interaction Networks Motif Prediction in Amino Acid Interaction Networks Omar GACI and Stefan BALEV Abstract In this paper we represent a protein as a graph where the vertices are amino acids and the edges are interactions

More information

BCMP 201 Protein biochemistry

BCMP 201 Protein biochemistry BCMP 201 Protein biochemistry BCMP 201 Protein biochemistry with emphasis on the interrelated roles of protein structure, catalytic activity, and macromolecular interactions in biological processes. The

More information

CMPS 3110: Bioinformatics. Tertiary Structure Prediction

CMPS 3110: Bioinformatics. Tertiary Structure Prediction CMPS 3110: Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the laws of physics! Conformation space is finite

More information

COMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University

COMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University COMP 598 Advanced Computational Biology Methods & Research Introduction Jérôme Waldispühl School of Computer Science McGill University General informations (1) Office hours: by appointment Office: TR3018

More information

F. Piazza Center for Molecular Biophysics and University of Orléans, France. Selected topic in Physical Biology. Lecture 1

F. Piazza Center for Molecular Biophysics and University of Orléans, France. Selected topic in Physical Biology. Lecture 1 Zhou Pei-Yuan Centre for Applied Mathematics, Tsinghua University November 2013 F. Piazza Center for Molecular Biophysics and University of Orléans, France Selected topic in Physical Biology Lecture 1

More information

Molecular Mechanics. I. Quantum mechanical treatment of molecular systems

Molecular Mechanics. I. Quantum mechanical treatment of molecular systems Molecular Mechanics I. Quantum mechanical treatment of molecular systems The first principle approach for describing the properties of molecules, including proteins, involves quantum mechanics. For example,

More information

Section Week 3. Junaid Malek, M.D.

Section Week 3. Junaid Malek, M.D. Section Week 3 Junaid Malek, M.D. Biological Polymers DA 4 monomers (building blocks), limited structure (double-helix) RA 4 monomers, greater flexibility, multiple structures Proteins 20 Amino Acids,

More information

Protein Data Bank Contents Guide: Atomic Coordinate Entry Format Description. Version Document Published by the wwpdb

Protein Data Bank Contents Guide: Atomic Coordinate Entry Format Description. Version Document Published by the wwpdb Protein Data Bank Contents Guide: Atomic Coordinate Entry Format Description Version 3.30 Document Published by the wwpdb This format complies with the PDB Exchange Dictionary (PDBx) http://mmcif.pdb.org/dictionaries/mmcif_pdbx.dic/index/index.html.

More information

The typical end scenario for those who try to predict protein

The typical end scenario for those who try to predict protein A method for evaluating the structural quality of protein models by using higher-order pairs scoring Gregory E. Sims and Sung-Hou Kim Berkeley Structural Genomics Center, Lawrence Berkeley National Laboratory,

More information

Overview. The peptide bond. Page 1

Overview. The peptide bond. Page 1 Overview Secondary structure: the conformation of the peptide backbone The peptide bond, steric implications Steric hindrance and sterically allowed conformations. Ramachandran diagrams Side chain conformations

More information

Lecture 2 and 3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability

Lecture 2 and 3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability Lecture 2 and 3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability Part I. Review of forces Covalent bonds Non-covalent Interactions: Van der Waals Interactions

More information

Presentation Outline. Prediction of Protein Secondary Structure using Neural Networks at Better than 70% Accuracy

Presentation Outline. Prediction of Protein Secondary Structure using Neural Networks at Better than 70% Accuracy Prediction of Protein Secondary Structure using Neural Networks at Better than 70% Accuracy Burkhard Rost and Chris Sander By Kalyan C. Gopavarapu 1 Presentation Outline Major Terminology Problem Method

More information

Reconstruction of Protein Backbone with the α-carbon Coordinates *

Reconstruction of Protein Backbone with the α-carbon Coordinates * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 26, 1107-1119 (2010) Reconstruction of Protein Backbone with the α-carbon Coordinates * JEN-HUI WANG, CHANG-BIAU YANG + AND CHIOU-TING TSENG Department of

More information

Outline. Levels of Protein Structure. Primary (1 ) Structure. Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins

Outline. Levels of Protein Structure. Primary (1 ) Structure. Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins Margaret Daugherty Fall 2003 Outline Four levels of structure are used to describe proteins; Alpha helices and beta sheets

More information

Electron Density at various resolutions, and fitting a model as accurately as possible.

Electron Density at various resolutions, and fitting a model as accurately as possible. Section 9, Electron Density Maps 900 Electron Density at various resolutions, and fitting a model as accurately as possible. ρ xyz = (Vol) -1 h k l m hkl F hkl e iφ hkl e-i2π( hx + ky + lz ) Amplitude

More information

Prediction and refinement of NMR structures from sparse experimental data

Prediction and refinement of NMR structures from sparse experimental data Prediction and refinement of NMR structures from sparse experimental data Jeff Skolnick Director Center for the Study of Systems Biology School of Biology Georgia Institute of Technology Overview of talk

More information

NMR BMB 173 Lecture 16, February

NMR BMB 173 Lecture 16, February NMR The Structural Biology Continuum Today s lecture: NMR Lots of slides adapted from Levitt, Spin Dynamics; Creighton, Proteins; And Andy Rawlinson There are three types of particles in the universe Quarks

More information

Macromolecular X-ray Crystallography

Macromolecular X-ray Crystallography Protein Structural Models for CHEM 641 Fall 07 Brian Bahnson Department of Chemistry & Biochemistry University of Delaware Macromolecular X-ray Crystallography Purified Protein X-ray Diffraction Data collection

More information

Chemistry. for the life and medical sciences. Mitch Fry and Elizabeth Page. second edition

Chemistry. for the life and medical sciences. Mitch Fry and Elizabeth Page. second edition hemistry for the life and medical sciences Mitch Fry and Elizabeth Page second edition ontents Preface to the second edition Preface to the first edition about the authors ix x xi 1 elements, atoms and

More information

Ab-initio protein structure prediction

Ab-initio protein structure prediction Ab-initio protein structure prediction Jaroslaw Pillardy Computational Biology Service Unit Cornell Theory Center, Cornell University Ithaca, NY USA Methods for predicting protein structure 1. Homology

More information

Selecting protein fuzzy contact maps through information and structure measures

Selecting protein fuzzy contact maps through information and structure measures Selecting protein fuzzy contact maps through information and structure measures Carlos Bousoño-Calzón Signal Processing and Communication Dpt. Univ. Carlos III de Madrid Avda. de la Universidad, 30 28911

More information

Supersecondary Structures (structural motifs)

Supersecondary Structures (structural motifs) Supersecondary Structures (structural motifs) Various Sources Slide 1 Supersecondary Structures (Motifs) Supersecondary Structures (Motifs): : Combinations of secondary structures in specific geometric

More information

Alpha-helical Topology and Tertiary Structure Prediction of Globular Proteins Scott R. McAllister Christodoulos A. Floudas Princeton University

Alpha-helical Topology and Tertiary Structure Prediction of Globular Proteins Scott R. McAllister Christodoulos A. Floudas Princeton University Alpha-helical Topology and Tertiary Structure Prediction of Globular Proteins Scott R. McAllister Christodoulos A. Floudas Princeton University Department of Chemical Engineering Program of Applied and

More information

Protein Bioinformatics Computer lab #1 Friday, April 11, 2008 Sean Prigge and Ingo Ruczinski

Protein Bioinformatics Computer lab #1 Friday, April 11, 2008 Sean Prigge and Ingo Ruczinski Protein Bioinformatics 260.655 Computer lab #1 Friday, April 11, 2008 Sean Prigge and Ingo Ruczinski Goals: Approx. Time [1] Use the Protein Data Bank PDB website. 10 minutes [2] Use the WebMol Viewer.

More information

Homology models of the tetramerization domain of six eukaryotic voltage-gated potassium channels Kv1.1-Kv1.6

Homology models of the tetramerization domain of six eukaryotic voltage-gated potassium channels Kv1.1-Kv1.6 Homology models of the tetramerization domain of six eukaryotic voltage-gated potassium channels Kv1.1-Kv1.6 Hsuan-Liang Liu* and Chin-Wen Chen Department of Chemical Engineering and Graduate Institute

More information

1. What is an ångstrom unit, and why is it used to describe molecular structures?

1. What is an ångstrom unit, and why is it used to describe molecular structures? 1. What is an ångstrom unit, and why is it used to describe molecular structures? The ångstrom unit is a unit of distance suitable for measuring atomic scale objects. 1 ångstrom (Å) = 1 10-10 m. The diameter

More information

Pymol Practial Guide

Pymol Practial Guide Pymol Practial Guide Pymol is a powerful visualizor very convenient to work with protein molecules. Its interface may seem complex at first, but you will see that with a little practice is simple and powerful

More information

Template Free Protein Structure Modeling Jianlin Cheng, PhD

Template Free Protein Structure Modeling Jianlin Cheng, PhD Template Free Protein Structure Modeling Jianlin Cheng, PhD Associate Professor Computer Science Department Informatics Institute University of Missouri, Columbia 2013 Protein Energy Landscape & Free Sampling

More information

Computer simulations of protein folding with a small number of distance restraints

Computer simulations of protein folding with a small number of distance restraints Vol. 49 No. 3/2002 683 692 QUARTERLY Computer simulations of protein folding with a small number of distance restraints Andrzej Sikorski 1, Andrzej Kolinski 1,2 and Jeffrey Skolnick 2 1 Department of Chemistry,

More information

BIOCHEMISTRY Course Outline (Fall, 2011)

BIOCHEMISTRY Course Outline (Fall, 2011) BIOCHEMISTRY 402 - Course Outline (Fall, 2011) Number OVERVIEW OF LECTURE TOPICS: of Lectures INSTRUCTOR 1. Structural Components of Proteins G. Brayer (a) Amino Acids and the Polypeptide Chain Backbone...2

More information

Outline. Levels of Protein Structure. Primary (1 ) Structure. Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins

Outline. Levels of Protein Structure. Primary (1 ) Structure. Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins Margaret Daugherty Fall 2004 Outline Four levels of structure are used to describe proteins; Alpha helices and beta sheets

More information

Protein Structures: Experiments and Modeling. Patrice Koehl

Protein Structures: Experiments and Modeling. Patrice Koehl Protein Structures: Experiments and Modeling Patrice Koehl Structural Bioinformatics: Proteins Proteins: Sources of Structure Information Proteins: Homology Modeling Proteins: Ab initio prediction Proteins:

More information

AN AB INITIO STUDY OF INTERMOLECULAR INTERACTIONS OF GLYCINE, ALANINE AND VALINE DIPEPTIDE-FORMALDEHYDE DIMERS

AN AB INITIO STUDY OF INTERMOLECULAR INTERACTIONS OF GLYCINE, ALANINE AND VALINE DIPEPTIDE-FORMALDEHYDE DIMERS Journal of Undergraduate Chemistry Research, 2004, 1, 15 AN AB INITIO STUDY OF INTERMOLECULAR INTERACTIONS OF GLYCINE, ALANINE AND VALINE DIPEPTIDE-FORMALDEHYDE DIMERS J.R. Foley* and R.D. Parra Chemistry

More information

Biomolecules: lecture 9

Biomolecules: lecture 9 Biomolecules: lecture 9 - understanding further why amino acids are the building block for proteins - understanding the chemical properties amino acids bring to proteins - realizing that many proteins

More information

Biomolecules: lecture 10

Biomolecules: lecture 10 Biomolecules: lecture 10 - understanding in detail how protein 3D structures form - realize that protein molecules are not static wire models but instead dynamic, where in principle every atom moves (yet

More information

Supplementary Information

Supplementary Information 1 Supplementary Information Figure S1 The V=0.5 Harker section of an anomalous difference Patterson map calculated using diffraction data from the NNQQNY crystal at 1.3 Å resolution. The position of the

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Table of Contents Page Supplementary Table 1. Diffraction data collection statistics 2 Supplementary Table 2. Crystallographic refinement statistics 3 Supplementary Fig. 1. casic1mfc packing in the R3

More information

Protein Secondary Structure Prediction using Pattern Recognition Neural Network

Protein Secondary Structure Prediction using Pattern Recognition Neural Network Protein Secondary Structure Prediction using Pattern Recognition Neural Network P.V. Nageswara Rao 1 (nagesh@gitam.edu), T. Uma Devi 1, DSVGK Kaladhar 1, G.R. Sridhar 2, Allam Appa Rao 3 1 GITAM University,

More information

Details of Protein Structure

Details of Protein Structure Details of Protein Structure Function, evolution & experimental methods Thomas Blicher, Center for Biological Sequence Analysis Anne Mølgaard, Kemisk Institut, Københavns Universitet Learning Objectives

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Supplementary Results DNA binding property of the SRA domain was examined by an electrophoresis mobility shift assay (EMSA) using synthesized 12-bp oligonucleotide duplexes containing unmodified, hemi-methylated,

More information

Bioinformatics III Structural Bioinformatics and Genome Analysis Part Protein Secondary Structure Prediction. Sepp Hochreiter

Bioinformatics III Structural Bioinformatics and Genome Analysis Part Protein Secondary Structure Prediction. Sepp Hochreiter Bioinformatics III Structural Bioinformatics and Genome Analysis Part Protein Secondary Structure Prediction Institute of Bioinformatics Johannes Kepler University, Linz, Austria Chapter 4 Protein Secondary

More information

1) NMR is a method of chemical analysis. (Who uses NMR in this way?) 2) NMR is used as a method for medical imaging. (called MRI )

1) NMR is a method of chemical analysis. (Who uses NMR in this way?) 2) NMR is used as a method for medical imaging. (called MRI ) Uses of NMR: 1) NMR is a method of chemical analysis. (Who uses NMR in this way?) 2) NMR is used as a method for medical imaging. (called MRI ) 3) NMR is used as a method for determining of protein, DNA,

More information

Summary of Experimental Protein Structure Determination. Key Elements

Summary of Experimental Protein Structure Determination. Key Elements Programme 8.00-8.20 Summary of last week s lecture and quiz 8.20-9.00 Structure validation 9.00-9.15 Break 9.15-11.00 Exercise: Structure validation tutorial 11.00-11.10 Break 11.10-11.40 Summary & discussion

More information

I690/B680 Structural Bioinformatics Spring Protein Structure Determination by NMR Spectroscopy

I690/B680 Structural Bioinformatics Spring Protein Structure Determination by NMR Spectroscopy I690/B680 Structural Bioinformatics Spring 2006 Protein Structure Determination by NMR Spectroscopy Suggested Reading (1) Van Holde, Johnson, Ho. Principles of Physical Biochemistry, 2 nd Ed., Prentice

More information

Physiochemical Properties of Residues

Physiochemical Properties of Residues Physiochemical Properties of Residues Various Sources C N Cα R Slide 1 Conformational Propensities Conformational Propensity is the frequency in which a residue adopts a given conformation (in a polypeptide)

More information

Protein Structure Determination

Protein Structure Determination Protein Structure Determination Given a protein sequence, determine its 3D structure 1 MIKLGIVMDP IANINIKKDS SFAMLLEAQR RGYELHYMEM GDLYLINGEA 51 RAHTRTLNVK QNYEEWFSFV GEQDLPLADL DVILMRKDPP FDTEFIYATY 101

More information

Molecular Modeling. Prediction of Protein 3D Structure from Sequence. Vimalkumar Velayudhan. May 21, 2007

Molecular Modeling. Prediction of Protein 3D Structure from Sequence. Vimalkumar Velayudhan. May 21, 2007 Molecular Modeling Prediction of Protein 3D Structure from Sequence Vimalkumar Velayudhan Jain Institute of Vocational and Advanced Studies May 21, 2007 Vimalkumar Velayudhan Molecular Modeling 1/23 Outline

More information

LS1a Fall 2014 Problem Set #2 Due Monday 10/6 at 6 pm in the drop boxes on the Science Center 2 nd Floor

LS1a Fall 2014 Problem Set #2 Due Monday 10/6 at 6 pm in the drop boxes on the Science Center 2 nd Floor LS1a Fall 2014 Problem Set #2 Due Monday 10/6 at 6 pm in the drop boxes on the Science Center 2 nd Floor Note: Adequate space is given for each answer. Questions that require a brief explanation should

More information