CS483 Assignment #2 Due date: Mar. 1 at the start of class. Protein Geometry Bedbug spit? Just say NO! Purpose of this assignment Get familiar with PDBsum and the PDB Extract atomic coordinates from protein data files Compute bond angles and dihedral angles This assignment uses coordinate data for two proteins both obtainable from the PDB: 1bz0 1yjh Human Hemoglobin A Crystal Structure of Cimex Nitrophorin Ferrous NO Complex Some background information The human hemoglobin A is a protein that is predominantly comprised of alpha helices. Figure 1 contain a view of 1bz0 with its four heme groups visible within the tetramer. The next protein, 1yjh also contains some helices but we will pay more attention to the 11 beta strands that are also in this protein. See Figure 2 for a view of 1yjh. It is quite remarkable that this protein also contains a heme group but not for oxygen transport. The protein is found in the saliva of Cimex lectularius, a bloodsucking bedbug that injects small amounts of its saliva into the host when it is taking a blood meal. The heme group is used to store and transport nitric oxide (NO) which is a highly reactive free-radical molecule with a half-life of less than one second when in the presence of biological tissue. If you look carefully at Figure 2 you can see the blue/red NO molecule hovering above the heme group. When saliva is injected into the wound, NO from the heme group induces a local vasodilation and also inhibits blood coagulation. Both of these effects help the bug get its meal. It is not just bedbugs that profit from nitric oxide. Pfizer has a revenue stream of over 1 billion dollars per year selling sildenafil which also initiates vasodilation by means of a local production of nitric oxide.
Figure 1: Ribbon diagram of 1bz0 showing four heme groups
Figure 2: Ribbon diagram of 1yjh showing heme group with NO
Overview of the assignment The first part of the assignment deals with the calculation of bond angles and bond lengths for atoms in the backbone of the proteins. In the second part of the assignment you will calculate dihedral angles that will be used to generate PHI/PSI plots. You will generate two plots: one corresponding to dihedral angles that are taken only from alpha helices and another plot that corresponds to dihedral angles that are taken only from beta strands. Preparatory Steps A) Download the two protein files from the PDB. B) Go to PDBsum and find the 1bz0 entry. Click on the Protein tab that is next to the Top page tab of the first page for 1bz0. You will then get a page showing a cartoon of the secondary structure for the protein. Click on the Domain 1 icon and you should get a pop up window that again shows a cartoon of the secondary structure but with a set of numbers that tell you where the helices begin and end. You will need these numbers to know where the helices start and stop in the PDB listing of the backbone atoms. C) Repeat the previous step for 1yjh but this time record that start and stop information for the beta strands. Work for the Assignment Part A Steps 1. Write a program that will extract the coordinates of all backbone atoms for the 8 helices that are present in the first chain of 1bz0. You should end up with 8 helix data sets. Each data set should contain coordinates of atoms CA, C, O, N for each residue in the helix. See Figure 3 for an illustration of the backbone atoms. In Part B you will be using this data to generate dihedral angles so be sure that you have enough information to do these computations. For example, the PHI dihedral of the first CA at the beginning of the first helix (say residue 3 of 1bz0) will require the coordinates of the C atom in the backbone of residue 2. Also, the PSI dihedral of the last CA in a helix (say residue 18 of 1bz0) will need the coordinates of the N atom in the backbone of the next residue (number 19). These last two sentences should tell you exactly where to make the cuts when creating your data sets.
Do the same for the 11 beta strands that are in 1yjh. In this case, you should end up with 11 beta strand data sets. 2. If you inspect Figure 3, you will see that there are 4 non-hydrogen bond angles and 4 bond lengths. Write a function that computes these bond angles and bond lengths when it is given the coordinates of CA, C, O, N, CA as in the figure. 3. Apply this function to all the peptide bonds in all the helix data sets. For each of the 4 bond angles compute the mean µ and variance σ 2. Use these values to define the probability density function (PDF): ( x µ ) 2 1 exp. 2 σ 2π 2σ For each of the 4 bond angles you will have a PDF. Repeat these calculations for the 4 bond lengths to get their PDFs. 4. Repeat step 3 using the beta strand data sets. You should now have 16 PDFs. 5. Generate 8 diagrams that compare the PDFs. For example, a diagram would show a plot of the PDF for a particular bond angle in the helix data sets and the overlapping PDF for that same bond angle in the beta data sets. So you will hand in 4 diagrams for the bond angles each having the two overlapping PDF plots and 4 diagrams for the bond lengths each having the two overlapping PDF plots. Be sure to properly label each plot, identifying the atoms in the bond angle or bond length and also specifying for each PDF its data source (alpha helix or beta strand). Figure 3: Typical bond angles and bond lengths for the peptide bond
Part B Steps 1. Write a function that computes the PHI, PSI dihedral angles for an alpha carbon. 2. Using this function compute PHI and PSI dihedral angles for all alpha carbons in all the helix data sets. Generate a Ramachandran PHI/PSI scatter plot where each point in the plot corresponds to a (PHI, PSI) pair of some alpha carbon. The axes should be labeled in the same way as the Procheck plot that you can access on the Top page of a protein when looking at a PDBsum entry. The plots that you hand in should be similar to the blue point clusters that you see for Procheck (you will not need to do the multiple colourings: red, brown, yellow, etc. seen in these figures). You should see a cluster of points in the plot. Compute the centroid of this cluster and place it in the plot as a special point. Its coordinates should also be stated. 3. Repeat this last step for all alpha carbons in all the beta strand data sets. What to Hand In [marks] [15] Well documented source code listings for: Data extraction in step 1 of part A. Bond angle and bond length calculations. PHI, PSI dihedral angle calculations. [10] The 8 PDF comparisons of step 5. Each PDF should be accompanied by a pair that indicates the mean µ and 2 variance σ values. [8] The two Ramachandran plots as described in steps 2 and 3 of Part B. [7] Your conclusions. Discuss what you observe in your output data. [5] BONUS question. As part of your conclusions, use Student s t-test to compare the normal distributions of Part A. If you cannot recall Student s t-test you can find a review at: http://en.wikipedia.org/wiki/student's_t-test. The assignment will be marked out of 40.