Programme 8.00-8.20 Summary of last week s lecture and quiz 8.20-9.00 Structure validation 9.00-9.15 Break 9.15-11.00 Exercise: Structure validation tutorial 11.00-11.10 Break 11.10-11.40 Summary & discussion 11.40-12.00 Quiz
Feedback Persons 2
Summary of Experimental Protein Structure Determination Key Elements
Learning Objectives After last week you should be able to: Give an outline of the most important steps (and obstacles) in protein structure determination by X-ray crystallography and NMR spectroscopy. Identify relevant parameters for evaluating the quality of protein structures determined by X-ray crystallography and NMR spectroscopy.
X-Ray Crystallography in Brief General method for structure determination Very accurate/well-defined structures Only little information on dynamics (Bfactors) For quality check: Remember the three Rs Resolution R/R free Ramachandran plot
The Experiment(s)
X-Ray Diffraction and Resolution Bragg s equation Diffraction image nλ = 2d sinθ " θ = sin 1 nλ % $ ' # 2d & 7
Other Parameters Non-crystallographic symmetry (NCS) Subunit symmetry not coinciding with crystal symmetry. Assume (=force) subunits to have identical conformation (strict/nonstrict). Reduces number of free parameters in refinement. Used when amount of data is insufficient (low resolution). B-factors Not refined Overall (from data) Grouped (aa. or bb/sc) Individual (atomic) Anisotropic (Res. < 1.5Å) Geometry Side chain rotamers Bond lengths Angles G-factor (overall geometrical quality)
Bottlenecks Getting the protein in sufficient quantity and purity Crystallisation (trial and error) Diffraction/resolution limit (both high and low)
NMR Basics NMR is nuclear magnetic resonance NMR spectroscopy is done on proteins IN SOLUTION Only atoms 1 H, 13 C, 15 N (and 31 P) can be detected in NMR experiments Proteins up to 30 kda Proteins stable at high concentration (0.5-1mM), preferably at room temperature
Distance Restraints Distance restraints are derived from spectra
Deposited NMR structure Finally, an ensemble of the 10-25 structures with lowest total energy and lowest number of violations are deposited in the Protein Data Bank The restraints are deposited in the BioMagResBank (BMRB) The ensemble shows possible structures within the space defined by the constraints The ensemble of 20 NMR structures of the U-box domain from Atpub14 deposited in PDB (1T1H) Andersen et al., JBC, 2004
Evaluation of NMR Structures Homo-/hetero- nuclear NMR? RMSD (be careful)? Types of constraints? Numbers of constraints? Violations? Specific regions?
Related Courses @ DTU 26325 Protein Crystallography Spring course, 13 weeks Contact: Pernille Harris, build. 206, room 212, (+45) 4525 2024, ph@kemi.dtu.dk
Structure Validation Originally by Anne Mølgaard, University of Copenhagen
Structure Validation A homology model will never be better than the template it is based on So make sure the template is good!
Learning objective After today you will be able to tell a good crystal structure from a bad one! 1hgf 1hgf Jmol PyMOL
The Protein Data Bank February, 2012
X-ray Crystallography vs. NMR X-ray crystallography Proteins of any size Proteins in crystal Complete data/total map of structure Many details one model Resolution, R-values, Ramachandran plot NMR spectroscopy Proteins below 50 kda Proteins in solution Incomplete data Fewer details many models Restraint violations, RMSD, Ramachandran plot
Ramachandran plot
Ramachandran Plot I Procheck (Laskowski) 4 categories: Most favored Additionally allowed Generously allowed disallowed More than 90% structures in most favored regions and none or only a few in disallowed regions PDBSum: http://www.ebi.ac.uk/ thornton-srv/databases/ pdbsum/
Ramachandran Plot II Uppsala Ramachandran server (Kleywegt & Jones): Newer analysis of PDB Two categories: Allowed Disallowed More than 95% residues in allowed regions http://eds.bmc.uu.se/ ramachan.html
Ramachandran Plot III MolProbity Current PDB standard. Three categories Favoured (>98%) Allowed (>99.8%) Disallowed Also outputs plots for Glycine Proline Pre-proline http:// molprobity.biochem.duke. edu/index.php
Ramachandran plot Good structures should have the majority of the residues in the most favored regions: Procheck: >90% Uppsala: >95% MolProbity: >98% and none or only a few in the disallowed regions. Residues in disallowed regions are either wrong or potentially very interesting.
Strange But Correct Mølgaard & Larsen (2002) Acta Cryst. D58, 111-119.
R-factor, R free Structure factor ( ) = f j exp 2πi hx j + ky j + lz j F hkl N j=1 [ ( )] Individual reflections I hkl F obs (hkl) 2 Take out 5-10% of the data and use to calculate R free
R-factor, R free R-factors vary from < 10% (high resolution structures) to 59% (theoretical random structure). Rule of thumb for good quality structures: R Resolution/10 R free should be < 30% and not much more than 5% (points) higher than R: R + 0.05 > R free > R
Key Parameters Resolution R values Agreement between data and model. Usually between 0.15 and 0.25, should not exceed 0.30. R ~ Resolution / 10 R + 0.05 > R free > R. Ramachandran plot The majority of residues in most favoured regions. B factors Contributions from static and dynamic disorder Well determined ~10-20 Å 2, intermediate ~20-30 Å 2, flexible 30-50 Å 2, invisible >60 Å 2.
1p7q R, Rfree: 21.8%, 30.9% Resolution: 3.4 Å Ramachandran plot B.E. Willcox et al. Nat Immunol. 2003 Sep;4(9):913-9.
Other parameters B-factors Contributions from static and dynamic disorder Well determined ~10-20 Å 2 Intermediate ~20-30 Å 2 Flexible 30-50 Å 2 Invisible >60 Å 2. Should be (mostly) > 5 Å 2 and < 50 Å 2 Occupancies Between 0 and 1 (0-100%) Rarely below 0.4 (40%)
B-factors 1xr9, 1.79Å atom # chain ID resi # x y z occ B-factor
B-factors 1p7q, 3.4Å
1hhh, 3.0Å Madden et al. Cell. 1993 Nov 19;75(4):693-708.
1au4, 2.3Å, R=26.6, R free =37.0 Yamashita et al. J.Am.Chem.Soc. v119 pp.11351, 1997
1a0h, 3.2Å Martin et al. Structure. 1997 Dec 15;5(12):1681-93
Occupancies, 1ea3, 2.3Å
The Electron Density Server
Something s Fishy in 2hr0 No correlation between B-factors and solvent accessibility. Large layers of solvent in the c-direction (30-40Å thick) that spans entire unit cell (i.e. nothing to hold the crystal together). Absence of bulk solvent in diffraction data. 80% solvent resolution 2.3Å. Perfect electron density around impossible geometries. Impossible merging statistics for data from four crystals (Rmerge = 0.11 in the last resolution shell, with I/sig(I) = 1.32).
Diffraction Precision Index DPI (with Rfree): DPI (no Rfree): Scaled by B-factor Cruickshank 1999, Acta Cryst D
DPI vs. Resolution 2BSU
2bsu, 1.60Å, R=31.9, R free =33.6 Has now been replaced by 2v2w
Take Home Message It is easy to find bad structures in the PDB so validate!! Resolution Ramachandran R free RMSD (NMR) Restraint violations (NMR) High impact journal high quality structure!
Programme 8.00-8.20 Summary of last week s lecture and quiz 8.20-9.00 Structure validation 9.00-9.15 Break 9.15-11.00 Exercise: Structure validation tutorial 11.00-11.10 Break 11.10-11.40 Summary & discussion 11.40-12.00 Quiz
Break!
Programme 8.00-8.20 Summary of last week s lecture and quiz 8.20-9.00 Structure validation 9.00-9.15 Break 9.15-11.00 Exercise: Structure validation tutorial 11.00-11.10 Break 11.10-11.40 Summary & discussion 11.40-12.00 Quiz