BIOINF 4120 Bioinformatics 2 - Structures and Systems - Oliver Kohlbacher Summer 2014 5. Structure Elucidation Overview Protein structure elucidation X-ray diffraction (XRD) Crystallization Physics of electron diffraction Phasing, modeling, refinement Nuclear magnetic resonance spectroscopy (NMR) Physical foundations of NMR spectroscopy 1D and 2D spectra and their interpretation Comparison NMR and XRD Structural Databases: the PDB 2 X-Ray Crystallography X-Ray Source Protein Crystal Detector Analysis 3 1
X-Rays 4 Protein Crystals Proteins are difficult to crystallize Irregular shape large holes in the crystal Rather large crystals required (0.1 0.5 mm) Large amounts of protein necessary Protein needs to be very pure Crystal growth is very slow (weeks to months) Some proteins do not crystallize at all (membrane proteins!) Branden, Tooze, p. 376 5 Crystallization Hanging Drop Nölting, p. 70, Branden/Tooze, p. 376 6 2
Protein Crystals Regular arrangement of protein molecules in a three-dimensional lattice Irregular shape of proteins causes water-filled holes in the crystal high water content (20 90%) Unit cell:smallest subunit of the crystal from which the whole crystal can be created by translation Branden, Tooze, p. 375 7 Protein Crystal 8 Example: Fab unit cell contains two copies of Fab Crystal is formed by translationof this unit cell along a regular lattice X-ray Diffraction of Proteins Bernal and Crowfootobserved in 1934that pepsin crystals create a well-defined diffraction pattern Nearly three decades and the invention of computers were necessary until Kendrew and Perutz could solve the first structures in 1960(myoglobin, hemoglobin) Max Perutz, John Kendrew 9 3
Wave Equations Euler s formula 10 Any periodic sine or cosine function can be represented as a complex exponential function. Example: Wave Equations 11 φ λ. s Intensity for time tat position ris described by: with unit vector spointing along the direction of the wave front, with frequency ω, wave length λ, phase φand i² = 1 Interference Constructive Amplitude increases Destructive Amplitude decreases Depends on phase difference Interference of two coherent waves E 1 and E 2 of equal amplitude E 0 : 0 12 4
Interference Constant factor Phase factor Resulting wave has the same frequency as the original waves E 1 and E 2 Amplitude depends on phase factor, i.e., the phase difference φ Amplitude is easily observable 13 Diffraction at Two Centers 14 Origin s 0 s 0 r s 0 r r s s 2θ s 0 λs = s s 0 s The retardationoftheinterferingwavesisr s r s 0 andthusthephasedifference φ = 2πλ -1 r (s-s 0 ) = 2πr S withs= (s s 0 ) / λ Diffraction at Two Centers 15 s 0 r s 0 λs = s s 0 s 0 r r s s s 0 s φ = 2πλ -1 r (s-s 0 ) = 2πr S Considering the ratio of wave Eand wave E 0 (r, t) diffracted at the origin yields: 5
Structure Factor Apart from the phase difference, diffraction probabilityf i of atom i is important: 16 The whole diffraction patternis then the sum of all diffracted waves originating from atoms iat unit cell positions r i : F is also called structure factor Structure factor depends on atom positionsand diffraction probabilities Structure Factor Structure factor corresponds to the Fourier transform of the atom coordinates of the diffracting protein Diffraction occurs by interaction with the atoms electron hulls, not with the nuclei Thus, we introduce a continuous electron density ρ(r) F then becomes 17 Diffraction Pattern of a Protein Nölting 18 6
Fourier Analysis Nölting 19 Fourier Analysis Nölting 20 Fourier Analysis Im Re Nölting 21 7
Phasing Problem Diffraction pattern corresponds to Fourier transform of electron density Inverse Fourier transform yields electron density from this Problem: detector measures intensity only, not the phase! I = F(S) = F(S)F * (S) Phase information, however, is required to compute electron density! Phasing problem: Reconstruction of the phase information Common way to solve this: heavy atom replacement John O Brien, The New Yorker Collection 1991 Rhodes, p. 18 22 Overview X-Ray Diffraction Nölting, p. 68 23 Electron Density Maps 24 8
Electron Density Maps 25 Resolution Resolution of a structure determines information content Determined by quality of the crystal: Purity Inclusions Water content Stability under irradiation Resolution can be estimated from diffraction pattern Nölting 26 Resolution Resolution determines which atomic details are recognizable Poor resolution (large value) blurs the details of the structure Resolution is measured in Å Resolution of 2 Å does notmean, that the error for the atom coordinates is about 2 Å! Error in the atom coordinates would be about 0.3 Å in that case 27 9
Resolution Resolution [Å] Information obtainable 4.0 Fold class, some secondary structures 3.5 Helices and strands become distinguishable poor 3.0 Most side chains recognizable 2.5 1.5 All side chains well defined, φ and ψ of the backbone partially well defined, water can be seen All backbone torsions well defined, first hydrogen atoms visible typical very good 1.0 Hydrogen atoms become visible possible 28 Nuclear Magnetic Resonance 1 H nuclei possess nuclear magnetic moment In an external magnetic field B 0, every nucleus assumes one of two possible states (spins) : αor β The two states differ in energy, spin state α(parallel to B 0 ) is energetically more favorable β B 0 α E 29 Nuclear Magnetic Resonance 1 H nuclei possess nuclear magnetic moment In an external magnetic field B 0, every nucleus assumes one of two possible states (spins) : α or β The two states differ in energy, spin state α (parallel to B 0 ) is energetically more favorable Addition of energy can invert the spin state β h ν α E 30 10
Nuclear Magnetic Resonance 1 H nuclei possess nuclear magnetic moment In an external magnetic field B 0, every nucleus assumes one of two possible states (spins) : α or β The two states differ in energy, spin state α (parallel to B 0 ) is energetically more favorable Addition of energy can invert the spin state β α E 31 Nuclear Magnetic Resonance E depends on The magnitude of the external magnetic field The electronic environment of the nucleus 32 Nuclear Magnetic Resonance E Can be measured (absorption) Has different magnitude for different types of atoms For a system of atoms we thus obtain an NMR spectrum 33 11
Angular Momentum Nuclei have nuclear angular momentum P P can be considered the quantum mechanical analog of a classical angular momentum (which does not suggest that the nuclei are rotating in any way!) Pdepends on spin quantum numberi (= h/2π, with Planck s constant h) Iis a function of the nuclide, i.e., of the number of neutrons and protons in the nucleus For (g,g) nuclei (even number of protons and neutrons) Ibecomes zero invisible for NMR! 34 Magnetic Moment Magnetic moment µ= γp isproportional to angular momentum Proportionality constant γ is called magnetogyric ratio γ determines sensitivity of measurement: high γ = high sensitivity γ differs for each nuclide 35 Properties of Important Nuclides Nuclide Nat. abundance [%] I γ [10 7 T -1 s -1 ] Rel. sensitivity 1 H 99.985 ½ 26.7519 1.00 2 D 0.015 1 4.1066 0.01 12 C 98.9 0 - - 13 C 1.1 ½ 6.7283 0.01 14 N 99.63 1 1.9338 0.001 15 N 0.37 ½ -2.7126 0.001 16 O 99.96 0 - - 17 O 0.0037 5/2-3.6280 0.03 31 P 100 ½ 10.8394 0.07 36 12
Quantization of P In an external magnetic field with magnetic flux density B 0 the magnitude of P,resp. µ, is quantized along the direction of B 0 (z-axis) Possible states of the nuclear spin are described by magnetic quantum number m and P z = m with m= I, I + 1,.., +I µ z = γm For a nucleus with I= ½ (e.g., 1 H) we obtain: m= +½, ½ and thus there are two possible spin states P z,½ = ½ and P z,-½ = ½ P z,+½ P z,-½ z 37 Energy in a Magnetic Field For simplicity, we will consider only nuclei with I = ½. Similar statements hold for other nuclei. Every atom with µ 0is a magnetic dipole in an external magnetic field Classically, the energy Eof a dipole is E= - µ z B 0 = -m γb 0 The energy difference between the two spin states is thus E= E -½ E ½ = γb 0 This energy difference corresponds to a resonance frequency νwith hν= E 38 Energy in a Magnetic Field Resonance frequency depends on γand B 0 Stronger magnetic fields correspond to larger energy differences, which in turn correspond to higher resonance frequencies E m = +½ 0 E 1 E 2 m = ½ B 1 B 2 39 13
Spin Populations Each atom can assume one of two states, all atoms are thus split into two populations of size N α and N β Majority of nuclei assume the ground state, i.e., the state of lowest energy (N β < N α ) Occupancy of states follows Boltzmann distribution Example: T= 300 K, B 0 = 7.05 T N β = 0.99995 N α Differences in occupancy of the states are very small since energy differences are small compared to k B T! 40 NMR Hardware http://en.wikipedia.org/wiki/image:pacific_northwest_national_laboratory_800_mhz_nmr_spectrometer.jpg 41 NMR Hardware 42 14
NMR SpectraofCH n Br 4-n In principle, resonance frequencies should depend on the nuclide only should be identical for all 1 H nuclei Counter example: bromomethanes resonance frequency depends on chemical environment of the nuclei 43 Shielding Chemical environment influences ν NMR contains structural information! Electrons create a fieldb that shields nucleus from B 0 Nucleus thus is exposed to an effective field B eff = B 0 -B B is proportional to B 0 B eff = B 0 B = B 0 σb 0 = (1 σ) B 0 with shielding σ 44 Chemical Shift ν depends on the nuclide and B 0 Instruments differ in their B 0 To simplify comparison between instruments, introduce a scale independent of B 0 Chemical shift δ δ is usually given in ppm(10-6 ) relative to the resonance frequency ν ref of a reference substance 45 15
Reference Substances Reference substancesshould be CH 3 Chemically inert Easy to handle Yield clear, intensive signals Reference substance for 1 H and 13 C is often tetramethylsilane (TMS) All 1 H und 13 C nuclei in TMS are H 3 C Si CH 3 CH 3 chemically equivalent, i.e., the result in only one peak each 46 Structural Information in Spectra Chemical shift depends on Topology(constitution) Geometry(conformation) Certain experiments yield Topological information (neighborhood) Distance information(e.g., NOE constraints) In combination this data can be used to deduce the structure of the protein 47 1 H-NMR SpectrumofEtOH 48 16
Scalar Coupling Spins interact with each other Energy of one nucleus depends on an other nucleus spin state Energy levels shifted 2 J Resonance frequency shifted Coupling is mediated across bonds 1 J coupling across one bond 2 J coupling across two bonds (geminal) 3 J 3 J coupling across three bonds (vicinal) 49 Scalar Coupling Example:spinsystemA X hν A1 hν A2 ν A1 ν A2 ν A J ν A 50 Incfluence of Structure on δ Chemical shift is caused by changes in electron density Electron density is influenced by: Topology Directly neighboring atoms (+I,-I effect, ) Implicitly given by the type of the amino acid Geometry Charges in the vicinity (electrostatics) Aromatic systems (ring current effect) 51 17
Random Coil Shift Nuclei in a similar environment, i.e., with identical neighboring atoms, have similar chemical shifts Differences in conformation can cause differences in the shift random coil shiftsare the shifts of amino acid atoms in a random coil, a peptide without explicit secondary structure 52 Chemical Shifts of Amino Acids 53 1 H-NMR Spectrum of a Protein 54 18
2D NMR Spectrum Peaks on the diagonal correspond to the shifts in 1D spectrum Cross peaks(off-diagonal) are caused by transfer of magnetization between two nuclei, i.e., interaction between these nuclei It usually implies "closeness" of these nuclei δ B δ A δ 1 δ A δ B δ 2 55 (H,H)-COSY COrrelatedSpectroscopY magnetization is transferred along bonds Cross peaks occur between nuclei separated by two or three bonds 56 (H,H)-COSY COSY shows characteristic patterns for certain amino acids This allows the assignment of peaks to certain amino acids and thus their identification in a spectrum 57 19
(H,H)-COSY 58 Structure Elucidation with NMR 1. Multiple NMR experiments 2. Determination of Coupling constants (yields backbone torsion angles) NOE distances (yields interatomic distances) Hydrogen bond patterns 3. Modeling of the structure consistent with these structural constraints 59 Comparison XRD NMR XRD Also for large proteins Requires crystals Hydrogen atoms invisible Unlabeled protein Higher spatial resolution < 30 kda NMR From solution Hydrogen atoms are essential Isotope-labeled protein required Information on flexibility 60 20
Databases PDB PDB (Protein Data Bank) http://www.rcsb.org Database for biomolecular structures Maintained by the RCSB (Research Collaboratory for Structural Bioinformatics) Deposition of structures in the PDB is prerequisite for the publication of the structure in a journal Each structure is given a unique identifier (PDB ID) 4 characters 1st character version 2nd 4th character structure ID Example: 2PTI, 3PTI, 4PTI are different structures of protein BPTI 2PTI: 1973, 3PTI: 1976, 4PTI: 1983 61 PDB Growth 90000 80000 70000 Yearly Growth Total 60000 50000 40000 30000 20000 10000 0 Data from: http://www.rcsb.org/pdb/statistics/contentgrowthchart.do?content=total&seqid=100 Data as of 11.04.2012 62 PDB Statistics Proteins Protein-NA- Complexes Nucleic Acids Total XRD 81,972 4,263 1,516 87,751 NMR 9,093 205 1,079 10,377 Total 91,065 4,468 2,595 98,128 http://www.rcsb.org Data as of 01.04.2014 63 21
PDB The First Entry! 64 PDB The First Entry! HEADER OXYGEN STORAGE 05-APR-73 1MBN 1MBNH 1 COMPND MYOGLOBIN (FERRIC IRON - METMYOGLOBIN) 1MBN 4 SOURCE SPERM WHALE (PHYSETER CATODON) 1MBNM 1 AUTHOR H.C.WATSON,J.C.KENDREW 1MBNG 1 [ ] REVDAT 27-OCT-83 1MBNS 1 REMARK 1MBNS 1 20 JRNL AUTH H.C.WATSON 1MBNG 2 JRNL TITL THE STEREOCHEMISTRY OF THE 1MBNG 3 PROTEIN MYOGLOBIN JRNL REF PROG.STEREOCHEM. V. 4 299 1969 1MBNG 4 JRNL REFN ASTM PRSTAP US ISSN 1MBNG 5 0079-6808 419 [ ] SEQRES 153 VAL LEU SER GLU GLY GLU TRP GLN VAL 1MBN 39 1 LEU VAL LEU HIS [ ] HET HEM 1 44 PROTOPORPHYRIN IX WITH FE(OH), FERRIC 1MBND 10 FORMUL 2 H32 O4 FE1 +++. 1MBNG 25 HEM C34 N4 FORMUL 2 HEM H1 O1 1MBNG 26 HELIX 1 A SER 3 GLU 18 1 N=3.63,PHI=1.73,H=1.50 1MBN 52 [ ] TURN 1 CD1 PHE PHE BETW C/D HELICES IMM PREC 1MBN 60 43 46 CD2 [ ] ATOM 1 N VAL 1-2.900 17.600 15.500 1.00 0.00 2 1MBN 72 ATOM 2 CA VAL 1-3.600 16.400 15.300 1.00 0.00 2 1MBN 73 ATOM 3 C VAL 1-3.000 15.300 16.200 1.00 0.00 2 1MBN 74 ATOM 4 O VAL 1-3.700 14.700 17.000 1.00 0.00 2 1MBN 75 ATOM 5 CB VAL 1-3.500 16.000 13.800 1.00 0.00 2 1MBN 76 ATOM 6 CG1 VAL 1-2.100 15.700 13.300 1.00 0.00 2 1MBNP 4 ATOM 7 CG2 VAL 1-4.600 14.900 13.400 1.00 0.00 2 1MBNL 8 ATOM 8 N LEU 2-1.700 15.100 16.000 1.00 0.00 1 1MBN 79 ATOM 9 CA LEU 2 -.900 14.100 16.700 1.00 0.00 1MBN 80 ATOM 10 C LEU 2-1.000 13.900 18.300 1.00 0.00 1MBN 81 ATOM 11 O LEU 2 -.900 14.900 19.000 1.00 0.00 1MBN 82 ATOM 12 CB LEU 2.600 14.200 16.500 1.00 0.00 1MBN 83 ATOM 13 CG LEU 2 1.100 14.300 15.100 1.00 0.00 1 1MBN 84 ATOM 14 CD1 LEU 2.400 15.500 14.400 1.00 0.00 1 1MBNL 9 [ ] 65 References + Materials Structure elucidation in general B. Nölting, Methods in Modern Biophysics, Springer, Berlin Branden, Tooze, Introduction to Protein Structure, Garland, New York, 1999 R. Cotterill, Biophysics An Introduction, Wiley, West Sussex, 2002 T. Creighton: Proteins Structures and Molecular Properties, Freeman, 2nd ed., 1992 X-ray diffraction T. L. Blundell and L. N. Johnson, Protein Crystallography, Academic Press New York, 1976 G. Rhodes, Crystallography made crystal clear, Elsevier, 1999 NMR H. Günther, NMR-Spektroskopie, Thieme, Stuttgart H. Friebolin, Basic One- and Two-Dimensional NMR Spectroscopy, VCH, Weinheim Kurt Wüthrich, NMR of Proteins and Nucleic Acids. John Wiley and Sons, 1986 J. Cavenagh, W. J. Fairbrother, A. G. Palmer, andn. J. Skelton, Protein NMR Spectroscopy: Principles and Practice, Academic Press Inc., San Diego, 1996. 66 22