BCM Protein crystallography - II Isomorphous Replacement Anomalous Scattering and Molecular Replacement Model Building and Refinement

BCM 6200 - Protein crystallography - II Isomorphous Replacement Anomalous Scattering and Molecular Replacement Model Building and Refinement

Changing practice in de novo structure determination Hendrickson W. Quarterly Reviews of Biophysics 47, 1 (2014), pp. 49 93 BCM 6200 2

MAD Phasing Collection of anomalous scattering data at specific wavelengths where heavy atoms scatter strongly. Multiwavelength Anomalous Diffraction Advantage Perfect isomorphism Single crystal Requires suitable K- or L-edge absorption Element may be natural (e.g. Zn 2+, Fe 3+ ) Data treatment similar to SIR Fewer errors in phases (one crystal) but anomalous signals are small and require accurate intensity measurements High quality experimental ρ(x,y,z) maps Most synchrotrons set up to conduct MAD/SAD SAD can be done on home sources BCM 6200 3

X-ray scattering X-ray scattering by atoms involves the following process:- 1. The electromagnetic field of the incident X-ray beam exerts a force on the electrons within the atoms. 2. To a first approximation the electrons can be regarded as free electrons which oscillate at the same frequency as the incident X-rays. 3. These electrons then hence emit X-rays of the same wavelength as the incident X-rays. 4. The phase of the scattered X-rays differs by 180 from that of the incident X-rays. This is called the free-electron approximation to X-ray scattering. BCM 6200 4

Anomalous X-ray scattering The free electron approximation breaks down for "high" Z atoms. 1. The inner electrons are more tightly attached to the nucleus than the outer electrons, and at certain X-ray wavelengths these electrons may be ejected from the inner shell (say K) into the continuous energy region. This happens if the incident wavelength is close to what is termed the absorption edge. 2. The electron then re-emits an X-ray photon as it falls back into a lower energy shell (say L). 3. The emitted photon does not necessarily differ in phase by 180 from the incident photon. The existence of anomalous dispersion always implies that the total scattering radiation diminishes, because a fraction of the scattered radiation is absorbed to produce the electronic transition. BCM 6200 5

Anomalous scattering Experimentally, there is a sharp discontinuity in the dependence of the absorption coefficient on energy (wavelength) at the energy corresponding to the energy required to eject an inner-shell electron. The discontinuity is known as an absorption edge. For Cu, the K-absorption edge is at 1.380Å. Compared to K α at 1.5418Å Near absorption edge, electrons in an atom can no longer be considered as free electrons. The consequence is anomalous scattering. Heavy atoms (Hg, Pt, Se) are common anomalous scatterers because their absorption edges fall within usable X-ray energy ranges. K absorption edge The atomic absorption coefficient for Cu The energy of an photon is E=hc/λ λ BCM 6200 6

When the incident photon has relatively low energy When the incident photon has high enough energy The photon is scattered, as it has insufficient energy to excite any of the available electronic transitions. The scattering effect of the atom may be adequately described by using the normal atomic scattering coefficient f 0 only. Some photons are scattered. Some photons are absorbed and re-emitted at lower energy (fluorescence). The scattered photon gains an imaginary component to its phase (f" scattering coefficient becomes non-zero); i.e. it is retarded compared to a normally scattered photon. http://skuld.bmsc.washington.edu/scatter/as_tutorial.html BCM 6200 7

Anomalous scattering factor When these electron transitions occur, the corresponding atomic scattering factor (f) behaves as a complex number, appearing with two correction terms, a real number (f ') and an imaginary one (f ''). In order to interpret how this correction factors modify the scattering factor, one has to take into account that the real component (f ') is 180 degrees out of phase with the normally scattered radiation, and that the imaginary component (f '') is 90 degrees out of phase with the normally scattered radiation. Free electron scattering Negative for heavy atoms! -Δf Bound electron scattering Therefore, the existence of anomalous dispersion always implies that the total scattering radiation diminishes because a fraction of the scattered radiation is absorbed to produce the electronic transition. BCM 6200 8

Physical interpretation of the real and imaginary correction factors of f f = f o + Δf + iδf real component, Δf A small component of the scattered radiation is 180 out of phase with the normally scattered radiation given by f o. Always diminishes f o. Absorption of X-rays imaginary component, Δf A small component of the scattered radiation is 90 out of phase with the normally scattered radiation given by f o. This phase shift is always oriented counterclockwise relative to the phase of the free electron scatter f o in the Argand diagram. BCM 6200 9

Anomalous scattering Anomalous scattering factor for different atom types Regular scattering factor decreases as resolution increases But anomalous component is rather independent of resolution because it is mainly caused by inner electrons. The anomalous component is a function of both Z and wavelength, and tables of Δf and f exist for each wavelength and Z. BCM 6200 10

Elements with absorption edges accessible for anomalous scattering in ~ 5-22 kev range 5-22 kev energy range most home sources and synchrotron beamlines X-ray Absorption Edges @ http://skuld.bmsc.washington.edu/scatter/as_periodic.html BCM 6200 11

Measuring f to determine f X-ray Anomalous scattering factor corrections are influenced by the chemical environment of the absorbing atom. Changes in scattering factors are pronounced near the absorption edge. Detailed spectral features appear. Scattering corrections cannot be assumed to be the same as free-atom anomalous scattering factors. Such corrections are accurate only at energies distant from the absorption edge. Cryostat Fluorescence detector Fluorescence Incident beam Crystal The anomalous scattering f is experimentally accessible: proportional to the atomic absorption coefficient, which is proportional to the fluorescence, which is turn allows f to be calculated (from the Kramers-Kronig transformation). BCM 6200 12

Tunable synchrotron radiation maximizes effect of anomalous scattering f f Fluorescence scan of Se-element in a selenomethionine protein sample. Experimental values scattering factors f' and f'' as a function of x-ray energy obtained from the fluorescence scan using the program CHOOCH http://www.cat.ernet.in/technology/accel/srul/beamlines/prot_cryst.html BCM 6200 13

X-ray Absorption Edges (theory vs reality) The exact edge position depends on the oxidation state of the atom as the inner shell electron can be more tightly bound when valence outer shell electrons are removed. Finally the X-ray wavelength bandpass needs to be considered because inherent effects can be masked if this is broader than the linewidths of the features naturally present. Theory isn't good enough - measure it yourself The actual absorption edge is shifted relative to the idealized edge for an isolated Cu atom. The largest part of this shift is due to the oxidation state of the Cu atom in the protein. The local chemical environment introduces "ripples" (EXAFS) into the scattering spectrum. The maximum achievable f" is actually larger than the edge jump from theory. (The effect isn't very large in this particular example, but sometimes it is substantial). The maximum achievable f', however, is smaller than the theoretical value. This is largely limited by the energy bandwidth of the x-ray source. http://skuld.bmsc.washington.edu/scatter/ BCM 6200 14

Anomalous Scattering Thus, the total structure factor F PH contains: the normal scattering of all atoms F P +F H 0, the dispersive component of the anomalously scattering atoms ΔF H and the anomalous component of the anomalously scattering atoms ΔF H. Note: Symbols F H and F A will be used interchangeably to describe the anomalous scatterers with λ F A h = 0 F A h + λ F A h +i λ ΔF " A h, where the contribution of anomalous scatterers to λ F A h is broken down into their λ independent component, 0 F A h, and λ dependent components - λ F A h being dispersive - Δf (λ), and λ ΔF " A h being anomalous - Δf (λ). BCM 6200 15

Anomalous Scattering breaks Friedel's Law F ph+ F ph- α hkl α -h-k-l Friedel s law does not hold under conditions of anomalous diffraction. The correction to the scattering factor, ΔF H, is always positive in imaginary plane. If all atoms scatter anomalously, the amplitude stays the same, but the phase is shifted. When there are mixed non-anomalous and anomalous scatterers (this shifts the amplitude and phase of F hkl vs F -h-k-l ) with accompanying breakdown. Equivalent representations BCM 6200 16

Bijvoet Pairs are Related by Space Group Symmetry Bijvoet pairs are similar to Friedel mates, except they are equivalent by space group symmetry. For example, in the case of a 2-fold along b, F + hkl = F hkl = F -hk-l and F - hkl = F -h-k-l = F h-kl Bijvoet pair equivalence may break down with anomalous diffraction. F hkl F -hk-l and F -h-k-l F h-kl The Bijvoet differences is defined as: Δ F hkl = F + hkl - F - hkl The difference can be used to locate heavy atoms that cause anomalous scattering in a Bijvoet Difference Patterson map with coefficients (Δ F hkl 2 ). Like the anomalous difference Fourier, the Bijvoet differences can be used for phasing as long as the phase is adjusted by 90. BCM 6200 17

SIRAS - Single Isomorphous Replacement with Anomalous Scattering Remember: F ph = F p + F H F p = - F H + F ph Harker Construction BCM 6200 18

Centric and acentric reflections Space group symmetry sometimes constrains phases of certain reflections to have only a limited, finite number of phase possibilities. For example, in orthorhombic space groups, any reflection of the type (hk0), (h0l) and (0kl) may only have a phase of 0 or 180. The axial planes in reciprocal space are thus referred to as centric zones, and the reflections within these planes are referred to as centric reflections. Reflections with unrestricted phases are referred to as acentric reflections. The probability distributions associated with these two classes of reflections are different, and in any phasing program, statistics are often reported separately for the two classes of reflections. It is in general "easier" to determine the phase angle of centric reflections. BCM 6200 19

MAD Phasing Data Sets to Collect X-ray diffraction data are usually collected at 3 or more wavelengths to provide observations to solve simultaneous equations: The edge or inflection (λ 1 ) f where the real component of the scattering correction is maximum. The peak or white line (λ 2 ) f where the imaginary component of the correction is maximum. The remote (λ 3 ), where f and f are small. This can be located on either side of the absorption edge. A 2 nd remote (λ 4 ) may be chosen at low energy. Wavelengths are chosen to maximize the differences in intensity between Bijvoet pairs (or Friedel mates). Fluorescence Scan Fit to Kramers-Kronig Relation λ 4 f f λ 1 λ 2 λ 3 BCM 6200 20

The MAD Method BCM 6200 21

Goal of MAD Phasing Experiment Locate the anomalously diffracting atoms F A in the unit cell, F A is redefined as 0 F A h + λ F A h Note: the sum of 0 F A h and λ F A h does not change the phase angle of F A This allows calculation of the corresponding phase φ A. The MAD Equations then estimate Δφ and F T (where Δφ is the difference in phase angle between anomalous and normal scattering atoms components. In the simplest case, the phase of F T is Δφ + φ A. The Fourier transform with amplitudes F T and Δφ + φ A phase will give an electron density map of all atoms in the structure. The wavelength dependent scattering for each atom type is: f = fo + f (λ) + i f (λ) BCM 6200 22

Recapitulation Δφ = φ T - φ A Only F λ is measured F λ is the total amplitude of the unknown structure whose phase is needed for the electron density. It is a wavelength dependent. F T is the normal scattering from all atoms. F A is the scattering due to the partial structure of the anomalously diffracting atoms. Δφ is the phase angle difference between normal (φ T )and anomalously diffracting atoms (φ A ). If we can locate the anomalous scattering atoms in the unit cell, we know their phase angle, φ A. We can then generate an estimate for Δφ and F T. http://www.bmsc.washington.edu/scatter/mad_4.html BCM 6200 23

MAD Phasing Scattering Correction Karle (1980) and Hendrickson (1985) formulated algebraic expressions that separate the non-anomalous (λ independent) from the anomalous scattering (λ dependent) atoms of F hkl. The form of the corrected scattering factor is: f = f 0 + f (λ) + if (λ) The contribution of the non-anomalously scattering atoms and those of the anomalous scatterers can be depicted in separate groups Then the structure factor for a given F hkl at wavelength λ can be recast as follows: Δφ λ F (hkl) 2 = 0 F T (hkl) 2 + a(λ) 0 F A (hkl) 2 + b(λ) 0 F T (hkl) 0 F A (hkl) cos[ 0 φ T (hkl) 0 φ A (hkl) ] + c(λ) 0 F T (hkl) 0 F A (hkl) sin[ 0 φ T (hkl) 0 φ A (hkl) ] a(λ) = (f λ 2 + f λ 2 )/(f 02 ); b(λ) = 2(f λ / f 0 ); c(λ) = 2(f λ /f 0 ) Jerome Karle Wayne Hendrickson W.A. Hendrickson, J.L. Smith & S. Sheriff (1985). "Direct Phase Determination Based on Anomalous Scattering" Methods Enzymol. 115, 41-55 BCM 6200 24

Solving the MAD Equations λ F (h) 2 = F T 2 + a(λ) F A 2 + b(λ) F T F A cos[δφ] + c(λ) F T F A sin[δφ] Δφ = φ T - φ A Each measurements of λ F (h) at some wavelength gives us one instance of the equation above, and the separate instances may be treated as a system of simultaneous equations from which we want to obtain the quantities F A, F T, and Δφ. One equation for λ F (h) and one for λ F (-h). Therefore at least two wavelengths are needed to solve the equations. Rewriting F A h as λ F A h = 0 F A h + λ F A h +i λ F A " h, where as before the contribution of anomalous scatterers to λ F A h is broken down into their λ independent component, 0 F A h, and λ dependent components - λ F A h and λ ΔF A " h. Then defining F ano = λ F h λ F h, and using λ F A h as defined above, we have F ano 2 λ F " A h sin F ano is thus uniquely determined by the properties of anomalously scattering atoms BCM 6200 25

Locating the Anomalous Scattering Sites In practice, we obtain estimates for F T, F A and Δφ from programs such as Phenix or CCP4 suite of programs. The goal is to find the protein phase, φ T. Recall that φ T = Δφ + φ A, so if one knew φ A the problem would be solved. Like MIR methods, the anomalous scatter location is found by solving a difference Patterson map (n atom structure, where n = number of anomalous scatterers). Calculate an anomalous difference Patterson with coefficients F2 ano = [ λ F h λ F h ] 2 using Bijvoet or Friedel pairs Since [ λ F h λ F h ] 2 4 [ λ F A " h ] 2 sin 2, F ano will be maximal if the phase angle φ A is perpendicular to φ T and zero if both vectors are collinear, which is the opposite to the MIR case The anomalous difference Patterson will have peaks of anomalous scatterers with heights proportional to 2 [ λ F A " h ] 2 as <sin 2 > = ½ Then to maximize signal-to-noise, choose the peak wavelength. The anomalous scattering can be used to "generate" multiple derivatives when the incident wavelength λ is changed (MAD). BCM 6200 26

Bijvoet Difference Patterson Map to Locate Anomalous Atoms Anomalous differences F2 ano can be used to locate the positions of the anomalously scattering atoms Harker section calculated using the anomalous signal from a single Cu atom in a 96 residue metalloprotein CBP [Guss, et al. 1989]; This map is rather noisy!! Why? At the maximum f, there are only 4.2 e - Instead, use estimate of F A from the MAD equations. u = 1/2 Harker Section F2 ano coefficients from λ peak λ = 1.3771Å, f = 4.17e The Patterson map with coefficients F ano 2 requires only a home source of fixed wavelength, accurate intensity measurements, and good software to solve the Patterson map! http://www.bmsc.washington.edu/scatter/mad_4.html BCM 6200 27

Use of F A 2 Patterson Map Using the solution for F A greatly improves the signal-to-noise since it was derived from all data sets. Now you have φ A and the estimate for Δφ, it is possible to solve for φ T = Δφ + φ A. Δφ = φ T - φ A Cu metalloprotein CBP u = 1/2 Harker Section F A 2 coefficients derived from four wavelengths http://www.bmsc.washington.edu/scatter/mad_4.html BCM 6200 28

Signal-to-Noise of a MAD Experiment Theoretical calculations will indicate whether there is a sufficient signal. < F λmin F λmax > < F T > N 2 f λmin f λmax < F T > < F + F > < F T > N 2 f" λpeak < F T > N = # anomalous scatterers < F T > = scattering power of macromolecule λmin and λmax measured from experiment but estimated for feasibility < F PH F P > < F P > N 2 f 0 < F P > f 0 is the uncorrected scattering factor for the heavy atom The anomalous diffraction signal does not diminish as a function of resolution. However signal to noise, I/σ(I), of data decreases with increased S = 2 sinθ/λ Generally, the signal to detect must be greater then the noise level, i.e. I > σ(i). This is indicated by the I/σ(I) value plotted versus resolution. BCM 6200 29

Use of Direct Methods or Automation Often the use of Se-Met methods causes the introduction of multiple anomalously diffracting sites. In general, there will be N 2 total vectors in the Patterson, where N is the number of atoms There will be N self-vectors on the origin. This leaves (N 2 - N) non-self vectors to sort out on the Patterson. For 10-20 Se atoms, there are still 90 to 380 vectors, which can make the Harker Sections complicated (also see cross-vectors). In general, one needs about 1 Se per 80-100 amino acids for phasing. For these reasons, crystallographers often turn to automated methods to solve difference Patterson maps. These include Direct Methods (Shake-n-Bake, ShelXd). Also see, Patterson methods like SOLVE/RESOLVE (heavy atom search and superposition HASSP). BCM 6200 30

Finding the Heavy Atom Substructure Thus finding the heavy atom substructure in MAD uses exactly the same methods as in MIR, except we use primarily the anomalous (f'') signal rather than the isomorphous (F PH -F P ) signal between two datasets. The simplest and tedious way is to calculate an anomalous difference Patterson, find the peaks on the Harker sections and calculate the relationship between the Harker peaks and the real space locations of the heavy atoms (Se, in this case). This is in fact how most of the MIR structures were done before the days of more reliable programs that automate this task. More recently, the programs SOLVE and SHELX have supplied us with powerful tools that can quite predictably locate heavy atoms from anomalous or isomorphous data. Direct Methods program is routinely used to solve small molecule crystal structures (at very high resolutions) and has been adapted so that it can solve heavy atom substructures at much more modest resolutions (the number of reflections per atom are not dissimilar in these two situations). Direct Methods is based on a simple set of probabilistic relationships where: φ -H +φ -K +φ H-K ~ 0 for a set of three strong reflections. http://xray0.princeton.edu/~phil/facility/guides/mad_example1.html BCM 6200 31

Increasing S/n with Redundancy 3.2Å Bijvoet difference Patterson maps calculated at 90 intervals of crystal rotation for ALR (augmenter of liver regeneration) crystals and the corresponding 3.2Å Fcalc Patterson map calculated from the cadmium positions. All maps are contoured starting at 1σ (map) with a 1σ step between contours. The marked improvement in the Patterson map is due to the increase in Bijvoet redundancy from 0.8 (90 sweep) to 9.0 (720 sweep). Patterson coefficients based on cadmium (Δf" = 4.7) anomalous signal for intensity data measured with a Cu rotating anode www.rigaku.com/downloads/journal/vol18.2.2001/rose.pdf BCM 6200 32

Se-Met: selenomethionine Se-Met MAD Phasing : Most popular method currently in use for proteins. Methionine is a relatively rare amino acid: 2.4% (vs. average of 5%) For Se-Met phasing, require 1 Met per ~100 amino acids. This really depends on the quality of data (signal-to-noise). Se can be introduced by growth of Met auxotrophic bacteria in the presence of Se-Met Conditions that inhibit Met synthesis can be used, and the media supplemented with Se-Met (nonauxotrophic strains). Other added ions or endogenous metals can be used too. Use of the method is relatively restricted to E. coli. Grayhack has developed a strain of Se-Met tolerant yeast. Air oxidation of protein must be vigorously avoided (DTT) Se=O can abolish the MAD signal. Methionine Selenomethionine Ref: Van Duyne et al. & Clardy (1993) J. Mol. Biol. 229, 105-124; Doublie (1997) Methods in Enzymol. 276, 523-530. BCM 6200 33

Se-Met MAD Phasing Substitution in E. coli can reach 100%. The replacement of Met with Se-Met has advantages in that the side chain often packs in a hydrophobic environment and is well-ordered. However, the N-terminal (AUG) start signal is often disordered. The natural frequency of Met in a protein ~ 1 in 75 amino acids. This represents about 2.5% of F P signal, which is detectable with a good crystal. It has also been possible to introduce Met sites for MAD phasing by site- directed mutagenesis. Se-Met is often isomorphous with Met and has good anomalous signal (as much as 18 e - ) Se-Met can help determine the sequence, or to find NCS. Use of 5-Br-U is better for DNA than RNA. Recent studies showed that 5-Br-U RNA can change the global fold Anomalous data are collected from 1 crystal at Se K-edge (12.578 kev). MAD data are collected at Edge, Inflection, and remote wavelengths BCM 6200 34

Sulfur S-SAS: experimental realities The ultimate goal of the SAS method is the use of S-SAS to phase protein data since most proteins contain sulfur as cysteines and methionines. However sulfur has a very weak anomalous scattering signal with f = 0.56 e - for Cu X-rays. The use of soft X-rays such as Cr K ( = 2.2909Å) doubles the sulfur signal ( f = 1.14 e - ). S-SAS method requires careful data collection and large data redundancies in order to maximize S/n and crystals that diffract to high resolution. A high symmetry space group (more symmetry equivalents) increases the chance of success. There are increasing number of S-SAS structures in the Protein Data Bank. BCM 6200 35

Pushing the limits of sulfur SAD phasing Single-wavelength anomalous dispersion of S atoms (S-SAD) is an elegant phasing method to determine crystal structures that does not require heavy-atom incorporation or selenomethionine derivatization. Nevertheless, this technique has been limited by the paucity of the signal at the usual X-ray wavelengths, requiring very accurate measurement of the anomalous differences. [ F(+) - F(-) ] /σ Successful structure solution of the N-terminal domain of the ectodomain of HCV E1 from crystals that diffracted very weakly. By combining data from 32 crystals, it was possible to solve the sulfur substructure and calculate initial maps at 7 Å resolution, and after density modification and phase extension using a higher resolution native data set to 3.5 Å resolution, model building was achievable. d /sig(d ) as a function of resolution. The graph shows the signal to noise from the anomalous differences. In the red part of the graph the anomalous signal is considered to be nonexistent. El Omari K, Iourin O, Kadlec J, Fearn R, Hall DR, Harlos K, Grimes JM, Stuart DI. Pushing the limits of sulfur SAD phasing: de novo structure solution of the N-terminal domain of the ectodomain of HCV E1. Acta Crystallogr D Biol Crystallogr. 2014 Aug;70(Pt 8):2197-203. BCM 6200 36

Improvement of SAS electron-density maps The blue meshes show the electron density contoured at 1σ. (a) Electron-density maps at 7 Å resolution after density modification by phenix.autosol using a solvent content of 75%. (b) Electron-density maps at 3.5 Å resolution after density modification by phenix.autobuild using sixfold NCS. (c) Final 2 F o F c electron-density maps at 3.5 Å resolution after refinement with autobuster. (d) Structure of HCV ne1 fitted into the electrondensity maps described in (c). The six monomers composing the asymmetric unit are colored differently. BCM 6200 37

Resolve the SIR or SAS phase ambiguity (Handedness) by Solvent Flattening The ISAS process is carried twice, once with heavy atom site(s) at refined locations (+++), and once in their inverted locations (---). Data FOM 1 Handedness FOM 2 R-Factor Corr. coeff RHE 0.54 Correct 0.82 0.26 0.958 0.54 Incorrect 0.8 0.30 0.94 NP With I 3 0.54 Correct 0.8 0.27 0.955 0.54 Incorrect 0.76 0.36 0.919 NP With I & S 4 0.56 Correct 0.82 0.24 0.964 0.56 Incorrect 0.78 0.35 0.926 1 : Figure of merit before solvent flattening 2 : Figure of merit after one filter and four cycles of solvent flattening 3 : Four Iodine were used for phasing 4 : Four Iodine and 56 Sulfur atoms were used for phasing Heavy Atom Handedness and Protein Structure Determination using Single-wavelength Anomalous Scattering Data, ACA Annual Meeting, Montreal, July 25, 1995. BCM 6200 38

Calculation of phases R-factors The isomorphous and anomalous lack of closures (ε iso and ε ano ) are minimized for each derivative using least-squares or maximum likelihood methods During the refinement, the contribution that each data set is making to the phasing may be judged by inspecting the values of several R-factors which are a measure of the average Bijvoet or dispersive anomalous differences to their respective lack of closures. For isomorphous (dispersive) differences in acentric reflections, a Cullis type R-factor may be quoted where R acentric C iso = ε iso and for centric reflections we may use Δ iso R centric C iso = F ph ± F p F calc F ph ± F p For anomalous differences, we equivalently define R C ano = All Cullis R-factor values should be less than unity. ε ano Δ Bijvoet http://www.gwyndafevans.co.uk/thesis-html/node40.html BCM 6200 39

Signal Strength and Data Quality Typical values SIR: R = 100 Σ hkl F PH - F P / Σ hkl F P 15-30% SAD: R anom = 200 Σ hkl I + - I - / Σ hkl I + + I - 5% S-SAD: R anom =.. 1% Requires high quality photon counters (Pilatus) BCM 6200 40

Signal Strength and Data Quality R merge = Σ hkl Σ i I i,hkl - <I hkl > / Σ hkl Σ i I i,hkl R r.i.m. = Σ hkl (N/(N-1)) 1/2 Σ i I i,hkl - <I hkl > / Σ hkl Σ i I i,hkl = R meas R p.i.m. = Σ hkl (1/(N-1)) 1/2 Σ i I i,hkl - <I hkl > / Σ hkl Σ i I i,hkl R anom = Σ hkl I hkl - I -h-k-l / (1/2 Σ hkl I hkl + I -h-k-l ) BCM 6200 41

Molecular Replacement BCM 6200 42

Molecular Replacement Molecular replacement has proven effective for solving macromolecular crystal structures based upon the knowledge of homologous structures. The method is straightforward and reduces the time and effort required for structure determination because there is no need to prepare heavy atom derivatives and collect their data. However S-SAS phasing is obviating the need to prepare derivatives and can be used, if only, at a last resort to solve the crystal structure Model building is also simplified, since little or no chain tracing is required. BCM 6200 43

Molecular Replacement: Practical Considerations The 3-dimensional structure of the search model must be very close (< 2Å rmsd) to that of the unknown structure for the technique to work efficiently. Sequence homology between the model and unknown protein is helpful but not strictly required. Success has been observed using search models having as low as 17% sequence similarity. Several computer programs such as AmoRe, X-PLOR/CNS, PHASER are available for MR calculations. BCM 6200 44

How Molecular Replacment works Use a model of the protein to estimate phases Must be a structural homologue (RMSD < 2Å) Two-step process: rotation and translation Find orientation of model (red black) Find location of oriented model (black blue) px.cryst.bbk.ac.uk/03/sample/molrep.htm BCM 6200 45

Using a protein model to estimate phases: the rotation function We need to determine the orientation of the model in the crystal unit cell We use a Patterson search approach in (,, ), which are Euler angles associated with the rotational space BCM 6200 46

Euler angles for rotation function The coordinate system is rotated by: an angle α around the original z axis; then by an angle β around the new y axis; and then by an angle γ around the final z axis. xyz convention BCM 6200 47

Using a protein model to estimate phases: translation function We need to determine the oriented model s location in crystal unit cell We do this with an R-factor search, where R = h F h obs k F h (calc) F h (obs) h BCM 6200 48

Translation functions Oriented model is stepped through the crystal unit cell using small increments in x, y, and z (e.g. x x+ step) The point where R is lowest represents the correct location There exists an alternative method that uses maximum likelihood to find the translation peak; this notion is embodied in the software package PHASER by Randy Read BCM 6200 49

Molecular Replacement Translation search Exp. Patterson Map Oriented Model Patterson TF(xyz) Rotation Function RF(αβγ) Model Patterson Map Exp. Patterson BCM 6200 50

Acknowledgements John Rose, ACA Summer School 2006, Reorganized by Andy Howard, Spring 2008 Mike Lawrence, Walter & Eliza Hall Institute of Medical Research, Melbourne, Australia Joseph Wedekind, University of Rochester Medical Center, Rochester, NY Manfred S. Weiss, Helmholtz-Zentrum Berlin, Macromolecular Crystallography (HZB-MX), D-12489 Berlin, Germany Yizhi Jane Tao, Rice University, Houston, Texas BCM 6200 51

BCM 6200 52