What is the Phase Problem? Overview of the Phase Problem. Phases. 201 Phases. Diffraction vector for a Bragg spot. In General for Any Atom (x, y, z)

Similar documents
X-ray Crystallography

Scattering by two Electrons

Phase problem: Determining an initial phase angle α hkl for each recorded reflection. 1 ρ(x,y,z) = F hkl cos 2π (hx+ky+ lz - α hkl ) V h k l

PSD '17 -- Xray Lecture 5, 6. Patterson Space, Molecular Replacement and Heavy Atom Isomorphous Replacement

Resolution: maximum limit of diffraction (asymmetric)

Patterson Methods

Protein crystallography. Garry Taylor

Protein Crystallography

Crystal lattice Real Space. Reflections Reciprocal Space. I. Solving Phases II. Model Building for CHEM 645. Purified Protein. Build model.

Anomalous dispersion

Protein Structure Determination 9/25/2007

Determination of the Substructure

PSD '18 -- Xray lecture 4. Laue conditions Fourier Transform The reciprocal lattice data collection

Structure Factors. How to get more than unit cell sizes from your diffraction data.

SHELXC/D/E. Andrea Thorn

Materials 286C/UCSB: Class VI Structure factors (continued), the phase problem, Patterson techniques and direct methods

SOLID STATE 9. Determination of Crystal Structures

Crystals, X-rays and Proteins

X-ray Crystallography. Kalyan Das

Determining Protein Structure BIBC 100

Chapter 20: Convergent-beam diffraction Selected-area diffraction: Influence of thickness Selected-area vs. convergent-beam diffraction

Structure factors again

Direct Method. Very few protein diffraction data meet the 2nd condition

Basic Crystallography Part 1. Theory and Practice of X-ray Crystal Structure Determination

Handout 7 Reciprocal Space

Electron Density at various resolutions, and fitting a model as accurately as possible.

Fourier Syntheses, Analyses, and Transforms

Kevin Cowtan, York The Patterson Function. Kevin Cowtan

X-ray analysis. 1. Basic crystallography 2. Basic diffraction physics 3. Experimental methods

Summary: Crystallography in a nutshell. Lecture no. 4. (Crystallography without tears, part 2)

Scattering Lecture. February 24, 2014

Molecular Biology Course 2006 Protein Crystallography Part I

The Phase Problem of X-ray Crystallography

Applications of X-ray and Neutron Scattering in Biological Sciences: Symmetry in direct and reciprocal space 2012

General theory of diffraction

Part 1 X-ray Crystallography

PROBING CRYSTAL STRUCTURE

Roger Johnson Structure and Dynamics: X-ray Diffraction Lecture 6

Overview - Macromolecular Crystallography

Protein Crystallography. Mitchell Guss University of Sydney Australia

Biology III: Crystallographic phases

Data Collection. Overview. Methods. Counter Methods. Crystal Quality with -Scans

6. X-ray Crystallography and Fourier Series

BC530 Class notes on X-ray Crystallography

Protein Structure Determination. Part 1 -- X-ray Crystallography

Chapter 2. X-ray X. Diffraction and Reciprocal Lattice. Scattering from Lattices

Two Lectures in X-ray Crystallography

research papers 1. Introduction Thomas C. Terwilliger a * and Joel Berendzen b

Data processing and reduction

Direct Methods and Many Site Se-Met MAD Problems using BnP. W. Furey

Image definition evaluation functions for X-ray crystallography: A new perspective on the phase. problem. Hui LI*, Meng HE* and Ze ZHANG

Handout 12 Structure refinement. Completing the structure and evaluating how good your data and model agree

X-ray Diffraction. Diffraction. X-ray Generation. X-ray Generation. X-ray Generation. X-ray Spectrum from Tube

There and back again A short trip to Fourier Space. Janet Vonck 23 April 2014

Proteins. Central Dogma : DNA RNA protein Amino acid polymers - defined composition & order. Perform nearly all cellular functions Drug Targets

GBS765 Electron microscopy

BCM Protein crystallography - II Isomorphous Replacement Anomalous Scattering and Molecular Replacement Model Building and Refinement

X-ray Data Collection. Bio5325 Spring 2006

Phonons I - Crystal Vibrations (Kittel Ch. 4)

The Reciprocal Lattice

ACORN - a flexible and efficient ab initio procedure to solve a protein structure when atomic resolution data is available

Diffraction. X-ray diffraction

Chemical Crystallography

We need to be able to describe planes and directions.

Diffraction Geometry

Physical Chemistry I. Crystal Structure

Why do We Trust X-ray Crystallography?

Introduction to Biological Small Angle Scattering

What use is Reciprocal Space? An Introduction

Experimental phasing, Pattersons and SHELX Andrea Thorn

- A general combined symmetry operation, can be symbolized by β t. (SEITZ operator)

Solid State Physics Lecture 3 Diffraction and the Reciprocal Lattice (Kittel Ch. 2)

Fourier Series. Combination of Waves: Any PERIODIC function f(t) can be written: How to calculate the coefficients?

Practical applications of synchrotron radiation in the determination of bio-macromolecule three-dimensional structures. M. Nardini and M.

CHAPTER 4 VECTORS. Before we go any further, we must talk about vectors. They are such a useful tool for

Noble gases do not join other atoms to form compounds. They seem to be most stable just as they are.

PX-CBMSO Course (2) of Symmetry

Structure Factors F HKL. Fobs = k I HKL magnitude of F only from expt

C. Incorrect! The velocity of electromagnetic waves in a vacuum is the same, 3.14 x 10 8 m/s.

CS273: Algorithms for Structure Handout # 13 and Motion in Biology Stanford University Tuesday, 11 May 2003

9. Diffraction. Lattice vectors A real-space (direct) lattice vector can be represented as

The ideal fiber pattern exhibits 4-quadrant symmetry. In the ideal pattern the fiber axis is called the meridian, the perpendicular direction is

Fan, Hai-fu Institute of Physics, Chinese Academy of Sciences, Beijing , China

Macromolecular X-ray Crystallography

Different states of a substance are different physical ways of packing its component particles:

Earth Materials Lab 2 - Lattices and the Unit Cell

NANO 703-Notes. Chapter 21: Using CBED

Experimental phasing in Crank2

Surface Sensitivity & Surface Specificity

Big Bang, Black Holes, No Math

CCP4 Diamond 2014 SHELXC/D/E. Andrea Thorn

Chapter 13: General Solutions to Homogeneous Linear Differential Equations

Theory of X-ray diffraction

Part II. Fundamentals of X-ray Absorption Fine Structure: data analysis

Electronic structure of correlated electron systems. Lecture 2

Physical Chemistry Analyzing a Crystal Structure and the Diffraction Pattern Virginia B. Pett The College of Wooster

Complex Numbers. Rich Schwartz. September 25, 2014

Homework 1 (not graded) X-ray Diffractometry CHE Multiple Choice. 1. One of the methods of reducing exposure to radiation is to minimize.

Part 3 - Image Formation

Transcription:

Protein Overview of the Phase Problem Crystal Data Phases Structure John Rose ACA Summer School 2006 Reorganized by Andy Howard,, Spring 2008 Remember We can measure reflection intensities We can calculate structure factors from the intensities We can calculate the structure factors from atomic positions We need phase information to generate the image p. 1 of 42 x,y.z [Real Space] What is the Phase Problem? X-ray Diffraction Experiment All phase information is lost [Reciprocal Space] In the X-ray diffraction experiment photons are reflected from the crystal lattice (planes in different directions giving rise to the diffraction pattern. Using a variety of detectors (film, image plates, CCD area detectors we can estimate intensities but we lose any information about the relative phase for different reflections. Phases Let s define a phase associated with a specific plane [] for an individual atom: = 2π(hx + ky + lz Atom at x =0.40, y =0.05, z =0.10 for plane [213]: = 2π(2*0.40 + 1*0.05 + 3*0.10 = 2π(1.35 If we examine a 2-dimensional case like k=0, then = 2π(hx + lz Thus for [201] (a two-dimensional case: = 2π(2*0.40 + 0*0.05 + 1*0.10 = 2π(0.90 Now, to understand what this means: p. 3 of 42 201 planes 0.4, y, 0.1 a 0 D H I B G C 201 Phases A F E 0 E D = 2π[ 2 (0.40 + 1 (0.10] = 2π(0.90 F A C G B I H D c 720 4π 1080 6π p. 4 of 42 360 2π In General for Any Atom (x, y, z a Atom ( at x,y,z 0 d 2π d 4π Remember: Plane We express any position in the cell as (1 fractional coordinates: p xyz = x a+y b+z c (2 the sum of integral multiples of the reciprocal axes σ = ha* + kb* + lc* d 6π c p. 5 of 42 Diffraction vector for a Bragg spot We set up the diffraction vector σ associated with a specific diffraction direction : σ = ha* + kb* + lc* The magnitude of this diffraction vector is the reciprocal of our Bragg-law plane spacing d : σ = 1/ d p. 6 of 42 1

Phase angle for a spot The phase angle associated with our atom is 2π times the proection of the displacement vector p onto σ : = 2π σ p But that displacement vector p is related to the real-space coordinates of the atom at position : p = x a + y b + z c where the fractional coordinates of our atom within the unit cell are (x, y, z Thus = 2π (ha* + kb* + lc* (x a + y b + z c p. 7 of 42 Real-space and reciprocal space But these real-space and reciprocal-space unit cell vectors (a,b,c and (a*,b*,c* are duals of one another; that is, they obey: a a* = 1, a b* = 0, a c* =0 b a* = 0, b b* = 1, b c* =0 c a* = 0, c b* = 0, c c* = 1 even when the unit cell isn t all full of 90-degree angles! p. 8 of 42 Matrix formulation of this duality If we construct the 3x3 reciprocal-space unit cell matrix A = (a* b* c* And the 3x3 real-space unit cell matrix R = (a b c for a specific position of the sample, then A and R obey the simple relationship A = R -1, i.e. AR = I Where I is a 3x3 identity matrix How to use this in getting phases = 2π (ha* + kb* + lc* (x a + y b + z c But using those dual relationships, e.g. a* a = 1, b* c = 0, we get = 2π (hx + ky + lz Note that this is true even if our unit cell angles aren t 90º! p. 9 of 42 p. 10 of 42 Why Do We Need the Phase? Fourier transform Karle amplitudes with Karle phases Importance of Phases Hauptman amplitudes with Hauptman phases Inverse Fourier transform Structure Factor Electron Density In order to reconstruct the molecular image (electron density from its diffraction pattern both the intensity and phase, which can assume any value from 0 to 2π, of each of the thousands of measured reflections must be known. p. 11 of 42 Karle amplitudes with Hauptman phases Phases dominate the image! Phase estimates need to be accurate Hauptman amplitudes with Karle phases p. 12 of 42 2

Understanding the Phase Problem The phase problem can be best understood from a simple mathematical construct. The structure factors ( are treated in diffraction theory as complex quantities, i.e., they consist of a real part (A and an imaginary part (B. If the phases, Φ, were available, the values of A and B could be calculated from very simple trigonometry: A = cos (Φ B = sin (Φ This leads to the relationship: (A 2 + (B 2 = 2 = I p. 13 of 42 Argand Diagram (A 2 + (B 2 = 2 = I The above relationships are often illustrated using an Argand diagram (right. From the Argand diagram, it is obvious that A and B may be either positive or negative, depending on the value of the phase angle, Φ. Note: the units of A, B and are in electrons. B imaginary! A F = A + ib " = tan #1 B real p. A14 of 42 f 0 The Structure Factor N Atomic scattering factors sinθ/λ = f e 2"i(hx +ky +lz # =1 Here f is the atomic scattering factor The scattering factor for each atom type in the structure is evaluated at the correct sinθ/λ. That value is the scattering ability for that atom. Remember sinθ/λ = 1/(2d We now have an atomic scattering factor with magnitude f 0 and direction p. 15 of 42 Resultant " = 2! ( hx + ky + lz F The Structure Factor Sum of all individual atom contributions = B N! = 1 f A imaginary Individual atom f s real N 2 # i( hx + ky + lz i" e =! f e = 1 p. 16 of 42 " x,y,z = 1 & ' V ( % e e #i, = cos, + isin, Electron Density Remember the electron density (image of the molecule is the Fourier transform of the structure factor. Thus #2$i[hx +ky +lz] * + = 1 & '% e #i, * V ( + = A + ib " x,y,z = 1 & '% A cos, + % B sin, * V ( + " x,y,z = 1 & '% A cos[2$(hx + ky + lz] + V ( Here V is the volume of the unit cell % B sin[2$(hx + ky + lz] * + How to calculate ρ(x,y,z In practice, the electron density for one three-dimensional unit cell is calculated by starting at x, y, z = (0, 0, 0 and stepping incrementally along each axis, summing the terms as shown in the equation above for all (as limited by the resolution of the data at each point in space. p. 17 of 42 p. 18 of 42 3

Solving the Phase Problem Small molecules Direct Methods Patterson Methods Molecular Replacement Macromolecules Multiple Isomorphous Replacement (MIR Multi Wavelength Anomalous Dispersion (MAD Single Isomorphous Replacement (SIR Single Wavelength Anomalous Scattering (SAS Molecular Replacement Direct Methods (special cases Solving the Phase Problem SMALL MOLECULES: The use of Direct Methods has essentially solved the phase problem for well diffracting small molecule crystals. MACROMOLECULES: Today, anomalous scattering techniques such as MAD or SAS are the most common techniques used for de novo structure determination of macromolecules. Both techniques require the presence of one or more anomalous scatterers in the crystal. p. 19 of 42 p. 20 of 42 Direct methods Karle, Hauptman, David Sayre, and others determined algebraic relationships among phase angles of groups of reflections. The simplest are triplet relationships: For three reflections h 1 =(h 1,k 1,l 1, h 2 =(h 2,k 2,l 2, h 3 =(h 3,k 3,l 3, they showed that if h 3 = -h 1 - h 2, then Φ 1 + Φ 2 + Φ 3 0 Thus if Φ 1 and Φ 2 are known then we can estimate that Φ 3 -Φ 1 - Φ 2 David Sayre When do triplet relations hold? Note the approximately zero value in that relationship Φ 1 + Φ 2 + Φ 3 0. The stronger the Bragg reflections are, the closer this condition is to being exact. For very strong Bragg reflections that sum will be very close to zero For weaker ones it may differ significantly from zero p. 21 of 42 p. 22 of 42 Phase probabilities Phase probabilities This notion of relationships among phases obliges us to think of phases probabilistically rather than deterministically. This is a key to the direct-methods approach and has a huge influence on how we think about phase determination. I m introducing all of this mostly to get you accustomed to the notion of phase probability distributions! Any phase has a value between 0 and 2π (or 0 and 360, if we re using degrees If we know it s close to 2π*0.42, then: If it s 2π*(0.42 ±0.01, it s a sharp phase probability distribution If it s 2π*(0.42 ±0.32, it s a much broader phase probability distribution p. 23 of 42 p. 24 of 42 4

P( Plots of phase probability Integral of probability must be 1, since every phase has to have some value. Sharp distribution Broad distribution p. 25 of 42 How can we use this? Obviously if we don t know 1 + 2, we can t use this to calculate 3, even if the intensities of all three are large. But we could guess what 1 and 2 are and use this to compute 3. Then we guess 4 and use the triplet relationship to compute 5 and 6, where h 5 = -h 1 - h 4 and h 6 = -h 1 - h 4 assuming that reflections 5 and 6 are strong, too! p. 26 of 42 Can we make this work? We start with guessed phases for a 10-100 strong reflections and use the triplet relationships to determine the phases for another 1000 reflections Any particular calculated phase can be determined by several different triplet relationships, so if they re self-consistent, the initial guessed 10-100 are correct; if they aren t self-consistent, the guess was wrong! In the latter case, we try a different set of guesses for our 10-100 starting phases and keep going p. 27 of 42 This actually works, provided: The data are correctly measured The data are strong enough that we can pick 1000 strong reflections to use in this process The data extend to high enough resolution that atomicity (separable atoms is really found There are ways to do direct methods without assuming atomicity, but they re more complicated p. 28 of 42 Is this relevant to macromolecules? Not directly: Atomicity rarely present Systematic errors in data Indirectly yes, because it can be used in conunction with other methods for locating heavy atoms in the SIR, MIR, and SAS methods It also helps introduce the notion of phase probability distributions (sneaky! p. 29 of 42 SIR and SAS Methods 1. Need a heavy atom (lots of electrons or a anomalous scatterer (large anomalous scattering signal in the crystal. SIR - heavy atoms usually soaked in. SAS - anomalous scatterers usually engineered in as selenomethional labels. Can also be soaked. 2. SIR collect a native and a derivative data set (2 sets total. SAS collect one highly redundant data set and keep anomalous pairs separate during processing. SAS - may want to choose a scatterer or wavelength that enhances the anomalous signal. 3. Must find the heavy atoms or anomalous scatterers can use Patterson analysis or direct methods. 4. Must resolve the bimodal ambiguity. use solvent flattening or similar technique p. 30 of 42 5

What s the bimodal ambiguity? As we ll show next time, a single isomorphous derivative or anomalous scatterer enables us to measure each phase apart from an ambiguity That is, for each phase we get two answers (e.g. 2π*0.12 and 2π*0.55, and we can t pick one out A second scatterer will resolve that p. 31 of 42 Phase probabilities with no error P( A single derivative with no error gives a phase probability like this: p. 32 of 42 P( Wrong estimate derived from derivative 1 2 derivatives, no error Wrong estimate derived from derivative 2 The two distributions overlap at the correct answer, not at the wrong answer Correct phase Errors spread this out Each phase estimate is not really that sharp Lack of isomorphism (see below makes each distribution spread out Joint probability distribution from 2 or more experiments is the product of the probability distributions of the individual experiments p. 33 of 42 p. 34 of 42 Realistic probability distributions P( Joint probability distribution = product of individual ones P(phase Joint probability distribution 0.35 Phase probability 0.3 Joint probability 0.25 distribution = P1(" for first derivative P2(" for 2nd P1(" * 0.2 with peaks at derivative 0.32 and with peaks at 0.558 0.315 and 0.815 0.15 p. 35 of 42 0.1 norm(p1 norm(p2 norm(p1*p2 0.05 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Phase/2! p. 36 of 42 6

Heavy Atom Derivatives Heavy atom derivatives MUST be isomorphous Heavy atom derivatives are generally prepared by soaking crystals in dilute (2-20 mm solutions of heavy atom salts (see Table II below for some examples. Crystal cracking is generally a good indication that that heavy atom is interacting with the crystal lattice, and suggests that a good derivative can be obtained by soaking the crystal in a more dilute solution. Is the derivative worth using? Once derivative data has been collected, the merging R factor (R merge between the native and derivative data sets can be used to check for heavy atom incorporation and isomorphism. R merge values for isomorphous derivatives range from 0.05 to 0.15. Values below 0.05 indicate that there is little heavy atom incorporation. Values above 0.15 indicate a lack of isomorphism between the two crystals. p. 37 of 42 p. 38 of 42 What is isomorphism? Isomorphism for derivatives means that the structure of the derivatized macromolecule is identical to the structure of the underivatized molecule except at the site where the derivative compound has been introduced. What is lack of isomorphism? A derivative may be nonisomorphous if: It alters the unit cell lengths or angles significantly (>0.2%? It rotates or translates the entire macromolecule within the unit cell It alters significantly the conformation of a large segment (> 8 amino acids or 4 nucleotides? of the mcromolecule p. 39 of 42 p. 40 of 42 Derivative compounds Table II. Protein Residues and Their Affinities for Heavy Metals Residue: Affinity for: Conditions: Histidine K 2 PtCl 4, NaAuCl 4, EtHgPO 4 H 2 ph>6 Tryptophan Hg(OAc 2, EtHgPO 4 H 2 Glutamic, Aspartic Acids UO 2 (NO 3 2, rare earth cations ph>5 Cysteine Hg,Ir,Pt,Pd,Au cations ph>7 2- Methionine PtCl 4 anion Finding the Heavy Atoms or Anomalous Scatterers The Patterson function - a F 2 Fourier transform with = 0 - vector map (u,v,w instead of x,y,z - maps all inter-atomic vectors - get N 2 vectors!! (where N= number of atoms P uvw = 1 " 2 cos2#(hu + kv + lv V From Glusker, Lewis and Rossi p. 41 of 42 p. 42 of 42 7