Likelihood and SAD phasing in Phaser. R J Read, Department of Haematology Cambridge Institute for Medical Research

Similar documents
Phaser: Experimental phasing

Molecular replacement. New structures from old

Structure solution from weak anomalous data

Charles Ballard (original GáborBunkóczi) CCP4 Workshop 7 December 2011

Experimental phasing in Crank2

The Crystallographic Process

SHELXC/D/E. Andrea Thorn

Experimental phasing in Crank2

Scattering by two Electrons

Acta Crystallographica Section F

CCP4 Diamond 2014 SHELXC/D/E. Andrea Thorn

Direct Methods and Many Site Se-Met MAD Problems using BnP. W. Furey

Direct Method. Very few protein diffraction data meet the 2nd condition

Experimental Phasing with SHELX C/D/E

research papers Reduction of density-modification bias by b correction 1. Introduction Pavol Skubák* and Navraj S. Pannu

Web-based Auto-Rickshaw for validation of the X-ray experiment at the synchrotron beamline

PHENIX Wizards and Tools

Electronic Supplementary Information (ESI) for Chem. Commun. Unveiling the three- dimensional structure of the green pigment of nitrite- cured meat

Maximum Likelihood. Maximum Likelihood in X-ray Crystallography. Kevin Cowtan Kevin Cowtan,

Practical aspects of SAD/MAD. Judit É Debreczeni

Determination of the Substructure

Experimental phasing, Pattersons and SHELX Andrea Thorn

Macromolecular Phasing with shelxc/d/e

Direct-method SAD phasing with partial-structure iteration: towards automation

PAN-modular Structure of Parasite Sarcocystis muris Microneme Protein SML-2 at 1.95 Å Resolution and the Complex with 1-Thio-β-D-Galactose

Protein Crystallography

Protein crystallography. Garry Taylor

X-ray Crystallography. Kalyan Das

SOLVE and RESOLVE: automated structure solution, density modification and model building

Anomalous dispersion

X-ray Crystallography

research papers Detecting outliers in non-redundant diffraction data 1. Introduction Randy J. Read

Recent developments in Crank. Leiden University, The Netherlands

Biology III: Crystallographic phases

Pipelining Ligands in PHENIX: elbow and REEL

ANODE: ANOmalous and heavy atom DEnsity

Supporting Information

ACORN - a flexible and efficient ab initio procedure to solve a protein structure when atomic resolution data is available

MR model selection, preparation and assessing the solution

The Crystallographic Process

research papers Iterative-build OMIT maps: map improvement by iterative model building and refinement without model bias 1.

research papers HKL-3000: the integration of data reduction and structure solution from diffraction images to an initial model in minutes

Joana Pereira Lamzin Group EMBL Hamburg, Germany. Small molecules How to identify and build them (with ARP/wARP)

Crystal lattice Real Space. Reflections Reciprocal Space. I. Solving Phases II. Model Building for CHEM 645. Purified Protein. Build model.

The SHELX approach to the experimental phasing of macromolecules. George M. Sheldrick, Göttingen University

Crystals, X-rays and Proteins

General theory of diffraction

Molecular Replacement (Alexei Vagin s lecture)

Image definition evaluation functions for X-ray crystallography: A new perspective on the phase. problem. Hui LI*, Meng HE* and Ze ZHANG

ANODE: ANOmalous and heavy atom DEnsity

Molecular Biology Course 2006 Protein Crystallography Part I

Supporting Information

The application of multivariate statistical techniques improves single-wavelength anomalous diffraction phasing

SOLID STATE 9. Determination of Crystal Structures

Phase problem: Determining an initial phase angle α hkl for each recorded reflection. 1 ρ(x,y,z) = F hkl cos 2π (hx+ky+ lz - α hkl ) V h k l

research papers Liking likelihood 1. Introduction Airlie J. McCoy

Garib N Murshudov MRC-LMB, Cambridge

Fan, Hai-fu Institute of Physics, Chinese Academy of Sciences, Beijing , China

PSD '17 -- Xray Lecture 5, 6. Patterson Space, Molecular Replacement and Heavy Atom Isomorphous Replacement

Molecular Biology Course 2006 Protein Crystallography Part II

Probability Distributions: Continuous

Protein Crystallography Part II

Kevin Cowtan, York The Patterson Function. Kevin Cowtan

Statistical density modification with non-crystallographic symmetry

GC376 (compound 28). Compound 23 (GC373) (0.50 g, 1.24 mmol), sodium bisulfite (0.119 g,

Molecular Replacement. Airlie McCoy

Data Collection. Overview. Methods. Counter Methods. Crystal Quality with -Scans

5.067 Crystal Structure Refinement Fall 2007

The Phase Problem of X-ray Crystallography

Charge density refinement at ultra high resolution with MoPro software. Christian Jelsch CNRS Université de Lorraine

BCM Protein crystallography - II Isomorphous Replacement Anomalous Scattering and Molecular Replacement Model Building and Refinement

Schematic representation of relation between disorder and scattering

4. Constraints and Hydrogen Atoms

Table 1. Crystallographic data collection, phasing and refinement statistics. Native Hg soaked Mn soaked 1 Mn soaked 2

Removing bias from solvent atoms in electron density maps

Oxidation of cobalt(ii) bispidine complexes with dioxygen

Electron Density at various resolutions, and fitting a model as accurately as possible.

research papers An introduction to molecular replacement 1. Introduction Philip Evans a * and Airlie McCoy b

Computational aspects of high-throughput crystallographic macromolecular structure determination

Scattering Lecture. February 24, 2014

S-SAD and Fe-SAD Phasing using X8 PROTEUM

Ensemble refinement of protein crystal structures in PHENIX. Tom Burnley Piet Gros

Determining Protein Structure BIBC 100

Rietveld Structure Refinement of Protein Powder Diffraction Data using GSAS

Substructure search procedures for macromolecular structures

Structure factors again

MRSAD: using anomalous dispersion from S atoms collected at CuKffwavelength in molecular-replacement structure determination

Supplementary Material for. Herapathite

Summary of Experimental Protein Structure Determination. Key Elements

Professor G N Ramachandran's Contributions to X-ray Crystallography

White Phosphorus is Air-Stable Within a Self-Assembled Tetrahedral Capsule

Overview - Macromolecular Crystallography

Macromolecular Crystallography Part II

Cover Page. The handle holds various files of this Leiden University dissertation.

New Delhi Metallo-β-Lactamase: Structural Insights into β- Lactam Recognition and Inhibition

Iron Complexes of a Bidentate Picolyl NHC Ligand: Synthesis, Structure and Reactivity

Supplementary Information

SUPPLEMENTARY INFORMATION

What is the Phase Problem? Overview of the Phase Problem. Phases. 201 Phases. Diffraction vector for a Bragg spot. In General for Any Atom (x, y, z)

Transcription:

Likelihood and SAD phasing in Phaser R J Read, Department of Haematology Cambridge Institute for Medical Research

Concept of likelihood Likelihood with dice 4 6 8 10 Roll a seven. Which die?? p(4)=p(6)=0 p(8)=1/8 p(10)=1/10

Principle of maximum likelihood Best model is most consistent with data Measure consistency by probabilities Optimise model by adjusting parameters in probability distribution

Least squares and likelihood Most experiments have multiple sources of error: Gaussian error in observations Central Limit Theorem Likelihood for Gaussians = least squares

Least-squares line fitting A [S] ( ) ( ) ( ) ( ) + = j calc j obs j j obs j calc j obs j obs j A A c A p A A A p 2 2,,, 2 2,, 2, ln 2 exp 2 1 σ σ πσ

Why not least squares in crystallography? Gaussian error for observations Error in predicting observation generally includes difference between structure factors this is Gaussian in phased difference e.g. F vs. F C from model, F P vs. F PH Phased error usually dominates elimination of unknown phase changes probabilities

Applying likelihood to crystallography Find probability distribution for observations start from structure factor probabilities eliminate unknown phase angles Adjust parameters to optimise likelihood Applications: calculating model phase probabilities structure refinement experimental phasing (isomorphous/anomalous) likelihood-based molecular replacement

The Central Limit Theorem Probability distribution of a sum of independent random variables tends to be Gaussian regardless of distributions of variables in sum Conditions: sufficient number of independent random variables none may dominate the distribution Centroid (mean) of Gaussian is sum of centroids Variance of Gaussian is sum of variances

Effect of atomic errors (or differences) Atomic errors give boomerang distribution of possible atomic contributions Portion of atomic contribution is correct df f Bragg Planes

Structure factor with coordinate errors Same direction as the sum of the atomic f but shorter by 0< D <1 D=f(resolution) Central Limit Theorem Many small atoms F σ Δ Gaussian distribution for the total summed F σ Δ =f(resolution) DF

Probability distribution for related structure factors Fraction of calculated structure factor correlated to true structure factor: D 1 p ( FF ; C ) = exp πεσ F DF C 2 2 Δ εσ Δ (Sim, Luzzati, Srinivasan ) Takes form of complex normal distribution 2 DF C F C σ Δ

Amplitude probability distribution Integrate over unknown phase angle to get Rice (Luzzati, Sim, Srinivasan) distribution F C

SAD: single-wavelength anomalous diffraction Most popular way to solve structures by experimental phasing (over 70% and rising) Can be done with intrinsic S and CuKα X-rays SAD phasing theory is very good Easy to automate Can be very fast Can be done from single dataset May need multiple crystals And careful data processing

Anomalous scattering A nom a lous adj. Deviating from the normal or common order, form or rule Anomalous scattering is absolutely normal while normal scattering occurs only as an ideal, over simplified model, which can be used as a first approximation when studying scattering problems IUCR Pamphlet Anomalous Dispersion of X-rays in Crystallography S. Caticha-Ellis (1998)

Anomalous scattering Anomalous scattering is due to the electrons being tightly bound (particularly in K & L shells) In classical terms, the electrons scatter as though they have resonant frequencies

https://www.youtube.com/watch?v=aznnwq8hjhu

Diffraction with anomalous scatterers SAD: single-wavelength anomalous diffraction

Diffraction with anomalous scatterers SAD: single-wavelength anomalous diffraction

Harker construction for SAD phasing F O F + " H F " H F H + F H ' F P F O +

Collections of structure factors Distribution for one is complex normal 2D Gaussian centered on expected value Im one variance describes size of uncertainty Distribution for several is multivariate complex normal single variance is replaced by covariance matrix covariance matrix describes uncertainties and correlations among structure factors SAD: 2 observed structure factors, and 2 calculated structure factors F F Re

SAD likelihood function Factor joint probability into two parts ( ) = p( F + O ;F O,H +,H ) p( F O ;H ) p F O +,F O ;H +,H Integrate out unknown phases, α + and α -

Intuitive understanding of SAD phasing Expected value of F -* (H -* ) Expected difference between F + and F -* Re ( * ) F F O P * ( F, α ;H ) O O * ( F;F, H, H ) P + + O O

Intuitive understanding of SAD phasing Total likelihood is integral of the product of the two distributions under the black circle Expected value of F -* (H -* ) Expected difference between F + and F -* F O

SAD log-likelihood gradient (LLG) map Compute derivative of log-likelihood with respect to heavy atom structure factor Fourier transform gives map of where likelihood target would like to see changes in anomalous scatterer model Very sensitive to minor sites picks up sites identified as water molecules in refined structures determined by halide soaks http://www.phaser.cimr.cam.ac.uk/index.php/tutorials tutorial with data for lysozyme iodide soak

Locating anomalous scatterers in model solved by MR Structure of thyroxine-binding globulin thyroxine doesn t bind where most people expected Thyroxine contains 4 iodine atoms f 3e for λ=0.979å Compare conventional model-phased anomalous difference map with Phaser LLG map

mol 1 Δ ano, 3.5σ LLG, 5.5σ mol 2

Combining MR and SAD CuKα data to 1.9Å on hen egg-white lysozyme can t find sulfurs with HySS or SHELXD Solve by MR with goat alpha-lactalbumin (40% identical) Use MR model as substructure for SAD look for S atoms in LLG map (finds all 10 S, 5-9 Cl - ) phases automatically combine MR and SAD Automated fitting with density-modified map tutorial with these data is available

Breakdown of Friedel s law Friedel s law breaks down for mixture of scatterers differing in real:anomalous ratio

Nitrate reductase (Natalie Strynadka) Integral membrane protein, 1976 residues contains 21 Fe atoms, 1 Mo, 118 S, 5 P (146 total) solved using combination of Fe-MAD, MIRAS

SAD phasing of nitrate reductase Fe peak SAD data only find 11 Fe sites with phenix.hyss several are super-sites of Fe 4 S 4 clusters phase and complete adding Fe, Mo, S with Phaser total of 57 sites: 20 Fe, 6 Mo, 31 S superatoms are resolved, 51 of 57 are identified correctly identity based initially on height of peak in LLG map, revised based on refined occupancy correct hand indicated by number of sites, LLG score

Iterative model-building and phasing Improve phases by density modification Build with ARP/wARP (or Resolve) 1607 residues, 1368 docked in sequence LLG completion from ARP/wARP model 105 sites, 92 correctly identified Repeat DM and ARP/wARP 1813 residues, 1775 docked in sequence

Automation of SAD phasing Functions are all available from Python used for SAD in PHENIX AutoSol GUI used as optional substructure completion method in phenix.hyss Log-likelihood-gradient completion look for one or several types of scatterer start from MR model (atoms or density) or partial substructure analyse map to add sites, make atoms anisotropic delete atoms that fade away repeat to convergence

Practical aspects of SAD phasing in Phaser Provide information about cell content sequence, molecular weight, percent solvent used to put data on absolute scale occupancies are reasonably accurate Provide information about f values wavelength (table lookup) or measured refined by default if near edge Try both hands if uncertain separate completion if mixture of atom types

SAD phasing in Phenix AutoSol GUI finds sites with HySS sensitivity of HySS improved by using Phaser to complete and score substructures automatically uses Phaser for phasing if SAD data tests both hands, chooses best hand carries out Resolve density modification and modelbuilding

SAD phasing in CCP4 ccp4i GUI modes for SAD phasing or MR+SAD SAD phasing pipeline find substructure with Hyss or SHELXD phase both hands density modification with parrot quick model-building with buccaneer MR+SAD provide MR model only phase in hand of MR solution

Background information Phaser crystallographic software, McCoy, Grosse- Kunstleve, Adams, Winn, Storoni & Read (2007), J. Appl. Cryst. 40, 658-674. plus papers cited here Liking likelihood, Airlie J. McCoy (2004), Acta Cryst. D60, 2169-2183. http://www.phaser.cimr.cam.ac.uk/index.php http://www.phaser.cimr.cam.ac.uk/index.php/tutorials http://www-structmed.cimr.cam.ac.uk/course

Contributors Experimental phasing Airlie McCoy, Laurent Storoni PHENIX collaboration Ralf Grosse-Kunstleve, Nigel Moriarty, Paul Adams Tom Terwilliger CCP4 SAD pipeline Kevin Cowtan