Structure solution from weak anomalous data

Similar documents
Phaser: Experimental phasing

The Crystallographic Process

Likelihood and SAD phasing in Phaser. R J Read, Department of Haematology Cambridge Institute for Medical Research

Charles Ballard (original GáborBunkóczi) CCP4 Workshop 7 December 2011

PHENIX Wizards and Tools

Molecular replacement. New structures from old

SOLVE and RESOLVE: automated structure solution, density modification and model building

SHELXC/D/E. Andrea Thorn

The Crystallographic Process

Web-based Auto-Rickshaw for validation of the X-ray experiment at the synchrotron beamline

Experimental phasing, Pattersons and SHELX Andrea Thorn

Determination of the Substructure

Practical aspects of SAD/MAD. Judit É Debreczeni

Molecular Replacement. Airlie McCoy

Experimental Phasing with SHELX C/D/E

Pipelining Ligands in PHENIX: elbow and REEL

ANODE: ANOmalous and heavy atom DEnsity

Experimental phasing in Crank2

PAN-modular Structure of Parasite Sarcocystis muris Microneme Protein SML-2 at 1.95 Å Resolution and the Complex with 1-Thio-β-D-Galactose

Anomalous dispersion

ANODE: ANOmalous and heavy atom DEnsity

The SHELX approach to the experimental phasing of macromolecules. George M. Sheldrick, Göttingen University

Experimental phasing in Crank2

research papers HKL-3000: the integration of data reduction and structure solution from diffraction images to an initial model in minutes

CCP4 Diamond 2014 SHELXC/D/E. Andrea Thorn

research papers Iterative-build OMIT maps: map improvement by iterative model building and refinement without model bias 1.

Direct Method. Very few protein diffraction data meet the 2nd condition

research papers Reduction of density-modification bias by b correction 1. Introduction Pavol Skubák* and Navraj S. Pannu

Direct Methods and Many Site Se-Met MAD Problems using BnP. W. Furey

BCM Protein crystallography - II Isomorphous Replacement Anomalous Scattering and Molecular Replacement Model Building and Refinement

Macromolecular Phasing with shelxc/d/e

Substructure search procedures for macromolecular structures

Axel T. Brunger, Debanu Das, Ashley M. Deacon, Joanna Grant, Thomas C. Terwilliger, Randy J. Read, Paul D. Adams, Michael Levitt and Gunnar F.

The application of multivariate statistical techniques improves single-wavelength anomalous diffraction phasing

research papers 1. Introduction Thomas C. Terwilliger a * and Joel Berendzen b

Non-merohedral Twinning in Protein Crystallography

Biology III: Crystallographic phases

Acta Crystallographica Section F

ACORN - a flexible and efficient ab initio procedure to solve a protein structure when atomic resolution data is available

Scattering by two Electrons

electronic reprint PHENIX: a comprehensive Python-based system for macromolecular structure solution

Electronic Supplementary Information (ESI) for Chem. Commun. Unveiling the three- dimensional structure of the green pigment of nitrite- cured meat

Image definition evaluation functions for X-ray crystallography: A new perspective on the phase. problem. Hui LI*, Meng HE* and Ze ZHANG

research papers Detecting outliers in non-redundant diffraction data 1. Introduction Randy J. Read

Experimental Phasing of SFX Data. Thomas Barends MPI for Medical Research, Heidelberg. Conclusions

MRSAD: using anomalous dispersion from S atoms collected at CuKffwavelength in molecular-replacement structure determination

Computational aspects of high-throughput crystallographic macromolecular structure determination

MR model selection, preparation and assessing the solution

Protein Crystallography

Supporting Information

Quantifying instrument errors in macromolecular X-ray data sets

for Automated Macromolecular Crystal Structure Solution

ACORN in CCP4 and its applications

Supplementary Information

Supporting Information

Direct-method SAD phasing with partial-structure iteration: towards automation

Statistical density modification with non-crystallographic symmetry

Protein crystallography. Garry Taylor

Recent developments in Crank. Leiden University, The Netherlands

Linking data and model quality in macromolecular crystallography. Kay Diederichs

Maximum Likelihood. Maximum Likelihood in X-ray Crystallography. Kevin Cowtan Kevin Cowtan,

Pathogenic C9ORF72 Antisense Repeat RNA Forms a Double Helix with Tandem C:C Mismatches

Supplemental Data. Structure of the Rb C-Terminal Domain. Bound to E2F1-DP1: A Mechanism. for Phosphorylation-Induced E2F Release

Resolution Dependence of an Ab Initio Phasing Method in Protein X-ray Crystallography

Data quality indicators. Kay Diederichs

Molecular Replacement (Alexei Vagin s lecture)

research papers Liking likelihood 1. Introduction Airlie J. McCoy

Tutorial on how to solve a Se-substructure using

Cover Page. The handle holds various files of this Leiden University dissertation.

research papers Single-wavelength anomalous diffraction phasing revisited 1. Introduction Luke M. Rice, a Thomas N. Earnest b and Axel T.

Twinning in PDB and REFMAC. Garib N Murshudov Chemistry Department, University of York, UK

Full wwpdb X-ray Structure Validation Report i

research papers An introduction to molecular replacement 1. Introduction Philip Evans a * and Airlie McCoy b

X-ray Crystallography. Kalyan Das

TLS and all that. Ethan A Merritt. CCP4 Summer School 2011 (Argonne, IL) Abstract

Principles of Protein X-Ray Crystallography

Data Reduction. Space groups, scaling and data quality. MRC Laboratory of Molecular Biology Cambridge UK. Phil Evans Diamond December 2017

X-ray Data Collection. Bio5325 Spring 2006

Twinning. Andrea Thorn

Likelihood-enhanced fast translation functions

Structural Basis for Methyl Transfer by a Radical SAM Enzyme

Patterson Methods

Full wwpdb X-ray Structure Validation Report i

Resolution and data formats. Andrea Thorn

Full wwpdb X-ray Structure Validation Report i

Full wwpdb X-ray Structure Validation Report i

The Phase Problem of X-ray Crystallography

Supplementary materials. Crystal structure of the carboxyltransferase domain. of acetyl coenzyme A carboxylase. Department of Biological Sciences

GC376 (compound 28). Compound 23 (GC373) (0.50 g, 1.24 mmol), sodium bisulfite (0.119 g,

Full wwpdb X-ray Structure Validation Report i

Ab initio molecular-replacement phasing for symmetric helical membrane proteins

research papers Ab initio structure determination using dispersive differences from multiple-wavelength synchrotronradiation powder diffraction data

Nitrogenase MoFe protein from Clostridium pasteurianum at 1.08 Å resolution: comparison with the Azotobacter vinelandii MoFe protein

Full wwpdb X-ray Structure Validation Report i

5.067 Crystal Structure Refinement Fall 2007

Space Group & Structure Solution

S-SAD and Fe-SAD Phasing using X8 PROTEUM

Full wwpdb X-ray Structure Validation Report i

Full wwpdb X-ray Structure Validation Report i

Transcription:

Structure solution from weak anomalous data Phenix Workshop SBGrid-NE-CAT Computing School Harvard Medical School, Boston June 7, 2014 Gábor Bunkóczi, Airlie McCoy, Randy Read (Cambridge University) Nat Echols, Ralf Grosse-Kunstleve, Paul Adams (Lawrence Berkeley National Laboratory) Tom Terwilliger (Los Alamos National Laboratory)

Structure solution from weak anomalous data The problems with weak signal Quantifying the anomalous signal Solving the anomalous sub-structure with weak signal Solving structures with weak signal Estimating the anomalous signal from the data

Structure solution from weak anomalous data The problem: low anomalous signal-to-noise Reasons: few anomalous scatterers, sulfur SAD, weak diffraction, wavelength far from peak

Structure solution from weak anomalous data Consequences of low anomalous signal-to-noise Substructure identification is difficult Phasing is poor Iterative density modification, model-building and refinement works poorly

Measures of anomalous signal-to-noise Ratio of anomalous signal (<ΔI 2 >-<σ 2 >( to uncertainty in anomalous signal (<σ 2 >) (0 for no anomalous signal) Ratio of anomalous differences to differences among equivalent centric reflections (1 for no anomalous signal) Anomalous correlation (between wavelengths, between half-datasets) Measurability (fraction of anomalous differences measured with difference/sigma > 3)

Measures of anomalous signal-to-noise Problems with these measures: Most require estimates of uncertainties in the data No obvious relationship between the signal-to-noise and whether a structure can be solved End up with rules of thumb: data with anomalous CC > 0.3 will be useful )

Another measure of anomalous signal-to-noise Make a measure that is related to what we want to do with the anomalous signal How about: The signal-to-noise in an anomalous difference Fourier at the positions of the anomalous scatterers Related to: the signal-to-noise in the anomalous differences, the number of reflections, and to the number of sites

Signal-to-noise in an anomalous difference Fourier at the positions of the anomalous scatterers Advantage: Closely related to what we need to do with the anomalous differences Disadvantage: Can only be calculated directly if the structure is solved (Can be estimated )

Example of anomalous signal-to-noise Holton Challenge data http://bl831.als.lbl.gov/~jamesh/challenge/ Simulated diffraction data from 3dk0 to 1.8 Å (useful to 2.3 Å) 0% to 100% occupancy of Se in selenomethionine Impossible.mtz" is difficulty (fraction S) = 0.79

Difficulty: 0.78 22% SeMet incorporation http://bl831.als.lbl.gov/~jamesh/powerpoint/ anomalous_challenge.pptx

Difficulty: 0.79 21% SeMet incorporation http://bl831.als.lbl.gov/~jamesh/powerpoint/ anomalous_challenge.pptx

Example of anomalous signal-to-noise Holton Challenge data Anomalous signal (mean density at sites in anomalous different Fourier) 16 14 12 10 8 6 4 2 0 Anomalous signal 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Fraction Se

Finding the anomalous sub-structure with weak anomalous signal Current approaches Anomalous Difference Patterson seeding Direct methods (Rantan) Dual-space methods (Shelxd, HySS, Crunch2) Difference Fourier (Solve)

Finding the anomalous sub-structure with weak anomalous signal Most powerful source of information about substructure before phases are known is the SAD likelihood function: The likelihood of measuring the observed anomalous data given a partial model

Using the SAD likelihood function to find the anomalous sub-structure Start with guess about the anomalous sub-structure From anomalous difference Patterson Random Any other source Find additional sites that increase the likelihood LLG completion based on log-likelihood gradient maps* Iterative addition of sites Related to using a difference Fourier but much better *La Fortelle, E. de & Bricogne, G. (1997). Methods Enzymol. 276, 472-494 McCoy, A. J. & Read, R. J. (2010). Acta Cryst. D66, 458-469.

Using LLG completion in HySS Guess 2- site solutions Peaks from Patterson Extrapolation Direct methods Phaser LLG completion Scoring Correlation Phaser LLG Range of resolution Variable number of Patterson solutions Adjustable LLGC_SIGMA (cut-off for peak height) Use LLG score to compare solutions Terminate early if same solution found several times Run quick direct methods first

Using LLG completion in HySS Test cases 164 SAD datasets from PDB (largely JCSG MAD data) Using peak, remotes, inflection as available to include data with low anomalous signal

Setting up test data on 165 datasets phenix.fetch_pdb 2o7t phenix.python $PHENIX/phenix/phenix/ autosol/sad_data_from_pdb.py 2o7t Splits out each wavelength (peak, edge, remote etc) for MAD and run separately Run HySS with direct methods, with LLG completion

Direct methods vs LLG completion 164 SAD datasets from PDB

Direct methods vs LLG completion 164 SAD datasets from PDB

Holton Challenge data Choosing a high-resolution cutoff Anomalous signal Mean anomalous peak height at coordinates of Se atoms 10 8 6 4 2 0 1 2 3 4 5 6 7 High-resolution cutoff (Å )

Holton Challenge data Correct sites found vs anomalous signal-to-noise Correct sites 14 12 10 8 6 4 2 0 Anomalous signal needed to find sites HySS-LLG-brute-force HySS-LLG Shelxd (100000 tries) Shelxd (1000 tries) Crunch2 SOLVE HySS (direct methods) 0 2 4 6 8 10 12 14 16 Anomalous signal

CysZ multi-crystal sulfur-sad data Qun Liu, Tassadite Dahmane, Zhen Zhang, Zahra Assur, Julia Brasch, Lawrence Shapiro, Filippo Mancia, Wayne Hendrickson (2012). Science 336,1033-1037 Data from 7 crystals collected at 1.74 Å Only merged data could be solved What is the minimum number of crystals that could have been used?

CysZ multi-crystal sulfur-sad data Datasets Anomalous signal 5 5.22 1 5.66 4 5.81 2 5.87 6 6.23 7 6.63 3 6.77 56 7.13 561 7.94 67 8.22 273 9.02 2734 9.03 27345 9.07 27346 9.28 273456 9.41 2734561 9.63

CysZ multi-crystal sulfur-sad data 25 20 LLG (brute-force) Shelxd (100000 tries) Correct sites 15 10 5 0 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5 10 Anomalous signal

CysZ multi-crystal sulfur-sad data 25 20 Correct sites 15 10 5 0 0 1 2 3 4 5 6 7 Number of crystals included LLG (bruteforce) Shelxd (100000 tries)

CysZ multi-crystal sulfur-sad data 25 Merge 6-7 20 Correct sites 15 10 5 0 0 1 2 3 4 5 6 7 Number of crystals included LLG (bruteforce) Shelxd (100000 tries)

CysZ multi-crystal sulfur-sad data Merge of crystals 6, 7 AutoSol R/Rfree=0.22/0.26

CysZ multi-crystal sulfur-sad data Merge of crystals 6, 7 AutoSol R/Rfree=0.22/0.26

HySS: summary of new features LLG completion of anomalous substructure Initiation of search with Patterson solutions, input sites, or randomized input sites LLG completion from Patterson solutions or direct methods solutions Parallel execution of searches Automation of search over resolution, direct methods, and Phaser completion Termination if same solution is found from different Patterson seeds at same resolution

Structure determination with weak anomalous signal AutoSol: substructure solution, phasing, density modification, preliminary model-building AutoBuild Iterative model-building, refinement, density modification Parallel AutoBuild Parallel runs of AutoBuild with map averaging and picking best models

Structure solution with phenix.autosol Experimental data, sequence, anomalously-scattering atom, wavelength(s) Find heavy-atom sites with direct methods (HYSS dual-space) Calculate phases (Phaser) Improve phases, find NCS, build model

Structure solution with phenix.autosol: enhancements for weak SAD data Experimental data, sequence, anomalously-scattering atom, wavelength(s) Find heavy-atom sites with direct methods (HYSS LLG completion) Use map and model in LLG completion Calculate phases (Phaser) Improve phases, find NCS, build model

AutoSol structure solution 164 SAD datasets from PDB (including inflection/remote datasets not previously used as SAD data) 1.0 0.8 Map correlation 0.6 0.4 0.2 0.0-0.2 0 10 20 30 40 Anomalous signal

1.0 AutoSol structure solution 164 SAD datasets from PDB AutoSol (new) 0.8 Map correlation 0.6 0.4 0.2 0.0-0.2 0 10 20 30 40 Anomalous signal

1.0 AutoBuild model-building 164 SAD datasets from PDB AutoBuild 0.8 Map correlation 0.6 0.4 0.2 0.0-0.2 0 5 10 15 20 25 30 35 40 45 Anomalous signal

1.0 AutoBuild model-building 164 SAD datasets from PDB Parallel AutoBuild 0.8 Map correlation 0.6 0.4 0.2 0.0-0.2 0 5 10 15 20 25 30 35 40 45 Anomalous signal

Holton Challenge data Final map correlation vs anomalous signal-to-noise 1.0 0.8 Map correlation 0.6 0.4 0.2 0.0 AutoSol/Parallel AutoBuild AutoSol/AutoBuild Crank2 6 7 8 9 10 11 12 Anomalous signal

Estimating the anomalous signal from the data Gold standard: Mean peak height of anomalous difference Fourier at positions of anomalously-scattering atoms Estimates of anomalous signal: Signal-to-noise in anomalous differences Xtriage recommended resolution CC of half-dataset anomalous differences Skew of anomalous difference Patterson

Signal-to-noise estimated from anomalous differences and uncertainties (r 2 =0.13) Signal-to-noise in anomalous difference Fourier 45 40 35 30 25 20 15 10 5 0 0.00 0.50 1.00 1.50 2.00 Solve estimate of anomalous signal-to-noise

Signal-to-noise estimated from Xtriage recommended resolution (r 2 =-0.10) Signal-to-noise in anomalous difference Fourier 45 40 35 30 25 20 15 10 5 0 0 5 10 15 20 25 30 Xtriage recommended resolution (Å)

Signal-to-noise estimated from halfdataset anomalous correlation (r 2 =0.28) Signal-to-noise in anomalous difference Fourier 45 40 35 30 25 20 15 10 5 0 0 0.2 0.4 0.6 0.8 1 Half-dataset Anomalous CC

Signal-to-noise estimated from halfdataset anomalous correlation (corrected for numbers of reflections and sites) (r 2 =0.50) Signal-to-noise in anomalous difference Fourier 45 40 35 30 25 20 15 10 5 0-20 -10 0 10 20 30 40 50 60 Half-dataset Anomalous CC * (Nrefl/sites) 1/2

Signal-to-noise estimated from skew of origin-removed truncated anomalous difference Fourier (r 2 =0.39) Signal-to-noise in anomalous difference Fourier 45 40 35 30 25 20 15 10 5 0-0.05 0.00 0.05 0.10 0.15 0.20 0.25 Skew of anomalous difference Patterson

Signal-to-noise estimated from skew of origin-removed truncated anomalous difference Fourier (r 2 =0.60) Signal-to-noise in anomalous difference Fourier 45 40 35 30 25 20 15 10 5 0-20 0 20 40 60 Skew * Nrefl 1/2

Success of direct methods substructure solution as function of true signal-to-noise HySS Direct Methods Fraction of sites found 1.0 0.8 0.6 0.4 0.2 0.0 0 5 10 15 20 25 30 35 40 45 Anomalous signal

Success of direct methods substructure solution as function of estimated signal-to-noise HySS Direct Methods Fraction of sites found 1.0 0.8 0.6 0.4 0.2 0.0 0 5 10 15 20 25 30 35 40 45 Signal estimated from skew * Nrefl 1/2

Structure solution from weak anomalous data: Perspectives Signal-to-noise in at sites in anomalous difference Fourier is a useful measure of quality Signal-to-noise can be estimated Likelihood-based methods for finding the anomalous substructure are powerful even with weak signal Structures can be solved with weak signal

Lawrence Berkeley Laboratory The PHENIX Project Los Alamos National Laboratory Duke University Cambridge University An NIH/NIGMS funded Program Project