Direct Methods and Many Site Se-Met MAD Problems using BnP. W. Furey

Similar documents
SHELXC/D/E. Andrea Thorn

Web-based Auto-Rickshaw for validation of the X-ray experiment at the synchrotron beamline

Likelihood and SAD phasing in Phaser. R J Read, Department of Haematology Cambridge Institute for Medical Research

Phaser: Experimental phasing

The Phase Problem of X-ray Crystallography

Experimental Phasing with SHELX C/D/E

Fast, Intuitive Structure Determination IV: Space Group Determination and Structure Solution

Determination of the Substructure

Shake-and-Bake: Applications and Advances. Russ Miller & Charles M. Weeks

Direct Method. Very few protein diffraction data meet the 2nd condition

Macromolecular Phasing with shelxc/d/e

CCP4 Diamond 2014 SHELXC/D/E. Andrea Thorn

Protein Crystallography

Experimental phasing, Pattersons and SHELX Andrea Thorn

Experimental phasing in Crank2

(716) Partial funding from NIH and NSF. Computing from TMC, PSC, Intel, and NIH. Principal Contributors

Patterson Methods

research papers 1. Introduction Thomas C. Terwilliger a * and Joel Berendzen b

X-ray Crystallography. Kalyan Das

ACORN in CCP4 and its applications

ACORN - a flexible and efficient ab initio procedure to solve a protein structure when atomic resolution data is available

Crystal lattice Real Space. Reflections Reciprocal Space. I. Solving Phases II. Model Building for CHEM 645. Purified Protein. Build model.

Protein crystallography. Garry Taylor

Experimental phasing in Crank2

Direct-method SAD phasing with partial-structure iteration: towards automation


The SHELX approach to the experimental phasing of macromolecules. George M. Sheldrick, Göttingen University

What is the Phase Problem? Overview of the Phase Problem. Phases. 201 Phases. Diffraction vector for a Bragg spot. In General for Any Atom (x, y, z)

Scattering by two Electrons

Structure solution from weak anomalous data

SOLVE and RESOLVE: automated structure solution, density modification and model building

Biology III: Crystallographic phases

Determining Protein Structure BIBC 100

Molecular Replacement (Alexei Vagin s lecture)

PAN-modular Structure of Parasite Sarcocystis muris Microneme Protein SML-2 at 1.95 Å Resolution and the Complex with 1-Thio-β-D-Galactose

Overview - Macromolecular Crystallography

PSD '17 -- Xray Lecture 5, 6. Patterson Space, Molecular Replacement and Heavy Atom Isomorphous Replacement

BCM Protein crystallography - II Isomorphous Replacement Anomalous Scattering and Molecular Replacement Model Building and Refinement

Charles Ballard (original GáborBunkóczi) CCP4 Workshop 7 December 2011

Rietveld Structure Refinement of Protein Powder Diffraction Data using GSAS

Tutorial on how to solve a Se-substructure using

Practical aspects of SAD/MAD. Judit É Debreczeni

X-ray Crystallography

Preparing a PDB File

Anomalous dispersion

Supplementary materials. Crystal structure of the carboxyltransferase domain. of acetyl coenzyme A carboxylase. Department of Biological Sciences

Pipelining Ligands in PHENIX: elbow and REEL

General theory of diffraction

Crystals, X-rays and Proteins

wwpdb X-ray Structure Validation Summary Report

Garib N Murshudov MRC-LMB, Cambridge

Space Group & Structure Solution

A GUI FOR EVOLVE ZAMS

research papers HKL-3000: the integration of data reduction and structure solution from diffraction images to an initial model in minutes

SUPPLEMENTARY INFORMATION

Electronic Supplementary Information (ESI) for Chem. Commun. Unveiling the three- dimensional structure of the green pigment of nitrite- cured meat

Molecular replacement. New structures from old

Protein Structure and Visualisation. Introduction to PDB and PyMOL

Image definition evaluation functions for X-ray crystallography: A new perspective on the phase. problem. Hui LI*, Meng HE* and Ze ZHANG

Modelling against small angle scattering data. Al Kikhney EMBL Hamburg, Germany

Resolution and data formats. Andrea Thorn

Recent developments in Crank. Leiden University, The Netherlands

Molecular Biology Course 2006 Protein Crystallography Part II

PDBe TUTORIAL. PDBePISA (Protein Interfaces, Surfaces and Assemblies)

Tutorial. Getting started. Sample to Insight. March 31, 2016

electronic reprint Optimizing DREAR and SnB parameters for determining Se-atom substructures

Full wwpdb X-ray Structure Validation Report i

Ab initio crystal structure analysis based on powder diffraction data using PDXL

Scattering Lecture. February 24, 2014

Phase problem: Determining an initial phase angle α hkl for each recorded reflection. 1 ρ(x,y,z) = F hkl cos 2π (hx+ky+ lz - α hkl ) V h k l

research papers Detecting outliers in non-redundant diffraction data 1. Introduction Randy J. Read

Full wwpdb X-ray Structure Validation Report i

Principles of Protein X-Ray Crystallography

Prediction and refinement of NMR structures from sparse experimental data

The Development of a Quality Control and Analysis Application for the ThermoFluor High Throughput Screening Assay

Full wwpdb X-ray Structure Validation Report i

Physical Chemistry Analyzing a Crystal Structure and the Diffraction Pattern Virginia B. Pett The College of Wooster

Working with protein structures. Benjamin Jack

4. Constraints and Hydrogen Atoms

Molecular Biology Course 2006 Protein Crystallography Part I

Full wwpdb X-ray Structure Validation Report i

research papers Reduction of density-modification bias by b correction 1. Introduction Pavol Skubák* and Navraj S. Pannu

IgE binds asymmetrically to its B cell receptor CD23

Protein Structure Determination 9/25/2007

Protein Crystallography. Mitchell Guss University of Sydney Australia

Changing and challenging times for service crystallography. Electronic Supplementary Information

This is an author produced version of Privateer: : software for the conformational validation of carbohydrate structures.

Full wwpdb X-ray Structure Validation Report i

Full wwpdb X-ray Structure Validation Report i

ECS8020 ORGANIC ELEMENTAL ANALYZER CHNS-O Analyzer

TLS and all that. Ethan A Merritt. CCP4 Summer School 2011 (Argonne, IL) Abstract

New Features in Agilent's CrysAlis Pro X-ray Diffractometer Software

Institute of Physics, Prague 6, Cukrovarnická street

11/6/2013. Refinement. Fourier Methods. Fourier Methods. Difference Map. Difference Map Find H s. Difference Map No C 1

Web Knowledge Base on Low Energy Nuclear Physics

organic papers 2-[(Dimethylamino)(phenyl)methyl]benzoic acid

X-ray Data Collection. Bio5325 Spring 2006

Applications of X-ray and Neutron Scattering in Biological Sciences: Symmetry in direct and reciprocal space 2012

Small-Angle Scattering Atomic Structure Based Modeling

Transcription:

Direct Methods and Many Site Se-Met MAD Problems using BnP W. Furey

Classical Direct Methods Main method for small molecule structure determination Highly automated (almost totally black box ) Solves structures containing up to a few hundred non-hydrogen atoms in the asymmetric unit.

Direct Methods Assumptions and Requirements Non-negativity of electron density Atoms are resolved, i.e. atomic resolution data are available Unit cell, symmetry and contents are known

Important Concepts - 1 Normalized Structure Factors E H given by E H = F H / < F H 2 > 1/2 with averaging in resolution shells The phase φ H of E H is the same as for F H < E H 2 > = 1 hence normalized

Important Concepts - 2 Structure Invariant - structural quantity independent of choice of unit cell origin Probabilistic estimates can be made for the values of structure invariants given the associated E magnitudes and cell contents

Fundamental formulas involving individual triplets P(ψ HK ) = [2π I 0 (A HK )] -1 exp(a HK cos ψ HK ) where P(ψ HK ) is the probability of the structure invariant having the value ψ HK A HK = 2 E H E K E -H-K / N 1/2 where N is the number of atoms in the cell and the E s are normalized structure factors

Note probability P(ψ HK ) increases as A HK increases, and that A HK is proportional to product of E s and inversely proportional to N 1/2 Expected value of cos ψ HK is given by <cos ψ HK > = I 1 (A HK ) / I 0 (A HK )

Φ 3 = Ψ HK, K=A HK Cochran Distribution for various K s σ vs K

Classical Direct Methods Applications for Proteins Used for phase extension to very high resolution Used with moderate success to locate heavy atom sites in isomorphous derivatives E values used in molecular replacement calculations

Current Direct Methods Applications for Proteins Shake n Bake (based on minimum function) used to solve complete protein structures with over 1,000 atoms (rubredoxin, lysozyme, calmodulin etc.), provided data to 1.1Å or better is available Used to locate anomalous scatterer sites from MAD or SAS data

General Shake n Bake Concept Use a multi-solution method starting with random phases (or randomly positioned atoms) in each trial. For each trial phase set, use a dual space procedure iterating between real and reciprocal space optimization/constraints.

Reciprocal space optimization based on shifting phases to reduce the minimum function R(ψ) Real space optimization and constraints based on computing new phases only from the largest peaks in map based on previous cycle phases Each trial phase set ranked by value of R(ψ)

SnB inner loop for trial structure Generate random trial structure Stop after N iterations Compute phases from structure Select structure from largest peaks Shift phases to reduce R(ψ) Compute map from new phases

Choice of data for Se determination Use F H + - F H - (anomalous) difference at single λ Use F H λi - F H l λj (dispersive) difference between two λ s Use F A values (derived from data at all λ s) Use F HLE values based on max anomalous and max dispersive differences

MAD Phasing For data collected at λ1, λ2 etc, choose a wavelength λn as native data, and reduce that data set by averaging Bijvoet pairs. For other derivative wavelengths λd, reduce both by averaging Bijvoet pairs to form isomorphous data sets, and without averaging to form anomalous data sets.

MAD Phasing For isomorphous and derivative anomalous data sets, scale derivative to native and use scattering factors of f 0 = 0, f = f (λd) - f (λn), f = f (λd) For native anomalous data use original native Bijvoet pairs and scattering factors of f 0 = 0, f = 0, f = f (λn)

Phase Refinement Minimizing W h P φ P FPHobs h FPHcalc φ P h φ P FPHcalc ( ) ( φ P ) where ( ) h 2 = h FPobs 2 + h FHcalc 2 h + 2 FPobs h FHcalc h cos ( φ P φ H )

h Phase Refinement Options ( ) W P FPHobs FPHcalc ( φ ) h φ P h P h φ P Classical - φ P = centroid, W h =1/E 2,1/ <E 2 > or unity, Pφ P =1, use reflections with FOM > 0.4-0.6 Maximum Likelihood - φ P stepped over allowed phases, Pφ P = corresponding probability, W h =1/E 2, 1/ <E 2 > or unity, use reflections with FOM > 0.2 φ P, Pφ P can also come from external source, i.e solvent flattened or NC-symmetry averaged maps. 2

Projection of peaks down NC twofold

MAD λ1, λ2, λ3 data (Scalepack files) CMBISO iso and ano scaled files all native (λ3) data CMBANO PHASIT phase file final map FSFOUR EXTRMP submap file MAPAVG averag mask MISSNG extension file BNDRY BLDCEL MAPINV

MAD Phasing/Averaging Statistics Wavelength type dmin (Å) No. refl Rano Riso dmin (Å) (phasing) Rc Phasing Power <FOM>!1, edge ano 2.3 72,632 0.063-2.6-3.47 0.380!2, peak ano 2.3 72,996 0.060-2.6-3.45 0.447!3, remote ano 2.3 72,650 0.048-2.6-2.09 0.389!1-!3 iso 2.3 74,407-0.039 2.6 0.55 1.89 0.393!2-!3 iso 2.3 74,774-0.035 2.6 0.61 1.59 0.357 Mean FOM (combined) = 0.759 for 48,632 reflections (2.6Å) Correlation coefficient between monomer density prior to NCS averaging = 0.764 Correlation coefficient between monomer density after NCS averaging/phase combination = 0.906

Peak anomalous (λ2)( difference Patterson

With SnB it s possible to automatically locate the anomalous scatterer substructure with data from any one of the dispersive combinations or anomalous pair sets As expected, sets with the maximum dispersive or anomalous signal typically yield a greater frequency of success

Automated Applications of BnP: Methodology W. Furey, 1 L. Pasupulati, 1 S. Potter 2, H. Xu 2, R. Miller 3 & C. Weeks 2 1 University of Pittsburgh School of Medicine and VA Medical Center 2 Hauptman-Woodward Medical Research Institute 3 Center for Computational Research, SUNY at Buffalo

Goal: Provide user-friendly software for automatic determination of protein crystal structures SnB Strengths 1. Powerful, state-of-the-art direct methods for automatically locating heavy atom sites 2. Friendly graphical user interface. SnB Weaknesses 1. Stops after finding sites, i.e no protein phasing 2. No software interface PHASES Strengths 1. Proven protein phasing (MAD MIRAS, etc), solvent flattenin NCS averaging, external program interfacing 2. Interactive graphics PHASES Weaknesses 1. Doesn t automatically find heavy atom sites 2. Script based, i.e. no GUI

Adopted Strategy Combine the SnB program with the PHASES package, putting everything under GUI control Establish default parameters and procedures allowing a aspects of the structure determination to be fully automated Also provide a manual mode allowing experienced users more control, and to facilitate development Provide graphical feedback when possible Facilitate coupling with popular external software

Main Developments Required for Automated Structure Determination Automatic substructure solution detection Automatic substructure validation Automatic hand determination (including space group changes, when needed)

Automatic Substructure Solution Original Method Based on histogram (Manual, time consuming, requires user interaction) Detection Current Method Based on R min and R cryst statistics (Automatic, fast, no user interaction)

Automatic Substructure Validation Original Method Left up to user to decide which peaks correspond to true sites (Manual) Current Method (auto mode) Based on occupancy refinement against Bijvoet differences (Automatic, fast, requires no coordinate refinement, hand insensitive) Current Method (manual mode) As in auto but can also compare peaks from different solutions (Manual)

Automatic Substructure Validation

Automatic Hand Determination Original Method Visual inspection of map projections (Manual, requires user interaction) Current Method (MAD, SIRAS or MIRAS) Based on variance differences in protein and solvent regions (Automatic, fast since requires no refinement, also requires no user interaction)

Automatic Hand Determination Current Method (SAS data only) Comparative analysis of R, FOM and CC after solvent flattening/phase combination. (Automatic, fast, requires no refinement) Current Method (SIR, MIR data only) Both hands tried, map examination needed. (Requires user interaction)

No man (or program) is an island Importing data files Scalepack files D*Trek files MTZ files $ Free format files Exporting data files Exporting control files O RESOLVE 2.08 Arp/wARP 6.1.1 Job submission from GUI Free format files CNS files MTZ files $ O files CHAIN files PDB files RESOLVE $ 2.08 Arp/wARP $ 6.1.1 $ RESOLVE, Arp/wARP and/or CCP4 must be obtained from their respective authors/distributors for these options to work

Results for 1jc4 a=43.6 b=78.6, c=89.4 Å, β= 91.95, P2 1 4 molecules (592 residues) in asu 2.1Å data, 3λ MAD data Substructure: Found 24 of 24 Se Phasing: mean PP- 2.95; mean FOM- 0.661 Time to map: ~41 min on G4 (1.5 GHz) Powerbook ~13 min on G5 (2.7 GHz) Desktop Auto Tracability: Resolve- 87% main chain, 68% side chain Arp/wARP- 82% main chain, 73% side chain

SeMet ASU Size & Data Resolution PDB No. No. PDB No. No. Code Sites Residues NCS d(å) Code Sites Residues NCS d(å 1QC2 4 169 1 1.5 1CLI 28 1380 4 3.0 1BX4 7 345 1 2.25 1A7A 30 864 2 2.8 1CB0 8 283 1 2.2 1L8A 40 1772 2 2.6 1T5H 10 504 1 2.5 1E3M 45 1600 2 3.0 2JXH 12 576 2 3.1 1HI8 50 1328 2 2.8 1GSO 13 431 1 2.22 1GKP 54 2748 6 2.5 2TPS 15 454 2 2.7 1DQ8 60 1868 4 2.3 1DBT 19 717 3 2.49 1E2Y 60 1880 10 3.2 1JEN 22 668 2 2.25 1M32 66 2196 6 2.5 1JC4 24 592 4 2.1 1EQ2 70 3100 10 2.9

Phasing Flexibility (Manual Mode)

Conclusion BnP is a user friendly, efficient, package for the automated determination of protein structures from x-ray diffraction data BnP downloads for Linux, Apple G4, G5, & Intel, and SGI s available (academic & non-profit institutions) at http://www.hwi.buffalo.edu/bnp/