Proteins. Central Dogma : DNA RNA protein Amino acid polymers - defined composition & order. Perform nearly all cellular functions Drug Targets

Similar documents
X-ray Crystallography. Kalyan Das

Protein crystallography. Garry Taylor

X-ray Crystallography

Structure factors again

Drug targets, Protein Structures and Crystallography

Crystal lattice Real Space. Reflections Reciprocal Space. I. Solving Phases II. Model Building for CHEM 645. Purified Protein. Build model.

Working with protein structures. Benjamin Jack

X-ray crystallography NMR Cryoelectron microscopy

X-Ray structure analysis

Molecular Graphics with PyMOL

Scattering by two Electrons

Direct Method. Very few protein diffraction data meet the 2nd condition

Protein Structure Determination. Why Bother With Structure? Protein Sequences Far Outnumber Structures. Growth of Structural Data

Determining Protein Structure BIBC 100

SHELXC/D/E. Andrea Thorn

Macromolecular X-ray Crystallography

Supplementary figure 1. Comparison of unbound ogm-csf and ogm-csf as captured in the GIF:GM-CSF complex. Alignment of two copies of unbound ovine

Protein Structure and Visualisation. Introduction to PDB and PyMOL


Protein Structure Determination. Why Bother With Structure? Protein Sequences Far Outnumber Structures

Protein Structure Determination. How are these structures determined?

X-ray crystallography

Protein Crystallography

Table 1. Crystallographic data collection, phasing and refinement statistics. Native Hg soaked Mn soaked 1 Mn soaked 2

Overview - Macromolecular Crystallography

Resolution: maximum limit of diffraction (asymmetric)

Two Lectures in X-ray Crystallography

Full wwpdb X-ray Structure Validation Report i

PSD '17 -- Xray Lecture 5, 6. Patterson Space, Molecular Replacement and Heavy Atom Isomorphous Replacement

Packing of Secondary Structures

Basic Crystallography Part 1. Theory and Practice of X-ray Crystal Structure Determination

Protein Structures: Experiments and Modeling. Patrice Koehl

Protein Data Bank Contents Guide: Atomic Coordinate Entry Format Description. Version Document Published by the wwpdb

Full wwpdb X-ray Structure Validation Report i

X-ray Crystallography I. James Fraser Macromolecluar Interactions BP204

SUPPLEMENTARY INFORMATION

Data File Formats. There are dozens of file formats for chemical data.

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Protein Structure Determination 9/25/2007

Exam I Answer Key: Summer 2006, Semester C

Full wwpdb X-ray Structure Validation Report i

Supplementary Figure 3 a. Structural comparison between the two determined structures for the IL 23:MA12 complex. The overall RMSD between the two

What is the Phase Problem? Overview of the Phase Problem. Phases. 201 Phases. Diffraction vector for a Bragg spot. In General for Any Atom (x, y, z)

Phase problem: Determining an initial phase angle α hkl for each recorded reflection. 1 ρ(x,y,z) = F hkl cos 2π (hx+ky+ lz - α hkl ) V h k l

Properties of amino acids in proteins

Nitrogenase MoFe protein from Clostridium pasteurianum at 1.08 Å resolution: comparison with the Azotobacter vinelandii MoFe protein

1. What is an ångstrom unit, and why is it used to describe molecular structures?

Molecular Modeling lecture 2

Electron Density at various resolutions, and fitting a model as accurately as possible.

Structure and evolution of the spliceosomal peptidyl-prolyl cistrans isomerase Cwc27

HIV protease inhibitor. Certain level of function can be found without structure. But a structure is a key to understand the detailed mechanism.

Anomalous dispersion

What makes a good graphene-binding peptide? Adsorption of amino acids and peptides at aqueous graphene interfaces: Electronic Supplementary

X-Ray Crystallography

Summary of Experimental Protein Structure Determination. Key Elements

Fourier Syntheses, Analyses, and Transforms

Physiochemical Properties of Residues

SUPPLEMENTARY INFORMATION

Central Dogma. modifications genome transcriptome proteome

Biological Macromolecules

SUPPLEMENTARY INFORMATION

Full wwpdb X-ray Structure Validation Report i

Ranjit P. Bahadur Assistant Professor Department of Biotechnology Indian Institute of Technology Kharagpur, India. 1 st November, 2013

Macromolecular Crystallography Part II

Molecular Biology Course 2006 Protein Crystallography Part I

CCP4 Diamond 2014 SHELXC/D/E. Andrea Thorn

SUPPLEMENTARY INFORMATION

Full wwpdb X-ray Structure Validation Report i

BC530 Class notes on X-ray Crystallography

Details of Protein Structure

Supplementary Materials for

3D Visualization of Drugs-Protein Complexes

Molecular Biology Course 2006 Protein Crystallography Part II

Supplemental Information for: Characterizing the Membrane-Bound State of Cytochrome P450 3A4: Structure, Depth of Insertion and Orientation

Patterson Methods

Full wwpdb X-ray Structure Validation Report i

RNA protects a nucleoprotein complex against radiation damage

SUPPLEMENTARY INFORMATION

Secondary Structure. Bioch/BIMS 503 Lecture 2. Structure and Function of Proteins. Further Reading. Φ, Ψ angles alone determine protein structure

Chapter 4: Amino Acids

April, The energy functions include:

Viewing and Analyzing Proteins, Ligands and their Complexes 2

- Introduction of x-ray crystallography: what it s used for, how it works, applications in science - Different methods used to generate data - Case

Full wwpdb X-ray Structure Validation Report i

Bioinformatics. Macromolecular structure

Computational structural biology and bioinformatics

Supplementary Information Intrinsic Localized Modes in Proteins

Ensemble refinement of protein crystal structures in PHENIX. Tom Burnley Piet Gros

7.91 Amy Keating. Solving structures using X-ray crystallography & NMR spectroscopy

Protein Crystallography Part II

Practical applications of synchrotron radiation in the determination of bio-macromolecule three-dimensional structures. M. Nardini and M.

Direct Methods and Many Site Se-Met MAD Problems using BnP. W. Furey

Crystals, X-rays and Proteins

Scattering Lecture. February 24, 2014

Electronic Supplementary Information (ESI) for Chem. Commun. Unveiling the three- dimensional structure of the green pigment of nitrite- cured meat

Web-based Auto-Rickshaw for validation of the X-ray experiment at the synchrotron beamline

Protein Struktur (optional, flexible)

Ramachandran Plot. 4ysz Phi (degrees) Plot statistics

BCH 4053 Exam I Review Spring 2017

A Primer in X-ray Crystallography for Redox Biologists. Mark Wilson Karolinska Institute June 3 rd, 2014

Transcription:

Proteins Central Dogma : DNA RNA protein Amino acid polymers - defined composition & order Perform nearly all cellular functions Drug Targets

Fold into discrete shapes. Proteins - cont. Specific shapes specific functions. >How do we determine the shape of a protein? >How does shape define function and influence drug action?

Revolutions in X-ray crystallography Getting faster - Hemoglobin: 30y! Your Favorite Protein: hours-weeks Not size limited - ribosome, viruses (> 2.5 MDa) Atomic resolution detail

The PDB surpasses >60,000 structures in 2009 Yearly Total...and is still growing! http://www.rcsb.org/pdb/statistics/contentgrowthchart.do?content=total&seqid=100

Genome sequencing discoveries Genes Human >35,000 Fly ~13,600 Flat worm >19,000 Plant 25,498 Yeast 6,400 >50 microbes 500-5,000 Viruses <10-100 Genomic data is growing even faster!

X-ray crystallography can produce molecular images (Source) Data Protein Crystal Structure Electron Model building density

Crystal growth - general requirements Need: Pure protein (>98%) Chemically, conformationally homogeneous sample Add: Precipitating agents (mild organics such as PEG or salts) Buffers, inorganic or organic salts Cofactors, ligands, chemical additives Perturb: Hydration state, temperature, solubility response Get: Random aggregate Amorphous precipitate (common) Ordered phase transition Crystals (hard, rare)

Crystallization methods Direct mixing Free interface diffusion Precipitant Protein Precipitant Wait (Microbatch) Dehydrate - wait (Vapor diffusion) Protein Movie courtesy Fluidigm, Inc.

Crystals 101 Crystals are: Ordered arrays of ~10 14 molecules ~25-80% water - similar to cells Native protein structure/activity retained in crystalline state

Data collection - experimental setup N2 stream (100 K) Diffracted X-rays Source Synchrotron Rotating anode Detector CCD Image plate Film Beam Optics (mirrors) Crystal (cryo-preserved)

Rotate the crystal to record all data - Oscillation 1º/frame Hexagonal lattice Each reflection (spot) arises from a set of Bragg planes Beam stop shadow Water ring

X-rays & crystals 101 Why use X-rays? X-rays are periodic waves (~1Å wavelength) Electrons (from protein atoms) scatter X-rays Scattering measurably perturbs incident X-ray properties Why use crystals? Crystals are periodic arrays of proteins Act as a micro-diffraction grating to constructively amplify scattering signal Scattered X-rays carry information about electron density distribution in a crystal

Crystal (Bravais) lattice types 14 lattice geometries can pack into repeating, 3D arrays

How do molecules pack? Unit cell - fundamental crystal repeat Asymmetric unit - minimal element within unit cell acted upon by symmetry operators Squiggles per: A.U. 1 1 1 1 U.C. 1 2 4 6

Symmetry and lattice type define Space Groups 230 groups 65 accessible to biomolecules (no mirror planes!) Can also have symmetry within an AU! 3 orthogonal 2-fold axes 222 point group - four AUs AUs http://neon.mems.cmu.edu/degraef/pointgroups/

X-rays are electromagnetic waves A simple wave: α=0 Can describe by: f(x)=fcos2π(hx+α) F Where: F=Amplitude λ=wavelength α=phase 1/λ=h

Fourier syntheses reconstruct electron density from diffraction data ρ(x) = Σ h F(h)cos2π(hx - α (h)) = sum of cosine terms Target function h F(h) α (h)/360 0 1 1 0 3-1/3 0.5 5 1/5 0 Sum - approximates target function More terms better approximation Concept of RESOLUTION

Diffraction data Onward to structure hkl = 18,17,0 hkl = 17,12,0 1) Collect data, index reflections (spots) - hkl terms ( addresses ) 2) Integrate spot intensities; calculate amplitudes ( F I) 3) Calculate scattered wave phases: Experimentally (deliberately modulate spot intensities): Heavy atom substitution (Multiple isomorphous replacement, MIR) Multiwavelength anomolous dispersion (MAD) Computationally (use prior model): Molecular replacement (MR)

Waves can be represented as vectors α=0 F F α 1/λ=h

Atomic scattering is additive F PH = F P + F H F F PH - F H = F P - F H F PH Identify heavy atom positions (HA xyz ), can calculate F H But, two possibilities for F PH in solving for F P!

Get third derivative! F PH = F P + F H F PH - F H = F P F PH2 = F P + F H2 F PH2 - F H2 = F P F PH2 F - F H2 F PH - F H With HA1 xyz, calculate F H1 With HA2 xyz, calculate F H2 Leaves only one possibility for F P!

Multiwavelength Anomalous Dispersion -- MAD 1. Derivatize YFP with heavy metal(s) (commonly SeMet) 2. Change wavelength to change X-ray absorbtion by metals (anomalous dispersion), Synchrotron needed 3. When x-rays are absorbed, F(hkl) F(-h-k-l) 4. Use anomalous differences, F(hkl) - F(-h-k-l), to locate metals 5. Calculate amplitude and phase of scattering from metals 6. Calculate probability of α P (hkl) 7. Each wavelength limits protein phases to 2 most probable values 8. Resolve phase ambiguity with: multiple wavelengths (MAD) solvent flattening (SAD) noncrystallographic symmetry averaging (model)...

Onward to structure Diffraction data Electron density FT 4) Apply Fourier synthesis to reconstruct electron density: Structure factor equation

Structure factor equation ρ(xyz) = 1/V ΣΣΣ F(hkl)cos2π(hx + ky + lz - α (hkl)) h k l ρ = electron density x, y, z = positions in crystalline repeat (fractional coordinates) V = unit cell volume F(hkl) = amplitude for reflection hkl h, k, l = integers, coordinates of each spot hx + ky + lz = counter through the unit cell α (hkl) = phase angle ( ), α, of spot hkl divided by 360 hkl s: 0 h 0 k 0 l z ρ=0 @ 2,1,5. x y ρ=6.7 @ 4,4,1

Resolution affects electron density interpretability Higher scattering angles add more spots (Fourier terms) Resolution information content 6 Å resolution 3 Å resolution Side chains evident at > 3.5 Å resolution

From maps to model Electron density Interpretation/model building 6) Thread amino acid sequence through electron density (manually or automatically) 7) Use amino acid shape and sequence as a guide (3D jigsaw) 8) Refine model computationally to find best match to data (F calc vs. F obs ) and optimize stereochemistry

Refinement Model F calc, α calc F obs Data Manual rebuilding Iterate until convergence F obs - F calc ; α calc Covalent geometry (Molecular dynamics) Shifts -- Δ x,y,z and B B ( temperature ) factor = disorder relative to a point atom

R and Rfree values -- the gold standards R = Σ F obs - F calc = 0.59 for random model ΣF obs = 0.4-0.55 for starting model > 0.25, good fit, errors still possible < 0.20, excellent fit R free = R value of a small, random subset never used in refinement. Ideally, Rfree< 0.30 & Rfree Rwork + 0.05 (this scales with resolution, however) Model is complete when: No interpretable difference electron density (F obs - F calc ) Geometry close to ideal No clashes, optimal rotamer stereochemistry

Interpreting the data - the structure table Data Collection Data set Remote Peak HgCl 2 derivative Space group P4 3 2 1 2 P4 3 2 1 2 P4 3 2 1 2 Unit cell a, b, c (Å) 52.8, 52.8, 160.1 52.8, 52.8, 160.1 52.1, 52.1, 162.9 α, β, γ ( ) 90, 90, 90 90, 90, 90 90, 90, 90 Wavelength (Å) 1.1000 0.9791 1.0093 Resolution range (Å) 45 2.4 45 2.4 43 3.1 Total reflections 206,536 72,214 41,453 Unique reflections 10,986 10,171 7,537 Redundancy 18.8 (15.5) a 7.1 (6.8) 5.5 (3.8) Completeness 99.9 (100) 99.5 (100) 98.6 (99.2) I/σ 43.3 (7.3) 41.7 (9.6) 34.9 (4.2) R sym (%) b 5.5 (30.8) 5.0 (20.0) 5.0 (37.4) a Values in parentheses are for highest-resolution shells. b I(h)j is the scaled observed intensity of the jth observation of reflection h, and <I(h)> is the mean value of corresponding symmetry-related reflections. (signal to noise) (data agreement) R sym = I(h) j < I(h) > / I(h) j j j

Interpreting the data - the structure table Refinement parameters Resolution (Å) 45 2.3 No. of nonhydrogen atoms 29,845 Rmsd No. of waters 243 Bond lengths (Å) 0.013 No. of ions 3 Bond angles ( ) 1.4 B factors Overall 30.1 Ramachandran Protein 29.2 Favored 90.4% (should be <0.02Å) (should be <2.0 ) (stereochemistry) Ligand/ion 39.5 Allowed 7.6% Water 34.6 Generous 2% R work /R e free 19.4/22.9 Disallowed 0 (model/data agreement) e where F obs and F calc are observed and model structure factors, respectively. R free was calculated by using a randomly selected set (5%) of reflections. R work = F obs F calc / F obs

Making sense of the PDB file - header info HEADER ISOMERASE 02-JUN-05 1ZVU TITLE STRUCTURE OF THE FULL-LENGTH E. COLI PARC SUBUNIT COMPND 2 MOLECULE: TOPOISOMERASE IV SUBUNIT A; SOURCE 2 ORGANISM_SCIENTIFIC: ESCHERICHIA COLI; KEYWDS BETA-PINWHEEL, ATPASE, SUPERCOILING, DECATENATION, DNA EXPDTA X-RAY DIFFRACTION AUTHOR K.D.CORBETT,A.J.SCHOEFFLER,N.D.THOMSEN,J.M.BERGER JRNL TITL THE STRUCTURAL BASIS FOR SUBSTRATE SPECIFICITY IN JRNL TITL 2 DNA TOPOISOMERASE IV. JRNL REF J.MOL.BIOL. V. 351 545 2005 REMARK 1 REMARK 2 REMARK 2 RESOLUTION. 3.00 ANGSTROMS. REMARK 3 REMARK 3 REFINEMENT. REMARK 3 PROGRAM : REFMAC 5.2.0005 REMARK 3 AUTHORS : MURSHUDOV,VAGIN,DODSON REMARK 3 REMARK 3 REFINEMENT TARGET : MAXIMUM LIKELIHOOD REMARK 3 REMARK 3 DATA USED IN REFINEMENT. REMARK 3 RESOLUTION RANGE HIGH (ANGSTROMS) : 3.00 REMARK 3 RESOLUTION RANGE LOW (ANGSTROMS) : 20.00 REMARK 3 DATA CUTOFF (SIGMA(F)) : 1.000 REMARK 3 COMPLETENESS FOR RANGE (%) : 89.6 REMARK 3 NUMBER OF REFLECTIONS : 18167

Making sense of the PDB file - the guts SEQRES 1 A 716 MET ASP ARG ALA LEU PRO PHE ILE GLY ASP GLY LEU LYS SEQRES 2 A 716 PRO VAL GLN ARG ARG ILE VAL TYR ALA MET SER GLU LEU SEQRES 3 A 716 GLY LEU ASN ALA SER ALA LYS PHE LYS LYS SER ALA ARG HELIX 1 1 LYS A 39 SER A 50 1 12 HELIX 2 2 THR A 66 GLY A 72 1 Atom number 7 HELIX 3 3 ASP A 79 ALA A 90 1 12 Amino acid type SHEET 1 A 2 VAL A 100 GLY A 102 0 SHEET 2 A 2 SER A 123 LEU A 125-1 O ARG A 124 N ASP A 101 CRYST1 257.990 62.141 63.998 90.00 90.00 90.00 P 21 21 2 4 ORIGX1 1.000000 0.000000 0.000000 0.00000 ORIGX2 0.000000 1.000000 0.000000 0.00000 ORIGX3 0.000000 0.000000 1.000000 0.00000 SCALE1 0.003876 0.000000 0.000000 0.00000 Occupancy SCALE2 0.000000 0.016092 0.000000 0.00000 SCALE3 0.000000 0.000000 0.015625 0.00000 ATOM 1 N ASP A 28-7.840 5.599-4.925 1.00 35.20 N ATOM 2 CA ASP A 28-7.889 6.594-3.807 1.00 35.29 C ATOM 3 C ASP A 28-8.269 8.003-4.275 1.00 34.79 C ATOM 4 O ASP A 28-9.238 8.576-3.783 1.00 34.99 O ATOM 5 CB ASP A 28-6.550 6.628-3.059 1.00 35.74 C ATOM 6 CG ASP A 28-6.250 7.999-2.449 1.00 37.37 C ATOM 7 OD1 ASP A 28-6.857 8.351-1.406 1.00 38.79 O ATOM 8 OD2 ASP A 28-5.402 8.722-3.024 1.00 38.60 O ATOM 9 N ARG A 29-7.495 8.558-5.207 1.00 34.04 N TER END Atom identifier Protein chain ID Residue number Cell dimensions and space group B-factor Atomic position

Some things to keep in mind PDB file oddities: Occ<1 - partial occupancy, see for ligands sometimes B>>B avg - disordered region, interpret with caution Missing side chain or sequence gap - region not modeled, likely disordered Two copies of same amino acid - multiple conformations modeled Waters/ligands often at end of file The model is still a model: Best fit to data, doesn t mean everything is perfect or right Higher resolution models typically more accurate - use for homology modeling, molecular replacement, analysis of active site geometry, etc.

Representations of protein structure Ribbon representation: traces path of protein chain through space Surface representation: shows solid features of protein exterior Spheres and sticks: show atomic connections Remember - a model is still a model!

Where is crystallography headed? Dissect mechanism and catalysis Structure/function studies Time resolved reactions Harder problems Dynamic, metastable complexes and assemblies Membrane proteins Rational ligand/inhibitor design Define cellular proteome

Where do we need physics? Detectors Increase sensitivity, dynamic range, speed Sources Benchtop synchrotrons Overcome radiation damage Crystallization Develop rational guidelines & novel approaches Use of non-diffracting/poorly-diffracting crystals Functional prediction Extracting/simulating dynamics from models and data Docking/modeling interactions Single protein diffraction Free electron lasers Data analyses