Chemical Shift Restraints Tools and Methods. Andrea Cavalli

Similar documents
Determination of the structure and dynamics of proteins using NMR chemical shifts (CS) and CS enhanced protein data bank (CS-PDB)

Protein Structure Prediction, Engineering & Design CHEM 430

Protein Structure Prediction

Principles of NMR Protein Spectroscopy. 2) Assignment of chemical shifts in a protein ( 1 H, 13 C, 15 N) 3) Three dimensional structure determination

Magnetic Resonance Lectures for Chem 341 James Aramini, PhD. CABM 014A

Useful background reading

CMPS 3110: Bioinformatics. Tertiary Structure Prediction

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction

Supporting Online Material for

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy

Tools for Cryo-EM Map Fitting. Paul Emsley MRC Laboratory of Molecular Biology

Protein Structure Prediction

Template Free Protein Structure Modeling Jianlin Cheng, PhD

Presenter: She Zhang

Prediction and refinement of NMR structures from sparse experimental data

Homology Modeling (Comparative Structure Modeling) GBCB 5874: Problem Solving in GBCB

7.91 Amy Keating. Solving structures using X-ray crystallography & NMR spectroscopy

Contact map guided ab initio structure prediction

Protein Structure Determination from Pseudocontact Shifts Using ROSETTA

Modeling for 3D structure prediction

Protein Structure Determination

Ab-initio protein structure prediction

Programme Last week s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues

Structure determination through NMR

PROTEIN'STRUCTURE'DETERMINATION'

NMR Assay of Purity and Folding

Protein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron.

Alpha-helical Topology and Tertiary Structure Prediction of Globular Proteins Scott R. McAllister Christodoulos A. Floudas Princeton University

Theory and Applications of Residual Dipolar Couplings in Biomolecular NMR

MD Simulation in Pose Refinement and Scoring Using AMBER Workflows

Protein Structure Determination Using NMR Restraints BCMB/CHEM 8190

Protein Structure Prediction

Protein Structure Analysis with Sequential Monte Carlo Method. Jinfeng Zhang Computational Biology Lab Department of Statistics Harvard University

Goals. Structural Analysis of the EGR Family of Transcription Factors: Templates for Predicting Protein DNA Interactions

Orientational degeneracy in the presence of one alignment tensor.

Protein Structures. 11/19/2002 Lecture 24 1

Protein Structure Refinement Using 13 C α Chemical. Shift Tensors. Benjamin J. Wylie, Charles D. Schwieters, Eric Oldfield and Chad M.

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

Deuteration: Structural Studies of Larger Proteins

114 Grundlagen der Bioinformatik, SS 09, D. Huson, July 6, 2009

Human and Server CAPRI Protein Docking Prediction Using LZerD with Combined Scoring Functions. Daisuke Kihara

Molecular Modeling lecture 2

De novo protein structure determination using sparse NMR data

1) NMR is a method of chemical analysis. (Who uses NMR in this way?) 2) NMR is used as a method for medical imaging. (called MRI )

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison

Origin of Chemical Shifts BCMB/CHEM 8190

Procheck output. Bond angles (Procheck) Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics.

HSQC spectra for three proteins

Carlo Camilloni, Andrea Cavalli,, and Michele Vendruscolo INTRODUCTION

Residual Dipolar Couplings BCMB/CHEM 8190

Protein Structure Determination Using NMR Restraints BCMB/CHEM 8190

Predicting Continuous Local Structure and the Effect of Its Substitution for Secondary Structure in Fragment-Free Protein Structure Prediction

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

Guided Prediction with Sparse NMR Data

HADDOCK: High Ambiguity

Course Notes: Topics in Computational. Structural Biology.

Template Free Protein Structure Modeling Jianlin Cheng, PhD

Protein structure analysis. Risto Laakso 10th January 2005

CS23D: a web server for rapid protein structure generation using NMR chemical shifts and sequence data

Template-Based Modeling of Protein Structure

arxiv: v1 [physics.chem-ph] 23 Sep 2014

Polypeptide Folding Using Monte Carlo Sampling, Concerted Rotation, and Continuum Solvation

NMR in Structural Biology

Protein Folding Prof. Eugene Shakhnovich

NMR, X-ray Diffraction, Protein Structure, and RasMol

SHIFTX2: significantly improved protein chemical shift prediction

Introduction to" Protein Structure

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror

Table S1. Primers used for the constructions of recombinant GAL1 and λ5 mutants. GAL1-E74A ccgagcagcgggcggctgtctttcc ggaaagacagccgcccgctgctcgg

Experimental Techniques in Protein Structure Determination

PROTEIN-PROTEIN DOCKING REFINEMENT USING RESTRAINT MOLECULAR DYNAMICS SIMULATIONS

NMR in Medicine and Biology

Computational Protein Design

Supporting Information

Homology Modeling. Roberto Lins EPFL - summer semester 2005

The PROSECCO server for chemical shift predictions in ordered and disordered proteins

Building 3D models of proteins

SUPPLEMENTARY ONLINE DATA

Interpreting and evaluating biological NMR in the literature. Worksheet 1

Bio nformatics. Lecture 23. Saad Mneimneh

Docking. GBCB 5874: Problem Solving in GBCB

Characterization of the free-energy landscapes of proteins by NMR-guided metadynamics

Protein folding. α-helix. Lecture 21. An α-helix is a simple helix having on average 10 residues (3 turns of the helix)

Supplementary Figures:

Molecular Modeling. Prediction of Protein 3D Structure from Sequence. Vimalkumar Velayudhan. May 21, 2007

Protein Structure Analysis and Verification. Course S Basics for Biosystems of the Cell exercise work. Maija Nevala, BIO, 67485U 16.1.

Timescales of Protein Dynamics

Millisecond Time-scale Protein Dynamics by Relaxation Dispersion NMR. Dmitry M. Korzhnev

NMR parameters intensity chemical shift coupling constants 1D 1 H spectra of nucleic acids and proteins

Can a continuum solvent model reproduce the free energy landscape of a β-hairpin folding in water?

Finding Similar Protein Structures Efficiently and Effectively

1. Protein Data Bank (PDB) 1. Protein Data Bank (PDB)

ALL LECTURES IN SB Introduction

Geometry-based Computation of Symmetric Homo-oligomeric Protein Complexes

Timescales of Protein Dynamics

Protein Folding by Robotics

A.D.J. van Dijk "Modelling of biomolecular complexes by data-driven docking"

NMR Structures in the Cloud Validation & Improvement

Transcription:

Chemical Shift Restraints Tools and Methods Andrea Cavalli

Overview

Methods Overview

Methods Details Overview

Methods Details Results/Discussion Overview

Methods

Methods Cheshire base solid-state

Methods Cheshire base solid-state CamShift new predictor Monte Carlo/Molecular Dynamics

Methods Cheshire base solid-state CamShift new predictor Monte Carlo/Molecular Dynamics CamDock protein-protein docking

About CHESHIRE: CHEmical SHifts REstraints

About CHESHIRE: CHEmical SHifts REstraints 3D structure determination from NMR chemical shifts.

About CHESHIRE: CHEmical SHifts REstraints 3D structure determination from NMR chemical shifts. Chemical shifts are easy to measure

About CHESHIRE: CHEmical SHifts REstraints 3D structure determination from NMR chemical shifts. Chemical shifts are easy to measure Can be measured with great accuracy

About CHESHIRE: CHEmical SHifts REstraints 3D structure determination from NMR chemical shifts. Chemical shifts are easy to measure Can be measured with great accuracy Contain a lot of structural informations (CSI, TALOS,...)

About CHESHIRE: CHEmical SHifts REstraints 3D structure determination from NMR chemical shifts. Chemical shifts are easy to measure Can be measured with great accuracy Contain a lot of structural informations (CSI, TALOS,...) In some cases they are the only available data

About CHESHIRE: CHEmical SHifts REstraints 3D structure determination from NMR chemical shifts. Chemical shifts are easy to measure Can be measured with great accuracy Contain a lot of structural informations (CSI, TALOS,...) In some cases they are the only available data but...

NOE-NMR vs CHESHIRE

NOE-NMR vs CHESHIRE NOEs have a direct structural interpretation as distances

NOE-NMR vs CHESHIRE NOEs have a direct structural interpretation as distances Chemical shifts values are indirectly related to geometry (SHIFTX, CamShift)

NOE-NMR vs CHESHIRE NOEs have a direct structural interpretation as distances Chemical shifts values are indirectly related to geometry (SHIFTX, CamShift) NOEs have long-range information

NOE-NMR vs CHESHIRE NOEs have a direct structural interpretation as distances Chemical shifts values are indirectly related to geometry (SHIFTX, CamShift) NOEs have long-range information Chemical shifts are local

NOE-NMR vs CHESHIRE NOEs have a direct structural interpretation as distances Chemical shifts values are indirectly related to geometry (SHIFTX, CamShift) NOEs have long-range information Chemical shifts are local NOEs are redundant

NOE-NMR vs CHESHIRE NOEs have a direct structural interpretation as distances Chemical shifts values are indirectly related to geometry (SHIFTX, CamShift) NOEs have long-range information Chemical shifts are local NOEs are redundant There is only one chemical shift per atom

NOE-NMR vs CHESHIRE NOEs have a direct structural interpretation as distances Chemical shifts values are indirectly related to geometry (SHIFTX, CamShift) NOEs have long-range information Chemical shifts are local NOEs are redundant There is only one chemical shift per atom Clear quality control (number of assigned NOEs, NOEs violation)

NOE-NMR vs CHESHIRE NOEs have a direct structural interpretation as distances Chemical shifts values are indirectly related to geometry (SHIFTX, CamShift) NOEs have long-range information Chemical shifts are local NOEs are redundant There is only one chemical shift per atom Clear quality control (number of assigned NOEs, NOEs violation) Weak Q-factor

Idea

Idea Force field -920 Free Energy -940-960 -980-1000 0 2 4 6 8 10 Cα-RMSD

Idea Force field Chemical shifts -920-420 Free Energy -940-960 -980 Chemical Shift -440-460 -480-1000 -500 0 2 4 6 8 10 Cα-RMSD 0 2 4 6 8 10 Cα-RMSD

Idea Force field Combined score Chemical shifts -920-420 -420 Free Energy -940-960 -980 Chemical Shift -440-460 -480 Chemical Shift -440-460 -480-1000 -500-500 0 2 4 6 8 10 Cα-RMSD 0 2 4 6 8 10 Cα-RMSD 0 2 4 6 8 10 Cα-RMSD

Idea Force field Combined score Chemical shifts -920-420 -420 Free Energy -940-960 -980 Chemical Shift -440-460 -480 Chemical Shift -440-460 -480-1000 -500-500 0 2 4 6 8 10 Cα-RMSD 0 2 4 6 8 10 Cα-RMSD 0 2 4 6 8 10 Cα-RMSD Structures have to be very close to the native one in order to feel chemical shifts score.

CHESHIRE Determination or prediction? Experiment Theory

CHESHIRE Determination or prediction? X-ray NMR Experiment Theory

CHESHIRE Determination or prediction? X-ray NMR ab initio Experiment Theory

CHESHIRE Determination or prediction? X-ray NMR Homology modeling > 50 % Homology modeling < 50 % ab initio Experiment Theory

CHESHIRE Determination or prediction? X-ray NMR Homology modeling > 50 % CHESHIRE Homology modeling < 50 % ab initio Experiment Theory

CHESHIRE Determination or prediction? Jigsaw puzzle

Steps Chemical shifts Prediction of local geometry Database SCOP domains Fragment selection SHIFX Fragment assembly Energy function Refinement

Local structure 1 Chemical shifts Prediction of local geometry Database Secondary structure prediction P 3 (S 1,S 2,S 3 AA 1,AA 2,AA 3 ), P cs (S Hα,N,Cα,Cβ,AA) E = N N i=1logp 3 (i) K cs logp cs (i) i=1 Secondary structure propensity P(S A)= N S N

Local structure 2 Chemical shifts Prediction of local geometry Database Torsion angle prediction S(Φ i,ψ i A,CS)=Sym(B,A)+Sym( CS A, CS B )+Sym(S A,S B ) Three best scoring cluster centers are taken as prediction.

Fragment selection Chemical shifts Fragment selection Database E = Fragments of length 3 and 9 aa N N i=1e cs (A i, CS A,B i, CS B )+K tor E tar (Φ i,ψ i,b) i=1 Performance Protein 3Pred TOPOS Ubiquitin 0.75 0.93 FF domain 0.90 0.86 Calbindin 0.85 0.95 HPR 0.87 0.86

Fold Fragment assembly Energy function

Fold Fragment assembly Energy function

Refinement 1 Chemical shifts Refinement Energy function Energy function E re f = E ff /log(1 C cs ) where C cs = K χ (1 C χ ), C χ correlation of CS type χ χ {Hα,N,Cα,Cβ}

Refinement 2

Refinement 2 Structure with large Rg are discarded

Refinement 2 Structure with large Rg are discarded Side-chains are added

Refinement 2 Structure with large Rg are discarded Side-chains are added Initial ranking

Refinement 2 Takes one structure at random from the best-list. New structure generated by simulated annealing. Structure with large Rg are discarded Side-chains are added Initial ranking Keeps a list of the 100 best structures

Results

Results

The largest

The largest 2GW6, 123 aa 1.72 Å backbone RMSD

The smallest

The smallest 1PV0, 46 aa 1.37 Å backbone RMSD

Solid-State NMR of protein G

Solid-State NMR of protein G

Solid-State NMR of protein G Structure RMSD N (5.5 A ) Q (RDC) 1P7F 0.40 0 0.03 3GB1 0.59 0 0.16 2GB1 0.97 1 0.37 2JU6 1.86 5 0.48 2K0P 1.04 3 0.40

Failures

Failures 0 S = 48.09-4.2458*NA, R=-0.9794-200 Score -400-600 -800 60 80 100 120 140 160 180 200 Number of Amino Acids

Failures 1ZGG -400 0-200 S = 48.09-4.2458*NA, R=-0.9794-450 -500 Refined Structures Refined Native Structure Expected Score Score -400 Score -550-450 -600-600 Score -500-650 -550-800 60 80 100 120 140 160 180 200 Number of Amino Acids 0 10 20 30 C!-RMSD -700 0 5 10 15 20 25 30 35 40 C!-RMSD

Failures 1ZGG -400 0-200 S = 48.09-4.2458*NA, R=-0.9794-450 -500 Refined Structures Refined Native Structure Expected Score Score -400 Score -550-450 -600-600 Score -500-650 -550-800 60 80 100 120 140 160 180 200 Number of Amino Acids 0 10 20 30 C!-RMSD -700 0 5 10 15 20 25 30 35 40 C!-RMSD Why? Usually because the assembly stage does not generate low RMSD models.

CamShift

CamShift Chemical shifts are predicted using distances to neighboring atoms R N C C H H O R N C C H H O

CamShift Chemical shifts are predicted using distances to neighboring atoms Accurate as ShiftX or Sparta and orders of magnitude faster R N C C H H O R N C C H H O

CamShift Chemical shifts are predicted using distances to neighboring atoms Accurate as ShiftX or Sparta and orders of magnitude faster CamShift with physical force field and ReX molecular dynamics N R C C H H R N C C H H O O

CamShift Chemical shifts are predicted using distances to neighboring atoms Accurate as ShiftX or Sparta and orders of magnitude faster CamShift with physical force field and ReX molecular dynamics ~ 1 A from unfolded for small proteins (1uzc, 1ubq,..) N R C C H H R N C C H H O O

CamShift-MD 2jvw: 61 residues Lowest Energy Structure 1.41Å RMSD 2jva: 108 residues Lowest Energy Structure 1.98 Å RMSD

CamShift Full No Long range Sparta HN 0.53 0.61 0.57 HA 0.29 0.37 0.27 N 3.10 3.18 2.52 CA 1.18 1.20 0.98 CB 1.43 1.48 1.07 CO 1.16 1.27 1.08

Conclusions

Conclusions Protein structure determination with chemical shifts is possible...

Conclusions Protein structure determination with chemical shifts is possible... but difficult... very difficult...

Conclusions Protein structure determination with chemical shifts is possible... but difficult... very difficult... CHESHIRE works (at the moment) for proteins up to ~100 aa.

Conclusions Protein structure determination with chemical shifts is possible... but difficult... very difficult... CHESHIRE works (at the moment) for proteins up to ~100 aa. results are stable ~1.0-2.0 Å Cα RMSD.

Conclusions Protein structure determination with chemical shifts is possible... but difficult... very difficult... CHESHIRE works (at the moment) for proteins up to ~100 aa. results are stable ~1.0-2.0 Å Cα RMSD. self-consistent criterion to (maybe) detect failures of the method.

Conclusions Protein structure determination with chemical shifts is possible... but difficult... very difficult... CHESHIRE works (at the moment) for proteins up to ~100 aa. results are stable ~1.0-2.0 Å Cα RMSD. self-consistent criterion to (maybe) detect failures of the method. can be used for complexes and with solid-state CS.

Conclusions Protein structure determination with chemical shifts is possible... but difficult... very difficult... CHESHIRE works (at the moment) for proteins up to ~100 aa. results are stable ~1.0-2.0 Å Cα RMSD. self-consistent criterion to (maybe) detect failures of the method. can be used for complexes and with solid-state CS. http://www.open-almost.org

Acknowledgments Michele Vendruscolo Chris Dobson Xavier Salvatella Kai Kohlhof Paul Robustelli Danny Hsu Rinaldo Wander Montalvao