Chemical Shift Restraints Tools and Methods Andrea Cavalli
Overview
Methods Overview
Methods Details Overview
Methods Details Results/Discussion Overview
Methods
Methods Cheshire base solid-state
Methods Cheshire base solid-state CamShift new predictor Monte Carlo/Molecular Dynamics
Methods Cheshire base solid-state CamShift new predictor Monte Carlo/Molecular Dynamics CamDock protein-protein docking
About CHESHIRE: CHEmical SHifts REstraints
About CHESHIRE: CHEmical SHifts REstraints 3D structure determination from NMR chemical shifts.
About CHESHIRE: CHEmical SHifts REstraints 3D structure determination from NMR chemical shifts. Chemical shifts are easy to measure
About CHESHIRE: CHEmical SHifts REstraints 3D structure determination from NMR chemical shifts. Chemical shifts are easy to measure Can be measured with great accuracy
About CHESHIRE: CHEmical SHifts REstraints 3D structure determination from NMR chemical shifts. Chemical shifts are easy to measure Can be measured with great accuracy Contain a lot of structural informations (CSI, TALOS,...)
About CHESHIRE: CHEmical SHifts REstraints 3D structure determination from NMR chemical shifts. Chemical shifts are easy to measure Can be measured with great accuracy Contain a lot of structural informations (CSI, TALOS,...) In some cases they are the only available data
About CHESHIRE: CHEmical SHifts REstraints 3D structure determination from NMR chemical shifts. Chemical shifts are easy to measure Can be measured with great accuracy Contain a lot of structural informations (CSI, TALOS,...) In some cases they are the only available data but...
NOE-NMR vs CHESHIRE
NOE-NMR vs CHESHIRE NOEs have a direct structural interpretation as distances
NOE-NMR vs CHESHIRE NOEs have a direct structural interpretation as distances Chemical shifts values are indirectly related to geometry (SHIFTX, CamShift)
NOE-NMR vs CHESHIRE NOEs have a direct structural interpretation as distances Chemical shifts values are indirectly related to geometry (SHIFTX, CamShift) NOEs have long-range information
NOE-NMR vs CHESHIRE NOEs have a direct structural interpretation as distances Chemical shifts values are indirectly related to geometry (SHIFTX, CamShift) NOEs have long-range information Chemical shifts are local
NOE-NMR vs CHESHIRE NOEs have a direct structural interpretation as distances Chemical shifts values are indirectly related to geometry (SHIFTX, CamShift) NOEs have long-range information Chemical shifts are local NOEs are redundant
NOE-NMR vs CHESHIRE NOEs have a direct structural interpretation as distances Chemical shifts values are indirectly related to geometry (SHIFTX, CamShift) NOEs have long-range information Chemical shifts are local NOEs are redundant There is only one chemical shift per atom
NOE-NMR vs CHESHIRE NOEs have a direct structural interpretation as distances Chemical shifts values are indirectly related to geometry (SHIFTX, CamShift) NOEs have long-range information Chemical shifts are local NOEs are redundant There is only one chemical shift per atom Clear quality control (number of assigned NOEs, NOEs violation)
NOE-NMR vs CHESHIRE NOEs have a direct structural interpretation as distances Chemical shifts values are indirectly related to geometry (SHIFTX, CamShift) NOEs have long-range information Chemical shifts are local NOEs are redundant There is only one chemical shift per atom Clear quality control (number of assigned NOEs, NOEs violation) Weak Q-factor
Idea
Idea Force field -920 Free Energy -940-960 -980-1000 0 2 4 6 8 10 Cα-RMSD
Idea Force field Chemical shifts -920-420 Free Energy -940-960 -980 Chemical Shift -440-460 -480-1000 -500 0 2 4 6 8 10 Cα-RMSD 0 2 4 6 8 10 Cα-RMSD
Idea Force field Combined score Chemical shifts -920-420 -420 Free Energy -940-960 -980 Chemical Shift -440-460 -480 Chemical Shift -440-460 -480-1000 -500-500 0 2 4 6 8 10 Cα-RMSD 0 2 4 6 8 10 Cα-RMSD 0 2 4 6 8 10 Cα-RMSD
Idea Force field Combined score Chemical shifts -920-420 -420 Free Energy -940-960 -980 Chemical Shift -440-460 -480 Chemical Shift -440-460 -480-1000 -500-500 0 2 4 6 8 10 Cα-RMSD 0 2 4 6 8 10 Cα-RMSD 0 2 4 6 8 10 Cα-RMSD Structures have to be very close to the native one in order to feel chemical shifts score.
CHESHIRE Determination or prediction? Experiment Theory
CHESHIRE Determination or prediction? X-ray NMR Experiment Theory
CHESHIRE Determination or prediction? X-ray NMR ab initio Experiment Theory
CHESHIRE Determination or prediction? X-ray NMR Homology modeling > 50 % Homology modeling < 50 % ab initio Experiment Theory
CHESHIRE Determination or prediction? X-ray NMR Homology modeling > 50 % CHESHIRE Homology modeling < 50 % ab initio Experiment Theory
CHESHIRE Determination or prediction? Jigsaw puzzle
Steps Chemical shifts Prediction of local geometry Database SCOP domains Fragment selection SHIFX Fragment assembly Energy function Refinement
Local structure 1 Chemical shifts Prediction of local geometry Database Secondary structure prediction P 3 (S 1,S 2,S 3 AA 1,AA 2,AA 3 ), P cs (S Hα,N,Cα,Cβ,AA) E = N N i=1logp 3 (i) K cs logp cs (i) i=1 Secondary structure propensity P(S A)= N S N
Local structure 2 Chemical shifts Prediction of local geometry Database Torsion angle prediction S(Φ i,ψ i A,CS)=Sym(B,A)+Sym( CS A, CS B )+Sym(S A,S B ) Three best scoring cluster centers are taken as prediction.
Fragment selection Chemical shifts Fragment selection Database E = Fragments of length 3 and 9 aa N N i=1e cs (A i, CS A,B i, CS B )+K tor E tar (Φ i,ψ i,b) i=1 Performance Protein 3Pred TOPOS Ubiquitin 0.75 0.93 FF domain 0.90 0.86 Calbindin 0.85 0.95 HPR 0.87 0.86
Fold Fragment assembly Energy function
Fold Fragment assembly Energy function
Refinement 1 Chemical shifts Refinement Energy function Energy function E re f = E ff /log(1 C cs ) where C cs = K χ (1 C χ ), C χ correlation of CS type χ χ {Hα,N,Cα,Cβ}
Refinement 2
Refinement 2 Structure with large Rg are discarded
Refinement 2 Structure with large Rg are discarded Side-chains are added
Refinement 2 Structure with large Rg are discarded Side-chains are added Initial ranking
Refinement 2 Takes one structure at random from the best-list. New structure generated by simulated annealing. Structure with large Rg are discarded Side-chains are added Initial ranking Keeps a list of the 100 best structures
Results
Results
The largest
The largest 2GW6, 123 aa 1.72 Å backbone RMSD
The smallest
The smallest 1PV0, 46 aa 1.37 Å backbone RMSD
Solid-State NMR of protein G
Solid-State NMR of protein G
Solid-State NMR of protein G Structure RMSD N (5.5 A ) Q (RDC) 1P7F 0.40 0 0.03 3GB1 0.59 0 0.16 2GB1 0.97 1 0.37 2JU6 1.86 5 0.48 2K0P 1.04 3 0.40
Failures
Failures 0 S = 48.09-4.2458*NA, R=-0.9794-200 Score -400-600 -800 60 80 100 120 140 160 180 200 Number of Amino Acids
Failures 1ZGG -400 0-200 S = 48.09-4.2458*NA, R=-0.9794-450 -500 Refined Structures Refined Native Structure Expected Score Score -400 Score -550-450 -600-600 Score -500-650 -550-800 60 80 100 120 140 160 180 200 Number of Amino Acids 0 10 20 30 C!-RMSD -700 0 5 10 15 20 25 30 35 40 C!-RMSD
Failures 1ZGG -400 0-200 S = 48.09-4.2458*NA, R=-0.9794-450 -500 Refined Structures Refined Native Structure Expected Score Score -400 Score -550-450 -600-600 Score -500-650 -550-800 60 80 100 120 140 160 180 200 Number of Amino Acids 0 10 20 30 C!-RMSD -700 0 5 10 15 20 25 30 35 40 C!-RMSD Why? Usually because the assembly stage does not generate low RMSD models.
CamShift
CamShift Chemical shifts are predicted using distances to neighboring atoms R N C C H H O R N C C H H O
CamShift Chemical shifts are predicted using distances to neighboring atoms Accurate as ShiftX or Sparta and orders of magnitude faster R N C C H H O R N C C H H O
CamShift Chemical shifts are predicted using distances to neighboring atoms Accurate as ShiftX or Sparta and orders of magnitude faster CamShift with physical force field and ReX molecular dynamics N R C C H H R N C C H H O O
CamShift Chemical shifts are predicted using distances to neighboring atoms Accurate as ShiftX or Sparta and orders of magnitude faster CamShift with physical force field and ReX molecular dynamics ~ 1 A from unfolded for small proteins (1uzc, 1ubq,..) N R C C H H R N C C H H O O
CamShift-MD 2jvw: 61 residues Lowest Energy Structure 1.41Å RMSD 2jva: 108 residues Lowest Energy Structure 1.98 Å RMSD
CamShift Full No Long range Sparta HN 0.53 0.61 0.57 HA 0.29 0.37 0.27 N 3.10 3.18 2.52 CA 1.18 1.20 0.98 CB 1.43 1.48 1.07 CO 1.16 1.27 1.08
Conclusions
Conclusions Protein structure determination with chemical shifts is possible...
Conclusions Protein structure determination with chemical shifts is possible... but difficult... very difficult...
Conclusions Protein structure determination with chemical shifts is possible... but difficult... very difficult... CHESHIRE works (at the moment) for proteins up to ~100 aa.
Conclusions Protein structure determination with chemical shifts is possible... but difficult... very difficult... CHESHIRE works (at the moment) for proteins up to ~100 aa. results are stable ~1.0-2.0 Å Cα RMSD.
Conclusions Protein structure determination with chemical shifts is possible... but difficult... very difficult... CHESHIRE works (at the moment) for proteins up to ~100 aa. results are stable ~1.0-2.0 Å Cα RMSD. self-consistent criterion to (maybe) detect failures of the method.
Conclusions Protein structure determination with chemical shifts is possible... but difficult... very difficult... CHESHIRE works (at the moment) for proteins up to ~100 aa. results are stable ~1.0-2.0 Å Cα RMSD. self-consistent criterion to (maybe) detect failures of the method. can be used for complexes and with solid-state CS.
Conclusions Protein structure determination with chemical shifts is possible... but difficult... very difficult... CHESHIRE works (at the moment) for proteins up to ~100 aa. results are stable ~1.0-2.0 Å Cα RMSD. self-consistent criterion to (maybe) detect failures of the method. can be used for complexes and with solid-state CS. http://www.open-almost.org
Acknowledgments Michele Vendruscolo Chris Dobson Xavier Salvatella Kai Kohlhof Paul Robustelli Danny Hsu Rinaldo Wander Montalvao