Protein Structure Prediction

Similar documents
Protein Structure Prediction

Computer simulations of protein folding with a small number of distance restraints

Template Free Protein Structure Modeling Jianlin Cheng, PhD

Programme Last week s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues

Protein Structures: Experiments and Modeling. Patrice Koehl

MMTSB Toolset. Michael Feig MMTSB/CTBP 2006 Summer Workshop. MMTSB/CTBP Summer Workshop Michael Feig, 2006.

Prediction and refinement of NMR structures from sparse experimental data

Protein Structure Prediction, Engineering & Design CHEM 430

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Template Free Protein Structure Modeling Jianlin Cheng, PhD

Protein Structure Prediction and Display

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy

Physiochemical Properties of Residues

Protein Structure Prediction

114 Grundlagen der Bioinformatik, SS 09, D. Huson, July 6, 2009

Protein Structure Determination

Bioinformatics: Secondary Structure Prediction

AbInitioProteinStructurePredictionviaaCombinationof Threading,LatticeFolding,Clustering,andStructure Refinement

Contact map guided ab initio structure prediction

PROTEIN SECONDARY STRUCTURE PREDICTION: AN APPLICATION OF CHOU-FASMAN ALGORITHM IN A HYPOTHETICAL PROTEIN OF SARS VIRUS

Protein structure. Protein structure. Amino acid residue. Cell communication channel. Bioinformatics Methods

BIOINF 4120 Bioinformatics 2 - Structures and Systems - Oliver Kohlbacher Summer Protein Structure Prediction I

Presenter: She Zhang

Chemical Shift Restraints Tools and Methods. Andrea Cavalli

Homology Modeling (Comparative Structure Modeling) GBCB 5874: Problem Solving in GBCB

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror

Protein modeling and structure prediction with a reduced representation

CMPS 3110: Bioinformatics. Tertiary Structure Prediction

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction

TOUCHSTONE II: A New Approach to Ab Initio Protein Structure Prediction

Ab-initio protein structure prediction

3D Structure. Prediction & Assessment Pt. 2. David Wishart 3-41 Athabasca Hall

Supporting Online Material for

A Method for the Improvement of Threading-Based Protein Models

Bioinformatics: Secondary Structure Prediction

Protein Structure Determination from Pseudocontact Shifts Using ROSETTA

ALL LECTURES IN SB Introduction

Steps in protein modelling. Structure prediction, fold recognition and homology modelling. Basic principles of protein structure

Protein Secondary Structure Prediction

Bioinformatics III Structural Bioinformatics and Genome Analysis Part Protein Secondary Structure Prediction. Sepp Hochreiter

Secondary Structure. Bioch/BIMS 503 Lecture 2. Structure and Function of Proteins. Further Reading. Φ, Ψ angles alone determine protein structure

Protein Structure Prediction

Monte Carlo simulation of proteins through a random walk in energy space

Computational Protein Design

Homology Modeling I. Growth of the Protein Data Bank PDB. Basel, September 30, EMBnet course: Introduction to Protein Structure Bioinformatics

Copyright Mark Brandt, Ph.D A third method, cryogenic electron microscopy has seen increasing use over the past few years.

Course Notes: Topics in Computational. Structural Biology.

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

proteins Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field

Protein Secondary Structure Prediction using Feed-Forward Neural Network

Template-Based Modeling of Protein Structure

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

Structural Alignment of Proteins

Neural Networks for Protein Structure Prediction Brown, JMB CS 466 Saurabh Sinha

Molecular Modeling lecture 2

TASSER: An Automated Method for the Prediction of Protein Tertiary Structures in CASP6

Central Dogma. modifications genome transcriptome proteome

Syllabus of BIOINF 528 (2017 Fall, Bioinformatics Program)

Protein Secondary Structure Prediction

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror

Protein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron.

Protein Modeling. Generating, Evaluating and Refining Protein Homology Models

Packing of Secondary Structures

Computational Molecular Biology. Protein Structure and Homology Modeling

CAP 5510 Lecture 3 Protein Structures

Improved Protein Secondary Structure Prediction

April, The energy functions include:

Protein Structure Analysis with Sequential Monte Carlo Method. Jinfeng Zhang Computational Biology Lab Department of Statistics Harvard University

What makes a good graphene-binding peptide? Adsorption of amino acids and peptides at aqueous graphene interfaces: Electronic Supplementary

Molecular Modeling. Prediction of Protein 3D Structure from Sequence. Vimalkumar Velayudhan. May 21, 2007

Details of Protein Structure

HIV protease inhibitor. Certain level of function can be found without structure. But a structure is a key to understand the detailed mechanism.

Conformational Sampling in Template-Free Protein Loop Structure Modeling: An Overview

Supplementary figure 1. Comparison of unbound ogm-csf and ogm-csf as captured in the GIF:GM-CSF complex. Alignment of two copies of unbound ovine

COMBINED MULTIPLE SEQUENCE REDUCED PROTEIN MODEL APPROACH TO PREDICT THE TERTIARY STRUCTURE OF SMALL PROTEINS

Basics of protein structure

Tools for Cryo-EM Map Fitting. Paul Emsley MRC Laboratory of Molecular Biology

Table 1. Crystallographic data collection, phasing and refinement statistics. Native Hg soaked Mn soaked 1 Mn soaked 2

Can protein model accuracy be. identified? NO! CBS, BioCentrum, Morten Nielsen, DTU

Molecular simulation and structure prediction using CHARMM and the MMTSB Tool Set Free Energy Methods

3DRobot: automated generation of diverse and well-packed protein structure decoys

Introduction to" Protein Structure

Can a continuum solvent model reproduce the free energy landscape of a β-hairpin folding in water?

Application of Statistical Potentials to Protein Structure Refinement from Low Resolution Ab Initio Models

Figure 1. Molecules geometries of 5021 and Each neutral group in CHARMM topology was grouped in dash circle.

Orientational degeneracy in the presence of one alignment tensor.

7.91 Amy Keating. Solving structures using X-ray crystallography & NMR spectroscopy

FlexPepDock In a nutshell

Section Week 3. Junaid Malek, M.D.

Major Types of Association of Proteins with Cell Membranes. From Alberts et al

) P = 1 if exp # " s. + 0 otherwise

Helix-coil and beta sheet-coil transitions in a simplified, yet realistic protein model

Supplementary Information Intrinsic Localized Modes in Proteins

Chapter 9. Loop Simulations. Maxim Totrov. Abstract. 1. Introduction

Multi-Scale Hierarchical Structure Prediction of Helical Transmembrane Proteins

Building 3D models of proteins

Procheck output. Bond angles (Procheck) Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics.

EBBA: Efficient Branch and Bound Algorithm for Protein Decoy Generation

Energy Minimization of Protein Tertiary Structure by Parallel Simulated Annealing using Genetic Crossover

Transcription:

Protein Structure Prediction Michael Feig MMTSB/CTBP 2006 Summer Workshop

From Sequence to Structure SEALGDTIVKNA

Ab initio Structure Prediction Protocol Amino Acid Sequence Conformational Sampling to generate native-like structures Scoring & Clustering to identify most native-like structures 3D Structure

Folding with All-Atom Models AAQAAAAQAAAAQAA CHARMM force field Implicit solvent replica exchange simulations 8 replicas, 10 ns/replica

Folding with Low-Resolution Model EQQNAFYEILHLPNLNEEQRNGFIQSLKDDPSQSANLLAEAKKLNDAQA SICHO model, MONSSTER simulated annealing run

SICHO Lattice Model Monte Carlo simulations: Thr Leu > Attempt move > Compute ΔE Phe > Accept with probability p: Asp %1 "E # 0 p =& 'exp($"e / k BT ) "E > 0 Simulated annealing! MMTSB/CTBP Summer Workshop Constant Temperature Replica Exchange Sampling Kolinski & Skolnick: Proteins 32, 475 (1998) Michael Feig, 2006.

Excluded volume SICHO Energy Function Knowledge-Based Terms Side chain burial propensity follows Kyte-Doolittle scale Centrosymmetric bias r g = 2.2 N 0.38 res Ala Arg Asp Ile 1.8-4.5-3.5 4.5

SICHO Energy Function Statistical Terms Potential of mean force (PMF): p " #E ij i kbt = e p j energy 2.5 2.0 1.5 1.0 0.5 0.0-0.5-1.0-1.5 "E = #kt ln(p) extended 0 1 2 3 4 5 6 7 8 9 10 11 GLU-GLU r(i,i+4) in Å helix

Conformational Sampling with SICHO All-Atom Energy in kcal/mol Protein A -3200-3250 -3300-3350 -3400 1 2 3 4 5 6 7 8 9 10 11 12 RMSD from native in Å MMTSB/CTBP Summer Workshop Michael Feig, 2006.

Ab initio Structure Prediction Protocol Amino Acid Sequence Secondary Structure Prediction PSIPRED et al. Efficient Sampling e.g. MONSSTER/SICHO All-Atom Reconstruction Scoring & Clustering e.g. MMGB/SA, DFIRE 3D Structure

Secondary Structure Prediction GDPIVKNAKLDSRLANKEALRLL?

Secondary Structure Propensities α-helix β-sheet turn Chou & Fasman (1974)

Secondary Structure Prediction Methods SABLE PSIPRED PSSP SAM-T99-sec PHD C+F 77.6% 76.2% 75.1% 76.1% 72.3% 50-60% C+F: Chou & Fasman GOR: Garnier, Osguthorpe, Robson

Secondary structure prediction Per-Residue Accuracy

Realistic ab initio Structure Prediction -2400-2450 CHARMM22/GBMV. -2500-2550 -2600-2650 -2700-2750 6 7 8 9 10 11 12 13 14 C! RMSD in Å (62 residues)

Sampling, Scoring, Clustering -2400-2450 CHARMM22/GBMV. -2500-2550 -2600-2650 -2700-2750 6 7 8 9 10 11 12 13 14 C! RMSD in Å (62 residues)

Ab initio Predictions DNase fragmentation factor NMR structure 1KOY Best-scoring prediction 7.4 Å RMSD

Scoring Functions Knowledge-based/statistical derived from known protein structures limited by training data usually fast e.g. DFIRE, RAPDF, prosaii Force field based model physical energy landscape more robust and transferable often expensive (require minimization) e.g. MMPB(GB)/SA, UNRES

-3500 Scoring Function Comparison MMGB/SA vs. DFIRE -8000 MMGB/SA score -3600-3700 -3800-3900 -4000-4100 DFIRE score -8500-9000 -9500-10000 -4200-10500 5.5 6 6.5 7 7.5 8 8.5 5.5 6 6.5 7 7.5 8 8.5 C! RMSD in Å 15.8 15.6 15.4 15.2 15 14.8 14.6 radius of gyration C! RMSD in Å 14.4 14.2 5.5 6 6.5 7 7.5 8 8.5 C! RMSD in Å

Sampling with Restraints Secondary structure bias Secondary structure prediction NMR shift data Distance restraints Experimental restraints (disulfides, NMR, EPR) Side chain contacts from analogous structures Shape restraints cryoem data, small-angle X-ray scattering

but the solution may lie elsewhere.

Sequence Homology Human thioredoxin (1AUC) SDKIIHLTDDSFDTDVLKADGAILVDFWAEWCGPCKMIAPILDEIADEYQGKLTV A : :....:..::: : :::::::: :.....:.... MVKQIESKTAFQEALDAAGDKLVVVDFSATWCGPCKMIKPFFHSLSEKYSNVIFL - KLNIDQNPGTAPKYGIRGIPTLLLFKNGEVAATKVGALSKGQLKEFLDAN---LA...:..:....::..::.:. :::.: : :: :.:. :. EVDVDDCQDVASECEVKCMPTFQFFKKGQ----KVGEFS-GANKEKLEATINELV E. Coli thioredoxin (1THO)

Comparative Modeling Assumption: Proteins with similar sequence have similar structure Human thioredoxin (1AUC) E. Coli thioredoxin (1THO)

Structural Templates from Homology Challenges: Correct alignment Loop modeling Side chain rebuilding PGTAPKYGIRGIPTLLLFKNGEVAATKVGALSKGQLKEFLDAN---LA.:....::..::.:. :::.: : :: :.:. :. QDVASECEVKCMPTFQFFKKGQ----KVGEFS-GANKEKLEATINELV

Accuracy of Predictions by Homology 100 % predictions 80 60 40 20 0 20 30 40 50 60 70 80 90 100 % sequence identity <2Å RMSD <5Å RMSD

Prediction through Fold Recognition Assumption: Proteins with similar secondary structure share fold 1N91 1JRM

Templates through Fold Recognition Challenges: Wrong templates Alignment uncertain Fragment modeling Refinement needed

Ab initio Sampling in Template-based Structure Prediction Template provides known protein structure Ab initio sampling of unknown fragments in the context of template

Template Restraints Near Flexible Part 0.0 0.1 0.4 0.7 0.2 1.0 Restraint potential: U = f " k (r # r 0 ) 2

Loop Sampling Methods # Residues 1-2 1-3 2-12 5-30 2-100 Sampling All-Atom Reconstruction Exhaustive Search Torsional Space MC/MD Multi-Scale Fragment-based Program MMTSB Tool Set Modeller (Sali) MMTSB Tool Set Rosetta (Baker)

Structural Genomics Efforts

Structure Refinement? predicted native (NMR)