Protein Structure Prediction
|
|
- Reginald Harrell
- 5 years ago
- Views:
Transcription
1 Page 1 Protein Structure Prediction Russ B. Altman BMI 214 CS 274 Protein Folding is different from structure prediction --Folding is concerned with the process of taking the 3D shape, usually based on physical principals. --Prediction uses any statistical, theoretical or empirical data to try to get at the end result. Protein Structure Prediction 1. A bit of history: Asilomar, 1994, 1996, 1998 & Four approaches to structure prediction: a. Homology Modeling b. Ab initio prediction c. Sequence-Structure Threading d. Docking 3. Two ways of threading Dynamic programming Knowledge-based potentials Asilomar, 1994, 1996, 1998 & Asilomar is state conference ground near Carmel, Monterey. 2. December 1994: Meeting on Critical Assessment of Techniques for Protein Structure Prediction 3. December 1996 & 1998: Second and Third meeting, etc 4. Competition was held to compare/contrast methods. Asilomar 4. Competition worked like this: Experimentalists who had structure that would be solved before date of CASP meeting submitted the sequence of the unknown to central repository. Predictors could download sequence and minimal information about protein (name), and could enter one of three categories. Assessors use automatic programs for analysis in addition to expertise to evaluate quality of predictions. Asilomar Categories 1. Homology Modeling (sequences with high homology to sequences of known structure) Given a sequence with homology > 25-30% with known structure in PDB, use known structure as starting point to create a model of the 3D structure of the sequence. Takes advantage of knowledge of a closely related protein. Use sequence alignment techniques to establish correspondences between known template and unknown.
2 Page 2 Asilomar Categories 2. Ab initio prediction (no known homology with any sequence of known structure) Given only the sequence, predict the 3D structure from first principles, based on energetic or statistical principles. Secondary structure prediction and multiple alignment techniques used to predict features of these molecules. Then, some method necessary for assembling 3D structure. New sequence: Ab initio prediction MLDTNMKTQLKAYLEKLTKPVELIATLDDSAKSAEIKELL Comparison of calculated (red) and experimental (blue) structures for the protein myoglobin using the refined potential function. The calculated structure is the lowest energy structure obtained from 3 different jobs with clustering and energy selection. The total simulation time on a 16 node partition CM-5 massively parallel computer was 60 hours, in which about 5 billion structures were generated. The RMS deviation of the two structures is 6.2 Å. Predict secondary structure: MLDTNMKTQLKAYLEKLTKPVELIATLDDSAKSAEIKELL HHHHHCCCCCHHHHHHHHHHCCCCBBBBBBBCCBBBB Predict 3D structure entirely: Asilomar Categories 3. Fold recognition (sequences with no sequence identity (<= 30%) to sequences of known structure. Given the sequence, and a set of folds observed in PDB, see if any of the sequences could adopt one the known folds. Takes advantage of knowledge of existing structures, and principles by which they are stabilized (favorable interactions).
3 Page 3 New sequence: Fold Recognition MLDTNMKTQLKAYLEKLTKPVELIATLDDSAKSAEIKELL Library of known folds: Asilomar Categories 3. Docking two proteins ( 96 only) Given two separate (known) protein structures, predict the geometry of their physical association.???? Use information about surface properties to find best hand/glove or lock/key fit between two known structures. Can do it by rigid body docking or flexible docking (harder) X X! X Protein Docking How to evaluate predictions? + RMSD Overall identification and topology of secondary structures Energy considerations (contacts, H-bonds) Similarity of hydrophobic core Sequence alignment quality (and systematic shift) See review of CASP4 at Homology Modeling When sequence homology is > 70%, high resolution models are possible (< 3 Å RMSD). Sophisticated energy minimization techniques do not dramatically improve upon initial guess. Sample Homology Modeling MODELLER (Sali et al, see course web page) 1. Find homologous proteins with known structure and align 2. Collect distance distributions between atoms in known protein structures 3. Use these distributions to compute positions for equivalent atoms in alignment 4. Refine using energetics Rigorous criteria applied such as torsion angles, van der Waals violations, RMSD.
4 Page 4 Homology modeling sample. Thick backbone shows known structure. Thin lines show modeled structures. Some sidechains are not positioned correctly, but backbone and other sidechains look quite good. a. Sidechain mistakes b. Shifts with correct alignment c. No template d. Misalignment e. Incorrect template Use of sensitive multiple alignment (e.g. PSI- BLAST) techniques helped get best alignments. Sidechain modeling using libraries of known amino acid conformations. Success ranged from 45% to 80% correct (= angles within 30 of experimental structure). Energy based refinement still not improving the structures. PSI BLAST Extension of BLAST with extra features: 1. Multiple blocks aligned (not just 1) 2. Profile used iterative to increase sensitivity in picking distance sequences build profile based on initial hits use profile to conduct another search rebuilt profile repeat 5. Be careful about repeating too many times PSIBLAST DRIFT
5 Page 5 PSI BLAST OVERVIEW SKIP FOLD RECOGNITION AND COME BACK TO IT Ab Initio Predictions 1 to 2 : (Secondary structure prediction) Range of accuracy from 66% to 77% (3 state labeling: helix, coil or beta). Human hand editing improves the accuracy. Multiple sequence alignments improve the performance of secondary structure prediction. Ab Initio Predictions 2 to 3 : (Assemble secondary structures into 3D) Sensitive to errors in secondary structure Predictors were more likely to predict previously known structures. Ab Initio Predictions 1 to 3 : (Predict 3D from sequence only) Predict interresidue contacts and then compute structure (mild success) Simplified energy term + reduced search space (phi/psi or lattice) (moderate success) Creative ways to memorize sequence <-> structure correlations in short segments from the PDB, and use these to model new structures. ROSETTA Method. Ab Initio Predictions 1 to 3 : Good progress (3 models better than fold recognition results in CASP III) 1. Associate sequence of unknown with known 3D structure library, and then optimizing contact frequency of amino acids, as measured in PDB (Baker et al). 2. Generate all folds on lattice and then filter the bad ones out (Samudrala et al) 3. Combine multiple sequence alignment, secondary structure prediction and lattice. (Skolnick et al)
6 Page 6 Lattice search Rosetta Method for ab initio 1. Break target into fragments of 9 amino acids 2. Create profile, X, for target 3. Create profile, S, for similar PDB sequences 4. Align profiles X, S to get rank order list of best match fragments in the PDB (REF: Simons Baker, JMB 306: ) Rosetta Method for ab initio 5. Start with extended chain, and evaluate the effect of introducing the fragments into the chain. 6. Use Metropolis-type algorithm for optimization, using following terms: hydrophobic burial polar side-chain interactions hydrogen bonding between beta-strands hard sphere repulsion (van der Waals) 6. Create 1000 structures, cluster them. 7. Choose one representative from each cluster as possible prediction Use an ellipsoid to be sure that hydrophobic residues are central
7 Page 7 CASP IV Performance Performance of Rosetta Method Alexey Murzin (Proteins Volume 45, Issue S5, Pages: 76-85) In 1996, in CASP2, we presented a semimanual approach to the prediction of protein structure that was aimed at the recognition of probable distant homology, where it existed, between a given target protein and a protein of known structure (Murzin and Bateman, [Proteins 1997; Suppl 1: ]). Central to our method was the knowledge of all known structural and probable evolutionary relationships among proteins of known structure classified in the SCOP database (Murzin et al., J Mol Biol 1995;247: ). It was demonstrated that a knowledge-based approach could compete successfully with the best computational methods of the time in the correct recognition of the target protein fold. Murzin prediction CASP IV The computational community responds Alexey can t play! Experimental Predicted
8 Page 8 Fold Recognition (check if sequence matches known 3D fold) CASP1: Of 21 target proteins, 11 wound up having folds that were previously known. CASP2: Of 22 targets, 15 with available folds CASP3: Of 43 targets, 36 with available folds CASP4: Of 56 target domains hard to say Every predictor does well on something. Common folds (more examples) are easier to recognize. Fold recognition was the surprise performer at the first competition. Incremental progress at second, third, fourth Fold Recognition Not all or none. List of top N hits much better than top hit. Common folds easier to recognize. Quality of alignments that result is NOT good. Potentials include: residue pair contact terms, hydrophobicity, polarity, H-bonds, local structure terms. Simple Dynamic Programming with environmental matching sometimes performs as well as sophisticated 3D potentials... Fold Recognition N-1 = target, N-2 = Fold in PDB New sequence: MLDTNMKTQLKAYLEKLTKPVELIATLDDSAKSAEIKELL Library of known folds:???? X X! X N-1 = target, N-2 = Fold in PDB N-1 = target, N-2 = Fold in PDB
9 Page 9 Fold Recognition ~ Threading ~ Inverse Folding Fold Recognition: given a sequence, and a library of backbones, find the backbone that accommodates the sequence best. Threading: Given a backbone, find the best way to mount the sequence on the backbone (with gaps) to maximize good interactions. Predictors for CASP I are along top row. Target sequences along first column. Dark grey means bad prediction, light gray pretty good, white very good. Hatched means no prediction. Upper left corner shows rank of best answer among list submitted by predictors (also shows fold used to make prediction, shift error and general protein class) Inverse Folding: (Folding = sequence to 3D). Start with 3D and find a good sequence. Elements of a fold recognition algorithm 1. Library of protein structures, suitably processed - All structures - Representative subset - Structures with loops removed 2. Scoring function - contact potential - environmental evaluation function 3. Method for generating initial alignments and/or searching for better alignments. Dynamic Programming with Environmental Strings (The subject of one of the homeworks) IDEA: Instead of aligning a sequence to a sequence, align a sequence to a string of descriptors that describe the 3D environment of the target structure. Usual DP, score matrix relates two amino acids: A R N D C Q A R N D C Q Thread DP, relate AAs to environments in 3D structure. E1 E2 E3 E4 E5... A R N D C Q
10 Page 10 What are environments. How do you compute them? Conceptually, superimpose multiple structures and look at the statistically conserved features around each 3D xyz position. This may include: Is AA buried/partially buried or exposed? If buried, how polar is the environment? If partially buried, how polar? What kind of secondary structures? (Buried status, polarity and secondary structure) 1. Align proteins with similar 3D structure. 2. Align homologous proteins by sequence alone. 3. For each position in protein, identify what environment it is by computing the local properties of interest (e.g. secondary structure, buried, polarity). 4. Count frequencies of different amino acids (within multiple alignment) in different environments. This creates a MATCH MATRIX. Bowie et al define 18 environments Another example of position-specific scores. DP threading Match Matrix Sample matrix showing alignment of amino acids and environments for globins. Entries indicate possible score for each amino acid at each environmental position, taken from match matrix. Z-Scores of DP threading for myoglobins, globins and non-globins. How do you thread a new sequence? Using standard dynamic programming, use new score matrix to align the sequence of environments from the structure of interest to the sequence of amino acids from unknown sequence. The highest scoring alignment is the best superposition of the sequence onto the structure. Using knowledge of scores of sequences with known structure, can see if the score is high enough to put the new sequence in the family.
11 Page 11 Advantages: DP Threading 1. Environmental proclivities may be more accurate than simple amino acid similarity: structural information local context potentially, many other features Net Result: Sample alignment B1 E2α B2α B2α E2α B2β P2β Eα Eβ Eα.. His Asp Val Ile Lys Ile Tyr Ser.. 2. Fast. 3. Pretty good performance (at Asilomar even). Disadvantages DP Threading Requires previous examples to work. Resulting match usually needs refinement May share some problems of DP in general (independence assumption from column to column, gap penalty choice, etc...) Disadvantages DP Threading Assumes average amino acid preferences overall similar protein-family environments. Doesn t compute the actual environment created by mounting the sequence on the structure. Assumes that the environment is relatively constant, and that only amino acid details change. But could have different types of interactions... Contact Potential Threading IDEA: Instead of modeling energies from first physical principles, simplify the problem by positioning only amino acids, and compute empirical energies from the observed associations of amino acids. GLU is attracted to LYS = E(glu, lys) Contact potential threading Create energy terms between amino acids: E(interaction) = -KT ln[frequency of interaction] where K is constant, T is temperature (constant), frequency of interaction measured in database of known structures. More frequent > more favorable.
12 Page 12 Contact potential (After Sippl et al.) More specifically: a = amino acid type a (ALA, VAL, etc...) b = amino acid type b s = separation in sequence E abs (r) = E abs (r) E s (r) Energy of interaction between a and b minus average energy at that separation equals the energy difference that contributes to stability. Contact Potential E abs (r) = -KT ln [ f abs (r) / f s (r) ] For any given sequence in 3D, compute distances between all pairs of amino acids (usually upto r = 10-15Å), and sum. E tot = Σ E abs (r) all a,b pairs Using contact potential 1. Given 3D structure, need to mount the sequence on the structure. simple dynamic programming (misses the point) other dynamic programming (better) exhautive enumeration (too expensive) recent paper shows that this is NP-hard heuristic enumeration limit on gap lengths, loop lengths (heuristic) Using contact potential Z-score. Number of standard deviations away from mean. Most meaningful for normal distributions Evaluate the contact potential for the alignment. 3. {Optional} Locally optimize the potential score. 4. Compare potential with random shuffle of sequence, and with other sequences to approximate z-score. 2SD Mean Sample threading. Other uses of contact potentials Fold recognition (as discussed here) Incorrect fold recognition detect unlikely or wrong structures bad predictions bad contacts, etc... Measure protein stability Use for ab initio prediction...
13 Page 13 Conclusions 1. Protein fold recognition will get asymptotically better, as we get more folds. 2. Best ab initio methods use knowledge of database, and will thus also improve. 2. Estimates are that we now have between 30% and 50% of folds that occur. 3. Given fold, we need to improve refinement with homology modeling techniques. Other information 1. points to CASP results and targets. 2. Special journal issues devoted to CASP: Proteins 23(3), 1995 CASP2: Proteins Supplement 1, 1997 CASP3: Nature Structural Biology, Vol 6, No. 2, Feb 1999, page 108. CASP4: Proteins Vol 45 (S5), 2001.
CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction
CMPS 6630: Introduction to Computational Biology and Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the
More informationCMPS 3110: Bioinformatics. Tertiary Structure Prediction
CMPS 3110: Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the laws of physics! Conformation space is finite
More informationProtein Structure Prediction, Engineering & Design CHEM 430
Protein Structure Prediction, Engineering & Design CHEM 430 Eero Saarinen The free energy surface of a protein Protein Structure Prediction & Design Full Protein Structure from Sequence - High Alignment
More information114 Grundlagen der Bioinformatik, SS 09, D. Huson, July 6, 2009
114 Grundlagen der Bioinformatik, SS 09, D. Huson, July 6, 2009 9 Protein tertiary structure Sources for this chapter, which are all recommended reading: D.W. Mount. Bioinformatics: Sequences and Genome
More informationBuilding 3D models of proteins
Building 3D models of proteins Why make a structural model for your protein? The structure can provide clues to the function through structural similarity with other proteins With a structure it is easier
More informationIntroduction to Comparative Protein Modeling. Chapter 4 Part I
Introduction to Comparative Protein Modeling Chapter 4 Part I 1 Information on Proteins Each modeling study depends on the quality of the known experimental data. Basis of the model Search in the literature
More informationProgramme Last week s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues
Programme 8.00-8.20 Last week s quiz results + Summary 8.20-9.00 Fold recognition 9.00-9.15 Break 9.15-11.20 Exercise: Modelling remote homologues 11.20-11.40 Summary & discussion 11.40-12.00 Quiz 1 Feedback
More informationTemplate Free Protein Structure Modeling Jianlin Cheng, PhD
Template Free Protein Structure Modeling Jianlin Cheng, PhD Associate Professor Computer Science Department Informatics Institute University of Missouri, Columbia 2013 Protein Energy Landscape & Free Sampling
More informationCan protein model accuracy be. identified? NO! CBS, BioCentrum, Morten Nielsen, DTU
Can protein model accuracy be identified? Morten Nielsen, CBS, BioCentrum, DTU NO! Identification of Protein-model accuracy Why is it important? What is accuracy RMSD, fraction correct, Protein model correctness/quality
More information09/06/25. Computergestützte Strukturbiologie (Strukturelle Bioinformatik) Non-uniform distribution of folds. Scheme of protein structure predicition
Sequence identity Structural similarity Computergestützte Strukturbiologie (Strukturelle Bioinformatik) Fold recognition Sommersemester 2009 Peter Güntert Structural similarity X Sequence identity Non-uniform
More informationProtein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron.
Protein Dynamics The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron. Below is myoglobin hydrated with 350 water molecules. Only a small
More informationProtein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche
Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche The molecular structure of a protein can be broken down hierarchically. The primary structure of a protein is simply its
More informationProtein Structure Determination
Protein Structure Determination Given a protein sequence, determine its 3D structure 1 MIKLGIVMDP IANINIKKDS SFAMLLEAQR RGYELHYMEM GDLYLINGEA 51 RAHTRTLNVK QNYEEWFSFV GEQDLPLADL DVILMRKDPP FDTEFIYATY 101
More informationCAP 5510 Lecture 3 Protein Structures
CAP 5510 Lecture 3 Protein Structures Su-Shing Chen Bioinformatics CISE 8/19/2005 Su-Shing Chen, CISE 1 Protein Conformation 8/19/2005 Su-Shing Chen, CISE 2 Protein Conformational Structures Hydrophobicity
More informationSupporting Online Material for
www.sciencemag.org/cgi/content/full/309/5742/1868/dc1 Supporting Online Material for Toward High-Resolution de Novo Structure Prediction for Small Proteins Philip Bradley, Kira M. S. Misura, David Baker*
More informationProcheck output. Bond angles (Procheck) Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics.
Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics Iosif Vaisman Email: ivaisman@gmu.edu ----------------------------------------------------------------- Bond
More informationHomology Modeling (Comparative Structure Modeling) GBCB 5874: Problem Solving in GBCB
Homology Modeling (Comparative Structure Modeling) Aims of Structural Genomics High-throughput 3D structure determination and analysis To determine or predict the 3D structures of all the proteins encoded
More informationWeek 10: Homology Modelling (II) - HHpred
Week 10: Homology Modelling (II) - HHpred Course: Tools for Structural Biology Fabian Glaser BKU - Technion 1 2 Identify and align related structures by sequence methods is not an easy task All comparative
More informationTemplate Free Protein Structure Modeling Jianlin Cheng, PhD
Template Free Protein Structure Modeling Jianlin Cheng, PhD Professor Department of EECS Informatics Institute University of Missouri, Columbia 2018 Protein Energy Landscape & Free Sampling http://pubs.acs.org/subscribe/archive/mdd/v03/i09/html/willis.html
More informationPacking of Secondary Structures
7.88 Lecture Notes - 4 7.24/7.88J/5.48J The Protein Folding and Human Disease Professor Gossard Retrieving, Viewing Protein Structures from the Protein Data Base Helix helix packing Packing of Secondary
More informationCS612 - Algorithms in Bioinformatics
Fall 2017 Protein Structure Detection Methods October 30, 2017 Comparative Modeling Comparative modeling is modeling of the unknown based on comparison to what is known In the context of modeling or computing
More informationALL LECTURES IN SB Introduction
1. Introduction 2. Molecular Architecture I 3. Molecular Architecture II 4. Molecular Simulation I 5. Molecular Simulation II 6. Bioinformatics I 7. Bioinformatics II 8. Prediction I 9. Prediction II ALL
More informationCMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison
CMPS 6630: Introduction to Computational Biology and Bioinformatics Structure Comparison Protein Structure Comparison Motivation Understand sequence and structure variability Understand Domain architecture
More informationMolecular Modeling Lecture 7. Homology modeling insertions/deletions manual realignment
Molecular Modeling 2018-- Lecture 7 Homology modeling insertions/deletions manual realignment Homology modeling also called comparative modeling Sequences that have similar sequence have similar structure.
More informationBioinformatics. Macromolecular structure
Bioinformatics Macromolecular structure Contents Determination of protein structure Structure databases Secondary structure elements (SSE) Tertiary structure Structure analysis Structure alignment Domain
More informationGiri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748
CAP 5510: Introduction to Bioinformatics Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs07.html 2/15/07 CAP5510 1 EM Algorithm Goal: Find θ, Z that maximize Pr
More informationProtein Structure Prediction
Protein Structure Prediction Michael Feig MMTSB/CTBP 2006 Summer Workshop From Sequence to Structure SEALGDTIVKNA Ab initio Structure Prediction Protocol Amino Acid Sequence Conformational Sampling to
More informationMolecular Modeling. Prediction of Protein 3D Structure from Sequence. Vimalkumar Velayudhan. May 21, 2007
Molecular Modeling Prediction of Protein 3D Structure from Sequence Vimalkumar Velayudhan Jain Institute of Vocational and Advanced Studies May 21, 2007 Vimalkumar Velayudhan Molecular Modeling 1/23 Outline
More informationLecture 2 and 3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability
Lecture 2 and 3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability Part I. Review of forces Covalent bonds Non-covalent Interactions: Van der Waals Interactions
More informationPrediction and refinement of NMR structures from sparse experimental data
Prediction and refinement of NMR structures from sparse experimental data Jeff Skolnick Director Center for the Study of Systems Biology School of Biology Georgia Institute of Technology Overview of talk
More informationNeural Networks for Protein Structure Prediction Brown, JMB CS 466 Saurabh Sinha
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha Outline Goal is to predict secondary structure of a protein from its sequence Artificial Neural Network used for this
More informationProtein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror
Protein structure prediction CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror 1 Outline Why predict protein structure? Can we use (pure) physics-based methods? Knowledge-based methods Two major
More informationPhysiochemical Properties of Residues
Physiochemical Properties of Residues Various Sources C N Cα R Slide 1 Conformational Propensities Conformational Propensity is the frequency in which a residue adopts a given conformation (in a polypeptide)
More informationDesign of a Novel Globular Protein Fold with Atomic-Level Accuracy
Design of a Novel Globular Protein Fold with Atomic-Level Accuracy Brian Kuhlman, Gautam Dantas, Gregory C. Ireton, Gabriele Varani, Barry L. Stoddard, David Baker Presented by Kate Stafford 4 May 05 Protein
More informationTHE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION
THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION AND CALIBRATION Calculation of turn and beta intrinsic propensities. A statistical analysis of a protein structure
More informationFigure 1. Molecules geometries of 5021 and Each neutral group in CHARMM topology was grouped in dash circle.
Project I Chemistry 8021, Spring 2005/2/23 This document was turned in by a student as a homework paper. 1. Methods First, the cartesian coordinates of 5021 and 8021 molecules (Fig. 1) are generated, in
More informationBioinformatics: Secondary Structure Prediction
Bioinformatics: Secondary Structure Prediction Prof. David Jones d.jones@cs.ucl.ac.uk LMLSTQNPALLKRNIIYWNNVALLWEAGSD The greatest unsolved problem in molecular biology:the Protein Folding Problem? Entries
More informationContact map guided ab initio structure prediction
Contact map guided ab initio structure prediction S M Golam Mortuza Postdoctoral Research Fellow I-TASSER Workshop 2017 North Carolina A&T State University, Greensboro, NC Outline Ab initio structure prediction:
More informationFlexPepDock In a nutshell
FlexPepDock In a nutshell All Tutorial files are located in http://bit.ly/mxtakv FlexPepdock refinement Step 1 Step 3 - Refinement Step 4 - Selection of models Measure of fit FlexPepdock Ab-initio Step
More informationAb-initio protein structure prediction
Ab-initio protein structure prediction Jaroslaw Pillardy Computational Biology Service Unit Cornell Theory Center, Cornell University Ithaca, NY USA Methods for predicting protein structure 1. Homology
More informationOrientational degeneracy in the presence of one alignment tensor.
Orientational degeneracy in the presence of one alignment tensor. Rotation about the x, y and z axes can be performed in the aligned mode of the program to examine the four degenerate orientations of two
More informationBasics of protein structure
Today: 1. Projects a. Requirements: i. Critical review of one paper ii. At least one computational result b. Noon, Dec. 3 rd written report and oral presentation are due; submit via email to bphys101@fas.harvard.edu
More informationProtein Structures: Experiments and Modeling. Patrice Koehl
Protein Structures: Experiments and Modeling Patrice Koehl Structural Bioinformatics: Proteins Proteins: Sources of Structure Information Proteins: Homology Modeling Proteins: Ab initio prediction Proteins:
More informationHMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder
HMM applications Applications of HMMs Gene finding Pairwise alignment (pair HMMs) Characterizing protein families (profile HMMs) Predicting membrane proteins, and membrane protein topology Gene finding
More informationAlpha-helical Topology and Tertiary Structure Prediction of Globular Proteins Scott R. McAllister Christodoulos A. Floudas Princeton University
Alpha-helical Topology and Tertiary Structure Prediction of Globular Proteins Scott R. McAllister Christodoulos A. Floudas Princeton University Department of Chemical Engineering Program of Applied and
More informationHOMOLOGY MODELING. The sequence alignment and template structure are then used to produce a structural model of the target.
HOMOLOGY MODELING Homology modeling, also known as comparative modeling of protein refers to constructing an atomic-resolution model of the "target" protein from its amino acid sequence and an experimental
More informationAnalysis and Prediction of Protein Structure (I)
Analysis and Prediction of Protein Structure (I) Jianlin Cheng, PhD School of Electrical Engineering and Computer Science University of Central Florida 2006 Free for academic use. Copyright @ Jianlin Cheng
More informationPage 1. References. Hidden Markov models and multiple sequence alignment. Markov chains. Probability review. Example. Markovian sequence
Page Hidden Markov models and multiple sequence alignment Russ B Altman BMI 4 CS 74 Some slides borrowed from Scott C Schmidler (BMI graduate student) References Bioinformatics Classic: Krogh et al (994)
More informationStatistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics
Statistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics Jianlin Cheng, PhD Department of Computer Science University of Missouri, Columbia
More informationProtein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror
Protein structure prediction CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror 1 Outline Why predict protein structure? Can we use (pure) physics-based methods? Knowledge-based methods Two major
More informationComputer simulations of protein folding with a small number of distance restraints
Vol. 49 No. 3/2002 683 692 QUARTERLY Computer simulations of protein folding with a small number of distance restraints Andrzej Sikorski 1, Andrzej Kolinski 1,2 and Jeffrey Skolnick 2 1 Department of Chemistry,
More informationHomology modeling. Dinesh Gupta ICGEB, New Delhi 1/27/2010 5:59 PM
Homology modeling Dinesh Gupta ICGEB, New Delhi Protein structure prediction Methods: Homology (comparative) modelling Threading Ab-initio Protein Homology modeling Homology modeling is an extrapolation
More informationModeling for 3D structure prediction
Modeling for 3D structure prediction What is a predicted structure? A structure that is constructed using as the sole source of information data obtained from computer based data-mining. However, mixing
More informationSecondary Structure. Bioch/BIMS 503 Lecture 2. Structure and Function of Proteins. Further Reading. Φ, Ψ angles alone determine protein structure
Bioch/BIMS 503 Lecture 2 Structure and Function of Proteins August 28, 2008 Robert Nakamoto rkn3c@virginia.edu 2-0279 Secondary Structure Φ Ψ angles determine protein structure Φ Ψ angles are restricted
More informationTemplate-Based Modeling of Protein Structure
Template-Based Modeling of Protein Structure David Constant Biochemistry 218 December 11, 2011 Introduction. Much can be learned about the biology of a protein from its structure. Simply put, structure
More informationHomology Modeling. Roberto Lins EPFL - summer semester 2005
Homology Modeling Roberto Lins EPFL - summer semester 2005 Disclaimer: course material is mainly taken from: P.E. Bourne & H Weissig, Structural Bioinformatics; C.A. Orengo, D.T. Jones & J.M. Thornton,
More informationCourse Notes: Topics in Computational. Structural Biology.
Course Notes: Topics in Computational Structural Biology. Bruce R. Donald June, 2010 Copyright c 2012 Contents 11 Computational Protein Design 1 11.1 Introduction.........................................
More informationStructural Alignment of Proteins
Goal Align protein structures Structural Alignment of Proteins 1 2 3 4 5 6 7 8 9 10 11 12 13 14 PHE ASP ILE CYS ARG LEU PRO GLY SER ALA GLU ALA VAL CYS PHE ASN VAL CYS ARG THR PRO --- --- --- GLU ALA ILE
More informationAssignment 2 Atomic-Level Molecular Modeling
Assignment 2 Atomic-Level Molecular Modeling CS/BIOE/CME/BIOPHYS/BIOMEDIN 279 Due: November 3, 2016 at 3:00 PM The goal of this assignment is to understand the biological and computational aspects of macromolecular
More informationSupplemental Materials for. Structural Diversity of Protein Segments Follows a Power-law Distribution
Supplemental Materials for Structural Diversity of Protein Segments Follows a Power-law Distribution Yoshito SAWADA and Shinya HONDA* National Institute of Advanced Industrial Science and Technology (AIST),
More informationStructure and evolution of the spliceosomal peptidyl-prolyl cistrans isomerase Cwc27
Acta Cryst. (2014). D70, doi:10.1107/s1399004714021695 Supporting information Volume 70 (2014) Supporting information for article: Structure and evolution of the spliceosomal peptidyl-prolyl cistrans isomerase
More informationDocking. GBCB 5874: Problem Solving in GBCB
Docking Benzamidine Docking to Trypsin Relationship to Drug Design Ligand-based design QSAR Pharmacophore modeling Can be done without 3-D structure of protein Receptor/Structure-based design Molecular
More informationSupplementary Figure 3 a. Structural comparison between the two determined structures for the IL 23:MA12 complex. The overall RMSD between the two
Supplementary Figure 1. Biopanningg and clone enrichment of Alphabody binders against human IL 23. Positive clones in i phage ELISA with optical density (OD) 3 times higher than background are shown for
More informationProtein Structure Prediction
Protein Structure Prediction Michael Feig MMTSB/CTBP 2009 Summer Workshop From Sequence to Structure SEALGDTIVKNA Folding with All-Atom Models AAQAAAAQAAAAQAA All-atom MD in general not succesful for real
More informationProtein Folding Prof. Eugene Shakhnovich
Protein Folding Eugene Shakhnovich Department of Chemistry and Chemical Biology Harvard University 1 Proteins are folded on various scales As of now we know hundreds of thousands of sequences (Swissprot)
More informationLecture 2-3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability
Lecture 2-3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability Part I. Review of forces Covalent bonds Non-covalent Interactions Van der Waals Interactions
More informationComputational Protein Design
11 Computational Protein Design This chapter introduces the automated protein design and experimental validation of a novel designed sequence, as described in Dahiyat and Mayo [1]. 11.1 Introduction Given
More informationSequence Alignments. Dynamic programming approaches, scoring, and significance. Lucy Skrabanek ICB, WMC January 31, 2013
Sequence Alignments Dynamic programming approaches, scoring, and significance Lucy Skrabanek ICB, WMC January 31, 213 Sequence alignment Compare two (or more) sequences to: Find regions of conservation
More informationConformational Geometry of Peptides and Proteins:
Conformational Geometry of Peptides and Proteins: Before discussing secondary structure, it is important to appreciate the conformational plasticity of proteins. Each residue in a polypeptide has three
More informationNMR, X-ray Diffraction, Protein Structure, and RasMol
NMR, X-ray Diffraction, Protein Structure, and RasMol Introduction So far we have been mostly concerned with the proteins themselves. The techniques (NMR or X-ray diffraction) used to determine a structure
More informationSUPPLEMENTARY INFORMATION
Supplementary Results DNA binding property of the SRA domain was examined by an electrophoresis mobility shift assay (EMSA) using synthesized 12-bp oligonucleotide duplexes containing unmodified, hemi-methylated,
More information7.91 Amy Keating. Solving structures using X-ray crystallography & NMR spectroscopy
7.91 Amy Keating Solving structures using X-ray crystallography & NMR spectroscopy How are X-ray crystal structures determined? 1. Grow crystals - structure determination by X-ray crystallography relies
More informationTemplate-Based 3D Structure Prediction
Template-Based 3D Structure Prediction Sequence and Structure-based Template Detection and Alignment Issues The rate of new sequences is growing exponentially relative to the rate of protein structures
More informationProtein Structure Prediction and Display
Protein Structure Prediction and Display Goal Take primary structure (sequence) and, using rules derived from known structures, predict the secondary structure that is most likely to be adopted by each
More informationProtein Structure Prediction and Protein-Ligand Docking
Protein Structure Prediction and Protein-Ligand Docking Björn Wallner bjornw@ifm.liu.se Jan. 24, 2014 Todays topics Protein Folding Intro Protein structure prediction How can we predict the structure of
More informationProtein Modeling. Generating, Evaluating and Refining Protein Homology Models
Protein Modeling Generating, Evaluating and Refining Protein Homology Models Troy Wymore and Kristen Messinger Biomedical Initiatives Group Pittsburgh Supercomputing Center Homology Modeling of Proteins
More informationProtein Threading. BMI/CS 776 Colin Dewey Spring 2015
Protein Threading BMI/CS 776 www.biostat.wisc.edu/bmi776/ Colin Dewey cdewey@biostat.wisc.edu Spring 2015 Goals for Lecture the key concepts to understand are the following the threading prediction task
More informationProtein Structure. W. M. Grogan, Ph.D. OBJECTIVES
Protein Structure W. M. Grogan, Ph.D. OBJECTIVES 1. Describe the structure and characteristic properties of typical proteins. 2. List and describe the four levels of structure found in proteins. 3. Relate
More informationPresenter: She Zhang
Presenter: She Zhang Introduction Dr. David Baker Introduction Why design proteins de novo? It is not clear how non-covalent interactions favor one specific native structure over many other non-native
More informationChemical Shift Restraints Tools and Methods. Andrea Cavalli
Chemical Shift Restraints Tools and Methods Andrea Cavalli Overview Methods Overview Methods Details Overview Methods Details Results/Discussion Overview Methods Methods Cheshire base solid-state Methods
More informationSCOP. all-β class. all-α class, 3 different folds. T4 endonuclease V. 4-helical cytokines. Globin-like
SCOP all-β class 4-helical cytokines T4 endonuclease V all-α class, 3 different folds Globin-like TIM-barrel fold α/β class Profilin-like fold α+β class http://scop.mrc-lmb.cam.ac.uk/scop CATH Class, Architecture,
More informationProtein structure (and biomolecular structure more generally) CS/CME/BioE/Biophys/BMI 279 Sept. 28 and Oct. 3, 2017 Ron Dror
Protein structure (and biomolecular structure more generally) CS/CME/BioE/Biophys/BMI 279 Sept. 28 and Oct. 3, 2017 Ron Dror Please interrupt if you have questions, and especially if you re confused! Assignment
More informationThe protein folding problem consists of two parts:
Energetics and kinetics of protein folding The protein folding problem consists of two parts: 1)Creating a stable, well-defined structure that is significantly more stable than all other possible structures.
More informationBiochemistry Prof. S. DasGupta Department of Chemistry Indian Institute of Technology Kharagpur. Lecture - 06 Protein Structure IV
Biochemistry Prof. S. DasGupta Department of Chemistry Indian Institute of Technology Kharagpur Lecture - 06 Protein Structure IV We complete our discussion on Protein Structures today. And just to recap
More informationHomework 9: Protein Folding & Simulated Annealing : Programming for Scientists Due: Thursday, April 14, 2016 at 11:59 PM
Homework 9: Protein Folding & Simulated Annealing 02-201: Programming for Scientists Due: Thursday, April 14, 2016 at 11:59 PM 1. Set up We re back to Go for this assignment. 1. Inside of your src directory,
More informationDATE A DAtabase of TIM Barrel Enzymes
DATE A DAtabase of TIM Barrel Enzymes 2 2.1 Introduction.. 2.2 Objective and salient features of the database 2.2.1 Choice of the dataset.. 2.3 Statistical information on the database.. 2.4 Features....
More information2 Dean C. Adams and Gavin J. P. Naylor the best three-dimensional ordination of the structure space is found through an eigen-decomposition (correspon
A Comparison of Methods for Assessing the Structural Similarity of Proteins Dean C. Adams and Gavin J. P. Naylor? Dept. Zoology and Genetics, Iowa State University, Ames, IA 50011, U.S.A. 1 Introduction
More informationCOMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University
COMP 598 Advanced Computational Biology Methods & Research Introduction Jérôme Waldispühl School of Computer Science McGill University General informations (1) Office hours: by appointment Office: TR3018
More informationProtein structure alignments
Protein structure alignments Proteins that fold in the same way, i.e. have the same fold are often homologs. Structure evolves slower than sequence Sequence is less conserved than structure If BLAST gives
More informationSection Week 3. Junaid Malek, M.D.
Section Week 3 Junaid Malek, M.D. Biological Polymers DA 4 monomers (building blocks), limited structure (double-helix) RA 4 monomers, greater flexibility, multiple structures Proteins 20 Amino Acids,
More informationBioinformatics 2 -- lecture 6
Bioinformatics 2 -- lecture 6 Loop modeling Energy minimization Steps in homology modeling Identify a sequence of interest. Search database for homologs of known structure. Align homologs with each other
More informationSequence analysis and comparison
The aim with sequence identification: Sequence analysis and comparison Marjolein Thunnissen Lund September 2012 Is there any known protein sequence that is homologous to mine? Are there any other species
More informationLarge-Scale Genomic Surveys
Bioinformatics Subtopics Fold Recognition Secondary Structure Prediction Docking & Drug Design Protein Geometry Protein Flexibility Homology Modeling Sequence Alignment Structure Classification Gene Prediction
More informationSteps in protein modelling. Structure prediction, fold recognition and homology modelling. Basic principles of protein structure
Structure prediction, fold recognition and homology modelling Marjolein Thunnissen Lund September 2012 Steps in protein modelling 3-D structure known Comparative Modelling Sequence of interest Similarity
More informationA profile-based protein sequence alignment algorithm for a domain clustering database
A profile-based protein sequence alignment algorithm for a domain clustering database Lin Xu,2 Fa Zhang and Zhiyong Liu 3, Key Laboratory of Computer System and architecture, the Institute of Computing
More informationProtein Folding by Robotics
Protein Folding by Robotics 1 TBI Winterseminar 2006 February 21, 2006 Protein Folding by Robotics 1 TBI Winterseminar 2006 February 21, 2006 Protein Folding by Robotics Probabilistic Roadmap Planning
More informationFrom Amino Acids to Proteins - in 4 Easy Steps
From Amino Acids to Proteins - in 4 Easy Steps Although protein structure appears to be overwhelmingly complex, you can provide your students with a basic understanding of how proteins fold by focusing
More informationTools for Cryo-EM Map Fitting. Paul Emsley MRC Laboratory of Molecular Biology
Tools for Cryo-EM Map Fitting Paul Emsley MRC Laboratory of Molecular Biology April 2017 Cryo-EM model-building typically need to move more atoms that one does for crystallography the maps are lower resolution
More informationProtein Science (1997), 6: Cambridge University Press. Printed in the USA. Copyright 1997 The Protein Society
1 of 5 1/30/00 8:08 PM Protein Science (1997), 6: 246-248. Cambridge University Press. Printed in the USA. Copyright 1997 The Protein Society FOR THE RECORD LPFC: An Internet library of protein family
More informationComputational Molecular Modeling
Computational Molecular Modeling Lecture 1: Structure Models, Properties Chandrajit Bajaj Today s Outline Intro to atoms, bonds, structure, biomolecules, Geometry of Proteins, Nucleic Acids, Ribosomes,
More information