Bioinformatics: Secondary Structure Prediction

Size: px
Start display at page:

Download "Bioinformatics: Secondary Structure Prediction"

Transcription

1 Bioinformatics: Secondary Structure Prediction Prof. David Jones

2 LMLSTQNPALLKRNIIYWNNVALLWEAGSD The greatest unsolved problem in molecular biology:the Protein Folding Problem?

3 Entries Why predict structure? Growth of sequence and structure databases Metagenomics Sequences Structures Year

4 Biological Information Flow Gene (DNA) Protein 3D structure Function The unique 3D structure of a protein determines its biochemical function.

5 Protein Secondary Structure HELIX STRAND COIL

6 Secondary Structure Prediction LMLSTQNPALLKRNIIYWNNVALLWEAGSD? LMLSTQNPALL HHHEEEECCCC OUTPUT: 3-letter Secondary Structure Alphabet INPUT: 20-letter Amino Acid Alphabet

7 Goal Take primary structure (sequence) and, using rules derived from known structures, predict the secondary structure that is most likely to be adopted by each residue

8 1st Generation Methods Based on statistical analysis of single amino acid properties Examples: Chou & Fasman (1974) Lim (1974) Garnier, Osguthorpe & Robson (1978)

9 Structural Propensities Due to the size, shape and charge of its side chain, each amino acid may fit better in one type of secondary structure than another Classic example: The rigidity and side chain angle of proline cannot be accommodated in an -helical structure

10 Structural Propensities Two ways to view the significance of this preference (or propensity) It may control or affect the folding of the protein in its immediate vicinity (amino acid determines structure) It may constitute selective pressure to use particular amino acids in regions that must have a particular structure (structure determines amino acid)

11 Chou-Fasman method Uses table of conformational parameters (propensities) determined primarily from measurements of secondary structure by CD spectroscopy Table consists of one likelihood for each structure for each amino acid For amino acid type A (e.g. leucine) and structure type S (e.g. α helix), a propensity score is calculated as follows: P S p( A S) p( A) n A, S n A n n S

12 Chou-Fasman propensities (partial table) Amino Acid P P P t Glu Met Ala Val Ile Tyr Pro Gly

13 Chou-Fasman method A prediction is made for each type of structure for each amino acid Can result in ambiguity if a region has high propensities for both helix and sheet (higher value usually chosen, with exceptions)

14 Chou-Fasman method Calculation rules are somewhat ad hoc Example: Method for helix Search for nucleating region where 4 out of 6 a.a. have P > 1.03 Extend until 4 consecutive a.a. have an average P < 1.00 If region is at least 6 a.a. long, has an average P > 1.03, and average P > average P consider region to be helix

15 2nd Generation Methods Based on peptide segments / residue pairs Examples: GOR III (1987) The BIG NEWS, however, was the appearance of the first examples of MACHINE LEARNING in secondary structure prediction Neural Networks: Qian & Sejnowski (1988), Bohr et al. (1988), Holley & Karplus (1989)

16 Neural Networks Originally, neural networks were developed as simple models of brain function i.e. they were intended to be simulations of real networks of neurons. Hence the term Artificial Neural Network. Today these simple models are obsolete in neuroscience research but instead have become very useful tools for finding patterns in data.

17 Artificial Neural Networks Inputs Output An artificial neural network (ANN) is made up of many switching units (artificial neurons) that are connected together according to a specific network architecture. The objective of an artificial neural network is to learn how to transform inputs into meaningful outputs.

18 Training and Using ANNs We start with complete random connection weights Then we present an input pattern to the network and compare the output we get to what it should be If the output is correct then we don t need to change any weights If the output is WRONG then we make small changes to the connection weights so as to reduce the ERROR We then repeat this for all the examples we have And keep repeating on all the data until we see no further improvement

19 Training and Testing Sets The data we use to adjust the network weights is called the TRAINING SET To make sure we are not over-fitting our network to our training set, we should test the network on a completely separate TESTING SET This splitting of training and testing data is called CROSS- VALIDATION and is an important concept in statistics and machine learning

20 Some real data... Inputs A R N D C Q E G H I L K M F P S T W Y V Coil Strand Coil Strand Strand Helix Helix Coil Strand Helix Helix Helix Strand Helix Helix Strand Strand Strand Coil Helix Helix Coil Coil Coil... And so on... Desired Outputs

21 A Better Scheme for Predicting Secondary Structure by Machine Learning Window of 15 residues Classifier (neural network) Helix Strand Coil MLSPQAMSDFHEELKWLLCNIPGQKLASLANREYT We are predicting the secondary Structure for this central residue

22 Representation Usual representation is to use 21 inputs per amino acid: Ala : Val : :

23 3rd Generation Neural Network Methods PHD Rost B, Sander, C. (1993) Prediction of protein secondary structure at better than 70% accuracy. J. Mol. Biol. 232, PSIPRED Jones, D. T. (1999) Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292:

24 3rd Generation Methods Exploit evolutionary information Based on conservation analysis of multiple sequence alignments (HMMs or profiles) Extract some long-range information via accessibility patterns Conserved Hydrophobic Residues -> BURIED Variable Polar Residues -> EXPOSED

25 Variability and Hydrophobity of Amino Acids Variable & Polar Conserved & Hydrophobic Water soluble Variable & Very Hydrophobic Conserved & Hydrophobic Transmembrane

26 Pros & Cons of 3rd Generation Methods PROs High residue accuracy Less underprediction of strands Good quality segment predictions CONs Provides prediction for FAMILY CONSENSUS structure NOT THE STRUCTURE OF THE TARGET SEQUENCE

27 PSIPRED Works directly on PSI-BLAST profiles (PSSMs) Uses 2 separate stages of neural networks First network predicts secondary structure Second network cleans outputs from 1st net Trained on profile from ~3000 different proteins of known structure (taken from PDB)

28 4 x 1st Networks PSIPRED Neural Networks 15x21 Input units 75 hidden units 3 output units (H/E/C) 2nd Network 15x3 Input units 60 hidden units 3 output units (H/E/C)

29 PSIPRED Method Raw profile from PSI-BLAST Log File Position-based scoring matrix used A R N D C Q E G H I L K M F P S T W Y V Window of 15 rows A R N D C Q E G H I L K M F P S T W Y V x 20 scaled inputs to 1st network 1st Network 315 inputs 75 hidden units 3 outputs Window of 15 x 3 outputs fed to 2nd network 2nd Network 60 inputs 60 hidden units 3 outputs Final 3-state Prediction

30 2 nd Network: Filtering Raw Predictions from 1 st Network Stage 1 output Stage 2 output Actual 49 T E L E V E R E V E P C G C S C W E E H I H P H V H A H C H E 49 T E L E V E R E V E P C G C S H W H E H I H P H V H A H C H E 49 T E 50 L E 51 V E 52 R E 53 V E 54 P C 55 G C 56 S H 57 W H 58 E H 59 I H 60 P H 61 V H 62 A H

31 Common Measures of Secondary Structure Prediction Accuracy Q 3 scores give the percentage of correctly predicted residues across 3 states (H,E,C) This is the most commonly used measure Other scores such as Matthew s Correlation Coefficient try to identify accuracy for individual states (Coil, Strand, Helix) and are more sensitive to over-prediction e.g. if you predict all residues to be random coil you will get a Q 3 score of around 50% just because around 50% of residues in proteins are in random coils. However, the MCC scores will be close to zero!

32 Matthews Correlation Coefficient

33 Number PSIPRED Benchmark Results Mean Q 3 score: 77.8% (80.6% now) Q3 Score (%)

34 OBSOLETE!! First use of neural network Current best method Comparison of Generations by Average Q3 Scores GEN 1 GEN 2 GEN Chou & Fasman GOR I Qian & Sejnowski PHD (1994) PSIPRED 10 0

35 Protein Fold Recognition

36 Sequence Comparison > 30% Identity between two protein sequences implies probable common structure and possibly common function However, there are many exceptions to this rule of thumb 1PLC 1NIN IDVLLGADDGSLAFVPSEFSISPGEKIVFKNNAGFPHNIVFDEDSIPS-GVDASKISM : X:.: : :.: :...:.::.. : :: :::.::: :...:.: :. ETYTVKLGSDKGLLVFEPAKLTIKPGDTVEFLNNKVPPHNVVFDAALNPAKSADLAK-SL PLC 1NIN SEEDLLNAKGETFEVAL---SNKGEYSFYCSPHQGAGMVGKVTVN- :...:X. : :::.::: ::.:::::::.:: SHKQLLMSPGQSTSTTFPADAPAGEYTFYCEPHRGAGMVGKITVAG

37 Similar Sequence Similar 3-D Structure (RMSD = 2.1 A, Seq. ID = 30%) Ribonuclease MC1 Ribonuclease Rh

38 Structure similarity

39 Increasing accuracy/reliability Prediction Methods Comparative modelling Requires: Known fold + clear homology Fold recognition Requires: Known fold Ab initio / new fold methods Requires: only target sequence Increasing Difficulty

40 Tertiary Structure Prediction BASIS: native fold is expected to be the conformation of lowest energy True for small molecules IMPLICATION: native fold can be found by defining a potential energy function and searching all conformations for the one with lowest energy

41 The Levinthal Paradox For a protein sequence of length l, the total number of possible chain conformations N is given by: N 10 l >> Even if a protein was able to rearrange itself at the speed of light it would take ~10 75 years to locate the global energy minimum for a 100 residue protein.

42 Fold recognition a short cut to predicting protein tertiary structure Although there are vast number of possible protein structures, only a few have been observed in Nature The chance of a newly solved structure having a previously unknown fold is only ~20% >> We might be able to predict protein structure by selecting from already observed folds (a Multiple Choice version of the protein folding problem)

43 Protein Structure Prediction by Threading

44 An objective function for protein folding As yet there is no accurate physical model for protein folding Physics based force fields are not able to properly handle entropic solvent effects We cannot rely on classical physics Can we define a statistical model? Can we estimate Prob(Structure Sequence)?

45 Original structure (fragment) T G P A S K Native threading opposite charges stabilise the fold. D I Q

46 Threading alignment 1 - W R T M E D Y Ouch! Equal sign charges repel!! S

47 Threading alignment 2 - W - T M E R Y Much better - opposite charges again! S

48 A Scoring Function for Threading We want a want of assessing the energy of a model i.e. a model based on an alignment of a sequence with a structure Energy functions based on physics do not work Let s use a KNOWLEDGE-BASED APPROACH APPROACH - Look at known structures and see how often particular features (e.g. contacts between amino acids) occur. In other words we calculate probabilities.

49 Converting Probabilities to Potentials Probabilities are inconvenient for computer algorithms due to the required multiplications Additive quantities (e.g. energies) are easier to handle >> For many applications it is common to transform probabilities into energy-like quantities

50 The Inverse Boltzmann Principle The basic assumption in generating empirical potentials from probabilities is the so-called Inverse Boltzmann principle. According to the Boltzmann principle, the probability of occurrence of a given conformational state of energy E scales with the Boltzmann factor e -E /RT, where R is the gas constant (1.987 x 10-3 kcal.mol -1 K -1 ) and T is the absolute temperature (e.g. room temperature).

51 Potentials of Mean Force Count interactions of given type (e.g. alanine->serine betacarbon to alpha-carbon) in real protein structures Count interactions of same type in randomly generated protein structures or randomly selected sites (the reference state) Ratio of probabilities provides an estimate of the free energy change according to the inverse Boltzmann equation: E RT log p( interactio n in real proteins ) p( interactio n in decoy structures ) E High Energy State Low Energy State

52 Potentials of Mean Force k = 4, s = 14 Angstroms A S How common is this configuration in real proteins? What about in randomly folded protein chains?

53 Fold Recognition Potentials SR terms (n <11) MR terms (10 < n < 30) LR terms (n >= 30) Solvation terms (Rel. Acc.)

54 Potential Short-range C Pair Potential Separation 4) Distance (A)

55 We can estimate the stability of a given protein fold by summing potentials of mean force for all residue pairs Loops are sometimes ignored Single residue (solvation) terms usually included Can scale terms according to protein size

56 Partially correct Immunoglobulin Heavy Chain Fab Fragment Model Coloured by Threading Potential HIGH LOW

57 Can statistical potentials find the correct fold amongst a large set of incorrect decoy structures?

58 Search Problem: Threading Searching for the optimum mapping of sequence to structure while optimizing the sum of pair interactions is NP-complete (proven by R. Lathrop in 1994) i.e. an exhaustive search is needed to guarantee an optimal solution This search process is called THREADING i.e. what is the best way of threading this sequence through this structure In practice due to short range of pair potentials, heuristic solutions work fairly well: Exhaustive search (not practical) Dynamic programming. Double dynamic programming. Branch and bound. Monte Carlo, simulated annealing, Gibbs sampling, genetic algorithm.

59 Threading Methods in Practice Compared to comparative modelling, threading methods can produce models where there is no detectable homologue to be found in PDB The simplifying assumption that the backbone of the structure does not change when the sequence changes also results in poor recognition of folds (~30% reliability) and inaccurate models (i.e. inaccurate alignments) For distant homologues, however, there are weak clues in the sequences that can help

60 Learning to recognise protein folds Ideally we would like to define a single value which denotes the compatibility of a structure with a sequence i.e. P(Structure Seq) In practice this is not straightforward as each feature is estimated differently and are not independent >> We need to combine different features to decide whether or not a given fold recognition match is correct

61 GenTHREADER Neural Net Pair Energy Solv. Energy Alignment score Proteins related Alignment Length Proteins unrelated Nres (Struct) Nres (seq)

62 FSSP Z-score GenTHREADER4 Network Output Correlation with Structural Similarity Network Output

63 Calibrating the scores In theory, the network outputs should be good estimates of posterior probabilities In practice, they are not accurate estimates of p-values when benchmarked However, we can empirically estimate p- values by fitting to a suitable distribution

64 Frequency Estimating GenTHREADER p-values % 90.00% 80.00% 70.00% 60.00% 50.00% 40.00% 30.00% 20.00% 10.00%.00% Frequency Cumulative % Network Score p 26537e 25.33x

65 P-value p-value dependence on Pair Energy and Solvation Energy Solv = 10 Solv = Pair Energy

66 P-value p-value dependence on Pair Energy and Profile Alignment Score E = +250 E = -250 Profile Alignment Score

67 Comparison of Search Methods Run time to search SWISSPROT Release 37 (77977 sequences) for matches to Adenylate Kinase Method Hits Time Taken BLAST 61 8 sec PSIBLAST min GenTHREADER min Adenylate Kinase Perfect threading method 2000? days

68 Error Rate (%) Ratio Error Rate/Coverage Plot for Sequence-based Profiles (i.e. HMMs) vs GenTHREADER (1% error rate thresholds marked) Profile GenTHR % 52% Coverage (Sensitivity)

69 An Example of Fold Recognition

70 An Example ORF HI0073 from H. influenzae 114 a.a. long Function UNKNOWN: E64000 hypothetical protein HI Haemophilus influenzae (str e-46 A71149 hypothetical protein PH Pyrococcus horikoshii 101 7e-22 H70345 conserved hypothetical protein aq_507 - Aquifex aeolicus 99 2e-21 F72600 hypothetical protein APE Aeropyrum pernix (strain K1) 50 1e-06 C75046 hypothetical protein PAB Pyrococcus abyssi (strain e-04 C64354 hypothetical protein MJ Methanococcus jannaschii D64375 hypothetical protein MJ Methanococcus jannaschii H90346 conserved hypothetical protein [imported] - Sulfolobus so S62544 hypothetical protein SPAC12G12.13c - fission yeast (Schiz C90279 conserved hypothetical protein [imported] - Sulfolobus so H64462 hypothetical protein MJ Methanococcus jannaschii C69282 conserved hypothetical protein AF Archaeoglobus ful

71

72

73

74

75 Conclusions HI0073 is a probable nucleotidyl transferase (now confirmed) Consistent fold recognition results Match to 1FA0 Chain B Secondary structure in reasonable agreement Functionally Important Residues are CONSERVED

Bioinformatics: Secondary Structure Prediction

Bioinformatics: Secondary Structure Prediction Bioinformatics: Secondary Structure Prediction Prof. David Jones d.t.jones@ucl.ac.uk Possibly the greatest unsolved problem in molecular biology: The Protein Folding Problem MWMPPRPEEVARK LRRLGFVERMAKG

More information

Protein Structure Prediction and Display

Protein Structure Prediction and Display Protein Structure Prediction and Display Goal Take primary structure (sequence) and, using rules derived from known structures, predict the secondary structure that is most likely to be adopted by each

More information

Bioinformatics III Structural Bioinformatics and Genome Analysis Part Protein Secondary Structure Prediction. Sepp Hochreiter

Bioinformatics III Structural Bioinformatics and Genome Analysis Part Protein Secondary Structure Prediction. Sepp Hochreiter Bioinformatics III Structural Bioinformatics and Genome Analysis Part Protein Secondary Structure Prediction Institute of Bioinformatics Johannes Kepler University, Linz, Austria Chapter 4 Protein Secondary

More information

Neural Networks for Protein Structure Prediction Brown, JMB CS 466 Saurabh Sinha

Neural Networks for Protein Structure Prediction Brown, JMB CS 466 Saurabh Sinha Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha Outline Goal is to predict secondary structure of a protein from its sequence Artificial Neural Network used for this

More information

Physiochemical Properties of Residues

Physiochemical Properties of Residues Physiochemical Properties of Residues Various Sources C N Cα R Slide 1 Conformational Propensities Conformational Propensity is the frequency in which a residue adopts a given conformation (in a polypeptide)

More information

Protein Secondary Structure Prediction

Protein Secondary Structure Prediction part of Bioinformatik von RNA- und Proteinstrukturen Computational EvoDevo University Leipzig Leipzig, SS 2011 the goal is the prediction of the secondary structure conformation which is local each amino

More information

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche The molecular structure of a protein can be broken down hierarchically. The primary structure of a protein is simply its

More information

Protein structure. Protein structure. Amino acid residue. Cell communication channel. Bioinformatics Methods

Protein structure. Protein structure. Amino acid residue. Cell communication channel. Bioinformatics Methods Cell communication channel Bioinformatics Methods Iosif Vaisman Email: ivaisman@gmu.edu SEQUENCE STRUCTURE DNA Sequence Protein Sequence Protein Structure Protein structure ATGAAATTTGGAAACTTCCTTCTCACTTATCAGCCACCT...

More information

Basics of protein structure

Basics of protein structure Today: 1. Projects a. Requirements: i. Critical review of one paper ii. At least one computational result b. Noon, Dec. 3 rd written report and oral presentation are due; submit via email to bphys101@fas.harvard.edu

More information

CMPS 3110: Bioinformatics. Tertiary Structure Prediction

CMPS 3110: Bioinformatics. Tertiary Structure Prediction CMPS 3110: Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the laws of physics! Conformation space is finite

More information

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction CMPS 6630: Introduction to Computational Biology and Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the

More information

PROTEIN SECONDARY STRUCTURE PREDICTION: AN APPLICATION OF CHOU-FASMAN ALGORITHM IN A HYPOTHETICAL PROTEIN OF SARS VIRUS

PROTEIN SECONDARY STRUCTURE PREDICTION: AN APPLICATION OF CHOU-FASMAN ALGORITHM IN A HYPOTHETICAL PROTEIN OF SARS VIRUS Int. J. LifeSc. Bt & Pharm. Res. 2012 Kaladhar, 2012 Research Paper ISSN 2250-3137 www.ijlbpr.com Vol.1, Issue. 1, January 2012 2012 IJLBPR. All Rights Reserved PROTEIN SECONDARY STRUCTURE PREDICTION:

More information

CAP 5510 Lecture 3 Protein Structures

CAP 5510 Lecture 3 Protein Structures CAP 5510 Lecture 3 Protein Structures Su-Shing Chen Bioinformatics CISE 8/19/2005 Su-Shing Chen, CISE 1 Protein Conformation 8/19/2005 Su-Shing Chen, CISE 2 Protein Conformational Structures Hydrophobicity

More information

Protein Structures: Experiments and Modeling. Patrice Koehl

Protein Structures: Experiments and Modeling. Patrice Koehl Protein Structures: Experiments and Modeling Patrice Koehl Structural Bioinformatics: Proteins Proteins: Sources of Structure Information Proteins: Homology Modeling Proteins: Ab initio prediction Proteins:

More information

Steps in protein modelling. Structure prediction, fold recognition and homology modelling. Basic principles of protein structure

Steps in protein modelling. Structure prediction, fold recognition and homology modelling. Basic principles of protein structure Structure prediction, fold recognition and homology modelling Marjolein Thunnissen Lund September 2012 Steps in protein modelling 3-D structure known Comparative Modelling Sequence of interest Similarity

More information

Protein Secondary Structure Prediction using Feed-Forward Neural Network

Protein Secondary Structure Prediction using Feed-Forward Neural Network COPYRIGHT 2010 JCIT, ISSN 2078-5828 (PRINT), ISSN 2218-5224 (ONLINE), VOLUME 01, ISSUE 01, MANUSCRIPT CODE: 100713 Protein Secondary Structure Prediction using Feed-Forward Neural Network M. A. Mottalib,

More information

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Introduction to Comparative Protein Modeling. Chapter 4 Part I Introduction to Comparative Protein Modeling Chapter 4 Part I 1 Information on Proteins Each modeling study depends on the quality of the known experimental data. Basis of the model Search in the literature

More information

BIOINF 4120 Bioinformatics 2 - Structures and Systems - Oliver Kohlbacher Summer Protein Structure Prediction I

BIOINF 4120 Bioinformatics 2 - Structures and Systems - Oliver Kohlbacher Summer Protein Structure Prediction I BIOINF 4120 Bioinformatics 2 - Structures and Systems - Oliver Kohlbacher Summer 2013 9. Protein Structure Prediction I Structure Prediction Overview Overview of problem variants Secondary structure prediction

More information

Protein Secondary Structure Prediction

Protein Secondary Structure Prediction Protein Secondary Structure Prediction Doug Brutlag & Scott C. Schmidler Overview Goals and problem definition Existing approaches Classic methods Recent successful approaches Evaluating prediction algorithms

More information

Programme Last week s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues

Programme Last week s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues Programme 8.00-8.20 Last week s quiz results + Summary 8.20-9.00 Fold recognition 9.00-9.15 Break 9.15-11.20 Exercise: Modelling remote homologues 11.20-11.40 Summary & discussion 11.40-12.00 Quiz 1 Feedback

More information

Protein Structure Prediction

Protein Structure Prediction Protein Structure Prediction Michael Feig MMTSB/CTBP 2006 Summer Workshop From Sequence to Structure SEALGDTIVKNA Ab initio Structure Prediction Protocol Amino Acid Sequence Conformational Sampling to

More information

THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION

THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION AND CALIBRATION Calculation of turn and beta intrinsic propensities. A statistical analysis of a protein structure

More information

Protein Structure Prediction

Protein Structure Prediction Page 1 Protein Structure Prediction Russ B. Altman BMI 214 CS 274 Protein Folding is different from structure prediction --Folding is concerned with the process of taking the 3D shape, usually based on

More information

Improved Protein Secondary Structure Prediction

Improved Protein Secondary Structure Prediction Improved Protein Secondary Structure Prediction Secondary Structure Prediction! Given a protein sequence a 1 a 2 a N, secondary structure prediction aims at defining the state of each amino acid ai as

More information

SUPPLEMENTARY MATERIALS

SUPPLEMENTARY MATERIALS SUPPLEMENTARY MATERIALS Enhanced Recognition of Transmembrane Protein Domains with Prediction-based Structural Profiles Baoqiang Cao, Aleksey Porollo, Rafal Adamczak, Mark Jarrell and Jaroslaw Meller Contact:

More information

Statistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics

Statistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics Statistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics Jianlin Cheng, PhD Department of Computer Science University of Missouri, Columbia

More information

1-D Predictions. Prediction of local features: Secondary structure & surface exposure

1-D Predictions. Prediction of local features: Secondary structure & surface exposure 1-D Predictions Prediction of local features: Secondary structure & surface exposure 1 Learning Objectives After today s session you should be able to: Explain the meaning and usage of the following local

More information

Week 10: Homology Modelling (II) - HHpred

Week 10: Homology Modelling (II) - HHpred Week 10: Homology Modelling (II) - HHpred Course: Tools for Structural Biology Fabian Glaser BKU - Technion 1 2 Identify and align related structures by sequence methods is not an easy task All comparative

More information

Protein Structure Prediction, Engineering & Design CHEM 430

Protein Structure Prediction, Engineering & Design CHEM 430 Protein Structure Prediction, Engineering & Design CHEM 430 Eero Saarinen The free energy surface of a protein Protein Structure Prediction & Design Full Protein Structure from Sequence - High Alignment

More information

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748 CAP 5510: Introduction to Bioinformatics Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs07.html 2/15/07 CAP5510 1 EM Algorithm Goal: Find θ, Z that maximize Pr

More information

IT og Sundhed 2010/11

IT og Sundhed 2010/11 IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011 1 NetSurfP Real Value Solvent Accessibility predictions with amino acid associated

More information

Presentation Outline. Prediction of Protein Secondary Structure using Neural Networks at Better than 70% Accuracy

Presentation Outline. Prediction of Protein Secondary Structure using Neural Networks at Better than 70% Accuracy Prediction of Protein Secondary Structure using Neural Networks at Better than 70% Accuracy Burkhard Rost and Chris Sander By Kalyan C. Gopavarapu 1 Presentation Outline Major Terminology Problem Method

More information

Packing of Secondary Structures

Packing of Secondary Structures 7.88 Lecture Notes - 4 7.24/7.88J/5.48J The Protein Folding and Human Disease Professor Gossard Retrieving, Viewing Protein Structures from the Protein Data Base Helix helix packing Packing of Secondary

More information

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinff18.html Proteins and Protein Structure

More information

ALL LECTURES IN SB Introduction

ALL LECTURES IN SB Introduction 1. Introduction 2. Molecular Architecture I 3. Molecular Architecture II 4. Molecular Simulation I 5. Molecular Simulation II 6. Bioinformatics I 7. Bioinformatics II 8. Prediction I 9. Prediction II ALL

More information

Protein Structure Prediction Using Multiple Artificial Neural Network Classifier *

Protein Structure Prediction Using Multiple Artificial Neural Network Classifier * Protein Structure Prediction Using Multiple Artificial Neural Network Classifier * Hemashree Bordoloi and Kandarpa Kumar Sarma Abstract. Protein secondary structure prediction is the method of extracting

More information

3D Structure. Prediction & Assessment Pt. 2. David Wishart 3-41 Athabasca Hall

3D Structure. Prediction & Assessment Pt. 2. David Wishart 3-41 Athabasca Hall 3D Structure Prediction & Assessment Pt. 2 David Wishart 3-41 Athabasca Hall david.wishart@ualberta.ca Objectives Become familiar with methods and algorithms for secondary Structure Prediction Become familiar

More information

114 Grundlagen der Bioinformatik, SS 09, D. Huson, July 6, 2009

114 Grundlagen der Bioinformatik, SS 09, D. Huson, July 6, 2009 114 Grundlagen der Bioinformatik, SS 09, D. Huson, July 6, 2009 9 Protein tertiary structure Sources for this chapter, which are all recommended reading: D.W. Mount. Bioinformatics: Sequences and Genome

More information

Building 3D models of proteins

Building 3D models of proteins Building 3D models of proteins Why make a structural model for your protein? The structure can provide clues to the function through structural similarity with other proteins With a structure it is easier

More information

Can protein model accuracy be. identified? NO! CBS, BioCentrum, Morten Nielsen, DTU

Can protein model accuracy be. identified? NO! CBS, BioCentrum, Morten Nielsen, DTU Can protein model accuracy be identified? Morten Nielsen, CBS, BioCentrum, DTU NO! Identification of Protein-model accuracy Why is it important? What is accuracy RMSD, fraction correct, Protein model correctness/quality

More information

Secondary Structure. Bioch/BIMS 503 Lecture 2. Structure and Function of Proteins. Further Reading. Φ, Ψ angles alone determine protein structure

Secondary Structure. Bioch/BIMS 503 Lecture 2. Structure and Function of Proteins. Further Reading. Φ, Ψ angles alone determine protein structure Bioch/BIMS 503 Lecture 2 Structure and Function of Proteins August 28, 2008 Robert Nakamoto rkn3c@virginia.edu 2-0279 Secondary Structure Φ Ψ angles determine protein structure Φ Ψ angles are restricted

More information

Template Free Protein Structure Modeling Jianlin Cheng, PhD

Template Free Protein Structure Modeling Jianlin Cheng, PhD Template Free Protein Structure Modeling Jianlin Cheng, PhD Professor Department of EECS Informatics Institute University of Missouri, Columbia 2018 Protein Energy Landscape & Free Sampling http://pubs.acs.org/subscribe/archive/mdd/v03/i09/html/willis.html

More information

PROTEIN SECONDARY STRUCTURE PREDICTION USING NEURAL NETWORKS AND SUPPORT VECTOR MACHINES

PROTEIN SECONDARY STRUCTURE PREDICTION USING NEURAL NETWORKS AND SUPPORT VECTOR MACHINES PROTEIN SECONDARY STRUCTURE PREDICTION USING NEURAL NETWORKS AND SUPPORT VECTOR MACHINES by Lipontseng Cecilia Tsilo A thesis submitted to Rhodes University in partial fulfillment of the requirements for

More information

Sequence analysis and comparison

Sequence analysis and comparison The aim with sequence identification: Sequence analysis and comparison Marjolein Thunnissen Lund September 2012 Is there any known protein sequence that is homologous to mine? Are there any other species

More information

Template Free Protein Structure Modeling Jianlin Cheng, PhD

Template Free Protein Structure Modeling Jianlin Cheng, PhD Template Free Protein Structure Modeling Jianlin Cheng, PhD Associate Professor Computer Science Department Informatics Institute University of Missouri, Columbia 2013 Protein Energy Landscape & Free Sampling

More information

HIV protease inhibitor. Certain level of function can be found without structure. But a structure is a key to understand the detailed mechanism.

HIV protease inhibitor. Certain level of function can be found without structure. But a structure is a key to understand the detailed mechanism. Proteins are linear polypeptide chains (one or more) Building blocks: 20 types of amino acids. Range from a few 10s-1000s They fold into varying three-dimensional shapes structure medicine Certain level

More information

Protein Secondary Structure Assignment and Prediction

Protein Secondary Structure Assignment and Prediction 1 Protein Secondary Structure Assignment and Prediction Defining SS features - Dihedral angles, alpha helix, beta stand (Hydrogen bonds) Assigned manually by crystallographers or Automatic DSSP (Kabsch

More information

Lecture 7. Protein Secondary Structure Prediction. Secondary Structure DSSP. Master Course DNA/Protein Structurefunction.

Lecture 7. Protein Secondary Structure Prediction. Secondary Structure DSSP. Master Course DNA/Protein Structurefunction. C N T R F O R N T G R A T V B O N F O R M A T C S V U Master Course DNA/Protein Structurefunction Analysis and Prediction Lecture 7 Protein Secondary Structure Prediction Protein primary structure 20 amino

More information

Getting To Know Your Protein

Getting To Know Your Protein Getting To Know Your Protein Comparative Protein Analysis: Part III. Protein Structure Prediction and Comparison Robert Latek, PhD Sr. Bioinformatics Scientist Whitehead Institute for Biomedical Research

More information

Protein Secondary Structure Prediction using Pattern Recognition Neural Network

Protein Secondary Structure Prediction using Pattern Recognition Neural Network Protein Secondary Structure Prediction using Pattern Recognition Neural Network P.V. Nageswara Rao 1 (nagesh@gitam.edu), T. Uma Devi 1, DSVGK Kaladhar 1, G.R. Sridhar 2, Allam Appa Rao 3 1 GITAM University,

More information

Protein structure alignments

Protein structure alignments Protein structure alignments Proteins that fold in the same way, i.e. have the same fold are often homologs. Structure evolves slower than sequence Sequence is less conserved than structure If BLAST gives

More information

Protein Bioinformatics. Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet sandberg.cmb.ki.

Protein Bioinformatics. Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet sandberg.cmb.ki. Protein Bioinformatics Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet rickard.sandberg@ki.se sandberg.cmb.ki.se Outline Protein features motifs patterns profiles signals 2 Protein

More information

09/06/25. Computergestützte Strukturbiologie (Strukturelle Bioinformatik) Non-uniform distribution of folds. Scheme of protein structure predicition

09/06/25. Computergestützte Strukturbiologie (Strukturelle Bioinformatik) Non-uniform distribution of folds. Scheme of protein structure predicition Sequence identity Structural similarity Computergestützte Strukturbiologie (Strukturelle Bioinformatik) Fold recognition Sommersemester 2009 Peter Güntert Structural similarity X Sequence identity Non-uniform

More information

Supporting Online Material for

Supporting Online Material for www.sciencemag.org/cgi/content/full/309/5742/1868/dc1 Supporting Online Material for Toward High-Resolution de Novo Structure Prediction for Small Proteins Philip Bradley, Kira M. S. Misura, David Baker*

More information

SCOP. all-β class. all-α class, 3 different folds. T4 endonuclease V. 4-helical cytokines. Globin-like

SCOP. all-β class. all-α class, 3 different folds. T4 endonuclease V. 4-helical cytokines. Globin-like SCOP all-β class 4-helical cytokines T4 endonuclease V all-α class, 3 different folds Globin-like TIM-barrel fold α/β class Profilin-like fold α+β class http://scop.mrc-lmb.cam.ac.uk/scop CATH Class, Architecture,

More information

Prediction of protein secondary structure by mining structural fragment database

Prediction of protein secondary structure by mining structural fragment database Polymer 46 (2005) 4314 4321 www.elsevier.com/locate/polymer Prediction of protein secondary structure by mining structural fragment database Haitao Cheng a, Taner Z. Sen a, Andrzej Kloczkowski a, Dimitris

More information

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy Design of a Novel Globular Protein Fold with Atomic-Level Accuracy Brian Kuhlman, Gautam Dantas, Gregory C. Ireton, Gabriele Varani, Barry L. Stoddard, David Baker Presented by Kate Stafford 4 May 05 Protein

More information

Protein Structure Prediction Using Neural Networks

Protein Structure Prediction Using Neural Networks Protein Structure Prediction Using Neural Networks Martha Mercaldi Kasia Wilamowska Literature Review December 16, 2003 The Protein Folding Problem Evolution of Neural Networks Neural networks originally

More information

Section Week 3. Junaid Malek, M.D.

Section Week 3. Junaid Malek, M.D. Section Week 3 Junaid Malek, M.D. Biological Polymers DA 4 monomers (building blocks), limited structure (double-helix) RA 4 monomers, greater flexibility, multiple structures Proteins 20 Amino Acids,

More information

Protein Structure Determination

Protein Structure Determination Protein Structure Determination Given a protein sequence, determine its 3D structure 1 MIKLGIVMDP IANINIKKDS SFAMLLEAQR RGYELHYMEM GDLYLINGEA 51 RAHTRTLNVK QNYEEWFSFV GEQDLPLADL DVILMRKDPP FDTEFIYATY 101

More information

Protein Secondary Structure Prediction

Protein Secondary Structure Prediction C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Master Course DNA/Protein Structurefunction Analysis and Prediction Lecture 7 Protein Secondary Structure Prediction Protein primary

More information

Protein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron.

Protein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron. Protein Dynamics The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron. Below is myoglobin hydrated with 350 water molecules. Only a small

More information

Protein Structure Prediction

Protein Structure Prediction Protein Structure Prediction Michael Feig MMTSB/CTBP 2009 Summer Workshop From Sequence to Structure SEALGDTIVKNA Folding with All-Atom Models AAQAAAAQAAAAQAA All-atom MD in general not succesful for real

More information

Procheck output. Bond angles (Procheck) Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics.

Procheck output. Bond angles (Procheck) Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics. Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics Iosif Vaisman Email: ivaisman@gmu.edu ----------------------------------------------------------------- Bond

More information

Homology Modeling. Roberto Lins EPFL - summer semester 2005

Homology Modeling. Roberto Lins EPFL - summer semester 2005 Homology Modeling Roberto Lins EPFL - summer semester 2005 Disclaimer: course material is mainly taken from: P.E. Bourne & H Weissig, Structural Bioinformatics; C.A. Orengo, D.T. Jones & J.M. Thornton,

More information

Computer simulations of protein folding with a small number of distance restraints

Computer simulations of protein folding with a small number of distance restraints Vol. 49 No. 3/2002 683 692 QUARTERLY Computer simulations of protein folding with a small number of distance restraints Andrzej Sikorski 1, Andrzej Kolinski 1,2 and Jeffrey Skolnick 2 1 Department of Chemistry,

More information

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES Protein Structure W. M. Grogan, Ph.D. OBJECTIVES 1. Describe the structure and characteristic properties of typical proteins. 2. List and describe the four levels of structure found in proteins. 3. Relate

More information

Chapter 5. Proteomics and the analysis of protein sequence Ⅱ

Chapter 5. Proteomics and the analysis of protein sequence Ⅱ Proteomics Chapter 5. Proteomics and the analysis of protein sequence Ⅱ 1 Pairwise similarity searching (1) Figure 5.5: manual alignment One of the amino acids in the top sequence has no equivalent and

More information

An Artificial Neural Network Classifier for the Prediction of Protein Structural Classes

An Artificial Neural Network Classifier for the Prediction of Protein Structural Classes International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347 5161 2017 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Research Article An Artificial

More information

Template-Based 3D Structure Prediction

Template-Based 3D Structure Prediction Template-Based 3D Structure Prediction Sequence and Structure-based Template Detection and Alignment Issues The rate of new sequences is growing exponentially relative to the rate of protein structures

More information

Sequence Analysis and Databases 2: Sequences and Multiple Alignments

Sequence Analysis and Databases 2: Sequences and Multiple Alignments 1 Sequence Analysis and Databases 2: Sequences and Multiple Alignments Jose María González-Izarzugaza Martínez CNIO Spanish National Cancer Research Centre (jmgonzalez@cnio.es) 2 Sequence Comparisons:

More information

Computational Genomics and Molecular Biology, Fall

Computational Genomics and Molecular Biology, Fall Computational Genomics and Molecular Biology, Fall 2014 1 HMM Lecture Notes Dannie Durand and Rose Hoberman November 6th Introduction In the last few lectures, we have focused on three problems related

More information

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison CMPS 6630: Introduction to Computational Biology and Bioinformatics Structure Comparison Protein Structure Comparison Motivation Understand sequence and structure variability Understand Domain architecture

More information

Improving Protein Secondary-Structure Prediction by Predicting Ends of Secondary-Structure Segments

Improving Protein Secondary-Structure Prediction by Predicting Ends of Secondary-Structure Segments Improving Protein Secondary-Structure Prediction by Predicting Ends of Secondary-Structure Segments Uros Midic 1 A. Keith Dunker 2 Zoran Obradovic 1* 1 Center for Information Science and Technology Temple

More information

We used the PSI-BLAST program (http://www.ncbi.nlm.nih.gov/blast/) to search the

We used the PSI-BLAST program (http://www.ncbi.nlm.nih.gov/blast/) to search the SUPPLEMENTARY METHODS - in silico protein analysis We used the PSI-BLAST program (http://www.ncbi.nlm.nih.gov/blast/) to search the Protein Data Bank (PDB, http://www.rcsb.org/pdb/) and the NCBI non-redundant

More information

Protein Structure Prediction using String Kernels. Technical Report

Protein Structure Prediction using String Kernels. Technical Report Protein Structure Prediction using String Kernels Technical Report Department of Computer Science and Engineering University of Minnesota 4-192 EECS Building 200 Union Street SE Minneapolis, MN 55455-0159

More information

Protein Threading. BMI/CS 776 Colin Dewey Spring 2015

Protein Threading. BMI/CS 776  Colin Dewey Spring 2015 Protein Threading BMI/CS 776 www.biostat.wisc.edu/bmi776/ Colin Dewey cdewey@biostat.wisc.edu Spring 2015 Goals for Lecture the key concepts to understand are the following the threading prediction task

More information

Protein secondary structure prediction with a neural network

Protein secondary structure prediction with a neural network Proc. Nati. Acad. Sci. USA Vol. 86, pp. 152-156, January 1989 Biophysics Protein secondary structure prediction with a neural network L. HOWARD HOLLEY AND MARTIN KARPLUS Department of Chemistry, Harvard

More information

Development and Large Scale Benchmark Testing of the PROSPECTOR_3 Threading Algorithm

Development and Large Scale Benchmark Testing of the PROSPECTOR_3 Threading Algorithm PROTEINS: Structure, Function, and Bioinformatics 56:502 518 (2004) Development and Large Scale Benchmark Testing of the PROSPECTOR_3 Threading Algorithm Jeffrey Skolnick,* Daisuke Kihara, and Yang Zhang

More information

Motif Prediction in Amino Acid Interaction Networks

Motif Prediction in Amino Acid Interaction Networks Motif Prediction in Amino Acid Interaction Networks Omar GACI and Stefan BALEV Abstract In this paper we represent a protein as a graph where the vertices are amino acids and the edges are interactions

More information

Syllabus of BIOINF 528 (2017 Fall, Bioinformatics Program)

Syllabus of BIOINF 528 (2017 Fall, Bioinformatics Program) Syllabus of BIOINF 528 (2017 Fall, Bioinformatics Program) Course Name: Structural Bioinformatics Course Description: Instructor: This course introduces fundamental concepts and methods for structural

More information

Predicting Protein Structural Features With Artificial Neural Networks

Predicting Protein Structural Features With Artificial Neural Networks CHAPTER 4 Predicting Protein Structural Features With Artificial Neural Networks Stephen R. Holbrook, Steven M. Muskal and Sung-Hou Kim 1. Introduction The prediction of protein structure from amino acid

More information

Intro Secondary structure Transmembrane proteins Function End. Last time. Domains Hidden Markov Models

Intro Secondary structure Transmembrane proteins Function End. Last time. Domains Hidden Markov Models Last time Domains Hidden Markov Models Today Secondary structure Transmembrane proteins Structure prediction NAD-specific glutamate dehydrogenase Hard Easy >P24295 DHE2_CLOSY MSKYVDRVIAEVEKKYADEPEFVQTVEEVL

More information

Protein 8-class Secondary Structure Prediction Using Conditional Neural Fields

Protein 8-class Secondary Structure Prediction Using Conditional Neural Fields 2010 IEEE International Conference on Bioinformatics and Biomedicine Protein 8-class Secondary Structure Prediction Using Conditional Neural Fields Zhiyong Wang, Feng Zhao, Jian Peng, Jinbo Xu* Toyota

More information

HMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder

HMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder HMM applications Applications of HMMs Gene finding Pairwise alignment (pair HMMs) Characterizing protein families (profile HMMs) Predicting membrane proteins, and membrane protein topology Gene finding

More information

Today. Last time. Secondary structure Transmembrane proteins. Domains Hidden Markov Models. Structure prediction. Secondary structure

Today. Last time. Secondary structure Transmembrane proteins. Domains Hidden Markov Models. Structure prediction. Secondary structure Last time Today Domains Hidden Markov Models Structure prediction NAD-specific glutamate dehydrogenase Hard Easy >P24295 DHE2_CLOSY MSKYVDRVIAEVEKKYADEPEFVQTVEEVL SSLGPVVDAHPEYEEVALLERMVIPERVIE FRVPWEDDNGKVHVNTGYRVQFNGAIGPYK

More information

Protein Structure Bioinformatics Introduction

Protein Structure Bioinformatics Introduction 1 Swiss Institute of Bioinformatics Protein Structure Bioinformatics Introduction Basel, 27. September 2004 Torsten Schwede Biozentrum - Universität Basel Swiss Institute of Bioinformatics Klingelbergstr

More information

Copyright Mark Brandt, Ph.D A third method, cryogenic electron microscopy has seen increasing use over the past few years.

Copyright Mark Brandt, Ph.D A third method, cryogenic electron microscopy has seen increasing use over the past few years. Structure Determination and Sequence Analysis The vast majority of the experimentally determined three-dimensional protein structures have been solved by one of two methods: X-ray diffraction and Nuclear

More information

Large-Scale Genomic Surveys

Large-Scale Genomic Surveys Bioinformatics Subtopics Fold Recognition Secondary Structure Prediction Docking & Drug Design Protein Geometry Protein Flexibility Homology Modeling Sequence Alignment Structure Classification Gene Prediction

More information

Lecture 2 and 3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability

Lecture 2 and 3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability Lecture 2 and 3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability Part I. Review of forces Covalent bonds Non-covalent Interactions: Van der Waals Interactions

More information

Bio nformatics. Lecture 23. Saad Mneimneh

Bio nformatics. Lecture 23. Saad Mneimneh Bio nformatics Lecture 23 Protein folding The goal is to determine the three-dimensional structure of a protein based on its amino acid sequence Assumption: amino acid sequence completely and uniquely

More information

Bayesian Models and Algorithms for Protein Beta-Sheet Prediction

Bayesian Models and Algorithms for Protein Beta-Sheet Prediction 0 Bayesian Models and Algorithms for Protein Beta-Sheet Prediction Zafer Aydin, Student Member, IEEE, Yucel Altunbasak, Senior Member, IEEE, and Hakan Erdogan, Member, IEEE Abstract Prediction of the three-dimensional

More information

Structure and evolution of the spliceosomal peptidyl-prolyl cistrans isomerase Cwc27

Structure and evolution of the spliceosomal peptidyl-prolyl cistrans isomerase Cwc27 Acta Cryst. (2014). D70, doi:10.1107/s1399004714021695 Supporting information Volume 70 (2014) Supporting information for article: Structure and evolution of the spliceosomal peptidyl-prolyl cistrans isomerase

More information

Properties of amino acids in proteins

Properties of amino acids in proteins Properties of amino acids in proteins one of the primary roles of DNA (but not the only one!) is to code for proteins A typical bacterium builds thousands types of proteins, all from ~20 amino acids repeated

More information

Protein quality assessment

Protein quality assessment Protein quality assessment Speaker: Renzhi Cao Advisor: Dr. Jianlin Cheng Major: Computer Science May 17 th, 2013 1 Outline Introduction Paper1 Paper2 Paper3 Discussion and research plan Acknowledgement

More information

Conditional Graphical Models

Conditional Graphical Models PhD Thesis Proposal Conditional Graphical Models for Protein Structure Prediction Yan Liu Language Technologies Institute University Thesis Committee Jaime Carbonell (Chair) John Lafferty Eric P. Xing

More information

Protein folding. α-helix. Lecture 21. An α-helix is a simple helix having on average 10 residues (3 turns of the helix)

Protein folding. α-helix. Lecture 21. An α-helix is a simple helix having on average 10 residues (3 turns of the helix) Computat onal Biology Lecture 21 Protein folding The goal is to determine the three-dimensional structure of a protein based on its amino acid sequence Assumption: amino acid sequence completely and uniquely

More information

BCH 4053 Spring 2003 Chapter 6 Lecture Notes

BCH 4053 Spring 2003 Chapter 6 Lecture Notes BCH 4053 Spring 2003 Chapter 6 Lecture Notes 1 CHAPTER 6 Proteins: Secondary, Tertiary, and Quaternary Structure 2 Levels of Protein Structure Primary (sequence) Secondary (ordered structure along peptide

More information

FlexPepDock In a nutshell

FlexPepDock In a nutshell FlexPepDock In a nutshell All Tutorial files are located in http://bit.ly/mxtakv FlexPepdock refinement Step 1 Step 3 - Refinement Step 4 - Selection of models Measure of fit FlexPepdock Ab-initio Step

More information

Supersecondary Structures (structural motifs)

Supersecondary Structures (structural motifs) Supersecondary Structures (structural motifs) Various Sources Slide 1 Supersecondary Structures (Motifs) Supersecondary Structures (Motifs): : Combinations of secondary structures in specific geometric

More information