Protein Secondary Structure Prediction using Pattern Recognition Neural Network
|
|
- Rebecca Lindsey
- 6 years ago
- Views:
Transcription
1 Protein Secondary Structure Prediction using Pattern Recognition Neural Network P.V. Nageswara Rao 1 (nagesh@gitam.edu), T. Uma Devi 1, DSVGK Kaladhar 1, G.R. Sridhar 2, Allam Appa Rao 3 1 GITAM University, 2 Endocrine and Diabetes Centre, Visakhapatnam, 3 JNTUK, Kakinada, India ABSTRACT Proteins are key biological molecules with diverse functions. With newer technologies producing more data (genomics, proteomics) than can be annotated manually, in silico methods of predicting their structure and thereafter their function has been christened the Holy Grail of structural bioinformatics. Successful secondary structure prediction provides a starting point for direct tertiary structure modeling; in addition it improves sequence analysis and sequence-structure binding for structure and function determination. Using machine learning and data mining process, we developed a pattern recognition technique based on statistical for predicting protein secondary structure from the component amino acid sequence. By applying this technique, a performance score of Q 8 =72.3% was achieved. This compares well with other established techniques, such as NN-I and GOR IV which achieved Q 3 scores of 64.05% and 63.19% respectively when predictions are made on single sequence alone. Key words: Secondary Structure, Pattern Recognition, Neural Network. 1. INTRODUCTION The prediction of protein structure from amino acid sequence has become the target of scientists since Anfinsen(1973) 1, who showed that the information necessary for protein folding resides completely within the primary structure. The emergence of rapid methods of DNA sequencing and the translation of the genetic code into protein sequences has boosted the need for automated methods of interpreting these linear sequences into threedimensional structure 2. Although the development of advanced molecular biology laboratory techniques reduced the amount of time necessary to determine a protein structure by X-ray crystallography, a crystal structure determination may still require many months. NMR techniques helped in determining protein structure, but NMR is also costly, time-consuming, requires large amounts of protein of high solubility and is severely limited by protein size 2. The conclusion is that current experimental methods of determining protein structure will not meet the requirements of the present and future needs for protein structure determination. 2. RELATED WORKS There are two main different approaches in determining protein structure theoretically: a molecular mechanics approach based on the assumption that a correctly folded protein occupies a minimum energy conformation, most likely a conformation near the global minimum of free energy. Potential energy is obtained by summing the terms due to bonded and non-bonded components estimated from these force field parameters and then can be minimized as a function of atomic coordinates in order to reach the nearest local minimum 3,4. This approach is very sensitive to the protein conformation of the molecules at the beginning of the simulation. One way to address this problem is to use molecular dynamics to simulate the way the molecule would move away from that initial state. Newton s laws and Monte Carlo methods were used to reach to a global energy minima. The approach of molecular mechanics is faced by problems of inaccurate force field parameters and spectrum of multiple minima 2. The second approach of predicting protein structures from sequence alone is based on the data sets of known protein structures and sequences. This approach attempts to find common features in these data sets which can be generalized to provide structural models of other proteins. Many statistical methods used the different frequencies of amino acid types: helices, strands, and loops in sequences to predict their location The main idea is that a segment or motif of a target protein that has a sequence similar to a segment or motif with known structure is assumed to have the same structure. ISSN:
2 Protein secondary structure prediction means the prediction of the formation of regular local structures such as α helices, β strands, coils, etc. Solving the protein folding problem will pave the way to rapid progress in the fields of protein engineering and drug design. As the number of protein sequences is growing much faster than our ability to solve their structures experimentally in the molecular biology laboratories, in silico prediction methods will narrow the gap between available sequences and structures. Previous research showed that it is promising to derive general rules for predicting protein structure from existing data and then applying them to unknown structures. Several methods have utilized this approach 5, Many statistically based methods use the different frequencies of amino acid types in sequences to predict their location in the secondary structure conformations: helices, strands, and coils The basic idea is that a segment or motif of a target protein that has a sequence similar to a segment or motif with known structure is assumed to have the same structure. Unfortunately, for many proteins there is not enough homology to any protein sequence or of known structure to allow application of this technique. The GOR method was first proposed by 15 and named after its authors Garnier-Osguthorpe-Robson. The GOR method attempts to include information about a slightly longer segment of the polypeptide chain. Instead of considering tendency for a single residue, position-dependent tendencies have been calculated for all residue types. Thus the prediction will therefore be influenced not only by the actual residue at that position, but also to some extent by other neighbouring residues 16. The propensity stables to some extent reflect the fact that positively charged residues are more often found in the C-terminal end of helices and that negatively charged residues are found in the N-terminal end. 3. PROPOSED METHOD The dssp database ( /gv/dssp/) is an archive of protein sequence with its secondary structure. Each file describes the primary structure of the protein and secondary structure of each amino acid in a columnar fashion. A set of 625 non redundant proteins with more than 25% sequence similarity were extracted. A sniffer is written to extract the sequence and its secondary structure from the.dssp file. A sample.dssp file is presented in Fig.1. ==== Secondary Structure Definition by the program DSSP, updated CMBI version by ElmK / April 1,2000 ==== DATE=7 OCT 2009 REFERENCE W. KABSCH AND C.SANDER, BIOPOLYMERS 22 (1983) HEADER HYDROLASE 30 MAR 09 3GUP. COMPND 2 MOLECULE: LYSOZYME;. SOURCE 2 ORGANISM_SCIENTIFIC: ENTEROBACTERIA PHAGE T4;. AUTHOR L.LIU,B.W.MATTHEWS TOTAL NUMBER OF RESIDUES, NUMBER OF CHAINS, NUMBER OF SS BRIDGES(TOTAL,INTRACHAIN,INTERCHAIN) ACCESSIBLE SURFACE OF PROTEIN (ANGSTROM**2) TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I) >H N(J), SAME NUMBER PER 100 RESIDUES TOTAL NUMBER OF HYDROGEN BONDS IN PARALLEL BRIDGES, SAME NUMBER PER 100 RESIDUES TOTAL NUMBER OF HYDROGEN BONDS IN ANTIPARALLEL BRIDGES, SAME NUMBER PER 100 RESIDUES TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I) >H N(I 5), SAME NUMBER PER 100 RESIDUES TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I) >H N(I 4), SAME NUMBER PER 100 RESIDUES TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I) >H N(I 3), SAME NUMBER PER 100 RESIDUES TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I) >H N(I 2), SAME NUMBER PER 100 RESIDUES TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I) >H N(I 1), SAME NUMBER PER 100 RESIDUES TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I) >H N(I+0), SAME NUMBER PER 100 RESIDUES TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I) >H N(I+1), SAME NUMBER PER 100 RESIDUES TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I) >H N(I+2), SAME NUMBER PER 100 RESIDUES TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I) >H N(I+3), SAME NUMBER PER 100 RESIDUES TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I) >H N(I+4), SAME NUMBER PER 100 RESIDUES TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I) >H N(I+5), SAME NUMBER PER 100 RESIDUES *** HISTOGRAMS OF *** RESIDUES PER ALPHA HELIX PARALLEL BRIDGES PER LADDER ANTIPARALLEL BRIDGES PER LADDER LADDERS PER SHEET. # RESIDUE AA STRUCTURE BP1 BP2 ACC N H >O O >H N N H >O O >H N TCO KAPPA ALPHA PHI PSI X CA Y CA Z CA 1 1 A M , 0.0 2, 0.3 0, , A N > , 0.0 4, , 0.0 5, A I H > S , 0.3 4, 2.6 1, 0.2 5, A F H > S , 0.2 4, 2.4 1, 0.2 1, ISSN:
3 5 5 A E H > S , 0.2 4, 2.0 1, 0.2 2, A M H X S , 2.7 4, 2.1 2, 0.2 2, A L H X>S , 2.6 4, 3.0 2, 0.2 5, A R H X5S , 2.4 4, 1.3 5, 0.2 1, A I H <5S , 2.0 2, 0.2 5, 0.2 1, A D H <5S , 2.1 2, 0.2 1, 0.2 3, A E H <5S , , 0.4 5, 0.2 3, A G << , 1.3 2, 0.4 5, , A L , 0.2 2, 0.4 4, , A R E A 28 0A , , 2.0 2, 0.4 4, A L E S , , , 0.2 2, A K E S C 57 0B , , , , A I E , 1.0 2, 0.3 2, , A Y E A 26 0A 37 8, 3.2 8, 2.4 6, 0.1 2, A K E A 25 0A 132 2, 0.3 6, 0.2 6, 0.2 2, A D > , 2.4 3, 0.9 2, 0.5 1, A a T 3 S , 0.2 1, 0.1 2, 0.1 2, A E T 3 S , 0.2 1, , , A G S < S , 0.9 2, 0.3 1, 0.3 2, A Y , 0.0 4, 2.4 9, 0.0 1, A Y E +AB 19 34A 36 9, 0.6 8, , 0.4 9, A T E +AB 18 32A 2 8, 2.4 8, 3.2 6, 0.3 2, A I E > + B 0 31A 0 4, 1.5 4, 2.2 2, , A G E 4 S A 14 0A 1 14, , 1.7 2, 0.2 2, A I T 4 S , 0.3 1, , , A G T 4 S , 1.0 2, , 0.4 2, Fig.1. The dssp file showing the primary structure and secondary structure of a protein (shown up to 30 residues only). Methodology: To predict the secondary structure of a protein, a Pattern Recognition Neural Network is designed. The neural network is defined with one input layer, one hidden layer and one output layer. The protein sequence is represented as a sliding window of size W(changing from 15 to 29) and the prediction is made on the structural state of the central residue of the window. Thus a protein segment of windows size W is represented as a 20 x W. Thus the input layer R consists of 20xW input units, i.e., W groups of 20 inputs each for each window. All the proteins that are used to train the neural network are encoded and are stored in vector. Each target is also represented as a boolean array of size 8, which represents one of the secondary structural state of the amino acid at that position in the protein sequence. The secondary structural states defined according to dssp are H,I,G,E,B,T,S and C. Thus H is represented as , I is represented as and finally C is represented as Thus the output layer of the neural network consists of eight units, one for each of the considered structural states(or classes). The target matrix is also prepared. The size of the hidden layer is taken as 2xW+1. The pattern recognition network is trained with the Scaled Conjugate Gradient algorithm. At each training cycle, the training sequences are presented to the network through the sliding window defined above, one residue at a time. Each hidden unit transforms the signals received from the input layer by using a transfer function log sigmoid to produce an output signal that is between and close to either 0 or 1. Weights are adjusted so that the error between the observed output from each unit and the desired output specified by the target matrix is minimized. One of the common problem data overfitting, while training the neural network, is eliminated by dividing the data into three subsets: (i) the training set, which is used for computing the gradient and updating the network weights and biases; (ii) the validation set, whose error is monitored during the training process because it tends to increase when data is overfitted; and (iii) the test set(not seen earlier by the neural network), whose error can be used to assess the quality of the division of the data set. The training process stopped automatically when any one of the several conditions like epochs, goal, validation errors is met. ISSN:
4 4. RESULTS AND DISCUSSION P.V. Nageswara Rao et. al. / International Journal of Engineering Science and Technology To analyze the network response, confusion matrix is computed by considering the outputs of the trained network and comparing with the expected results(targets), shown in Fig. 2. Fig. 2. Confusion Matrix showing the performance of the classifier. The diagonal cells show the number of residue positions that were correctly classified for each structural class. The off-diagonal cells show the number of residue positions that were misclassified (e.g. helical predicted as coil). The rightmost cell in the last row shows the total percentage of correctly predicted residues (upper number) and the total percentage of incorrectly predicted residues (lower number). By applying this technique, a performance score of Q 8 =72.3% is achieved. This compares well with state of art techniques, such as NN-I and GOR IV which achieved Q 3 scores of 64.05% and 63.19% respectively when predictions are made on single sequence alone. The Receiver Operating Characteristic (ROC) curve, a plot of the true positive rate (sensitivity) versus the false positive rate (1 - specificity) is also drawn and shown in Fig.3. ISSN:
5 5. CONCLUSION Fig.3. ROC Curve showing the performance of the classifier The prediction accuracy can be improved by: Increasing the number of training vectors, with appropriate distribution of all the classes. Increasing the window size or adding more relevant information, such as biochemical properties of the amino acids. Increase the number of hidden layers and neurons. ACKNOWLEDGEMENTS The authors would like to thank Acharya Nagarjuna University and GITAM University for providing computational facility and access to e-journals to carry out this research. REFERENCES 1. Anfinsen, C.B. (1973). Principles That Govern The Folding Of Protein Chains. Science. 181: Stephen, R. Holbrook, Steven, M., Muskal and Sung-Hou Kim. (1990). Predicting Protein Structural Features With Artificial Neural Networks. Artificial Intelligence and Molecular Biology. 3 Weiner, P.K. and Kollman, P.A. (1981). AMBER: Assisted Model Building With Energy Refinement. A General Program For Modeling Molecular and Their Interactions. Journal of Computational Chemistry. 2:287: Weiner, S.J., Kollman, P.A., Case, D.A., Singh, U.C., Chio, C., Alagona, G., Profeta, S. and Weiner, P.K.(1984). A New Force Field For Molecular Mechanical Simulation Of Nucleic Acids and Proteins. Journal of American Chemical Societies. 106: Chou, P.Y. and Fasman, G.D. (1974). Prediction of Protein Conformation. Biochemistry. 13: Garnier, J., Osguthorpe, D.J. and Robson, B. (1978). Analysis Of The Accuracy and Implications Of simple Methods For Predicting The Secondary Structure Of Globular Proteins. Journal of Molecular Biology. 120: Lim, V.I. (1974). Algorithms For the Prediction of Alpha-Helical and Beta-Structural Regions in Globular Proteins. Journal of Molecular Biology. 88: Blundel, T., sibanda, B.L. and Pearl, L. (1983). Three-Dimensional Structure, Specificity and Catalytic Mechanism Of Renin. Nature. 304: ISSN:
6 9 Greer, J. (1981). Comparative Model-Building Of The Mammalian Serine Proteases. Journal of Molecular Biology. 153: Warme, P.K., Momany, F.A., Rumball, S.V., Tuttle, R.W. and Scheraga, H.A. (1974). Computation Of Structures Of Homologous Proteins. Alpha-Lactalbumin From Lysozyme. Biochemistry. 13: Richardson, J.S.(1981). The Anatomy and Taxonomy of Protein Structures. Advances in Protein Chemistry. 34: Kringbaum, W.R., and Knutton, S.P. (1973). Prediction of The amount Of Secondary Structure in A Globular Protein From Its Amino acid Composition. Proceedings of the National Academy of Science. USA. 70(10): Qian, N. and Sejnowski, T.J. (1988). Predicting The Secondary Structure Of Globular Proteins Using Neural Network Models. Journal of Molecular Biology. 202(4): Crik, F. (1989). The Recent Excitement About Neural Networks. Nature. 337: Garnier, J. Robson, B. (1989). The GOR Method For Predicting Secondary Structure in Proteins. Prediction Of Protein Structure and The Principles Of Protein Conformation. New York: Plenum Press Garnier, J. and Robson, B.(1989). The GOR Method For Predicting Secondary Structures in Proteins. Prediction of Protein Structure and The Principles of Protein Conformation. New York:Plenum Press ISSN:
Bioinformatics III Structural Bioinformatics and Genome Analysis Part Protein Secondary Structure Prediction. Sepp Hochreiter
Bioinformatics III Structural Bioinformatics and Genome Analysis Part Protein Secondary Structure Prediction Institute of Bioinformatics Johannes Kepler University, Linz, Austria Chapter 4 Protein Secondary
More informationIT og Sundhed 2010/11
IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011 1 NetSurfP Real Value Solvent Accessibility predictions with amino acid associated
More informationPROTEIN SECONDARY STRUCTURE PREDICTION: AN APPLICATION OF CHOU-FASMAN ALGORITHM IN A HYPOTHETICAL PROTEIN OF SARS VIRUS
Int. J. LifeSc. Bt & Pharm. Res. 2012 Kaladhar, 2012 Research Paper ISSN 2250-3137 www.ijlbpr.com Vol.1, Issue. 1, January 2012 2012 IJLBPR. All Rights Reserved PROTEIN SECONDARY STRUCTURE PREDICTION:
More informationBasics of protein structure
Today: 1. Projects a. Requirements: i. Critical review of one paper ii. At least one computational result b. Noon, Dec. 3 rd written report and oral presentation are due; submit via email to bphys101@fas.harvard.edu
More informationPhysiochemical Properties of Residues
Physiochemical Properties of Residues Various Sources C N Cα R Slide 1 Conformational Propensities Conformational Propensity is the frequency in which a residue adopts a given conformation (in a polypeptide)
More informationProtein Secondary Structure Prediction
part of Bioinformatik von RNA- und Proteinstrukturen Computational EvoDevo University Leipzig Leipzig, SS 2011 the goal is the prediction of the secondary structure conformation which is local each amino
More informationProtein Structure Prediction and Display
Protein Structure Prediction and Display Goal Take primary structure (sequence) and, using rules derived from known structures, predict the secondary structure that is most likely to be adopted by each
More informationPredicting Protein Structural Features With Artificial Neural Networks
CHAPTER 4 Predicting Protein Structural Features With Artificial Neural Networks Stephen R. Holbrook, Steven M. Muskal and Sung-Hou Kim 1. Introduction The prediction of protein structure from amino acid
More informationCAP 5510 Lecture 3 Protein Structures
CAP 5510 Lecture 3 Protein Structures Su-Shing Chen Bioinformatics CISE 8/19/2005 Su-Shing Chen, CISE 1 Protein Conformation 8/19/2005 Su-Shing Chen, CISE 2 Protein Conformational Structures Hydrophobicity
More informationProtein Secondary Structure Prediction using Feed-Forward Neural Network
COPYRIGHT 2010 JCIT, ISSN 2078-5828 (PRINT), ISSN 2218-5224 (ONLINE), VOLUME 01, ISSUE 01, MANUSCRIPT CODE: 100713 Protein Secondary Structure Prediction using Feed-Forward Neural Network M. A. Mottalib,
More informationHIV protease inhibitor. Certain level of function can be found without structure. But a structure is a key to understand the detailed mechanism.
Proteins are linear polypeptide chains (one or more) Building blocks: 20 types of amino acids. Range from a few 10s-1000s They fold into varying three-dimensional shapes structure medicine Certain level
More informationProtein Secondary Structure Prediction
Protein Secondary Structure Prediction Doug Brutlag & Scott C. Schmidler Overview Goals and problem definition Existing approaches Classic methods Recent successful approaches Evaluating prediction algorithms
More informationStatistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics
Statistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics Jianlin Cheng, PhD Department of Computer Science University of Missouri, Columbia
More informationProtein Structure Prediction Using Multiple Artificial Neural Network Classifier *
Protein Structure Prediction Using Multiple Artificial Neural Network Classifier * Hemashree Bordoloi and Kandarpa Kumar Sarma Abstract. Protein secondary structure prediction is the method of extracting
More informationTHE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION
THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION AND CALIBRATION Calculation of turn and beta intrinsic propensities. A statistical analysis of a protein structure
More informationProtein Structures: Experiments and Modeling. Patrice Koehl
Protein Structures: Experiments and Modeling Patrice Koehl Structural Bioinformatics: Proteins Proteins: Sources of Structure Information Proteins: Homology Modeling Proteins: Ab initio prediction Proteins:
More informationPresentation Outline. Prediction of Protein Secondary Structure using Neural Networks at Better than 70% Accuracy
Prediction of Protein Secondary Structure using Neural Networks at Better than 70% Accuracy Burkhard Rost and Chris Sander By Kalyan C. Gopavarapu 1 Presentation Outline Major Terminology Problem Method
More informationSUPPLEMENTARY MATERIALS
SUPPLEMENTARY MATERIALS Enhanced Recognition of Transmembrane Protein Domains with Prediction-based Structural Profiles Baoqiang Cao, Aleksey Porollo, Rafal Adamczak, Mark Jarrell and Jaroslaw Meller Contact:
More informationAn Artificial Neural Network Classifier for the Prediction of Protein Structural Classes
International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347 5161 2017 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Research Article An Artificial
More informationProtein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche
Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche The molecular structure of a protein can be broken down hierarchically. The primary structure of a protein is simply its
More informationGiri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748
CAP 5510: Introduction to Bioinformatics Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs07.html 2/15/07 CAP5510 1 EM Algorithm Goal: Find θ, Z that maximize Pr
More informationProtein structure. Protein structure. Amino acid residue. Cell communication channel. Bioinformatics Methods
Cell communication channel Bioinformatics Methods Iosif Vaisman Email: ivaisman@gmu.edu SEQUENCE STRUCTURE DNA Sequence Protein Sequence Protein Structure Protein structure ATGAAATTTGGAAACTTCCTTCTCACTTATCAGCCACCT...
More informationBioinformatics: Secondary Structure Prediction
Bioinformatics: Secondary Structure Prediction Prof. David Jones d.t.jones@ucl.ac.uk Possibly the greatest unsolved problem in molecular biology: The Protein Folding Problem MWMPPRPEEVARK LRRLGFVERMAKG
More informationImproved Protein Secondary Structure Prediction
Improved Protein Secondary Structure Prediction Secondary Structure Prediction! Given a protein sequence a 1 a 2 a N, secondary structure prediction aims at defining the state of each amino acid ai as
More informationBioinformatics: Secondary Structure Prediction
Bioinformatics: Secondary Structure Prediction Prof. David Jones d.jones@cs.ucl.ac.uk LMLSTQNPALLKRNIIYWNNVALLWEAGSD The greatest unsolved problem in molecular biology:the Protein Folding Problem? Entries
More informationPROTEIN SECONDARY STRUCTURE PREDICTION USING NEURAL NETWORKS AND SUPPORT VECTOR MACHINES
PROTEIN SECONDARY STRUCTURE PREDICTION USING NEURAL NETWORKS AND SUPPORT VECTOR MACHINES by Lipontseng Cecilia Tsilo A thesis submitted to Rhodes University in partial fulfillment of the requirements for
More informationNeural Networks for Protein Structure Prediction Brown, JMB CS 466 Saurabh Sinha
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha Outline Goal is to predict secondary structure of a protein from its sequence Artificial Neural Network used for this
More informationOrientational degeneracy in the presence of one alignment tensor.
Orientational degeneracy in the presence of one alignment tensor. Rotation about the x, y and z axes can be performed in the aligned mode of the program to examine the four degenerate orientations of two
More informationOptimization of the Sliding Window Size for Protein Structure Prediction
Optimization of the Sliding Window Size for Protein Structure Prediction Ke Chen* 1, Lukasz Kurgan 1 and Jishou Ruan 2 1 University of Alberta, Department of Electrical and Computer Engineering, Edmonton,
More informationProtein Structure Analysis and Verification. Course S Basics for Biosystems of the Cell exercise work. Maija Nevala, BIO, 67485U 16.1.
Protein Structure Analysis and Verification Course S-114.2500 Basics for Biosystems of the Cell exercise work Maija Nevala, BIO, 67485U 16.1.2008 1. Preface When faced with an unknown protein, scientists
More informationIntroduction to Comparative Protein Modeling. Chapter 4 Part I
Introduction to Comparative Protein Modeling Chapter 4 Part I 1 Information on Proteins Each modeling study depends on the quality of the known experimental data. Basis of the model Search in the literature
More informationProtein Structure Prediction Using Neural Networks
Protein Structure Prediction Using Neural Networks Martha Mercaldi Kasia Wilamowska Literature Review December 16, 2003 The Protein Folding Problem Evolution of Neural Networks Neural networks originally
More informationIntro Secondary structure Transmembrane proteins Function End. Last time. Domains Hidden Markov Models
Last time Domains Hidden Markov Models Today Secondary structure Transmembrane proteins Structure prediction NAD-specific glutamate dehydrogenase Hard Easy >P24295 DHE2_CLOSY MSKYVDRVIAEVEKKYADEPEFVQTVEEVL
More informationToday. Last time. Secondary structure Transmembrane proteins. Domains Hidden Markov Models. Structure prediction. Secondary structure
Last time Today Domains Hidden Markov Models Structure prediction NAD-specific glutamate dehydrogenase Hard Easy >P24295 DHE2_CLOSY MSKYVDRVIAEVEKKYADEPEFVQTVEEVL SSLGPVVDAHPEYEEVALLERMVIPERVIE FRVPWEDDNGKVHVNTGYRVQFNGAIGPYK
More information3D Structure. Prediction & Assessment Pt. 2. David Wishart 3-41 Athabasca Hall
3D Structure Prediction & Assessment Pt. 2 David Wishart 3-41 Athabasca Hall david.wishart@ualberta.ca Objectives Become familiar with methods and algorithms for secondary Structure Prediction Become familiar
More informationProtein 8-class Secondary Structure Prediction Using Conditional Neural Fields
2010 IEEE International Conference on Bioinformatics and Biomedicine Protein 8-class Secondary Structure Prediction Using Conditional Neural Fields Zhiyong Wang, Feng Zhao, Jian Peng, Jinbo Xu* Toyota
More informationProtein Structure. W. M. Grogan, Ph.D. OBJECTIVES
Protein Structure W. M. Grogan, Ph.D. OBJECTIVES 1. Describe the structure and characteristic properties of typical proteins. 2. List and describe the four levels of structure found in proteins. 3. Relate
More informationBioinformatics. Macromolecular structure
Bioinformatics Macromolecular structure Contents Determination of protein structure Structure databases Secondary structure elements (SSE) Tertiary structure Structure analysis Structure alignment Domain
More informationProtein Structure: Data Bases and Classification Ingo Ruczinski
Protein Structure: Data Bases and Classification Ingo Ruczinski Department of Biostatistics, Johns Hopkins University Reference Bourne and Weissig Structural Bioinformatics Wiley, 2003 More References
More informationSupersecondary Structures (structural motifs)
Supersecondary Structures (structural motifs) Various Sources Slide 1 Supersecondary Structures (Motifs) Supersecondary Structures (Motifs): : Combinations of secondary structures in specific geometric
More informationSequence analysis and comparison
The aim with sequence identification: Sequence analysis and comparison Marjolein Thunnissen Lund September 2012 Is there any known protein sequence that is homologous to mine? Are there any other species
More information2MHR. Protein structure classification is important because it organizes the protein structure universe that is independent of sequence similarity.
Protein structure classification is important because it organizes the protein structure universe that is independent of sequence similarity. A global picture of the protein universe will help us to understand
More informationCAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan
CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinff18.html Proteins and Protein Structure
More informationBCH 4053 Spring 2003 Chapter 6 Lecture Notes
BCH 4053 Spring 2003 Chapter 6 Lecture Notes 1 CHAPTER 6 Proteins: Secondary, Tertiary, and Quaternary Structure 2 Levels of Protein Structure Primary (sequence) Secondary (ordered structure along peptide
More informationMotif Prediction in Amino Acid Interaction Networks
Motif Prediction in Amino Acid Interaction Networks Omar GACI and Stefan BALEV Abstract In this paper we represent a protein as a graph where the vertices are amino acids and the edges are interactions
More informationNumber sequence representation of protein structures based on the second derivative of a folded tetrahedron sequence
Number sequence representation of protein structures based on the second derivative of a folded tetrahedron sequence Naoto Morikawa (nmorika@genocript.com) October 7, 2006. Abstract A protein is a sequence
More informationAnalysis and Prediction of Protein Structure (I)
Analysis and Prediction of Protein Structure (I) Jianlin Cheng, PhD School of Electrical Engineering and Computer Science University of Central Florida 2006 Free for academic use. Copyright @ Jianlin Cheng
More informationaddresses: b Department of Mathematics and Statistics, G.N. Khalsa College, University of Mumbai, India. a.
Reaching Optimized Parameter Set: Protein Secondary Structure Prediction Using Neural Network DongardiveJyotshna* a, Siby Abraham *b a Department of Computer Science, University of Mumbai, Mumbai, India
More informationALL LECTURES IN SB Introduction
1. Introduction 2. Molecular Architecture I 3. Molecular Architecture II 4. Molecular Simulation I 5. Molecular Simulation II 6. Bioinformatics I 7. Bioinformatics II 8. Prediction I 9. Prediction II ALL
More informationThe Relative Importance of Input Encoding and Learning Methodology on Protein Secondary Structure Prediction
Georgia State University ScholarWorks @ Georgia State University Computer Science Theses Department of Computer Science 6-9-2006 The Relative Importance of Input Encoding and Learning Methodology on Protein
More informationImproving Protein 3D Structure Prediction Accuracy using Dense Regions Areas of Secondary Structures in the Contact Map
American Journal of Biochemistry and Biotechnology 4 (4): 375-384, 8 ISSN 553-3468 8 Science Publications Improving Protein 3D Structure Prediction Accuracy using Dense Regions Areas of Secondary Structures
More informationCMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction
CMPS 6630: Introduction to Computational Biology and Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the
More informationImproving Protein Secondary-Structure Prediction by Predicting Ends of Secondary-Structure Segments
Improving Protein Secondary-Structure Prediction by Predicting Ends of Secondary-Structure Segments Uros Midic 1 A. Keith Dunker 2 Zoran Obradovic 1* 1 Center for Information Science and Technology Temple
More informationSteps in protein modelling. Structure prediction, fold recognition and homology modelling. Basic principles of protein structure
Structure prediction, fold recognition and homology modelling Marjolein Thunnissen Lund September 2012 Steps in protein modelling 3-D structure known Comparative Modelling Sequence of interest Similarity
More informationComputational Biology: Basics & Interesting Problems
Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information
More informationAmino Acid Structures from Klug & Cummings. 10/7/2003 CAP/CGS 5991: Lecture 7 1
Amino Acid Structures from Klug & Cummings 10/7/2003 CAP/CGS 5991: Lecture 7 1 Amino Acid Structures from Klug & Cummings 10/7/2003 CAP/CGS 5991: Lecture 7 2 Amino Acid Structures from Klug & Cummings
More informationIntroduction to" Protein Structure
Introduction to" Protein Structure Function, evolution & experimental methods Thomas Blicher, Center for Biological Sequence Analysis Learning Objectives Outline the basic levels of protein structure.
More informationAutomated Assignment of Backbone NMR Data using Artificial Intelligence
Automated Assignment of Backbone NMR Data using Artificial Intelligence John Emmons στ, Steven Johnson τ, Timothy Urness*, and Adina Kilpatrick* Department of Computer Science and Mathematics Department
More informationProtein Structures. Sequences of amino acid residues 20 different amino acids. Quaternary. Primary. Tertiary. Secondary. 10/8/2002 Lecture 12 1
Protein Structures Sequences of amino acid residues 20 different amino acids Primary Secondary Tertiary Quaternary 10/8/2002 Lecture 12 1 Angles φ and ψ in the polypeptide chain 10/8/2002 Lecture 12 2
More informationProtein structure alignments
Protein structure alignments Proteins that fold in the same way, i.e. have the same fold are often homologs. Structure evolves slower than sequence Sequence is less conserved than structure If BLAST gives
More informationPrediction of protein secondary structure by mining structural fragment database
Polymer 46 (2005) 4314 4321 www.elsevier.com/locate/polymer Prediction of protein secondary structure by mining structural fragment database Haitao Cheng a, Taner Z. Sen a, Andrzej Kloczkowski a, Dimitris
More informationCOMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University
COMP 598 Advanced Computational Biology Methods & Research Introduction Jérôme Waldispühl School of Computer Science McGill University General informations (1) Office hours: by appointment Office: TR3018
More information1-D Predictions. Prediction of local features: Secondary structure & surface exposure
1-D Predictions Prediction of local features: Secondary structure & surface exposure 1 Learning Objectives After today s session you should be able to: Explain the meaning and usage of the following local
More informationGetting To Know Your Protein
Getting To Know Your Protein Comparative Protein Analysis: Part III. Protein Structure Prediction and Comparison Robert Latek, PhD Sr. Bioinformatics Scientist Whitehead Institute for Biomedical Research
More informationCMPS 3110: Bioinformatics. Tertiary Structure Prediction
CMPS 3110: Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the laws of physics! Conformation space is finite
More informationBayesian Models and Algorithms for Protein Beta-Sheet Prediction
0 Bayesian Models and Algorithms for Protein Beta-Sheet Prediction Zafer Aydin, Student Member, IEEE, Yucel Altunbasak, Senior Member, IEEE, and Hakan Erdogan, Member, IEEE Abstract Prediction of the three-dimensional
More informationBIOCHEMISTRY Course Outline (Fall, 2011)
BIOCHEMISTRY 402 - Course Outline (Fall, 2011) Number OVERVIEW OF LECTURE TOPICS: of Lectures INSTRUCTOR 1. Structural Components of Proteins G. Brayer (a) Amino Acids and the Polypeptide Chain Backbone...2
More informationSyllabus of BIOINF 528 (2017 Fall, Bioinformatics Program)
Syllabus of BIOINF 528 (2017 Fall, Bioinformatics Program) Course Name: Structural Bioinformatics Course Description: Instructor: This course introduces fundamental concepts and methods for structural
More informationArtifical Neural Networks
Neural Networks Artifical Neural Networks Neural Networks Biological Neural Networks.................................. Artificial Neural Networks................................... 3 ANN Structure...........................................
More informationBIRKBECK COLLEGE (University of London)
BIRKBECK COLLEGE (University of London) SCHOOL OF BIOLOGICAL SCIENCES M.Sc. EXAMINATION FOR INTERNAL STUDENTS ON: Postgraduate Certificate in Principles of Protein Structure MSc Structural Molecular Biology
More informationDATE A DAtabase of TIM Barrel Enzymes
DATE A DAtabase of TIM Barrel Enzymes 2 2.1 Introduction.. 2.2 Objective and salient features of the database 2.2.1 Choice of the dataset.. 2.3 Statistical information on the database.. 2.4 Features....
More informationPacking of Secondary Structures
7.88 Lecture Notes - 4 7.24/7.88J/5.48J The Protein Folding and Human Disease Professor Gossard Retrieving, Viewing Protein Structures from the Protein Data Base Helix helix packing Packing of Secondary
More informationAccelerating Biomolecular Nuclear Magnetic Resonance Assignment with A*
Accelerating Biomolecular Nuclear Magnetic Resonance Assignment with A* Joel Venzke, Paxten Johnson, Rachel Davis, John Emmons, Katherine Roth, David Mascharka, Leah Robison, Timothy Urness and Adina Kilpatrick
More informationMolecular Modelling. part of Bioinformatik von RNA- und Proteinstrukturen. Sonja Prohaska. Leipzig, SS Computational EvoDevo University Leipzig
part of Bioinformatik von RNA- und Proteinstrukturen Computational EvoDevo University Leipzig Leipzig, SS 2011 Protein Structure levels or organization Primary structure: sequence of amino acids (from
More informationBIOINF 4120 Bioinformatics 2 - Structures and Systems - Oliver Kohlbacher Summer Protein Structure Prediction I
BIOINF 4120 Bioinformatics 2 - Structures and Systems - Oliver Kohlbacher Summer 2013 9. Protein Structure Prediction I Structure Prediction Overview Overview of problem variants Secondary structure prediction
More informationProtein Secondary Structure Assignment and Prediction
1 Protein Secondary Structure Assignment and Prediction Defining SS features - Dihedral angles, alpha helix, beta stand (Hydrogen bonds) Assigned manually by crystallographers or Automatic DSSP (Kabsch
More informationProtein quality assessment
Protein quality assessment Speaker: Renzhi Cao Advisor: Dr. Jianlin Cheng Major: Computer Science May 17 th, 2013 1 Outline Introduction Paper1 Paper2 Paper3 Discussion and research plan Acknowledgement
More informationLecture 7. Protein Secondary Structure Prediction. Secondary Structure DSSP. Master Course DNA/Protein Structurefunction.
C N T R F O R N T G R A T V B O N F O R M A T C S V U Master Course DNA/Protein Structurefunction Analysis and Prediction Lecture 7 Protein Secondary Structure Prediction Protein primary structure 20 amino
More informationCHAPTER 29 HW: AMINO ACIDS + PROTEINS
CAPTER 29 W: AMI ACIDS + PRTEIS For all problems, consult the table of 20 Amino Acids provided in lecture if an amino acid structure is needed; these will be given on exams. Use natural amino acids (L)
More informationProteins: Structure & Function. Ulf Leser
Proteins: Structure & Function Ulf Leser This Lecture Proteins Structure Function Databases Predicting Protein Secondary Structure Many figures from Zvelebil, M. and Baum, J. O. (2008). "Understanding
More informationFrom Amino Acids to Proteins - in 4 Easy Steps
From Amino Acids to Proteins - in 4 Easy Steps Although protein structure appears to be overwhelmingly complex, you can provide your students with a basic understanding of how proteins fold by focusing
More informationRNA and Protein Structure Prediction
RNA and Protein Structure Prediction Bioinformatics: Issues and Algorithms CSE 308-408 Spring 2007 Lecture 18-1- Outline Multi-Dimensional Nature of Life RNA Secondary Structure Prediction Protein Structure
More informationProtein Structure Prediction
Protein Structure Prediction Michael Feig MMTSB/CTBP 2006 Summer Workshop From Sequence to Structure SEALGDTIVKNA Ab initio Structure Prediction Protocol Amino Acid Sequence Conformational Sampling to
More informationProtein Secondary Structure Prediction using Logical Analysis of Data.
Protein Secondary Structure Prediction using Logical Analysis of Data. JACEK B A EWICZ 1, PETER L. HAMMER 2, PIOTR UKASIAK 1 1 Institute of Computing Sciences, Poznan University of Technology, ul. Piotrowo
More informationSCOP. all-β class. all-α class, 3 different folds. T4 endonuclease V. 4-helical cytokines. Globin-like
SCOP all-β class 4-helical cytokines T4 endonuclease V all-α class, 3 different folds Globin-like TIM-barrel fold α/β class Profilin-like fold α+β class http://scop.mrc-lmb.cam.ac.uk/scop CATH Class, Architecture,
More informationRanjit P. Bahadur Assistant Professor Department of Biotechnology Indian Institute of Technology Kharagpur, India. 1 st November, 2013
Hydration of protein-rna recognition sites Ranjit P. Bahadur Assistant Professor Department of Biotechnology Indian Institute of Technology Kharagpur, India 1 st November, 2013 Central Dogma of life DNA
More informationReconstructing Amino Acid Interaction Networks by an Ant Colony Approach
Author manuscript, published in "Journal of Computational Intelligence in Bioinformatics 2, 2 (2009) 131-146" Reconstructing Amino Acid Interaction Networks by an Ant Colony Approach Omar GACI and Stefan
More informationProtein Bioinformatics. Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet sandberg.cmb.ki.
Protein Bioinformatics Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet rickard.sandberg@ki.se sandberg.cmb.ki.se Outline Protein features motifs patterns profiles signals 2 Protein
More informationAdvanced Certificate in Principles in Protein Structure. You will be given a start time with your exam instructions
BIRKBECK COLLEGE (University of London) Advanced Certificate in Principles in Protein Structure MSc Structural Molecular Biology Date: Thursday, 1st September 2011 Time: 3 hours You will be given a start
More informationUsing Knowledge-Based Neural Networks to Improve Algorithms: Refining the Chou-Fasman Algorithm for Protein Folding
Using Knowledge-Based Neural Networks to Improve Algorithms: Refining the Chou-Fasman Algorithm for Protein Folding Richard Maclin Jude W. Shavlik Computer Sciences Dept. University of Wisconsin 1210 W.
More information7 Protein secondary structure
78 Grundlagen der Bioinformatik, SS 1, D. Huson, June 17, 21 7 Protein secondary structure Sources for this chapter, which are all recommended reading: Introduction to Protein Structure, Branden & Tooze,
More informationIntroducing Hippy: A visualization tool for understanding the α-helix pair interface
Introducing Hippy: A visualization tool for understanding the α-helix pair interface Robert Fraser and Janice Glasgow School of Computing, Queen s University, Kingston ON, Canada, K7L3N6 {robert,janice}@cs.queensu.ca
More informationExamples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE
Examples of Protein Modeling Protein Modeling Visualization Examination of an experimental structure to gain insight about a research question Dynamics To examine the dynamics of protein structures To
More informationBSc and MSc Degree Examinations
Examination Candidate Number: Desk Number: BSc and MSc Degree Examinations 2018-9 Department : BIOLOGY Title of Exam: Molecular Biology and Biochemistry Part I Time Allowed: 1 hour and 30 minutes Marking
More informationCopyright Mark Brandt, Ph.D A third method, cryogenic electron microscopy has seen increasing use over the past few years.
Structure Determination and Sequence Analysis The vast majority of the experimentally determined three-dimensional protein structures have been solved by one of two methods: X-ray diffraction and Nuclear
More informationProtein folding. α-helix. Lecture 21. An α-helix is a simple helix having on average 10 residues (3 turns of the helix)
Computat onal Biology Lecture 21 Protein folding The goal is to determine the three-dimensional structure of a protein based on its amino acid sequence Assumption: amino acid sequence completely and uniquely
More informationHOMOLOGY MODELING. The sequence alignment and template structure are then used to produce a structural model of the target.
HOMOLOGY MODELING Homology modeling, also known as comparative modeling of protein refers to constructing an atomic-resolution model of the "target" protein from its amino acid sequence and an experimental
More information8 Protein secondary structure
Grundlagen der Bioinformatik, SoSe 11, D. Huson, June 6, 211 13 8 Protein secondary structure Sources for this chapter, which are all recommended reading: Introduction to Protein Structure, Branden & Tooze,
More informationMolecular Modeling. Prediction of Protein 3D Structure from Sequence. Vimalkumar Velayudhan. May 21, 2007
Molecular Modeling Prediction of Protein 3D Structure from Sequence Vimalkumar Velayudhan Jain Institute of Vocational and Advanced Studies May 21, 2007 Vimalkumar Velayudhan Molecular Modeling 1/23 Outline
More informationOutline. Levels of Protein Structure. Primary (1 ) Structure. Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins
Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins Margaret Daugherty Fall 2004 Outline Four levels of structure are used to describe proteins; Alpha helices and beta sheets
More information