DATE A DAtabase of TIM Barrel Enzymes

Similar documents
Procheck output. Bond angles (Procheck) Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics.

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Physiochemical Properties of Residues

Examples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE

Lecture 10 (10/4/17) Lecture 10 (10/4/17)

1. Protein Data Bank (PDB) 1. Protein Data Bank (PDB)

Chapter 2 Structures. 2.1 Introduction Storing Protein Structures The PDB File Format

Protein Structure. Role of (bio)informatics in drug discovery. Bioinformatics

THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION

Protein structure alignments

2MHR. Protein structure classification is important because it organizes the protein structure universe that is independent of sequence similarity.

Packing of Secondary Structures

Protein Structure: Data Bases and Classification Ingo Ruczinski

Model Mélange. Physical Models of Peptides and Proteins

Get familiar with PDBsum and the PDB Extract atomic coordinates from protein data files Compute bond angles and dihedral angles

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES

D Dobbs ISU - BCB 444/544X 1

Protein structure. Protein structure. Amino acid residue. Cell communication channel. Bioinformatics Methods

Basic Principles of Protein Structures

Read more about Pauling and more scientists at: Profiles in Science, The National Library of Medicine, profiles.nlm.nih.gov

Structure to Function. Molecular Bioinformatics, X3, 2006

SCOP. all-β class. all-α class, 3 different folds. T4 endonuclease V. 4-helical cytokines. Globin-like

BIRKBECK COLLEGE (University of London)

The Structure and Functions of Proteins

Basics of protein structure

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

HMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder

Bioinformatics. Macromolecular structure

Problem Set 1

Protein Data Bank Contents Guide: Atomic Coordinate Entry Format Description. Version Document Published by the wwpdb

Heteropolymer. Mostly in regular secondary structure

Protein Structure Bioinformatics Introduction

Protein Secondary Structure Prediction

ALL LECTURES IN SB Introduction

Protein Structure & Motifs

Protein Structures: Experiments and Modeling. Patrice Koehl

Study of Mining Protein Structural Properties and its Application

Properties of amino acids in proteins

Advanced Certificate in Principles in Protein Structure. You will be given a start time with your exam instructions

Protein Structure Prediction and Display

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan

Supersecondary Structures (structural motifs)

Protein Structure and Function. Protein Architecture:

Secondary and sidechain structures

Protein structure (and biomolecular structure more generally) CS/CME/BioE/Biophys/BMI 279 Sept. 28 and Oct. 3, 2017 Ron Dror

HOMOLOGY MODELING. The sequence alignment and template structure are then used to produce a structural model of the target.

Useful background reading

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy

Getting To Know Your Protein

Bioinformatics. Proteins II. - Pattern, Profile, & Structure Database Searching. Robert Latek, Ph.D. Bioinformatics, Biocomputing

CS612 - Algorithms in Bioinformatics

BCH 4053 Spring 2003 Chapter 6 Lecture Notes

Protein Struktur (optional, flexible)

Peptides And Proteins

Presenter: She Zhang

SUPPLEMENTARY INFORMATION

Orientational degeneracy in the presence of one alignment tensor.

Assignment 2 Atomic-Level Molecular Modeling

Major Types of Association of Proteins with Cell Membranes. From Alberts et al

Protein Structure Prediction

Secondary Structure. Bioch/BIMS 503 Lecture 2. Structure and Function of Proteins. Further Reading. Φ, Ψ angles alone determine protein structure

CHAPTER 29 HW: AMINO ACIDS + PROTEINS

1-D Predictions. Prediction of local features: Secondary structure & surface exposure

Homology and Information Gathering and Domain Annotation for Proteins

proteins SHORT COMMUNICATION MALIDUP: A database of manually constructed structure alignments for duplicated domain pairs

B. β Structure. All contents of this document, unless otherwise noted, are David C. & Jane S. Richardson. All Rights Reserved.

Biomolecules: lecture 10

Motif Prediction in Amino Acid Interaction Networks

COMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University

4 Proteins: Structure, Function, Folding W. H. Freeman and Company

7 Protein secondary structure

Analysis and Prediction of Protein Structure (I)

Protein Structure. Hierarchy of Protein Structure. Tertiary structure. independently stable structural unit. includes disulfide bonds

8 Protein secondary structure

NUCLEOTIDE BINDING ENZYMES

BSc and MSc Degree Examinations

Structure and evolution of the spliceosomal peptidyl-prolyl cistrans isomerase Cwc27

12 Protein secondary structure

Number sequence representation of protein structures based on the second derivative of a folded tetrahedron sequence

Can protein model accuracy be. identified? NO! CBS, BioCentrum, Morten Nielsen, DTU

Structural Alignment of Proteins

CMPS 3110: Bioinformatics. Tertiary Structure Prediction

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction

Computer design of idealized -motifs

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison

Today. Last time. Secondary structure Transmembrane proteins. Domains Hidden Markov Models. Structure prediction. Secondary structure

Intro Secondary structure Transmembrane proteins Function End. Last time. Domains Hidden Markov Models

Central Dogma. modifications genome transcriptome proteome

Basic structures of proteins

Protein Structure Basics

Bioinformatics: Secondary Structure Prediction

PROTEIN SECONDARY STRUCTURE PREDICTION: AN APPLICATION OF CHOU-FASMAN ALGORITHM IN A HYPOTHETICAL PROTEIN OF SARS VIRUS

Protein Bioinformatics Computer lab #1 Friday, April 11, 2008 Sean Prigge and Ingo Ruczinski

Computer simulations of protein folding with a small number of distance restraints

THE UNIVERSITY OF MANITOBA. PAPER NO: 409 LOCATION: Fr. Kennedy Gold Gym PAGE NO: 1 of 6 DEPARTMENT & COURSE NO: CHEM 4630 TIME: 3 HOURS

Protein Modeling Methods. Knowledge. Protein Modeling Methods. Fold Recognition. Knowledge-based methods. Introduction to Bioinformatics

Supplementary Figure 1. Aligned sequences of yeast IDH1 (top) and IDH2 (bottom) with isocitrate

Molecular Modeling lecture 2

BCH 4053 Exam I Review Spring 2017

Transcription:

DATE A DAtabase of TIM Barrel Enzymes 2 2.1 Introduction.. 2.2 Objective and salient features of the database 2.2.1 Choice of the dataset.. 2.3 Statistical information on the database.. 2.4 Features.... 2.4.1 Structure and the Topology diagram of the protein... 2.4.2 Other informations... 2.4.3 Conformational information.. 2.4.4 BLAST output for the particular entry... 2.5 Availability, Update and the structure of the website.. 2.6 References.. 25 26 26 27 28 28 29 29 30 31 32 24

2.1 Introduction 2.1 Introduction: The eight-stranded alpha/beta barrel (TIM barrel) is by far the most common tertiary fold observed in high-resolution protein crystal structures. It is estimated that 10% of all known enzymes have this domain. The members of this large family of proteins catalyze very different reactions. Such diversity in function has made this family an attractive target for protein engineering. Moreover, the evolutionary history of this protein family has been the subject of rigorous debate. Arguments have been made in favor of both convergent and divergent evolution. Because of the lack of sequence homology, the ancestry of this molecule is still a mystery. In this study, an analysis has been attempted on proteins which were found to have the alpha/beta structural motif to study their structural features like conformational preferences, functional significance, topological features such as solvent accessibility, residue preference, salt bridges, sequence similarity etc. This database is a collection of sequence, structural, functional and conformational information about all enzymes that have the TIM barrel (alpha8/beta8) topology (where the beta strand order is parallel:1,2,3,4,5,6,7,8). The TIM database currently contains 84 different enzymes. A lot of work has been done on the TIM barrel proteins. People have made attempts to classify TIM barrel protein based on evolutionary studies, and have also come up with some interesting results like the Backward evolution model is not preferred and that modern proteins have arisen from common ancestors that bound key metabolites. This has been studied extensively by Peer Bork (2000) by mapping the known TIM barrel folds to the pathways of central metabolism. Another interesting observation that needs to be mentioned is that all the TIM barrel proteins are made up of approximately 250 residues, with a deviation of 10 residues. Though some examples may have 500 residues in this dataset, when one considers only the curated protein, the number of residues falls down to approximately 250 residues. Fig. 1: Snapshot of the DATE website available at http://144.16.74.43/~skumar/date 25

2.2 Objective and salient features of the database 2.2 Objective and salient features of the database DATE is an acronym for Database of TIM barrel enzymes. The database was created with the objective to help people working with TIM barrel proteins, get quick and comprehensive information about the protein of interest. This is different from the TIM-DB which is maintained by Dr. Pujadas (1999), where he has given links to the PDB website, SCOP and CATH. He has also classified the known TIM barrel proteins according to the E.C number. It is interesting to note from our analysis and their analysis that there are no examples for the class Ligase. This database is not a general database but a specially database as described by Helen Berman (1999). She had defined a speciality database as follows: Another type of database that has proven invaluable in research has been the speciality database. These databases are curated by people working experts in the field and provide information beyond the structure themselves. These may include derived structural data, sequence information, and other biochemical information. 2.2.1 Choice of the dataset The information available in the database is briefly discussed in this section. When the analysis on TIM barrel protein was carried out, some of the observations were very striking. Most of the helices terminated with either glycines or with asparagines in the left handed helical conformation. The dataset was characterized by looking at the phi-psi values of the proteins taken. The set of proteins, which were actually taken for the study, was selected as follows. The homology between the proteins is less than 25%. The resolution at which the protein was used was given as another criterion. Here, we had used a resolution cut-off of 2.5 angstrom. The residue frequency to occur at the helix region and to occur at the strand region was calculated and it was found that Ala and Leu had the maximum frequency to occur in helices. The residues with the maximum frequency to occur in strands include Val and Leu. The residue, which terminated the helix in the left-handed helical conformation, was Glycine, followed on by Asn. Fig. 2: Choice of the dataset 26

2.3 Statistical information on the database 2.3 Statistical information on the database There are 85 enzymes in the database with 22,062 residues. The propensity for the helix to terminate in the left handed helical conformation was the maximum for Glycine, followed by Asparagine and histidine. Some of the statistics of the database like, residue preference in the helix, strand, helix termination with positive phi and psi, etc are shown in the figures below. Fig. 3: Conformational characterization of the residues (non-gly) in the dataset and the composition of the dataset. Fig. 4: Residue frequency to occur in the Helix region and the strand region for the residues in the database. 27

2.4 Features 2.4 Features Some of the salient features in this database are discussed below. 2.4.1 Structure and Topology diagram of the Protein The structure of each of the protein in the datasbase is represented in a ribbon diagram generated by Molscript written by kraulis. The topology diagram of each of the enzymes is also depicted as a cartoon, with the cylinders representing the helix and the arrowmarks representing the strand. The arrowhead points towards the C-terminus of the protein. Incase where there are large extra domains in the proteins, they are not represented by cylinders and arrowmarks, instead, they are marked to be extra domains in the protein. The crescent represents the loops, which connect the helix and the strand. Fig. 5: One half of the webpage for the entry Triosephosphate Isomerase 28

2.4 Features 2.4.2 Other informations Other than the structure and the topology diagram, information on the function, which this enzyme performs, multimeric unit of this protein, number of residues in this protein and the pathway in which this is involved is also given. Other than these, the link to the PDB entry is also given. All these are shown in figure. 5. 2.4.3 Conformational information Another link by the name Ramachandran Phi Psi Values is provided for each of the entry in the database. This links to a file, which contains complete information on the phi psi values of the protein. The representation and the conventions used in making the file are also defined in each page. For reference, pls see figure. 6 Fig. 6: This shows the file for which the link labelled Ramachandran phi psi values is given. This file gives the backbone dihedral values for the residues in the protein. The format of the file and the conventions used in explaining is also described in the header of each of the file. 29

2.4 Features 2.4.4 BLAST Output for the particular entry Other than these structural informations, sequence information is also provided. For each of the entry, the BLAST (Basic Local Alignment Search Tool) result is displayed. This gives the global sequence alignment of the proteins from different organisms. See figure. 7 Fig. 7: shows the page containing the BLAST output for that particular entry 30

2.5 Availability, update and the structure of the website 2.5 Availability, update and the structure of the website DATE is currently updated once every 3 months. Researchers who are interested to use DATE may directly download DATE as a zip file at http://144.16.74.43/~skumar/date/date.zip or as a tar file at http://144.16.74.43/~skumar/date/date.tar. Researchers who are interested to browse the contents of the database can visit DATE at http://www.144.16.74.43/~skumar/date. Another mirror site is also hosted at http://www.geocities.com/date and at http://www.geocities.com/madanm2/date. Incase of any mistakes, please email to M. Madan Babu (mmadanbabu@rediffmail.com) or S. Kumar Singh (skumar@mbu.iisc.ernet.in). Fig. 8: Structure of the website 31

2.6 References 2.6 References Andrew R.C. Raine et. al (1994). On the evolution of alternate core packing in eightfold α/β barrels, Protein Science 3, 1889-1892. Berman, H. (1999). The past and the future of structure databases, Curr. Opin. Biotechnol. 10, 76-80 Copley, R. R & Bork, P (2000). Homology among (βα) barrels: implications for the evolution of metabolic pathways, J. Mol. Biol, 303, 627-640 Farber, G. K. (1993) An α/β barrel full of evolutionary trouble. Curr. Opin. Struc. Biol. 3, 409-412. Farber, G. K. & Petsko, G. A (1990) The evolution of α/β barrel enzymes TIBS. 15, 228-234. Kraulis, P. J. (1991). MOLSCRIPT: A program to produce both detailed and schematic plots of protein structures. J. Appl. Cryst. 24, 946-950. Pujadas, G. & Palau, J. (1999). TIM barrel fold: structural, functional and evolutionary characteristics in natural and designed molecules. Biologia, Bratislava, 54 (3): 231-254 Ramachandran, G.N. & Sasisekharan, V. (1968). Conformation of polypeptides and proteins, Adv. Protein Chem, 23,283-438 Reardon, D & Farber, K. G. (1995). The structure and evolution of α/β barrel proteins, FASEB J. 9, 497-503. 32