Sequence Based Bioinformatics

Similar documents
A bioinformatics approach to the structural and functional analysis of the glycogen phosphorylase protein family

Proteins: Characteristics and Properties of Amino Acids

Objectives. Comparison and Analysis of Heat Shock Proteins in Organisms of the Kingdom Viridiplantae. Emily Germain 1,2 Mentor Dr.

Sequence Alignments. Dynamic programming approaches, scoring, and significance. Lucy Skrabanek ICB, WMC January 31, 2013

Viewing and Analyzing Proteins, Ligands and their Complexes 2

Comparison and Analysis of Heat Shock Proteins in Organisms of the Kingdom Viridiplantae. Emily Germain, Rensselaer Polytechnic Institute

Using Higher Calculus to Study Biologically Important Molecules Julie C. Mitchell

PROTEIN SECONDARY STRUCTURE PREDICTION: AN APPLICATION OF CHOU-FASMAN ALGORITHM IN A HYPOTHETICAL PROTEIN OF SARS VIRUS

Protein structure. Protein structure. Amino acid residue. Cell communication channel. Bioinformatics Methods

SEQUENCE ALIGNMENT BACKGROUND: BIOINFORMATICS. Prokaryotes and Eukaryotes. DNA and RNA

Similarity or Identity? When are molecules similar?

Advanced Topics in RNA and DNA. DNA Microarrays Aptamers

Translation. A ribosome, mrna, and trna.

Structure and evolution of the spliceosomal peptidyl-prolyl cistrans isomerase Cwc27

7.012 Problem Set 1. i) What are two main differences between prokaryotic cells and eukaryotic cells?

Supplementary Figure 3 a. Structural comparison between the two determined structures for the IL 23:MA12 complex. The overall RMSD between the two

A Plausible Model Correlates Prebiotic Peptide Synthesis with. Primordial Genetic Code

Clustering and Model Integration under the Wasserstein Metric. Jia Li Department of Statistics Penn State University

Supplementary Information. Broad Spectrum Anti-Influenza Agents by Inhibiting Self- Association of Matrix Protein 1

Programme Last week s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues

Lecture 15: Realities of Genome Assembly Protein Sequencing

Exam III. Please read through each question carefully, and make sure you provide all of the requested information.

NMR study of complexes between low molecular mass inhibitors and the West Nile virus NS2B-NS3 protease

CSE 549: Computational Biology. Substitution Matrices

Research Proposal. Title: Multiple Sequence Alignment used to investigate the co-evolving positions in OxyR Protein family.

Protein Struktur (optional, flexible)

Supplementary figure 1. Comparison of unbound ogm-csf and ogm-csf as captured in the GIF:GM-CSF complex. Alignment of two copies of unbound ovine

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Protein Structures: Experiments and Modeling. Patrice Koehl

Geometrical Concept-reduction in conformational space.and his Φ-ψ Map. G. N. Ramachandran

M.O. Dayhoff, R.M. Schwartz, and B. C, Orcutt

Properties of amino acids in proteins

L L. Figure by MIT OCW.

Introduction to the Ribosome Overview of protein synthesis on the ribosome Prof. Anders Liljas

What makes a good graphene-binding peptide? Adsorption of amino acids and peptides at aqueous graphene interfaces: Electronic Supplementary

Solutions In each case, the chirality center has the R configuration

7.014 Problem Set 2. [substrate] mm Initial reaction velocity* mmol/min

UNIT TWELVE. a, I _,o "' I I I. I I.P. l'o. H-c-c. I ~o I ~ I / H HI oh H...- I II I II 'oh. HO\HO~ I "-oh

Goals. Structural Analysis of the EGR Family of Transcription Factors: Templates for Predicting Protein DNA Interactions

*****Scientists use the SCIENTIFIC METHOD to help them answer questions and solve problems about the natural world.*****

Monte Carlo Simulations of Protein Folding using Lattice Models

7.014 Quiz I Handout

Sequence Alignment: A General Overview. COMP Fall 2010 Luay Nakhleh, Rice University

Supplementary Information Intrinsic Localized Modes in Proteins

Bahnson Biochemistry Cume, April 8, 2006 The Structural Biology of Signal Transduction

Chemistry Chapter 22

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.

Amino Acid Side Chain Induced Selectivity in the Hydrolysis of Peptides Catalyzed by a Zr(IV)-Substituted Wells-Dawson Type Polyoxometalate

Massachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution

Week 10: Homology Modelling (II) - HHpred

Protein Structure. Role of (bio)informatics in drug discovery. Bioinformatics

C CH 3 N C COOH. Write the structural formulas of all of the dipeptides that they could form with each other.

Protein Structure Bioinformatics Introduction

Computational Analysis of the Fungal and Metazoan Groups of Heat Shock Proteins

Exam I Answer Key: Summer 2006, Semester C

Energy and Cellular Metabolism

Amino Acids and Peptides

Bioengineering & Bioinformatics Summer Institute, Dept. Computational Biology, University of Pittsburgh, PGH, PA

β1 Structure Prediction and Validation

Packing of Secondary Structures

Lecture 10: Cyclins, cyclin kinases and cell division

Biochemistry. Lecture 8 Enzyme Kinetics

Bioinformatics Exercises

Potentiometric Titration of an Amino Acid. Introduction

CSCE555 Bioinformatics. Protein Function Annotation

Bioinformatics. Dept. of Computational Biology & Bioinformatics

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega

BIS Office Hours

The translation machinery of the cell works with triples of types of RNA bases. Any triple of RNA bases is known as a codon. The set of codons is

Protein Struktur. Biologen und Chemiker dürfen mit Handys spielen (leise) go home, go to sleep. wake up at slide 39

7.012 Problem Set 1 Solutions

Read more about Pauling and more scientists at: Profiles in Science, The National Library of Medicine, profiles.nlm.nih.gov

Sensitive NMR Approach for Determining the Binding Mode of Tightly Binding Ligand Molecules to Protein Targets

Protein Bioinformatics. Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet sandberg.cmb.ki.

5. MULTIPLE SEQUENCE ALIGNMENT BIOINFORMATICS COURSE MTAT

Modelling of Possible Binding Modes of Caffeic Acid Derivatives to JAK3 Kinase

Bioinformatics. Scoring Matrices. David Gilbert Bioinformatics Research Centre

Supplemental Materials for. Structural Diversity of Protein Segments Follows a Power-law Distribution

Studies Leading to the Development of a Highly Selective. Colorimetric and Fluorescent Chemosensor for Lysine

Cladistics and Bioinformatics Questions 2013

Objective: You will be able to justify the claim that organisms share many conserved core processes and features.

Details of Protein Structure

Supplemental Materials

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

7.014 Problem Set 1. A nswers to this problem set are to be turned in. Problem sets will not be accepted late. Solutions will be posted on the web.

Oxygen Binding in Hemocyanin

The Journal of Animal & Plant Sciences, 28(5): 2018, Page: Sadia et al., ISSN:

12/6/12. Dr. Sanjeeva Srivastava IIT Bombay. Primary Structure. Secondary Structure. Tertiary Structure. Quaternary Structure.

Sequence analysis and comparison

Structural Alignment of Proteins

Sequential resonance assignments in (small) proteins: homonuclear method 2º structure determination

Supporting information to: Time-resolved observation of protein allosteric communication. Sebastian Buchenberg, Florian Sittel and Gerhard Stock 1

Midterm Review Guide. Unit 1 : Biochemistry: 1. Give the ph values for an acid and a base. 2. What do buffers do? 3. Define monomer and polymer.

Biological Macromolecules

Physiochemical Properties of Residues

CONCEPT OF SEQUENCE COMPARISON. Natapol Pornputtapong 18 January 2018

Desorption/Ionization Efficiency of Common Amino Acids in. Surface-assisted Laser Desorption/ionization Mass Spectrometry

Major Types of Association of Proteins with Cell Membranes. From Alberts et al

Chapter 5. Proteomics and the analysis of protein sequence Ⅱ

Evolution of complete proteomes: guanine-cytosine pressure, phylogeny and environmental influences blend the proteomic architecture

Transcription:

Structural and Functional Analysis of Inosine Monophosphate Dehydrogenase using Sequence-Based Bioinformatics Barry Sexton 1,2 and Troy Wymore 3 1 Bioengineering and Bioinformatics Summer Institute, Department of Computational Biology, University of Pittsburgh, Pittsburgh, PA 15261 2 Department of Biology, Bucknell University, Lewisburg, PA 17837 3 Pittsburgh Supercomputing Center, Biomedical Initiative Group, Pittsburgh, PA 15213 Sequence Based Bioinformatics Multiple sequence alignments between homologous proteins Identification of relatedness between proteins in the same family y or in different families based on phylogenetic tree analysis Construction of 3-dimensional 3 models of proteins which have unknown structure Offers insights into conserved features within a protein class or family Analysis of conserved residues or regions of the sequence Interactions within the active site or elsewhere that give the protein p it s s distinct functionality Design of novel inhibitors through computational docking (drug design) d Facilitates the construction of a more powerful sequence-structure structure-function relationship 1

Why Study IMPDH? Inosine monophosphate dehydrogenase (IMPDH) catalyzes the unique initial step in guanine nucleotide synthesis Reaction is dependent on an NAD cofactor and converts IMP to XMP The biosynthesis of nucleotides such as guanine is essential for cell proliferation www.angelfire.com/dc/apgenetics/replication.jpg Sequence Selection Used 1NFB (solved structure of human IMPDH from the PDB) as the query sequence Used BLAST (Basic( Local Alignment Search Tool) to find proteins with highest sequence identity to human IMPDH Cut-off E Value was 1x10-20 Returned 159 sequences 2

Multiple Sequence Alignment Used the program T-Coffee T-Coffee computes the best global alignment and the top ten local alignments for each possible pair of sequences It then determines the multiple sequence alignment which has maximum consistency with all of the pairwise alignments previously determined MEME Motif Patterns MEME = Multiple Expectation Maximization for Motif Elicitation MEME Algorithm is a combination of: Expectation minimization (EM) EM-based heuristic for choosing the EM starting point Maximum likelihood ratio Multistart for searching over possible motif widths Search for finding multiple motifs Discovered motifs are highly conserved regions of the sequence and are presumed to give the protein it s distinct functionality 3

GeneDoc to View Alignment Manual Alignment Adjustment 4

Phylogenetic Tree Construction SeqBoot - Reads in data set (multiple sequence alignment) and produces multiple data sets from it by bootstrap resampling PROTDIST - Computes a distance measure for protein sequences using maximum likelihood estimates based on the Dayhoff PAM matrix Based on genetic code plus a constraint on changing to different category of amino acid Also computes percentage similarity between sequences NEIGHBOR - Neighbor Joining Method Neighbor Joining is a distance matrix method producing an unrooted tree Method is very fast and can handle very large data sets CONSENSE combines the multiple trees into one consensus tree Phylogenetic Tree Analysis 5

Refined Sequence List Based on phylogenetic tree outgroup analysis Based on viewing optimal alignment in GeneDoc Elimination of sequences which lacked a significant amount of motifs or appeared unrelated Sequences from undetermined organisms and those which were hypothetical were also eliminated Discovered GMP Reductase Relationship GMP Reductase has a 34% sequence identity with IMPDH GMPR and prokaryotic IMPDH contained the displayed motif, while eukaryotic IMPDH did not Otherwise IMPDH and GMPR have almost identical motif patterns Both participate in the same enzymatic pathway GMPR converts guanosine monophosphate (GMP) to xanthine monophosphate (XMP) GMPR & Prokaryotic IMPDH Eukaryotic IMPDH 6

Reran MEME with Reduced Number of Sequences Revaluated Sequences in GeneDoc and Readjusted Alignment 7

Phylogenetic Tree Analysis Part II Simplified Phylogenetic Tree Prokaryote I (gram positive bacteria, pathogenic, obligate aerobes) Prokaryote III (proteobacteria, pathogenic, gram negative) Prokaryote II (rod shaped bacteria, gram positive, thermophylic Fungi (budding yeasts, fission yeasts, bread molds, etc.) Protists (pathogenic protozoa) Insects (honey bee, mosquito, fruit fly, etc.) Plants (flowering plants, rice, tobacco, etc.) Mammal/Amphibian IMPDH II (human, mouse, chicken, frog, etc.) Mammal/Amphibian IMPDH I (human, mouse, chicken, frog, etc.) 8

Superposed Motifs onto 3-D 3 Structure of IMPDH Motifs Highlighted on 3-D 3 D Structure 9

Motifs Highlighted on the Monomer Only Specific Interactions of Motif Residues with IMP Motif 6 Pictured Cysteine 331 Inosine Monophosphate (IMP) substrate 10

Specific Interactions of Motif Residues with NAD Motif 5 Pictured Serine 275 Serine 276 NAD Cofactor Phenylalanine 282 Specific Interactions of Motif Residues with both NAD & IMP Arginine 322 NAD Asparagine 303 Isoleucine 332 IMP 11

Structural Analysis Roles of each motif Motif & Consensus Sequence Motif 1 - Residues 29 to 57 GLTYNDFLILPGYIDFTADQ VDLTSALTKKITLKTPLY Motif 2 - Residues 64 to 113 PLVSSPMDTVTEAGMAIAMALTGGIGFIHHNCTPEFQANE VRKVKKYEQG Motif 3 - Residues 114 to 163 FITDPVVLSPKDRVRDVFEAKARHG FCGIPITDTGRMGSRLVGIISSRDI Motif 4 - Residues 194 to 234 LKEANEILQRSKKGKLPIVNE DDELVAIIARTDLKKNRDYP Motif 5 - Residues 263 to 291 LAQAGVDVVVLDSS QGNSIFQINMIKYIK Motif 6 - Residues 294 to 343 YPNLQVIGGNVVTAAQAKNLIDAGVDALRVGMGSGSICIT QEVLACGRPQ Motif 7 - Residues 359 to 399 VPVIADGGIQNVGHIAKALALG ASTVMMGSLLAATTEAPGE Motif 8 - Residues 401 to 429 FFSDGIRLKKYRGMG SLDAMDKHLSSQNR Motif 9 - Residues 436 to 466 KIKVAQGVSGAVQDK GSIHKFVPYLIAGIQH Crucial Residues Thr45 Ala46 Met70 Asp71 Thr72 Val73 His93 Asp117 Pro118 Pro123 Gly148 Leu218 Asp226 Lys228 Pro234 Asp274 Ser275 Ser276 Ser280 Phe282 Asn303 Arg322 Ile330 Cys331 Ile332 Glu335 Asp364 Gly365 Gly366 Ile367 Ser388 Leu389 Tyr411 Met414 Arg429 Gly451 Leu460 Gly463 Hsd466 Possible Function NAD cofactor binding NAD cofactor binding Structural Structural NAD cofactor binding and chemistry NAD cofactor and inosine substrate chemistry Inosine substrate binding and chemistry NAD cofactor binding NAD cofactor binding Conclusions Prokaryote I (gram positive bacteria, pathogenic, obligate aerobes) Fungi (budding yeasts, fission yeasts, bread molds, etc.) Protists (pathogenic protozoa) Prokaryote III (proteobacteria, pathogenic, gram negative) Prokaryote II (rod shaped bacteria, gram positive, thermophylic Plants (flowering plants, rice, tobacco, etc.) Insects (honey bee, mosquito, fruit fly, etc.) Phylogenetic tree reveals relatedness among analyzed sequences Indicates probable evolutionary progression Mammal/Amphibian IMPDH II (human, mouse, chicken, frog, etc.) Mammal/Amphibian IMPDH I (human, mouse, chicken, frog, etc.) Visualization and analysis of conserved motifs and residues Interactions with the substrate and cofactor provide insight into the catalytic activity of IMPDH 12

Future Applications IMPDH is unregulated in rapidly dividing tumor cells and has therefore been identified as an excellent target for pharmacological intervention Information about key interactions at the active site can provide biochemists or anyone interested with useful knowledge in designing drugs to inhibit IMPDH Inhibiting the enzyme could also have antiparasitic, antimicrobial, or even antiviral applications References Bailey T., and Elkan C. The Value of Prior Knowledge in Discovering Motifs with MEME. AAAI Press (2003), NIH HG00005. Colby T., Vanderveen K., Strickler M., Markham G., Goldstein B. Crystal Structure of Human Type II Inosine Monophosphate Dehydrogenase: Implications for Ligand Binding and Drug Design. Biochemistry (1999), 96: 3531-3536. 3536. Poirot O., O Toole O E., Notredame C. Tcoffee@igs: A Web Server for Computing, Evaluating, and Combining Multiple Sequence Alignments. Nucleic Acids Research (2003), 13: 3503-6. Sintchak M., and Nimmesgern E. The Structure of Inosine 5 - Monophosphate Dehydrogenase and the Design of Novel Inhibitors. Immunopharmacology (2000), 47: 163-184. 184. 13

Acknowledgments Bioengineering and Bioinformatics Summer Institute, Department of Computational Biology, University of Pittsburgh Rajan Munshi, BBSI Coordinator National Institutes of Health (NIH) and National Science Foundation (NSF) Department of Biology, Bucknell University Troy Wymore, Biomedical Initiative Group, Pittsburgh Supercomputing Center Adam Marko, University of Pittsburgh 14