Structural and Functional Analysis of Inosine Monophosphate Dehydrogenase using Sequence-Based Bioinformatics Barry Sexton 1,2 and Troy Wymore 3 1 Bioengineering and Bioinformatics Summer Institute, Department of Computational Biology, University of Pittsburgh, Pittsburgh, PA 15261 2 Department of Biology, Bucknell University, Lewisburg, PA 17837 3 Pittsburgh Supercomputing Center, Biomedical Initiative Group, Pittsburgh, PA 15213 Sequence Based Bioinformatics Multiple sequence alignments between homologous proteins Identification of relatedness between proteins in the same family y or in different families based on phylogenetic tree analysis Construction of 3-dimensional 3 models of proteins which have unknown structure Offers insights into conserved features within a protein class or family Analysis of conserved residues or regions of the sequence Interactions within the active site or elsewhere that give the protein p it s s distinct functionality Design of novel inhibitors through computational docking (drug design) d Facilitates the construction of a more powerful sequence-structure structure-function relationship 1
Why Study IMPDH? Inosine monophosphate dehydrogenase (IMPDH) catalyzes the unique initial step in guanine nucleotide synthesis Reaction is dependent on an NAD cofactor and converts IMP to XMP The biosynthesis of nucleotides such as guanine is essential for cell proliferation www.angelfire.com/dc/apgenetics/replication.jpg Sequence Selection Used 1NFB (solved structure of human IMPDH from the PDB) as the query sequence Used BLAST (Basic( Local Alignment Search Tool) to find proteins with highest sequence identity to human IMPDH Cut-off E Value was 1x10-20 Returned 159 sequences 2
Multiple Sequence Alignment Used the program T-Coffee T-Coffee computes the best global alignment and the top ten local alignments for each possible pair of sequences It then determines the multiple sequence alignment which has maximum consistency with all of the pairwise alignments previously determined MEME Motif Patterns MEME = Multiple Expectation Maximization for Motif Elicitation MEME Algorithm is a combination of: Expectation minimization (EM) EM-based heuristic for choosing the EM starting point Maximum likelihood ratio Multistart for searching over possible motif widths Search for finding multiple motifs Discovered motifs are highly conserved regions of the sequence and are presumed to give the protein it s distinct functionality 3
GeneDoc to View Alignment Manual Alignment Adjustment 4
Phylogenetic Tree Construction SeqBoot - Reads in data set (multiple sequence alignment) and produces multiple data sets from it by bootstrap resampling PROTDIST - Computes a distance measure for protein sequences using maximum likelihood estimates based on the Dayhoff PAM matrix Based on genetic code plus a constraint on changing to different category of amino acid Also computes percentage similarity between sequences NEIGHBOR - Neighbor Joining Method Neighbor Joining is a distance matrix method producing an unrooted tree Method is very fast and can handle very large data sets CONSENSE combines the multiple trees into one consensus tree Phylogenetic Tree Analysis 5
Refined Sequence List Based on phylogenetic tree outgroup analysis Based on viewing optimal alignment in GeneDoc Elimination of sequences which lacked a significant amount of motifs or appeared unrelated Sequences from undetermined organisms and those which were hypothetical were also eliminated Discovered GMP Reductase Relationship GMP Reductase has a 34% sequence identity with IMPDH GMPR and prokaryotic IMPDH contained the displayed motif, while eukaryotic IMPDH did not Otherwise IMPDH and GMPR have almost identical motif patterns Both participate in the same enzymatic pathway GMPR converts guanosine monophosphate (GMP) to xanthine monophosphate (XMP) GMPR & Prokaryotic IMPDH Eukaryotic IMPDH 6
Reran MEME with Reduced Number of Sequences Revaluated Sequences in GeneDoc and Readjusted Alignment 7
Phylogenetic Tree Analysis Part II Simplified Phylogenetic Tree Prokaryote I (gram positive bacteria, pathogenic, obligate aerobes) Prokaryote III (proteobacteria, pathogenic, gram negative) Prokaryote II (rod shaped bacteria, gram positive, thermophylic Fungi (budding yeasts, fission yeasts, bread molds, etc.) Protists (pathogenic protozoa) Insects (honey bee, mosquito, fruit fly, etc.) Plants (flowering plants, rice, tobacco, etc.) Mammal/Amphibian IMPDH II (human, mouse, chicken, frog, etc.) Mammal/Amphibian IMPDH I (human, mouse, chicken, frog, etc.) 8
Superposed Motifs onto 3-D 3 Structure of IMPDH Motifs Highlighted on 3-D 3 D Structure 9
Motifs Highlighted on the Monomer Only Specific Interactions of Motif Residues with IMP Motif 6 Pictured Cysteine 331 Inosine Monophosphate (IMP) substrate 10
Specific Interactions of Motif Residues with NAD Motif 5 Pictured Serine 275 Serine 276 NAD Cofactor Phenylalanine 282 Specific Interactions of Motif Residues with both NAD & IMP Arginine 322 NAD Asparagine 303 Isoleucine 332 IMP 11
Structural Analysis Roles of each motif Motif & Consensus Sequence Motif 1 - Residues 29 to 57 GLTYNDFLILPGYIDFTADQ VDLTSALTKKITLKTPLY Motif 2 - Residues 64 to 113 PLVSSPMDTVTEAGMAIAMALTGGIGFIHHNCTPEFQANE VRKVKKYEQG Motif 3 - Residues 114 to 163 FITDPVVLSPKDRVRDVFEAKARHG FCGIPITDTGRMGSRLVGIISSRDI Motif 4 - Residues 194 to 234 LKEANEILQRSKKGKLPIVNE DDELVAIIARTDLKKNRDYP Motif 5 - Residues 263 to 291 LAQAGVDVVVLDSS QGNSIFQINMIKYIK Motif 6 - Residues 294 to 343 YPNLQVIGGNVVTAAQAKNLIDAGVDALRVGMGSGSICIT QEVLACGRPQ Motif 7 - Residues 359 to 399 VPVIADGGIQNVGHIAKALALG ASTVMMGSLLAATTEAPGE Motif 8 - Residues 401 to 429 FFSDGIRLKKYRGMG SLDAMDKHLSSQNR Motif 9 - Residues 436 to 466 KIKVAQGVSGAVQDK GSIHKFVPYLIAGIQH Crucial Residues Thr45 Ala46 Met70 Asp71 Thr72 Val73 His93 Asp117 Pro118 Pro123 Gly148 Leu218 Asp226 Lys228 Pro234 Asp274 Ser275 Ser276 Ser280 Phe282 Asn303 Arg322 Ile330 Cys331 Ile332 Glu335 Asp364 Gly365 Gly366 Ile367 Ser388 Leu389 Tyr411 Met414 Arg429 Gly451 Leu460 Gly463 Hsd466 Possible Function NAD cofactor binding NAD cofactor binding Structural Structural NAD cofactor binding and chemistry NAD cofactor and inosine substrate chemistry Inosine substrate binding and chemistry NAD cofactor binding NAD cofactor binding Conclusions Prokaryote I (gram positive bacteria, pathogenic, obligate aerobes) Fungi (budding yeasts, fission yeasts, bread molds, etc.) Protists (pathogenic protozoa) Prokaryote III (proteobacteria, pathogenic, gram negative) Prokaryote II (rod shaped bacteria, gram positive, thermophylic Plants (flowering plants, rice, tobacco, etc.) Insects (honey bee, mosquito, fruit fly, etc.) Phylogenetic tree reveals relatedness among analyzed sequences Indicates probable evolutionary progression Mammal/Amphibian IMPDH II (human, mouse, chicken, frog, etc.) Mammal/Amphibian IMPDH I (human, mouse, chicken, frog, etc.) Visualization and analysis of conserved motifs and residues Interactions with the substrate and cofactor provide insight into the catalytic activity of IMPDH 12
Future Applications IMPDH is unregulated in rapidly dividing tumor cells and has therefore been identified as an excellent target for pharmacological intervention Information about key interactions at the active site can provide biochemists or anyone interested with useful knowledge in designing drugs to inhibit IMPDH Inhibiting the enzyme could also have antiparasitic, antimicrobial, or even antiviral applications References Bailey T., and Elkan C. The Value of Prior Knowledge in Discovering Motifs with MEME. AAAI Press (2003), NIH HG00005. Colby T., Vanderveen K., Strickler M., Markham G., Goldstein B. Crystal Structure of Human Type II Inosine Monophosphate Dehydrogenase: Implications for Ligand Binding and Drug Design. Biochemistry (1999), 96: 3531-3536. 3536. Poirot O., O Toole O E., Notredame C. Tcoffee@igs: A Web Server for Computing, Evaluating, and Combining Multiple Sequence Alignments. Nucleic Acids Research (2003), 13: 3503-6. Sintchak M., and Nimmesgern E. The Structure of Inosine 5 - Monophosphate Dehydrogenase and the Design of Novel Inhibitors. Immunopharmacology (2000), 47: 163-184. 184. 13
Acknowledgments Bioengineering and Bioinformatics Summer Institute, Department of Computational Biology, University of Pittsburgh Rajan Munshi, BBSI Coordinator National Institutes of Health (NIH) and National Science Foundation (NSF) Department of Biology, Bucknell University Troy Wymore, Biomedical Initiative Group, Pittsburgh Supercomputing Center Adam Marko, University of Pittsburgh 14