BIOSYNTHESIS AND PHYSIOLOGICAL ROLE OF ARCHAEOSINE IN THE EXTREME HALOPHILIC ARCHAEON Haloferax volcanii

Size: px
Start display at page:

Download "BIOSYNTHESIS AND PHYSIOLOGICAL ROLE OF ARCHAEOSINE IN THE EXTREME HALOPHILIC ARCHAEON Haloferax volcanii"

Transcription

1 BIOSYNTHESIS AND PHYSIOLOGICAL ROLE OF ARCHAEOSINE IN THE EXTREME HALOPHILIC ARCHAEON Haloferax volcanii By GABRIELA PHILLIPS A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA

2 2011 Gabriela Phillips 2

3 To my husband for his love, understanding, patience 3

4 ACKNOWLEDGMENTS Abundant gratitude belongs to Dr. Valerie de Crécy-Lagard for supervision, support, encouragement throughout all the years we worked together in the field. Her knowledgeable and valuable input stimulated this dissertation from preliminary levels to actualization. I would sincerely like to thank my committee members, James Preston, Nemat Keyhani, Claudio Gonzalez, Nigel Richards for their support, time, and helpful insights that helped me become a better prepared scholar in the field. I would like to express my deep and sincere gratitude to Basma el Yacoubi for her helpful teachings, discussions, understanding; her precious support helped me enormously to cope with the difficulties of my doctoral studies. I am grateful to Marc Bailly for insightful discussions and for developing a better procedure for bulk trna extraction and purification as well as setting up the protocol for extraction and purification of E. coli trna Asp. I am especially indebted to Sophie Alvarez (Danforth Plant Science Center, Proteomics and Mass Spectrometry Facility, St. Louis, MO.) for her LC-MS/MS analysis on bulk trna. I also want to thank to Kirk Gaston (Pat A. Limbach Research Group University of Cincinnati) for his prompt E. coli trna Asp sequencing and analysis. I am grateful to Dr. Julie Maupin-Furlow (MCB, UF) for the H. volcanii H26 and H. salinarum NRC-1 strains. I also thank her for H. volcanii expression plasmid pjam202; without it, I would not have been able to perform all the H. volcanii phenotype complementation tests. I will miss my coworkers Crysten Haas, Ian Blaby, and Patrick Thiaville for helpful discussions. My undergraduate studies where directed by the advice of Dr. Madeline Rasche who introduced me to my first serious scientific experiments and believed in my 4

5 scholastic abilities. Finally, I need to thank my family for unceasing support and patience. I would not have completed this task without their love and understanding. 5

6 TABLE OF CONTENTS page ACKNOWLEDGMENTS... 4 LIST OF TABLES... 9 LIST OF FIGURES LIST OF ABBREVIATIONS ABSTRACT CHAPTER 1 trna BIOGENESIS IN ARCHAEA trna Role in Translation trna Structure trna Processing trna Processing in Archaea Overview of Archaea Domain Maturation of trna 5 -end Maturation of trna 3 -end Introns in Archaeal trna Transcripts M. kandleri C-to-U trna Editing Posttranscriptional Modifications of trna Nucleosides Agmatidine, a recently discovered trna modification essential for decoding Wyosine derivatives biosynthesis pathways in Archaea Modification of Adenosine to N 1 -methyladenosine to N 1 -methylinosine, an archaeal site specific modification Guide RNA dependent modifications of trnas Archaeosine, an archaeal trna specific modification MATERIAL AND METHODS Materials Bioinformatics Tools Three Dimensional (3D) Structure Superimposition and Visualization Strains, Media, Growth and Transformation H. volcanii Competent Cells And Transformation Protocols Competent cells Transformation Polymerase Chain Reaction DNA Electrophoresis Plasmid Isolation and Transformation

7 Site-Directed Mutagenesis General Cloning Plasmids and Strains Construction Plasmids construction for bacterial complementation assays Plasmids construction for archaeal complementation assays Chromosomal gene deletions Southern Blot Functional Complementation Assays Thymidine auxotrophy phenotype complementation Queuosine deficient phenotype complementation Archaeosine deficient phenotype complementation trna Work Bulk trna extraction trna Asp purification Bulk trna digestion for LC-MS/MS analysis trna Asp digestion ARCHAEOSINE BIOSYNTHESIS IN H. volcanii Background Results In the Extreme Halophilic Archaeon H. volcanii, Archaeosine Is Not Essential for growth HVO_2348, Encoding FolE2 Homolog, Is Involved in Both Folate and Archaeosine Biosynthesis HVO_1718, Encoding QueD Homolog, Is Involved in Archaeosine Biosynthesis HVO_1717, Encoding QueE Homolog, and HVO_1716, Encoding QueC Homolog, Are Involved in Archaeosine Biosynthesis ArcS Is the Last Step in Archaeosine Biosynthesis in H. volcanii Discussion ALTERNATIVE ARCHAEOSINE BIOSYNTHESIS ROUTES Background Results In Some Crenarchaea, QueF-like Protein Catalyzes the Last Step in Archaeosine Biosynthesis In Other Crenarchaea, GATII-QueC Protein Catalyzes the Last Step in Archaeosine Biosynthesis Bacterial Tgt Charges Archaeosine at Position 34 of trna Asp Discussion FUNCTIONAL DIVERSITY OF THE COG0720 PROTEIN FAMILY Background Results

8 Separation of six COG0720 Protein Subfamilies by Comparative Genomics. 115 Cluster analysis Phylogeny and motif derivation Structural Analysis of the COG0720 Family PTPS-I/III Protein Functions in Both Folate and Queuosine Pathway Role of COG0720 Proteins in Archaea Flexibility of the PTPS Catalytic Site Discussion PHENOTYPIC ANALYSIS OF H. volcanii ARCHAEOSINE DEFICIENT MUTANTS Background Results Other Extreme Halophilic Archaea Have Lost Archaeosine H. volcanii Archaeosine Deficient Mutants Are Sensitive to High Mg 2+ Concentrations H. volcanii Archaeosine Deficient Mutants Show a Cold Sensitive Phenotype Discussion SUMMARY AND FUTURE DIRECTIONS Summary of Findings Future Directions APPENDIX A LIST OF PRIMERS B LIST OF PLASMIDS C LIST OF STRAINS D E NAMES AND ABBREVIATIONS OF TRNA MODIFICATIONS FOUND IN ARCHAEA LIST OF COG0720 PROTEINS SEQUENCES USED TO BUILD THE MULTIPLE ALIGNMENTS AND THE PHYLOGENETIC TREE LIST OF REFERENCES BIOGRAPHICAL SKETCH

9 LIST OF TABLES Table page 5-1 Testing the in vivo activity of different COG0720 protein derivatives Organisms predicted to contain COG0720 enzymes with dual PTPS-I/III activities A-1 List of Primers B-1 List of Plasmids C-1 List of Strains D-1 Names and Abbreviations of trna Modifications Found in Archaea

10 LIST OF FIGURES Figure page 1-1 trna role in translation trna secondary and tertiary structures Maturation of trna in Archaea Phylogenetic distribution of Archaea Representatives of archaeal RNase P RNAs Types of introns found in Archaea Representation of a bulge helix bulge (BHB) and relaxed bulge helix loop (BHL) Selected trna posttranscriptional modifications Biosynthesis of Wyosine derivatives in Archaea atgt role in G + biosynthesis Crystal structure of atgt from P. horikoshii (PDB IQ8) Chemical structure of Archaeosine and Queuosine The biosynthetic pathway of queuosine in Bacteria and Eukarya Bacterial preq 0 biosynthetic steps used as a model to determine the preq 0 (G + ) biosynthesis in H. volcanii PCR and Southern blot verifications of the HVO_2001 chromosomal gene deletion LC-MS/MS analysis of bulk trna extract from H. volcanii Δatgt derivative strains Growth curve analysis of H. volcanii Δatgt (VDC3241) compared to H26 WT PCR and Southern blot verifications of the HVO_2348 chromosomal gene deletion dt auxotrophy phenotype of H. volcanii ΔfolE2 strain LC-MS/MS analysis of bulk trna extracted from H. volcanii ΔfolE2 strain

11 3-10 Growth curve analysis of H. volcanii ΔfolE2 (VDC3235) Chromosomal topology of HVO_2348 in H. volcanii Chromosomal topology of the H. volcanii preq 0 genes PCR verifications for the chromosomal deletion of HVO_ LC-MS/MS analysis of bulk trna extracted from H. volcanii ΔHVO_1718 and H26 WT strains Complementation of G + deficient phenotype by QueD homolog PCR verifications for the chromosomal deletion of HVO_ PCR verification for the chromosomal deletion of HVO_ LC-MS/MS analysis of bulk trna extracted from H. volcanii ΔHVO_1717 and H26 WT strains Complementation of G + deficient phenotype by QueE homolog LC-MS/MS analysis of bulk trna extracted from H. volcanii ΔHVO_1716 and H26 WT strains Complementation of G + deficient phenotype by QueC homolog Comparison of atgt and ArcS domains PCR and Southern blot verifications for the HVO_2008 gene deletion LC-MS/MS analysis of bulk trna extracted from H. volcanii ΔatgtA2 (ArcS) derivatives Phylogenetic distribution of ArcS, GAT-QueC, and QueF-like in Archaea Structure based alignments of QueF and QueF-like Proposed models for the last step in G + biosynthesis Alignments of representative atgts from Euryarchaea and Crenarchaea LC-MS/MS analysis of bulk trna extracted from P. calidifontis Construction of the E. coli heterologous systems LC-MS/MS analysis of trna extracted from E.coli ΔqueF derivatives

12 4-8 LC-MS/MS analysis of bulk trna extract from E. coli ΔqueCΔqueF derivatives Analysis of RNase T1 digest of trna Asp Known or predicted roles of COG0720 (PTPS) proteins in GTP-derived metabolic pathways Physical clustering of the four PTPS protein sub-families (I-IV) Signature motifs obtained for COG0720 proteins Evolutionary relationships of COG0720 family of proteins in 48 taxa Spatial comparisons of PTPS crystal structures Distribution of dual PTPSI/III proteins in both Q and THF in specific organisms Complementation of the E. coli ΔfolB dt auxotrophy phenotype by PTPS-I/III and PTPS-I from C. botulinum (Cb) LC-MS/MS analysis of Q content in bulk trna extracted from E. coli ΔqueD derivative strains Role of COG0720 proteins in Archaea Mg 2+ bound to trna trna tertiary interactions Phylogenetic distribution of trna modifications genes in archaeal extreme halophiles LC-MS/MS analysis of trna extracted from H. walsbyi and H. volcanii High Mg 2+ concentration sensitive phenotype of G + mutants Cold sensitive phenotype of H. volcanii Δatgt Cold sensitive phenotype of G + deficient mutants

13 LIST OF ABBREVIATIONS C 5-FOA Degrees Celsius 5-Fluoroorotic acid A. baylyi Acinetobacter baylyi sp ADP1 A. fulgidus Archaeoglobus fulgidus A. pyrophilus Aquifex pyrophilus A. pernix Aeropyrum pernix aars AMP Amp r ATP BLAST BH4 aminoacyl-trna synthetase Adenosine 5 -monophosphate Ampicilin resistance Adenosine 5 -triphosphate Basic Local Alignment Search Tool tetrahyrobiopterin C- Carboxyl C. botulinum Clostridium botulinum C. jejuni Campylobacter jejuni C. maqulingensis Caldiviriga maqulingensis C. elegans Caenorhabditis elegans CTP DMSO DNA DTT Cytosine 5 -triphospate Dimethyl sulfoxide Deoxyribonucleic acid Dithiothreitol E. coli Escherichia coli EDTA Ethylenediaminetetraacetic acid 13

14 H. cutirubrum Halobacterium cutirubrum H. lacusprofundi Halobrum lacusprofundi H. marismortui Haloarcula marismortui H. pylori Helicobacter pylori H. salinarum Halobacterium salinarum H. volcanii Haloferax volcanii HPLC Hv-Ca Hv-Mm High pressure liquid chromatography H. volcanii minimal medium enhanced with casaaminoacids H. volcanii minimal medium I. hospitalis Ignicoccus hospitalis IPTG Isopropyl-β-D-thiogalactopyranoside K. aerogenes Klebsiella aerogenes Kan r Kanamycin resistance L. interogans Leptospira interogans LB LC Luria-Bertani broth Liquid chromatography M. acetivorans Methanosarcina acetivorans M. jannaschii Methanocaldococcus jannaschii M. kandleri Methanopyrus kandleri M. sedula Metallosphaera sedula mrna MS MS/MS messanger RNA Mass Spectrometry Tandem Mass Spectrometry N- Amino 14

15 N. equitans Nanoarchaeum equitans nm Nov r OD OH- Nanometer Novobiocin resistance Optical density Hydroxyl group P. aerophilum Pyrobaculum aerophilum P. aeruginosa Pseudomonas aeruginosa P. calidifontis Pyrobacculum calidifontis P. falciparum Plasmodium falciparum P. furiosus Pyrococcus furiosus P. abyssi Pyrococcus abyssi P. horikoshii Pyrococcus horikoshii PCR pre-trna Polymerase Chain Reaction Primary transcript of trna R. norvegicus Rattus norvegicus RNA rmsd rpm Ribonucleic Acid root-mean-square-deviation Rotations per minute S. aciditrophicus Syntrophus aciditrophicus S. acidocaldaricus Sulfolobus acidocaldarius S. cerevisiae Saccharomyces cerevisiae S. coelicolor Streptomyces coelicolor S. enterica Salmonella enterica S. shibatae Sulfolobus shibatae 15

16 S. solfataricus Sulfolobus solfataricus S.fumaroxidans S.tokodii SAM Syntrophobacter fumaroxidans Sulfolobus tokodii S-adenosyl methionine T. acidophilum Thermoplasma acidophilum T. kodakaraensis Thermococcus kodakaraensis T. neutrophilus Thermoproteus neutrophilus T. pallidum Treponema pallidum T. pendens Thermophilus pendens THF trna UV Tetrahydrofolate transfer RNA Ultraviolet V. distributa Vulcanisaeta distributa XIC Extracted ion chromatogram Z. mobilis Zymomonas mobilis YPC βme Yeast-Peptone-Casaaminoacid Beta-mercaptoethanol 16

17 Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy BIOSYNTHESIS AND PHYSIOLOGICAL ROLE OF ARCHAEOSINE IN THE EXTREME HALOPHILIC ARCHAEON Haloferax volcanii Chair: Valerie de Crécy-Lagard Major: Microbiology and Cell Science By GABRIELA PHILLIPS August 2011 Transfer RNA is one of the critical molecules in protein translation as it is the adaptor molecule between the mrna and the growing peptide. The primary transcript of trna undergoes multiple processing steps to mature and become a fully functional trna. One of the remarkable processing steps is the modifications of the canonical encoded nucleotides (U, G, A, and C). These post-transcriptional modifications are simple and complex. One of the complex modifications is Archaeosine (G + ). Archaeosine is found at position 15 of all archaeal trnas that bear a guanine at this position. The signature enzyme for G + biosynthesis is trna guanine transglycosylase (atgt). Despite the large number of biochemical studies on atgt, most of the G + biosynthesis steps and its physiological role remained unidentified. This study focuses on the identification of the genes involved in G + biosynthesis and the phenotypical characterization of G + deficient strains. Archaeosine structurally resembles another complex trna posttranscriptional modification, Queuosine (Q). Queuosine is found at position 34 of bacterial and eukaryotic trna Asp, His, Asn, and Tyr. The structural similarity of G + and Q suggested similar early biosynthetic steps. The biosynthetic steps of Q are fairly well documented. We showed, using bioinformatics and genetics analysis, that 17

18 archaeal homologs of the Q biosynthesis genes, fole, qued, quec, and quee, are involved in G + biosynthesis. We constructed H. volcanii deletion strains of each of these four genes. The trnas extracted from these mutants were devoid of G +. The last biosynthetic step, the formation of formamidino group, is specific to Archaea. To catalyze this last step in G + formation, Archaea employ different enzymes, ArcS, GATIIqueC, or QueF-like. ArcS is present in Euryarchaea; GATII-QueC and QueF-like are present in Crenarchaea. We also showed that the COG0720 (QueD) superfamily contains promiscuous enzyme sub-families. Finally, we showed that the H. volcanii G + deficient mutants exhibit cold-sensitive and high Mg 2+ ions concentration sensitive phenotypes indicating possible roles of G + in folding and protecting trna from Mg 2+ induced cleavage. 18

19 CHAPTER 1 trna BIOGENESIS IN ARCHAEA Transfer RNA (trna) is one of the critical molecules in protein translation as it is the adaptor between the mrna and the synthesizing polypeptide. The trna molecule is both a physical and an informational link between the mrna and the elongating peptide. trna binds with the mrna at the level of the codon-anticodon, and both with the elongating peptide and incoming amino acid at the acceptor end. The specificity of the codon-anticodon interaction as well as the correct charging of the trna is the driving force behind the genetic code. (Ramakrishnan 2002; Blanchard et al., 2004) trna Role in Translation trna has two functional sites. At one site, an activating enzyme covalently adds a specific amino acid while the other functional site carries the anticodon specific for that amino acid (Figure 1-1). Each trna isoacceptor transfers one specific amino acid to a growing polypeptide chain as specified by the nucleotide sequence of the messenger RNA being translated. Accurate translation requires two essential steps: 1) the presence of the correct amino-acid for covalent attachment to the -CCA end of the trna, and 2) the correct selection of the amino acid-charged trna specified by the mrna sequence (Ling et al., 2009). The aminoacyl trna synthetase (aars) adds the specific amino-acid to the 2 or 3 -OH-terminal ribose of its cognate trna. The aars binds ATP and its corresponding amino acid to form an aminoacyl-adenylate and release inorganic pyrophosphate (PP i ) (Ibba and Söll, 2000). Then, the adenylate-aars complex binds the appropriate trna molecule, and the amino acid is transferred from the aa-amp to either the 2'- or the 3'-OH of the last trna base (A76) at the 3'-end. Although the chemical reactions are similar for the 20 aminoacyl-trna synthetases, 19

20 they are classified in two groups: Class I, and Class II. The main difference between the two classes is that Class I enzymes attach the amino acids to the 2 -OH group of the terminal nucleotide of trna, and Class II enzymes attach the amino acids to the 3 -OH group of the terminal nucleotide of the trna (Garrett and Grisham, 1995; Ling et al., 2009). Some aarss also catalyze the pre-transfer editing of misactivated aminoacyl adenylates (hydrolysis of misactivated amino acid) and/or post-transfer editing of misacylated trna (hydrolysis of mischarged trna) (Ling et al., 2009; Lue and Kelley, 2005). Protein synthesis takes place on ribosomes (Figure 1-1). Ribosomes are large macromolecular assemblies composed of approximately 60 percent ribosomal RNA (rrna) and 40 percent proteins. The main role of the ribosomes is to place the mrna, the aminoacyl-trna, and the appropriate protein factors in their correct positions relative to one another. Ribosomes contain three adjacent trna binding sites: 1) the aminoacyl binding site (A site) for a trna molecule attached to the incoming amino acid in the protein, 2) the peptidyl binding site (P site) for the central trna molecule containing the growing peptide chain, and 3) an exit binding site (E site) to discharge used trna molecules from the ribosome (Figure 1-1). Components of ribosomes, including the rrna, catalyze at least some of the reactions involved in peptide bond formation (Garrett and Grisham, 1995; Ramakrishnan, 2002; Voet and Voet, 2004). The EF-Tu (or eef-1 in eukaryotes and archaea) binds to the charged trna to form a ternary complex. This complex transiently enters the ribosome with the trna anticodon domain pairing with the mrna codon in the ribosomal A site (Figure 1-1). If the codon-anticodon pairing is correct, EF-Tu hydrolyzes GTP to GDP and inorganic 20

21 phosphate, and changes its conformation to dissociate from the trna molecule. Then, the aminoacyl trna fully enters the A site where its amino acid is brought near the P- site polypeptide, and the ribosome catalyzes the covalent transfer of the amino acid onto the polypeptide (Figure 1-1) (Ibba and Söll, 2000; Valle et al., 2002). The translation accuracy of the genetic code depends on the attachment of each amino acid to the appropriate trna. The specificity of aminoacylation ensures that the trna carries the amino acid encoded by the codon with which it pairs; the ribosome controls the topology of the interaction such that only a single triplet of nucleotides is available for pairing (Ibba and Soll, 2000).The base-paired polynucleotides are always antiparallel. mrna is read in the 5 to 3 direction. Thus, the first nucleotide of the codon pairs with nucleotide 36 of the trna, the second with nucleotide 35, and the third with nucleotide 34 (Yarian et al., 2002). trna Structure trnas are small RNAs ranging from 73 to 93 nucleotides in length (Garrett and Grisham, 1995; Rich and RajBhandary, 1976; Voet and Voet, 2004). trna molecules assume secondary structures composed of four base-pair stems in a clover leaf shape arrangement (Voet and Voet, 2004) (Figure 1-2A). The four leaves are the acceptor arm, the D arm, the anticodon arm, the TψC arm. An additional variable arm is sometimes present between the TψC and the anticodon arm. The acceptor arm consists of a 3 -terminal sequences -CCA to which the amino acid is appended by aarss to form the amino-acid charged trna. The CCA may be genetically encoded or enzymatically added to the immature trna. At the 5 end, trnas end in a 5 -terminal monophosphate group. The 7-bp stem includes the 5 -terminal nucleotides that may contain non-watson-crick (WC) base pairs such as G-U. The anticodon arm consists of 21

22 a 5 base pairs stem ending in an anti-codon loop that is complementary to the codon specifying the trna s corresponding amino acid. The anti-codon loop occurs opposite to the acceptor stem. The D arm consists of a 3 or 4 base pairs stem ending in a loop that commonly contains the modified base dihydrouridine (D) (Figure 1-2A). The TψC arm consists of a 5 base pairs stem that often contains the sequence T pseudouridine (ψ) C (TψC) (Figure 1-2A). The modified nucleoside pseudouridine as well as ribothymidine (T), Inosine (I), and hypermethylated purines are also found in this loop. The variable arm that consists of 3 to 21 nucleotides has the greatest variability among trnas. All trnas have 15 invariant positions (with strictly conserved nucleosides) and 8 semi-invariant positions (with only purines or pyrimidines) that occur mostly in the loop regions. The purine on the 3 -end of the anti-codon (position 37) is invariably modified (Rich and RajBhandary, 1976; Voet and Voet, 2004). Each clover leaf further assumes a tertiary, L-shaped conformation (Figure 1-2B). One end of the L-shaped trna is formed by the acceptor and the T stems folded into a continuous double-helix; the other end consists of the D and the anti-codon stem (Rich and RajBhandary, 1976; Voet and Voet, 2004) (Figure 1-2B). The trna tertiary structure is stabilized through hydrogen bonding between bases. Helical regions are stabilized by Watson-Crick (WC) and non-wc base-pairing. Non-helical regions are stabilized by hydrogen bonding interactions between two or three bases that are not usually complementary to each other and through hydrogen bonds between bases and either phosphates groups or the 2 -OH groups of the ribose residues (Rich and RajBhandary, 1976). 22

23 trna Processing In E. coli, trnas are encoded in the chromosome by 60 genes; some of them are components of rrna operons, and others are dispersed (frequently in operons) all over the chromosome (Voet and Voet, 2004). The trna primary transcripts are different from the physiologically active trna molecules. Mature trnas have a 5 -monophosphate end, are smaller than the primary transcripts, and contain unusual bases that are not present in the primary transcripts (Garrett and Grisham, 1995). Thus, to be converted into its physiologically active form, the primary trna transcript undergoes a series of transformations: the removal of 5 and 3 leader sequences, the addition of a universally conserved -CCA sequence in some species, the removal of introns, and the covalent modification of nucleosides (Phizicky and Hopper, 2010) (Figure 1-3). With the exception of the 5 end processing that is carried out by a ribonucleoprotein particle, all other processes are carried out by protein enzymes. trna Processing in Archaea Archaea is a group of organisms unique in its intriguing ability to thrive in extreme environmental conditions: very high salinity, temperature, pressure, and ph variations. This ability is ensured by the increased stability of its nucleic acids - among other adaptations. A critical molecule in translation is trna. Translation accuracy depends on trna stability. The stability of trna correlates with its processing features. It is the purpose of this section to review and asses the present understanding of archaeal trna processing. Overview of Archaea Domain Initially, Archaea were inappropriately classified as bacteria due to their prokaryotic morphology but were later reclassified. The archaeal domain was 23

24 discovered more than 30 years ago by Carl Woese who used small-subunit (SSU) ribosomal-rna sequences as a universal molecular clock to show the differences between Archaea and Bacteria (Woese, 1987; Woese and Fox, 1977). Archaea share similarities with both Bacteria (metabolic functions) and Eukarya (informationprocessing functions). Archaea, however, have unique and distinct features. For example, Archaea have cell membranes that contain isoprene side chains that are ether-linked to glycerol 1-phosphate while in bacterial and eukaryal membrane fatty acids are ester linked to the stereoisomer glycerol-3-phosphate (G3P) (Kates, 1993). The SSU rrna tree revealed that within the Archaea domain there are several biologically different phyla (Figure 1-4) (Gribaldo and Brochier-Armanet, 2006). The archaeal domain comprises two main phyla: Euryarchaea and Crenarchaea. Euryarchaea is the most inclusive as it encompasses the greatest phenotypic diversity among identified cultivable species; examples include many halophiles, methanogens and thermoacidophiles (Figure 1-4) (Forterre et al., 2002). Halophiles, including the genus Halobacterium, live in extremely saline environments (20-25% w/v salt). They are responsible for the red color of the salt lakes due to the C-50 carotenoid pigments present in the cell walls (Oren, 2002a). Methanogens are microorganisms that produce methane as a metabolic byproduct in anoxic conditions. They are common in wetlands, in the guts of animals such as ruminants and humans, and in marine sediments. Euryarchaea also contain some thermoacidophiles (Forterre et al., 2002) that live mostly in hot springs and/or within deep ocean vent communities. The other main phylum of Archaea is Crenarchaea. At first, Crenarchaea have been considered thermophilic or hyperthermophilic organisms with some able to grow at 24

25 up to 113 C. Recent PCR detection methods have detected Crenarchaea in temperate and cold habitats (Forterre et al., 2002). Lately, three other phyla have been tentatively created: Nanoarchaea, which contains N. equitans (Huber et al., 2002), Korarchaea, which contains a small group of unusual thermophilic species, and Thaumarchaea, which contains organisms that are chemolithoautotrophic ammonia-oxidizers (Brochier- Armanet et al., 2008). Most Archaea contain between 40 and 50 different trna molecules that decode one or more of the 60 different sense codons. The archaeal genes that encode the different trnas are either individually transcribed, cotranscribed with other trna genes, or cotranscribed with other types of genes (Cavicchioli, 2006). As in Bacteria and Eukarya, the archaeal trna nascent transcripts (pre-trna) are longer than the mature trna and require 5 and 3 ends trimming and processing (Figure 1-3). Maturation of trna 5 -end Ribonuclease P (RNase P) is a ubiquitous endoribonuclease found in all domains of life including chloroplasts and mitochondria. Its main activity is the formation of mature 5'-ends of trnas by cleaving the 5'-leader elements of precursor-trnas leaving a 5 -terminal monophosphate (Kirsebom, 2007). RNase P functions as a RNA-protein complex which is comprised of a conserved RNA plus a varying number of proteins - depending of the domain of life. Bacterial RNase P contains one protein whereas the eukaryotic RNase P contains nine or ten proteins (Kirsebom, 2007). In Archaea, the RNase P ribonucleoprotein complex contains four protein subunits (POP5, RPP30, RPP21, and RPP29) that are associated with one RNA (Kirsebom, 2007). The trans-acting catalytic function of the RNase P is retained by the RNA component which has two major domains, the specificity domain (S) that recognizes the 25

26 T-arm and acceptor stem of the pre-trna, and the catalytic domain (C) (Kirsebom, 2007); (Ellis and Brown, 2009) (Figure 1-5). Archaeal RNase P RNAs (RPR) are of two types: type A (the ancestral type) that has homology with the bacterial ancestral RPR, and type M (the M. jannaschii type) that lacks some of the structural elements (P8, L15 and P16-P17 region) involved in substrate binding in bacterial RPR (Kirsebom, 2007; Kirsebom and Trobro, 2009) (Figure 1-5). Only the type A RPR has catalytic activity in the absence of proteins in vitro; in vivo, it needs the protein complex for catalysis (Kirsebom, 2007). The type M RPR was shown to have no activity by its own; this might be due to the missing regions that prevent the RPR to bind the substrate (Kirsebom, 2007). Recently, a new type of RNase P RNA (Type T) was found in Pyrobaculum sp. and in the related C. maqulingensis and V. distributa (Figure 1-5) (Ellis and Brown, 2009; Lai et al., 2010). The new type of RPR retains the conventional catalytic domain but lacks the recognizable specificity domain. In vitro biochemical assays showed that, indeed, the new RNase P cleaves the 5 -leader pre-trna (Ellis and Brown, 2009; Lai et al., 2010). The cleavage activity of the RNase P depends on the presence of high ionic strength. High concentrations of Mg 2+ (300 mm) increase the activity of RPR. Mg 2+ can be replaced by Mn 2+ even though Mn 2+ increases nonspecific cleavage (Liu et al., 2010). The RNase P proteins (RPP) form two binary complexes (POP5 RPP30 and RPP21 RPP29) (Pulukkunat and Gopalan, 2008; Xu et al., 2009). The complex formed between POP5 RPP30 was shown to increase the rate of pre-trna cleavage (by 60- fold) while the other binary complex increased the substrate affinity (by 16-fold) (Chen et al., 2010). NMR, X-ray, and enzymatic footprint analysis of the two binary complexes 26

27 showed the interactions between the two complexes, and between RNA and RPP21 RPP29. The RPP21 RPP29 complex interacts with the C-domain of RPR. The RPP30 POP5 interacts with S-domain of RPR and with the pre-trna substrate on two distinct sites (Liu et al., 2010; Liu et al., 2010; Xu et al., 2009). Lately, the Gopalan laboratory showed that the ribosomal protein L7Ae is the fifth subunit of the archaeal RNAse P complex (Cho et al., 2010). The addition of the L7Ae protein to the Archaeal RNase P complex increased both optimal reaction temperature and k cat /K m (by about 360-fold) for pre-trna cleavage (Cho et al., 2010). It has recently been argued that the archaebacterium N. equitans does not possess RNase P (Kirsebom, 2007; Randau et al., 2008). Computational and experimental studies did not find evidence of its existence (Randau et al., 2008). In this organism, the trna promoter is close to the trna gene, and it is thought that transcription starts at the first base of the trna thus removing the requirement for RNase P (Randau et al., 2008). Another case of unusual maturation of 5 -end of trna is the presence in Methanosarcinales of G-1 adding enzyme called trna His -guanylyl-transferase (Thg1). Thg1 adds an extra guanine at the -1 position of trna after the removal of the 5 leader by RNase P. In most organisms, RNase P removes the 5 leader at position 1 of pretrna. However, in S. cerevisiae, RNase P removes the 5 -leader of pre-trna His at position -1 (Jackman and Phizicky, 2006). To allow recognition by histidyl-trna synthase, G-1 is added, by Thg1, to trna His after RNase P cleavage (Heinemann et al., 2010). Homologs of Thg1 were found in Bacteria and Archaea which genetically encode trna His G-1 (Rao et al., 2011). M. acetivorans encodes the G-1 in its trnas. It was 27

28 shown, in vitro, that M. acetivorans Thg1 homolog has the guanine adding activity on trna His transcript lacking G-1 (Rao et al., 2011). However, the physiological role of M. acetivorans Thg1 remains to be unraveled. Maturation of trna 3 -end While trna 5'-processing by RNase P is similar in all kingdoms, trna 3'-end maturation differs from one domain of life to another. Bacteria use a multistep process involving endo- and exonucleases (Blum, 2008). Eukarya and Archaea use mainly one endonuclease (trnase Z) and one transferase enzyme (trna nucleotidyltransferase) (Cavicchioli, 2006). trnase Z cleaves the 3 -end trna trailer immediately after the first unpaired base extending on the 3 -end from the trna acceptor stem and leaves a 3- hydroxyl group to allow the addition of the CCA trinucleotide by the trna terminal transferase enzyme (Vogel et al., 2005). trnase Z belongs to the family of zincdependent metallo-hydrolases of the β-lactamase superfamily with a Zn-coordination signature motif HXHXDH where X represents any amino acids (Ishii et al., 2005). trnase Z occurs in two forms. The long form, trnase Z L ( amino acids long), is found mainly in eukaryotes. The short form, trnase Z S ( amino acids long), is found in all three domains of life (Hartmann et al., 2009; Vogel et al., 2005) Archaea possess the short form, trnase Z S. The enzymes from P. furiosus, M. jannaschii, H. volcanii and P. aerophilum were heterologously expressed and purified. All four enzymes have trna-processing activity in vitro (Hartmann et al., 2009; Holzle et al., 2008; Schierling et al., 2002; Späth et al., 2008). trnase Z from H. volcanii is a homodimer that requires Mn 2+ and Zn 2+ for activity. It is a trna specific enzyme inhibited by high salt (KCl) concentrations, in vitro, (Schierling et al., 2002) although H. volcanii is an extreme halophile. trnase Z from P. furiosus is similar to the H. volcanii 28

29 protein, but it is able to cleave alternative substrates such as introns in pre-trna and pre-trna 5 -leaders (Späth et al., 2008; Späth et al., 2007). There are no archaeal trnase Z crystal structures available, but the structure of a bacterial trnase Z has been solved. The enzyme is a dimer of metallo-β-lactamase domains. Each domain has a protruding flexible arm that has a role in substrate binding (Redko et al., 2007; Vogel et al., 2005). The catalytic site of each domain has one or two Zn 2+ ions bound depending upon the binding to the substrate. The crystal structure shows that when bound to trna, two Zn 2+ ions are bound in the catalytic pocket; when the enzyme is not bound to trna, only one Zn 2+ is bound in the catalytic site (Ishii et al., 2005). Each dimer accommodates two trnas. Solving the structure of archaeal trnase Z would reveal how archaeal enzyme binds and cleaves the 3 -leader pre-trna. Mature trna contains a -CCA sequence at the 3 -terminal. The -CCA terminal sequence plays an important role in translation. On one hand, it is the aminoacylation site. On the other hand, it provides key interactions between the trna molecule and the A and P sites of the large subunit rrna (Betat et al., 2010; Cavicchioli, 2006; Voet and Voet, 2004). The -CCA sequence is not encoded in trna genes of many bacterial, archaeal, and nearly all eukaryotic trna genes (Betat et al., 2010). Thus, trna maturation necessitates an essential polymerase to catalyze the posttranscriptional addition of the -CCA end. This enzyme is the trna nucleotidyl transferase which uses ATP and CTP as substrates but does not require a nucleic acid template (Cavicchioli, 2006; Minagawa et al., 2004; Voet and Voet, 2004). Two types of trna nucleotidyl transferase have been found (Class I and Class II.) The two classes exhibit strong core 29

30 homology (the nucleotidyl transferase motif); however, there is no homology outside the core (Yue et al., 1996). Archaeal trna nucleotidyl transferase is of the Class I type (Xiong and Steitz, 2004). A single active site adds both CTP and ATP as shown by mutational analysis of the S. shibatae enzyme (Cho and Weiner, 2004). The addition of the C and A requires two Mg 2+ ions per molecule that specifically promote synthesis of the correct -CCA (Hou et al., 2005). The crystal structure from A. fulgidus cocrystalized with different substrates (trna-c, trna-cc, and trna-cca) gave insights into the mechanisms of - CCA addition. trna does not translocate or rotate during the addition of C75 and A76 (Cho et al., 2005; Cho and Weiner, 2004; Xiong et al., 2003). The archaeal trna nucleotidyl transferase binds CTP and ATP specifically excluding GTP and UTP by using hydrogen bonding interactions of the nucleotides with Arg224 and the backbone phosphates of the trna (Martin et al., 2008; Xiong et al., 2003; Xiong and Steitz, 2004). The trna 3 -end interacts with the nucleotide to be incorporated. Here, the backbone phosphates interact with the bound CTP or ATP and additionally help to position arginine (Arg224) in the correct orientation (Martin and Keller, 2007; Pan et al., 2010; Tomita et al., 2006). These specific interactions appear in conjunction with a sequential rearrangement of the binding pocket to accommodate the growing 3 end. Hence, Class I enzymes recognize and select the correct nucleotides not as pure protein-based enzymes, but as ribonucleoproteins where the trna part is not just a substrate molecule (primer), but it is an active part of the nucleotide binding pocket. Introns in Archaeal trna Transcripts Intron removal is another important step in trna maturation. Introns are intergenic regions that disrupt the exon-coding regions of genes. Introns are transcribed by RNA 30

31 polymerase and removed from the initial transcript by endonucleases excision (Cavicchioli, 2006). They are found in all domains of life. Introns are found in bacterial trna (Class I introns) (Garrett and Grisham, 1995). These Class I introns are selfsplicing catalytic RNAs that carryout both the phosphodiester cleavage and ligation reactions to remove the non-coding intergenic sequences and connect the mature trna sequences (Garrett and Grisham, 1995). At least 20% of the eukaryotic trna transcripts contain one intron found specifically between positions 37 and 38 (Marck and Grosjean, 2002; Randau and Soll, 2008). The eukaryotic splicing endonuclease recognizes position 37 and removes the intron (Abelson et al., 1998). After the intron is removed, the exons are ligated by the eukaryotic trna splicing ligase (Abelson et al., 1998; Greer et al., 1983). Finally 2 - phosphotransferase removes the 2 -phosphate left at the ligation junction (Calvin and Li, 2008). Approximately 15% of archaeal trna genes contain introns with the highest presence (up to 70%) in Thermoproteales (Sugahara et al., 2008). These introns vary from 11 to 129 nucleotides in length (Heinemann et al., 2010). Four types of Archaeal trna genes have been identified. Type one, the nonintronic trna, is encoded by a single gene with no introns (Figure 1-6). Type two, the intron containing trna, is encoded by a single gene with a maximum of three or four introns (Figure 1-6). Most introns in single intron trna genes are found between position 37 and 38 (Kaine et al., 1983; Marck and Grosjean, 2003). The trna genes containing multiple introns, present at various positions, are found exclusively in Crenarchaea (Heinemann et al., 2010; Randau et al., 2005). Type three, the trans-split trna (split-trna) (Figure 1-6) was 31

32 initially found only in N. equitans with 5 and 3 halves encoded by two separate genes (Randau et al., 2005). Recently, it was shown that other Crenarchaea (C. maquilingensis and Pyrobaculum sp.) have split-trnas. Also, it was found that these organisms contain tri-split-trnas in which the trna gene contains three individual transcripts (Fujishima et al., 2009). Each section of the split trna contains flanking leader sequences at 5 and 3 ends that are complementary to each other. Subsequently, the flanking leader sequences at 5 and 3 ends are trans-spliced by the same endonuclease and ligated to form the mature trna (Sugahara et al., 2009). The type four pre-trna, found in crenarchaeal T. pendens, is the intron-containing permuted trnas in which the 3 half of the trna occurs upstream of the 5 half, and the trna gene contains an endogenous intron (Chan et al., 2011; Fujishima et al., 2009; Sugahara et al., 2009) (Figure 1-6). Intron, split, and permutated archaeal pre-trnas share a common bulge-helixbulge (BHB) consensus motif around the intron/leader - exon margins that can be cleaved by the same trna splicing endonuclease (Chan et al., 2011; Marck and Grosjean, 2003; Randau et al., 2005). The BHB motif consists of two-three nucleotides bulges separated by four base pairs helix (Figure 1-7A). Several archaeal pre-trnas contain a relaxed form of BHB motif comprised of a single three nucleotides bulge and an internal loop separated by a four base pairs helix called bulge-helix-loop (BHL) (Figure 1-7B) found mostly in Crenarchaea (Marck and Grosjean, 2003). These motifs are necessary and sufficient for the archaeal endonucleases to recognize and cleave most splicing sites (Calvin and Li, 2008). 32

33 There are three forms of archaeal trna splicing endonucleases as shown by the solved structures: a heterotetrameric (α2β2) found in Crenarchaea, a homodimeric (α 2), and homotetrameric (α4). The last two forms are found in Euryarchaea (Calvin and Li, 2008; Randau et al., 2005; Tocchini-Valentini et al., 2005). The homotetrameric endonuclease (α4) from M. jannaschii is organized as a dimer of dimers with one subunit from each dimer participating in catalysis. The other subunit acts to stabilize the dimer (Li et al., 1998). The homodimer splicing endonuclease (α 2) from A. fulgidus has the same overall shape as the homotetrameric endonuclease, but the subunit organization is different. Each subunit contains two similar repeating domains; the N- terminal acts to stabilize the dimer, and the C-terminal domain, homologous to the subunit of the homotetrameric enzyme, catalyzes the cleavage reaction (Li and Abelson, 2000). The heterotetrameric (α2β2) endonuclease from N. equitans is a dimer of two heterodimers. The catalytic subunits (α2) are arranged in diagonal to the structural subunits (β2). The enzyme is functional only when the two heterodimers come together (Mitchell et al., 2009). The euryarchaeal splicing endonuclease is more stringent in recognizing the canonical BHB motif. The crenarchaeal splicing endonuclease s substrate recognition is more relaxed so that it recognizes the alternative forms (BHL) of BHB (Calvin and Li, 2008). Cleavage of introns by the archaeal trna splicing endonuclease leaves 3 -half beginning with a 5 -hydroxyl and a 5 -half ending in a 2,3 -cyclic phosphate. These trna halves are ligated together by a 3 -P RNA splicing ligase (RNL) (Calvin and Li, 2008). Although this reaction was known for more than 30 years, only recently the enzyme responsible for the ligase reaction was identified in M. kandleri and P. 33

34 aerophilum. The recombinant enzyme was purified and biochemically characterized. The protein belongs to the RtcB (RNA-splicing ligase) enzyme family. It was shown, in vitro, that the enzyme joins two spliced trna halves together. The joining phosphodiesterase linkage contains the phosphate originally present in the 2,3 -cyclic phosphate. The crystal structure of the RtcB homolog from P. horikoshii shows a new protein fold with a conserved putative Zn 2+ binding cleft. Indeed, in vitro studies showed that Zn 2+ is required for catalysis with no ATP or GTP requirements (Englert et al., 2011). A phylogenetic distribution analysis of members of the RtcB family showed that homologues of RtcB are present in all three domains of life with the exception, in Eukarya, of fungi and vascular plants (Englert et al., 2011). Yeast uses the Class I 5 -P RNL exclusively to ligate trna halves, and vascular plants use the Class II 5 -P RNL (Englert and Beier, 2005). M. kandleri C-to-U trna Editing RNA editing has been defined as a programmed alteration of RNA primary structures that generates a sequence that could have been directly encoded at the DNA level (Grosjean and Benne, 1998). M. kandleri encodes a cytidine at position 8 in about 30 (out of 34) trna genes whereas the mature trnas possess a uridine at this position. The uridine at position 8 forms a reverse Hoogsteen interaction with A14 that is critical to maintain the stability of the sharp kink between the acceptor stem and the A9 base of trna (Westhof et al., 1985). Thus, the C8 must be modified to U in order to allow the interaction with A14 which is present in all trnas. The M. kandleri enzyme responsible for this editing reaction (CDAT8) was recently identified and characterized (Randau et al., 2009). The recombinant protein was purified and shown to have C-to-U editing activity in vitro on M. kandleri trna His transcripts. The crystal structure revealed that the 34

35 enzyme is a dimer and each monomer consists of three domains. The N-terminal is a cytidine deaminase domain with the cytidine deaminase signature motif ({HAEX(n)PCX(2)C}) in which His and two Cys are involved in Zn 2+ binding site. The other domain is a central ferredoxin-like domain. The C-terminal is a THUMP domain (trna binding domain) (Randau et al., 2009). The cytidine deaminase domain together with the THUMP domain recognize and bind trna to deaminate cytidine into uridine at position 8; thus, the enzyme introduces the C8U mutation in M. kandleri trna genes (Heinemann et al., 2010). Thus far, the C8 was found only in M. kandleri trna genes. As more archaeal genomes are sequenced, it will be interesting to observe whether this mutation occurs in other Archaea or is specific to M. kandleri. Posttranscriptional Modifications of trna Nucleosides trna transcripts contain only the canonical RNA nucleosides adenosine (A), uridine (U), cytosine (C), and guanosine (G) whereas the mature trnas contain modifications of the canonical nucleosides. These modifications can be simple (methylations of the base or the ribose) or complex (addition of an entire functional group) (Grosjean et al., 2008) (Figure 1-8). The proportion of such modified nucleotides in trna can approach 50% (Grosjean et al., 2008). To date, more than 80 modified nucleosides have been identified at about 60 different trna positions. A few of them together with the corresponding standard abbreviations are shown in Figure 1-8. The physiological role of most of trna posttranscriptional nucleoside modifications is not completely elucidated. However, for few of them, in vivo, in vitro, and in silico studies showed that trna modifications located within or around the anticodon loop provide a fine tuning in the interactions of trna molecules with other partners of the translation apparatus. When located outside the anticodon region, modifications in trna confer 35

36 important mechanisms for trna stabilization (Grosjean et al., 2008). Some modifications such as m 1 G37, t 6 A37, and ψ55 are universally distributed; others are specific to a given domain such as G + 15, Cm56, and m 1 ψ54 found only in Archaea (Appendix D). In Archaea, 47 trna modifications have been identified, but the majority of the information about the exact locations of these modifications in the trna is known only in H. volcanii which contains a total of only 15 modifications at 19 positions (Gupta, 1984; Gupta, 1986). For the remaining 32 modifications found in other archaeal species, the information is scarce. The field of post-transcriptional modifications of Archaeal trnas has been pioneered by the analytical work of W. McCloskey and P.F Crain who identified trna modifications in phylogenetically diverse Archaea (Edmonds et al., 1991; Kowalak et al., 1994; McCloskey et al., 2001) and by R. Gupta who sequenced all trna molecules of the extreme halophile H. volcanii (Gupta, 1984; Gupta, 1986). The analysis of trna extracted from Archaea living at different temperatures showed an increased number of modifications in hyperthermophile trna (Edmonds et al., 1991; Kowalak et al., 1994). McCloskey postulated that posttranscriptional modifications in archaeal thermophiles play major roles in trna stabilization under extreme conditions (Kowalak et al., 1994). Later, M. Helm and Y. Motorin extensively reviewed the importance of posttranscriptional modifications in trna folding and stability (Motorin and Helm, 2010). Due to the large number of modifications found in archaeal trna, I focused, here, on the function and synthesis of trna modifications specific to Archaea. 36

37 Agmatidine, a recently discovered trna modification essential for decoding Decoding Ile AUA codons in Bacteria requires the essential modification lysidine (k 2 C) (derived from lysine). Almost all Bacteria possess trna Ile CAU to decode AUA codons. In these trnas, C is modified to lysidine by lysidine synthase (or TilS) (Suzuki and Miyauchi, 2010). The k 2 CAU anticodon decodes the AUA codon not the methionine encoding AUG codon (Suzuki and Miyauchi, 2010). Lysidine is also an anti-determinant for charging by methionyl-trna synthetase (Suzuki and Miyauchi, 2010). For this reason, the tils gene is essential in Bacteria (Suzuki and Miyauchi, 2010). Recently, the Rajbhandary group discovered that while most Archaea use trna Ile CAU to decode AUA codons, N. equitans use trna Ile UAU (Kohrer et al., 2008). Archaea also require a cytosine modification of their trna Ile CAU (Gupta, 1984; Kohrer et al., 2008). The trna Ile extracted from H. marismortui showed the presence of agmatidine (C + or agm 2 C) (Figure 1-8). Agmatidine, a newly discovered modification derived from agmatine, is present at position 34 instead of lysidine. trna extracted from other archaeal organisms (M. maripaludis and S. solfataricus,) showed the presence of agmatidine suggesting that agm 2 C is present in both Euryarchaea and Crenarchaea (Mandal et al., 2010). COG1571 was identified as a potential gene responsible for agm 2 C formation as contains a putative OB-fold DNA/RNA binding domain (Grosjean et al., 2008). Biochemical assays showed that indeed the COG1571-encoded enzyme (TiaS) was responsible for agmatidine formation. TiaS uses agmatine and ATP as substrates (Ikeuchi et al., 2010). In the hyperthermophile T. kodakaraensis agmatine is essential for polyamine biosynthesis (Fukuda et al., 2008; Grosjean et al., 2008). Because H. volcanii COG1571 encoding gene HVO_0339 (TiaS) could be disrupted only when an additional copy was present in trans (Blaby et al., 2010), agm 2 C is likely to be essential 37

38 for survival. Nevertheless, structural and mechanistic studies are needed to understand the catalytic mechanisms behind the formation of agm 2 C. Wyosine derivatives biosynthesis pathways in Archaea Wyosine derivatives are some of the most structurally complex trna ribonucleosides modifications. Most archaeal trnas analyzed to date contain yw derivatives, some of which, such as img2 and mimg, are specific to Archaea (de Crécy- Lagard et al., 2010) (Figure 1-9). It was long known that Trm5, the methyltransferase responsible for the m 1 G modification at position 37, is also the first enzyme in the yw pathway (Björk et al., 2001; Droogmans and Grosjean, 1987). The remainder of the yw pathway was elucidated in yeast through the efforts of several groups (Suzuki et al., 2009; Urbonavicius et al., 2009). As discussed below, recent studies revealed that yw biosynthesis in archaeal organisms is complex. The enzyme that catalyzes the formation of m 1 G37 in Archaea is a member of the Trm5 family found in Eukarya, but is distinct from the TrmD family that catalyzes the identical reaction in Bacteria (Björk et al., 2001). TrmD belongs to the Class I family of Ado-Met-dependent methyltransferases, which do not require the L shape of trna, while Trm5 belongs to the Class II family of methyltransferases that do require the L shape (Brulé et al., 2004; Christian and Hou, 2007; Goto-Ito et al., 2008). In keeping with their structural differences, enzymological studies have shown that the Trm5 and TrmD enzyme families have distinct kinetic profiles (Christian et al., 2010). Structural analysis of the M. jannaschii Trm5 enzyme in complex with trna allowed the identification elements for G37 and trna binding domain recognition and (Goto-Ito et al., 2009). This led to the hypothesis that recognition of trna by Trm5 might provide a checkpoint for a mature trna (Goto-Ito et al., 2009). While an understanding of these 38

39 two analogous enzyme families is undoubtedly emerging, the situation might be more complicated in Archaea as comparative genomic analysis suggests (de Crécy-Lagard et al., 2010). Phylogenetic analysis identified three Trm5 subfamilies in Archaea (Trm5a, Trm5b and Trm5c) (de Crécy-Lagard et al., 2010). Several archaeal species contain two subfamilies, and the structurally characterized archaeal Trm5 is a member of the Trm5b subfamily (de Crécy-Lagard et al., 2010). A combination of observations led to the proposal that Trm5a methylates the C7 position of img-14, not the N-1 position of G37 (de Crécy-Lagard et al., 2010). First, the distribution of Trm5a correlates with the presence of yw bases containing methyl groups at the C7 position of img-14, such as img2 and mimg (Figure 1-9) (de Crécy-Lagard et al., 2010). Second, differences can be observed between Trm5a and Trm5b primary sequences, such as the absence of D1 domain and of key G37 recognition residues (de Crécy-Lagard et al., 2010). Finally, methylation experiments using Trm5a and Trm5b proteins from P. abyssi showed that the two enzymes did not catalyze the same reactions (de Crécy-Lagard et al., 2010). Tyw1, the second enzyme of the pathway, is a member of the radical-sam superfamily. These enzymes utilize iron-sulfur clusters and S-adenosylmethionine (SAM) to generate substrate based radicals. The structures of Tyw1 from the Archaea P. horikoshii and M. jannaschii (Goto-Ito et al., 2007 ; Suzuki et al., 2007) provided insight into the binding of iron and SAM and predicted the trna binding surface, but the catalytic mechanism has yet to be unraveled. It was recently shown that yw-86 and its methylated derivative yw-72 (Figure 1-9), previously thought to be specific to eukaryotes, are also found in a variety of archaeal species (de Crécy-Lagard et al., 2010; Umitsu et al., 2009). The aminocarboxypropyl 39

40 side-chain is inserted by Tyw2, which is homologous to Trm5a, thus revealing an intriguing example of a change in catalytic activity within the same enzyme family (Umitsu et al., 2009). The fourth and last enzyme shared between archaeal and eukaryotic pathways, Tyw3, still remains to be characterized biochemically and structurally. The complexity of the yw pathway in Archaea, where a least six different variants have been identified, the most complex found in hyperthermophiles ((de Crécy-Lagard et al., 2010) and Figure 1-9), is perhaps unprecedented in any other known metabolic pathway. yw derivatives have been shown to limit frame-shifting in yeast (Waas et al., 2007), but no in vivo data is available yet for Archaea that could elucidate both the function and diversity of these modifications. Modification of Adenosine to N 1 -methyladenosine to N 1 -methylinosine, an archaeal site specific modification Inosine (6-deaminated adenosine (I)) is a modified nucleoside found in eukaryotes and bacteria at position 34 of trna (the wobble position) (Grosjean et al., 1996). The I derivative, N 1 -methylinosine (m 1 I), is found only at position 37 eukaryotic trna Ala and at position 57 (TψC loop) of several trna in some Archaeal halophiles and hyperthermophiles (Grosjean et al., 1995). The formation of I34 and I37 in bacteria and eukaryotes respectively is catalyzed by a distinct trna:adenosine deaminase that hydrolytically deaminates adenosine. Tad1p catalyzes the formation of I37, and Tad2p/Tad3p catalyzes the formation of I34 in S. cerevisiae. TadA catalyzes the formation of specific I34 of trna Arg (AGC) (Grosjean et al., 2008; Rubio et al., 2007). The N 1 -methylinosine at position 37 (m 1 I37) of eukaryal trna is formed by the addition of methyl by a specific SAM-dependent methylase (Grosjean et al., 1996). In Archaea, 40

41 the first step is the methylation of A57 by the SAM dependent trna:m 1 A methyltransferase (TrmI) followed by deamination of the 6-amino group of the adenosine moiety catalyzed by a specific trna:m 1 A specific deaminase (Grosjean et al., 1995; Roovers et al., 2004). In most cases, trna ribose methylation is catalyzed by site-specific methyltransferase protein that recognizes both sequence and structure within the pre-trna substrate. However, TrmI from P. abyssi is a region-specific enzyme and catalyzes the methylation of A57 and also A58 (Grosjean et al., 2008; Roovers et al., 2004). The crystal structure of TrmI from P. abyssi was solved (Roovers et al., 2004). The enzyme is a tetramer. The intersubunit disulphide bridges and hydrophobic interactions between monomers act to stabilize the structure at high temperatures (80 C). The catalytic domain of each subunit (residues ) is a modified Rossmann fold composed of a central seven-stranded β-sheet, flanked by α- helices on both sides (Roovers et al., 2004). The next enzyme, a trna:m 1 A specific deaminase, that acts on the formation of m 1 I57 is yet to be identified. As with other modifications outside the anticodon region, m 1 I57 is involved in tertiary interactions across D and T-loop to maintain the integrity of the trna L-shape. Guide RNA dependent modifications of trnas Archaea have one feature in common with eukaryotes in the use of guide RNAs and their associated protein complexes to introduce ψ and 2 -O-methylations in both rrnas and trnas (Grosjean et al., 2008). In trna Trp molecules of several Archaea including H. volcanii, the C m 34 and U m 39 methylations are introduced by the box C/D ribonucleoprotein (RNP) complex containing L7p, fibrillarin and Nop5 (Grosjean et al., 2008). It was also shown that in Sulfolobus sp., ψ35 is introduced in trna by a Cbf5 41

42 dependent H/ACA machinery (Muller et al., 2009). Interestingly, in P. abyssi the enzyme Pus7 modifies this position as well as position 13 (Muller et al., 2009). A Pus7 homolog is also found in S. solfataricus, but it does not modify position 35, only position 13 (Muller et al., 2009). This is the first example in Archaea of a ψ residues introduced by a guide RNA in some species and directly by an enzyme in others, although a similar phenomenon had already been observed in the methylation of C m 56 (Renalier et al., 2005). C m 56 is a site specific modification found only in Archaea (Grosjean and Benne, 1998; Grosjean et al., 2008). The gene responsible for the formation of Cm56 (atrm56) was identified in P. abyssi (PAB1040), biochemically and structurally characterized (Kuratani et al., 2008; Renalier et al., 2005). atrm56 catalyzes the SAM dependent 2 - O-methylation of cytidine residue at position 56 of pre-trna (Renalier et al., 2005). The atrm56 enzyme forms a spherical dimer with a flat surface and no deep active site cavity. SAM is located near the surface, with its methyl group exposed to the solvent so that the cytidine at position 56 of trna is readily accessible to the active site with no large induce-fit conformational change of atrm56 (Kuratani et al., 2008). Homologs of atrm56 are found in all archaeal sequenced to date with the exception of P. aerophilum which uses C/D guide RNA directed trna 2 -O-methylation complex for methylation of C56 of trna as shown by Renalier et al. (Renalier et al., 2005). ψ55 is a universal modification of trna introduced by the TruB/Pus4 families in bacteria and yeast, respectively, whereas ψ54 is found only in Archaea and a few higher eukaryotes. In Archaea the Pus10 (or PusX) family of proteins modifies both ψ54 and ψ55 in vitro (Gurha and Gupta, 2008; Roovers et al., 2006), whereas Cbf5 can 42

43 modify only position 55 in a guide independent manner (Gurha et al., 2007). Recent RNA analysis of a H. volcanii Δcbf5 strain suggests that Cbf5 is responsible for the modification of rrna in vivo, but that ψ54 55 trna is still present in that mutant (Blaby et al., 2011). This suggests that either both Pus10 and Cbf5 or only Pus10 modify trna in vivo, although this prediction could not be tested as pus10 is essential in H. volcanii (Blaby et al., 2010). Archaeosine, an archaeal trna specific modification Archaeosine (G + ) is a 7-deazaguanosine derivative found at position 15 (D-loop) in trnas of almost all Archaea analyzed to date (Gregson et al., 1993). Bacterial and eukaryal trnas are not modified at this position or contain archaeosine at any other position. Almost all Archaea synthesize G + de novo. The enzyme archaeosine trnaguanine transglycosylase (atgt) catalyzes the critical step in G + biosynthesis (Watanabe et al., 1997), and it was extensively biochemically and structurally characterized. The recombinant atgt enzymes from H. volcanii, P. horikoshii, and M. jannaschii were expressed, purified and their enzymatic properties investigated (Iwata- Reuyl, 2003). Archaeal Tgt takes the free base 7-cyano-7-deazaguanine (preq 0 ) and exchanges it with the guanine at position 15 forming preq 0 -trna (Figure 1-10). The subsequent steps take place at the trna level. PreQ 0 -trna is further modified to G +. The crystal structure of P. horikoshii in complex with its substrates guanine, preq 0, and trna Val has been determined (Figure 1-11) to reveal details in the transglycosylation mechanism. When bound to the substrate, atgt forms a dimer which involves both the N-terminal domain and the C-terminal domain. The N-terminal domain contains the catalytic domain that folds into (α/β) 8 barrel with a characteristic Zn 2+ binding site formed 43

44 by {CXCX(2)CX(22)H} motif (Figure 1-11 inset) (Ishitani et al., 2002) and the Asp95 shown to be critical for catalysis (Bai et al., 2000). The C-terminal region contains three domains, C1, C2, and C3 that do not have any sequence similarity with any other protein with known structure (Figure1-11). The C1 domain is involved in dimerization. The C3 domain adopts an OB-fold characteristic of RNA binding domain found in RNA modification enzymes and ribonucleoproteins (PUA domain) (Perez-Arellano et al., 2007). To expose the hidden G15 to atgt, trna undergoes drastic conformational changes when the trna L-form is disrupted to form the previously unknown λ-shaped trna (Ishitani et al., 2003). The C3 domain together with the C2 domain synergistically recognize, bind, and stabilize the new trna λ form (Ishitani et al., 2003) The next chapters will focus on the identification and phenotypical characterization of archaeosine biosynthetic steps in the extreme halophilic archaeon H. volcanii. 44

45 Growing peptide chain Outgoing empty trna Ser Thr Lys Asp Phe Trp EF-Tu Incoming trna bound to amino acid and EF-Tu trna A C C trna trna trna U U U C U A A A G U G G U C A U G G A A A G A U U U C A C C A C G E P A trna Messenger RNA Ribosome Figure 1-1. trna role in translation. At one end, trna carries a three-nucleotide sequence called the anticodon. The anticodon forms three base pairs with the codon in mrna. The mrna encodes a protein as a series of contiguous codons, each of which is recognized by a particular trna. At the other end, each trna is covalently attached to the amino acid that corresponds to the anticodon sequence. During protein synthesis, trnas are delivered to the ribosome by elongation factors (EF-Tu in bacteria, eef-1 in eukaryotes and archaea). Once delivered, a trna already bound to the ribosomes transfers the growing polypeptide chain from its 3 end to the amino acid attached to the 3 end of the newly-delivered trna. The peptide formation reaction is catalyzed by the ribosome. 45

46 A B Figure 1-2. trna secondary and tertiary structures. A) Typical trna clover leaf; the important features are labeled. B) The tertiary conformation of trna (PDB 6TNA) showing the typical L-shape; the clover leaf features are labeled. 46

47 5 -end processing 3 -end processing trna nucleotidyltransferase trna splicing ligase Posttranscriptional modifications Figure 1-3. Maturation of trna in Archaea. The trna transcript has its 5 -leader cleaved by RNase P and 3 -end cleaved by trnase Z. The nucleotidyl transferase adds the CCA to the 3 -end. trna splicing endonuclease cleaves the intron, and the RNase splicing ligase enzyme ligates the ends. The canonical ribonucleosides are chemically modified. 47

48 Crenarchaea Thermoproteales Sulfolobales Desulfurococcales Thaumarchaea Korarchaea Euryarchaea Nitrosphera gargensis Nitrosopumilus maritimus Candidatus Korarchaeum cryptofilum Nanoarchaea Nanoarchaeaum equitans Thermococcales Methanopyrales Methanobacteriales Methanococcales Thermoplasmatales Archaeoglobales Methanosarcinales Methanocellales Methanomicrobiales Halobacteriales Figure 1-4. Phylogenetic distribution of Archaea. Archaea contains two main phyla, Crenarchaea and Euryarchaea, and three tentatively created phyla, Nanoarchaea, Korarchaea, and Thaumarchaea 48

49 Specificity domain Specificity domain Catalytic domain Catalytic domain Catalytic domain Figure 1-5. Representatives of archaeal RNase P RNAs. Archaeal type A RNA, exemplified by that of M. thermoautotrophicus, is similar to bacterial type A RNAs but lack P13, P14 and P18. Type M RNAs (lacking P6, P8, P16 and P17) are found in Methanococci and Archaeoglobi. Type T RNAs (lacking the S-domain) are found in the Thermoproteaceae. Adapted from RNase P Database (Ellis and Brown, 2009) 49

50 Figure 1-6. Types of introns found in Archaea. A) Common trna with no disruption. B) trna containing a single intron at the canonical position 37/38. C) Example of a trna containing a single intron at a noncanonical position. D) Example of a trna containing multiple introns (up to three introns) at various positions. E) Split trna, in which the 5 and 3 halves of the trna are encoded on separate genes. F) Tri-split trna, in which the trna is composed of three individual transcripts. G) Permuted trna, in which the 3 half of the trna occurs upstream from the 5 half. H) Intron-containing permuted trnas have been reported to contain an endogenous intron (Adapted from Randau and Söll (Randau and Söll, 2008)). Bulge Helix Bulge Bulge Helix Loop A B Figure 1-7. Representation of a bulge helix bulge (BHB) and relaxed bulge helix loop (BHL). A) Schematic representation of a BHB. B) Schematic representation of a BHL. Conventional cis-splicing endonuclease recognizes and cleaves (arrows) the BHB or BHL RNA motif in pre-trna leading to the excision of the intron (Adapted from Cavicchioli (Cavicchioli, 2006)). 50

51 Figure 1-8. Selected trna posttranscriptional modifications. A) Simple modifications. B) Complex modifications 51

52 Taw2 Taw3 yw-86 yw-72 Trm5b Taw1 Taw3 Guanine m 1 G img-14 Trm5a img Taw3 img2 mimg Figure 1-9. Biosynthesis of Wyosine derivatives in Archaea. Red circles represent the moieties added at each step of synthesis. atgt? preq 0 preq 0 -trna G + -trna Figure atgt role in G + biosynthesis. atgt exchanges the encoded guanine at position 15 of trna with the free base preq 0 forming preq 0 -trna. 52

53 C2 domain Cys281 PUA domain His307 Zn 2+ Cys284 Cys279 Zn 2+ Zn 2+ biding site Catalytic domain Figure Crystal structure of atgt from P. horikoshii (PDB IQ8). atgt is a modular enzyme with the N-terminal containing the Zn 2+ binding motif and the catalytic region. The C-terminal comprises domains C1, C2, and C3. C3 adopt an OB fold characteristic for RNA binding domain (PUA domain). The inset shows the Zn 2+ binding residue 53

54 CHAPTER 2 MATERIAL AND METHODS Materials The materials mentioned in this study had been acquired from the following suppliers or friends: All organic and inorganic analytical grade chemicals: Fisher Scientific (Atlanta, GA) or Sigma Chemical Co. (St. Louis, MO). Restriction endonucleases and Taq DNA polymerase: New England BioLabs (Beverly, MA). Phusion DNA polymerase: Finnzymes (Espoo, Finland). RNase P 1, RNase A, RNase T2, Phosphodiesterase 2 and alkaline phosphatase: Sigma Chemicals Co (St. Louis, MO). Desalted oligonucleotides (Appendix A): Integrated DNA Technologies (Coralville, IA) bp DNA molecular weight standards: New England BioLabs (Beverly, MA). The genomic DNA from H. volcanii and H. salinarum NRC.1: prepared as described by Dyall-Smith (Dyall-Smith, 2009). E.coli genomic DNA: prepared as described in Sambrook & Russell (Sambrook and Russell, 2001). P. calidifontis genomic DNA was prepared from cell paste, gift from Todd Lowe (UCSC), using Nucleobond AXR-400 columns from Clontech Laboratories (Mountain View, CA) according to the manufacturer s protocol. The S. solfataricus genomic DNA was a kind gift from Dr. Dirk Iwata-Reuyl (Portland University, OR). Bioinformatics Tools Analysis of the phylogenetic distribution and physical clustering was performed in the SEED database (Overbeek et al., 2005) ( BLAST tools and resources at NCBI (Altschul et al., 1990) were used to search for DNA and protein homologs in NCBI data base. Multiple alignments were built using the ClustalW tool (Chenna et al., 2003). 54

55 Structure based alignments were performed using the ESpript platform ( (Gouet et al., 1999). The PRATT tool (Jonassen et al., 1995) from Prosite website ( was used to derive the specific protein motifs. ScanProsite (de Castro et al., 2006) proteins in the Prosite database and Phi-Blast at NCBI (Schaffer et al., 2001) were used to scan the database for the presence of one specific motif in other proteins from NCBI database. Web logo ( (Crooks et al., 2004) was used to create sequence logos. The phylogenetic trees were constructed using the neighbor joining method (Saitou and Nei, 1987) and the parsimony method (Day, 1987) imbedded in MEGA 4.0 software (Tamura et al., 2007) and the SATCHMO algorithm (Edgar and Sjolander, 2003) imbedded in the Phylofacts suite ( (Glanville et al., 2007). The H. volcanii and H. salinarum genome sequences were accessed through the UCSC archaeal genome browser (Schneider et al., 2006). Three Dimensional (3D) Structure Superimposition and Visualization The released protein structures were downloaded from Protein Data Bank (PDB: visualized and analyzed using DS Vizualizer (Marti-Renom et al., 2004), Protein Explorer (Martz, 2002), and Cn3D (Wang et al., 2000). The structure alignment was performed using the superimposition tool of the software Discovery Studio 2.5 ( (Marti-Renom et al., 2004) and Cn3D VAST at NCBI (Hogue, 1997; Kann et al., 2005; Wang et al., 2000) Strains, Media, Growth and Transformation Strains used in these studies are listed in Appendix C. 55

56 E. coli derivatives were routinely grown at 37 C in LB (BD Diagnostic System) or minimal M9 medium (Sambrook and Russell, 2001) supplemented with 0.2% glycerol as a carbon source. Growth media were solidified with 15 g/l agar (BD Diagnostic System) for the preparation of plates. Transformations of E. coli were performed following standard procedures (Sambrook and Russell, 2001). Ampicillin (Amp r, 100 μg/ml), Thymidine (dt, 300 µm), Kanamycin (Kan r, 50 µg/ml), isopropyl-beta-dthiogalactopyranoside (IPTG, 1 mm) and L-arabinose (0.2%) were added when needed. H. waslbyi C23 was grown at 37 C static in defined media (DBCM2) (Dyall- Smith, 2009) containing 200g NaCl, 29.1 g MgSO 4 7H 2 O, 25g MgCl 2 6H 2 O, 5.8 g KCl, 5 mm, 5.0 mm NH 4 Cl, 1mM K 2 HPO 4 ph 7.5, 0.25% HCl, g FeCl 2 4H 2 O, 0.19 mg CoCl 2 6H 2 O, 0.1 mg MnCl 2 4H 2 O, 0.07 mg ZnCl 2, mg H 3 BO 3, Na 2 MoO 4 2H 2 O, mg NiCl 2 6H 2 O, mg CuCl 2 2H 2 O,.04 mg 4- aminobenzoate, mg biotin, 0.09 mg nicotinic acid, 0.05 mg calcium panthotenate, 0.15 mg pyridoxamine hydrochloride, 0.09 mg thiamine chloride hydrochloride, 0.05 cyanocobalamine, 0.03 mg lipoic acid, 0.03 mg riboflavin, mg folic acid, and 10 mm pyruvate. H. volcanii derivatives were grown at 45 C and 200 rpm in: 1) Hv-YPC rich medium (Allers et al., 2004) containing: 144g NaCl, 21g MgSO 4 7H 2 O, 18g MgCl 2 6H 2 O, 4.2 g KCl, 10mM Tris HCl (ph 7.5), 0.5% yeast extract, 0.1% peptone, and 0.1% casamino acids (w/v); 2) Hv-min minimal medium (Dyall-Smith, 2009) containing 144g NaCl, 21g MgSO 4 7H 2 O, 18g MgCl 2 6H 2 O, 4.2 g KCl, 10mM Tris HCl (ph 7.5), 0.5% Na Lactate (v/v), 0.5% Na Succinate (wt/v), 0.02% glycerol (w/v), 5mM NaHCO 3, 0.5 mm K 2 HPO 4 ph 7.5, 0.36 mg MnCl 2 4H 2 O, 0.44 mg ZnSO 4 7H 2 O, 2.3 mg 56

57 FeSO 4 7H 2 O, and 0.05 mg CuSO 4 5H 2 O. Riboflavin (20 µg/ml), Uracil (50 or 10 μg/ml), novobiocin (Nov r, 0.2 μg/ml), and 5-fluoroorotic acid (5-FOA, 50 μg m/l) were added when needed. H. volcanii growth media was solidified with 20 g/l agar (BD Diagnostic System) for preparation of plates. To enhance transformation efficiency, H. volcanii strains H26 and different mutants were transformed with plasmid DNA isolated from E. coli GM2163 or E. coli INV110 (dcm - dam - ) according to Cline et al. (Cline et al., 1989). H. volcanii Competent Cells And Transformation Protocols Competent cells Ten ml of YPC were inoculated with one colony of H. volcanii strain then incubated at 37 C overnight (shaking at 180 rpm). 5mL of this culture was used to inoculate 100mL culture in a 250mL flask then incubated overnight at 37 C. When the absorbance (λ = 600 nm) reached 0.8, the cells were spun down in 50mL centrifuge tubes at 6,000 rpm for 15min at room temperature and resuspended in a 20 ml of buffered spheroplasting solution (per liter 58.5 g NaCl, 2.01 g KCl, 50 ml 1.0 M Tris HCl ph 8.2, 150 g sucrose) to wash the cells of residual Mg 2+ ions (Dyall-Smith, 2009). The cells were then centrifuged at 5000 rpm for 10 minutes at room temperature. The supernatant was carefully removed and the pellet was resuspended in a final volume of 5 ml of buffered spheroplasting solution with 15% glycerol (per liter 58.5 g NaCl, 2.01 g KCl, 50 ml 1.0 M Tris HCl ph 8.2, 150 g sucrose, 150 ml glycerol). The cells were rapidly frozen in dry ice and stored at -70 C. Transformation One hundred µl of 0.5M EDTA (ph 8.0) was added to 1 ml of concentrated cell suspension (freshly thawed) and mixed gently. The mixture was incubated at room 57

58 temperature for 10 minutes. While the cells were converted to spheroplasts, the DNA was added (1-2 µg) to the bottom of 1.5 ml plastic microfuge tubes (sterile). One hundred µl of spheroplast was added to the tube, mixed gently, and incubated for another 5 minutes at room temperature. After 5 minutes, an equal volume (i.e. 100 ul) of of 60% PEG600 solution (600 µl of pure PEG600 and 400 µl unbuffered spheroplastic solution containing 58.5 g NaCl, 2.01 g KCl, 150 g sucrose per liter) was added, mixed, and incubated for 20 minutes at room temperature. After 20 minutes, 1 ml of recovery medium (18% salt, 15% sucrose) was added and centrifuged for 5 minutes at 6500 rpm. The cells were resuspended in 1mL of recovery medium and allowed to recover by incubating 2-4 hours at 37 C before plating on selective media. Polymerase Chain Reaction Polymerase chain reactions (PCRs) were performed using Phusion Hot Start, (New England Biolabs, Beverly, MA), Taq DNA polymerase (New England Biolabs, Beverly, MA) or Pfu Turbo (Stratagene, Santa Clara, CA) using primers listed in Appendix A. For each PCR reaction 100 ng of template DNA, 0.2 µm forward primer, 0.2 µm reverse primer, reaction buffer to 1X concentration, 200 µm dntp, nuclease free water and 1 2 units DNA polymerase per 100 µl reaction were used. The thermocycling conditions for routine PCR were: 1 cycle of initial denaturation at 95 C for 1 minute, 30 cycles of denaturation at 95 C for 15 seconds, annealing at C (depending on the Tm of the primers calculated as 2 X number of purines + 4 X number of pyrimidines), extension at 72 C for 1 minute per kb (for Phusion is 30 seconds per kb), and a last cycle of final extension at 72 C for 10 minutes. 58

59 DNA Electrophoresis Sizes of PCR products and plasmid fragments were analyzed by electrophoresis using %(w/v) agarose gels in TAE buffer (40 mm Tris acetate, 2 mm EDTA, ethidium bromide 0.1% (v/v), ph 8.5) with 1 Kb DNA ladder molecular weight markers as standards (New England Biolabs, Beverly, MA ). Gels were photographed using KODAK Gel Logic 200 Imaging System (Carestream Health, Rochester, NY) Plasmid Isolation and Transformation Plasmids were isolated with Qiagen Miniprep kit according to manufacturer s protocols (Qiagen Inc., Valencia, CA). When applicable, linearized plasmids or inserts were purified from agarose slices by QIAquick gel extraction kit (Qiagen). Site-Directed Mutagenesis Site-directed mutagenesis was performed according to the QuikChange Site Directed Mutagenesis protocol (Stratagene, Santa Clara, CA) as per manufacturer s instructions with the following modifications. Phusion DNA polymerase (New England Biolabs) was used for generation of all mutations. An elongation time of three minutes was used for the generation of 5.5 kb products. Following the PCR, the parental DNA template was DpnI treated (New England Biolabs). The nicked vector DNA containing the desired mutations was then transformed into DH5α chemically competent cells. The resulting plasmids were verified by Sanger sequencing at the University of Florida core facility. General Cloning Genes were amplified by PCR using DNA polymerases including Pfu Turbo (Stratagene, Santa Clara, CA), Taq (New England Biolabs, Beverly, MA), or Phusion (New England Biolabs, Beverly, MA). The fidelity of all cloned PCR amplified products 59

60 was confirmed by DNA sequencing using the dideoxy termination method with Perkin- Elmer/Applied Biosystems and LICOR automated DNA sequencers (DNA Sequencing Facilities, Interdisciplinary Center for Biotechnology Research and Department of Microbiology and Cell Science, University of Florida). Products and vectors were cut with restriction enzymes NdeI, BlpI, KpnI, BamHI, EcoRI, XbaI, or SphI according to manufacturer s specifications (New England Biolabs) as needed. The DNA fragments were ligated into vectors using T4 DNA ligase (New England Biolabs). Ligation reactions were performed at room temperature for 30 min. Plasmids and Strains Construction Plasmids used in these studies are listed in Appendix B. Plasmids construction for bacterial complementation assays The ygcm gene (NP_ ) was amplified from E. coli genomic DNA using primers ygcm_fw and ygcm_rev bearing EcoRI sites and cloned into pbad24 (Guzman et al., 1995). The SSO2412 (NP_ ) and Pcal_1063 (YP_ ) genes were amplified from genomic DNA of S. solfataricus and P. calidifontis respectively. To amplify SSO2412, we used primers SsQueD2QHGH_Fw and SsQueD2QHGH_Rev bearing NcoI and SphI restriction sites. To amplify Pcal_1063 we used primers PcQueD2WHGH_Fw and PcQueD2WHGH_Rev bearing NcoI and SphI restriction sites. The obtained PCR fragments were cloned into pbad24 after digestion with appropriate enzymes. The SSO2412 fragment was directly cloned into pbad24 previously digested with SmaI, whereas the Pcal_1063 fragment was cloned into pbad24 using the restriction sites NcoI and SphI. The P. calidifontis Pcal_0221 (YP_ ) was amplified from P. calidifontis genomic DNA using primers 60

61 QueFLikepbad24_Fw and QueFLikepbad24_Rev bearing NcoI and SphI restriction sites respectively and cloned into pbad24 after digestion with the appropriate enzymes. Plasmids construction for archaeal complementation assays The Vng6306 (NP_ ), Vng6305 (NP_ ), and Vng6303 (NP_ ) genes were amplified from the H. salinarum NRC1 genomic DNA using primers HsQueD_NdeI_Fw and HsQueD_BlpI_Rev, HsQueE_NdeI_Fw and HsQueE_BlpI_Rev, HsQueC_NdeI_Fw and HsQueC_BlpI_Rev respectively bearing NdeI and BlpI restriction sites and cloned into pjam202 (Kaczowka and Maupin-Furlow, 2003) after digestion with appropriate enzymes. The Vng1957G gene was amplified from the H. salinarum NRC1 genomic DNA using primers HstgtA2_Fw bearing NdeI and HstgtA2_Rev bearing BlpI then cloned into pjam202 after digestion with appropriate enzymes. The HVO_1282 (YP_ ) gene was amplified from the H. volcanii DS70 genomic DNA using primers HvPTPSIV_NdeI_Fw and HvPTPSIV_BlpI_Rev bearing NdeI and BlpI sites and cloned into pjam202 after digestion with the appropriate enzymes. The HVO_2001, was amplified from the H. volcanii DS70 genomic DNA using primers HvtgtA1_Fw and HVtgtA1_Rev and cloned into pjam202 after digestion with NdeI and BlpI of both the primers and vector. Chromosomal gene deletions The ΔfolB::Kan r deletion was transferred by P1 transduction (Miller, 1972) from the E. coli JW strain from the Keio collection (Baba et al., 2006) into E. coli K12 MG1655 to create MG1655 ΔfolB::Kan r strain (VDC3276). The deletion of the folb gene was verified by PCR. The ΔqueF::Kan r deletion was transferred by P1 transduction (Miller, 1972) from the E. coli JW strain from the Keio collection (Baba et al., 2006) into E. coli K12 MG1655 ΔqueC (VDC2047) to create MG1655 ΔqueC 61

62 ΔqueF::Kan r strain (VDC3274). The Kan r marker was then excised as described by (Datsenko and Wanner, 2000) to create the MG1655 ΔqueC ΔqueF strain (VDC3280). The deletion of the quef gene was verified by PCR. The H. volcanii ΔHVO_1716, ΔHVO_2348, ΔHVO_2001, ΔHVO_2008 deletion strains were constructed as described by El Yacoubi et al. (El Yacoubi et al., 2009). In summary, a region of the chromosome containing the gene to be deleted with an additional 1000 bp upstream and downstream was amplified and cloned into a pentr plasmid such as pcr8/gw/topo (Invitrogen) using TA technology (Holton and Graham, 1991). The fragment corresponding to the target gene was then deleted by performing reverse PCR using 5 phosphorylated oligonucleotides (Zhou et al., 2008). The 5 and 3 termini of the PCR product, the linearized plasmid without the gene, were ligated using T4 DNA Ligase (New England Biolabs) before transformation into TOP10 cells (Invitrogen). The resulting circulized plasmid was recombined using LR technology (Invitrogen) into pby158 (El Yacoubi et al., 2009). The pby158 derivatives containing the deletion cassette were passaged through INV110 (dcm -, dam - ) (Holmes et al., 1991) then transformed into H. volcanii strain H26 (DS70 ΔpyrE2). The deletion strains of H. volcanii were obtained using the two steps protocol described by Allers et al. (Allers and Ngo, 2003). In short, the first cross-over was selected using ura+ phenotype, and the double cross-over was selected using fluoroorotic acid resistance. The deletion strains were checked by PCR using oligonucleotides annealing to the deleted gene and oligonucleotides annealing outside the cassette containing the deletion, as described by El Yacoubi et al. (El Yacoubi et al., 2009). 62

63 Southern Blot Southern hybridization was performed using DIG Easy Hyp kit (Roche Molecular Biochemical, Indianapolis, IN). Genomic DNA preparations were performed according to The Halohandbook (Dyall-Smith, 2009). After digestion with the appropriate restriction enzymes, the DNA was transferred by capillary transfer to a positively charged nylon membrane using the alkaline transfer method according to Sambrook et al. (Sambrook and Russell, 2001). Ten µg of digested DNA were loaded on the 0.7% agarose gel casted in 1XTAE (40 mm Tris acetate, 1.0 mm EDTA) containing 0.5 µg/ml etidium bromide. The gel was run (in 1XTAE) at 25 V for 3 hours. After separation, the DNA was denatured in denaturing alkaline solution (1.0 M NaCl, 0.5 M NaOH) and then transferred to a positively charged nylon membrane (Roche) using alkaline transfer buffer (0.4 N NaOH, 1.0 M NaCl). After transfer, the DNA was fixed on the membrane by soaking it into Neutralization Buffer II (0.5 M Tris HCl ph 7.2 and 1.0 M NaCl) for 15 minutes at room temperature. The hybridization of immobilized DNA to a probe was processed according to Roche Biosciences DIG protocols (Roche). The probes were designed to hybridize outside the 5 and 3 flanking regions. Stringency washes were performed according to Roche Biosciences DIG Application Manual for Filter Hybridization except for the second stringency wash which was performed in 0.5X SSC (20X SSC contains 3.0 M NaCl and 0.3 M sodium citrate) at 65 C. The image was developed on Kodak film using QX60A Processor Konic (Diagnostic Imaging, Norwalk, CT). 63

64 Functional Complementation Assays Thymidine auxotrophy phenotype complementation E. coli folb deletant cells were transformed with pbad24 alone (negative control) or with pbad24::folb Ec (positive control) or various PTPS genes. Complementation tests were performed by streaking transformed cells on LB plates that contained appropriate antibiotics and L-arabinose, with or without dt. Two independent clones were used for each construct. Plates were incubated for 2 days at 37 C. Queuosine deficient phenotype complementation E. coli qued deletant cells were transformed with pbad24 alone (negative control) or with pbad24 containing E. coli qued (positive control), various PTPS genes, queuosine or archaeosine genes. Complementation tests were made by growing the transformed cells in M9 minimal medium that contained appropriate antibiotics and 0.2% L-arabinose. Bulk trna was extracted, purified, digested into ribonucleosides and analyzed by LC-MS/MS (Kowalak et al., 1994) for the presence of Q. Two independent clones were used for each construct. Archaeosine deficient phenotype complementation H. volcanii HVO_2001 deletant strains were transformed with pjam202c (Zhou et al., 2008) (negative control) or with pjam202 containing HVO_2001 under a constitutive ribosomal promoter. The H. volcanii HVO_1716, HVO_1717, HVO_1718, and HVO_2008 deletant cells were transformed with pjam202c (Zhou et al., 2008) (negative control) or with pjam202 containing H. salinarum homologs Vng6303, Vng6305, Vng6306, and Vng1957G respectively under a constitutive ribosomal promoter. Complementation tests were made by growing the transformed cells in H. volcanii minimal medium. Then, bulk trna was extracted, purified, and digested into 64

65 ribonucleosides. The resulting ribonucleosides were analyzed by LC-MS/MS (Kowalak et al., 1994) for the presence of G +. Two independent clones were used for each construct. trna Work Bulk trna extraction E. coli derivatives, H. Volcanii derivatives, and H. Walsbyi were grown in rich or defined media. The cells were collected by centrifugation (5000 rpm for 5 min at 4⁰C) and stored at -20 C for further use. To extract trna, the frozen cells were thawed and resuspended in 50 mm Na acetate buffer ph 5.8 (3 ml buffer per 1g of cells). Equal volume of phenol saturated with mildly acid buffer (50 mm NaOAc ph 5.8) was immediately added to the cell suspension and shaken overnight at room temperature. The aqueous phase was recovered by centrifugation (20 min at 5,000 rpm), and another one volume of buffered saturated phenol was added. The phenol:buffer was vigorously shaken again for 2 minutes at room temperature. After centrifugation, as above, one volume of chloroform was added and mixed vigorously again for 2 minutes at room temperature. The supernatant was recovered by centrifugation and adjusted to 20% isopropanol followed by 1 hour incubation at -20 C. The pellet containing genomic DNA and long RNA (mrna and rrna) was spun down, and the amount of isopropanol was adjusted to 60% final concentration. After one overnight standing at -20 C, the precipitated small RNAs (mostly soluble RNA = trna) were recovered by centrifugation at 4 C, washed twice with cold 70% ethanol (to remove the salts from the cellular extract) and then once with cold 80% ethanol, dried and finally resuspended in 5000 μl water. Further purification steps were achieved on DEAE -cellulose (Fisher 65

66 Scinetific cartridge/5 ml) column or on Nucleobond AXR-400 (Clontech Laboratories); both last chromatography steps were performed according to the manufacturer s protocols. All trna extractions and analysis were performed at least twice, independently. trna Asp purification One species trna Asp (GUC) was purified from bulk trna using biotinylated primers on Streaptavidin sepharose resin (GE Healthcare, Pittsburgh, PA) according to Rinehart et al. (Rinehart et al., 2005). Four hundred µg of 5 -biotinilated specific primers (5 biotin-ccctgcgtgacaggcagg-3 ) in 6X NTE solution (20X NTE solution is 4.0 M NaCl, 0.1 M Tris-HCl ph 7.5, 50 mm EDTA, 5.0 mm 2-BME) were added to the Hitrap Strepaptavidin sepharose HP R-10 1 ml column (GE Healthcare). Then, 4.0 mg of total trna (10 mg/ml in 6X NTE) were added and incubated at 65 C for 30 min. After incubation, the temperature of the mixture was decreased slowly to 30 C. Then, the trna was washed three times with 3X NTE, 1X NTE, and 0.1X NTE until the absorbance (λ=260 nm) of the wash was zero. The trna Asp retained on the beads was eluted with 1 ml of 0.1X NTE at 65 C. 1.0 M NaCl and 80% isopropanol was added to precipitate the trna. The pellet was washed with 85% ethanol and dried. The trna was resuspended in 50 µl sterile water. Bulk trna digestion for LC-MS/MS analysis Four hundred µg of bulk trna was resuspended in 100 µl water. To this solution were added 0.1 volume of 0.01 M ammonium acetate (ph 5.3) and 0.2 units of Nuclease P1. The solution was incubated at 45 C for 2 hours and then briefly cooled on ice. Then, 0.1 volume of ammonium bicarbonate (1.0 M at ph 7.0) was added along with 0.02 units of Phosphodiesterase I and 5.0 units of E. coli alkaline phosphatase. The 66

67 resulting solution was incubated for 2 hours at 37 C. LC-MS/MS analysis was done on a high performance liquid chromatography (HPLC) system coupled to a hybrid triple quadrupole ion trap MS (4000 Q-TRAP; Applied Biosystems, Foster City, CA) equipped with a TurboIonSpray (TIS) interface operated in the positive ion mode at the Donald Danforth Plant Science Center - Mass Spectrometry and Proteomics Facility (St. Louis, MO) trna Asp digestion In order to map the G + modification on trna Asp, the trna was digested with RNase T1 (Harada et al., 1972) then analyzed by LC-MS/MS (Mandal et al., 2010). RNase T1 digestion. RNase T1 is a fungal endonuclease that cleaves singlestranded RNA after guanine residues, on their 3' end (Pace et al., 1991). Ten µg of pure trna was mixed with 200 mm TrisHCl ph 7.5, 1.0 M NaCl, and 2 units of RNase T1 and incubated at 37 C for 15 min. Then, the trna was precipitated with 80% isopropanol and washed with ethanol. After drying, the pellet was resuspended in 50 µl water and analyzed by LC-MS/MS. 67

68 CHAPTER 3 ARCHAEOSINE BIOSYNTHESIS IN H. volcanii Background Archaeosine (G + ) is one of the most complex ribonucleosides modifications found in trna. It has been identified at position 15 of almost all archaeal trnas sequenced to date (Jühling et al., 2009). The structure of G + consists of a ribose, a 7-deazaguanine base, and a nitrile group attached to the C7 of the base (Gregson et al., 1993) (Figure 3-1). Although it was discovered more than 20 years ago (Edmonds et al., 1991; Gregson et al., 1993; Gupta, 1984), the biosynthetic steps leading to G + synthesis have not yet been elucidated. Almost all Archaea synthesize G + de novo. The well characterized archaeal trna guanine transglycosylase (atgt) was the only known G + synthesis enzyme when we started this work. Archaeosine (G + ) is structurally related to Queuosine (Q) (Figure 3-1), another complex trna modification. Queuosine also contains a ribose and a deazaguanine base but harbors an aminomethyl-cyclopentadiol attached to the C7 of the base (Yokoyama et al., 1979; Yokoyama et al., 1979). In Bacteria and Eukarya, Queuosine is found at position 34 of trna Asp, Asn, Tyr, and His (Morris et al., 1999; Yokoyama et al., 1979). Eukaryotes salvage queuine (q), the Q base, from the gut flora (Okada et al., 1979). Eukaryotic Tgt (etgt) takes queuine as a substrate and exchanges it with the guanine at position 34 of trna forming Q-tRNA (Okada et al., 1979) (Figure 3-2). Many bacteria synthesize Q de novo (Nishimura, 1983; Reader et al., 2004) (Figure 3-2). The first established intermediate in the queuosine pathway was preq 0 (Iwata- Reuyl, 2003). The precursor of 7-cyano-7-deazaguanine (preq 0 ) is GTP which is 68

69 modified to dihydroneopterin triphosphate (H 2 NTP) by GTP cyclohydrolase I encoded by the fole gene (El Yacoubi et al., 2006; Phillips et al., 2008). H 2 NTP is the substrate for QueD to yield 6-carboxy-5,6,7-tetrahydropterin (CPH 4 ). CPH 4 is the substrate for the next enzyme, QueE, that catalyzes the formation of 7-carboxy-7-deazaguanine (CDG) which is converted into preq 0 by the QueC enzyme (McCarty et al., 2009; McCarty et al., 2009; Reader et al., 2004). QueF is the NADPH dependent oxidoreductase that reduces the nitrile group of preq 0 to amino group of 7-aminomethyl-7-deazaguanine (preq 1 ) (Lee et al., 2007; Van Lanen et al., 2005). Bacterial Tgt (btgt) catalyzes the exchange of preq 1 with guanine at position 34 of trna (Nakanishi et al., 1994; Nishimura, 1983; Okada et al., 1979; Yokoyama et al., 1979). The rest of the reaction takes place at the trna level. S-adenosylmethionine:tRNA ribosyl transferase isomerase (QueA) catalyzes the formation of epoxyq (Q 0 -trna) from preq 1 -trna (Mueller and Slany, 1995); Q 0 -trna is further reduced to Q-tRNA by epoxyqueuosine reductase (QueG) (Miles et al., 2011) (Figure 3-2). Bacterial and archaeal Tgts are structurally similar. They share about 25% sequence identity (Stengl et al., 2005) and belong to a common fold that is unique to the Tgt family forming a homologous superfamily within the TIM/(αβ) 8 -barrel fold. They both catalyze the same reaction: the incorporation of the 7-substituted-7-deazaguanine into trna (Ishitani et al., 2002; Jänel et al., 1984; Reuter and Ficner, 1995; Stengl et al., 2005). Bacterial and archaeal Tgts recognize guanine and replace it with 7- deazaguanine derivatives at completely different positions; the mode of trna recognition differs in the two Tgt enzymes. The trna U 33 G 34 U 35 sequence is present in all Q specific trnas and is recognized by the active site of the btgt (Stengl et al., 69

70 2005). In contrast, atgt has an extra domain, a PseudoUridine synthase and Archaeosine trna binding domain (PUA) that recognizes the A-form of RNA and is wide spread among RNA-modifying enzymes (Ferre-D'Amare, 2003). Because of the common pathway intermediate (preq 0 ), the identical core structures of the Q and G + modifications, and the similar structures of the bacterial and archaeal Tgt enzymes, it was predicted that early biosynthesis steps are similar for the G + and Q modifications (Iwata-Reuyl, 2008; Iwata-Reuyl, 2003) (Figure 3-3). Hence, the bacterial preq 0 biosynthetic pathway was used as a model to predict and experimentally validate that archaeal homologs of fole, qued, quee, and quec are involved in G + biosynthesis (Figure 3-3). H. volcanii was used as a model organism for the genetic elucidation of the G + biosynthesis pathway. H. volcanii is an aerobe, an extreme halophile, a moderate thermophile, and one of the few Archaea easily grown in laboratory conditions in both rich and defined media. H. volcanii is also among the few genetically alterable Archaea. The H. volcanii genome is sequenced (Hartman et al., 2010) and several genetic tools such as shuttle and expression vectors (Holmes et al., 1991; Kaczowka and Maupin- Furlow, 2003), chromosomal gene deletion techniques (Allers and Ngo, 2003), and a variety of selectable markers (Allers et al., 2004) have been developed. Results In the Extreme Halophilic Archaeon H. volcanii, Archaeosine Is Not Essential for growth Archaeosine is found at position 15 of almost all archaeal trnas sequenced to date. Position 15 sits at the elbow of the tertiary structure of trna (Jovine et al., 2000) and has been shown to be involved in tertiary interactions across D and T loop (Jovine 70

71 et al., 2000). Because the positively charged imidino group might interact with the negatively charged phosphates of the trna backbone, G + was assumed to increase trna stability (Iwata-Reuyl, 2003; Stengl et al., 2005). Since Tgt, the G + signature gene, is well conserved throughout the archaeal domain, G + was assumed to be essential for growth in these organisms. Although atgt was well biochemically and structurally characterized, genetic studies were lacking. Consequently, a deletion of the gene in H. volcanii was attempted to assess the essentiality of atgt in this organism. HVO_2001 encodes atgt in H. volcanii. Using the double recombination protocol developed by Allers laboratory (Allers and Ngo, 2003), HVO_2001 was deleted without difficulty yielding strain VDC3241. H. volcanii is one of the halophilic Archaea that was shown to be polyploid (Breuert et al., 2006; Delmas et al., 2009). To ensure that there were no wild-type allele remaining in the mutant strain (H. volcanii can harbor as much as 20 chromosomal copies (Breuert et al., 2006)) the deletion of the HVO_2001 gene was verified by both PCR and Southern Blot (Figure 3-4). The HVO_2001 gene from H. volcanii was cloned behind a constitutive ribosomal promoter, the P2 promoter from the H. cutirubrum rrna operon, in pjam202 (Kaczowka and Maupin-Furlow, 2003). The resulting plasmid (pgp109) was transformed in VDC3241 yielding strain VDC3266. As controls, both VDC3241 and the WT parent H26 were transformed with the empty plasmid pjam202c, yielding VDC3259 and VDC3226 respectively. To verify that the deletion of atgt led to the loss of G + as implied from the biochemical studies, these H. volcanii derivatives were grown in Hv-YPC, trna was extracted, purified, and digested by ribonuclease P1, phosphodieterase I, and alkaline phosphatase to ensure proper trna hydrolysis to ribonucleosides (Pomerantz and McCloskey, 1990). The resulting 71

72 ribonucleosides were analyzed by LC-MS/MS (Pomerantz and McCloskey, 1990). In the positive control, VDC3226, G + eluted at 25.3 (325 m/z) minutes whereas the peak was not detectable in the deletion mutant strain, VDC3259 (Figure 3-5). The presence of G + was restored in the strain expressing HVO_2001 in trans (Figure 3-5) confirming that the phenotype was not due to a polar effect on a downstream gene. The ratio between the amount of N 2, N 2 dimethylguanosine (m 2 2G) (311 m/z) in the mutant and the wild type was used as an internal standard to estimate the variations caused by the loading amount of trna analyzed in the mutant strain. The growth of the H. volcanii Δatgt strain (VDC3241) was then compared to the isogenic wild-type strain (H26) to test if the absence of G + led to any growth defect in optimal growth conditions. When grown at optimal conditions (YPC, 45 C, 200 rpm), the H. volcanii atgt mutant showed no growth defect (Figure 3-6). This is the first time the atgt gene has been deleted in any Archaea. This result showed that, at least in H. volcanii, neither atgt nor G + is essential for growth in optimal conditions. The dispensability of G + in this organism allowed the use of genetic approaches to identify the rest of the steps of G + biosynthetic pathway. HVO_2348, Encoding FolE2 Homolog, Is Involved in Both Folate and Archaeosine Biosynthesis In Q biosynthesis, GTP is the precursor that undergoes a series of reactions catalyzed by GTP cyclohydrolase I (FolE) to form H 2 NTP (Phillips et al., 2008). This enzyme is involved in tetrahydrofolate (THF) synthesis in bacteria and plants (Phillips et al., 2008; Pribat et al., 2010; Ravanel et al., 2001). Dr. de Crécy-Lagard performed a bioinformatics analysis on Q genes (fole and quedcef) and revealed that qued, quee quec, but no quef homologs are found in almost all Archaea sequenced to date 72

73 (Reader et al., 2004) with a few exceptions (N. equitans and H. walsbyi); yet, not all Archaea that had qued, quee, quec homologs had homologs or fole. This discrepancy was clarified with the discovery of a new type of GTP cyclohydrolase-i type B (FolE2) (El Yacoubi et al., 2006). Archaea that have homologs of bacterial qued, quee, quec, have either a fole1 or a fole2 homolog. If Bacteria and Archaea share similar steps in early Q biosynthesis, then the archaeal homologs of fole1 or fole2 should be involved in G + synthesis since both the bacterial fole1 and fole2 are involved in Q synthesis (Phillips et al., 2008). H. volcanii is one of the few archaeal organisms that synthesize and use THF as a methyl (CH 3 ) donor (Levin et al., 2004; Ortenberg et al., 2000). A mutant lacking any of the enzymes involved in THF biosynthesis is auxotrophic for the metabolites that require THF derivatives as CH 3 donors such as thymidine (dt), hypoxanthine, pantothenate, or methionine (Little and Haynes, 1979). H. volcanii has one fole2 homolog, HVO_2348. The HVO_2348 gene was deleted from H. volcanii H26, as described above, yielding the H. volcanii ΔfolE2 strain, VDC3235. The deletion was confirmed by PCR and Southern Blot (Figure 3-7). If HVO_2348 is involved in THF biosynthesis as predicted, then VDC3235 should be a dt, hypoxanthine, and pantothenate (B 5 ) auxotroph when grown in Hv-Ca medium (Allers et al., 2004). The fole2 mutant strain was grown on agar plates containing Hv-Ca or Hv-Ca supplemented with dt, hypoxanthine, and pantothenate. As predicted, H. volcanii ΔfolE2 (VDC3235) did not grow in Hv-Ca, but the addition of dt, hypoxanthine, and pantothenate restored the growth of the mutant strain while the isogenic wild type (H26 WT) grew well with or without the addition of dt, 73

74 hypoxanthine, and pantothenate (Figure 3-8). These results suggested that H. volcanii fole2 homolog, HVO_2348, might function in THF pathway. To verify the prediction that HVO_2348 is also involved in G + biosynthesis, the VDC3245 strain as well as the isogenic wild type (H26 WT) were grown in YPC rich medium until late exponential phase (OD λ=600 nm = ). Bulk trna was extracted, purified, and hydrolyzed. The obtained ribonucleosides were LC-MS/MS analyzed. The peak observed at 25.7 min in UV trace of the H26 WT was the protonated G + (325 m/z). The same peak, in VDC3235, was reduced more than 50 fold (Figure 3-9). This result suggested that HVO_2348, fole2 homolog, is involved in G + biosynthesis. The small amount of G + observed in the UV trace might be due to preq 0 contamination found in the rich medium (Watanabe et al., 1997). Therefore, to minimize the contamination of trna with preq 0, all other preq 0 auxotrophic strains were grown in defined medium. To further investigate the effects of the fole2 deletion in H. volcanii, a growth rate comparison between the H. volcanii fole2 mutant and the isogenic wild type was performed. Both H. volcanii fole2 deletion strain and the isogenic wild type were grown in rich medium supplemented with dt (40 µg/l, at 45 C and 200 rpm) for 36 hours. The growth was monitored by reading the optical density (OD λ = 600 nm ) every four hours. A slight growth defect was observed for the deletion strain, VDC3235, although the cell yield was relatively the same for both mutant and isogenic wild type (Figure 3-10). The complementation studies were not necessary because the downstream gene is in opposite orientation to HVO_2348 (Figure 3-11). Thus, the expression of the downstream genes would not be affected by the deletion of HVO_

75 HVO_1718, Encoding QueD Homolog, Is Involved in Archaeosine Biosynthesis. QueD is the next enzyme in Q biosynthetic pathway. QueD was first identified by a Korean group in E. coli (ygcm) as a sepiapterin reductase, but sepiapterin is not present in Bacteria (Woo et al., 2002). Later, QueD (or PTPS-I) was reported in Synechocystis sp. PCC 6803 as a pyruvoyl tetrahydropterin synthase (PtpS) homolog with merely 10% of PtpS activity (Jin Sun et al., 2006). PtpS is involved in tetrahydrobiopterin (BH4) biosynthesis in eukaryotes. Recently, QueD (ygcm in E. coli) was shown to be involved in Q biosynthesis in A. baylyi ADP1 in vivo (Reader et al., 2004) and in E. coli preq 0 biosynthesis in vitro (McCarty et al., 2009). H. volcanii has two qued homologs: HVO_1718 and HVO_1284. HVO_1718 clusters with the homologs of other Q biosynthesis genes quee and quec (Figure 3-12). Thus, we predicted that the H. volcanii qued homolog, HVO_1718, is involved in G + biosynthesis. To verify the above prediction, an H. volcanii HVO_1718 deletion strain was constructed to yield strain VDC3290. To ensure that there was no wild-type allele remaining in the mutant strain, the deletion of the HVO_1718 was verified with PCR using primers that anneal within the gene or 100 bp upstream and downstream the gene (Figure 3-13). The deletion strain was then transformed with a plasmid (pgp426) containing the qued homolog from H. salinarum, Vng6306, under the ribosomal constitutive promoter mentioned above. Also, H. volcanii deletion strain (VDC3290) and isogenic wild type (H26) strains were transformed with pjam202c (empty plasmid) as negative and positive controls, respectively. The H. volcanii derivatives strains were grown in defined media (Hv-Mm) until late exponential phase. Bulk trna was extracted, purified, and hydrolyzed. The resulted ribonucleosides were analyzed by LC-MS/MS. The peak eluting at 25.8 min in the ph26 WT corresponds to the protonated G + (325 75

76 m/z); the same peak was reduced more than 35 times in the mutant lacking HVO_1718 (Figure 3-14). When the homolog gene from H. salinarum, Vng6306, was expressed in trans in the H. volcanii ΔHVO_1718 strain, the G + peak in the UV trace profile of the complemented mutant was restored (Figure 3-15); thus, the archaeosine phenotype was not due to a polar effect. These results suggest that HVO_1718 is involved in G + biosynthesis. HVO_1717, Encoding QueE Homolog, and HVO_1716, Encoding QueC Homolog, Are Involved in Archaeosine Biosynthesis QueE and QueC are enzymes that catalyzes the reactions that follow QueD in the Q biosynthetic pathway (McCarty et al., 2009). Moreover, the H. volcanii homologs of quee and quec, HVO_1717 and HVO_1716 respectively, physically cluster with HVO_1718 in a potential preq 0 operon (figure 3-12); therefore, it was reasonable to propose that HVO_1717 and HVO_ 1716 are involved in G + biosynthesis. The genes HVO_1717 and HVO_1716 were deleted in the H. volcanii wild-type H26 strain background yielding VDC3347 and VDC3352, respectively. The chromosomal deletion of HVO_1717 (VDC3347) (Figure 3-16) and HVO_1716 (VDC3352) (Figure 3-18) was PCR verified using primers that anneal within the gene and primers that anneal upstream and downstream the gene. The H. volcanii ΔHVO_1717 strain was transformed with plasmids containing the H. salinarum quee homolog (Vng6303). The H. volcanii ΔHVO_1716 was transformed with H. salinarum quec homolog (Vng6305). Both genes, Vng6303 and Vng6305, were cloned behind a P2 ribosomal constitutive promoter. As negative control, H. volcanii ΔHVO_1717 (VDC3347) and ΔHVO_1716 (VDC3352) deletion strains were respectively transformed with pjam202c (empty plasmid). As a positive control, the isogenic wild type (H26) was transformed with 76

77 pjam202c (VDC3226). The H. volcanii deletion strain derivatives as well as the isogenic wild type derivative (VDC3226) were grown in defined medium until late exponential phase. The trna was extracted, purified, and enzymatically hydrolyzed. The resulting hydrolyzed ribonucleosides were analyzed for the presence of G + using LC-MS/MS. The UV trace profile of the H26 isogenic wild type (VDC3226) showed G + peak at 25.3 min (325 m/z) whereas the same peak in the H. volcanii deletion mutants (ΔHVO_1717 and ΔHVO_1716) was reduced more than 30 fold (Figure 3-18 and Figure 3-20). The G + peak was fully restored when the H. salinarum quee and quec homologs, Vng6305 and Vng6303, respectively, were expressed in trans (Figure 3-19 and Figure 3-21). Therefore, HVO_1717 and HVO_1716, the homologs of quee and quec, respectively, are both involved in G + biosynthesis in H. volcanii. The small amount of G + accumulated in the trna extracted from the G + deletion strains might be due to preq 0 contamination. preq 0 was present in the cells because the inoculums, which represented 10% of the total growth medium, were grown in rich medium known to contain preq 0 (Watanabe et al., 1997). ArcS Is the Last Step in Archaeosine Biosynthesis in H. volcanii Because the deazaguanine base derivations are different in Q and G, it was hypothesized that the last step in G + biosynthesis is specific for Archaea (Iwata-Reuyl, 2003). While atgt was discovered more than a decade ago, the remaining late step in G + biosynthesis was yet to be revealed. As described above, atgt introduces preq 0 at position 15 of trna. The resulting preq 0 -trna is then transformed into G + -trna. Using comparative genomics, Dr. de Crécy-Lagard identified atgta2 as a strong candidate for the last step in G + biosynthesis. The criteria to identify candidate enzyme responsible for the last step in G + biosynthesis were: the gene family had to be 77

78 distributed only in Archaea, the protein had to bind trna, and the corresponding gene could cluster with atgt genes in some organisms. Indeed, genes of the atgta2 family cluster with atgt genes in phylogenetically distinct Archaea, have a trna binding domain (PUA), and are found only in Archaea. This enzyme was often annotated as an atgt (the canonical transglycosylase) due to the high similarity to the canonical atgt. However, the atgt active site residues are not conserved in atgta2. atgta2 contains a conserved domain ({PCX(3)KPYX(2)SX(2)H}) specific to atgta2 that is not present in atgt (Figure 3-22) (Phillips et al., 2010). If atgta2 catalyzes the last step in G + biosynthesis, then atgta2 deletion mutant should accumulate preq 0 -trna. To experimentally verify this prediction, a deletion of the HVO_2008, atgta2 homolog, was constructed (VDC5203). The deletion of HVO_2008 was verified both by PCR and Southern Blot (Figure 3-23). To ensure that the G + deficient phenotype was not due to polar effects, the H. salinarum atgta2 homolog, Vng1957, was cloned into pjam202 under the control of a constitutive ribosomal promoter. The resulting plasmid was transformed into the H. volcanii HVO_2008 deletion strain. As a positive control, the H. volcanii isogenic wild type strain (H26) was transformed with the empty plasmid (pjam202c). The strains were grown in rich medium until late exponential phase (OD λ=600 nm = 2.5). Bulk trna was extracted, purified, enzymatically hydrolyzed to ribonucleosides and analyzed by LC-MS/MS. The peak at 25.1 min corresponding to G + (325 m/z) detected in the UV trace of the positive control disappeared in the H. volcanii HVO_2008 mutant strain, and a new peak appeared at 25.4 min corresponding to preq 0 -nucleoside (308 m/z) (Figure 3-24). 78

79 The WT G + profile (presence of G +, absence of preq 0 ) was restored when the atgta2 homolog from H. salinarum was expressed in trans in the H. volcanii ΔTgtA2 (Figure 3-24). The absence of G + in the deletion strain H. volcanii ΔHVO_2008 and the appearance of a new peak that corresponds to the preq 0 -trna suggest that atgta2 is involved in the last step of G + and that preq 0 -trna is the substrate for atgta2. Further biochemical studies were performed by our collaborator, Dr. Dirk Iwata-Reuyl at Portland University, showing that the atgta2 homolog from M. jannaschii (MJ1022) is an ATP independent, glutamine dependent amidotransferase that catalyzes the formation of G + -trna from preq 0 -trna. The enzyme was renamed Archaeosine synthase (glutamine:preq 0 -trna amidinotransferase) and atgta2 was reannotated as arcs. Discussion Thus far, 41 trnas were sequenced in H. volcanii (Gupta, 1984; Juhling et al., 2009); out of these, 25 trnas are modified at position 15. atgt is the critical enzyme that exchanges the base guanine at position 15 of almost all archaeal trnas with free base preq 0. atgt and G + are well conserved across Archaea (El Yacoubi et al., 2009). Consequently, it might be concluded that G + modification is necessary for trna folding and stabilization in Archaea. However, the deletion of atgt in H. volcanii and the absence of G + from H. volcanii trna suggested that neither atgt nor G + are essential for the cell growth in optimal conditions. This is the first in vivo study showing that, at least in a mesophilic extreme halophile, H. volcanii, G + and atgt are not essential; the study also made possible the identification of the of G + biosynthesis steps in the extreme halophilic archaeon H. volcanii. 79

80 The bacterial preq 0 biosynthetic pathway was used as a model to predict equivalent archaeal pathways. Using a combination of comparative genomics and genetics, we showed that archaeal homologs of Q genes, fole2, qued, quee, and quec are involved in G + biosynthesis in H. volcanii. The LC-MS/MS analysis on the hydrolyzed bulk trna extracted from the H. volcanii deletion strains revealed that the amount of G + in trna had decreased more than 35 fold. The small accumulation of the G + in trna might be due to preq 0 contamination from the media. GTP cyclohydrolase I (FolE) is involved in both archaeosine and THF biosynthesis. The product of the FolE2 reaction, H2NTP (Nar et al., 1995; Nar et al., 1994), is shared between Archaeosine and THF biosynthesis pathways. This is not the first example when primary metabolites share biosynthetic products with trna modification biosynthesis. In S. cerevisiae, the dimethylallyl pyrophosphate is a substrate for both Mod5 and Erg20p (farnesyl diphosphate synthase). Mod5 is the enzyme that catalyzes isopentenylation of A to i 6 A of trna. Erg20p catalyzes the formation of farnesyl diphosphate, an essential step in sterol biosynthesis (Benko et al., 2000). H. volcanii has two qued homologs, HVO_1718 and HVO_1284. Only HVO_1718 is involved in G + biosynthesis. QueD shares high homology with pyruvoyl tetrahydropterin synthase (ptps) involved in biopterin synthesis (BH4) in eukaryotes, and with the PTPS-III enzyme involved THF in some bacteria and apicomplexans (Dittrich et al., 2008; Hyde et al., 2008; Pribat et al., 2009). QueD, PtpS, and PTPS-III enzymes belong to the COG0720 family. A detailed study on this superfamily of enzymes will be discussed in Chapter 5. 80

81 The last step in Archaeosine biosynthesis is catalyzed by ArcS as shown by the in vivo and in vitro data. ArcS catalyzes the one step formation of G + in an ATP independent addition of ammonia to the nitrile moiety of preq 0. With the exception of few halophiles (H. walsbyi and H. lacusprofundi), ArcS is present in all Euryarchaea. However, most Crenarchaea, except S. tokodaii, S. solfataricus, I. hospitalis, and H. butylicus, do not possess ArcS homologs - although the presence of G + in the trna of a number of Crenarchaea was demonstrated (Edmonds et al., 1991; Kowalak et al., 1994; McCloskey et al., 2001). Thus, a different enzyme might be responsible for Archaeosine formation in these organisms. This case will be discussed in Chapter 4. In conclusion, this study illustrates the pragmatic benefits of employing comparative genomics approaches to discover new enzymes, to decipher novel biosynthetic pathways, to discern the interplay between primary metabolism and trna modifications pathways. 81

82 Archaeosine (G + ) Queuosine (Q) Figure 3-1. Chemical structure of archaeosine and queuosine. Archaeosine and queuosine are structurally similar sharing the same 7-deazaguanine base. The differences reside in the appended moieties: G + has a formamidino group attached to the C7 of the base while Q has an amino-methyl-cyclopentadiol attached to the C7 of the base. 82

83 Bacteria Archaeosine (G + ) fole1/fole2 quedec quef Queuosine (Q) queg etgt Eukarya queg Archaeosine (G + ) Queuosine (Q) Queuine quef Figure 3-2. The biosynthetic pathway of queuosine in Bacteria and Eukarya. fole1/fole2 quedec 83

84 FolE1/FolE2 Bacteria QueD QueE QueC GTP Dihydroneopterin triphosphate 6 - carboxytetrahydropterin 5 - Carboxydeazaguanine preq 0???? GTP Dihydroneopterin triphosphate 6 - carboxytetrahydropterin 5 - Carboxydeazaguanine preq 0 Archaea Figure 3-3. Bacterial preq 0 biosynthetic steps used as a model to determine the preq 0 (G + ) biosynthesis in H. volcanii. A B C Figure 3-4. PCR and Southern blot verifications of the HVO_2001 chromosomal gene deletion. A) PCR verification using primers annealing upstream and downstream of HVO_2001. B) PCR verification using primers annealing within HVO_2001. C) Southern blot verification. The asterisk represents H26 WT. 84

85 AAA m 2 2 G t 6 A m 2 2 G t 6 A m 2 2 G t 6 A Figure 3 5. LC-MS/MS analysis of bulk trna extracted from H. volcanii Δatgt derivative strains. The UV traces at 254 nm and the extraction ion chromatograms (insets) for 325 m/z are shown. The G+ peak is present at 25.3 minutes in the isogenic wild type strain (H26 pjam202c) profile is absent in the mutant strain, Δatgt pjam202c. The wild type strain profile was restored when atgt (HVO_2001) was expressed in trans in the mutant strain (Δatgt patgt Hv ). 85

86 VDC3241 H26 WT Figure 3-6. Growth curve analysis of H. volcanii Δatgt (VDC3241) compared to H26 WT. The cells were grown in YPC, 45 C, 200 rpm for 39 hours. A B C Figure 3-7. PCR and Southern blot verifications of the HVO_2348 chromosomal gene deletion. A) PCR verification using primers annealing upstream and downstream of HVO_2348. B) PCR verification using primers annealing within HVO_2348. C) Southern blot verification. Lane 1 repsents the ladder; lanes 2, 3, and 4 represent independent clones of H26 ΔHVO_2348 strains; lane 4 represents the isogenic wild type. Hv - Ca + Hyp+dT Hv - Ca + dt Hv - Ca + dt 1 H26 WT; 2 H26 Δ fole2 ; 3 - H26 Δ fole2 Figure 3-8. dt auxotrophy phenotype of H. volcanii ΔfolE2 strain. The H26 WT and H. volcanii ΔfolE2 were grown on defined medium supplemented with casaamino acids, hypoxantine (50ug/mL) and thymidine (80ug/mL) as shown. The plates were incubated for 10 days at 45 C. 86

87 H26 WT H26 ΔfolE2 Figure 3-9. LC-MS/MS analysis of bulk trna extracted from H. volcanii ΔfolE2 strain. The UV traces at 254 nm and the extraction ion chromatograms (insets) for 325 m/z are shown. The G + peak present at 25.7 minutes in H26 WT (isogenic wilde type) peak was reduced more than 35 times in the mutant strain, H. volcanii ΔfolE2. Figure Growth curve analysis of H. volcanii ΔfolE2 (VDC3235). VDC3245 growth was compared to the WT growth. The strains were grown in YPC supplemented with 80µg/mL dt, at 45 C, 200 rpm for 76 hours. 87

88 HVO_2347 Hypothetical protein HVO_2349 Transcription regulator HVO_2348 (fole2) Figure Chromosomal topology of HVO_2348 in H. volcanii. The downstream genes are in reverse orientation to HVO_2348. HVO_1716 HVO_1717 HVO_1718 quec quee qued Figure Chromosomal topology of the H. volcanii preq 0 genes. HVO_1718, HVO_1717, and HVO_1716, homologs of bacterial qued, quee, and quec respectively, are positioned in the same putative operon. A B Figure PCR verifications for the chromosomal deletion of HVO_1718. A) PCR verification using primers to anneal upstream and downstream HVO_1718. B) PCR verification using primers annealing within HVO_1718. The asterisks represent H26 WT 88

89 m 2 2 G G + t 6 A H26 WT H26 Δ HVO_1718 m 2 2 G t 6 A G + Figure LC-MS/MS analysis of bulk trna extract from H. volcanii ΔHVO_1718 and H26 WT strains. The UV traces at 254 nm and the extraction ion chromatograms (insets) for 325 m/z are shown. The G + peak present at 26.8 minutes in the H26 WT (isogenic wild type) was diminished more than 35 fold in the mutant strain, H26 ΔHVO_

90 Ratios m 2 2G/m 2 2G WT G + /m 2 2G H26 VDC3226 pjamc H26 HVO_1718 VDC3455 H26 VDC3453 HVO_1718 pjamc pvng6306 Figure Complementation of G + deficient phenotype by QueD homolog. The G + deficient phenotype was complemented by the in trans expression of H. salinarum Vng6306, HVO_1718 homolog, in the mutant strain. To control for the amount of trna, the m 2 2G content in the complemented strains was compared with the m 2 2G content in the WT control (blue bars). The ratios of G + /m 2 2G of trna extracted from the H. volcanii ΔHVO_1718 derivatives strains are shown by the red bars A B Figure PCR verifications for the chromosomal deletion of HVO_1717. A) PCR verifications using primers to anneal upstream and downstream HVO_1717. B) PCR verifications using primers annealing within HVO_1717. The asterisks represent H26 WT 90

91 A B Figure PCR verification for the chromosomal deletion of HVO_1716. A) PCR verifications using primers to anneal upstream and downstream HVO_1716. B) PCR verifications using primers annealing within HVO_1716. The asterisks represent H26 WT m 2 2 G t 6 A H26 m 2 2 G t 6 A H26 ΔHVO_1717 G + Figure LC-MS/MS analysis of bulk trna extract from H. volcanii ΔHVO_1717 and H26 WT strains. The UV traces at 254 nm and the extraction ion chromatograms (insets) for 325 m/z are shown. The G + peak present at minutes in the H26 WT (isogenic wild type) was reduced more than 35 fold in the H26 ΔHVO_1717 strain. 91

92 Ratios m 2 2G/m 2 2G WT G + /m 2 2G 0 H26 VDC3226 pjamc H26ΔHVO_1717 VDC3460 H26ΔHVO_1717 VDC3458 pjam202c pvng6305 Figure Complementation of G + deficient phenotype by QueE homolog. The G + deficient phenotype was complemented by the in trans expression of H. salinarum Vng6305, HVO_1717 homolog in the mutant. To control for the amount of trna, the m 2 2G content in the complemented strains was compared with the m 2 2G content in the WT control (blue bars). The ratios of G + /m 2 2G in trna extracted from the H. volcanii ΔHVO_1717 derivatives strains are shown by the red bars. 92

93 m 2 2G t 6 A H26 WT m 2 2G t 6 A H26 ΔHVO_1716 G + Figure LC-MS/MS analysis of bulk trna extract from H. volcanii ΔHVO_1716 and H26 WT strains. The UV traces at 254 nm and the extraction ion chromatograms (insets) for 325 m/z are shown. The G + peak present at minutes in the H26 WT was reduced more than 35 fold in the H26 ΔHVO_1716 strain. 93

94 Ratios m 2 2 G/m 2 2 G WT G + /m 2 2 G 0 H26 VDC3226 pjamc H26ΔHVO_1716 VDC3464 H26 VDC3462 ΔHVO_1716 p JAM202c p VNG6303 Figure Complementation of G + deficient phenotype by QueC homolog. The G + deficient phenotype was complemented by the in trans expression of H. salinarum Vng6303, HVO_1716 homolog in the mutant. To control for the amount of trna, the m 2 2G content in the complemented strains was compared with the m 2 2G content in the WT control (blue bars). The ratios of G + /m 2 2G in trna extracted from the H. volcanii ΔHVO_1716 derivatives strains are shown by the red bars Figure Comparison of atgt and ArcS domains. atgt and ArcS have similar C- Terminal organization; however, the C1 domain of ArcS contain a conserved motif specific for ArcS(*). Also, the N-terminal domain is not present in ArcS. 94

95 DtgtA2 ptgta DtgtA2 empty plasmid WT empty A B C Figure PCR and Southern blot verifications for the HVO_2008 gene deletion. A) PCR verification using primers annealing upstream and DtgtA2 downstream ptgta2 HS HVO_2008. B) PCR verification using primers annealing within HVO_2008. C) Southern blot verification. The asterisks represent the H26 WT. DtgtA2 emp WT empty plasmid DtgtA2 empty plasmid DtgtA2 ptgta2 HS Figure LC-MS/MS analysis of bulk trna extracted from H. volcanii ΔatgtA2 (arcs) derivatives. The UV chromatogram showed the presence of a new peak in H. volcanii ΔatgtA2 strain corresponding to preq 0. When the H. salinarum atgta2 homolog was expressed in trans in the mutant strain (ΔtgtA2 ptgta2 Hs ), the WT profile of G + was restored. The extraction ion chromatograms are also shown. 95

96 CHAPTER 4 ALTERNATIVE ARCHAEOSINE BIOSYNTHESIS ROUTES Background The preq 0 molecule is an intermediate in both archaeosine and queuosine synthesis. Even though G + and Q share similar early steps, the last biosynthetic steps of archaeosine biosynthesis are specific for Archaea (Iwata-Reuyl, 2003).The last step of archaeosine biosynthesis in Euryarchaea is performed by the gene product of arcs that transfers ammonia from glutamine or asparagine directly to preq 0 to form archaeosine (Phillips et al., 2010). Although ArcS is an amidotransferase enzyme, it shares little or no similarities with other amidotransferase enzymes. Most amidotransferase enzymes have a glutamine amide transfer (GAT) domain that exhibits glutaminase activity and a substrate binding domain called synthase. GAT domains are found in many amidotransferase enzymes and are classified in two classes (Massiere and Badet- Denisot, 1998). Class I is characterized by a catalytic triad formed by Cys, His, and Glu or Asp. Class II is characterized mainly by a conserved Cys at the amino terminus of the protein; no conserved catalytic triad was observed (Massiere and Badet-Denisot, 1998). Neither GATI nor GATII characteristics are present in ArcS; therefore, this raises the possibility of alternative routes for late step of G + biosynthesis. Results Phylogenetic distribution of the ArcS across the Archaea kingdom performed by Dr. de Crécy-Lagard showed that although almost all Archaea have atgt and G +, not all have ArcS. ArcS is missing in many Crenarchaea (Figure 4-1). Some of these organisms contain G + - as shown by the McCloskey group (Dalluge et al., 1997; Edmonds et al., 1991). Several of these Crenarchaea have longer QueC protein than 96

97 other QueC found in Archaea (470 instead of 270 residues). Further analysis revealed the presence of glutamine dependent amidotransferase type II domain (GATII) fused with QueC (GATII-QueC). The N-terminal of the GATII-QueC protein showed a N- terminal conserved Cys which is typical for amidotransferase type II. GATII domain catalyzes the amide nitrogen transfer from glutamine to the appropriate substrate. As demonstrated in Chapter 3, ArcS is a glutamine dependent amidotransferase. Even though the GATII-QueC fused family was the obvious candidate for the enzyme that would transfer the amido group to the nitrile of preq 0 to form G +, it was not present in all Crenarchaea that lack ArcS (Figure 4-1). Hence, another gene family candidate was proposed: a quef-like gene that physically clusters in A. pernix with the quec gene and encodes a protein family with high similarity to QueF. The presence of the QueF enzyme in Archaea does not make sense because QueF is a NADPH dependent oxidoreductase that reduces preq 0 to preq 1 in bacterial Q biosynthesis (Lee et al., 2007; Swairjo et al., 2005; Van Lanen et al., 2005). Hence, an analysis of the structure based alignments using as input the structure of B. subtilis, YkvM, (Swairjo et al., 2005; Van Lanen et al., 2005) and the archaeal QueF-like available sequences was performed. The analysis revealed that the QueF motif ({E78[SL]K[SA]hK[LY][YFW]85}) is not present in the QueF-like protein; however, a high conservation of the catalytic cysteine (Cys56 - B. subtilis numbering) and of the substrate binding, C-terminal glutamate, was revealed (Lee et al., 2007) (Figure 4-2). Appropriately, we predicted that QueF-like catalyzes the amido transfer to preq 0 thus forming archaeosine and G + is formed before being charged to trna. The sequence analysis performed revealed that neither GATII-QueC nor QueF-like proteins possess any known trna binding domain. If 97

98 G + would be formed before being charged to trna, then the binding pocket of the crenarchaeal atgt should be slightly different from the euryarchaeal atgt that takes preq 0 as a substrate. An alignment of representative atgt from both Euryarchaea atgt and Crenarchaea atgt showed that the residues of the substrate binding pockets are indeed different; Val197/Val198/Pro199 have been replaced with Pro/Thr/Thr (Figure 4-4). Because a genetic manipulation of Crenarchaea that have GATII-queC or queflike homologs would be almost imposible, we decided to use an E. coli system to experimentally validate the above predictions. The advantages are that E. coli genetics tools are well developed and tested, the organisms grow fast, and, most importantly, btgt is promiscuous. it is known that btgt, in vitro catalyzes the transglycosylase reaction using multiple substrates (preq 0, preq 1, Guanine, and Q base ) (Stengl et al., 2005). In addition, the analysis of btgt crystal structure from Z. mobillis (PDB 1WKD) suggested that the btgt binding pocket is large enough to accommodate G + (Grosjean and Benne, 1998). In Some Crenarchaea, QueF-like Protein Catalyzes the Last Step in Archaeosine Biosynthesis. P. calidifontis is a Crenarchaea that contains atgt, QueE, and QueC but has no ArcS homolog. To verify the presence of G + in an ArcS deficient organism, the trna extracted from P. calidifontis was analyzed by LC-MS/MS. The P. calidifontis cell paste (kind gift from Todd Lowe) was thawed and prepared for trna extraction. Bulk trna was extracted as described in Material and Methods (Chapter 2). The digested ribonucleosides were analyzed by LC-MS/MS. The UV chromatogram showed the G + peak (325 m/z) eluting at 25.5 min (Figure 4-5). Thus, the presence of G + does not 98

99 dependend on the presence of ArcS. To test if quef-like was involved in G + biosynthesis, a heterologous system was constructed (Figure 4-6A). The quef was deleted in E.coli MG1655 strain to ensure that no preq 1 and Q were formed. The resulting strain (VDC2041) was transformed with pbad24 vector that contained queflike (Pcal_0221) (pgp358) from P. calidifontis to yield strain VDC3368. For negative control, E.coli ΔqueF was transformed with pbad24 (VDC3367). If QueF-like is involved in G + biosynthesis, then trna from the E. coli ΔqueF strain with QueF-like expressed in trans would contain G + instead of Q. The deletion strain derivatives were grown in LB until late exponential phase. The expression of QueF-like was induced by adding 0.2% arabinose to the growth medium. Bulk trna was extracted, purified, and hydrolyzed. The resulting ribonucleosides were analyzed by LC-MS/MS. The UV profile of the deletion strain derivative containing quef-like, in trans, showed the presence of G + peak at 25.7 min (325 m/z) and preq 0 at 26.1 (308 m/z) (Figure 4-7). The authenticity of the G + peak was MS/MS confirmed (Figure 4-7 inset). The UV profile of the negative control (VDC3367) revealed only the presence of preq 0 at 26.1 min (308 m/z) as previously experimentally verified (Reader et al., 2004; Van Lanen et al., 2005). The presence of preq 0 peak in the E. coli ΔqueF pquef-like Pc was expected because the cell still produces preq 0. It was reported by Reuter et al. that btgt binds preq 0 with low affinity (Reuter and Ficner, 1995) and charges it to specific trnas; however, the binding affinity of btgt to G + has not been determined yet. In Other Crenarchaea, GATII-QueC Protein Catalyzes the Last Step in Archaeosine Biosynthesis S. acidocaldaricus is another Crenarchaea that lacks ArcS; instead, it possesses the GATll-QueC homolog. The presence of G + in S. acidocaldaricus was confirmed by 99

100 the McCloskey group (Edmonds et al., 1991; Kowalak et al., 1994). Thus, the presence of G + does not depend on the presence of ArcS. To verify if GATII-QueC is involved in G + synthesis, a heterologous system was constructed (Figure 4-6B). The quef and quec were deleted in the E.coli MG1655 strain to ensure that no preq 1 and Q would be produced. The resulting strain (VDC3280) was transformed with a plasmid that contained homolog of GATII-queC from S. solfataricus (SSO0016) under the control of the pbad (arac) promoter (JSCG s kind gift) (VDC3282). For negative control, the E. coli ΔqueC ΔqueF was transformed with pbad24 (VDC3281). If GATII-QueC is involved in G + biosynthesis, then trna extracted from the E.coli ΔqueC ΔqueF strain having GATII-QueC expressed in trans should contain G + instead of Q. The deletion strain derivatives were grown in LB until late exponential phase. The expression of GATII- QueC was induced by adding 0.2% arabinose to the growth medium. Bulk trna was extracted, purified, and enzymatically hydrolyzed. The resulting ribonucleosides were analyzed by LC-MS/MS. The UV profile of the deletion strain derivative containing GATII-queC Ss, in trans, showed the presence of G + peak at 25.5 min (325 m/z) and preq 0 (308 m/z) at 26.1 min (Figure 4-8). The authenticity of the presence of G + was confirmed by MS/MS (Figure 4-8 inset). The UV profile of the trna extracted from the negative control showed no preq 0 or Q. The presence of preq 0 in E. coli ΔqueC ΔqueF pgatii-quec Ss might be due to the the C- terminal domain of GATII-QueC, QueC, that perhaps catalyzes the formation of preq 0. Bacterial Tgt Charges Archaeosine at Position 34 of trna Asp Archaeal Tgt recognizes the D loop of trna and catalyzes the base exchange of guanine at position 15 with the preq 0 free base. On the other hand, bacterial Tgt recognizes U 33 G 34 U 35 sequence of specific trnas (His, Asp, Asn, and Tyr) (Mueller and 100

101 Slany, 1995) and catalyzes the base exchange of guanine at position 34 with the preq 1 free base. preq 0 is also a substrate for btgt in vitro (Reuter and Ficner, 1995). Bacterial cells contain both free preq 1 and preq 0, but the preferential substrate of btgt is preq 1. The preference is due to the higher affinity of the enzyme to the substrate preq 1 (0.4 µm) compared to preq 0 (2.4 µm) (Tidten et al., 2007). In vitro studies showed that besides modifying trna, btgt also modifies mrna (Hurt et al., 2007) if the appropiate recognition elements are provided. These elements of recognition are the hairpin structure with the UGU sequence position analogous to that of trna (Hurt et al., 2007). We predicted that btgt introduces archaeosine at position 34 of trna Asp, Asn, His, and Tyr. To validate the prediction, an analysis of the position 34 in the E. coli trna extracted from the heterologous constructed strain, E. coli ΔqueF pquef-like Pc (VDC3368), was performed. Because the trna Asp make up about 2% of total trna in a cell (Bailly et al., 2006), it was chosen to be purified and sequenced to map the position of G +. The E. coli ΔqueF pquef-like Pc (VDC3368) and E. coli ΔqueF pbad24 (VDC3367) were grown in LB until late exponential phase. Bulk trna was extracted and purified. The trna Asp was extracted from the bulk trna using biotinylated primers bound to the streptavidin sepharose resin (Harada et al., 1972; Rinehart et al., 2005). Ten µg of trna Asp were digested with RnaseT1 (Harada et al., 1972). RnaseT1 cleaves single-stranded RNA after guanine residues. The theoretical RnaseT1 digestion profile ( of the E. coli trna Asp is shown in Figure 4-9A. The fragment that contains the anticodon loop is specific for trna Asp. We predicted that if btgt introduces G + at position 34 of trna Asp, then the trna Asp digestion fragment, C-C-U-Q-U-C-m 2 A-C-Gp, should have G + in VDC3368 or preq 0 in VDC

102 instead of Q. Indeed, the LC-MS/MS analysis and sequencing of the trna Asp digested with RNase T1 performed by our collaborator Kirk Gaston (Pat. A. Limbach laboratory at University of Cincinati) confirmed the presence of G + at position 34 in VDC3368 (Figure 4-9B). The same fragment of trna Asp purified from VDC3367 contained preq 0 at position 34 (Figure 4-9B). Other posttranscriptional modifications were not affected. Thus, bacterial Tgt charges G + at position 34 of trna Asp in the heterologous system expressing QueF-like in trans; These results suggest that GATII-QueC or QueF-like most certainly catalyzes the last step of G + biosynthesis in Crenarchaea. Discussion ArcS, the last step in G + biosynthesis in Euryarchaea, was not present in all Crenarchaea. Using a combination of bioinformatics and genetics tools, we predicted that the last step of G + biosynthesis in Crenarchaea is performed through alternative pathways. Indeed, in some Crenarchaea this step most certainly is catalyzed by GATII- QueC, in others by QueF-like enzymes. The bioinformatics analysis performed on representative atgt from both Crenarchaea and Euryarchaea showed that the binding residues in Crenarchaea differ from those in Euryarchaea: Pro/Thr/Thr and Val197/Val198/Pro199, respectively. The Val197Pro mutation might increase the binding pocket enough to accommodate G + due to shorter side chain of Pro. Also, the polarity of threonine might also stabilize G +. Thus, only a few amino acid mutations would change the substrate binding specificity of the enzyme; however, the reaction specificity remains the same. Substrate binding residues changes were also observed in eukaryal Tgt. Bacterial and eukaryal Tgts catalyze the same reaction - the exchange of guanine at position 34 of certain trna with free 7- deazaguanine derivative bases. Bacterial Tgt binds preq 1 and charges it into trna 102

103 (Reuter and Ficner, 1995). Eukaryal Tgt binds queuine and charges it into trna (Chen et al., 2010). The substrate specificity residues of the bacterial enzymes are Leu231/Ala232/Val233/Glu235 (Z. mobilis numbering). There are not enough available studies on eukaryotic Tgt to determine the binding residues of etgt to the substrate, queuine. However, from homology models based on C. elegans sequence, it has been suggested that Val233Gly change, specific for eukaryotic Tgt, significantly enlarges the binding pocket thus allowing the binding of extended preq 1 -like substrates such as queuine (Stengl et al., 2005). Biochemical studies will be needed to verify if G + is formed before being charged to trna. To gain insights in catalytic mechanisms of both QueF-like and GATII-QueC, biochemical studies are underway. 103

104 atgt ArcS GAT-QueC QueF-like Euryarchaea Crenarchaea Figure 4-1. Phylogenetic distribution of ArcS, GAT-QueC, and QueF-like in Archaea. The filled rectangles show the presence of genes. The empty rectangles show the absence of genes. The tree was constructed using atgt sequences from 36 representative archaeal organisms employing Neighbor-Joining method embedded in MEGA

105 * Figure 4-2. Structure based alignments of bacterial QueF and crenarchaeal QueF-like protein sequences. The structure of B. subtilis, YkvM, was used as a secondary structure reference. Blue highlight represents bacterial QueF NADPH binding site (QueF motif). Red highlight represents crenarchaeal QueF-like (no NADPH binding residues). Asterisks represent substrate binding residues. Filled black circle represents the conserved catalytic nucleophile. 105

106 QueDEC FolE1/FolE2 GTP FolE1/FolE2 quede GATII-QueC preq 0 QueF-like arctgt FolE1/FolE2, QueDE GATII-QueC, arctgt GATII-QueC QueF-like preq 0 -trna ArcS G + base arctgt G + -trna Figure 4-3. Proposed models for the last step in G + biosynthesis. The ArcS path: preq 0 is charged to trna by atgt then ArcS catalyzes the reaction of G + formation. The proposed model (red arrows), G + is formed before being charged to trna by the gene products of GATII-queC or quef-like then atgt charges G + into trna. Substrate biding residues Euryarchaea Crenoarchaea Natronomonas pharaonis Haloarcula marismortui Methanosaeta thermophila Thermoplasma acidophilum Picrophilus torridus Pyrococcus horikoshii Thermococcus kodakarensis Metallosphaera sedula Sulfolobus acidocaldarius Pyrobaculum arsenaticum Pyrobaculum aerophilum Pyrobaculum islandicum Caldivirga maquilingensis DVFPVGAVVPLMNSYRYGDMIEAILGAKRGLGADAPVHLFGAGHPMMFAL DVFPLGAVVPLMNEYRYADLADVVAACKRGLGEVGPVHLFGAGHPMMFAM DLYPIGAVVPLMESYRFRELVDVVVASKTGLGPGVPVHLFGAGHPMVFAL GYHPIGGVVPLLETYDYSTLVDIIINSKINLSFNKPVHLFGGGHPMFFAF LYLPIGGVVPLLESYRYSDLVKIIFNSKVSSDFSRPVHLFGGGHPMFFAF EIHPIGGVVPLLESYRFRDVVDIVISSKMALRPDRPVHLFGAGHPIVFAL EIHPIGAVVPLMESYRYRDLVDVVIASKVGLRPDRPVHLFGAGHPMIFAL KMLALGSPTVFMEKYKYDTLVDMIYTAKSSVSRGVPFHLFGGGVPHIIPF KMLALGSPTVLMQRYEYAPLIDMIYKSKSNVSRGKPFHLFGGGHPHIFAF PILAIGSPTTLLEEYRFDVLLEAVLHVKANITREAPLHLFGAGHPLILPF HIFAVGSPTTLLEEYRFDLLLEVILHVKANILREAPLHLFGAGHPLVLPF HIYAIGSPTTLLEEYKFDLILKIVLDVKLNMMREAPLHLFGAGHPLVLPF DIYAIGSPTTLLQAYNFTGIIKMILTVKSIIPPGKPVHLFGVGHPLILPL.:*.. ::: * : :. : : *.**** * *.:.: Figure 4-4. Alignments of representative atgts from Euryarchaea and Crenarchaea. The alignments were performed using ClustalW2. The blue box represents the binding residues of euryarchaeal Tgts. The red box represents the binding residues from crenarchaeal Tgts. 106

107 m 2 2G XIC 325 m/z G min t 6 A Figure 4-5. LC-MS/MS analysis of bulk trna extracted from P. calidifontis. The UV traces at 254 nm and the extraction ion chromatograms (insets) for 325 m/z are shown. The G + peak elutes at 25.5 minutes. As internal standards, the m 2 2G and t 6 A peaks are shown. 107

108 Q-tRNA queg E. coli ΔqueF pquef-like Pc epoxyq-trna quea preq 1 -trna fole qued quee quec GTP H2NTP CPH4 CDG preq 0 preq 1 quef btgt quef-like Pc Archaeosine btgt A G + -trna Q-tRNA queg E. coli ΔqueC ΔqueF pgatii-quec Ss epoxyq-trna quea preq 1 -trna fole qued quee quec GTP H2NTP CPH4 CDG preq 0 preq 1 quef btgt GATII-queC Ss Archaeosine btgt G + -trna B Figure 4-6. Construction of the E. coli heterologous systems. A) quef was deleted, so no Q was made, and quef-like Pc was added in trans to form G + (dark red arrows). B) quec and quef were deleted and GATII-queC was added in trans to form G + (dark red arrows). Black represents Q pathway (genes and intermediates) in E. coli. 108

109 E.coli ΔqueF pbad24 E.coli ΔqueF p quef - like Figure 4-7. LC-MS/MS analysis of trna extracted from E.coli ΔqueF derivatives. The UV traces at 254 nm and the extraction ion chromatograms (insets) for 325 m/z are shown. The G + peak (325 m/z, 25.7 minutes) is present in the mutant strain expressing QueF-like Pc in trans. 109

110 E.coli ΔqueCΔqueF pbad24 E.Coli ΔqueCΔqueF p GAT -quec Figure 4-8. LC-MS/MS analysis of bulk trna extract from E. coli ΔqueCΔqueF derivatives. The UV traces at 254 nm and the extraction ion chromatograms (insets) for 325 m/z are shown. The G + peak (325 m/z, 25.5 minutes) is present in the mutant strain expressing GATII-QueC Ss in trans. 110

111 Relative Abundance Relative Abundance Relative Abundance Relative Abundance trna Asp Ribonuclease T1 endonucleolytic cleavage of RNA to yield nucleoside 3 phosphates ending mainly in Gp A w1 VDC3367 G+ btgt c 2 25 w 3 w 1 y 2 w 2 a-b m/z VDC3368 y 2 c 2 w 2 y 3 y 3 c 3 c 3 c 7 y 7 w 3 w y m/z c 7 y 8 c 4 y c 4 8 y 5 pos mass 3'>p sequence G1:G Gp G2:G Gp A3:G AGp C5:G CGp G7:G Gp 48:G AGp U11:G UUCAGp y 2 D16:G DCGp G19:G Gp y y 3 w 2 D20:G DDAGp A24:G31 y 5 C32:G w w AAUACCUGp CCUG + UC/CGp C41:G CAGp y C C U preq 0 U C m 2 A C Gp c c c 2 c 3 w 7 y 8 w 8 y 7 w 7 y 8 y 6 c 5 y5 y 8 w 7 w 8 G44:G Gp G45:G45 c G46:G c Gp Gp c747:g50 4 c 6 c UCGp C51:G CGp G53:G Gp G54:G Gp T55:G TPCGp A59:G AGp U61:G UCCCGp P66:G PCCGp U70:G UUCCGp C75:A CCA y 4 y 3 y 5 w 3 C C U G+ U C m 2 A C Gp c 2 c 3 y 7 a-b 6 c 5 c 4 c 6 a-b 6 c 6 y 6 c 6 c 5 a-b c 7 c y 2 w 2 w 1 y 7 w 7 y 4 y 3 y 2 w 2 50 VDC3367 y 3 c 3 y C C U preq 0 U C m 2 A C Gp c c c 2 c 5 c 7 c c 3 c 4 c 6 c 8 7 y 7 a-b 6 w 7 y 8 y 8 c 5 y5 y 6 y 5 w 3 c 6 y 6 w 1 25 c 2 w 2 w 3 w 1 y 2 a-b m/z 25 0 w1 y 8 w 7 w 8 c y 4 c 5 c 7 4 c 8 c c c digest. For sequencing, w 7 we looked 4 at 6 8 y 7 a-b 6 the anticodon fragment (C-C-U-G + -U-Cm 2 A(/)-C-Gp) c c 6 3 as it is specific for c 5 trna Asp and contains the position 34. B) The a-b c w collision induced dissociation (CID) of the PreQ 0 fragment (VDC3267) and G + y 2 w 2 y 3 c 7 y m/z y 7 y 4 y 3 y 5 w 3 B y 8 w 8 C C U G+ U C m 2 A C Gp Figure c Analysis of RNase T1 digest 2 of trna Asp. A) trna Asp RNase T1 theoretical c VDC fragment (VDC3268) with the fragment ions mapped onto the sequence y 2 w 2 w 1 111

112 CHAPTER 5 FUNCTIONAL DIVERSITY OF THE COG0720 PROTEIN FAMILY Background GTP is a molecule essential to energy conservation and signaling. It is also one of the building blocks of RNA and DNA as well as the precursor of a number of primary and secondary metabolites biogenesis. Among these metabolites are: deazaflavin derivatives, pterin related coenzymes (tetrahydropterin, tetrahydrofolate, methanopterin, and molybdopterin), and 7-deazaguanine derivatives such as queuosine and archaeosine found in trna. Many of the enzymes involved in the synthesis of the GTP derived metabolites are members of the same structural superfamily, the Tunnel-fold (T-fold) superfamily (Colloc'h et al., 2000). This superfamily is comprised of a functionally diverse group of enzymes that assemble through oligomerization of a core domain comprised of a pair of 2-stranded anti-parallel -sheets and two helices to form a 2n n barrel (Colloc'h et al., 2000). Two barrels associate in a head-to-head fashion, and bind planar substrates such as purines or pterins at the interface using a conserved Glu/Gln residue to anchor the substrate. T-fold enzymes catalyze diverse reactions. The tetrahydrobiopterin (BH4) biosynthesis pathway comprises three enzymes that belong to the T- fold superfamily (Auerbach and Nar, 1997; Nar et al., 1995; Yim and Brown, 1976). GTP cyclohydrolase IA (GCYH-IA) catalyzes the first step of the pathway producing 7,8-dihydroneopterin triphosphate (H 2 NTP) from GTP (Nar et al., 1995) (Yim and Brown, 1976). H 2 NTP is then converted to 6-pyruvoyl-tetrahydropterin (PTP) by 6-pyruvoyl-tetrahydropterin synthase (PTPS-II) encoded in rat by ptps (Milstien and Kaufman, 1989; Park et al., 1990) (Figure 5-1) PTP is then reduced to BH4 by sepiapterin reductase (SR encoded 112

113 by the spr gene) (Milstien and Kaufman, 1989; Smith, 1987). GTP cyclohydrolase IA is also the first enzyme of the THF synthesis pathway (Yim and Brown, 1976). In some organisms, GTP cyclohydrolase IA can be replaced with GTP cyclohydrolase IB (GCYH-IB) (El Yacoubi et al., 2006) another T-fold enzyme (Sankaran et al., 2009). The THF pathway contains a second T-fold enzyme, dihydroneopterin aldolase (Garçon et al., 2006) (DHNA, encoded in E. coli by folb) (Figure 5-1). Recently, it was shown that in P. falciparum, as well as in several bacteria, the DHNA step is bypassed by yet another T-fold enzyme, PTPS-III, a homolog of PTPS-II, that directly cleaves H 2 NTP to 6- hydroxyl-7,8-dihydropterin (6HMDP) (Hyde et al., 2008; Pribat et al., 2009) (Figure 5-1). All of the Q biosynthesis steps have been elucidated, and three enzymes of the Q pathway are T-fold enzyme. First, GCYH-IA or GCYH-IB catalyze not only the first step of folate but also of Q and G + synthesis (Phillips et al., 2008). The second enzyme of the pathway PTPS-I or QueD, homologous to PTPS-II, was shown to catalyze the formation of 6-carboxy-5,6,7,8-tetrahydropterin from DHNTP in vitro (McCarty et al., 2009) (Figure 5-1). Strains carrying a deletion of the corresponding gene in A. baylyi sp. ADP1 or E. coli lack Q (El Yacoubi et al., 2006; Reader et al., 2004). QueF, the oxidoreductase that reduces the nitrile side chain of preq 0 to the aminomethyl side chain of 7-aminomethyl-7-deazaguanine (preq 1 ), the next intermediate in Q pathway, is also a T-fold enzyme (Van Lanen et al., 2005). Bacterial trna guanine transglycosylase charges preq 1 into specific trna (Watanabe et al., 1997). The SAM dependent trna ribosyltransferase (QueA) catalyzes the formation of the epoxyq (Kinzie et al., 2000). Finally, the epoxyq is further reduced into queuosine by epoxyqueuosine reductase (QueG) (Miles et al., 2011). 113

114 Functional diversity resides not only among the different T-fold sub-families but also within a given subfamily. To date, three members of the COG0720 subfamily, PTPS-I, II and III have been shown to catalyze different reactions in different pathways (Figure 5-1), and a fourth COG0720 member, PTPS-IV, whose structure was recently determined, has an as yet unknown function (Spoonamore et al., 2008). The PTPS-II family has been the most mechanistically and structurally characterized. The 3D structure of the rat liver PTPS-II exhibits a homohexameric structure formed by a dimer of trimers with a 3-fold symmetry (Nar et al., 1994). Based on molecular modeling, site-directed mutagenesis, and refined crystal structures of the enzyme alone and in complex with natural substrate, it was shown that the substrate binding mode and reaction mechanism occurs at the interface of the two trimers (Ploom et al., 1999). The active site of PTPS-II consists of Cys42 (R. norvegicus numbering) from one trimer and Asp88 and His89 from the adjacent trimer (Ploom et al., 1999). Zn(II) plays an important role in the catalysis. The Zn(II) binding site is comprised of three histidine residues from the same monomer: His23, His48, and His50 (Burgisser et al., 1995; Ploom et al., 1999). The proposed role of Zn(II) is to activate the substrate proton, stabilize the intermediates and disfavor the breaking of the C1-C2 bond in the pyruvoyl side-chain. The projected reaction involves a complex mechanism involving base-catalyzed redox transfer and triphosphate elimination (Le Van et al., 1988). Employing a combination of bioinformatics and genetics tools, we showed that the COG0720 protein superfamily contains different members involved in different biosynthetic pathways. We also showed that the members of the COG0720 superfamily 114

115 retained similar catalytic motifs. This similarity leads to a promiscuous catalytic activity and to relaxed substrate specificity among the members of COG0720 superfamily Results Separation of six COG0720 Protein Subfamilies by Comparative Genomics Members of the COG0720 family are difficult to annotate. The first well characterized member of the COG0720 family was 6-pyruvoyl tetrahydropterin synthase (PTPS-II) involved in BH4 biosynthesis (Le Van et al., 1988; Milstien and Kaufman, 1989; Ploom et al., 1999). Most of PTPS-II homologs were therefore annotated as 6- pyruvoyl tetrahydropterin synthase. In bacteria, many members of the COG0720 family are currently annotated as PTPS-IIs even though Bacteria do not generally produce BH 4 with the exception of the cyanobacteria (Jin Sun et al., 2006). Out of 810 of COG0720 of bacterial sequences in RefSeq (Pruitt et al., 2007), 516 are annotated as 6-pyruvoyl tetrahydropterin synthases. Jin Sun et al have shown that COG0720 family has multiple members (Jin Sun et al, 2006). The Korean group showed that Synechococcus sp. PCC7942 has two COG0720 homologs: one that has canonical PTPS-II in vitro (YP_ ) and the other that has only 10% of the canonical PTPS-II reaction (YP_ ). A search for COG0720 homologs in Synechococcus sp. genome using BLAST searching algorithm using as input the rat PTPS-II enzyme (NP_ ) retrieved only two COG0720 proteins. The one with low similarity (YP_ E- value: 5e-20) was involved in BH4 biosynthesis because its deletion affected the levels of BH4. The other one (PTPS-I), with higher similarity (YP_ E-value: 6e-31), was not involved in BH4 biosynthesis; later, it was shown to be involved in Q biosynthesis (Jin Sun et al., 2006; McCarty et al., 2009; Reader et al., 2004). 115

116 Using the SEED database (Overbeek et al., 2005), a comparative genomic analysis was performed on the six COG0720 (PTPS) family members. Dr. de Crécy- Lagard built a SEED subsystem, Experimental - PTPS. In this subsystem, 918 genomes were analyzed; out of these, 114 genomes have more than one copy of COG0720 showing the risk of misannotation is very high. Cluster analysis Physical clustering analysis revealed that specific members of the subfamilies could be efficiently separated by analyzing the identity of their neighbors. As shown on Figure 5-2, ptps genes (encoding PTPS-II enzymes) cluster with other genes (such as fole and spr) of the BH4 pathway (Figure 5-2). Out of 99 organisms containing PTPS-II genes, 14 cluster with fole and/or spr. Similarly, out of 563 sequenced organisms containing qued genes encoding PTPS-I, 283 cluster with other queuosine genes (quecef) (Reader et al., 2004) and fole or fole2 (Phillips et al., 2008) (Figure 5-2). Finally, out of the 64 genes encoding PTPS-III that can functionally replace folb (Hyde et al., 2008; Pribat et al., 2009), 16 cluster with folate biosynthesis genes such as fole, folk and folp (Figure 5-2). The PTPS-IV family is still of unknown function but physical clustering suggests a link with riboflavin. PTPS-IV genes are found only in a few halophilic Archaea and actinomycetes (a total of 14 organisms). In both groups, they cluster with: GTP-cyclohydrolase III (GCYH-III) genes (arfa) (Graham et al., 2002) in Archaea or GTP-cyclohydrolase II genes (riba2) in bacteria (Spoonamore and Bandarian, 2008), and a formamide hydrolase gene (arfb) that encodes the subsequent enzyme in these GCYH-III dependent riboflavin pathway (Grochowski et al., 2009). 116

117 Phylogeny and motif derivation Previous sequence and structural analysis of the PTPS-III family showed that the PTPS-II and PTPS-III families are to be distinguished by the presence of specific motifs surrounding the catalytic residues {CX(5)HGH} for PTPS-II enzymes (R. norvegicus numbering), {EX(2)HGH} for PTPS-III enzymes (P. falciparum numbering) (Dittrich et al., 2008; Hyde et al., 2008). We extended this motif analysis to the other PTPS families. Using the PRATT (Jonassen et al., 1995) tool from PROSITE suite, we derived PROSITE motifs for all members of PTPS family (Figure 5-3). The PTPS-I group has members in almost all bacterial organisms that synthesize Q de novo; the derived specific motif was {CX(3)HGH}. The PTPS-II group is found mainly in mammals and a few bacteria such as Cyanobacteria and Chlorobia. The derived characteristic motif was {CX(5)HGHX[FY]X}. In some bacteria, the folate biosynthesis FolX and FolB steps are left out and replaced by PTPS-III enzyme (Hyde et al., 2008; Pribat et al., 2009). The characteristic derived motif was {EX[IL]HGHX(3,5)V} (Figure 5-3). The PTPS-IV group with members in some Archaea and actinomycetes has its derived characteristic motif {FX(0,1)GX[ANTV]}. Two groups of COG0720 proteins that did not contain any of the motifs identified in the PTPS-I/II/III/IV families were found in Crenoarchaea.The PTPS-V group had members in all sequenced Pyrobaculum sp., in Vulcanisaeta sp., and in T. neutrophilus and contained a {SX(2)WX(3)HGH} motif. The PTPS-VI group had members in M. sedula DSM 5348 and in all sequenced Sulfolobus sp.; the derived motif for PTPS-VI was {SSX(4)QXHGH} motif (Figure 5-3). The specificity of the PTPS derived motifs was tested using the motif search tool Phi-Blast from NCBI as well as ScanProsite (de Castro et al., 2006) from PROSITE database (Hulo et al., 2006). Additional to the PROSITE derived motifs, we also 117

118 generated weblogos for each of the PTPS sub-families (Figure 5-3) using Web Logo 3.0 (Crooks et al., 2004). To generate the logos, at least 40 sequences for each PTPS subfamily were used with the exception of the PTPS-IV family (only 14 sequences used, due to lack of sequenced organisms that contain members of this family), PTPS-V (7 sequences) and PTPS-VI (6 sequences). The six motifs were used as signature motifs to annotate the COG0720 subfamilies. We found the use of these signature motifs more reliable than separation of the subfamilies using phylogenetic trees. Indeed, both neighbor-joining and parsimony methods of the MEGA 4 pack (Tamura et al., 2007) using as input a multiple alignment of 48 COG0720 proteins (Apendix E) performed with CLUSTALW2 (Chenna et al., 2003) or the SATCHMO-JS (Edgar and Sjolander, 2003) tool from Phylofacts suite (Glanville et al., 2007) failed to separate the PTPS-I/III and PTPS-III as well as PTPS-V and PTPS-VI subfamilies with higher than 50% confidence (Figure 5-4). Members of the PTPS-Il and PTPS-IV families did however group in separate branches (Figure 5-4). Structural Analysis of the COG0720 Family To gain insight into the structural basis for plasticity of the active site, the structure of the extensively biochemically and structurally characterized rat PTPS-II enzyme (PDB 1B66) was compared with the structure of the P. aeruginosa PTPS-I (PDB 2OBA). Structural superimposition was performed based on monomers of each protein structure. The best global fit was obtained using PTPS-II structure from R. norvegicus as reference with the following three homology regions used as anchoring sites: the first region from Thr66 to Gly67, the second from Thr105 to Glu107, and the third from Glu133 to Tyr134 (R. novergicus numbering). Overall, the two structures superimpose with rmsd of (Figure 5-5A). The spatial locations of the canonical 118

119 T-fold Glu residues which interact with the exocyclic pterin amine are conserved in both structures (Glu133/Glu107) (Ploom et al., 1999). The catalytic residues Asp88 and His89 (R. norvegicus numbering) occupy the same coordinates in both proteins. The three histidines known to be involved in Zn(II) binding in PTPS-II (His23, His48, and His50) (Ploom et al., 1999) occupy the same coordinates as the Zn(II) binding histidines (His13, His38, and His30) of the PTPS-I. The distances from Cys to Zn(II) are conserved in both enzymes (Figure 5-5). Thus, Cys24 occupies the same spatial position in the P. aeruginosa PTPS-I structure as Cys42 does in the R. norvegicus PTPS-II structures. The main difference between the two proteins was found around the active site. The R. norvegicus PTPS-II has two extra helix domains, one found on the N- terminal side adjacent to Cys42 and the other around Asp88/His89 (Figure 5-5A). In addition, the structure of PTPS-I from P. aeruginosa (PDB 2OBA) and PTPS-III P. falciparum (PDB 1Y13) were superimposed using rigid FATCAT tool (Ye and Godzik, 2003) from PDB database (Wilson et al., 2009). The two structures superimpose with a rmsd of 2.21, the catalytic site formed of the Asp67 and His68 (PDB 2OBA) occupy the same coordinates as the Asp79 and His80 in PDB 1Y13, and the spatial locations of the canonical T-fold Glu residues, which interact with the exocyclic pterin amine, are conserved in both structures (Glu107/Glu161) (Ploom et al., 1999) (Figure 5-5C). Also, the spatial coordinates of the three histidine that coordinate the Zn(II) are conserved in both structures (Figure 5-5D). The key catalytic residue Cys24 of PTPS-I (PDB 2OBA) is missing in PTPS-III (PDB 1Y13). Glu38 in 1Y13 (Hyde et al., 2008) might replace the Cys24. The distances from nucleophile Cys24 and Glu38 of PDB2OBA and PDB1Y13 respectively to the Zn(II) are similar; thus the spatial location of the two proposed 119

120 nucleophile residues with respect to Zn(II) are also similar in the two enzymes (Figure 5-5D). There are differences between the two structures. The loops containing the active site do not overlap completely. PTPS-III (PDB1Y13) has an extra loop close to the pocket containing the active site (Figure 5-5C). PTPS-I/III Protein Functions in Both Folate and Queuosine Pathway Genomic analysis revealed that a group of bacteria (Figure 5-6) contained only one PTPS encoding gene but were predicted to require both PTPS-III and PTPS-I activities as they possess the preq 1 biosynthesis genes (quecef) as well as the signature folate genes folk and folp (Figure 5-6), but lacked qued (encoding PTPS-I) and folb (encoding PTPS-III). Closer analysis of the PTPS protein encoded in these organisms revealed that it contained a signature motif {CEX[ILPV]HGH} (Table 5-2 and Figure 5-6) that can be considered a hybrid PTPS-I and PTPS-III motif. Pribat et al demonstrated that a predicted PTPS-I/III enzyme from S. aciditrophicus (YP_ ) exhibited PTPS-III activity as the corresponding gene complemented the dt auxotrophy of an E.coli ΔfolB strain (Pribat et al., 2009). However, physical clustering linked the corresponding gene to the Q biosynthesis pathway. More generally, out of 38 organisms containing the dual motif, 7 of them cluster with folate biosynthesis genes and 14 of them cluster with Q biosynthesis genes. To test whether the PTPS-III proteins containing hybrid motifs also exhibited PTPS-I activity, we examined the nucleoside constituents of bulk trna extracted from of WT E. coli (MG1655 pbad24, VDC3339) and of the ΔqueD strain transformed with pbad24, a Q deficient - strain (VDC3321) or with plasmid derivatives expressing the PTPS-I/III Sa gene (VDC3335) or PTPS- I/III Sa Cys26Ala gene (VDC3365). Bulk trnas were enzymatically hydrolyzed, dephosphorylated, and the ribonucleosides analyzed by LC-MS/MS. The 410 m/z ion 120

121 that corresponds to the protonated molecular weight (MH+) of Q was detected by UV at min for the WT background, while no 410 m/z ion was detected in the ΔqueD pbad24 strain (Figure 5-7). Expression of the PTPS-I/III Sa gene complemented the Q deficient phenotype of the E. coli ΔqueD mutant (Figure 5-7). Mutating Cys26 of the {CEX[ILPV]HGH} motif to alanine in the S. aciditrophicus protein abolished complementation of the Q deficient phenotype by the corresponding gene but not of the dt auxotrophy phenotype (Table 5-1). Similarly expressing a canonical PTPS-III gene from L. interrogans did not lead to any complementation of the Q deficient phenotype whereas the same clone was effective in complementing the dt auxotrophy phenotype of the folb strain (Pribat et al., 2009) and (Table 5-1). In addition, we tested the PTPS-I (YP_ ) and PTPS-I/III Cb (YP_ ) genes from C. botulinum strain In this organism, the PTPS-I Cb gene clusters with Q biosynthesis genes, and the PTPS-I/III Cb gene clusters with folate biosynthesis genes (Figure 5-8). Both complemented the Q deficient phenotype and thus were active as PTPS-I enzymes (Table 5-1). Only PTPS-I/III complemented the dt auxotrophy phenotype and thus exhibited PTPS-III activity (Figure 5-8). These results show that PTPS enzymes that contain hybrid PTPS-III/I motifs are active in both folate and Q biosynthesis pathways and that the conserved cysteine in that motif is critical for PTPS-I activity but not PTPS- III activity. Role of COG0720 Proteins in Archaea Archaeosine and Queuosine share a common intermediate, preq 0 (Iwata-Reuyl, 2003). PTPS-I is a step in preq 0 biosynthesis. Archaea that produce G + would be required to encode PTPS-I homologs. Almost all Euryarchaea that have a tgt gene encode a PTPS-I homolog. One exception is the symbiont N. equitans that probably 121

122 salvages the G + precursor, preq 0, since its genome encodes only the tgta and arcs genes ( Queuosine and Archaeosine subsystem in the SEED database). Unexpectedly, PTPS-I genes are absent in many Crenarchaea, quite a few are known to produce G + (Edmonds et al., 1991). This suggests that another enzyme family might be catalyzing the same reaction. Several archaeal genomes encode COG0720 paralogs. For example, H. volcanii contains both a PTPS-I gene (HVO_1718) and a PTPS-IV gene (HVO_1282), P. furiosus contains both a PTPS-I gene (PF0219) and PTPS-III gene (PF1278), S. solfataricus contains PTPS-VI gene (SSO2412), and P. calidifontis contains PTPS-V gene (Pcal_1063). H. volcanii contains G + in its trnas (Gupta, 1984; Watanabe et al., 1997). In Chapter 3, we showed that HVO_1718 homolog of bacterial qued (PTPS-I) is involved in G + biosynthesis. The function of the PTPS-IV protein is less clear. H. volcanii is among the rare Archaea that have a full folate pathway (Levin et al., 2004; Ortenberg et al., 2000), but folb gene is yet to be identified in these organisms (Falb et al., 2008). One possibility was that even if PTPS-IV did not have the signature of the PTPS-III motif, it could functionally replace FolB. To test the prediction, we constructed a ΔHVO_1282 strain of H. volcanii and showed that it did not require dt for growth unlike the H. volcanii ΔfolE2 mutant we had previously constructed (El Yacoubi et al., 2009) (Figure 5-9A). These results suggest that PTPS-IV was not involved in folate biosynthesis. Because physical clustering suggests that PTPS-IV might be involved in riboflavin synthesis, and because specific riboflavin biosynthesis genes are still missing in Archaea (Grochowski et al., 2009), we tested if HVO_1282 was involved in riboflavin synthesis. As a control, we constructed a H. volcanii strain deleted for the riba gene 122

123 (HVO_1284). As shown in Figure 5-9B, no growth defect was observed in the absence of riboflavin in the HVO_1282 strain whereas, as expected, the ΔHVO_1284 strain required riboflavin to grow. Thus the PTPS-IV is involved in neither folate nor riboflavin biosynthesis. Its physiological role is yet to be investigated. We also explored whether the deviant COG0720 members found in S. solfataricus or in P. calidifontis had QueD or FolB activity in the E. coli complementation tests. Therefore, the SSO2412 corresponding to the PTPS-VI motif was cloned in pbad24. The resulting plasmid was transformed in E. coli ΔfolB to verify if SSO2412 complements dt auxotrophy phenotype of E. coli ΔfolB, and in E. coli ΔqueD to verify if SSO2412 complements the Q deficient phenotype. The Pcal_1063 was also cloned in pbad24 and transformed in both E. coli ΔfolB and E. coli ΔqueD. The complementation test showed that only the expression of SSO2412 (PTPS-VI) complemented dt auxotrophy of E. coli ΔfolB (Figure 5-9C and Table 5-1) but not the Q deficient phenotype of E. coli ΔqueD. Pcal_1063 expression complemented neither FolB nor Q deficient phenotypes (Table 5-1). Hence, the role of PTPS-V is yet to be investigated. Flexibility of the PTPS Catalytic Site Based on an exhaustive comparative analysis of the Zur regulon performed by Haas et al. (Haas et al., 2009) who found that certain bacteria contained two copies of the qued/ptps-i gene (Haas et al., 2009). We named these two copies qued and qued2, with qued2 predicted to be under the control of negative regulator Zur, a repressor that senses zinc levels, upregulating genes under its control when zinc is low (Patzer and Hantke, 1998). The motif derived for this QueD2 sub-family is {CX(4)HGH}. A. baylyi ADP1 contains only a qued2 gene and no qued gene. QueD2 has been shown to be involved in Q biosynthesis by Reader et al (Reader et al., 2004). We further 123

124 tested if QueD2 proteins could functionally replace PTPS-I enzymes by expressing the A. baylyi sp. ADP1 qued2 gene (YP_ ) in the E. coli ΔqueD strain (VDC4660). As shown in Table 5-1, complementation of the Q deficient phenotype was observed, thus confirming that QueD2 had PTPS-I activity. Interestingly, introducing the Lys23Cys and Cys24Ser mutations in the E. coli QueD protein, thereby changing the {CX(3)HGH} motif to a {CX(4)HGH} motif, also allowed functional complementation (Table 5-1). The result suggested that PTPS-I catalytic pocket is plastic. To further probe this idea, we tested if PTPS-II proteins, which contains {CX(5)HGH} motifs, could also function as PTPS-I enzymes. Previous studies (Jin Sun et al., 2006) had shown that PTPS-I from Synechococcus sp. PCC7942 did possess PTPS-II activity in vitro (albeit only 10% of the activity of the canonical PTPS-II from the same organism), but the reverse scenario has never been tested. As shown in Figure 5-8 and Table 5-1, expressing the rat ptps gene in the E. coli ΔqueD strain (VDC3331) did restore the production of Q thus demonstrating that PTPS-II exhibited enough PTPS-I activity to functionally replace the chromosomal encoded qued - at least when expressed on a multicopy plasmid. Finally, we tested if mutating the motif to {CX(2)HGH} to shorten the spacer region still led to a functional PTPS-I enzyme. The E. coli ΔqueD and ΔfolB strains were transformed with the plasmid expressing the PTPS-I/III Sa with Cys26Ala and Glu27Cys mutations; thus, we created a {CX(2)HGH} motif. This clone failed to complement either the Q deficient or the dt auxotrophy phenotypes (Table 5-1). Discussion Our in vivo results in both archaeal and bacterial model organisms confirmed the in vitro studies performed by Bandarian laboratory (McCarty et al., 2009): PTPS-I/QueD is required to synthesize preq 0. preq 0 is a common intermediate in Q and G + 124

125 biosynthesis. PTPS-I/QueD is a member of the COG0720 family which comprises at least six sub-families of enzymes; members of each of these subfamilies are involved in specific biosynthetic pathways. PTPS-I is involved in Q biosynthesis. PTPS-II is involved in BH4 biosynthesis. PTPS-III is involved in folate biosynthesis. PTPS-I/III is involved in both Q and folate biosynthesis. The PTPS-IV and PTPS-V enzymes have no assigned function yet. PTPS-VI might be involved in folate derivatives synthesis in Sulfolobus sp. Members of the COG0720 protein superfamily showed relaxed substrate and reaction specificities. PTPS-II ({CX5HGH}) not only catalyzes the pyruvoyltetrahydropterin (PPH4) formation from H2NTP in BH4 biosynthesis in mammals (Milstien and Kaufman, 1989) but also complements the Q deficient phenotype of E. coli ΔqueD strain. In vitro, QueD/PTPS-I ({CX3HGH}) produces carboxy-tetrahydropterin (CPH4) from H2NTP, its own biological substrate, as well as from sepiapterin, PPH4, and HNTP (McCarty et al., 2009). One of the substrates of PTPS-I, PPH4, is the product of PTPS-II catalysis. PTPS-I take H2NTP and produces CPH4, while PTPS-II takes H2NTP and forms pyruvoyltetrahydropterin. Hence, the two enzymes, PTPS-I and PTPS-II have evolved to share main active site features while catalyzing the formation of different products. The dual PTPS-I/III ({CEX(2)HGH}) functions in both Q biosynthesis and folate biosynthesis catalyzing different reactions in the two pathways. An example of promiscuous enzymes family is the alkaline phosphatase superfamily (AP). The alkaline phosphatase (AP) enzyme and the evolutionary related member nucleotide pyrophosphatase/phosphatase (NPP) belong to the same AP superfamily. AP hydrolyzes phosphate monoesters and has a low activity 125

126 for phosphate diesters hydrolysis. Conversely, NPP hydrolyzes preferentially phosphate diesters, but, with aproximaytively fold lower activity, it hydrolyzes phosphate monoesters (Auerbach and Nar, 1997; Zalatan et al., 2008). Another example is the member of the enolase superfamily, o-succinylbenzoate synthase (OSBS) which functions both as succinylbenzoate synthase in menaquinone biosynthetic pathway and as N-acylamino acid racemase (NAAAR) in racemization of N-acetylmethionine (Gerlt et al., 2005; Hult and Berglund, 2007). Although the OSBS and NAAAR subfamilies do not share a high sequence similarity, the catalytic residues are very well conserved between the subfamilies. One exception is Cyanobacteria OSBS subfamily that replaces lysine with tyrosine or arginine. This replacement, in E. coli, appears to stabilize the endiolate intermediate rather than act as a general acid/base catalyst (Glasner et al., 2006). The functions of the PTPS-IV, V, and VI families remain elusive. The PTPS-IV family has retained its T-fold structure (Spoonamore et al., 2008) but bioinformatics, biochemical and genetics analysis suggest that this enzyme family is not involved in Q, folate, or biopterin synthesis; its role is yet to be determined. The PTPS-V ({SX(2)WX(3)HGH}) and PTPS-VI ({SSX(4)QXHGH}) share a similar catalytic motif ({SX(6)HGH}) but their functions are still unknown PTPS-V did not complement Q and dt phenotypes. PTPS-VI partially complemented the dt autotrophy of the E. coli ΔfolB suggesting that it might be a folate enzyme. However, the physiological role of the PTPS-VI enzyme remains elusive. The synthesis of the modified folate found in S. solfataricus (Zhou and White, 1992) still awaits clarification. The PTPS-V functions remain to be elucidated. 126

127 To confirm the promiscuity of the COG0720 family, we further analyzed the PTPS- I, PTPS-II, and PTPS-III crystal structures and found that they exhibit topologically identical catalytic (Cys/Glu, Asp, and His) and coordinating metal (Zn(II)) residues. The small structural differences among PTPS-I, PTPS-II, and PTPS-III proteins must play a role in accommodating different substrates in the active site by assuming slightly different conformational changes and driving different chemistries. Nevertheless, further biochemical and structural characterizations are required to understand these differences. 127

128 Table 5-1. Testing the in vivo activity of different COG0720 protein derivatives Variant tested Motif PTPS-I activity a PTPSIII m 1 G/m 1 G c Q/m 1 G/m 1 G c activity b PTPS-I Ec CX3HGH E+07 _ PTPS-I/III Sa CEX2HGH E+08 + PTPS-I/III Sa Cys26Ala AEX2HGH PTPS-III Li EX2HGH PTPS-I Cb CX3HGH E+08 - PTPS-I/III Cb CEX2HGH E+08 + PTPS-I Ab CX4HGH E+08 - PTPS-I Ec Lys23Cys CX4HGH E+08 - and Cys24Ser PTPS-II Rn CX5HGH E+08 - PTPS-I/III Sa Cys26Ala CX2HGH and Glu27Cys PTPS-II Rn Cys42Ala CX3HGH E+07 - and Asn44Cys SSO2412 SSX4QXHGH Pcal_1063 WX3HGH a) m 1 G/m 1 G c is the ratio of of m 1 G in trna analyzed after transformation of a ΔqueD strain with the test plasmids, compared with trna extracted from the control ΔqueD pbad24. Q levels are then divided by the m 1 G ratios to correct for variations in trna levels. These analyses are semi-quantitative and were conducted at least twice independently. b) Growth on LB plates in the absence of dt at 37 C for 48H after transformation of an E. coli ΔfolB strain 128

129 Table 5-2. Organisms predicted to contain COG0720 enzymes with dual PTPS-I/III activities. Organism Accession numbers Motif Desulfuromonas acetoxidans ZP_ GDCENLHGHNWK Geobacter_sulfurreducens NP_ GDCENLHGHNWR Pirellula_sp. NP_ DICERIHGHNYGV Thermotoga_maritima NP_ GKCERLHGHTYR Geobacter_metallireducens YP_ GDCENLHGHNWK Thermoanaerobacter_tengcongensis NP_ GKCEELHGHTYRL Desulfovibrio_desulfuricans YP_ GKCEALHGHNFG Dehalococcoides_ethenogenes YP_ GKCENLHGHRYE Blastopirellula_marina ZP_ GTCERVHGHNYR Solibacter_usitatus YP_ GKCENVHGHNYR Syntrophobacter_fumaroxidans YP_ GKCENLHGHNWK Syntrophus_aciditrophicus YP_ GNCEHLHGHNWA Clostridium_botulinum YP_ GKCERLHGHTYG Desulfovibrio_vulgaris YP_ GKCENLHGHNFA Bacteroides_vulgatus YP_ SKCENLHGHNWI Anaeromyxobacter_sp YP_ GKCERLHGHNW Anaeromyxobacter_dehalogenans YP_ GKCERLHGHNWRV Thermotoga_petrophila YP_ GKCEKLHGHTYR Herpetosiphon_aurantiacus YP_ GKCERLHGHNYR Fervidobacterium_nodosum YP_ GKCEKLHGHTYK Pelobacter_propionicus YP_ GDCENLHGHNWK Caldicellulosiruptor_saccharolyticus YP_ GKCERLHGHTYK Thermoanaerobacter_pseudethanolicus YP_ GKCEELHGHTYK Desulfococcus_oleovorans YP_ HKCENLHGHNWK Dethiosulfovibrio_peptidovorans ZP_ GKCEALHGHTYR Planctomyces_limnophilus YP_ NICERLHGHNWR Denitrovibrio_acetiphilus YP_ GKCENLHGHNWK 129

130 THF FolB Hydroxy - methyl-dihydroneopterin PTPS-III - Dihydroneopterin Phosphatase FolE FolQ GTP ArfAB O H 2 N N H PTPS - II Pyruvoyltetrahydropterin synthase Dihydroneopterin Triphosphate PTPS-I 6 -carboxytetrahydropterin Dihydroneopterin Monophosphate QueE O P P P H H O H O? H O H H N H 2,5-diamino -ribofuranosylamino - pyrimidinone triphosphate N PTPS -IV N H 2 6-pyruvoyltetrahydropterin BH4 6- carboxydeazaguanine Q QueC preq 0 G + Figure 5-1. Known or predicted roles of COG0720 (PTPS) proteins in GTP-derived metabolic pathways. PTPS-II is involved inbh4 synthesis. PTPS-I is involved in Q and G+ biosynthesis. PTPS-III is involved in THF biosynthesis. PTPS-IV might be involved in riboflavin derivatives synthesis. 130

131 Dehalococcoides ethenogenes Thermotoga maritima Clostridium botulinum PTPS-III gene clusters (64/16) folp PTPS-I/III fole2 folk folp PTPS-III folk Rhodothermus marinus Cytophaga hutchinsonii PTPS-II gene clusters (99/24) PTPS-II fole SR Clostridium botulinum Desulfovibrio vulgaris Syntrophobacter aciditrophus PTPS-I gene clusters (563/283) PTPS-I PTPS-I/III quee fole quec PTPS-IV gene clusters (14/10) Halobacterium NRC1 Streptomyces avertimilis ArfA ArfB PTPS-IV Figure 5-2. Physical clustering of the four PTPS protein sub-families (I-IV). PTPS-I, found in 563 organisms, clusters with Q biosynthesis genes in 283 organisms. PTPS-II genes cluster with BH4 biosynthesis genes in 24 genomes out of 99 containing PTPS-II. PTPS-III cluster with folate biosynthesis genes in 16 genomes out 64 containing PTPS-III. PTPS-IV cluster with GTPCH-III (ArfA) in Archaea and with GTPCH-II (RibA) in some actinomycetes. 131

132 PTPS -I { C -X(3)-H-G-H } PTPS -IV F- x(0,1)-g-x -[ANTV] -[NPQST] PTPS-II {C- X(5)-H-G-H-X-[FY]-X- [LV] -X -[IV]} PTPS -V {S-X(2) -(W,Y)-X(3) -H -G- H} PTPS -III {E-X-[IL]- H-G- H - X(3,5)-V - X- [AILV]-X-[GIL]} PTPS-I/III {C-X-E-X-[IL]-H-G-H-X(3,5)-V-X-[AILV]- X-[GIL]} PTPS-VI {SS-X(4)-Q-X-H-G-H} Figure 5-3. Signature motifs obtained for COG0720 proteins. The {CX(3)HGH} motif is found in PTPS-I member involved in Q biosynthesis encoded by the qued gene. The {CX(5)HGHX[FY]X[LV]X[IV]} motif is present in PTPS-II protein involved in BH4 biosynthesis. The {EX[IL]HGHX(3,5)VX[AILV]X[GIL]} motif is present in PTPS-III protein involved in folate biosynthesis, the {CEX[ILPV]HGHX[FWY]X(3)[AILV]} motif is present in PTPS-I/III protein involved in both queuosine and folate biosynthesis. The {FX(0,1)GX[ANTV][NPQST]} motif is present in PTPS-IV sequences. The {SX(2)(W,Y)X(3)HGH} is found in PTPS-V, which is present in few Pyrobaculum sp. The motif of PTPS-VI is {SSX(4)QXHGH} and occurs in few Sulfolobus sp. 132

133 Leptospira interrogans PTPS-III Anopheles gambiae PTPS-II Mus musculus PTPS-II Drosophila melanogaster PTPS-II Caenorhabditis elegans PTPS-II Geobacillus kaustophilus PTPS-II Chlorobium tepidum PTPS-II Pelodictyon luteolum PTPS-II Prochlorococcus marinus PTPS-II Synechococcus sp. PTPS-II Synechococcus elongatus PTPS-II Gloeobacter violaceus PTPS-II Thermococcus kodakarensis PTPS-III Pyrococcus abyssi PTPS-III Pyrococcus furiosus PTPS-III Desulfotalea psychrophila PTPS-I Archaeoglobus fulgidus PTPS-I Bordetella bronchiseptica PTPS-I Bordetella avium PTPS-I Coxiella burnetii PTPS-I Bacteroides thetaiotaomicron PTPS-I Campylobacter jejuni PTPS-I Rhizobium leguminosarum PTPS-I Bacillus anthracis PTPS-I Leptospira interrogans PTPS-I Xanthomonas campestris PTPS-I Geobacter sulfurreducens PTPS-I/III Syntrophobacter fumaroxidans PTPS-III 98 Aquifex aeolicus PTPS-III Halobacterium sp. PTPS-IV Halorhabdus utahensis PTPS-IV Natronomonas pharaonis PTPS-IV Sorangium cellulosum PTPS-IV Nocardioides sp. PTPS-IV Salinispora tropica PTPS-IV Nakamurella multipartita PTPS-IV Thermomonospora curvata PTPS-IV Streptomyces avermitilis PTPS-IV Pyrobaculum calidifontis PTPS-V Vulcanisaeta distributa PTPS-V Sulfolobus solfataricus P2 PTPS-VI Sulfolobus islandicus PTPS-VI Pyrococcus furiosus PTPS-I Methanococcus maripaludis PTPS-I Bdellovibrio bacteriovorus PTPS-III Legionella pneumophila PTPS-III gamma proteobacterium PTPS-III Lactococcus lactis folb PTPS-II PTPS-I PTPS-III PTPS-I/III PTPS-IV PTPS-V PTPS-VI PTPS-I PTPS-III Figure 5-4. Evolutionary relationships of COG0720 family of proteins in 48 taxa. The 0.2 numbers represent the percentage confidence calculated by the Bootstrap method; B=1000 bootstrap replications. 133

134 A B Cys42 Cys24 His48 His28 C Zn His23 His13 His50 His30 D Glu38 His41 PTPS family Nucleophile C1OH Distance (Å) C2OH Distance (Å) Zn Distance ( Å) PTPS -II Cys Cys24 Zn His28 His29 His13 PTPS -I Cys His43 PTPS -III Glu 38 OH Glu 38 - =O His30 Figure 5-5. Spatial comparisons of PTPS crystal structures. A) Using Accelerys DS Vizualizer 2.5, R. norvegicus PTPS-II (black, PDB 1B66) and the P. aeruginosa (grey, PDB 2OBA) structures were superimposed. The three His residues coordinating the essential Zn 2+ ion were used as reference points to show the relative occupation in space of the active-site nucleophile Cys42 in R. norvegicus PTPS-II and the proposed nucleophile Cys24 in P. aeruginosa PTPS-I. Distances of the respective nucleophilic centres (the S atom of Cys42 and Cys24) from the Zn 2+ ion were measured as shown in the inset table. The distances from the O atom of C1OH and C2OH of the biopterin side chain were also measured and shown in the table. B) The relative positions of the three His residues and the nucleophile Cys24 and Cys42 from PDB 2OBA (PTPS-I, grey) and PDB 1B66 (PTPS-II, black) respectively are conserved in both structures. C) The superimposition of the structure of PTPS-I (PDB id 2OBA, grey) and PTPS-III (PDB 1Y13, black) was performed using the bioinformatics server FATCAT tool imbedded in PDB. The structure alignment has 116 equivalent positions with an optimum rmsd of 2.21 without twists. The three His residues coordinating the essential Zn 2+ ion were used as reference points to show the relative occupation in space of the active-site nucleophile Glu38 in P. falciparum PTPS-III and the nucleophile Cys24 in P. aeruginosa PTPS-I. Distances of the respective nucleophilic centers (the O atom of Glu38 and S from Cys24) from the Zn 2+ ion were measured as shown in the inset table. The distances from the O atom of C1OH and C2OH of the biopterin side chain were also measured and shown in the table. D) The relative positions of the three His residues and the nucleophile Cys24 and Glu38 from PDB 2OBA (PTPS-I, grey) and PDB 1Y13 (PTPS-III, black) respectively are conserved in both structures. 134

135 Hyperthermus butylicus DSM 5456 Methanosarcina barkeri str. fusaro Bacteroides vulgatus ATCC 8482 Dehalococcoides ethenogenes 195 Denitrovibrio acetiphilus DSM Dictyoglomus thermophilum H-6-12 Acidobacteria bacterium Ellin345 Solibacter usitatus Ellin6076 Clostridium botulinum Blastopirellula marina 3645 Planctomyces limnophilus 3776 Desulfatibacillum alkenivorans AK-01 Desulfococcus oleovorans Hxd3 Desulfotalea psychrophila LSv54 Desulfovibrio vulgaris Desulfuromonas acetoxidans Geobacter sulfurreducens PCA Pelobacter carbinolicus DSM 2380 Anaeromyxobacter dehalogenans Syntrophus aciditrophicus Syntrophobacter fumaroxidans Campylobacter hominis BAA-381 Caldicellulosiruptor saccharolyticus Dethiosulfovibrio peptidovorans Elusimicrobium minutum Pei191 Gene present Gene absent THF pathway Q pathway Figure 5-6. Distribution of dual PTPSI/III proteins in both Q and THF in specific organisms. In some organisms, PTPS-I/III clusters with Q genes, in other, it clusters with THF genes. 135

136 Complementation of E. coli ΔfolB 1 pptps-i/iii Cb ; 2- pptps-i Cb ; 3 pfolb Ec ; 4 pbad24 Figure 5-7. Complementation of the E. coli ΔfolB dt auxotrophy phenotype by PTPS- I/III and PTPS-I from C. botulinum (Cb). Growth was monitored after 48 hours on LB plates containing 100 μg/ml Amp r and supplemented when noted with 0.2% Ara or 80 μg/ml dt. 136

137 A MG1655 pbad24 Q XIC 410 m/z B ΔqueD pbad24 No Q XIC 410 m/z C ΔqueD pptps-i/iii Sa Q XIC 410 m/z D ΔqueD pptps-ii Rn Q XIC 410 m/z Figure 5-8. LC-MS/MS analysis of Q content in bulk trna extracted from E. coli ΔqueD derivative strains. E. coli ΔqueD Q deficient phenotype was complemented by in trans expression of PTPS-I/III Sa and PTPS-II Rn. A) The UV chromatogram of MG1655 pbad24. B) The UV chromatogram of MG1655 ΔqueD pbad24. C) The UV chromatogram of MG1655 ΔqueD pptps-i/iii Sa. D) The UV chromatogram of MG1655 ΔqueD pptps-ii Rn (Rn: Rattus norvegicus, Sa: Syntrophus aciditrophicus). 137

138 YPC+dT 2 YPC A 1 H26 WT; 2 H26 ΔPTPS-IV; 3 H26 ΔfolE2 Hv-Mm Hv-Mm+Rib B 1 - H26 WT; 2 - H26 ΔPTPS-IV; 3 - H26 ΔHVO_ Ara + dt E. coli ΔfolB psso2412; 2 - E. coli ΔfolB pbad24; 3 - E. coli ΔfolB pfolb Ec C Figure 5-9. Role of COG0720 proteins in Archaea. A) Genetic evidence that HVO_1282 (PTPS-IV) gene is not involved in folate biosynthesis. Growth of H. volcanii derivatives on Hv YPC plates with or without 80 μg/ml dt was monitored after 10 days. B) Genetic evidence that PTPS-IV gene is not involved in riboflavin biosynthesis. Growth of H. volcanii derivatives on Hv- Mm+Riboflavin and Hv-Mm was monitored after 10 days. C) Genetic evidence that SSO2412 gene (PTPS-VI) has folb activity. Complementation of dt auxotrophy phenotype of E. coli ΔfolB with SSO2412 cloned in pbad24. Growth was monitored after 48 hours on LB plates containing Amp r 100 μg/ml and supplemented when noted with 0.2% Ara or 80 μg/ml dt. 138

139 CHAPTER 6 PHENOTYPIC ANALYSIS OF H. volcanii ARCHAEOSINE DEFICIENT MUTANTS Background Archaea inhabit some of the most forbidding places on Earth: deep hydrothermal vents, permanently cold areas such as sea and dry lakes of Antarctica, very salty sea waters such as the Dead Sea, and hot springs such as those in the Yellowstone National Park. Archaea are adapted to grow in extreme conditions of high salt concentrations (halophiles), high temperature (hyperthermophile), low (acidophile) and high (alkaliphiles) ph. There is little knowledge about the adaptation strategies of Archaea to these environments; even less is known about the adaptability of nucleic acids to such extreme conditions. When nucleic acids are exposed to a hostile environment, two types of degradation have been observed: an overall structural denaturation and a chemical degradation of their building blocks (Grosjean and Oshima, 2007). trna structural stability can be enhanced by: 1) increased GC content, 2) monovalent and divalent cations, and 3) non-cyclic polyamines. The structural stability of trna molecules under high temperature was thoroughly studied, and several adaptations strategies were observed. There is a direct relationship between trna stability and GC content in the base pairing region; a 5% increase in the GC content of trna increases the thermal denaturation temperature by 1.5 C (Grosjean and Oshima, 2007). The extreme thermophilic and hyperthermophilic organisms have cloverleaf stems made almost entirely of G:C base pairs (Marck and Grosjean, 2002). trna stability is also increased by the presence of small ligands such as monovalent cations (Na +, K + ), or divalent cations (Mg 2+ and Mn 2+ ). Magnesium ions act as counter ions to shield the highly 139

140 negative phosphate backbone of nucleic acids thus promoting trna folding; the tertiary conformation of trna stability is increased (Serebrov et al., 2001; Serebrov et al., 1998; Serebrov et al., 1997). In the tertiary conformation of trna, Mg 2+ ions bind strongly to specific binding pockets (Jovine et al., 2000; Maglott et al., 1998; Nobles et al., 2002; Serebrov et al., 2001; Serebrov et al., 1998) (Figure 6-1). Aliphatic non-cyclic compounds containing two or more protonated amino nitrogen such as linear polyamines or branched polyamines are also known to stabilize trna. Tetrakis(3- aminopropyl)ammonium (Taa) plays important roles in stabilizing RNAs in thermophiles (Terui et al., 2005). At concentrations in the micromolar range (200 µm), branched Taa increases the melting temperature of yeast trna Phe transcripts by more than 20 C (Hayrapetyan et al., 2009). Moreover, branched quaternary polyamines are the major cellular polyamines in some hyperthermophiles such as A. pyrophilus, M. jannaschii, and other Archaea genera (Hamana et al., 1994; Hamana et al., 1985; Hamana et al., 2003). Furthermore, the correct folding and rigidity of trna tertiary structure are improved by posttranscriptional modifications (Grosjean and Oshima, 2007) (Figure 6-2). The folding interactions of the conserved L shaped structure of cytoplasmic trnas are characterized by hydrogen bonding of G19 (D domain) with C56 (TΨC domain), and by a G18 (D domain) base pair with Ψ55 (TΨC domain) (Grosjean and Benne, 1998). In hyperthermophilic Archaea, an inter-strand stacking interaction between G18 and G19 was observed; m 1 I57 and C56 similarly interact both between themselves and also with s 2 T54. All of the above form a Mg 2+ specific coordination site (Grosjean and Oshima, 2007). The highly conserved nucleoside modifications rt54, Ψ55, and m 5 C49 in the 140

141 TΨC domain together with Mg 2+ ions have been shown to increase the affinity between the TΨC and D domains (Nobles et al., 2002) (Figure 6-1). The rigidity of the trna tertiary structure is also enhanced by interactions between the nucleotide at position 15 and the nucleotide at position 48 (Nobles et al., 2002). In more than 70% of trnas, the interaction between positions 15 and 48 is a reverse Watson-Crick of G-C type interaction (Jühling et al., 2009). In the yeast trna Phe crystal structure, the site around position 15 has an increased negative electrostatic potential due to the back-bones phosphate groups. To this site, two Mg 2+ ions are bound to increase trna stability (Jovine et al., 2000; Maglott et al., 1998). Archaeal Tgt is conserved across Archaea sequenced to date. Since atgt is the critical enzyme in G + biosynthesis, G + must be also present in the known Archaea. The presence of G + at position 15 was assumed to be involved in maintaining the integrity of the tertiary structure of trna due to the positive charges of the amidino group interacting with the phosphate groups of the trna (Iwata-Reuyl, 2003). The conjecture of the above assumptions was that G + should be essential for the survivability of many Archaea in extreme environments. Since atgt is not essential for optimal growth in extreme halophilic archaeon H. volcanii, further phenotypical characterization of the G + deficient mutants could uncover possible physiological roles of this modification. Results Other Extreme Halophilic Archaea Have Lost Archaeosine High salinity environments are characterized by a high concentration of divalent cations such as Mg 2+ and Ca 2+ ions, and monovalent ions such as K + and Na + ions. It was established that both monovalent cations and divalent cations increase the stability of trna (Tan and Chen, 2010). Since G + is not essential in the mesophilic, extreme 141

142 halophile H. volcanii, we wondered whether high salinity environment could naturally compensate for the loss of this modification by providing enough cations to increase folding and stability of trna. To test this hypothesis, a phylogenetic distribution analysis of atgt in archaeal extreme halophiles was performed by Dr. de Crécy-Lagard; atgt was not found in H. walsbyi, (Figure 6-3). To verify that the loss of atgt indeed leads to the loss of G + and also that atgt had not been replaced by a non-orthologous enzyme in that organism, the H. walsbyi (kind gift from Dr. Mike Dyal-Smith - Max-Planck Institute, Germany), was grown in defined media for four weeks without agitation at 37 C. Bulk trna was extracted, purified, and hydrolyzed to ribonucleosides for subsequent LC MS/MS analysis. The UV trace chromatogram of the trna showed that the peak representing G + at min (325 m/z) in H. volcanii was missing in H. walsbyi (Figure 6-4); other modifications such as t 6 A or m 2 2G were present in both samples. These results suggest that growth in high salt environment might allow G + to become dispensible. H. volcanii Archaeosine Deficient Mutants Are Sensitive to High Mg 2+ Concentrations trna evolved to maintain structural integrity in extreme environments using different strategies. As mentioned above, divalent or monovalent cations maintain structural integrity of trna (Grosjean and Oshima, 2007; Helm, 2006; Motorin and Helm, 2010). The most common cation involved in the promotion and maintenance of the correct tertiary folding of trnas is Mg 2+ (Oliva and Cavallo, 2009; Oliva et al., 2007; Serebrov et al., 2001; Serebrov et al., 1998). Oliva et al., showed, in silico, that Mg 2+ bound to N7 of guanine or G + could increase the stabilization of the Reverse Watson- Crick bond in trna (Oliva et al., 2007). If G + can be replaced by Mg 2+, we predicted that 142

143 G + deficient mutants could show poor growth at low Mg 2+ ions concentrations. Thus, the H. volcanii Δatgt (VDC3241), H. volcanii ΔarcS (VDC5203), ΔqueD (VDC3290), and isogenic wild type (H26) were grown on solid rich medium (YPC) with varying Mg 2+ concentrations (0.0, 0.05, 0.15, 0.2, 0.3, 0.4,0.6, 0.8, and 1.0 M) while the concentrations of other salt components remained constant (2.45 M NaCl, M KCl, and M CaCl). The plates were incubated at 45 C for 5 days. The isogenic wild type grew at all Mg 2+ ions concentrations except 0.0 M thus confirming that H. volcanii requires Mg 2+ ions for growth (Rodriguez-Valera et al., 1981). Growth of the H. volcanii Δatgt strain was retarded at 0.3 M Mg 2+ ions concentrations (0.2 M is the optimal concentration); the H. volcanii ΔarcS and ΔqueD growth was retarded at 0.4 M Mg 2+ concentrations. The H. volcanii ΔarcS and ΔqueD growth defects were slightly different from the Δatgt mutant because: 1) in the case of the ΔarcS mutant, preq 0 presence in trna could slightly complement the loss of G + ; 2) in the case of the ΔqueD mutant, the possibility of preq 0 salvage could lead to low concentrations of G + in trna. For all G + deficient mutants, a growth defect was observed above 0.4 M Mg 2+ concentration (Figure 6-5). Contrary to our predictions, the G + deficient mutants are sensitive to high but not to low Mg 2+ ions concentrations. H. volcanii Archaeosine Deficient Mutants Show a Cold Sensitive Phenotype Life in high salt environment is harsh. Extreme halophiles have to adapt not only to high osmotic pressure due to high concentrations of salt, but also to variations of temperature and differences of salt concentration due to precipitations. For example, salinity of the Great Salt Lake has been fluctuating around 20% due to seasonal changes in temperature and precipitations for the last century (Van den Bergh and Roulin, 2010). Also, the Dead Sea has up to a 35% salt concentration variation between 143

144 hot summers and heavily rainy winters (Oren, 2002C). To find out if archaeosine plays a role in adaptation to variations of temperature and salt concentration (total salt concentration), the H. volcanii atgt mutant (VDC3241) was grown at different temperatures (28, 30, 37, 45, 50, and 55 C) on different total salt concentrations (12, 14, 16, 18, 23, 25% w/v). As shown in Figure 6-6, a cold sensitivity phenotype was observed for H. volcanii atgt mutant. This phenotype was rescued by expressing atgt (HVO_2001) in trans. To verify whether the cold sensitivity phenotype is a consequence of G + deficiency, other G + deficient mutants, containing ΔqueD and ΔarcS deletions, were grown on rich (YPC) or defined solid media (Hv-Mm) at 28 C and 45 C. In YPC, a growth defect at low temperature (28 C) was observed in the other G + deficient mutants (H. volcanii ΔqueD and H. volcanii ΔarcS) (Figure 6-7). However, the H. volcanii ΔarcS exhibited less pronounced growth sensitivity than the atgt mutant. This difference might be due to the presence of preq 0 and/or to the incidence of G + in the trna of the ΔarcS mutant. Similarly, H. volcanii ΔqueD exhibited less pronounced growth sensitivity than the atgt mutant; this difference might be associated with the presence G + in the trna of ΔqueD mutant. The presence of G + in the trna of the ΔqueD mutant could be due to the salvage of preq 0 (Chapter 3). In Hv-Mm, a comparable cold-sensitive phenotype was observed for all G + deficient mutants. From these observations, we suggest that the cold-sensitive phenotype could be due to the lack of G + but not to the lack of atgt. Discussion Because atgt is not essential for growth in H. volcanii, we searched for possible phenotypes of the G + mutants. First, we revealed that other extreme halophiles have naturally lost G +. Second, an unexpected sensitivity to high Mg 2+ concentrations 144

145 phenotype was observed with some of the G + mutants. Finally, some of the G + mutants exhibited a cold sensitivity phenotype. The loss of G + might have occurred as an adaptation to high salt environment containing high concentration of cations (Mg 2+, K +, Na +, and Ca 2+ ) that perhaps are transported in the cell (Oren, 2002). It was shown that cations, especially Mg 2+ and high concentrations of K +, increase the stability of the trna structure (Leroy et al., 1977; Oliva and Cavallo, 2009; Tan and Chen, 2010; Tinoco and Bustamante, 1999). It has been reported that H. salinarum and H. marismortui, which are phylogenetically closely related to H. walsbyi and H. volcanii, maintain an intracellular K + concentration of about 4.0 M (Ng et al., 2000). Hence, high concentrations of cations in the cell would increase the correct folding and rigidity of trna tertiary structure thus compensating for the loss of archaeosine. The G + deficient mutants exhibited unexpected growth sensitivity at high Mg 2+ concentrations. In the crystal structure of yeast trna Phe, the hydrated Mg 2+ binds to well determined locations (Figure 6-1). In vitro data showed that in moderate concentrations (around 0.01 M), Mg 2+ ions promote folding and maintain the tertiary trna structure (see above and the Introduction to this Chapter) (Serebrov et al., 1997); however, at higher concentrations (0.03 M), Mg 2+ ions cleave intact trna D-loop and T-loop. It was observed that in both E. coli and yeast trna Phe high Mg 2+ concentration promotes strong cleavage at positions G16 and G20 and C60 leaving behind a 3 - phosphates. Also, a site specific cleavage was observed with Pb 2+, Mn 2+, Ca 2+, and Eu 2+ in the trna D-loop and T-loop (C60) (Marciniec et al., 1989; Matsuo et al., 1995). Furthermore, one point mutation in the cleaving site changed the specificity of the metal promoted 145

146 cleavage; for example, mutating C60 to U60 resulted in lowering the efficiency of Mg 2+ cleavage (Marciniec et al., 1989; Wrzesinski et al., 1995). Thus, the growth sensitivity to high Mg 2+ concentration in the H. volcanii G + deficient mutants might be related to trna degradation due to high Mg 2+ concentration. Therefore, we propose that G + could possibly protect the trna D-loop from cleavage due to increased Mg 2+ ions concentration in the cell. Nevertheless, biochemical and biophysical studies on the effect of Mg 2+ on trna with or without G + will be necessary to verify this proposition. The lack of G + in trna relates to the observed cold sensitivity phenotype of the H. volcanii G + deficient mutants. Although many deletions of genes involved in posttranscriptional modifications found outside the trna anticodon loop exhibit no phenotype, deletion of a few of them led to temperature growth sensitive phenotype (Blaby et al., 2010; Ishida et al., 2010; Phizicky and Hopper, 2010). Ishida et al. showed that deletion of TruB, the gene encoding the enzyme responsible for pseudouridylation of position 55 (ψ55), in T. thermophilus caused severe growth retardation at low temperature (Ishida et al., 2010). The nucleosides at position 55 and postion 18 are known to be involved in tertiary interactions across the D-loop and T-loop increasing the rigidity of the trna molecule (Helm, 2006; Grosjean and Benne, 1998) (Figure 6-2). Also, position 55 and position 18 participate in the binding site of the two Mg 2+ ions. Studies performed on the folding and the stability of the trna suggested that Mg 2+ and modified nucleosides promote the correct folding of trna by decreasing clover leaf to L- shape transition energy (Helm, 2006; Brion and Westhof, 1997; Draper, 2008). The transition energy can also be lowered by heat (Garrett and Grisham, 1995). In addition, Mg 2+ bound increases the rigidity of trna (Bolton and Kearns, 1977; Leroy et al., 1977; 146

147 Serebrov et al., 2001; Serebrov et al., 1998). The ψ55 would bind Mg 2+ to help promote the correct folding and to maintain the integrity of the L-shaped trna. The cold sensitivity phenotype exhibited by the trub mutants could therefore be associated with the absence of ψ55 in trna. Then, again, Oliva et al showed, in silico, that Mg 2+ and G + are interchangeable (Oliva et al., 2007). Position 15 is also part of the two Mg 2+ ions binding site (Figure 6-1). Thus, we suggest that the presence of G + in trna might promote the correct folding and the increased stability of the trna tertiary structure. 147

148 G15 Figure 6-1. Mg 2+ bound to trna. Mg 2+ ions are depicted in cyan; trna backbone representation: Acceptor arm is depicted in yellow; TψC arm is depicted in blue; Variable loop is depicted in orange; Anticodon arm is depicted in red; D arm is depicted in green. 148

149 Figure 6-2. trna tertiary interactions. Only the interactions between D-loop and T-loop are shown. Adapted from Grosjean et al. (Grosjean et al., 2008) 149

150 Archaeoglobus fulgidus Halobacterium sp Haloarcula marismortui Haloferax volcanii Natronomonas pharaonis Haloquadratum walsbyi Methanosarcina Figure 6-3. Phylogenetic distribution of trna modifications genes in archaeal extreme halophiles. H. walsbyi has lost some of the trna modification genes. atgt is one of these genes. 150

151 Intensity, cps Intensity, cps H.volcanii m 2 2 G t 6 A G + H.walsbyi m 2 2 G NO G + t 6 A Figure 6-4. LC-MS/MS analysis of trna extracted from H. walsbyi and H. volcanii. The UV (λ=254 nm) and extraction ion chromatograms are shown. G + peak (325 m/z) eluted at minutes in the H. volcanii. The G + peak is not present in the UV chromatogram of trna extracted from H. walsbyi. The internal standards m 2 2G and t 6 A are also shown. 151

152 Figure 6-5. High Mg 2+ concentration sensitive phenotype of G + mutants. The H. volcanii ΔqueD, Δatgt and ΔarcS were grown in rich media at different Mg 2+ concentration while the other salts concentrations remained constant. A growth sensitivity of G + mutants was observed at 0.40 M [Mg 2+ ]. 28 C 45 C YPC H26 pjamc; 2- H26 atgt patgt; 3- H26 atgt pjamc Figure 6-6. Cold sensitive phenotype of H. volcanii Δatgt. The cells were grown on YPC at both 45 C and 28 C. The cold sensitive phenotype was rescued by atgt expressed in trans. 152

153 YPC 2 Hv-Mm 2 45 C C H26 ; 2- H26 ΔarcS; 3- H26 Δatgt; 4- H26 ΔqueD Figure 6-7. Cold sensitive phenotype of G + deficient mutants. The mutant strains as well as the isogenic wild type were grown on both rich medium (YPC) and defined medium (Hv-Mm) at 45 C and 28 C for 10 days. 153

Types of RNA. 1. Messenger RNA(mRNA): 1. Represents only 5% of the total RNA in the cell.

Types of RNA. 1. Messenger RNA(mRNA): 1. Represents only 5% of the total RNA in the cell. RNAs L.Os. Know the different types of RNA & their relative concentration Know the structure of each RNA Understand their functions Know their locations in the cell Understand the differences between prokaryotic

More information

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus:

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus: m Eukaryotic mrna processing Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus: Cap structure a modified guanine base is added to the 5 end. Poly-A tail

More information

9 The Process of Translation

9 The Process of Translation 9 The Process of Translation 9.1 Stages of Translation Process We are familiar with the genetic code, we can begin to study the mechanism by which amino acids are assembled into proteins. Because more

More information

Laith AL-Mustafa. Protein synthesis. Nabil Bashir 10\28\ First

Laith AL-Mustafa. Protein synthesis. Nabil Bashir 10\28\ First Laith AL-Mustafa Protein synthesis Nabil Bashir 10\28\2015 http://1drv.ms/1gigdnv 01 First 0 Protein synthesis In previous lectures we started talking about DNA Replication (DNA synthesis) and we covered

More information

Protein synthesis I Biochemistry 302. February 17, 2006

Protein synthesis I Biochemistry 302. February 17, 2006 Protein synthesis I Biochemistry 302 February 17, 2006 Key features and components involved in protein biosynthesis High energy cost (essential metabolic activity of cell Consumes 90% of the chemical energy

More information

Translation and the Genetic Code

Translation and the Genetic Code Chapter 11. Translation and the Genetic Code 1. Protein Structure 2. Components required for Protein Synthesis 3. Properties of the Genetic Code: An Overview 4. A Degenerate and Ordered Code 1 Sickle-Cell

More information

Section 7. Junaid Malek, M.D.

Section 7. Junaid Malek, M.D. Section 7 Junaid Malek, M.D. RNA Processing and Nomenclature For the purposes of this class, please do not refer to anything as mrna that has not been completely processed (spliced, capped, tailed) RNAs

More information

Chapter 17. From Gene to Protein. Biology Kevin Dees

Chapter 17. From Gene to Protein. Biology Kevin Dees Chapter 17 From Gene to Protein DNA The information molecule Sequences of bases is a code DNA organized in to chromosomes Chromosomes are organized into genes What do the genes actually say??? Reflecting

More information

Molecular Biology (9)

Molecular Biology (9) Molecular Biology (9) Translation Mamoun Ahram, PhD Second semester, 2017-2018 1 Resources This lecture Cooper, Ch. 8 (297-319) 2 General information Protein synthesis involves interactions between three

More information

Reading Assignments. A. Genes and the Synthesis of Polypeptides. Lecture Series 7 From DNA to Protein: Genotype to Phenotype

Reading Assignments. A. Genes and the Synthesis of Polypeptides. Lecture Series 7 From DNA to Protein: Genotype to Phenotype Lecture Series 7 From DNA to Protein: Genotype to Phenotype Reading Assignments Read Chapter 7 From DNA to Protein A. Genes and the Synthesis of Polypeptides Genes are made up of DNA and are expressed

More information

Organic Chemistry Option II: Chemical Biology

Organic Chemistry Option II: Chemical Biology Organic Chemistry Option II: Chemical Biology Recommended books: Dr Stuart Conway Department of Chemistry, Chemistry Research Laboratory, University of Oxford email: stuart.conway@chem.ox.ac.uk Teaching

More information

GENETICS - CLUTCH CH.11 TRANSLATION.

GENETICS - CLUTCH CH.11 TRANSLATION. !! www.clutchprep.com CONCEPT: GENETIC CODE Nucleotides and amino acids are translated in a 1 to 1 method The triplet code states that three nucleotides codes for one amino acid - A codon is a term for

More information

BCH 4054 Spring 2001 Chapter 33 Lecture Notes

BCH 4054 Spring 2001 Chapter 33 Lecture Notes BCH 4054 Spring 2001 Chapter 33 Lecture Notes Slide 1 The chapter covers degradation of proteins as well. We will not have time to get into that subject. Chapter 33 Protein Synthesis Slide 2 Prokaryotic

More information

-14. -Abdulrahman Al-Hanbali. -Shahd Alqudah. -Dr Ma mon Ahram. 1 P a g e

-14. -Abdulrahman Al-Hanbali. -Shahd Alqudah. -Dr Ma mon Ahram. 1 P a g e -14 -Abdulrahman Al-Hanbali -Shahd Alqudah -Dr Ma mon Ahram 1 P a g e In this lecture we will talk about the last stage in the synthesis of proteins from DNA which is translation. Translation is the process

More information

From gene to protein. Premedical biology

From gene to protein. Premedical biology From gene to protein Premedical biology Central dogma of Biology, Molecular Biology, Genetics transcription replication reverse transcription translation DNA RNA Protein RNA chemically similar to DNA,

More information

Computational Biology: Basics & Interesting Problems

Computational Biology: Basics & Interesting Problems Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information

More information

Translation. Genetic code

Translation. Genetic code Translation Genetic code If genes are segments of DNA and if DNA is just a string of nucleotide pairs, then how does the sequence of nucleotide pairs dictate the sequence of amino acids in proteins? Simple

More information

Introduction to the Ribosome Overview of protein synthesis on the ribosome Prof. Anders Liljas

Introduction to the Ribosome Overview of protein synthesis on the ribosome Prof. Anders Liljas Introduction to the Ribosome Molecular Biophysics Lund University 1 A B C D E F G H I J Genome Protein aa1 aa2 aa3 aa4 aa5 aa6 aa7 aa10 aa9 aa8 aa11 aa12 aa13 a a 14 How is a polypeptide synthesized? 2

More information

Gene Expression: Translation. transmission of information from mrna to proteins Chapter 5 slide 1

Gene Expression: Translation. transmission of information from mrna to proteins Chapter 5 slide 1 Gene Expression: Translation transmission of information from mrna to proteins 601 20000 Chapter 5 slide 1 Fig. 6.1 General structural formula for an amino acid Peter J. Russell, igenetics: Copyright Pearson

More information

Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p

Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p.110-114 Arrangement of information in DNA----- requirements for RNA Common arrangement of protein-coding genes in prokaryotes=

More information

Molecular Biology - Translation of RNA to make Protein *

Molecular Biology - Translation of RNA to make Protein * OpenStax-CNX module: m49485 1 Molecular Biology - Translation of RNA to make Protein * Jerey Mahr Based on Translation by OpenStax This work is produced by OpenStax-CNX and licensed under the Creative

More information

Biochemistry Prokaryotic translation

Biochemistry Prokaryotic translation 1 Description of Module Subject Name Paper Name Module Name/Title Dr. Vijaya Khader Dr. MC Varadaraj 2 1. Objectives 2. Understand the concept of genetic code 3. Understand the concept of wobble hypothesis

More information

Advanced Topics in RNA and DNA. DNA Microarrays Aptamers

Advanced Topics in RNA and DNA. DNA Microarrays Aptamers Quiz 1 Advanced Topics in RNA and DNA DNA Microarrays Aptamers 2 Quantifying mrna levels to asses protein expression 3 The DNA Microarray Experiment 4 Application of DNA Microarrays 5 Some applications

More information

Chapter 12. Genes: Expression and Regulation

Chapter 12. Genes: Expression and Regulation Chapter 12 Genes: Expression and Regulation 1 DNA Transcription or RNA Synthesis produces three types of RNA trna carries amino acids during protein synthesis rrna component of ribosomes mrna directs protein

More information

Chapter

Chapter Chapter 17 17.4-17.6 Molecular Components of Translation A cell interprets a genetic message and builds a polypeptide The message is a series of codons on mrna The interpreter is called transfer (trna)

More information

BCMB Chapters 39 & 40 Translation (protein synthesis)

BCMB Chapters 39 & 40 Translation (protein synthesis) BCMB 3100 - Chapters 39 & 40 Translation (protein synthesis) Translation Genetic code trna Amino acyl trna Ribosomes Initiation Elongation Termination How is the nucleotide code translated into a protein

More information

BCMB Chapters 39 & 40 Translation (protein synthesis)

BCMB Chapters 39 & 40 Translation (protein synthesis) BCMB 3100 - Chapters 39 & 40 Translation (protein synthesis) Translation Genetic code trna Amino acyl trna Ribosomes Initiation Elongation Termination How is the nucleotide code translated into a protein

More information

ومن أحياها Translation 2. Translation 2. DONE BY :Nisreen Obeidat

ومن أحياها Translation 2. Translation 2. DONE BY :Nisreen Obeidat Translation 2 DONE BY :Nisreen Obeidat Page 0 Prokaryotes - Shine-Dalgarno Sequence (2:18) What we're seeing here are different portions of sequences of mrna of different promoters from different bacterial

More information

Procesamiento Post-transcripcional en eucariotas. Biología Molecular 2009

Procesamiento Post-transcripcional en eucariotas. Biología Molecular 2009 Procesamiento Post-transcripcional en eucariotas Biología Molecular 2009 Figure 6-21 Molecular Biology of the Cell ( Garland Science 2008) Figure 6-22a Molecular Biology of the Cell ( Garland Science 2008)

More information

RNA Synthesis and Processing

RNA Synthesis and Processing RNA Synthesis and Processing Introduction Regulation of gene expression allows cells to adapt to environmental changes and is responsible for the distinct activities of the differentiated cell types that

More information

something about srna in archaea

something about srna in archaea something about srna in archaea or: Processed Small RNAs in Archaea and BHB Elements Sarah Berkemer Bioinformatics Vienzig Archaea? Sarah Berkemer (Bioinformatics Vienzig) BHB elements in Archaea 2 / 23

More information

L I F E S C I E N C E S

L I F E S C I E N C E S 1a L I F E S C I E N C E S 5 -UUA AUA UUC GAA AGC UGC AUC GAA AAC UGU GAA UCA-3 5 -TTA ATA TTC GAA AGC TGC ATC GAA AAC TGT GAA TCA-3 3 -AAT TAT AAG CTT TCG ACG TAG CTT TTG ACA CTT AGT-5 NOVEMBER 7, 2006

More information

NO!!!!! BCMB Chapters 39 & 40 Translation (protein synthesis) BCMB Chapters 39 & 40 Translation (protein synthesis)

NO!!!!! BCMB Chapters 39 & 40 Translation (protein synthesis) BCMB Chapters 39 & 40 Translation (protein synthesis) BCMB 3100 - Chapters 39 & 40 Translation How is the nucleotide code translated into a protein code? translation DNA RNA protein transcription 5 UCA 3 NH 2 Ser COO -????? Adapter Molecule Hypothesis (Crick,

More information

ومن أحياها Translation 1. Translation 1. DONE BY :Maen Faoury

ومن أحياها Translation 1. Translation 1. DONE BY :Maen Faoury Translation 1 DONE BY :Maen Faoury 0 1 ومن أحياها Translation 1 2 ومن أحياها Translation 1 In this lecture and the coming lectures you are going to see how the genetic information is transferred into proteins

More information

Protein synthesis I Biochemistry 302. Bob Kelm February 23, 2004

Protein synthesis I Biochemistry 302. Bob Kelm February 23, 2004 Protein synthesis I Biochemistry 302 Bob Kelm February 23, 2004 Key features of protein synthesis Energy glutton Essential metabolic activity of the cell. Consumes 90% of the chemical energy (ATP,GTP).

More information

BME 5742 Biosystems Modeling and Control

BME 5742 Biosystems Modeling and Control BME 5742 Biosystems Modeling and Control Lecture 24 Unregulated Gene Expression Model Dr. Zvi Roth (FAU) 1 The genetic material inside a cell, encoded in its DNA, governs the response of a cell to various

More information

Introduction to Molecular and Cell Biology

Introduction to Molecular and Cell Biology Introduction to Molecular and Cell Biology Molecular biology seeks to understand the physical and chemical basis of life. and helps us answer the following? What is the molecular basis of disease? What

More information

From Gene to Protein

From Gene to Protein From Gene to Protein Gene Expression Process by which DNA directs the synthesis of a protein 2 stages transcription translation All organisms One gene one protein 1. Transcription of DNA Gene Composed

More information

2012 Univ Aguilera Lecture. Introduction to Molecular and Cell Biology

2012 Univ Aguilera Lecture. Introduction to Molecular and Cell Biology 2012 Univ. 1301 Aguilera Lecture Introduction to Molecular and Cell Biology Molecular biology seeks to understand the physical and chemical basis of life. and helps us answer the following? What is the

More information

CHAPTER4 Translation

CHAPTER4 Translation CHAPTER4 Translation 4.1 Outline of Translation 4.2 Genetic Code 4.3 trna and Anticodon 4.4 Ribosome 4.5 Protein Synthesis 4.6 Posttranslational Events 4.1 Outline of Translation From mrna to protein

More information

Protein synthesis II Biochemistry 302. Bob Kelm February 25, 2004

Protein synthesis II Biochemistry 302. Bob Kelm February 25, 2004 Protein synthesis II Biochemistry 302 Bob Kelm February 25, 2004 Two idealized views of the 70S ribosomal complex during translation 70S cavity Fig. 27.25 50S tunnel View with 30S subunit in front, 50S

More information

Degeneracy. Two types of degeneracy:

Degeneracy. Two types of degeneracy: Degeneracy The occurrence of more than one codon for an amino acid (AA). Most differ in only the 3 rd (3 ) base, with the 1 st and 2 nd being most important for distinguishing the AA. Two types of degeneracy:

More information

Bio Microbiology - Spring 2010 Study Guide 18

Bio Microbiology - Spring 2010 Study Guide 18 Bio 230 - Microbiology - Spring 2010 Study Guide 18 Archaea Kingdom Crenarchaeota: mainly hyperthermophiles Kingdom Euryarchaeota: methanogens, halophiles, Thermoplasma & Archaeoglobus Kingdom Korarchaeota:

More information

Multiple Choice Review- Eukaryotic Gene Expression

Multiple Choice Review- Eukaryotic Gene Expression Multiple Choice Review- Eukaryotic Gene Expression 1. Which of the following is the Central Dogma of cell biology? a. DNA Nucleic Acid Protein Amino Acid b. Prokaryote Bacteria - Eukaryote c. Atom Molecule

More information

Videos. Bozeman, transcription and translation: https://youtu.be/h3b9arupxzg Crashcourse: Transcription and Translation - https://youtu.

Videos. Bozeman, transcription and translation: https://youtu.be/h3b9arupxzg Crashcourse: Transcription and Translation - https://youtu. Translation Translation Videos Bozeman, transcription and translation: https://youtu.be/h3b9arupxzg Crashcourse: Transcription and Translation - https://youtu.be/itsb2sqr-r0 Translation Translation The

More information

AQA Biology A-level. relationships between organisms. Notes.

AQA Biology A-level. relationships between organisms. Notes. AQA Biology A-level Topic 4: Genetic information, variation and relationships between organisms Notes DNA, genes and chromosomes Both DNA and RNA carry information, for instance DNA holds genetic information

More information

Lesson Overview. Ribosomes and Protein Synthesis 13.2

Lesson Overview. Ribosomes and Protein Synthesis 13.2 13.2 The Genetic Code The first step in decoding genetic messages is to transcribe a nucleotide base sequence from DNA to mrna. This transcribed information contains a code for making proteins. The Genetic

More information

GCD3033:Cell Biology. Transcription

GCD3033:Cell Biology. Transcription Transcription Transcription: DNA to RNA A) production of complementary strand of DNA B) RNA types C) transcription start/stop signals D) Initiation of eukaryotic gene expression E) transcription factors

More information

9/2/17. Molecular and Cellular Biology. 3. The Cell From Genes to Proteins. key processes

9/2/17. Molecular and Cellular Biology. 3. The Cell From Genes to Proteins. key processes Molecular and Cellular Biology Animal Cell ((eukaryotic cell) -----> compare with prokaryotic cell) ENDOPLASMIC RETICULUM (ER) Rough ER Smooth ER Flagellum Nuclear envelope Nucleolus NUCLEUS Chromatin

More information

Lecture 25: Protein Synthesis Key learning goals: Be able to explain the main stuctural features of ribosomes, and know (roughly) how many DNA and

Lecture 25: Protein Synthesis Key learning goals: Be able to explain the main stuctural features of ribosomes, and know (roughly) how many DNA and Lecture 25: Protein Synthesis Key learning goals: Be able to explain the main stuctural features of ribosomes, and know (roughly) how many DNA and protein subunits they contain. Understand the main functions

More information

Lecture 9 Translation.

Lecture 9 Translation. 1 Translation Summary of important events in translation. 2 Translation Reactions involved in peptide bond formation. Lecture 9 3 Genetic code Three types of RNA molecules perform different but complementary

More information

Translation. A ribosome, mrna, and trna.

Translation. A ribosome, mrna, and trna. Translation The basic processes of translation are conserved among prokaryotes and eukaryotes. Prokaryotic Translation A ribosome, mrna, and trna. In the initiation of translation in prokaryotes, the Shine-Dalgarno

More information

RNA & PROTEIN SYNTHESIS. Making Proteins Using Directions From DNA

RNA & PROTEIN SYNTHESIS. Making Proteins Using Directions From DNA RNA & PROTEIN SYNTHESIS Making Proteins Using Directions From DNA RNA & Protein Synthesis v Nitrogenous bases in DNA contain information that directs protein synthesis v DNA remains in nucleus v in order

More information

9/11/18. Molecular and Cellular Biology. 3. The Cell From Genes to Proteins. key processes

9/11/18. Molecular and Cellular Biology. 3. The Cell From Genes to Proteins. key processes Molecular and Cellular Biology Animal Cell ((eukaryotic cell) -----> compare with prokaryotic cell) ENDOPLASMIC RETICULUM (ER) Rough ER Smooth ER Flagellum Nuclear envelope Nucleolus NUCLEUS Chromatin

More information

What is the central dogma of biology?

What is the central dogma of biology? Bellringer What is the central dogma of biology? A. RNA DNA Protein B. DNA Protein Gene C. DNA Gene RNA D. DNA RNA Protein Review of DNA processes Replication (7.1) Transcription(7.2) Translation(7.3)

More information

Introduction to molecular biology. Mitesh Shrestha

Introduction to molecular biology. Mitesh Shrestha Introduction to molecular biology Mitesh Shrestha Molecular biology: definition Molecular biology is the study of molecular underpinnings of the process of replication, transcription and translation of

More information

ATP. P i. trna. 3 Appropriate trna covalently bonds to amino acid, displacing AMP. Computer model Hydrogen bonds

ATP. P i. trna. 3 Appropriate trna covalently bonds to amino acid, displacing AMP. Computer model Hydrogen bonds mino acid attachment site nticodon Hydrogen bonds mino acid T i denosine i i denosine minoacyl-trn synthetase (enzyme) trn 1 ctive site binds the amino acid and T. 2 T loses two groups and bonds to the

More information

1. In most cases, genes code for and it is that

1. In most cases, genes code for and it is that Name Chapter 10 Reading Guide From DNA to Protein: Gene Expression Concept 10.1 Genetics Shows That Genes Code for Proteins 1. In most cases, genes code for and it is that determine. 2. Describe what Garrod

More information

Regulation of Transcription in Eukaryotes

Regulation of Transcription in Eukaryotes Regulation of Transcription in Eukaryotes Leucine zipper and helix-loop-helix proteins contain DNA-binding domains formed by dimerization of two polypeptide chains. Different members of each family can

More information

Bio 1B Lecture Outline (please print and bring along) Fall, 2007

Bio 1B Lecture Outline (please print and bring along) Fall, 2007 Bio 1B Lecture Outline (please print and bring along) Fall, 2007 B.D. Mishler, Dept. of Integrative Biology 2-6810, bmishler@berkeley.edu Evolution lecture #5 -- Molecular genetics and molecular evolution

More information

Ribosomal RNA. Introduction. Organization of the Ribosomal RNA Genes. Introductory article

Ribosomal RNA. Introduction. Organization of the Ribosomal RNA Genes. Introductory article Denis LJ Lafontaine, Université Libre de Bruxelles, Brussels, Belgium David Tollervey, Wellcome Trust Centre for Cell Biology, University of Edinburgh, UK All proteins are synthesized by ribosomes, large

More information

Information Content in Genetics:

Information Content in Genetics: Information Content in Genetics: DNA, RNA and protein mrna translation into protein (protein synthesis) Francis Crick, 1958 [Crick, F. H. C. in Symp. Soc. Exp. Biol., The Biological Replication of Macromolecules,

More information

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid. 1. A change that makes a polypeptide defective has been discovered in its amino acid sequence. The normal and defective amino acid sequences are shown below. Researchers are attempting to reproduce the

More information

PROTEIN SYNTHESIS INTRO

PROTEIN SYNTHESIS INTRO MR. POMERANTZ Page 1 of 6 Protein synthesis Intro. Use the text book to help properly answer the following questions 1. RNA differs from DNA in that RNA a. is single-stranded. c. contains the nitrogen

More information

Introduction. Gene expression is the combined process of :

Introduction. Gene expression is the combined process of : 1 To know and explain: Regulation of Bacterial Gene Expression Constitutive ( house keeping) vs. Controllable genes OPERON structure and its role in gene regulation Regulation of Eukaryotic Gene Expression

More information

Prokaryotic Regulation

Prokaryotic Regulation Prokaryotic Regulation Control of transcription initiation can be: Positive control increases transcription when activators bind DNA Negative control reduces transcription when repressors bind to DNA regulatory

More information

Quiz answers. Allele. BIO 5099: Molecular Biology for Computer Scientists (et al) Lecture 17: The Quiz (and back to Eukaryotic DNA)

Quiz answers. Allele. BIO 5099: Molecular Biology for Computer Scientists (et al) Lecture 17: The Quiz (and back to Eukaryotic DNA) BIO 5099: Molecular Biology for Computer Scientists (et al) Lecture 17: The Quiz (and back to Eukaryotic DNA) http://compbio.uchsc.edu/hunter/bio5099 Larry.Hunter@uchsc.edu Quiz answers Kinase: An enzyme

More information

Three types of RNA polymerase in eukaryotic nuclei

Three types of RNA polymerase in eukaryotic nuclei Three types of RNA polymerase in eukaryotic nuclei Type Location RNA synthesized Effect of α-amanitin I Nucleolus Pre-rRNA for 18,.8 and 8S rrnas Insensitive II Nucleoplasm Pre-mRNA, some snrnas Sensitive

More information

GENE ACTIVITY Gene structure Transcription Transcript processing mrna transport mrna stability Translation Posttranslational modifications

GENE ACTIVITY Gene structure Transcription Transcript processing mrna transport mrna stability Translation Posttranslational modifications 1 GENE ACTIVITY Gene structure Transcription Transcript processing mrna transport mrna stability Translation Posttranslational modifications 2 DNA Promoter Gene A Gene B Termination Signal Transcription

More information

15.2 Prokaryotic Transcription *

15.2 Prokaryotic Transcription * OpenStax-CNX module: m52697 1 15.2 Prokaryotic Transcription * Shannon McDermott Based on Prokaryotic Transcription by OpenStax This work is produced by OpenStax-CNX and licensed under the Creative Commons

More information

Ribosome readthrough

Ribosome readthrough Ribosome readthrough Starting from the base PROTEIN SYNTHESIS Eukaryotic translation can be divided into four stages: Initiation, Elongation, Termination and Recycling During translation, the ribosome

More information

Dr Mike Dyall-Smith. Archaea: Main points. Archaea: Discovery. Archaea: Discovery. Discovery of the Archaea. Lecture: Archaeal diversity

Dr Mike Dyall-Smith. Archaea: Main points. Archaea: Discovery. Archaea: Discovery. Discovery of the Archaea. Lecture: Archaeal diversity Lecture: Archaeal diversity Dr Mike Dyall-Smith Haloarchaea Research Lab., Lab 3.07 mlds@unimelb.edu.au Reference: Microbiology (Prescott et al., 6th). Chapter 20. Archaea: Main points Discovery of a third

More information

Part IV => DNA and RNA. 4.6 RNA Translation 4.6a Genetic Code 4.6b Translational Machinery

Part IV => DNA and RNA. 4.6 RNA Translation 4.6a Genetic Code 4.6b Translational Machinery Part IV => DNA and RNA 4.6 RNA Translation 4.6a Genetic Code 4.6b Translational Machinery Section 4.6a: Genetic Code Synopsis 4.6a - In order to translate the genetic information (or genetic code) carried

More information

Biophysics Lectures Three and Four

Biophysics Lectures Three and Four Biophysics Lectures Three and Four Kevin Cahill cahill@unm.edu http://dna.phys.unm.edu/ 1 The Atoms and Molecules of Life Cells are mostly made from the most abundant chemical elements, H, C, O, N, Ca,

More information

Regulation of Gene Expression

Regulation of Gene Expression Chapter 18 Regulation of Gene Expression Edited by Shawn Lester PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley

More information

RNA Processing: Eukaryotic mrnas

RNA Processing: Eukaryotic mrnas RNA Processing: Eukaryotic mrnas Eukaryotic mrnas have three main parts (Figure 13.8): 5! untranslated region (5! UTR), varies in length. The coding sequence specifies the amino acid sequence of the protein

More information

Microbial Taxonomy and the Evolution of Diversity

Microbial Taxonomy and the Evolution of Diversity 19 Microbial Taxonomy and the Evolution of Diversity Copyright McGraw-Hill Global Education Holdings, LLC. Permission required for reproduction or display. 1 Taxonomy Introduction to Microbial Taxonomy

More information

Lecture 13: PROTEIN SYNTHESIS II- TRANSLATION

Lecture 13: PROTEIN SYNTHESIS II- TRANSLATION http://smtom.lecture.ub.ac.id/ Password: https://syukur16tom.wordpress.com/ Password: Lecture 13: PROTEIN SYNTHESIS II- TRANSLATION http://hyperphysics.phy-astr.gsu.edu/hbase/organic/imgorg/translation2.gif

More information

BIOCHEMISTRY GUIDED NOTES - AP BIOLOGY-

BIOCHEMISTRY GUIDED NOTES - AP BIOLOGY- BIOCHEMISTRY GUIDED NOTES - AP BIOLOGY- ELEMENTS AND COMPOUNDS - anything that has mass and takes up space. - cannot be broken down to other substances. - substance containing two or more different elements

More information

Translation Part 2 of Protein Synthesis

Translation Part 2 of Protein Synthesis Translation Part 2 of Protein Synthesis IN: How is transcription like making a jello mold? (be specific) What process does this diagram represent? A. Mutation B. Replication C.Transcription D.Translation

More information

TWO PARTNERS OF THE RIBOSOME, EF-TU AND LEPA EVELINA INES DE LAURENTIIS. B.Sc. University of Lethbridge, A Thesis

TWO PARTNERS OF THE RIBOSOME, EF-TU AND LEPA EVELINA INES DE LAURENTIIS. B.Sc. University of Lethbridge, A Thesis TWO PARTNERS OF THE RIBOSOME, EF-TU AND LEPA EVELINA INES DE LAURENTIIS B.Sc. University of Lethbridge, 2007 A Thesis Submitted to the School of Graduate Studies of the University of Lethbridge in Partial

More information

UNIT 5. Protein Synthesis 11/22/16

UNIT 5. Protein Synthesis 11/22/16 UNIT 5 Protein Synthesis IV. Transcription (8.4) A. RNA carries DNA s instruction 1. Francis Crick defined the central dogma of molecular biology a. Replication copies DNA b. Transcription converts DNA

More information

Chapter 9 DNA recognition by eukaryotic transcription factors

Chapter 9 DNA recognition by eukaryotic transcription factors Chapter 9 DNA recognition by eukaryotic transcription factors TRANSCRIPTION 101 Eukaryotic RNA polymerases RNA polymerase RNA polymerase I RNA polymerase II RNA polymerase III RNA polymerase IV Function

More information

Molecular Biology of the Cell

Molecular Biology of the Cell Alberts Johnson Lewis Morgan Raff Roberts Walter Molecular Biology of the Cell Sixth Edition Chapter 6 (pp. 333-368) How Cells Read the Genome: From DNA to Protein Copyright Garland Science 2015 Genetic

More information

Chapter 19 Overview. Protein Synthesis. for amino acid. n Protein Synthesis genetic info encoded in nucleic acids translated into standard amino acids

Chapter 19 Overview. Protein Synthesis. for amino acid. n Protein Synthesis genetic info encoded in nucleic acids translated into standard amino acids Chapter 19 Overview Protein Synthesis n Protein Synthesis genetic info encoded in nucleic acids translated into standard amino acids n Genetic code dictionary defining meaning for base sequence n Codon

More information

2/25/2013. Electronic Configurations

2/25/2013. Electronic Configurations 1 2 3 4 5 Chapter 2 Chemical Principles The Structure of Atoms Chemistry is the study of interactions between atoms and molecules The atom is the smallest unit of matter that enters into chemical reactions

More information

Biomolecules. Energetics in biology. Biomolecules inside the cell

Biomolecules. Energetics in biology. Biomolecules inside the cell Biomolecules Energetics in biology Biomolecules inside the cell Energetics in biology The production of energy, its storage, and its use are central to the economy of the cell. Energy may be defined as

More information

CHAPTER 13 PROKARYOTE GENES: E. COLI LAC OPERON

CHAPTER 13 PROKARYOTE GENES: E. COLI LAC OPERON PROKARYOTE GENES: E. COLI LAC OPERON CHAPTER 13 CHAPTER 13 PROKARYOTE GENES: E. COLI LAC OPERON Figure 1. Electron micrograph of growing E. coli. Some show the constriction at the location where daughter

More information

Cellular Neuroanatomy I The Prototypical Neuron: Soma. Reading: BCP Chapter 2

Cellular Neuroanatomy I The Prototypical Neuron: Soma. Reading: BCP Chapter 2 Cellular Neuroanatomy I The Prototypical Neuron: Soma Reading: BCP Chapter 2 Functional Unit of the Nervous System The functional unit of the nervous system is the neuron. Neurons are cells specialized

More information

2 Genome evolution: gene fusion versus gene fission

2 Genome evolution: gene fusion versus gene fission 2 Genome evolution: gene fusion versus gene fission Berend Snel, Peer Bork and Martijn A. Huynen Trends in Genetics 16 (2000) 9-11 13 Chapter 2 Introduction With the advent of complete genome sequencing,

More information

Chapter 19. Microbial Taxonomy

Chapter 19. Microbial Taxonomy Chapter 19 Microbial Taxonomy 12-17-2008 Taxonomy science of biological classification consists of three separate but interrelated parts classification arrangement of organisms into groups (taxa; s.,taxon)

More information

Supplementary Information. Structural basis for precursor protein-directed ribosomal peptide macrocyclization

Supplementary Information. Structural basis for precursor protein-directed ribosomal peptide macrocyclization Supplementary Information Structural basis for precursor protein-directed ribosomal peptide macrocyclization Kunhua Li 1,3, Heather L. Condurso 1,3, Gengnan Li 1, Yousong Ding 2 and Steven D. Bruner 1*

More information

Chemical Principles and Biomolecules (Chapter 2) Lecture Materials for Amy Warenda Czura, Ph.D. Suffolk County Community College Eastern Campus

Chemical Principles and Biomolecules (Chapter 2) Lecture Materials for Amy Warenda Czura, Ph.D. Suffolk County Community College Eastern Campus Chemical Principles and Biomolecules (Chapter 2) Lecture Materials for Amy Warenda Czura, Ph.D. Suffolk County Community College Eastern Campus Primary Source for figures and content: Tortora, G.J. Microbiology

More information

Chapters 12&13 Notes: DNA, RNA & Protein Synthesis

Chapters 12&13 Notes: DNA, RNA & Protein Synthesis Chapters 12&13 Notes: DNA, RNA & Protein Synthesis Name Period Words to Know: nucleotides, DNA, complementary base pairing, replication, genes, proteins, mrna, rrna, trna, transcription, translation, codon,

More information

Conceptofcolinearity: a continuous sequence of nucleotides in DNA encodes a continuous sequence of amino acids in a protein

Conceptofcolinearity: a continuous sequence of nucleotides in DNA encodes a continuous sequence of amino acids in a protein Translation Conceptofcolinearity: a continuous sequence of nucleotides in DNA encodes a continuous sequence of amino acids in a protein Para além do fenómeno do wobble, há que considerar Desvios ao código

More information

Initiation of translation in eukaryotic cells:connecting the head and tail

Initiation of translation in eukaryotic cells:connecting the head and tail Initiation of translation in eukaryotic cells:connecting the head and tail GCCRCCAUGG 1: Multiple initiation factors with distinct biochemical roles (linking, tethering, recruiting, and scanning) 2: 5

More information

Gene regulation II Biochemistry 302. Bob Kelm February 28, 2005

Gene regulation II Biochemistry 302. Bob Kelm February 28, 2005 Gene regulation II Biochemistry 302 Bob Kelm February 28, 2005 Catabolic operons: Regulation by multiple signals targeting different TFs Catabolite repression: Activity of lac operon is restricted when

More information

MICROBIOLOGIA GENERALE. The Archaea

MICROBIOLOGIA GENERALE. The Archaea MICROBIOLOGIA GENERALE The Archaea The Archaea s traits 1. The cell wall of Archaea: pseudopeptidoglycan, polysaccharide, glycoprotein 2. The cytoplasmic membrane of Archaea: ether linkage, glycerol

More information

Genetic code redundancy and its influence on the encoded polypeptides

Genetic code redundancy and its influence on the encoded polypeptides , http://dx.doi.org/10.5936/csbj.201204006 Genetic code redundancy and its influence on the encoded polypeptides Paige S. Spencer a, José M. Barral a,b,c* CSBJ Abstract: The genetic code is said to be

More information

Taxonomy. Content. How to determine & classify a species. Phylogeny and evolution

Taxonomy. Content. How to determine & classify a species. Phylogeny and evolution Taxonomy Content Why Taxonomy? How to determine & classify a species Domains versus Kingdoms Phylogeny and evolution Why Taxonomy? Classification Arrangement in groups or taxa (taxon = group) Nomenclature

More information