CHARACTERIZATION OF THE metk AND yitj LEADER RNAs FROM THE. Bacillus subtilis S BOX REGULON DISSERTATION

Size: px

Start display at page:

Download "CHARACTERIZATION OF THE metk AND yitj LEADER RNAs FROM THE. Bacillus subtilis S BOX REGULON DISSERTATION"

Daniela Gray
5 years ago
Views:

1 CHARACTERIZATION OF THE metk AND yitj LEADER RNAs FROM THE Bacillus subtilis S BOX REGULON DISSERTATION Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Vineeta A. Pradhan, B.S. Graduate Program in Microbiology The Ohio State University 2012 Dissertation Committee: Professor Tina M. Henkin, Advisor Professor Charles J. Daniels Professor Kurt L. Fredrick Professor Joseph A. Krzycki

2 Copyright by Vineeta A. Pradhan 2012

3 ABSTRACT A variety of mechanisms that regulate gene expression have been uncovered in bacteria. Riboswitches are cis-acting regulatory sequences that reside typically in the untranslated regions of bacterial mrnas. Riboswitches serve as genetic regulatory switches that sense and respond specifically to environmental signals to regulate expression of the downstream gene, typically in the absence of any protein factor. The S box riboswitch is a transcription termination control system found mostly in Grampositive bacteria that regulates the expression of many genes involved in sulfur metabolism. The S box genes are characterized by the presence of a set of highly conserved primary sequence and secondary structural elements in the untranslated leader region upstream of the regulated coding sequence. SAM, the molecular effector of the S box riboswitch, is synthesized from methionine and ATP. Expression of the majority of the S box genes is induced during methionine starvation (when SAM pools are low) and is repressed in the presence of methionine (when SAM pools are high). In spite of high sequence and structural conservation, a few S box leader RNAs from Bacillus subtilis fail to exhibit typical S box gene regulation and variation is seen in response to SAM both in vivo and in vitro. This work examines the leader RNA elements that contribute to the observed S box variability, with a special focus on the metk leader ii

4 RNA. Investigation of the metk leader RNA was performed using biochemical and genetic techniques. We modulated in vivo SAM pools without removing methionine from the growth medium and provided evidence for a SAM-dependent change in metk gene expression in vivo. Phylogenetic analyses revealed the presence of unique sequence elements, the Upstream (US) and Downstream (DS) boxes, that are highly conserved in the metk leader RNAs in several Firmicutes. Using RNase H assays, we showed that these regions are involved in a base-pairing interaction that is stabilized in the absence of SAM. Extensive mutagenic analysis of the US and DS box sequences confirmed the need for an intact US-DS base-pairing interaction for response to SAM in vivo. Transcript stability and abundance studies showed that the US-DS pairing is disrupted in the presence of SAM and that any alteration in the US box sequence reduces transcript stability significantly. A model for metk regulation was proposed in which the metk gene is regulated at the level of mrna stability, in addition to being under the control of the S box regulon. In vitro investigation of the B. subtilis yitj SAM binding pocket was conducted to identify RNA determinants for ligand affinity and specificity. We attempted to generate yitj variants that exhibit higher SAM affinity compared to wild-type yitj or a change in ligand specificity. Extensive mutational analysis was conducted as part of the crystal structure study of the B. subtilis yitj RNA in complex with SAM and mutants were tested for effects on in vitro transcription and SAM binding. As expected, most of the mutants exhibited loss of SAM binding. However, some mutants resulted in constitutive high iii

5 termination, indicating that the RNA was locked in a SAM-bound-like conformation in the absence of SAM. Selected mutants were also tested in response to a series of SAM analogs and compared to the response of the wild-type yitj RNA. Individual RNA elements critical for S box riboswitch function were examined using the mete and yusc leader RNAs. Our data suggest that both the SAM-binding domain and the terminator/antiterminator structures play a crucial role in the calibration of the S box regulatory system. The effect of the metk promoter and US box sequence on expression of yusc was also examined. We predict that the metk promoter, along with the US box sequence, are responsible for reduced transcription initiation or reduced RNA polymerase processivity in vitro. These studies provide possible implications of the metk promoter and US box sequence on transcription and therefore metk regulation. iv

6 This work is dedicated to my parents Neela and Prakash Kurlekar. v

7 ACKNOWLEDGEMENTS I wish to express my sincere gratitude to my advisor, Dr. Tina Henkin, for her guidance, patience, and support throughout the years. It has been a privilege to work with such a gifted scientist and excellent mentor. I am grateful to Dr. Frank Grundy for his guidance and support throughout the years. Dr. Grundy played an instrumental role in this project, particularly in the work of the metk project. I would like to extend a special thanks to my committee members, Dr. Charles Daniels, Dr. Kurt Fredrick and Dr. Joseph Krzycki, for their guidance, support and time over the years. I also wish to thank my colleague and dear friend, Dr. Sharnise N. Mitchell, for being a great support both on a scientific and personal level, throughout the years. I really appreciate and value her friendship, guidance, and constant encouragement. I would like to thank the present and past members of the lab, particularly, Dr. Brooke A. McDaniel, Dr. Jerneja Tomšič and Dr. Enrico Caserta for their helpful discussions and encouragement during our time together. I would also like to thank Susan Tigert and Chris Woltjen for their technical assistance. I would like to thank Mike Zianni from the Plant Microbe and Genome Facility vi

8 for his help with the qrt-pcr assays. I wish to express my special thanks to Dr. Madhura Pradhan for her constant encouragement and support throughout the years. Most of all, I am truly grateful to my family, especially my husband Ashish and my parents Neela and Prakash Kurlekar. Without their encouragement, patience and support this would not have been possible. vii

9 VITA January 19, Born New Brunswick, New Jersey B.S., Microbiology, University of Pune 2004-present...Graduate Teaching and Research Associate, Department of Microbiology, The Ohio State University PUBLICATIONS 1. McDaniel BA, Grundy FJ, Kurlekar VP, Tomsic J, Henkin TM Identification of a mutation in the Bacillus subtilis S-adenosylmethionine synthetase gene that results in derepression of S box gene expression. J Bacteriol 188: Lu C, Ding F, Chowdhury A, Pradhan V, Tomsic J, Holmes WM, Henkin TM, Ke A SAM recognition and conformational switching mechanism in the Bacillus subtilis yitj S box/sam-i riboswitch. J Mol Biol 404: Major Field: Microbiology FIELDS OF STUDY viii

10 TABLE OF CONTENTS ABSTRACT... ii DEDICATION... v ACKNOWLEDGEMENTS... vi VITA... viii LIST OF TABLES... xv LIST OF FIGURES... xvi LIST OF ABBREVIATIONS... xx CHAPTER REGULATION OF GENE EXPRESSION BY RIBOSWITCHES Types of riboswitch classes Riboswitch classes RNA Thermosensors T box riboswitch Amino acid binding riboswitches L box riboswitch Glycine riboswitch Glutamine riboswitch Purine-sensing riboswitches ix

11 Guanine riboswitch Adenine riboswitch Deoxyguanosine riboswitch PreQ 1 riboswitch c-di-gmp glms ribozyme M box riboswitch Fluoride riboswitch B 12 riboswitch TPP riboswitch FMN riboswitch THF riboswitch Moco and Tuco RNA elements SAM-sensing ribsowitches S box/sam-i riboswitch SAM-II riboswitch S MK box/sam-iii riboswitch SAM-IV riboswitch SAM-V riboswitch SAH riboswitch Research goals CHAPTER CHARACTERIZATION OF THE metk LEADER RNA: AN ATYPICAL MEMBER OF THE Bacillus subtilis S BOX REGULON Introduction Materials and Methods Bacterial strains and growth conditions x

12 2.2.2 Genetic techniques β-galactosidase measurements Measurement of SAM pools in vivo In vitro transcription termination assays for determination of SAM pools In vitro transcription termination assays RNase H cleavage assay Total RNA extraction for primer extension analysis Primer extension analysis of the metk leader RNA Quantitative reverse transcriptase PCR (qrt-pcr) assay Results Response to varying SAM pools in vivo by the wild-type B. subtilis metk leader RNA Deletion mapping of the metk leader RNA Unique sequences are located on the 5' and 3' sides of the metk S box element The metk US and DS box regions are involved in a base-pairing interaction which is dependent on a functional SAM-binding domain Mutagenic analysis of the conserved metk US box element Physiological context of the G5U mutant Mutagenesis of the US box sequence Effect of the US box mutations on metk transcript stability and abundance In vitro analysis of the metk US and DS box mutants Conditions to generate a metk halted-complex during in vitro transcription Discussion CHAPTER IN VITRO INVESTIGATION OF THE SAM BINDING POCKET OF THE Bacillus subtilis yitj RIBOSWITCH Introduction Materials and methods Construction of DNA templates for in vitro selection Site-directed mutagenesis In vitro transcription assays RNase H cleavage assay xi

13 3.3 Results SAM-dependent structural transition of the wild-type yitj leader RNAs with distinct 3' end-points Mutagenesis of a region in helix P3 to generate a pool of yitj variants Effect of leader region mutations on the SAM-dependent structural transition of the yitj RNA Identification of yitj leader RNA determinants for SAM affinity and recognition Disruption of G11 causes loss of SAM binding, yet high constitutive transcription termination in vitro Mutating residues in the P3 helix of the yitj SAM-binding pocket results in a surprising stabilization of the aptamer domain Substitution of the U85-A109 base-pair within the pseudoknot weakens the SAM-binding ability without affecting the termination efficiency Termination efficiency of wild-type and U85-A109 variant yitj constructs in response to SAM analogs Discussion CHAPTER INVESTIGATION OF THE FUNCTION OF S BOX RIBOSWITCH STRUCTURAL ELEMENTS: INSIGHTS INTO FACTORS CONTRIBUTING TO S BOX RIBOSWITCH VARIABILITY Introduction Materials and methods Bacterial strains Genetic techniques Construction of hybrid leader RNAs mete and yusc hybrid leader RNAs metk and yusc hybrid constructs β-galactosidase measurements In vitro transcription termination assay Conditions for the mete and yusc wild-type and hybrid constructs Conditions for the metk-yusc hybrid leader RNA constructs xii

14 4.3 Results Repression of S box gene expression in the presence of methionine is dependent on the SAM-binding domain of the S box leader RNA The termination efficiency of S box genes is dictated by the terminator/antiterminator domains The length of the metk upstream (US) box sequence affects expression of the metk-yusc hybrid leader RNAs The metk US box sequence contributes to the termination efficiency of the metk-yusc hybrid constructs Discussion CHAPTER SUMMARY AND DISCUSSION LIST OF REFERENCES APPENDIX A EFFECTS OF THE RelA MUTANT ALLELE ON B. subtilis S BOX GENE EXPRESION A.1 Introduction A.2 Hypothesis A.3 Aim of study A.4 Materials and methods A.4.1 Bacterial strains and growth conditions A.4.2 Genetic techniques A.4.3 β-galactosidase measurements A.4.4 Determination of SAM pools in vivo A.5 Results A.5.1 Methionine prototrophic strains containing a wild-type or rela1 allele exhibit distinct S box-lacz gene expression profiles, despite similar in vivo SAM pools xiii

15 A.5.2 A.5.3 The rela1 mutant allele does not affect S box-lacz expression in a methionine auxotroph A rela null strain fails to exhibit high SAM pools during methionine starvation A.6 Discussion xiv

16 LIST OF TABLES Table 1.1 Known classes of riboswitches based on ligand type Expression of wild-type metk-lacz transcriptional fusion in vivo Expression of the metk-lacz deletion mutants during IPTG assay Oligonucleotide primers for the B. subtilis yitj leader RNA Mutational analysis of the B. subtilis yitj S box riboswitch In vitro transcription termination of wild-type B. subtilis yitj in the presence of SAM or SAM analogs In vitro transcription termination of B. subtilis yitj variants in the presence of SAM or SAM analogs DNA oligonucleotides for mete and yusc leader RNA constructs DNA oligonucleotides for metk and yusc leader RNA constructs In vivo expression analysis of the wild-type and hybrid constructs In vitro analysis of the wild-type and hybrid mete and yusc leader RNAs Sequences of the metk and yusc hybrid leader RNA constructs xv

17 LIST OF FIGURES Figure 1.1 Models for sensing of regulatory signals by leader RNAs RNA thermosensors The T box mechanism Glycine riboswitch Purine riboswitch variants and their ligand specificities Proposed mechanism for allosteric ribozyme-mediated gene control Model for regulation of S box gene expression in response to SAM Crystal structure of the B. subtilis yitj leader RNA bound to SAM Crystal structure of the E. faecalis metk S MK box riboswitch Plasmid used to generate strain BR151 Pspac-metK Construction of the BR151 Pspac-metK strain Measurement of in vivo SAM pools and β-galactosidase activity during IPTG limitation In vivo expression of the wild-type metk-lacz transcriptional fusion xvi

18 2.5 The predicted secondary structure of the B. subtilis metk leader RNA Alignment of S box sequences from metk genes in Firmicutes The alignment of the metk upstream (US) and downstream (DS) box sequences from Firmicutes Primer extension analysis of the B. subtilis metk leader RNA to map the 5 transcriptional start-site Oligonucleotide-direction RNase H cleavage mapping of the B. subtilis metk leader RNA In vivo expression analysis of wild-type metk compared to the metk US box mutant constructs using the IPTG assay In vivo expression analysis of wild-type metk compared to the metk DS box mutant constructs and metk US-DS box double mutants using the IPTG assay Measurement of RNA abundance of metk transcripts Expression profiles of the wild-type and mutant metk-lacz transcripts In vitro analysis of the metk US box mutants In vitro analysis of the metk DS box mutants Proposed model for regulation of B. subtilis metk gene expression B. subtilis yitj leader RNA structural model The B. subtilis yitj leader RNA SAM-dependent structural transition of the B. subtilis yitj S box RNA using RNase H cleavage assay A close-up view of the B. subtilis yitj SAM-binding pocket xvii

19 3.5 SAM-dependent structural transition of the pools of wild-type and mutant yitj transcripts SAM-dependent structural transition of wild-type and yitj variant RNAs A SAM titration of the yitj-aat template in the RNase H cleavage assay The SELEX scheme Nucleotides within the SAM-binding pocket of the B. subtilis yitj RNA Nucleotides within the SAM-binding pocket of the B. subtilis yitj RNA In vitro transcription termination assay Chemical structures of SAM and SAM analogs Methionine biosynthesis pathways in B. subtilis Predicted secondary structures of the yusc and mete leader RNAs In vivo expression assay of the wild-type and hybrid leader RNA constructs during methionine starvation SAM-dependent transcription termination of an S box riboswitch Direct comparison of the terminator/antiterminator competition Direct comparisons of the binding domains In vivo expression analysis of the wild-type metk and yusc leader RNAs In vivo analyses of the metk-yusc hybrid leader RNA fusions In vivo expression of metk-yusc hybrid leader constructs In vitro transcription termination analysis A.1 Measurement of in vivo SAM pools and β-galactosidase activity A.2 In vivo expression assay during methionine starvation xviii

20 A.3 Measurement of in vivo SAM pools and β-galactosidase activity xix

21 LIST OF ABBREVIATIONS Å aa AASD AdoCbl aa-trna AEC ASD ATP bp cdna CTP DAP DNA FMN GlcN6P angstrom amino acid anti-anti-shine-dalgarno 5 deoxy-5 -adensylcobalamin aminoacyl-trna aminoethylcysteine anti-shine-dalgarno adenosine triphosphate base-pair complementary DNA cytidine triphosphate diaminopimelate deoxyribonucleic acid flavin mononucleotide glucosamine-6-phosphate xx

22 GTP IPTG ITC K d LB LysRS mrna MTA MTR NAIM nt NTP NMR O.D. ORF ppgpp PAGE PCR preq 1 qrt-pcr guanosine triphosphate isopropyl β-d-1-thiogalactopyranoside isothermal titration calorimetry dissociation constant Luria Bertani lysyl trna synthetase messenger RNA methylthioadenosine methylthioribose nucleotide analog interference mapping nucleotide nucleotide triphosphate nuclear magnetic resonance optical density open reading frame guanosine tetraphosphate polyacrylamide gel electrophoresis polymerase chain reaction 7-aminomethyl-7-deazaguanine quantitative real time-polymerase chain reaction xxi

23 RBS RNA RNAP RNase ROSE SAC SAH SAM SAXS SD T 0 t 1/2 TBAB Term 1/2 THF TIC TMP TPP Tris-HCl trna ribosome binding site ribonucleic acid RNA polymerase ribonuclease repression of heat-shock gene expression S-adenosylcysteine S-adenosylhomocysteine S-adenosylmethionine small angle X-ray scattering Shine-Dalgarno time zero half-life tryptose blood agar base half-maximal termination tetrahydrafolate translation initiation complex thiamin monophosphate thiamin pyrophosphate tris-(hydroxylmethyl) aminomethane hydrochloride transfer RNA xxii

24 uorf UTP UTR X-gal upstream open reading frame uridine triphosphate untranslated region 5-bromo-4-chloro-3-indolyl-beta-D-galactopyranoside xxiii

25 CHAPTER 1 REGULATION OF GENE EXPRESSION BY RIBOSWITCHES Bacteria constantly modulate gene expression in response to fluctuating physical and chemical parameters, which in turn ensures that the cell utilizes valuable resources without wasteful energy expenditure. The central genetic dogma states that genetic information is transferred from deoxyribonucleic acid (DNA) via messenger ribonucleic acid (mrna) to protein (Crick 1970). Regulation can occur at any stage during this process. A variety of mechanisms that regulate gene expression in bacteria have been uncovered to date. The versatility of RNA as a regulatory molecule allows RNA elements within a transcript to modulate gene expression at a multitude of levels such as transcription, translation and mrna stability. Over the past decade, studies have identified the presence of regulatory sequences that reside typically in the 5 -untranslated regions (5 UTRs) of bacterial mrnas, also known as leader regions. The majority of these RNA elements, termed riboswitches (Nahvi et al. 2002), have been identified in Gram-positive bacteria. Riboswitches serve as genetic regulatory switches that sense and respond specifically to environmental signals 1

26 to regulate expression of the downstream gene in the absence of any protein factor. The environmental signals include coenzymes, nucleotide derivatives, amino acids, sugars, metal ions, small RNAs and physical parameters such as temperature. From a physiological standpoint, bacterial cells need to sense these environmental signals and respond to their dynamic nature. Thus, regulation by riboswitches occurs such that response to the environmental signal either upregulates or downregulates gene expression, depending on whether the system is an on or off switch. Riboswitches generate complex structures to perform two important functions, signal recognition and conformational switching (Breaker 2010). In the case of simple metabolite-binding riboswitches, the RNA consists of one aptamer domain and one regulatory expression platform. The aptamer domain forms a selective ligand-binding pocket that confers specific recognition of the cognate metabolite. Metabolite binding leads to a conformational shift in the RNA structure. This results in formation of one of two mutually exclusive structures that either permit or prevent expression of the downstream coding region. The key feature that connects the ligand-binding domain and the regulatory expression platform is sequence complementarity, which leads to the existence of the alternate RNA conformations. 2

27 A. B. Figure 1.1 Models for sensing of regulatory signals by leader RNAs. A. Transcription termination control. Interaction of a regulatory molecule (a small RNA or a small metabolite) modulates the structure of the nascent RNA as it emerges from RNAP. This structural modulation determines whether the transcript folds into the helix of an intrinsic terminator, resulting in premature termination of transcription, or a competing antiterminator that sequesters sequences necessary for formation of the terminator, resulting in continued transcription and expression of the downstream coding sequence. B. Translational control. Interaction of a regulatory molecule, or changes in temperature, determine whether the RNA transcript folds into a structure that sequesters the Shine- Dalgarno (SD) sequence of the downstream coding region. Sequestration of the SD results in inhibition of translation, while release of the SD, either directly or by folding of the RNA into a competing structure that allows access of the translational machinery to the SD, allows expression of the downstream coding sequence. Adapted from (Grundy and Henkin 2006). 3

28 Riboswitch-mediated gene control in bacteria takes place mostly at the level of premature transcription termination (i.e., transcription attenuation) or translation initiation (Figure 1.1). Typically, transcription termination in bacteria is directed by intrinsic or factor-independent terminators. Intrinsic transcription termination involves folding of the nascent RNA transcript into a G-C-rich helix followed by a U-rich track. While the stable G-C-rich hairpin induces pausing of the transcribing RNAP complex, the relatively weak binding between the poly U residues in the nascent RNA transcript and the corresponding poly A sequence in the DNA facilitates dissociation of the RNAP from the DNA template, releasing the nascent mrna and terminating transcription. The poly U residues may also induce RNAP stalling, thereby providing time for the RNA hairpin to form (Nudler and Gottesman 2002, Peters et al. 2011). The majority of the riboswitches that regulate at the transcriptional level are off switches, in that the presence of a regulatory signal represses expression of the downstream gene. Ligand binding to the aptamer domain results in folding of the leader RNA into an intrinsic terminator (T) hairpin (Figure 1.1A). In the absence of the signal, the RNA forms an alternate structure termed the antiterminator (AT). The AT is more stable than the T and serves as the default state in the absence of a regulatory signal. Formation of the AT structure prevents formation of the T helix, because the AT sequesters a sequence that participates in the formation of the T. When the RNAP encounters the AT, the RNAP continues transcription into the downstream coding region, resulting in expression of the downstream gene. As the AT structure is the default state, the T helix can form only when a third structure termed the anti-antiterminator (AAT) 4

29 forms. The AAT is stabilized in the presence of the cognate ligand, and as the name suggests, it prevents formation of the AT structure by sequestering a sequence involved in formation of the AT. The presence or absence of the signal therefore dictates the conformation of the RNA, which in turn controls expression of the downstream coding region. Unlike off switches, some transcriptional riboswitches function as on switches (Mandal and Breaker 2004, Mandal et al. 2004). In this case, the default state of the riboswitch RNA is the T structure, which forms in the absence of ligand resulting in low gene expression. Signal recognition by the RNA promotes formation of the AT structure and prevents formation the T helix. This type of riboswitch results in upregulation of gene expression. In contrast to intrinsic termination described above, a second general mechanism of transcription termination employed by bacteria involves the presence of specific protein factors. Factor-dependent termination of transcription requires binding of a protein (designated Rho) to a Rho utilization (rut) site in the nascent transcript. The rut site is ~70 nt long and consists of a cytosine (C)-rich sequence that is relatively unstructured. The Rho protein functions as an RNA helicase and travels in an ATPdependent manner towards the 3 end of the RNA. The Rho protein contacts the paused RNAP and results in termination of transcription by dissociating the transcription elongation complex (TEC) (Boudvillain et al. 2010, Peters et al. 2011, Platt 1994). A recently identified riboswitch employs Rho-dependent termination of transcription (instead of the typical intrinsic termination) and represents the newest mechanism of 5

30 riboswitch-mediated control (Hollands et al. 2012) (see section 1.2.7). Riboswitch-mediated regulation at the level of translation initiation proceeds by occlusion of the Shine-Dalgarno (SD) sequence (Figure 1.1B). The SD sequence is located within the ribosome-binding site (RBS) and availability of the RBS for binding of the 30S ribosomal subunit is crucial for initiation of translation. For most of the riboswitches that regulate at the level of translation initiation, the presence of a regulatory signal downregulates gene expression. In these off switches, when gene expression is downregulated, the SD sequence is occluded through base-pairing interactions with a complementary upstream sequence, the anti-shine-dalgarno (ASD) sequence. In the absence of the signal, the ASD sequence pairs with a third sequence located upstream from the ASD called the anti-anti-shine-dalgarno (AASD). When the ASD-AASD pairing is favored, the SD sequence becomes free and available to interact with the translation initiation complex (TIC), which results in high expression of the downstream gene. Some translational switches function as on switches such that the default state of the RNA favors occlusion of the SD region, resulting in low gene expression. For translational riboswitches that function as on switches, the presence of a regulatory signal results in the RNA to fold into a conformation that promotes ASD-AASD pairing. This results in the SD region to be free and available for binding of the translation initiation complex, leading to upregulation of gene expression. In addition to the above-described regulatory mechanisms, riboswitch-mediated gene control has also been identified at the level of ribozyme self-destruction, mrna 6

31 splicing and mrna stability (Cheah et al. 2007, Croft et al. 2007, Winkler et al. 2004). This chapter will describe the known mechanisms of riboswitch-mediated gene control based on the specific ligands recognized by each class. 1.1 Types of riboswitch classes Riboswitches modulate gene expression in response to environmental signals that range from metabolites and inorganic ions to temperature and small RNAs. A number of structures (solution as well as X-ray crystal) have revealed the architecture of the compact binding pockets that result in high affinity and specificity for the various ligands. Table 1.1 categorizes each riboswitch class based on the molecular recognition signal and highlights the structures that are currently available. Table 1.1 Known classes of riboswitches based on ligand type Riboswitch Molecular signal High-resolution structure RNA thermosensors Temperature T box Amino acids via (Gerdeman et al. 2003, uncharged transfer RNA Wang et al. 2010) (trna) Amino Acids Glycine Glycine (Butler et al. 2011) L box Lysine (Garst et al. 2008; Serganov et al. 2008) Glutamine Glutamine (continued) 7

32 Table 1.1 (continued) Riboswitch Molecular signal High-resolution structure Nucleotide derivatives G box Guanine (Batey et al. 2004; Noeske et al. Adenine Adenine 2005; Serganov et al. 2004) 2 -dg 2 -deoxyguanosine (Pikovskaya et al. 2011) PreQ 1 7-aminomethyl-7-deazaguanine (Klein et al. 2009) c-di-gmp-i Cyclic-di-GMP (Kulshina et al. 2009; Smith et al. 2009; Smith et al. 2010a) c-di-gmp-ii Cyclic-di-GMP (Smith et al. 2011) Sugars glms ribozyme Glucosamine-6-phosphate (Klein and Ferre-D Amare 2006; Cochrane et al. 2007) Ions M box Magnesium (Dann et al. 2007; Wakemann et al. 2009; Ramesh et al. 2011) Fluoride Fluoride (Ren et al. 2012) Coenzymes B deoxy-5 -adenosylcobalamin THI box Thiamin pyrophosphate (Edwards and Ferre- D Amare 2006; Noeske et al. 2006; Serganov et al. 2006; Thore et al. 2006; Thore et al. 2008) FMN Flavin mononucleotide (Serganov et al. 2009; Vicens et al. 2011) THF Tetrahydrofolate (Huang et al. 2011; Trausch et al 2011) MoCo Molybdenum WCo Tungsten S box (SAM-I) S-adenosylmethionine (Montange and Batey 2006; Lu et al. 2010; Stoddard et al. 2010) SAM-II S-adenosylmethionine (Gilbert et al. 2008) S MK box (SAM-III) S-adenosylmethionine (Lu et al. 2008) SAM-IV S-adenosylmethionine SAM-V S-adenosylmethionine SAH S-adenosylhomocysteine (Edwards et al. 2010) 8

33 One of the earliest and most prevalent riboswitches to be identified is the THI box, which recognizes thiamin pyrophosphate (TPP). The TPP-responsive riboswitch has been found in all three domains of life (Miranda-Rios et al. 2001, Sudarsan et al. 2003, Winkler et al. 2002a). A few early-discovered riboswitches also sense vitamin-derived coenzymes such as adenosylcobalamin (vitamin B 12 or AdoCbl) and flavin mononucleotide (FMN) (Mironov et al. 2002, Nahvi et al. 2002, Winkler et al. 2002b). Subsequently reported riboswitch classes include RNAs that recognize S- adenosylmethionine (SAM), lysine, guanine/adenine, glycine and the bacterial second messenger c-di-gmp (Grundy and Henkin 1998, Grundy et al. 2003, Mandal et al. 2003, Mandal et al. 2004, Sudarsan et al. 2008, Weinberg et al. 2007). Together, these represent the ten most-common riboswitch classes currently known (Breaker 2011). Unlike most of the characterized riboswitches, RNA thermosensors do not bind a regulatory signal directly. Rather, this class responds to a change in temperature to form a simple thermoresponsive base-paired structure (Altuvia et al. 1989, Morita et al. 1999b, Narberhaus et al. 1998). However, similar to most other riboswitch classes, RNA thermosensors also employ Watson-Crick base-pairing to control gene expression. Early discoveries of riboswitches were made by manually examining sequence alignments to generate predicted secondary structures (Grundy and Henkin 1993, Grundy and Henkin 1998). A few riboswitch classes are known to be highly conserved in sequence and structure and therefore can be relatively easy to identify. However, identification of several putative novel riboswitches has been complicated due to low sequence conservation, making discovery much more tedious. For example, the T box 9

34 leader RNAs display only a 14 nt T box consensus sequence and a few other small conserved sequence elements, but exhibit extensive secondary structural conservation (Grundy and Henkin 1993, Henkin et al. 1992). Recent advances in bioinformatics search tools have identified several putative structured RNA motifs within noncoding regions of mrnas, resulting in an expansion of the riboswitch field (Bengert and Dandekar 2004, Chang et al. 2009, Clote et al. 2012, Singh et al. 2009). 1.2 Riboswitch classes RNA Thermosensors RNA thermosensors, also termed RNA thermometers, have one of the simplest architectures among riboswitches (Narberhaus et al. 2006). In most cases, these elements are located in the 5 UTRs of bacterial heat- and cold-shock genes as well as virulence genes (Giuliodori et al. 2010, Johansson et al. 2002, Morita et al. 1999a, Waldminghaus et al. 2007). RNA thermosensors differ from other riboswitches in that there is no need for an effector molecule-binding domain. To date, all known RNA thermosensors regulate gene expression at the level of translation initiation in response to a change in temperature. Typically, at low temperatures, the SD region is sequestered in a hairpin structure. Increasing temperatures destabilize the RNA structure, making the RBS accessible for translation to be initiated (Figure 1.2). Translational control results in a rapid response and ensures that the mrna transcript is available for translation initiation as soon as the requirement for the gene products arises. 10

35 A. B Figure 1.2 RNA thermosensors. A. At low temperature, the SD-ASD pairing results in inhibition of translation initiation. B. At higher temperature, the ASD SD helix is disrupted and the SD sequence is available for translation initiation. Adapted from (Henkin 2008). The first RNA thermosensor was identified in the Escherichia coli rpoh gene, which encodes the alternate sigma factor σ 32 or RpoH (Morita et al. 1999a, Morita et al. 1999b). Two regions within the rpoh mrna form an extensive structure that blocks the entry of the ribosome to the SD sequence. An increase in temperature disrupts the RNA structure thereby exposing the RBS for translation initiation. This enhances translation of the transcript for the alternate sigma factor, resulting in rapid induction of the heat-shock response. Another heat-shock RNA thermosensor that uses RNA secondary structure to sense temperature is the repression of heat-shock gene expression (ROSE) element. This is the most prevalent RNA thermosensor, discovered initially in rhizobia (Narberhaus et 11

36 al. 1998, Nocker et al. 2001a, Nocker et al. 2001b) and later identified in numerous alpha (α)- and gamma (γ)-proteobatcteria such as E. coli and Salmonella (Waldminghaus et al. 2005). All known ROSE elements control the expression of small heat shock genes, such as the E. coli inclusion body-binding protein A (ibpa) gene. The ROSE-like element in the 5 UTR of the ibpa gene is termed ROSE ibpa (Waldminghaus et al. 2009). The IbpA protein is induced dramatically (>100-fold) by heat shock conditions during growth of an E. coli biofilm (Kuczynska-Wisnik et al. 2010). RNA secondary structure predictions strongly suggest that the SD sequence in the ROSE ibpa RNA is masked at low temperatures and that translation of this transcript is enhanced upon exposure to higher temperatures (Waldminghaus et al. 2005, Waldminghaus et al. 2009). RNA thermometers also play a crucial role in the induction of translation of virulence gene transcripts. For pathogenic bacteria, an increase in temperature to 37 C indicates successful invasion of a mammalian host and triggers the expression of genes encoding virulence factors. The prfa gene, which encodes the key transcriptional activator of Listeria monocytogenes virulence genes, is transcribed at 30ºC and 37ºC, but is translated only at 37 C. The 5 UTR of the prfa gene from L. monocytogenes is folded in a structure that occludes the SD sequence at 30ºC (Johansson et al. 2002). When the pathogen invades the mammalian host, the prfa transcript structure becomes destabilized at 37ºC, which enables efficient translation of the prfa mrna, resulting in activation of virulence gene expression. A similar mechanism has been proposed for the virulence gene transcriptional activator encoded by the lcrf gene in Yersinia pestis (Hoe and Goguen 1993). 12

37 RNA thermosensors have also been documented in the lytic-lysogenic decision of the bacteriophage lambda (λ). The λ ciii gene product regulates the lysogenic pathway by stabilizing the λ cii regulatory protein. Two alternate RNA structures (A and B) control translation of the ciii gene in a temperature-dependent manner (Altuvia et al. 1989). At optimal growth conditions (37ºC), the RNA is in conformation B in which the RBS of the ciii mrna is accessible. The 30S ribosomal subunit binds the RNA in structure B, resulting in translation initiation. The lysogenic cycle is favored as the concentration of the ciii protein rises. During severe heat stress (45ºC), the ciii RNA undergoes a conformational change in which tertiary interactions of structure B are altered, resulting in a structure in which the translational initiation region of the ciii mrna is now blocked (structure A) (Altuvia et al. 1989). Occlusion of the RBS in structure A inhibits the 30S ribosomal subunit from binding, thereby preventing translation initiation of the ciii mrna. As a result, ciii gene expression is downregulated at high temperatures, which leads to degradation of the phage λ cii protein and entry into the lytic pathway. Thus, the temperature-dependent equilibrium between ciii mrna structures A and B is dependent on the binding of the 30S ribosomal subunit, which preferentially binds structure B and initiates translation. Entering the lytic cycle allows the phage to escape from the host during severe conditions (such as heat stress). The ciii RNA regulatory mechanism provides a unique example of a thermosensor in which gene expression is turned off in response to elevated temperatures. It has been predicted that expression of the phage λ ciii gene is dependent on a sequence upstream of the ciii structural gene, which is recognized by the host RNase III 13

38 enzyme. RNase III functions to expose the occluded ciii RBS, resulting in high expression of ciii (Altuvia et al. 1989, Altuvia et al. 1991). Binding (but not processing) by RNase III stabilizes the RNA in structure B at optimal growth conditions (37ºC) and promotes translation initiation of the ciii mrna. It is therefore possible that RNase III also controls the equilibrium between structures A and B. However, the exact mechanism by which RNase III regulates this equilibrium is not yet understood (Oppenheim et al. 1991). Along with heat-shock RNA thermosensors, cold-induced genetic RNA switches also exist. The first such example was identified in the mrna of the cspa gene in E. coli (Goldstein et al. 1990). The CspA protein functions as an RNA chaperone that binds and stabilizes single-stranded RNAs (Jiang et al. 1997). The cspa mrna adopts one of two mutually exclusive conformations in a temperature-dependent manner. At optimal growth conditions (37ºC), the 5 UTR of the cspa transcript forms a highly folded conformation that sequesters the RBS in a helical structure and prevents translation initiation. However, when the mrna is exposed to lower temperatures (around 10ºC), a structural reorganization of the mrna exposes the SD sequence in the loop of an alternate stem that results in ribosome binding and translational initiation (Giuliodori et al. 2010). Thus, increased expression of the CspA protein prevents single-stranded RNA molecules from folding into detrimental conformations that might be stabilized upon exposure to lower environmental temperatures (Giuliodori et al. 2010, Jiang et al. 1997). 14

39 1.2.2 T box riboswitch The T box mechanism is widely used among Gram-positive bacteria to regulate expression of aminoacyl-trna synthetase (aars) genes and amino acid biosynthesis and uptake genes (Grundy and Henkin 1993, Grundy et al. 2002, Henkin et al. 1992, Henkin and Grundy 2006). More than 1000 T box genes have been identified through bioinformatics analyses (Gutierrez-Preciado et al. 2009). T box riboswitches most commonly utilize a mechanism of transcription termination control. However, some T box RNAs in Gram-negative bacteria and members of the Actinomycetes are predicted to regulate at the level of translation initiation (Gutierrez-Preciado et al. 2009). The T box nascent transcript includes an element that serves as an intrinsic transcriptional terminator. Sequences on the 5 side of the terminator can also participate in formation of an alternate, less stable antiterminator structure. Binding of a specific uncharged trna stabilizes the antiterminator and therefore prevents formation of the terminator helix. This leads to increased synthesis of the full-length mrna (Figure 1.3B) Binding of a charged cognate trna promotes termination indirectly, by preventing binding of the uncharged trna (Figure 1.3A). Thus, the T box riboswitch monitors the ratio of charged vs. uncharged trna species. Regulation by the T box mechanism maintains appropriate pools of aminoacylated trnas (aa-trnas) that are essential for cell viability (Grundy and Henkin 1993, Grundy et al. 1994, Grundy et al. 2002). 15

40 A. B. Figure 1.3 The T box mechanism. Regulation by the T box riboswitch occurs primarily at the level of transcription termination in response to the charging state of the cognate trna. A. Aminoacylated trna binds only to the Specifier Loop. When the cognate trna is charged with the correct amino acid (aa), the leader RNA contacts the trna at its anticodon stem-loop but fails to make key contacts with the antiterminator element via the acceptor stem of the trna. Under these conditions, a terminator helix is favored (blue-black) and transcription ceases. B. Uncharged trna interacts at both the Specifier Loop and the antiterminator and results in structural changes throughout the leader RNA. Under these conditions, an antiterminator element (red-blue) is stabilized, which sequesters sequences (blue) that otherwise participate in formation of the terminator helix, and transcription continues into the downstream coding region. The trna is shown in cyan; the amino acid (aa) is shown as a yellow circle attached to the 3 end of the charged trna. Positions of base-pairing between the leader RNA and the trna (Specifier Loop trna anticodon, antiterminator bulge trna acceptor end) are shown as green lines. Adapted from (Green et al. 2010). 16

41 The T box system was initially uncovered based on the analysis of the tyrs gene from B. subtilis (Henkin et al. 1992). This study showed that a long leader region preceding the tyrs coding region contains an intrinsic transcriptional terminator. Immediately upstream of this terminator is a 14 nt sequence, termed the T box sequence (Henkin et al. 1992). Further studies involved manual examination of 10 aars leader sequences (Grundy and Henkin 1993). The predicted secondary structure of a T box riboswitch consists of three helical domains (Stems I, II, III) and a pseudoknot element (Stem IIA/B) that are present within the leader RNA and precede the T box sequence and intrinsic terminator helix (Grundy and Henkin 1993). The identity of a single codon, termed the Specifier Sequence, within a specific internal loop of Stem I directs amino acid specificity for the T box mechanism. The leader RNA-tRNA interaction is facilitated by pairing of the cognate trna anticodon with the Specifier Sequence. Genetic analysis showed that mutating the tyrosine UAC codon to a phenylalanine UUC codon switches the response to phenylalanine limitation rather than tyrosine (Grundy and Henkin 1993). Replacement of the tyrosine UAC codon with a nonsense codon resulted in transcription termination, and a nonsense suppressor trna containing a compensatory anticodon mutation was able to restore expression (Grundy and Henkin 1993). These studies established that base-pairing between the Specifier Sequence of the leader RNA and the anticodon of the corresponding trna is essential for antitermination. The competing antiterminator helix includes a portion of the T box sequence within a 7-nt bulge region on the 5 side (5 -UGGNACC-3, where N is variable) 17

42 (Grundy and Henkin 1993). A crucial base-pairing interaction between the acceptor end of the uncharged trna (5 -NCCA-3 ) and residues in the antiterminator bulge (5 - UGGN-3, where the N residues covary) stabilizes the antiterminator and prevents formation of the competing terminator helix (Grundy et al. 1994). This second trnaleader RNA interaction is blocked by charged trna (Figure 1.3A) (Grundy and Henkin 1993, Grundy et al. 1994, Yousef et al 2003). Extensive mutational analysis of both the tyrs leader RNA sequence and trna Tyr demonstrated that the sequence and structural elements conserved in T box leader RNAs are important for antitermination in vivo (Grundy et al. 2002, Rollins et al. 1997). As trna interactions with the tyrs leader RNA could not be shown in vitro, a different T box leader RNA was chosen for these analyses. The B. subtilis glyqs leader RNA is a natural variant that lacks two large structural elements common to most T box leader RNAs. Uncharged trna Gly promotes readthrough of the B. subtilis glyqs leader RNA in an in vitro transcription assay in the absence of any additional factors, indicating that trna alone is capable of generating a regulatory response in vitro (Grundy et al. 2002). Additional mutational analysis showed that an intact trna is required for antitermination in vitro (Yousef et al. 2003), and addition of an extra nucleotide at the 3 end of the trna inhibits antitermination (Grundy et al. 2005). RNase H structural studies were conducted in which an antisense DNA oligonucleotide was used to probe the 3 side of the terminator helix. These studies demonstrated that complexes generated in the absence of trna or in the presence of a charged trna mimic are in the terminator configuration, whereas complexes generated in the presence of uncharged trna are in 18

43 the antiterminator configuration (Yousef et al. 2005). The nuclear magnetic resonance (NMR) solution structure of the tyrs T box antiterminator helix revealed extensive stacking of the upper helix and the 3 portion of the internal bulge onto the bottom helix (Gerdeman et al. 2003). It was suggested that the stacking interaction facilitates the trna to sample a set of conformations during binding (Gerdeman et al. 2003). Phylogenetic studies and genetic analyses had revealed the presence and importance of highly conserved structural elements, such as the GA (or kink-turn) and S-turn (or E loop) motifs in T box leader RNAs (Rollins et al. 1997, Winkler et al. 2001). Recent NMR studies have confirmed the presence of these motifs in the tyrs T box leader RNA. The structures revealed that the S-turn loop is located adjacent to the Specifier Sequence, while the GA motif is conformationally independent from the Specifier Sequence and is not affected by the presence of the Specifier Sequence (Wang et al. 2010, Wang and Nikonowicz 2011) Amino acid binding riboswitches L box riboswitch The lysine biosynthesis pathway is essential in bacteria for the production of the amino acid lysine for protein synthesis and cell wall biosynthesis, and for generation of diaminopimelate (DAP), a key cell wall component. The lysine biosynthetic pathway also generates dipicolinate, which is required for endospore formation. Phylogenetic analysis of lysine biosynthesis genes revealed a complex leader RNA structural array, conserved in low G+C Gram-positive bacteria (e.g., Firmicutes) and γ-proteobacteria (Grundy et al. 19

44 2003, Sudarsan et al. 2003). Typically, the lysine (or L box) riboswitches from Grampositive bacteria utilize a mechanism of transcriptional control. Lysine specifically promotes a structural shift in the B. subtilis lysc leader RNA that favors the terminator structure (Grundy et al. 2003, Rodionov et al. 2003, Sudarsan et al. 2003). Mutant with sequence changes that disrupt the highly conserved regions in the B. subtilis lysc leader RNA exhibit loss of lysine repression in vivo and loss of lysine-dependent transcription termination in vitro (Grundy et al. 2003). Regulation of L box genes from Gram-negative bacteria appears to take place at the level of translation initiation. The high-resolution crystallographic structures of the lysine regulatory mrna element (Garst et al. 2008, Serganov et al. 2008) reveal a complex architecture in which the binding pocket completely encapsulates the lysine molecule. These results indicate that lysine is located within a 5-helical junctional core. Lysine recognition is stabilized by potassium (K + )-mediated hydrogen bonds to the lysine carboxyl oxygen atoms (Serganov et al. 2008). A recent study showed that K + plays a key role in lysine-dependent termination by increasing the affinity of the lysc leader RNA for lysine, rather than having an impact on specificity (Wilson-Mitchell et al. 2012). The L box aptamer domain binds L-lysine with an apparent dissociation constant (K d ) of ~1 µm and exhibits a high level of molecular discrimination against closely related analogs such as D-lysine, DAP and ornithine (Sudarsan et al. 2003). A recent mutagenic characterization of the lysc leader RNA revealed variants with altered ligand specificities. Ligand recognition was attributed to the specific molecular interactions of lysine or lysine analogs with the nucleotides within the ligand-binding pocket (Wilson- 20

45 Mitchell et al. 2012). Mutations that confer resistance to the lysine analog aminoethylcysteine (AEC) have been mapped to the 5 UTRs of the lysc genes of both B. subtilis and E. coli (Di Girolamo et al. 1988, Lu et al. 1991, Lu et al. 1992, Patte et al. 1998). However, AEC has 10-fold lower efficiency than lysine in promoting transcription termination (Grundy et al. 2003). Hence, AEC is unlikely to act as a repressor of L box expression in vivo. It has been shown that the lysyl-trna synthetase (LysRS) plays a role in misincorporating the lysine analog during translation, thereby acting as the primary cellular target for growth inhibition by AEC (Ataide et al. 2007). Resistance in L box mutants is likely due to derepression of expression of the downstream coding region, resulting in increased production of lysine, which outcompetes AEC during trna aminoacylation (Ataide et al. 2007) Glycine Riboswitch The glycine riboswitch elements usually reside in the 5 UTRs of glycine degradation genes. The RNA acts as a transcriptional on switch to regulate expression of glycine transport and cleavage genes in response to an increase in glycine concentration. The glycine binding riboswitch forms a unique structure in that two aptamer domains are present in tandem and gene expression is regulated through cooperative binding of the amino acid (Mandal et al. 2004). The ability of the RNA to bind glycine in a cooperative manner results in a transition from a fully off-state to a fully on-state across a narrow concentration gradient, thereby improving the sensitivity of 21

46 regulation (Kwon and Strobel 2008, Mandal et al. 2004). Regulation by the glycine riboswitch results in greater flexibility of carbon metabolism within the cell (Mandal et al. 2004). The glycine riboswitch from B. subtilis is part of the gcvt operon, which encodes proteins that form the glycine cleavage system (Mandal et al. 2004). The gcvt RNA motif exists in two forms, type I and type II, separated by a linker sequence (Figure 1.4). The glycine riboswitch is also present in the 5 UTR of a gene from Vibrio cholerae (VC1422) that encodes a putative amino acid transporter (Mandal et al. 2004). In-line probing, which reveals metabolite-induced changes in aptamer structure through spontaneous RNA cleavage, along with equilibrium dialysis, indicated specificity toward glycine over alanine or serine. 22

47 Figure 1.4 Glycine riboswitch. Secondary structure model of the glycine aptamer domains. Shown are the consensus sequences of the RNA motifs I and II separated by a linker. Each aptamer has two conserved paired regions (P1 and P3) and two stem-loops (P2 and P4) that are not conserved. Adapted from (Kwon and Strobel 2008). Nucleotide analog interference mapping (NAIM) and mutagenesis studies explored the chemical basis for cooperativity by the glycine riboswitch from the glycine permease operon Fusobacterium nucleatum (gene FN0328) (Kwon and Strobel 2008). These studies revealed that the minor groove of the P1 helix from aptamer 1 and the major groove of the P3a helix from both aptamers facilitate a cooperative tertiary interaction (Kwon and Strobel 2008). A recent 3.6 Å crystal structure of the glycine riboswitch from F. nucleatum revealed the ligand binding sites and confirmed an extensive network of tertiary interactions mediated largely by A-minor contacts (Butler et 23

48 al. 2011). It was predicted that the interaptamer stacking interactions between the P1 helix from aptamer 1 and the P3 helix from aptamer 2 play a role in cooperativity. Small-angle-X-ray scattering (SAXS), which provides information about the global size and shape of the RNA in solution, indicated a two-state transition of the glycine aptamer in response to Mg 2+ and glycine (Lipfert et al. 2009). The absence of glycine but the presence of Mg 2+ leads to a significant conformational change in the unbound RNA. However, it is only in the presence of glycine that the RNA structure is further compacted by tertiary packing to form the biologically relevant ligand-bound state (Lipfert et al. 2009) Glutamine aptamer The glutamine riboswitch is the latest addition to the amino acid recognizing class of aptamers. A structured non-coding RNA motif, glna, was identified in the 5 UTRs of cyanobacterial genes encoding ammonium transporters and glutamine and glutamate synthetases (Ames and Breaker 2011). This RNA is approximately 60 nt in length and is predicted to function in nitrogen metabolism, by specifically recognizing the amino acid L-glutamine. The 67 nt glna motif from Synechococcus elongates (67glnA) was subjected to in-line probing in the presence of L-glutamine, which revealed an apparent K d of ~575 µm. With the exception of D-glutamine, which binds to the 67glnA motif with 10-fold lower affinity, all other analogs fail to bind to the aptamer (Ames and Breaker 2011). Disruptive mutations in conserved stem regions of the glna RNA motif prevent the RNA from exhibiting a conformational change in the presence of L- 24

49 glutamine, but compensatory mutations restore the response (Ames and Breaker 2011). These data suggest the importance of structure over the precise sequence. The possibility of glutamate as the natural metabolite (rather than glutamine) was tested based on the physiological concentrations of the two metabolites. Even though the intracellular level of glutamate is ~100 mm as opposed to ~4 mm for glutamine, the in-line probing assay fails to show any structural modulation even when 100 mm glutamate is added to the reaction. The glna RNA forms a tandem orientation (similar to the glycine riboswitch, see section ) with two or three aptamers separated by linker regions, suggesting cooperative binding of the amino acid. However, in-line probing assays have failed to confirm the suggested cooperativity for the glna RNA aptamers (Ames and Breaker 2011). A structural variant of the novel glutamine-responsive glna RNA motif, termed the Downstream-peptide (DP), was identified in marine metagenomic sequences. As marine environments often experience nitrogen limitation, it is predicted that glutamine sensing by the DP RNA regulates nitrogen metabolism in cyanobacteria (Ames and Breaker 2011). The 83 DP RNA from Synechococcus sp. CC9902 shows an apparent K d value of ~5 mm for L-glutamine, which is roughly 10-fold higher than the K d value of the glna motif, and the 83 DP RNA displays no affinity toward D-glutamine. The DP RNA motif with disruptive mutations in a predicted pseudoknot structure fails to exhibit structural modulation of the RNA in the presence of L-glutamine, and compensatory mutations restore the response of the RNA (Ames and Breaker 2011). 25

50 The precise mechanism of gene regulation for nitrogen metabolism by the naturally occurring glna or DP RNA aptamers is unclear, due to the failure to identify an obvious expression platform downstream of either of the glutamine binding aptamers (Ames and Breaker 2011). However, the discovery of the third amino acid-binding RNA expands the scope of metabolites recognized by natural aptamers Purine-sensing riboswitches Members of the purine-sensing riboswitch class specifically recognize purines or modified purines. Although three riboswitches from this class are similar in sequence and secondary structure, they each specifically recognize a distinct ligand: guanine (G), adenine (A) or 2 -deoxyguanosine (2 -dg) (Kim et al. 2007, Mandal et al. 2003, Mandal and Breaker 2004). The specificity towards each ligand is dependent on the identities of nucleotides within the ligand-binding pocket (Figure 1.5). The fourth member of this riboswitch class adopts a distinct secondary structure to selectively bind the guanine analog prequeuosine 1 (preq 1 ) or 7-aminomethyl-7-deazaguanine (Roth et al. 2007). These purine-sensing riboswitches share a common mechanism by which they recognize their respective ligands, namely by Watson-Crick base-pairing. 26

51 A. B. L2 L2 L3 L3 C. L2 L3 Figure 1.5 Purine riboswitch variants and their ligand specificities. Depicted are the consensus sequence and secondary structure models for riboswitch aptamers that selectively respond to A. guanine, B. adenine or C. 2 -deoxyguanosine. Red nucleotides are present in greater than 90% of the guanine riboswitch representatives. Blue nucleotides in the adenine and 2 -deoxyguanosine aptamers differ from the guanine consensus. P, pairing; J, junction; L, loop region. Adapted from (Breaker 2011) Guanine riboswitch The guanine riboswitch was the first purine riboswitch to be identified (Mandal et al. 2003). The B. subtilis xpt-pbux operon encodes a xanthine phosphoribosyltransferase and a xanthine-specific purine permease, and guanine was shown to repress the expression of this operon in vivo (Christiansen et al. 1997). It was proposed that a protein 27

52 factor is responsible for the regulation of the xpt-pbux operon. Failure to identify such a protein factor led to the investigation of a possible alternate regulatory mechanism. Phylogenetic analysis of the xpt-pbux 5 UTR revealed an RNA motif with a conserved sequence and secondary structure, termed the G box (Mandal et al. 2003). The secondary structure of the G box is characterized by three stems (P1, P2 and P3) arranged in a tuning fork-like orientation. P1 serves as the anchor for the two parallel P2-L2 and P3-L3 hairpins (Figure 1.5A). In-line probing and equilibrium dialysis indicated that guanine binding by the riboswitch is specific and promotes formation of an intrinsic transcription terminator. Although purine analogs xanthine and hypoxanthine induce modulation of spontaneous cleavage during in-line probing analysis, they fail to outcompete guanine during equilibrium dialysis, i.e., only an excess of unlabeled guanine (and not unlabeled analogs) can redistribute the tritiated guanine that associates with the RNA. Adenine and several other guanine analogs show very low or no affinity towards the G box. The apparent K d of the guanine binding aptamer for guanine is ~5 nm, and for xanthine and hypoxanthine is 10-fold higher (Mandal et al. 2003). Together, these results clearly indicate that the 5 UTR of xpt-pbux RNA preferentially binds guanine over other purines and purine analogs (Mandal et al. 2003) Adenine riboswitch The adenine riboswitch was identified as a variant of the G box motif (Mandal and Breaker 2004). Analysis of the leader RNA of the B. subtilis ydhl (now known as pbue) gene, which encodes a purine efflux pump, revealed several notable differences 28

53 relative to the guanine binding domain of the xpt RNA. The most important difference was observed at a strictly conserved nucleotide in the P1/P3 junction in the xpt sequence. The ydhl sequence carries a U at position 74 instead of a C observed in the xpt RNA (Figure 1.5B). Two additional adenine responsive riboswitches were identified in the 5 UTRs of the add (adenine deaminase) genes from Clostridium perfringens and Vibrio vulnificus (Mandal and Breaker 2004). In-line probing analysis indicated that the three variant RNAs are specific for adenine. The ydhl RNA binds adenine with high affinity (apparent K d ~300 nm) and is selective against most adenine analogs (Mandal and Breaker 2004). The ydhl RNA, similar to the xpt RNA from the guanine riboswitch, binds the ligand by Watson-Crick base-pairing. Hence, a simple change from a G-C base-pair in the xpt RNA to an A-U base-pair in the ydhl RNA switches the molecular recognition of the RNA from guanine to adenine (Mandal and Breaker 2004). As compared to the guanine riboswitch, the default state of the adenine riboswitch is off, as the absence of ligand promotes formation of the intrinsic terminator within the ydhl leader RNA. The ydhl gene is predicted to encode a purine efflux pump that maintains the in vivo concentration of purines. Therefore, when adenine levels are high, expression of the ydhl gene is upregulated and purines are pumped out of the cell at a higher rate. The enzyme adenine deaminase functions to break down adenine into hypoxanthine and ammonia during purine metabolism, reducing the levels of excess purine in the cell. Therefore, upregulation of the add gene in response to high concentrations of adenine would make sense. 29

54 The precise interactions of the two purine aptamers with their respective ligands and details of purine binding by the RNAs were revealed by high-resolution X-ray crystal structures and NMR analysis (Batey et al. 2004, Noeske et al. 2005, Serganov et al. 2004). These studies showed nearly identical three-dimensional (3-D) folds for the guanine and adenine riboswitches and suggested that the purines are enveloped completely within the ligand-binding pockets. The high-resolution analysis showed that the regulatory helix P1 is connected to the P2-L2 and P3-L3 hairpins and is stabilized by tertiary loop-loop interactions. The structural studies confirmed that a cytidine residue makes a Watson-Crick base-pair with the guanine ligand, and a uridine residue in the equivalent position creates a canonical base-pair with the adenine ligand (Batey et al. 2004, Noeske et al. 2005, Serganov et al. 2004). Although these interactions are critical for ligand recognition, they are not the only determinants for specificity. Additional residues within the aptamer junctions make up the conserved core, which results in tight binding of the ligand to the RNA Deoxyguanosine riboswitch The third class of purine riboswitches was identified as a variant of the guaninesensing aptamers only in a single organism Mesoplasma florum, a nonparasitic member of the class Mollicutes (Kim et al. 2007). 12 putative riboswitch variants are classified into type I-V, based on the sequence variations relative to the guanine consensus (Kim et al. 2007). Although the overall architecture of this riboswitch class is similar to that of the characterized purine aptamers, several sequence deviations have been observed in the 30

55 critical core region and the L2 and L3 loops, suggesting that the M. florum RNA has an altered ligand-binding pocket resulting in recognition of a metabolite other than guanine (Figure 1.5C). In-line probing in the presence of a series of guanine and guanosine derivatives revealed the ligand specificity of the variant RNAs. One RNA subclass, I-A, exhibits substantial structural modulation in the presence of 100 µm 2 -dg and reveals a pattern consistent with the formation of a three-way junction similar to the previously characterized purine riboswitches (Kim et al. 2007). The I-A RNA binds 2 -dg with high affinity (apparent K d ~80 nm) and specificity, such that it discriminates against guanine as well as guanosine by approximately two orders of magnitude (Kim et al. 2007). Type I-A RNA binds 2 -dg using a core structure which is similar to that of the purine aptamers. Additional biochemical and structural studies revealed that mutating the uridine at position 51 to a cytidine (U51C) in a guanine riboswitch results in a switch in selectivity from guanine to 2 -dg (Edwards and Batey 2009). Thus, the 2 -dg riboswitch achieves its specificity through modification of key interactions involving the nucleobase. Reorganization of the ligand-binding pocket accommodates the additional sugar moiety. A recent 2.3 Å crystal structure of the M. florum 2 -dg riboswitch in complex with 2 -dg confirmed these findings (Pikovskaya et al. 2011). Bound 2 -dg is positioned in the center of the core where it forms a Watson-Crick base-pair with a cytidine equivalent to the discriminatory C74 in the guanine riboswitch (Batey et al. 2004, Noeske et al. 2005, Pikovskaya et al. 2011, Serganov et al. 2004). The J2-3 junction is predicted to be the 31

56 specificity determinant for the 2 -dg riboswitch as it encapsulates 2 -dg and discriminates against related compounds PreQ 1 riboswitch The fourth class of purine riboswitches includes RNAs that specifically recognize the purine analog prequeuosine 1 (preq 1 ) (Meyer et al. 2008, Roth et al. 2007). PreQ 1 is a precursor molecule for biosynthesis of queuosine, a hypermodified 7-deazaguanosine nucleoside found in the anticodon wobble position of certain trnas (Harada and Nishimura 1972). The majority of preq 1 -binding riboswitches function at the level of transcription termination. Two distinct classes of preq 1 -binding riboswitches have been identified. The preq 1 -I class (including subtypes 1 and 2) from B. subtilis was identified upstream of genes involved in biosynthesis of queuosine and was shown to specifically bind preq 1 with an affinity in the low nanomolar range (Roth et al. 2007). The most striking feature of this riboswitch is its small size (34 nt in length). This riboswitch consists of a simple stem-loop and a short, 3 A-rich tail (Roth et al. 2007). The preq 1 -II class, identified upstream of genes that encode hypothetical membrane proteins in the Streptococcaceae family, shows distinct structural features (Meyer et al. 2008). The aptamer domain of preq 1 -II lacks primary sequence conservation with the preq1-i class. In addition, the preq 1 -II RNA is twice as long as the preq 1 riboswitch and is predicted to have four helices (Meyer et al. 2008). Crystal structure analysis of the preq 1 -I quec RNA riboswitch (Class preq 1 -I, 32

57 subtype 2) from B. subtilis revealed the presence of an H-type pseudoknot structure (Klein et al. 2009). Comparative studies of crystal and solution structure data suggested identical conformations of the P1/L3 region in the lower part of the binding pocket. However, significant structural differences were observed in regions above the preq 1 binding pocket. The L1-P2 region is more compact in the crystal structure than in the solution structure, whereas the base-pairing interactions in the L2 region are well-defined in the solution structure. Together, these results indicate conformational heterogeneity in the preq 1 RNA (Klein et al. 2009, Zhang et al. 2011). Structural modulation of the quec RNA using in-line probing is seen only in the presence of preq 1, compared to the presence of varying concentrations of a series of purine analogs. (Roth et al. 2007). Loss of preq 1 -dependent structural modulation is observed for a mutant that contains a uridine in place of the highly conserved cytidine (C34). This indicates that the C34 position is involved in canonical Watson-Crick pairing with the ligand, analogous to the discriminator cytidine at position 74 of the G box (Batey et al. 2004, Mandal et al. 2003, Roth et al. 2007, Serganov et al. 2004). It is predicted that discrimination between the closely related ligands preq 1 and guanine occurs through the 7 -aminomethyl group, unique to preq 1 (Kim and Breaker 2008, Roth et al. 2007) c-di-gmp Bis-(3-5 )-cyclic dimeric guanosinemonophosphate (c-di-gmp) is a circular RNA dinucleotide that functions as a second messenger to trigger wide-ranging processes, such as the switch between motile and biofilm lifestyles, pilus and flagellum 33

58 formation, and virulence gene expression (Cotter and Stibitz 2007, Hengge 2009, Tamayo et al. 2007). c-di-gmp is generated from two guanosine-5 -triphosphates by the diguanylate cyclase (DGC) enzyme and is degraded by the phosphodiesterase (PDE) enzyme. A c-di-gmp-specific riboswitch element (c-di-gmp-i) was identified based on highly conserved RNA domains, such as the Genes for Environment, for Membrane and for Motility (GEMM) motif (Weinberg et al. 2007), upstream of the DGC and PDE sequences (Sudarsan et al. 2008). The V. cholerae riboswitch element (Vc2) is predicted to be an on switch with high expression in the presence of c-di-gmp to upregulate virulence genes. The Clostridium difficile (Cd1) riboswitch, however, works as an off switch to regulate the transcription of genes encoding flagellar proteins (Sudarsan et al. 2008). In-line probing assays conducted on the V. cholerae riboswitch element (Vc2) show a 1:1 RNA:c-di-GMP stoichiometry, with a tight affinity towards the ligand (apparent K d ~1 nm). This RNA-ligand interaction is of particular importance as it shows the highest affinity of a c-di-gmp receptor and one of the tightest RNA-small molecule interactions (Smith et al. 2010a). The cellular pools of c-di-gmp are predicted to range in the nanomolar to low micromolar concentrations (Hengge 2009). The K d value of 1 nm would then suggest that this RNA is always in the bound or on state, resulting in no regulation of gene expression. However, it is predicted that regulation of the c-di-gmp riboswitch is kinetically-controlled based on the on- and off-rates of ligand interaction with the RNA. The off-rate is extremely slow such that the ligand is bound irreversibly to the RNA within the time-frame required for the switch to be triggered. It is therefore 34

59 suggested that the activity of this riboswitch is modulated primarily by the on-rate of ligand binding and that high intracellular concentrations of c-di-gmp facilitate rapid binding (Smith et al. 2009, Smith et al. 2010a). Two independent and simultaneous high-resolution crystal structures of the V. cholera riboswitch element established that the ligand is bound within a 3-helix junction and is recognized by canonical Watson-Crick and Hoogsteen base-pairing interactions (Kulshina et al. 2009, Smith et al. 2009). The crystal structure identified a new helix formed by flanking nucleotides, including a G-C base-pair that interacts with the c-di- GMP molecule. Stacking interactions were identified as critical determinants for ligand recognition and high affinity (Kulshina et al. 2009, Smith et al. 2009). A separate crystal structure revealed the presence of metal ions, which were absent from both the previous crystal structures (Smith et al. 2010a). A second class of c-di-gmp riboswitch (c-di-gmp-ii) from C. difficile (84 Cd aptamer) shows distinct structural characteristics (Lee et al. 2010). A high-resolution X- ray crystal structure analysis suggested that the class II riboswitch recognizes c-di-gmp using a pseudoknot element that is closely involved in molecular recognition of the ligand through stacking interactions (Smith et al. 2011). The c-di-gmp-ii riboswitch functions as a tandem RNA sensory system in which c-di-gmp binding by the RNA induces folding changes at atypical splice-site junctions to modulate alternative RNA processing (Lee et al. 2010). The 84 Cd aptamer is located at an unusually long distance (~600 bp) from its corresponding coding region. The long sequence between the aptamer and the coding region exhibits characteristics of a typical 35

60 group I intron (Lee et al. 2010). Group I introns are a class of self-splicing ribozymes that catalyze their own excision from RNA precursors (Cech 1990). Splicing by the group I intron follows a twostep sequential process. The first phosphodiester cleavage is at the 5 splice-site (ss). An exogenous guanosine docks in the active site of the ribozyme and releases the 5 exon. The precursor molecule results in a conformational change followed by a second attack on a phosphodiester linkage at the 3 ss. The second cleavage is catalyzed by a different guanosine located in the terminal region of the intron. The two exons are ligated as the intron is released (Cech 1990). While group I introns typically function as selfish elements, it has been speculated that the c-di-gmp II RNA and the group I intron collaborate to function as an allosteric ribozyme, wherein splicing is controlled by c-di-gmp (Lee et al. 2010). The aptamer and ribozyme domains incubated in the presence of GTP yield characteristic group I ribozyme products and splicing occurs only at the 3 ss. However, c-di-gmp when present in addition to GTP significantly increases production of the spliced exons by increasing the rate of attack (Lee et al. 2010). Based on the sequences and structures of the precursor mrna and processed RNAs, it is proposed that the c-di-gmp-ii riboswitch functions at the translational level (Lee et al. 2010). In the precursor RNA, the start codon resides in a helical structure that restricts ribosome access and precludes translation (Figure 1.6A). In the presence of both c-di-gmp and GTP, the ribozyme action yields a processed mrna in which the 5 and 3 exons are ligated such that an intact RBS is located at an optimal distance upstream of 36

61 the exposed start codon (Figure 1.6B). Thus, allosteric activation of ribozyme selfsplicing by c-di-gmp promotes translation. In the absence of c-di-gmp, the ribozyme action favors GTP attack only 4 nt upstream of the start codon in the 3 exon, and cleaves a sequence that serves as an RBS (Figure 1.6C). This alternate condition inhibits translation initiation and gene expression is off. A. GTP + c-di-gmp GTP B. C. Figure 1.6 Proposed mechanism for allosteric ribozyme-mediated gene control. A. Precursor mrna with the start codon sequestered by the ribozyme stem (P10). B. RNA processed in the presence of GTP and c-di-gmp unmasks the start codon and creates a RBS. C. RNA processed in the presence of GTP alone lacks a RBS. Adapted from (Lee et al. 2010). 37

62 1.2.6 glms ribozyme The B. subtilis glms gene encodes glucosamine-fructose-6-phosphate amidotransferase to generate glucosamine-6-phosphate (GlcN6P). GlcN6P is an essential component of sugar metabolism and cell wall biosynthesis, and is recognized by the glms ribozyme. The glms ribozyme from several Gram-positive organisms represses expression of the glms gene in response to increasing concentrations of GlcN6P (Winkler et al. 2004). The glms RNA functions as a catalytic ribozyme. Phylogenetic analysis revealed high sequence conservation in the glms RNA secondary structural element that consists of four paired domains, P1-P4. The domains P1 and P2, in addition to a critical pseudoknot structure, form the essential components for ligand recognition. The dispensable P3 and P4 domains are involved mostly in catalytic rate-enhancement (Roth et al. 2006, Soukup 2006, Wilkinson and Been 2005, Winkler et al. 2004). The pseudoknot organizes the glms core architecture by bringing the paired domains and the catalytic site in close proximity (Soukup 2006). Effector binding by the glms element does not result in structural rearrangement, suggesting that the glms binding pocket is pre-organized in the absence of ligand (Hampel and Tinsley 2006). Comparison of the high-resolution crystal structures of the apo and activator-bound forms confirmed the rigidity of the RNA element (Cochrane et al. 2007, Klein and Ferre-D'Amare 2006). The glms cleavage reaction proceeds through a transesterification step and the 38

63 cleavage products possess 5 -hydroxyl and 2, 3 -cyclic phosphate termini (McCarthy et al. 2005, Winkler et al. 2004). GlcN6P specifically increases the cleavage rate by 1,000- fold (Winkler et al. 2004). Additional studies showed that spontaneous cleavage in the absence of GlcN6P takes place only in Tris buffers and that GlcN6P enhances the rate of cleavage ~100,000-fold in a HEPES buffer (McCarthy et al. 2005). The cleavage due to Tris was attributed to a dependence on amine analogs for RNA self-cleavage. Ligandactivated catalysis was shown to be a function of the acid dissociation constant (pk a ) of the amine group. It was concluded that the RNA lacks catalytic function on its own and that GlcN6P, specifically the amine group, acts as a coenzyme in this system rather than an effector of structural change in the RNA (McCarthy et al. 2005). A highly conserved, catalytically important guanosine residue is located in the active site (Cochrane et al. 2007). There is a strong interdependence between GlcN6P and the G residue as catalytic rate enhancement occurs only when both GlcN6P and G are present. The N1 of the G residue acts as a general base upon GlcN6P binding (Klein et al. 2007). Mechanistic studies of the glms leader RNA indicated that the metabolite-induced self-cleavage of the glms ribozyme results in intracellular degradation of the downstream transcript by RNase J1, which targets transcripts with 5 -OH termini, and ultimately lowers production of the GlmS enzyme (Collins et al. 2007). The glms ribozyme does not function as a metalloenzyme, as it lacks any active site metal ions (Cochrane et al. 2007, Klein and Ferre-D'Amare 2006). Characterization of the glms ribozyme activity showed that divalent metal ions (such as Mg 2+ ) play only a structural role, and that the lack of domains P3 and P4 results in an increased demand for 39

64 Mg 2+ (Roth et al. 2006, Soukup 2006) M box riboswitch Magnesium ion (Mg 2+ ) is an essential divalent metal ion in cellular systems and is critical for the function of all physiological processes. The first cation-responsive RNA sensor was identified in Salmonella enterica (Cromie et al. 2006). In vivo genetic analyses, mutagenesis and RNA structural probing assays indicate that expression of the mgta gene, which encodes the Mg 2+ transporter (MgtA), is controlled by its 5 UTR. It was hypothesized that the mgta leader RNA, which is present in one of two alternate conformations depending on the intracellular concentration of the divalent metal ion, controls transcription of the downstream coding region (Cromie et al. 2006). Subsequent studies showed that mgta expression is regulated at the level of transcription initiation by the PhoP/PhoQ two-component system by responding to periplasmic Mg 2+, and that the Mg 2+ -responsive mgta 5 UTR exhibits control at the level of transcription elongation (Cromie and Groisman 2010). It is therefore suggested that two independent mechanisms are involved in Mg 2+ -dependent transcriptional regulation of mgta in Gram-negative bacteria. The PhoP/PhoQ two-component system responds to micromolar levels of environmental Mg 2+ and activates transcription initiation or to millimolar concentrations of Mg 2+ and represses transcription initiation (Garcia Vescovi et al. 1996). Once transcription is initiated, the 5 leader region of the nascent mgta transcript functions as an alternative Mg 2+ -sensing system. If the Mg 2+ concentration increases in the bacterial cytoplasm, mgta transcription is interrupted before reaching the 40

65 downstream coding region (Cromie et al. 2006). A recent study identified a novel component that controls the regulatory function of the Salmonella mgta riboswitch (Zhao et al. 2011). A 17-residue proline-rich peptide, MgtL, is translated specifically under high Mg 2+. The open reading frame (ORF) for MgtL is embedded in the Mg 2+ riboswitch sequence. Data from structural probing and mutational studies indicate that the presence of high Mg 2+ alters the base-pairing interaction in one riboswitch loop region to favor an alternate stem-loop by a stemswitching mechanism. Formation of this alternate structure subsequently reveals the RBS for mgtl translation. This prevents transcription to continue into the downstream coding region (Zhao et al. 2011). Inhibition of mgtl translation (due to start codon mutations) under high Mg 2+ conditions prevents premature termination of transcription, but leader peptide amino acid limitation does not prevent premature termination of transcription. Together, these observations suggest that the RNA conformational changes are independent of stalling of the translating ribosome (Zhao et al. 2011). Although regulation is predicted at the level of transcription termination, the mgta leader RNA lacks an intrinsic transcription terminator consisting of a CG-rich hairpin followed by a run of Us (Cromie et al. 2006, Peters et al. 2011). Results from a latest study provide evidence that the mgta riboswitch is regulated by a unique mechanism, which employs Rho-dependent transcription termination (instead of intrinsic termination) under high Mg 2+ concentrations (Hollands et al. 2012). A sequence required for binding of Rho (R1) is located in the mgta leader region. When Mg 2+ levels are high, the RNA undergoes a conformational change that allows Rho to interact with the R1 site and 41

66 promote ATP hydrolysis resulting in Rho-dependent termination (Hollands et al. 2012). The recent data suggest that the truncated product observed during the study conducted by Zhao and coworkers is an artifact of pausing rather than true termination. Ribosome stalling is predicted to favor a conformation in which the R1 site is sequestered, thereby preventing it from interacting with Rho. When Mg 2+ concentration is low, ribosome stalling on the MgtL peptide promotes mgta transcription elongation, which inhibits Rho-dependent termination. When Mg 2+ concentration is high, a ribosome translating the full mgtl ORF favors formation of an alternate conformation in which the R1 site is free to interact with Rho. Thus, complete translation of mgtl promotes Rhodependent transcription termination (Hollands et al. 2012). A distinct Mg 2+ -responsive riboswitch, termed the M box riboswitch, was characterized in the Gram-positive bacterium B. subtilis (Dann et al. 2007). This metalloregulatory RNA functions to maintain Mg 2+ homeostasis in the cell. The M box RNAs, identified originally in a bioinformatics search, regulate expression of three major families of Mg 2+ transporters CorA, MgtE and MgtA/MgtB P-type ATPases (Barrick et al. 2004). A detailed study involving genetic, biochemical and biophysical analyses showed that the M box RNA couples metal ion-induced RNA folding with genetic control (Dann et al. 2007). The 5 UTR of the B. subtilis mgte gene contains a putative transcriptional terminator structure (Dann et al. 2007). Expression of the wild-type mgte-lacz reporter fusion is repressed selectively by Mg 2+ in vivo, and mutations that disrupt the terminator/antiterminator sequences result in loss of Mg 2+ -responsive regulation (Dann et 42

67 al. 2007). Selective 2'-hydroxyl acylation and primer extension (SHAPE) and in-line probing analyses showed a lowered reactivity of the M box RNA in the presence of Mg 2+, suggesting that the aptamer domain is rearranged substantially upon association with Mg 2+ to create a more compacted architecture (Dann et al. 2007). Structural models of the M box riboswitch reveal the presence of 6 individual Mg 2+ ions associated within 3 closely packed, nearly parallel helices and formation of the tertiary structure is cooperative (Dann et al. 2007, Wakeman et al. 2009). A recent highresolution crystal structure of the M box RNA was generated by replacing Mg 2+ ions with Mn 2+ ions (Ramesh et al. 2011). The manganese-chelated structure reveals metal ion binding by the RNA, which is important to facilitate long-range tertiary interactions. It further confirms the metal-dependent conformational change in the RNA structure (Ramesh et al. 2011) Fluoride riboswitch The fluoride riboswitch is the second metal-responsive genetic switch (Baker et al. 2012). However, it should be noted that fluoride is not considered an active metal, but rather belongs to the group of halides. Regulatory elements, such as the crcb RNA motif from the organism Pseudomonas syringae, were identified upstream of genes predicted to be involved in ion transport. The P. syringae crcb element is predicted to regulate at the level of translation initiation and activate expression of fluoride export genes. The activation of transport genes can reduce the cellular concentrations of fluoride that can otherwise be toxic to the cell (Baker et al. 2012). 43

68 The fluoride riboswitch consists of two helical regions separated by an asymmetrical central loop. The 5 terminal region of the RNA is predicted to base-pair with a sequence in the loop region through pseudoknot interactions. In-line probing experiments show that the crcb RNA motif binds free fluoride ions with an affinity of ~60 µm and has the ability to discriminate against other halogen ions (such as chloride, bromide and iodide) (Baker et al. 2012). The minimum inhibitory concentration (MIC) of fluoride ions in E. coli containing the crcb gene is ~200 mm, but a crcb deletion strain is highly sensitive to lower concentrations of fluoride (MIC ~1 mm) (Baker et al. 2012). A subsequent study analyzed the crcb RNA from Thermotoga petrophila (Ren et al. 2012). Isothermal titration calorimetry (ITC) showed an apparent K d value of ~140 µm. This study solved the crystal structure of the crcb RNA in complex with a fluoride ion at a resolution of 2.3 Å and revealed the ligand-rna interactions. This was of particular interest as both the ligand and RNA are negatively charged molecules. Other examples of metabolite-responsive riboswitches that bind to negatively charged ligands are thiamine pyrophosphate (see section ) and flavin monophosphate (see section ). The fluoride ion is in a central core and is coordinated specifically to an inner shell of three Mg 2+ ions. The metal ions are surrounded and coordinated to an outer shell of five backbone phosphates and water molecules. Previous studies have examined the ability of nucleic acids to bind anions such as chloride. Results obtained from these studies reveal that electropositive groups within the RNA molecule (mainly amino, imino, and hydroxyl groups) create specific anion binding pockets (Auffinger et al. 44

69 2004). Along with metal ion coordination, fluoride-dependent pseudoknot formation, stacking and long-range interactions are critical in ligand recognition (Ren et al. 2012) B 12 riboswitch The B 12 riboswitch is a coenzyme-recognizing regulatory element that was identified upstream of cobalamin transport (btub) and biosynthetic (cob operon) genes in E. coli and Salmonella (Nahvi et al. 2002). The B 12 riboswitch recognizes 5 -deoxy-5 - adenosylcobalamin (AdoCbl or vitamin B 12 ). Earlier studies showed that AdoCbl downregulates synthesis of the cobalamin transport protein BtuB, by inhibiting ribosome binding (Nou and Kadner 2000). The B 12 riboswitch (B 12 box) box has also been identified in several Gram-positive organisms, expanding the scope of this genetic element (Nahvi et al. 2004). Regulation by the B 12 box is associated with transcriptional control of the mete gene from Streptomyces and Mycobacterium tuberculosis and of the ribonucleotide reductases from Streptomyces and Enterococcus faecalis (Baker and Perego 2011, Borovok et al. 2006, Warner et al. 2007). The B 12 riboswitch element from E. coli binds AdoCbl with an apparent K d of ~300 nm, and related analogs such as cyanocobalamin and methylcobalamin do not exhibit any measurable binding (Nahvi et al. 2002). Ligand binding sequesters the RBS, thereby controlling gene expression at the level of translation initiation (Nahvi et al. 2002). Studies conducted with AdoCbl analogs showed that stereochemical modification of the corrin ring renders the ligand inactive, both in vivo and in vitro (Gallo et al. 2008, Nahvi et al. 2002). 45

70 The B 12 element is present in two different tandem arrangements, such that two riboswitch elements are located adjacent to each other, upstream of one ORF (Sudarsan et al. 2006). The tandem arrangement carries two B 12 elements in conjunction to each other (as seen in Desulfitobacterium hafniense) or adjacent to a distinct riboswitch class, such as the S-adenosylmethionine (SAM)-responsive S box riboswitch (seen in B. clausii; see section for S box riboswitch). The former example is similar to the glycine riboswitch with respect to structural architecture and does not employ cooperative ligand binding (refer to section ). The second type of arrangement yields a composite gene control by having two distinct riboswitches arranged in tandem (Sudarsan et al. 2006). The B. clausii MetE enzyme is involved in methionine metabolism and generates methionine from homocysteine in the absence of AdoCbl. The alternate, more efficient isoenzyme, MetH, requires AdoCbl for the synthesis of methionine. Both mete and meth are regulated by S box riboswitches. In addition, mete is also regulated by a B12-responsive element. Therefore, under high AdoCbl levels, the mete gene is turned off which results in the preferential use of MetH for methionine synthesis. When SAM levels are high, both genes are downregulated by the S box mechanism. The mete 5 UTR contains the S box riboswitch element upstream of the B 12 element. Each structural element within the tandem riboswitch contains a terminator helix and generates an independent response to the respective ligands (Sudarsan et al. 2006). The presence of either molecular effector (SAM or AdoCbl) is sufficient to confer repression, and the binding affinity for one ligand is unaffected by the presence of the 46

71 other. Mutagenic analysis showed that disruption of the consensus sequence or secondary structure of either aptamer element affects the sensitivity only to the corresponding ligand, without affecting the function of the adjacent riboswitch (Borovok et al. 2006). This type of arrangement consequently achieves higher gene control TPP riboswitch The thiamine pyrophosphate (TPP) riboswitch (or THI box) was among the earliest riboswitches to be discovered (Miranda-Rios et al. 2001, Winkler et al. 2002a). This riboswitch downregulates expression of genes involved in the biosynthesis and transport of the essential cofactor thiamine (vitamin B 1 ) and its pyrophosphate derivative TPP. The THI box sequence was identified in a variety of Gram-positive and Gramnegative bacteria, as well as in archaea, fungi and plants (Miranda-Rios et al. 2001, Sudarsan et al. 2003, Winkler et al. 2002a). TPP-mediated riboswitch control has been observed at a variety of regulatory levels. In E. coli, the THI box is located in the 5 UTRs of the thim and thic genes that are downregulated by sequestration of the SD region. The B. subtilis thim gene that contains a THI box in the 5 UTR is regulated only at the level of transcription termination (Mironov et al. 2002). The B. anthracis tena gene contains two TPP riboswitches in tandem. Both elements respond independently to TPP, yet appear to function in concert with each other (Welz and Breaker 2007). TPP-mediated transcriptional regulation of a thiamine transporter was identified recently in an oral spirochete (Bian et al. 2011). 47

72 The TPP riboswitch also regulates at the level of mrna splicing in certain filamentous fungi such as Neurospora crassa and Apergillus oryzae (Cheah et al. 2007, Kubodera et al. 2003). TPP binding to the THI box riboswitch located in the NMT1 gene in N. crassa leads to increased production of an alternatively spliced product that contains upstream ORFs (uorfs). The uorfs compete for translation initiation and downregulate expression of the main ORF (Cheah et al., 2007). Arabidopsis thaliana and other plant species, as well as some photosynthetic algae, carry the TPP riboswitch in the 3 UTR of the thic gene near the 3 -poly A tail, suggesting control at the level of mrna processing and mrna stability (Croft et al. 2007, Sudarsan et al. 2003, Wachter et al. 2007). TPP binding to a THI box element located in an intron downstream of the thic gene in A. thaliana results in increased splicing and production of a transcript with decreased stability relative to the unprocessed transcript. An extensive in vivo analysis of the E. coli thim riboswitch showed that mutants can be categorized into two classes based on their expression profiles (Ontiveros-Palacios et al. 2008). Results from this investigation suggested that the mutations lock the RNA in one of two conformations that either inhibit or activate expression. RNase H structural mapping and 30S ribosomal subunit toeprinting assays suggested that TPP actively inhibits accessibility of the SD region (Ontiveros-Palacios et al. 2008). The crystal structures of the TPP riboswitch from various phylogenetic backgrounds have revealed the intricate ligand-rna interactions (Edwards and Ferre- D'Amare 2006, Noeske et al. 2006, Serganov et al. 2006, Thore et al. 2006, Thore et al. 2008). The structures with plant and bacterial origin are highly similar, indicating 48

73 evolutionary conservation. The RNA employs a Y-shaped architecture, also seen in the purine riboswitches (refer to section 1.2.4). The structural studies examined the high specificity of the RNA towards its natural ligand TPP, in comparison to various analogs that lack either one or both of the phosphate moieties, such as thiamine monophosphate (TMP) or thiamine. The binding affinities of the RNA for these ligands is ~0.1 μm for TPP, 100 μm for TMP and 600 μm for thiamin (Winkler et al. 2002a). The RNA makes direct contacts with two surfaces of TPP, the pyrimidine ring and the pyrophosphate moiety (Sudarsan et al. 2003). Recognition of the phosphate moiety contributes to ~100 to 1000-fold higher affinity for TPP compared to its analogs (Serganov et al. 2006, Thore et al. 2006, Winkler et al. 2002a). However, the central thiazole ring of the ligand is not recognized directly by the RNA (Serganov et al. 2006). The thiazole component of TPP bridges the two RNA domains together and stabilizes the overall RNA fold. This provides an explanation for the ability of the antimicrobial agent pyrithiamine pyrophosphate (PTPP), which contains a pyridine ring in place of the central thiazole moitety (Thore et al. 2008), to bind the TPP riboswitch and downregulate gene expression (Sudarsan et al. 2005). The ability of PTPP to bind to THI box elements is considered a major cause of PTPP-dependent toxicity in bacterial cells. The high-resolution crystal structures of the THI box RNA confirmed the presence of at least two divalent metal ions. These cations are required to counteract the negative charges from the RNA phosphate backbone as well as from the pyrophosphate moiety of the ligand, resulting in a tight interaction (Edwards and Ferre-D'Amare 2006, Noeske et al. 2006, Thore et al. 2006). Results obtained from an ITC analysis supported 49

74 the divalent Mg 2+ -dependent folding of RNA and provide evidence for the role of metal ions in ligand affinity (Kulshina et al. 2010) FMN riboswitch The FMN riboswitch (also termed the RFN element) directs expression of the biosynthetic and transport genes of riboflavin (vitamin B 2 ) and its precursor, flavin mononucleotide (FMN) (Mironov et al. 2002). At least two FMN elements have been identified in B. subtilis that bind the cofactor FMN and result in downregulation of the downstream genes. The two FMN elements control expression of essential downstream genes in distinct manners. The FMN biosynthetic gene ribd is regulated at the level of transcription termination, and regulation of the transport gene ribu (also known as ypaa) occurs by translation inhibition (Winkler et al. 2002b). In-line probing and fluorescence studies show that FMN binds the RNA with high affinity (apparent K d ~5-10 nm), but analogs flavin adenine dinucleotide (FAD) and riboflavin bind with much lower affinities (~300 nm and 3 µm, respectively) (Wickiser et al. 2005, Winkler et al. 2002b). RNA structure probing studies showed that the crucial phosphate moiety is necessary to discriminate between FMN and riboflavin. Riboflavin lacks the phosphate moiety and therefore is unable to generate a regulatory response (Winkler et al. 2002b). The negative charges contributed by the phosphates of the RNA and FMN are counteracted by divalent cations (such as Mg 2+ ) (Wickiser et al. 2005, Winkler et al. 2002b) but the identity of these divalent cations is not crucial (Serganov et al. 2009). 50

75 The high-resolution crystal structures from F. nucleatum and B. subtilis revealed FMN to be enveloped completely by the RNA (Serganov et al. 2009, Vicens et al. 2011). The RNA is arranged in a butterfly-like scaffold, similar to one of the several architectural modules found in 23S ribosomal RNA (rrna). These structures, from phylogenetically distinct organisms, show the direct recognition (through hydrogen bonding) of the phosphate moiety by the RNA and confirm the involvement of the chromophoric isoalloxazine ring. The uracil-like edge of the FMN ring is involved in Watson-Crick-like hydrogen bonding (Serganov et al. 2009, Vicens et al. 2011). The phosphate and ring structures of FMN are directed towards different domains of the RNA. A similar ligand-rna interaction has been observed in the TPP riboswitch (see section for TPP riboswitch). SAXS and evaluation of the free and bound crystal structures revealed little global switching of the FMN riboswitch, suggesting that there is no obvious ligand-dependent conformational change in the RNA structure (Baird and Ferre-D'Amare 2010, Vicens et al. 2011). An investigation of the ribd FMN-responsive riboswitch revealed that regulation is dictated by the rate constant for FMN association, along with the rate at which the RNAP completes transcription of the terminator helix (Wickiser et al. 2005). It was suggested that during transcription, the RNA-ligand complex is unable to reach thermodynamic equilibrium before the RNAP commits to a regulatory decision. The discrepancy observed between the apparent K d values for binding and the ligand concentrations required to induce transcription termination in vitro suggested that the FMN riboswitch operates at a kinetic level (Wickiser et al. 2005). 51

76 Additional biochemical and genetic studies have confirmed the ability of the FMN riboswitch to bind the chemical analog roseoflavin and downregulate FMNdependent expression (Lee et al. 2009, Mansjo and Johansson 2011, Ott et al. 2009). Roseoflavin, a natural pigment synthesized by Streptomyces, is the only known FMN/riboflavin analog with antibacterial properties (Otani et al. 1974). Mutations in the ligand-binding pocket of the FMN riboswitch were identified in roseoflavin-resistant bacteria that exhibited derepression of reporter gene expression (Lee et al. 2009). Roseoflavin binds the FMN riboswitch with affinity lower than that of FMN (apparent K d ~100 nm) but higher than that of riboflavin due to the presence of the dimethylamino group on the flavin ring structure of roseoflavin (Lee et al. 2009) THF riboswitch The THF regulatory system is a recent addition to the riboswitch family and expands the number of coenzymes that are sensed directly by RNA elements. The THF RNA motif is found primarily in Firmicutes and resides upstream of folate transport genes (folt) as well as biosynthesis genes (folc, fole and folqpbk) (Ames et al. 2010). The THF receptor consists of a ~100 nt element predicted to form four helices (P1-P4) and an additional pseudoknot structure around a three-way junction. High nucleotide conservation is observed within the P2 helix and around the single-stranded junctions between the paired regions (Ames et al. 2010). Long-range tertiary interactions facilitated by the pseudoknot structure were confirmed by a high-resolution crystal structure (Huang et al. 2011). This study suggested that the pseudoknot interactions are crucial for the 52

77 regulatory response of the THF riboswitch. The THF riboswitch selectively binds derivatives of the vitamin folate, including tetrahydrofolate (THF) and dihydrofolate (DHF) (Ames et al. 2010). THF, the active form of folate, is an essential cofactor involved in 1-carbon transfer reactions. It is predicted that the THF riboswitch monitors only the active fraction of the total intracellular folate pool (Trausch et al. 2011). In-line probing assays showed that the apparent K d of the RNA element for THF is ~70 nm, and ~300 nm for the THF analogs. Mutating key base-pairs in the P2 helix increases spontaneous cleavage, making the apparent K d ~1000-fold poorer than wild-type, and mutants with compensatory substitutions exhibit restored affinity towards THF (Ames et al. 2010). Additional assays revealed that the RNA binds to various 5- and 10-modified forms of THF but rejects folic acid from the binding pocket, indicating a preference towards reduced forms of the vitamin (Ames et al. 2010). A recent high-resolution structure of a Streptococcus mutans THF element, bound to the THF analog folinic acid, sheds light onto the recognition of ligand by the RNA (Trausch et al. 2011). This crystal structure revealed the presence of two separate ligandbinding pockets within a single structured domain. Despite different RNA motifs, the mode of ligand recognition by both binding sites is strikingly similar (Trausch et al. 2011). Under physiological Mg 2+ concentrations, the riboswitch reveals strong cooperative binding to the two ligands, although only one of these sites is required for a regulatory response (Trausch et al. 2011). A Lactobacillus casei THF element shows a potential hairpin structure that 53

78 functions as a translational off switch, such that in the presence of the ligand the anti- RBS sequence pairs with the RBS, preventing translation initiation (Ames et al. 2010). However, this mechanism has not been verified experimentally. A THF riboswitch identified from the human gut metagenome shows a distinct stem-loop structure that is predicted to function as an intrinsinc transcriptional terminator (Ames et al. 2010) Moco and Tuco RNA elements A comparative genomics approach using computational analysis revealed several highly conserved RNA motifs (Weinberg et al. 2007). One such RNA motif was identified upstream of genes involved in molybdate transport, molybdenum cofactor (Moco) biosynthesis as well as proteins that employ Moco as a coenzyme (Weinberg et al. 2007). A variety of organisms, like γ- and δ-proteobacteria, Clostridia, Actinobacteria and Deinococcus, contain one or multiple copies of the Moco element. At least 8 Moco motifs have been identified in D. hafniens. The Moco elements are also present in a tandem arrangement. A few organisms that do not require molybdenum instead utilize tungsten and its cofactor (Tuco), and a few variants of the Moco element can be triggered by Tuco (Weinberg et al. 2007). The E. coli moa operon is under the control of two promoters and expression is upregulated by two transcriptional factors (Regulski et al. 2008). In addition to the multileveled regulation by transcriptional factors, the Moco element likely functions as an off switch in response to Moco and discriminates against Moco analogs. Regulation of gene expression is either at the transcriptional or translational level, and some organisms 54

79 that contain multiple Moco elements show evidence of both types of expression platforms (Regulski et al. 2008). In-line probing of the Moco element upstream of the moa operon showed that the RNA forms a highly structured metabolite-sensing regulatory element (containing paired regions P1-P5) (Regulski et al. 2008). The Moco elements can be divided into two overall architectures (structures with or without P3), which likely correlates with the ability to utilize Moco, Tuco or both. Signature motifs (such as the GNRA tetraloop and a tetraloop receptor, where N is A, C, G or U and R is A or G) that stabilize the overall RNA fold have been identified in the Moco RNA (Regulski et al. 2008) SAM-sensing ribsowitches S-adenosylmethionine (SAM) is an essential cellular metabolite and is intricately involved in physiological processes. Most importantly, it is used as a methyl-group donor in a variety of chemical reactions. As SAM is synthesized from methionine and ATP by SAM synthetase (encoded by the metk gene), growth in the presence of methionine leads to high concentration of SAM in vivo and growth in the absence of methionine results in low in vivo SAM pools (Grundy and Henkin 1998, Tomsic et al. 2008). The SAM-binding riboswitches represent the most diverse collection of regulatory elements that recognize the same effector molecule. Six different riboswitch classes (S box/sam-i, SAM-II, S MK box/sam-iii, SAM-IV, SAM-V and SAM-I/IV) that recognize SAM as the molecular effector have been identified, with the S box being the most prevalent (Corbino et al. 2005, Fuchs et al. 2006, McDaniel et al. 2003, Poiata et 55

80 al. 2009, Weinberg et al. 2008, Weinberg et al. 2010). These riboswitches bind SAM with high affinity and selectivity and discriminate against near-cognate derivatives, such as S- adenosylhomocysteine (SAH) and S-adenosylcysteine (SAC). Regulation of the SAM-binding riboswitches is most commonly seen at the level of premature transcription termination. Riboswitches that recognize SAH and discriminate against SAM have also been characterized (Wang and Breaker 2008). The X-ray crystal structures of many SAM-binding riboswitches have deciphered the distinct architectural ligand-recognition properties. The following sections will describe the current literature regarding these riboswitch elements S box/sam-i riboswitch A highly conserved RNA motif termed the S box was originally identified upstream of 11 methionine, SAM and cysteine biosynthetic genes in B. subtilis and subsequently in other low G+C, Gram-positive bacteria (Grundy and Henkin 1998). Increased S box gene expression is observed during methionine limitation (when SAM pools are low), and growth in the presence of methionine results in repression of gene expression (when SAM pools are high) (Grundy and Henkin 1998). The S box leader RNAs show high conservation in primary sequence and secondary structure. The predicted secondary structure consists of helices P1-P4, with highly conserved residues in unpaired regions. Phylogenetic analysis revealed that a transcription terminator helix is located at the 3 end of the leader region, suggesting regulation at the level of premature transcription termination. Mutational analysis showed 56

81 that disruption of helix P1 leads to constitutive expression, suggesting that the RNA is unable to form the terminator conformation (Grundy and Henkin 1998). SAM was identified as the effector molecule that binds directly to S box leader RNAs and promotes premature transcription termination in vitro, in the absence of any auxiliary protein factor (McDaniel et al. 2003) Figure 1.7 Model for regulation of S box gene expression in response to SAM. The antiterminator structure (AT, red-blue) forms in the absence of SAM, allowing expression of the downstream coding region(s). Binding of SAM (represented by the asterisk) stabilizes the anti-antiterminator structure (AAT), which sequesters sequences (red) required for formation of the antiterminator and frees sequences (blue) required for formation of the terminator helix (T), resulting in premature termination of transcription. Numbers indicate helices 1-4. SAM binding also promotes a tertiary interaction (dashed line) between residues in the terminal loop of helix 2 and the unpaired region between helices 3 and 4. Adapted from (Tomsic et al. 2008). The S box model proposes that when the cells are starved for methionine, the SAM pools drop and the antiterminator structure forms (Figure 1.7, left panel). 57

82 Formation of the antiterminator results in transcription to continue into the downstream coding region. In presence of high SAM pools, the anti-antiterminator sequesters sequences required for formation of the antiterminator. This promotes formation of the terminator helix and therefore gene expression is turned off (Figure 1.7, right panel). The S box model was further supported by two additional studies that confirmed direct sensing of SAM by the S box RNA (Epshtein et al. 2003, Winkler et al. 2003). Extensive biochemical and genetic analyses further revealed that a highly conserved kink-turn motif (GA motif) and an essential pseudoknot element facilitate the appropriate folding of the S box RNAs and the subsequent SAM-dependent transcription termination (McDaniel et al. 2005, Winkler et al. 2001). The yitj S box leader RNA binds SAM with high affinity (K d ~20 nm) (Tomsic et al. 2008) and binding by the yitj RNA to SAM is highly specific. The yitj RNA exhibits 100- and 10,000-fold lower affinities for SAH and SAC, respectively, as compared to the affinity for SAM (McDaniel et al. 2003, Winkler et al. 2003). The tertiary structure of the yitj RNA in complex was SAM was revealed by two independent high-resolution structures from Thermoanaerobacter tengcongensis (Montange and Batey 2006) and B. subtilis (Lu et al. 2010) (Figure 1.8). (Chapter 3 will describe the mutagenic analysis of the B. subtilis yitj RNA conducted as part of the crystal structure study in collaboration with Dr. Ailong Ke, Cornell University). The two nearly superimposable crystal structures revealed that the ligand is buried deep within the SAM-binding pocket, which is formed by helices P1-P4. A crystal structure study of the apo-form of the aptamer domain suggested that a sampling of a variety of intermediate RNA conformations results 58

83 in the selection of the appropriate structure, based on the ligand concentration (Stoddard et al. 2010). A. B. Figure 1.8 Crystal structure of the B. subtilis yitj leader RNA bound to SAM. A. Helices P1 through P4 are colored in gray, green, cyan, and yellow, respectively. Red dashes denote pseudoknot base-pairs. SAM-contacting bases are labeled in magenta. The sheared A46 U78 base-pair that recognizes the adenosine base of SAM is labeled with a red dot. B. Ribbon representation of the B. subtilis yitj RNA structure. The color scheme is the same as that in A. SAM is shown as a ball-and-stick model overlaid with surface representations in gray. The assigned magnesium metal ion is shown in pink. Adapted from (Lu et al. 2010). 59

84 A comprehensive genetic and biochemical study comparing the S box riboswitches from B. subtilis revealed variability in the response to SAM both in vivo and in vitro (Tomsic et al. 2008). The S box gene-lacz fusions show a 250-fold range in induction ratios after 4 h of methionine starvation (Tomsic et al. 2008). Variability is also observed both in the termination efficiency in the absence of SAM and in the concentration of SAM required for half-maximal termination in vitro. This study showed that genes involved in methionine biosynthesis are tightly repressed in the presence of SAM and show high induction of expression in the absence of SAM as compared to genes involved in methionine transport. Overall, it was concluded that the S box gene expression is finely tuned based on the physiological function of the genes (Tomsic et al. 2008). Chapter 4 will describe the importance of the S box leader RNA structural elements that play a role in the observed variability and Chapter 2 will describe the detailed characterization of the atypical metk S box leader RNA. The metk gene encodes SAM synthetase, the enzyme responsible for synthesizing the S box molecular effector, SAM. Although most S box riboswitches regulate at the level of premature transcription termination, S box-mediated regulation has been predicted at the level of translation initiation in certain Gram-negative bacteria and Actinomycetes (Rodionov et al. 2004). Antisense RNA interference has been documented for S box riboswitch in pathogenic organisms, where the riboswitch transcript is predicted to interact with the mrna of a virulence factor in trans (Loh et al. 2009) or a sulfur operon in cis (Andre et al. 2008), to regulate gene expression. An S box riboswitch has also been identified upstream of the B. 60

85 clausii mete gene, in a tandem arrangement adjacent to a B 12 -binding riboswitch element (refer to section 1.2.9) (Sudarsan et al. 2006). Both riboswitch elements (S box and B 12 elements) function independently in response to their respective ligands and regulate gene expression at the level of premature transcription termination SAM-II riboswitch The SAM-II riboswitch was first identified using comparative sequence and structural probing analyses (Corbino et al. 2005). The SAM-II element, found predominantly in α-proteobacteria such as Agrobacterium tumefaciens in the meta leader RNA, is the smallest of the SAM-binding riboswitches and is distinct from the S box riboswitch in sequence and structure. The initial characterization of this regulatory RNA showed lower affinity for SAM compared to other S box riboswitch elements; the RNA was shown to bind SAM with an apparent KD of ~1 μm. In-line probing and equilibrium dialysis showed strong discrimination against SAM-related compounds (>1,000-fold for SAH). The simple architecture of the SAM-II riboswitch consists of a single stem-loop structure with an H-type pseudoknot (Corbino et al. 2005). The high-resolution structure of a SAM-II element obtained from an environmental sample revealed the first structure of an entire riboswitch element, inclusive of both the aptamer domain and expression platform (Gilbert et al. 2008). This crystal structure predicted that repression by the SAM-II riboswitch takes place by blocking the translation initiation site, rather than structural switching with a complementary downstream sequence. Ligand-dependent 61

86 stabilization of the pseudoknot structure sequesters the SD sequence at the 3 end of the riboswitch element. Although the SAM-II RNA is globally different from the S box motif, functional group recognition takes place in an analogous manner, which forms the basis for efficient discrimination against near-cognate analogs such as SAH (Gilbert et al. 2008). A variety of biophysical techniques (NMR, Single Molecule Fluorescence Energy Resonance Transfer [smfret], SAXS and molecular dynamics simulations) have shed light on the ligand-dependent conformational switching of the SAM-II riboswitch (Chen et al. 2011, Doshi et al. 2012, Haller et al. 2011, Kelley and Hamelberg 2010). Overall, these studies revealed that the essential divalent Mg 2+ ions promote a compact RNA conformation, resulting in a structural preorganization that enables pseudoknot formation. However, it is only in the presence of the ligand that this crucial pseudoknot structure is formed fully (Haller et al. 2011) S MK box/sam-iii riboswitch The third class of SAM-binding riboswitches has a relatively simple architecture (unlike S box, but similar to SAM-II) such that both ligand binding and regulatory control encompass a single module. The S MK box riboswitch was identified upstream of metk genes from members of the Lactobacillales (Fuchs et al. 2006). The S MK RNA element from E. faecalis binds SAM directly, resulting in a structural rearrangement that regulates gene expression at the level of translation initiation, in the absence of auxiliary protein factors. Mutational and structural probing analyses showed that pairing between 62

87 SD and anti-sd (ASD) regions is required for SAM binding (Fuchs et al. 2006). Ribosomal toeprinting assays subsequently revealed that the SD-ASD pairing blocks access of the ribosome to the SD region, inhibiting gene expression (Fuchs et al. 2007). Thus, the S MK box is a unique SAM-binding riboswitch as the SAM-binding domain, which participates directly in metabolite recognition, is not separable from the regulatory target. A. B. Figure 1.9 Crystal structure of the E. faecalis metk S MK box riboswitch. A. Secondary structure of the S MK riboswitch RNA. Helices P1 through P4 are colored in cyan, green, silver and yellow, respectively. Gray shading, SD sequence; solid magenta lines, direct contacts between the RNA and the SAM molecule; dashed magenta lines, tertiary interactions between J3/2 and P2 and J2/4. B. Cartoon representation of the crystal structure of the S MK riboswitch. SAM is shown in magenta. The coloring scheme for the crystal structure is consistent with A. Adapted from (Lu et al. 2008). 63

88 Although initial genetic analyses for the E. faecalis S MK box RNA were conducted in B. subtilis, translational repression was observed during elevated SAM pools (during growth in the presence of methionine), supporting the model for S MK box regulation (Fuchs et al. 2006). S MK -mediated regulation was further confirmed in the native background during which in vivo SAM pools were modulated. Regulation at the translational level was inferred, as no significant change was observed in the overall transcript abundance (Smith et al. 2010b). As the RNA-SAM complex in vitro shows a half-life (7.8 s) that is ~20-fold shorter than the transcript half-life (3 min) in vivo, it was suggested that the S MK box functions as a reversible riboswitch (Smith et al. 2010b). In addition, this reversibility (or conformational switching) was shown using fluorescence spectroscopy (Smith et al. 2010b). This type of regulatory mechanism ensures that the cell is poised to respond rapidly to changing SAM pools. X-ray crystallographic studies revealed a Y-shaped RNA with SAM intercalated within a 3-way helical junction (Lu et al. 2008) (Figure 1.9). The crystal structure confirmed the direct involvement of the SD sequence in SAM recognition. Similar to SAM recognition by the S box riboswitch, the adenine moiety of SAM intercalates within the RNA resulting in continuous base stacking. The positive charge of the sulfonium ion is crucial for specific recognition of SAM over a near-cognate derivative such as SAH, which lacks the overall positive charge. Crystal structure studies showed that the selenium-derivative of SAM (Se-SAM) binds to the RNA in an identical manner, thereby confirming the location of the sulfonium ion. However, the SAH-bound S MK RNA 64

89 exhibited only minimal contacts. Competition binding assays showed that the RNA binds SAM with high affinity (apparent K d ~0.85 µm) and specificity (~100-fold preference over SAH). However, unlike the S box and SAM-II RNAs, the S MK RNA does not interact with the methionine side chain of SAM. These SAM-binding riboswitches recognize the same biological ligand, yet they exhibit distinct RNA folds suggesting independent evolution (Lu et al. 2008). Chemical probing analyses (using SHAPE) in combination with mutational studies showed that the S MK box RNA samples one of three conformations depending on the concentration of the ligand sensed (Lu et al. 2011). Similar to the conformational switching seen for the S box RNA, the majority of the S MK RNA structures are present in the apo state (in the absence of ligand) that represents the on conformation. However, a subset of the RNA structures pre-organize into a SAM-bound-like or ready state. It is only upon exposure to SAM that the RNA is stabilized into the off conformation. Similar results were obtained using NMR and SAXS analyses (Wilson et al. 2011). ITC analyses of S MK box variant RNAs showed that the reversible, translational riboswitch is controlled thermodynamically (Wilson et al. 2011) SAM-IV riboswitch The SAM-IV element, uncovered during a search for novel riboswitches, contains a set of elements similar to the S box riboswitch (Weinberg et al. 2008). This class is found primarily in Actinomycetales (such as M. tuberculosis) upstream of genes involved in sulfur metabolism. The SAM-IV riboswitch core shares five of the six key 65

90 nucleotides that make direct contacts with the ligand in S box RNA, suggesting a similar mode of molecular recognition. However, the sulfonium ion binding site appears to be different (Weinberg et al. 2008). The SAM-IV and S box RNAs show distinct scaffolds with significant differences in the peripheral tertiary architecture. The P4 helix in SAM-IV is located outside the core, the KT motif is absent in the P2 helix and an additional pseudoknot structure is predicted at the top of the P3 helix. Binding analyses by the Streptomyces coelicolor SAM-IV riboswitch showed analogous affinity for SAM and comparable discrimination against the near-cognate ligand SAH, relative to the yitj S box riboswitch (Weinberg et al. 2008). A precise regulatory mechanism for SAM-IV was not identified, although SAM-IV elements are observed upstream of a few rho-independent transcriptional terminators as well as many translational start-sites. Recently, an S box/sam-iv riboswitch class was predicted that is similar to the S box RNAs (Weinberg et al. 2010). Like SAM-IV, the S box/sam-iv class contains conserved nucleotides within the ligand-binding core but it differs from S box in the global scaffold. The S box/sam-iv riboswitch does not contain helix P1 or the KT motif and it appears to have only one pseudoknot structure located at the top of helix P3. The overall similarities between the S box, SAM-IV and S box/sam-iv riboswitches suggest that these RNAs have evolved from a common ancestor to regulate the same effector molecule. It has been suggested that these SAM binding variants together with the S box riboswitch constitute a SAM superfamily (Weinberg et al. 2008). 66

91 SAM-V riboswitch The SAM-V element is the fifth member of the SAM-binding riboswitch class (Poiata et al. 2009). This motif was originally identified during a comparative sequence analysis of G-C-rich intergenic regions in the marine α-proteobacterium Candidatus Pelagiobacter ubique HTCC 1062 (Meyer et al. 2009). The SAM-V elements have a consensus sequence and secondary structure similar to the S box riboswitch. SAM-V is predicted to form an H-type pseudoknot structure with two stems and two loops. Most SAM-V elements have been identified upstream of SD sequences, suggesting regulation at the level of translational inhibition (by RBS occlusion). In-line probing and equilibrium dialysis have independently confirmed SAM binding by the aptamer, with an apparent K d value of ~150 µm and have also shown discrimination against SAH (Poiata et al. 2009). The SAM-V element has also been identified immediately downstream from a SAM-II riboswitch, in a tandem arrangement (Poiata et al. 2009). This is the first example of a tandem architecture in which two different riboswitch classes bind the same effector molecule. The two tandem riboswitches do not bind SAM cooperatively and appear to be regulated differently. It is predicted that SAM-II controls gene expression at the level of transcription termination, whereas the SAM-V element is regulated at the translation initiation level (Poiata et al. 2009). The SAM-II element fails to show a ligand-dependent structural modulation in the presence of the downstream SAM-V element, implying that SAM-II binds SAM before the SAM-V transcript is synthesized (Poiata et al. 2009). This double regulation can be advantageous for cells having slow 67

92 mrna turnover rates (such as marine organisms) (Poiata et al. 2009). If the cell detects high SAM concentrations, regulation can take place by premature transcription termination at the SAM-II element. However, if the full-length mrna is synthesized, after which SAM concentration in the cell increases, then expression of the downstream coding region can be prevented by inhibition of translation (Poiata et al. 2009) SAH riboswitch A putative SAH regulatory motif was predicted originally during a comparative sequence analysis (Weinberg et al. 2007). The SAH-binding riboswitch was subsequently confirmed upstream of genes involved in SAH catabolism in a number of α- proteobacteria and actinobacteria (Wang et al. 2008). SAH is a natural by-product of the SAM demethylation reaction and it differs from SAM by the absence of a single methyl group. SAH can thus act as a competitive inhibitor for reactions that involve SAM. In addition, it is imperative that the SAH pools in the cell are regulated tightly, as increased SAH levels lead to toxicity. Standard experimental procedures such as in-line probing and equilibrium dialysis show that the meth regulatory element from Dechloromonas aromatica binds SAH with high affinity (apparent K d ~20 nm) and selectivity against SAM (>1,000-fold) (Wang et al. 2008). In-line probing analyses showed that almost all functional groups present on SAH are essential for molecular recognition by the RNA and the inability to bind SAM is due to the steric clash of the methyl side-chain. These findings were corroborated by the X-ray crystal structure of the SAH-bound riboswitch (Edwards et al. 2010). Modeling of 68

93 SAM into the crystal structure revealed the steric interference, confirming the 1,000-fold discrimination against SAH. This discrimination suggests that the α-proteobacteria and actinobacteria control catabolic gene expression of SAH only when high concentrations of SAH are attained (Wang et al. 2008). The central core of the RNA is made up of highly conserved nucleotides, flanked by helices P1, P2 and P4; helix P3 is seen only in a few organisms. An unusual LL-type pseudoknot structure, which creates a shallow cleft on the RNA surface, is stabilized in the presence of SAH (Edwards et al. 2010, Wang et al. 2008). The SAH-stabilized conformation either prevents formation of the intrinsic transcription terminator or reveals the SD sequence resulting in ribosome binding. Regulation is predicted at the level of transcription antitermination for meth, and at the level of translation initiation for the ahcy gene from P. syringae. In vivo studies of the SAH hydrolase ahcy from P. syringae showed that in the presence of SAH the riboswitch is on. Any disruption of the aptamer domain results in downregulation of expression (Wang et al. 2008). Biophysical analysis using ITC showed that the SAH riboswitch, like many other riboswitches, samples an ensemble of conformations (Edwards et al. 2010). In the ligandfree state, the RNA adopts a bound-like conformation that shifts easily to the fully bound conformation in the presence of SAH. 1.3 Research goals The focus of this dissertation has been to investigate the S box (SAM-I) riboswitch from B. subtilis. The S box regulatory system was identified in the Henkin 69

94 laboratory (Grundy and Henkin 1998). S box genes involved in sulfur metabolism, as well as methionine, cysteine and SAM biosynthesis and transport pathways, show a high degree of conservation in primary sequence and secondary structure in the 5 UTRs (Grundy and Henkin 1998). Extensive biochemical and genetic analyses showed that the S box leader RNAs undergo a conformational change specifically in response to SAM (McDaniel et al. 2003). SAM directs increased termination of transcription in a purified in vitro system, in the absence of any auxiliary protein factor. MetK synthesizes SAM from methionine and ATP. Growth in the presence of methionine results in high SAM pools, while growth in the absence of methionine results in low SAM pools. Expression of the majority of the S box genes is induced during methionine starvation (when SAM pools are low) and is repressed in the presence of methionine (when SAM pools are high) (Grundy and Henkin 1998, McDaniel et al. 2003, Tomsic et al. 2008). A mutagenic study confirmed the direct involvement of SAM in S box regulation (McDaniel et al. 2006). A single trans-acting mutation in the metk gene resulted in derepression of S box gene expression in vivo. This mutation (SBD1; S box-derepressed mutation 1) resulted in reduced SAM synthetase activity and decreased SAM pools in the cell (McDaniel et al. 2006). I tested the effect of this mutation on the growth of methionine auxotrophic (BR151; metb10) and prototrophic (BR151MA; Met + ) strains, as well as on yitj-lacz expression (McDaniel et al. 2006). The SBD1 allele was sufficient to confer loss of repression of S box gene expression during growth in the presence of methionine. This mutation also resulted in a modest growth rate defect in the prototrophic 70

95 background, suggesting that the metb10 allele present in the methionine auxotroph was able to suppress the growth rate reduction caused by the SBD1 mutation (McDaniel et al. 2006). Although metk is a part of the S box regulon, a metk-lacz transcriptional fusion fails to exhibit increased expression during methionine limitation (when SAM pools are low). SAM also fails to stimulate increased transcription termination in vitro at the metk leader region terminator. Northern blot and quantitative real-time polymerase chain reaction (qrt-pcr) have revealed only a transient increase in the amount of metk readthrough transcripts during methionine limitation in vivo (Tomsic et al. 2008). From a physiological standpoint, MetK utilizes methionine to synthesize SAM, while the rest of the S box gene products function to synthesize and transport methionine. Thus, the functional role of metk suggests the need for an additional level of regulation, which functions in conjunction with the S box system. The major focus of my research has been to characterize the B. subtilis metk leader RNA, in order to gain further insight regarding its regulatory mechanism. Chapter 2 will describe the characterization of the metk leader RNA using genetic and biochemical techniques. As previous studies failed to show an increase in metk-lacz expression under low SAM pools during methionine starvation conditions, we modulated in vivo SAM pools without removing methionine from the growth medium. For this purpose, we constructed a strain in which the chromosomal metk gene was placed under the control of an inducible promoter and total SAM pools were measured. The key result obtained from our studies provided evidence for a SAM-dependent change in metk gene 71

96 expression in vivo. Increased metk-lacz expression was observed when the SAM pools were low, and metk-lacz expression was repressed when the in vivo SAM pools were elevated. Chapter 2 will further discuss the phylogenetic analyses that revealed unique sequences located upstream (US box) and downstream (the DS box) of the metk S box element. Using primer extension analysis, we mapped the metk transcriptional start-site (+1) and established that the 5 end of the US box sequence is located precisely at the +1 position of the metk transcript. The US and DS box sequences display significant complementarity, suggesting a base-pair interaction, which was confirmed using an RNase H cleavage assay. The base-pairing interaction was stabilized only in the absence of SAM, and the US-DS pairing was dependent on a functional S box element. Extensive mutagenic analysis of the US and DS box sequences confirmed the need for an intact US- DS base-pairing interaction for a wild-type response to SAM. Chapter 2 will further discuss qrt-pcr analyses performed to study the metk transcript half-life and abundance. Significant reductions in transcript stability and abundance were observed when the US box sequence was altered. These results implied that the US box sequence plays a role in mrna stability and that pairing of the US box region with the DS box sequence protects the 5 end of the transcript from degradation. Chapter 2 will also describe studies of the metk leader RNAs using in vitro transcription termination assays. Multiple-round transcription assays were performed to compare the termination efficiencies of wild-type and mutant metk constructs in response to SAM. Consistent with previous results, SAM did not stimulate increased termination at 72

97 the wild-type metk leader region terminator. Our results showed a variation in the total amount of transcript for mutant constructs relative to the wild-type control. Transcription monitored using time-course analyses may be helpful to show if the US box sequence plays a role in the recruitment of RNAP, thereby affecting transcription initiation. Attempts to generate a halted complex in order to perform single-round transcription using the metk DNA template have been technically challenging. However, we have identified the conditions necessary to generate such a halted complex. Results obtained from the in vitro studies will be discussed in this chapter. Based on the results from Chapter 2, we proposed a model for the regulation of metk gene expression from B. subtilis. We hypothesized that the metk gene is regulated at the level of mrna stability, in addition to being under the control of the S box regulon. It is also possible that regulation occurs at the level of transcription initiation, or involves a combination of both RNA stability and transcription initiation. Chapter 3 will focus on the in vitro investigation of the B. subtilis yitj S box riboswitch, a well-studied leader RNA from the S box regulon. yitj encodes methylenetetrahydrofolate reductase and is closely involved in the methionine biosynthetic pathway (Grundy and Henkin 1998, Murphy et al. 2002). The yitj leader RNA shows high affinity for SAM and discriminates strongly against closely related natural analogs such as SAH (100-fold lower than SAM) and SAC (nearly 10,000-fold lower than SAM) (McDaniel et al. 2003, Winkler et al. 2003). This chapter will discuss the mutational analysis of the SAM binding pocket of the B. subtilis yitj S box leader RNA. The first part of the chapter will describe our attempts to isolate aptamers with high 73

98 affinity or altered specificity using Systematic Evolution of Ligands by Exponential enrichment (SELEX). Experimental procedures and preliminary data will be explained. However, due to technical difficulties, yitj variants with altered properties could not be obtained. A more direct approach was explored which targeted the yitj SAM-binding pocket using site-directed mutagenesis. This extensive mutational study was conducted (by V. A. Pradhan and J. Tomšič) as part of the crystal structure analysis of the B. subtilis yitj S box riboswitch in complex with ligand (Lu et al. 2010; collaboration with Dr. Ailong Ke, Cornell University). Thirty-two mutants that targeted key residues in the yitj sequence were generated. We analyzed the effects of these mutations on in vitro transcription termination (V. A. Pradhan) and SAM binding (J. Tomšič). Most of the mutations disrupted SAM binding (consistent with their position in the crystal structure), as these residues either make important contacts with SAM or are crucial for stabilizing the structural domains within the SAM-binding core. A majority of these mutants exhibited high constitutive transcription termination in the absence of SAM. These data suggest that the mutations lock the RNA into a conformation that resembles the SAMbound form even in the absence of the ligand (Lu et al. 2010). Selected mutants were analyzed further in response to a series of SAM analogs and compared to the response of the wild-type yitj RNA. These results will also be described in Chapter 3. Characterization of individual RNA elements within the S box leader region critical for riboswitch function will be described in Chapter 4. These results provide insight into factors responsible for S box riboswitch variability. This chapter will focus 74

99 on the design and analysis of hybrid leader RNAs generated with mete and yusc. These leader RNAs served as good candidates for this study as they exhibit different expression profiles in response to limiting SAM levels. mete, which encodes methionine synthase, is associated with the methionine biosynthetic pathway and shows high affinity towards SAM (Grundy and Henkin 1998, Murphy et al. 2002, Tomsic et al. 2008). mete-lacz expression is highly induced during growth in the absence of methionine and tightly repressed in the presence of methionine (Tomsic et al. 2008). yusc, which encodes an ABC-type methionine transporter, exhibits low induction during growth in the absence of methionine and expression is not repressed completely during growth in the presence of methionine (Grundy and Henkin 1998, Hullo et al. 2004, Murphy et al. 2002, Tomsic et al. 2008). Hybrid constructs of mete and yusc were analyzed in vivo and in vitro under low and high SAM conditions. Overall, the data suggest that even though the ability to respond to changing SAM levels is a function of the SAM binding domain, the ability to promote transcription termination efficiently in the presence of SAM is dictated by both the SAM-binding domain and the terminator/antiterminator structures. We therefore conclude that both structural domains play a crucial role in the calibration of the S box regulatory system. In the second half of Chapter 4, a distinct hybrid leader RNA will be discussed which involves metk and yusc. We investigated the effect of the metk promoter on the transcription efficiency. The metk-lacz expression was induced when SAM pools were low and methionine levels were high (Chapter 2), but high levels of SAM failed to promote transcription termination of metk in vitro (Chapter 2). Based on these results and 75

100 the US box mutagenesis (Chapter 2), we hypothesized that the metk promoter, along with the US box sequence, are responsible for reduced transcription initiation or reduced RNAP processivity in vitro as well as reduced transcript stability. The effects of the metk promoter and US box sequence on expression of the yusc S box leader RNA have been examined both in vivo and in vitro. These studies provide possible implications of the metk promoter and US box sequence on transcription and therefore metk regulation. 76

101 CHAPTER 2 CHARACTERIZATION OF THE metk LEADER RNA: AN ATYPICAL MEMBER OF THE Bacillus subtilis S BOX REGULON 2.1 Introduction The S box regulon, originally identified in the Gram-positive bacterium Bacillus subtilis, is characterized by high primary sequence and secondary structural conservation. These conserved sequences are located upstream of genes involved in sulfur metabolism and methionine and SAM biosynthesis pathways (Grundy and Henkin 1998). SAM, the molecular effector of the B. subtilis S box regulon, is synthesized from methionine and ATP by SAM synthetase, encoded by the metk gene. The S box genes are regulated at the level of premature transcription termination. SAM binds to the majority of S box leader RNAs from B. subtilis and promotes formation of an intrinsic terminator helix, leading to downregulation of gene expression (Grundy and Henkin 1998, McDaniel et al. 2003, McDaniel et al. 2005). As SAM is synthesized from methionine, growth in the presence of methionine 77

102 results in high SAM pools, while growth in the absence of methionine results in low SAM pools. The physiological concentration of SAM in a methionine auxotrophic strain grown in the presence of methionine is ~300 µm (Tomsic et al. 2008). Depleting methionine from the culture medium leads to a rapid drop in SAM pools to ~50 μm and then below the limit of detection (25 µm) after 1 h of methionine limitation (Tomsic et al. 2008). Expression of an S box gene-lacz transcriptional fusion is induced when cells are starved for methionine, and growth in the presence of methionine results in repression of S box gene expression. The ratio of gene expression during growth in the absence of methionine to that in the presence of methionine is termed the induction ratio. An extensive in vivo and in vitro characterization of the 11 S box-regulated transcriptional units from B. subtilis revealed a high degree of variation in response to SAM limitation (Tomsic et al. 2008). In spite of high sequence and structural conservation, a few S box leader RNAs fail to exhibit typical S box gene regulation. Induction ratios of S box-lacz transcriptional fusions exhibit a 250-fold range after 4 h of growth in the presence or absence of methionine. Variability is also observed both in the termination efficiency in the absence of SAM and in the concentration of SAM required for half-maximal termination in vitro. Overall, these studies concluded that S box gene expression is tuned physiologically to the functional roles of the S box genes and that expression is turned on only when necessary (Tomsic et al. 2008). Although the B. subtilis metk leader RNA contains sequence and structural elements that have been observed in other S box leader RNAs, metk-lacz expression is not induced in the presence of low SAM pools during methionine starvation and SAM 78

103 fails to stimulate transcription termination at the metk leader region terminator in vitro (Tomsic et al. 2008). Northern blot analysis and quantitative real-time PCR (qrt-pcr) revealed only a transient increase in the metk readthrough transcript during methionine limitation in vivo (Tomsic et al. 2008). Although studies from other laboratories have demonstrated a decrease in metk expression during growth in the presence of methionine, these experiments were conducted under steady-state growth conditions (rather than during methionine starvation conditions) in a methionine prototroph (Auger et al. 2002, Yocum et al. 1996). It is possible that as a methionine prototrophic strain synthesizes methionine on its own, addition of exogenous methionine increases the in vivo SAM levels further, which results in the observed reduction of metk expression. Starvation for methionine indirectly depletes the in vivo SAM pools (Grundy and Henkin 1998, McDaniel et al. 2003, Murphy et al. 2002) and the reduction in SAM pools correlates with the increase in expression of most S box gene-lacz transcriptional fusions (Tomsic et al. 2008). However, methionine starvation in a methionine auxotroph results in inhibition of growth. The metk gene product specifically synthesizes SAM from methionine, while the rest of the S box gene products synthesize and transport methionine. The functional role of metk might explain why it does not show typical S box regulation during methionine starvation. We hypothesize that the B. subtilis metk S box gene is regulated by a mechanism that works in conjunction with the S box regulon. In the current study, in vivo and in vitro assays were performed to examine why metk is unique compared to the other S box genes and how metk expression is regulated in response to SAM. 79

104 2.2 Materials and Methods Bacterial strains and growth conditions The B. subtilis strains used in this study were BR151 (lys-3 metb10 trpc2); BR151MA (lys-3 trpc2); BR151 Pspac-metK (lys-3 metb10 trpc2 Pspac-metK) (Pspac promoter; Yansura and Henner 1984); ZB307A (SPβc2del2::Tn917::pSK10 6) (Zuber and Losick 1987) and ZB449 (trpc2 phea1 abrb703 SPβ-cured) (Nakano and Zuber 1989). B. subtilis strains were grown on tryptose blood agar base medium (TBAB; Difco, Franklin Lakes, NJ), Spizizen minimal medium (Anagnostopoulos and Spizizen 1961) and 2XYT broth (Miller 1972). Antibiotics were added as indicated at the following concentrations: chloramphenicol, 5 μg/ml; neomycin, 5 μg/ml. IPTG (isopropyl β-d-1- thiogalactopyranoside; Gold Biotechnologies, St. Louis, MO) was used at 0.2 mm and 1.0 mm. X-Gal (5-bromo-4-chloro-3-indolyl-β-D-galactopyranoside; Gold Biotechnologies) was used at 40 μg/ml as an indicator of β-galactosidase activity. All growth was at 37ºC Genetic techniques Transformation of B. subtilis was carried out as described previously (Henkin et al. 1990). Chromosomal DNA was prepared using the DNeasy tissue kit (Qiagen, Chatsworth, CA). Wizard columns (Promega, Madison, WI) were used for plasmid preparations. Oligonucleotide primers were purchased from Integrated DNA Technologies (Coralville, IA). Restriction endonucleases and DNA-modifying enzymes 80

105 were purchased from New England Biolabs (Beverly, MA) and used as described by the manufacturer. Mutations were identified by DNA sequencing (Genewiz Inc., North Brunswick, NJ). Transcriptional fusions to lacz were generated in plasmid pfg328 (Grundy et al. 1993), which contains a cat gene that confers resistance to chloramphenicol. The metk-lacz fusion constructs were introduced in single copy into the B. subtilis chromosome by recombination into the SPβ prophage carried in strain ZB307A and purified by passage of the phage through strain ZB449 (Nakano and Zuber 1989, Zuber and Losick 1987). The phage carrying the fusion was then introduced into a B. subtilis host strain. Strains containing lacz fusions were grown in the presence of chloramphenicol. Construction of the strain BR151 Pspac-metK was performed as follows (done by F. Grundy). An integration vector pspacint (6.2 kilobase [kb]), which served to integrate the Pspac promoter (Yansura and Henner 1984) upstream of the B. subtilis chromosomal metk gene, was generated by ligating DNA fragments from three vectors (pmutin4, pbest502 and pdg148) that are used commonly in Gram-positive bacteria (Figure 2.1). A 600 bp fragment from pmutin4 carried the Pspac promoter sequence, a 2.0 kb fragment from pbest502 carried the resistance marker for neomycin and a 3.6 kb fragment from pdg148 carried the sequence for the laci gene (LacI repressor) and the resistance marker for ampicillin. A B. subtilis DNA fragment (~400 bp) containing part of the metk sequence, with restriction sites for HindIII and BamHI, was cloned into the pspacint vector using HindIII-BamHI restriction endonucleases. The Pspac promoter sequence was located immediately upstream of the metk fragment. The Pspac-metK 81

106 sequence was then integrated at the BR151 chromosomal metk site by double crossover recombination. Thus, the native metk promoter was located upstream of a truncated metk gene, while the IPTG-dependent Pspac promoter was located upstream of the intact metk gene (Figure 2.2). The LacI repressor protein is constitutively expressed in the BR151 Pspac-metK strain. Growth of this strain in the absence of IPTG resulted in inhibition of expression of the intact metk gene. The addition of IPTG inhibited the DNA binding activity of the LacI repressor, resulting in expression of the full-length MetK protein. Strains containing the metk gene under the control of the Pspac promoter were grown in the presence of neomycin and IPTG. The metk-lacz transcriptional fusions were integrated into BR151 Pspac-metK via transduction, as described above. 82

107 Figure 2.1 Plasmid used to generate strain BR151 Pspac-metK. The pspacint was generated using fragments from three vectors as indicated. Restriction sites are shown as vertical lines. Locations of antibiotic resistance cassettes and the laci gene are shown. Pspac, spac promoter. 83

108 Figure 2.2 Construction of the BR151 Pspac-metK strain. The metk fragment was cloned into the pspacint vector and integrated into BR151 at the chromosomal metk site by double crossover recombination. This resulted in the intact metk gene under the control of the IPTG-dependent Pspac promoter. 84

109 2.2.3 β-galactosidase measurements Strains containing lacz fusions were grown in Spizizen minimal medium containing the required amino acids at a concentration of 50 μg/ml until early exponential growth phase and were then harvested by centrifugation. Cells were resuspended in fresh Spizizen minimal medium in the presence or absence of methionine. Samples were collected at 1-h intervals and assayed for β-galactosidase as described previously (Miller 1972) using toluene permeabilization. Strains containing the metk gene under the control of the IPTG-dependent Pspac promoter were grown in 10 ml of 2XYT broth containing IPTG (0.2 mm). Cells were grown until mid-log phase, harvested by centrifugation and resuspended in an equal volume (10 ml) of 2XYT broth in the absence of IPTG. The cells were then diluted 100-fold into fresh 2XYT broth, in the presence or absence of 1.0 mm IPTG. Samples were collected at 1-h intervals and assayed for β-galactosidase activity. All starvation experiments and assays were conducted at least twice, and variation was <10% Measurement of SAM pools in vivo BR151 Pspac-metK containing a metk-lacz fusion was grown in 2XYT broth in the presence of 0.2 mm IPTG until mid-log phase. Cells were harvested by centrifugation and resuspended in fresh 2XYT broth in the absence of IPTG. Cell samples were collected by filtration at the indicated time points and extracted with 1.0 ml 0.5 M formic acid; the formic acid was removed by lyophilization as described by Ochi and coworkers. (Ochi et al. 1981). Cell extracts were tested in an in vitro transcription termination assay 85

110 using a yitj template that included the glyqs promoter sequence and compared to a SAM standard curve, as previously described (McDaniel et al. 2006). Samples were also harvested at each time point and assayed for β-galactosidase activity, as described above In vitro transcription termination assays for determination of SAM pools Templates for in vitro transcription by B. subtilis RNAP were generated by PCR using oligonucleotide primers that contained the glyqs promoter sequence upstream of the leader region of the yitj gene, to generate a transcription start-site 17 nt upstream of the start of helix 1 (McDaniel et al. 2003). The promoter sequences were designed to allow initiation with a dinucleotide (ApC) corresponding to the +1/+2 positions of the transcript and a halt in transcription at position +16, by omission of GTP (McDaniel et al. 2003). The PCR fragment was ~400 bp in length and included 92 bp downstream from the transcription terminator to allow resolution of terminated and readthrough products. The PCR product was purified with a Qiagen PCR cleanup kit and sequenced by Genewiz. Single-round transcription reactions were carried out as described (Grundy et al. 2002, McDaniel et al. 2003). A reaction mixture of 20 mm Tris HCl, ph 8, 20 mm NaCl, 10 mm MgCl2, 100 mm EDTA, 150 µm ApC (Sigma, St. Louis, MO), 2.5 µm GTP and ATP, 0.75 µm UTP, 0.25 µm [α- 32 P]-UTP (800 Ci/mmol [30 TBq/mmol]; GE Healthcare, Piscataway, NJ), DNA template (10 nm) and His-tagged purified B. subtilis RNAP (6 nm) was incubated for 15 min at 37ºC and placed on ice. Heparin (20 µg/ml; Sigma) was added to block subsequent reinitiation. Elongation was resumed by the addition of 10 µm rntps and 40 mm MgCl 2 and reactions were incubated at 37 C for an 86

111 additional 15 min. Templates were transcribed in the presence of various concentrations of SAM or cellular extract, as indicated. The final volume of the reaction was 35 µl. Transcription was terminated by extraction with phenol-chloroform. Transcription products were resolved by denaturing polyacrylamide gel electrophoresis (PAGE) and visualized by PhosphorImager (Molecular Dynamics) analysis. Efficiency of termination was calculated as the amount of termination product divided by the sum of the readthrough and termination products. Percent termination was plotted as a function of ligand concentration (GraphPad Prism). The reactions were performed in duplicate and -1 reproducibility was ±5%. An average internal cell volume of 0.54 ± 0.13 µl A 595 (Wabiko et al. 1988) was used to calculate the intracellular SAM concentrations from the cell equivalents and A 595 of cell cultures used in preparation of the extracts, as described previously (McDaniel et al. 2006) In vitro transcription termination assays The wild-type and mutant metk DNA templates containing the native metk promoter were generated by PCR (KOD DNA polymerase, EMD Biosciences, San Diego, CA). PCR products were purified with a Qiagen PCR cleanup kit and sequenced by Genewiz. The PCR fragments were ~400 bp in length and included 121 bp downstream from the metk transcription terminator, to allow resolution of terminated and readthrough products. Multiple-round transcription reactions were carried out in the presence of a high concentration of GTP, which corresponds to the +1 position of the metk transcript. The reaction mixture was as described above, except that GTP was 87

112 added at a concentration of 0.29 mm. The reaction mixture also contained 7 mm KCl. ApC was omitted from the reaction mixture. Transcription was conducted in the presence of 10 µm rntps. Templates were transcribed in the presence or absence of SAM, as indicated. The reaction volume was 35 µl. Transcription reactions were incubated at 37 C for 30 min and were terminated by extraction with phenol-chloroform. Transcription products were analyzed as described above. The reactions were performed in duplicate and reproducibility was ±5% RNase H cleavage assay The transcription reaction contained 20 mm MgCl 2, 40 mm Tris ph 8.0, 50 µg/ml BSA (Sigma), 5.0 mm each of ATP, CTP and GTP, 0.50 mm UTP, 7.5 mm GMP. Transcription was carried out in the presence of 0.5 mm [α- 32 P]-UTP (800 Ci/mmol [30 TBq/mmol], GE Healthcare) for radiolabeling. DNA template was added at a concentration of 10 ng/µl. The DNA template was generated by PCR (KOD polymerase) from a plasmid containing the T7 promoter sequence attached upstream of the metk sequence. RNAs were transcribed in the presence or absence of 2.5 mm SAM, or ligand, as indicated, for 25 min at 37ºC. After transcription, a DNA oligonucleotide (5 µm) complementary to a region in the DS box sequence of the B. subtilis metk leader RNA (positions ) was added and hybridized for 5 min at 37ºC. RNase H (1 µl, 10 U/µl; Ambion, Austin, TX) was then added to the reaction and incubated for 10 min at 37ºC. The final volume of the reaction was 20 µl. The reactions were stopped by phenol chloroform extraction. The resulting RNA products were resolved by denaturing PAGE 88

113 and visualized by PhosphorImager analysis. The percentage of RNAs protected from cleavage was calculated as the amount of full-length RNAs relative to the total amount of RNA in each reaction. The reactions were performed in duplicate and reproducibility was ±10% Total RNA extraction for primer extension analysis BR151 cells were grown in 100 ml 2XYT broth until mid-exponential growth phase, harvested by centrifugation and resuspended in 6 ml LETS buffer (0.1 M LiCl, 10 mm EDTA, 10 mm Tris HCl [ph 7.4], 1% sodium dodecyl sulfate). 6 ml phenol:chloroform:isoamyl alcohol (25:24:1) and 3 ml washed glass beads ( µ; Sigma) were added to the resuspended cells and vortexed for 2 min. RNA extraction was performed as described by Wu and coworkers (Wu et al. 1989). The RNA concentration was measured using a ND-1000 spectrophotometer (NanoDrop Technologies, Inc., Wilmington, DE) Primer extension analysis of the metk leader RNA Primer extension analysis was conducted using three separate oligonucleotide primers, metk RC 6 (5 -GCACCTTGGTTGTCTCACTCAGTTG-3 ), metk RC 6-1 (5 - ACCTTGGTTGTCTCACTCAGTTGAAC-3 ) and metk RC 9-1 (5 - GAATCATACAACCTTGCAACAGGTTAGC-3 ). The oligonucleotide primers were 5 end-labeled by incubation at 37ºC for 1 h with T4 polynucleotide kinase (New England Biolabs) in the presence of [γ- 32 P]-ATP (3000Ci/mmol, [259 TBq/mmol]; MP 89

114 Biomedicals, Solon, OH). 50 μg of total RNA was subjected to reverse transcription using 600 pmol 5 end-labeled primers with Superscript III reverse transcriptase (Invitrogen, Carlsbad, CA). The primer extension products were heated at 95ºC for 3 min and 3 μl samples were loaded onto an 8 M urea-6.0% polyacrylamide sequencing gel. A DNA sequencing ladder (Sequenase 2.0 Kit, USB, Santa Clara,CA) was generated as a size standard by using the same oligonucleotide primers and wild-type metk plasmid DNA as template. All products were visualized using PhosphorImager analysis Quantitative reverse transcriptase PCR (qrt-pcr) assay BR151 Pspac-metK cells containing a metk-lacz transcriptional fusion were grown in 10 ml 2XYT broth in the presence of 0.2 mm IPTG. Cells were grown until mid-log phase, harvested by centrifugation and resuspended in an equal volume (10 ml) of 2XYT broth in the absence of IPTG. The cells were then diluted 100-fold into fresh 2XYT broth, in the presence or absence of 1.0 mm IPTG, and growth was monitored. Rifampicin (150 µg/ml; Sigma) was added 3 h after the cells reached an OD of 0.10 at 595 nm. Duplicate samples (5 ml) of the cultures with and without IPTG were collected at 0, 2.5, 5, 10 and 20-min intervals after rifampicin addition. Cells were collected on 0.45 µm pore size nitrocellulose filters (Nalgene, Rochester, NY) using a vacuum manifold. Filters were frozen immediately on dry ice. Frozen cell culture samples were scraped from the filter and resuspended in 330 µl LETS buffer. 300 µl phenol:chloroform:isoamyl alcohol (25:24:1) and 150 µl washed glass beads were added to the resuspended cells and RNA extraction was performed as described above. 30 µl of 90

115 the RNA samples generated from duplicate cultures at each time point were treated with 3 µl RNase-free DNase I (Turbo DNA-free kit; Ambion), in a final reaction volume of 100 µl, to minimize genomic DNA contamination. This step was followed by acid phenol:chloroform extractions. Reverse transcription reactions were carried out using a Thermoscript reverse transcriptase PCR (RT-PCR) system (Invitrogen). Each DNasetreated RNA sample (1 µl) was used to generate duplicate cdna templates resulting in 4 cdna samples per RNA sample (final reaction volume 25 µl). Oligonucleotide primers for cdna sysnthesis, lacz 290(a) RC (5 -CGTAACCGTGCATCTGCCAGTT-3 ) and 5s rrna RC (5 -TCCTACTCTCACAGGGGGAAAC-3 ), were used at a final concentration of 50 µm. Quantitative PCRs (25 µl) were performed in duplicate using iq SYBR Green supermix (Bio-Rad, Hercules, CA). Data sets were collected on a Bio-Rad CFX96 real-time PCR system. The cycling conditions were as follows: 3 min at 95 C (activation of Taq polymerase and well factor collection), followed by 40 cycles consisting of 30 s at 95 C, 20 s at 56 C, and 30 s at 72 C. Fluorescence signal data were collected during the 72 C phase of each cycle. Melt curves from 55 C-95 C (in 0.5 C increments, measuring fluorescence at each temperature) were collected for all samples following the last cycle and showed the presence of only one product in each reaction. DNA obtained by PCR, using the same oligonucleotide primers used for RT-PCR, was used as the standard DNA. Each cdna sample was generated in duplicate resulting in a total of 8 replicates. Efficiencies of amplification of each gene were similar based on the slopes of the standard curves. The standard curves were used to derive the copy number of each transcript in each RNA sample, which was an average of 8 replicates. Starting 91

116 quantities, abundance and half-life measurements of the transcripts were calculated using the qrt-pcr software (Bio-Rad CFX Manager) and Graph Pad Prism. 2.3 Results Response to varying SAM pools in vivo by the wild-type B. subtilis metk leader RNA Previous data from our laboratory have shown that expression of the wild-type metk-lacz transcriptional fusion is not induced during methionine starvation in a methionine auxotroph and the presence of SAM does not promote metk transcription termination in vitro (Grundy and Henkin 1998, Tomsic et al. 2008). Northern blot and qrt-pcr analyses show only a transient increase in metk readthrough transcripts when cells are starved for methionine (Tomsic et al. 2008). The majority of our earlier studies have focused on gene expression in a methionine auxotroph under limiting methionine conditions. The effects of methionine starvation on a methionine auxotrophic strain are two-fold. Starvation for methionine indirectly depletes the in vivo SAM pools (Grundy and Henkin 1998, McDaniel et al. 2003, Murphy et al. 2002) and results in inhibition of growth. We designed a system in which the in vivo SAM levels were modulated without changing the methionine levels, thereby not affecting the growth rate as severely as during a methionine starvation. We constructed a BR151 Pspac-metK strain in which the chromosomal metk gene was under the control of an IPTG-dependent Pspac promoter (Yansura and Henner 1984). As this system was designed to deplete the cellular SAM pools without the need to 92

117 starve for methionine, cells were grown in rich medium. We predicted that growth in the presence of IPTG would result in high in vivo SAM pools (indicating an IPTG-dependent induction of metk expression), while growth in the absence of IPTG would result in low intracellular SAM pools (suggesting a reduction in metk gene expression). The intracellular SAM pools of BR151 Pspac-metK were measured during growth in rich medium to validate the function of the IPTG-dependent Pspac promoter (Figure 2.3). Formic acid extracts were prepared from cells harvested at different time points during IPTG limitation. To calculate the concentration of SAM present in the cellular extracts, we measured the termination efficiency at the yitj leader region terminator promoted by addition of extracts compared to the termination efficiency promoted by known concentrations of SAM (McDaniel et al. 2006, Tomsic et al. 2008). 93

118 ( ) ( ) Figure 2.3 Measurement of in vivo SAM pools and β-galactosidase activity during IPTG limitation. BR151 Pspac-metK cells containing the wild-type metk-lacz transcriptional fusion were grown until mid-exponential phase in the presence of IPTG (0.2 mm), harvested and resuspended in 2XYT in the absence of IPTG. Samples were collected at the indicated times. Cell extracts were neutralized by the addition of 1 N KOH, and samples were added to a B. subtilis RNAP in vitro transcription termination reaction mixture containing a yitj DNA template. Termination efficiency was compared to a standard curve generated using known concentrations of SAM, which was used to calculate the SAM concentration present in the cell extracts. The in vivo SAM levels are plotted on the left Y-axis (µm) and β-galactosidase activity is shown on the right Y-axis (Miller units). Dashed line with open squares, β-galactosidase activity; solid line with filled circles, in vivo SAM pools. At the T 0 time point, the SAM pools in the cell extracts were at ~150 µm (Figure 2.3, left Y-axis, solid line). The SAM levels dropped rapidly to ~40 µm within the first hour of IPTG limitation and eventually dropped below the limit of detection (to ~25 µm), ~2 h after removal of IPTG from the growth medium. Expression of the metk-lacz 94

119 -Galactosidase (MU) transcriptional fusion was monitored concurrently for samples collected in parallel, to corroborate the effect of SAM pool modulation (using IPTG limitation) on metk-lacz derepression in BR151 Pspac-metK. β-galactosidase activity started at ~40 Miller units and reached 90 Miller units after 4 h of IPTG removal (Figure 2.3, right Y-axis, dashed line). These data support a correlation between lowered SAM pools and increased metklacz expression in this strain IPTG IPTG Time (h) Figure 2.4 In vivo expression of the wild-type metk-lacz transcriptional fusion. The metk-lacz fusion was integrated in single copy in strain BR151 Pspac-metK (lys-3 metb10 trpc2 Pspac-metK). Cells were grown in 2XYT broth containing IPTG (0.2 mm) and resuspended in fresh medium in the presence (filled symbols) or absence (open symbols) of IPTG (1 mm). Samples were taken at 1-h intervals until 4 h after resuspension. Open squares, in the absence of IPTG; filled squares, in the presence of IPTG. MU, Miller units. The expression of the wild-type metk-lacz transcriptional fusion was measured during growth in the presence of IPTG and compared to expression during growth in the 95

120 absence of IPTG (Figure 2.4). The wild-type metk-lacz fusion construct exhibited repression of gene expression when SAM pools were high (in the presence of IPTG). The β-galactosidase activity for cells grown in the presence of IPTG was in the range of Miller units. The metk-lacz fusion construct exhibited a 3.5-fold increase in β- galactosidase activity (to ~110 Miller units) after 4 h of growth in the absence of IPTG, when SAM pools were low. Table 2.1 lists the values of the wild type metk-lacz β- galactosidase activity and induction ratios during methionine starvation and IPTG limitation. Table 2.1 Expression of wild-type metk-lacz transcriptional fusion in vivo Assay Strain β-galactosidase activity Induction ratio (Miller units at T 4 ) a (T 4 ) b +Met -Met Methionine BR ± ± Starvation +IPTG -IPTG IPTG assay BR151 Pspac-metK 31 ± ± a T4 indicates the value after 4 h of methionine starvation or IPTG limitation. Values are reported as the means ± the standard deviations for three assays. b Induction ratio at T 4 indicates the ratio of values in the absence of IPTG to values in the presence of IPTG 96

121 2.3.2 Deletion mapping of the metk leader RNA The predicted secondary structure of the metk leader RNA is made up of elements that resemble a typical S box RNA (Figure 2.5). Helices 1 through 4 are arranged around a conserved core. The kink-turn (or GA) motif in helix 2, as well as the potential tertiary interaction between the loop of helix 2 (L2) and the junction region between helices 3 and 4 (J3/4), are conserved in the metk leader sequence. The highly conserved sequences within the SAM-binding pocket in helix 3, as well as the AU-rich base-pairs at the top of helix 1 are also evident. 97

122 Figure 2.5 The predicted secondary structure of the B. subtilis metk leader RNA. Numbering is relative to the transcription start site (+1). The sequence is shown in the terminator conformation; red and blue residues illustrate the alternate pairing required for formation of the antiterminator, shown above the terminator. Helices 1 to 5 are identified by boxed numbers; T, terminator; AT, antiterminator; AAT, anti-antiterminator. Bold dashed line (black) indicates sequence downstream of the terminator. Arrows indicate endpoints of the deletion mutants; red, 5 deletion; green, 3 deletion; purple, S box deletion. The triangle and the purple dashed line indicate the sequence that has been deleted for the S box deletion mutant. 98

123 AT 3 AT AAT T Figure Coding region

124 Table 2.2 Expression of the metk-lacz deletion mutants during IPTG assay Construct β-galactosidase activity Induction ratio (Miller units) a (T 4 ) d +/-IPTG +IPTG -IPTG b c T 0 T 4 wild-type 43 ± ± ± deletion 7.0 ± ± ± deletion 32 ± ± ± delta S box 150 ±51 98 ± ± a Values are reported as the means ± the standard deviations for two assays. b T 0 indicates the value at the start of the IPTG assay c T 4 indicates the value after 4 h of growth in the presence or absence of IPTG d Induction ratio at T 4 indicates the ratio of values in the absence of IPTG to values in the presence of IPTG In order to identify features within the metk leader RNA that are critical for metk gene expression in response to SAM limitation (during IPTG limitation), we performed a systematic deletion mapping of the metk leader RNA. The 5 deletion construct (in which positions +1 to +7 were deleted) failed to exhibit an induction of expression during growth in the absence of IPTG and regulation to limitation of SAM was lost. Deletion of residues at the 5 end of the metk leader sequence resulted in a 6.0-fold reduction in β- galactosidase activity at time T 0, relative to the wild-type metk-lacz fusion construct (Table 2.2). The β-galactosidase activity of the 5 deletion fusion construct was reduced ~12-fold after 4 h (T 4 ) of growth in the absence of IPTG, while β-galactosidase activity 100

125 was reduced ~3.5-fold after 4 h of growth in the presence of IPTG, as compared to the wild-type metk-lacz fusion construct under the same conditions. The 3 deletion construct (in which the sequence from postion +219 downstream of the terminator was deleted) also failed to exhibit an induction of expression after growth in the absence of IPTG. Deleting the 3 end of the metk leader RNA resulted in a small decrease in β-galactosidase activity at T 0 (1.3-fold) and T 4 (1.7-fold) during growth in the presence of IPTG as compared to wild-type metk-lacz. The expression of the 3 deletion mutant at T 4 in the absence of IPTG was reduced ~4.0-fold as compared to wildtype metk-lacz (Table 2.2). These results showed that deleting either end of the metk leader RNA resulted in a loss of response to SAM limitation, as the constructs failed to show an induction of metk-lacz expression upon IPTG removal. These results indicate that the first 7 nt at the 5 and the sequence downstream of the terminator region at the 3 end of the metk leader RNA are important for metk gene expression. Deletion of the S box riboswitch element (ΔS box, which deletes helices 1-4, as well as the terminator element, from positions +24 to +247) from the metk leader RNA resulted in a 3-fold induction of expression after 4 h of growth in the absence of IPTG compared to expression in the presence of IPTG. β-galactosidase activity of the ΔS box mutant at T 0 was 4.0-fold higher than that of the wild-type metk construct and activity at T 4 was ~3.0-fold higher than wild-type both in the presence and absence of IPTG (Table 2.2). Together, results obtained from the deletion mapping indicated that deleting the sequence at the 5 and 3 ends of the leader region had a negative effect on metk expression, while deleting the metk S box element increased overall expression without 101

126 affecting the response to the changing SAM pools in vivo. These results suggest that the metk S box element does not participate directly in the primary mode of regulation. This prompted us to look more closely at the 5 and 3 regions of the metk leader RNA sequence Unique sequences are located on the 5 and 3 sides of the metk S box element A phylogenetic analysis of the metk leader RNAs from several Firmicutes was conducted by F. Grundy (Figure 2.6). Comparative sequence analysis revealed two highly conserved regions in the metk leader RNAs in Firmicutes in addition to the S box element (Figure 2.7). These sequences were unique to metk as they were absent from the other S box leader RNAs from B. subtilis. Several positions within the sequence elements were invariant, leaving little possibility for covariation. Significant sequence complementarity was observed between the two sequence elements, which suggested a base-pairing interaction. The base-pairing interaction resulted in a stem loop structure that was located upstream of the SD region, designated the pre-sd structure (Figure 2.7C). 102

127 Figure 2.6 Alignment of S box sequences from metk genes in Firmicutes. The sequences in green are the predicted -35 and -10 promoter regions. The sequence shown in red is the highly conserved US box region located near the 5 end of the metk leader RNA. The nucleotides shown in blue indicate the conserved core region. Abbreviations are as follows: Afla, Anoxybacillus flavithermus; Bcoa, Bacillus coagulans; Bhal, B. halodurans; Blic, B. licheniformis; Bpum, B. pumilus; Bsel, B. selenetireducans; Bsp14911, B. species strain 14911; Bsp SG1, B. species strain SG1; Bste, B. stearothermophilus; Bsub, B. subtilis; Esib, Exiguobacterium sibiricum; Gsp WCH70, Geobacillus species strain WCH70; Lsph, Lysinibacillus sphaericus; Mcas, Magicicada cassini; Oihe, Oceanobacillus iheyensis; Saur, Staphylococcus aureus; Scar, S. carnosus. 103

A. C. 19- -251 B. 10- -259 1- -268 Pre-SD Figure 2.7 The alignment of the metk upstream (US) and downstream (DS) box sequences from Firmicutes. A.

The US box consensus sequence is shown at the bottom. Upper case letters, 100% conservation. B. The metk DS box sequnce alignment (green sequences) with the conserved core shown in blue.

128 A. C B Pre-SD Figure 2.7 The alignment of the metk upstream (US) and downstream (DS) box sequences from Firmicutes. A. The metk US box sequence alignment (red sequences) with the conserved core sequence highlighted in blue. Underlined nucleotides overlap helix 1 of the metk S box leader RNA. The US box consensus sequence is shown at the bottom. Upper case letters, 100% conservation. B. The metk DS box sequnce alignment (green sequences) with the conserved core shown in blue. The DS box consensus sequence is at the top of the alignment. Upper case letters, 100% conservation. Abbreviations are as in Figure 2.6. Additional abbreviations are as follows: Bant, B. anthracis; Sepi, S. epidermidis. C. Pre-SD, Pre-Shine-Dalgarno. 104

129 The sequence element located to the 5 side of the metk S box riboswitch element was termed the Upstream (US) box sequence (Figure 2.7A). It overlapped helix 1 of the metk S box riboswitch and extended into the junction region between helices 1 and 2. The 19-nucleotide consensus sequence contained 10 invariant positions. Primer extension analysis was conducted to map the metk transcriptional start-site (+1) in order to identify the location of the metk US box sequence relative to the 5 end of the transcript. A primer extension product (~110 bases in length) was observed only in the reaction to which reverse transcriptase enzyme was added and corresponded to a guanine (G) in the sequencing ladder (Figure 2.8, lanes 2 & 7). Several complementary DNA oligonucleotides designed to hybridize to different regions in the RNA were employed to verify this result (data not shown). The primer extension analysis indicated that the 5 end of the metk US box sequence is located precisely at the +1 position of the metk transcript. 105

130 Figure 2.8 Primer extension analysis of the B. subtilis metk leader RNA to map the 5 transcriptional start-site. A DNA oligonucleotide (complementary to nucleotides ) was hybridized to total RNA isolated from strain BR151 grown in 2XYT broth. Two concentrations of total RNA (10 µg and 50 µg) were tested in a reverse transcription (RT) reaction. The same primer was used for generating the sequencing ladder. The bold arrows indicate the band corresponding to the metk transcription start-site; the first transcribed nucleotide of the metk mrna is indicated by the bold letter G with an asterisk. M, size marker. 106

131 Figure

132 Phylogenetic analysis revealed a second sequence element, designated the Downstream (DS) box sequence, located ~60 nt downstream of the metk S box terminator sequence and was significantly complementary to the US box sequence (Figure 2.7B). Of the 16 nucleotides in the DS box consensus sequence, 5 nucleotides were 100% conserved. The B. subtilis metk DS box sequence was located ~30 nt upstream of the metk translational start-site. This distance appeared to vary, such that a few organisms showed longer insertions between the end of the DS box sequence and the metk start codon (data not shown). Deleting a portion of this region (positions ) from the B. subtilis metk leader RNA resulted in expression similar to wild-type metklacz under limiting SAM conditions (data not shown) The metk US and DS box regions are involved in a base-pairing interaction and the SAM reponse for pairing is dependent on a functional SAM-binding domain We performed an RNase H probing assay to investigate the formation of the pre- SD structure. The RNase H enzyme targets only DNA-RNA hybrids. Accessibility of the DS box sequence was probed using a DNA oligonucleotide complementary to a region in the DS box sequence (positions ). The antisense oligonucleotide was designed such that it did not interfere with nucleotides in the US box sequence that overlapped helix 1. T7 RNA transcripts were generated in the presence or absence of SAM. The wild-type metk transcript control reaction, in the absence of antisense oligonucleotide and SAM, showed protection from RNase H cleavage (Figure 2.9A, lane 1). Addition of the oligonucleotide in the absence of SAM resulted in a increase (~14-fold) in the 108

133 cleavage product, while transcripts generated in the presence of SAM were highly sensitive to cleavage, such that cleavage increased 40-fold relative to lane 1 and 3.0-fold relative to lane 2 (Figure 2.9A, lanes 1, 2 and 3). These data indicated that the addition of SAM to the wild-type metk RNA resulted in increased accessibility of the DS box sequence. The specificity of the metk leader RNA for SAM was confirmed by testing the wild-type metk leader RNA in response to a closely related SAM analog or SAM precursors using the RNase H cleavage assay. The wild-type metk RNA exhibited cleavage only in response to SAM. Millimolar concentrations of S-adenosylhomocysteine (SAH), methionine or datp (either alone or in combination) failed to stimulate increased cleavage of the wild-type leader RNA by RNase H (data not shown). We also tested the effect of the stringent response alarmone (ppgpp) and failed to observe an increase in cleavage of the wild-type leader RNA (data not shown). Titration of the wild-type metk leader RNA in the presence of varying concentrations of SAM indicated that a minimum concentration of 50 µm was required for ~50% cleavage (Figure 2.9B). 109

A. B. Figure 2.9 Oligonucleotide-direction RNase H cleavage mapping of the B. subtilis metk leader RNA. A. RNase H cleavage of the wild-type and mutant metk leader RNAs in response to SAM.

Oligonucleotide was added to radiolabeled B. subtilis metk RNA generated in the presence (+) or absence ( ) of SAM (1 mm).

134 A. B. Figure 2.9 Oligonucleotide-direction RNase H cleavage mapping of the B. subtilis metk leader RNA. A. RNase H cleavage of the wild-type and mutant metk leader RNAs in response to SAM. The US1 mutant contains a CGG AUU change and the S box mutant contains a single U C substitution in the SAM binding pocket (position +105). Denaturing PAGE analysis of RNase H cleavage products. Oligonucleotide was added to radiolabeled B. subtilis metk RNA generated in the presence (+) or absence ( ) of SAM (1 mm). RNA-DNA hybrids were cleaved with RNase H and the products were visualized by autoradiography. B. SAM titration of the wild-type metk leader RNA. Increasing concentrations of SAM (0, 0.10, 1.0, 10, 50, 100, 500, 1000 and 5000 µm) were added to each reaction. P, protected transcript; C, cleavage transcript; % C, percentage of cleavage transcript relative to the total amount of transcript. 110

135 Three positions in the metk US box sequence that are predicted to be 100% conserved throughout all the Firmicutes were mutated simultaneously (C4A, G5U and G6U) to generate the metk US1 mutant. The resulting triple mutant exhibited constitutive high cleavage in both the presence and absence of SAM (Figure 2.9A, lanes 5 and 6) relative to wild-type metk. Thus, disrupting the invariant US box nucleotides resulted in a loss of response to SAM in the RNase H cleavage assay. The metk S box mutant, containing a U105C substitution in the SAM binding domain, resulted in a nearly equal ratio of cleavage to protection, regardless of whether SAM was added to the reaction (Figure 2.9A, lanes 7, 8, 9). A similar U C substitution in the SAM binding pocket of the yitj leader RNA abolishes SAM binding (Lu et al. 2010). It is important to note that the metk S box mutant exhibited higher cleavage of the DS box sequence in the absence of SAM, while increased protection in the presence of SAM compared to that of the wild-type RNA under the same conditions. This result indicated that a point mutation in the SAM binding pocket (independent of the US and DS box sequences) led to a general increase in accessibility of the DS box sequence. The metk ΔS box RNA exhibited high protection from RNase H cleavage (~98%, data not shown) regardless of the presence of SAM. This result indicated a loss of response to SAM due to the deletion of the S box element. The metk ΔS box RNA containing the US1 mutation showed constitutive cleavage (~95%, data not shown). The metk ΔS box US1 RNA exhibited a cleavage pattern (data not shown) similar to that shown by the metk US1 RNA alone (Figure 2.9A, lanes 5 and 6), suggesting the consistency of the US1 phenotype in the RNase H cleavage assay. 111

136 2.3.5 Mutagenic analysis of the conserved metk US box element Extensive site-directed mutagenesis was performed to study the unique sequences in the metk leader RNA that are predicted to be important for metk regulation. Guanine at the metk +1 site was left intact so as to not affect transcription initiation by the RNAP. We have evidence that shows that a G1A-lacZ fusion construct exhibited loss of induction during IPTG limitation with an overall reduced β-galactosidase activity, compared to wild-type metk-lacz (Woltjen and Grundy, data not shown). Nucleotides beyond position U8 in the metk US box sequence overlap helix 1 of the S box element (Figure 2.7A) and hence were not altered. We therefore focused our attention on the US box positions A2-U8, which were mutated individually to every other nucleotide or also deleted (in case of U8). A total of 22 US box point mutants were examined in response to SAM limitation in the BR151 Pspac-metK strain using in vivo expression assays. Figure 2.10 illustrates the results obtained from a comprehensive in vivo analysis of the US box point mutants. 112

137 Figure 2.10 In vivo expression analysis of wild-type metk compared to the metk US box mutant constructs using the IPTG assay. BR151 Pspac-metK cells, containing the appropriate transcriptional fusion construct, were grown in the presence of IPTG (0.2 mm) up to mid-exponential phase, harvested and resuspended in the absence or presence of IPTG (1.0 mm). Samples were taken every 1h until 4 h and tested for β-galactosidase activity. The graph shows data for the 4 h time point. MU, Miller units. Mutant constructs with sequence changes in the US box region that disrupted the base-pairing between the US and DS boxes in the pre-sd structure, exhibited an overall reduced gene expression with no response to the changing SAM levels. For example, the A2C/U, G3C/U, A7C/U and U8C mutants (Figure 2.10). Each nucleotide substitution weakened the base-pairing interaction and resulted in a loss of response to changing 113

138 concentrations of SAM (Figure 2.5, right panel). However, mutants in which the US-DS pairing was maintained, either by a newly formed Watson-Crick or a G U wobble basepair, exhibited a response to SAM limitation with increased readthrough in the absence of IPTG, consistent with expression of the wild-type metk-lacz fusion construct. Of the 8 selected US box sequence positions, 5 positions (C4, G5, G6, A7 and U8) were predicted to be 100% conserved. However, the A7 and U8 positions tolerated a sequence change. The A7G substitution was tolerated as it formed a wobble base-pair with a U in the DS box sequence (Figure 2.10). Although the U8 position is located in a loop region and no obvious nucleotide in the DS box sequence is predicted to base-pair with it, mutation of this invariant position to any of the other three nucleotides (A, C or G) or even deleting it (U8Δ) was tolerated (Figure 2.10). Positions C4-G6 did not tolerate any sequence change, even though a potential to base-pair was maintained. Mutations C4A and C4G resulted in constitutive high readthrough. Substitutions at the G5 and G6 positions, as well as the mutation C4U, led to overall reduced β-galactosidase activity, with loss of response to SAM limitation relative to the wild-type metk-lacz construct (Figure 2.10). The G5U mutant exhibited the lowest β-galactosidase activity among the 22 US box point mutants. These results imply that the C4, G5 and G6 residues are particularly crucial for the metk response in vivo, and indicate the importance of sequence over base-pairing interaction within this region. The three positions C4, G5 and G6, that are 100% conserved in the metk alignments, are part of a designated conserved core region (Figures 2.7A and 2.7B, residues highlighted in blue). The results from the in vivo expression analysis of the C4-114

139 G6 mutants are consistent with the in vitro results from the RNase H cleavage assay, in which the severe metk US1 triple mutant (C4G5G6 A4U5U6) exhibited constitutive high cleavage. Together, these results suggest that disrupting the US-DS pairing interaction results in a loss of response to SAM Physiological context of the G5U mutant The G5U metk-lacz transcriptional fusion construct exhibited the lowest β- galactosidase activity in BR151 Pspac-metK (~5 Miller units in the absence of IPTG, 22- fold lower than wild-type metk-lacz expression, Figure 2.10) compared to the other metk US box mutants. The G U substitution at position G5 was predicted to disrupt a 100% conserved G-C base-pair with DS box sequence. We introduced this mutation into the chromosomal metk copy of a methionine auxotroph (BR151; metb10) to generate the strain, BR151-G5U. Mutating the native metk gene resulted in a growth defect on TBAB plates, such that colonies appeared tiny compared to a strain with an intact metk gene (Grundy, data not shown). The difference in growth seen on nutrient rich plates suggested that the mutation G5U resulted in decreased SAM synthetase activity. We generated the strain BR151-G6A in which the US box point mutation G6A was introduced into the chromosomal metk sequence. The G6A metk point mutant had exhibited an intermediate β-galactosidase activity in the BR151 Pspac-metK background, which was lower than wild-type (~4.4-fold lower in the absence of IPTG) but ~5-fold higher than G5U in the absence of IPTG (~22 Miller units; Figure 2.10). BR151-G6A accordingly showed a growth phenotype of the colonies that was intermediate between 115

140 wild-type and BR151-G5U (Grundy, data not shown). We tested the ability of the mutant strains to grow in the presence of ethionine, the S-ethyl analog of methionine. Growth of the mutant strains was compared to that of an isogenic strain containing the wild-type metk gene, BR151-ZKO. Methionine was included in the minimal medium to support growth of the auxotrophic strains. Growth was analyzed qualitatively (by observing the colony size) on minimal medium plates, with or without ethionine (25 µg/ml). In the absence of ethionine but in the presence of methionine, the wild-type BR151-ZKO strain showed bigger-sized colonies compared to the tiny colonies of BR151-G5U, while BR151-G6A showed an intermediate growth phenotype (data not shown). In the presence of both ethionine and methionine, the difference in colony sizes for the three strains was consistent such that the wild-type showed bigger sized colonies, BR151-G5U showed tiny colonies and the BR151-G6A showed medium sized colonies (data not shown). This result suggested that the ethionine present in the medium was outcompeted by methionine, leading to preferential uptake of methionine over ethionine. We also tested the effect of the metk mutations in a methionine prototrophic (BR151MA; Met + ) background, during which exogenous methionine was omitted from the minimal medium and growth of the three strains was compared only in response to ethionine. In the absence of ethionine, BR151MA-ZKO (containing the wild-type metk allele) displayed bigger colonies compared to BR151MA-G5U, while BR151MA-G6A showed an intermediate colony size. As expected, BR151MA-ZKO showed no growth on minimal medium plates containing ethionine, indicating the toxic effect of ethionine. 116

141 These results suggested that the toxicity of ethionine for the prototrophic wild-type strain could not be overcome as methionine was not included in the medium. However, both BR151MA-G5U and BR151MA-G6A were able to grow in the presence of ethionine compared to the wild-type control (data not shown). We predicted that mutations G5U and G6A in the metk sequence resulted in significantly lower SAM synthetase activity compared to the wild-type strain, which led to very low amounts of ethionine incorporation into the mutant strains. The difference in SAM synthetase activity of the wild-type and mutant strains can explain the preferential toxicity of ethionine for the wild-type strain. To corroborate the results obtained during the ethionine analysis, we compared the SAM pools at a single time point for the strains BR151-ZKO and BR151-G5U. Cells grown in rich medium or minimal medium containing methionine were harvested at midlog phase, and the SAM pools were measured. Under both growth conditions, BR151- ZKO showed ~80 µm SAM, while SAM pools for the strain BR151-G5U were below the level of detection (data not shown). These results validated our prediction that the G5U mutant resulted in lower SAM synthetase activity Mutagenesis of the metk DS box sequence A subsequent mutagenic analysis was conducted to gain insight into the functional role of the DS box sequence. Unlike the US box mutagenesis, we did not target the entire DS box sequence for mutagenesis as a few substitutions on the 5 side of the DS box sequence would have subsequently disrupted the 3 portion of the US box 117

142 sequence, which overlaps helix 1 of the S box element. We therefore targeted specific positions within the DS box sequence based on the results obtained from the US box mutagenesis. Figure 2.11 illustrates the results from the comprehensive in vivo analysis of the DS box mutants. The majority of the DS box point mutant constructs failed to exhibit a response to SAM limitation, and showed 3-4-fold reduced β-galactosidase activity in the absence of IPTG compared to wild-type metk-lacz (Figure 2.11). The DS box mutants showed less variation in gene expression at each position compared to the US box mutants, although, not every alternate nucleotide was tested in the DS box mutagenic analysis. Only two DS box mutants (U3pC and G16pA) showed a response to changing SAM levels. As seen for the US box point mutants, an intact US-DS base-pairing interaction appeared to maintain a response to limiting levels of SAM, consistent with the wild-type phenotype. The wild-type nucleotide U265 in the DS box sequence makes a wobble base-pair with position G3 in the US box sequence. The U265C mutation in the DS box sequence formed a Watson-Crick interaction by pairing with position G3 in the US box. The single nucleotide substitution in the DS box sequence maintained the response to SAM, as it resulted in increased expression during SAM limitation. The DS box point mutation G252A also showed a response to SAM, although expression in the absence of IPTG was ~2-fold higher than in the presence of IPTG. In the absence of IPTG, the β-galactosidase activity of the G16pA-lacZ construct was ~2-fold lower than the activity of the wild-type metk-lacz fusion construct. 118

143 U265A U265C G264A G264C C263A C263G C263UC262A C262G C262UG259UG252U C4U-G264A G5A-C263U G6A-C262U A7U-U261A Figure 2.11 In vivo expression analysis of wild-type metk compared to the metk DS box mutant constructs and metk US-DS box double mutants using the IPTG assay. BR151 Pspac-metK cells, containing the appropriate transcriptional fusion construct, were grown in the presence of IPTG (0.2 mm) up to mid-exponential phase, harvested and resuspended in the absence or presence of IPTG (1.0 mm). Samples were taken every hour until 4 h and tested for β-galactosidase activity. The graph shows data for the 4 h time point. MU, Miller units. Three positions within the DS box sequence, G264, C263 and C262 (C263 and C262 are 100% conserved in the DS box consensus sequence) were predicted to base-pair with three positions C4, G5 and G6 from the US box sequence. The six GC-rich positions together make up the conserved core region based on the phylogenetic analysis of the metk genes from Firmicutes. Similar to the mutations in the US box conserved core, 119

144 substitutions within the DS box conserved core region did not result in increased expression during SAM limitation, in spite of the potential to base-pair. This result strongly indicated sequence preference over base-pairing, as five out of the six conserved core nucleotides are invariant throughout the entire phylogeny. Compensatory mutations were generated by simultaneously changing the US and DS box positions within and outside of the conserved core region to maintain the basepairing interaction between the two regions. We tested three double mutants within the conserved core, out of which only one double mutant (with the mutation C4U-G264A) showed increased gene expression upon limitation for SAM, suggesting that the compensatory mutant rescued the phenotype of the individual point mutants (Figure 2.11). However, wild-type expression levels were not achieved under these conditions as β-galactosidase activity of the C4U-G264A double mutant was 1.6-fold lower than activity of the wild-type construct. We also tested mutants with compensatory changes outside the conserved core region (G3A-U265C, A7U-U261A and A7G-U261C). The single nucleotide substitutions G3A and U265C, in the US and DS box sequences respectively, maintained a response to changing concentrations of SAM (Figure 2.10). However, the compensatory mutation G3A-U265C resulted in a loss of response to SAM limitation (Woltjen and Grundy, data not shown). The newly formed A-C mismatch disrupted the US-DS pairing, and the mutant was unable exhibit high expression under low SAM conditions. This result further validated the prediction that the US-DS base-pairing interaction is required for a response to SAM. 120

145 The single nucleotide substitution A7U in the US box sequence resulted in a disruption of the US-DS base-pairing. This mutant exhibited low expression during SAM limitation (Figure 2.10). Introducing the compensatory change in the DS box sequence (U261A) failed to rescue the US box mutant phenotype to wild-type expression levels (Figure 2.11). A different double mutation A7G-U261C resulted in high expression under low SAM concentrations (Woltjen and Grundy, data not shown). This result suggested a preference for a purine at position 7, as only a G U wobble (formed by the A7G point mutant) or a G-C Watson-Crick base-pair (formed by the A7G-U261C double mutant) was tolerated at this position. Based on the above results, we can conclude that the metk US and DS box sequences are involved in a pairing interaction and that the US-DS pairing is required to obtain a response to SAM. However, at the same time, we can speculate based on the mutagenesis results for the conserved core that base-pairing is not always sufficient to achieve increased metk-lacz expression during SAM limitation. These results suggest that the conserved core region acts as a recruiting site for a potential factor that binds to the US-DS pairing interaction and further stabilizes the interaction. These results clearly indicate a sequence requirement for the conserved core region Effect of the US box mutations on metk transcript stability and abundance Based on the results from the in vivo reporter assays, we wanted to analyze whether mutating the 5 US box sequence had an effect on the metk transcript stability. Addition of rifampicin followed by qrt-pcr was performed to measure the half-life 121

146 (t 1/2 ) of metk-lacz RNAs. We also measured the abundance of the transcripts from the same samples. Abundance of the 5S rrna, which was measured for each RNA sample, served as the control. As expected, no significant decrease in 5S rrna abundance was observed during growth. 122

147 Figure 2.12 Measurement of RNA abundance of metk transcripts. Cells were grown in rich medium either in the presence or in the absence of IPTG (0.2 mm). Growth was monitored until mid-exponential phase and cells were harvested at T 3. Duplicate samples were filtered through a vacuum manifold at 0, 2.5, 5, 10 and 20-min intervals postrifampicin (150 µg/ml) treatment and instantly frozen on dry ice. The transcript half-life (t 1/2 ) was determined by measuring the decrease in transcript abundance over time. A. Wild-type metk-lacz; B. G5U metk-lacz; C. US1 metk-lacz. Open symbols indicate transcripts in the absence of IPTG, filled symbols indicate transcripts in the presence of IPTG. SQ denotes the starting quantity of the RNA as determined by the qrt-pcr software (Bio-Rad CFX Manager). 123

148 A. -IPTG WT metk-lacz +IPTG B. -IPTG G5U metk-lacz +IPTG C. +IPTG US1 metk-lacz -IPTG Figure

149 The RNA abundance for the wild-type metk-lacz fusion construct was compared to two mutant metk-lacz fusion constructs, the US box G5U point mutant, and the more severe triple mutant, US1. In the absence of IPTG (when SAM levels are low), the wildtype metk-lacz RNA abundance dropped ~4.0-fold after 20 min of IPTG limitation, relative to the 0 min time point (Figure 2.12A), while in the presence of IPTG (under high SAM levels), wild-type metk-lacz RNA abundance dropped ~10-fold relative to the 0 min time point. The US1 mutant exhibited a drastic (~340-fold) reduction in transcript abundance after 20 min of IPTG limitation, relative to the 0 min time point (Figure 2.12C), while RNA abundance in the presence of IPTG dropped 12-fold over a period of 20 min. These results indicated the severe effect of the triple mutant on metk RNA abundance. The G5U construct showed an intermediate phenotype relative to the wildtype and US1 metk constructs. The G5U-lacZ transcript levels dropped ~24-fold after 20 min of IPTG limitation, while the RNA abundance in the presence of IPTG dropped ~7- fold over a period of 20 min (Figure 2.12B). The t 1/2 for the wild-type metk-lacz RNA in the absence of IPTG was ~15 min, suggesting that the wild-type transcript is extremely stable when SAM levels are low. In the presence of IPTG (under high SAM levels), the t 1/2 for the wild-type metk-lacz RNA dropped ~12-fold to 1.2 min (Figure 2.12A). These results indicated that the wild-type transcript was stabilized when SAM levels were low. The US1 mutant exhibited a reduction in the transcript half-life when cells were grown in the absence of IPTG compared to wild-type metk-lacz. The t 1/2 value in the absence of IPTG was less than 0.82 min, ~18-fold lower than the wild-type t 1/2 value, suggesting that the transcript 125

150 stability is reduced significantly due to the severe triple mutation. The transcript stability of the US1 mutant in the presence of IPTG was comparable to wild-type, with t 1/2 value of 1.9 min (Figure 2.12C). The transcript stability of the G5U mutant was intermediate relative to the wild-type and US1 metk constructs, with a t 1/2 value of ~7.4 min (Figure 2.12B). This value is ~2-fold lower compared to the wild-type t 1/2 under the same conditions, indicating a less severe disruption of the US-DS pairing interaction as compared to the US1 mutant. The G5U-lacZ transcript showed a t 1/2 of 1.2 min in the presence of IPTG after 20 min, equivalent to that of wild-type metk-lacz RNA. These results suggest that pairing of the US-DS boxes in important for transcript stability and that the presence of SAM reduces transcript stability. We also measured the change in gene expression over time for the same rifampicin-treated samples, grown in the presence or in the absence of IPTG. Expression at each time point was normalized to the stable 5S rrna control, which showed no change in expression under these conditions (Figure 2.13). All values are relative to the 0 min time point under each condition. Wild-type metk-lacz exhibited a ~3.0-fold reduction in gene expression over a 20-min period in the absence of IPTG, and an 11-fold drop in expression over the same time period in the presence of IPTG (Figure 2.13, green squares). The overall expression of wild-type metk-lacz during growth in the absence of IPTG (when SAM pools are low) was ~4.0-fold higher than for cells grown in the presence of IPTG (when SAM pools are high). This further corroborated the observation that wild-type metk-lacz expression increases when SAM pools are low during IPTG limitation. The US1-lacZ transcript showed a ~400-fold decrease in expression after

151 min during growth in the absence of IPTG, while expression dropped ~17-fold after cells were grown in the presence of IPTG (Figure 2.13, blue triangles). The G5U-lacZ RNA exhibited a 23-fold reduction in gene expression in the absence of IPTG, and a 10-fold drop in expression in the presence of IPTG, over a period of 20 min (Figure 2.13, orange circles). The G5U mutant exhibited an expression profile that was intermediate between that of the wild-type metk-lacz and US1-lacZ constructs, suggesting that the single point mutation was less severe in its effect on gene expression. Expression for all three constructs in the presence of IPTG was nearly superimposable (Figure 2.13, filled symbols). 127

152 Normalized fold expression 1 WT metk 0.1 G5U metk 0.01 US1 metk Time (min) Figure 2.13 Expression profiles of the wild-type and mutant metk-lacz transcripts. Gene expression for the same rifampicin-treated samples (as in Figure 2.12), grown in the presence or in the absence of IPTG. Expression at each time point was normalized to the stable 5S rrna control. Squares, wild-type metk-lacz; circles, G5U metk-lacz; triangles, US1 metk-lacz. Open symbols, cells grown in the absence of IPTG; filled symbols, cells grown in the presence of IPTG In vitro analysis of the metk US and DS box mutants Transcription termination assays were performed to determine if SAM could promote premature termination of transcription of wild-type and mutant metk gene expression. Wild-type metk failed to show a response to SAM in vitro, i.e., there was no 128

153 increase in transcription termination in the presence of SAM (Figures 2.14 and 2.15). (Note: The in vitro data have been expressed as percent readthrough instead of termination, for easier comparison with the in vivo readthrough expression analysis). The majority of the metk US box mutants did not show a response to SAM in vitro (Figure 2.14). Only two US box mutants, A2C and G3A, showed 1.5-fold reduction in percent readthrough (or increase in percent termination) in the presence of SAM. Not all substitutions at the same position showed the same amount of readthrough. For example, the A2 mutants showed a mixed phenotype, such that A2G exhibited high readthrough, A2U showed very low readthrough, while A2C responded to changing concentrations of SAM. Positions G3 and A7 exhibited different phenotypes for each nucleotide substitution. Out of the 3 mutants at position G5, 2 showed low readthrough, while mutants at position C4 exhibited constitutive high readthrough (Figure 2.14). 129

Figure 2.14 In vitro analysis of the metk US box mutants. Transcription termination assays were conducted in the absence or presence of SAM (2.5 mm).

154 Figure 2.14 In vitro analysis of the metk US box mutants. Transcription termination assays were conducted in the absence or presence of SAM (2.5 mm). Constructs have been grouped according to nucleotide position. Termination efficiency is the amount of the terminated product relative to the sum of the terminated and readthrough products. Here, the Y-axis denotes the percent readthrough, which was calculated by subtracting the termination value from 100. The metk DS box mutants were also analyzed in response to SAM using the in vitro transcription termination assay (Figure 2.15). The metk DS box mutants showed a relatively consistent percent readthrough compared to the metk US box mutants (comparing Figures 2.14 and 2.15). However, the DS box mutants failed to show any increase in percent termination in the presence of SAM (Figure 2.15). 130

155 U265A U265C G264A G264C C263A C263G C263UC262A C262G C262UG259UG252U C4U-G264A G5A-C263U G6A-C262U Figure 2.15 In vitro analysis of the metk DS box mutants. Transcription termination assays were conducted in the absence or presence of SAM (2.5 mm). Constructs have been grouped according to nucleotide position. Termination efficiency is the amount of the terminated product relative to the sum of the terminated and readthrough products. Here, the Y-axis denotes the percent readthrough, which was calculated by subtracting the termination value from 100. Three US-DS box double mutants exhibited a ~ fold increase in the total transcription compared to wild-type metk and the DS box point mutants (Figure 2.15). However, none of the double mutants responded to the presence of SAM in vitro. Overall, unlike results from the in vivo reporter assays, the in vitro transcription 131

156 termination analysis remained inconclusive. Additional studies will be required to understand the metk regulation in vitro Conditions to generate a metk halted-complex during in vitro transcription While testing the metk mutants using the multiple-round in vitro transcription termination assays, we simultaneously identified experimental conditions that would allow us to perform a single-round transcription assay using the metk template and were successful in generating a metk halted-complex. As the +1 site was identified using primer extension analysis, we used high GTP as the initiating nucleotide. Attempts to initiate transcription using the GpA dinucleotide were also successful. Most of the previous in vitro assays were performed using [α- 32 P]-UTP as the radionucleotide and by leaving out GTP to generate a halt during transcription. As the metk leader RNA sequence contains 2 Us within the first 12 nt, we omitted UTP (instead of GTP) and performed the experiment in the presence of an alternate radionucleotide, [α- 32 P]-CTP. This was predicted to halt the transcription upon reaching position U8. Transcription was resumed by addition of all 4 nucleotides. We observed transcripts for the wild-type metk RNA and the metk US box mutation C4A (data not shown) but SAM failed to stimulate termination of transcription at the leader region terminators for either constructs. Consistent with previous results (Figure 2.14), the wild-type RNA exhibited an equal ratio of terminated to readthrough products in the presence and absence of SAM, while the metk C4A RNA showed constitutive high readthrough (data not shown). Further optimization will be necessary to identify the precise conditions required to characterize 132

157 the metk leader RNA in a purified in vitro system and to obtain a SAM-dependent response. 2.4 Discussion The B. subtilis S box regulon controls genes involved in sulfur metabolism pathways (Grundy and Henkin 1998). The S box genes exhibit significant variation in the amounts of induction and repression of expression in vivo as well as in the concentrations of SAM required to promote termination in vitro (Tomsic et al. 2008). The metk leader RNA is an exception. The metk-lacz fusion does not exhibit induction of expression during methionine starvation and SAM fails to promote increased termination at the metk leader region terminator (Tomsic et al. 2008). Although the metk leader RNA shows high conservation in the primary sequence and secondary structure compared to the other S box leader RNAs, the mechanism by which metk is regulated has been elusive. The objective of the current study was to characterize the B. subtilis metk leader RNA and identify the regulatory mechanism of the metk gene. We constructed a strain, BR151 Pspac-metK, in which the native metk gene was under the control of an IPTG-dependent Pspac promoter. We measured the SAM pools during growth in the absence of IPTG-limiting conditions. It was predicted that cells grown in the presence of IPTG would result in high SAM pools while cells grown in the absence of IPTG would result in low in vivo SAM pools. Total SAM pools from the current study were at 150 µm concentration during growth in the presence of IPTG. This concentration is 2-fold lower than the previously measured B. subtilis SAM pools 133

158 (McDaniel et al. 2006, Tomsic et al. 2008, Wabiko et al. 1988). Earlier studies of a methionine auxotroph grown in minimal medium in the presence of methionine revealed in vivo SAM pools to be ~300 μm (Tomsic et al. 2008). Depleting methionine from the culture medium leads to a rapid drop in SAM pools to ~50 μm and then below the limit of detection (25 µm) after 1 h of methionine limitation (Tomsic et al. 2008). We had expected the overall SAM pools to be equal or even higher during growth in the nutrient rich medium, as there would be a constant supply of methionine for SAM production. The apparent difference in total SAM pools can be due to the different growth conditions employed during the two studies. In the current study, cells were grown in nutrient-rich medium, whereas previous SAM pool measurements were performed during methionine starvation using defined minimal medium. It is possible that actively growing cells in rich medium have a higher demand for SAM as compared to cells grown in minimal medium and hence result in a 2-fold reduced in vivo SAM concentration during growth in rich medium. Along with measuring the SAM pools, the β-galactosidase activity of the metklacz fusion was measured. The wild-type metk-lacz construct exhibited induction of expression after SAM levels in the cell dropped. However, we report that the increase in metk-lacz expression does not coincide precisely with the drop in SAM levels in BR151 Pspac-metK. A delay seen before the maximum β-galactosidase activity was reached suggests that the metk-lacz expression is induced only after in vivo SAM drops to very low levels. This result is different from the yitj-lacz expression profile seen previously (Tomsic et al. 2008), which shows a simultaneous increase of yitj-lacz expression as 134

159 SAM pools continue to drop. The peculiar expression profile of the metk gene sheds light on its physiological role. The B. subtilis metk gene differs from the rest of the S box genes in that it utilizes methionine to synthesize SAM, while the other S box genes (like yitj) are closely involved in synthesis of methionine. It is possible that the metk gene is controlled by a global regulatory mechanism, in addition to being controlled by the existing S box mechanism. We compared the expression of the wild-type metk-lacz fusion during growth in the presence and absence of IPTG in the BR151 Pspac-metK strain. More than 3-fold induction of metk-lacz expression was observed in the absence of IPTG (during low SAM pools), whereas the presence of IPTG (when SAM pools are high) resulted in repression of metk expression. Unlike a typical S box gene, expression in the presence of high SAM was not completely off. As metk is an essential gene, the constitutive low expression (~30-40 Miller units) seen in the presence of IPTG ensures that SAM is always being synthesized. Modulation of the in vivo SAM pools using IPTG was different from our previous method, in which SAM was indirectly depleted by removal of methionine. Starvation for methionine results in inhibition of cell growth. Our current results suggest that induction of the wild-type metk-lacz fusion occurs when the SAM pools are limiting (by control of the chromosomal metk gene expression), and methionine levels are high. We speculate that along with sensing SAM pools, metk also senses in vivo methionine levels to ensure steady growth. This makes sense as SAM is synthesized from methionine by SAM synthetase, encoded by the metk gene. 135

160 Analysis of the the metk leader RNA deletion mutants revealed the RNA elements necessary to achieve a wild-type response during growth in presence and absence of IPTG. Deleting either end of the metk leader RNA was not tolerated, as the 5 and 3 deletion constructs exhibited a complete loss of metk induction under low SAM conditions (Table 2.2). These results indicate that intact 5 and 3 regions are important and play a direct role in the unique regulation of the metk gene. Deleting the metk S box riboswitch element (helices 1 through 4, as well as the terminator element) was tolerated, as the ΔS box mutant showed a response to changing levels of SAM in vivo (Table 2.2). Removal of the terminator element can explain the increased expression shown by this mutant compared to wild-type metk. These preliminary results using the in vivo reporter assays suggested that the metk S box element does not participate directly in the primary mode of metk regulation and that regions at the 5 and 3 ends of the leader RNA are crucial. Phylogenetic analysis of metk leader RNAs from several Firmicutes revealed two highly conserved sequence motifs, the US and DS boxes. These sequences were located on the 5 and 3 sides of the metk S box riboswitch element. These elements make the metk RNA unique among the S box leader RNAs. The importance of these sequences in the regulation of the metk gene was first highlighted using lacz reporter assays of the deletion mutants. The close proximity of the US box sequence to the 5 end of the RNA prompted us to map the metk RNA transcriptional start-site. Primer extension analysis demonstrated that the US box sequence started precisely at the metk +1, suggesting a role for the US box region in metk regulation, at the level of mrna stability, transcription 136

161 initiation, and/or RNAP recognition. Analysis of the metk leader RNAs revealed significant sequence complementarity of the US box region with a region located ~60 nt downstream of the terminator helix. The DS box sequence was located ~30 nt upstream of the SD region and was predicted to form a pre-sd structure by base-pairing with the US box sequence. The consensus sequences for each of these regions showed 10 out of the 19 US box nucleotides and 5 out of the 16 DS box nucleotides to be 100% conserved. Out of the 6 nucleotides that are part of the GC-rich conserved core, 5 were invariant, suggesting a regulatory role for this sequence. Accessibility of the DS box sequence to a DNA oligonucleotide, complementary to a region within the DS box sequence, was monitored by cleavage using RNase H. The T7 transcribed wild-type metk RNA exhibited cleavage in the presence of SAM. This result suggested that SAM reduced the US-DS box pairing, which resulted in the DS box sequence to be available for the antisense oligonucleotide to bind. This makes sense as the US box is involved in helix 1 formation, which is stabilized typically under high SAM conditions. However, the absence of SAM resulted in protection of the DS box sequence from RNase H cleavage. In the absence of SAM, the US box sequence is not involved in helix 1 formation. The result indicates that in the absence of SAM, the US box sequence base-pairs with the DS box region and protects it from RNase H cleavage. We can therefore conclude that the pairing of the US box sequence with the DS box sequence occurs under low SAM conditions. The metk US1 triple mutant exhibited constitutive cleavage regardless of whether 137

162 SAM was present. This result indicated that the US1 mutation disrupted the base-pairing interaction and the DS box sequence was no longer protected from RNase H cleavage. It is possible that the conserved core region within the US box sequence recruits a protein factor that further stabilizes the US-DS box pairing under low SAM conditions. However, in the case of the metk US1 mutant, the recruiting site is disrupted which reduces the UD-DS base-pairing interaction. The U105C substitution in the metk SAM binding pocket was independent of the US and DS box sequences, and the resulting metk S box mutant exhibited equal protection and cleavage in the presence and absence of SAM. This result suggested that the SAM response for pairing of the US and DS box sequences is dependent on a functional S box riboswitch and requires an intact SAM binding pocket. We speculate that the point mutation in the SAM binding pocket created a floppy RNA molecule, which underwent a rapid transition between the paired and unpaired US-DS box interaction, leading to a constant access of the antisense oligonucleotide to the DS box sequence. A more likely explanation is tht the RNA was stuck in a conformation which resulted in constant cleavage and protection regardless of the presence of SAM. This mutation is similar to that observed for yitj SAM binding pocket mutants in which SAM binding is abolished, yet these mutants exhibit high constitutive termination in vitro (see Chapter 3) (Lu et al. 2010). Based on the above RNase H probing results we conclude that the US box sequence base-pairs with the complementary DS box sequence in the absence of SAM. The ΔS box mutant RNA was protected in the RNase H cleavage assay in the 138

163 presence and absence of SAM, suggesting that the US and DS box sequences were always paired in this construct. The ΔS box mutant fails to respond to changing SAM concentrations in vitro. The US1 ΔS box mutant RNA on the other hand showed constitutive high cleavage, suggesting that the US-DS pairing was disrupted due to the sequence change in the conserved core region (data not shown). The unrestricted pairing of the US-DS boxes in the ΔS box mutant would suggest that gene expression is always on. However, the ΔS box mutant responds to changing concentrations of SAM as it exhibited increased expression only under low SAM conditions, using the IPTG limitation assay (Table 2.2). We can reason that the increased expression of the ΔS box mutant was only in part due to the US-DS pairing. It is possible that pairing alone is not sufficient for increased expression, and that an additional stabilizing factor is needed that is present in vivo under condtions when SAM concentration is low, but methionine levels are high. The extensive in vivo analysis of the US and DS box mutants further validated that the US-DS pairing interaction is important for metk gene expression. Overall, changes in the US box sequence showed a greater variation in expression of the metklacz fusion as compared to changes in the DS box sequence. Several US box mutants (for example, A2G, G3A, A7G, U8A, U8G and U8Δ) exhibited increased readthrough in the absence of IPTG, consistent with wild-type metk-lacz. Some mutants exhibited constitutive high readthrough (C4A and C4G) while others showed constitutive low readthrough (A2C, A2U, G3U and mutations of G5 and G6 positions). The results indicated that maintenance of (or a change to) a purine residue was tolerated (with the 139

164 exception of the U8 position located within a loop region). The US box mutational analysis supported our observation that maintenance of the US-DS base-pairing results in increased metk expression when SAM pools are low. Compared to sequence changes in the US box sequence, the majority of the DS box mutants exhibited 3-4-fold reduced β-galactosidase activity in the absence of IPTG compared to wild-type metk-lacz. Only two mutants (U265C) exhibited increased β- galactosidase activity during growth in the absence of IPTG that was comparable to wildtype metk-lacz. U265C-lacZ showed increased readthrough during growth in the absence of IPTG. U265C forms a Watson-Crick base-pair with the G3 position in the US box sequence. The US box mutation G3A, which formed a Watson-Crick base-pair with the position U265 in the DS box sequence, also exhibited increased readthrough during growth in the absence of IPTG. However, the compensatory mutation G3A-U265C, which disrupts the base-pairing interaction between the US and DS boxes, failed to show increased readthrough under low SAM pools (Woltjen and Grundy, data not shown). The underlying conclusion from the mutagenesis was that maintenance of the pre- SD structure (either through Watson-Crick or wobble base-pairing) results in increased expression during growth in the absence of IPTG. These data support the model that under low SAM conditions, when helix 1 of the S box element is not formed, the US box sequence base-pairs with the DS box sequence, leading to upregulation of gene expression. There were two exceptions to this model. We failed to observe an increase in readthrough of the A7U-U261A double mutant under low SAM conditions. This result 140

165 was unexpected because the compensatory mutation was able to form a Watson-Crick base-pair. However, it is possible that a purine is required at position +7 in the US box sequence, as the only point mutation tolerated at this position was A7G. We also have evidence that shows that an A7G-U261C double mutant responded to IPTG limitation, and exhibited wild-type expression values (Woltjen and Grundy, data not shown). The second exception was that any disruption of the conserved core region was not tolerated. Although the potential to base-pair was maintained, the majority of substitutions in the conserved core disrupted regulation of metk. Only one double mutant (C4U-G264A) responded to IPTG limitation, but showed lower β-galactosidase activity than wild-type metk-lacz (Figure 2.11). We speculate that the strict sequence preference shown in this region is a recognition site for an alternate trans-acting factor that binds the US box sequence only in the presence of SAM. However, in the absence of SAM the pairing of the US-DS box sequences acts a protector and prevents access to such a trans-acting factor(s). It is possible that SAM synthetase itself binds to this region to control gene expression as a feedback inhibition mechanism for metk gene regulation. Additional experiments will be required to test these possibilities. Earlier studies have shown that ethionine, the S-ethyl analog of methionine, can be incorporated into proteins in place of methionine (Cheng et al. 1968, Gross and Tarver 1955). Only a few organisms, including B. subtilis, can further metabolize ethionine into S-adenosylethionine (SAE) (Allen et al. 1986). SAE differs from SAM by a single methyl group, in that it replaces the methyl group of SAM with an ethyl group. SAE can 141

166 therefore be toxic if synthesized by the cell. Prototrophic strains BR151MA-G5U and BR151MA-G6A exhibited a growth defect but were able to survive in the presence of ethionine as compared to the isogenic wild-type control strain BR151MA-ZKO which exhibited no growth. The absence of methionine from the growth medium suggested that ethionine could not be outcompeted, resulting in death of the wild-type strain. The preferential toxicity of ethionine for the wild-type strain suggested that normal SAM synthetase activity resulted in incorporation of ethionine leading to production of SAE, which eventually killed the cells. We predicted that the survival of the G5U and G6U mutants in the presence of ethionine was due to defective SAM synthetase activity, which in turn resulted in significantly lower levels of SAE in vivo. The in vivo SAM pools from the auxotrophic mutant strain BR151-G5U were indeed below the level of detection when compared to the SAM pools from the isogenic wild-type strain. This suggested that the single nucleotide substitution was sufficient to lower the SAM synthetase activity. These results provided a physiological proof of the effect of changing the US box sequence on in vivo SAM pools and provided further evidence that the US box element is important for metk regulation. It is important to note that the single nucleotide substitution in the leader RNA sequence of the essential metk gene was tolerated by the cell and resulted in a reduction in SAM synthetase activity (as reflected by the lowered SAM pools). Mutants with sequence changes in the metk coding region have been isolated which also exhibit a growth defect due to low SAM synthetase activity, subsequently resulting in low SAM pools in vivo (McDaniel et al. 2006, Wabiko et al. 1988). One such mutant showed 142

167 derepression of the S box gene expression (McDaniel et al. 2006). Another study implied that a minimum SAM pool concentration of ~25µM was optimal for growth (Wabiko et al. 1988). Although the G5U SAM pools were below the detection limit using our method, we can infer that the SAM pools are in a similar range (~25 µm) based on the study conducted by Wabiko and coworkers. In vitro transcription assays were performed to test the effect of the US and DS box mutations on SAM-directed transcription termination. The in vitro results were not consistent with the results obtained from the in vivo reporter assays. Addition of SAM failed to stimulate termination at the wild-type metk leader region terminator. The metk mutants that showed a response to changing concentrations of SAM in vivo failed to exhibit a response in vitro. These results suggested that a potential factor(s), that has not been identified yet, was missing from the in vitro purified system and hence we were not able to see an increase in SAM-directed termination. Overall, the in vitro analyses were inconclusive. Further studies are necessary to observe a SAM-dependent response in vitro. In vitro transcription can be performed in the presence of cellular extracts generated from cells growing exponentially to see if addition of the extract improves the response of the metk in vitro. Upon examining the in vitro data closely, we found significant variation in the transcription yields of some mutants compared to the wild-type construct. Results from multiple repeats of the in vitro transcription experiments revealed that some metk mutants showed very high band intensities while others showed bands that were barely visible (data not shown). This pattern remained consistent despite the same concentration 143

168 of template DNA added to the reaction. A similar result was obtained with hybrid leader RNA constructs of metk and yusc (see Chapter 4). These results suggest that transcription initiation at the metk transcriptional start-site by the RNAP is inconsistent and that mutating the US box sequence interferes with the proper docking or recognition of the DNA sequence for efficient transcription by the RNAP. Another explanation is that altering the metk US box sequence affects the processivity of the RNAP and in turn affects transcription elongation. Time-course analyses could provide information regarding the rate of RNAP elongation. We have identified the experimental conditions to generate a halted complex in order to perform single-round transcription reactions using the metk template. However, additional studies are necessary to improve this in vitro transcription system. Recent advances in the field of mrna decay and turnover have revealed a number of important RNases in B. subtilis (Bechhofer 2011). The essential RNase J1 is a broad-specificity endonuclease that also possesses the unique 5 3 exonuclease activity, previously thought to be lacking in bacteria (Even et al. 2005). RNase J1 is involved in turnover of mrna intermediates and is predicted to participate directly in the initiation of mrna decay. The non-essential RNase J2 is also known to contribute in some of these events (Even et al. 2005). The recently identified RNase Y is an essential enzyme that exerts an effect on global mrna stability in B. subtilis (Shahbabian et al. 2009). Together, these studies have shown that the 5 end of a transcript plays a crucial role in the overall RNA stability, such that 5 monophosphates are more susceptible to degradation by RNases than 5 triphosphates. 144

169 In the second part of our study, we examined the effect of the US box sequence (which is located precisely at the 5 end of the transcript) on metk leader RNA stability. The wild-type RNA, when compared to the two US box mutant transcripts, had the longest half-life (t 1/2 ~15 min) in the absence of IPTG, while the half-life dropped to only ~1.2 min after 20 min in the presence of IPTG. This suggests that the US-DS pairing in the wild-type transcript is disrupted in the presence of SAM, leading to a significant decrease in transcript stability. These data are in good agreement with the results for the wild-type metk construct obtained using the RNase H cleavage assay. The RNase H results revealed that the DS box sequence was highly accessible in the presence of SAM (Figure 2.9, left panel), suggesting that the US-DS box interaction is disrupted under these conditions. The ~18-fold reduction in t 1/2 of the US1-lacZ RNA during growth in the absence of IPTG compared to wild-type metk-lacz RNA indicated that the mutation disrupted the US-DS pairing resulting in significantly lowered RNA stability in the absence of SAM. These results are in agreement with the results from the RNase H assays during which the US1 mutant exhibited cleavage even in the absence of SAM (Figure 2.9, middle panel). The t 1/2 of the G5U mutant was down only 2-fold during low SAM conditions relative to wild-type. This suggests that the single nucleotide substitution did not alter the base-pairing interaction as much as the triple base substitution of the US1 mutant and hence resulted in a less severe impact on transcript half-life. All three constructs exhibited similar half-lives in the presence of high SAM pools (ranging from 1.2 to 2.0 min). These results suggested that the presence of SAM 145

170 destabilizes the US-DS pairing in case of the wild-type transcript, while sequence changes in the US box region destabilize the pairing regardless of the presence of SAM. Earlier studies have determined the half-life of the B. subtilis metk transcript as ~2.0 min (Smith et al. 2010b). This value is lower than the transcript half-life measured in the current study (15 min). The difference can be attributed to the different growth conditions used during these two studies. The previous analysis was conducted during a methionine starvation using minimal medium monitored over much longer periods of growth. Our analysis, on the other hand, has been performed in rich growth medium. It is possible that growth in a defined medium resulted in an overall shorter mrna half-life. Additional RNA stability assays performed during methionine starvation conditions may be helpful to explain the differences observed in metk-lacz gene expression during methionine starvation and IPTG-limitation assays. The RNA abundance for all three constructs was nearly identical when grown in the presence of IPTG, suggesting minimal change in the total RNA during growth in the presence of high in vivo SAM pools. These results indicate that a steady growth rate leads to constant RNA abundance. The stability and abundance results together validate the importance of the US-DS pairing interaction and suggest that the US box is protected from degradation by pairing with the DS box sequence. The results support our model that the US-DS pairing interaction is stabilized under low SAM conditions. In addition, we predict that the US-DS pairing in the absence of SAM, but in the presence of methionine, is stabilized further by a separate protein factor, which binds to the potential recruiting site within the conserved core region. 146

171 Based on the data obtained from the current study, we propose a model for the regulation of metk gene expression from B. subtilis, in which the metk gene is subjected to regulation at the level of mrna stability, in addition to being under the control of the S box regulon (Figure 2.16). It is also possible that regulation occurs at the level of transcription initiation, or involves a combination of both RNA stability and transcription initiation. 147

172 Figure 2.16 Proposed model for regulation of B. subtilis metk gene expression. The 5 end of the metk leader RNA with the US box sequence (red line) transcribed by RNAP (panel A). The RNA undergoes a conformational change depending on the concentration of SAM, such that in the presence of SAM, the RNA forms the terminator (T) (panel D) amd gene expression is off. The RNA forms the antiterminator (AT) (blue-orange) under low SAM conditions. This leads to transcription of the DS box sequence (green line) with the potential for the US-DS pairing to occur (panel B). The next check point is methionine dependent, such that under low SAM and low methionine (Met) conditions, the US-DS pairing is unstable and the metk RNA is susceptible to degradation by an RNase (brown pie) resulting in gene expression to be off (panel C). Under conditions with low SAM and high methionine, a methionine-dependent factor (blue oval) stabilizes the US-DS pairing, resulting in stabilization of the RNA. This results in gene expression to be on (panel E). Under high SAM and high methionine concentrations, the RNA reconforms into the terminator (T) conformation and unzips the US-DS pairing interaction. This causes the RNA to now be susceptible to a potential degradation event, resulting in reduced RNA stability and gene expression is off. SD, Shine-Dalgarno region; AUG, start codon. 148

173 A. B. - SAM AT - MET C. AT US box US box DS box SD AUG RNase US box DS box SD AUG SAM D. + MET E. AT F. + SAM DEGRADED Gene expression OFF T T US box TERMINATE Gene expression OFF Factor STABLE SD Gene expression ON AUG US box RNase DEGRADED Figure 2.16 DS box Gene expression OFF SD AUG 1

174 The fact that the wild-type metk-lacz construct exhibits induction of expression only during conditions when SAM pools are low but methionine levels are high, suggests the importance of growth rate on metk regulation. It is possible that the metk gene is under the control of a superimposing global regulatory event like the stringent response, which occurs under high stress conditions such as amino acid starvation. Further investigation using ppgpp, the stringent response alarmone, or ppgpp synthetase (RelA) (encoded by the rela gene) can shed light on such a mechanism. We have evidence to suggest such a global regulatory mechanism might exist, as sequences resembling the metk US and DS boxes have been observed in a distinct regulatory element that is not part of the S box regulon (Grundy and Henkin, unpublished data). Additional studies will be necessary to verify whether these elements are regulated by a common mechanism that also regulates expression of the metk gene. 150

175 CHAPTER 3 IN VITRO INVESTIGATION OF THE SAM-BINDING POCKET OF THE Bacillus subtilis yitj S BOX RIBOSWITCH 3.1 Introduction The S-adenosylmethionine (SAM)-binding S box riboswitch is a widespread class of riboswitches, originally identified upstream of 11 transcriptional units from the Grampositive bacterium B. subtilis (Grundy and Henkin 1998). The S box regulon controls the expression of 26 genes involved in import, synthesis and recycling pathways for methionine, cysteine and SAM (Grundy and Henkin 1998). The B. subtilis S box riboswitch consists of a SAM-sensing aptamer domain followed by an expression platform which functions to regulate the downstream genes by transcription attenuation (Epshtein et al. 2003, McDaniel et al. 2003, Winkler et al. 2003). The intracellular concentration of SAM dictates whether the RNA folds into one of two mutually exclusive conformations, resulting in either continuation of transcription into the downstream coding region (through formation of the antiterminator) or termination at the leader region terminator (through formation of the terminator). Termination is dependent on 151

176 formation of a third structure, the anti-antiterminator, which competes with the antiterminator. When SAM levels are low, the anti-antiterminator structure is destabilized, resulting in formation of the antiterminator structure and expression of the downstream coding regions. When SAM levels are high, the anti-antiterminator structure is stabilized, preventing formation of the antiterminator and allowing formation of the terminator and premature termination of transcription. 152

177 Figure 3.1. B. subtilis yitj leader RNA structural model. The structural model is based on phylogenetic analyses (Grundy and Henkin 1998) and is shown in the terminator conformation. Red and blue residues indicate the alternate pairing for formation of the antiterminator, shown above the terminator. Boxed numbers indicate helices (or paired regions) 1 4; T, terminator; AT, antiterminator; AAT, anti-antiterminator. Numbering of residues is relative to the predicted transcription start-site. Adapted from (Grundy and Henkin 1998). 153

178 Phylogenetic analyses revealed that the S box aptamer domain consists of four helical segments (P1-P4) organized around a four-way junction (Figure 3.1). Support for the secondary structural model of the S box leader RNA was provided by data from extensive genetic studies, which confirmed a pseudoknot structure between the loop of helix P2 and junction region between helices P3 and P4 (J3/4) (McDaniel et al. 2005). This tertiary structural element was validated by two independent high-resolution crystal structures from Thermoanaerobacter tengcongensis (Montange and Batey 2006) and B. subtilis (Lu et al. 2010). The crystal structures are nearly superimposable and establish the global architecture of the S box riboswitch aptamer domain. The structural analyses revealed that P1/P4 and P2/P3 form two sets of coaxially stacked helices that are packed together at a ~70º angle. The ligand-binding pocket is situated between the minor grooves of helices P1 and P3, enveloping SAM into a tightly packed interface (Lu et al. 2010, Montange and Batey 2006). SAM, the molecular effector of the S box riboswitch, is synthesized from methionine and ATP by SAM synthetase, encoded by the metk gene. Growth in the presence of methionine results in high SAM pools, while growth in the absence of methionine results in low SAM pools. The physiological concentration of SAM in a methionine auxotrophic strain grown in the presence of methionine is ~300 µm (Tomsic et al. 2008). Depleting methionine from the culture medium leads to a rapid drop in SAM pools to <50 μm and then below the limit of detection (25 µm) after 1 h of methionine limitation (Tomsic et al. 2008). The yitj gene encodes methylenetetrahydrofolate reductase, an enzyme in 154

179 methionine biosynthesis (Grundy and Henkin 1998). The B. subtilis yitj S box RNA has been the most carefully characterized S box riboswitch and many of the early studies on the S box regulon have been conducted using the yitj leader RNA (Grundy and Henkin 1998, McDaniel et al. 2003, McDaniel et al. 2005, Winkler et al. 2001). The yitj leader RNA is highly sensitive to changes in SAM concentrations and gene expression of a yitjlacz transcriptional fusion is induced only when intracellular SAM pools are low (Tomsic et al. 2008). yitj variants containing mutations in highly conserved regions in the leader RNA exhibit loss of repression during growth in the presence of methionine (McDaniel et al. 2003, McDaniel et al. 2005). Several studies of the wild-type yitj RNA have examined the response of the yitj riboswitch to SAM using an in vitro transcription termination assay (McDaniel et al. 2003, McDaniel et al. 2005, Tomsic et al. 2008). yitj exhibits half-maximal termination in vitro at a concentration of 0.35 µm SAM (Tomsic et al. 2008). Size-exclusion filtration assays were used to determine the affinity of the yitj leader RNA for SAM. The wildtype yitj RNA exhibits high affinity for SAM with an apparent K d ~20 nm. This is consistent with data obtained from previous analyses that reported K d values between 4 and 10 nm (Lim et al. 2006, Winkler et al. 2003). The yitj leader RNA is highly specific for SAM and discriminates strongly against closely related natural analogs such as S- adenosyl- L -homocysteine (SAH) and S-adenosyl- L -cysteine (SAC). The yitj RNA exhibits 100- and 10,000-fold lower affinities for SAH and SAC, respectively, as compared to the affinity for SAM (McDaniel et al. 2003, Winkler et al. 2003). yitj variants containing mutations in highly conserved regions of the leader RNA exhibit loss 155

180 of SAM binding and SAM-directed transcription termination in vitro (McDaniel et al. 2003, McDaniel et al. 2005). Based on these results, we conducted a study to generate B. subtilis yitj variants that exhibit higher SAM affinity compared to wild-type yitj or a change in ligand specificity. We targeted residues in the SAM-binding pocket of the yitj leader RNA that make direct and indirect contacts with the SAM molecule. First, we employed Systematic Evolution of Ligands by EXponential enrichment (SELEX) on the B. subtilis yitj RNA. The goal of this study was to subject a pool of mutant yitj transcripts to iterative rounds of selection in the presence of lower concentrations of SAM in order to obtain aptamers with either higher affinity or modified specificity. However, due to technical difficulties using SELEX, our attempts to obtain such aptamers were unsuccessful. The first part of this chapter will describe the methods as well as the preliminary results obtained during the SELEX study. In a subsequent study, we used a more direct approach in which we mutated individual residues in the SAM-binding pocket. This targeted mutational analysis of the yitj leader RNA was conducted as part of the crystal structure study of the B. subtilis yitj S box riboswitch (in collaboration with the laboratory of Dr. Ailong Ke, Cornell University) (Lu et al. 2010). In this study, 32 mutants were tested for effects on in vitro transcription termination (V. A. Pradhan) and SAM binding (J. Tomšič) (Lu et al. 2010). The second part of this chapter will focus on the results obtained during these experiments. A separate study was conducted that investigated the response of certain yitj 156

181 variants to a series of SAM analogs using in vitro transcription termination assays. We found that some of these mutants retained a response to SAM and a limited number of SAM analogs. However, we failed to obtain variants that exhibited specificity only toward a SAM analog. These results will be discussed in the last part of this chapter. 3.2 Materials and methods Construction of DNA templates for in vitro selection DNA templates used for T7 RNA polymerase (RNAP) transcription during the in vitro selection were generated using complementary pairs of overlapping DNA oligonucleotides (Integrated DNA Technologies, Coralville, IA). Briefly, the 5 pair contained the phage T7 RNAP promoter sequence fused to the +14 position of the yitj leader region. The remaining complementary pairs contained the wild-type yitj sequence, each with a 4-nt 3 overhang complementary to the 5 region of the adjacent pair. The terminal pair contained the 3 leader RNA sequence that ended at +169, near the 5 side of the transcription terminator. This position was selected so that helix P1 formation could be monitored without the competing antiterminator structure. Each internal oligonucleotide pair was phosphorylated using T4 polynucleotide kinase (New England Biolabs, Beverly, MA). The pairs were then mixed, incubated at 95ºC and slow-cooled to room temperature to permit annealing. The paired oligonucleotides were ligated using T4 DNA ligase (New England Biolabs) per the manufacturer s instructions. The resulting DNA template was amplified using the flanking 5 and 3 DNA oligonucleotides as primers for PCR using Pfu DNA polymerase (Stratagene, La Jolla, CA). The final 157

182 template DNA was purified with a Qiagen PCR cleanup kit (Qiagen, Chatsworth, CA) and sequenced by Genewiz (South Plainfield, NJ). For construction of the pool of mutant yitj DNA templates, the same procedure was followed except that oligonucleotides with the desired sequence changes were used in place of the wild-type oligonucleotides. The spiked oligonucleotides used to generate the mutant DNA templates are listed in Table 3.1. The percentage of spiking and the nucleotides that were mutated are indicated in the same table. The percentage of spiking was calculated based on the total number of positions that were targeted for mutagenesis, as well as the maximum number of mutations expected per output molecule. The following formula was used to determine the percentage of spiking per oligonucleotide: n! P(x) = C x m (1-C m ) n-x x! (n-x)! where, P(x) = probability of x mutations per oligonucleotide x = number of mutations per oligonucleotide n = number of bases in oligonucleotide to be mutagenized C m = total fraction concentration of mutant nucleotides We defined the value of x as 4 mutations per molecule, n as 7 and the value of C m as A C m value of 0.50 resulted in a percentage of ~17% for each of the 3 contaminating nucleotides. Therefore, a ratio of wild-type to mutant nucleotides was set to 49:51. DNA fragments (180 bp) containing wild-type or mutant sequences were cloned into plasmid pgem7zf+ (Promega). The resulting constructs were introduced into E. coli 158

183 strain DH5α (φ80dlaczδm15 enda1 reca1 hsdr17 (r - k m + k ) thi-1 gyra96 rela1 Δ(lacZYA-argF)U169) by transformation and grown on Luria-Bertani (LB) medium (Miller 1972). Transformants were screened for formation of white colonies on LB medium containing X-Gal (5-bromo-4-chloro-3-indolyl-β-D-galactopyranoside; Gold Biotechnologies, St. Louis, MO), indicating disruption of the multiple cloning site within the vector. Plasmid DNA was isolated from the transformants using a Promega Wizard prep kit (Madison, WI) and sequenced to confirm the mutations. Table 3.1 Oligonucleotide primers for the B. subtilis yitj leader RNA Primer yitj (14-50) yitj (14-46)RC yitj (51-89) yitj (47-85)RC yitj (90-126) yitj (86-122)RC yitj ( ) yitj ( )RC Sequence (5 3 ) a, b, c AAAATTTCATATCCGTTCTTATCAAGAGAAGCAGAGG TGCTTCTCTTGATAAGAACGGATATGAAATTTT GACTGGCCCGACGAAGCTTCAG[CAAC]CGGTGTAATGGCG ATTACACCG[GTTG]CTGAAGCTTCGTCGGGCCAGTCCCTC ATCAGCCATGACCAAG[GTG]CTAAATCCAGCAAGCTCG CTTGCTGGATTTAG[CAC]CTTGGTCATGGCTGATCGCC AACAGCTTGGAAGATAAGAAGAGACAAAATCACTGACAAAGTC GACTTTGTCAGTGATTTTGTCTCTTCTTATCTTCCAAGCTGTTCGAG a Bold positions within parentheses indicate variation from the wild-type sequence. Each position contained 49% wild-type sequence and 17% for each mutant nucleotide. b The underlined sequence indicates the 4 nt-overhang for each oligonucleotide-pair designed for cassette-ligation. c RC, reverse complement Site-directed mutagenesis The transcriptional fusion vector pfg328 (Grundy et al. 1993) containing wild- 159

184 type yitj DNA was used as a template for oligonucleotide-directed mutagenesis as described previously (McDaniel et al. 2003). For the purpose of the in vitro transcription termination assay, the strong B. subtilis glyqs promoter replaced the native yitj promoter as described previously (McDaniel et al. 2003). The glyqs promoter was fused to the +14 position of the yitj leader sequence. The yitj sequence continued into the coding region and had an endpoint that was 92 bp downstream of the transcription terminator. DNA oligonucleotides containing the desired mutations were designed as primers for PCR amplification of pfg328-yitj DNA using Pfu DNA polymerase. The resulting products were subjected to digestion with DpnI (New England Biolabs) to remove the starting wild-type template and introduced into XL-2 blue ultracompetent cells by transformation as per the manufacturer s instructions (Stratagene). The plasmid DNA was isolated using a Promega Wizard prep kit and sequenced to confirm the mutations. DNA templates for in vitro transcription termination were generated from the plasmid DNA by PCR In vitro transcription assays Templates for in vitro transcription reactions were constructed to generate a transcription start-site 17 nt upstream of the start of P1 as described above (section 3.2.2) (Figure 3.1). ApC was used as the initiating dinucleotide and GTP was omitted to halt the transcription at position +16 (McDaniel et al. 2003). Single-round transcription reactions were carried out (as in Chapter 2) as described previously (Grundy et al. 2002, McDaniel et al. 2003). Templates were transcribed in the presence or absence of SAM (140 µm) or SAM analogs in a reaction volume of 35 µl. Transcription products were resolved by 160

185 denaturing PAGE and visualized by PhosphorImager (Molecular Dynamics) analysis. Percent termination was plotted as a function of ligand concentration (GraphPad Prism). The reactions were performed in duplicate and reproducibility was ±5% RNase H cleavage assay RNase H cleavage was carried out as described previously (McDaniel et al. 2003). The DNA template for T7 RNAP transcription was generated using complementary pairs of oligonucleotides as described above (section 3.2.1). Transcription was carried out using a MEGAshortscript T7 RNAP transcription kit (Ambion, Austin, TX) in the presence of 0.5 mm [α- 32 P]-UTP (800 Ci/mmol [30 TBq/mmol], GE Healthcare, Piscataway, NJ) for radiolabeling. RNAs were transcribed in the presence or absence of 2.5 mm SAM, as indicated, for 30 min at 37ºC. After transcription, a DNA oligonucleotide (5 µm) complementary to positions or of the B. subtilis yitj leader RNA was added and hybridized for 5 min at 37ºC. RNase H (1 µl, 10 U/µl; Ambion) was then added to the reaction and incubated for 10 min at 37ºC. The final volume of the reaction was 30 µl. The reactions were stopped by phenol chloroform extraction. The resulting RNA products were resolved by denaturing PAGE and visualized by PhosphorImager analysis. The percentage of RNAs protected from cleavage was calculated as the amount of full-length RNAs relative to the total amount of RNA in each reaction. The reactions were performed in duplicate and reproducibility was ±10%. 161

186 3.3 Results SAM-dependent structural transition of wild-type yitj leader RNAs with distinct 3 endpoints The model for S box gene regulation predicts that the anti-antiterminator (helix P1) is stabilized in the presence of SAM, while the absence of SAM results in formation of the antiterminator structure. We probed the structure of helix P1 by using an oligonucleotide that is complementary to the 3 side of helix P1 and detected annealing by sensitivity to cleavage with RNase H. The RNase H enzyme cleaves only RNA-DNA hybrids. Formation of helix P1 in the presence of SAM prevents the complementary oligonucleotide from binding to the 3 side of helix P1. This leads to protection from RNase H cleavage. In the absence of SAM, the 3 side of helix P1 is single-stranded and available for the oligonucleotide to bind, resulting in cleavage by RNase H. Thus, the yitj leader RNA exhibits structural transition in a SAM-dependent manner. We also probed the structure of the antiterminator by using a separate oligonucleotide that is complementary to the 3 side of the antiterminator. As the antiterminator is formed in the absence of SAM, the 3 side of the antiterminator is not available for the oligonucleotide to bind. Hence, protection of the RNA from RNase H cleavage is expected in the absence of SAM. In the presence of SAM, the 5 side of the antiterminator is sequestered in formation of the alternate helix P1. This results in the 3 side of the antiterminator to be available for the RNase H oligonucleotide to bind, resulting in RNase H cleavage. Thus, sensitivity to RNase H cleavage was used as a measure of the SAM-dependent structural transition shown by the yitj leader RNA. 162

187 We tested the RNase H cleavage sensitivity of two wild-type yitj templates with distinct 3 endpoints. Figure 3.2 shows the endpoints of two yitj templates and the positions of two complementary oligonucleotide primers used to probe for SAMdependent structural transitions. The shorter yitj template (159 nt), used to test the SAMdependent structural transition of helix P1, should form helix P1 in the presence of SAM and be protected from oligonucleotide 1-directed RNase H cleavage. The longer template (187 nt), used to detect the SAM-dependent structural transition of the antiterminator, should form the antiterminator in the absence of SAM and be protected from oligonucleotide 2-directed RNase H cleavage. 163

188 Positions targeted for mutagenesis * SAM RNase H oligo 2 RNase H oligo 1 Endpoint of template yitj-at (187 nt) Endpoint of template yitj-aat (159 nt) Figure 3.2 The B. subtilis yitj leader RNA. The boxed nucleotides depict the 7 positions within the SAM-binding pocket targeted for mutagenesis; SAM is shown as an asterisk in magenta. Endpoints of the two templates are designated by arrows. RNase H oligonucleotides are shown by black lines, oligonucleotide 1 is complementary to sequence ; oligonucleotide 2 is complementary to sequence

A. Template yitj-aat Oligo 1 - - + + RNase H + - + + SAM - + - + (159 nt) P (~145 nt) C 1 2 3 4 B.

3 SAM-dependent structural transition of the B. subtilis yitj S box RNA using RNase H cleavage assay. A.

189 A. Template yitj-aat Oligo RNase H SAM (159 nt) P (~145 nt) C B. Template yitj-at Oligo RNase H SAM (187 nt) P (~170 nt) C Figure 3.3 SAM-dependent structural transition of the B. subtilis yitj S box RNA using RNase H cleavage assay. A. RNase H sensitivity of the yitj-aat transcript. B. RNase H sensitivity of the yitj-at transcript. SAM was added at a final concentration of 2.5 mm. The predicted sizes of the RNA products are shown to the left. P, Protection; C, Cleavage. 165

190 The addition of oligonucleotide 1 to the RNase H cleavage reaction triggered cleavage only in the absence of SAM, consistent with the S box model in which the region targeted by oligonucleotide 1 is sequestered only when SAM is present (Figure 3.3A, lane 3). The yitj-aat transcript is available for oligonucleotide 1 to hybridize with the 3 side of helix P1, resulting in cleavage. However, the yitj-aat transcript was protected in the presence of SAM, suggesting that the oligonucleotide was unable to bind to the 3 side of helix P1, preventing RNase H cleavage (Figure 3.3A, lane 4). Figure 3.3B illustrates the structural transition of the longer T7 RNAP-transcribed yitj transcript, which includes the antiterminator sequence of the yitj RNA (yitj-at). Addition of oligonucleotide 2 directed cleavage of the transcripts only in the presence of SAM. This indicates that the 5 side of the antiterminator sequence is sequestered in formation of the terminator structure in the presence of SAM. As a result, the 3 side of the antiterminator is available for oligonucleotide 2 to hybridize, resulting in RNase H- directed cleavage. These results are consistent with the S box model in which the antiterminator structure is formed only in the absence of SAM Mutagenesis of a region in helix P3 to generate a pool of yitj variants We developed an experimental system to identify S box leader sequence determinants important for ligand affinity and specificity. Direct contacts between the T. tengcongensis yitj S box RNA and SAM were revealed in helix P3 in the crystal structure (Montange and Batey 2006). These ligand-rna interactions were confirmed in a separate crystal structure study of the B. subtilis yitj RNA in complex with SAM (Lu et 166

191 al. 2010). We targeted nucleotides in the SAM-binding pocket of the B. subtilis yitj RNA for mutagenesis, in an effort to obtain variants with either higher affinity for the natural ligand or altered specificity towards an analog (Figure 3.4). Positions C45, A46, A47 and C48 on the 5 side and G77, U78 and G79 on the 3 side of helix P3 were targeted for mutagenesis (Figure 3.4). Positions C45, A46, U78 and G79 are predicted to interact directly with the SAM molecule, while A47, C48 and G77 are predicted to be in close proximity to SAM. P1 SAM P3 Nucleotides targeted for mutagenesis Figure 3.4 A close-up view of the B. subtilis yitj SAM-binding pocket. Helix P1 is depicted in grey; helix P3 is in cyan; SAM is in magenta; the sequence targeted for mutagenesis is highlighted in dark blue. This figure was generated using the PyMol software; Protein Data Bank (PDB) accession code 3NPB (Lu et al. 2010). 167

192 Spiked DNA oligonucleotides were used to generate a pool of mutant DNA templates (Table 3.1). DNA sequencing of 27 representatives from the pool of mutant DNA templates revealed the presence of 1-6 changes, with an average of 4 changes per molecule (data not shown). These data confirmed the efficiency of mutagenesis. We simultaneously generated a wild-type yitj template as a positive control. A T7 RNAP promoter sequence was attached to the 5 end of the wild-type and mutant DNA templates to allow for T7 RNAP-mediated transcription Effect of leader region mutations on the SAM-dependent structural transition of the yitj RNA The RNase H assay served as one of the experimental tools to characterize the various yitj mutants. Based on the preliminary results for wild-type RNA (section 3.3.1), we tested the sensitivity of the wild-type and pool of mutant yitj transcripts towards RNase H cleavage. The wild-type yitj transcript (with the yitj-aat endpoint) exhibited cleavage in the absence of SAM (Figure 3.5, lane 3). Addition of 40 µm SAM also resulted in cleavage similar to that seen in the absence of SAM (Figure 3.5, lane 4). This was surprising considering that yitj exhibits half-maximal termination at a SAM concentration of 0.35 µm and an apparent K d of ~20 nm (Tomsic et al. 2008). It is possible that the RNase H assay is not as sensitive as the in vitro transcription termination or the SAM binding assays and hence requires a much higher concentration of SAM to detect a conformational change in the RNA. As expected, addition of 2.5 mm SAM resulted in protection of the wild-type yitj transcript from RNase H cleavage (Figure 3.5, 168

lane 4). However, the pool containing mutant yitj leader RNAs exhibited cleavage regardless of the presence SAM (Figure 3.5, lanes 6-8). Even the addition of a high concentration of SAM (2.

193 lane 4). However, the pool containing mutant yitj leader RNAs exhibited cleavage regardless of the presence SAM (Figure 3.5, lanes 6-8). Even the addition of a high concentration of SAM (2.5 mm) did not change the response of the pool of yitj mutants in the RNase H assay. This result was not surprising as the pool of yitj mutants contains alterations within a highly conserved region of the leader sequence. Wild-type yitj transcripts Pool of mutant yitj transcripts Oligo RNase H SAM µm mm µm mm P C Lane Figure 3.5 SAM-dependent structural transition of the pools of wild-type and mutant yitj transcripts. SAM was added at the indicated concentrations. P, protected transcript; C, cleaved transcript. 169

194 We also tested a few individual mutants that were sequenced from the variant template pool (Figure 3.6). Lanes 1 and 2 depict wild-type yitj RNA in the absence and presence of SAM, respectively (Figure 3.6). Mutant M1 showed constitutive cleavage regardless of the presence of SAM (Figure 3.6, lanes 3 and 4). Sequencing of mutant M1 revealed the presence of 4 nucleotide substitutions within the SAM-binding pocket: C45A, A46G, C48A and G77U. A different mutant (M3) with a point mutation, C45U, also resulted in high constitutive cleavage regardless of the presence of SAM, indicating that helix P1 cannot be formed (data not shown). These results suggest the importance of position C45, which is part of a conserved base-triple (G79-C45 G11) and makes a direct contact to the methionine moiety of SAM. It is possible that the newly formed wobble base-pair (U45 G79) at the bottom of helix P3 can no longer interact with SAM or with G11 to form the base-triple. Together, these results conclusively show that a change at position C45 abolishes response to SAM in vitro. 170

Wild-type M1 M2 M7 Oligo + + + + + + + + RNase H + + + + + + + + SAM - + - + - + - + P C Lane 1 2 3 4 5 6 7 8 Figure 3.6 SAM-dependent structural transition of wild-type and yitj variant RNAs.

195 Wild-type M1 M2 M7 Oligo RNase H SAM P C Lane Figure 3.6 SAM-dependent structural transition of wild-type and yitj variant RNAs. SAM was added at a final concentration of 2.5 mm. M, mutant from pool of variant yitj templates; P, protected transcript; C, cleaved transcript. Point mutants M2 (G77C) and M7 (A47G) showed wild-type-like responses to RNase H in the presence of SAM (Figure 3.6). Unlike mutant M2, a separate mutant, M9, that contained G77C along with G79C, exhibited a complete loss of response to SAM (data not shown). The results of M2 and M9 suggested that the mutation G77C in combination with a second substitution (G79C) switched the response entirely. Positions G77 and A47 do not interact directly with the SAM molecule but are in close proximity. It is therefore possible that these positions tolerate a sequence change. As G79 is part of a base-triple that is involved in hydrogen bond interactions with the methionine moiety of 171

196 SAM, the above results indicate that any disruption of the base-triple leads to a loss of SAM-dependent structural transition. This preliminary analysis provided insight into the varied responses that could be obtained from the pool of yitj mutants Identification of yitj leader RNA determinants for SAM affinity and recognition We performed SELEX (Ellington and Szostak 1990, Tuerk and Gold 1990) or in vitro evolution to obtain yitj variants from the pool of mutant DNA templates using the RNase H assay as the primary selection step. SELEX allows for a simultaneously screening of diverse pools of DNA (or RNA) molecules with a specific functionality. As a complex pool of mutant templates is expected to contain only a small fraction of the functional molecules, we hypothesized that by passing mutant sequences through iterative rounds of selection, yitj variants only with a specific functionality could be obtained eventually. To estimate the starting concentration of SAM to be used for the first round of in vitro selection, we performed a SAM titration on the wild-type yitj leader RNA (Figure 3.7). A gradual increase in protection from RNase H cleavage was evident as the concentration of SAM increased from 5 µm to 2.5 mm (Figure 3.7). This result showed that SAM concentrations higher than 40 µm prevent the RNase H oligonucleotide from binding to the 3 side of the helix P1. This indicates that the RNA undergoes a structural transition that favors formation of helix P1 in the presence of SAM, which is in agreement with the S box model. 172

197 Figure 3.7 A SAM titration of the yitj-aat template in the RNase H cleavage assay. SAM was added in increasing concentrations from lanes 3 to 8. P, protected transcript; C, cleaved transcript; %C, percentage of cleavage. Percent cleavage is the amount of the cleaved transcript relative to the sum of the cleaved and protected transcripts. We observed ~50% cleavage at a SAM concentration of 160 µm (Figure 3.7, lane 8). To ensure that the yitj variants obtained during SELEX maintain the response to SAM, we selected a less stringent concentration of SAM that would result in at least 50% protection from RNase H cleavage. The first round of selection was performed with a starting concentration of SAM that was 2-fold higher (320 µm), as it would exhibit >50% protection. Subsequent assays in the presence of lower concentrations of SAM were used to detect higher affinity variant RNA molecules. 173

Figure 3.8 The SELEX scheme. The selection cycle begins with the DNA pool (template 1, top right corner). The RNase H oligonucleotides are denoted by bold red (oligo 1) and blue (oligo 2) lines.

198 Figure 3.8 The SELEX scheme. The selection cycle begins with the DNA pool (template 1, top right corner). The RNase H oligonucleotides are denoted by bold red (oligo 1) and blue (oligo 2) lines. We employed a two-stage SELEX scheme as illustrated in Figure 3.8. In the first stage, the pool of mutant DNA templates (consisting of the 159 bp yitj-aat templates) was transcribed using T7 RNAP in the presence of 320 µm SAM. The RNA pool thus generated was used as the starting point for our in vitro selection. Transcripts with SAMdependent helix P1 formation were selected using antisense DNA oligonucleotide 1 in the 174

199 RNase H cleavage assay. The RNase H cleavage eliminates the primer-binding site required for the subsequent reverse-transcription step. Reverse-transcription was therefore a key step in this scheme, as it specifically selected uncleaved transcripts from the population. The cdna products were amplified to generate a sub-pool of molecules with SAM-dependent helix P1 formation. In the second stage of SELEX, we tested if the yitj leader RNAs could form the antiterminator structure in the absence of SAM. For this purpose, the cdna products generated during stage 1 were extended to include the antiterminator region downstream of helix P1 (187 bp yitj-at template). Transcripts with the ability to form the antiterminator structure in the absence of SAM were protected from RNase H cleavage in the presence of oligonucleotide 2 and selected specifically during the ensuing reversetranscription step. The amplified cdna products served as the template pool for the next selection round. The two RNase H assays, along with both the reverse-transcription reactions formed one round of in vitro selection. We aimed to conduct iterative rounds of selection to obtain a pool of yitj variants with specific functionality. The selections were performed on pools of the wild-type and mutant yitj templates, where wild-type yitj served as the control in which no sequence changes were expected after multiple rounds of selection. Four selections for each starting template were performed in parallel in order to have multiple pools of molecules. In spite of optimizing experimental conditions prior to starting the actual selection, the reversetranscription as well as the PCR amplification and extension steps were technically challenging. We modified several variables individually or in combination, such as the 175

200 source of DNA polymerase, annealing temperature, template dilution, concentration of oligonucleotide primers, and concentration of Mg 2+, to list a few. Overall, we found that the amplification products obtained using Pfu DNA polymerase exhibited a better yield than those obtained using Taq DNA polymerase (data not shown). These efforts to improve the experimental conditions were halted eventually after unsuccessful attempts over a period of one year, after which the yitj SELEX project was terminated. We next studied the B. subtilis yitj leader RNA using an alternative approach in which the yitj SAM-binding pocket was targeted directly rather than using spiked oligonucleotide primers. Site-directed mutagenesis was performed on the wild-type yitj construct to generate single and double mutants. These mutants were analyzed as part of a crystal structure study of the B. subtilis yitj leader RNA (in collaboration with Dr. Ailong Ke, Cornell University) (Lu et al. 2010). The following sections will describe the results obtained using in vitro transcription termination assays in the presence of SAM and SAM analogs Disruption of G11 causes loss of SAM binding, yet high constitutive transcription termination in vitro The G11 nucleotide is involved in a direct interaction with the methionine moiety of SAM and participates in a G79-C45 G11 base-triple that ties together the J1/2 and P3 within the yitj RNA (Figure 3.9) (Lu et al. 2010). Mutating G11 to any of the other three nucleotides (A, C, or U) resulted in a complete loss of detectable SAM binding (SAM binding done by J. Tomšič, Lu et al. 2010). These results were consistent with the 176

201 position of G11 in the crystal structure. However, each of the G11 mutants exhibited increased termination in the absence of SAM (see Table 3.2) (Lu et al. 2010). Figure 3.9 Nucleotides within the SAM-binding pocket of the B. subtilis yitj RNA. Shown is the interaction of SAM with the G79-C45 G11 base-triple. Distances are given in angstrom in magenta. Carbon, oxygen, nitrogen, sulfur, and phosphorus atoms are colored gray, red, blue, gold, and orange, respectively. Adapted from (Lu et al. 2010). In particular, G11C exhibited the most dramatic effect, with >80% termination in the presence or absence of high SAM (140 µm). The G11U mutant also exhibited 70% termination regardless of the presence of SAM. The wild-type yitj construct showed a maximal response (99% termination) at 0.2 µm SAM, 700-fold less than the maximum concentration of SAM used for the G11C mutant. These results indicate that mutating the 177

202 highly conserved position (G11) in the yitj leader RNA facilitates formation of an RNA structure that resembles the SAM-bound conformation, even when no SAM is present (Lu et al. 2010). The G11A mutant exhibited only a modest effect on termination in the absence of SAM, despite complete loss of SAM binding, indicating that the structural change that results in the stabilization of RNA into a form similar to the ligand-bound form is separable from the effects on SAM binding (Lu et al. 2010). 178

203 Table 3.2 Mutational analysis of the B. subtilis yitj S box riboswitch. a b a Values indicate the efficiency of termination in vitro performed in the absence ( SAM) or presence (+SAM) of SAM (140 μm). b ND, not detectable (K d >200 μm). 179

204 3.3.6 Mutating residues in the P3 helix of the yitj SAM-binding pocket results in a surprising stabilization of the aptamer domain Substitutions of G79 and C45, the residues that participate in the base-triple with G11 (Figure 3.9), exhibited high constitutive transcription termination, regardless of the presence or absence of SAM (Table 3.2). These results were similar to those obtained for the G11 mutants (section 3.3.5). These data suggest that disrupting any nucleotide within the G79-C45 G11 base-triple results in a loss of response to SAM, similar to the results obtained during RNase H assays. It is possible that sequence changes stabilize the yitj SAM-binding domain in a structure that resembles the SAM-bound conformation (Lu et al. 2010). 180

205 Figure 3.10 Nucleotides within the SAM-binding pocket of the B. subtilis yitj RNA. A sheared A46 U78 pair recognizes the Watson Crick and Hoogsteen faces of the adenosine moiety of SAM. Distances are given in angstrom in magenta. Carbon, oxygen, nitrogen, sulfur, and phosphorus atoms are colored gray, red, blue, gold, and orange, respectively. Adapted from (Lu et al. 2010). Like the G79-C45 G11 base-triple, the A46 U78 base-pair is also highly critical. The A46 and U78 nucleotides interact directly with the adenine ring of SAM (Figure 3.10). This base-pair is highly sensitive to substitution, consistent with its position in the crystal structure. Mutants of A46 U78 exhibited high constitutive termination similar to the mutants of the G79-C45 G11 base-triple (Table 3.2) (Lu et al. 2010). 181

206 The adenine ring of SAM is further stabilized by a second base-triple A47 C48- G77, which is stacked above the A46 U78 base-pair. Changing most of the positions within the A47 C48-G77 base-triple resulted in high constitutive termination regardless of the presence of SAM (Table 3.2). However, several notable substitutions like C48A and C48G retained a partial response in the presence of high SAM, indicating that SAM binding was not affected severely (Lu et al. 2010). It is possible that C48 is more tolerant to sequence changes compared to the other nucleotides within the base-triple as it does not participate in direct interactions with the SAM molecule Substitution of the U85-A109 base-pair within the pseudoknot weakens the SAM-binding ability without affecting the termination efficiency The U85-A109 base-pair supports the crucial pseudoknot and creates a compact RNA structure (Lu et al. 2010, Montange and Batey 2006). A109 interacts with position A24 to form a base-triple that stabilizes the pseudoknot interaction (Lu et al. 2010). In contrast to the point mutations in helix P3 and the J1/2-P3 base-triple, substitution of either nucleotide in the U85-A109 base-pair exhibited a response to SAM, with an increase in transcription termination only in the presence of SAM (Table 3.2) (Lu et al. 2010). Percent termination in the presence of SAM was similar to wild-type yitj, and percent termination in the absence of SAM was ~1.6-fold higher for the U85 position and ~1.5-fold lower for the A109 position compared to wild-type (Figure 3.11). The U85C- A109G compensatory mutant also exhibited a wild-type-like response during transcription in both the presence and absence of SAM (Table 3.2). This indicates that 182

207 altering the U85-A109 base-pair does not affect termination in the absence of SAM compared to the other point mutants (Table 3.2). Figure 3.11 In vitro transcription termination assay. Comparison of in vitro transcription termination efficiencies of wild-type and variant yitj constructs in response to SAM. SAM concentration (µm) is denoted on the X-axis and termination efficiency is denoted as percentage values on the Y-axis. Although the termination efficiency of the U85-A109 base-pair did not change significantly, the binding affinity for SAM was affected strongly (Lu et al. 2010). The U85C point mutant showed a ~220-fold increase in K d value, and A109G resulted in a ~110-fold increase in K d (Table 3.2, SAM binding, J. Tomšič). The U85C-A109G compensatory mutant restored partially the SAM-binding affinity relative to the point 183

208 mutants alone, but was still 58-fold lower as compared to the wild-type yitj RNA. Changing the basepair from an A-U to a C-G maintained the response to SAM in an in vitro transcription termination assay but reduced the affinity for SAM. The fact that the compensatory mutant was unable to rescue the phenotype of the individual point mutants in the binding assay suggests that these residues exhibit sequence specificity in addition to their ability to base-pair. This is consistent with the critical role that A109 plays in formation of the pseudoknot structure (Lu et al. 2010) Termination efficiency of wild-type and U85-A109 variant yitj constructs in response to SAM analogs In order to examine the specificity of the yitj S box riboswitch RNA, we tested the wild-type yitj leader RNA in response to a series of SAM analogs using an in vitro transcription termination assay. Each SAM analog tested during this study differed from SAM at a single functional group and most of the modifications were located at the sulfur group or within the amino acid side chain (Figure 3.12). The concentration for halfmaximal termination (Trm 1/2 ) at the wild-type yitj leader region terminator was determined for each SAM analog (Table 3.3). Several yitj variants of the SAM-binding pocket were tested initially at one high concentration of each SAM analog. Mutants that showed a change in percent termination were further titrated using a range of analog concentrations and compared to the percent termination of wild-type yitj. The concentrations necessary to reach Trm 1/2 for the U85-A109 base-pair in presence of the SAM analogs are listed in Table

209 SeSAM Figure 3.12 Chemical structures of SAM and SAM analogs. Modifications for each analog are depicted as pink spheres relative to SAM (top left corner). Abbreviations for each compound are as follows: SAM, S-adenosyl-L-methionine; SeSAM, Se-adenosyl-Lselenomethionine; TeSAM, Te-adenosyl-L-telluromethionine; SAH, S- adenosylhomocysteine; SAEt, S-adenosyl-ethionine; OH-SAM, hydroxyl-sam; ME- SAM, methyl-ethyl-sam. 185

210 Table 3.3. In vitro transcription termination of wild-type B. subtilis yitj in the presence of SAM or SAM analogs. Max% denotes the maximum percentage of termination observed in the presence of ligand. Trm 1/2 denotes half-maximal termination and Trm 1/2 (µm) indicates the concentration of ligand for half-maximal termination. Maximum concentration of SAM or SAM analogs used are as follows: SAM, 2.5 µm; SAH, 500 µm; Sinefungin, 5.0 mm; SAEt, 2.5 µm; SeSAM, 2.5 µm; TeSAM, 2.5 µm; 3 -deoxy SAM, 2.5 µm; OH-SAM, 2.5 µm; ME-SAM, 125 µm. 186

211 187 Table 3.4 In vitro transcription termination of B. subtilis yitj variants in the presence of SAM or SAM analogs. Max% denotes the maximum percentage of termination in the presence of ligand. Trm 1/2 denotes half-maximal termination and Trm 1/2 (µm) indicates the concentration of ligand required for halfmaximal termination. Maximum concentration of SAM or SAM analogs used are as follows: SAM, 2.5 µm; SAH, 500 µm; Sinefungin, 5.0 mm; SAEt, 2.5 µm; SeSAM, 2.5 µm; TeSAM, 2.5 µm; 3 -deoxy SAM, 2.5 µm; OH-SAM, 2.5 µm; ME-SAM, 125 µm. 187

212 SAH differs from SAM by the absence of a single methyl group and a positive charge. Wild-type yitj exhibited the lowest maximum percent termination (Max%T) in the presence of SAH as compared to that in the presence of SAM and the other analogs (Table 3.3). SAH resulted in 78% termination even when added at a concentration of 500 µm, while 1.2 µm SAM was sufficient to stimulate 100% termination for wild-type yitj (Table 3.3). The A109G and U85C mutants exhibited a maximum termination of 50% and required ~3000 µm concentration SAH (data not shown). The C85-G109 compensatory mutant rescued the phenotype of each single mutant. The compensatory mutant exhibited a maximum termination value similar to wild-type yitj, but required ~6- fold higher concentration of SAH (Table 3.4). These results are in agreement with the model that states the requirement for an overall positive charge. SAH lacks the positive charge and results in reduced affinity for wild-type and variant yitj RNAs. Sinefungin, a natural antibiotic produced by Streptomyces sp, is a SAM analog in which methionine is replaced by ornithine. Sinefungin lacks the sulfonium ion and contains an amine substitution in place of the methyl group (Figure 3.12). We found that this analog induced transcription termination at the wild-type yitj terminator. A concentration of 500 µm was necessary to reach ~90% termination (Table 3.3). It is possible that the decreased affinity for this analog is due to the reduced net positive charge caused by the absence of the central sulfonium ion. Previous studies have demonstrated that a high concentration (1.5 mm) of sinefungin reduces the percent termination by increasing readthrough (~4-fold) at the ykrw leader region terminator 188

213 during in vitro transcription termination (McDaniel et al. 2003). In the current study, a 10-fold higher concentration of sinefungin (~15 mm) was required to reach ~50% termination by the yitj variants A109G and U85C (Table 3.4). The C85-G109 compensatory mutant rescued the phenotype of the individual mutants marginally. A ~3.0-fold lower concentration of sinefungin was required to achieve Trm 1/2 for the compensatory mutant compared to the U85C mutant (Table 3.4). Se-SAM and Te-SAM differ from SAM by the identity of the chemical element attached to the sulfur group. Se-SAM contains selenium in place of sulfur, while Te- SAM contains tellurium (Figure 3.12). The positive charge at the central position of SAM is crucial for specific ligand recognition (Lim et al. 2006, Lu et al. 2010, Montange and Batey 2006). Both Se-SAM and Te-SAM retain this positive charge. Similar to the termination efficiency in the presence of SAM, wild-type yitj showed termination efficiencies reaching 100% and 97% in the presence of ~2.5 µm Se-SAM and Te-SAM, respectively (Table 3.3). Sulfur, selenium and tellurium belong to the same group of elements, termed chalcogens, and exhibit highly similar chemical properties. It is therefore not surprising that wild-type yitj responds similarly to these two analogs, as it does to SAM. The difference lies in the concentration required for Trm 1/2. Wild-type yitj needed 2-fold lower concentration of Se-SAM and Te-SAM as compared to the concentration of SAM (Table 3.3). Altering the U85-A109 base-pair had a minor effect on termination efficiency in the presence of Se-SAM. Both point mutants exhibited similar Trm 1/2 values in the presence of 2-fold and 7-fold higher concentrations of the 189

214 analog (Table 3.4). However, the compensatory mutant restored the phenotype of the individual mutants. The double mutant achieved a Trm 1/2 at 0.1 µm Se-SAM, similar to the value for SAM. The A109G and U85C mutants showed a maximum termination of ~50% in the presence of Te-SAM. U85C required 22-fold higher concentration of Te- SAM compared to wild-type yitj (Table 3.4). Again, the compensatory mutant rescued the phenotype exhibited by the point mutants. These results indicate that Se-SAM and Te-SAM resemble SAM closely, and are therefore able to generate a SAM-like response for wild-type yitj. These results further validate the model that a positive charge is essential for ligand recognition and not necessarily the identity of the charged moiety (Lu et al. 2010, Montange and Batey 2006). The response of the wild-type yitj leader RNA to the SAM analog SAEt was nearly identical to SAM. SAEt substitutes the methyl group in SAM with an ethyl group (Figure 3.12). High-resolution analyses have revealed that the methyl group of SAM points towards the solvent cavity within the interior of the RNA and does not participate directly in ligand recognition (Lu et al. 2010, Montange and Batey 2006). Our results appear to be in agreement with this model. Relative to wild-type yitj RNA, the A109G mutant needed a 10-fold higher concentration of ligand to reach Trm 1/2 of <40% (Table 3.4), and the U85C mutant required 8-fold higher concentration. The compensatory mutation C85-G109 rescued these phenotypes partially. These results, yet again, suggest that disruption of the U85-A109 base-pair leads to a severe defect in the response to the ligand by the leader RNA, indicating the importance of sequence over base-pairing. 190

215 3 -deoxy SAM differs from SAM by the absence of the 3 hydroxyl group of the sugar moiety (Figure 3.12). Alteration of the sugar moiety of SAM did not affect the termination efficiency of wild-type yitj (Table 3.3). Wild-type yitj reached ~100% termination efficiency in the presence of 3 -deoxy SAM with a Trm 1/2 concentration of 0.15 µm. These values were similar to those seen in the presence of SAM (Table 3.3). However, the yitj point mutants exhibited barely 50% termination in the presence of the analog. Although the compensatory mutant rescued the phenotype of each point mutant, it required 2-fold higher concentration of 3 -deoxy SAM compared to the wild-type yitj RNA (Table 3.4). Substituting the amine group in the methionine tail by a hydroxyl group did not alter the response of wild-type yitj in vitro. This was surprising, as previous studies have indicated the importance of the amine group in correct recognition of SAM (Lim et al. 2006). Changing the NH + 3 to an H resulted in a K d value of >1000 µm compared for wild-type yitj (Lim et al. 2006). However, the carboxylate group of SAM involved in Watson-Crick base-pairing with the yitj RNA is left intact in OH-SAM. OH-SAM promoted termination at the wild-type yitj leader region terminator and a 2-fold higher concentration of OH-SAM was necessary to reach a Trm 1/2 of ~60% (Table 3.3). The yitj variants tested in the presence of this analog exhibited very low termination yield, and the compensatory mutant did not rescue the phenotypes of the point mutants (Table 3.4). This suggests a direct interaction of the amine group with the U85-A109 base-pair within the SAM-binding pocket and that any alteration at this sequence is detrimental to the 191

216 termination efficiency. Lastly, the SAM analog ME-SAM differs from SAM in that the α-carboxylate, α- amine and the α-carbon of methionine are missing (Figure 3.12). ME-SAM promoted transcription termination at the wild-type yitj leader region terminator. However, a 50- fold higher concentration (5 µm) was required for maximum termination of 96%, compared to SAM (Table 3.3). The individual mutants did not exhibit high termination values even with analog concentrations of 125 µm. Also, the compensatory mutant exhibited a Trm 1/2 concentration of 12.5 µm ME-SAM for 80% maximum termination (Table 3.4). These results are consistent with the crystal structure that highlights the importance of the functional groups in ligand recognition. 3.4 Discussion The B. subtilis yitj S box leader RNA is one of the most well-studied S box riboswitches (Grundy and Henkin 1998). Regulation of this riboswitch occurs such that expression of the yitj gene is upregulated when the cells are starved for methionine (when SAM pools are low) and gene expression is turned off in the presence of methionine (when SAM pools are high). Several genetic, biochemical and biophysical studies together provided the basis for the S box model and the ligand-dependent structural transitions of the RNA (McDaniel et al. 2003, McDaniel et al. 2005). Crystal structure analyses of the yitj RNA have revealed the nucleotides that participate in SAMrecognition (Lu et al. 2010, Montange and Batey 2006, Stoddard et al. 2010). 192

217 In this study, we examined the SAM-binding pocket of the yitj leader RNA in an attempt to obtain variant aptamers with altered affinities or ligand specificities. We initially examined a pool of yitj mutants with an altered sequence in helix P3 using SELEX. RNase H cleavage assays followed by reverse-transcription PCR served as the selection steps in the experiment. Wild-type yitj exhibited a SAM-dependent structural transition with increased protection in the presence of SAM during the RNase H cleavage assay (Figure 3.5). However, the pool of mutant yitj RNAs failed to show increased protection even in the presence of high (2.5 mm) SAM concentration. This result was not surprising as a highly conserved region closely associated with SAM was targeted for mutagenesis. This result indicated that an alteration of the SAM-binding pocket affects the ability of the RNA to undergo a ligand-dependent structural transition. By testing a few yitj mutants using the RNase H assay, we were able to explore the possible responses of representatives from a partially randomized pool of yitj templates. These preliminary results established the design for the two-stage SELEX scheme. The first half of the SELEX scheme selected for RNA molecules with SAMdependent helix P1 formation, while the second half of SELEX selected RNAs able to form the antiterminator in the absence of SAM. Every step in the SELEX scheme was optimized before proceeding with the in vitro selection experiment. However, technical difficulties hampered our attempts and the investigation had to be terminated. An alternate experimental design might minimize the PCR-related technical problems and result in a more efficient selection system. Sites for restriction enzymes can be included 193

218 in the two template sequences to bypass the PCR-based methods of modifying the template end-points at each step within the selection scheme. Treating the amplified cdna products with the appropriate restriction enzyme can reduce the loss of DNA that occurred during the PCR amplification and extension steps. A subsequent in vitro characterization of the yitj riboswitch was performed as part of a larger study, which crystallized the SAM-bound yitj RNA from B. subtilis (Lu et al. 2010). We characterized a set of binding domain mutants using an in vitro transcription termination assay. A few selected mutants were also tested for their ability to bind SAM using a filter exclusion assay. Most of the mutants exhibited loss of SAM binding (consistent with their position in the crystal structure), as these residues either make important contacts with SAM or are important for stabilization of crucial structural domains that form the SAM-binding pocket (Lu et al. 2010). The more surprising mutations were those that resulted in high transcription termination in the absence of SAM, suggesting that they stabilize a structural arrangement in the aptamer domain similar to the SAM-bound form. The fact that many mutants retained the ability to respond to SAM suggested that formation of the SAMbound conformation in the absence of SAM is separable from the ability to bind SAM. Mutational studies conducted with the THI-box riboswitch revealed similar results (Ontiveros-Palacios et al. 2008). These observations together suggest that a riboswitch undergoes significant evolutionary selection to regulate gene expression efficiently, only in response to the appropriate ligand (Lu et al. 2010). 194

219 The above results also suggest that it is advantageous for the aptamer domain to fold into a pre-bound conformation in the absence of the ligand. This can result in a quick ligand-dependent structural transition that can be coupled to gene regulation only in response to the correct ligand. Data obtained from additional analyses of the yitj variants can provide a better understanding of how certain mutations preferentially stabilize the bound-like state of the RNA. It is important to note that the in vitro transcription termination assay detects the efficiency of termination at the leader region terminator in response to a ligand, while the RNase H assay detects the ligand-dependent structural transition of the RNA. It is therefore possible that the RNase H assay is not as sensitive as the in vitro transcription assays and hence requires a much higher concentration of SAM to detect a SAMdependent response in vitro. For example, the SAM concentration required to achieve half-maximal termination at the yitj leader region terminator is ~0.35 µm, whereas the concentration of SAM required for 50% cleavage using RNase H cleavage was ~160 µm. Discrepancy was also observed in some results for the yitj variants. For example, the C45U mutant exhibited high constitutive cleavage in the RNase H assay, suggesting the inability to form helix P1 (section 3.3.2). Based on the RNase H result, we would predict the C45U mutant to show high readthrough activity in an in vitro transcription termination assay. However, this mutant exhibited high constitutive termination in the in vitro transcription termination assay (Table 3.2). It is possible that such single nucleotide substitutions that trap the RNA in the SAM-bound-like conformation, fail to exhibit a 195

220 structural-transition in the RNase H cleavage assay, leading to inconsistent results. Lastly, we characterized the wild-type yitj RNA and a few selected mutants in response to SAM analogs using in vitro transcription termination assays. Our results indicated that the yitj leader RNA is highly specific for SAM, its natural ligand. In some cases, we found that the wild-type yitj transcript exhibited increased termination in the presence of a few analogs. However, this was achievable only under high ligand concentrations that are not physiologically relevant. The analysis of U85C, A109G and C85-G109 indicated that the U85 position was highly critical and that alteration of this nucleotide severely affected the ability of the RNA to respond to SAM or SAM analogs in vitro. The compensatory mutations showed the importance of sequence over basepairing. The U85-A109 base-pair supports the crucial pseudoknot and creates a compact RNA structure (Lu et al. 2010, Montange and Batey 2006). A109 interacts with the N1 position of A24 to further stabilize the pseudoknot interaction (Lu et al. 2010). In addition to stabilizing the pseudoknot structure, these tertiary interactions also stabilize the SAM-binding pocket through ribose-mediated hydrogen bonds in a SAMindependent manner. Based on the above results, we can conclude that changing the ligand specificity of the B. subtilis yitj leader RNA is challenging. Due to the highly conserved sequence of the yitj SAM-binding pocket, generation of variants with altered affinities or specificities can be difficult. However, as seen in the case of the B. subtilis lysc leader RNA, nucleotide changes within the lysine-binding pocket are enough to modify the ligand 196

221 specificity from lysine to a lysine analog (Wilson-Mitchell et al. 2012). We can speculate that the high conservation seen among the S box leader RNAs results in an evolutionary advantage, which prevents the RNA from recognizing closely-related ligands and inappropriately triggering gene regulation. 197

222 CHAPTER 4 INVESTIGATION OF THE FUNCTION OF S BOX RIBOSWITCH RNA STRUCTURAL ELEMENTS: INSIGHTS INTO FACTORS CONTRIBUTING TO S BOX RIBOSWITCH VARIABILITY 4.1 Introduction The gene products regulated by the S box transcription termination control system in B. subtilis are involved in different steps of sulfur metabolism (Grundy and Henkin 2004, Grundy and Henkin 2006, Hullo et al. 2004, Murphy et al. 2002) (Figure 4.1). Expression of these transcriptional units is induced during starvation for methionine, in response to a drop in the intracellular concentration of the molecular effector, SAM (Grundy and Henkin 1998). SAM is synthesized from methionine and ATP by SAM synthetase, encoded by the metk gene. Growth in the presence of methionine results in high SAM pools, while growth in the absence of methionine results in low SAM pools. The 11 S box genes from B. subtilis exhibit differential regulation. An extensive biochemical and genetic analysis revealed the correlation between the physiological roles of the S box genes and their sensitivity to SAM in vivo and in vitro (Tomsic et al. 2008). 198

223 The S box gene-lacz fusions show a 250-fold range in induction ratios after 4 h of methionine starvation (Tomsic et al. 2008). This study concluded that the genes directly involved in methionine biosynthesis (e.g., mete, yitj and yjci) (Figure 4.1) exhibit the tightest regulation in vivo, i.e., these genes exhibit the lowest expression level during growth in the presence of methionine and the greatest increase in expression during starvation for methionine. The mete and yitj genes, in particular, exhibit a long delay before gene expression is induced. This pattern of gene expression suggests that the cell turns on methionine biosynthesis only after SAM levels become very low (Tomsic et al. 2008). In contrast, the yuscba operon (which encodes an ABC-type methionine transporter, Figure 4.1) exhibits a higher level of expression during growth in the presence of methionine and a lower magnitude of induction during methionine starvation. The yusc gene exhibits rapid induction of expression (Tomsic et al. 2008). This pattern of gene expression suggests that expression of the methionine transporter is turned on even before the SAM levels in the cell become very low. These properties are consistent with the role in methionine transport, as it is more efficient to take up exogenous methionine rather than to synthesize it in vivo. Compared to most other S box genes, the metk gene does not exhibit induction during methionine starvation (Tomsic et al. 2008). 199

224 Figure 4.1 Methionine biosynthesis pathways in B. subtilis. Enzymatic steps catalyzed by the S box gene products are shown. Genes with unknown function (yoadcb, yxjg and yxjh) are positioned based on sequence similarity. SAM, S-adenosylmethionine; SAH, S- adenosylhomocysteine; MT, methylthio; THF, tetrahydrofolate. Adapted from (Murphy et al. 2002). Northern blot analysis revealed the relative levels of the terminated and readthrough transcripts in B. subtilis during methionine starvation (Tomsic et al. 2008). An RNA probe complementary to the 5 UTR of each S box transcript was used to detect both the terminated and readthrough transcripts. As the primary sequence of S box leader 200

225 RNAs is highly conserved, RNA probes that hybridize in the coding region of the S box gene were employed to ensure specific detection of the readthrough transcript. These experiments showed that for most S box genes (with the exception of cysh and metk) the terminated transcript is the major product during growth in the presence of high methionine, while starvation for methionine results in a decrease in the amount of terminated transcript and an increase in the readthrough transcript (Tomsic et al. 2008). The major product at the start of methionine starvation for both mete and yusc is the terminated product. However, a 72-fold increase in readthrough transcript is observed for mete, while yusc exhibits a 4.0-fold increase in the readthrough product (Tomsic et al. 2008). These results corroborate the in vivo expression data and are consistent with the physiological roles of these two S box genes (Tomsic et al. 2008). Variability is also seen in the in vitro SAM-dependent transcription termination efficiency for the S box transcriptional units (except metk) (Tomsic et al. 2008). This analysis showed that genes closely associated with methionine biosynthesis (mete, yitj, yjci and ykrt) exhibit the highest sensitivity to SAM. These genes require very low concentrations (in the µm range) of SAM to achieve a half-maximal termination response. It was observed that these biosynthetic genes generally correspond to the genes that are most tightly repressed during growth in the presence of methionine (when SAM pools are high). These genes also exhibit similar termination efficiencies in the absence of SAM (10-15%) and similar maximal efficiencies of termination, which is consistent with tight regulation in vivo (Tomsic et al. 2008). The yusc gene requires a high 201

226 concentration of SAM (15 µm) to reach a half-maximal response in vitro and shows high termination (~50%) in the absence of SAM (Tomsic et al. 2008). The high termination of yusc in the absence of SAM suggests that the antiterminator does not effectively compete with the terminator thereby resulting in low expression even under inducing conditions. These results are consistent with yusc expression in vivo. Overall, the S box genes exhibit a 100-fold range in sensitivity to SAM in vitro. Together, these results show that genes that are tightly repressed in vivo during growth under conditions with high SAM pools are also highly sensitive to SAM during transcription in vitro (Tomsic et al. 2008). A 250-fold range in affinity for SAM has been observed for the S box transcripts (Tomsic et al. 2008). The apparent K d values of mete, yitj and leader RNAs of other methionine biosynthetic genes are in the range of 14 to 25 nm, while the yusc leader RNA has the weakest affinity for SAM with an apparent K d value of 3.5 µm (Tomsic et al. 2008). These experiments showed that the binding affinity of the leader RNAs for SAM correlates with the sensitivity to SAM in vitro (Tomsic et al. 2008). Genes with a lower apparent K d value respond to low concentrations of SAM in the in vitro transcription termination assay. Genes with the greatest sensitivity to SAM in vitro also exhibited the highest induction ratio in vivo (Tomsic et al. 2008). Based on these results, it was proposed that the affinity of the SAM binding domain for SAM is a critical factor for regulation, and that other parameters, such as the mutually exclusive competing elements, and the efficiency of termination, play a major role in the calibration of the system (Tomsic et al. 2008). 202

227 Based on the information gleaned from the above studies, we investigated the RNA elements that are responsible for S box riboswitch variability. The first part of this chapter focuses on in vivo and in vitro analyses of hybrid leader RNAs, generated with mete and yusc sequences. These leader RNAs served as good candidates for this study as they exhibit different expression profiles in response to limiting SAM levels. mete shows the greatest increase in expression when compared to other S box genes and exhibits high affinity for SAM, while yusc is less sensitive to changes in SAM in vivo and exhibits low affinity for SAM (Grundy and Henkin 1998, Hullo et al. 2004, Tomsic et al. 2008). We also investigated a possible promoter effect on the efficiency of S box gene transcription. For this study, we chose the metk and yusc leader RNAs. The metk-lacz expression was induced when SAM pools were low and methionine levels were high (Chapter 2). High levels of SAM failed to promote transcription termination of wild-type metk in vitro. Phylogenetic analysis revealed highly conserved sequences upstream (US box) and downstream (DS box) of the metk S box riboswitch element (Chapter 2). The metk US box is located at the transcriptional start-site of the metk leader RNA. The majority of sequence changes within the US box region resulted in a loss of response to changing SAM pools in vivo. Alteration of the US box sequence also disrupted the predicted US-DS pairing interaction and reduced the transcript stability in vivo (Chapter 2). The in vitro analysis of the metk US box mutants revealed variation in the total amount of transcript generated relative to the wild-type metk RNA (Chapter 2). Based on these results and the US box mutagenesis (Chapter 2), we hypothesized that the metk 203

228 promoter, along with US box sequence, is responsible for reduced transcription initiation or reduced RNAP processivity in vitro. The effects of the metk promoter and US box sequence on expression of the yusc S box leader RNA were tested in vivo and in vitro. The results obtained from these studies will be discussed in the second half of this chapter. 4.2 Materials and methods Bacterial strains The B. subtilis strains used in this study were BR151 (lys-3 metb10 trpc2); ZB307A (SPβc2del2::Tn917::pSK10 6) (Zuber and Losick 1987); ZB449 (trpc2 phea1 abrb703 SPβ-cured) (Nakano and Zuber 1989). B. subtilis strains were grown on tryptose blood agar base medium (TBAB; Difco, Franklin Lakes, NJ), Spizizen minimal medium (Anagnostopoulos and Spizizen 1961) and 2XYT broth (Miller 1972). Chloramphenicol was added at a concentration of 5 μg/ml. X-Gal (5-bromo-4-chloro-3-indolyl-β-Dgalactopyranoside; Gold Biotechnologies, St. Louis, MO) was used at 40 μg/ml as an indicator of β-galactosidase activity. All growth was at 37ºC Genetic techniques Transformation of B. subtilis was carried out as described previously (Henkin et al. 1990). Chromosomal DNA was prepared using the DNeasy tissue kit (Qiagen, Chatsworth, CA). Wizard columns (Promega, Madison, WI) were used for plasmid 204

229 preparations. Oligonucleotide primers were purchased from Integrated DNA Technologies (Coralville, IA). Restriction endonucleases and DNA-modifying enzymes were purchased from New England Biolabs (Beverly, MA) and used as described by the manufacturer. Mutations were identified by DNA sequencing (Genewiz Inc., North Brunswick, NJ). Transcriptional fusions were generated in plasmid pfg328 (Grundy et al. 1993) which contains a cat gene conferring resistance to chloramphenicol. The lacz fusion constructs were introduced in single copy into the B. subtilis chromosome by recombination into the SPβ prophage carried in strain ZB307A and purified by passage of the phage through strain ZB449 (Nakano and Zuber 1989, Zuber and Losick 1987). The phage carrying the fusion was then introduced into the strain BR151. Strains containing lacz fusions were grown in the presence of chloramphenicol Construction of hybrid leader RNAs mete and yusc hybrid leader RNAs The mete and yusc leader RNA hybrids were constructed such that the binding domain of one leader RNA was fused upstream of the terminator domain of the second leader RNA. For example, the mete-yusc leader RNA hybrid contained the binding domain of the mete leader RNA and the terminator domain of the yusc leader RNA. Likewise, the yusc-mete leader RNA hybrid construct contained the binding domain of the yusc leader RNA and the terminator domain of the mete leader RNA. The mete and yusc leader RNAs share a sequence (5 -AGAUGAGAGA-3 ) between the J4/1 region 205

230 and the 3 side of helix 1 (anti-antiterminator) (Figure 4.2, sequence in green box). This common sequence was used to fuse the binding domain of the first leader RNA upstream of the terminator domain of the second leader RNA. The 3 side of the antiterminator (Figure 4.2, red sequence) is complementary to the 5 side of the terminator (Figure 4.2, blue sequence), and these sequences form the alternate antiterminator structure. For sequences that form such alternate structures, alteration in one part of the region of the RNA usually requires a compensatory change in the complementary region of the RNA to maintain base-pairing. However, as the common sequence between mete and yusc is located on the 3 side of the anti-antiterminator, corresponding sequence changes in the terminator or antiterminator sequences were not required. Complementary pairs of DNA oligonucleotides (Table 4.1) were employed to generate the hybrid sequence for each construct. Template end-points for each pair of wild-type and hybrid construct were identical. Templates for in vitro transcription by B. subtilis RNAP were generated by PCR (Pfu DNA Polymerase, Stratagene, La Jolla, CA) using oligonucleotide primers that contained the glyqs promoter sequence (McDaniel et al. 2003). The B. subtilis glyqs promoter is a strong, constitutive promoter that allows efficient transcription in vitro. Use of this promoter does not affect the response to SAM (McDaniel et al. 2003) and results in consistent transcription initiation for each construct. 206

231 Table 4.1 DNA oligonucleotides for mete and yusc leader RNA constructs Oligonucleotide a, b, c, d Sequence (5 3 ) mete-yusc mete-yusc RC yusc-mete yusc-mete RC mete DSXba yusc DSXba metegly US yuscgly US glyq USKpnBam CTAATTCCATCAGATTGTGTCTGAGAGATGAGAGAAAGGC GCCTTTCTCTCATCTCTCAGACACAATCTGATGGAATTAG CCAATTCACACGAAGCGTTCAGCTTTGAAAGATGAGAGAGGCAGTG CACTGCCTCTCTCATCTTTCAAAGCTGAACGCTTCGTGTGAATTGG ATTAATTTCTAGATGTAAAACACTCTCTTTC ATTAATTTCTAGAAATCACCTGCCTTAACAC GGTCCATCTTTTTATATGATCATTTACAAAAAATTAATAACATTT GGTCCATCTTTTTATATGATCATTTACTATATATTTCTCTTATCA TAAGGATCCGGTACCACGAAGAATATTCGGGATTGTA a Sequences in bold indicate the common sequence between mete and yusc b Underlined sequences indicate the transition into the second leader RNA c RC indicates reverse complement d US and DS indicate upstream and downstream, respectively 207

232 A. yusc AT U G CCC G A A GGG A G G C A A A C C U A C A C A C U G G A A U A C G GCC UGG A G A G U G A AC U A U U C U C U U U U U G A A A U G G AAT SAM-binding domain U G C U U C A C A A C A A A A CGAAGC G U U GUUUCG A C G A U U G A G A G A A A G GCAUUUUAUAUAA A G UU U A U A U U A A A A G A A A G U U U C G C G G A A A G A G A G U G C C U U U C U C U C A G A G U G U G C A U C G U C G G C U A C G U A U A U A C G C G G C UUUUCUUUU T Figure 4.2 Predicted secondary structures of the yusc and mete leader RNAs. A. Wild-type yusc leader RNA. B. Wild-type mete leader RNA. Both RNAs are shown in the terminator conformation. Red and blue residues indicate the alternate pairing for formation of the antiterminator, shown above the terminator. T, terminator; AT, antiterminator; AAT, anti-antiterminator. The sequence highlighted by the green box served as the common sequence for the leader RNA hybrid constructs. (continued) 208

233 209 Figure 4.2 (continued) mete B. C AU C U A U U C U C U U U G A G A G GGGUU C G A G U A GG U G G CCU U U U G A CCC C A A C A A C G A A G C C C C G U A AU A C C A U G U U G A U C A U A A G G C A C G G U G C U A A UU C C A UCAGAU U G U AGUCUG G A G A U G A G A G A G G CAGUGUUUUACGUAGAAAA C U C U U U C U C C G G G G A A A G A G G C UUUUUGUU U C A U G UC U G A C G G A G A G A G A G A G U G C C U C U U U C U C U C AU G GG A G U G U U U U A G A A A A A C G U T SAM-binding domain AAT AT

234 metk and yusc hybrid constructs The metk and yusc hybrid leader RNAs were constructed such that the native yusc promoter sequence was replaced by the metk promoter sequence and the sequence upstream of the yusc helix 1 region was replaced by the metk US box sequence. Complementary pairs of DNA oligonucleotides (Table 4.2) were employed to generate the fusion sequence for each hybrid construct. Templates for in vitro transcription were generated as described above. Table 4.2 DNA oligonucleotides for metk and yusc leader RNA constructs Oligonucleotide Sequence (5 3 ) a, b metk-yusc 1 FD metk-yusc 1 RC metk-yusc 2 FD metk-yusc 2 RC metk-yusc 3 FD metk-yusc 3 RC metk-yusc 4 FD metk-yusc 4 RC metk-yusc 5 FD metk-yusc 5 RC metk-yusc 6 FD metk-yusc 6 RC metk-yusc 7 FD metk-yusc 7 RC metk-yusc 8 FD metk-yusc 8 RC GATATTTCATTGAGCGGATATTTCTCTTATCAAGAGAGG CCTCTCTTGATAAGAGAAATATCCGCTCAATGAAATATC GATATTTCATTGAGCGGATACTCTTATCAAGAGAGG CCTCTCTTGATAAGAGTATCCGCTCAATGAAATATC GATAAGATATTTCATTGTTTCTCTTATCAAGAGAGG CCTCTCTTGATAAGAGAAACAATGAAATATCTTATC GATATTTCATTGAGATTATATTTCTCTTATCAAGAGAGG CCTCTCTTGATAAGAGAAATATAATCTCAATGAAATATC GATATTTCATTGAGATTATACTCTTATCAAGAGAGG CCTCTCTTGATAAGAGTATAATCTCAATGAAATATC GATAAGATATTTCATTGAGTTTCTCTTATCAAGAGAGG CCTCTCTTGATAAGAGAAACACAATGAAATATCTTATC GATAAGATATTTCATTGAGTCTCTTATCAAGAGAGG CCTCTCTTGATAAGAGACACAATGAAATATCTTATC CGATAAGATATTTCATTGAGCGGTTTCTCTTATCAAGAGAGGT ACCTCTCTTGATAAGAGAAACCGCTCAATGAAATATCTTATCG a Underlined sequences denote the metk US1 mutation (C4G5G6 A4U5U6) b FD, forward; RC, reverse complement 210

235 4.2.4 β-galactosidase measurements Strains containing lacz fusions were grown in Spizizen minimal medium containing the required amino acids at a concentration of 50 μg/ml until early exponential phase and then harvested by centrifugation. Cells were resuspended in fresh Spizizen minimal medium in the presence or absence of methionine. Samples were collected at 1-h intervals and assayed for β-galactosidase as described previously (Miller 1972) using toluene permeabilization. All starvation experiments and assays were conducted at least twice, and variation was <10% In vitro transcription termination assay Conditions for the mete and yusc wild-type and hybrid constructs Templates for in vitro transcription by the B. subtilis RNAP were generated by PCR using oligonucleotide primers that contained the glyqs promoter sequence and hybridized within the leader region of the target gene, to generate a transcription start-site 16 nt upstream of the start of the mete helix 1 or 8 nt upstream of the start of the yusc helix 1. The promoter sequences were designed to allow initiation with a dinucleotide (ApC) corresponding to the +1/+2 positions of the transcript and a halt in transcription at position +29 for mete and +21 for yusc by omission of GTP (McDaniel et al. 2003). The PCR fragments were ~400 bp in length and included 67 and 63 bp downstream from the transcription terminator of mete and yusc, respectively, to allow resolution of terminated and readthrough products. PCR products were purified with a Qiagen cleanup kit and 211

236 sequenced by Genewiz. Single-round transcription reactions were carried out as described in Chapter 2, section Templates were transcribed in the presence or absence of SAM (as indicated). Transcription products were resolved by denaturing PAGE and visualized by PhosphorImager (Molecular Dynamics) analysis. Percent termination was plotted as a function of ligand concentration (GraphPad Prism). The reactions were performed in duplicate and reproducibility was ±5% Conditions for the metk-yusc hybrid leader RNA constructs Templates for in vitro transcription by the B. subtilis RNAP were generated by PCR using oligonucleotide primers that contained the metk promoter sequence with or without the metk US box sequence and fused to the helix 1 or the J1/2 region of the yusc leader RNA. The metk transcription start-site was located 1-17 nt upstream of the start of the yusc helix 1, depending on the construct (refer to Table 4.5 for sequence details). The PCR fragments were ~400 bp in length and included 63 bp downstream from the yusc transcription terminator, to allow resolution of terminated and readthrough products. PCR products were purified and sequenced as described above. Multiple-round transcription reactions were carried out as described in Chapter 2, section Templates were transcribed in the presence or absence of SAM (as indicated). Transcription products were analyzed and percent termination was determined as described above. The reactions were performed in duplicate and reproducibility was ±5%. 212

237 4.3 Results Repression of S box gene expression in the presence of methionine is dependent on the SAM-binding domain of the S box leader RNA The wild-type and hybrid constructs were tested using in vivo β-galactosidase expression assays during methionine starvation. Cells were grown in minimal medium in the presence of methionine until mid-exponential phase, after which the cells were harvested and resuspended in the presence or absence of methionine. The wild-type mete-lacz fusion exhibited high β-galactosidase activity in the absence of methionine compared to the other three constructs (Figure 4.3, open squares), and very low expression in the presence of methionine (Figure 4.3, filled squares). The ratio of expression during growth in the absence of methionine to that in the presence of methionine is termed the repression ratio. mete exhibited a high repression ratio of 590, which suggests that the mete gene is tightly regulated in response to SAM (Table 4.3). The mete repression ratio is consistent with the functional role of mete in methionine biosynthesis. The wild-type yusc-lacz fusion exhibited ~2.3-fold lower β-galactosidase activity after 4 h of growth in the absence of methionine compared to wild-type mete-lacz under the same conditions (Figure 4.3, open inverted triangles), while wild-type yusc-lacz failed to show complete repression of β-galactosidase activity after 4 h of growth in the presence of methionine (Figure 4.3, filled inverted triangles). The repression ratio of wild-type yusc was 31, ~20-fold lower than the repression ratio of wild-type mete (Table 213

238 4.3). A low repression ratio indicates that the gene is less tightly regulated in response to SAM, which is consistent with the functional role of yusc as a methionine transport gene. These results were in agreement with previously reported data (Tomsic et al. 2008). Figure 4.3 In vivo expression assay of the wild-type and hybrid leader RNA constructs during methionine starvation. S box leader gene-lacz fusions were integrated in single copy in strain BR151 (lys-3 metb10 trpc2). Cells were grown in Spizizen minimal medium (Anagnostopoulos and Spizizen 1961) containing methionine and resuspended in fresh medium in the presence (filled symbols) or absence (open symbols) of methionine. Samples were taken at 1-h intervals until 4 h after resuspension. Squares, wild-type mete-lacz; inverted triangles, wild-type yusc-lacz; diamonds, meteyusc-lacz; circles, yusc-mete-lacz. MU, Miller units; -M, in the absence of methionine; +M, in the presence of methionine. 214

239 Table 4.3 In vivo expression analysis of the wild-type and hybrid constructs. * a β-galactosidase activity is expressed as Miller Units (MU) and is reported for samples taken at 4 h after the growth in the presence of methionine (+Met) or absence of methionine (-Met).Values are reported as the means ± the standard deviations for two assays. *, value is too low to measure with high accuracy. b Repression ratio indicates the ratio of values in the absence of methionine to the values in the presence of methionine after 4 h of growth. The mete-yusc hybrid leader construct consists of the SAM binding domain of the mete leader RNA and the terminator domain of the yusc leader RNA. Expression of the mete-yusc-lacz hybrid construct was 4.5-fold lower compared to wild-type metelacz after 4 h of growth in the absence of methionine, and expression of the hybrid construct was ~2.0-fold lower than that of wild-type yusc-lacz under the same conditions (Figure 4.3, open diamonds). Expression of mete-yusc-lacz was repressed completely in the presence of methionine, similar to wild-type mete-lacz (Figure 4.3, filled diamonds). Also, the mete-yusc-lacz hybrid showed a repression ratio of ~220, which was ~2.7-fold 215

240 lower than wild-type mete-lacz and 7.0-fold higher than wild-type yusc-lacz (Table 4.3). These results suggest that sensitivity of the mete-yusc RNA to the changing concentration of SAM in vivo is a function of the mete binding domain. It is possible that expression of the mete-yusc hybrid is relatively low due to the ineffective competition between the yusc antiterminator and terminator domains, which is consistent with the wild-type yusc expression. The yusc-mete-lacz hybrid construct exhibited a delayed induction. β- Galactosidase activity of this hybrid construct was only 1.3-fold lower compared to that of wild-type yusc-lacz after 4 h of growth in the absence of methionine (Figure 4.3, open circles). In the presence of methionine, expression was not completely repressed (Figure 4.3, filled circles). These results indicate that the expression profile for the yusc-metelacz construct is very similar to that of wild-type yusc-lacz. The yusc-mete-lacz construct exhibited a repression ratio that was 1.6-fold lower than that of wild-type yusclacz, suggesting that this hybrid construct is also less tightly regulated, similar to wildtype yusc. Again, these results suggest that the binding domain plays a critical role in sensing SAM. These results indicate that the binding domain contributes to the degree of repression in the presence of methionine (when SAM pools are high), while the terminator/antiterminator domain dictates the level of expression The termination efficiency of S box genes is dictated by the terminator/antiterminator domains 216

241 In vitro transcription termination assays were employed to test the efficiency of termination for each wild-type and hybrid construct in response to varying SAM concentrations. A representative image of a polyacrylamide gel and the corresponding calculation of the percent termination are shown (Figure 4.4). 217

A. RT B. Figure 4.4 SAM-dependent transcription termination of an S box riboswitch. A. In vitro transcription of a representative B. subtilis S box template, yitj, in the presence or absence of SAM.

242 A. RT B. Figure 4.4 SAM-dependent transcription termination of an S box riboswitch. A. In vitro transcription of a representative B. subtilis S box template, yitj, in the presence or absence of SAM. SAM was added ranging from µm final concentration. Terminated (T, filled circles) and readthrough (RT, open circles) RNAs are labeled. B. Quantitation of the SAM response. Termination efficiency is the amount of the terminated product relative to the sum of the terminated and readthrough products. 218

243 Analysis of the binding domains versus the terminator/antiterminator domains for a wild-type and hybrid construct pair was performed using in vitro transcription termination assays. In Figure 4.5, each wild-type and hybrid pair has the same binding domain. Wild-type yusc and the yusc-mete hybrid both needed a high concentration of SAM (150 µm) to achieve maximum termination of ~80%. Wild-type yusc exhibited a high percent termination of ~40% in the absence of SAM (Figure 4.5A, filled circles). This suggests that the antiterminator of the yusc leader RNA cannot effectively compete with the terminator. However, by changing the yusc terminator domain to the mete terminator domain, the yusc-mete hybrid construct exhibited a termination efficiency of ~15% in the absence of SAM (Figure 4.5A, open circles). The ~2.5-fold lowered percent termination compared to wild-type yusc indicated that the mete terminator domain contributes significantly to the reduced termination efficiency in the absence of SAM. The downshift in the curve of the hybrid leader construct compared to the curve of the wild-type construct illustrates the effect of the terminator domain. Wild-type mete exhibited ~80% termination in the presence of ~1 µm SAM (Figure 4.5B, filled squares). The mete-yusc hybrid exhibited a higher percent termination (98%) at the same concentration of SAM (Figure 4.5B, open squares), indicating that the yusc terminator domain contributes to the overall increased termination compared to wild-type mete. However, as both constructs have the same binding domain, high termination in the presence of low concentration of SAM indicates the sensitivity of the mete binding domain. In the absence of SAM, wild-type mete 219

244 exhibited a low percent termination of 7.0%, consistent with the physiological role of mete (Figure 4.5B, filled squares). By changing the mete terminator domain to that of the yusc terminator domain, we observed an increase in termination efficiency to 66% in the absence of SAM (Figure 4.5B, open squares). The change in terminator domains resulted in the ~10-fold reduction in termination efficiency in the absence of SAM compared to wild-type mete, which is seen as an upshift in the hybrid curve compared to that of the wild-type curve. Overall, these data indicate that the terminator/antiterminator domains contribute to the efficiency of termination in response to SAM, while the sensitivity to SAM is maintained by the binding domain. 220

245 A. B. Figure 4.5 Direct comparison of the terminator/antiterminator competition. In vitro transcription termination analyses for the wild-type and hybrid constructs. A. Wild-type yusc, filled circles; yusc-mete hybrid, open circles. B. Wild-type mete, filled squares; mete-yusc hybrid, open squares. Each wild-type and hybrid pair have the same binding domain but different terminator domains. Titration of the leader RNAs was performed in the presence of increasing concentrations of SAM (X-axis). Termination efficiency (Yaxis) is the amount of the terminated product relative to the sum of the terminated and readthrough products. 221

246 The effect of the binding domain on termination efficiency was analyzed by comparing the wild-type and hybrid leader RNA pairs in which the terminator domains were constant (Figure 4.6). Wild-type yusc exhibited a termination efficiency of ~40% in the absence of SAM (Figure 4.6A, filled circles). A change in the binding domain from yusc to mete resulted in higher termination (~60%) by the mete-yusc hybrid compared to wild-type yusc under the same conditions (Figure 4.6A, open squares). The key observation was that the mete-yusc hybrid approached ~100% termination in the presence of low SAM (~1 µm), while wild-type yusc required at least 150 µm SAM to reach ~90% termination. The higher termination of the mete-yusc hybrid in the absence of SAM can be due to the ineffective competition of the yusc antiterminator domain with the terminator domain, while the increased termination efficiency in the presence of lower concentrations of SAM is due to the high sensitivity of the mete binding domain. Wild-type mete exhibited low termination in the absence of SAM (~7%) and high termination (~100%) in the presence of very low concentrations of SAM (~1 µm) (Figure 4.6B, filled squares). A change in the binding domain from mete to yusc resulted in a modest 2-fold increase in termination (~14%) by the yusc-mete hybrid in the absence of SAM (Figure 4.6B, open circles). However, yusc-mete exhibited a termination efficiency of ~80% in the presence of a much higher concentration of SAM (150 µm). These results suggest that a change in the binding domain resulted in a loss of SAM sensitivity, which was consistent with that of wild-type yusc, while the presence of the mete terminator 222

247 resulted in low termination efficiency in the absence of SAM. Together, the in vitro data suggest that although the termination efficiency is dictated by the terminator/antiterminator domains, the sensitivity to SAM is a function of the binding domain. Table 4.4 summarizes the in vitro analyses for the wild-type and hybrid leader RNAs. 223

248 A. B. Figure 4.6 Direct comparisons of the binding domains. In vitro transcription termination analyses for the wild-type and hybrid constructs. A. Wild-type yusc, filled circles; mete-yusc hybrid, open squares. B. Wild-type mete, filled squares; yusc-mete hybrid, open circles. Each wild-type and hybrid pair have different binding domains but the same terminator domains. Titration of the leader RNAs was performed in the presence of increasing concentrations of SAM (X-axis). Termination efficiency (Y-axis) is the amount of the terminated product relative to the sum of the terminated and readthrough products. 224

249 Table 4.4 In vitro analysis of the wild-type and hybrid mete and yusc leader RNAs a Termination efficiency is the amount of the terminated product relative to the sum of the terminated and readthrough products. Values are indicated as means ± the standard deviations for two assays. b Termination efficiency in the absence of SAM. c Termination efficiency in the presence of 1 µm SAM The length of the metk upstream (US) box sequence affects expression of the metk-yusc hybrid leader RNAs The metk US box sequence was fused upstream of the helix 1 region of the yusc leader RNA. The yusc promoter and transcriptional start-site (+1) were replaced by the metk promoter and transcriptional start-site. A total of 8 metk-yusc-lacz fusion constructs (my1-my8, Table 4.5) were analyzed using in vivo expression assays and compared to the wild-type metk-lacz and yusc-lacz fusion constructs (Figures ). In agreement with previous results, expression of the wild-type metk-lacz fusion was not repressed in minimal medium during growth in the presence of methionine 225

250 (when SAM pools are high) (Figure 4.7, filled squares), and did not increase during methionine starvation (when SAM pools are low) (Figure 4.7, open squares; refer to section 2.3.1, Chapter 2). Expression of the wild-type yusc-lacz fusion construct was induced during growth in the absence of methionine (Figure 4.7, open triangles), and was relatively high during growth in the presence of methionine (Figure 4.7, filled triangles). These results indicate that yusc is relatively less tightly repressed in the presence of SAM. Figure 4.7 In vivo expression analysis of the wild-type metk and yusc leader RNAs. S box gene-lacz fusions were integrated in single copy in strain BR151 (lys-3 metb10 trpc2). Cells were grown in Spizizen minimal medium (Anagnostopoulos and Spizizen 1961) containing methionine and resuspended in fresh medium in the presence (filled symbols) or absence (open symbols) of methionine. Samples were taken at 1-h intervals until 5 h after resuspension. Squares, metk-lacz; circles, yusc-lacz. MU, Miller units; - M, in the absence of methionine; +M, in the presence of methionine. 226

251 Table 4.5 Sequences of the metk and yusc hybrid leader RNA constructs a, b, c metk-yusc 1 (my1; with partial US box) +1 taggtgcctatttcttctgaatcatattgacattgcaaacccttttacgataagatatttcattgagcgga tatttctcttatcaagagaggtggagggaagtgccctatgaagcccggcaaccatcaacactgtt metk-yusc 2 (my2; with longer US box) +1 taggtgcctatttcttctgaatcatattgacattgcaaacccttttacgataagatatttcattgagcgga tactcttatcaagagaggtggagggaagtgccctatgaagcccggcaaccatcaacactgttgaa metk-yusc 3 (my3; without US box) +1 taggtgcctatttcttctgaatcatattgacattgcaaacccttttacgataagatatttcattgtttctc ttatcaagagaggtggagggaagtgccctatgaagcccggcaaccatcaacactgttgaaatggt metk-yusc 4 (my4; my1 with US1) +1 taggtgcctatttcttctgaatcatattgacattgcaaacccttttacgataagatatttcattgagatta tatttctcttatcaagagaggtggagggaagtgccctatgaagcccggcaaccatcaacactgtt metk-yusc 5 (my5; my2 with US1) +1 taggtgcctatttcttctgaatcatattgacattgcaaacccttttacgataagatatttcattgagatta tactcttatcaagagaggtggagggaagtgccctatgaagcccggcaaccatcaacactgttgaa metk-yusc 6 (my6; my3 with AG upstream of TTT at start of yusc helix 1) +1 taggtgcctatttcttctgaatcatattgacattgcaaacccttttacgataagatatttcattgagtttc tcttatcaagagaggtggagggaagtgccctatgaagcccggcaaccatcaacactgttgaaatg metk-yusc 7 (my7; my3 with AG upstream of T at start of yusc helix 1) +1 taggtgcctatttcttctgaatcatattgacattgcaaacccttttacgataagatatttcattgagtctc ttatcaagagaggtggagggaagtgccctatgaagcccggcaaccatcaacactgttgaaatggt metk-yusc 8 (my8; my6 + CGG) +1 taggtgcctatttcttctgaatcatattgacattgcaaacccttttacgataagatatttcattgagcggt ttctcttatcaagagaggtggagggaagtgccctatgaagcccggcaaccatcaacactgttgaa a +1 indicates the metk transcriptional start-site, G b underlined sequence indicates the metk US box region fused to the yusc sequence c my, abbreviation for metk-yusc 227

252 The metk-yusc1 construct contained the first 9 residues of the metk US box sequence (including the metk +1) (Table 4.5, my1) and showed an expression profile similar to that of the wild-type metk-lacz fusion during methionine starvation (comparing Figure 4.7, squares to Figure 4.8, triangles). β-galactosidase activity for the metk-yusc1-lacz fusion was ~2.0-fold higher than wild-type metk-lacz after 5 h of growth in the presence and absence of methionine. The metk-yusc2 construct contained the first 17 residues of the metk US box sequence (Table 4.5, my2) and showed ~1.2-fold increase in β-galactosidase activity during growth in the presence of methionine and ~1.5-fold increase during growth in the absence of methionine, compared to metk-yusc1- lacz (Figure 4.8, squares). Both hybrid constructs failed to exhibit an increase in expression during growth in the absence of methionine, consistent with the expression of wild-type metk-lacz. These results indicate that including the metk US box sequence upstream of the yusc leader RNA and changing the native yusc promoter to the metk promoter sequence resulted in a loss of response to methionine starvation in vivo. 228

253 Figure 4.8 In vivo analyses of the metk-yusc hybrid leader RNA fusions. S box leader gene-lacz fusions were integrated in single copy in strain BR151 (lys-3 metb10 trpc2). Cells were grown in Spizizen minimal medium (Anagnostopoulos and Spizizen 1961) containing methionine and resuspended in fresh medium in the presence (filled symbols) or absence (open symbols) of methionine. Samples were taken at 1-h intervals until 5 h after resuspension. Triangles, metk-yusc1-lacz; squares, metk-yusc2-lacz; circles, metk-yusc3-lacz; diamonds, metk-yusc4-lacz; inverted triangles, metk-yusc5-lacz. MU, Miller units; -M, in the absence of methionine; +M, in the presence of methionine. The metk-yusc3 construct contained only the metk promoter sequence and the metk transcriptional start-site (G+1) upstream of the yusc leader RNA helix 1, while the 229

254 metk US box sequence was omitted (Table 4.5, my3). The metk-yusc3-lacz exhibited very low β-galactosidase activity (1-2 Miller units) (Figure 4.8, circles). This result indicated that the metk US box sequence is important for expression in vivo, and that the metk promoter sequence, along with the metk transcriptional start-site, were not sufficient for metk-yusc3 expression in vivo. Sequence of the metk-yusc3 construct contained the +1 position fused to three consecutive uridines of the yusc helix 1 sequence. We speculated that this sequence caused the RNAP to dissociate from the template (due to insufficient number of purines near the transcriptional start-site), subsequently resulting in no expression (Figure 4.8, circles). It is possible that the RNAP requires the metk A+2 and G+3 positions for proper transcription, in addition to the transcriptional start-site (G+1). We therefore generated hybrid constructs in which the first 3 positions of the metk US box sequence were fused upstream of either 3 or 1 uridines in the helix 1 region of yusc, generating the metkyusc6 and metk-yusc7 constructs, respectively (Table 4.5). In vivo expression assays showed that adding the +2 and +3 positions (A and G) increased the β-galactosidase by ~ fold for the two hybrid constructs, relative to the metk-yusc3-lacz fusion construct (Figure 4.9, inverted triangles and circles). 230

255 Figure 4.9 In vivo expression of metk-yusc hybrid leader constructs. S box leader gene-lacz fusions were integrated in single copy in strain BR151 (lys-3 metb10 trpc2). Cells were grown in Spizizen minimal medium (Anagnostopoulos and Spizizen 1961) containing methionine and resuspended in fresh medium in the presence (filled symbols) or absence (open symbols) of methionine. Samples were taken at 1-h intervals until 5 h after resuspension. Inverted triangles, metk-yusc6-lacz; circles, metk-yusc7-lacz; diamonds, metk-yusc8-lacz. MU, Miller units; -M, in the absence of methionine; +M, in the presence of methionine. The metk-yusc8 construct was generated by adding 3 more nucleotides (to include the first 6 positions, 5 -GAGCGG-3 ) of the metk US box sequence upstream of the 3 uridines of yusc. Expression of metk-yusc8-lacz increased ~20-fold when compared to metk-yusc3-lacz during growth in the absence of methionine and ~5.0-fold 231

256 compared to the metk-yusc6 and 7 fusion constructs during growth in the absence of methionine (Figure 4.9, diamonds). These results suggest that the in vivo activity of the metk-yusc hybrid construct was restored by adding 3 additional metk US box positions. The metk phylogenetic analysis revealed that the invariant C4, G5 and G6 positions are part of a conserved core region and are important for metk expression in vivo (see 2.3.3, Chapter 2). It is interesting to note that the expression of the metk-yusc8-lacz construct in the absence of methionine was 1.5-fold higher than in the presence of methionine after 5 h of growth (Figure 4.9, comparing open diamonds to filled diamonds). This result suggests that, in contrast to wild-type metk, the metk-yusc8 construct exhibits some level of regulation during methionine starvation. Mutating the conserved core region within the US box sequence (US1 mutation C4G5G6 A4U5U6; see Chapter 2) resulted in very low expression of the metk-yusc4- lacz (3-4 Miller units) and metk-yusc5-lacz (0 Miller units) fusion constructs during growth in the presence and absence of methionine (Figure 4.8, diamonds and inverted triangles). We have previously predicted that these positions from the metk US box conserved core region play a crucial role in metk regulation most likely at the level of transcript stability (Chapter 2). Overall, the metk-yusc results indicate the requirement for the US box sequence for efficient expression in vivo The metk US box sequence contributes to the termination efficiency of the metk-yusc hybrid constructs 232

257 In vitro transcription termination assays were performed to analyze the efficiency of termination at the yusc leader region terminator for each hybrid construct in response to SAM. Consistent with the results described in Chapter 2, wild-type metk failed to respond to SAM in vitro (section 2.3.9, Chapter 2). Percent termination did not increase upon addition of SAM (Figure 4.10, lanes 1 and 2, bands RT1 and T1). metk-yusc1 and metk-yusc2 exhibited high constitutive readthrough in the absence and presence of SAM (Figure 4.10, lanes 3, 4 and 5, 6, respectively; RT2, T2). Like wild-type metk, metk-yusc 1 and metk-yusc 2 failed to respond to changing levels of SAM in vitro and showed very low termination efficiency (1%). A strong pause band (~112 b) was observed for the metk-yusc hybrid constructs 1 and 2 and this band was not observed for the wild-type metk control. Previous studies of wild-type yusc also showed no evidence of such a pause band (data not shown). 233

Figure 4.10 In vitro transcription termination analysis. In vitro transcription termination assays of the wild-type metk and metk-yusc hybrid constructs in response to SAM.

258 Figure 4.10 In vitro transcription termination analysis. In vitro transcription termination assays of the wild-type metk and metk-yusc hybrid constructs in response to SAM. SAM was added at a final concentration of 2.5 mm. Termination efficiency is the amount of the terminated product relative to the sum of the terminated and readthrough products. RT1, readthrough band of wild-type metk; T1, terminated band of wild-type metk; RT2, readthrough band of the metk-yusc hybrid transcripts; T2, terminated band of the metk-yusc hybrid transcripts; P, pause band. Compared to constructs metk-yusc 1 and 2, metk-yusc3 (which contained the G+1 position fused to three consecutive uridines of the yusc helix 1 sequence) failed to show any transcripts (Figure 4.10, lanes 7 and 8). Attempts to improve the transcription yield (by changing the ratio of the nucleotides added to the reaction) were unsuccessful. Similarly, the metk-yusc6 and 7 constructs, which contained two additional positions of 234

Sensing Metabolic Signals with Nascent RNA Transcripts: The T Box and S Box Riboswitches as Paradigms

Sensing Metabolic Signals with Nascent RNA Transcripts: The T Box and S Box Riboswitches as Paradigms T.M. HENKIN AND F.J. GRUNDY Department of Microbiology and The RNA Group, The Ohio State University,