Escherichia coli Lactose Operon

Escherichia coli Lactose Operon Agnes Ullmann, Pasteur Institute, Paris, France The study of Escherichia coli lactose operon laid the foundation of modern molecular biology. It contributed to the elaboration of the concept of genetic regulation, proposed by Jacob and Monod almost 50 years ago, a model which survives essentially unchanged. The operon structure, consisting of structural and regulatory genes has been elaborated and their regulatory response to small molecules, such as inducer, glucose and cyclic AMP, have been elucidated. Gene regulation of the lactose operon led to the discovery of messenger ribonucleic acid (mrna), to the identification of the Lac repressor and to the development of the theory of allostery. The lactose genes and its derivatives provide tools to a wide range of current applications in many fields of biology; they are the most commonly used reporter genes in the analysis of developmentally regulated systems, in the study of mutation frequencies and in functional genomics. Introductory article Article Contents. Historical Development of the lac Paradigm. The lac Structural Genes. The Repressor. The Operator. The Inducer. Catabolite Repression Control of lac. The lac Operon Components as Tools Online posting date: 15 th March 2009 Historical Development of the lac Paradigm Exploration of the lactose (lac) system was started in the late 1940s at the Pasteur Institute in Paris by Jacques Monod and his collaborators. From the beginning of the twentieth century it had been known that certain microbial enzymes were formed only in the presence of their specific substrate. This phenomenon, called enzymatic adaptation, was later redefined and named enzyme induction. It should be emphasized that at the time the structure of deoxyribonucleic acid (DNA) was not known, little was known about the structure of proteins and nothing was known of their biosynthesis. Finding a lactose-fermenting (Lac + ) mutant from an Escherichia coli Lac 2 strain allowed Monod to conclude that the ability of the Lac + mutant to utilize lactose was due to the formation of an adaptive enzyme, b-galactosidase (called lactase at that time). These results were to raise the fundamental question of the relation between gene, enzyme and inductive substrate. The very nature of enzyme induction had been elucidated in the early 10s when it was shown that the inducing effect of the substrate (lactose) entailed total biosynthesis of the b-galactosidase from amino acids. See also: Monod, Jacques Lucien; Escherichia coli as an Experimental Organism ELS subject area: Genetics and Molecular Biology How to cite: Ullmann, Agnes (March 2009) Escherichia coli Lactose Operon. In: Encyclopedia of Life Sciences (ELS). John Wiley & Sons, Ltd: Chichester. DOI: 10.1002/9780470015902.a0000849.pub2 Most of the major concepts of gene expression and regulation were established using the lactose system of E. coli. b-galactosidase, which splits lactose into glucose and galactose, could easily be determined quantitatively thanks to a chromogenic substrate o-nitrophenyl-b-d-galactoside (ONPG). Furthermore, the development of galactoside analogues, mainly thiogalactosides, which served to dissociate enzyme activity from enzyme induction (gratuitous inducers) permitted a straightforward analysis of the system. The discovery of bacterial conjugation in 1946 by Lederberg, and refinement of this technique in 16 by Wollman, Jacob and Hayes, permitted an early genetic analysis of the genes determining lactose metabolism. See also: Escherichia coli and the Development of Bacterial Genetics; Lederberg, Joshua By the late 10s a series of lac 2 mutations had been obtained: most of these mutations mapped to a single locus of the E. coli chromosome, defining the region now known as the lac locus, which contains three structural genes, lacz, lacy and laca, coding for b-galactosidase, lactose permease and thiogalactoside transacetylase (see later). The discovery of a new class of genetic elements, regulatory genes, the demonstration that the expression of both lacz and lacy genes is negatively regulated by a repressor, and, finally, the hypothesis of an obligatory intermediate, the messenger ribonucleic acid (mrna), in the transfer of structural information from gene to protein, culminated in the concept of the operon. The work on the lac system became the foundation of the new field of molecular biology. By the end of the 1960s, the main problems posed by the expression of lac genes in E. coli had been solved. These bold new ideas were subsequently applied to a large number of different systems. Our present molecular understanding of gene expression and regulation can be ENCYCLOPEDIA OF LIFE SCIENCES # 2009, John Wiley & Sons, Ltd. www.els.net 1

unambiguously traced back to the lactose system. See also: History of Molecular Biology The lac Structural Genes The three structural genes of the lac locus, lacz, lacy and laca, code for b-galactosidase, lactose permease and thiogalactoside transacetylase, respectively, as shown in Figure 1. b-galactosidase Since the early studies of enzyme induction, the E. coli b-galactosidase has been associated with a wide range of applications in various fields of molecular biology. It is a hydrolytic transglucosidase, specific for the b-d-galactopiranoside configuration, and can act as a hydrolase as well as a transferase. b-galactosidase is a tetramer, the only active form of the enzyme. It consists of four identical subunits, each having a molecular weight of approximately 120 000. Owing to its high turnover number (about 6000 mol ONPG hydrolysed s 21 mol 21 enzyme), one molecule of b-galactosidase per bacterial cell can be accurately determined. The amino acid sequence of b-galactosidase was determined before DNA sequence techniques had been developed; it was the largest protein sequence ever elucidated. According to this sequence, the protein contains 1021 amino acid residues corresponding to a subunit molecular weight of 116 248. When, later, the nucleotide sequence of the lacz gene was also determined in 1983, the amino acid sequence deduced from the DNA sequence differed only in 10 amino acid residues, and the b-galactosidase subunit was predicted to consist of 1023 residues. See also: Peptide Sequencing by Edman Degradation An interesting feature of the lacz gene, called a- and o-complementation, has been revealed when deletion mutants, involving quite large segments of the gene, have been isolated: inactive deletion mutants, lacking either the N-or the C-terminal region of b-galactosidase (a- and o-peptides, respectively) would complement another inactive deletion mutant containing the region missing in the first. This complementation represents noncovalent reassociation of complementary fragments, which then reassemble into an enzymatically active tetrameric structure. Three-dimensional structure of b-galactosidase b-galactosidase had been obtained in crystalline form in the early 10s, but the resolution of its X-ray structure had Figure 1 lacz lacy laca β-galactosidase Lactose permease Transacetylase The lac structural genes. to wait 40 years. The crystal structure at 2.5-A resolution represents the longest polypeptide chain for which an atomic structure has been determined. The crystal structure shows b-galactosidase to be a tetramer with dimensions of roughly 175 135 90 A along the respective 2-fold axes. The monomer consists of five compact domains and a relatively extended 50-residue N-terminal segment, corresponding to the a-peptide (Figure 2). The X-ray image of b-galactosidase provides a structural rationale for both a- and o-complementation. The N-terminal segment, representing the a-peptide, participates in the formation of a subunit interface, which in turn allows the formation of the active site, made up of elements from two different subunits. The C-terminal third of the polypeptide chain shows a well-defined compact domain, corresponding to the o-peptide, and represents an independent folding unit, as had been suggested earlier from complementation studies. See also: Macromolecular Structure Determination by X-ray Crystallography; Proteins: Fundamental Chemical Properties Lactose permease Among the mutants isolated by Monod in the late 1940s, the so-called cryptic mutants were able to synthesize b- galactosidase but unable to metabolize lactose. When, in the mid-10s, gratuitous inducers became available, the mystery of the cryptic mutants was solved. Using a radioactive gratuitous inducer, methyl- b-d-thiogalactoside (TMG), Monod and coworkers showed, in 16, that wildtype lac + cells accumulate labelled TMG about 100-fold against a concentration gradient, the accumulated amount representing about 2% of the bacterial dry weight. They also showed that accumulation of TMG was energydependent and reversible. Labelled TMG accumulated in induced, but not in uninduced, wild-type bacteria. In addition, no accumulation was observed in the cryptic mutants. The conclusion was clear: the factor responsible for TMG accumulation could only be a specific protein, controlled by a gene, lacy, as distinct from lacz. The synthesis of this protein, named permease, was induced by b-galactosides simultaneously with that of b-galactosidase; therefore the discovery of permease played an important role in the development of the operon model (see later). The isolation of the lacy gene product was achieved 10 years after the discovery of permease. It is a highly lipophilic protein with an apparent subunit molecular weight of approximately 30 000. The mechanism of transport has been elucidated: Lac permease transports various a- or b- galactosides into the E. coli cells against a concentration gradient by cotransporting one proton together with one galactoside molecule. The DNA of the lacy gene was sequenced in 1980, and, according to this sequence, Lac permease contains 417 amino acid residues. Topological analysis suggests that Lac permease consists of 12 lipophilic a-helices, and both the N- and C-termini lie on the inside of the E. coli cell. It seems that Lac permease is active as a monomer. See also: Bacterial Cytoplasmic Membrane; 2 ENCYCLOPEDIA OF LIFE SCIENCES # 2009, John Wiley & Sons, Ltd. www.els.net

978 978 2 2 2 2 978 978 978 978 2 2 2 2 978 998 (a) Bacterial Membrane Transport: Organization of Membrane Activities Thiogalactoside transacetylase The third gene of the lac locus, laca, codes for an enzyme, able to acetylate some thiogalactosides and named thiogalactoside transacetylase. The laca gene has been sequenced: it includes 203 codons. Biochemical studies showed that the enzyme is a heterotrimer and is a member of a large family of acetyltransferases that O-acetylate dissimilar substrates, but share limited sequence homology. The in vivo function of thiogalactoside transacetylase is still unclear. (b) (c) 51 284 278 27 423 28 424 518 279 283 511 284 278 27 423 Alpha peptide: 1-51 Alpha peptide: 1-51 2 281 2 281 Domain 2: 79 220-334 79 Domain 1: Domain 1: 51-217 51-217 347 674Domain 3: 334-627 0 0 28 424 518 279 283 511 Domain 2: 220-334 347 Domain 3: 674 334-627 Domain 4: 627-736 Domain 4: 721 721 627-736 Domain 5: -1023 Domain 5: -1023 Figure 2 Three-dimensional structure of b-galactosidase. (a) Ribbon representation of the b-galactosidase tetramer showing the largest face of the molecule. Contacts between red/green and blue/yellow dimers form the long interface. Contacts between the red/yellow and blue/green dimers form the activating interface. Formation of the tetrameric particle results in two deep clefts that run across opposite faces of the molecule. Each contain two active sites. (b) Ribbon diagram of the blue/green dimer viewed down the molecular 2-fold axis, showing the composition of the activating interface. Residues 1 50 from each chain, which form the a-complementation region (see text), are shown in red. The interface includes contacts between the respective complementation peptides, between two helices from the respective monomers that pack together to form a four-helix bundle, and between an extended loop (residues 272 288) from each monomer that reaches across the interface and extends into the active site region of the neighbouring monomer stabilizing the active site structure. (c) Stereo ribbon diagram of the b-galactosidase monomer showing the domain organization of the chain. Residues corresponding to successive domains are coloured in successive spectral colours. Reprinted from Jacobson RH, Zhang XJ, DuBose RF and Matthews BW (1994) Three-dimensional structure of b-galactosidase of E. coli. Nature 369;761 766, with permission. The Repressor By the end of the 10s, the knowledge of the steps involved in bacterial conjugation (transfer of a part of the chromosome of one bacteria to another) allowed a new approach to the study of regulation of gene expression. It had been already known that in E. coli, the synthesis of b-galactosidase is dependent on the lacz gene (z) and on an additional genetic factor, known to exist in two forms, the wild-type, called i +, corresponding to the inducible phenotype (i.e. enzyme synthesis occurs only if inducer is added to the bacteria) and the constitutive i form (i.e. the enzyme is synthesized in the absence of inducer). Previous genetic analysis revealed that the z and i genes are closely linked on the chromosome. All this provided the basis for the epoch-making Pardee, Jacob and Monod experiment (19), known as the PaJaMo experiment, that led to three major concepts of gene expression: the concepts of repressor, negative control of enzyme induction and mrna. In the PaJaMo experiment, synthesis of b-galactosidase was measured during conjugation of male bacteria (Hfr) with females (F 2 ). When a LacZ + Hfr strain of the genotype i + z + was crossed with an F 2 of the genotype i 2 z 2,b-galactosidase was immediately synthesized, but within a few minutes enzyme synthesis stopped. If at this time inducer was added, enzyme synthesis resumed (Figure 3). The interpretation of the experiment was straightforward: following chromosomal transfer, the z + gene can be expressed in an i 2 cytoplasm, but the concomitant transfer of the i + gene results in the arrest of enzyme synthesis. In a parallel experiment, when the direction of the cross was reversed, i.e. an Hfr strain i 2 z 2 was crossed with an F 2 i + z +, no enzyme synthesis occurred. Thus the i + gene in the recipient was dominant over the i 2 gene of the donor. This led to the hypothesis that the i + gene produces a substance that shuts off enzyme synthesis. It was named repressor. See also: Bacterial Chromosome; Bacterial Genetic Exchange; Jacob, Franc ois; Monod, Jacques Lucien During conjugation the lacz gene of the Hfr was immediately expressed in the diploid. At the time, it was generally believed that genes produce stable structures, which accumulate in the cytoplasm, and, as ribosomal RNA ENCYCLOPEDIA OF LIFE SCIENCES # 2009, John Wiley & Sons, Ltd. www.els.net 3

Units β-galactosidase (ml 1 ) 15 10 5 Inducer Inducer added No inducer 0 0 2 4 6 8 Hours Figure 3 The PaJaMo experiment. A male (Hfr) lac i + z + strain was conjugated with a female (F ) lac i z strain in the absence of inducer. At the time indicated, inducer was added to one of the cultures, whereas the other one received no addition. b-galactosidase activity was measured as a function of time. Adapted from Pardee AB, Jacob F and Monod J (19) The genetic control and cytoplasmic expression of inducibility in the synthesis of b-galactosidase by E. coli. Journal of Molecular Biology 1; 165 178, with permission. (rrna) was the only known RNA, it was assumed to act as a template for protein synthesis. In the PaJaMo experiment a stable template could not explain the rapid kinetics of enzyme synthesis. A new hypothesis was therefore proposed: the structural gene produces a metabolically unstable RNA, named mrna, which in turn is decoded in the ribosomes to yield the corresponding protein. The model of gene regulation was born: it stated that in bacteria, transcription, i.e. mrna production, is regulated in a negative manner by repressors. In the lac system inducer is needed to counteract the action of the repressor. See also: Messenger RNA in Prokaryotes Two important questions still awaited an answer: what is the nature of the repressor and where is its site of action? The PaJaMo experiments suggested that E. coli does not contain more than 10 molecules of Lac repressor per cell and its isolation was therefore a great challenge; success provided evidence that the repressor is a protein. The isolation of mutants that overproduce Lac repressor several 100-fold opened the way for further biochemical studies. Both the amino acid and the DNA sequences of the Lac repressor were determined in the 1970s. Lac repressor is a tetramer, containing 360 amino acid residues per monomer, and each monomer binds one molecule of inducer. See also: Repression Mechanism A wide-ranging genetic analysis of the laci gene, resulted in over 4000 single amino acid replacements of known phenotype between codons 2 and 329 of the laci gene. The determination of the three-dimensional structure of the Lac repressor (Figure 4), both in the presence and absence of Figure 4 The crystal structure of the Lac repressor DNA complex constructed from the available Protein Data Bank structures (Lewis et al., 1996) by modelling procedures (Balaeff et al., 2004). In the V-shaped tetrameric Lac repressor each of the two dimers (drawn as purple protein cartoon) binds with high specificity to a 21-base pair operator DNA fragment (drawn as blue tubes and red spheres) through the N-terminal 62 residuelong headpiece (drawn in green). Dimer dimer assembly occurs via a compact four-helical bundle formed by 18 C-terminal residues from each subunit (drawn as orange tubes). Adapted from Balaeff A, Mahadevan L and Schulten K (2004) Structural basis for cooperative DNA binding by CAP and Lac repressor. Structure 12; 123 132, with permission from Elsevier. inducer, fully confirmed earlier biochemical studies. It also made it possible to combine structural and genetic data, to map all the amino acid substitution mutations on the threedimensional structure of the repressor and to provide a clear picture of repressor function. The Operator In the late 10s it was established that the synthesis of the lac proteins (b-galactosidase, permease and acetylase) occurred at the same rates after addition of inducer, and in i 2 mutants the constitutive synthesis was also coordinate. These observations led Jacob and Monod, in 1961, to develop the concept of two kinds of genes: structural, which code for the synthesis of proteins; and regulatory, which were postulated to control the expression of proteins through the intermediary of repressors. They also postulated that there was a unique genetic structure associated with the group of structural genes, the site of interaction with the repressor, which they called operator. Mutations in the operator site would lead to a loss of sensitivity to the repressor, thus yielding constitutive mutants. In a diploid strain (containing two copies of the lac locus) the operator constitutive mutants (o c ) were cis-dominant, i.e. acting only on the adjacent lac genes. A new concept was born, the operon, defined as a unit of coordinate transcription, 4 ENCYCLOPEDIA OF LIFE SCIENCES # 2009, John Wiley & Sons, Ltd. www.els.net

Transcription P lacl P O lacz lacy laca Translation lacl mrna Binding of the repressor to the operator = prevention of lacz, Y and A transcription (a) Lacl repressor P lacl P O lacz lacy laca lacl mrna β-galactosidase lac mrna Permease Transacetylase Lacl repressor (b) + Inducer Binding of the inducer = Inactivation of the repressor = Transcription of lacz, Y and A Figure 5 Diagram of the lactose operon in the repressed (a) and induced (b) states. Synthesis of the lactose operon proteins, genetically determined by the structural genes (lacz, lacy and laca), is blocked by the LacI repressor synthesized by the regulator gene, laci. The operator (O) is the site of specific interaction with the repressor. The repressor can be inactivated by the inducer, thus allowing transcription to take place at the promoter. Inherent to the operon model is the assumption that transfer of genetic information from gene to protein involves a short-lived mrna. Reprinted with permission from Jacob F and Monod J (1961) Genetic regulatory mechanisms in the synthesis of proteins. Journal of Molecular Biology 3:318 356. Copyright # 1961 Academic Press. Drawing courtesy of Jean-Marc Ghigo. composed of structural genes connected by an operator, the target of the action of a repressor, produced by a regulator gene. See also: Genes: Definition and Structure Several years later it became necessary to postulate an additional genetic element, a specific DNA region, distinct from the operator, where transcription of mrna starts. This region has been named promoter. Figure 5 illustrates the lactose operon in the repressed and induced states. The operator was isolated as a 27-bp oligonucleotide protected from DNAase digestion by bound Lac repressor. The specificity of binding of Lac repressor to the operator is high, the equilibrium constant being approximately 10 213 mol L 21. The three-dimensional structure of the Lac repressor complexed with a 21-bp synthetic operator DNA has also been determined. In the crystal structure of the repressor DNA complex, the four DNA-binding domains of the Lac repressor tetramer are bound to two independent, symmetric operator double helices (Figure 4). See also: Binding Constants: Measurement and Biological Range The Inducer Until the late 1940s only the substrates of enzymes were known to serve as inducers to those enzymes. The synthesis of artificial b-galactosides made it possible to separate their role as enzyme substrates from their role as inducers. Monod s finding, in the early 10s, that a b-galactoside could be an inducer but not a substrate, led to the use of the term gratuitous inducers. The existence of nonsubstrate inducers ruled out any direct connection between the b- galactosides as substrates and as inducers. It was this discovery that led to the abandonment of the term enzymatic adaptation and adoption of induced enzyme synthesis. Another important discovery emerged from the study of induction: the concept of allostery. From the very beginning of 1960s, one of the major interests of Monod was how proteins recognize chemical signals: how the repressor recognizes both inducer and DNA. The interaction between the inducer and repressor was extremely rapid, specific and entirely reversible. The study of the phenomenon of induction suggested to Monod the theory of allostery: the repressor was postulated to possess two stereospecifically different sites, one for the inducer, the other for the operator DNA. The induction would be entirely due to reversible conformational alterations induced in the DNAbound protein when it interacts with the inducer. Subsequently, the concept of allostery became a major biological generalization and one of the most important ideas to emerge from the study of the operon model. We know today that most mechanisms of cell signalling involve allosteric interactions. A great surprise came from the finding that in a lacz 2 lacy + strain the permease was not induced by lactose, ENCYCLOPEDIA OF LIFE SCIENCES # 2009, John Wiley & Sons, Ltd. www.els.net 5

suggesting that b-galactosidase is needed for the permease (and acetylase) to be induced. Since b-galactosidase can act both as a hydrolase and a transferase, it can transfer the galactosyl moiety from the C4 position to the C6 position of glucose to form the 1-6-b-D-galactoside-glucose, a lactose isomer, allolactose, which is the natural inducer of the lac operon. Catabolite Repression Control of lac Catabolite repression is a relatively recent term given to a phenomenon discovered almost a century ago by Dienert, and successively designated as adaptation, diauxie and glucose effect. An important observation was made by Karstro m in 1938, who showed that bacteria grown on glucose are unable to ferment other sugars, such as galactose, maltose or lactose, whereas bacteria grown on these sugars are able to grow on glucose. The first quantitative analysis of this phenomenon was performed by Monod in 1945. He grew bacterial cultures in the presence of two carbohydrates instead of one, and, depending on the nature of the two carbohydrates, the growth curve exhibited two successive growth cycles, separated by a lag. Monod named this phenomenon diauxie. When glucose was associated with carbohydrates, such as maltose or lactose, metabolized by adaptive enzymes, diauxic growth occurred invariably. Monod concluded that the diauxic growth was the result of an inhibitory action of glucose on the formation of the specific adaptive enzyme. He later showed that glucose inhibits the synthesis of b-galactosidase. See also: Group Translocation PEP:PTS In 1961, it was observed that the products of the catabolism of glucose, rather than the glucose molecule itself, are responsible for the inhibitory effect of glucose on enzyme synthesis; hence, a new name for the glucose effect, catabolite repression was coined. The identification of 3 5 -cyclic adenosine monophosphate (camp) in glucosestarved E. coli cells, in 1965, suggested a possible role of this nucleotide in catabolite repression. Indeed, addition of camp to a glucose-grown culture could overcome catabolite repression of b-galactosidase synthesis. Further study of the mechanisms of camp action led to the discovery of the camp receptor protein, also called catabolite activator protein (CAP) and to the corresponding gene, crp, as well as to the identification of the gene cya encoding adenylate cyclase, the enzyme responsible for camp synthesis. It was subsequently established that the camp CAP complex promotes gene expression of catabolic operons by binding to sites near the transcription starts of several catabolic genes or operons and activates their expression. The mechanism of interaction of CAP with RNA polymerase at these sites has also been elucidated. See also: Bacterial Transcription Regulation The sequence of the crp gene has been determined. It codes for a protein of 209 residues. CAP is a dimer and has one binding site for camp per monomer. The crystal structure of CAP dimer complexed with two molecules of Figure 6 Schematic structure of the CAP DNA complex. For the DNA, sharply bent by CAP, the bases and backbone are shown in red. The protein is represented as a ribbon diagram (blue) with two camp molecules (red) placed to indicate the binding sites on each subunit of CAP. Reprinted from Parkinson G, Gunasekera A Vojtechovsky J et al. (1996) Aromatic hydrogen bond in sequence-specific protein DNA recognition. Nature Structural Biology 3: 837 841, with permission from Nature Publishing Group. Figure courtesy of Richard H. Ebright. camp was solved by Steitz and coworkers. CAP has a modular structure, the N-terminal domain binds camp and the C-terminal domain binds DNA. The crystal structure of the CAP DNA complex has also been determined and one of the most salient features of the binding of CAP to the lac promoter region is the creation of a bend in the DNA of approximately 808. Recent in vitro structural studies suggest that the binding of CAP stabilizes the binding of the Lac repressor to the operator. Therefore, CAP acts not simply by activating transcription, but also by enhancing repression of the lac genes (Figure 6). The positive control mediated by camp CAP has often been associated with the effect of camp in relieving catabolite repression. There still exists such a belief in this association that no distinction is made between the involvement of camp CAP in efficient gene expression and in the modulation of catabolite repression. This model has, however, been challenged in the light of a number of results showing that catabolite repression of the lac operon (and also of other systems) can be modulated independently of camp. The mechanism(s) responsible for catabolite repression is therefore, still awaiting clarification. The lac Operon Components as Tools b-galactosidase fusions The observation that the N-terminal 23 residues of b-galactosidase can be replaced with other amino acids without affecting enzymatic activity enabled the production of fusions between the laci and lacz genes, which yielded a hybrid protein with both Lac repressor and b-galactosidase 6 ENCYCLOPEDIA OF LIFE SCIENCES # 2009, John Wiley & Sons, Ltd. www.els.net

activities. This opened the way for the construction of a variety of lac fusions, resulting in the formation of hybrid genes that specify hybrid proteins. Nowadays, a number of commercially available vectors for constructing lacz fusions in vitro exist. The sensitive detection of b-galactosidase activity with a colourless, soluble substrate, 5- bromo-4-chloro-3-indolyl-b-d-galactoside (X-Gal), which forms an insoluble blue dye upon hydrolysis, has made lacz gene one of the most commonly used in vivo reporter genes. It provides a sensitive method of detecting genes that are subject to specific regulatory signals, of studying the localization of a protein to a given cellular compartment and of analyzing developmentally regulated genes, transgene expression, tissue-specific expression and other aspects involved in embryogenesis or developmental biology. The regulation, identification and localization of many proteins of various origins have been uncovered by using b-galactosidase activity of fusion proteins as a marker. During the past few decades, transgenic mice bearing lacz reporter gene have been used extensively to analyze gene expression patterns through the developmental stages of the life cycle in different tissues. To obtain transgenic mice, the lacz gene is fused to the promoter of a targeted gene, then DNA is prepared and microinjected into fertilized mouse eggs. The expression of the transgene bearing b-galactosidase activity can easily be visualized in situ in tissue sections by using the chromogenic substrate X-Gal, or the recently introduced different fluorogenic substrates that allow sensitive and precise localizations. The explosion of genome sequence data revealed that a large fraction of the open reading frames encoded by these genomes has no known biological function. A major challenge remains in finding efficient new techniques to investigate the function of these genes, an approach known as functional genomics, that include analysis of protein protein interactions, that are central to most biological processes. At present, the most powerful approach used to select or screen for protein protein interactions is based on two-hybrid methods. Originally developed by Fields and Song in 1989, at present several yeast or bacterial twohybrid systems have been described; in general the two putative protein partners are fused to two interacting polypeptides, such as transcription factors or complementary domains of a specific protein. In most systems, the readout of hybrid protein association is coupled to screanable or to selectable phenotypes, one of the most sensitive ones being LacZ expression. See also: Genetic Engineering: Reporter Genes; Protein Domain Fusion a-complementation Perhaps the most widely exploited property of b-galactosidase is the phenomenon called a-complementation. This complementation involves noncovalent reassociation of complementary fragments of the b-galactosidase subunit polypeptide chain, which then reassemble into an enzymatically active tetrameric structure. In the case of a- complementation, restoration of enzyme activity occurs when the N-terminal 60 residues of b-galactosidase interact in vivo or in vitro with an inactive lacz mutant (a-acceptor) which lacks codons 11 41. This phenomenon was used to develop new cloning vectors for identifying bacterial colonies containing recombinant DNA in 19; the lac regulatory region and the 60 first codons of the lacz gene were inserted into the DNA of phage M13. Bacteria, producing the a- acceptorprotein,wheninfectedwiththephageyieldactivebgalactosidase by complementation and form blue plaques in the presence of X-Gal. Insertion of foreign DNA into the a- region of M13 interferes with a-complementation, giving rise to recombinants that form colourless plaques. This simple test has made cloning in M13 a routine procedure. Intracistronic complementation of the lacz gene has recently been adapted for use in eukaryotic cells. Complementation of relevant lacz mutants in mammalian cells has been shown to permit analysis of cell fusion and detection of co-localized interacting proteins within single intact cells. Intracistronic complementation of b-galactosidase has also been used for direct assessment of specific protein dimerization interactions in a biologically relevant context. See also: Cell Hybrids Lac repressor as a tool The capacity of Lac repressor to bind efficiently to the operator and be released from it by inducer has frequently been used to modulate gene expression, or to select homozygous negative mutants in eukaryotic cells. The modular structure of the Lac repressor allows creation of fusion proteins, including eukaryotic nuclear localization domains and transcriptional activation domains to both the N- and C-terminus of repressor without significant disruption of specific DNA binding. This property of the Lac repressor has been used to screen complex peptide libraries for direct interaction with a given receptor. The peptides fused to the C-terminus of the Lac repressor can still bind to the operator with high efficiency. This linkage allows enrichment for specific peptide ligands in the random population of peptides by affinity purification of the peptide repressor operator complexes with an immobilized receptor. The commercially available Big Blue transgenic mouse mutation detection system provides a powerful approach for direct analysis of spontaneous and chemically induced mutations in vivo. The Big Blue mouse is transgenic for three genetic elements of the lactose operon: the laci gene, the operator and the lacz gene. The laci gene is the target of mutagenesis. Cells with a mutated laci gene produce defective Lac repressor, therefore b-galactosidase is synthesized and can be easily detected. Cells with unmutated laci gene produce active repressor and no b-galactosidase is produced. This system has been widely used to study the effects of chemical carcinogens on mutation frequencies. See also: Experimental Organisms Used in Genetics; Proteins: Affinity Tags Given the wide range of applications to which the lac operon has been associated, one can anticipate that further ENCYCLOPEDIA OF LIFE SCIENCES # 2009, John Wiley & Sons, Ltd. www.els.net 7

developments will contribute to our understanding of many features of cellular regulation and organization. Further Reading Balaeff A, Mahadevan L and Schulten K (2004) Structural basis for cooperative DNA binding by CAP and Lac repressor. Structure 12: 123 132. Beckwith JR and Zipser D (eds) (1970) The Lactose Operon. Cold Spring Harbor: Cold Spring Harbor Laboratory Press. Fowler A and Zabin I (1978) Amino acid sequence of b-galactosidase. Journal of Biological Chemistry 253: 5521 5525. Jacob F and Monod J (1961) Genetic regulatory mechanisms in the synthesis of proteins. Journal of Molecular Biology 3: 318 356. Jacob F, Ullmann A and Monod J (1965) De le tions fusionnant l opéron lactose et un ope ron purine chez E. coli. Journal of Molecular Biology 31: 704 719. Jacobson RH, Zhang XJ, DuBose RF and Matthews BW (1994) Three-dimensional structure of b-galactosidase of E. coli. Nature 369: 761 766. Kolb A, Busby S, Buc H, Garges S and Adhya S (1993) Transcriptional activation by camp and its receptor protein. Annual Review of Biochemistry 62: 749 7. Lewis M, Chang G, Horton NC et al. (1996) Structure of the E. coli lactose operon repressor and its complexes with DNA and inducer. Science 271: 1247 1254. Lwoff A and Ullmann A (eds) (1978) Selected Papers in Molecular Biology by Jacques Monod. New York: Academic Press. Lwoff A and Ullmann A (eds) (1979) Origins of Molecular Biology. A Tribute to Jacques Monod. New York: Academic Press. Monod J (1945) Sur la nature du phe nome` ne de diauxie. Annales de l Institut Pasteur Paris 71: 37 40. Monod J (1966) From enzymatic adaptation to allosteric transitions. Science 154: 475 483. Mu ller-hill B (1996) The lac Operon. A Short History of a Genetic Paradigm. Berlin: Walter de Gruyter. Pace HC, Kercher MA, Lu P et al. (1997) Lac repressor genetic map in real space. Trends in Biochemical Sciences 22: 334 339. Pardee AB, Jacob F and Monod J (19) The genetic control and cytoplasmic expression of inducibility in the synthesis of b-galactosidase by E. coli. Journal of Molecular Biology 1: 165 178. Rossi FMV, Blakely BT and Blau HM (2000) Interaction blues: protein interactions monitored in live mammalian cells by b-galactosidase complementation. Trends in Cell Biology 10: 119 122. Ullmann A (1992) Complementation in b-galactosidase: from protein structure to genetic engineering. BioEssays 14: 201 205. Ullmann A (ed.) (2003) Origins of Molecular Biology. A Tribute to Jacques Monod. Washington, DC: ASM press. Ullmann A and Danchin A (1983) Role of cyclic AMP in bacteria. Advances in Cyclic Nucleotide Research 15: 1 53. 8 ENCYCLOPEDIA OF LIFE SCIENCES # 2009, John Wiley & Sons, Ltd. www.els.net