Vital Statistics Derived from Complete Genome Sequencing (for E. coli MG1655)

Similar documents
Bio 119 Bacterial Genomics 6/26/10

Biology 105/Summer Bacterial Genetics 8/12/ Bacterial Genomes p Gene Transfer Mechanisms in Bacteria p.

Bacterial Genetics & Operons

CHAPTER : Prokaryotic Genetics

Genetic Variation: The genetic substrate for natural selection. Horizontal Gene Transfer. General Principles 10/2/17.

CHAPTER 13 PROKARYOTE GENES: E. COLI LAC OPERON

Microbial Genetics, Mutation and Repair. 2. State the function of Rec A proteins in homologous genetic recombination.

Introduction to Molecular and Cell Biology

Biology 112 Practice Midterm Questions

2012 Univ Aguilera Lecture. Introduction to Molecular and Cell Biology

AP Bio Module 16: Bacterial Genetics and Operons, Student Learning Guide

Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p

Flow of Genetic Information

BIOLOGY STANDARDS BASED RUBRIC

DNA Technology, Bacteria, Virus and Meiosis Test REVIEW

2. What was the Avery-MacLeod-McCarty experiment and why was it significant? 3. What was the Hershey-Chase experiment and why was it significant?

Warm-Up. Illustrate (via model, diagram, cartoon, et cetera) how viral replication introduces genetic variation in the viral population. (LO 3.

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.

Frequently Asked Questions (FAQs)

PROTEIN SYNTHESIS INTRO

Reading Assignments. A. Genes and the Synthesis of Polypeptides. Lecture Series 7 From DNA to Protein: Genotype to Phenotype

Chapter 27: Bacteria and Archaea

BME 5742 Biosystems Modeling and Control

Multiple Choice Review- Eukaryotic Gene Expression

Gene Regulation and Expression

Special Topics on Genetics

Gene expression in prokaryotic and eukaryotic cells, Plasmids: types, maintenance and functions. Mitesh Shrestha

4. Why not make all enzymes all the time (even if not needed)? Enzyme synthesis uses a lot of energy.

Principles of Genetics

Controlling Gene Expression

Lesson Overview. Gene Regulation and Expression. Lesson Overview Gene Regulation and Expression

Prokaryotes & Viruses. Practice Questions. Slide 1 / 71. Slide 2 / 71. Slide 3 / 71. Slide 4 / 71. Slide 6 / 71. Slide 5 / 71

3.B.1 Gene Regulation. Gene regulation results in differential gene expression, leading to cell specialization.

Prokaryotic Gene Expression (Learning Objectives)

the noisy gene Biology of the Universidad Autónoma de Madrid Jan 2008 Juan F. Poyatos Spanish National Biotechnology Centre (CNB)

Computational Cell Biology Lecture 4

Name: SBI 4U. Gene Expression Quiz. Overall Expectation:

Eukaryotic Gene Expression

Name Period The Control of Gene Expression in Prokaryotes Notes

15.2 Prokaryotic Transcription *

GACE Biology Assessment Test I (026) Curriculum Crosswalk

1. In most cases, genes code for and it is that

The Gene The gene; Genes Genes Allele;

Prokaryotic Regulation

GENE ACTIVITY Gene structure Transcription Transcript processing mrna transport mrna stability Translation Posttranslational modifications

Development Team. Regulation of gene expression in Prokaryotes: Lac Operon. Molecular Cell Biology. Department of Zoology, University of Delhi

The Eukaryotic Genome and Its Expression. The Eukaryotic Genome and Its Expression. A. The Eukaryotic Genome. Lecture Series 11

GCD3033:Cell Biology. Transcription

Topic 4 - #14 The Lactose Operon

9/2/17. Molecular and Cellular Biology. 3. The Cell From Genes to Proteins. key processes

Chapter 15 Active Reading Guide Regulation of Gene Expression

Molecular Evolution & the Origin of Variation

Molecular Evolution & the Origin of Variation

From gene to protein. Premedical biology

Genetic Basis of Variation in Bacteria

Biology. Biology. Slide 1 of 26. End Show. Copyright Pearson Prentice Hall

Genetics 275 Notes Week 7

Bio 1B Lecture Outline (please print and bring along) Fall, 2007

SPECIES OF ARCHAEA ARE MORE CLOSELY RELATED TO EUKARYOTES THAN ARE SPECIES OF PROKARYOTES.

Fitness constraints on horizontal gene transfer

Cell Division: the process of copying and dividing entire cells The cell grows, prepares for division, and then divides to form new daughter cells.

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus:

VCE BIOLOGY Relationship between the key knowledge and key skills of the Study Design and the Study Design

Honors Biology Reading Guide Chapter 11

Biology I Fall Semester Exam Review 2014

Chapter 19. Gene creatures, Part 1: viruses, viroids and plasmids. Prepared by Woojoo Choi

no.1 Raya Ayman Anas Abu-Humaidan

Number of questions TEK (Learning Target) Biomolecules & Enzymes

Bio/Life: Cell Biology

Topic 1 - The building blocks of. cells! Name:!

Translation and Operons

(A) Heterotrophs produce some organic nutrients, and must absorb inorganic nutrients from the environment.

REVIEW SESSION. Wednesday, September 15 5:30 PM SHANTZ 242 E

GENE REGULATION AND PROBLEMS OF DEVELOPMENT

Computational Biology: Basics & Interesting Problems

GREENWOOD PUBLIC SCHOOL DISTRICT Genetics Pacing Guide FIRST NINE WEEKS Semester 1

Regulation of Gene Expression in Bacteria and Their Viruses

Streptomyces Linear Plasmids: Their Discovery, Functions, Interactions with Other Replicons, and Evolutionary Significance

2015 FALL FINAL REVIEW

9/11/18. Molecular and Cellular Biology. 3. The Cell From Genes to Proteins. key processes

The nature of genomes. Viral genomes. Prokaryotic genome. Nonliving particle. DNA or RNA. Compact genomes with little spacer DNA

UNIT 6 PART 3 *REGULATION USING OPERONS* Hillis Textbook, CH 11

Introduction. Gene expression is the combined process of :

Introduction to Biology with Lab

Bi 1x Spring 2014: LacI Titration

Unit 6 Test: The Cell Cycle

Boolean models of gene regulatory networks. Matthew Macauley Math 4500: Mathematical Modeling Clemson University Spring 2016

Outline. Genome Evolution. Genome. Genome Architecture. Constraints on Genome Evolution. New Evolutionary Synthesis 11/8/16

13.4 Gene Regulation and Expression

UNIT 5. Protein Synthesis 11/22/16

I. Molecules and Cells: Cells are the structural and functional units of life; cellular processes are based on physical and chemical changes.

Prokaryotic Gene Expression (Learning Objectives)

Chapters 12&13 Notes: DNA, RNA & Protein Synthesis

Know how to read a balance, graduated cylinder, ruler. Know the SI unit of each measurement.

AP Biology. Read college-level text for understanding and be able to summarize main concepts

Round 1. Mitosis & Meiosis Inheritance (10 questions)

Regulation of Gene Expression

Name: SAMPLE EOC PROBLEMS

2. Draw two water molecules. Using a dotted line, show a hydrogen bond that could form between them.

2. Cellular and Molecular Biology

Transcription:

We still consider the E. coli genome as a fairly typical bacterial genome, and given the extensive information available about this organism and it's lifestyle, the E. coli genome is a useful point of departure for subsequent discussion of prokaryotic genome diversity. Comparative Size Distribution of Prokaryotic Genomes Physical Form of the Genome vs. the Genetic Map: The circularity of the E. coli genetic map reflects the physical form of the DNA genome. All of the genes in the "core" genome of E. coli are coded in a single double-stranded DNA molecule that has no ends. (Contrast this with the linear genetic maps typical of eukaryotic chromosomes.) Although we sometimes refer to the DNA as being "circular", this distorts physical reality. If the genome were laid out in a perfect circle, the circumference would be about 1 mm, at least several hundred times the dimensions of an E. coli cell. Clearly the DNA is not in a true circle while inside a cell, but is highly folded and compacted. So, while the sophisticated E. coli geneticist refers to the genetic map as circular, they describe the genomic DNA as being "covalently continuous". Vital Statistics Derived from Complete Genome Sequencing (for E. coli MG1655) The classic E. coli genetic map is arbitrarily divided into 100 "minutes", corresponding to 4,639 kbp of DNA containing 4,288 ORF's. The average size of an ORF is close to 1 kb. Therefore, approximately 90% of the DNA in the genome is coding sequence. Most genes are "single copy". Only a few very highly expressed genes are present in multiple, functionally equivalent copies. Less than half of these genes had been discovered by the techniques of "classical" bacterial genetics when the complete genome sequence was published 3. Page 1 of 8

Functional Distribution of E. coli ORF's FUNCTIONAL CATEGORY # Proteins % DNA Replication, Recombination, Repair 115 2.7 Regulatory Proteins 133 3.1 Enzymes for Cell Structural Components 182 4.2 Translation 182 4.2 Physiological Responses to Environment 188 4.4 Energy Metabolism 243 5.7 Other Enzymes of Intermediary Metabolism 318 7.4 Biosynthetic Enzymes 340 7.9 Transport Proteins 427 10.0 Other known 528 12.3 Unknown 1,632 38.1 TOTAL 4,288 100.0 Core Genome vs. Pan Genome The variable presence of lysogenic viruses, plasmids, transposable elements, all of which are capable of horizontal gene transfer into other strain lineages, has led to the concept that each individual strain of a bacterial species shares a common or "core" array of genes with all other strains of that species. However, each strain will have a unique and variable set of adjunct genetic sequences, either in the bacterial chromosome itself, or in extrachromosomal elements such as plasmids. These "adjunct" genetic sequences are referred to collectively as the "pan" genome. As a generalization, core genomic sequences are not subject to promiscuous horizontal gene transfer, and are frequently essential for the basic function of the cell. The genes for DNA Polymerase III, and for ribosomal RNA would be considered part of the core genome of an E. coli strain. The pan genome consists of sequences that a strain may have acquired by horizontal gene transfer. These sequences may be essential for the cell under certain special circumstances, but are not involved with basic cell functions. A plasmid carrying antibiotic resistance genes would be considered part of the pan genome. "Chromosomal Islands" are relatively large, contiguous tracts of chromosomal genes that have evidently been acquired from another organism by lateral (horizontal) gene transfer. They are also considered part of the pan genome. The virulence of several pathogenic E. coli strains is associated with genes in chromosomal islands; in which case they are sometimes referred to as "pathogenicity islands". Characteristics that allow recognition of a chromosomal island include: i.) the set of contiguous genes is not uniformly present in closely related strains. ii.) the %GC of the island is different than the genome as a whole. Page 2 of 8

iii.) the pattern of synonymous codon preference is different from the majority of genes in the genome. iv.) the DNA sequences at the island boundaries are often recognizably similar to sequences found in lysogenic virus genomes or in transposable elements. Operons and Regulons Operons are contiguous clusters of several related genes whose expression is coordinately regulated by transcriptional regulation of a specific polycistronic mrna. The classic example is the Lac Operon. Regulons are non-contiguous related genes whose expression is coordinately regulated even though they are transcribed from independent onto multiple mrna's. The classic example is the heat shock regulon. Replication Strategy Bidirectional replication is initiated from a unique origin (oric at 84') with the terminus directly opposite the origin. Under conditions of rapid growth bacteria uses a strategy called dichotomous replication. This means that new cycles of DNA replication begin before the previous cycle is complete, allowing for a minimum cell doubling time (15 min.) that is less than the minimum time required to replicate the genome (60 min.). At rapid growth rates (doubling time less than 30 min.) dichotomous replication means that the copy number of genes located near the origin of replication is enriched relative to genes near the terminus. This is why we often observe that highly expressed genes are preferentially located near the origin of replication to increase the relative copy number. Bacterial genome exhibiting DICHOTOMOUS REPLICATION.!! Four origins of replication shown as green dots.! 6 replication forks shown by red arrows Another interesting aspect of genome organization observed in E. coli is that highly expressed genes are usually transcribed in the same direction that they Replication Terminus are replicated. This is thought to mitigate conflicts between RNA polymerase (transcription) and replication complexes for access to the same allowing faster replication. Page 3 of 8

Transcription-Translation Coupling The lack of a nuclear compartment in prokaryotes leads to one of the most important molecular differences between prokaryotes and eukaryotes, namely, in prokaryotes transcription and translation of protein coding genes is coupled (simultaneous). In other words, ribosomes are translating the 5' end of a mrna before the 3' end of the mrna has been transcribed. Coupling of transcription to translation reduces opportunity for mrna processing (intron splicing, etc.). Introns are exceedingly rare in the bacterial genomes sequenced so far. There are no introns in in the E. coli genome. The only transcripts subject to significant processing are those leading to functional rrnas and trnas. Horizontal Gene Transfer Bacterial do not have a sexual life cycle; neither meiosis nor fertilization have ever been documented. In fact, they don't even carry out mitotic cell divisions in the sense that they lack centrioles, centrosomes, and the spindle apparatus used by most eukaryotes for chromosome segregation. Genetic diversity leading to rapid genomic evolution relies instead on 3 primary mechanisms of horizontal gene transfer: 1. Plasmid-Mediated Conjugation 2. Specialized Transduction and Generalized Transduction mediated by bacterial viruses (bacteriophage). 3. Transformation; the exchange of DNA through uptake of free DNA from solution Transposable Elements Transposable Elements are DNA sequences that are capable of mediating their own movement (transposition) to new locations within the genome they inhabit, or to other genetic elements present in the same cell. Barabara McClintock was awarded a Nobel Prize for her pioneering discovery of transposable elements in the genome of maize. Transposable elements of various types are widespread in genomes of eukaryotes and bacteria. In bacteria, transposable elements can generally be assigned to one of two major types, "Insertion Sequences (IS)" and "Composite Transposons". In practice, composite transposons are typically referred to simply as "transposons". Insertion sequences (IS's) are smaller (1-2 kb) transposable elements whose only genes are directly related to promotion and regulation of their transposition, typically the gene for the socalled transposase enzyme. IS elements are characterized by short, terminal, inverted repeat sequences with the ORF or ORF's in between. They are normal constituents of many bacterial chromosomes and plasmids. Page 4 of 8

A PRIMER of E. coli GENETICS and GENOMICS 3/17/11 Composite transposons generally consist of two copies of the same IS element flanking variable amounts of other DNA sequences coding for one or several genes with diverse functions. The entire transposon moves as a single unit. The best known transposons are those which were discovered as parts of antibiotic resistance plasmids. The diagram below compares the typical structure of an IS element with the transposon Tn5. Tn5 carries 3 antibiotic resistance genes sandwiched between 2 copies of IS50. Only IS elements will be discussed below. Transposable elements are a game changer in bacterial genomes. They participate in a bewildering array of molecular events that alter the genomes which they inhabit. The most important of these are: Transposition IS movement and insertion at a different location in the same DNA molecule, or in different molecule in the cell. The transposition process is often accompanied by replication of the IS (replicative transposition), leading to an increase in the copy # of the IS. The diagram below shows replicative transposition of an IS. Insertional Inactivation Insertion of an IS within a coding sequence generally leads to the loss of gene function (null mutation). Homologous Recombination Multiple copies of the same IS in the same cell are substrates for homologous recombination events that may lead to DNA deletions, sequence inversions, or fusion of separate DNA molecules. For example, homologous recombination between copies of the same IS element in a conjugal Page 5 of 8

plasmid and the bacterial chromosome leads to formation of Hfr strains, as shown below. IS F + IS IS Hfr IS The promiscuity of transpositional and recombinational events associated with Itransposable elements unlocks the Pandora's Box of genome plasticity for bacterial chromosomes and plasmids in which they are found. In fact, the K-12 laboratory strains of E. coli show considerable variability in the number and location of transposable elements in their genomes due to transposition events that have occurred since the parent strain was first isolated in 1922. History of E. coli K-12 Laboratory Strains If we go directly to nature (i.e. the wastewater plant, people with genitourinary tract infections, cattle feedlots, etc.) as a source of E. coli strains we have no difficulty isolating a genetically diverse array of genotypes that fall under the technical definition of E. coli. This can be interesting and productive from the perspective of clinical microbiology, ecology and evolution. Otherwise, if we are using E. coli as a model organism, then it is customary to use the same welldocumented strain as other investigators, so that results from different labs can be readily compared. Genetic studies with E. coli have traditionally been conducted largely with descendants of a specific E.coli isolate designated "K-12". 1 The strain Escherichia coli K-12 was isolated in 1922 from the stool of a convalescent diphtheria patient in a clinical bacteriology lab in Palo Alto, California. In 1925, the culture was deposited in the strain collection of the Department of Bacteriology at Stanford University, where it was given the designation K-12. Strain K-12 gave typical results in the standard tests used for the identification of E. coli and was therefore used for many years in bacteriology lab classes as a typical example of E. coli. This original culture is still maintained in the department collection. In the early 1940s, E. L. Tatum, then at Stanford, asked the bacteriology department for some bacteria to test for possible use in his studies of biochemical genetics. By great good luck he was given, along with cultures of other species of bacteria, E. coli K-12 which proved to be ideally suited to his studies because it is prototrophic, easy to cultivate in a defined medium, and grows rapidly. The use of this bacterium permitted easy study of very large populations and thus the accurate analysis of very rare events, such as spontaneous mutations, presenting a great advantage in this respect over the plants, animals, and fungi previously used in genetic studies. In 1944, Tatum and Gray reported the isolation of the first auxotrophic mutants of strain K-12. In 1946, Lederberg and Tatum demonstrated genetic recombination in strain K-12, further opening the door to genetic research. Since that time, many thousands of derivatives of strain K-12 have been created in laboratories around the world. All the strains used in this course are in the K-12 family. Page 6 of 8

Another interesting property of K-12 lab strains is that they have lost the ability, during many years of laboratory cultivation, to colonize the human GI tract. This makes them potentially safer to use than "wild" E. coli strains. 2 The original lab strain of E. coli K-12 isolated in 1922, came from nature with a pan genome containing a large conjugal plasmid (the F plasmid) and a lysogenic bacteriophage genome (Lambda). These elements are not found uniformly in all strains of E. coli in nature, and they have been lost even in many descendants of K-12. The first E.coli strain that was subject to whole genome sequencing is a derivative of K-12 designated MG1655. MG 1655 differs from K-12 by the removal of the F plasmid and the Lambda genome, as is the closest thing we have to a "generic" E. coli. from Chart 8. in Bachmann (1996) Page 7 of 8

REFERENCES (Available online through links on the course website References page.) 1 Bachmann, B. J. (1996) Derivations and Genotypes of Some Mutant Derivatives of Escherichia coli K-12 Chapter 133 in Escherichia coli and Salmonella: Cellular and Molecular Biology 2nd ed. Vol. 2 This is a description of the E. coli K12 family tree. It would be a waste of time to slog through the whole paper at this point. However, you might read just the little information it provides on the derivation of strain MG1655 from the original wild-type. 2 3 Smith, H.K (1975) Survival of orally administered E. coli K12 in alimentary tract of human. Nature 255:500-502. Blattner, et al. (1997) The Complete Genome Sequence of Escherichia coli K-12 Science 277: 1453. Page 8 of 8