The minimal prokaryotic genome. The minimal prokaryotic genome. The minimal prokaryotic genome. The minimal prokaryotic genome

Similar documents
Stepping stones towards a new electronic prokaryotic taxonomy. The ultimate goal in taxonomy. Pragmatic towards diagnostics

MiGA: The Microbial Genome Atlas

The Minimal-Gene-Set -Kapil PHY498BIO, HW 3

Microbial Taxonomy and the Evolution of Diversity

Comparative genomics: Overview & Tools + MUMmer algorithm

Interpreting the Molecular Tree of Life: What Happened in Early Evolution? Norm Pace MCD Biology University of Colorado-Boulder

Introduction to polyphasic taxonomy

Genetic Variation: The genetic substrate for natural selection. Horizontal Gene Transfer. General Principles 10/2/17.

Outline. I. Methods. II. Preliminary Results. A. Phylogeny Methods B. Whole Genome Methods C. Horizontal Gene Transfer

2 Genome evolution: gene fusion versus gene fission

Microbes usually have few distinguishing properties that relate them, so a hierarchical taxonomy mainly has not been possible.

Microbial Taxonomy. Slowly evolving molecules (e.g., rrna) used for large-scale structure; "fast- clock" molecules for fine-structure.

Using phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression)

Microbial Taxonomy. Microbes usually have few distinguishing properties that relate them, so a hierarchical taxonomy mainly has not been possible.

Computational methods for predicting protein-protein interactions

# shared OGs (spa, spb) Size of the smallest genome. dist (spa, spb) = 1. Neighbor joining. OG1 OG2 OG3 OG4 sp sp sp

Taxonomy. Content. How to determine & classify a species. Phylogeny and evolution

Microbial Taxonomy. C. Microbes usually have few distinguishing properties that relate them, so a hierarchical taxonomy mainly has not been possible.

Introduction to Bioinformatics Integrated Science, 11/9/05

Microbiology Helmut Pospiech

Chapter 19. Microbial Taxonomy

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

AP Biology Essential Knowledge Cards BIG IDEA 1

Horizontal Gene Transfer and the Emergence of Darwinian Evolution

Microbial Taxonomy and Phylogeny: Extending from rrnas to Genomes

A A A A B B1

Computational approaches for functional genomics

Big Idea 1: The process of evolution drives the diversity and unity of life.

Today's project. Test input data Six alignments (from six independent markers) of Curcuma species

HORIZONTAL TRANSFER IN EUKARYOTES KIMBERLEY MC GRAIL FERNÁNDEZ GENOMICS

AP Curriculum Framework with Learning Objectives

Unsupervised Learning in Spectral Genome Analysis

Map of AP-Aligned Bio-Rad Kits with Learning Objectives

Microbiota: Its Evolution and Essence. Hsin-Jung Joyce Wu "Microbiota and man: the story about us

BIOL 1010 Introduction to Biology: The Evolution and Diversity of Life. Spring 2011 Sections A & B

Big Idea 3: Living systems store, retrieve, transmit, and respond to information essential to life processes.

Valley Central School District 944 State Route 17K Montgomery, NY Telephone Number: (845) ext Fax Number: (845)

Big Idea #1: The process of evolution drives the diversity and unity of life

Dr. Amira A. AL-Hosary

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Enduring understanding 1.A: Change in the genetic makeup of a population over time is evolution.

Genome Annotation. Bioinformatics and Computational Biology. Genome sequencing Assembly. Gene prediction. Protein targeting.

Evolution AP Biology

Quantitative Genetics & Evolutionary Genetics

AP Biology Curriculum Framework

Microbial Diversity and Assessment (II) Spring, 2007 Guangyi Wang, Ph.D. POST103B

Visualizing and Assessing Phylogenetic Congruence of Core Gene Sets: A Case Study of the g-proteobacteria

Microbial Diversity. Yuzhen Ye I609 Bioinformatics Seminar I (Spring 2010) School of Informatics and Computing Indiana University

Chapters AP Biology Objectives. Objectives: You should know...

doi: / _25

A. Incorrect! In the binomial naming convention the Kingdom is not part of the name.

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Real species are typically defined by the ability of their

Evolutionary Genetics: Part 0.2 Introduction to Population genetics

Evolutionary Genomics and Proteomics

Essentiality in B. subtilis

Fitness constraints on horizontal gene transfer

Chapter 26 Phylogeny and the Tree of Life

Introduction to characters and parsimony analysis

Essential knowledge 1.A.2: Natural selection

Biology. Revisiting Booklet. 6. Inheritance, Variation and Evolution. Name:

The Prokaryotic World

Principles of Genetics

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Quantitative Exploration of the Occurrence of Lateral Gene Transfer Using Nitrogen Fixation Genes as a Case Study

OGtree: a tool for creating genome trees of prokaryotes based on overlapping genes

A DNA Sequence 2017/12/6 1

CHAPTER : Prokaryotic Genetics

Classification and Phylogeny

Origin and Evolution of Life

Properties of Life. Levels of Organization. Levels of Organization. Levels of Organization. Levels of Organization. The Science of Biology.

Chapter 17. Organizing Life's Diversity

Evaluate evidence provided by data from many scientific disciplines to support biological evolution. [LO 1.9, SP 5.3]

Supplementary material to Whitney, K. D., B. Boussau, E. J. Baack, and T. Garland Jr. in press. Drift and genome complexity revisited. PLoS Genetics.

AP Biology Review Packet 5- Natural Selection and Evolution & Speciation and Phylogeny


Consensus Methods. * You are only responsible for the first two

BLAST. Varieties of BLAST

BIOINFORMATICS: An Introduction

Classification and Phylogeny

West Windsor-Plainsboro Regional School District AP Biology Grades 11-12

Gene expression in prokaryotic and eukaryotic cells, Plasmids: types, maintenance and functions. Mitesh Shrestha

Chapter 1: Introduction: Themes in the Study of Life

Horizontal transfer and pathogenicity

The Science of Biology. Chapter 1

Identify stages of plant life cycle Botany Oral/written pres, exams

8/23/2014. Phylogeny and the Tree of Life

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Microbes and you ON THE LATEST HUMAN MICROBIOME DISCOVERIES, COMPUTATIONAL QUESTIONS AND SOME SOLUTIONS. Elizabeth Tseng

Chapter 15: Darwin and Evolution

Multiple Sequence Alignment. Sequences

Chapter 26 Phylogeny and the Tree of Life

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences.

Microbiome: 16S rrna Sequencing 3/30/2018

Microbial Genetics, Mutation and Repair. 2. State the function of Rec A proteins in homologous genetic recombination.

Biology 105/Summer Bacterial Genetics 8/12/ Bacterial Genomes p Gene Transfer Mechanisms in Bacteria p.

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis

CSCI 4181 / CSCI 6802 Algorithms in Bioinformatics

PACING GUIDE ADVANCED PLACEMENT BIOLOGY

Transcription:

Dr. Dirk Gevers 1,2 1 Laboratorium voor Microbiologie 2 Bioinformatics & Evolutionary Genomics The bacterial species in the genomic era CTACCATGAAAGACTTGTGAATCCAGGAAGAGAGACTGACTGGGCAACATGTTATTCAG GTACAAAAAGATTTGGACTGTAACTTAAAAATGATCAAATTATGTTTCCCATGCATCAGG GCAATGGGAAGCTCTTCTGGAGAGTGAGAGAAGCTTCCAGTTAAGGTGACATTGAAGC AAGTCCTGAAAGATGAGGAAGAGTTGTATGAGAGTGGGGAGGGAAGGGGGAGGTGGA GGGATGGGGAATGGGCCGGGATGGGATAGCGCAAACTGCCCGGGAAGGGAAACCAGCA TGTACAGACCTGAACAACGAAGATGGCATATTTTGTTCAGGGAATGGTGAATTAAGTGT GGCAGGAATGCTTTGTAGACACAGTAATTTGCTTGTATGGAATTTTGCCTGAGAGACCTC CTTCATCCATCACTGTCCTTGTCAAATAGTTTGGAACAGGTATAATGATCACAATAACCC AAGCATAATATTTCGTTAATTCTCACAGAATCACATATAGGTGCCACAGTTATCCCCATT TATGAATGGAGTMinimalProkaryoticGenomeGATGAAAACCTTAGGAATAATGAA GATTTGCGCAGGCTCACCTGGATATTAAGACTGAGTCAAATGTTGGGTCTGGTCTGACT TAATGTTTGCTTTGTTCATGAGCACCACATATTGCCTCTCCTATGCAGTTAAGCAGGTAG GTGACAGAAAAGCCCATGTTTGTCTCTACTCACACACTTCCGACTGAATGTATGTATGGA GTTTCTACACCAGATTCTTCAGTGCTCTGGATATTAACTGGGTATCCCATGACTTTATTCT GACACTACCTGGACCTTGTCAAATAGTTTGGACCTTGTCAAATAGTTTGGAGTCCTTGTCA AATAGTTTGGGGTTAGCACAGACCCCACAAGTTAGGGGCTCAGTCCCACGAGGCCATCCT ACTTCAGATGACAATGGCAAGTCCTAAGTTGTCACCATACTTTTGACCAACCTGTTACC AATCGGGGGTTCCCGTAACTGTCTTCTTGGGTTTAATAATTTGCTAGAACAGTTTACGGA ACTCAGAAAAACAGTTTATTTTCTTTTTTTCTGAGAGAGAGGGTCTTATTTTGTTGCCCAG GCTGGTGTGCAATGGTGCAGTCATAGCTCATTGCAGCCTTGATTGTCTGGGTTCCAGTGG TCTCCCACCTCAGCCTCCCTAGTAGCTGAGACTACATGCCTGCACCACCACATCTGGCTA GTTTCTTTTATTTTTTGTATAGATGGGGTCTTGTTGTGTTGGCCAGGCTGGCCACAAATTC TGGTCTCAAGTGATCCTCCCACCTCAGCCTCTGAAAGTGCTGGGATTACAGATGTGAGC ACCACATCTGGCCAGTTCATTTCCTATTACTGGTTCATTGTGAAGGATACATCTCAGAAA AGTCAATGAAAGAGACGTGCATGCTGGATGCAGTGGCTCATGCCTGTAATCTCAGCACT The minimal prokaryotic genome The minimal prokaryotic genome Mycoplasma genitalium ~108-121 genes not required for growth in laboratory ~265-350 genes required for growth in laboratory The minimal prokaryotic genome Diagram of the genome of Mycoplasma genitalium - 480 proteins The minimal prokaryotic genome Haemophilus influenzae (1703) Using transposon mutagenesis (one gene disruptions): 130 genes not required for growth in laboratory 350 genes required for growth in laboratory 240 Mycoplasma genitalium (468) C.M. Fraser et al., Science 1995 1

The minimal gene set A synthetic minimal genome The only way to better understand the minimal component of cellular life and understand the evolution of life Koonin, E.V. NRM 2003 CTACCATGAAAGACTTGTGAATCCAGGAAGAGAGACTGACTGGGCAACATGTTATTCAG GTACAAAAAGATTTGGACTGTAACTTAAAAATGATCAAATTATGTTTCCCATGCATCAGG GCAATGGGAAGCTCTTCTGGAGAGTGAGAGAAGCTTCCAGTTAAGGTGACATTGAAGC AAGTCCTGAAAGATGAGGAAGAGTTGTATGAGAGTGGGGAGGGAAGGGGGAGGTGGA GGGATGGGGAATGGGCCGGGATGGGATAGCGCAAACTGCCCGGGAAGGGAAACCAGCA TGTACAGACCTGAACAACGAAGATGGCATATTTTGTTCAGGGAATGGTGAATTAAGTGT GGCAGGAATGCTTTGTAGACACAGTAATTTGCTTGTATGGAATTTTGCCTGAGAGACCTC CTTCATCCATCACTGTCCTTGTCAAATAGTTTGGAACAGGTATAATGATCACAATAACCC AAGCATAATATTTCGTTAATTCTCACAGAATCACATATAGGTGCCACAGTTATCCCCATT TATGAATGGAGTProkaryoticCoreGenomeGATGAAAACCTTAGGAATAATGAATGA TTGCGCAGGCTCACCTGGATATTAAGACTGAGTCAAATGTTGGGTCTGGTCTGACTTTA ATGTTTGCTTTGTTCATGAGCACCACATATTGCCTCTCCTATGCAGTTAAGCAGGTAGGTG ACAGAAAAGCCCATGTTTGTCTCTACTCACACACTTCCGACTGAATGTATGTATGGAGTT CTACACCAGATTCTTCAGTGCTCTGGATATTAACTGGGTATCCCATGACTTTATTCTGAC ACTACCTGGACCTTGTCAAATAGTTTGGACCTTGTCAAATAGTTTGGAGTCCTTGTCAAAT AGTTTGGGGTTAGCACAGACCCCACAAGTTAGGGGCTCAGTCCCACGAGGCCATCCTCAC TCAGATGACAATGGCAAGTCCTAAGTTGTCACCATACTTTTGACCAACCTGTTACCAAT GGGGGTTCCCGTAACTGTCTTCTTGGGTTTAATAATTTGCTAGAACAGTTTACGGAACTC AGAAAAACAGTTTATTTTCTTTTTTTCTGAGAGAGAGGGTCTTATTTTGTTGCCCAGGCTG GTGTGCAATGGTGCAGTCATAGCTCATTGCAGCCTTGATTGTCTGGGTTCCAGTGGTTCTC CACCTCAGCCTCCCTAGTAGCTGAGACTACATGCCTGCACCACCACATCTGGCTAGTTT TTTTATTTTTTGTATAGATGGGGTCTTGTTGTGTTGGCCAGGCTGGCCACAAATTCCTGG CTCAAGTGATCCTCCCACCTCAGCCTCTGAAAGTGCTGGGATTACAGATGTGAGCCACC ACATCTGGCCAGTTCATTTCCTATTACTGGTTCATTGTGAAGGATACATCTCAGAAACAGT AATGAAAGAGACGTGCATGCTGGATGCAGTGGCTCATGCCTGTAATCTCAGCACTTTGG The prokaryotic core genome How big? How stable? Common history? Core genome is NOT = minimal genome!!! Why so small? Non-orthologous gene displacement Maybe, at deep divergences, many true orthologs fall below the radar screen of BLAST Maybe, a few radically reduced parasite genomes are skewing the analysis But just maybe, there really are only about 100 genes, mostly translational/transcriptional in the true core - all else is mix and match So what?!? Doolittle, F. Genomes2005, Halifax 2

The phylogenetic problem resolved? Phylogenetic incongruence (HGT - hidden paralogies) Patchy distribution (different gene content within species) Loss of phylogenetic signal for deep branches More data = more phylogenetic signal? Resist both loss and HGT? Daubin et al., GR 2002 All the possible comparisons between gene phylogenies by using principle component analysis 120 genes with common phylogenetic history Phylogenetic artifact Lack of signal HGT Daubin et al., GR 2002 Among these, 205 contain exactly one gene per species. We consider these 205 genes to represent likely orthologs and, consequently, to be good candidates for use in inferring the organismal phylogeny and the extent of LGT the Shimodaira Hasegawa (SH) test Lerat et al., PLOS 2004 Lerat et al., PLOS 2004 3

Failure of rejection is not the same as support Ford W. Doolittle Genes with little signal may fail to reject many or even all topologies, but they cannot be said to support a certain topology it is possible that a robust tree based on concatenated sequences is well supported because different constituent genes contribute strong support to different individual nodes of the tree, without any supporting that tree over all statistical test for each gene against many topologies more nuanced than rejection or failure of rejection visualize the compatibility of all genes with all trees simultaneously Susko et al., MBE 2006 Heat map = simultaneous display of all combinations of genes and test topologies together with simultaneous clustering of both genes and topologies according to p- values Clustering of genes identifies the core set of genes with a similar evol. history. Clustering of toplogoies identifies which trees are (nearly) equally supported (= # best trees) Susko et al., MBE 2006 Susko et al., MBE 2006 Can we prove Darwin s s theory of evolution? Suited evidence would be: a molecular phylogeny similar to organism phylogeny Our phylogenetic analyses do not support treethinking.... Representations other than a tree should be investigated because a non-critical concatenation of markers could be highly misleading. Bapteste et al., 2005 But for prokaryotes we don t have anything else besides the molecular phylogeny to determine the organism phylogeny = CIRCULAR REASONING to use concatenated genes as a the organism phylogeny We have to live with that!! We could believe in the concatenated phylogeny IF most genes would support the same phylogeny We do live in an era in which we have an enormous amount of data (> 350 genomes) 4

Problems with this study according to Ford W. Doolittle: concatenated without evaluating congruence among genes (failure of rejection is not suport) individual genes are compared with the concatenated genes => acception of rejection Even if they were congruent, still it is no prove of Darwin s theory as ONLY 31 genes were considered -> can t be representative of the phylogenetic history of the organisms HGT is shown to make a substantial contribution to genome evolution (up to 25-30%)... therefore no tree can, in principle, fully reflect the course of evolution of species. Eugene V. Koonin What does it mean... to speak of an organismal genealogy when nearly all of the genes in the cell - genes that give its general character - do not share a common history? C.R. Woese 2002 CTACCATGAAAGACTTGTGAATCCAGGAAGAGAGACTGACTGGGCAACATGTTATTCAG GTACAAAAAGATTTGGACTGTAACTTAAAAATGATCAAATTATGTTTCCCATGCATCAGG GCAATGGGAAGCTCTTCTGGAGAGTGAGAGAAGCTTCCAGTTAAGGTGACATTGAAGC AAGTCCTGAAAGATGAGGAAGAGTTGTATGAGAGTGGGGAGGGAAGGGGGAGGTGGA GGGATGGGGAATGGGCCGGGATGGGATAGCGCAAACTGCCCGGGAAGGGAAACCAGCA TGTACAGACCTGAACAACGAAGATGGCATATTTTGTTCAGGGAATGGTGAATTAAGTGT GGCAGGAATGCTTTGTAGACACAGTAATTTGCTTGTATGGAATTTTGCCTGAGAGACCTC CTTCATCCATCACTGTCCTTGTCAAATAGTTTGGAACAGGTATAATGATCACAATAACCC AAGCATAATATTTCGTTAATTCTCACAGAATCACATATAGGTGCCACAGTTATCCCCATT TATGAATGGAGTMicrobialPanGenomeGATGAAAACCTTAGGAATAATGAATGATT GCGCAGGCTCACCTGGATATTAAGACTGAGTCAAATGTTGGGTCTGGTCTGACTTTAAT GTTTGCTTTGTTCATGAGCACCACATATTGCCTCTCCTATGCAGTTAAGCAGGTAGGTGAC AGAAAAGCCCATGTTTGTCTCTACTCACACACTTCCGACTGAATGTATGTATGGAGTTTCT ACACCAGATTCTTCAGTGCTCTGGATATTAACTGGGTATCCCATGACTTTATTCTGACACT ACCTGGACCTTGTCAAATAGTTTGGACCTTGTCAAATAGTTTGGAGTCCTTGTCAAATAGT TGGGGTTAGCACAGACCCCACAAGTTAGGGGCTCAGTCCCACGAGGCCATCCTCACTTC AGATGACAATGGCAAGTCCTAAGTTGTCACCATACTTTTGACCAACCTGTTACCAATCGG GGGTTCCCGTAACTGTCTTCTTGGGTTTAATAATTTGCTAGAACAGTTTACGGAACTCAGA AAAACAGTTTATTTTCTTTTTTTCTGAGAGAGAGGGTCTTATTTTGTTGCCCAGGCTGGTG GCAATGGTGCAGTCATAGCTCATTGCAGCCTTGATTGTCTGGGTTCCAGTGGTTCTCCCA CTCAGCCTCCCTAGTAGCTGAGACTACATGCCTGCACCACCACATCTGGCTAGTTTCTTT ATTTTTTGTATAGATGGGGTCTTGTTGTGTTGGCCAGGCTGGCCACAAATTCCTGGTCTC AAGTGATCCTCCCACCTCAGCCTCTGAAAGTGCTGGGATTACAGATGTGAGCCACCACAT TGGCCAGTTCATTTCCTATTACTGGTTCATTGTGAAGGATACATCTCAGAAACAGTCAAT GAAAGAGACGTGCATGCTGGATGCAGTGGCTCATGCCTGTAATCTCAGCACTTTGGGAGG Intra-species comparisons A strain is only a single representative of a species, the members of which can be genotypically and phenotypically much more diverse. How many genomes are needed to fully describe a bacterial species? Most analyses have revealed large differences in gene content between closely related strains. Lawrence COiM 2005 Medini et al., COiGD 2005 5

the microbial pan-genome the microbial pan-genome From this study: This question was addressed by sequencing the genomes of 8 Streptococcus agalactiae group B strains (GBS) A bacterial species can be described by its pan-genome (pan, Greek for whole ) Each strain on average 1806 genes present in every strain (core genome) plus 439 genes that are absent in one or more strains (dispensable genome) the present GBS pan-genome contains 2713 genes unique genes will continue to emerge even after 100s or 1000s genomes on average 33 new genes with every new strain sequenced core = essence of the species dispensable = diversity of the species (conserved + strain-specific) The bacterial species will never be fully described, i.e. open pan-genome Claire Fraser Claire Fraser, TIGR GBS pan-genome GBS core genome the microbial pan-genome the microbial pan-genome open closed Core genome = essence, basic aspects of the biology of a sp and its major phenotypic traits Dispensable genome = supplementary biochemical pathways and functions that are not essential for bacterial growth but confer selective advantages,such as adaptation to different niches, virulence, capsular serotype, antibiotic resistance, or colonization of a new host. (conserved / unique) Medini COiGD 2005 6

Open pan-genome: typical for species that colonize multiple environments and have multiple ways of exchanging genetic material: e.g. Streptococci, Meningococci, H. pylori, Salmonellae and E. coli each new genome of Streptococci: + 50 genes (Streptococci is an ecological and phenotypical uniform species) Each new genome of E. coli: + 300 genes (E. coli might be too heterogeneous to be one species) Closed pan-genome: more conserved, live in isolated niches with limited access to the global microbial gene pool: e.g. B. Anthracis, Mycobacterium tuberculosis, Buchnera aphidicola and Chlamydia trachomatis CTACCATGAAAGACTTGTGAATCCAGGAAGAGAGACTGACTGGGCAACATGTTATTCAG GTACAAAAAGATTTGGACTGTAACTTAAAAATGATCAAATTATGTTTCCCATGCATCAGG GCAATGGGAAGCTCTTCTGGAGAGTGAGAGAAGCTTCCAGTTAAGGTGACATTGAAGC AAGTCCTGAAAGATGAGGAAGAGTTGTATGAGAGTGGGGAGGGAAGGGGGAGGTGGA GGGATGGGGAATGGGCCGGGATGGGATAGCGCAAACTGCCCGGGAAGGGAAACCAGCA TGTACAGACCTGAACAACGAAGATGGCATATTTTGTTCAGGGAATGGTGAATTAAGTGT GGCAGGAATGCTTTGTAGACACAGTAATTTGCTTGTATGGAATTTTGCCTGAGAGACCTC CTTCATCCATCACTGTCCTTGTCAAATAGTTTGGAACAGGTATAATGATCACAATAACCC AAGCATAATATTTCGTTAATTCTCACAGAATCACATATAGGTGCCACAGTTATCCCCATT TATGAATGGAGTSpeciesGenomeConceptGATGAAAACCTTAGGAATAATGAATGA TTGCGCAGGCTCACCTGGATATTAAGACTGAGTCAAATGTTGGGTCTGGTCTGACTTTA ATGTTTGCTTTGTTCATGAGCACCACATATTGCCTCTCCTATGCAGTTAAGCAGGTAGGTG ACAGAAAAGCCCATGTTTGTCTCTACTCACACACTTCCGACTGAATGTATGTATGGAGTT CTACACCAGATTCTTCAGTGCTCTGGATATTAACTGGGTATCCCATGACTTTATTCTGAC ACTACCTGGACCTTGTCAAATAGTTTGGACCTTGTCAAATAGTTTGGAGTCCTTGTCAAAT AGTTTGGGGTTAGCACAGACCCCACAAGTTAGGGGCTCAGTCCCACGAGGCCATCCTCAC TCAGATGACAATGGCAAGTCCTAAGTTGTCACCATACTTTTGACCAACCTGTTACCAAT GGGGGTTCCCGTAACTGTCTTCTTGGGTTTAATAATTTGCTAGAACAGTTTACGGAACTC AGAAAAACAGTTTATTTTCTTTTTTTCTGAGAGAGAGGGTCTTATTTTGTTGCCCAGGCTG GTGTGCAATGGTGCAGTCATAGCTCATTGCAGCCTTGATTGTCTGGGTTCCAGTGGTTCTC CACCTCAGCCTCCCTAGTAGCTGAGACTACATGCCTGCACCACCACATCTGGCTAGTTT TTTTATTTTTTGTATAGATGGGGTCTTGTTGTGTTGGCCAGGCTGGCCACAAATTCCTGG CTCAAGTGATCCTCCCACCTCAGCCTCTGAAAGTGCTGGGATTACAGATGTGAGCCACC ACATCTGGCCAGTTCATTTCCTATTACTGGTTCATTGTGAAGGATACATCTCAGAAACAGT AATGAAAGAGACGTGCATGCTGGATGCAGTGGCTCATGCCTGTAATCTCAGCACTTTGG Comparison between DDH and genome similarity What have we learned from almost a decade of extensive genome sequencing with respect to currently named bacterial species? DDH Can we improve the species definition/concept? ANI Konstantinidis PNAS 2005 unpublished study of 28 sequenced strains: y = 0.785x + 16.197 100 R 2 = 0.9486 Comparison between 16S and genome similarity % DNA-DNA Hybridization 80 60 40 16S seq id 20 0-20 0 20 40 60 80 100 % Conserved DNA % conserved DNA: blastn of 1020 nt frags with 90% seq id. cut off Goris IJSEM (in press) ANI Konstantinidis PNAS 2005 7

Gene content diversity within species Gene content diversity within species % conserved genes ANI Species may differ up to 35% gene content or 20% (excl. hypothetical and mobile elements) 70% DDH = min 80% gene content shared 20% = on average 1000 genes! Konstantinidis PNAS 2005 ANI Konstantinidis PNAS 2005 Evolutionary relatedness should be coupled to ecological relatedness This will give better predictive species definition than just an evolutionary one (what is the genetic basis for ecological distinctiveness) clusters or continuum of diversity? Do bacteria exhibit a genetic continuum in nature or are there coherent sequence/genomic clusters on which a species definition/concept could be based? Freq. 333 Hsp60 sequences from vibrio strains isolated from coastal bacterioplankton Current datasets in taxonomy consists of only a few representative strains per species, consequently borders based on this dataset might not hold when more intraspecies diversity is included! % sequence identity 8

Unsolved questions: How is selection acting on bacterial populations? Recombination rate at the whole genome level within a bacterial population? Answers to these questions will advance our knowledge of the two basic processes on which our current species concepts are based. One thing is certain: reconciling eukaryotic and bacterial species under the same biological species concept is NOT possible because these organisms are too different in terms of evolutionary processes 9