Interpreting the Molecular Tree of Life: What Happened in Early Evolution? Norm Pace MCD Biology University of Colorado-Boulder nrpace@colorado.edu
Outline What is the Tree of Life? -- Historical Conceptually a tree of organisms, but -- Molecular trees, constraints, controversies and the Next-Gen stall in expanding the Tree -- It s not a simple tree of organisms: Pangenome How do we know where is LUCA on the molecular map? When was that? What was the nature of earliest life, the early lines of descent? -- How to predict? How to go deeper than LUCA? -- Paradoxes may be interesting
Haeckel, 1866
Whittaker, 1969 Stanier, 1960s
Carl Woese, early 1980s RNase T1 fingerprint
Woese, 1977 Eukaryotes Bacteria Archaebacteria (Archaea)
Woese, 1977 Eukaryotes Bacteria Archaebacteria (Archaea) 1987
Why rrna sequences for the backbone of the universal tree? Universally present The most conservative sequence in biology even into pre-cellular life. No lateral transfer reflects the genetic machinery.
Expanding the Tree: Into the Natural Microbial World Sample DNA rdna PCR library clone sequence next-gen sequence
Expansion of the Bacterial Tree
Doolittle Confusogram
Swithers and Katz, Microbe 2013
And then came genome sequences.
Pangenome the collection of genes accessible to a phylotype Tenaillon et al., Nature Rev. Microbiol. 8:207 (2010) (Genes) Lukjancenko et al., Mic. Ecol. 6-:708 (2010) (Gene families)
E.g. Gene contents of different strains of Escherichia coli: Pangenome Strain A Strain B Strain C
E.g. Gene contents of different strains of Escherichia coli: Pangenome Strain A Strain B Strain C What s with all the lateral transfer?
5µm Jed Fuhrman
Pangenome: the world of Jean-Baptist Lamark (1744-1829)
Tree of Life? Tree of what??
Tree of Life? Tree of what?? What gene(s) to use?? Core genes rrna, others cellular line of descent Concatenated core genes with care only Concatenated genomes Ugh. There is no such thing as a tree of organisms. Note that no single gene or sequence is uniformly useful in phylogenetic analyses throughout the ToL
Making Sense of Sequences: Molecular Phylogeny 1. Align sequences so that homologous residues are juxtaposed. 2. Count the number of differences between pairs of sequences -- this is some measure of evolutionary distance that separates the organisms. 3. Calculate the tree, the relatedness map, that most accurately represents all the pairwise differences.
Experimental tree, late 1990s
Baldauf et al., 2000
Problems in resolving Deep-branching topology: Representation Uncertainty
Now BUT - Next-Gen Problems!! Cumlative Number of Sequences A. 900,000 800,000 700,000 600,000 500,000 400,000 300,000 200,000 100,000 30,000 25,000 20,000 15,000 10,000 5,000 0 1989 1992 1995 1998 Total Bacteria Eucarya Archaea 0 1989 1992 1995 1998 2001 2004 2007 Next-gen sequences are short you may get a (low level) taxon call, but only ~70% of the time with environmental seqs and you can t do phylogeny with unclassified seqs. Pipelines, in dealing with pyro-babble, toss novel seqs that don t fit the training set and throw out the new stuff!
* Inferred sequence change 1.2 1.0 0.8 0.6 0.4 0.2 Species level variation Phylum level variation Domain level variation Unseen change Observed change 0.0 0.0 0.3 0.5 Observed Sequence Change * (Knuc) = -3/4 ln(1-(4/3)d)
Where is the root LUCA? Woese 1987
Rooting a tree requires an outgroup: not available with a universal tree. Solution (Dayhoff, 1970s): paralogous rooting -- use trees based on in-group paralogs. (Recall that homologs are of three kinds: Orthologs, Paralogs and Xenologs. )
Paralogs you can still recognize include: Elongation Factors Tu and G Membrane ATP Synthase α and β trnas metf and met Each gives the 3-D tree and are homologs
EF-Tu/EF-G alignment, residues 1-70 Tu G
Rooting the Big Tree Bacteria Archaea EF-G Eucarya Bacteria Eucarya EF-Tu Archaea
Woese 1990
When was LUCA? >3.5 billion years ago
What was LUCA? Not a genetic cell. More likely a state, communal, interdependent, replicating foci. Early phylogenetic lines would have differentiated with acquisition of intermolecular specificity. Radiations at the base of the domains could occur only after development of sophistication necessary for independent vertical lines of descent
Paradox: How is it that chemiosmosis was in-place before the biochemical/genetic membrane? Maybe the first membrane was abiological.
CLASH: The Big Tree vs. the Common Wisdom The eukaryote nuclear line of descent is not a late arrival, rather, is as old as cellular life The prokaryote-eukaryote model of evolution is wrong and needs to be banished from the lexicon of biology.
Where did the eukaryotic cell come from? Mitos and chlps from specific bacterial phyla, Proteobacteria and Cyanobacteria. But the nuclear line is primordial and older than cyanobacteria
The modern kind of eucaryotic cell, complete with chloroplast (and probably mitochondrion) was in-place by >3 billion years ago!
Models of Biological Organization and Evolution vs. Procaryote-Eucaryote, the textbook tale Three Domains
Procaryote/Eucaryote: The Test (Woese, 1977) 1. All eucaryotes are specifically related to one another. True 2. All procaryotes are related to the exclusion of eucaryotes. False 3. Procaryotes gave rise to (more advanced) eucaryotes. False
The End Thank you! Organism