Indian Academy of Sciences HYPOTHESIS The indefinable term prokaryote and the polyphyletic origin of genes MASSIMO DI GIULIO Early Evolution of Life Laboratory, Institute of Biosciences and Bioresources, CNR, Via P. Castellino, 111, 80131 Naples, Italy Abstract An analysis has been performed on implications existing between the presence/absence of the evolutionary stage of the prokaryote, that is to say, the presence/absence of common characteristics between archaea and bacteria, and the monophyletic/polyphyletic origin of genes of proteins. Thereafter, a theorem stating that: the polyphyletic origin of proteins would imply the absence of common characteristics between bacteria and archaea and therefore the lack of the evolutionary stage of the prokaryote, and vice versa that the indefinable prokaryote stage implies a polyphyletic origin of proteins, has been made and validated. The conclusion is that given the absence of truth in common characteristics between archaea and bacteria, the origin of genes of proteins should have been polyphyletic. [Di Giulio M. 2017 The indefinable term prokaryote and the polyphyletic origin of genes. J. Genet. 96, 393 397] The nonbiological meaning of the prokaryote term and the polyphyletic origin of genes: an introduction Monophyletic origin of genes seems to be the only sensible explanation for their existence. This would derive, e.g., from the observation that we can build plausible multiple alignments of orthologous proteins taken from the three domains of life. In a more explicit way, the fact that it is possible to align orthologous proteins from the three domains of life would seem to imply that these proteins were already present at the last universal common ancestor (LUCA) stage. Therefore, they were simply inherited in the three domains of life, thus, defining their monophyletic origin. This reasoning seems to be sensible and therefore, would discredit considerably the polyphyletic hypothesis as an explanation for the origin of genes of proteins. However, it is possible, at least in principle, that at the LUCA stage, protein modules were present, active and codified by means of pieces of RNA that were assembled by transsplicing to form messenger RNAs (mrnas). The translation of these mrnas resulted in proteins whose genes (DNA) actually evolved only later, that is to say, only after the formation of the main phyletic lines (Di Giulio 2008a). In other words, at the LUCA stage, protein modules might have existed and codified by means of pieces of RNA. The one that we observe now in the multiple *E-mail: massimo.digiulio@ibbr.cnr.it. alignments of orthologous proteins is substantially and primarily an alignment of modules. These modules might have been different in the different phyletic lines of divergence (also if homologues), because assembled by trans splicing only later on, during or after the formation of the three domains of life (Di Giulio 2006b, 2008a,b,d), to form proteins that we align today. In this way, we would be able to explain why we can obtain multiple alignments of the same protein taken from organisms from the three domains of life, also under the hypothesis that the model for their evolution has been the polyphyletic one and not only monophyletic (Di Giulio 2006b, 2008a,b,d). There are several observations that makes us to consider that the origin of genes could have been polyphyletic. Firstly, it has been suggested that the origin of DNA might have been relatively recent, that is to say, only after that the main phyletic lines were established (Mushegian and Koonin 1996; Aravind et al. 1999; Leipe et al. 1999; Forterre 2000, 2005; Poole and Logan 2005). If this would have been true, then the origin of genes must have been necessarily polyphyletic, having DNA, for a hypothesis, evolved later on and independently in the main phyletic lines (Di Giulio 2008a). Further, there are strong observations that makes us to consider that the origin of genes of transfer RNAs (trnas) has been polyphyletic (Di Giulio 1999, 2004, 2006a,b, 2008a,b,c,d,e, 2009b, 2012a,b, 2013a,b). Indeed, given that the hypothesis of ancestrality of split trna genes of Nanoarchaeum Keywords. progenote; last universal common ancestor; evolutionary stages; early evolution of life. Journal of Genetics, DOI 10.1007/s12041-017-0775-x, Vol. 96, No. 2, June 2017 393
Massimo Di Giulio equitans has been strongly corroborated (Di Giulio 2004, 2005, 2006a,b, 2008a,b,c,d,e, 2009a,b, 2011, 2012a,b, 2014; Branciamore and Di Giulio 2011). The presence of split trna genes both at the LUCA stage and in actual organisms should then prove a late assembly of the genes of trnas in the main phyletic lines. In other words, their origin should have been polyphyletic (Di Giulio 1999, 2004, 2006b, 2012a, 2013a,b). There are also other observations that consider that the origin of genes of proteins might have been polyphyletic (Di Giulio 1999, 2004, 2006b, 2012a, 2013a,b; Lupas et al. 2001). In conclusion, although the polyphyletic hypothesis for the origin of genes has never been taken into account, there are several observations and explanations that bring us to consider that such origin is less absurd than commonly thought (Di Giulio 1999, 2004, 2006b, 2012a,b, 2013a,b). There is a debate on the term prokaryote, that is to say, if it has, or does not have, an actual biological meaning (Pace 2006, 2009; Martin and Koonin 2006; Sapp 2006; Dolan and Margulis 2007; Whitman 2009; Doolittle and Zhaxybayeva 2009, 2013; Fox-Keller 2010; Di Giulio 2015). Indeed, although analyses have reconstructed a complex LUCA, e.g., composed of genes and proteins comparable to that of current organisms (Mushegian and Koonin 1996; Ouzounis and Kyrpides 1996; Ouzounis et al. 2006; Mat et al. 2008; Tuller et al. 2010). We are, nevertheless, unable to define in a likewise clear and immediate way, what in reality is a prokaryote i.e., what characteristics it had as, in contrast, these analyses would easily seem to imply (Di Giulio 2015). That is to say, we are not essentially able to associate to the prokaryote word, a single characteristic, for instance, of the cellular level, that is really in common between bacteria and archaea. This would seem to deny the very existence of the prokaryote stage because not appreciable and not supported by the presence of at least a character that was common between archaea and bacteria (Di Giulio 2015; figure1b). It has been maintained that hundreds of genes, in particular, the operational genes, are commonly present in archaea and bacteria and they are generally absent in eukaryotes (Doolittle and Zhaxybayeva 2013). These genes would identify a common ancestor between them (Doolittle and Zhaxybayeva 2013). But this would not clarify the meaning of the prokaryote term because the existence of a common ancestor between bacteria and archaea indicated from the presence of common proteins between them would not imply that the evolutionary stage of the prokaryote has existed. This is because these proteins might have been present as seen above under different forms, e.g., in the form of modules and assembled only later on, in the main phyletic lines. This implies a polyphyletic origin for proteins. Therefore, the nonexistence of the evolutionary stage of the prokaryote (figure 1a), but only that of a progenote (figure 1b), given the rapid dynamism that would seem involved in the evolution of these proteins (Di Giulio 2000, 2001, 2011, 2015; see below). It would result therefore, that proteins would not be a good indicator of an evolutionary stage as, for instance, the cellular membrane or the organization of a chromosome might instead be it. This is because proteins may not represent a clear cellular level. They would be a kind of Figure 1. Three possible evolutionary branching from the LUCA are shown. (a) The universal ancestor is a prokaryote because the characters (geometrical forms) are identical in the bacterial and archaeal domains. (b) LUCA is a progenote because the characters are completely different between the bacterial and archaeal domains; and (c) only a few characters are identical in the bacterial and archaeal domains, while the most part of characters is different in the two domains of life. This would define a pseudoprokaryote condition or better pseudoprogenote condition of LUCA that is indicated as a possibility, but not discussed in the text. 394 Journal of Genetics, Vol. 96, No. 2, June 2017
Prokaryote term and gene polyphyletic origin character that might reflect both the presence of an evolutionary stage and its absence, given the evolutionary dynamism that would seem to be associated with their structure. Then, the set of problems would seem to understand if the multiple alignments of orthologous proteins define complete proteins in the common ancestor, that is to say, they define that the evolutionary stage of the prokaryote really existed (figure 1a). Or if, on the contrary, these multiple alignments would imply incomplete proteins, i.e. still in rapid evolution, that is to say, with a mode and tempo of evolution completely different from the current one. The latter scenario would imply therefore, the absence of the evolutionary stage of the prokaryote, and instead the presence of the one of the progenote (figure 1b). Multiple alignments of orthologous proteins have been analysed and it has been concluded that these might support, or at least to be compatible, with a polyphyletic origin of genes, and thus, the absence of the evolutionary stage of the prokaryote (Di Giulio 2008a). It is being understood that multiple alignments of orthologous proteins from the domains of life should be read also in the light of a polyphyletic origin of genes. Here, I analyse this kind of problems, considering our incapacity to define with accuracy and immediacy, the prokaryote term and its relationships with the polyphyletic origin of proteins. I assume that the domains of life are only two bacteria and archaea and not three, that is to say, eukaryotes are derived from some archaea as now there is the tendency to consider (Williams et al. 2012, 2013; Spang et al. 2015). However, the conclusions seem true also if eukaryotes would represent the third domain of life. Further, all crucial terms used in this work have been previously defined (Woese and Fox 1977; Woese 1987; Doolittle and Brown 1994; Pace 2006, 2009; Di Giulio 2000, 2001, 2006b, 2011, 2015). Nonbiological meaning of the prokaryote term does imply a polyphyletic origin of genes and vice versa: a theorem We start to reason under the hypothesis that the origin of proteins has been polyphyletic. This should imply, by definition, an immaturity of proteins at the stage of LUCA. The absence of common characteristics between bacteria and archaea (figure 1b) would be a simple consequence of that. Indeed, the immaturity of proteins should imply, that is to say, would be equivalent to, an immaturity of all cellular structures that, therefore, might not have been in common between bacteria and archaea as still rapidly evolving (figure 1b). Therefore, suchnot todefinetheevolutionary stage of the prokaryote (genote) (figure 1a), but only that of a progenote (figure 1b).Wecanseethata polyphyletic origin of proteins should imply the absence of common structures between archaea and bacteria, and therefore, our inability to define the prokaryote term in a biological sense. Does exist the possibility that immature proteins, that is to say still in rapid evolution, can instead go to define complete structures, i.e., for instance, belonging to and defining a determined cellular level and thus an evolutionary stage? It seems to me that this is not possible because if proteins are immature, that is in rapid evolution, then also the structures of corresponding cellular level should be rapidly evolving. Consequently, it seems to me remote the possibility that instead a structure of this cellular level can have been passed to bacteria and archaea, as still rapidly evolving, that is to say, with a tempo and mode typical of a progenote and not of a genote (Woese and Fox 1977; Woese 1987; Doolittle and Brown 1994; Di Giulio 2000, 2001, 2006b, 2011, 2015). Is the opposite also true? That is to say, the absence of common characteristics between bacteria and archaea would imply an immaturity of proteins and therefore, a polyphyletic origin of these or not? The answer seems to be positive. If proteins would have been in an advanced stage of maturation, this would have imposed evolved cellular structures, i.e., complete. These structures would have been common between bacteria and archaea, and this, instead, is not observable. Indeed, the absence of common characteristics between archaea and bacteria is first of all the absence of common cellular structures. This can only reflect the presence of proteins still in rapid evolution, i.e., premature, at the LUCA s stage, and thus, a polyphyletic origin of the latter. Whereas the absence of common characteristics between archaea and bacteria cannot reflect the presence of proteins completely evolved, i.e., mature. These proteins would have implied the very contrary. That is to say, the presence at the LUCA s stage of at least some complete characters that would have become common between bacteria and archaea, and this is not true for the hypothesis. In conclusion, it seems that all this can be included in the following theorem: the polyphyletic origin of proteins does imply the absence of common characteristics between bacteria and archaea, and therefore the absence of the evolutionary stage of prokaryote and vice versa, the indefinable evolutionary stage of the prokaryote does imply a polyphyletic origin of proteins. That is to say, the theorem would make equivalent the following points: (i) the absence of common characteristics between archaea and bacteria, (ii) the absence of the evolutionary stage of prokaryote i.e., the presence of that of progenote, and (iii) the polyphyletic origin of genes of proteins. The highly significant biological consequence would be that the origin of proteins would have followed the polyphyletic model and not the usually accepted monophyletic one. Journal of Genetics, Vol. 96, No. 2, June 2017 395
Massimo Di Giulio Acknowledgements This work was carried out in the Department of Philosophy and History of Science, Faculty of Science, Charles University, Prague, Czech Republic, during my stay in this university, financed by the short-term mobility programme of the National Research Council, Italy. I would like to thank Prof Jaroslav Flegr for discussions and cordial hospitality. References Aravind I., Walkers D. R. and Koonin E. K. 1999 Conserved domains in DNA repair proteins and evolution of repair systems. Nucleic Acids Res. 27, 1223 1242. Branciamore S. and Di Giulio M. 2011 The presence in trna molecule sequences of the double hairpin, an evolutionary stage through which the origin of this molecule is thought to have passed. J. Mol. Evol. 72, 352 363. Di Giulio M. 1999 The non-monophyletic origin of trna molecule. J. Theor. Biol. 197, 403 414. Di Giulio M. 2000 The universal ancestor lived in a thermophilic or hyperthermophilic environment. J. Theor. Biol. 203, 203 213. Di Giulio M. 2001 The non-universality of the genetic code: the universal ancestor was a progenote. J. Theor. Biol. 209, 345 349. Di Giulio M. 2004 The origin of the trna molecule: implications for the origin of protein synthesis. J. Theor. Biol. 226, 89 93. Di Giulio M. 2005 The ocean abysses witnessed the origin of the genetic code. Gene 346, 7 12. Di Giulio M. 2006a Nanoarchaeum equitans is a living fossil. J. Theor. Biol. 242, 257 260. Di Giulio M. 2006b The non-monophyletic origin of the trna molecule and the origin of genes only after the evolutionary stage of the last universal common ancestor (LUCA). J. Theor. Biol. 240, 343 352. Di Giulio M. 2008a The origin of genes could be polyphyletic. Gene 426, 39 46. Di Giulio M. 2008b The split genes of Nanoarchaeum equitans are an ancestral character. Gene 421, 20 26. Di Giulio M. 2008c Permuted trna genes of Cyanidioschyzon merolae, the origin of the trna molecule and the root of the Eukarya domain. J. Theor. Biol. 253, 587 592. Di Giulio M. 2008d Split genes, ancestral genes. In Prebiotic evolution and astrobiology (ed. J. Tze-Fei Wong and A. Lazcano), chapter 13. Landes Bioscience Publisher, Texas, USA. Di Giulio M. 2008e Transfer RNA genes in pieces are an ancestral character. EMBO Rep. 9, 820. Di Giulio M. 2009a A comparison among the models proposed to explain the origin of the trna molecule: a synthesis. J. Mol. Evol. 69, 1 9. Di Giulio M. 2009b Formal proof that the split genes of trnas of Nanoarchaeum equitans are an ancestral character. J. Mol. Evol. 69, 505 511. Di Giulio M. 2011 The last universal common ancestor (LUCA) and the ancestors of archaea and bacteria were progenotes. J. Mol. Evol. 72, 119 126. Di Giulio M. 2012a The origin of the trna molecule: independent data favor a specific model of its evolution. Biochimie 94, 1464 1466. Di Giulio M. 2012b The recently split transfer RNA genes may be close to merging the two halves of the trna rather than having just separated them. J. Theor. Biol. 310, 1 2. Di Giulio M. 2013a A polyphyletic model for the origin of trnas has more support than a monophyletic model. J. Theor. Biol. 318, 124 128. Di Giulio M. 2013b Is Nanoarchaeum equitans a paleokaryote? J. Biol. Res. 19, 83 88. Di Giulio M. 2014 The split genes of Nanoarchaeum equitans have not originated in its lineage and have been merged in another Nanoarchaeota: a reply to Podar et al. J. Theor. Biol. 349, 167 169. Di Giulio M. 2015 The non-biological meaning of the term prokaryote and its implications. J. Mol. Evol. 80, 98 101. Dolan M. F. and Margulis L. 2007 Advances in biology reveal truth about prokaryotes. Nature 445, 21. Doolittle W. F. and Brown J. R. 1994 Tempo mode the progenote and the universal root. Proc. Natl. Acad. Sci. USA 91, 6721 6728. Doolittle W. F. and Zhaxybayeva O. 2009 On the origin of the prokaryotic species. Genome Res. 19, 744 756. Doolittle W. F. and Zhaxybayeva O. 2013 What is a prokaryote? In The prokaryotes prokaryotic biology and symbiotic associations (ed. E. Rosenberg et al.), pp. 21 37. Springer, Heidelberg, Germany. Forterre P. 2000 Three RNA cells for ribosomal lineages and three DNA virus to replicate their genomes: a hypothesis for the origin of cellular domain. Proc. Natl. Acad. Sci. USA 103, 3669 3674. Forterre P. 2005 The two ages of the RNA world, and the transition to the DNA world: a story of viruses and cells. Biochimie 87, 793 803. Fox-Keller E. 2010 The mirage of a space between nature and nurture. Duke Univesity Press, Durham, USA. Leipe D. D., Aravind L. and Koonin E. V. 1999 Did DNA replication evolve twice independently? Nucleic Acids Res. 27, 3389 3401. Lupas A. N., Ponting C. P. and Russell R. B. 2001 On the evolution of protein folds: are similar motifs in different protein folds the result of convergence, insertion, or relics of an ancient peptide world? J. Struct. Biol. 134, 191 203. Martin W. and Koonin E. V. 2006 A positive definition of prokaryotes. Nature 442, 868. Mat W. K., Xue H. and Wong J. T. 2008 The genomics of LUCA. Front Biosci. 3, 5605 5613. Mushegian A. R. and Koonin E. V. 1996 A minimal gene set for cellular life derived by comparison of complete bacterial genomes. Proc. Natl. Acad. Sci. USA 93, 10268 10273. Ouzounis C. and Kyrpides N. 1996 The emergence of major cellular processes in evolution. FEBS Lett. 390, 119 123. Ouzounis C. A., Kunin V., Darzentas N. and Goldovsky L. 2006 A minimal estimate for the gene content of the last universal common ancestor. Res. Microbiol. 157, 57 68. Pace N. R. 2006 Time for a change. Nature 441, 289. Pace N. R. 2009 Rebuttal: the modern concept of prokaryote. J. Bacteriol. 191, 2006 2007. Poole A. M. and Logan D. T. 2005 Modern mrna proofreading and repair: clues that the last universal ancestor possessed an RNA genome? Mol. Biol. Evol. 22, 1444 1455. Sapp R. 2006 Two faces of the prokaryote concept. Intern. Microbiol. 9, 163 172. Spang A., Saw J. H., Jørgensen S. L., Zaremba-Niedzwiedzka K., Martijn J. et al. 2015 Complex archaea that bridge the gap between prokaryotes and eukaryotes. Nature 521, 173 179. Tuller T., Birin H., Gophna U., Kupiec M. and Ruppin E. 2010 Reconstruction ancestral gene content by coevolution. Genome Res. 20, 122 132. Whitman W. B. 2009 The modern concept of the procaryote. J. Bacteriol. 191, 2000 2005. 396 Journal of Genetics, Vol. 96, No. 2, June 2017
Williams T. A., Foster P. G., Nye T. M., Cox C. J. and Embley T. M. 2012 A congruent phylogenomic signal places eukaryotes within the Archaea. Proc. R. Soc. London B 279, 4870 4879. Williams T. A., Foster P. G., Cox C. J. and Embley T. M. 2013 An archaeal origin of, eukaryotes supports only two primary domains of life. Nature 504, 231 236. Prokaryote term and gene polyphyletic origin Woese C. R. 1987 Bacterial evolution. Microbiol. Res. 51, 221 271. Woese C. R. and Fox G. E. 1977 The concept of cellular evolution. J. Mol. Evol. 10, 1 6. Received 9 May 2016, in final revised form 6 July 2016; accepted 23 August 2016 Unedited version published online: 25 August 2016 Final version published online: 17 June 2017 Corresponding editor: Rajiva Raman Journal of Genetics, Vol. 96, No. 2, June 2017 397