Journal of Botany. Genome Size 2011

Size: px
Start display at page:

Download "Journal of Botany. Genome Size 2011"

Transcription

1 Journal of Botany Genome Size 20

2 Genome Size 20

3 Journal of Botany Genome Size 20

4 Copyright 202 Hindawi Publishing Corporation. All rights reserved. This is a focus issue published in Journal of Botany. All articles are open access articles distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

5 Editorial Board Stephen W. Adkins, Australia Iftikhar Ahmad, Pakistan Mohammad Anis, India Tadao Asami, Japan Hiroshi Ashihara, Japan Muhammad Y. Ashraf, Pakistan Prem L. Bhalla, Australia Hikmet Budak, Turkey Bhagirath S. Chauhan, Philippines Kang Chong, China Curtis C. Daehler, USA Peter J. De Lange, New Zealand MariaR.Ercolano,Italy Muhammad Farooq, Pakistan Urs Feller, Switzerland M. Fernandez-Aparicio, Spain Muhammad Hamayun, Pakistan Takashi Hashimoto, Japan Shamsul Hayat, Saudi Arabia Simon Hiscock, UK Muhammad Iqbal, Pakistan Gaoming Jiang, China Bai-Lian Larry Li, USA Jutta Ludwig-Mueller, Germany Tariq Mahmood, Pakistan Minami Matsui, Japan Frederick Meins, Switzerland Marcelo Menossi, Brazil Michael Moeller, UK Akira Nagatani, Japan Karl Joseph Niklas, USA Masashi Ohara, Japan Claude Penel, Switzerland Lorenzo Peruzzi, Italy Bala Rathinasabapathi, USA S. G. Razafimandimbison, Sweden Zed Rengel, Australia Marko Sabovljevic, Serbia Bernd Schneider, Germany Harald Schneider, UK William K. Smith, USA Kenneth J. Sytsma, USA Hiroshi Tobe, Japan An Vanden Broeck, Belgium Philip J. White, UK Andrew Wood, USA Guang Sheng Zhou, China

6 Contents Polyploidy and Speciation in Pteris (Pteridaceae), Yi-Shan Chao, Ho-Yih Liu, Yu-Chung Chiang, and Wen-Liang Chiou Volume 202, Article ID 87920, 7 pages Exploring Diversification and Genome Size Evolution in Extant Gymnosperms through Phylogenetic Synthesis, J. Gordon Burleigh, W. Brad Barbazuk, John M. Davis, Alison M. Morse, and Pamela S. Soltis Volume 202, Article ID , 6 pages On the Coevolution of Transposable Elements and Plant Genomes, PeterCiváň, Miroslav Švec, and Pavol Hauptvogel Volume 20, Article ID , 9 pages The Genomes of All Angiosperms: A Call for a Coordinated Global Census, David W. Galbraith, Jeffrey L. Bennetzen, Elizabeth A. Kellogg, J. Chris Pires, and Pamela S. Soltis Volume 20, Article ID 64698, 0 pages Does Large Genome Size Limit Speciation in Endemic Island Floras?, Maxim V. Kapralov and Dmitry A. Filatov Volume 20, Article ID , 6 pages Genome Diversity in Maize, VictorLlaca, MatthewA. Campbell, andstéphane Deschamps Volume 20, Article ID 0472, 0 pages Evolution of Genome Size in Duckweeds (Lemnaceae), Wenqin Wang, Randall A. Kerstetter, and Todd P. Michael Volume 20, Article ID 57039, 9 pages

7 Hindawi Publishing Corporation Journal of Botany Volume 202, Article ID 87920, 7 pages doi:0.55/202/87920 Review Article Polyploidy and Speciation in Pteris (Pteridaceae) Yi-Shan Chao,, 2 Ho-Yih Liu, Yu-Chung Chiang, and Wen-Liang Chiou 2 Department of Biological Sciences, National Sun Yat-Sen University, Kaohsiung 804, Taiwan 2 Division of Botanical Garden, Taiwan Forestry Research Institute, Taipei 00, Taiwan Correspondence should be addressed to Yu-Chung Chiang, yuchung@mail.nsysu.edu.tw and Wen-Liang Chiou, chiou@tfri.gov.tw Received 6 June 20; Revised 9 December 20; Accepted 6 January 202 Academic Editor: Kang Chong Copyright 202 Yi-Shan Chao et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The highest frequency of polyploidy among plants is considered to occur in the Pteridophytes. In this study, we focused on polyploidy displayed by a specific fern taxon, the genus Pteris L. (Pteridaceae), comprising over 250 species. Cytological data from 06 Pteris species were reviewed. The base number of chromosomes in Pteris is 29. Polyploids are frequently found in Pteris, including triploids, tetraploids, pentaploids, hexaploids, and octoploids. In addition, an aneuploid species, P. deltodon Bak., has been recorded. Furthermore, the relationship between polyploidy and reproductive biology is reviewed. Among these 06 Pteris species, 60% exhibit polyploidy: 22% show intraspecific polyploidy and 38% result from polyploid speciation. Apogamous species are common in Pteris. Diploids are the most frequent among Pteris species, and they can be sexual or apogamous. Triploids are apogamous;tetraploids are sexual or apogamous. Most Pteris species have one to two ploidy levels. The diverse ploidy levels suggest that these species have a complex evolutionary history and their taxonomic problems require further clarification.. Introduction Polyploidy provides a rapid route for species evolution and adaptation [, 2]. Taxa arising from polyploidy are usually characterized by divers gene expression [3]. This variation in gene expression also has effects on ecological traits, which play an important role in speciation because a specialised niche is a key factor in the formation of new taxa [4 8]. For example, ecological isolation can allow taxa with genetic variation to become segregated [9]. It is estimated that the highest frequency of polyploidy is exhibited in ferns. The frequency of polyploid speciation in ferns is 3%, which is much higher than 5% in angiosperms [0]. In ferns, a special form of asexual reproduction known as apogamy is common [, 2]. Apogamy provides a bypass to crossover mispairing of chromosomes and stabilises the reproduction of polyploids [3 5]. During metaphase I of meiosis, these polyploids present multivalents, which may have difficulty separating equally. Apogamous species are clonal hybrid genotypes, and, as a result, apogamy creates reproductive barriers that prevent gene flow among closely related taxa, thereby facilitating sympatric speciation [6]. Each taxon maintains an independent genetic lineage, leading eventually to a new species. Pteris L. (Pteridaceae) is a cosmopolitan fern genus with over 250 species. Some Pteris species have several different ploidy levels and are found in several geographical areas, such as P. cretica and P. vittata [, 7, 8], which likely reflect the ecological differentiation within species. For example, different niche preferences have been found in P. fauriei [9]. In addition, polyploidy can also cause morphological novelty. Species complexes in Pteris have been frequently reported [20, 2]. Those species complexes are usually composed of a group of taxa with similar morphologies and involved several polyploids. This paper depicts the cytotypes, breeding systems, character variations, and their relationships in the genus Pteris. 2. Cytotypes of Pteris Species The first studies of polyploidy in Pteris focused mainly on ploidy differences and apogamy of P. cretica [22]. Walker [2] provided the first comprehensive cytological study of the genus Pteris, which included 82 species, and reported that the base number of chromosomes in Pteris is 29. In the current study, data from previous cytological studies of 06 Pteris species were integrated (Table ). The data

8 2 Journal of Botany Table : Ploidy levels and breeding systems of 06 Pteris species. Sexual 2X Apogamous 2X? 2X Apogamous 3X Sexual 4X Apogamous 4X? 4X Others Reference P. acanthoneura Alston V [2] P. actiniopteroides Christ V [2] P. albersii Hieron. V V [2, 25] P. altissima Poir. V [2] P. arguta Aiton V [2] P. argyraea Moore V V V [2, 24] P. aspericaulis Wall. V V [2, 26] P. atrovirens Willd. V [2] P. bella Tagawa V V V [27, 28] P. berteroana Ag. V [27] P. biaurita L. V V 6X [2, 29, 30] P. bifurcata Ching V [24] P. boninensis H. Ohba V [3] P. buchananii Bak. ap. Sim. V [2] P. burtonii Bak. V [2] P. cadieri Christ V V [32] P. camerooniana Kuhn V [2, 33] P. catoptera Kze. V [2] P. comans Forst. V [2, 24] P. confusa T. G. Walker V [2] P. cretica L. V V V V V [, 34 39] P. dactylina Hook. V [2, 24, 38] P. delchampsii W.H.Wagner & Nauman V [40] P. deltodon Bak. V V Sexual n = 53, 55 [8, 23, 27] P. dentata Forsskal V [2] P. dispar Kze. V V V [2, 23, 24, 27] P. ensiformis Burm. V V V V 5X; 2n = 84, 68 [2, 8, 23, 27, 4 44] P. esquirolii Christ V [24] P. excelsa Gaud. V V V V [8, 24, 45] P. fauriei Hieron. V V [2, 27, 39, 44, 46] P. formosana Bak. V [33] P. friesii Hieron. V [2] P. gallinopes Ching V [8] P. gongalensis T. G. Walker V [2] P. grandifolia L. V [2] P. grevilleana Wall. ex Agardh V V [2, 32, 47] P. haenkeana Presl V [2] P. hamulosa Christ V [2] P. henryi Christ V [39] P. hexagona (L.) Proctor V [2] P. holttumii C. Chr. V [2] P. hookeriana Ag. V [33] P. incompleta Cav. V [24] P. insignis Mett. ex Kuhn V [2] P. intricata Wright V [22] P. kidoi Kurata V [23, 48, 49]

9 Journal of Botany 3 Sexual 2X Apogamous 2X Table : Continued.? 2X Apogamous 3X Sexual 4X Apogamous 4X? 4X Others Reference P. kingiana Endl. V [24] P. kiuschiuensis Hieron. V V [2, 50, 5] P. laurisilvicola Kurata V V [24] P. ligulata Gaud. V [2] P. linearis Poir. V V V [27, 28] P. lineata Poir. et Lam. V [39] P. longifolia L. V V [2, 24] P. longipes D. Don V [24] P. longipinnula Wall. V [24] P. macilenta A. Rich. V [2] P. marginata Bory V [2] P. multiaurita Ag. V [2] P. multifida Poir. V V [2, 8, 23, 27, 45, 52] P. nakasimae Tagawa V [2, 23] P. namegatae Kurata V [23] P. natiensis Tagawa V [5] P. nemoralis Willd. V [24] P. nipponica Shieh V [33, 53] P. orientalis Alderw. [54] P. oshimensis Hieron. V V [8, 50, 55] P. otaria Bedd. V [2, 29] P. pacifica Hieron. V V [2, 24] P. palustris Poir. V [24] P. papuana Ces. V [2] P. pellucida Presl V V V [2, 3] P. pellucidifolia Hayata V [36] P. plumula Desv. V [2] P. podophylla Sw. V [24] P. praetermissa T. G. Walker V [2] P. pseudoquadriaurita Khullar V [24] P. quadriaurita Retz. V V V [2, 20] P. reptans T.G. Walker V [2] P. roseo-lilacina Hieron. V [2] P. ryukyuensis Tagawa V [55, 56] P. saxatilis Carse V [2] P. scabripes Wall. V V [2, 24] P. scabristipes Tagawa V [24] P. sefuricola Sa. Kurata V [34] P. semipinnata L. V V [2, 23, 27] P. setuloso-costulata Hayata V V [23, 27, 53] P. silentvalliensis N. C. Nair V [24] P. similis Kuhn V [2] P. spinescens C. Presl V [24, 57] P. stenophylla Wall. ex Hook. & Grev. V [2] P. subquinata (Wall. ex Bedd.) Agardh V [24] P. togoensis Hieron. V [2]

10 4 Journal of Botany Sexual 2X Apogamous 2X Table : Continued.? 2X Apogamous 3X Sexual 4X Apogamous 4X? 4X Others Reference P. tokioi Masam. V V [27, 50, 55] P. trachyphylla Kunze. V [2] P. tremula R. Br. Sexual 8X [2] P. tripartita Sw. V V [2, 35] P. umbrosa R. Br. V [2] P. vellucida V [24] P. vittata L. V V V V 5X, 6X [, 2, 7, 8, 27, 35, 39, 44, 58] P. wallichiana Agardh V [2, 8, 27, 44] P. wangiana Ching V [24] P. warburgii Christ V [54] P. werneri (Rosenstock) Holtt. V [25] P. wulaiensis C.M.Kuo V [36] P. xiaoyingae H.He&L.B.Zhang V [59] P. yamatensis (Tagawa) Tagawa V [2, 23]? means unknown. It is accorded to the literature. P. vellucida could be misspelling of P. pellucida. of some varieties were combined into species. The degree of polyploidy varies in Pteris, including triploid, tetraploid, pentaploid, hexaploid, and octoploid species. In addition, an aneuploid species, P. deltodon Bak., with 53, 55, and 82 chromosomes per gamete has been reported [8, 23, 24]. Among the 06 Pteris species examined, 43 (40%) are diploid, the most common cytotype in this genus (Figure ). Among the remaining 40 polyploid species (38%), there are 3 triploids, 26 tetraploids, and octoploid. Octoploidy in P. tremula R. Br. is the highest ploidy level recorded. Compared to the 34% level of polyploidy in all leptosporangiate ferns [0], polyploid speciation is apparently much more common in the genus Pteris. The highest ploidy levels found in all fern genera, such as dodecaploids and hexadecaploids in Asplenium [2], were not found in the genus Pteris. Of the 06 species of Pteris examined, most of the individuals in a species have only one ploidy level. In contrast, intraspecific polyploids are found in the other 23 species, 22% of genus Pteris. Nine species include both diploid and triploid individuals, and four species include both diploids and tetraploids plants. No species comprised solely triploids or tetraploids. Six species have individuals with three ploidy levels, diploids, triploids, and tetraploids (Figure ). Four species with other intraspecific ploidy combinations were grouped in the others in Figure. Specifically,Pterisbiaurita L. and P. ensiformis Burm. include individuals having four ploidy levels, that is, diploids, triploids, tetraploids, and hexaploids. Also, Pteris vittata L. exhibits pentaploidy in addition to the above four ploidy levels. Finally, Pteris deltodon is aneuploid species. Some of the triploids and tetraploids in the genus clearly arose from autopolyploidy. For example, 29 trivalent chromosomes were found in the triploid P. fauriei Hieron (P. fauriei var. fauriei) [27], and 29 tetravalent chromosomes were found in Pteris Only 2X 3 Only 3X 26 Only 4X 9 Only 8X 2X + 3X 0 6 Figure : The cytotypes of the 06 Pteris species in this review. tokioi Masam [27]. Overall, the frequency of the above intraspecific variation in ploidy level in Pteris is lower than the 33% reported for all leptosporangiate ferns [0]. 2X + 4X 3X + 4X 3. Polyploids and Breeding Systems The most common breeding system in the genus Pteris is sexual (48 species) (Figure 2), followed by apogamous (33 species), then species with both sexual and apogamous reproduction (3 species). Sexual reproduction is more frequent than apogamous in both diploids (28% > %) and tetraploids (5% > 7%) (Figure 3). Some other cytotypes could be sexual. For example, P. tremula is a sexual octoploid. Although P. deltodon is aneuploid, sexual diploids (n = 53, 55) are also found in this species [8]. Apogamy is found in 39% of all cytotypes and 28% of polyploids (Figure 3; diploids %, triploids 2%, 4 2X+ 3X + 4X 4 Others

11 Journal of Botany Sexual 33 Apogamous 3 2 Figure 2: The breeding systems of the 06 Pteris species in this review. (%) % Sexual 2X % Apogamous 2X 8%?2X 2% Apogamous 3X 5% Sexual 4X Sexual and apogamous 7% 7% Figure 3: The breeding systems and cytotypes of the 06 Pteris species in this paper. tetraploids 7%). Apogamous diploids are considered to originate from the hybridisation of two sexual diploid species followed by acquired apogamy or from genetic change of a sexual diploid species []. Apogamous triploids could derive from a cross between sexual diploid and tetraploid species or between apogamous diploids (unreduced, diploid gametes, functionally male) and sexual diploids [5, 4, 48, 60]. Diploids appear to be the ancestors of the triploids; however, ploidy reduction is also possible: triploids produced the diploid apogamous Dryopteris pacifica. Such diploids may be derived from partial synapsis and segregation [6]. All triploids were found to be apogamous. In both autotriploids and allotriploids, disordered chromosome separation occurs; trivalents in autotriploids and a bivalent plus a univalent in allotriploids cannot be resolved into balanced products. Without apogamous reproduction, spores from autotriploids do not have balanced chromosome complements and, thus, are not viable [62, 63]. The number of sexual tetraploids is greater than that of apogamous ones. Tetraploids could arise via chromosome doubling in a diploid and maintain sexual reproduction thereafter. Tetraploids can also arise from another mechanism, the so-called triploid bridge [2]. In this case, a Apogamous 4X?4X Unknow 3% Others triploid arises from the fusion of a reduced (haploid) and an unreduced (diploid) gamete. Furthermore, crossing of the unreduced gametes of that apogamous triploid with the haploid gametes of a sexual diploid could produce apogamous tetraploids [64]. 4. Polyploidy and the Variation of Pteris Species The variation among infraspecific polyploids may reveal the contribution of polyploidy to speciation. Such variation could include differences in morphology, ecology, cytology, and reproduction. These variations might not be easily distinguished in infraspecific polyploids, which are usually referred to as a species complex. Furthermore, possible cryptic species could lurk inside. Below are some examples of these phenomena in Pteris species. Pteris fauriei includes sexual diploids and apogamous triploids, P. fauriei var. minor and P. fauriei var. fauriei, respectively [46]. Because the triploids have 29 trivalents at meiosis [27], it is likely that they arose via autopolyploidy. Diploids of this species often grow in exposed sites and grasslands and prefer warmer habitats than the triploids [9]. Although only genome dosage distinguishes the diploid and triploid taxa, polyploidy caused ecological differentiation between the taxa. Pteris cretica is widely distributed in warm-temperate and tropical parts of the Old World [65]. It could be one of the most attractive materials in Pteris studies. Early studies revealed apogamy and different ploidy levels in the species [], while subsequent studies reported variable morphology and ploidy levels, including diploids, triploids, and tetraploids [2, 34 36] (Table ). Although the sexual diploid P. cretica has been reported, apogamous diploids, triploids, and tetraploids of this species suggest possible hybridisation events. The presence of a bivalent plus a univalent during meiosis provides distinct evidence of such hybridization [37, 66]. Based on allozyme banding patterns, the triploid P. cretica may derive from the diploid apogamous P. cretica and the diploid sexual P. kidoi [48]. Furthermore, an apogamous intermediate form between the sexual tetraploid P. multifida and the apogamous diploid P. cretica has been reported [34]. Pteris ensiformis occurs in India, Sri Lanka, SE Asia, and Polynesia, although now it is also widely naturalised elsewhere in the tropics [65]. Its cytotypes include diploid, triploid, tetraploid, and pentaploid. The triploid may have arisen from a cross between sexual diploids and sexual tetraploids. The origin of the pentaploid is more complicated to infer; however, hybridisation is suggested by the failure of chromosome pairing at meiosis [4]. Pteris ensiformis var. victoriae Ba. was reported to be aneuploid (2n = 84, 68) [42]. Given the various morphologies of P. ensiformis, this species likely underwent multiple hybridisations and possibly contains cryptic species. Pteris quadriaurita is a well-known species complex that includes P. quadriaurita sensu stricto, P. multiaurita, P. confusa, and P. otaria [20, 2]. Their radically different morphologies make these taxa difficult to identify. Field

12 6 Journal of Botany hybridisation experiments have provided evidence that this species complex arose from recurrent hybridisation events, that is, a hybrid swarm. The fact of both sexual and apogamous reproductive systems and divers ploidy levels, including diploid, triploid, and tetraploid, indicates the undergoing speciation of this complex. Pteris deltodon has diploids and triploids. Furthermore, its aneuploid cytotypes, n = 53 and 55, indicate that it is also hypotetraploid (Table ). The appearance of aneuploids is limited, with only two records in China and Japan [8, 23]. The species likely arose from allopolyploidy, but it is now a stable species because of its sexual reproduction and 64 spores per sporangium [8]. The Chinese ladder brake P. vittata shows considerable morphological variations and wide geographical distribution throughout the world. Diverse ploidy levels and reproductive modes have been recorded, including sexual diploids, triploids, sexual and apogamous tetraploids, pentaploids, and hexaploids (Table ). Furthermore, the spore mother cells show a variety of multivalents, such as 20I + 26II + 5III, 9I + 45II + 3III + 2IV, 29II + 29I, and 29II + 87I [7, 8], indicating the occurrence of allopolyploidy. The species involved are not clear, and further taxonomical study on this species complex is needed. 5. Conclusion We reviewed the cytological data of 06 Pteris species, and 60% of them exhibit polyploidy, with a frequency of polyploid speciation 38%. This ratio, however, could be underestimated. Since the taxonomy of Pteris remains unclear, some cryptic species may exist in the 22% species with intraspecific polyploidy. Integration of further cytological data with reproductive and morphological studies should clarify Pteris systematics including species delimitations and also the evolutionary history of its taxa. Authors contribution Y. S. Chao and H. Y. Liu contributed equally to this work. Acknowledgments The authors thank Dr. Craig Martin for English editorial assistance and anonymous reviewer for valuable comments. ThisstudywassupportedbygrantsfromTaiwanForestry Research Institute (96AS-..2-FI-G, 97AS-.2.2-EI-W4, and 98AS-8.2.-F-G) awarded to W.-L. Chiou, and the National Science Council of Taiwan awarded to Y.-C. Chiang. References [] V. Grant, Plant Speciation, Columbia University Press, New York, NY, USA, 98. [2] L. H. Rieseberg and J. H. Willis, Plant speciation, Science, vol. 37, no. 5840, pp , [3] P. S. Soltis and D. E. Soltis, The role of genetic and genomic attributes in the success of polyploids, Proceedings of the National Academy of Sciences of the United States of America, vol. 97, no. 3, pp , [4] C. N. Page, Ecological strategies in fern evolution: a neopteridological overview, Review of Palaeobotany and Palynology, vol. 9, no. -2, pp. 33, [5] C. H. Haufler, M. D. Windham, D. M. Britton, and S. J. Robinson, Triploidy and its evolutionary significance in Cystopteris protrusa, Canadian Journal of Botany, vol. 63, no. 0, pp , 985. [6] J. Ramsey and D. W. Schemske, Neopolyploidy in flowering plants, Annual Review of Ecology and Systematics, vol. 33, pp , [7] D. M. Rosenthal, A. E. Schwarzbach, L. A. Donovan, O. Raymond, and L. H. Rieseberg, Phenotypic differentiation between three ancient hybrid taxa and their parental species, International Journal of Plant Sciences, vol. 63, no. 3, pp , [8]B.L.GrossandL.H.Rieseberg, Theecologicalgeneticsof homoploid hybrid speciation, Journal of Heredity, vol. 96, no. 3, pp , [9] J. C. Vogel, F. J. Rumsey, S. J. Russell et al., Genetic structure, reproductive biology and ecology of isolated populations of Asplenium csikii (Aspleniaceae, Pteridophyta), Heredity, vol. 83, no. 5, pp , 999. [0]T.E.Wood,N.Takebayashi,M.S.Barker,I.Mayrose,P.B. Greenspoon, and L. H. Rieseberg, The frequency of polyploid speciation in vascular plants, Proceedings of the National Academy of Sciences of the United States of America, vol. 06, no. 33, pp , [] I. Manton, Problems of Cytology and Evolution in the Pteridophyta, Columbia University Press, New York, NY, USA, 950. [2] T. G. Walker, Cytology and evolution in the fern genus Pteris L., Evolution, vol. 6, no., pp. 7 43, 962. [3] J. D. Lovis, Evolutionary patterns and processes in ferns, Advances in Botanical Research, vol. 4, no. C, pp , 978. [4] T. G. Walker, The Experimental Biology of Ferns, Academic Press, Londo, UK, 979, Edited by A. F. Dyer. [5] L. G. Hickok and E. J. Klekowski Jr., Inchoate speciation in Ceratopteris:an analysis od the synthesized hybrid C. richardii xc.pteridoides, Evolution, vol. 28, pp , 974. [6] C. R. Werth and M. D. Windham, A model for divergent, allopatric speciation of polyploid pteridophytes resulting from silencing of duplicate-gene expression, The American Naturalist, vol. 37, no. 4, pp , 99. [7] P. Khare and S. Kaur, Intraspecific polyploidy in Pteris vittata Linn., Cytologia, vol. 48, no., pp. 2 25, 983 (Japanese). [8] Z. R. Wang, A preliminary study on cytology of Chinese Pteris, Acta Phytotaxonomica Sinica, vol. 27, no. 6, pp , 989. [9] Y.-M. Huang, H.-M. Chou, J.-C. Wang, and W.-L. Chiou, The distribution and habitats of the Pteris fauriei complex in Taiwan, Taiwania, vol. 52, no., pp , [20] T. G. Walker, The Pteris quadriaurita complex in Ceylon, Kew Bulletin, vol. 4, no. 3, pp , 954. [2] T. G. Walker, Hybridization in some species of Pteris L., Evolution, vol. 2, no., pp , 958. [22] I. Manton, Chromosomes and fern phylogeny with special reference to Pteridaceae, Journal of the Linnean Society of London, vol. 56, no. 365, pp , 958. [23] K. Iwatsuki, lora of Japan-Pteridophyta and Gymnospermae, Kodansha, Tokyo, Japan, 995, Edited by K. Iwatsuki, T. Yamazaki, D. E. Boufford and H. Ohba.

13 Journal of Botany 7 [24] Index to Plant Chromosome Numbers (IPCN), Missouri Botanical Garden, St. Louis, Mo, USA, 979, Edited by P. Goldblatt andd.e.johnson, [25] T. G. Walker, Additional cytogenetic notes on the pteridophytes of Jamaica, Transactions of the Royal Society of Edinburgh, vol. 69, pp , 973. [26] L. S. Ammal and K. V. Bhavanandan, Cytological studies on some members of Pteridaceae (sensu Copeland) from south India, Indian Fern Journal, vol. 8, no. -2, pp , 99. [27] J.-L. Tsai and W.-C. Shieh, A cytotaxonomic survey of the pteridophytes in Taiwan (2) chromosome and spore characteristics, Journal of Science & Engineering, vol. 2, pp , 984. [28] Y. J. Chang, W. C. Shieh, and J. L. Tsai, Studies on the karyotypes of the fern genus Pteris in Taiwan, in Proceedings of the 2nd Seminar on Asian Pteridology, Taipei, Taiwan, 992. [29] I. Manton and W. A. Sledge, Observations on the cytology and taxonomy of the Pteridophyte flora of Ceylon, Philosophical Transactions of the Royal Society B, vol. 238, pp , 954. [30] S. K. Roy and J. B. Singh, A note on the chromosome numbers in some ferns from Pachmarhi Hills, Central India, vol. 4, pp. 8 83, 975. [3] P. N. Mehra, Chromosome number in Himalayan ferns, Research Bulletin (n.s.) of Panjab University, vol. 2, pp , 96. [32] Y. S. Chao, H. Y. Liu, Y. M. Huang, and W. L. Chiou, Reproductive traits of Pteris cadieri and P. grevilleana in Taiwan: implications for their hybrid origin, Botanical Studies, vol. 5, no. 2, pp , 200. [33] Á. Löve, D. Löve, and R. E. G. Sermolli, Cytotaxonomical Atlas of the Pteridophyta, Vaduz, Liechtenstein, 977, Edited by J. Cramer. [34] N. Nakatô, A cytological study on an intermediate form between Pteris multifida and P. cretica, JournalofJapanese Botany, vol. 50, no. 4, pp. 0 25, 975. [35] R. P. Roy, B. M. B. Sinha, and A. R. Sakya, Cytology of some ferns of Kathmandu valley, Fern Gazette, vol. 0, pp , 97. [36] Y.-M. Huang, S.-Y. Hsu, T.-H. Hsieh, H.-M. Chou, and W.- L. Chiou, Three Pteris species (Pteridaceae: Pteridophyta) reproduce by apogamy, Botanical Studies, vol. 52, p. 79, 20. [37] S. C. Verma and S. P. Khullar, Cytogenetics of the Western Himalayan Pteris cretica complex, Annals of Botany, vol. 29, no. 4, pp , 965. [38] S.-J. Lin, K. Iwatsuki, and M. Kato, Cytotaxonomic study of ferns from China I. Species of Yunnan, Journal of Japanese Botany, vol. 7, no. 4, pp , 996. [39] M. Kato, N. Nakato, X. Cheng, and K. Iwatsuki, Cytotaxonomic study of ferns of Yunnan, southwestern China, The Botanical Magazine Tokyo, vol. 05, no., pp , 992. [40] W. H. J. Wagner and C. E. Nauman, Pteris Xdelchampsii, a spontaneous fern hybrid from southern Florida, American Fern Journal, vol. 72, pp , 982. [4] A. Abraham, Studies on the cytology and phylogeny of the Pteridophytes VII. Observation on one hundred species of South Indian ferns, Journal of the Indian Botanical Society, vol. 4, pp , 962. [42] P. I. Kuriachan and C. A. Ninan, Aspects of Plant Sciences, Today & Tomorrow s Printers, New Delhi, India, 976, Edited by P. K. K. Nair. [43] W.-L. Chiou, The gametophytes of Pteris ensiformis Burm, in Proceedings of the Seminar of Asian Pteridology II, Taipei, Taiwan, 992. [44] K. Mitui, Chromosome numbers of some ferns in the Ryukyu Islands, Journal of Japanese Botany, vol. 5, pp. 33 4, 976. [45] N. Nakatô, A cytogeographic study on the Japanese Pteris excelsa complex, Journal of Japanese Botany, vol. 5, pp , 976. [46] Y. M. Huang, H. M. Chou, T. H. Hsieh, J. C. Wang, and W. L. Chiou, Cryptic characteristics distinguish diploid and triploid varieties of Pteris fauriei (Pteridaceae), Canadian Journal of Botany, vol. 84, no. 2, pp , [47] N. Nakato, Notes on chromosomes of Japanese pteridophytes, Journal of Japanese Botany, vol. 65, pp , 990. [48] T. Suzuki and K. Iwatsuki, Genetic variation in apogamous fern Pteris cretica L. in Japan, Heredity, vol. 65, pp , 990. [49] N. Nakato, Notes on chromosomes of Japanese pteridophytes, Journal of Japanese Botany, vol. 63, pp , 988. [50] K. Mitui, Chromosomes and speciation in ferns, Science Reports of the Tokyo Kyoiku Daigaku B, vol. 3, pp , 968. [5] S. Kurita, Chromosome number of some Japanese ferns III, Journal of the College of Arts and Sciences, Chiba University, Natural Science Series, vol. 8, pp , 962. [52] S. M. Kawakami, M. Ito, and S. Kawakami, Apogamous sporophyte formation in a fern Pteris multifida and its characteristics, Journal of Plant Research, vol. 08, no. 2, pp. 8 84, 995. [53] K. Mitui, Chromosome studies on Japanese ferns, Journal of Japanese Botany, vol. 4, pp. 6 64, 966. [54] R. E. Holttum and S. K. Roy, Cytological observations on ferns from New Guinea with descriptions of new species, Blumea, vol. 3, pp , 965. [55] K. Mitui, Chromosome studies on Japanese ferns, Journal of Japanese Botany, vol. 42, pp. 05 0, 967. [56] T. Nakaike, New Flora of Japan Pteridophyta, Shibundo Company, Tokyo, Japan, 992. [57] N. Nakato and M. Kato, Chromosome numbers of ten species of ferns from Guam, S. Mariana Isls., Acta Phytotaxonomica et Geobotanica, vol. 52, pp , 200. [58] J. Srivastava, S. A. Ranade, and P. B. Khare, Distribution and threat status of the cytotypes of Pteris vittata L. (Pteridaceae) species complex in India, Current Science, vol. 93, no., pp. 8 85, [59]H.HeandL.B.Zhang, Pteris xiaoyingae, sp. nov. (sect. Pteris) from a karst cave in China based on morphological and palynological evidence, Systematic Botany, vol. 35, no. 4, pp , 200. [60] S.-J. Lin, M. Kato, and K. Iwatsuki, Electrophoretic variation of the apogamous Dryopteris varia group (Dryopteridaceae), Journal of Plant Research, vol. 08, no. 4, pp , 995. [6] S.-J. Lin, M. Kato, and K. Iwatsuki, Diploid and triploid offspring of triploid agamosporous fern Dryopteris pacifica, The Botanical Magazine Tokyo, vol. 05, no. 3, pp , 992. [62] L. Comai, The advantages and disadvantages of being polyploid, Nature Reviews Genetics, vol. 6, no., pp , [63] R. J. Singh, Plant Cytogenetics, [64] Y. Watano and K. Iwatsuki, Genetic variation in the Japanese apogamous form of the fern Asplenium unilaterale Lam, The Botanical Magazine Tokyo, vol. 0, no. 3, pp , 988. [65] K. U. Kramer and P. M. McCarthy, Flora of Australia: Ferns, Gymnosperms and Allied Genera, Collingwood, 998. [66] J. Jha and B. M. Sinha, Cytomorphological variability in apogamous populations of Pteris cretica L., Caryologia, vol. 40, pp. 7 78, 987.

14 Hindawi Publishing Corporation Journal of Botany Volume 202, Article ID , 6 pages doi:0.55/202/ Research Article Exploring Diversification and Genome Size Evolution in Extant Gymnosperms through Phylogenetic Synthesis J. Gordon Burleigh, W. Brad Barbazuk, John M. Davis, 2 Alison M. Morse, 2 and Pamela S. Soltis 3 Department of Biology, University of Florida, Gainesville, FL 326, USA 2 School of Forest Resources and Conservation, University of Florida, Gainesville, FL 326, USA 3 Florida Museum of Natural History, University of Florida, Gainesville, FL 326, USA Correspondence should be addressed to J. Gordon Burleigh, gburleigh@ufl.edu Received 2 June 20; Accepted 20 September 20 Academic Editor: Hiroyoshi Takano Copyright 202 J. Gordon Burleigh et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Gymnosperms, comprising cycads, Ginkgo, Gnetales, and conifers, represent one of the major groups of extant seed plants. Yet compared to angiosperms, little is known about the patterns of diversification and genome evolution in gymnosperms. We assembled a phylogenetic supermatrix containing over 4.5 million nucleotides from 739 gymnosperm taxa. Although 93.6% of the cells in the supermatrix are empty, the data reveal many strongly supported nodes that are generally consistent with previous phylogenetic analyses, including weak support for Gnetales sister to Pinaceae. A lineage through time plot suggests elevated rates of diversification within the last 00 million years, and there is evidence of shifts in diversification rates in several clades within cycads and conifers. A likelihood-based analysis of the evolution of genome size in 65 gymnosperms finds evidence for heterogeneous ratesofgenomesizeevolutionduetoanelevatedrateinpinus.. Introduction Recent advances in sequencing technology offer the possibility of identifying the genetic mechanisms that influence evolutionarily important characters and ultimately drive diversification. Within angiosperms, large-scale phylogenetic analyses have identified complex patterns of diversification (e.g., [ 3]), and numerous genomes are at least partially sequenced. Yet the other major clade of seed plants, the gymnosperms, have received far less attention, with few comprehensive studies of diversification and no sequenced genomes. Note that throughout this paper gymnosperms specifies only the approximately 000 extant species within cycads, Ginkgo, Gnetales, and conifers. These comprise the Acrogymnospermae clade described by Cantino et al. [4]. Many gymnosperms have exceptionally large genomes (e.g., [5 7]), and this has hindered whole-genome sequencing projects, especially among economically important Pinus species. This large genome size is interesting because one suggested mechanism for rapid increases in genome size, polyploidy, is rare among gymnosperms [8]. Recentsequencing efforts have elucidated some of genomic characteristics associated with the large genome size in Pinus. Morseet al. [9] identified a large retrotransposon family in Pinus, that, with other retrotransposon families, accounts for much of the genomic complexity. Similarly, recent sequencing of 0 BAC (bacterial artificial chromosome) clones from Pinus taeda identified many conifer-specific LTR (long terminal repeat) retroelements [0]. These studies suggest that the large genome size may be caused by rapid expansion of retrotransposons and may be limited to conifers, Pinaceae, or Pinus. Other studies have quantified patterns of genome size among gymnosperms, especially within Pinus and the other Pinaceae [6, 7, 4]. These studies have largely focused on finding morphological, biogeographic, or life history correlates of genome size, but the rates and patterns of genome size evolution in gymnosperms are largely unknown. This study first synthesizes the available phylogenetically informative sequences to build a phylogenetic hypothesis of

15 2 Journal of Botany gymnosperms that reflects the recent advances in sequencing and computational phylogenetics. The resulting tree provides a starting point for large-scale evolutionary and ecological analyses of gymnosperms and will hopefully be a resource to promote and guide future phylogenetic and comparativestudies.weusethetreetoexaminelarge-scale patterns of diversification of the extant gymnosperm lineages and also to examine rates of genome size evolution. 2. Methods and Materials 2.. Supermatrix Phylogenetic Inference. We constructed a phylogenetic hypothesis of gymnosperms from available, phylogenetically informative sequence data in GenBank that was available on June 30, We first downloaded from GenBank all core nucleotide sequence data from gymnosperms (Coniferophyta, Cycadophyta, Ginkgophyta, and Gnetophyta). Additionally, we downloaded sequences from the basal angiosperm lineages (e.g., Amborella, Nymphaeales, Chloranthaceae, and Austrobaileyales) to represent the angiosperms and a diverse sampling of Moniliformopses taxa (including species from Equisetum, Psilotum, Ophioglossum, Botrychium, Angiopteris, andadiantum) to use as outgroups. To identify sets of homologous sequences from the GenBank data, we clustered sequences less than 0,000 bp in length based on results from an all-by-all pairwise BLAST analysis. The all-by-all blastn search was done with blastall using the default parameters [5]. Significant BLAST hits had a maximum e-value of.0e 0 and at least 50% overlap of both the target and query sequences. A Perl script identified the largest clusters of sequences in which each sequence has a significant BLAST hit against at least one other sequence inthecluster.weonlyconsidered clusters containing loci that had been used previously for phylogenetic analyses. This included plastid and mitochondrial loci as well as nuclear 8S rdna, 26S rdna, and internal transcribed spacer (ITS) sequences. Among these clusters, those containing sequences from at least 5 taxa were aligned using Muscle [6], and the resulting alignments were manually checked and adjusted. The resulting alignments were edited for inclusion in the supermatrix by removing hybrid taxa and those that lacked a specific epithet and also keeping only a single sequence per species. The final cluster alignments were then concatenated to make a single phylogenetic supermatrix (e.g., [7]) Phylogenetic and Dating Analysis. To estimate the optimal topology and molecular branch lengths for the gymnosperms, we performed maximum likelihood (ML) phylogenetic analysis on the full supermatrix alignment using RAxML-VI-HPC version [8]. All ML analyses used the general time reversible (GTR) nucleotide substitution model with the default settings for the optimization of individual per-site substitution rates and classification of these rates into rate categories. To assess uncertainty in the topology and branch length estimates, we ran 00 nonparametric bootstrap replicates on the original data set [9]. We transformed the optimal and bootstrap trees to chronograms, ultrametric trees in which the branch lengths represent time, using penalized likelihood [20] implemented in r8s version.7 [2]. We used a smoothing parameter of 0000, which was chosen based on cross-validation of the fossil constraints. For the r8s analysis, we used the same time constraints on seed plant clades used by Won and Renner [22]. The most recent common ancestor of seed plants was constrained to a maximum age of 385 million years ago (mya). The most recent common ancestor of the extant gymnosperms was fixed at 35 mya and Gnetum at 0 mya. The following clades were given minimum age constraints: angisperms: 25 mya, cycads: 270 mya, Cupressaceae: 90 mya, Araucariaceae: 60 mya, Gnetales + Pinaceae: 225 mya, Pinaceae: 90 mya, and Gnetales: 25 mya Diversification Analysis. To examine the general patterns of diversification through time among the extant gymnosperm lineages, we first made lineage through time plots using the R package APE [23]. To account for uncertainty in the dating estimates, we plotted each bootstrap tree after it had been transformed into a chronogram and all nongymnosperm taxa were removed. Since there appears to be much variance in the divergence time estimates among trees, and branch length estimates are often unreliable, especially when estimated from such a sparse, heterogeneous sequence matrix, we used a test for changes in diversification rate that relies on tree shape, not branch lengths. Specifically, we used the whole-tree, topology-based test described by Moore et al. [24] to detect nodes associated with significant shifts in diversification rate based on the Δ statistic. The analyses were performed using the aptreeshape R package [25]. We used only the optimal tree estimate and again, pruned all non-gymnosperm taxa from the tree prior to analysis Rates of Genome Size Evolution. We first assembled a set of mean genome size data for all gymnosperms present in the phylogenetic tree (in pg DNA) from the Kew C-value database [26]. This includes data from the studies of Murray [6] and Grotkopp et al. [4]. When there were multiple estimates available from a single species, we used the mean of the estimates. We tested for shifts in rates of genome size evolution using Brownie v [27]. We used the censored rate test, which tests for differences in rates of evolution of a continuous character (genome size) in one clade versus another clade or paraphyletic group based on a Brownian motion model. We made the following comparisons: conifers + Gnetales versus cycads + Ginkgo, Pinaceae versus other conifers + Gnetales, non-pinaceae conifers + Gnetales versus cycads + Ginkgo, Pinus versus other Pinaceae, the non-pinus Pinceae versus the other conifers + Gnetales, and Pinus subgenus Strobus subgenus versus Pinus subgenus Pinus. To account for topological and branch length uncertainty, we performed all hypothesis tests in Brownie on each bootstrap tree and weighted the results across replicates. The penalized likelihood analysis in r8s collapsed some branch lengths to 0, and Brownie does not work with 0 branch lengths in the tree. Thus, prior to the Brownie analysis, all 0 branch lengths were changed to 0..

16 Journal of Botany 3 3. Results 3.. Phylogenetic Data. The alignment from the complete supermatrix contains sequences from 950 taxa (739 gymnosperms, 08 angiosperms, and 03 nonseed plant outgroups) and is 74,05 characters in length. The 739 gymnosperm taxa include at least one representative from every family as well as from 88 genera. In total, the matrix contains 4,5,44 nucleotides and 93.6% missing data. The number of nucleotides per taxon ranges from 252 to 33,38 (average = 4,749; median = 3,355) Phylogenetic Inference. In the 950-taxon trees, 63.3% (60) of the nodes have at least 50% bootstrap support, 4.7% (396) have at least 70% support, 25.8% (245) have at least 90% support, and 9.7% (92) have 00% support. The seed plants have 00% support, and the angiosperms are sister to a clade of all gymnosperms (Figure ). Within gymnosperms, a clade of Ginkgo + cycads (bootstrap support (BS)= 66%) is sister to a clade consisting of conifers + Gnetales (BS = 96%). Gnetales are sister to Pinaceae within the conifers, although the Gne-Pine clade has only 57% support. Within the major groups of gymnosperms (conifers, Gnetales, and cycads), family-level and generic relationships generally are congruent with those inferred in other analyses (Figure ). Of the 54 gymnosperm genera represented by more than one species in the tree, 47 have at least 50% bootstrap support, 36 have at least 90% bootstrap support, and 26 have 00% bootstrap support. A full version of the bootstrap consensus tree is available as Supplemental Material Diversification. Although the lineage through time plots display much variation among bootstrap replicates (Figure 2), the general trend among the bootstrap trees is similar, with what appears to be high and possibly increasing diversification over the last 00 million years. Still, lineage through time plots are imprecise and difficult to interpret. If this trend of high recent diversification were true, we would expect to find evidence of increased rates of diversification in some relatively young clades. The Δ statistic indicated a significant shift in the rates of diversification at 0 nodes in the tree. Several are within the cycads. This includes the node dividing Cycas and Epicycas species from the other cycads (P = ) and its daughter node separating Cycas, Epicycas, anddioon from the other cycads (P = 0.57). Also, two basal-most nodes of Zamia show significant shifts in diversification rates (P = 0.04 and 0.36). Within conifers, there is a significant shift in diversification at the most recent common ancestor of Podocarpus (P = 0.07). Also, there are significant shifts in diversification at the two basal nodes of Cupressaceae (P = and ) and within Cupressaceae, at the most recent common ancestor of Callitris, Neocallitropsis, Actinostrobus, Widdringtonia, Fitzroya, Diselma, andaustrocedrus (P = ). Finally, there is a significant shift in two of the basal nodes of Picea (P = 0.066, P = ). Pinaceae Araucariaceae Cephalotaxaceae Cupressaceae Podocarpaceae Sciadopityaceae Taxaceae Gnetales Ginkgo Cycadaceae Stangeriaceae Zamiaceae Figure : Overview of the ML tree of 739 gymnosperm taxa; angiosperms and outgroups have been removed. Colors represent the different families of conifers (Pinaceae, Araucariaceae, Cephalotaxaceae, Cupressaceae, Podocarpaceae, Sciadopityaceae, and Taxaceae), Gnetales, Ginkgo, and the families of cycads (Cycadaceae, Stangeriaceae, and Zamiaceae). A full bootstrap consensus tree is available as Supplementary Material available online at doi:0.55/20/ N (lineages) Time (MYA) Figure 2: Lineage through time plot for the gymnosperm species. All bootstrap trees, with ultrametric branch lengths from r8s, were pruned to include only the gymnosperm taxa. Each line represents a single ML bootstrap tree. The graph shows the pattern of diversification of the gymnosperm taxa in the tree through time, as the tree grew from a single lineage at the root to the current sampling of 739 species. 0

17 4 Journal of Botany 3.4. Genome Size Evolution. Based on the large size of genomes of Pinus species, we hypothesized that there may be an increase in the rate of genome size evolution (Figure 3). We performed a series of likelihood ratio tests to examine the patterns of rate variation throughout gymnosperms, with a focus on testing for rate variation associated with conifers, Pinaceae, and Pinus (Table ). In all comparisons in which Pinus (or a group containing Pinus) was compared to another group, the group with Pinus showed significantly elevated rates of genome size evolution (Table ). We detected no significant shifts in rates of evolution between any groups that did not contain Pinus, and there was no significant difference in rates of evolution between the two Pinus subgenera (Pinus and Strobus; Table ). 4. Discussion The analyses of gymnosperm diversification and genome size evolution demonstrate the dynamic evolutionary processes of the extant gymnosperms, which sharply contrasts with their reputation as ancient, relictual species. The lineage through time plots are consistent with high, and possible growing, rates of diversification within the last 00 million years, concurrent with major radiations of angiosperms (e.g., [, 2, 28]) and extant ferns [29]. There is evidence of numerous significant shifts in diversification within both cycads and conifers, and there is strong evidence for a recent, large increase in the rate of genome size evolution in Pinus. Although Pinus is a species-rich genus, we find no links between increased rates of diversification and shifts in rates of genome size evolution. Advances in sequencing technology and computational biology over the past decade enable phylogenetic estimates comprising large sections the plant diversity. This study demonstrates that it is possible to construct credible phylogenetic hypotheses including nearly three quarters of the extant gymnosperm species. Unlike supertree approaches (e.g., [4]), the supermatrix methods easily incorporate branch length estimates and estimates of topological and branch length uncertainty. Still, until there is far more data per taxon, estimates of the gymnosperm phylogeny will continue to improve, and thus, it is important to consider error and uncertainty in phylogenetic estimates when using these trees to infer evolutionary processes. There are other reasons to interpret this gymnosperm tree with caution. For example, both heterogeneity in the patterns of molecular evolution and missing data can lead to erroneous estimates of trees and branch lengths in ML phylogenetic analyses (e.g., [30, 3]). Furthermore, our analysis does not attempt to incorporate evolutionary processes, such as incomplete lineage sorting or gene duplication and loss or reticulation, that may cause incongruence between the gene trees and the species phylogeny (e.g., [32]). Although this study used thousands of sequences, it does not incorporate the evolutionary perspective of low-copy nuclear genes. Still, in many cases, evolutionary or ecological analyses that use phylogenetic trees may be robust to topological and branch length error (e.g., [33]), and the large tree of Non-Pinaceae Conifers Conifers + Gnetales 300 Pinaceae Gnetales Cycads + Ginkgo 200 Genome Size 2.25 to to to to to to 22.5 MYA 00 Pinus P. Pinus P. Strobus 22.5 to to to to to Sequoiadendron giganteum Sequoia sempervirens Metasequoia glyptostroboides Platycladus orientalis Microbiota decussata Tetraclinis articulata Juniperus rigida Juniperus virginiana Cupressus torulosa Cupressus sempervirens Cupressus funebris Chamaecyparis obtusa Chamaecyparis lawsoniana Chamaecyparis pisifera Thuja plicata Thuja occidentalis Libocedrus bidwillii Libocedrus plumosa Widdringtonia schwarzii Callitris rhomboidea Glyptostrobus pensilis Taxodium distichum Taxodium mucronatum Cryptomeria fortunei Cryptomeria japonica Athrotaxis selaginoides Athrotaxis cupressoides Taiwania cryptomerioides Taiwania flousiana Cunninghamia lanceolata Taxus baccata Sciadopitys verticillata Dacrycarpus dacrydioides Dacrydium cupressinum Podocarpus nivalis Podocarpus totara Podocarpus acutifolius Podocarpus hallii Prumnopitys ferruginea Prumnopitys taxifolia Halocarpus kirkii Halocarpus bidwillii Halocarpus biformis Manoao colensoi Lagarostrobos franklinii Phyllocladus alpinus Phyllocladus glaucus Phyllocladus trichomanoides Lepidothamnus laxifolius Agathis australis Wollemia nobilis Araucaria cunninghamii Pinus kesiya Pinus densiflora Pinus sylvestris Pinus mugo Pinus thunbergii Pinus taiwanensis Pinus resinosa Pinus nigra Pinus tropicalis Pinus merkusii Pinus heldreichii Pinus halepensis Pinus canariensis Pinus pinaster Pinus pinea Pinus roxburghii Pinus coulteri Pinus pseudostrobus Pinus ponderosa Pinus jeffreyi Pinus durangensis Pinus montezumae Pinus engelmannii Pinus devoniana Pinus washoensis Pinus sabiniana Pinus torreyana Pinus oocarpa Pinus pringlei Pinus serotina Pinus rigida Pinus elliottii Pinus palustris Pinus caribaea Pinus tecunumanii Pinus echinata Pinus muricata Pinus glabra Pinus attenuata Pinus herrerae Pinus patula Pinus jaliscana Pinus greggii Pinus leiophylla Pinus banksiana Pinus clausa Pinus virginiana Pinus contorta Pinus pinceana Pinus maximartinezii Pinus culminicola Pinus johannis Pinus cembroides Pinus edulis Pinus monophylla Pinus rzedowskii Pinus balfouriana Pinus aristata Pinus nelsonii Pinus monticola Pinus armandii Pinus parviflora Pinus cembra Pinus wallichiana Pinus koraiensis Pinus pumila Pinus albicaulis Pinus lambertiana Pinus flexilis Pinus ayacahuite Pinus strobiformis Pinus strobus Pinus chiapensis Pinus peuce Pinus bungeana Pinus gerardiana Picea mariana Picea omorika Picea pungens Picea orientalis Picea engelmannii Abies alba Abies sibirica Abies concolor Abies fraseri Abies balsamea Tsuga canadensis Cedrus deodara Cedrus atlantica Cedrus brevifolia Pseudotsuga menziesii Larix decidua Larix laricina Larix sibirica Gnetum costatum Gnetum gnemon Gnetum ula Welwitschia mirabilis Ephedra likiangensis Ephedra gerardiana Ephedra distachya Ephedra monosperma Ephedra fragilis Ephedra viridis Ephedra americana Ephedra tweediana Encephalartos villosus Bowenia serrulata Zamia angustifolia Stangeria eriopus Cycas circinalis Cycas revoluta Ginkgo biloba Figure 3: Ancestral state reconstruction of genome size (in pg DNA) on a chronogram 65 gymnosperm taxa. Different genome sizes are represented by different colors, with the ancestral genome sizes estimated with squared change parsimony.

18 Journal of Botany 5 Table : Rate estimates from the two rate parameter models from Brownie. Indicatethatthesingleratemodelwasrejectedbasedon the Chi-squared P value ( P<0.005; P<0.0005). Significance was also assessed using AIC. Comparison Rates of Genome Evolution Conifers + Gnetales.878 Cycads + Ginkgo Pinaceae 2.75 Other conifers + Gnetales 0.78 Other conifers + Gnetales 0.78 Cycads + Ginkgo Pinus Other Pinaceae 0.43 Other Pinaceae 0.43 Other conifers + Gnetales 0.78 Pinus Strobus subgenus 2.66 Pinus Pinus subgenus 3.56 gymnosperms enables sophisticated and comprehensive tests of evolutionary and ecological hypotheses. We demonstrate this with our diversification analysis, the results of which emphasize numerous, independent shifts in diversification rate throughout gymnosperms and apparently recent, high rates of diversification (Figure 2). Estimates of diversification may be affected by taxonomic sampling and inaccurate branch length estimates. However, we might expect that adding the remaining species, which would likely fit near the tips of the tree, would result in increased estimates of recent diversification. Thus, our analyses suggest the intriguing perspective that the extant gymnosperms are a vibrant, growing clade, and not simply the sole survivors of ancient diversity. Greater sampling and a more robust tree will provide a more complete view of gymnosperm diversification. With better branch length estimates, it will be possible to use more powerful likelihood-based approaches to identify clades with increasing and decreasing diversification rates [34]. With more complete taxon sampling, it may be possible to identify characters associated with changing speciation and extinction rates ([35], but see [36]). One of the great challenges of evolutionary genomics is to identify the mechanisms of genome evolution that drive diversification. Some of the mechanisms that cause changes in genome size, such as whole-genome duplications or activity of retrotransposons, can have implications on diversification rates. Our analysis of the rates of genome size evolution demonstrate that Pinus is unique among gymnosperms. That is, the highly elevated rates of change in genome size appear to be limited to Pinus. However, in gymnosperms, we find no evidence of increases in diversification associated with Pinus, which displays a significantly elevated rate of genome size evolution. Furthermore, we find no obvious evidence for increase in rates of genome size evolution in clades associated with shifts in diversification. While our analysis failed to link genome size and diversification, this comparative approach for identifying shifts in genome size can inform our search for the specific drivers of the increased genomic complexity in Pinus, and this ultimately can help inform strategies for sequencing and assembling the first Pinus genomes. Supplementary Materials The nucleotide and C-value data matrices along with all trees are available on the Dryad data repository ( Acknowledgment This work was funded by a University of Florida Research Opportunity Fund Seed Grant. References [] S. Magallón and A. Castillo, Angiosperm diversification through time, American Journal of Botany, vol. 96, no., pp , [2] C. D. Bell, D. E. Soltis, and P. S. Soltis, The age and diversification of the angiosperms re-revisited, American Journal of Botany, vol. 97, no. 8, pp , 200. [3] S. A. Smith, J. M. Beaulieu, A. Stamatakis, and M. J. Donoghue, Understanding angiosperm diversification using small and large phylogenetic trees, American Journal of Botany, vol. 98, no. 3, pp , 20. [4]P.D.Cantino,J.A.Doyle,S.W.Grahametal., Towardsa phylogenetic nomenclature of Tracheophyta, Taxon, vol. 56, no. 3, pp , [5] D. Ohri and T. N. Khoshoo, Genome size in gymnosperms, Plant Systematics and Evolution, vol. 53, no. -2, pp. 9 32, 986. [6] B. G. Murray, Nuclear DNA amounts in gymnosperms, Annals of Botany, vol. 82, pp. 3 5, 998. [7]M.R.AhujaandD.B.Neale, Evolutionofgenomesizein conifers, Silvae Genetica, vol. 54, no. 3, pp , [8] T. N. Khoshoo, Polyploidy in gymnosperms, Evolution, vol. 3, no., pp , 958. [9] A.M.Morse,D.G.Peterson,M.N.Islam-Faridietal., Evolution of genome size and complexity in Pinus, PLoS ONE, vol. 4, no. 2, Article ID e4332, [0]A.Kovach,J.L.Wegrzyn,G.Parraetal., ThePinus taeda genome is characterized by diverse and highly diverged repetitive sequences, BMC Genomics, vol., no., article 420, 200. [] K.L.Joyner,X.-R.Wang,J.S.Johnston,H.J.Price,andC.G. Williams, DNA content for Asian pines parallels new world relatives, Canadian Journal of Botany, vol. 79, no. 2, pp , 200. [2]S.E.Hall,W.S.Dvorak,J.S.Johnston,H.J.Price,andC. G. Williams, Flow cytometric analysis of DNA content for tropical and temperate new world pines, Annals of Botany, vol. 86, no. 6, pp , [3]I.Wakamiya,R.J.Newton,J.S.Johnston,andH.J.Price, Genome size and environmental factors in the genus Pinus, American Journal of Botany, vol. 80, no., pp , 993. [4] E. Grotkopp, M. Rejmánek,M.J.Sanderson,andT.L.Rost, Evolution of genome size in pines (Pinus) and its life-history correlates: supertree analyses, Evolution, vol. 58, no. 8, pp , 2004.

19 6 Journal of Botany [5] S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, Basic local alignment search tool, Journal of Molecular Biology, vol. 25, no. 3, pp , 990. [6] J. D. Thompson, D. G. Higgins, and T. J. Gibson, CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice, Nucleic Acids Research, vol. 22, no. 22, pp , 994. [7] A. de Queiroz and J. Gatesy, The supermatrix approach to systematics, Trends in Ecology and Evolution, vol. 22, no., pp. 34 4, [8] A. Stamatakis, RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models, Bioinformatics, vol. 22, no. 2, pp , [9] J. Felsenstein, Confidence limits on phylogenies: an approach using the bootstrap, Evolution, vol. 39, no. 4, pp , 985. [20] M. J. Sanderson, Estimating absolute rates of molecular evolution and divergence times: a penalized likelihood approach, Molecular Biology and Evolution, vol. 9, no., pp. 0 09, [2] M. J. Sanderson, R8s: inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock, Bioinformatics, vol. 9, no. 2, pp , [22] H. Won and S. S. Renner, Dating dispersal and radiation in the gymnosperm Gnetum (Gnetales) clock calibration when outgroup relationships are uncertain, Systematic Biology, vol. 55, no. 4, pp , [23] E. Paradis, J. Claude, and K. Strimmer, APE: analyses of phylogenetics and evolution in R language, Bioinformatics, vol. 20, no. 2, pp , [24] B. R. Moore, K. M. A. Chan, and M. J. Donoghue, Detecting diversification rate variation in supertrees, in Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life,O. R. P. Bininda-Emonds, Ed., pp , Kluwer Academic, Dodrecht, The Netherlands, [25] N. Bortolussi, E. Durand, M. G. B. Blum, and O. François, Aptreeshape: statistical analysis of phylogenetic treeshape, Bioinformatics, vol. 22, no. 3, pp , [26] M. D. Bennett and I. J. Leitch, Plant DNA C-values database, 2005, [27] B. C. O Meara, C. Ané, M. J. Sanderson, and P. C. Wainwright, Testing for different rates of continuous trait evolution using likelihood, Evolution, vol. 60, no. 5, pp , [28] H. Wang, M. J. Moore, P. S. Soltis et al., Rosid radiation and the rapid rise of angiosperm-dominated forests, Proceedings of the National Academy of Sciences of the United States of America, vol. 06, no. 0, pp , [29] H. Schneider, E. Schuettpelz, K. M. Pryer, R. Cranfill, S. Magallón, and R. Lupia, Ferns diversified in the shadow of angiosperms, Nature, vol. 428, no. 6982, pp , [30] B. Kolaczkowski and J. W. Thornton, Performance of maximum parsimony and likelihood phylogenetics when evolution is heterogenous, Nature, vol. 43, no. 70, pp , [3] A. R. Lemmon, J. M. Brown, K. Stanger-Hall, and E. M. Lemmon, The effect of ambiguous data on phylogenetic estimates obtained by maximum likelihood and bayesian inference, Systematic Biology, vol. 58, no., pp , [32] W. P. Maddison, Gene trees in species trees, Systematic Biology, vol. 46, no. 3, pp , 997. [33] E. A. Stone, Why the phylogenetic regression appears robust to tree misspecification, Systematic Biology, vol.60,no.3,pp , 20. [34] D. L. Rabosky, LASER: a maximum likelihood toolkit for detecting temporal shifts in diversification rates, Evolutionary Bioinformatics, vol. 2, pp , [35] W. P. Maddison, P. E. Midford, and S. P. Otto, Estimating a binary character s effect on speciation and extinction, Systematic Biology, vol. 56, no. 5, pp , [36] D. L. Rabosky, Extinction rates should not be estimated from molecular phylogenies, Evolution, vol. 64, no. 6, pp , 200.

20 Hindawi Publishing Corporation Journal of Botany Volume 20, Article ID , 9 pages doi:0.55/20/ Review Article On the Coevolution of Transposable Elements and Plant Genomes Peter Civáň, Miroslav Švec, and Pavol Hauptvogel 2 Department of Genetics, Faculty of Natural Sciences, Comenius University, Mlynská dolina B, Bratislava 4, Slovakia 2 Plant Production Research Center, Gene Bank of Slovak Republik, Bratislavská cesta 22, Piešťany, Slovakia Correspondence should be addressed to Peter Civáň, civan@fns.uniba.sk Received June 20; Revised 23 August 20; Accepted 6 September 20 Academic Editor: Curtis C. Daehler Copyright 20 Peter Civáň et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Plant genomes are unique in an intriguing feature: the range of their size variation is unprecedented among living organisms. Although polyploidization contributes to this variability, transposable elements (TEs) seem to play the pivotal role. TEs, often considered intragenomic parasites, not only affect the genome size of the host, but also interact with other genes, disrupting and creating new functions and regulatory networks. Coevolution of plant genomes and TEs has led to tight regulation of TE activity, and growing evidence suggests their relationship became mutualistic. Although the expansions of TEs represent certain costs for the host genomes, they may also bring profits for populations, helping to overcome challenging environmental (biotic/abiotic stress) or genomic (hybridization and allopolyploidization) conditions. In this paper, we discuss the possibility that the possession of inducible TEs may provide a selective advantage for various plant populations.. Transpositional Strategies, Distribution, and Regulation of TEs Transposable elements (TEs) comprise a palette of immensely diverse DNA structures that can be unified by the following definition: they all are (or have been) able to insert themselves (or new copies of themselves) into new locations within genome. According to their mechanism of transposition, TEs can be classified [] into class I elements (retroelements) transposing through an RNA intermediate, and class II elements (DNA transposons) moving only via DNA. The major superfamilies of class I are Ty-copia and Ty3-gypsy retrotransposons, while class II is represented by TIR (terminal inverted repeat) elements and Helitrons, which are sometimes classified separately [2]. Among both retrotransposons and transposons, nonautonomous forms (e.g., MITEs, SINEs, and LARDs) are quite prevalent [3], utilizing the transpositional machinery of autonomous TEs. A significant portion of plant genomes is constituted by class I elements (specifically LTR retrotransposons, with direct long terminal repeats at both ends), which replicate in a copy-and-paste manner. In brief (according to [3, 4]), the genomic DNA copy of a retroelement is transcribed into mrna that enters the cytosol, similarly to standard DNA transcripts. The information of the mrna is translated, typically creating a structural protein GAG and a polyprotein POL. These protein products associate with other retroelement mrna copies and pack them into viruslike particles. Within these structures, dimerized mrna copies are reversely transcribed into cdna and the whole complex enters the nucleus, where the new cdna copy integrates at a new site. This mode of transposition, if not suppressed, allows retroelements to massively increase their copy numbers, resulting in a rapid expansion of genome size. Compared to retrotransposons, class II elements are anticipated to have a smaller potential of increasing their copy number. Owing to their cut-and-paste insertional mechanism mediated by transposases, multiplication arises only when a transposon from a recently replicated genomic region is transposed to a region about to undergo replication [5] or in cases when the excision site is repaired by gene conversion, using the sister chromatid as a template [4]. Nevertheless, short DNA transposons like MITEs (miniature inverted-repeat TEs) can be remarkably effective in increasing their copy numbers each generation [6, 7]. In the genome of Lotus japonicus, the copy number of detected MITEs is similar to, or higher than, the copy numbers of major LTR retrotransposon superfamilies [8], and the overall

21 2 Journal of Botany contribution to the genome size is smaller only because of the short length of MITEs ( bp). For most of the TEs in plant genomes, a certain balance between TE proliferation and minimal damage to the host has evolved. This balance is widely achieved by epigenetic silencing, an important feature of which is its reversibility [9]. Epigenetic suppression of TEs are realized on different levels from general transcriptional control by TEunspecific histone modification and sirna-directed DNA methylation [0 2], to posttranslational processes, like the species-specific control of nuclear localization of transposase [3, 4]. Silencing of TEs can be immensely effective. For example, LTR retrotransposons constitute a large part of the Gossypium genome, but almost no transcripts were found in the Gossypium EST database [5]. Interestingly, a variety of LINE-like transcripts have been found in the same EST libraries, suggesting different levels of suppression for particular TE classes. But attenuation of TEs is not irreversible. Disruption of epigenetic silencing patterns and consequent derepression and proliferation of TEs are thought to be associated with two natural phenomena interspecific hybridization [6 8] and biotic/abiotic stress [9 23]. TEs are not evenly distributed along chromosomes. DNA transposons are overrepresented in gene-rich or euchromatic regions [7, 24], avoiding exons [7], while class I elements concentrate in gene-poor, heterochromatic regions around centromeres [24 28]. Exceptions to this general pattern are, for example, copia retroelements of maize overrepresented in euchromatin [29], or the FIDEL retrotransposon absent from heterochromatin in Arachis [30]. Authors frequently suppose the existence of yet unknown mechanisms of regionspecific TE targeting. However, the uneven distributions of class I and class II elements across the genomes might be satisfactorily explained by probabilistic principles (as follows), without the need to hypothesize any targeting mechanism. DNA transposons in heterochromatic regions may be less likely to be transposed because of the DNA topology inaccessible for transposases. On the contrary, DNA transposons in euchromatic or genic regions are being transposed frequently, and owing to the transposition strategy, the excised element is prone to reinsert at a nearby genomic location, which is likely to be euchromatic too (the shorter the distance between the original and novel position, the higher the probability that the two loci are in the same condition). The passive copies in the heterochromatin are eventually removed from the genome while only the active copies have chances to multiply, and thus colonize the euchromatin. Transpositions into exons are mostly filtered out by natural selection, so the majority of DNA transposons is observed in introns, or 5 and 3 adjacent regions. The less severe consequences of short TE insertions for splicing might be the reason why MITEs predominate in introns over other elements. The situation is different for LTR retrotransposons. New copies of class I elements are generated outside the nucleus in the cytosol, thus the probability for a new copy to reintegrate in the proximity of the maternal element is extremely low. Without any targeting mechanism, retroelements integrate randomly across the whole genome, and the observed uneven distribution can be attributed to the following factors: (i) the majority of transpositions within or near genes is filtered out by selection because of the usually deleterious effects; and (ii) LTR retrotransposons (or TEs in general) are removed more efficiently from highlyrecombining regions because the mechanisms of removal are recombination dependent [3 33]. This hypothesis is supported by the finding that Sorghum LTR retrotransposons younger than 0,000 years appear to be randomly distributed along chromosomes [34]. Distribution patterns of TEs are therefore likely to be a function of the transpositional strategy and age of the individual TE family, affected by methylation [35] and some genomic particularities of the host species (e.g., gene density and recombination landscapes). 2. Effects of Transposable Elements on Plant Genomes 2.. Genome Size. The most obvious effect of TE existence in plants is their impact on the genome size. TE fraction can be as low as 5% in small plant genomes, and as high as >70% in large plant genomes (Table ) while the number of genes remains roughly constant [36] (of course, the latter does not hold true for polyploids). The correlation between the proportion of TEs specifically LTR retrotransposons and the physical length of the genome is so evident in the examined plant species [36, 37] that the genome size can be generally regarded as a linear function of TE content, and the dynamics of LTR retrotransposons as the major contributor to C-value differences among plants. To answer the question what is the typical TE contribution to the length of plant genome, one needs some factual idea of a typical plant C-value. For this purpose, Tenaillon et al. [37] use an arithmetic mean of the C-values provided by Bennett and Leitch [38]. Their Plant DNA C- value database currently comprises 6,287 and 204 entries for angiosperms and gymnosperms, respectively; mostly based on Feulgen microdensitometry and flow cytometry data. Although this dataset probably cannot be considered a representative sample of all land plants (e.g., monocots being overrepresented), as the most comprehensive one it still provides informative insights into plant genome size variation. The average genome size (C) of angiosperms (flowering plants) and gymnosperms sampled in this database is Gbp and 8.57 Gbp, respectively. However, the distribution of C-values, especially in angiosperms, shows strong positive skewness (Figure ), with the data unevenly distributed around the mean (72.3% of the examined angiosperm species having smaller genomes than the mean value of the dataset). For such distributions, the median is a better indication of central tendency than the arithmetic mean. The median values of the angiosperm and gymnosperm genome size data sampled in the Plant DNA C-value database are 2.40 Gbp (.46 Gbp for eudicots; Gbp for monocots) and Gbp, respectively. In other words, half of the examined flowering plants has genome sizes below 2.4 Gbp. Surprisingly, the interval Gbp represents here the modal category for angiosperm C-values, comprising 8.38% of examined species (Figure (b)). Hence, Oryza sativa, Lotus japonicus, Medicago truncaluta, Vitis

22 Journal of Botany 3 Table : Known TE proportions in plant genomes when various authors provide different values, the range is given. Inconsistent estimates of TE content result from incomplete genomic representation and/or varying bioinformatical approaches. Examination of complete genomes and application of the same TE discovery pipeline is therefore essential for comparative analyses on intra- and interspecies level. No data on TE content is available for gymnosperms, and angiosperms with large genomes (>2.4 Gbp) are underrepresented. In most genomes, Helitrons were not surveyed appropriately. Species Predominant fertilization Genome size (Gbp) TE content (%) DNA transposon content (%) Retrotransposon content (%) Arabidopsis thaliana S Arabidopsis lyrata C ns ns Brachypodium distachyon S Carica papaya C ns ns Cucumis melo C Oryza sativa S ns 7 Lotus japonicus S Genomic fraction examined Complete/draft genome Complete/draft genome Complete/draft genome Complete/draft genome BAC clones (6.7 Mbp) Complete chr. 0; complete/draft genome TAC clones (32.4 Mbp); 2/3 genome BAC clones Medicago truncaluta S ns 9.6 (0.233 Mbp) Complete/draft Populus trichocarpa C genome Vitis vinifera S Complete/draft genome Brassica oleracea C Complete/draft genome Sorghum bicolor S /7 55/42 Complete/draft genome Complete/draft Glycine max S genome (3.42 Mbp) Small insert Gossypium herbaceum S <0. 52 genomic library (3.42 Mbp) Zea mays C Complete/draft genome Small insert Aegilops tauschii S genomic library (2.9 Mbp) Hordeum vulgare S ns ns BAC clones (0.44 Mbp) Secale cereale C BAC clones (2.033 Mbp) References [2, 25, 32] [2, 73] [74] [75] [76] [24, 39] [8, 77] [40, 78] [79] [4, 80, 8] [32] [34, 82] [38, 83] [84] [29] [38, 85] [86] [38, 87] S self-pollination; C cross-pollination; ns not specified; only full-length LTR retrotransposons analysed; only TEs with homology to known repeat elements considered. vinifera, Brasica oleracea, and Populus trichocarpa have typical genome lengths (Table ), therefore it can be implied that transposable elements typically constitute one-third of angiosperm genomes. Genome size changes are not unidirectional. TE fragments and solo LTRs have been found to constitute a significant fraction of repetitive elements (e.g.,[39 43]), and are believed to be remnants of removed TEs [3]. There is evidence that transpositional bursts can be followed by DNA loss [44], and it has been reported that the removal of LTR retrotransposons canproceed withdifferent efficiency in distinct species [33, 40, 42]. However, it should be noted that the estimations of a retroelements half-life assume constant removal rates for repetitive sequences, and rely on the

23 4 Journal of Botany Frequency (%) Genome size (Gbp) Frequency (%) Angiosperms Gymnosperms Genome size (Gbp) (a) (b) Figure : Frequency distribution for spermatophyta C-values. The dataset consists of all angiosperm (6,287) and gymnosperm (204) entries in the Plant DNA C-value database [38] (retrieved 27/05/20). (a) Comparison of angiosperm and gymnosperm C-value frequency distribution. Genomes of spermatophyta were split into categories with Gbp intervals (X-axis); the proportion (percent) of each category (vertical axis) is shown separately for the two groups. Interestingly, angiosperms (mean = Gbp, s x = , median = 2.40 Gbp) and gymnosperms (mean = 8.57 Gbp, s x = , median = Gbp) do not display similar distributions. Despite the extreme range of C-values, 95% of flowering plants fall within 0 22 Gbp. The range of gymnosperm C-values is smaller; however, the overall genome size is higher, with 95% of species falling within 7 33 Gbp. (b) Higher resolution of angiosperm C frequency distribution. X-axis shows genome size categories with 0.2 Gbp increments and the vertical axis indicates the proportion of each category (percent). Mean and median values are indicated by the red and green line, respectively. The mean of the first half of the histogram (left to the median) is.049 Gbp; the mean of the second half is 0x larger (not indicated). Hence, the average small (<2.4 Gbp) and large (>2.4 Gbp) angiosperm genomes are Gbpand 0 Gbp long, respectively. Modal interval ( Gbp) includes 8.38% of flowering plants. molecular clock principle; therefore, the revealed interspecies differences in TE survival should be interpreted cautiously. Disregarding the mutational effects of TEs at this point, a dramatic increase of noncoding or repetitive fraction in the genome theoretically raises the nutritional and time requirements for DNA replication and maintenance in each cell (leading to putative costs formulated as the large genome constraint hypothesis, [45]). The higher nutritional and time demands may lead to decreased fecundity and prolonged generation time, the two main constituents affecting selective advantage [46]. While such changes are apparently disfavouring in populations of unicellular organisms, the growing genome size does not seem to impose an evident and unambiguous selective disadvantage in spermatophyta (seed plants) [45, 47]. Offering one intriguing example: phosphorus (P) is known to be the limiting nutrient in most soils (reviewed in [48]). As it is one of the basic components of nucleic acids, plants with significantly larger genomes have higher P demands for DNA synthesis, leaving lesser amounts for other essential cell processes (e.g., ATP or phospholipid dependent). Therefore, it seems natural to anticipate that plants with large C-values should loose the ecological competition against the plants whose genomes are severalfold smaller. Despite that, genome sizes in examined land plants range 2056-fold [49], with Genlisea margaretae (C = 63.4Mbp = pg) [50] and Trillium hagae (C = 32.5 pg)[5] on the opposite poles. Relatively large, almost 22-fold differences in C-values have also been reported for members of a single genus (Eleocharis) at the same ploidy level [52]. If we reject the unlikely possibility that such enormous genome size variation in plants is attributable to stochastic processes only, the question arises as to why some plants maintain relatively small genomes (Figure ) while others sink into genomic obesity [53]. In theory, there are two possible causes that may allow transposable elements, and thus genomes to expand extensively: (i) deficiency in the mechanisms of suppression and/or removal of TEs from the genome and (ii) selective advantage that favours individuals and/or populations with high TE activity. In relation to the former possibility (i), Weil and Martienssen [9] compare the interaction of TEs and host genomes to resistance to pathogens, and hypothesize that as transposons evolve ways around host silencing, host organisms evolve new genes for silencing, perhaps through duplication and subfunctionalization. This idea is supported by the discovery of positive selection acting on some LTR retrotransposons in the rice genome [54]. On rare occasions, mutant variants of retroelements can escape host recognition and rapidly amplify, leading to what is commonly observed as bursts of amplification. Consequently, the detected variability in TE activity in time and taxa can be attributed to different phases of the host-parasite interaction. The occasional escape of TEs from the host suppression via random mutations and positive selection seems plausible; however, such events should exhibit roughly the same periodicity in all genomes, assuming similar substitutional rates of repetitive DNA among species. Therefore, this red queen race in itself can hardly explain the enormous differences in plant genome sizes. Moreover, the host-pathogen analogy also implies that the species with below-average performance in TE

24 Journal of Botany 5 suppression are evolutionary disadvantaged to those species who have successfully prevented transpositional bursts of parasitic retroelements. Comparing the plant species with small and large genomes, we lack any direct evidence for such generalization. The above-mentioned conflicts direct the attention to the latter alternative (ii), which suggests plant genome size variation to be caused by a differential selective advantage of TE possession acting in distinct species or populations (i.e., the presence and activity of transposable elements might be beneficial in some species/populations while detrimental in others). Testability of this hypothesis depends on identifying the nature of such advantage; therefore, other effects of TEs on plant genomes need to be carefully considered (see below) Mutability. The discovery of transposable elements was accompanied by the observation of their mutability [55], which unlike the growing genome size, can often have a phenotypic manifestation. Transpositions into the coding regions of genes are usually deleterious; however, those transpositional events that passed through the sieve of selection can induce a variety of genetic changes, including interrupting host genes, creating different expression forms, changing intron length, and affecting expression levels of adjacent genes [56]. Downregulation of genes may be caused not only by TE insertions disrupting the promoters, but more likely by sirna-guided DNA methylation, which is primarily directed to suppress the TE activity but affects the expression of nearby genes too [35]. Whole-genome differences in TE-siRNA interactions have such dramatic effects on expressional patterns that they may contribute to speciation [2]. Among recently reported TE-induced mutations are cluster-shaped somatic variation in grapevine caused by the insertion of Hatvine-rrm DNA transposon in the VvTFLA gene promoter [57]; flower color gene mutation caused by TgmExpress transposition into the intron 2 of F3H gene in soybean [58]; or transposoninduced DNA methylation of CmWIP promoter leading to sex determination in melon [59]. Comprehensive and genome-wide analyses of TE mutability have been accomplished on the fully sequenced genome of rice. According to [39], while LTR retrotransposons constitute 7% of the rice genome, 22% of these sequences lie within putative or established rice genes. Within the genic regions, fragmented elements have been predominantly identified, and full-length elements are rare [39]. Available genomic sequences of the two rice subspecies japonica and indica provide a powerful resource for comparative and functional genomic analyses, which has been utilized by Huang et al. [56] to study transposon insertion polymorphisms (TIPs). Interestingly, more than 0% of TIPs between Nipponbare and 93- rice cultivars were located in expressed gene regions. Roughly half of those TIPs occurred in introns, often resulting in alternative splicing, and more than a third were found in [, 250] regions, relative to the transcription start site. Effects of TE insertions within the promoters are particularly impressive in the case of two genes, causing 8-fold upregulation and 23-fold downregulation of gene expression [56]. In another study [7], high-throughput sequencing was utilized to determine,664 insertion sites of mping transposon in a population of 24 rice plants. Subsequent comparative microarray analysis concluded that the vast majority of TE insertions either have no impact, or preferentially enhance transcription under normal conditions. However, seven out of ten loci, unaffected by mping insertion under normal conditions, were inducible by salt and cold stress. Scanning mping sequences for cisacting plant regulatory elements resulted in identification of 96 putative regulatory motifs, one-third of which were stress responsive. These experiments demonstrate that the mping transposon, resembling a mobile gene enhancer, provides new binding sites for transcription factors or other regulatory proteins, and may actually benefit the host by creating potentially useful allelic variants and novel, stressinducible regulatory networks [7]. Among the most intriguing features of transposable elements is the ability of certain classes to capture gene fragments. The potential to contribute to gene evolution by combining genes, exons, and introns into novel functional unitsismostapparentinhelitrons. Although these elements were initially challenging to identify due to the absence of typical TE structural features [60, 6], an effective structurebased program has been developed recently, leading to the detection of thousands Helitrons in several plant genomes [6]. The frequency of the gene capture is particularly striking in the genome of maize [62, 63]. For example, Morgante et al. [62] randomly selected nine genic insertions polymorphic in maize inbred lines and demonstrated that eight of them are nonautonomous Helitrons, each containing between one and seven different fragments of host genes. Yang and Bennetzen [63] have shown on a genome-wide scale that 60% of maize Helitrons contains captured fragments of nuclear genes, 4% of which are under purifying selection, and another 4% exhibiting apparent adaptive selection, which suggests beneficial effects for the host or Helitron transposition/retention. Although the vast majority of the genes captured by Helitrons are incomplete, defective copies of conserved functional genes (including exons and also introns), a fraction of those gene fragments may serve as the template for interfering RNAs or new gene functions via exon shuffling, expanding the repertoire of mutable changes provided by TEs. The last significant mutable effectof TEs to be mentioned is the macrotransposition, that is, a transposition involving two physically close, interacting elements, and an intervening chromosomal segment. Such transposon pairs may produce other complex rearrangements, including deletions, inversions, and reshuffling of the intertransposon segment [64], thus macrotransposition can be another contributor to genome divergence and speciation. 3. From Individuals to Populations: From Junk to Treasure In the study of coevolution of TEs and plant genomes, two different populations affected by inconsistent selection forces should be recognized the population of transposable

25 6 Journal of Botany elements within the genomic niche of an individual, and the population of diverse individuals within an ecological niche. In this section, the term population refers to the latter one. Since the early work of Barbara McClintock, the presence and action of transposable elements in the host genomes has been studied from the perspective of an individual. Because the mutational consequences of TE activity on genes and their expression patterns (see above) are undirected by the host and principally random, it has become apparent that the uncontrolled TE transposition or expansion is, in the vast majority of cases neutral, detrimental, or lethal for an individual. The absence of evident benefits of the possession of transposable elements for an individual led to TEs being regarded as an archetype of selfish or parasitic DNA, whose only functional aim is to reproduce itself, regardless of the effect on the host genome. The large genomic regions occupied by ancient, suppressed, or still active transposable elements have acquired a label junk DNA an unnecessary burden for cell and organism. The individual-based viewpoint in the TE research was needed, because a fundamental description of TE diversity, prevalence, and modus operandi was required, and also understandable, because the technical possibilities to study TE dynamics on the population level were unavailable until recently. However, evolution acts on populations, not on individuals. Some recent studies have drawn the attention to population processes related to TEs [37, 65 67], and it is becoming clearer that to answer the questions about the origin, evolution, function, and importance of transposable elements in plant genomes, it is necessary to move the research focus from individuals to populations. Environmental stresses are known initiators of TE activity [20 23, 68], and diverse effects of transpositional events on the expression of adjacent genes have been reported [7, 35, 56, 57, 59, 68]. Although the mutational impact of TE bursts is likely to be detrimental for an individual, TE activity creates new variability in the population, providing raw material for selection forces. An illustrative example of how effective the TEs can be in generating genetic diversity is provided by the activity of the mping DNA transposon in some rice strains. With roughly 40 new transpositions of mping per plant per generation, even small populations contain thousands of new insertions, a large portion of which upregulates genes in their vicinity under stress conditions [6, 7]. Hence, TE activity may actually help the population to overcome changing environmental conditions and adapt to new ecological settings. Under diversifying selection, this ability of quick adaptations is likely to outbalance the costs of decreased fitness of some individuals, or the possible large genome constraint. From this perspective, the escape of TEs from the silenced state resembles more a regulated response to cope with stress on population level rather than an undesired side effect of stress exposure, an idea initially hypothesized by McClintock [55] as the response to genomic shock. Possession of a mechanism that can boost the evolutionary changes and be switched on and off depending on the situation might be the decisive factor for the survival or extinction of a population in changing environments. It is suggestive to hypothesize that transposable elements might represent such a tool and were actually invented, or at least modified by eukaryotes to fulfil this function. And if plants possess and use an autonomous mechanism to control TE proliferation, it means they also have a basic control over their own genome size. The hypothesis of transposable elements as intrinsic tools for increasing genetic variability has some testable implications (i) (iii). For example, (i) TE-driven stimulation of variability could be especially beneficial for species, populations, or genomic regions exposed to strong diversifying selection (e.g., host-pathogen systems). Such entities would be therefore expected to possess more transposable elements and their TE dynamics to be more responsive to stress conditions. Interestingly, Nielen et al. [30] have found LTR retrotransposon FIDEL associated with conserved Arachis genes less frequently than what was expected by chance, but its presence close to fast-evolving NBS genes (resistance gene analogues) was in agreement with random distribution. (ii) Asexually reproducing organisms and self-pollinating plants, which lack the opportunity to recombine their genetic material, might profit from the enhanced diversity sustained by the TEs, as suggested for rice by Naito et al. [7]. However, the outcrosser Arabidopsis lyrata shows 2-3 times higher TE content than the selfer A. thaliana (Table ). The relationship between the mating system and TE dynamics for these two relatives has been studied [68, 69], and among the possible causes are differences in effective population size and related stochastic processes, or more deleterious consequences of the accumulation of recessive mutations in self-pollinating plants. Self-pollinators might also be more efficient in suppressing TEs owing to more rapid fixation of epigenetic silencing patterns. On the other hand, Bestor [70] assumes that the aggressiveness of transposons in self-fertilizing sexuals is self-limited comparing to outcrossing sexuals, where the transposon fixation is nearly certain provided that the coefficient of selection imposed by the transposon is less than 0.5 when there is one or more transposition events per generation. It implies that self-pollinators are not expected to have higher transposon content than related outcrossers, unless the TEs provide some net benefits to the host. While Triticeae seems to be an interesting tribe for such comparisons (Table ), more comprehensive data on the TE content is required. (iii) Taxa adapted to environmentally stable niches, such as ocean depths or high altitudes, would be expected to contain significantly less TEs comparing to the organisms from more unstable environments, which supposedly overcame multiple bursts of TE activity. An excellent opportunity to study these expectations is provided by three diploid sunflower species. Helianthus anomalus, H. deserticola, and H. paradoxus are independently derived via hybridization events between the same two parental taxa, H. annuus and H. petiolaris. The three hybrid taxa encountered a rapid, retrotransposon-mediated genome expansion [7, 72], and all of them occupy habitats considered abiotically extreme relative to either parental species. H. anomalus and H. deserticola inhabit arid desert-like environments whereas H. paradoxus occurs exclusively in saline environments.

26 Journal of Botany 7 Interestingly, the scale of copy number increase for copia LTR retrotransposons differs considerably among the three sunflower hybrid species, with a 3.7-fold increase in copy number in the genome of H. paradoxus (relative to the average parental species value) versus a lower.7-fold and 2.2-fold increase for H. anomalus and H. deserticola, respectively [72].Thecopynumberincreaseofgypsy LTR retrotransposons is even more stunning, with 5.6- to fold multiplication in the hybrid taxa, compared to parental populations (whichdidnot differ significantly) [7]. Ungerer et al. [7] suggest that hybridization, abiotic stress, or both may have been involved in this extensive retrotransposon proliferation. In either case, it is tempting to hypothesize that this is an example of TE proliferation being switched on by the host regulatory mechanisms (possibly coevolving with the TEs) as an action to elicit mutational consequences potentially helpful in adapting to new environments. 4. Conclusion Genome-wide and population-based examinations of similar projections have the potential to illuminate the role of transposable elements in speciation, adaptation, and in the evolution of plant genomes, in general. Besides, a detailed knowledge of TE activity regulation is required to understand the coevolution of TEs and plant genomes, and the extent of consequential benefits utilizable by plants. During the past decades, the notion of TEs has varied from autotelic junk to valued tool of evolutional response. In the light of new evidence, the terms like junk or selfish DNA, and even host genome and defence mechanisms for TE suppression are becoming more misleading than ever. At present, comparative and functional genomic studies targeted on TE population dynamics and TE-cell interactions, supported by high-throughput technologies, are on the way to finalize this paradigm shift. Acknowledgments The authors would like to thank two anonymous reviewers for helpful suggestions and constructive critique. This paper was supported by the Slovak Research and Development Agency under the Contracts no. APVV and no. APVV References [] T. Wicker, F. Sabot, A. Hua-Van et al., A unified classification system for eukaryotic transposable elements, Nature Reviews Genetics, vol. 8, no. 2, pp , [2] V. V. Kapitonov and J. Jurka, Helitrons on a roll: eukaryotic rolling-circle transposons, Trends in Genetics, vol. 23, no. 0, pp , [3] F. Sabot and A. H. Schulman, Genomics of transposable elements in the Triticeae, in Genetics and Genomics of the Triticeae, C. Feuillet and G. J. Muehlbauer, Eds., pp , [4] M. Labrador and V. G. Corces, Transposable element-host interactions: regulation of insertion and excision, Annual Review of Genetics, vol. 3, pp , 997. [5] P. J. Russell, igenetics, Benjamin/Cummings, San Francisco, Calif, USA, st edition, [6] K. Naito, E. Cho, G. Yang et al., Dramatic amplification of a rice transposable element during recent domestication, Proceedings of the National Academy of Sciences of the United States of America, vol. 03, no. 47, pp , [7]K.Naito,F.Zhang,T.Tsukiyamaetal., Unexpectedconsequences of a sudden and massive transposon amplification on rice gene expression, Nature, vol. 46, no. 7267, pp , [8] S. Sato, Y. Nakamura, T. Kaneko et al., Genome structure of the legume, Lotus japonicus, DNA Research, vol. 5, no. 4, pp , [9] C. Weil and R. Martienssen, Epigenetic interactions between transposons and genes: lessons from plants, Current Opinion in Genetics and Development, vol. 8, no. 2, pp , [0] R. K. Tran, D. Zilberman, C. de Bustos et al., Chromatin and sirna pathways cooperate to maintain DNA methylation of small transposable elements in Arabidopsis, Genome Biology, vol. 6, no., article R90, [] F. J. Qin, Q. W. Sun, L. M. Huang, X. S. Chen, and D. X. Zhou, Rice SUVH histone methyltransferase genes display specific functions in chromatin modification and retrotransposon repression, Molecular Plant, vol. 3, no. 4, pp , 200. [2] J. D. Hollister, L. M. Smith, Y. L. Guo, F. Ott, D. Weigel, and B. S. Gaut, Transposable elements and small RNAs contribute to gene expression divergence between Arabidopsis thaliana and Arabidopsis lyrata, Proceedings of the National Academy of Sciences of the United States of America, vol. 08, no. 6, pp , 20. [3] T. Uchiyama, Y. Saito, H. Kuwabara et al., Multiple regulatory mechanisms influence the activity of the transposon, Tam3, of Antirrhinum, New Phytologist, vol. 79, no. 2, pp , [4] K. Fujino, S. N. Hashida, T. Ogawa et al., Temperature controls nuclear import of Tam3 transposase in Antirrhinum, Plant Journal, vol. 65, no., pp , 20. [5]G.Hu,J.S.Hawkins,C.E.Grover,andJ.F.Wendel, The history and disposition of transposable elements in polyploid Gossypium, Genome, vol. 53, no. 8, pp , 200. [6] B. Liu and J. F. Wendel, Retrotransposon activation followed by rapid repression in introgressed rice plants, Genome, vol. 43, no. 5, pp , [7] M. Petit, C. Guidat, J. Daniel et al., Mobilization of retrotransposons in synthetic allotetraploid tobacco, New Phytologist, vol. 86, no., pp , 200. [8] X. Shan, Z. Liu, Z. Dong et al., Mobilization of the active MITE transposons mping and Pong in rice by introgression from wild rice (Zizania latifolia Griseb.), Molecular Biology and Evolution, vol. 22, no. 4, pp , [9] H. Ito, H. Gaubert, E. Bucher, M. Mirouze, I. Vaillant, and J. Paszkowski, An sirna pathway prevents transgenerational retrotransposition in plants subjected to stress, Nature, vol. 472, no. 734, pp. 5 20, 20. [20] S. Pouteau, M. A. Grandbastien, and M. Boccar, Microbial elicitors of plant defense responses activate transcription of aretrotransposon, Plant Journal, vol. 5, no. 4, pp , 994. [2] S. R. Wessler, Plant retrotransposons: turned on by stress, Current Biology, vol. 6, no. 8, pp , 996. [22] M. A. Grandbastien, Activation of plant retrotransposons under stress conditions, Trends in Plant Science, vol. 3, no. 5, pp. 8 87, 998.

27 8 Journal of Botany [23] P. Capy, G. Gasperi, C. Biémont, and C. Bazin, Stress and transposable elements: co-evolution or useful parasites? Heredity, vol. 85, no. 2, pp. 0 06, [24] T. Sasaki, The map-based sequence of the rice genome, Nature, vol. 436, no. 7052, pp , [25] S. Kaul, H. L. Koo, J. Jenkins et al., Analysis of the genome sequence of the flowering plant Arabidopsis thaliana, Nature, vol. 408, no. 684, pp , [26] B. D. Peterson-Burch, D. Nettleton, and D. F. Voytas, Genomic neighborhoods for Arabidopsis retrotransposons: a role for targeted integration in the distribution of the Metaviridae, Genome Biology, vol. 5, no. 0, article R78, [27] J. Du, Z. Tian, N. J. Bowen, J. Schmutz, R. C. Shoemaker, and J. Ma, Bifurcation and enhancement of autonomousnonautonomous retrotransposon partnership through LTR swapping in soybean, Plant Cell, vol. 22, no., pp. 48 6, 200. [28] C. Staginnus, C. Desel, T. Schmidt, and G. Kahl, Assembling a puzzle of dispersed retrotransposable sequences in the genome of chickpea (Cicer arietinum L.), Genome, vol. 53, no. 2, pp , 200. [29] P. S. Schnable, D. Ware, R. S. Fulton et al., The B73 maize genome: complexity, diversity, and dynamics, Science, vol. 326, no. 5956, pp. 2 5, [30] S. Nielen, F. Campos-Fonseca, S. Leal-Bertioli et al., FIDELa retrovirus-like retrotransposon and its distinct evolutionary histories in the A- and B-genome components of cultivated peanut, Chromosome Research, vol. 8, no. 2, pp , 200. [3] K. M. Devos, J. K. M. Brown, and J. L. Bennetzen, Genome size reduction through illegitimate recombination counteracts genome expansion in Arabidopsis, Genome Research, vol. 2, no. 7, pp , [32] X. Zhang and S. R. Wessler, Genome-wide comparative analysis of the transposable elements in the related species Arabidopsis thaliana and Brassica oleracea, Proceedings of the National Academy of Sciences of the United States of America, vol. 0, no. 5, pp , [33] C. Vitte and J. L. Bennetzen, Analysis of retrotransposon structural diversity uncovers properties and propensities in angiosperm genome evolution, Proceedings of the National Academy of Sciences of the United States of America, vol. 03, no. 47, pp , [34] A. H. Paterson, J. E. Bowers, R. Bruggmann et al., The Sorghum bicolor genome and the diversification of grasses, Nature, vol. 457, no. 7229, pp , [35] J. D. Hollister and B. S. Gaut, Epigenetic silencing of transposable elements: a trade-off between reduced transposition and deleterious effects on neighboring gene expression, Genome Research, vol. 9, no. 8, pp , [36] K. M. Devos, Grass genome organization and evolution, Current Opinion in Plant Biology, vol. 3, no. 2, pp , 200. [37] M. I. Tenaillon, J. D. Hollister, and B. S. Gaut, A triptych of the evolution of plant transposable elements, Trends in Plant Science, vol. 5, no. 8, pp , 200. [38] M. D. Bennett and I. J. Leitch, Plant DNA C-values Database, release 5.0, December 200, [39] L. Gao, E. M. McCarthy, E. W. Ganko, and J. F. McDonald, Evolutionary history of Oryza sativa LTR retrotransposons: a preliminary survey of the rice genome sequences, BMC Genomics, vol. 5, article 8, [40] H. Wang and J.-S. Liu, LTR retrotransposon landscape in Medicago trunculata: more rapid removal than in rice, BMC Genomics, vol. 9, article 382, [4] A. Benjak, A. Forneck, and J. M. Casacuberta, Genome-wide analysis of the cut-and-paste transposons of grapevine, PLoS ONE, vol. 3, no. 9, Article ID e307, [42] T. Wicker and B. Keller, Genome-wide comparative analysis of copia retrotransposons in Triticeae, rice, and Arabidopsis reveals conserved ancient evolutionary lineages and distinct dynamics of individual copia families, Genome Research, vol. 7, no. 7, pp , [43] M. Charles, H. Belcram, J. Just et al., Dynamics and differential proliferation of transposable elements during the evolution of the B and A genomes of wheat, Genetics, vol. 80, no. 2, pp , [44] C. Vitte, O. Panaud, and H. Quesneville, LTR retrotransposons in rice (Oryza sativa, L.): recent burst amplifications followed by rapid DNA loss, BMC Genomics, vol. 8, article 28, [45] C. A. Knight, N. A. Molinari, and D. A. Petrov, The large genome constraint hypothesis: evolution, ecology and phenotype, Annals of Botany, vol. 95, no., pp , [46] L. M. Wahl and C. S. DeHaan, Fixation probability favors increased fecundity over reduced generation time, Genetics, vol. 68, no. 2, pp , [47] C. A. Knight and J. M. Beaulieu, Genome size scaling through phenotype space, Annals of Botany, vol. 0, no. 6, pp , [48] C. P. Vance, C. Uhde-Stone, and D. L. Allan, Phosphorus acquisition and use: critical adaptations by plants for securing a nonrenewable resource, New Phytologist, vol. 57, no. 3, pp , [49] I.J.Leitch,J.M.Beaulieu,M.W.Chase,A.R.Leitch,andM. F. Fay, Genome size dynamics and evolution in monocots, Journal of Botany, vol. 200, Article ID 86256, 200. [50] J. Greilhuber, T. Borsch, K. Müller, A. Worberg, S. Porembski, and W. Barthlott, Smallest angiosperm genomes found in Lentibulariaceae, with chromosomes of bacterial size, Plant Biology, vol. 8, no. 6, pp , [5] B. J. M. Zonneveld, New record holders for maximum genome size in eudicots and monocots, Journal of Botany, vol. 200, Article ID , 200. [52] F. Zedek, J. Šmerda, P. Šmarda, and P. Bureš, Correlated evolution of LTR retrotransposons and genome size in the genus eleocharis, BMC Plant Biology, vol. 0, article 265, 200. [53] J. L. Bennetzen and E. A. Kellogg, Do plants have a one-way ticket to genomic obesity? Plant Cell, vol. 9, no. 9, pp , 997. [54] R. S. Baucom, J. C. Estill, J. Leebens-Mack, and J. L. Bennetzen, Natural selection on gene function drives the evolution of LTR retrotransposon families in the rice genome, Genome Research, vol. 9, no. 2, pp , [55] B. McClintock, The significance of responses of the genome to challenge, Science, vol. 226, no. 4676, pp , 984. [56] X. Huang, G. Lu, Q. Zhao, X. Liu, and B. Han, Genomewide analysis of transposon insertion polymorphisms reveals intraspecific variation in cultivated rice, Plant Physiology, vol. 48, no., pp , [57] L. Fernandez, L. Torregrosa, V. Segura, A. Bouquet, and J. M. Martinez-Zapater, Transposon-induced gene activation as a mechanism generating cluster shape somatic variation in grapevine, Plant Journal, vol. 6, no. 4, pp , 200.

28 Journal of Botany 9 [58] G. Zabala and L. Vodkin, Novel exon combinations generated by alternative splicing of gene fragments mobilized by a CACTA transposon in Glycine max, BMC Plant Biology, vol. 7, article 38, [59] A. Martin, C. Troadec, A. Boualem et al., A transposoninduced epigenetic change leads to sex determination in melon, Nature, vol. 46, no. 7267, pp , [60] C.Du,J.Caronna,L.He,andH.K.Dooner, Computational prediction and molecular confirmation of Helitron transposons in the maize genome, BMC Genomics, vol. 9, article 5, [6] L. Yang and J. L. Bennetzen, Structure-based discovery and description of plant and animal Helitrons, Proceedings of the National Academy of Sciences of the United States of America, vol. 06, no. 3, pp , [62] M. Morgante, S. Brunner, G. Pea, K. Fengler, A. Zuccolo, and A. Rafalski, Gene duplication and exon shuffling by helitronlike transposons generate intraspecies diversity in maize, Nature Genetics, vol. 37, no. 9, pp , [63] L. Yang and J. L. Bennetzen, Distribution, diversity, evolution, and survival of Helitrons in the maize genome, Proceedings of the National Academy of Sciences of the United States of America, vol. 06, no. 47, pp , [64] J. T. Huang and H. K. Dooner, Macrotransposition and other complex chromosomal restructuring in maize by closely linked transposons in direct orientation, Plant Cell, vol. 20, no. 8, pp , [65] C. E. Grover and J. F. Wendel, Recent insights into mechanisms of genome size change in plants, Journal of Botany, vol. 200, Article ID , 200. [66] A. Le Rouzic and G. Deceliere, Models of the population genetics of transposable elements, Genetical Research, vol. 85, no. 3, pp. 7 8, [67] A. Le Rouzic, T. S. Boutin, and P. Capy, Long-term evolution of transposable elements, Proceedings of the National Academy of Sciences of the United States of America, vol. 04, no. 49, pp , [68] S. I. Wright, Q. H. Le, D. J. Schoen, and T. E. Bureau, Population dynamics of an Ac-like transposable element in self- and cross-pollinating arabidopsis, Genetics, vol. 58, no. 3, pp , 200. [69] S. Lockton and B. S. Gaut, The evolution of transposable elements in natural populations of self-fertilizing Arabidopsis thaliana and its outcrossing relative Arabidopsis lyrata, BMC Evolutionary Biology, vol. 0, no., article 0, 200. [70] T. H. Bestor, Sex brings transposons and genomes into conflict, Genetica, vol. 07, no. 3, pp , 999. [7] M. C. Ungerer, S. C. Strakosh, and Y. Zhen, Genome expansion in three hybrid sunflower species is associated with retrotransposon proliferation, Current Biology, vol. 6, no. 20, pp. R872 R873, [72] T. Kawakami, S. C. Strakosh, Y. Zhen, and M. C. Ungerer, Different scales of Ty/copia-like retrotransposon proliferation in the genomes of three diploid hybrid sunflower species, Heredity, vol. 04, no. 4, pp , 200. [73] T. T. Hu, P. Pattyn, E. G. Bakker et al., The Arabidopsis lyrata genome sequence and the basis of rapid genome size change, Nature Genetics, vol. 43, no. 5, pp , 20. [74] J. P. Vogel, D. F. Garvin, T. C. Mockler et al., Genome sequencing and analysis of the model grass Brachypodium distachyon, Nature, vol. 463, no. 7282, pp , 200. [75] R. Ming, S. Hou, Y. Feng et al., The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus), Nature, vol. 452, no. 790, pp , [76] V. M. González, A. Benjak, E. M. Hénaff et al., Sequencing of 6.7 Mb of the melon genome using a BAC pooling strategy, BMC Plant Biology, vol. 0, article 246, 200. [77] D. Holligan, X. Zhang, N. Jiang, E. J. Pritham, and S. R. Wessler, The transposable element landscape of the model legume Lotus japonicus, Genetics, vol. 74, no. 4, pp , [78] S. B. Cannon, L. Sterck, S. Rombauts et al., Legume genome evolution viewed through the Medicago truncatula and Lotus japonicus genomes, Proceedings of the National Academy of Sciences of the United States of America, vol. 03, no. 40, pp , [79] G. A. Tuskan, S. DiFazio, S. Jansson et al., The genome of black cottonwood, Populus trichocarpa (Torr. & Gray), Science, vol. 33, no. 5793, pp , [80] R. Velasco, A. Zharkikh, M. Troggio et al., A high quality draft consensus sequence of the genome of a heterozygous grapevine variety, PLoS ONE, vol. 2, no. 2, Article ID e326, [8] O. Jaillon, J. M. Aury, B. Noel et al., The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla, Nature, vol. 449, no. 76, pp , [82] J. Schmutz, S. B. Cannon, J. Schlueter et al., Genome sequence of the palaeopolyploid soybean, Nature, vol. 463, no. 7278, pp , 200. [83] J. Du, D. Grant, Z. Tian et al., SoyTEdb: a comprehensive database of transposable elements in the soybean genome, BMC Genomics, vol., no., article 3, 200. [84] J. S. Hawkins, H. Kim, J. D. Nason, R. A. Wing, and J. F. Wendel, Differential lineage-specific amplification of transposable elements is responsible for genome size variation in Gossypium, Genome Research, vol. 6, no. 0, pp , [85] W. Li, P. Zhang, J. P. Fellers, B. Friebe, and B. S. Gill, Sequence composition, organization, and evolution of the core Triticeae genome, Plant Journal, vol. 40, no. 4, pp , [86] T. Wicker, W. Zimmermann, D. Perovic et al., A detailed look at 7 million years of genome evolution in a 439 kb contiguous sequence at the barley Hv-elF4E locus: recombination, rearrangements and repeats, Plant Journal, vol. 4, no. 2, pp , [87] J. Bartoš, E. Paux, R. Kofler et al., A first survey of the rye (Secale cereale) genome composition through BAC end sequencing of the short arm of chromosome R, BMC Plant Biology, vol. 8, article 95, 2008.

29 Hindawi Publishing Corporation Journal of Botany Volume 20, Article ID 64698, 0 pages doi:0.55/20/64698 Review Article The Genomes of All Angiosperms: A Call for a Coordinated Global Census David W. Galbraith, Jeffrey L. Bennetzen, 2 ElizabethA.Kellogg, 3 J. Chris Pires, 4 and Pamela S. Soltis 5 School of Plant Sciences and Bio5 Institute, The University of Arizona, Tucson, AZ 8572, USA 2 Department of Genetics, University of Georgia, Athens, GA 30602, USA 3 Department of Biology, University of Missouri-St Louis, St Louis, MO , USA 4 Division of Biological Sciences, University of Missouri, Columbia, MO , USA 5 Florida Museum of Natural History and The Genetics Institute, University of Florida, Gainesville, FL 326, USA Correspondence should be addressed to David W. Galbraith, galbraith@arizona.edu Received June 20; Accepted 22 August 20 Academic Editor: Andrew Wood Copyright 20 David W. Galbraith et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Recent advances in biological instrumentation and associated experimental technologies now permit an unprecedented efficiency and scale for the acquisition of genomic data, at ever-decreasing costs. Further advances, with accompanying decreases in cost, are expected in the very near term. It now becomes appropriate to discuss the best uses of these technologies in the context of the angiosperms. This white paper proposes a complete genomic census of the approximately 500,000 species of flowering plants, outlines the goals of this census and their value, and provides a road map towards achieving these goals in a timely manner.. Introduction The angiosperms (flowering plants) are believed to comprise somewhere around 360,000 species [, 2] in 45 families [3]. An exact enumeration of the total number of existing species is impossible, because this includes a guess at the number of species that have not yet been discovered. If these were all found, a final total of somewhere between 400,000 and 500,000 species seems likely [4]. The enumeration and description of all angiosperm species takes on particular urgency because thousands of species become extinct every year, largely due to the impact of human activities. The risks accompanying the depletion of angiosperm biodiversity resources relate to ecosystem maintenance, to food production, for example, by the loss of wild relatives of crops that may have desirable alleles, and to drug discovery, given the observation that plant secondary products represent not simply the primary source of drugs and drug leads, but also the chemical inspiration for synthetic organic syntheses. These observations lead us to propose the timely establishment of a global network of coordinated research activities across the angiosperms with the objectives listed below. 2. Objectives The main objective is to establish vibrant and efficient communication among scientific experts in the disparate disciplines of taxonomy, systematics, cytometry, genomics, and bioinformatics, with the aim of planning a comprehensive molecular census of the angiosperms. This census is envisaged to start with measurement of nuclear genome sizes (C-value) for all angiosperms and would progress, over a five-year period, through sample survey sequencing to complete genome sequencing for a subset of these species. The census would start with planning activities preparative to coordinated research. Such planning would involve face-to-face meetings, the establishment of virtual communications, training courses and workshops, support for student internships, and support for postdoctoral fellows, graduate students, and junior faculty for cross-disciplinary visits to other participating laboratories and institutions. It would also include the development and publication of white papers covering technical aspects of the planned research as well as identifying Grand Challenge questions that require coordinated funding. The planning process needs to be

30 2 Journal of Botany sensitive to the international aspects necessarily associated with a global census, including issues of ownership, of specimen import/export, of ethics, and of cultural differences. The planning process should provide intelligent and workable solutions to issues that are identified, and distribute these freely to the general scientific community for adoption. The census would then proceed to implementation, in which funding would be obtained for the three data-intensive stages of the project. 3. Rationale and Justification 3.. What Is the Value of a Complete Molecular Census of the Angiosperms? The rationale underlying a complete global molecular census is primarily and simply that it would provide an accounting of existing plant biodiversity generated by millions of years of evolution, using the most advanced and cost-effective means available. The angiosperms, or flowering plants, emerged approximately 40 million years ago, their sudden appearance and rapid diversification being described by Darwin as an abominable mystery [5]. The ease with which angiosperms became the dominant group, displacing ferns, cycadophytes, and conifers, excites considerable interest, as does the dramatic evolution of angiosperm-derived characteristics critical for animal evolution and diversification. As the primary autotrophs on land, angiosperms represent an invaluable resource of genetic and genomic information. It is therefore surprising, and somewhat alarming, how little genetic and genomic information has been collected for the angiosperms. In part, this has been due to cost. Historically, information of the type that we propose be gathered cytological data and DNA sequences has been obtained using expensive and delicate equipment, with high costs of operation. At present, biases in genomic information for angiosperms reflect the interests of society: Dirzo and Raven [6] noted that the 2,500 species of crop plants (0.5% of all angiosperms, based on Bramwell s estimate [4]) are distributed in only about a third of the 45 families of flowering plants. The grass and legume families are most represented, making up about one-third of all crops. Ten additional families (Amaranthaceae, Apiaceae, Arecaceae, Asteraceae, Brassicaceae, Lamiaceae, Rosaceae, Rutaceae, Solanaceae, and Zingiberaceae) are represented by a few to many dozens of crop species whereas the numerous other families contribute only one or a few crop species. Of the 2,500 crop species, 03 supply over 90% of the calories for human consumption, with rice, wheat, and maize (all Poaceae) making up over 60% of this total. For this reason alone, collection of cytological, molecular, and genomic data has naturally focused on crops. It should be noted that additional species are cultivated as sources of fiber, as ornamentals, or as sources of medicines. Dirzo and Raven [6] estimate that a further 25,000 plant species, are, or have been, used as sources of medicines, but most of these are not cultivated, with materials being collected directly from nature. Therefore, this group of plants, despite their importance, is not well described at the molecular or cytological level. The justification for a comprehensive molecular census of the angiosperms is easy to articulate: firstly, because we lack most molecular information from the great majority of plant species, any attempt to acquire this information will provide value. Secondly, it is clear that increasing pressures on the environment because of the largely uncontrolled growth of human activity is increasingly resulting in loss of biodiversity [7]. The rate of extinction of species has been likened in severity to the five mass extinction events over the last 600 million years in which 65 95% of all species disappeared [8]. In the present case, this is due to a combination of anthropogenic climate change, superimposed on the clearing of lands for agricultural and industrial purposes. The rates of species extinction are debatable, given caveats, amongst other things, as to the definition of a species, and as to the method of estimating extinction. Figures of 7,500 speciesextinctionsperyearareacceptedbysome,asare estimates for the time to extinction of 50% of birds (,500 years) or mammals (6,500 years; all data from Stork, 200). Barnosky et al. [8] compared extinction rates of mammals over the last 500 years to those of mammals in the fossil record. They found the current rate exceeds that associated with all five previous mass extinctions, and coupled to data from other orders of life forms, predicts extinction of 75% of amphibian, mammalian, and avian species within years. Hence, there is broad consensus amongst scientists that an extinction comparable in magnitude to the previous five mass extinctions is indeed currently underway, in agreement with the prediction of a 998 survey by the American Museum of Natural History [9]. Putting these two observations together, it is obvious we risk extinction of many (perhaps most) plant species before they are described in any detail. At the molecular and genomic level, this means we risk the nondiscovery of DNA sequences that encode gene products having useful properties, ranging, for example, from genes that might encode resistance to biotic or abiotic stresses, to metabolic pathways that produce compounds of potential pharmacological value. At the level of basic biology, we risk missing the observation of interesting and novel forms of eukaryotic plant life styles, given that our current understanding of plants is predominantly shaped by study of a very limited number of agricultural crops and model species. If we accept that it would be valuable to derive a comprehensive molecular census of the angiosperms, it is reasonable to move on to examine the value of each of the census components. Within the context of this article and this volume, these components are the C-value, sample survey sequencing, and complete genome sequencing What is the Value of the Information Represented by the Individual Components of the Census? C-Value Rationale. The C-value of eukaryotic species is a fundamental parameter that can be measured, given the presence of

31 Journal of Botany 3 a nucleus as a defining characteristic of eukaryotic cells. Differences in C-values have impacts at multiple biological levels within eukaryotes, with effects on genome replication time and mechanism, effects on scaling associated with nuclear volume and surface area relative to other cellular components and to overall cell size, effects on numbers of chromosomes and the occurrence and regulation of endo- and polyploidy [0, ], effects on the copy numbers of genes, transposable elements, and on other structural components of chromosomes, and effects relating to genetic and epigenetic transmission of information. Direct measurement of genome size by flow cytometry provides a useful reality check on genome sizes estimated from full sequence assemblies, and, as a general rule, due to misassembly or absence from these assemblies of repetitive elements, the true genome size is larger than these assemblies [2]. Many research groups have an ongoing interest in studying the processes responsible for genome structure variation in multicellular plants, and the influence of this variation on gene and chromosome function. It is clear that nuclear genomes in angiosperms are highly dynamic, particularly as reflected in the amplification of transposable elements (TEs), in the removal of selectively neutral DNA sequences, in the occurrence of polyploidy, in recombinational rearrangement by ectopic events, and the resolution of double-strand DNA breaks [3 20]. Dramatic differences have been observed in the rates of these processes in different lineages [2], but too few lineages have been analyzed to define when these changes occur, and to associate these changes with alterations in the environment or in the evolved fitness for that environment. Other workers have examined correlations between nuclear DNA content and histone H3 methylation, between rdna copy number and genome size, and between genome size and recombination rate (reviewed in [2]). At the cellular and organismal levels, analysis of C- values has been used for a number of large comparative studies [2]. These recently have included the relationship between genome size and B-chromosomes, duration of the cell division cycle, seed size and mass, photosynthetic rate, plant growth form and distribution, leaf cell size and stomatal density, and patterns of genome size evolution (see [2, 22, 23]forafulldiscussion). Differences in C-value have practical implications as well, notably for identification of species that can be most efficiently sequenced (i.e., have the smallest genomes), identification of unsuspected polyploid series, identification of rare hybrid species, and identification of the best plant lineages to investigate recent changes in DNA removal rate, TE amplification, or ploidy. Genome size determinations also indicate the best lineages for sequence analysis that can provide the broadest range of information about plant gene diversity (content and function), by allowing choices of appropriate size genomes from a phylogenetically informative set of species. In agriculture, nuclear DNA content measurements allow rapid identification of novel hybrids, detection of desired or undesired ploidy classes within certified seed, efficient screening for haploids and doubled haploids, and rogueing of aneuploids emerging from genetic engineering and tissue culture [24]. Justification. As of now, genome sizes have been measured and compiled for 8,959 species [2] of which 6,287 have been awarded prime status, meaning that their values are strongly supported by curated experimental results. The latter number represents -2% of angiosperms, depending on whether one includes as yet undiscovered species. Leadership in creating a repository of this information has been taken by Drs. Bennett and Leitch at the Royal Botanic Gardens, Kew, in the form of the Plant DNA C-values Database (release 5.0, December 200; [25]). This database provides a number of searchable functions and modes, returning output fields chosen from Family, Genus, Species, Authority, Chromosome number (2n), Ploidy level (x), Estimation method, Voucher, C-value, and Original Reference. One can also search for species having genome sizes falling within specific defined ranges. Most C-value measurements included in the database, and essentially all that are done now, employ flow cytometry to estimate C-value, following methods pioneered some years ago [26]. This involves simple chopping of plant samples, filtering the released nuclei to remove large debris, and adding a DNA-specific fluorochrome. The fluorescence emission of the nuclei is then measured by flow cytometry, and compared to that of standards [26 28]. Although flow cytometry is convenient and rapid (up to 0 samples per hour can be readily processed on one instrument), and the required chemicals are inexpensive, the low level of overall sampling of angiosperm C-values achieved to date is, in part, due to the historically high cost of the flow cytometers themselves. The instruments have also typically been physically large, difficult and expensive to maintain, and have required highly-trained technicians as operators. Some examples of low-cost cytometers have been available over the last 25 years (Partec) having a small laboratory footprint, and minimal energy and fluidic requirements. Other manufacturers have recently developed low-cost instruments, including Accuri, Millipore, and Life Technologies [29]. Three further features of these instruments are relevant as follows. Dynamic Range and Linearity. The Accuri C6, costing around $35,000, has an extraordinarily large dynamic range of measurement. This dynamic range easily accommodates DNA content values spanning pg in a single instrument run, and analysis of noise and upper limits indicates the C6 should be able to handle the largest and smallest angiosperm values yet reported [30]. For all samples, optical linearity can also be validated from the accumulated data, such as appropriate positioning of 4C versus 2C nuclei at both ends of the scale. Further, working with fixed settings offers significant gains in robust processing of data between days, and across users and laboratories. Automated Throughput. Low-cost plate samplers are now available. A general criticism of much current C-value data from botanical studies is their inadequate sampling: few replicates, and few populations, in particular. The published value needs to convey a context, implying more numbers and more discriminate sampling, which is evidently facilitated by automation.

32 4 Journal of Botany Portability. A follow-on is that part of these analyses would be more effectively done in the field, such that immediate results guide subsequent sampling. The Partec CyFlow minipoc instrument, designed for CD4 and CD4% testing in developing countries but, based on its technical specifications, clearly adaptable for genome size measurements, is particularly notable, since it weighs less than 5 kg, operates on a 2-volt battery supply, costs less than $20,000, and, being equipped with USB ports, can be readily interfaced with wireless communication devices. An additional impediment to large-scale C-value measurement has been the typical requirement of fresh tissue for flow cytometric analysis, as originally described [26]. However, modifications to standard techniques now allow the use of dried leaf tissue [3, 32] an innovation that may dramatically increase the rate of C-value measurement, particularly if the approach can be altered to accommodate the vast botanical resources housed in the world s herbaria. Based on these advances, it is now conceivable, in terms of affordability, methodology, and practicality, to consider extending C-value measurements to all angiosperms. The general concept would be to distribute significant numbers (20 30) of these instruments across the globe, placing them in appropriate locations, such as botanical gardens, seed repositories, and academic institutions, and then operate them continuously to analyze samples collected at those particular locations. Our aim, therefore, is to develop an optimal design to achieve this goal; this will require identifying and addressing issues associated with sample collection and identification, setting up standard practices for sample processing, and devising schemas for solving any technical problems that might be encountered. An ancillary, yet crucial, aim is to facilitate the process of obtaining funds to support data collection. For both aims, a key is the establishment of collaborations, as has been done for animal genomics [33]. In recent years, the number of groups undertaking C-value measurements around the world has steadily increased, and scientists are beginning to recognize the potential of medium- and large-scale surveys. A number of reports are emerging that address C-values of species within specific geographical areas (see [2] for a discussion), and participants in these projects are already highly collaborative. It is, finally, worth emphasizing quality control of C- value estimates. The concept of a gold standard has already been mentioned in terms of the values provided as prime estimates in the Kew C-value database. However, in the scientific community beyond cytologists, evolutionary biologists, population biologists, and systematists, there has been a naïve acceptance of very narrow data generation and analysis in the C-value field. For instance, analyses that look at only one individual within a species and with only one C- value determination for that sample/species are often judged acceptable. Analysis of multiple individuals and information on ploidy, populations, vouchers, and cytogenetics should be provided and required for definitive assessments. In considering a census of largely unexplored species, it must be underlined that a 2C-value gains its pertinence from its context and representativity. While technical quality control is simple to develop and ensure, the conceptual setting requires scientific maturation. We need to avoid the situation in which we are attempting to integrate often dubious lists of C-values (albeit easy to use and therefore readily appreciated) with laborious geobotanical and evolutionary studies. We also need to develop means to continue to identify and, to the greatest extent possible, eliminate inaccurate or incorrect values from the C-value literature Sample Survey Sequencing Rationale. The angiosperms are unusual, compared to most eukaryotic clades, in that their genome sizes span an extraordinary range. The smallest reported angiosperm genome size, that of Genlisea margaretae (Lentibulariaceae), corresponds to a 2C-value of 0.29 pg [34]. The largest currently measured is that of Paris japonica (Melanthiaceae) with a 2C-value of pg [2], representing a genome of 50 billion base pairs. Because angiosperms contain somewhere between 25,000 and 75,000 genes (see, e.g., the immediate question is why some angiosperms have 2,400-fold more nuclear DNA than others in a life style that is evolutionarily viable. Further questions concern the relationship between chromosome number and C-value; an extreme example is that of Sedum suaveolens with 2n = 640 and yet only 8.3 pg ofdnapernucleus[35]. In part, the first question is answered by the established occurrence and dynamic amplification and loss of repetitive DNA sequences within the genome (see, for example, [5, 36, 37]). However, insufficient samples exist to draw meaningful conclusions. Certainly the same is true for investigating the second question. This aspect of the census, therefore, aims to provide an in-depth analysis of the correlates of C-value diversity, and an understanding of its biological consequences. This would be achieved by obtaining more information, across carefully selected species, of the patterns of repetitive genomic sequences within the genomes. For this type of survey, very low sequence coverage is needed, but it nevertheless can be guided by C-value information obtained as previously described. Justification. Dramatic decreases in costs of sequencing have accompanied the emergence of new sequencing technologies [38]. Colloquially termed next-generation (NextGen) sequencing, these technologies rely on highly parallel implementation of sequencing reactions, at specific locations on slides or in microwells, which greatly reduces the cost per reaction as compared to capillary-based (firstgeneration) sequencing. A penalty is paid in terms of read length, although incremental improvements in this parameter continue to be achieved. However, the disadvantages of short read lengths are greatly offset by dramatic increases in sequence output. For example, a single sequencing center, BGI Shenzhen, now has the capacity to produce about 75 terabytes of sequence data per day, approximately equal to,500 human genomes. Indeed, the extraordinary increase in global sequencing capacity,

33 Journal of Botany 5 coupled with plunging costs, has spawned a number of large-scale collaborative projects, including the,000 human genomes project ( the 0,000 vertebrate genomes project ( the,00 Arabidopsis project ( the,000 plant transcriptomes project ( and various metagenomics surveys (see, for example, The International Human Microbiome Consortium: Massively parallel sequencing has revolutionized molecular biology by making genomic sequencing possible for many more organisms than previously attainable. Low redundancy and shallow coverage genome survey sequences from this type of sequencing have the potential to rapidly provide large, cost-effective datasets for phylogenetic inference, to replace single gene or spacer regions as DNA barcodes, and to provide a plethora of data for other comparative molecular evolution studies [39 4]. Coupling low coverage ( 5X) sequencing to methods of DNA indexing, to allow sample multiplexing, provides lowcost surveys of genomes assuming one has an idea of the genome size. Repetitive sequences predominate in the results, and tools have been provided for their specific analysis in this type of data[39 42], so this provides a rapid means to chart the occurrence, amplification and contraction, and phylogenetic distributions of these interesting genomic elements. The role of coordination in these activities is to provide an international framework that can most efficiently define those angiosperm species to be sequenced Complete Genome Sequencing Rationale. The angiosperms are also unusual in terms of the widespread occurrence of ancestral genome duplication events [43]. For example, even a genome as small as that of Arabidopsis thaliana contains clear evidence of several large-scale duplication events, although their precise timing remains contentious [44]. Angiosperms readily accommodate polyploidy, and this mechanism provides the greatest proportion of duplicated genes [45]. At the same time, tandem duplications are an important contributor to the evolution of agriculturally and biomedically important traits, such as disease resistance and the biosynthesis of secondary products. Full genomic sequencing, hand-in-hand with transcriptome analysis, would greatly augment the molecular analyses at the heart of this census, since it would enumerate all sequences that might have evolutionary relevance to plant performance and productivity. Clearly, achieving such a goal cannot be done relying solely on NextGen sequencing, due to cost and throughput considerations at this large scale. However, radically different sequencing (Gen3) technologies are near commercialization and promise quantum leaps in throughputs and correspondingly decreased costs (see, e.g., Pacific Biosciences [46]). Further, no limitations on continued decreases in costs of sequencing, and associated increases in sequence output, are envisaged over the next decade or so [38]. Planning for the rational use of Gen3 (and GenN and Gen N+) sequencers therefore represents an important component of coordination activities, given that they should be fully functional in the research setting in the upcoming year. Justification. As indicated above, rapid advances in technology will ensure that our capabilities to acquire sequence data run the danger of outstripping our abilities to select appropriate questions to be addressed. In effect,we willbe in the position of the amateur photographer a few years back, encountering the replacement of film by digital technologies. Without the limitation imposed by the time and cost of buying and developing film, the art of photography has been infinitely changed. Care in selection of subject, exposure, and so on, has been replaced by taking many more pictures in the expectation that at least some will be outstanding. The danger is the inevitable deluge of data, and the lack of both software to adequately curate these data, and storage to archive them. It will be critical to design appropriate experiments and strategies, to assemble teams, and to generate funding resources, such that a complete molecular census of the angiosperms can be achieved What Is the Value of Coordinated Planning in This Area? Performing a complete molecular census of the angiosperms will require coordinated planning. First and foremost, we need to establish a network of scientists who display the ability to work interactively and without egotism, who also are aware of the technical, technological, and infrastructural issues that relate to the collection of census data, and who are sensitive to the cultural, social, and geographical issues that necessarily accompany collection activities. Coordination will also lead to the identification of taxonomic and collection deficiencies required for census implementation. This will impact the training programs recognized as important in the plant sciences, and is certainly anticipated to put much more emphasis on the classical training of plant systematists, morphologists, and taxonomists. The scientists wishing to participate in coordination activities should have a common interest in genome structure, function, and evolution, and, since they use many different genetic, molecular, and computational strategies to investigate related questions, a greater level of interaction would promote synergistic activities. Analyzing genome sizes across all angiosperms would provide a solid foundation for initiating collaborations. For instance, if some lineage of plants were found to have unusually rapid rates of genome size change, this would provide an impetus to those investigating transposable elements to look for de novo element activity, DNA repair experts to see how this process can change over short evolutionary timeframes, and ecologists/evolutionary biologists to investigate the possible mechanistic origins and functional outcomes of these changes. As in all comprehensive research projects, the great majority of the samples will be simple to collect and analyze, but the last few samples are likely to be recalcitrant to analysis or (more likely) collection. One of the first tasks of the coordinating group will be to set initial priorities

34 6 Journal of Botany for sample analysis. Although these priorities cannot be established in advance of assembly of a coordinating group with broad disciplinary and international representation, it is likely that the first studies would be focused on full sampling across the angiosperm phylogeny. One can imagine that the secondary priorities will be for deeper sampling of specific families that have interesting priorities discovered in the first broad sampling and/or families that have active research communities. A first glance at genome size data can also identify lineages that have recently changed their ploidy, where X, 2X, 4X and so forth, genome size variation is observed among close relatives, since measurement of C-values is less cumbersome than measurement of chromosome numbers. Also, analyzing a full range of angiosperms for genome size would indicate patterns in genome change that may have been missed as a consequence of observing across an unevenly sampled and tiny subset of plant genomes. It could identify, for example, which TEs are the most likely to amplify dramatically in one lineage compared to another. The data so far show no patterns beyond shared events in closely related species. However, with each genome comprising only a single data point, we only have a couple dozen (mostly wildly scattered) data points. More lineages need to be investigated, and genome size determinations would be a first step toward rationally selecting sample sequence analyses that can determine genome composition with a small amount of shotgun sequence data [42]. Finally, research coordination activities including participants who have been conducting field work in many countries over the years would make use of current best practices and the contacts for collaboration. The characterization, preservation, and utilization of angiosperm genetic diversity is a vital issue worldwide, so a full buy-in to this proposed approach is expected from the full international community of plant scientists. In summary, the proposed coordination activities would aim to bring together a range of scientists with similar interests who have not previously worked together, but whose collective interests span multiple relevant, interconnected disciplines and technologies. We expect that the proposed collaboration will generate genome size data leading to myriad and diverse spinoffs, as the data will implicate particularly interesting lineages for further investigation, including cases where genome change appears to be rapid or in an unexpected direction. Once genome sizes are known broadly, future choices of genomes that deserve full shotgun (or complete, pending technological advances and funding) sequence analysis can be based on a much more complete knowledge foundation How Does This Census Relate to Crops and Food Production? The census relates to crops at multiple direct levels. First, the data generated by grants that will be written or otherwise facilitated by the coordination and planning activities should first uncover unusual and unexpected properties of angiosperm genome size, leading to discoveries regarding repeat structure and, finally, genome sequence. The identification of wild diploid relatives of major and minor crops will provide a road map for complete sequencing, and will focus the development of other molecular tools for those crops. The relevance of orphan crops, particularly in developing countries, should be mentioned. In general, these have not benefited from the application of modern and intense breeding programs, and therefore dramatic yield improvements may be feasible. Finally, through the many investigations stimulated by the interactions enabled through the census, we fully anticipate the discovery of novel genomic features and sequence types, whose evolutionary properties and mechanistic underpinnings presently are unknown but which might have direct relevance to crop improvement. 4. Implementation of Research Coordination Activities What type of research coordination might be considered? We propose the following. 4.. An Annual Workshop. This workshop would be designed to provide theoretical underpinning and practical training in census activities. Ideally, workshop participants would include graduate students and postdoctoral research associates, in order to provide them direct experience in technical methods, which are envisaged as being either based on wet laboratory or in silico approaches. An embedded theme in our call is the establishment of distributed technologies of reasonable cost to address the acquisition of a complete angiosperm census. Workshop topics could include (i) measurement of plant genome C-values using small footprint (essentially portable) flow cytometric instrumentation, (ii) the use of small footprint sequencers in a distributed environment (e.g., the Life Technologies Ion Torrent, and the Illumina MySeq), (iii) data storage and manipulation (particularly using cloud-based solutions), (iv) GPS-based methods for plant imaging, taxonomic assignment, and archiving, and (v) current issues relating to plant classification. Issues of resource protection, sovereignty and ownership, and ethics will also be integrated into these workshops. Finally, outreach activities to schools would be designed and implemented to extend awareness of these emerging issues An Annual Conference. An annual conference centered around census goals and activities would provide a natural environment to establish collaborative activities A Website. This would be set up to allow registration of network participants, to promote interactions among these participants, to archive methods and technologies, and to provide hyperlinks to resources, for example, to tools for resolving conflicts in the application of scientific names A Program of Exchange Visits. These would be between participating laboratories, to allow efficient transfer of information, particularly for cross-training students, postdoctoral research associates, and beginning faculty.

35 Journal of Botany A Program of White Papers. These would outline future research needs and directions, Grand Challenge questions, and would provide written support and resources for proposals to acquire the specific data types, starting with C- value measurements, and moving on to sample sequencing, and then to complete genome sequencing. Specific topics might include (i) establishing a C-value analysis pipeline for seed storage repositories, including those at botanical gardens, government facilities, and international agencies (e.g., CGIAR centers), (ii) issues and problems of sample collection and taxonomic identification, (iii) integration of disparate information types, and (iv) ongoing maintenance and curation of databases A Research Coordination Committee. The plant C- value community would be polled to elect a coordination committee that would discuss issues related to implementation of the proposed research program. The election, tenure, and activities of this committee would be comparable to that of the Maize Genetics Executive Committee ( and other organismspecific research organization groups, like those developed for Arabidopsis, sorghum,soybean,and wheat.the primary purpose of such a group would be to foster communication within the plant C-value research community and between the researchers and external interested parties, like governments, funding agencies, industry, and NGOs Coordinated Searches for Funding. This will be a critical part of any proposed census, and the search for funds for implementation of the collection, measurement, and archiving activities will be challenging. Review panels in national funding agencies tend not to favor projects that propose surveys without also addressing an underlying biological question. Part of this bias comes from the predominance of hypothesis testing as the core of the scientific method within biological research, and it contrasts dramatically with the situation in other observational sciences, such as astronomy, where large-scale collection of data in the absence of hypotheses is the norm [38]. Nonetheless, construction of biological databases (e.g., GenBank and before it the Atlas of Protein Sequence and Structure) has been integral to the development of modern biology; surveys and their resulting data are incorporated into more experimental realms of biology almost automatically when they are found to be useful [47]. It is likely that progressive recognition, by society, of anthropogenic change to the environment will alter the perception of how biological research should be done, and will lend urgency to completing the proposed census. If so, funding will naturally follow. Nonetheless, coordination of funding activities on an international scale will present challenges of its own 5. Available Resources Notable resources in terms both of infrastructure and research personnel already exist, and these would provide an excellent foundation for the proposed coordination activities. 5.. Natural Repositories. Botanical gardens represent the historical locations for plant collections, both living and in archival vouchered forms. Some important examples are in the developed countries, including the Royal Botanic Gardens, Kew (RBG Kew), the Arnold Arboretum, the New York Botanical Garden, and the Missouri Botanical Garden. Others, in developing countries, access notable biodiversity (for example, Xishuangbanna, China, and Bogor, Indonesia). RBG Kew has established a program of measurement of C- values and maintains important web resources relating to plant biodiversity. All of these gardens have implemented, or are implementing, programs for extraction and archiving of genomic DNA. Some have active federal support for assessing biodiversity, for example, the Center for Tropical Forest Science, a joint venture of the Arnold Arboretum and the Smithsonian Tropical Research Institute, has funding that supports Biodiversity Workshops in collaboration with scientists in China and the developing world. A second type of collection is exemplified by seed storage programs at the Svalbard Global Seed Vault, at the USDA facility in Fort Collins, Colorado, and by RBG Kew at Wakehurst Place. The first and second are repositories of seed accessions primarily of crop species of importance to the world and to US agriculture, respectively. The third has a comprehensive mandate, having already banked dry seed of 0% of the world s seed plant diversity since 2000, and aiming to reach 25% by For the two latter locations, a regular cycle of seed germination, to verify viability in storage over time, is an integral part of the program. The seedling samples generated in this way represent an obvious and available resource for genome size measurements and for genomic DNA extraction. Global resources in terms of research personnel include cytologists active in genome size measurements, scientists involved in large-scale genome sequencing efforts, taxonomists, systematists, evolutionary biologists, and bioinformaticians. These interested and trained scientists are the most vital resource of all, and the proposed pan-angiosperm C-value census will add tremendous impetus for growth of this community Cytologists. Anumberofcytologistsareactiveinflow cytometric genome measurement, particularly in Europe. Work by cytologists and collaborators has defined the means to efficiently obtain C-values from plants (e.g., [26, 28, 48], to correlate C-values with environmental population distributions within species [49], to examine genome size within the context of angiosperm evolution [22, 50], to employ unexpected sources of plant samples for C-value measurement [5], to sample geographic locations [52 54], and to organize and provide this information in searchable form to the community ([25] and references therein). Researchers in Europe are particularly active in medium-scaleprojectstodefineangiospermc-values within specific geographic regions. Coordinating activities within

36 8 Journal of Botany this community would serve primarily to prioritize activities to be covered by grant applications, to avoid duplication, and to efficiently translate research findings into practical advances across the network Large-Scale Genomics Analysis. Therateofproduction of genomic sequences is increasing nearly exponentially, driven by considerable innovation in instrumentation and sequencing technologies [38]. Coordination of sequencing activities is largely driven by funding concerns: many crop species that have been sequenced (e.g., maize), and that are being sequenced (wheat), have such large genomes that whole genome sequencing can only be achieved by consortia. Other consortia are sampling many individuals within single species, or are sampling representative individuals within a specific genus. Many of these consortia span the globe, and some employ combinations of expertise in cytology and sequencing to address previously intractable issues (e.g., flow sorting of chromosome arms of wheat prior to the use of NextGen sequencing). Some consortia center around a single, large-scale sequencing facility, the Beijing Genomics Institute in Shenzhen and the JGI in the US being prime examples. One advantage of large sequencing centers is the speed with which they can incorporate advanced sequencing technologies into their pipelines, and the consequent commoditization of sequencing costs Taxonomists, Systematists, and Evolutionary Biologists. A number of collaborative mechanisms currently exist in the US that link personnel and activities within this group. For example, an NSF Research Coordination Network already exists on the topic of Microevolutionary Molecular and Organismic Research in Plant History (micromorph; as well as two NSF Centers: the National Evolutionary Synthesis Center (NESCent) in Durham, North Carolina, and the National Center for Ecological Analysis and Synthesis (NCEAS) at UC Santa Barbara. Efficient collaboration with these centers, and others, such as the NSF iplant Collaborative, which is charged with development of cyberinfrastructure to address, for example, issues of evolutionary patterns and mechanisms and appropriate phylogenetic sampling, should facilitate attaining the goals of the angiosperm census Bioinformaticians. A key program currently driving collaboration in plant bioinformatics in the US is the iplant Collaborative. Close interaction with this program and its scientists will be important for developing and using informatics tools to link international efforts in angiosperm genomics. iplant has links with the KP project, and via that to NESCent, further extending the potential for collaborative links. Overall, the proposed pan-angiosperm census should develop and strengthen communication, knowledge, and scientific training between individuals, and their laboratory members, that are acknowledged experts in one, or at most two, of the five research focus areas identified above. Understanding the gaps in information between experts in these areas will allow us to design workshops that effectively identify these gaps, and fill them. Through serving as a bridge between different international groups that are addressing similar census goals, we should be able to facilitate the set-up and funding of formal activities that efficiently divide census tasks across geographical locations, and that successfully engage the support of the various governments. We envisage the process of writing white papers as an important aspect of our activities, since this should provide local funding applications with an international imprimatur signifying approval from the global scientific community. We emphasize that although a number of local activities are in place to address the goal of a molecular census, none are comprehensive and all would benefit from a coordination mechanism such as that suggested here. 6. Summary and Conclusions We are currently at an unprecedented moment for the investigation of life on Earth. The scale of generation of DNA sequence information dwarfs any previous type of biological information gathering. Decreasing costs of these analyses, and increasing power in tools for the extraction of biologically meaningful insights from these data, continue to advance at extraordinary rates. All plants harbor genetic novelty with potential agricultural, biomedical, environmental, and industrial value, and the small number of species with any molecular analysis at all indicates that only a tiny portion of this value has been identified. Still, with 500,000 species, full genome sequence analysis of all angiosperms is not on the near-term horizon. Priorities need to be set, with genome size and ploidy as key criteria. Moreover, genome size is itself an important biological feature that can help tell us how genomes evolve and function. A broad characterization of C-values across the angiosperms would identify biological and/or environmental correlates with nuclear genome size, would uncover the general rules of genome size variation, would discover lineages where unusual genomic alterations occurred or are occurring, and would indicate which species are most appropriate for sample sequence or full genome sequence analysis. The technologies for angiosperm C-value analysis have themselves dramatically decreased in cost and increased in robustness/availability of late. Hence, the time is right to undertake study of the nuclear genome DNA content for the entire range of angiosperms. We propose a coordinated process to set priorities and foster communications between laboratories worldwide that will pursue C-value analysis of the angiosperms. This coordination would include an annual workshop and an annual meeting, web-based tools, an elected coordination committee, exchange visits, white papers, and organized searches for funding. The community to undertake the research exists, although we expect it to grow dramatically over the next few years, but it is currently composed of a large number of scattered and uncoordinated research laboratories. Our proposal seeks to maintain the creative independence of these groups and new entrants to the field, but to provide them with the communication, coordination, and data analysis tools that will make their work most productive.

37 Journal of Botany 9 References [] R. F. Thorne, How many species of seed plants are there? Taxon, vol. 5, no. 3, pp. 5 52, [2] A.J.Paton,N.Brummitt,R.Govaertsetal., Towardstarget of the global strategy for plant conservation: a working list of all known plant species progress and prospects, Taxon, vol. 57, no. 2, pp , [3] B. Bremer, K. Bremer, M. W. Chase et al., An update of the angiosperm phylogeny group classification for the orders and families of flowering plants: APG III, Botanical Journal of the Linnean Society, vol. 6, no. 2, pp. 05 2, [4] D. Bramwell, How many plant species are there? Plant Talk, vol. 28, pp , [5]T.J.Davies,T.G.Barraclough,M.W.Chase,P.S.Soltis,D. E. Soltis, and V. Savolainen, Darwin s abominable mystery: insights from a supertree of the angiosperms, Proceedings of the National Academy of Sciences of the United States of America, vol. 0, no. 7, pp , [6] R. Dirzo and P. H. Raven, Global state of biodiversity and loss, Annual Review of Environment and Resources, vol. 28, pp , [7] N. E. Stork, Re-assessing current extinction rates, Biodiversity and Conservation, vol. 9, no. 2, pp , 200. [8] A. D. Barnosky, N. Matzke, S. Tomiya et al., Has the Earth s sixth mass extinction already arrived? Nature, vol. 47, no. 7336, pp. 5 57, 20. [9] American Museum of Natural History, National survey reveals biodiversity crisis scientific experts believe we are in midst of fastest mass extinction in Earth s history, 998, press/feature/biofact.html/. [0] D. W. Galbraith, K. R. Harkins, and S. Knapp, Systemic endopolyploidy in Arabidopsis thaliana, Plant Physiology, vol. 96, no. 3, pp , 99. [] M. Barow and A. Meister, Endopolyploidy in seed plants is differentlycorrelatedtosystematics,organ,lifestrategyand genome size, Plant, Cell and Environment, vol. 26, no. 4, pp , [2] M. D. Bennett and I. J. Leitch, Nuclear DNA amounts in angiosperms: targets, trends and tomorrow, Annals of Botany, vol. 07, no. 3, pp , 20. [3] J. L. Bennetzen, J. Ma, and K. M. Devos, Mechanisms of recent genome size variation in flowering plants, Annals of Botany, vol. 95, no., pp , [4] N. Chantret, J. Salse, F. Sabot et al., Molecular basis of evolutionary events that shaped the Hardness locus in diploid and polyploid wheat species (Triticum and Aegilops), Plant Cell, vol. 7, no. 4, pp , [5] K. M. Devos, J. K. M. Brown, and J. L. Bennetzen, Genome size reduction through illegitimate recombination counteracts genome expansion in Arabidopsis, Genome Research, vol. 2, no. 7, pp , [6] K. Ilic, P. J. SanMiguel, and J. L. Bennetzen, A complex history of rearrangement in an orthologous region of the maize, sorghum, and rice genomes, Proceedings of the National Academy of Sciences of the United States of America, vol. 00, no. 2, pp , [7] E. Isidore, B. Scherrer, B. Chalhoub, C. Feuillet, and B. Keller, Ancient haplotypes resulting from extensive molecular rearrangements in the wheat A genome have been maintained in species of three different ploidy levels, Genome Research, vol. 5, no. 4, pp , [8] J. Ma and J. L. Bennetzen, Rapid recent growth and divergence of rice nuclear genomes, Proceedings of the National Academy of Sciences of the United States of America, vol. 0, no. 34, pp , [9] J. Ma and J. L. Bennetzen, Recombination, rearrangement, reshuffling, and divergence in a centromeric region of rice, Proceedings of the National Academy of Sciences of the United States of America, vol. 03, no. 2, pp , [20] R. T. Gaeta and J. C. Pires, Homoeologous recombination in allopolyploids: the polyploid ratchet, New Phytologist, vol. 86, no., pp. 8 28, 200. [2] C. Vitte and J. L. Bennetzen, Analysis of retrotransposon structural diversity uncovers properties and propensities in angiosperm genome evolution, Proceedings of the National Academy of Sciences of the United States of America, vol. 03, no. 47, pp , [22] I. J. Leitch, D. E. Soltis, P. S. Soltis, and M. D. Bennett, Evolution of genome size in the angiosperms, American Journal of Botany, vol. 90, no., pp , [23] I. J. Leitch, D. E. Soltis, P. S. Soltis, and M. D. Bennett, Evolution of DNA amounts across land plants (embryophyta), Annals of Botany, vol. 95, no., pp , [24] D. W. Galbraith, J. Bartos, and J. Dolezel, Flow cytometry and cell sorting in plant biotechnology, in Flow Cytometry in Biotechnology, L. A. Sklar, Ed., pp , Oxford University Press, New York, NY, USA, [25] M. D. Bennett and I. J. Leitch, Plant DNA C-values database, 200, [26] D. W. Galbraith, K. R. Harkins, J. M. Maddox et al., Rapid flow cytometric analysis of the cell cycle in intact plant tissues, Science, vol. 220, no. 460, pp , 983. [27] J. S. Johnston, M. D. Bennett, A. L. Rayburn, D. W. Galbraith, and H. J. Price, Reference standards for determination of DNA content of plant nuclei, American Journal of Botany, vol. 86, no. 5, pp , 999. [28] J. Doležel, J. Greilhuber, and J. Suda, Estimation of nuclear DNA content in plants using flow cytometry, Nature Protocols, vol. 2, no. 9, pp , [29] K. R. Chi, Going with the flow, The Scientist, vol. 25, no. 5, p. 57, 20. [30] D. W. Galbraith, Simultaneous flow cytometric quantification of plant nuclear DNA contents over the full range of described angiosperm 2C values, Cytometry A, vol. 75, no. 8, pp , [3] J. Suda and P. Travnicek, Reliable DNA ploidy determination in dehydrated tissues of vascular plants by DAPI flow cytometry new prospects for plant research, Cytometry A, vol. 69, no. 4, pp , [32] A. V. Roberts, The use of bead beating to prepare suspensions of nuclei for flow cytometry from fresh leaves, herbarium leaves, petals and pollen, Cytometry A, vol. 7, no. 2, pp , [33] D. Haussler, S. J. O Brien, O. A. Ryder et al., Genome 0K: a proposal to obtain whole-genome sequence for 0,000 vertebrate species, Journal of Heredity, vol. 6, pp , 00. [34] J.Greilhuber,T.Borsch,K.Muller,A.Worberg,S.Porembski, and W. Barthlott, Smallest angiosperm genomes found in Lentibulariaceae, with chromosomes of bacterial size, Plant Biology, vol. 8, no. 6, pp , [35] B. J. M. Zonneveld, New record holders for maximum genome size in eudicots and monocots, Journal of Botany, Article ID , 4 pages, 200. [36] P. SanMiguel, A. Tikhonov, Y.-K. Jin et al., Nested retrotransposons in the intergenic regions of the maize genome, Science, vol. 274, no. 5288, pp , 996.

38 0 Journal of Botany [37] C. Vitte, O. Panaud, and H. Quesneville, LTR retrotransposons in rice (Oryza sativa, L.): recent burst amplifications followed by rapid DNA loss, BMC Genomics, vol. 8, article 28, [38] D. W. Galbraith, The grand challenges in enabling dataintensive biological research, Frontiers in Genomic Assay Technology, vol. 2, no. 26, 20. [39] P. Green, 2x genomes does depth matter? Genome Research, vol. 7, no., pp , [40] D. A. Rasmussen and M. A. F. Noor, What can you do with 0. genome coverage? A case study based on a genome survey of the scuttle fly Megaselia scalaris (Phoridae), BMC Genomics, vol. 0, article 382, [4] P. R. Steele and J. C. Pires, Biodiversity assessment: state-ofthe-art techniques in phylogenomics and species identification, American Journal of Botany, vol. 98, no. 3, pp , 20. [42] J. DeBarry, R. Liu, and J. L. Bennetzen, Discovery and assembly of repeat family pseudomolecules from sparse genomic sequence data using the assisted automated assembler of repeat families (AAARF) algorithm, BMC Bioinformatics, vol. 9, article 235, [43] Y. Jiao, N. J. Wickett, S. Ayyampalayam et al., Ancestral polyploidy in seed plants and angiosperms, Nature, vol. 473, no. 7345, pp , 20. [44] M. D. Ermolaeva, M. Wu, J. A. Eisen, and S. L. Salzberg, The age of the Arabidopsis thaliana genome duplication, Plant Molecular Biology, vol. 5, no. 6, pp , [45] L. E. Flagel and J. F. Wendel, Gene duplication and evolutionary novelty in plants, New Phytologist, vol. 83, no. 3, pp , [46] J. Eid, A. Fehr, J. Gray et al., Real-time DNA sequencing from single polymerase molecules, Science, vol. 323, no. 590, pp , [47] B. J. Strasser, Collecting, comparing, and computing sequences: the making of margaret O. Dayhoff s Atlas of Protein Sequence and Structure, , Journal of the History of Biology, vol. 43, no. 4, pp , 200. [48] J.Loureiro,E.Rodriguez,J.Doležel, and C. Santos, Two new nuclear isolation buffers for plant DNA flow cytometry: a test with 37 species, Annals of Botany, vol. 00, no. 4, pp , [49] K. H. Keeler, B. Kwankin, P. W. Barnes, and D. W. Galbraith, Polyploid polymorphism in big bluestem (Andropogon gerardii Vitman), Genome, vol. 29, no. 2, pp , 987. [50] G. Bharathan, G. M. Lambert, and D. W. Galbraith, Nuclear DNA content of monocotyledons and related taxa, American Journal of Botany, vol. 8, no. 3, pp , 994. [5] E. Sliwinska, I. Pisarczyk, A. Pawlik, and D. W. Galbraith, Measuring genome size of desert plants using dry seeds, Botany, vol. 87, no. 2, pp , [52] J. Suda, J. Loureiro, P. Travnicek et al., Flow cytometry and its applications in plant population biology, ecology and biosystematics: new prospects for the cape flora, South African Journal of Botany, vol. 75, no. 2, pp , [53] J. Loureiro, D. Kopecky, S. Castro, C. Santos, and P. Silveira, Flow cytometric and cytogenetic analyses of Iberian Peninsula Festuca spp, Plant Systematics and Evolution, vol. 269, no. -2, pp , [54] S. Siljak-Yakovlev, F. Pustahija, E. M. Solic et al., Towards a genome size and chromosome number database of balkan flora: C-values in 343 taxa with novel values for 242, Advanced Science Letters, vol. 3, no. 2, pp , 200.

39 Hindawi Publishing Corporation Journal of Botany Volume 20, Article ID , 6 pages doi:0.55/20/ Research Article Does Large Genome Size Limit Speciation in Endemic Island Floras? Maxim V. Kapralov and Dmitry A. Filatov Department of Plant Sciences, University of Oxford, South Parks Road, Oxford OX 3RB, UK Correspondence should be addressed to Maxim V. Kapralov, maxim.kapralov@plants.ox.ac.uk Received 27 May 20; Accepted 28 August 20 Academic Editor: Andrea Polle Copyright 20 M. V. Kapralov and D. A. Filatov. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Genome sizes in plants vary by several orders of magnitude, and this diversity may have evolutionary consequences. Large genomes contain mainly noncoding DNA that may impose high energy and metabolic costs for their bearers. Here we test the large genome constraint hypothesis, which assumes that plant lineages with large genomes are diversifying more slowly Knight et al. (2005), using endemic floras of the oceanic archipelagos of the Canaries, Hawaii, and Marquesas Islands. In line with this hypothesis, the number of endemic species per genus is negatively correlated with genus-average genome size for island radiations on Hawaiian and Marquesas archipelagos. However, we do not find this correlation on the Canaries, which are close to the continent and therefore have higher immigration rate and lower endemism compared to Hawaii. Further work on a larger number of floras is required to test the generality of the large genome constraint hypothesis.. Introduction The DNA content of one nonreplicated holoploid genome with the chromosome number n, referred as C-value [], varies nearly 2400-fold across angiosperms [2], from C = pg in Genlisea margaretae (Lentibulariaceae) [3] to C = pg in Paris japonica (Melanthiaceae) [4]. However, gene numbers per genome in angiosperms do not vary so greatly [5] despite gargantuan variation in DNA content. For example, genome size of Zea mays is nearly 8-fold bigger than one of another grass species, Brachypodium distachyon, but difference in gene numbers between these species is only 22%, the discrepancy which is mainly explained by different retrotransposon content in two genomes [5]. Number of eukaryotic genes is relatively stable and makes up a small fraction of total DNA while much of the variation in genome size is due to noncoding DNA [6], which may be energetically and metabolically costly for their bearers [7]. Vinogradov [8] and Knight et al. [9] found negative correlation between the genus-level diversity and the genus-average genome size in plants, suggesting a genome size constraint on capacity for diversification. Further Knight et al. [9] proposed the large genome constraint hypothesis (LGCH) that suggests that species with large genomes are less likely to generate progenitor species. The LGCH is in agreement with the general observation that most angiosperm species have small genomes, with a mode, median, and mean genome size (C) of just 0.6, 2.6, and 6.2 pg, respectively [0]. The LGCH echoes point of view that larger genomes are maladaptive, as they may constrain growth [], and evolved in populations with smaller effective population size and hence low efficacy of natural selection [6]. However, no relationship between effective population size and genome size was found in seed plants [2]. Some support to the LGCH comes from Suda et al. who hypothesized that rapid insular burst of speciation is more likely to happen in angiospermswithminutenucleardnaamounts [3](page 234) after finding that many island lineages of Macaronesian angiosperms which underwent adaptive radiations have very small genome sizes. Analyses by Vinogradov [8] and Knight et al. [9] do tentatively support the LGCH, but the negative correlations they found between the genus-level diversity and the genus-average genome size were quite weak, 0. and 0.065, respectively, and the methods they used might be a

40 2 Journal of Botany subject to a phylogenetic bias more closely related species are expected to have more similar genome sizes, which was not taken into account in the previous analyses. There is a call for phylogenetic comparative analyses of genome size [4] with a complete genus-level phylogeny of plants [9]. While the complete genus-level phylogeny of flowering plants is yet to be achieved despite the current significant progress in the field [5], some of the regional floras have been studied well enough for the task. Oceanic archipelagos have been regarded as nature s laboratories since Darwin s and Wallace s seminal works [6, 7] andmayoffer a particularly good opportunity to test the LGCH. Oceanic archipelagos are groups of islands with an exclusive volcanic origin that have never been connected to continents [6]. Biota of oceanic islands is composed of species that arrived via long-distance dispersal or evolved through in situ speciation often via bursts of speciation that form multiple closely related species adapted to a broad spectrum of ecological niches [8]. Usually a significant proportion of species on oceanic archipelagos are endemic [9]. If we assume that a large genome is an evolutionary handicap, then island endemic lineages with bigger genome sizes should generate fewer progenitor species compared to their relatives with smaller genomes. Here we use a phylogenetic framework to test this prediction of the LGCH in the endemic floras of the oceanic archipelagos of the Canaries, Hawaii, and Marquesas Islands. All three archipelagos possess highly diverse and intensively studied floras which are well suited to test our hypothesis [8, 20, 2]. The Canary archipelago is formed by volcanically active islands and islets that are 7 24 My old and about 7447 km 2 in area [22]. The Canaries, located just off the northwest coast of mainland Africa, 00 km west of the border between Morocco and the Western Sahara. The archipelago possesses high ecosystem diversity, including dry semidesert vegetation of the coastal lowlands, woodlands, the laurel forest zone, pine forest, and the summit scrub. There are about 680 endemic vascular plant taxa in the Canaries accounting for over 50% of native flora [20]. C-values of 40% of plant species endemic to the Canaries were estimated by Suda et al. [3, 23] that makes the Canary flora the best covered regional flora from the genome size perspective. The Hawaiian archipelago includes 8 major islands with a total land area of about 6636 km 2 located in the middle of the Pacific Ocean. The origin of the archipelago dates back to about 70 My ago, although most of extant islands are younger than 5 My [24]. The high, up to 3000 m, elevation of the islands creates steep climatic gradients and diverse ecosystems ranging from the dry exposed coastal cliffs, through dry, mesic, and wet forests, to alpine summit scrubs. As an isolated archipelago, Hawaii is relatively poor in species with 009 native angiosperm species but rich in endemic taxa which constitute about 90% of the native flora [25]. The Marquesas archipelago, located in the Eastern Pacific Ocean, is comprised of 9 main islands with a total area of 049 km 2. These tropical islands are subjected to frequent drought conditions due to the prevailing easterly winds formed from the dry air masses above the Humboldt Current. The Marquesas archipelago is characterized by relatively homogeneous conditions and hence by an impoverished native flora (ca. 360 species), with a high proportion of endemics (42%; [2]). Here we are using endemic floras of these archipelagos to test the large genome constraint hypothesis [9]. Unlike the previous studies [8, 9], we employ a phylogenetic framework to take the relatedness of species into account. The negative correlation between the number of endemic species per genus and genome size remains significant under this framework for the Pacific archipelagos, providing additional support to the LGCH. 2. Methods 2.. Data Collection. Data on the endemic angiosperm flora of the Canary Islands were obtained from the checklist [26]. The lists of endemic angiosperm species for the floras of the Hawaiian and Marquesas Islands were obtained from the websites developed by the Smithsonian Institution [27, 28]. Only the species-level taxa were included in the analysis. Further we obtained the average genome size for genera with endemic species from the Plant DNA C-values database at the Royal Botanical Gardens at Kew [29] which contains C-values for.8% of all angiosperm species and 58% of angiosperm families [2]. The incompleteness of the C-value database and cases when large genera are represented just by one or few species contributed random gaps and noise into our analyses making it more conservative. For the Canaries only data from Suda et al. [3, 23] were used. Genome size values for the Hawaiian endemic genus Schiedea were obtained from [30]. This resulted in a dataset with information on numbers of endemic species per genus and genus-average genome size for 26, 67, and 7 genera from the Canary Islands, the Hawaiian Islands, and the Marquesas Islands, respectively (see Table S of Supplementary Material available oline at doi:.55/20/458684) Data Analyses. We tested correlations between numbers of endemic species per genus and genus-average genome size using phylogenetically independent contrasts (PIC; [3]). The PIC approach is more conservative than conventional statistics; the difference in trait values is calculated at each node of the phylogeny, resulting in n contrasts where n is the number of species in a fully resolved tree. We conducted Felsenstein s independent contrasts in Mesquite (version 2.72) [32] using the PDAP:PDTREE module [33]. For the analyses of the endemic floras of Hawaii and the Marquesas Islands as well as for combined data set of Hawaii and Marquesas Islands, which share many genera with endemic species and belong to the same biogeographic area, we used phylogenetic trees built using rbcl sequences obtained from GenBank [34]. Phylogenetic trees were reconstructed with Bayesian inference using MrBayes 3..2 [35, 36]. Alignments were partitioned by codons, and the general time-reversible nucleotide substitution model with gamma shape parameter was used. All model parameters were optimized independently for each codon position. Two independent analyses, each with four parallel chains, were run for generations, sampling trees every 00

41 Journal of Botany 3 Archipelago Table : Genus-average genome size and endemic diversity statistics of studied archipelagos. N genera included N endemic species N species per genus Genus-average Felsenstein s contrasts correlation C-value (C,pg) mean SD d.f. R P value The Canaries Hawaii Marquesas Islands Hawaii and Marquesas Is generations after a burn-in period of generations. However, for about a half of the genera from the data set representing the Canary Islands rbcl sequences were not available. Thus, to build a phylogeny of Canary endemics we used the program Phylomatic 2 [37] which utilizes current knowledge of phylogenetic relations between angiosperm taxa in combination with published phylogenies [38 4]and our own tree for available genera (Supplementary Material). All resulted phylogenies (Figure ; FiguresS S3) matched generally accepted phylogenetic relationships. 3. Results and Discussion The sampled 26, 67, and 7 genera from the Canary Islands, the Hawaiian Islands, and the Marquesas Islands contained 480, 334, and 52 endemic species, respectively. The joint Hawaii-Marquesas dataset contained 73 genera with 386 endemic species and had the highest number of endemic species per genus (5.29). The Canaries had fewer endemic species per genus (3.8). Average C values were 2.05 and 2.35 pg for the Canaries and the joint Hawaii-Marquesas dataset, respectively. Differences between Hawaii-Marquesas dataset and the Canary Islands were marginally insignificant for numbers of endemic species per genus (P value = 0.07; t-test) and not significant for genus-average C-values (P value = 0.24; t-test). Mean C-values for all archipelagos is nearly threefold lower than the mean calculated for all available angiosperms [0] in accordance with Suda et al s conclusions for Macaronesian angiosperms [3]. Thus, relatively small genome size of island endemics compared to the mainland biota is confirmed for three oceanic archipelagos and seems to be a general rule. Smaller genomes of island endemics could be explained by either genome miniaturization during or after island speciation events or by the predominance of colonizers with small genomes. Island populations often have small effective population size due to limited resources and bottlenecks during island-hopping speciation. Hence, given increased activity of transposable elements in small populations [42], genome miniaturization might not be very common on islands but the opposite trend may prevail. Indeed, nearly threefold genome increase in younger species without a change in ploidy level was reported for the Hawaiian endemic genus Schiedea (Caryophyllaceae) presumably due to accumulation of transposons [30]. Despite this increase, Schiedea is also a good illustration of smaller genome size of island sister taxa compared to the mainland counterpart given that this Hawaiian endemic genus has over fourfold smaller genome compared to its sister mainland genus, Honckenya [30]. Thus the predominance of colonizers with small genomes and/or higher naturalization potential of species with small genomes is a more likely explanation for smaller genomes of island endemics. This is in agreement with recent findings that invasive plant species have smaller genomes than their noninvasive relatives [43 45]. PIC analyses showed a negative correlation between the number of endemic species per genus and genus-average genome size for all archipelagos analyzed separately as well as for the joint Hawaii-Marquesas dataset (Table ). However, only in the Hawaiian and joint Hawaii-Marquesas datasets correlation was significant (Table ). The Marquesas Islands alone do not show a significant correlation, presumably because of a small sample size; however when they were combined with biogeographically similar Hawaii, it made the negative correlation stronger (Table ). While smaller genomes of island endemics hold for all studied archipelagoes, there is a striking difference between Hawaii and the Canaries in negative correlation between numbers of endemic species per genus and genus-average genome size. This difference is not explained by the age and size of islands, total number of species sampled, or C- values, which are relatively similar. However, the average number ofspecies usedto calculatemean C-values per genus is about threefold lower for the Canaries compared to one for Hawaii (Table S), and together with less resolved phylogeny this could make PIC analysis for the Canary Islands more conservative. Also from biogeographical point of view, the Canaries are close to Africa while Hawaii is a much more isolated archipelago with a higher proportion of endemic species. Geographical isolation and its consequences for colonization potential perhaps explain the higher number of endemic species per genus in Hawaii and might influence the relationship between genus-average genome size and endemic diversity statistics. Thus, significant negative correlation between numbers of endemic species per genus and genus-average genome size is not a universal feature of studied oceanic archipelagos and may depend on such factors as proximity to mainland or island size, with more local endemics on larger islands [46]. Hence, further work on a larger number of floras is required to test the generality of the LGCH.

42 4 Journal of Botany Carex 5/0.42 Cyperus 2/0.74 Luzula / Agrostis /4.79 Calamagrostis 2/2.2 Trisetum 2/ Deschampsia / Festuca /5.57 Poa 3/3.79 Cenchrus /3.5 Pennisetum 4/2.2 Panicum 2/.65 Eragrostis 8/0.68 Isachne 2/ Smilax / Sisyrinchium / Astelia 4/.27 Liparis / Amaranthus / Chenopodium /.2 Schiedea 34/. Silene 7/2.37 Phytolacca /.65 Portulaca 3/.68 Rumex 3/2.79 Artemisia 3/4.82 Bidens 27/.73 Brighamia 2/.05 Lobelia 3/ Vaccinium 3/2. Myrsine 23/.23 Cuscuta / 7.5 Ipomoea /.8 Solanum 3/.4 Nicotiana /3.63 Plantago 3/.03 Myoporum /.94 Coprosma 9/ Morinda /0.65 Psychotria /.03 Gardenia 3/.3 Ixora 7/.39 Pittosporum /0.53 Hydrocotyle /0.98 Lepidium 5/0.58 Abutilon 4/.4 Hibiscus 5/.53 Gossypium /.74 Melicope 55/0.93 Zanthoxylum 4/6.5 Sapindus /0.54 Syzygium / Euphorbia /7.9 Phyllanthus 2/ Viola 7/.33 Acacia 2/.24 Caesalpinia / Erythrina /.4 Vigna /0.73 Sesbania /.23 Vicia /5.6 Sophora /.36 Colubrina / Acaena /0.33 Fragaria /0.37 Rubus 2/0.37 Oxalis 2/2.58 Santalum 7/0.29 Gunnera 2/7.44 Argemone /0.6 Ranunculus 2/6.78 Hernandia /2. Peperomia 26 /.96 Figure : Bayesian phylogeny of the joint dataset of Hawaii and the Marquesas Islands based on rbcl sequences. Posterior probabilities are shown above branches; numbers of endemic species per genus and the genus-average genome size (C, pg) are shown after genera names before and after slash, respectively. Felsenstein s contrasts correlation R between numbers of endemic species per genus and the genus-average genome size is (P value = 0.0).

43 Journal of Botany 5 Acknowledgments The authors thank Mark Chapman for comments on the paper and the Natural Environment Research Council, UK for funding. References [] J. Greilhuber, J. Doležel, M. A. Lysák, and M. D. Bennett, The origin, evolution and proposed stabilization of the terms genome size and C-value to describe nuclear DNA contents, Annals of Botany, vol. 95, no., pp , [2] M. D. Bennett and I. J. Leitch, Nuclear DNA amounts in angiosperms: targets, trends and tomorrow, Annals of Botany, vol. 07, no. 3, pp , 20. [3] J.Greilhuber,T.Borsch,K.Müller, A. Worberg, S. Porembski, and W. Barthlott, Smallest angiosperm genomes found in Lentibulariaceae, with chromosomes of bacterial size, Plant Biology, vol. 8, no. 6, pp , [4] J. Pellicer, M. F. Fay, and I. J. Leitch, The largest eukaryotic genome of them all? Botanical Journal of the Linnean Society, vol. 64, no., pp. 0 5, 200. [5] K. M. Devos, Grass genome organization and evolution, Current Opinion in Plant Biology, vol. 3, no. 2, pp , 200. [6] M. Lynch and J. S. Conery, The origins of genome complexity, Science, vol. 302, no. 5649, pp , [7] P. SanMiguel, A. Tikhonov, Y. K. Jin et al., Nested retrotransposons in the intergenic regions of the maize genome, Science, vol. 274, no. 5288, pp , 996. [8] A.E.Vinogradov, SelfishDNAismaladaptive:evidencefrom the plant Red List, Trends in Genetics, vol. 9, no., pp , [9] C. A. Knight, N. A. Molinari, and D. A. Petrov, The large genome constraint hypothesis: evolution, ecology and phenotype, Annals of Botany, vol. 95, no., pp , [0] I.J.Leitch,J.M.Beaulieu,M.W.Chase,A.R.Leitch,andM. F. Fay, Genome size dynamics and evolution in monocots, Journal of Botany, vol. 200, Article ID 86256, 8 pages, 200. [] M. D. Bennett and I. J. Leitch, Genome size evolution in plants, in TheEvolutionoftheGenome,T.R.Gregory,Ed.,pp , Elsevier, Amsterdam, The Netherlands, [2] K. D. Whitney, E. J. Baack, J. L. Hamrick et al., A role for nonadaptive processes in plant genome size evolution? Evolution, vol. 64, no. 7, pp , 200. [3] J. Suda, T. Kyncl, and V. Jarolímová, Genomesizevariation in Macaronesian angiosperms: forty percent of the Canarian endemic flora completed, Plant Systematics and Evolution, vol. 252, no. 3-4, pp , [4] B. Charlesworth and N. Barton, Genome size: does bigger mean worse? Current Biology, vol. 4, no. 6, pp. R233 R235, [5] B. Bremer, K. Bremer, M. W. Chase et al., An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG III, Botanical Journal of the Linnean Society, vol. 6, no. 2, pp. 05 2, [6] A. R. Wallace, Island Life, Or, The Phenomena and Causes of Insular Faunas and Floras, Including a Revision and Attempted Solution of the Problem of Geological Climates, Harper Collins College, London, UK, 88. [7] C. R. Darwin, Journal of Researches into the Natural History and Geology of the Countries Visited During the Voyage of H.M.S. Beagle Round the World, under the Command of Capt. Fitz Roy, R.N., John Murray, London, UK, 845. [8] W. L. Wagner and V. A. Funk, Eds., Hawaiian Biogeography. Evolution on a Hot Spot Archipelago, Smithsonian Institution Press, Washington, DC, USA, 995. [9] T. F. Stuessy, G. Jakubowsky, R. S. Gómez et al., Anagenetic evolution in island plants, Journal of Biogeography, vol. 33, no. 7, pp , [20] J. A. Reyes-Betancort, A. Santos Guerra, I. R. Guma, C. J. Humphries, and M. A. Carine, Diversity, rarity and the evolution and conservation of the Canary Islands endemic flora, Anales del Jardin Botanico de Madrid, vol. 65, no., pp , [2] J. Florence and D. H. Lorence, Introduction to the flora and vegetation of Marquesas Islands, Allertonia, vol. 7, pp , 997. [22] G. Féraud, G. Giannérini, R. Campredon, and C. J. Stillman, Geochronology of some Canarian dike swarms: contribution to the volcano-tectonic evolution of the archipelago, Journal of Volcanology and Geothermal Research, vol.25,no.-2,pp , 985. [23] J. Suda, T. Kyncl, and R. Freiová, Nuclear DNA amounts in Macaronesian angiosperms, Annals of Botany, vol. 92, no., pp , [24] J. P. Price and D. A. Clague, How old is the Hawaiian biota? Geology and phylogeny suggest recent divergence, Proceedings of the Royal Society B, vol. 269, no. 508, pp , [25] J. P. Price, Floristic biogeography of the Hawaiian Islands: influences of area, environment and paleogeography, Journal of Biogeography, vol. 3, no. 3, pp , [26] M. Arechavaleta, S. Rodriguez, N. Zurita, and A. Garcia, Eds., Lista de Especies Silvestres de Canarias. Hongos, Plantas y Animales Terrestres, Gobierno de Canarias, 200. [27] W.L.Wagner,D.R.Herbst,andD.H.Lorence,Floraofthe Hawaiian Islands website, 2005, [28] W. L. Wagner and D. H. Lorence, Flora of the Marquesas Islands website, 2002, [29] M. D. Bennett and I. J. Leitch, Plant DNA C-values Database, release 5.0, 200, [30]M.V.Kapralov,M.Stift,andD.A.Filatov, Evolutionof genome size in Hawaiian endemic genus Schiedea (Caryophyllaceae), Tropical Plant Biology, vol. 2, no. 2, pp , [3] J. Felsenstein, Phylogenies and the comparative method, American Naturalist, vol. 25, no., pp. 5, 985. [32] W. P. Maddison and D. R. Maddison, Mesquite: a modular system for evolutionary analysis. Version 2.72, [33] P. E. Midford, T. Garland Jr., and W Maddison, PDAP:PDTREE package for Mesquite, version.2, [34] National Center for Biotechnology Information, ncbi.nlm.nih.gov. [35] J. P. Huelsenbeck and F. Ronquist, MRBAYES: bayesian inference of phylogenetic trees, Bioinformatics, vol. 7, no. 8, pp , 200. [36] F. Ronquist and J. P. Huelsenbeck, MrBayes 3: bayesian phylogenetic inference under mixed models, Bioinformatics, vol. 9, no. 2, pp , [37] C. O. Webb and M. J. Donoghue, Phylomatic: tree assembly for applied phylogenetics, Molecular Ecology Notes, vol. 5, no., pp. 8 83, [38] S. R. Downie, M. F. Watson, K. Spalik, and D. S. Katz- Downie, Molecular systematics of Old World Apioideae

44 6 Journal of Botany (Apiaceae): relationships among some members of tribe Peucedaneae sensu lato, the placement of several islandendemic species, and resolution within the apioid superclade, Canadian Journal of Botany, vol. 78, no. 4, pp , [39] S. C. Kim, M. R. McGowen, P. Lubinsky, J. C. Barber, M. E. Mort, and A. Santos-Guerra, Timing and tempo of early and successive adaptive radiations in Macaronesia, PLoS One, vol. 3, no. 5, Article ID e239, [40] M. E. Mort, D. E. Soltis, P. S. Soltis, J. Francisco-Ortega, and A. Santos-Guerra, Phylogenetics and evolution of the Macaronesian clade of Crassulaceae inferred from nuclear and chloroplast sequence data, Systematic Botany, vol. 27, no. 2, pp , [4] T. L. P. Couvreur, A. Franzke, I. A. Al-Shehbaz, F. T. Bakker, M. A. Koch, and K. Mummenhoff, Molecular phylogenetics, temporal diversification, and principles of evolution in the mustard family (Brassicaceae), Molecular Biology and Evolution, vol. 27, no., pp. 55 7, 200. [42] J. F. Y. Brookfield and R. M. Badge, Population genetics models of transposable elements, Genetica, vol. 00, no. 3, pp , 997. [43] M. Kubešová, L. Moravcová, J. Suda, V. Jarošík, and P. Pyšek, Naturalized plants have smaller genomes than their noninvading relatives: a flow cytometric analysis of the Czech alien flora, Preslia, vol. 82, no., pp. 8 96, 200. [44] S. Lavergne, N. J. Muenke, and J. Molofsky, Genome size reduction can trigger rapid phenotypic evolution in invasive plants, Annals of Botany, vol. 05, no., pp. 09 6, 200. [45] E. Grotkopp, M. Rejmánek, M. J. Sanderson, and T. L. Rost, Evolution of genome size in pines (Pinus) and its life-history correlates: supertree analyses, Evolution, vol. 58, no. 8, pp , [46] A. C. Algar and J. B. Losos, Evolutionary assembly of island faunas reverses the classic island-mainland richness difference in Anolis lizards, Journal of Biogeography, vol. 38, no. 6, pp , 20.

45 Hindawi Publishing Corporation Journal of Botany Volume 20, Article ID 0472, 0 pages doi:0.55/20/0472 Review Article Genome Diversity in Maize Victor Llaca, Matthew A. Campbell, 2 and Stéphane Deschamps DuPont Agricultural Biotechnology, Experimental Station, P.O. Box 80353, Wilmington, DE , USA 2 Pioneer Hi-Bred International Inc., A DuPont Company, 7300 NW 62nd Avenue, P.O. Box 004, Johnston, IA , USA Correspondence should be addressed to Victor Llaca, victor.llaca@usa.dupont.com Received 27 April 20; Accepted 7 July 20 Academic Editor: Simon Hiscock Copyright 20 Victor Llaca et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Zea mays (maize) has historically been used as a model species for genetics, development, physiology and more recently, genome structure. The maize genome is complex with striking intraspecific variation in gene order, repetitive DNA content, and allelic content exceeding the levels observed between primate species. Maize genome complexity is primarily driven by polyploidization and explosive amplification of LTR retrotransposons, with the counteracting effect of unequal and illegitimate crossover. Transposable elements have been shown to capture genic content, create chimeras, and amplify those sequences via transposition. New sequencing platforms and hybridization-based strategies have appeared over the past decade which are being applied to maize and providing the first genome-wide comprehensive view of structural variation and will provide the basis for investigating the interplay between repeats and genes as well as the amount of species level diversity within maize.. Introduction Maize is among the most extensively studied plant species in the history of genetics. Beyond its considerable agricultural and economic value as a crop for food, feed and fuel, maize presents unparalleled biological attributes as a research model for genetic diversity and genome evolution [, 2].The maize genetic pool includes a large natural diversity among both wild and cultivated relatives. Well-established breeding strategies, inbred lines, mutant collections, easy-to-follow phenotypes, and large distinctive chromosomes are just some of the characteristics that allowed the construction of the first plant genetic map [3], proof of meiotic recombination linked with the recombination of genetic traits, and support for the chromosomal theory of inheritance [4, 5]. Subsequent research using maize as a genetic model led to the discovery of epigenetic modifications (in the form of paramutation) as well as transposable elements (TEs) (later found to be common to most, if not all, eukaryotic and prokaryotic organisms) [6 9]. The accumulated cytogenetic and genetic data and more recently, vast sequence information derived from genome project initiatives in grasses, have provided a wealth of information on the structure and evolution of the maize genome. A BAC-by-BAC, mapped draft of the genome of the maize inbred line B73 is currently available ([0]; latest release, AGPv2 available at Additional structural and sequence data for diverse maize genotypes are accumulating thanks to the use of improved comparative genomics hybridization (CGH) techniques and deep resequencing using next-generation sequencing platforms [ 4]. Furthermore, an increasing number of finished genome sequences within the Poaceae (i.e., grass family) provide an important resource for comparative genomics [5]. Currently, there are relatively complete physical assemblies for four nonmaize grass species: rice (Oryza sativa [6]), sorghum (Sorghum bicolor [7]), purple false broom (Brachypodium distachyon) [8], and foxtail millet (Setaria italica [9]; Bennetzen et al., unpublished results; In this review, we focus on the diversity of the nuclear genome of maize. As we will describe in this paper, this is a dynamic genome, with multiple processes playing a role in its expansion, contraction, and sequence diversification. 2. General Characteristics of the Maize Genome The maize genome is genetically diploid and consists of 0 chromosomes with an estimate size ranging from 2.3

46 2 Journal of Botany to 2.7 Gb [9, 0, 20, 2]. As is the case with other large genomes in plant species, the maize genome consists mostly of a nongenic, repetitive fraction punctuated by islands of unique, or low-copy DNA that harbor single genes or small groups of genes. The repetitive elements contribute significantly to the wide range of diversity within the species and include transposable elements (TEs), ribosomal DNA (rdna), and high-copy short-tandem repeats mostly present at the telomeres, centromeres, and heterochromatin knobs [5, 22 24]. Transposable elements are mobile, selfish sequences of DNA that have the capacity of moving from an original location to different parts of the genome. They are classified as Class I (retrotransposons) and Class II (DNA transposons), according to whether the transposition intermediate is RNA or DNA, respectively, [25]. Retrotransposons are duplicated in situ via reverse transcription, with the new copy inserting itself in a new location in the genome, producing a net gain of one element. Most Class II transposable elements, on the other hand, follow a transposition process by cutting and pasting. Members of one family of DNA transposons in plants are proposed to have replicative transposition which uses a rolling circle DNA replication mechanism [26]. Class I and Class II transposons are either autonomous, which is defined as containing all the components necessary for transposition, or nonautonomous, indicating that their transposition is dependent upon the presence of the cognate autonomous element [25]. The most abundant class of TEs in plants is long terminal repeats (LTRs) retrotransposons, which are retroviruslike mobile genetic elements characterized by having long terminal repeats. Most LTR retrotransposons are bounded by target site duplications (TSDs), show extensive CpG and CHG methylation, as well as being generally organized into large clusters and commonly nested inside other LTR retrotransposons [27, 28]. Over million LTR retrotransposon fragments, corresponding to more than 75% of the total nuclear genome sequence, have been identified in the maize genome. The actual number of elements is hard to define due to the nested nature of retrotransposition in maize and the fragmentation of the data [29] although some estimates have ranged from 50,000 to 250,000 elements [27]. To date, 44 families of LTR retrotransposons have been annotated, most of them present at relatively low copy numbers (less than 0 copies) and constituting a small proportion of the genome. However, two families (Copia and Gypsy) are highly abundant and constitute about 80% of the total retroelements. Retrotransposons do not follow a random distribution, with Copia-like elements usually present in generich regions and Gypsy-like elements overrepresented in pericentromeric and other heterochromatic regions [0]. Finally, LINES (Long Interspersed Nuclear Elements) and their short derivatives, SINES (Short Interspersed Nuclear Elements), are less-defined, non-ltr retroelements. LINEs are commonly identified by the presence of TSDs (typically generated during insertion) and one end terminated with a homopolymer, usually poly(t), while SINEs have an internal RNA polymerase III promoter sequence and one homopolymer end, and their transposition is dependent upon the presence of the autonomous LINE [29]. LINES and SINES are relatively infrequent in maize and constitute approximately % of the genome. Class II elements, DNA transposons, constitute a smaller proportion of the genome in maize than retrotransposons, about 8.6%. The first descriptions of mobile elements were class II maize transposons, discovered by Barbara McClintock during her study of chromosomal duplications, inversions, and translocations produced by the Ac/Ds elements [6, 30]. Since then, additional families of DNA transposons have been identified and classified which include: Tc-Mariner, hat, Mutator, PIF Harbinger, and their nonautonomous derivatives which are termed MITES (Miniature Inverted Transposable Elements) [25, 3]. These Class II elements are usually bounded by terminal inverted repeats (TIRs) and are flanked by short TSDs. Autonomous elements encode transposase and/or additional genes that are necessary for their transposition, while nonautonomous elements tend to be short, have nonconserved internal sequences, and/or carry captured DNA elements [25, 32]. Like retrotransposons, class II elements do not have a randomized distribution in the genome and most families, with the exception of elements from the CACTA family, have insertional preference for genic regions [0]. Methylation seems to have a role in the regulation and silencing of class I and II transposable elements, and activation is correlated with de-methylation of TIRs [33 35]. High-copy tandem repeats are present in different parts of the genome including centromeres, telomeres, knobs, and rdna. The centromeric regions include a combination of repeats and retrotransposons in or near sequences that participate in the formation of the kinetochore and in the attachment of microtubules on chromosomes during mitosis and meiosis. Maize centromeres consist of thousands of a 56-bp unit called CentC [23] (reviewed in [36]). Centromeres evolve very quickly and centromere repeats have little or no homology between species. However the same repeats are found in all maize chromosomes [37]. Due to difficulties sequencing large regions with tandem repeats, the number of copies in most centromeres has not been determined. The only centromeres that have been fully assembled are those on chromosome 2 and 5 which are thought to be the shortest [38]. The maize ZmB73v reference genome assembly contains an estimated 54% of the genome s total CentC content [38]. Analysis of stretched DNA fibers suggests that the total length of CentC arrays varies less than 00 kb to several Megabases [37]. Four types of retrotransposons, CRM, CRM2, CRM3/CentA, and CRM4 have been described as centromeric and are interspersed among CentC tandem repeat sequences [0]. One additional repeat sequence, Cent4, is at or near the primary constriction of chromosome 4 [39]. Heterochromatic knobs, cytological features that can be observed as dark round structures, consist of megabase-sized tandem repeats of derivations of one of two repeats (80 and 350 bp long) that comprise 0.6% to 6% of the total genome. They can be found in more than 20 specific locations in pachytene chromosomes [40], and their structural differences among maize varieties suggested significant intraspecific diversity of

47 Journal of Botany 3 the maize genome as we will discuss later. Knobs can also have retroelements inserted [22, 4 43]. The rdna regions consist of thousands of tandem repeats encoding for rrna. It has been estimated that the maize genome has between,600 to 23,000 9-kb tandem copies of genes encoding the 45S RNA precursor for the 8S, 5.8S, and 28S ribosomal RNA on the short arm of chromosome 6. This arrangement constitutes the nucleolus organizer region (NOR), another early observable cytogenetic feature in the maize genome [44]. Each of these ribosomal genes in the repeat is separated by a non-transcribed spacer. Precursors for 5S ribosomal RNA genes are clustered as 342-bp tandem repeats in an additional, smaller cluster in long arm of chromosome 2 [45, 46]. Telomeres, first named by Muller in 938 but defined by McClintock in maize several years earlier [5], include tandem-repeated telomeric and subtelomeric sequences that protect the frequent rearrangements that naturally occur at the ends of DNA molecules [47]. Finally the maize genome includes thousands of simple sequence satellites and a few megatracts of trinucleotide repeats, namely, AGT and AGC [48]. The total number of nontransposon-related genes, pseudogenes and mirnas constitute the rest of the maize genome, approximately 5% of the total [49].While it is difficult to estimate accurately the total number of genes due to the incomplete nature of the current B73 physical assembly, it has been estimated recently to be approximately 32,000, classified in,892 families and a total of 50 loci encoding mirna [0]. However, syntenic arrangements of genes are not necessarily conserved across individuals within the Zea genusaswewilldiscussbelow. 3. Intraspecific Diversity of Maize Early cytogenetic studies showed considerable line-specific differences in heterochromatin, or C, banding, and heterochromatic knob distribution. Supernumerary chromosomes, or B chromosomes, were also found in some maize and teosintes [50 54]. These cytogenetic differences have been positively correlated to differences in DNA content [55 57]. Using Southern hybridization, Rivin et al. [4] found that copy numbers of tandem-repeated sequences such as ribosomal DNA and knob repeats varied among North American genotypes for as much as two to three fold. Intraspecific variations of as much as 38.8% from the average of 5.5 pg/2n nucleus have been reported in Zea mays [55, 56, 58 60]. More recently, sequencing data has demonstrated that the maize genome exhibits rather variable levels of naturally occurring genetic diversity depending on the lines involved in the comparison [49, 6]. On average, the frequency of single nucleotide polymorphism between two maize inbreds is approximately substitution per 00 bases [62, 63]. Interestingly, this level of intraspecies polymorphism is striking when compared to mammals; this average rate of polymorphism is 0 times higher than that observed between humans and also higher than that observed between human and chimpanzees [64]. Maize seems to be tolerant of increases in large amounts of DNA content per nucleus without noticeable effect on plant phenotype. In maize, the most significant recent contributions to genome size have been by LTR retrotransposons, and their number and distribution have been shown to vary considerably in different haplotypes [65, 66]. Copy number variation has been found in tandem repeats at centromeres, knobs, and rdna loci [43, 67, 68]. Major focus has been given recently to regions with copy number variation (CNV) and presence-absence variation (PAV). With the improvement of genome-wide hybridization technologies and increasing information on the sequences of multiple maize lines by next-generation technologies, it is becoming clear that CNV and PAVs have a major role in the diversity of the maize genome, and potentially its heterosis. Springer et al. [69] analyzed the structural variation present between the genomes of the inbred lines B73 and Mo7 using comparative genomic hybridization (CGH). This study showed megabase-size B73 regions that were absent in Mo7. By using PCR analysis in 22 additional lines, they were able to identify a 2 Mb region on chromosome 6 that was present or absent from the lines as a single haplotype block. Beló et al. [70] used an expression array to perform CGH analysis on 3 North American maize inbreds with the reference inbred B73 and found a total of 2,09 potential CNVs; the authors screened a subset of 5 CNV loci via PCR and were able to confirm that 2 loci (80%) were true insertion/deletion events. Two of the CNV regions were shown to be at least hundreds of kilobases long with the remaining validated CNVs being fewer than 0 kb in length. Swanson-Wagner et al. [7] used array-based CGH to compare content and copy number variation of 32,500 genes among 9 diverse maize inbred lines and 4 teosinte accessions, relative to the B73 reference genome. They found variation in about 0% of the targets, with 479 genes showing higher and 3,40 genes showing lower copy number or missing in B73. Most down genes were single copy in B73 and therefore considered PAV. A number of genes were higher in some lines and lower/absent in others. Interestingly they discovered that the majority of these polymorphisms predated the origin of maize from teosinte. 4. Mechanisms of Maize Genome Evolution and Diversity Major mechanisms have an effect in the evolution of the maize genome and the generation of intraspecific genome diversity () whole genome duplications (polyploidization) and segmental duplications, (2) DNA transposition and retrotransposition, (3) capture and translocation of genes or gene segments by transposons, (4) recombination and gene conversion events, and (5) single base mutations and expansion/contraction of simple sequence repeats (SSRs). These mechanisms are described below and added to the genome diversity generated by the gene flow between maize populations and introgression between maize and related species (teosinte) [49, 72].

48 4 Journal of Botany 4.. Duplication and Polyploidization. Like in other cereals and plants in general, polyploidization has played an important role in the evolution of the maize genome [73, 74]. Evidence for both segmental duplications and whole genome duplication by wide crosses was initially found in linkage and comparative genetic analysis, which showed extensive chromosome duplications in maize [75]. More recently, comparative sequence information has supported the idea that the maize genome has undergone at least two polyploidization events. In the first event, approximately 70 to 80 Myr ago, a common ancestor to cereals underwent whole genome duplication, followed by gene loss. More than 68% of the duplicated genes from this event, which are currently collinear between rice and sorghum, retain only one copy. However, 99% of these genes are orthologous between the two species, suggesting that early gene loss predated the divergence among the cereals [76]. Genes have been preferentially removed from one of the homologs, a process calledbiased fractionation [77]. The second polyploidization event in maize occurred from 5 to 2 Myr ago and occurred after the divergence from the last common ancestor to sorghum. Two progenitors of maize hybridized at some point between 4.8 and.9 Myr [78 80], giving rise to a tetraploid followed by large-scale loss and movement of duplicated genes (up to 50%) and chromosomal rearrangements that eventually returned the genome to a diploid behavior [8, 82]. The number of maize B73 high-confidence proteincoding genes predicted under high stringency is 32,540, higher than similar estimates for Brachypodium (25,532), rice (29,77), or sorghum (27,640) [0, 7, 8, 83]. This number is likely to be an underestimate due to the missing genic content in the current physical assembly. While there are stable tetraploid maize varieties reported [68], the most important effect of polyploidization in modern maize is the redundancy that the early polyploidization generated, with the subsequent relaxation of selective constraints Retrotransposition and the Expansion of the Maize Genome. While maize and sorghum, close relatives within the Andropogoneae tribe, share the same number of chromosomes, the maize genome is approximately 3 times the size of sorghum (800 Mbp). The secondary polyploidization described above accounts for only part of this difference. The overall size of the maize genome and intergenic distances has expanded dramatically due to LTR retrotransposition within the last 0 Myr. In grasses, the proportion of LTR retrotransposons is correlated to its genome size, while the proportion of Class II transposons remains constant (see Table ). The small genomes of Brachypodium and rice have a retrotransposon content of 23.3% and 25.8%, respectively, compared to 54.5% in sorghum, and 75.9% in maize [84]. The high abundance and nonrandom distribution of LTR retroelements in maize was one of the early observations made as sequence information started accumulating [85 88]. As these elements have long terminal repeats that are identical at the time of the transposition, the analysis of mutations allowed dating of the elements [89]. Initial studies of nested retroelements found within the adh region indicated that a massive retrotransposition event had occurred in the last 3 Myr. [65, 66, 89]. Liu et al. [90] investigated the insertion dynamics of LTR retrotransposons in genefree and gene-containing BACs and identified two peaks of amplification in gene-free areas, the first around.5 2 Mya and a more recent one, within the last 500,000 years. They found only one peak of amplification in gene-containing regions, within the last Myr. The conservative nature of LTR retrotransposition via an RNA intermediate and leaving behind the original element belies the reason why this selfish DNA has colonized and expanded the maize genome Mechanisms for Genome Decrease. Both unequal and illegitimate recombinations are important mechanisms that may counteract the expanding effects of LTR retrotransposition [9]. Unequal homologous recombination within a chromosome (i.e., intrastrand), that is associated with larger (>50 bp) direct repeats (in this case between adjacent LTRs), is proposed to generate a solo LTR and leads to the net deletion of the internal sequence plus one LTR sequence [92, 93]. The effects of unequal crossover between homologous LTR sequences at distinct chromosomal locations can have more striking results including a reciprocal deletion and duplication event, inversions and reciprocal translocations [94, 95]. By comparison, illegitimate recombination can occur between smaller lengths of homology than unequal homologous recombination and is proposed to be responsible for the creation of numerous internal deletions and truncated LTR retrotransposons [96]. This form of recombination is presumed to occur via non-homologous end joining or slip-strand mispairing which, in turn, leads to DNA loss [96, 97]. All three of these mechanisms are proposed to counteract the genome expansion in plants which is primarily driven by either increases in ploidy or amplification of repetitive DNA [94] Transposons and Genetic Colinearity. Intraspecific genome variation has long been attributed to changes in size of heterochromatic DNA outside coding sequences that contracted or expanded the chromosomes [98]. However, violation of gene microcolinearity has been found in multiple locations since it was first reported by Fu and Dooner [99]. These authors sequenced 230-kb and 0-kb BAC contigs flanking the bz locus in the North American inbred lines B73 and McC, respectively, and found extensive differences in content and position of intergenic retrotransposons. More remarkably, out of 0 genes clustered in the McC sequence, 4 were absent in B73. Further sequence analysis of the bz locus in multiple lines showed considerable variation in other maize lines, with only 25% to 84% of sequences shared [00]. Similar polymorphisms for the presence/absence of genic sequences have been found in different chromosome locations [0, 02]. Helitrons have been associated to intraspecific violation of genetic colinearity in maize. The role of Helitrons leading to genome variation in maize was first reported by Lai et al. [03], using comparative bioinformatics analysis of the bz

49 Journal of Botany 5 Table : Distribution and proportion of features in the annotated genomes of 4 grasses. Table shows a comparison of major genomic element types between 4 published references. N/A indicates these data were not available. Reference sequence completed Bd2 Indica + japonica Bxt623 B73 Brachypodium [7] Rice [5, 5] Sorghum [6] Maize [9] 27 Mb 420 Mb 739 Mb 260 Mb Copies % Genome Copies % Genome Copies % Genome Copies %Genome Class I (retroelements) 50, , , ,39, LTR Ty/copia 2, , , , LTR Ty3/gypsy 32, , , , LTR other, N/A N/A 56, , Non-LTR (LINE) 3, , ,000 Non-LTR (SINE) N/A N/A, ,990 0 Other Class I N/A N/A 5, ,206 N/A N/A N/A Class II (DNA Transposon) 29, , , , CACTA, , , , hat , ,800. Mutator 2, , N/A N/A Tc/mariner , N/A N/A PIF/harbinger N/A N/A N/A N/A MULE N/A N/A N/A N/A N/A N/A 2,900 MITE (Stowaway) N/A N/A N/A N/A 7, , MITE (Tourist) 23, , , ,700 Helitron N/A N/A, , Other Class II N/A N/A 8, , N/A N/A Protein-encoding genes 25, , , ,540 6

50 6 Journal of Botany region in McC and B73 to reveal the presence/absence of two Helitrons, HelA,andHelB, which account for all of the allelic variation at this locus. Unlike other Class II TEs, Helitron elements are not flanked by terminal inverted repeats and do not generate TSDs. They have an 8 25 bp sequence able to form a hairpin near the 3 end and preferentially insert in AT dinucleotides [04, 05]. A more extensive study between inbred lines B73 and Mo7 [06] suggested that a large proportion of differential insertions in the genome between B73 and Mo7 could be attributed to Helitron sequences. While most reported Helitron genes seem to be truncated versions of their progenitor genes [07], the maize CYP72A27- Zm gene represents a full cytochrome P450 monooxygenase (P450) gene recently captured by a Helitron and transposed into an Opie-2 retrotransposon [08]. Complete Helitron elements are widespread in the genome. One study identified,930 intact Helitrons consisting of 8 families and more than 20,000 Helitron fragments [09, 0]. Another study identified 2,79 nonautonomous Helitron elements []. The majority of the elements identified thus far represent nonautonomous Helitrons containing chimeric segments derived from multiple genes although the analysis of the complete sequencing of one single maize inbred line provides only biased information of the extent and diversity of gene capture, transposition, and amplification by Helitrons [2]. In addition to Helitron, the Mutator superfamily possesses non-autonomous elements, called Pack-MULES, that have the ability to capture segments of nuclear gene(s) which can be arranged in chimeras [3, 4]. Further, molecular evidence has revealed that these novel chimeras can be both transcribed and translated, which suggest that this mechanism of gene fragment capture inside of non-autonomous elements can produce, and evolve into, novel protein coding sequences [3, 4]. Much like the Helitron elements, Pack-MULE transposition and amplification can lead to deviations in intraspecific synteny, and recent research has shown that Pack-MULEs preferentially capture GC-rich genomic segments and displayed biased insertion into the 5 end of coding regions [5]. Pack-MULE elements possess sequences that are associated with small RNAs and can influence the expression profile of the captured genic sequences, and, given the insertional bias towards the beginning of the transcribed region, these elements can have significant effects on the expression of the endogenous genes into which they are inserted [5]. 5. Future Prospects The discovery of the colinearity of the maize and other grass genomes, dating back to million years ago, was a major breakthrough in comparative genomics within the Poaceae and helped in the identification of genes, gene families, duplication events, and characterization of the structural variation of the genomes [24, 6, 7]. The accumulation of evidence pointing to high structural polymorphism and the significant sequence diversity among maize inbreds, however, highlights a serious limitation in the use of a single reference genome as a sufficient representative of a species. Thus, in order to capture the range of sequence diversity (i.e., single nucleotide polymorphisms and small sized indels) as well as larger structural variations (i.e., CNVs, PAVs, and large indels), deep resequencing and assembly of the gene space to create a pan-genome is necessary among a range of diverse inbreds [8]. While the B73 reference assembly in its current state has proven invaluable for annotation and basic research into the organization and diversity of gene (and repeat) content, there are estimates of upward of 0% of the genic content missing from the current version [0, 29]. The complexity of the maize genome combined with available resequencing strategies (either Sanger or next-generation sequencing) prevents the creation of a completed physical assembly similar to that in Arabidopsis and rice whose genome sizes are roughly an order of magnitude smaller. With the advent of next-generation sequencing platforms, the ability to rapidly generate the full content of sequences in an inbred with high coverage is now feasible, but de novo assembly of these short reads will not create any additional finished assemblies of comparable quality to B73 within maize. Comparative genomic hybridization (CGH) strategies offer a rapid and inexpensive strategy to look at structural variation and alignment of WGS reads to the B73 reference assembly and will produce an equivalent digital CGH for a reasonable cost. However, both of these strategies will have a strong B73-centric bias for the foreseeable future. Thus, for a period of time until sequencing technology significantly increases read lengths to promote robust genome assembly from whole genome shotgun (WGS) projects, a large amount of diverse sequences from the Zea genuswillremainassmall, anonymous assemblies which will have to be positioned by laborious or inexact methods including, among others, BAC library screening, syntenic comparisons to sorghum, oat-maize addition line screening, and genetic mapping via GBS on segregating populations. However, these small, genic assemblies from deep resequencing projects can be annotated structurally and functionally in a manner similar to EST and FL-cDNA projects that were initiated prior to significant reference assembly creation in the past two decades. And upon the creation of the large contigs of physical sequence, these small genic assemblies are easily integrated. Therefore, genic diversity in maize can be collected and analyzed in detail for individual inbreds using deep resequencing and assembly, and, upon the advent of upcoming sequencing platforms that will allow rapid de novo assembly, these can be easily organized on the physical map. References [] E. H. Coe, The origins of maize genetics, Nature Reviews Genetics, vol. 2, no., pp , 200. [2] J. L. Bennetzen, The future of Maize, in Handbook of Maize: Genetics and Genomics,J.L.BennetzenandS.Hake,Eds.,pp , Springer, Berlin, Germany, [3] R.A.Emerson,G.W.Beadle,andA.C.Fraser, Asummary of linkage studies in maize, Cornell University Agricultural Experimental Station Memoir, vol. 80, pp. 83, 935. [4] H. B. Creighton and B. McClintock, A correlation of cytological and genetical crossing-over in Zea mays, Proceedings

51 Journal of Botany 7 of the National Academy of Sciences of the United States of America, vol. 7, no. 8, pp , 93. [5] B. McClintock, The order of genes C, Sh, and Wx in Zea mays with reference to a cytological known point on the chromosome, Proceedings of the National Academy of Sciences of the United States of America, vol. 7, no. 8, pp , 93. [6] B. McClintock, The origin and behavior of mutable loci in maize, Proceedings of the National Academy of Sciences of the United States of America, vol. 36, no. 6, pp , 950. [7] R. A. Brink, A genetic change associated with the R locus in maize which is directed and potentially reversible, Genetics, vol. 4, no. 6, pp , 956. [8] V. L. Chandler, W. B. Eggleston, and J. E. Dorweiler, Paramutation in maize, Plant Molecular Biology, vol. 43, no. 2-3, pp. 2 45, [9] J. L. Bennetzen, Maize genome structure and evolution, in Handbook of Maize: Genetics and Genomics, J.L.Bennetzen and S. Hake, Eds., pp , Springer, Berlin, Germany, [0] P. S. Schnable, D. Ware, R. S. Fulton et al., The B73 maize genome: complexity, diversity, and dynamics, Science, vol. 326, no. 5956, pp. 2 5, [] S. J. Emrich, W. B. Barbazuk, L. Li, and P. S. Schnable, Gene discovery and annotation using LCM-454 transcriptome sequencing, Genome Research, vol. 7, no., pp , [2] M. A. Gore, J. M. Chia, R. J. Elshire et al., A first-generation haplotype map of maize, Science, vol. 326, no. 5956, pp. 5 7, [3] J. P. Vielle-Calzada, O. M. De La Vega, G. Hernández- Guzmán et al., The palomero genome suggests metal effects on domestication, Science, vol. 326, no. 5956, p. 078, [4] J. Lai, R. Li, X. Xu et al., Genome-wide patterns of genetic variation among elite maize inbred lines, Nature Genetics, vol. 42, no., pp , 200. [5] X. Wang, H. Tang, and A. H. Paterson, Seventy million years of concerted evolution of a homoeologous chromosome pair, in parallel, in major poaceae lineages, Plant Cell, vol. 23, no., pp , 20. [6]T.Sasaki, Themap-basedsequenceofthericegenome, Nature, vol. 436, no. 7052, pp , [7] A.H.Paterson,J.E.Bowers,R.Bruggmannetal., TheSorghum bicolor genome and the diversification of grasses, Nature, vol. 457, no. 7229, pp , [8] J. P. Vogel, D. F. Garvin, T. C. Mockler et al., Genome sequencing and analysis of the model grass Brachypodium distachyon, Nature, vol. 463, no. 7282, pp , 200. [9] A. N. Doust, E. A. Kellogg, K. M. Devos, and J. L. Bennetzen, Foxtail millet: a sequence-driven grass model system, Plant Physiology, vol. 49, no., pp. 37 4, [20] A. L. Rayburn, D. B. Biradar, D. G. Bullock, and L. M. McMurphy, Nuclear DNA content in F hybrids of maize, Heredity, vol. 70, pp , 993. [2] S. Zhou, F. Wei, J. Nguyen et al., A single molecule scaffold for the maize genome, PLoS Genetics, vol. 5, no., Article ID e0007, [22] W. J. Peacock, E. S. Dennis, M. M. Rhoades, and A. J. Pryor, Highly repeated DNA sequence limited to knob heterochromatin in maize, Proceedings of the National Academy of Sciences of the United States of America, vol.78,no.7,pp , 98. [23] E. V. Ananiev, R. L. Phillips, and H. W. Rines, Chromosomespecific molecular organization of maize (Zea mays L.) centromeric regions, Proceedings of the National Academy of Sciences of the United States of America, vol. 95, no. 22, pp , 998. [24] M. Morgante, Plant genome organisation and diversity: the year of the junk!, Current Opinion in Biotechnology, vol. 7, no. 2, pp , [25] C. Feschotte, N. Jiang, and S. R. Wessler, Plant transposable elements: where genetics meets genomics, Nature Reviews Genetics, vol. 3, no. 5, pp , [26] V. V. Kapitonov and J. Jurka, Rolling-circle transposons in eukaryotes, Proceedings of the National Academy of Sciences of the United States of America, vol. 98, no. 5, pp , 200. [27] P. J. SanMiguel and C. Vitte, The LTR-retrotransposons of maize, in Handbook of Maize: Genetics and Genomics, J. L. Bennetzen and S. Hake, Eds., pp , Springer, Berlin, Germany, [28] J. B. Hollick and N. Springer, Epigenetic phenomena and epigenomics in maize, in Epigenomics, A. C. Ferguson- Smith, J. M. Greally, and R. A. Martienssen, Eds., pp. 9 47, Springer, Berlin, Germany, [29] R. S. Baucom, J. C. Estill, C. Chaparro et al., Exceptional diversity, non-random distribution, and rapid evolution of retroelements in the B73 maize genome, PLoS Genetics, vol. 5, no., Article ID e000732, [30] B. McClintock, Mutation in maize, Carnegie Institution of Washington Yearbook, vol. 52, pp , 953. [3] T. Wicker, F. Sabot, A. Hua-Van et al., A unified classification system for eukaryotic transposable elements, Nature Reviews Genetics, vol. 8, no. 2, pp , [32] D. Lisch and N. Jiang, Mutator and MULE transposons, in Handbook of Maize: Genetics and Genomics, J.L.Bennetzen and S. Hake, Eds., pp , Springer, Berlin, Germay, [33] V. L. Chandler and V. Walbot, DNA modification of a maize transposable element correlates with loss of activity, Proceedings of the National Academy of Sciences of the United States of America, vol. 83, no. 6, pp , 986. [34] R. A. Martienssen and V. Colot, DNA methylation and epigenetic inheritance in plants and filamentous fungi, Science, vol. 293, no. 5532, pp , 200. [35] Y. Jia, D. R. Lisch, K. Ohtsu, M. J. Scanlon, D. Nettleton, and P. S. Schnable, Loss of RNA-dependent RNA polymerase 2 (RDR2) function causes widespread and unexpected changes in the expression of transposons, genes, and 24-nt small RNAs, PLoS Genetics, vol. 5, no., Article ID e000737, [36] J. A. Birchler and F. Han, Maize centromeres: structure, function, epigenetics, Annual Review of Genetics, vol. 43, pp , [37] R. K. Dawe, Maize centromeres and knobs (neocentromeres), in Handbook of Maize: Genetics and Genomics, J. L. Bennetzen and S. Hake, Eds., pp , Springer, Berlin, Germany, [38] T. K. Wolfgruber, A. Sharma, K. L. Schneider et al., Maize centromere structure and evolution: sequence analysis of centromeres 2 and 5 reveals dynamic loci shaped primarily by retrotransposons, PLoS Genetics, vol. 5, no., Article ID e000743, [39]B.T.Page,M.K.Wanous,andJ.A.Birchler, Characterization of a maize chromosome 4 centromeric sequence: evidence for an evolutionary relationship with the B chromosome centromere, Genetics, vol. 59, no., pp , 200.

52 8 Journal of Botany [40] B. McClintock, Genetic and cytological studies of maize, Carnegie Institution of Washington Yearbook, vol. 58, pp , 959. [4] C. J. Rivin, C. A. Cullis, and V. Walbot, Evaluating quantitative variation in the genome of Zea mays, Genetics, vol. 3, no. 4, pp , 986. [42] E. V. Ananiev, R. L. Phillips, and H. W. Rines, Complex structure of knob DNA on maize chromosome 9: retrotransposon invasion into heterochromatin, Genetics, vol. 49, no. 4, pp , 998. [43] E. V. Ananiev, R. L. Phillips, and H. W. Rines, A knobassociated tandem repeat in maize capable of forming foldback DNA segments: are chromosome knobs megatransposons? Proceedings of the National Academy of Sciences of the United States of America, vol. 95, no. 8, pp , 998. [44] M. D. McMullen, B. Hunter, R. L. Phillips, and I. Rubenstein, The structure of the maize ribosomal DNA spacer region, Nucleic Acids Research, vol. 4, no. 2, pp , 986. [45] P. J. Buescher, R. L. Phillips, and R. Brambi, Ribosomal RNA contents of maize genotypes with different ribosomal RNA gene numbers, Biochemical Genetics, vol. 22, no. 9-0, pp , 984. [46] L. Li and K. Arumuganathan, Physical mapping of 45S and 5S rdna on maize metaphase and sorted chromosomes by FISH, Hereditas, vol. 34, no. 2, pp. 4 45, 200. [47] J. Li, F. Yang, J. Zhu, S. He, and L. Li, Characterization of a tandemly repeated subtelomeric sequence with inverted telomere repeats in maize, Genome, vol. 52, no. 3, pp , [48] E. V. Ananiev, M. A. Chamberlin, J. Klaiber, and S. Svitashev, Microsatellite megatracts in the maize (Zea mays L.) genome, Genome, vol. 48, no. 6, pp , [49] A. Rafalski and E. Ananiev, Genetic diversity, linkage disequilibrium and association mapping, in Handbook of Maize: Genetics and Genomics, J. L. Bennetzen and S. Hake, Eds., pp , Springer, Berlin, Germany, [50] W. L. Brown, Numbers and distribution of chromosome knobs in United States maize, Genetics, vol. 34, no. 5, pp , 949. [5] B. McClintock, A. Kato, and A. Blumenschein, Chromosome Constitution of Races of Maize. Its Significance in the Interpretation of Relationships between Races and Varieties in the Americas, Colegio de Postgraduados, Chapingo, Mexico, 98. [52] B. McClintock, Mechanisms that rapidly reorganize the genome, Stadler Genetics Symposium, vol. 0, pp , 978. [53] B. McClintock, Significance of chromosome constitutions in tracing the origin and migration of races of maize in the Americas, in Maize Breeding and Genetics,D.B.Walden, Ed., pp , John Wiley & Sons, New York, NY, USA, 978. [54] W. R. Carlson, The B chromosome in maize, in Handbook of Maize: Genetics and Genomics,J.L.BennetzenandS.Hake, Eds., pp , Springer, Berlin, Germany, [55] A. L. Rayburn, H. J. Price, J. D. Smith, and J. R. Gold, C-band heterochromatin and DNA content in Zea mays, American Journal of Botany, vol. 72, no. 0, pp , 985. [56] A. L. Rayburn, Flow cytometric assessment of nucleotide variability and its evolutionary implications, in Classical and Molecular Cytogenetic Analysis, W.J. Raup and B.S. Gill, Eds., pp. 0 5, Kansas Agricultural Experimental Station, Manhattan, Kan, USA, 994. [57] C. M. Tito, L. Poggio, and C. A. Naranjo, Cytogenetic studies in the genus Zea. 3. DNA content and heterochromatin in species and hybrids, Theoretical and Applied Genetics, vol. 83, no., pp , 99. [58] J. H. Lee, K. Arumuganathan, S. M. Kaeppler et al., Variability of chromosomal DNA contents in maize (Zea mays L.) inbred and hybrid lines, Planta, vol. 25, no. 4, pp , [59] D. A. Laurie and M. D. Bennet, Nuclear DNA content in the genera Zea and Sorghum. Intergeneric, interspecific and intraspecific variation, Heredity, vol. 55, no. 3, pp , 985. [60] D. P. Biradar and A. L. Rayburn, Heterosis and nuclear DNA content in maize, Heredity, vol. 7, no. 3, pp , 993. [6] A. Rafalski and M. Morgante, Corn and humans: recombination and linkage disequilibrium in two genomes of similar size, Trends in Genetics, vol. 20, no. 2, pp. 03, [62] M. I. Tenaillon, M. C. Sawkins, A. D. Long, R. L. Gaut, J. F. Doebley, and B. S. Gaut, Patterns of DNA sequence polymorphism along chromosome of maize (Zea mays ssp. mays L.), Proceedings of the National Academy of Sciences of the United States of America, vol. 98, no. 6, pp , 200. [63] A. Ching, K. S. Caldwell, M. Jung et al., SNP frequency, haplotype structure and linkage disequilibrium in elite maize inbred lines, BMC Genetics, vol. 3, article no. 9, [64] E. S. Buckler, B. S. Gaut, and M. D. McMullen, Molecular and functional diversity of maize, Current Opinion in Plant Biology, vol. 9, no. 2, pp , [65] P. SanMiguel, A. Tikhonov, Y. K. Jin et al., Nested retrotransposons in the intergenic regions of the maize genome, Science, vol. 274, no. 5288, pp , 996. [66] P. SanMiguel and J. L. Bennetzen, Evidence that a recent increase in maize genome size was caused by the massive amplification of intergene retrotransposons, Annals of Botany, vol. 82, pp , 998. [67] A. Kato, J. C. Lamb, and J. A. Birchler, Chromosome painting using repetitive DNA sequences as probes for somatic chromosome identification in maize, Proceedings of the National Academy of Sciences of the United States of America, vol. 0, no. 37, pp , [68] J. A. Birchler and H. W. Bass, Cytogenetics and chromosomal structural diversity, in Handbook of Maize: Genetics and Genomics, J. L. Bennetzen and S. Hake, Eds., pp , Springer, Berlin, Germany, [69] N. M. Springer, K. Ying, Y. Fu et al., Maize inbreds exhibit high levels of copy number variation (CNV) and presence/absence variation (PAV) in genome content, PLoS Genetics, vol. 5, no., Article ID e000734, [70] A. Beló, M. K. Beatty, D. Hondred, K. A. Fengler, B. Li, and A. Rafalski, Allelic genome structural variations in maize detected by array comparative genome hybridization, Theoretical and Applied Genetics, vol. 20, no. 2, pp , 200. [7] R. A. Swanson-Wagner, S. R. Eichten, S. Kumari et al., Pervasive gene content variation and copy number variation in maize and its undomesticated progenitor, Genome Research, vol. 20, no. 2, pp , 200. [72] J. Doebley, Molecular evidence for gene flow among Zea species, Bioscience, vol. 40, no. 6, pp , 990. [73] J. F. Wendel, Genome evolution in polyploids, Plant Molecular Biology, vol. 42, no., pp , [74] J. Messing, The polyploid origin of maize, in Handbook of Maize: Genetics and Genomics, J.L.BennetzenandS.Hake, Eds., pp , Springer, Berlin, Germany, 2009.

53 Journal of Botany 9 [75] S. Ahn and S. D. Tanksley, Comparative linkage maps of the rice and maize genomes, Proceedings of the National Academy of Sciences of the United States of America, vol. 90, no. 7, pp , 993. [76] A.H.Paterson,J.E.Bowers,andB.A.Chapman, Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics, Proceedings of the National Academy of Sciences of the United States of America, vol. 0, no. 26, pp , [77] M. R. Woodhouse, J. C. Schnable, B. S. Pedersen et al., Following tetraploidy in maize, a short deletion mechanism removed genes preferentially from one of the two homologs, PLoS Biology, vol. 8, no. 6, Article ID e000409, 200. [78] J. Messing, A. K. Bharti, W. M. Karlowski et al., Sequence composition and genome organization of maize, Proceedings of the National Academy of Sciences of the United States of America, vol. 0, no. 40, pp , [79] Z. Swigonova, J. Lai, J. Ma et al., On the tetraploid origin of the maize genome, Comparative and Functional Genomics, vol. 5, no. 3, pp , [80] Z. Swigoňová, J. Lai, J. Ma et al., Close split of sorghum and maize genome progenitors, Genome Research, vol. 4, no. 0A, pp , [8] R. Song, V. Llaca, and J. Messing, Mosaic organization of orthologous sequences in grass genomes, Genome Research, vol. 2, no. 0, pp , [82] R. Bruggmann, A. K. Bharti, H. Gundlach et al., Uneven chromosome contraction and expansion in the maize genome, Genome Research, vol. 6, no. 0, pp , [83] T. Tanaka, B. A. Antonio, S. Kikuchi et al., The Rice Annotation Project Database (RAP-DB): 2008 update, Nucleic Acids Research, vol. 36, supplement, pp. D028 D033, [84] K. M. Devos, Grass genome organization and evolution, Current Opinion in Plant Biology, vol. 3, no. 2, pp , 200. [85] B. C. Meyers, S. V. Tingey, and M. Morgante, Abundance, distribution, and transcriptional activity of repetitive elements in the maize genome, Genome Research, vol., no. 0, pp , 200. [86] G. Haberer, S. Young, A. K. Bharti et al., Structure and architecture of the maize genome, Plant Physiology, vol. 39, no. 4, pp , [87] F. Wei, E. Coe, W. Nelson et al., Physical and genetic structure of the maize genome reflects its complex evolutionary history., PLoS Genetics, vol. 3, no. 7, article e23, [88] F. Wei, J. Zhang, S. Zhou et al., The physical and genetic framework of the maize B73 genome, PLoS Genetics, vol. 5, no., Article ID e00075, [89] P. SanMiguel, B. S. Gaut, A. Tikhonov, Y. Nakajima, and J. L. Bennetzen, The paleontology of intergene retrotransposons of maize, Nature Genetics, vol. 20, no., pp , 998. [90] R. Liu, C. Vitte, J. Ma et al., A GeneTrek analysis of the maize genome, Proceedings of the National Academy of Sciences of the United States of America, vol. 04, no. 28, pp , [9] J. L. Bennetzen, J. Ma, and K. M. Devos, Mechanisms of recent genome size variation in flowering plants, Annals of Botany, vol. 95, no., pp , [92] Z. Tian, C. Rizzon, J. Du et al., Do genetic recombination and gene density shape the pattern of DNA elimination in rice long terminal repeat retrotransposons? Genome Research, vol. 9, no. 2, pp , [93] J. Ma and J. L. Bennetzen, Recombination, rearrangement, reshuffling, and divergence in a centromeric region of rice, Proceedings of the National Academy of Sciences of the United States of America, vol. 03, no. 2, pp , [94] C. Vitte and J. L. Bennetzen, Analysis of retrotransposon structural diversity uncovers properties and propensities in angiosperm genome evolution, Proceedings of the National Academy of Sciences of the United States of America, vol. 03, no. 47, pp , [95] D. J. Garfinkel, Genome evolution mediated by Ty elements in Saccharomyces, Cytogenetic and Genome Research, vol. 0, no. 4, pp , [96] J. Ma, K. M. Devos, and J. L. Bennetzen, Analyses of LTRretrotransposon structures reveal recent and rapid genomic DNA loss in rice, Genome Research, vol. 4, no. 5, pp , [97] K. M. Devos, J. K. M. Brown, and J. L. Bennetzen, Genome size reduction through illegitimate recombination counteracts genome expansion in Arabidopsis, Genome Research, vol. 2, no. 7, pp , [98] N. Jiang, A. A. Ferguson, R. K. Slotkin, and D. Lisch, Pack- Mutator-like transposable elements (Pack-MULEs) induce directional modification of genes through biased insertion and DNA acquisition, Proceedings of the National Academy of Sciences of the United States of America, vol. 08, no. 4, pp , 20. [99] V. Llaca and J. Messing, Amplicons of maize zein genes are conserved within genic but expanded and constricted in intergenic regions, Plant Journal, vol. 5, no. 2, pp , 998. [00] H. Fu and H. K. Dooner, Intraspecific violation of genetic colinearity and its implications in maize, Proceedings of the National Academy of Sciences of the United States of America, vol. 99, no. 4, pp , [0] Q. Wang and H. K. Dooner, Remarkable variation in maize genome structure inferred from haplotype diversity at the bz locus, Proceedings of the National Academy of Sciences of the United States of America, vol. 03, no. 47, pp , [02] R. Song and J. Messing, Contiguous genomic DNA sequence comprising the 9-kD zein gene family from maize, Plant Physiology, vol. 30, no. 4, pp , [03] S. Brunner, K. Fengler, M. Morgante, S. Tingey, and A. Rafalski, Evolution of DNA sequence nonhomologies among maize inbreds, Plant Cell, vol. 7, no. 2, pp , [04] J. Lai, Y. Li, J. Messing, and H. K. Dooner, Gene movement by Helitron transposons contributes to the haplotype variability of maize, Proceedings of the National Academy of Sciences of the United States of America, vol. 02, no. 25, pp , [05] V. V. Kapitonov and J. Jurka, Rolling-circle transposons in eukaryotes, Proceedings of the National Academy of Sciences of the United States of America, vol. 98, no. 5, pp , 200. [06] S. Brunner, G. Pea, and A. Rafalski, Origins, genetic organization and transcription of a family of non-autonomous helitron elements in maize, Plant Journal, vol. 43, no. 6, pp , [07] M. Morgante, S. Brunner, G. Pea, K. Fengler, A. Zuccolo, and A. Rafalski, Gene duplication and exon shuffling by helitron-like transposons generate intraspecies diversity in maize, Nature Genetics, vol. 37, no. 9, pp , [08] S. Gupta, A. Gallavotti, G. A. Stryker, R. J. Schmidt, and S. K. Lal, A novel class of Helitron-related transposable elements

54 0 Journal of Botany in maize contain portions of multiple pseudogenes, Plant Molecular Biology, vol. 57, no., pp. 5 27, [09] N. Jameson, N. Georgelis, E. Fouladbash, S. Martens, L. C. Hannah, and S. Lal, Helitron mediated amplification of cytochrome P450 monooxygenase gene in maize, Plant Molecular Biology, vol. 67, no. 3, pp , [0] L. Yang and J. L. Bennetzen, Distribution, diversity, evolution, and survival of Helitrons in the maize genome, Proceedings of the National Academy of Sciences of the United States of America, vol. 06, no. 47, pp , [] L. Yang and J. L. Bennetzen, Structure-based discovery and description of plant and animal Helitrons, Proceedings of the National Academy of Sciences of the United States of America, vol. 06, no. 3, pp , [2] C. Du, N. Fefelova, J. Caronna, L. He, and H. K. Dooner, The polychromatic Helitron landscape of the maize genome, Proceedings of the National Academy of Sciences of the United States of America, vol. 06, no. 47, pp , [3] S. K. Lal, N. Georgelis, and L. C. Hannah, Helitrons: their impact on maize genome evolution and diversity, in Handbook of Maize: Genetics and Genomics, J.L.BennetzenandS. Hake, Eds., pp , Springer, Berlin, Germany, [4] N. Jiang, Z. Bao, X. Zhang, S. R. Eddy, and S. R. Wessler, Pack-MULE transposable elements mediate gene evolution in plants, Nature, vol. 43, no. 7008, pp , [5] K. Hanada, V. Vallejo, K. Nobuta et al., The functional role of pack-mules in rice inferred from purifying selection and expression profile, Plant Cell, vol. 2, no., pp , [6] J. Messing and V. Llaca, Importance of anchor genomes for any plant genome project, Proceedings of the National Academy of Sciences of the United States of America, vol. 95, no. 5, pp , 998. [7] G. Moore, K. M. Devos, Z. Wang, and M. D. Gale, Grasses, line up and form a circle, Current Biology, vol. 5, no. 7, pp , 995. [8] M. Morgante, E. De Paoli, and S. Radovic, Transposable elements and the plant pan-genomes, Current Opinion in Plant Biology, vol. 0, no. 2, pp , 2007.

55 Hindawi Publishing Corporation Journal of Botany Volume 20, Article ID 57039, 9 pages doi:0.55/20/57039 Research Article EvolutionofGenomeSizeinDuckweeds(Lemnaceae) Wenqin Wang, Randall A. Kerstetter,, 2 and Todd P. Michael, 2 Rutgers, Department of Plant Biology and Pathology, The State University of New Jersey, The Waksman Institute of Microbiology, Piscataway, NJ 08854, USA 2 Monsanto Company, 800 North Lindbergh Boulevard, Creve Coeur, MO 6367, USA Correspondence should be addressed to Todd P. Michael, todd.p.michael@monsanto.com Received 4 February 20; Revised 5 April 20; Accepted 9 May 20 Academic Editor: Johann Greilhuber Copyright 20 Wenqin Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. To extensively estimate the DNA content and to provide a basic reference for duckweed genome sequence research, the nuclear DNA content for 5 different accessions of 23 duckweed species was measured by flow cytometry (FCM) stained with propidium iodide as DNA stain. The C-value of DNA content in duckweed family varied nearly thirteen-fold, ranging from 50 megabases (Mbp)in Spirodela polyrhiza to,88mbp in Wolffia arrhiza. There is a continuous increase of DNA content in Spirodela, Landoltia, Lemna, Wolffiella, andwolffia that parallels a morphological reduction in size. There is a significant intraspecific variation in the genus Lemna. However, no such variation was found in other studied species with multiple accessions of genera Spirodela, Landoltia, Wolffiella, and Wolffia.. Introduction The Lemnaceae, commonly known as duckweeds, are the smallest, fastest-growing, and simplest of flowering plants. In this globally distributed aquatic monocot family (Figure (a)), there are 33 species representing five genera: Spirodela, Landoltia, Lemna, Wolffiella, and Wolffia. Among them, Spirodela is the most ancestral, while Wolffia is the most derived []. The individual plants range in size from.5 cm long (Spirodela polyrhiza) to less than one millimeter (Wolffia globosa). Therefore, there is a successive reduction of morphological structures in parallel with evolutionary advancement within the family (Figure (b)). Duckweeds are not simply miniature versions of larger angiosperms; they represent a highly modified structural organization that resulted from the alteration, simplification, or loss of many morphological and anatomical features [2]. The biomass doubling time of the fastest-growing duckweeds in optimal growth conditions is less than 30 hours, nearly twice as fast as other fast growing flowering plants and more than double that of conventional crops [3]. Before the days of Arabidopsis, duckweeds, and more specifically Lemna, were an important model system for plant biology [4]. Since duckweeds are small, morphologically reduced (although with root and leaf-like structure), fast growing, easily cultivated under aseptic conditions (Figure (c)), transformable, crossable, and particularly suited to biochemical studies (direct contact with media), it is an ideal system for biological research [5]. Much of what we know about photoperiodic flowering responses comes from fundamental research conducted on Lemna by the preeminent plant biologist Dr. William Hillman at the Brookhaven National Laboratories [6]. Some of the current uses of Lemnaceae are a testimony to its scientific, commercial, and biomass utility: basic research and evolutionary model system [7], toxicity testing organism [8], biotech protein factories [9], wastewater remediation [0], high protein animal feed, carbon cycling [5],and biofuelpotential candidates []. The advent of high-throughput sequencing technologies has enabled a new generation of model plant systems [2]. In an effort to initiate duckweed genomic research, we endeavoured to identify species with small genomes that would be ideal for sequencing. First, we queried the Kew plant genome database ( and found that there were only 6 duckweed accessions that

56 2 Journal of Botany Spirodela Lemnaceae Landoltia Wolffia Wolffiella Lemna (a) (b) (c) Figure : Duckweeds are small aquatic plants that are widely distributed in nature and amenable to culturing in the lab. (a) Duckweeds growing in the Raritan Canal River, Piscataway, NJ, USA. This population of duckweed includes Wolffia, Spirodela, and Lemna. (b) The relative size of Spirodela, Landoltia, Lemna, Wolffiella, and Wolffia in the order of phylogeny as compared to an American Quarter. (c) Sterile Spirodela polyrhiza grown in the Schenk and Hildebrandt basal salt medium. had been measured by the Feulgen method [3, 4]. DNA content of single species from each genus was determined and showed obvious difference. Due to it being laborious and time consuming, the popularity of Feulgen technique has waned. Feulgen has been largely replaced by flow cytometry (FCM) [5], a faster, easier, and more accurate method and the current preferred technique for genome size estimations and DNA ploidy analyses in plants [6]. In order to find the smallest duckweed genome for sequencing and also explore previous observations about genome complexity in duckweeds, we estimated the genome size of all of the five duckweed genera using FCM. These genome size measurements will form the foundation for future work in sequencing duckweed genome, and enabling duckweeds as a model and applied system. 2. Materials and Methods 2.. Plant Materials. 5 accessions of 23 duckweed species representing all 5 genera were measured in this study (Table S in Supplementary Materials available online at doi:0.55/20/57039) (They provide details of sample collection and results of nuclear DNA content measurements for Lemnaceae.). Elias Landolt collected most of the duckweed accessions described in this work over the past 50 years (Landolt Duckweed Collection) [2]. Accessions were either obtained directly from Elias Landolt, BIOLEX (NC, USA) or The University of Toronto Culture Collection of Algae and Cyanobacteria (UTCC). Currently, the Landolt Duckweed Collection has been moved to Rutgers University. Additional lines were collected from lakes and wastewater

57 Journal of Botany 3 ponds by TPM and WW (NJ, USA). Plants were grown aseptically for 2 weeks with /2 full concentration of Schenk and Hildebrandt Basal Salt mixture (Sigma, USA) liquid culture medium under short day growth condition (8 h light and 6 h darkness with constant temperature 23 C). We bar-coded all the determined and undetermined species by identification of polymorphisms of chloroplast atpf-atph noncoding spacer [7] Isolation and Staining of Nuclei. To estimate nuclear DNA contents with flow cytometry (FCM), sample tissue nuclei were stained with propidium iodide (PI) [8]. Briefly, 0 mg of fresh duckweed tissue and the same amount of the internal standard were chopped simultaneously with new razor blades and isolation buffer in a plastic Petri dish [9]. Isolates were filtered through a 30-µm nylon mesh into an Eppendorf tube. The suspensions of nuclei were stained with 50 µgml PI mixed with 50 µgml RNase (R4875, Sigma). The samples were incubated on ice for a few minutes before estimation by FCM Analysis of Nuclear DNA Content by FCM. PI-stained nuclei were analyzed for DNA content with a Coulter Cytomics FC500 Flow Cytometer (Beckman Coulter, Inc., Miami, Florida, USA). In all experiments, the fluorescence of at least 3000 G-phase nuclei was measured. DNA content of each target sample was calculated by comparing its mean nuclear fluorescence with that of an internal standard (Figure 2(a)). We utilized internal controls that closely match the duckweed genome sizes being measured to ensure accuracy. The internal standard is a Brachypodium distachyon line, (Bd2, 300 Mbp) [6], Arabidopsis thaliana Columbia., (At, 47 Mbp) [20], and Physcomitrella patens ssp patens, (Pp, 480 Mbp) [2]. The numbers in bracket were generated by our flow cytometry equipment and our methods. Therefore, the validated genome sizes are not exactly the same but very close to cited references. Both duckweed and internal standards have very little secondary compounds, which will interfere with quantitative DNA staining. The absolute DNA content of a sample is calculated based on the values of the G peak means: Sample C DNA content [ ( ) ] sample G peak mean = ( ) standard G peak mean standard C DNA content ( Mbp ). At least, three independent biological replicates for each sample were analyzed on different days to estimate the mean DNA content. The transformation factor from pg to Mbp is: pg= 978 Mbp [22] Statistical Analysis. Data on intraspecies variation of genome size were analysed by ANOVA: single factor test. To test whether genome size variation was correlated with geographic location or altitude of populations, the Spearman correlation coefficient (r)wasused. () 3. Results 3.. Intra- and Interspecies Variations of Genome Sizes. The genome sizes of 5 accessions from 23 species representing 5 genera were estimated by FCM (Table S). The DNA content estimates varied nearly thirteen-fold, ranging from 50 Mbp in Spirodela polyrhiza to,88mbp in Wolffia arrhiza.we superimposed the estimated C-value on a phylogenetic tree for Lemnaceae based on combination of morphological, flavonoid, allozyme, and DNA sequence analysis [] and found that there is a continuous increase of DNA content in order of Spirodela, Landoltia, Lemna, Wolffiella, and Wolffia, which correlates well with the morphological reduction within the family (Figures 3(a) and 3(b)). In the genus Spirodela, we measured genome size for 34 accessions and found that the C DNA content only varies from 50 to 67 Mbp (Figure 3(a); Table S). The analysis of variance (ANOVA: single factor test) revealed that there was not a significant difference in Spirodela polyrhiza genome sizes (P > 0.05). Similarly, the C DNA content for 9 accessions of Landoltia punctata from 372 to 397 Mbp did not show significant variation (Figure 3(a); Table S). In the genus Wolffiella, the genome sizes range from 623 Mbp to 973 Mbp (Figure 3(a)), which is almost as 4 6 times large as Spirodela polyrhiza. Like Spirodela polyrhiza and Landoltia punctata, there are no obvious intraspecific genome size variations in Wolffiella hyalina and Wolffiella lingulata. In the genus Wolffia, wemeasured species and found that they have the largest genome sizes on average among the duckweed family (Figure 3(a)). 5.3-fold difference was observed from Wolffia australiana (357 Mbp) to Wolffia arrhiza(,88 Mbp). In the genus Lemna, 7 species were investigated. There is a large amount of genome size variation in this genus. Lemna valdiviana has the smallest genome size (323 Mbp), while Lemna aequinoctialis has the biggest (760 Mbp). Surprisingly, intraspecific genome-size fluctuations are also impressive. For Lemna minor, 26 accessions have genome sizes ranging from 356 to 604 Mbp with up to 69.6% of the intraspecific DNA content variance. We confirmed the intraspecific difference of them by randomly choosing 2 Lemna minor with simultaneous measurement of both accessions (26.0% difference between 659 Lm and 7436 Lm, Figure 2(b)). Statistical analyses revealed significant differences among the Lemna minor accessions (P < 0.0). As well, Lemna aequinoctialis ( Mbp, 79.2%) (Figure 2(c)), Lemna trisulca ( Mbp, 59.0%), and Lemna japonica ( Mbp, 40.8%) all show intraspecific difference, indicating a drastically uneven evolution of intraspecific genome expansion in Lemna C-Value and Latitude, Longitude, and Altitude. To investigate whether there is a correlation between genomesize variations and the geographic distribution in the duckweed, we compared genome size estimates with the latitude, longitude, and altitude of recorded collection. However, genome size variation was not correlated with

58 4 Journal of Botany 600 Number of particles Number of particles Relative fluorescence (a) Relative fluorescence (b) Number of particles Panel Peak Accession A B Bd2 Relative Genome fluorescence CV% size(mbp) Bd Sp Lm Lm Bd Relative fluorescence 800 C La La (c) (d) Figure 2: Flow cytometry (FCM) histograms showing relative nuclear DNA content of Duckweed. (a) Histogram showing relative DNA content of Spirodela polyrhiza (, 5 Mbp) and internal standard Brachypodium distachyon Bd2 (2, 300Mbp) based on relative PI fluorescent intensity (channel number). Linear PI fluorescence intensity of G nuclei was used for the calculation of DNA content (3500 particles were counted); (b) Difference in relative DNA content of two simultaneously measured Lemna minor accessions (2, Lm659, 444 Mbp; 3, Lm7436, 560 Mbp) with internal standard Bd2 (); 5000 particles were counted. (c) Difference in relative DNA content of two simultaneously measured Lemna aequinoctialis accessions (2, La662, 40 Mbp; 3, La726, 748 Mbp) with internal standard Bd2 (); 5000 particles were counted. (d) Summary of Panel a, b, and c and genome size corresponding to each peak. latitude by Pearson coefficient (r-value: Spirodela = 0.05, Landoltia = 0.7, Lemna = 0.07, Wolffiella = 0.7, Wolffia = 0.34) (Figure 4(b)), nor with longitude (r-value: Spirodela = 0.7, Landoltia = 0.04, Lemna = 0.26, and Wolffia = 0.4) (Figure 4(c)) except Wolffiella with a high r-value 0.86 possibly due to limited accessions (n = 8). No correlation was found between C-values and altitude, either (r-value: Spirodela = 0.3, Landoltia = 0.25, Lemna = 0.33, Wolffiella = 0.4, and Wolffia = 0.3) (Figure 4(d)). It is interesting we found that most of Spirodela, Landoltia, Wolffiella, and Wolffia were collected from a similar geographic range between 0 to 45 and preferred to localize above 600 m to 200 m of altitude. In contrast, most of Lemna species were collected between 30 to 60 and preferred to distribute below 600 m. However, this most likely represents a sampling bias and could also explain the absence of a relationship between genome size and the environment in duckweed. 4. Discussion 4.. Genome Evolution in Duckweeds. In the phylogeny of Lemnaceae, there is a strong relationship observed between genome size evolution and morphological progression. We found that the ancestral genus Spirodela has the smallest genome size, while the most advanced genus Wolffia contains biggest genome size (Figure 3; Table S), which correlates with the morphological reduction rather than organism

59 Journal of Botany 5 = 20 steps Wolffia arrhiza () Wolffia cylindracea () Wolffia columbiana () Wolffia elongata Wolffia neglecta () Wolffia angusta () Wolffia globosa () Wolffia microscopica () Wolffia australiana (2) Wolffia borealis () Wolffia brasiliensis () Wolffiella lingulata (4) Wolffiella oblonga Wolffiella gladiata () Wolffiella caudata Wolffiella neotropica Wolffiella welwitschii Wolffiella denticulata Wolffiella repanda Wolffiella hyalina (3) Wolffiella rotunda Lemna ecuadoriensis Lemna obscura () Lemna turionifera Lemna japonica (2) Lemna trisulca (4) Lemna gibba (5) Lemna disperma Lemna minor (25) Lemna valdiviana () Lemna yungensis Lemna minuta Lemna aequinoctialis (3) Lemna perpusilla Lemna tenera Landoltia punctata (9) Spirodela intermedia Spirodela polyrhiza (34) Outgroup (pistia; araceae) C DNA content (Mbp) (a) C DNA content (Mbp) Spirodela polyrhiza Landoltia punctata Lemna aequinoctialis Lemna valdiviana Lemna minor Lemna gibba Lemna trisulca Lemna japonica Lemna obscura Wolffiella hyalina Wolffiella gladiata Wolffiella lingulata Wolffia brasiliensis Wolffia borealis Wolffia australiana Wolffia microscopica Wolffia globosa Wolffia angusta Wolffianeglecta Wolffia elongata Wolffia columbiana Wolffia cylindracea Wolffiaarrhiza (b) Figure 3: Genome size variation across the duckweeds. Estimated C-value superimposed on a phylogenetic tree for Lemnaceae based on combination of morphological, flavonoid, allozyme, and DNA sequence analysis []. The species in black were what we tested, and the species in the grey were the ones we did not examine in this experiment. In the bracket is the number of different accessions we tested. (b) Average genome sizes (y-axis) of duckweed species negatively parallel with degree of primitivity (x-axis). Duckweed species are arranged on the x-axis from lower to higher evolutionary status, which deduced from primitive and derived morphological traits [3]. complexity within the family. This result is consistent with Geber s finding, which showed that there was a relationship between DNA content and degree of primitivity [23]. Genome doubling has been a pervasive force in plant evolution, which has occurred repeatedly [24]. Even the smaller genome of Arabidopsis thaliana has been impacted by genome duplication [25]. Cytological variation by counting the chromosomes was extensively investigated within duckweed. They concluded that polyploidy (2n = 20, 30, 40, 50, 60, and 80) is the main intrapopulational variation [2], which means polyploidization was very active and occurred in the duckweeds for multiple rounds in the past.

60 6 Journal of Botany C DNA content (Mbp) Spirodela Landoltia Lemna Wolffiella Wolffia Latitude ( ) (a) (b) C DNA content (Mbp) C DNA content (Mbp) Longtitude ( ) Altitude (m) Spirodela Wolffiella Spirodela Wolffiella Landoltia Wolffia Landoltia Wolffia Lemna Lemna (c) (d) Figure 4: The relation of C DNA content with geographical coordinates and altitude. (a) Geographical origin of the duckweed accessions analyzed; (b) Latitude and C DNA content; (c) Longitude and C DNA content; (d) Altitude and C DNA content. After polyploidization, transposable element mobility, insertions, deletion, and epigenome restructuring contribute to the successful development of a new species and also genome size changes [26]. Changes in genome structure could lead to differential gene loss, extensive changes in gene expression [27], and have immediate effects on the phenotype and fitness of an individual [28]. It is likely polyploidy might drive the divergence during duckweed evolution Geographic Distribution and Genome Size Variation. It was suggested that variation in DNA content has adaptive significance and is correlated with the environmental traits of species [29]. The environmental conditions of plants are to a large extent determined by latitude, longitude, and altitude. Previous studies have indicated a positive correlation between genome size and latitude (associated with the length of sun light with the growing season and the temperature) and also altitude (associated with the temperatures) among plant species. For example, the increase of DNA content corresponded with the increasing latitude found in the Pinaceae family [30] and with increasing altitude observed in Zea mays [3]. Duckweeds are distributed broadly around the world (Figure 4(a)). Our result shows that there is no significant overall correlation of genome size with latitude, longitude, and altitude (Figure 4). Thesameresultwasfound in Vicia faba [32], Sesleria albicans [33], and Asteraceae [34]. A summary revealed that these relationships were not straightforward and not clear. Five studies (Picea sitchensis, Berberis, Poaceae, and Fabaceae, Tropical versus temperate grasses, 329 tropical versus 527 temperate plants) found positive, seven (Arachis duranensis, Festuca arundinacea, North American cultivars of Zea mays, 62 British plants, 23 Arctic plants, 22 North American Zea mays,and North American Zea mays) found negative, and five (Allium cepa, Dactylis glomerata, and Helianthus) found nonsignificant correlations between genome size and latitude. Additionally, nine were positive, eight were negative, and six were not statistically significant between genome size and altitude [35]. But

Review Article Polyploidy and Speciation in Pteris (Pteridaceae)

Review Article Polyploidy and Speciation in Pteris (Pteridaceae) Journal of Botany Volume 2012, Article ID 817920, 7 pages doi:10.1155/2012/817920 Review Article Polyploidy and Speciation in Pteris (Pteridaceae) Yi-Shan Chao, 1, 2 Ho-Yih Liu, 1 Yu-Chung Chiang, 1 and

More information

The Origin of Species

The Origin of Species The Origin of Species Introduction A species can be defined as a group of organisms whose members can breed and produce fertile offspring, but who do not produce fertile offspring with members of other

More information

BDB 2014 Picea study day, an introduction. Paul Goetghebeur, BG Ghent University

BDB 2014 Picea study day, an introduction. Paul Goetghebeur, BG Ghent University BDB 2014 Picea study day, an introduction Paul Goetghebeur, BG Ghent University From ferns to Gymnosperms : from sporangia to seeds Seed ferns (fossil) : Medullosaceae (Kalkman 1972) Seed ferns (fossil)

More information

SPECIATION. REPRODUCTIVE BARRIERS PREZYGOTIC: Barriers that prevent fertilization. Habitat isolation Populations can t get together

SPECIATION. REPRODUCTIVE BARRIERS PREZYGOTIC: Barriers that prevent fertilization. Habitat isolation Populations can t get together SPECIATION Origin of new species=speciation -Process by which one species splits into two or more species, accounts for both the unity and diversity of life SPECIES BIOLOGICAL CONCEPT Population or groups

More information

Case Study. Who s the daddy? TEACHER S GUIDE. James Clarkson. Dean Madden [Ed.] Polyploidy in plant evolution. Version 1.1. Royal Botanic Gardens, Kew

Case Study. Who s the daddy? TEACHER S GUIDE. James Clarkson. Dean Madden [Ed.] Polyploidy in plant evolution. Version 1.1. Royal Botanic Gardens, Kew TEACHER S GUIDE Case Study Who s the daddy? Polyploidy in plant evolution James Clarkson Royal Botanic Gardens, Kew Dean Madden [Ed.] NCBE, University of Reading Version 1.1 Polypoidy in plant evolution

More information

UON, CAS, DBSC, General Biology II (BIOL102) Dr. Mustafa. A. Mansi. The Origin of Species

UON, CAS, DBSC, General Biology II (BIOL102) Dr. Mustafa. A. Mansi. The Origin of Species The Origin of Species Galápagos Islands, landforms newly emerged from the sea, despite their geologic youth, are filled with plants and animals known no-where else in the world, Speciation: The origin

More information

NOTES CH 24: The Origin of Species

NOTES CH 24: The Origin of Species NOTES CH 24: The Origin of Species Species Hummingbirds of Costa Rica SPECIES: a group of individuals that mate with one another and produce fertile offspring; typically members of a species appear similar

More information

The Origin of Species

The Origin of Species Chapter 24 The Origin of Species PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from Joan Sharp

More information

interest in Reproductive Biology, and Ecotypic Variation Metasequoia: An Overview of Its Phylogeny, jianhua Li he discovery of the dawn

interest in Reproductive Biology, and Ecotypic Variation Metasequoia: An Overview of Its Phylogeny, jianhua Li he discovery of the dawn Metasequoia: An Overview of Its Phylogeny, Reproductive Biology, and Ecotypic Variation jianhua Li interest in he discovery of the dawn t redwood sparked renewed the relationship between two important

More information

8/23/2014. Phylogeny and the Tree of Life

8/23/2014. Phylogeny and the Tree of Life Phylogeny and the Tree of Life Chapter 26 Objectives Explain the following characteristics of the Linnaean system of classification: a. binomial nomenclature b. hierarchical classification List the major

More information

4/4/2017. Extrinsic Isolating Barriers. 1. Biological species concept: 2. Phylogenetic species concept:

4/4/2017. Extrinsic Isolating Barriers. 1. Biological species concept: 2. Phylogenetic species concept: Chapter 13 The origin of species 13.1 What Is a Species? p. 414 Ways to identify species 1. Biological species concept: 1. There are many different concepts of species 2. Species are important taxonomic

More information

CYTO-TAXONOMIC STUDIES ON NEW ZEALAND PTERIDACEAE

CYTO-TAXONOMIC STUDIES ON NEW ZEALAND PTERIDACEAE CYTO-TAXONOMIC STUDIES ON NEW ZEALAND PTERIDACEAE BY G. BROWNLIE Botany Department, Canterbury U?iiversity College {Received 15 Jttne 1956) (With Plates 4 and 5) These studies are being undertaken in the

More information

PLANT VARIATION AND EVOLUTION

PLANT VARIATION AND EVOLUTION PLANT VARIATION AND EVOLUTION D. BRIGGS Department of Plant Sciences, University of Cambridge S. M. WALTERS Former Director of the University Botanic Garden, Cambridge 3rd EDITION CAMBRIDGE UNIVERSITY

More information

Chemotaxonomic significance of leaf wax n-alkanes in the Pinales (Coniferales)

Chemotaxonomic significance of leaf wax n-alkanes in the Pinales (Coniferales) Journal of Biological Research 1: 3 19, 2004 J. Biol. Res. is available online at http://www.jbr.gr Chemotaxonomic significance of leaf wax n-alkanes in the Pinales (Coniferales) MASSIMO MAFFEI*, SILVIA

More information

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships Chapter 26: Phylogeny and the Tree of Life You Must Know The taxonomic categories and how they indicate relatedness. How systematics is used to develop phylogenetic trees. How to construct a phylogenetic

More information

Outline for today s lecture (Ch. 13)

Outline for today s lecture (Ch. 13) Outline for today s lecture (Ch. 13) Sexual and asexual life cycles Meiosis Origins of Genetic Variation Independent assortment Crossing over ( recombination ) Heredity Transmission of traits between generations

More information

PTERIS REPTANS (PTERIDACEAE) - A NEW RECORD FOR INDIA

PTERIS REPTANS (PTERIDACEAE) - A NEW RECORD FOR INDIA FERN GAZ. 19(1):25-29. 2012 25 PTERIS REPTANS (PTERIDACEAE) - A NEW RECORD FOR INDIA V.K. SREENIVAS 1 & P.V. MADHUSOODANAN 2 1 Department of Botany, University of Calicut, Kerala, India - 673635 (Email:

More information

Life history, diversity and distribution: a study of Japanese pteridophytes

Life history, diversity and distribution: a study of Japanese pteridophytes ECOGRAPHY 26: 129 138, 2003 Life history, diversity and distribution: a study of Japanese pteridophytes Qinfeng Guo, Masahiro Kato and Robert E. Ricklefs Guo, Q., Kato, M. and Ricklefs, R. E. 2003. Life

More information

Reproductive Morphology

Reproductive Morphology Week 3; Wednesday Announcements: 1 st lab quiz TODAY Reproductive Morphology Reproductive morphology - any portion of a plant that is involved with or a direct product of sexual reproduction Example: cones,

More information

Intraspecific gene genealogies: trees grafting into networks

Intraspecific gene genealogies: trees grafting into networks Intraspecific gene genealogies: trees grafting into networks by David Posada & Keith A. Crandall Kessy Abarenkov Tartu, 2004 Article describes: Population genetics principles Intraspecific genetic variation

More information

Meiosis and Sexual Life Cycles

Meiosis and Sexual Life Cycles CAMPBELL BIOLOGY IN FOCUS URRY CAIN WASSERMAN MINORSKY REECE 10 Meiosis and Sexual Life Cycles Lecture Presentations by Kathleen Fitzpatrick and Nicole Tunbridge, Simon Fraser University SECOND EDITION

More information

Chapter 13: Meiosis and Sexual Life Cycles Overview: Hereditary Similarity and Variation

Chapter 13: Meiosis and Sexual Life Cycles Overview: Hereditary Similarity and Variation Chapter 13: Meiosis and Sexual Life Cycles Overview: Hereditary Similarity and Variation Living organisms Are distinguished by their ability to reproduce their own kind Biology, 7 th Edition Neil Campbell

More information

Introduction to Botany. Lecture 36

Introduction to Botany. Lecture 36 Introduction to Botany. Lecture 36 Alexey Shipunov Minot State University December 6, 2013 Shipunov (MSU) Introduction to Botany. Lecture 36 December 6, 2013 1 / 47 Outline 1 Questions and answers 2 Diversity

More information

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Phylogeny: the evolutionary history of a species

More information

Gnetophyta - Taxonomy

Gnetophyta - Taxonomy Gnetophyta Gnetophyta - Taxonomy 3 very distinct lineages: Gnetopsida Ephedrales Ephedraceae Ephedra Gnetales Gnetaceae Gnetum Welwitschiales Welwitschiaceae Welwitschia Gnetophyta - Shared Characters

More information

Sexual Reproduction and Genetics

Sexual Reproduction and Genetics Chapter Test A CHAPTER 10 Sexual Reproduction and Genetics Part A: Multiple Choice In the space at the left, write the letter of the term, number, or phrase that best answers each question. 1. How many

More information

Cytogenetical Studies of East Himalayan Hamamelidaceae, Combre# taceae and Myrtaceae

Cytogenetical Studies of East Himalayan Hamamelidaceae, Combre# taceae and Myrtaceae conifers. Evolution, Lawrence, Kans., 21, 720-724 (1967). - MIKSCHE, (1961). - SUNDERLAND, N., and MCLEISH, J.: Nucleic acid content and J. P.: Variation in DNA content of several gymnosperms. Canad. concentration

More information

Unit 7: Evolution Guided Reading Questions (80 pts total)

Unit 7: Evolution Guided Reading Questions (80 pts total) AP Biology Biology, Campbell and Reece, 10th Edition Adapted from chapter reading guides originally created by Lynn Miriello Name: Unit 7: Evolution Guided Reading Questions (80 pts total) Chapter 22 Descent

More information

ASSESSING AMONG-LOCUS VARIATION IN THE INFERENCE OF SEED PLANT PHYLOGENY

ASSESSING AMONG-LOCUS VARIATION IN THE INFERENCE OF SEED PLANT PHYLOGENY Int. J. Plant Sci. 168(2):111 124. 2007. Ó 2007 by The University of Chicago. All rights reserved. 1058-5893/2007/16802-0001$15.00 ASSESSING AMONG-LOCUS VARIATION IN THE INFERENCE OF SEED PLANT PHYLOGENY

More information

Introduction to Botany. Lecture 31

Introduction to Botany. Lecture 31 Introduction to Botany. Lecture 31 Alexey Shipunov Minot State University November 17th, 2010 Outline Spermatophyta: seed plants 1 Spermatophyta: seed plants Pinopsida Spermatophyta: seed plants Three

More information

Lecture 9: Readings: Chapter 20, pp ;

Lecture 9: Readings: Chapter 20, pp ; Lecture 9: Meiosis i and heredity Readings: Chapter 20, pp 659-686; skim through pp 682-3 & p685 (but just for fun) Chromosome number: haploid, diploid, id polyploid l Talking about the number of chromosome

More information

Chapter 16: Reconstructing and Using Phylogenies

Chapter 16: Reconstructing and Using Phylogenies Chapter Review 1. Use the phylogenetic tree shown at the right to complete the following. a. Explain how many clades are indicated: Three: (1) chimpanzee/human, (2) chimpanzee/ human/gorilla, and (3)chimpanzee/human/

More information

Meiosis and Sexual Life Cycles

Meiosis and Sexual Life Cycles Chapter 13 Meiosis and Sexual Life Cycles PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from

More information

The process by which the genetic structure of populations changes over time.

The process by which the genetic structure of populations changes over time. Evolution The process by which the genetic structure of populations changes over time. Divergent evolution Goldfields and Ahinahina (silversword) a highly evolved member of the composite family. Evolution

More information

Integrative Biology 200A "PRINCIPLES OF PHYLOGENETICS" Spring 2012 University of California, Berkeley

Integrative Biology 200A PRINCIPLES OF PHYLOGENETICS Spring 2012 University of California, Berkeley Integrative Biology 200A "PRINCIPLES OF PHYLOGENETICS" Spring 2012 University of California, Berkeley B.D. Mishler Feb. 7, 2012. Morphological data IV -- ontogeny & structure of plants The last frontier

More information

Big Questions. Is polyploidy an evolutionary dead-end? If so, why are all plants the products of multiple polyploidization events?

Big Questions. Is polyploidy an evolutionary dead-end? If so, why are all plants the products of multiple polyploidization events? Plant of the Day Cyperus esculentus - Cyperaceae Chufa (tigernut) 8,000 kg/ha, 720 kcal/sq m per month Top Crop for kcal productivity! One of the world s worst weeds Big Questions Is polyploidy an evolutionary

More information

The Origin of Species

The Origin of Species The Origin of Species What you need to know The difference between microevolution and macroevolution. The biological concept of species. Prezygotic and postzygotic barriers that maintain reproductive isolation

More information

Meiosis and Sexual Life Cycles

Meiosis and Sexual Life Cycles 13 Meiosis and Sexual Life Cycles Lecture Presentation by Nicole Tunbridge and Kathleen Fitzpatrick CAMPBELL BIOLOGY TENTH EDITION Reece Urry Cain Wasserman Minorsky Jackson Variations on a Theme Living

More information

The Origin of Species

The Origin of Species LECTURE PRESENTATIONS For CAMPBELL BIOLOGY, NINTH EDITION Jane B. Reece, Lisa A. Urry, Michael L. Cain, Steven A. Wasserman, Peter V. Minorsky, Robert B. Jackson Chapter 24 The Origin of Species Lectures

More information

The process by which the genetic structure of populations changes over time.

The process by which the genetic structure of populations changes over time. Evolution The process by which the genetic structure of populations changes over time. Divergent evolution is the accumulation of differences between groups which can lead to the formation of new species.

More information

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics. Evolutionary Genetics (for Encyclopedia of Biodiversity) Sergey Gavrilets Departments of Ecology and Evolutionary Biology and Mathematics, University of Tennessee, Knoxville, TN 37996-6 USA Evolutionary

More information

Evolution - Unifying Theme of Biology Microevolution Chapters 13 &14

Evolution - Unifying Theme of Biology Microevolution Chapters 13 &14 Evolution - Unifying Theme of Biology Microevolution Chapters 13 &14 New Synthesis Natural Selection Unequal Reproductive Success Examples and Selective Forces Types of Natural Selection Speciation http://www.biology-online.org/2/11_natural_selection.htm

More information

J. MITCHELL MCGRATH, LESLIE G. HICKOK, and ERAN PICHERSKY

J. MITCHELL MCGRATH, LESLIE G. HICKOK, and ERAN PICHERSKY P1. Syst. Evol. 189:203-210 (1994) --Plant Systematics and Evolution Springer-Verlag 1994 Printed in Austria Assessment of gene copy number in the homosporous ferns Ceratopteris thalictroides and C. richardii

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

MEIOSIS C H A P T E R 1 3

MEIOSIS C H A P T E R 1 3 MEIOSIS CHAPTER 13 CENTRAL DOGMA OF BIOLOGY DNA RNA Protein OFFSPRING ACQUIRE GENES FROM PARENTS Genes are segments of DNA that program specific traits. Genetic info is transmitted as specific sequences

More information

Minor Research Project

Minor Research Project Executive Summary Minor Research Project DNA BARCODING OF MURDANNIA (COMMELINACEAE) IN WESTERN GHATS MRP (S)-1409/11-12/KLMG002/UGC-SWRO By Rogimon P. Thomas Assistant Professor Department of Botany CMS

More information

Chapter 13 Meiosis and Sexual Life Cycles

Chapter 13 Meiosis and Sexual Life Cycles Chapter 13 Meiosis and Sexual Life Cycles Question? Does Like really beget Like? The offspring will resemble the parents, but they may not be exactly like them. This chapter deals with reproduction of

More information

ESS 345 Ichthyology. Systematic Ichthyology Part II Not in Book

ESS 345 Ichthyology. Systematic Ichthyology Part II Not in Book ESS 345 Ichthyology Systematic Ichthyology Part II Not in Book Thought for today: Now, here, you see, it takes all the running you can do, to keep in the same place. If you want to get somewhere else,

More information

PHYLOGENY AND SYSTEMATICS

PHYLOGENY AND SYSTEMATICS AP BIOLOGY EVOLUTION/HEREDITY UNIT Unit 1 Part 11 Chapter 26 Activity #15 NAME DATE PERIOD PHYLOGENY AND SYSTEMATICS PHYLOGENY Evolutionary history of species or group of related species SYSTEMATICS Study

More information

Chapter 14 The Origin of Species

Chapter 14 The Origin of Species Chapter 14 The Origin of Species PowerPoint Lectures for Biology: Concepts & Connections, Sixth Edition Campbell, Reece, Taylor, Simon, and Dickey Copyright 2009 Pearson Education, Inc. Lecture by Joan

More information

Meiosis and Sexual Life Cycles

Meiosis and Sexual Life Cycles Chapter 13 Meiosis and Sexual Life Cycles PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from

More information

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS.

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS. !! www.clutchprep.com CONCEPT: OVERVIEW OF EVOLUTION Evolution is a process through which variation in individuals makes it more likely for them to survive and reproduce There are principles to the theory

More information

Big Idea #1: The process of evolution drives the diversity and unity of life

Big Idea #1: The process of evolution drives the diversity and unity of life BIG IDEA! Big Idea #1: The process of evolution drives the diversity and unity of life Key Terms for this section: emigration phenotype adaptation evolution phylogenetic tree adaptive radiation fertility

More information

Essential Questions. Meiosis. Copyright McGraw-Hill Education

Essential Questions. Meiosis. Copyright McGraw-Hill Education Essential Questions How does the reduction in chromosome number occur during meiosis? What are the stages of meiosis? What is the importance of meiosis in providing genetic variation? Meiosis Vocabulary

More information

Microevolutionary changes show us how populations change over time. When do we know that distinctly new species have evolved?

Microevolutionary changes show us how populations change over time. When do we know that distinctly new species have evolved? Microevolutionary changes show us how populations change over time. When do we know that distinctly new species have evolved? Critical to determining the limits of a species is understanding if two populations

More information

BIOLOGY. Meiosis and Sexual Life Cycles CAMPBELL. Reece Urry Cain Wasserman Minorsky Jackson

BIOLOGY. Meiosis and Sexual Life Cycles CAMPBELL. Reece Urry Cain Wasserman Minorsky Jackson CAMPBELL BIOLOGY TENTH EDITION Reece Urry Cain Wasserman Minorsky Jackson 13 Meiosis and Sexual Life Cycles Lecture Presentation by Nicole Tunbridge and Kathleen Fitzpatrick Variations on a Theme Living

More information

Biology, 7e (Campbell) Chapter 13: Meiosis and Sexual Life Cycles

Biology, 7e (Campbell) Chapter 13: Meiosis and Sexual Life Cycles Biology, 7e (Campbell) Chapter 13: Meiosis and Sexual Life Cycles Chapter Questions 1) What is a genome? A) the complete complement of an organism's genes B) a specific sequence of polypeptides within

More information

Lecture 11 Friday, October 21, 2011

Lecture 11 Friday, October 21, 2011 Lecture 11 Friday, October 21, 2011 Phylogenetic tree (phylogeny) Darwin and classification: In the Origin, Darwin said that descent from a common ancestral species could explain why the Linnaean system

More information

Chapter 13- Reproduction, Meiosis, and Life Cycles. Many plants and other organisms depend on sexual reproduction.

Chapter 13- Reproduction, Meiosis, and Life Cycles. Many plants and other organisms depend on sexual reproduction. Chapter 13- Reproduction, Meiosis, and Life Cycles Many plants and other organisms depend on sexual reproduction. Flowers are the sexual reproductive organ systems of angiosperms. Sexual reproduction gametes

More information

Meiosis and Sexual Life Cycles

Meiosis and Sexual Life Cycles LECTURE PRESENTATIONS For CAMPBELL BIOLOGY, NINTH EDITION Jane B. Reece, Lisa A. Urry, Michael L. Cain, Steven A. Wasserman, Peter V. Minorsky, Robert B. Jackson Chapter 13 Meiosis and Sexual Life Cycles

More information

Conceptually, we define species as evolutionary units :

Conceptually, we define species as evolutionary units : Bio 1M: Speciation 1 How are species defined? S24.1 (2ndEd S26.1) Conceptually, we define species as evolutionary units : Individuals within a species are evolving together Individuals of different species

More information

Unit 6 : Meiosis & Sexual Reproduction

Unit 6 : Meiosis & Sexual Reproduction Unit 6 : Meiosis & Sexual Reproduction 2006-2007 Cell division / Asexual reproduction Mitosis produce cells with same information identical daughter cells exact copies clones same number of chromosomes

More information

Speciation Plant Sciences, 2001Updated: June 1, 2012 Gale Document Number: GALE CV

Speciation Plant Sciences, 2001Updated: June 1, 2012 Gale Document Number: GALE CV is the process of evolution by which new species arise. The key factor causing speciation is the appearance of genetic differences between two populations, which result from evolution by natural selection.

More information

Biology Slide 1 of 28

Biology Slide 1 of 28 Biology 1 of 28 2 of 28 22-4 Seed Plants Seed plants are the most dominant group of photosynthetic organisms on land. 3 of 28 22-4 Seed Plants Seed plants are divided into two groups: Gymnosperms bear

More information

Processes of Evolution

Processes of Evolution 15 Processes of Evolution Forces of Evolution Concept 15.4 Selection Can Be Stabilizing, Directional, or Disruptive Natural selection can act on quantitative traits in three ways: Stabilizing selection

More information

Chapter 27: Evolutionary Genetics

Chapter 27: Evolutionary Genetics Chapter 27: Evolutionary Genetics Student Learning Objectives Upon completion of this chapter you should be able to: 1. Understand what the term species means to biology. 2. Recognize the various patterns

More information

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

Meiosis and Life Cycles - 1

Meiosis and Life Cycles - 1 Meiosis and Life Cycles - 1 We have just finished looking at the process of mitosis, a process that produces cells genetically identical to the original cell. Mitosis ensures that each cell of an organism

More information

Unfortunately, there are many definitions Biological Species: species defined by Morphological Species (Morphospecies): characterizes species by

Unfortunately, there are many definitions Biological Species: species defined by Morphological Species (Morphospecies): characterizes species by 1 2 3 4 5 6 Lecture 3: Chapter 27 -- Speciation Macroevolution Macroevolution and Speciation Microevolution Changes in the gene pool over successive generations; deals with alleles and genes Macroevolution

More information

For a species to survive, it must REPRODUCE! Ch 13 NOTES Meiosis. Genetics Terminology: Homologous chromosomes

For a species to survive, it must REPRODUCE! Ch 13 NOTES Meiosis. Genetics Terminology: Homologous chromosomes For a species to survive, it must REPRODUCE! Ch 13 NOTES Meiosis Genetics Terminology: Autosomes Somatic cell Gamete Karyotype Homologous chromosomes Meiosis Sex chromosomes Diploid Haploid Zygote Synapsis

More information

Biology Eighth Edition Neil Campbell and Jane Reece

Biology Eighth Edition Neil Campbell and Jane Reece BIG IDEA I The process of evolution drives the diversity and unity of life. Enduring Understanding 1.C Life continues to evolve within a changing environment. Essential Knowledge 1.C.1 Speciation and extinction

More information

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics: Homework Assignment, Evolutionary Systems Biology, Spring 2009. Homework Part I: Phylogenetics: Introduction. The objective of this assignment is to understand the basics of phylogenetic relationships

More information

Review of Mitosis and Meiosis

Review of Mitosis and Meiosis Review of Mitosis and Meiosis NOTE: Since you will have already had an introduction to both mitosis and meiosis in Biol 204 & 205, this lecture is structured as a review. See the text for a more thorough

More information

How Biological Diversity Evolves

How Biological Diversity Evolves CHAPTER 14 How Biological Diversity Evolves PowerPoint Lectures for Essential Biology, Third Edition Neil Campbell, Jane Reece, and Eric Simon Essential Biology with Physiology, Second Edition Neil Campbell,

More information

EVOLUTION Unit 1 Part 9 (Chapter 24) Activity #13

EVOLUTION Unit 1 Part 9 (Chapter 24) Activity #13 AP BIOLOGY EVOLUTION Unit 1 Part 9 (Chapter 24) Activity #13 NAME DATE PERIOD SPECIATION SPECIATION Origin of new species SPECIES BIOLOGICAL CONCEPT Population or groups of populations whose members have

More information

Plant evolution and speciation BY2204 EVOLUTION

Plant evolution and speciation BY2204 EVOLUTION Plant evolution and speciation BY2204 EVOLUTION Trevor Hodkinson Plant Sciences Moderatorship Some evol. processes shared with other organisms (natural selection; allopatric speciation). Some more common

More information

Meiosis and Sexual Reproduction. Chapter 10. Halving the Chromosome Number. Homologous Pairs

Meiosis and Sexual Reproduction. Chapter 10. Halving the Chromosome Number. Homologous Pairs Meiosis and Sexual Reproduction Chapter 10 Outline Reduction in Chromosome Number Homologous Pairs Meiosis Overview Genetic Recombination Crossing-Over Independent Assortment Fertilization Meiosis I Meiosis

More information

Overview. Overview: Variations on a Theme. Offspring acquire genes from parents by inheriting chromosomes. Inheritance of Genes

Overview. Overview: Variations on a Theme. Offspring acquire genes from parents by inheriting chromosomes. Inheritance of Genes Chapter 13 Meiosis and Sexual Life Cycles Overview I. Cell Types II. Meiosis I. Meiosis I II. Meiosis II III. Genetic Variation IV. Reproduction Overview: Variations on a Theme Figure 13.1 Living organisms

More information

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogeny? - Systematics? The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogenetic systematics? Connection between phylogeny and classification. - Phylogenetic systematics informs the

More information

Sexual Reproduction and Meiosis. Chapter 11

Sexual Reproduction and Meiosis. Chapter 11 Sexual Reproduction and Meiosis Chapter 11 1 Sexual life cycle Made up of meiosis and fertilization Diploid cells Somatic cells of adults have 2 sets of chromosomes Haploid cells Gametes (egg and sperm)

More information

Ladies and Gentlemen.. The King of Rock and Roll

Ladies and Gentlemen.. The King of Rock and Roll Ladies and Gentlemen.. The King of Rock and Roll Learning Objectives: The student is able to construct an explanation, using visual representations or narratives, as to how DNA in chromosomes is transmitted

More information

X-Sheet 3 Cell Division: Mitosis and Meiosis

X-Sheet 3 Cell Division: Mitosis and Meiosis X-Sheet 3 Cell Division: Mitosis and Meiosis 13 Key Concepts In this session we will focus on summarising what you need to know about: Revise Mitosis (Grade 11), the process of meiosis, First Meiotic division,

More information

Meiosis & Sexual Reproduction

Meiosis & Sexual Reproduction Meiosis & Sexual Reproduction 2007-2008 Cell division / Asexual reproduction Mitosis produce cells with same information identical daughter cells exact copies clones same amount of DNA same number of chromosomes

More information

Meiosis & Sexual Life Cycle

Meiosis & Sexual Life Cycle Chapter 13. Meiosis & Sexual Life Cycle 1 Cell reproduction Mitosis produce cells with same information identical daughter cells exact copies (clones) same amount of DNA same number of chromosomes asexual

More information

The use of Phylogenies for Conservation purposes

The use of Phylogenies for Conservation purposes The use of Phylogenies for Conservation purposes Arnica Katarina Andreasen Systematic Biology, Uppsala University Overview Biological distinctivness Measures of diversity Example: Malvaceae Arnica evolution

More information

Dr. Ramesh U4L3 Meiosis

Dr. Ramesh U4L3 Meiosis Dr. Ramesh U4L3 Meiosis The Cell Cycle and Cell Division: MEIOSIS The Cell Cycle and Cell Division KEY CONCEPT: Meiosis Halves the Nuclear Chromosome Content and Generates Diversity Organisms have two

More information

C3020 Molecular Evolution. Exercises #3: Phylogenetics

C3020 Molecular Evolution. Exercises #3: Phylogenetics C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from

More information

Chapter 11 Meiosis and Sexual Reproduction

Chapter 11 Meiosis and Sexual Reproduction Chapter 11 Meiosis and Sexual S Section 1: S Gamete: Haploid reproductive cell that unites with another haploid reproductive cell to form a zygote. S Zygote: The cell that results from the fusion of gametes

More information

Unit 9: Evolution Guided Reading Questions (80 pts total)

Unit 9: Evolution Guided Reading Questions (80 pts total) Name: AP Biology Biology, Campbell and Reece, 7th Edition Adapted from chapter reading guides originally created by Lynn Miriello Unit 9: Evolution Guided Reading Questions (80 pts total) Chapter 22 Descent

More information

LAB 4: PHYLOGENIES & MAPPING TRAITS

LAB 4: PHYLOGENIES & MAPPING TRAITS LAB 4: PHYLOGENIES & MAPPING TRAITS *This is a good day to check your Physcomitrella (protonema, buds, gametophores?) and Ceratopteris cultures (embryos, young sporophytes?)* Phylogeny Introduction The

More information

Sporic life cycles involve 2 types of multicellular bodies:

Sporic life cycles involve 2 types of multicellular bodies: Chapter 3- Human Manipulation of Plants Sporic life cycles involve 2 types of multicellular bodies: -a diploid, spore-producing sporophyte -a haploid, gamete-producing gametophyte Sexual Reproduction in

More information

Meiosis and Sexual Life Cycles

Meiosis and Sexual Life Cycles Chapter 13 Meiosis and Sexual Life Cycles Lecture Outline Overview: Variations on a Theme Living organisms are distinguished by their ability to reproduce their own kind. Offspring resemble their parents

More information

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26 Phylogeny Chapter 26 Taxonomy Taxonomy: ordered division of organisms into categories based on a set of characteristics used to assess similarities and differences Carolus Linnaeus developed binomial nomenclature,

More information

MEIOSIS LAB INTRODUCTION PART I: SIMULATION OF MEIOSIS EVOLUTION. Activity #9

MEIOSIS LAB INTRODUCTION PART I: SIMULATION OF MEIOSIS EVOLUTION. Activity #9 AP BIOLOGY EVOLUTION Unit 1 Part 7 Chapter 13 Activity #9 NAME DATE PERIOD MEIOSIS LAB INTRODUCTION Meiosis involves two successive nuclear divisions that produce four haploid cells. Meiosis I is the reduction

More information

SPECIATION. SPECIATION The process by which once species splits into two or more species

SPECIATION. SPECIATION The process by which once species splits into two or more species SPECIATION SPECIATION The process by which once species splits into two or more species Accounts for the diversity of life on earth If no speciation, there would only be species that was continuously evolving

More information

Organizing Life s Diversity

Organizing Life s Diversity 17 Organizing Life s Diversity section 2 Modern Classification Classification systems have changed over time as information has increased. What You ll Learn species concepts methods to reveal phylogeny

More information

Exam 1 PBG430/

Exam 1 PBG430/ 1 Exam 1 PBG430/530 2014 1. You read that the genome size of maize is 2,300 Mb and that in this species 2n = 20. This means that there are 2,300 Mb of DNA in a cell that is a. n (e.g. gamete) b. 2n (e.g.

More information

Outline. Classification of Living Things

Outline. Classification of Living Things Outline Classification of Living Things Chapter 20 Mader: Biology 8th Ed. Taxonomy Binomial System Species Identification Classification Categories Phylogenetic Trees Tracing Phylogeny Cladistic Systematics

More information

I. Short Answer Questions DO ALL QUESTIONS

I. Short Answer Questions DO ALL QUESTIONS EVOLUTION 313 FINAL EXAM Part 1 Saturday, 7 May 2005 page 1 I. Short Answer Questions DO ALL QUESTIONS SAQ #1. Please state and BRIEFLY explain the major objectives of this course in evolution. Recall

More information

SEM studies on fruit and seed of some Chenopodium L. species (Chenopodiaceae)

SEM studies on fruit and seed of some Chenopodium L. species (Chenopodiaceae) SEM studies on fruit and seed of some Chenopodium L. species (Chenopodiaceae) Jagna Karcz 1, Bozena Kolano 2, Jolanta Maluszynska 2 University of Silesia, Faculty of Biology and Environmental Protection

More information