Door. Yu He. Proefschrift voorgedragen tot het bekomen van de graad van Doctor in de Toegepaste Biologische wetenschappen

Size: px
Start display at page:

Download "Door. Yu He. Proefschrift voorgedragen tot het bekomen van de graad van Doctor in de Toegepaste Biologische wetenschappen"

Transcription

1 FACULTEIT LANDBOUWKUNDIGE EN TOEGEPASTE BIOLOGISCHE WETENSCHAPPEN MOLECULAR APPROACH TO LONGIDORIDAE (NEMATODA: DORYLAIMIDA): ORGANELLE GENOMICS, PHYLOGENY, POPULATION DIVERSITY AND DIAGNOSTICS MOLECULAIRE BENADERING VAN LONGIDORIDAE (NEMATODA: DORYLAIMIDA): ORGANELGENOOMANALYSE, FYLOGENIE, POPULATIEDIVERSITEIT EN DIAGNOSE Door Yu He Thesis submitted in fulfillment of the requirements for the degree of Doctor in Applied Biological Sciences Proefschrift voorgedragen tot het bekomen van de graad van Doctor in de Toegepaste Biologische wetenschappen Op gezag van Rector: Prof. Dr. A De Leenheer Decaan: Prof. Dr. ir. H. VAN LANGENHOVE Promotoren: Prof. Dr. Ir. M. MOENS Prof. Dr. Ir. L. TIRRY

2 Acknowledgements After four-year bench-works, my Ph.D program is reaching the end. During the four years, many people supported my work, which made the accomplishment of the work possible. I attain plentiful knowledge of sciences as well as humanity. I am grateful to Prof. Dr. ir. Luc Tirry who kindly granted me the opportunity to start and complete this work as his student. I expect to send my cordial gratitude to Prof. Dr. ir. Maurice Moens. He is always friendly and patient. With his persistent encougement, I am able to search the solutions freely when encountering the difficulties. He can accept any innovative proposals and give the helpful advices. He always keeps an eye on my work but never limits my thoughts. It is a pleasure to work with his supervision. I wish to acknowledge Prof. F. Lamberti, Dr. T.C. Vrain and Dr. D.J.F, Brown for their samples and precise identification. Without their contribution, I cannot complete this work. I wish to express my gratitude to Mr. Yunliang Peng for his introduction, Ms. Nic Smol and professors teaching in PINC course who guided me through the PINC course with the university grant, which was the startpoint of my scientific career. I wish to send acknowlegements to Mr. L. Wayenberge, Mr. S. Bayen, Dr. S. Subbotin and Dr. T. Maes for their lab assistance and scientific experience sharing. I also would like to thank many colleagues in CLO-DGB: Shulong Chen, Phan Ke Long, Dr. Viaene Nicole, ir. Wim Wesemael, Nancy de Sutter, T. Rubtsova, and my Chinese friends: Peiyin Shen and Hongmei Li. With them, I spent a pleasant period in my life. Finally, I want to express my special thanks to my parents and my wife for their love and tireless help.

3 Table of Contents Table of Contents...i CHAPTER 1 Introduction CHAPTER 2 Mining the genome resources of plant-parasitic nematodes 5 CHAPTER 3 General materials and methods.. 19 CHAPTER 4 Mitochondrial genome of Xiphinema americanum...25 CHAPTER 5 Ribosomal genes of longidorids.59 CHAPTER 6 Diversity of Internal Transcribed Spacer.107 CHAPTER 7 Isolation and characterization of microsatellites..147 CHAPTER 8 General concluding marks and perspectives 165 Summary 169 References..173 Appendix A Appendix B Appendix C Curriculum vitae.249 i

4 1. Introduction 1 Introduction Longidorids (Nematoda, Dorylaimida, Longidoridae) is a big group including hundreds of species. Some of the species are vector of nepovirus that damages a wide range of crops and are listed as quarantine pests. Therefore, nematologists become more interested in their geographical distribution, pathogenicity associated with virus transmission, phylogeny, population diversity and diagnostics. The genus Xiphinema Cobb, 1913 (Dorylaimida: Longidoridae) includes 296 nominal taxa (234 valid species, 49 junior synonyms and 13 species inquirendae). It is the largest genus in the family Longidoridae and also in the order Dorylaimida (Coomans et al., 2001). Nematodes belonging to this genus have a worldwide geographical distribution. They feed ectoparasitically on plant roots on which they cause root galling. Some Xiphinema species, X. index, X. diversicaudatum, X. italiae, X. americanum, X. bricolensis, X. rivesi and X. californicum are vectors of nepoviruses, such as cherry rasp leaf nepovirus, peach rosette mosaic nepovirus, tobacco ringspot nepovirus potato calico strain, and tomato ringspot nepovirus (all listed in the annex I/A1) (Taylor & Brown, 1997). The latter four species belong to the X. americanumgroup, of which the non-european populations have been listed as quarantine pathogens in the European Union. The cited species are economically important on crops such as cherry, peach, tobacco, tomato, strawberry and grape (Taylor & Brown, 1997). The X. americanum group includes 49 putative species and 2 species inquirendae (Lamberti et al., 2000). Because of the overlapping of morphological characters of the species, this group of nematodes regularly causes identification 1

5 1. Introduction problems, even for specialists. Misidentification of species belonging to this group slows down further studies on genetics, population diversity, phylogeny, and virus transmission. It will certainly cause difficulties for the quarantine strategy. Even though Lamberti et al. (2000) have provided an updated polytomous key for species identification of this group, accurate and fast identification required for quarantine purposes is still a difficult task. In nematology, species definition relies solely on morphological characters. The limitation of such methods has led to ambiguous species definitions and with consequent problems for diagnostics, geographical subdivision of populations, pathogenicity determination, and retrieval of the organism history. Adams (1998) described the nematode species concepts and the evolutionary paradigm in detail. He pointed out the serious problems associated with species concepts: over- or underestimation of species and misrepresentation of the phylogenetic relationships. However, correct species definition and identification is the cornerstone of modern diversity and systematics studies. Coomans (2002) reviewed the current status of nematode systematics and recommended using a combination of morphology based methods and modern molecular techniques to approach the species definition and identification. Molecular techniques have been widely applied in various research disciplines of nematology. PCR-RFLP analyses and sequencing analyses has been successfully used for: (i) species identification of entomopathogenic nematodes (Nasmith et al., 1996), root lesion nematodes (Orui, 1996; Waeyenberge et al., 2000), root-knot nematodes (Zijlstra et al., 1997) and cyst forming nematodes (Subbotin et al., 1999); (ii) studies on population diversity (Hiatt et al., 1995); (iii) phylogenetic analyses on Heterorhabditis (Adams et al., 1998) and burrowing nematodes (Kaplan et al., 2000). Powers et al. (1997) made a detailed evaluation of the diagnostic potential of ITS1 region for nematodes. A more detailed review of the applications is provided in chapter 2. A phylogenetic study of the longidorids using molecular data may help the reevaluation of the species validity so as to facilitate the final precise species definition and identification. To achieve these goals, the family of the Longidoridae (Nematoda: Dorylaimida) was characterised using molecular techniques. This work was done within the framework of a EU-granted project: The Xiphinema americanum-group 2

6 1. Introduction virus vector nematodes: Development of a diagnostic protocol (SMT4-CT ). The results presented here used the mitochondrial genome, ribosomal RNA genes or spacer regions and microsatellites to address the following goals: 1. To investigate the features of the mitochondrial genome of Xiphinema americanum (a representative of longidorids) and the phylogenetic implication deduced from the comparison between X. americanum and other groups of the Nematoda; 2. To infer the phylogenetic relationships of the main groups of longidorids; 3. To work out fast and efficient identification procedures for Xiphinema species; 4. To study the phylogenetic and population relationships of X. americanumgroup species and to seek for an efficient diagnostic tool for species identification in this group; 5. To develop a fast and efficient diagnostic procedure to identify virus-vector species. With the accomplishment of the above objectives, the systematists should be able to re-construct the phylogeny of longidorids, the taxonomists to rationally synonymise species, and the statutory laboratories to rapidly identify some virusvector species and design an efficient pest-control strategy. 3

7 4

8 2. Literature review 2 Mining the genome resources of plant-parasitic nematodes The genome sequencing project of Caenorhabditis elegans was recently completed. The 97-megabase genome sequence has provided abundant information (The C. elegans sequencing consortium, 1998). In the past decade, molecular biological techniques have been widely applied to studies on nematodes. The techniques proved to be very efficient tools when addressing a wide scope of problems such as diagnostics, systematics, phylogeny, populations and ecology. The number of articles using molecular techniques and published recently are a good indicator for this trend. An excellent review of the techniques used in nematology was published by Jones et al. (1997). In this article I want to introduce powerful DNA related techniques including WGA (whole genome amplification) and to summarise applications of strides on different genomic regions. I expect that techniques and applications illustrated here will be helpful to nematologists, especially researchers working with systematics, phylogeny, taxonomy, population diversity, and ecology of plant-parasitic nematodes. 2.1 Overcome the limitation of DNA resources Most of the molecular studies published recently have used genomic DNA. Many research programmes focused on those nematode taxa for which relatively abundant materials are available through field sampling or culturing in the laboratory. Root-knot nematodes (Meloidogyne spp.), cyst forming nematodes (Heterodera spp. and Globodera spp.), migratory endoparasitic nematodes (Pratylenchus spp. and Radopholus spp.), and the pinewood nematode (Bursaphelenchus xylophilus) have been the subject of multiple and diverse studies. Molecular researches on other nematode groups have been restricted to the PCR amplified single gene or gene clusters because of the shortage of materials. Many applications require the 5

9 2. Literature review construction of a genomic library (e.g. screening microsatellite markers, genome cross hybridisation), or large quantities of genomic DNA to start (RAPD, AFLP, and other fingerprinting methods). Apparently, obtaining abundant DNA is the first and most critical step to efficient usage of genomic resources of plant-parasitic nematodes. Obtaining enough DNA for molecular studies is not always easy. Fortunately, recent developments and particularly whole genome amplification (WGA) techniques used in medical science may change this. WGA was originally designed to produce enough DNA for clinical analyses from biopsies or dissected single cells. It is appropriate for studies on complicated diseases such as cancer since it can be used to detect genomic aberration of cells so as to facilitate diagnosis and treatment in the early phase of disease. There are several commonly used WGA methods: PEP (primer extension pre-amplification) (Zhang et al., 1992), DOP-PCR (degenerate oligonucleotide primed polymerase chain reaction) (Telenlius et al., 1992), T-PCR (tagged PCR) (Grothues et al., 1993), Alu-PCR (Lengauer et al., 1993); the recently developed MDA (multiple displacement amplification) method (Dean et al., 2002) and SCOMP (single cell comparative genomic hybridisation) (Klein et al., 1999). PEP, DOP-PCR and T-PCR are all methods that use degenerate oligonucleotides. PEP uses a pool of 15 bp oligonucleotides with 4-fold degeneracy at each site. DOP-PCR uses primers that incorporate a restriction site at the 5 end, 6 oligonucleotides with 4-fold degeneracy at each site and 6 bp specific sequences at the 3 end. T-PCR, considered to be a variant of DOP-PCR, uses a pool of primers with specific sequences at the 5 end and several degenerate oligonucleotides at the 3 end. Alu-PCR makes usage of the approximately 900,000 intermediate Alu repeat sequences in the human genome to amplify the whole genome. Because the Alu sequence is specific to the human genome, it cannot be applied to nematode genomes. SCOMP is a linker-adapter mediated PCR technique in which the genome is digested with the restriction enzyme Sau3AI, ligated to adapters and amplified with adapter specific primers. The aim of WGA is to provide sufficient DNA giving an unbiased representation of the original genome. However, this is not an easy task. Wells et al. (1999) compared four methods, PEP, DOP-PCR, Alu-PCR and T-PCR, for CGH (comparative genome hybridisation) analysis. They selectively amplified 10 loci from 6

10 2. Literature review the WGA products produced with the four methods to compare genomic coverage. The results showed that DOP-PCR and PEP methods produced the best coverage of the whole genome (89% and 91%); DOP-PCR, Alu-PCR and T-PCR yielded higher than PEP. The accuracy revealed by SSCP (single strand conformation polymorphism) or ARMS (amplification refractory mutation system) results was 95% for PEP and DOP-PCR methods. CGH results proved that the T-PCR and Alu-PCR resuted in highly biased amplification. DOP-PCR had much less biased results whereas PEP did not produce enough DNA for the analyses. Cheung and Nelson (1996) confirmed that DOP-PCR was a good method for microsatellites based genotyping. It was also proven that the DOP-PCR products were reliable sources for SNP (single nucleotide polymorphism) genotyping analysis (Jordan et al., 2002) and genetic aberration detection (Hirose, et al., 2001). PEP products were proven to be reliable sources for genotyping (Kuivaniemi et al., 2002) and multiple mutation analyses single cell (Dietmaier et al., 1999). The above cited methods, however, do not produce fragments longer than 1.5 Kb. The most efficiently amplified fragments are about bp, which limits the usage of the amplified DNA. Modified DOP-PCR methods have been published recently (Buchanan et al., 2000; Kittler et al., 2001). With these methods, products of up to 10 Kb can be amplified. Unlike the above mentioned methods, the recently developed MDA method does not use thermostable DNA polymerase required by traditional PCR. It takes advantage of the RCA (rolling circle amplification) feature of the phage φ29 DNA polymerase to obtain 100,000 fold amplification from as little as 300 pg of starting DNA. The size of the amplified products is greater than 10 Kb. The designers used TaqMan quantitative PCR to determine the gene representation bias. The results showed only fold bias in comparison with fold bias produced by DOP-PCR or PEP methods. The high fidelity and uniformity were also confirmed by the CGH and SNP genotyping results. Additionally, MDA products can be applied to southern blot analysis of restriction fragments and chromosome painting. Therefore, MDA is probably a good choice for WGA. SCOMP usually gives high yields. Its high fidelity was confirmed by sequencing, comparative genome hybridisation, and loss of heterozygocity. Stoecklein et al. (2002) showed the superiority of SCOMP to DOP-PCR in their works on microdissected archival tissue samples. 7

11 2. Literature review WGA is probably a good strategy for plant nematologists who have to deal with shortage of DNA resources. It allows production of large quantities of high quality DNA from a single juvenile, making further observations possible at ease. However, attention must be paid to the pitfalls and problems associated with this technique. Care should be taken to avoid DNA contamination. More DNA has to be used when working with repeat sequences such as microsatellites. Wells et al. (1999) observed that artefacts were produced when they worked with microsatellites starting from a single cell. Also, Cheung and Nelson (1996) observed artefacts of amplified microsatellites when the starting DNA quantity was too low. Over-representation of one of the two alleles was observed with some WGA methods and interfered with the attempts to perform quantitative PCR. The extreme of this phenomenon is ADO (allele dropout) (Wells & Sherlock, 1998) that causes false homozygotes. However, this phenomenon usually appears only in a small proportion of the amplification products (approximately 10%) if the starting DNA quantity is very low. Maier et al. (2001) reported the successful development of a species-specific microsatellite marker for coral based on a genomic library constructed from DOP- PCR products. This is a good example of the application of WGA to organisms other than humans. Using a genomic library constructed from DOP-PCR products, He et al. (submitted) successfully developed a species-specific microsatellite marker for Xiphinema index, an ectoparasitic nematode that transfers grape fan leaf virus and is listed quarantine organism in the European Union, 2.2 Which part of the genome should we use? The genome consists of junk DNA and coding sequences. Recent genome sequencing has shown that 90% of the vertebrate genome does not code for recognizable genes (Walkup, 2000). Early discovery of the c-value paradox between different species of the frog and fly can be ascribed to the difference of the noncoding DNA. Investigation on the C. elegans genome revealed 73% non-coding sequences and 27% coding sequences including 19,099 predicted protein coding genes and non-coding RNA genes (The C. elegans sequencing consortium, 1998). The non-coding regions were classified into two major groups according to the locations related to the functional genes: (i) introns located between exons or spacers located between genes, which are transcribed and removed during splicing; and (ii) 8

12 2. Literature review intergenic regions or the transcribed 5 and 3 end untranslated regions of genes. Three major types of sequences were observed to distribute in non-coding regions: (i) pseudogenes that are remnant of an ancient gene duplication events of which a copy became inactivated by mutation; (ii) variable number tandem repeats (VNTR), satellites, mini-satellites and simple sequence repeats (SSRs or microsatellites); and (iii) interspersed repeats that are usually derived from mobile elements (retrotransposons and DNA transposons). All of the non-coding regions have been used for diagnosis, phylogeny, population genetics, linkage mapping and other disciplines. Twenty-six percent of the C. elegans genome is predicted to be introns (The C. elegans sequencing consortium, 1998). Introns or transcribed spacers of some genes have been used for diagnosis, and for the study of population diversity and phylogeny of nematodes. The ITS (internal transcribed spacer) of the ribosomal RNA gene cluster is the most popular choice for nematologists in recent years. Strictly speaking, this is not an intron that separates the exons of a protein encoding gene. The ITS separates linked rrna genes and is removed from the mature transcripts. The function of the ITS region is still unknown but yeast mutation/deletion experiments have unravelled the significance of the integrity of several structural features in the ITS region crucial for accurate spacer removal and biogenesis of the functional ribosome (Van Nues et al., 1995). Many articles published related to the application of ITS to different groups of nematodes: Cherry et al. (1997) used restriction analysis on PCR amplified ITS1 to study Belonolaimus. A summary of the taxonomic marker power of ITS1 was published by Powers et al. (1997). Ibrahim et al. (1994) applied ITS-RFLP to differentiate species and populations of Aphelenchoides and Ditylenchus augustus. Iwahori et al. (1998) used restriction mapping and sequencing analysis of amplified ITS region to construct the phylogeny for Bursaphelenchus species (pinewood nematodes); more phylogeny work on Bursaphelenchus species was reported by Beckenbach et al. (1999). PCR-RFLP of ITS was also applied to the diagnosis of Pratylenchus species (Orui, 1996; Uehara et al., 1998; Waeyenberghe et al., 2000). Intraspecific ITS-RFLP was used to analyse the X. americanum group (Vrain et al., 1992); Several species of the Steinernematidae and Heterorhabditidae (entomopathogenic nematodes) were separated by PCR-RFLP of ITS (Nasmith et al., 1996). Phylogeny work based on ITS1 sequences was conducted with Heterorhabditis 9

13 2. Literature review (Adams et al., 1998). Ibrahim et al. (1997) studied the genetic variation in Nacobbus aberrans using other techniques in combination with PCR-RFLP of ITS. Ferris et al. (1993) studied the variation of cyst-forming nematodes species based on ITS sequences. Szalanski et al. (1997) used PCR-RFLP of ITS to identify cyst forming nemtodes (five species of Heterodiridae). Subbotin et al. (1999) used ITS-RFLP and morphometrics to identify species of the Heterodera avenae group. Subbotin et al. (2001) constructed a phylogeny for cyst-forming nematodes based on ITS sequences. Interspecific ITS-RFLP was assayed for Globodera species parasites of solanaceous plants by Thiéry and Mugniéry (1996). Zijlstra et al. (1997) developed a precise method based on ITS-RFLP to separate root-knot nematode species (Meloidogyne spp.). In view of the above cited works, the diagnostic power of ITS is strong at species level of many nematode groups. However, the use of the ITS region for studies at population level is rarely reported. A recent article described microsatellites in ITS regions of fresh water crayfish that can obscure the phylogentic relationships at population level because of their intra-genomic variation (Harris & Crandall, 2000). As far as we know, there is no publication describing the same phenomenon in nematodes. However, intra-genomic variation in nematodes was indicated by Powers et al. (1997). Abundant microsatellite loci were observed by He et al. (unpublished) in ITS region of longidorids. So, cautious evaluation of the intragenomic variation need to be done when one tries to use ITS to study the population diversity and phylogeny. The intergenic spacer (IGS) region of ribosomal RNA genes was chosen for the identification of root-knot nematodes species (Petersen & Vrain, 1996; Petersen et al., 1997) yielding efficient separation of Meloidogyne chitwoodi, M. hapla and M. fallax. De Georgi and Abbott (1998) reported variation in IGS region of M. arenaria. The fact that the 5S rrna gene is located in the IGS region of M. arenaria is an interesting discovery (Vahidi et al., 1991), which is a common phenomenon for fungi and protozoa but not for metazoa. A tandem repeat with a 129 bp core was found in the IGS region of M. arenaria (Vahidi & Honda, 1991). No article describes the application of IGS to phylogenetic or ecological studies on nematodes, although one report used a spacer region between 5S rrna and spliced leader RNA genes to successfully identify Globodera pallida and G. rostochiensis (Shields et al., 1996). 10

14 2. Literature review Tandem repeats account for 2.7% of the C. elegans genome including variable number of tandem repeats (VNTR), satellites, mini-satellites and simple sequence repeats (SSR or microsatellites) (The C. elegans sequencing consortium, 1998). VNTR has been widely used for DNA fingerprinting in diagnostics. Stanton et al. (1997) used the VNTR located in the intergenic region of the mitchondrial genome of root-knot nematodes to identify cultural isolates. Hyman and Whipple (1996) evaluated the VNTR for genetic variation between individuals or populations and species differentiation of root-knot nematodes. Whipple et al. (1998) re-evaluated the power of the VNTR locus among and within populations of M. incognita and indicated limitation for population differentiation because of the high intra-genomic polymorphism. Two satellites were cloned and characterised from the root-knot nematodes species M. hapla and M. incognita (Piotte et al., 1994). Both were highly reiterated StyI satellites and shared no similarity to known repetitive elements. They were also distinct from each other. Castagnone-Sereno et al. (1992) examined the relationships between root-knot nematode species with different reproductive behaviour and used cloned repetitive sequences as probes in southern blot analysis on restriction digested genomic DNA. Applications of satellite probes are not restricted to root-knot nematodes. Stratford et al. (1992) isolated a repetitive element and used it as a diagnostic probe for cyst forming nematodes. Tarès et al. (1993) cloned and characterised a conserved satellite sequence for B. xylophilus. Recently, Stack et al. (2000) applied the species-specific satellite probe designed by Grenier et al. (1996) to characterise Heterorhabditis indica isolates while ITS-RFLP and IGS-RFLP were used to profile the isolates. Satellites, however, are not abundant in every nematode taxa. The genome of C. elegans contains a low amount (less than 1%) of satellite sequences (La Volpe et al., 1988; Uitterlinden et al., 1989). Microsatellites or SSRs are widely distributed in the genome of eukaryotic organisms and displays high levels of polymorphism (Litt & Luty, 1989; Weber & May, 1989; Tautz, 1989). Because of this high polymorphism and ease of interpretation, uninterrupted repeats have been applied for different purposes (Weber, 1990). However, compound microsatellite loci with mixed repeat motifs or interrupted by other sequences complicate the interpretation of the length patterns at the population level (Freimer & Slatkin, 1996), sequence analysis is thus essential. 11

15 2. Literature review High-resolution maps for mouse and human have revealed an even distribution of microsatellites in these genomes (Dib et al., 1996; Dietrich et al., 1996). The human genome sequencing project revealed approximately 3% SSRs for the human genome (International Human Genome Sequencing Consortium, 2001). The mouse genome (Mouse genome sequencing consortium, 2002) revealed a higher proportion of SSRs. In addition, the distribution of SSRs is different between the two genomes. Katii et al. (2001) confirmed the differential distribution of SSRs in eukaryotic genomes by identifying and characterising SSRs from the human, fly, nematode, plant and yeast genome sequences. Microsatellites are rarely located in coding sequences (Hancock, 1995a) because of the instability caused mainly by slip-strand mispairing during DNA replication (Fresco & Alberts, 1960; Sia et al., 1997a) resulting in a lethal frame shift mutation (motif with three nucleotides such as (ATC) n). Because of their abundance and high polymorphism microsatellite markers are widely used in population genetic studies, genome typing systems, and genome mapping,. The analysis of microsatellites can be fully automated (Hancock, 1999). A review of microsatellite isolation procedure was published by Zane et al. (2002). There are a few applications of microsatellites to the studies of population diversity of animal parasitic nematodes (Hockstra et al., 1997; Otsen et al., 2000; Fisher & Viney, 1996; Barker & Bundy, 2000; Underwood et al., 2000). Microsatellites were rarely used by plant-parasitic nematologists. Recently, Thiery & Mugniery (2000) studied microsatellite specificity and diversity among Globodera species and populations. De Luca et al. (2002) characterised a (GAAA) n microsatellite that was linked to direct repeats in root-knot nematodes. He et al. (submitted) characterised seven microsatellites from X. index. The microsatellites were found to be species-specific and linked to retro-transposon like elements. Interspersed repeats are abundant in the genomes sequenced so far. Thirtyeight dispersed repeat families were identified from the C. elegans genome (The C. elegans sequencing consortium, 1998). Interspersed repeats were classified into four types: (i) short interspersed elements (SINEs), (ii) long interspersed elements (LINEs), (iii) LTR (long terminal repeat) retrotransposons and (iv) DNA transposons. SINEs are non-autonomous retrotransposon. It is speculated that they rely on LINEs to achieve retrotransposition (Okada et al., 1997). SINEs consist of the A and B boxies of the RNA polymerase III promoter that are derived from a trna or 7SL (a 12

16 2. Literature review component of the signal recognition particle). LINEs are autonomous retrotransposons with two open reading frames coding for the necessary enzymes for transposition. LINEs and trna derived SINEs both have a poly-a tail. LTR retrotransposons are flanked by long terminal direct repeats that maintain the transcriptional regulatory elements. The three genes (gag, pol, and env) present code for the proteins required for reverse transcription and incorporation (Malik et al., 2000). DNA transposons are similar to bacterial transposons. They encode for transposase and have terminal inverted repeats (Smit et al., 1996). Retrotransposons can be developed as several types of molecular markers for plant genetic studies (Kumar & Hirochika, 2001). Retrotransosons were also considered to be very useful for biodiverisity and phylogeny studies. Shedlock and Okada (2000) discussed the great potential of SINEs for systematics and phylogeny in eukaryotes. Several publications deal with the phylogeny and biodiversity of legumes and cereals (Gribbon et al., 1999; Peare et al., 2000). Hoekstra et al. (1999; 2000) described genetic variation and linkage to the chemothereapeutics resistance of a Tc1 like transposon (DNA transposon) among populations of Haemonchus contortus, a sheep intestinal parasitic nematode. The Tc1 transposon was reported from C. elegans. Korswagen et al. (1996) made use of the adundant sites of Tc1 transposons as sequence-tagged sites for mapping mutations. There are no reports on interspersed elements in plant-parasitic nematodes. For plant nematologists the most attractive coding regions for diagnostic studies appear to be nuclear ribosomal RNA genes (18S and 28S) or mitochondrial genes because of their widespread use in other organisms and the confirmed power for phylogeny, population genetics, ecology and diagnostics. Blouin et al. (1998) studied the substitution of mitochondrial genes and reached the conclusion that conserved genes (COI, Cytb) are useful for phylogeny studies of closely related species; whereas the fast-evolving genes such as ND4 are good tools for population genetic studies. Blouin et al. (1999) used the mitochondrial ND4 gene to analyse the genetic structure of Heterorhabditis marelatus and the effects of the life cycle variation on genetic structure. Courtright et al. (2000) studied the population diversity of Scottnema lindsayae by using partial 28S nuclear rrna (D2 and D3 expansion regions) and 16S mitochondrial rrna gene sequences. Powers and Harris (1993) designed a PCR-RFLP analysis to differentiate five root-knot nematodes based on the 13

17 2. Literature review fragment amplified between COII and 16S rrna genes of mitochondria. Stanton et al. (1997) improved the identification method designed by Powers and Harris (1993) by designing a multiplexed protocol for simultaneous amplification of two regions of the mitochondrial genome to separate all haplotypes reported by Hugall et al. (1994). Joyce et al. (1994) used combined PCR-RFLP analyses on the COII-16S rrna fragment and ITS for differentiation of Heterorhabditis isolates. A phylogenetic analysis of burrowing nematodes was based on the combined analysis of partial 28S nuclear rrna (D2 and D3 expansion regions) and ITS region (Kaplan et al., 2000). De Ley et al. (1999) used the D2 and D3 sequences to assist their characterisation of Cephalobidae species. Al-Banna et al. (1997) used the D3 sequences for constructing the phylogeny of Pratylenchus. The most impressive work was published by Blaxter et al. (1998). The authors established the molecular evolutionary framework for the phylum Nematoda based the sequences of nuclear 18S rrna gene. Floyd et al. (2002) tried to construct a molecular barcode for soil nematodes identification based on 18S rrna gene. Stetterquist et al. (1996) reported the application of probes made from major sperm protein sequences to nematode species identification. Some Pratylenchus species were successfully identified using this method. 2.3 Whoe genome methods The section above summarised the applications of different genomic regions to phylogeny, population genetics and diagnosis of nematodes with special focus on research concerning plant-parasitic nematodes. Another important way to mine the genome is to directly analyse the whole genome. Apparently, genome sequencing is the best approach to unravel the mystery of the whole genome. However, such giant projects require large labour input and financial investments. Different genome analysis methods were designed to collect sufficient information for a previous specified research goal. Early research tended to use southern blot analysis on the genomic DNA, especially the small mitochondrial genome that was digested with restriction enzymes. Radice et al. (1988) compared two species: Heterodera glycines and Heterodera schachtii with restriction patterns of the mitochondrial genome. Powers and Sandall (1988) compared root-knot nematodes species with the aid of the restriction patterns of the mitochondrial genome. Hiatt et al. (1995) analysed intra and inter-population variation in M. arenaria using cdna, interspersed element and IGS repeat as probe for southern analyses. The results confirmed that selected probe 14

18 2. Literature review sequences can be used for the study of the population structure and genome evolution. Since PCR was introduced, many rapid methods were designed to take random signatures from the genome, such as AFLP (amplified fragment length polymorphism), DAF (DNA amplified fingerprinting), RAPD (random amplified polymorphic DNA). They provide scientists with fast and efficient ways to dig out the required information from the targeted genomes. AFLP is a powerful DNA marker technique based on the selective amplification of the restriction fragments by PCR (Vos et al., 1995; Zabeau & Vos, 1993). It combines the advantages of RFLP and PCR to achieve a fast and reliable dense marker development system. A common AFLP protocol includes several steps: restriction enzyme digestion of the whole genome with two enzymes (one is a six base cutter as EcoRI and the other is a four base cutter such as MseI), adapter ligation to the restriction fragment, pre-amplification by an adapter sequence specific primer for which one selective base is added at the 3 end, amplification by an adapter sequence specific primer with three selective bases at the 3 end, the complexity of the targeted genome determines the selective bases used, PAGE and scoring. AFLP has been applied to biodiversity studies (Paul et al., 1997; Yee et al., 1999; Krauss, 2000), population genetics (Yan et al., 1999), positional cloning (Liscum & Oeller, 1999), sex-determination (Griffiths & Orr, 1999), phylogeny (Sharma et al., 1996) and transcriptome analysis (Bachem et al., 1998). Plant nematologists noticed the power of the novel technique. Van Der Beek et al. (1998) studied genetic variation among parthenogenetic root-knot nematodes species by AFLP and 2D-protein analyses in comparison with morphological characters. Wang et al. (2001) combined the AFLP analysis and host preference to group Heterodera trifolii into two putative races. However, because of the high cost and relatively complicated manipulation skills required for AFLP analysis, many nematologists still prefer to use RAPD, a technique that was introduced by Williams et al. (1990). Compared to AFLP, RAPD is cheap, easy to manipulate and thereby more accessible to researchers who are not expert in molecular manipulation but who wish to use powerful molecular tools to solve problems. RAPD analysis uses a short oligonucleotide (10 bp) to obtain genome signatures observed by agarose electrophoresis. A great amount of articles published in recent years used the RAPD technique to approach a wide range of problems. 15

19 2. Literature review Genetic variation of Nacobbus aberrans was studied by the combination of RAPD and ITS-RFLP analyses (Ibrahim et al., 1997). Diversity of Radopholus spp. isolates was analysed by isozyme, ITS-RFLP, and RAPD. Hierarchical analysis separated the isolates into two groups based on the RAPD scoring matrix (Fallas et al., 1996). Genetic variation of tropical root-knot nematodes species was assessed by RAPD. Virulent lines of M. arenaria could be differentiated from avirulent lines and other species (Blok et al., 1997). The genomic diversity of R. similis populations was assessed by RAPD (Hahn et al., 1996; Hahn et al., 1994; Elbadri et al., 2002). Kaplan et al. (1996) developed STS (sequence tagged sites) derived from RAPD analysis for burrowing nematodes. Combined analysis based on the STS and host range assays revealed that R. similis and R. citrophilus are not reproductively separated (Kaplan et al., 1997). Another report from Kaplan and Opperman (1997) implies that R. citrophilus does not represent a unique species according to the RAPD, STS, and isozymes analysis results. Nadler (1996) applied RAPD and isozyme analyses to study the genetics of geographic variation in Ascaris suum. Williamson et al. (1997) developed species-specific fragments based on the results of RAPD analysis for M. hapla and M. chitwoodi. Population variability of M. javanica was evaluated by RAPD (Carneiro et al., 1997). RAPD was also applied to study the inter and intraspecific diversity of Globodera species and populations (Thiéry et al., 1997; Bendezu et al., 1998). Fullaondo et al. (1999) developed specific primers from the RAPD analysis to identify Globodera rostochiensis and G. pallida. Zijlstra et al. (2000) obtained SCARs (sequence characterised amplified regions) from RAPD analysis for fast and accurate identification of M. incognita, M. javanica and M. arenaria. The same SCARs were applied to identify the root-knot nematodes species occurring in South Africa (Fourie et al., 2001). Despite these wide applications, RAPD results usually suffer from instability and poor reproducibility between labs. Cautious optimisation of the reaction conditions may help establish a stable system. DAF is very similar to RAPD (Caetano-Anolles et al., 1991). It uses a short oligonucleotide primer (7-8 bp) to obtain a random signature from the genome. The oligo-primer can be linear form or can have a mini-hairpin structure (Hirao et al., 1994; Caetano-Anolles & Gresshoff, 1994). DAF uses a much higher primer concentration than other random genome sampling techniques: primer-template ratio for DAF is compared to less than one for other techniques (Caetano-Anolles, 16

20 2. Literature review 1996). Taking advantage of the sensitivity of silver staining and high-resolution offered by PAGE, DAF produces much more complex fingerprints than RAPD does. Several techniques were developed from DAF: ASAP (arbitrary signatures from amplification profiles) (Caetano-Anollés & Gresshoff, 1996) and MAAP (multiple arbitraty amplicon profiling) (Caetano-Anolles, 1994). DAF and ASAP were used to assess the genetic instability of Bermudagrass (Cynodon) cultivars (Caetano-Anolles, 1998). Sen et al. (1997) applied DAF to develop a DNA marker for bread wheat. Baum et al. (1993) used this technique for species identification and quantification of genotypic diversity of root-knot nematodes. Yet, this technique does not have wide applications in Nematology. 2.4 Conclusion and perspectives Mining genomic resources helps nemtologists to solve many problems concerning species diagnosis, systematics, phylogeny, and population genetics. It brings new light on the development of different research fields in Nematology. In this review, the applications of different non-coding and coding regions of the genome were summarised. Each of them has power in some studies but limitations in others. A discreet selection should be taken before the research work starts. If necessary, a combined data set can be used to give better solutions. At present, most researchers select ribosomal rrna genes and mitochondria genes for diagnosis, population genetics and phylogeny study. The great potential of interspersed repeats and SSRs were not given much consideration because of the shortage of information for them among nematode taxa. This can be ascribed to the scarcity of DNA resources for most of the plant-parasitic nematode taxa. A very useful technique (WGA) whole genome amplification developed for clinic medical researches was reviewed. It may be a good solution for plant nematologists who are hand-bound by the limitation of DNA sources. Applications of three random genome-sampling techniques were reviewed: AFLP, RAPD and DAF. AFLP is apparently the best choice if the lab is not limited in terms of financial resources and can rely on advice from molecular experts. However, considering the present situation of nematology labs, RAPD will still be the main choice for studies on diversity, identification and molecular marker development. 17

21 2. Literature review DAF is an inexpensive alternative to RAPD; it yields more complex and more reproducible fingerprinting patterns than RAPD. Looking back to applications of molecular techniques to Nematology during the past decade, It was clearly a fruitful period. In the coming years we can expect that more and rich molecular data will be collected. Scientists will have many new sights into new research fields. 18

22 3. Materials and methods 3 General materials and methods 3.1 Taxa sampling 3.1. The nematodes species and populations used in this work are listed in Table 3.2 Preservation of specimens Nematode samples were preserved in 1M NaCl and stored at -70ºC. In this way, both morphological characters and nucleic acids can be maintained in good condition for up to five years. 3.3 Total DNA extraction from single nematodes One juvenile or adult nematode was transferred into 13µL ddh 2 O and cut into 2-5 pieces with a sterilised scalpel. 10 µl 2 worm lysis buffer (20 mm Tris-HCl (ph 8.0), 100 mm KCl, 3.0 mm Mg 2 Cl, 2.0 mm DTT, 0.9% Tween 20) and 0.1 µl proteinase K stock solution (20mg/mL) were added to a 200/500 µl eppendorf. The nematode fragments were pipetted in 9.9 µl ddh 2 O and added to the eppendorf, which was then briefly centrifuged and stored at 70 C for at least 10 minutes. Subsequently, the eppendorf was incubated at 65 C for 1-2 hours and the proteinase K was denatured at 95 C for 10 minutes. Finally, the DNA suspension was cooled to 4 C and stored at 20 C until use. No additional purification was required for subsequent PCR procedure. 19

23 3. Materials and methods 3.4 Gel staining Samples were run on 1-2% agarose gels in TBE or TAE buffer, stained in Ethidium Bromide (0.5 µg /ml) for minutes and visualized under UV. Table 3.1. Taxon sampling for longidorids used in the this study Nematode species Code Locality of sample Host Source L. africanus CA46 CA, US Chenopodium sp F. Lamberti L. apulus CAN23 Mola di Bari, Italy T.C. Vrain L. arthensis CAN115 Suter, Switzerland T.C. Vrain L. athesinus EU105 Type locality, Italy D.J.F. Brown L. attenuatus CAN17 Germany T.C. Vrain L. breviannulatus CAN268 Nebraska T.C. Vrain L. caespiticola EU20 Scotland Potato D.J.F. Brown L. cameniae EU130 Hangzhou, China Camellia J. Zheng L. carpathicus Carpa Kirchbichel, Germany T. Rubtsova L. diadecturus CAN31 Elkins, White river Elm T.C. Vrain L. edmundsi VE275 Carribean sea beach F. Lamberti L. elongatus EU1 Ingraston, peebles, grass D.J.F. Brown L. euonymous EU124 Zabagr, Hungary grape D.J.F. Brown L. goodeyi EU26 Peebles, Scotland D.J.F. Brown L. helveticus SV46 Camenzuid, Switzerland F. Lamberti L. intermedius Inter Planegg, Germany T. Rubtsova L. juvenilis CAN196 Moca, Slovakia T.C. Vrain L. latocephalus BLUE Nylstrom, South Africa F. Lamberti L. latocephalus CAN114 Greece T.C. Vrain L. leptocephalus EU8 Scotland Potato D.J.F. Brown L. macrosoma LM1 Research Station, Switzerl. Cherry L. piceicola EU112 Branisko, Slovakia D.J.F. Brown L. profundrum Prof Gandesbergen, Germany T. Rubtsova L. sp GG6 Byron, Georgia, US Pecan F. Lamberti L. sturhani Vise348 Auggen, Germany T. Rubtsova Paralongidorus maximus Max592 Harrier Sand, Germany T. Rubtsova Paralongidorus sp CAN201 Bosaka H1 SL Nut trees T.C. Vrain Paratrichodorus PMA F. Decraemer macrostylus Pratylenchus penetrans PPE Lab culture, Belgium Xiphidorus minor VE269 Amazon forest, Venezuela Woods F. Lamberti Xiphidorus sp CAN248 Argentina T.C. Vrain X. abrantinum CAN223 Portugal T.C. Vrain X. americanum PE40 Pennsyvalia, US Apple F. Lamberti X. americanum PEAX Pennsylvania, US Single female F. Lamberti culture X. americanum CA22 Davis, CA, US Grapes F. Lamberti X. americanum PE24 Pennsylvania, US Peach F. Lamberti 20

24 3. Materials and methods X. americanum MS Scubey Forest, Mississipi, Grass F. Lamberti US X. americanum XA1 US D.J.F. Brown X. americanum XA2 Clarksuille, AR. Apple D.J.F. Brown X. americanum CA54 Kearney, CA, US Grape F. Lamberti X. americanum group MD2 Mariland, US Peach F. Lamberti X. americanum group CA62 Twin Cities, CA, US Grapes F. Lamberti X. americanum group CA55 Kearney, CA, US Apple F. Lamberti X. americanum group PE5 Pennsylvania, US Peach F. Lamberti X. americanum group PE6 Pennsylvania, US Pear F. Lamberti X. americanum group PE7 Pennsylvania, US Apple F. Lamberti X. americanum group PE9 Pennsylvania, US Cherry F. Lamberti X. americanum group PE26 Pennsylvania, US White Rose F. Lamberti X. americanum group PE29 Pennsylvania, US Cherry F. Lamberti X. americanum group PE32 Pennsylvania, US Apple F. Lamberti X. americanum group PE39 Pennsyvalia, US Cherry F. Lamberti X. americanum group PE12 Pennsylvania, US Apple F. Lamberti X. americanum group PE15 Pennsylvania, US Cherry F. Lamberti X. americanum group AB7 Auburn Univ Campus, AB, F. Lamberti US X. americanum group AB9 Auburn Univ Campus, AB, F. Lamberti US X. americanum group CA4 Barsoom, CA, US Apple F. Lamberti X. americanum group CA6 Colombine, CA, US Citrus F. Lamberti X. americanum group CA7 Colombine, CA, US Grape F. Lamberti X. americanum group CA8 Colombine, CA, US Grapes F. Lamberti X. americanum group CA10 Caratane, CA, US Almond F. Lamberti X. americanum group CA11 Caratane, CA, US Grape F. Lamberti X. americanum group CA12 McKewry, CA, US Grape F. Lamberti X. americanum group CA13 CA, US Grape F. Lamberti X. americanum group CA14 Wescot, CA, US Grape F. Lamberti X. americanum group CA15 Wescot, CA, US Nectarin sp F. Lamberti X. americanum group CA16 Marthedahl, CA, US Grape F. Lamberti X. americanum group CA18 Asadorian, CA, US Grape F. Lamberti X. americanum group CA24 Winters, CA, US Grape F. Lamberti X. americanum group CA31 Plenada, CA, US Plum F. Lamberti X. americanum group CA40 CA, US Grape F. Lamberti X. americanum group CA43 CA, US F. Lamberti X. americanum group CA45 Kearney proj 30, CA, US, F. Lamberti X. bakeri CAN27 Fayetteville Honeysuckle T.C. Vrain X. basiri EU125 San Jose, Cuba Banana D.J.F. Brown X. basiri group EU126 San Jose Cuba 9 Banana D.J.F. Brown X. brasiliense EU41 Para State, Brazil Eliaeis sp D.J.F. Brown X. brevicollum Xb1 South Africa A. Coomans X. brevicollum EU37 Piricicaba, Brazil F. Lamberti X. brevicollum BB1 Brazil A. Coomans X. brevicollum EU132 Beijing, China J. Zheng X. brevicollum EU29 Brazil Grape F. Lamberti X.brevisicum EU5 Braga, Protugal Quercus sp. D.J.F. Brown 21

25 3. Materials and methods X. bricolensis CAN39 Winfield, B. C., Canada Grapes T. C.Vrain X. bricolensis PE18 Pennsylvania, US Corn, cherry F. Lamberti X. californicum CA3 Type locality, CA, US Grape F. Lamberti X. californicum CA33 Arena, CA, US Grape F. Lamberti X. californicum CA50 Riverside, CA, US Olive F. Lamberti X. chambersi AB3 Lee county, Alabama, US F. Lamberti X. coxi GG10 Jenkil Tsloud, GG, US F. Lamberti X. coxi GG11 Jenkil Tsloud, GG, US Juniperus sp F. Lamberti X. dentatum EU111 Branisko, Slovakia D.J.F. Brown X. diversicaudatum EU7 Forest, Braga, Portugal Eucalyptus sp D.J.F. Brown X. diffusum CAN162 South Africa T.C. Vrain X. diffusum GG7 Jenkil Tsloud, GG, US Oak F. Lamberti X. elongatum CAN24 Israel T.C. Vrain X. georgianum GG14 Jenkil Tsloud, GG, US Juniperus sp F. Lamberti X. incognitum PE42 Pennsylvania, US Corn F. Lamberti X. index EU25 Argentina, FL Grape D.J.F. Brown X. index T31 Italy Grape F. Lamberti X. index T8 Italy Grape F. Lamberti X. index MC21 Morocco Grape F. Lamberti X. index SPA1 Spain Grape F. Lamberti X. index CA28 Plenada, CA, US Grape F. Lamberti X. insigne EU131 Hangzhou, China D.J.F. Brown X. italiae BAR1 Italy Olive F. Lamberti X. krugi EU36 Piricicaba, Brazil D.J.F. Brown X. madeirense EU14 Coimbra, Portugal Potato D.J.F. Brown X. madeirense EU103 Coimbra, Portugal Peach D.J.F. Brown X. pachtaicum CA56 Winters Pistachio, CA, US Peach F. Lamberti X. pachtaicum EU3 Pentamodi, Crete Grape D.J.F. Brown X. pachtaicum EU4 North Greece Garden D.J.F. Brown X. pachtaicum EU115 Castelnuovo Berarjengo Olive D.J.F. Brown Italy X. pachtaicum EU2 Heraklion, Crete Olive and grass D.J.F. Brown X. pachtaicum T48 Castelnuovo Berarjengo Grape F. Lamberti Italy X. pachtaicum EU118 Crete D.J.F. Brown X. pachtaicum ML8 Kausani, Moldova Grape D.J.F. Brown X. pachtaicum ML21 Albata, Moldova Grape D.J.F. Brown X. pacificum GG15 Pike, GG, US Peach F. Lamberti X. pacificum MD1 Mariland, US Yellow poplar F. Lamberti X. pachydermum EU109 Portugal, type locality D.J.F. Brown X. peruvianum GG3 Byron, Georgia, US Blueberry F. Lamberti X. pyrenaicum EU121 Cyprus Grape D.J.F. Brown X. radicicola V1273 Chu momray, Vietnam X. rivesi PE20 Pennsylvania, US Merdle (wheat) F. Lamberti X. rivesi PE33 Pennsylvania, US Yellow Poplar F. Lamberti X. rivesi PE23 Pennsylvania, US Apple F. Lamberti X. rivesi PE1 Pennsylvania, US White Pine F. Lamberti X. rivesi PE2 Pennsylvania, US Apple F. Lamberti X. santos CAN224 Portugal T. C. Vrain 22

26 3. Materials and methods X. santos UT14 Moab, UT, US Juniperus sp F. Lamberti X. savanicola CAN72 Dakar, Senegal T.C. Vrain X. setariae EU27 Brazil. J. Furlan sample 1 Grape D.J.F. Brown X. simile ML5 Anenii Nou, Moldova Grape D.J.F. Brown X. simile ML35 Mikauti, Moldova Apple D.J.F. Brown X. sp. EU123 Zabagr, Hungary Grape D.J.F. Brown X. sp EU110 Portugal type locality D.J.F. Brown X. taylori EU117 Spa, Slovakia D.J.F. Brown X. taylori TN1 Treuna, Italy Grape X. thornei CO3 Colorado, US Apple F. Lamberti X. thornei CO5 Colorado, US Apple F. Lamberti X. thornei OR4 Molella, Oregon, US Grape F. Lamberti X. utahense NV2 Mesquite, Nevada, US Poplar F. Lamberti X. utahense NV3 Wells, NV, US Aspen F. Lamberti X. utahense NV5 Wells, NV, US Juniperus sp F. Lamberti X. utahense UT4 Coal Creek, Utah, US Ulmus sp F. Lamberti X. utahense UT6 Cove Fort, UT, US Apple F. Lamberti X. utahense UT8 Burneville, UT, US Shrub F. Lamberti X. utahense UT24 Holleday, UT, US Poplar F. Lamberti X. utahense UT25 Holleday, UT, US Cotton F. Lamberti 23

27 3. Materials and methods 24

28 4. Characterisation of the mitochondrial genome 4 Mitochondrial genome of Xiphinema americanum sensu stricto (Nematoda: Enoplea): the most compacted mitochondrial genome of any Metazoan 4.1 Introduction The animal mitochondrial genome is usually a small circular molecule (14-20 Kb) composed of 37 genes, including 13 protein genes coding for subunits of the enzyme forming the respiratory chain complexes, 2 rrna genes and 22 trna genes (Wolstenholme, 1992). Recently, there are more and more exceptional cases have been reported as more animal species were investigated. Linear were reported to exist in cnidarian classes (Bridge et al., 1992). Multipartite mitochondrial genomes were found in Globodera Pallida (Nematoda: Chromadorea) (Armstrong et al., 2000) and the primitive mesozoan animal Dicyema (Watanabe et al., 1999). ATP8 genes are absent from all fully sequenced of nematodes including Meloidogyne javanica (Okimoto et al., 1991), Caenorhabditis elegans/ascaris suum (Okimoto et al., 1992), Onchocerca volvulus (Keddie et al., 1998) except Trichinella spiralis (Lavrov & Brown, 2001). It is also absent in Mytilus edulis (Bivalvia) (Hoffmann et al., 1992). The of Cnidarians has lost almost all trna genes and contains one or two other genes such as muts not found in other metazoan (Beagley et al., 1995, 1996, 1998; Beaton et al., 1998; Pont-Kingdon et al., 1998). Repeated genes were discovered in the mitochondrial genome of Romanomermis culicivorax (Nematoda: Adenophrea) (Hyman & Azevedo, 1996). In summary, a big proportion of the bias from the general structure of animal mitochondrial genomes was reported from the very few completely sequenced nematodes. In this article, we will report another completely sequenced mitochondrial genome of a nematode, Xiphinema americanum sensu stricto (Nematoda: 25

29 4. Characterisation of the mitochondrial genome Adenophrea). This is a plant ecto-parasitic nematode that reproduces by parthenogenesis and is capable of transmitting plant viruses. Some features of this mitochondrial genome are unique in metazoan mitochondrial genomes. Like the published mitochondrial genome of the Enoplean nematode Trichinella spiralis (Lavrov & Brown, 2001), its genes are transcribed from both strands, whereas genes of the of Chromadorean nematodes are transcribed from the same strand. However, the ATPase subunit 8 gene that is present in Trichinella spiralis is absent from X. americanum sensu stricto as reported for mitochondrial genomes of several other Chromadorean nematodes. Translation of all protein genes is initiated with the ATA codon. The TTG and GTT codon reported to be the initiation codon for Caenorhabditis elegans and Ascaris suum do not play the same role in the Xiphinema americanum mitochondrial genome. Translation usually terminates or has the potential to terminate with the complete termination codon TAA/TAG. The predicted Large rrna gene was interrupted by trna Met (CAU), which has never been found in other metazoan mitochondrial genome. Interrupted or fragmented rrna genes have been discovered in eubacteria, non-metazoan mitochondria, chloroplast and nucleocytosolic compartments of eukaryotes (Gray & Schnare, 1996). Introns were also reported for archaeal rrna genes (Kjems & Garrett, 1991; Takai & Horikoshi, 1999). However, the exact 5 end of the 16S rrna gene need to be confirmed by analysis of cdna so as to accept or exclude the prediction of the interrupted gene. 4.2 Material and Methods Total DNA extraction DNA extracted from single juvenile or adult was used for the PCR. The details of the extraction procedure are described in Chapter 3. DNA for southern blotting analysis was prepared using a Dneasy tissue kit (Qiagen GmbH, Postfach, Germany) from around 1000 juneniles and adults Mitochondrial genome amplification and cloning Because there was no sequence information available for the mitochondrial genome of X. americanum, I aligned the published sequence of C. elegans (Accession: NC_001328) with all deposited sequences by BLASTN searching of the Genbank database. Two primers COI-F (5 -GATTTTTTGGKCATCCWGARG-3 ), 26

30 4. Characterisation of the mitochondrial genome COI-R (5 -CWACATAATAAGTATCATG-3 ) were designed. A 414 bp fragment was successfully amplified by using a AmpliTaq Gold PCR kit (ABI) and cloned into pgem-t vector (Promega, Leiden, The Netherlands). The PCR was performed in a PTC-100/200 thermocycler (MJ research, Biozyme, San Diego, USA). The cycling conditions were 95 C for 10 minutes, 5 cycles of 94 C for 30 sec, 45 C for 40 sec and 72 C for 1 min, and another 35 cycles 94 C for 30 sec, 37 C for 30 sec, and 72 C for 1 min followed by an extension at 72 C for 10 min. Sequence was determined as described in section The sequence was confirmed as part of the COI gene by BLASTN searching of the Genbank database. Another two primers were designed for long PCR procedure within the partial COI sequence: COIL-1(5 - TGGTACCCCAAATGGAGAAGGCT-3 ; COIR-1(5 - GAGCACATCACATGTTTAGAGTGG-3 ). The rest of the was successfully amplified by long and accurate PCR (Cheng & Stoneking, 1996) using an Elongase Kit (Invitrogen, Merelbeke, Belgium). The PCR reactions consisted of 22 cycles of denaturing at 94 C for 30 sec, annealing for 30 sec and extension at 72 C for 22 min, and another 13 cycles of denaturing at 94 C for 30 sec, annealing at 57 C for 30 sec, and extension at 72 C for 22 min followed by an extension at 72 C for 10 min. A touchdown annealing profile was used for the first 22 cycles with annealing temperature decreasing 0.5 C each cycle till the annealing temperature reaching 57 C. We also added 15 sec additional extension time each cycle for the last 13 cycles. Each PCR reaction produced a single fragment of approximately 12.5 Kb when visualised under the UV after separated in a 1% agarose gel and staining with ethidium bromide. The fragment was recovered from the gel by excision and purified by Gel-purification kit (Qiagen, GmbH, Postfach, Germany). The fragment was cloned into pgem-t vector (Promega, Leiden, The Netherlands) following the manufacture instructions Molecule circularity confirmation According to the sequences (see the section below), two enzymes predicted to have a single site in the selected to digest the mitochondria DNA: BamHI and EcoRV. Restriction analysis consisted of a one-enzyme restriction digestion with BamHI and a two-enzyme digestion with BamHI and EcoRV. The digested products together with undigeseted genomic DNA were separated on 1% agarose gel in 1 TBE buffer at 125 V for 2 hours. The gel was stained with EtBr. The picture of the gel was made under the UV with a ruler positioned beside the ladder. The ruler was used to 27

31 4. Characterisation of the mitochondrial genome recognize the DNA size from the following blotting results. DNA was depurinated by 0.2M HCl and blotted to the nylon membrane (Hybond N+, Amersham-Pharmacia, Roosendaal, Nederland) by alkaline transfer using 0.4M NaOH according to Sambrook et al. (1989). Probes were prepared from the PCR amplified mitochondrial genome by random priming with [ α- 32 P] dctp using Ready-To-Go DNA Labelling Beads (-dctp) (Amersham-Pharmacia, Roosendaal, Nederland). Hybridization was performed with ULTRAhyb TM kit (Ambion, Huntingdon, UK) Sequencing and sequence analyses We used the GPS-1 genome priming system (New England Biolabs, Leusden, The Netherlands) to insert the universal primer island into the cloned fragment. XhoI and NotI were used to map the position of the primer island. According to the mapping results, appropriate clones were selected for sequencing. The sequencing reactions were carried out by using a BigDye terminator cycle sequencing kit (ABI, Lennik, Belgium). The final sequences were determined by ABI prizm 377 genetic analyser (ABI). Sequences were assembled and edited using BioEdit (Hall, 1999). Most of the protein genes were identified using ORF finder ( with invertebrate mitochondrial codon table or by BLASTX searching of Genbank database to isolate the conserved protein domain followed by determination of the initiation codons and termination codons. ND4L and ND6 genes were determined by comparing the hydrophobicity profiles (Kyte & Doolittle, 1982) with other nematodes as they lacked an identifiable conserved domain. We used online trnascan-se service ( (Lowe & Eddy, 1997) to search for the trna genes with a standard cloverleaf structure or the trna genes containing a TV-replacement loop (Wolstenholme et al., 1994). Seventeen genes were obtained using this method. The other five trna genes were obtained by recognizing the probable secondary structure motifs and anticodons with the aid of Mfold (Zuker et al., 1999). All of the five genes have a loop structure replacing the D arm. Two rrna genes were detected firstly by the presence of short conserved sequences. We determined all the genes in the neighbourhood and then use Mfold to construct the secondary structure of both rrna genes based upon the primary sequence alignment with genes of other nematodes and referring to the predicted secondary structures of other nematodes, Escherichia coli, as well as the conservation inducted from the 28

32 4. Characterisation of the mitochondrial genome observation of many organisms (Gutell et al., 1993; Gutell, 1994). The exact ends of rrna genes should be determined by sequencing the mature rrnas or cdna since there were no identifiable similarity with other determined rrna genes found at the 5 or 3 ends of both rrna genes. Secondary structures were edited by RNAviz (De Rijk & Wachter, 1997). Primary sequences alignment was performed using ClustalX (Thompson et al., 1997). Codon usage or amino acids composition were analysed by contingency table. For 2 2 contingency table, Yates correction (Yates, 1934) was used to calculate the Chi-square. The two proportion test (Fisher exact test (Fisher, 1934) was performed on the comparison of AC /GT rich codon usages in both strands. 4.3 Results and Discussion Genome composition and gene organisation The genome size is only bp, which is the smallest size in all metazoan mitochondrial genomes reported thus far. It is 1129 bp smaller than the Onchocerca volvulus mitochondrial genome that was previously reported as the smallest metazoan mitochondrial genome (Keddie et al., 1998). Compared to other metazoan s, the genome economization of X. americanum ss ise the result of smaller protein, trna and rrna genes; the absence of lengthy non-coding regions; and gene overlapping. Armstrong et al. (2000) addressed the importance of confirmation of the circular form of the mictochondrial genome if the molecule was obtained by PCR for which the artifacts can be produced from the integrated tandem array of mitochondria genome. The circularity of the molecule studied was confirmed in this study: southern blot analysis showed the size of the mitochondrial genome is about 13 Kb excluding the possibility of the integrated tandem form of the molecule; The success of the long PCR excluded the possibility of the linear form of the molecule. Except for one 37bp non-coding region located between putative trna Asn (GUU) and ND4L genes that was assumed to be the control region, the other five non-coding regions have a length-range between 1bp and 11bp. This feature is rare in metazoan mitochondrial genomes that usually have at least one long noncoding region functioning as the replication origin or the transcription origin (Wolstenholme, 1992). This feature was reported for gastropod Pupa strigosa that is a highly compacted genome (Kurabayashi & Ueshima, 2000). 29

33 4. Characterisation of the mitochondrial genome Gene overlaps are spread widely over the genome (Table 4.1). An abundance of overlapping genes was also reported for gastropods and pulmonate land snails (Hatzoglou et al., 1995; Yamazaki et al., 1997). However, no genes have been reported to have such extensive overlaps up to 36bp located in the same coding strand (trna Pro (UGG) and small rrna gene) as in X. americanum ss mitochondrial genome. The overlaps between trna Cys (GCA), trna His (AUG), trna Ser (UGA) and trna Phe (GAA) genes range from 1 to 31bp (Fig. 4.1). These overlaps pose a big challenge for the transcript processing system. This is discussed below. Fig Gene map of X. americanum mitochondrial genome: Protein genes and rrna genes are abbreviated as in the text. trna genes follow the one letter amino acid code; S1 indicates trna- Ser (UCU) and S2 indicates trna-ser (UGA) ; L1 and L2 represent trna-leu (UAG) and trna-leu (UAA) respectively. The inner circular bar indicates the transcription direction of the genes; the black bar indicates the clockwise direction of transcription; the white bar indicates the counterclockwise direction. In the gene boundaries, positive numbers indicate the number of intergenic nucleotides and the negative numbers indicate the number of overlapping nucleotides. The region filled with blue color indicates the interrupted L-rRNA gene. Regions filled with green color indicate the highly overlapping genes. 30

34 4. Characterisation of the mitochondrial genome Protein gene Table 4.1 Comparison of mitochondrial protein and rrna genes with those of other nematodes and an arthropod No. of amino acids / nucleotides % amino acid / nucleotide identity and similarity X. americanum Ascaris. suum 1 Caenorhabditis. elegans 2 Onchocerca. volvulus 3 Trichinella. spiralis 4 Anopheles. gambiae 5 1 vs 2 1 vs 3 1 vs 4 1 vs 5 1 vs 6 Predicted initiation and termination codons in X. americanum COI ATA T(AA) COII ATA TA(A) COIII ATA TAA ND ATA TAA ND ATA ND ATA T(AG) ND ATA TAA ND4L ATA TAA ND ATA TAA ND ATA TAA Cytb ATA TAA ATP ATA T(AA) ATP8 NF NF NF NF NA NA NA NA NA NA NA rrna gene 6 L-rRNA S-rRNA T(ATTAA ) Note: NF indicates not found. NA indicates not available. In the columns of amino acid identity and similarity, the first number is the calculated identity and the second is the percentage of similarity. Nucleotides included in the quote are part of the complete termination codon overlapping with the downstream trna or protein genes. 1 2 Data was obtained from the published (Okimoto et al., 1992) (genbank ACCESSION:X54253 and NC_001328). 3 referred to (Keddie et al., 1998) (genbank ACCESSION NC_001861). 4 referred to the data provided in (Lavrov & Brown, 2001) (ACCESSION: NC_002681). 5 was from the published data (Beard et al., 1993) (Accession: NC_002084). 6 We considered only the primary sequence identity of rrna genes between organisms. 31

35 4. Characterisation of the mitochondrial genome Table 4.2 Comparison of codon usage and amino acids composition between predicted peptides coded by AC and GT rich strand ** Protein genes located in AC rich strand Protein genes located in GT rich strand Amino acid codon NNC NNA NNG NNT NNC NNA NNG NNT χ 2 test a χ 2 test b Proportion Test c Nonpolar A GCN 18.3 (9) 28.6 (14) 61.2 (3) 46.9 (23) 12.2 (10) 21.9 (18) 9.8 (8) 56.1 (46) I ATY 21.1 (19) 78.9 (71) 21.2 (31) 78.8 (115) L CTN 5.8 (5) 40.2 (35) 10.3 (9) 43.7 (38) 15.5 (23) 30.2 (45) 13.4 (20) 40.9 (61) TTR 77.1 (91) 22.9 (27) 68.5 (163) 31.5 (75) M ATR 73.6 (53) 26.4 (19) 63.7 (79) 36.3 (45) F TTY 22.6 (26) 77.4 (89) 20.5 (54) 79.5 (210) * P CCN 16.2 (6) 27.0 (10) 16.2 (6) 40.6 (15) 10.5 (8) 25.0 (19) 21.1 (16) 43.4 (33) W TGR 66.7 (24) 33.3 (12) 65.7 (46) 34.3 (24) V GTN 11.0 (9) 40.2 (33) 18.3 (15) 30.5 (25) 7.9 (15) 33.3 (63) 20.7 (39) 38.1 (72) Polar N AAY 43.9 (18) 56.1 (23) 22.0 (11) 78.0 (39) * * C TGY 42.1 (8) 57.9 (11) 28.1 (9) 71.9 (23) Q CAR 45.0 (9) 55.0 (11) 68.6 (24) 31.4 (11) G GGN 4.3 (3) 50.7 (35) 14.5 (10) 30.5 (21) 10.0 (12) 35.8 (43) 25.8 (31) 28.4 (34) S AGN 11.6 (8) 42.0 (29) 30.4 (21) 15.9 (11) 8.6 (11) 37.5 (48) 26.6 (34) 27.3 (35) TCN 20.8 (15) 27.8 (20) 12.5 (9) 38.9 (28) 18.1 (27) 23.4 (35) 10.7 (16) 47.8 (71) T ACN 14.7 (11) 21.3 (16) 10.7 (8) 53.3 (40) 24.6 (15) 23.0 (14) 14.7 (9) 37.7 (23) ***** Y TAY 27.3 (9) 72.7 (24) 25.4 (16) 74.6 (47) Acidic D GAY 44.4 (8) 55.6 (10) 21.9 (7) 78.1 (25) E GAR 66.7 (10) 33.3 (5) 62.7 (32) 37.3 (19) * Total Basic R CGN 15.4 (2) 69.2 (9) 0 (0) 15.4 (2) 11.1 (3) 33.3 (9) 18.5 (5) 37.1 (10) * H CAY 36.4 (8) 63.6 (14) 31.4 (11) 68.6 (24) K AAR 63.0 (17) 37.0 (10) 56.9 (29) 43.1 (22) Total ** *** Total Codons in genes coded by AC rich strand Codons in genes coded by GT rich strand Other Codons AC rich codons *** 32

36 4. Characterisation of the mitochondrial genome GT rich codons ** Total a χ 2 test of the difference of the frequency of amino acids encoded in two strands by 2 2 contingency table. Symbol * indicated the P level: *, **, *** represent P<0.05, P<0.01, P<0.001 etc. No symbol * means P > b χ 2 test of the difference of the codon usage of each amino acids or groups of amino acids encoded in two strands by 4 2 contingency table with D.F=3. Symbol * follows the same rules as a. c Proportion Test of the difference of CA rich codons encoded in two strands by Fisher exact test. Symbol * follows the same rules as a. 33

37 4. Characterisation of the mitochondrial genome Table 4.3 Comparison of amino acids composition of peptides sequences of predicted Cytb and ND1 proteins coded by AC and GT rich strand Cytb ND1 X. americanum ss GT rich T. spiralis a AC rich X. americanum ss GT rich T. spiralis a AC rich Amino acid codon No. % NO. % χ 2 test No. % No. % χ 2 test Nonpolar A GCN I ATY L Total CTN TTR * * M ATR * F TTY * * P CCN W TGR V GTN * * Total Polar N AAY C TGY Q CAR G GGN S Total AGN TCN T ACN * * Y TAY Total Acidic D GAY E GAR Total Basic R CGN H CAY K AAR Total Grand total χ 2 test of the difference of the frequency of amino acids encoded by GT rich strand of X. americanum and AC rich strand of T. spiralis by 2 2 contingency table. Symbol * indicated the P level: *, **, a *** represent P<0.05, P<0.01, P<0.001 etc. No symbol * means P > data was obtained from genbank (ACCESSION: NC_002681) (Lavrov & Brown, 2001). 34

38 4. Characterisation of the mitochondrial genome The protein encoding genes and rrna genes of X. americanum are smaller than those of other metazoan. Two rrna genes are also smaller than those of other nematodes (Fig. 4.1). In three cases, two protein genes adjoin each other with only 0-2bp intergenic sequences (ND5 and ND6, COII and ND2, ND4 and ATP6) that could not form a hairpin/trna-like structure proposed as the processing recognition sites in the traditional polycistron processing model (Ojala et al., 1980, 1981). This phenomenon indicates that other transcription and post-transcriptional mechanism may exist8 in the mitochondria. The A+T content of the X. americanum is 66.51%. This figure is very similar to that of T. spiralis (66.99%) (Lavrov & Brown, 2001) and lower than that of other nematodes: O. volvulus (73%) (Keddie et al., 1998), A. suum (70.4%), C. elegans (75.5%) (Okimoto et al., 1992) and R. culicivorax (80%) (Hyman & Azevedo, 1996). Genes are transcribed from both strands. The GT rich strand contains sense sequences for 9 proteins, 14 trnas and 2 rrnas genes. The AC rich strand has sense sequences for 3 proteins and 8 trnas genes. Most of sense sequence are located in the GT rich strand. This structure of gene arrangement is similar to that of Chromadorean nematodes (all genes located in GT rich strand) and different from that of T. spiralis, vertebrates and arthropods (most of genes located in AC rich strand). The corresponding GC and AT skews (GC skew = (G-C)/(G+C) and AT skew = (A-T)/(A+T)) (Perna & Kocher, 1995) were calculated for the AC rich strand as and It may imply that the asymmetrical replication observed in vertebrate mitochondria exists in these nematode mitochondria too. However, in comparison with vertebrate mitochondria, most of the sense sequences were present in the GT rich strand that is the leading strand in the replication assumed to accumulate the mutation causing the AC bias in the other strand during the single strand state (Reyes et al., 1998). We will discuss the effects of AC/GT bias on the amino acid composition and codon usage in a following section. 35

39 4. Characterisation of the mitochondrial genome The gene arrangement of the X. americanum ss mitochondrial genome is unique. The gene map is shown in Fig The gene order and its phylogenetic significance are discussed below Gene arrangement and phylogeny inference The gene contents of are usually very stable within the main group of metazoans such as chordates, hemichordates, arthropods, echinoderms, and annelids. Gene arrangements were observed to be relatively conserved within these groups and diverged between groups. Therefore, it was thought that mitochondria gene arrangements have great potential to solve the deepest branches of metazoan phylogeny (Boore 1999; Boore & Brown, 1998). That is the case while the gene arrangements data resolved the dilemma of the phylogeny of arthropods (Boore et al., 1995, 1998) and echinoderms, for which all other data gave equivocal results (Smith et al., 1993). However, gene content and arrangements are diverged within nematodes. For all seven species with completely sequenced or gene contents and arrangements being available, six Chromadoreans Caenorhabditis elegans (Nematoda: Chromadorea: Rhabditida: Rhabditoidea: Rhabditidae) /Ascaris suum (Nematoda: Chromadorea: Ascaridida: Ascaridoidea: Ascarididae) (Okimoto et al., 1992), Onchocerca volvulus (Nematoda: Chromadorea: Spirurida: Filaroidea: Onchocercidae) (Keddie et al., 1998), two hookworms (Ancylostoma duodenale and Necator americanus (Nematoda: Chromadorea: Rhabditida: Strongylina: Ancylostomatoidea: Ancylostomatdae) (Hu et al., 2002) and Meloidogyne javanica (Nematoda: Chromadorea: Tylenchida: Tylenchina: Heteroderidae) (Okimoto et al., 1991)) have the same gene contents but very different gene arrangements though they share few gene boundaries; The only Enoplean with sequenced, Trichinella spiralis (Nematoda: Enoplea: Trichocephalida: Trichinellidae) has ATP8 gene that is absent from the of other nematodes and has a gene arrangement more similar to that of horseshoe crab (Limulus polyphemus) (Lavrov et al., 2000a; Lavrov & Brown, 2001). A multipartite was found in Globodera pallida 36

40 4. Characterisation of the mitochondrial genome (Nematoda: Chromadorea: Tylenchida: Tylenchina: Heteroderidae). Within the completely sequenced 9428 bp of sc I, seven protein genes were identified (Armstrong et al., 2000). The co-existence of multiple mitochondrial genomes is often reported for plant mitochondria (Mackenzie & Mcintosh, 1999). All descriptions to date indicate that gene arrangements will not be a useful tool to resolve the lineages of nematodes. It was reported for of mollusks that the gene arrangements are very diverged among three major classes, but conserved within each group (Kurabayashi & Ueshima, 2000; Wilding et al., 1999). However, that is not the case for nematodes. The gene arrangements in the of six nematodes are shown in Fig Shared gene arrangements between X. americanum ss and other nematodes are indicated. The two Enoplean nematodes do not share any gene boundaries. O. volvulus shared more gene boundaries with X.. americanum than other nematodes: trna Leu (UAG) and COIII, trna cluster (trna Cys (GCA)- trna Ser (UGA)- trna Pro (UGG)). A further gene boundary could be obtained by single translocation of ATP6 or trna Phe (GAA) gene. A. suum/c. elegans also share the gene order of trna Leu (UAG) and COIII with X.. americanum. When protein genes and RNA genes are considered alone, A. suum/c. elegans also share the ND5 and ND6 gene arrangement with X. americanum; O. volvulus will share the ND2 and ND4 gene arrangement with X. americanum and T. spiralis could share Cytb and Mt-S-rRNA arrangement by one translocation of either of the genes. M. javanica will would have the same ND4L -ND3-COII arrangement by one translocation of the ND3 gene. All three-chromadorean share COII-H-Mt-L-rRNA-ND3, Y- ND1, and Cytb-L1 arrangemements. A. suum/c. elegans share the ND2-I and ND6- ND4L arrangement with M. javanica and T-ND4-COI, V-Cytb, L1-COIII, S1-ND2, G- COII with O. volvulus beyond the common arrangements mentioned above for Chromadoreans. Additionally, one trna gene arrangement, E-S1 is shared between O. volvulus and M. javanica. T. spiralis has only one common arrangement with A. suum/c. elegans, S2-N, and no shared arrangements with other nematodes. The gene arrangement 37

41 4. Characterisation of the mitochondrial genome of T. spiralis is more similar to that of the horsecrab (Limulus polyphemus) (Lavrov & Brown, 2001). Despite of all the above shared gene arrangements or proposed shared gene boundaries generated by the minimum re-organization of the gene orders, the fact that so few shared gene arrangements will not allow the phylogeny inference of deep lineages of nematodes such as Enoplea and Chromadorea. It is possible that the maintenance mechanism of the are different in early branched nematodes groups, which may lead to the different rates of gene re-organization that will blur the real trace of lineages evolving rates. For instance, the maintenance mechanism of in Globodera pallida (Armstrong et al., 2000) is likely to be similar to that of plants, in which inter-molecular or intra-molecular recombination resulted in molecules of different sizes maintained in the same mitochondria. They are functional molecules maintained with stoichiometric shift (Janska et al., 1998). The other example is the of R. culicivorax (Nematoda: Enoplea) (Hyman & Azevedo, 1996), which maintains both single copy and repeated copies of ND3 and ND6 under the same selective pressure. The problem of using gene arrangment for phylogenetic analysis in Nematoda are underlined by the comparison with the phylogeny constructed with 18S gene by Blaxter et al. (1998). The shared gene arrangement of Caenorhabditis elegans, Ascaris suum and two hookworms (C. elegans and two hookworms have the same gene order; The position of the AT rich control region of A. suum is different from that of above three species.) and the more diverged gene arrangements between A. suum and O. volvulus in contradiction to the phylogenetic groups based on 18S rrna gene (Blaxter et al., 1998), in which A. suum and O. volvulus are more closely related species, implies that the gene arrangement data are an unuseful tool to infer the phylogeny of subgroups of nematodes. The big difference of between G. pallida and M. javanica (these two species belong to the same family) also provides strong evidence that the gene order is unuseful for phylogenetic analysis of nematodes. 38

42 4. Characterisation of the mitochondrial genome The two fully sequenced Enoplean nematodes do not share any gene arrangements. However, one common feature of three Enoplean nematodes (including R. culicivorax) that is distinct from all studied Chromadorean nematodes to date is that genes are transcribed from both strands instead of one. Fig Gene arrangements of nematode mitochondrial genomes: Arrows under each gene arrangement map indicate the direction of gene transcription. The black line above gene arrangement maps of Chromadorean nematodes indicate the shared gene arrangements among them; the blue, green, and orchre lines indicate the shared arrangements between two Chromadorean nematodes in addition to the commonly shared genes. The black and purple lines under the maps of Chromadorean nematodes mark out the shared arrangements with Enoplean nematodes: purple lines indicate the shared arrangement with T. spiralis; black lines indicate the shared arrangement with X. americanum. The circular arrow between maps for O. volvulus and X. americanum means that the shared arrangement could be obtained by a single translocation of one of the two genes Protein genes and codon usage All protein genes started with ATA initiation codon. With the exception of ND3 which end with a TAG stop codon, the 11 protein genes terminated with a TAA stop codon (Table 4.1). In light of the gene overlapping, COI, COII, ATP6, ND2 and ND3 genes probably terminate with T/TA. It has been shown that post-transcriptional polyadenylation process can add the complete termination codon to the mitochondrial genes (Ojala et al., 1980, 1981). The ND1 and COIII genes overlap with trna Leu (UAG) for 26 bp and 9bp respectively (Fig. 4.1). Taking account of the polycistronic processing model 39

43 4. Characterisation of the mitochondrial genome (Ojala et al., 1980, 1981), the COIII gene could initiate at an ATA codon located 27 bp downstream of the predicted ATA initiation codon so as to produce the trna Leu (UAG) from the primary transcripts with the functional ND1 mrna truncated 23 bp to end with a single T. The spliced trna Leu (UAG) mrna would then miss the three T nucleotides at the 5 end, which would require repairing perhaps using the RNA editing system similar to that of Acanthamoeba mitochondria (Price & Gray, 1999), in which the nucleotidyltransferase add the nucleotides in a 3 to 5 direction to repair the mismatches located in the 5 part of the acceptor stem. However, so far do the reported trna editing systems in metazoan mitochondria restrict their functions to only the 3 side of the acceptor stem (Yokobori & Pääbo, 1995a; Reichert & Mörl, 2000; Lavrov et al., 2000b). An alternative explanation of these extensive gene overlaps could be that the trna Leu (UAG) is a pseudo-gene or that the three mature genes are processed in different primary transcripts. If the trna Leu (UAG) is a pseudo-gene, the functional gene would need to be imported from the nucleus. The only report of this phenomenon in metazoan mitochondria is the importation of a trna Lys (UUU) into marsupial mitochondria (Döner et al., 2001). The immediately adjacent protein genes: ND4 and ATP6, ND5 and ND6, COII and ND2 pose another challenge to the transcript processing system of X. americanum mitochondria. Because of the absence of an intergenic region, the trna-like stem-loop structure cannot form between genes that the genes would not be able to be spliced using the mechanism proposed by the polycistronic processing model (Ojala et al., 1980, 1981). The two protein genes may be translated sequentially from the dicistronic transcript without further transcript processing, in a similar manner to the system in RNA phage, in which, two peptides are translated from the same mrna. The second protein initiation site is usually hidden in a stable stem region until the ribosome disrupts the structure, exposing the initiation codon. Then, the second protein is translated. The cotranscriptional translation of both proteins simultaneously is also possible, as most of the 40

44 4. Characterisation of the mitochondrial genome transcription or translation factors of mitochondria isolated so far are predicted to have high similarity to those bacterial genetic systems (Shadel & Clayton, 1993). The fact that replication, transcription, and translation will occur in the same mitochondrial matrix would facilitate the assumed simultaneous events. Additionally, mitochondrial mrna does not need the 5 7-methylguanylate cap structure for initiation of protein synthesis, which is usually processed after transcription. As proposed, because of the absence of leader sequences (7-methylguanylate cap structure for eukaryotes, or Shine-Dalgarno sequence for bacteria), the initiation of translation of mitochondria would be relatively inefficient but this can be compensated by the abundant transcripts. Larger mrna (>400 nucleotides) can stabilize the translation complex (Taanman, 1999). Therefore, the dicistronic mrna may be more stable than a single cistronic mrna. This may therefore be an adaptive mechanism for the maintenance of the mitochondrial genetic system while allowing the genome to become more economized. Protein genes were transcribed from both strands. Sense sequences of COI, ND5 and ND6 genes are located in the AC rich strand. The other 9 protein genes are located in the GT rich strand. Despite the difference in nucleotide composition of the two strands (A>T>C>G for the AC rich strand; T>A>G>C for the GT rich strand), T is the most abundant nucleotide for all protein genes with the exception of the ND4L gene (T>G>A>C), A is the second most abundant nucleotide. The composition of G is higher than C except for the ATP6 and ND5 (T>A>C>G) genes. So, as in of Chromadorean nematodes, T is the most favored and C is the least favored nucleotides. Non-coding region of the AC rich strand is more AC rich than the average composition of the strand. The coding region of AC rich strand uses a higher proportion of AC rich codons than that of GT rich strand (P < 0.001) (Table 4.2). The overall difference of codon usage of two strands can be ascribed to the different proportion of CA rich or GT rich codons. No significant differences in usage of other codons between two strands are seen for the 41

45 4. Characterisation of the mitochondrial genome synonymous sites of codons, only codons for the amino acids Arginine and Asparagine show a significant difference between two strands (P<0.05). The total composition of synonymous sites are significantly different (P<0.01) and the proportion of AC in synonymous sites are much more abundant in AC rich strand than that in GT rich strand (P<0.001). Both functional constraints and directional mutation pressure in AC/GT rich strand propels the selection of AC/GT rich codons while maintaining the exact functions of proteins. The composition of most of the amino acids is not significantly different between two strands. Three amino acids show differences: Thr (P< ), Met (P<0.05), and Glu (P<0.05), for which the codon usage do not have significant difference implying the difference may derived from the different amino acid compositions of different proteins encoded on two strands. In comparison with the of T. spiralis, another Enoplean nematode, with much stronger GC and AT skew in the coding region of AC rich strand (GC skew = -0.59, AT skew = 0.48) (Lavrov & Brown, 2001), which has significant difference in all codon usage and 12 amino acids composition, the less biased nucleotide composition of the X. americanum has much less biased codon usage and amino acid composition between two coding strands. This indicates a fine correlation between the nucleotide composition and codon usage/amino acids composition and that functional constraints affect on the amino acid composition. A comparison of the amino acid composition of ND1 and Cytb between X. americanum and T. spiralis (Table 4.3) also indicates that the same types of protein have similar amino acid composition between the two organisms. Those two proteins are coded in AC rich strand of of T. spiralis and GT rich strand of of X. americanum. The favored GT or AC codons in two nematodes resulted in different amino acids composition of F (TTY), V(GTN), T(ACN) coded by AC, GT rich codons for both Cytb and ND1. Codon TTR for Leucine differed significantly between both proteins. Cytb of T. spiralis contains more M than that of X. americanum without significant difference 42

46 4. Characterisation of the mitochondrial genome between codon usages. This may be the consequence of different functional constraints existing in the two nematodes rrna genes Both rrna genes are transcribed from the GT rich strand. The predicted size of both genes is much smaller than that of other nematodes (Table 4.1). The rrna genes were found overlap extensively with upstream or downstream genes. The exact ends of both genes could not be determined since they are lacking similarity to those of other rrna genes reported. The secondary structures were predicted for both genes referring to those of other nematodes, Escherichia coli, as well as the inferred conservation deduced from the observation of many organisms (Gutell et al., 1993; Gutell, 1994). The Mt-LrRNA gene of X. americanum was found possibly to be interrupted by trna Met (CAU). This gene arrangement has never been reported from any other of metazoan Mt-S-rRNA gene The Mt-S-rRNA gene is predicted to be about 625 bp in length. The nucleotide composition of the gene is A>T>G>C. A is the most abundant nucleotides though the strand is GT rich. This may be caused by the functional constraints. Similar to the general S-rRNA secondary structure of Escherichia coli, the structure can be divided into four domains (Fig. 4.3). Bold-faced letters indicated conserved sequence motif with other nematodes or Escherichia coli. It is interesting to note that most of the conservation is found located in the loop region with important function (Fig. 4.3), loops with helix numbers: 19, 21, 25, 27, 50, since the helix region is under the compensatory mutation that may lead to higher mutation rates than that in the loop region with strong functional constraints. The structure of pseudo-knot located in loop 21 and 19 is very conserved because of the functional importance and high structural constraints of this region (Gutell, 1994). Covariation of nucleotides in the two interactive loops was found by comparing to that of other nematodes or Escherichia coli. The reduced size of the Mt-S-rRNA gene 43

47 4. Characterisation of the mitochondrial genome was ascribed to the reduced size of both loop and helix regions in several helix-loop regions (Fig. 4.3) compared to those of other nematodes; the loop regions between helix 23 and 25, 22 and 27, 33 and 47, loop regions at helix number 35, 36, 47 and 48 are much reduced; Helix 45 and the loop region are absent; while helix 35, 39 and 40 are also reduced in size. In addition to the reduction of several helix-loop regions, two economized helix-loop regions with helix numbers 41 and 42 are maintained in the secondary structure of this Mt-S-rRNA, which is never found in the secondary structure of other nematodes reported thus far. Another feature that we found in the secondary structure is that many non-canonical base pairs are widely used in the helix region. Noncanonical pairs: U:G, G:A, A:C, C:U, G:G, U:U and A:A pairs are found sandwiched between the general canonical pairs. A:C and A:G pairs are much more frequent than other non-canonical pairs except U:G pairs. They were found in helixes 5, 22, 27 and 32. The 5 of the gene extends extensively into the upstream trna Pro (UGG) gene. How the transcripts of both genes are processed is unknown. It may be that the genes are processed by shifting the secondary structure of the primary transcripts at the 5 end of the Mt-S-rRNA gene. If the normal secondary structure of trna Pro (UGG) is formed, the transcripts will be processed into a functional trna Pro (UGG) and a unstable Mt-S-rRNA mrna that misses helices 2 and 3. They stabilize the domain 1 and 2 of the molecule and short molecule will be degraded before it is packaged into a mature ribosome. If the normal secondary structure for the Mt-S-rRNA gene is produced, a hairpin structure (Fig. 4.3) will form beside the start point of helix 1 of the Mt-S-rRNA gene adjoining the upstream trna Ser (UGA). Splicing trna Ser (UGA) will leave the hairpin structure as the start part of the Mt-S-rRNA gene. After corresponding ribosome proteins bind to each domain of the Mt-S-rRNA, the hairpin structure will be processed to the functional 5 in the requirement of the ribosomal structure. Some trans-acting factors will be required for this splicing process. 44

48 4. Characterisation of the mitochondrial genome Fig Secondary structure model for mt-s-rrna gene of X. americanum: The sequence is numbered every 50 bp. Helices are numbered according to (Gutell, 1994). Small figures A and B show the hypothesized processing mechanism of 5 of mt-s-rrna gene (see text). Co-varied nucleotides in loop numbered 21 are marked with circles and in red or blue color separately. 45

49 4. Characterisation of the mitochondrial genome Fig Secondary structure model of mt-l-rrna gene of X. americanum mitochondrial genome: Sequences are numbered every 50 bp; Helix numbers are given according to Gutell et al., The insertion site of trna (CAU) -Met is indicated by an arrow. Proposed tertiary interactions (Khaitovich & Mankin, 1999; Xiong et al., 1999) are marked with thin lines connections. Conserved sequences are indicated by boldfaced letters. Helix D10_1 does not exist in the secondary structure of E. coli (Gutell et al., 1993). 46

50 4. Characterisation of the mitochondrial genome Mt-L-rRNA gene The large rrna gene has a similar nucleotide composition to the small rrna gene. The predicted length of the gene is around 879 bp, shorter than the Mt-L-rRNA genes of other nematodes (Table 4.1). In the predicted structure, the 5 and 3 of the molecule is closed by a 6 base paired stem region (Fig. 4.4). This structure is usually found in Eubacteria and most Archea but rarely reported for eukaryotes (De Rijk et al., 1999), with exception of sea anemone, Metridium senile mt-mt-l-rrna, a most E. coli-like (Raue et al., 1998) Mt-LrRNA gene found in metazoan mitochondria (Beagley et al., 1998). A putative paired end structure was also proposed for the mt-l-rrna of T. spiralis (Lavrov & Brown, 2001). The fact that both Enoplean nematodes having this primitive structure may confirm the more primitive position of those nematodes in Nematoda as proposed by systematists of nematodes and indicated in the nematode phylogeny constructed by Blaxter et al. (1998). The overall structure of X. americnaum Mt-L-rRNA gene is similar to that of other metazoans. The 5 half of the molecule is much reduced compared to the Mt-LrRNA of E. coli. The 3 half has a more conserved structure because of the important functions of domain V as the main component of peptidyl transferase center (Gutell, 1996; Khaitovich & Mankin,1999). In comparison with mt-mt-l-rrna genes of other nematodes, the X. americanum Mt-L-rRNA gene has lost the helix-loop structure D2, D21, and maintained the D20 and E1 E3 helix-loop regions that were absent in the Mt- L-rRNA genes of others. The D10_1 region is present in mt-l-rrna genes of both Enoplean nematodes (Fig. 4.4) while absent from those of other nematodes and Mt-LrRNA gene of E. coli. Loop regions between helixes D1 and D11, D11 and D17 are reduced in size. The central loop is also smaller than that of other nematodes. Like the Mt-S-rRNA gene, besides U:G pair, other non-canonical pairs: A:C, U:U, U:C, A:G, C:C, G:G, A:A are observed to exist between the canonical pairs in helix regions. Peptidyltransferase pseudoknot is indicated in Fig Compared to the typical structure, 47

51 4. Characterisation of the mitochondrial genome the nucleotides involved in the interaction from loop reigon of helix 89 has mutation in nucleotide 659 (G to A), and one insertion of A in 661, however, the pseudoknot structure is difficult to identify from mt-l-rrna because of the little pressure of natural selection (Ivanov et al.,1999). The tertiary interaction between Helix D9 and G1 were also indicated. The co-variation analysis by Gutell (1996) predicted this triple interaction though it has not been confirmed. However, recent experiments have proven the proximity of D9 to peptidyltransferase center (Xiong et al., 1999). Another interesting feature of the X. americanum Mt-L-rRNA gene is the insertion of the trna Met (CAU) in helix D7. The exact insertion site is indicated in the Fig. 4.4 by an arrow. Interrupted rrna has been discovered in mitochondria, chloroplast, eubacteria and nucleocytosolic compartments of eukaryotes (Gray & Schnare, 1996). In protist mitochondria, several cases of interrupted rrna genes have been reported. A extremely scrambled rrna gene was reported from the mitochondrial genomes of green algae (Nedelcu, 1997), in which the rrna was not only fragmented into several coding modules but also distributed in different coding strands. The intron belonging to subgroup B1 of group II was observed in the mitochondrial genome of red alga (Burger et al., 1999) and brown alga Pylaiella littoralis (Fontaine et al., 1995). Although plant mitochondria confronted the explosive invasion of group I introns (Cho & Palmer, 1999; Palmer,J.D., Adam et al., 2000), introns are not observed in rrna genes (Kubo et al., 2000). Introns have been found in rrna genes of fungal mitochondrial genome too (Paquin et al., 1997). The interrupted coding modules of rrna genes can be interspersed by intron, proteincoding genes, internal transcribed spacer, and trna genes. The latter case is observed in X. americanum mt-mt-l-rrna gene but the interrupted rrna gene was never reported for metazoan mitochondrial genome. To process these two genes from the primary transcripts would require the excision (endonuclease such as RNase P) and ligation (ligase) system operating on the well formed secondary structure of both trna (Met) (CAU) and the D7 region of the mt-l-rrna. 48

52 4. Characterisation of the mitochondrial genome trna gene Twenty-two trna genes were identified for the X. americanum mitochondrial genome, which is typical for metazoans. Most of the trna genes are smaller than the corresponding genes in of other nematodes because of the reduced TV replacement loop, Tψ C stem-loop region, or DHU stem-loop region. We found three types of secondary structures of trna genes: the classic clover leaf structure (V, T, M, N), typical nematode trna structure with Tψ C stem-loop region replaced by a 6-14 bp TV loop (L1, L2, V, T, H, A, R, K, D, W, E, Q, G, P, F) (75), and trna with DHU arm replaced by a D loop (S1, S2, C, Y, I) (Fig. 4.5). The bizarre TV replacement structure of nematode trnas has gained more support from recent experiments (the potential tertiary interaction of D-stem (Watanabe et al., 1994); isolated EF factor interacting with it (Ohtsuki et al., 2001)) for its functional role in the translation system of nematode mitochondria. We noticed that trna Thr (UGU) and trna Val (UAC) could be folded into both the cloverleaf with 36 bp overlaps and TV loop structure without overlaps. Because the two genes are transcribed in different strands, it is possible that they will have the standard cloverleaf structure. When they are folded into the cloverleaf structure, no mismatched pairs are found in the acceptor stem. However, when they are folded into the TV loop structure, mismatches in acceptor stem were found between nucleotides 7 and 66 for trna Thr (UGU), between 7 and 66, 6 and 67 for trna Val (UAC). Both cloverleaf and TV replacement structure are presented in Fig Mismatches are found in the acceptor stem, Tψ C stem, DHR stem and anticodon stem. Twenty three mismatches are located in the acceptor stem: nine mismatches are observed between 7 and 66 (trnas: L1, K, D, Q, F, G, Y, L2, P, N) that is also found in the trna genes of other nematodes (6, 7, 8, 46); the rest of mismatches are found between 3 and 70 (trnas: H, N), 1 and 72 (trnas: E, N, S2, Q), 2 and 71 (trnas: S2, N), 6 and 67 (trnas: A, W), 5 and 68 (trnas: Q). Two mismatches are found in D-stem between positions 10 and 25 for K, 11 and 24 for Q. One mismatch is found for N located 49

53 4. Characterisation of the mitochondrial genome between 52 and 62. Thirteen mismatches are found in the anticodon stem: seven of them are located between 27 and 43 for trnas: M, N, K, S1, S2, P, and F; another six are observed between 29 and 41 for L1 and N, 31 and 39 for L2, Q and H, 28 and 42 for C. So many mismatches imply that the repairing of the stems is needed for the maturity of the trnas, especially for trnas that have more than one mismatch in the acceptor stem. The trna editing system may function as such a repairing system as it was reported in other metazoan mitchondria. The editing systems are reported mostly working on the 3 half of the acceptor stem by polyadenylation followed by addition of CCA for land snail (Yokobori & Pääbo, 1995b), squid (Tomita et al., 1996), Platypus (Yokobori & Pääbo, 1995), bird (Yokobori & Pääbo, 1997) and the others were reported to change the anticodon by base conversion in marsupial mitochondria (Mörl et al., 1995) alternatively an RNA-dependent RNA polymerase to repair the missing 3 half in centipede (Lavrov et al., 2000b). However, RNA editing was not found in the mitochondria of C. elegans (Orr et al., 1997). We found that the trnas of X. americanum mitochondrial genome have both conserved nucleotides and variations in the positions involved in the tertiary interaction compared to the mitochondrial genome of other nematodes (Wolstenholme et al., 1994). The tertiary interaction of trna molecules is maintained by hydrogen bonds between bases, base and ribose, or phosphate. A few invariable or semi-invariable nucleotide positions are involved in the tertiary interaction (Dirheimer et al., 1995). We will describe six major interactions that maintain the stability of the L-shaped trna structure: , (L3), (L2), , 15 48(L4), 26 44(L1) (Fig. 4.5). For the interaction , the conserved nucleotide composition of three positions are T8 A14-A21 for the standard cloverleaf structure and T8 A14-R21 for the TV-loop structure of nematodes. The exceptional cases are: the trna (H) with TV-loop structure has A in position 8 instead of U; eight trnas have U/G instead of A in position 14 (U for Q, D, E, H, N; G for A, L2, R); Five trnas have U/G/C in position 21 instead 50

54 4. Characterisation of the mitochondrial genome of R (U for A, W; G for Q, L2, and C for K). For the trnas with standard cloverleaf structure, the T8 A14-A21 format is replaced by T8 U14-G21 for N or T8 A14-C21 for V. As to the interaction (L3), the conserved nucleotide compositions are C13-G22 G46 for the standard structure and Y13-R22 R(L3) for the TV-loop structure. Three trna genes with cloverleaf, V, T, M have T13-R22 R46 composition; one trna gene has G13-T22 A46 composition that is different from the C13-G22 G46. Six trna genes with TV-loop structure, Q, E, H, L2, R have R13-N22 D(L3) composition differing from the general composition. Three trnas (M, N and V) show variation in the positions of interaction (L2) with R10-Y25 R46 differing from the conserved structure G10-C25 G45 for the standard structure and three trna genes (L2, K, H) have the K10-H25 T(L2) composition different from the general composition R10-Y25 R(L2) for the TV- loop structure.. A9 A23-T12 and A9 W23-W12 are general compositions for cloverleaf structure and TV-loop structure respectively. Variation from the standard composition was observed for eight trna genes (Q, D, E, L1, L2, K, P, and R) with TV-loop structure. The composition for those genes in the three positions is H9 N23-D12. trnas: N, T, V have W15 W(L4) in place of the conserved G15 C48 for trnas with standard loverleaf structure. As for the hydrogen bonding of 26 44(L1), the conserved type for cloverleaf structure is G26 A44 and for TV-loop structure is R26 N44. We observed A26 A44, for trna gene M and T, C26 A44 for V, U26 U44 for N with standard cloverleaf structure. Four trna genes (Q, G, H, and L1) with TVloop structure have Y26 N44 in those positions. Another feature of trna genes of the X. americanum mitochondrial genome is the extensive overlaps found in the trna cluster H-C-S2. The extensive overlap up to 30 51

55 4. Characterisation of the mitochondrial genome bp (Fig. 4.1) implies that those three genes cannot be processed in one primary transcript. We assumed that the trna cluster was processed from different primary transcripts to obtain the mature genes. Another possibility is that some genes in the cluster are pseudogenes; the functional trnas need to be imported from the nucleus. However, until now, importation of trna from nucleus to mitochondria was reported for marsupials only (Döne et al., 2001) though it has been made clear for protozoan (Trypanosome brucei) (Tan et al., 2002). The absence of most trna genes in cnidarians (Beagley et al., 1998) also implied the possibility of gene importation from the nucleus. Additionally, one pseudo-trna Phe (GAA) gene was predicted by trna-scan (Lowe & Eddy, 1997) searching. The predicted structure includes a 26bp intron inserted in the general position of eukaryotic trna genes (Abelson et al., 1998). The splicing sites are indicated in Fig We do not know whether this structure is just found by chance. However, according to the accuracy of the software proven (Lowe & Eddy, 1997), the possibility is low for finding such a structure merely by chance. Then based on the hypothesis of the origin of the mitochondria (α-proteobacteria origin) (Lang et al., 1999), the intron could not be originating from the prototype of ancient mitochondria), alternatively, it could be a trace of gene importation from the nucleus. I don t know whether the trna intron splicing system had played a role in mitochondrial systems. As the trend of genome economization become irreversible, the intron-containing gene was taken placed by the TV-loop structure. However, the trace of the intron-containing gene may imply the adoption of a nuclear trna gene during the evolutionary history of the mitochondrial genome. 52

56 4. Characterisation of the mitochondrial genome Fig Secondary structure model for trna genes: The general models for three structures of predicted trna genes are presented. Tertiary structures are indicated by the thin line connecting the bases involved in the interaction. Black circles mark the general composition of nucleotides in those positions. Grey circles 53

57 4. Characterisation of the mitochondrial genome imply that the nucleotides in those positions are not universally present in each trna gene. The arrows in the structure of pseudo-trna (Phe) -(GAA) point to the splicing sites of the intron Non-coding region The lengthy non-coding region rich in AT found in all completely sequenced mitochondrial genomes of nematodes is absent in the X. americanum mitochondrial genome. There are only two significant non-coding regions: one of 23 bp located between trna-f and H; the other of 37 bp located between trna-n and ND4L. This absence of lengthy non-coding region is similar to the situation in gastropods such as Pupa strigosa (Kurabayashi & Ueshima, 2000). I observed a sequence motif 5 GAGACCTGAGCCCAAGATA3 in the 37bp non-coding region similar to the conserved promoter element sequence (5 CAGACCGCCAAAAGATA3 ) around the 54

58 4. Characterisation of the mitochondrial genome transcription start site within the D-loop region of the human mitochondrial genome (Taanman, 1999). This sequence may serve as the promoter for light strand transcription. Another sequence (5 AACUACCAUAAAACUACCAAAA3 ) located between two predicted stem-loop regions (Fig. 4.6) includes a repeated 11bp motif and polya tract that could be the binding site for mitochondria transcription factors. The stem-loop structure 1 predicted for the H-strand containing many T nucleotides in the loop region. It is very likely that it plays a role as the initiation site of light strand replication as indicated for the initiation site of light strand replication for human (Hixton et al., 1986). The stem-loop structure 1 of the light strand and the stem-loop structure 2 of the H-strand could form a cruciform structure that is similar to the structure of the transcription initiation site of chicken with symmetrical arrangement of the consensus sequence of the promoter region (L Abbé et al., 1991). The cruciform structure has almost perfectly symmetrical bases for both stem regions, which may indicate that this is a bi-directional promoter. Thus, there are two putative promoters for transcription of light strand. The one located at the 3 end part of the light strand of the non-coding region may also function as the initiation site for replication of H-strand as that in the mitochondrial genome of human (Schadel & Clayton, 1997). 4.4 Concluding marks The unique feature of the mitochondrial genome of X. americanum, as well as the diversified mitochondrial genomes of nematodes previously reported indicates that the nematode group is probably greatly diverged in the rates of gene rearrangements that cause big differences in the genetic systems maintaining the mitochondrial genomes. The contradiction between gene arrangements and the phylogeny constructed from the nuclear small rrna gene addressed the diversified evolving rates of gene order among lineages, which devalue the gene order as a tool for digging into the lineages of this group, since the real phylogenetical information is difficult to be recovered from each lineage correctly under such diversified evolving rates. The high economization of the 55

59 4. Characterisation of the mitochondrial genome mitochodrial genome follows the common trend of the metazoan mitochondrial genome evolution. The absence of ATP8 gene, usage of trna genes with both D/TV-replacement loop structure and cloverleaf structure indicates that mitochondrial genome of X. americanum have features of both Enoplean group and Chromadorean group nematodes. These features are well fitted to the position of X. americanum, which is more primitive than Chromadorean nematodes and more derivative than Trichinella spiralis. However, the probable unique features: interrupted mt-l-rrna gene (features found in protists mitochondrial genome), the nuclear trna gene-like pseudo-trna Phe (never reported for metazoan) indicate the special contributions of the unique evolution state of the genetic system of X. americanum. 56

60 5. Ribosomal genes of longidorids 5 Ribosomal genes of longidorids (Nematoda: Enoplea: Dorylaimida: Longidoridae): sequences, secondary structures, and phylogeny 5.1 Introduction Longidorids belong to Dorylaimida: Longidoridae. The family was subdivided into two subfamilies: Longidorinae and Xiphinematinae. Within the Longidorinae, five genera: Longidorus (107 valid species), Paralongidorus (42 valid species), Longidoroides (19 valid species), Xiphidorus (8 valid species) (Coomans, 1984/85) and Paraxiphidorus (Coomans & Chaves, 1995) were classified into two tribes; Xiphidorini with genus Xiphidorus and Paraxiphidorus, Longidorini with the rest three genera; One genus, Xiphinema was classified in the Xiphinematinae with 296 nominal taxa corresponding to 234 valid taxa, 49 junior synonyms and 13 species inquirendae (Coomans et al., 2001). All of the species live ectoparasitically. Some of them can transfer plant viruses (nepoviruses) and considered to be economical quarantine animals (Taylor & Brown, 1997). Except genera: Longidoroides and Paraxiphidorus, species from each genus were sampled and used in the phylogeny study. 57

61 5. Ribosomal genes of longidorids Within the longidorids, one group (Xiphinema americnaum group) species is worth mentioning here. X. americanum was the type and only species when genus Xiphinema was established by Cobb (1913). According to the polytomous key of Loof & Luc (1990), the common morphological characters for this group are small (1-3 mm), body c-shaped or spiral, two well developed genital branches, no uterine differentiation, tail short conical to broadly convex-conoid, vulva position 40-60% of the body length from the head. As more populations of X. americanum were sampled and investigated from different geographical localities, taxonomists presented many morphological varieties among populations studied. Then Lima (1965) and Tarjan (1969) suggested that the X. americanum is the complex of several species. Until now, fifty-one species have been recorded in this group (Lamberti et al., 2000). The identification of those species by traditional morphological observations and morphometrics is very difficult because of the overlap of the characters used. Several questions are raised here: What is the true phylogeny behind such a complicated group? How can all species within this group be correctly positioned? Phylogenetical interpretation of Longidoridae is very limited. Recently, Rubtsova et al. (2001) has used molecular approach to investigate the phylogeny of a few species of this family. Precious work has been done by Coomans et al. (2001). Based on selected morphological characters the authors studied the phylogeny of the whole genus Xiphinema. Molecular systematic approaches are considered to be useful to address issues where morphological characters lead to ambiguous interpretation. In recent years, ribosomal RNA genes attracted the attention of many systematists and evolutionists because of their functional importance and the assumptions of their ability to maintain the organism evolutionary history faithfully (Lydeard, 2000). Many nematologists also used the rrna gene sequences to infer the phylogeny of nematodes (Subbotin et al., 2000; Blaxter et al., 1998; Kaplan et al., 2000; Al-Banna, et al., 1997). When they are assembled into the ribosome together with proteins, ribosomal RNAs are 58

62 5. Ribosomal genes of longidorids usually folded into complicated secondary structure and tertiary structure. Although some secondary and tertiary structures have been constructed by X-ray crystallization (Cate et al., 1999) or cryo-electron microscopic (EM) reconstruction method (Mueller et al., 2000), most of the structures deposited in the public database such as the Antewerp database of large (Van de Peer et al., 1996) and small rrna sequences (De Rijk et al., 1999) ( and comparative RNA web site ( (Cannon et al., 2002) are derived from comparative analysis, which generates the folding from the common compensatory substitutions and pairing patterns on many sequences. The secondary structure of rrnas is very useful i to improve sequence alignments, which are critical for the phylogeny construction (Kier, 1995; Hickson et al., 1996) Researchers have achieved some success by using alignment refined with the aid of the secondary structure and optimized computer algorithm (Titus & Frost, 1996). Additionally, secondary structure can provide useful information for weighing the sequence in weighted parsimony or other weighted methods. Structural motifs themselves may also maintain some useful information for phylogeny inference (Lydeard, 2000). With this research I tried to answer the aforementioned questions concerning the phylogetic position of the species within the longidorids. My work is the first attempt to address the phylogenetical issue of the whole family based on molecular data mentioned in the rrna genes. I also tried to provide systematists with persuasive molecular information to aid in the reconstruction of those taxa. 5.2 Materials and Methods Taxon sampling Nematodes samples collected for this study are listed in Table 3.1., including 23 species from genus Longidorus, two species from genus Paralongidorus, two species from genus Xiphidorus, and 35 species from genus Xiphinema. 59

63 5. Ribosomal genes of longidorids Total DNA extraction DNA extracted from a single juvenile or adult was used for the PCR. The details of the extraction procedure is described in Chapter Amplification and cloning Ribosomal genes 26S, 18S and the IGS (intergenic spacer) were amplified by LA- PCR (Barnes, 1994; Cheng et al., 1994) using primers 18VNS and 28VNS (Table 5.1) and elongase kit (Invitrogen, Merelbeke, Belgium). The PCR reactions consisted of 22 cycles of denaturing at 94 C for 30 sec, annealing for 30 sec and extension at 72 C for 15 min, and another 13 cycles of denaturing at 94 C for 30 sec, annealing at 50 C for 30 sec, and extension at 72 C for 15 min followed by an extension at 72 C for 10 min. The touchdown annealing profile was added to the first 22 cycles with annealing temperature decreasing 0.5 C each cycle till the annealing temperature reaching 50 C, 15 sec additional extension time each cycle for the last 13 cycles. The ITS (internal transcribed spacer) and 5.8S genes were amplified by using primers VN18 and VN28 (Table 5.1) and PCR kit from (Qiagen, GmbH, Postfach, Germany). The PCR was performed in a PTC-100/200 thermocycler (MJ research, Biozyme, San Diego, USA). The cycling conditions were 94 C for 3 minutes, 35 cycles 94 C for 30 sec, 54 C for 40 sec, and 72 C for 2 min followed by an extension at 72 C for 10 min. D2 and D3 expansion regions were amplified by primers D2A and D3B (Table 5.1). The cycling profile was 94 C for 3 minutes, 35 cycles 94 C for 30 sec, 54 C for 40 sec, and 72 C for 1 min followed by an extension at 72 C for 10 min. PCR products were visualised under UV after separating in a 1% agarose gel and staining with ethidium bromide. The fragments were recovered from the gel by excision and purified with Gel-purification kit (Qiagen). Then, the fragment was cloned into pgem-t vector (Promega, Leiden, The Netherlands). 60

64 5. Ribosomal genes of longidorids Table 5.1. Primers used in our study. Primer code Primer sequence (5-3 ) Reference 18VNS TGTACAAAGGGCAGGGACG Vrain et al. (1992) 28VNS TTCCTTAGTAACGGCGAGTG Vrain et al. (1992) VN18 TTGATTACGTCCCTGCCCTTT Petersen et al. (1997) VN28 TTTCACTCGCCGTTACTAAGG Petersen et al. (1997) D2A ACAAGTACCGTGAGGGAAAGTTG D3B TCGGAAGGAACCAGCTACTA Sequencing and sequence analyses A direct sequencing strategy was used for D2 and D3 amplification product; For ITS fragment, a primer walking strategy was used to get the complete sequence; For the 26S and 18S fragment, GPS-1 genome priming system (New England Biolabs) was used to insert the universal primer island into the cloned fragment. SpeI and NotI were used to map the position of the primer island. According to the mapping results, appropriate clones were selected for sequencing. The sequencing reactions were carried out by using a BigDye terminator cycle sequencing kit (ABI). The final sequences were determined by ABI prizm 377 genetic analyser (ABI). Sequences were assembled and edited by BioEdit (Hall, 1999). Primary sequences alignments were performed using ClustalX (Thompson, et al., 1997) Secondary structure construction and analyses Mfold (Zuker et al., 1999) was used to aid the construction of the secondary structure of both rrna genes based upon the primary sequence alignment with genes of Caenorhabditis elegans (Ellis et al., 1986), Drosophila melanoganster (Tautz et al., 1988), Xenopus laevis (Clark et al., 1984) referring to their predicted secondary structure. The V4 region of 18S gene was constructed referring to the work published by Wuyts et al. (2000). The secondary structure model of the D2 region was inferred with the aid of 61

65 5. Ribosomal genes of longidorids MWM algorithm implemented in circle (Tabaska et al., 1998) and Mfold (Zuker et al., 1999) to edit the sequence alignment. based on the optimised alignment. The exact ends of rrna genes should be determined by sequencing the mature rrnas or cdna. Secondary structures were edited using RNAviz (De Rijk et al., 1997). All nucleotides polymorphism among three species (L. macrosoma (code LM1), X. americanum (code XA1), and X. brevicollum (code XB1)) (Codes refer to Table 3.1) were mapped to the secondary structure models Nucleotides diversity Nucleotide diversity (π) was calculated for each region of the ribosomal RNA gene cluster (28S gene, IGS (intergenic spacer), ETS (external transcribed spacer), 18S gene, ITS1(internal transcribed spacer), 5.8S gene, ITS2) using software DnaSP (Rozas & Rozas, 1999). π is given by the following formula: π = Π / L (L is the length of the sequence.) Π equals the average number of nucleotide differences between two sequences randomly chosen from the sample; n equals the number of sequences of the sample; Π ij equals the number of nucleotide differences between ith and jth sequences; n(n-1)/2 equals the number of possible pairs. 62

66 5. Ribosomal genes of longidorids Phylogeny inferences Alignment The alignment is the first critical step for phylogeny inference. Good alignment usually leads to the correct phylogeny; improper alignments will give misleading phylogeny. Sequence alignments made by ClustalX with default parameters (gap open penalty score and extension score 6.66) were manually edited using Bioedit (Hall, 1999) according to the secondary structural information. Most of the analyses were based on this alignment (coded A10. see Table 5.2.). Eight alignments were also made using arbitrary gap open and extension penalty scores: 20:5, 13:5, 12:7, 8:3, 7:2, 5:1, 3:2, 3:0.05 (coded A1 A8 in Table 5.2.) that were used for the dataset derived from D1-D3 of 28S gene (Karen, 2002). Gap columns and obvious improper alignments were manually edited. Phylogeny was inferred from the alignments independently. The eight alignments were merged into a large single dataset for elision analysis (Wheeler et al., 1995) based on the assumption that the disparity among alignments will be downweighted and the agreement will have higher weights naturally so that the regions with extensive history of insertion and deletion events will be automatically down-weighted Phylogenetic analyses The separate and combined datasets of D2 and D3 sequences were analysed. D2 and D3 regions may evolve at different rates and with different historical records of evolution. Therefore, homogeneity test (Farris, et al., 1994) was used to measure the incongruence between two regions so as to decide whether to perform analyses on the combined dataset. Different phylogenetic methods have their own Achilles heel. The use of multiple methods increases the confidence of the inferred phylogeny if they produce the same results. Therefore, trees were established using the criteria MP (Maximum parsimony), 63

67 5. Ribosomal genes of longidorids ML (Maximum likelihood), and the distance method with ME (Minimum evolution) implemented in PAUP 4.0 (Swofford, 2002). Weighted parsimony was also performed on D2 and the combined D2 and D3 dataset (using weights 1 assigned to the ambiguous aligned regions and 3 to the well aligned regions). The weight strategies were referring to the secondary structure model of D2 and D3 regions. Branch or topology supports were indicated by non-parametric or parametric bootstrap. As for the maximum parsimony method, decay indices were calculated for each branch. Homogeneity of nucleotide compositions was given by χ 2 statistics implemented in PAUP. The MP method was used for the D2 dataset, the D3 dataset, the combined D2 and D3 dataset and the elision matrix. The search method was heuristic, with swapping algorithm TBR (tree bisection and reconnection) and 10 random additions of sequences. One hundred non-parametric bootstrap replicates were analysed with heuristic search algorithm. Decay indices (Bremer, 1994) were calculated by Autodecay (Ericsson, 2001). The ML method was used for The D2 dataset, D3 dataset, and the combined D2 and D3 dataset. The maximum likelihood model was selected by LRT (Log Likelihood ratio test) implemented in software Modeltest (Posda, 1998). Nested models were evaluated by LRT and the best model was selected for the maximum likelihood method The start tree for LRT was obtained by NJ (neighbor joining) method or the best maximum parsimonious tree. The heuristic search algorithm was used for tree searching with the TBR swap algorithm. The starting tree was obtained by stepwise algorithm. The TBR was limited to 10,000 due to the heavy computation time that could not be afforded. Eleven searches were performed. In each search, parameters were modified or the heuristic search methods were adjusted. Eleven ML trees were compared using KH-test with RELL approximation (Kishino & Hasegawa, 1989). The selected best tree topology (with the highest log likelihood value) was tested by SOWH test (Goldman et al., 2000) using parametric bootstrap methods. Parametric bootstrap replicates were produced by Seqgen implementing Monte Carlo simulation (Rambaut & Grassly, 1997) based on the 64

68 5. Ribosomal genes of longidorids ML estimated parameters of the given topology. Non-parametric bootstrap methods were used to calculate the branch supports. One hundred replicates were used for heuristic algorithm with ML criterion or distance criterion with minimum evolution (ME) model. Distance methods use minimum evolution model and uncorrected p-distance. One thousand non-parametric bootstrap replicates were calculated under the same criterion to figure out the branch support. The phylogeny of Longidorus species was also inferred from the morphological characters selected in the most recent polytomous key (Chen et al., 1997). The methods used were MP and NJ. Non-parametric bootstrap was performed for the dataset. Both phylogeny inferences were compared to estimate the most accurate phylogeny on the basis of the data sets sampled in our study. The phylogeny of genus Xiphinema was compared to the phylogeny based on the morphological characters inferred by Coomans et al. (2001). 5.3 Results and discussions Nucleotide diversity and secondary structure of ribosomal rrnas Nucleotide diversity The nucleotide diversity (π) was analysed using sliding window (Rozas & Roza, 1999) (Fig. 5.1) with the average pairwise number of nucleotide differences in a window of 50 sites was plotted in each nucleotide position. The value of parameter π for each region is: for the 28S gene, for the 18S gene, for the 5.8S gene, for the ETS region, for the ITS1, for the ITS2, and for the IGS region. It can be concluded that the 18S and 5.8S genes are very conserved among the three species. The 28S gene has higher nucleotide diversity than 18S and 5.8S gene. However, compared to non-coding regions, it is still very conserved. The sliding window graph shows that the distribution of diversity in gene region is discrete and that in non-coding regions is more continuous. This suggests that the diversity of gene region 65

69 5. Ribosomal genes of longidorids is usually centred in regions with less functional constraints and the non-coding regions usually are considered to evolve under less selection force resulting in more homogeneously distributed polymorphic sites. 66

70 5. Ribosomal genes of longidorids 67

71 5. Ribosomal genes of longidorids L-rRNA variation The length of the complete L-rRNA of longidorids is 3755 bp (X. americanum and X. brevicollum) or 3811 bp (L. macrosoma), which is smaller than L-rRNA of mammals and birds (about 4000 bp). Most of the numbered stem-loops common in other organisms (De Rijk et al., 1999) are also present in L-rRNA of longidorids (X. americanum, X. brevicollum and L. macrosoma). The secondary structure derived from the sequences of the three species are shown in Fig Variable nucleotide sites are indicated and the alternative nucleotides observed in each site are marked out. Most of the polymorphic sites are observed in the D2 region (C1 stem-loop region), stem-loop structures D2 D5_1 (D3 region), D20, E9_1, E20_1 E20_6, H1_2 and H1_3. This phenomenon can be also observed from the sliding window graph (Fig. 5.1). Polymorphism is due to the point mutation, insertion (stem-loop E9-1, G4, H1_2, H1_3) and deletion (stem-loop H4). The outcome of co-variation or compensatory substitution in the stem structure have a big contribution to polymorphism of those regions. Compared with L-rRNA of Caenorhabditis elegans (Ellis et al., 1986), the major difference of secondary motifs is the presence of stem-loop H1_2 and E11_2; when compared to Xenopus laevis (frog) (Clark et al., 1984), the difference is found to be the presence of E11_2. The secondary structure model of the D2 expansion region was inferred from the comparative analysis of 62 species of longidorids (Fig. 5.3). It consists of three stem-loop regions. The polymorphic sites are evenly distributed in each stem-loop branch as can be observed from the sliding window graph (Fig. 5.4). The value of parameter π is , which is much higher than that of the 28S gene ( ). This indicates that the D2 region is a hotspot of mutation. However, the π value for the D3 region is , which is even lower than the diversity in the 28S gene. The stem-loop structure D4_1 in D3 expansion region is absent in L. latocephalus, L. profundorum, L. piceicola, L. intermedius, L. carpathicus, L. elongates, L. juvenilis, and L. leptocephalus. However, the 68

72 5. Ribosomal genes of longidorids nucleotides located in this stem-loop are not parsimonious informative. L. latocephalus and L. profundorum were not closely grouped with the other six species. Those results indicated that this secondary structure motif does not contain useful phylogenetic information. The secondary structure model for the expansion part of the stem-loop E20 was also constructed (Fig. 5.3). Polymorphism of this region is lower than that of D2 expansion region and stem-loop H1 region. The polymorphic sites aggregated in the E20_6 stem-loop region. Point mutations and insertions in the loop region, compensatory mutations and co-variation in stem region caused the polymorphism. Sequences data produced from the D2 and D3 expansion region were used to construct the phylogeny of longidorids. The details are discussed later in this chapter. 69

73 5. Ribosomal genes of longidorids 70

74 5. Ribosomal genes of longidorids 71

75 5. Ribosomal genes of longidorids 72

76 5. Ribosomal genes of longidorids 73

77 5. Ribosomal genes of longidorids 74

78 5. Ribosomal genes of longidorids Fig The secondary structure model of large rrna gene of X. brevicollum: stem-loop numbers were given according to (De Rijk et al., 1999); Secondary structure model for two variable regions were presented in Figure 2; Two red arrows drawn in B21 and D10 indicated the sequenced regions of D2 and D3 regions; non-canonical UG pair was connected by black circle; other non-canonical pairs (AC, UU, CC, etc.) were connected by hollow circles; characters in orchre indicate the variable nucleotides in the correponding site in large rrna gene of L. macrosoma; characters in blue indicate the alternative nucleotide for that site in large rrna gene of X. americanum; Green characters in D3 region are parsimonious informative sites obtained from the analyses of sequences of 62 species; arrows guided character or characters are insertion sequences. 75

79 5. Ribosomal genes of longidorids Fig The secondary structure models for D2 expansion region: non-canonical UG pair was connected by black circle; other non-canonical pairs (AC, UU, CC, etc.) were connected by hollow circles; (A) and E20 stem-loop expansion region (B). (A): characters in black color indicates the highly variable sites; characters in blue color indicate the conserved sites with only compensatory substitution or covariation; characters in red color indicate the sites conserved among species analyzed; stem-loop number rule still follow (ref). (B): stem-loop numbers followed the same rule as that for D2 model. characters in orchre indicate the variable nucleotides for that site in large rrna gene of L. macrosoma; arrowed guided character or characters are insertion sequences. 76

80 5. Ribosomal genes of longidorids Fig The sliding window graphs for D2 and D3 expansion regions were calculated from analyses on 62 longidorids. 77

81 5. Ribosomal genes of longidorids S-RNA variation The length of S-rRNA gene was 1826bp for X. americanum and X. brevicollum, and 1830bp for L. macrosoma. That is longer than that of C. elegans (1763 bp) (Ellis et al., 1986) and shorter than that of D. melanoganster (1995 bp) (Tautz et al., 1988). The length variation is due to in V4 and V7 regions. Most of the polymorphic sites were located in V4 and V2 among the three longidorids. The remaining polymorphic sites were distributed in V5, V7, V8 and V9 (Fig. 5.5). Polymorphic sites between the two closely related species X. americanum and X. brevicollum were found in V2 (1 site), V5 (1 site), V7 (1 site). All polymorphic sites were indicated in Fig The observed polymorphic sites are due to compensatory mutation or co-variation in the stem region or point mutation, insertion and deletion in the loop region. Because of their conservation, 18S gene sequences have been used to construct phylogeny for Nematoda (Blaxter et al. 1998) and establish molecular barcode for identification (Floyd et al. 2001). 78

82 5. Ribosomal genes of longidorids 79

83 5. Ribosomal genes of longidorids 80

84 5. Ribosomal genes of longidorids Fig The secondary structure model for small rrna: The stem-loop numbering follow the rule (Wuyts et al., 2002); V1-V9 represent the variable regions; Non-canonical UG pair was connected by black circle; other non-canonical pairs (AC, UU, CC, etc.) were connected by hollow circles; characters in red indicate the variable nucleotides for that site in large rrna gene of L. macrosoma; characters in blue indicate the alternative nucleotide for that site in large rrna gene of X. americanum. 81

85 5. Ribosomal genes of longidorids Phylogeny The base composition of the D2 and D3 expansion regions did not show high heterogeneity. The homogeneity test of base composition was performed on sequences of both regions. There was no significant difference of base composition observed for the D2 region (χ 2 = , df=219, P = ) or for the D3 region (χ 2 = , df=219, P = ). Partition homogeneity analyses (Farris et al., 1994) resulted in P =0.22, which supported the combination analyses of both D2 and D3 regions. The phylogenetic analyses based on D2 sequences resulted in the same tree topology as obtained from the combined analyses (Fig ). The separate analyses on D3 sequences indicated that this region does not maintain enough phylogenetic information to resolve the all taxa included in the analyses. However, combined analyses increased the support for monophyly for both Longidorus and Xiphinema (Fig ). Table 5.2. I summarized the details of phylogenetic analyses and results are summarized in Positions of Genera The genus Paralongidorus was grouped with species of the genus Longidorus (Fig ). The phylogenetic analysis made by Rubsova et al. (2001) resulted in the same position. The genus Xiphidorus was grouped with Xiphinema americanum-group species within the genus Xiphinema. The same phylogenetic groups were represented in ML tree (Fig. 5.6), MP tree (Fig. 5.7) and NJ tree (Fig. 5.8) (with minimum evolution criterion). However, the bootstrap support for the monophyly of Xiphinema (67% for ML analyses and 68% for MP analyses on combined dataset), Longidorus (59% for ML analyses and 61% for MP analyses on combined dataset) was not strong. The decay indices calculated from the combined dataset are 5 for Longidorus and 2 for Xiphinema, which are not high either. Parsimonious analysis on elision merged data matrix resulted 82

86 5. Ribosomal genes of longidorids in a strong support for Xiphinema monophyly; monophyly of Longidorus species was not supported (Fig. 5.10). The two Xiphidorus species were grouped together with bootstrap value 100% for the combined dataset in both MP and ML trees. The Xiphidorus group was grouped with X. americanum group with strong branch support (bootstrap value 99% and decay index 10 for MP method; bootstrap value 99% for ML method on combined dataset). The clade Paralongidorus was strongly supported (100% bootstrap for both ML and MP method on combined dataset). It was clustered in a position close to L. africanus but the support for this is very low (less than 50% bootstrap, 4 decay index). The MP analysis performed on the elision dataset resulted in the same position for Xiphidorus and Paralongidorus.Weighted parsimonious analyses do not improve the monophyly of Xiphinema or Longidorus. Tree topology obtained from the weighted MP method is the same as that from unweighted MP (data not shown) Details of the phylogenetic analyses The phylogenetic analyses on the D2 dataset and the combined dataset produced similar tree topology for ML, MP and distance method under ME model. However, analyses of the D3 dataset did not resolve all lineages though it maintains the strongly supported groups in the trees produced from D2 and combined dataset. I performed MP, ML analyses on 9 different alignments: eight alignments with parameters mentioned in the material and methods section (coded A1-A8); one alignment with default parameter of ClustalX (gap open penalty score and extension penalty 6.66) was edited according to the D2 and D3 secondary structure mode (coded A9 in Table 5.2). The best tree topology was inferred from alignment A9 that was refined with the secondary structure model. Very similar tree topology was generated from this alignment by different methods. MP analyses were performed on the elision culled data 83

87 5. Ribosomal genes of longidorids matrix (the eight alignments mentioned above) for D2 and the combined datasets. Similar tree topology was obtained from both analyses (Fig. 5.10). MP analysis on the D2 dataset produced 3324 best parsimonious trees with tree length TL= KH-test (Kishino & Hasegawa, 1989) and non-parametric Templeton (Wilcoxon signed-rank) and winning sites (sign) tests (Templeton, 1983) showed that there is no significant difference between all topologies. A consensus tree has been made from the 3324 most parsimonious trees (Fig. 5.11A). MP analysis on the D3 dataset produced equally good maximum parsimonious trees with TL= 489. The consensus tree was presented in Fig.5.11B. The g1 statistic for D2 and D3 datasets are and (proper left skewed distribution), respectively. That indicates that the data sets contain good phylogenetic signals for parsimony analysis (Hillis & Huelsenbeck, 1992). Statistic g1 was calculated by evaluating the tree length distribution of random trees, the combined dataset had g1 = MP analysis on the combined dataset resulted in 200 maximum parsimonious trees with TL=2751. A consensus tree of the 200 equally scored trees is presented in Figure 5.7. Bootstrap values and decay indices were calculated for the consensus trees. They were added to the corresponding nodes. MP analyses on D2 and the combined dataset resulted in very similar tree topology. Slight differences were found in the positions of L. elongatus and L. carpathicus: in the MP tree based on the D2 dataset, L. carpathicus is more close to the internal group L. intermedius and L. piceicola with strong bootstrap support (81%) and low decay index (1). In both trees (Fig. 5.7 and Fig. 5.11A), the branch support (decay indices) for the positions of L. elongatus or L. carpathicus was only 1. Therefore, their positions are not stably represented in both trees. X. chambersi and L. apulus also slightly changed their positions. The branch supports for their positions are very low in MP tree based on D2 dataset, only (1) decay index for X. chambersi and (0) for L. apulus. Bootstrap supports are less than 50% for both. However, in the MP tree based on 84

88 5. Ribosomal genes of longidorids combined data set, the decay index for L. apulus was increased to 3 and 4 for X. chambersi with bootstrap support still less than 50%. Apparently, the phylogenetic signals contained in the D3 region contributed to these improved branches. Including characters from D3 region also improved several branches supports: monophyly of Xiphinema and Longidorus was supported with higher bootstrap and decay indices: 61% bootstrap and 5 decay index versus 53% bootstrap and 1 decay index for Longidorus, 68% bootstrap and 2 decay index versus 55% bootstrap and 1 decay index for Xiphinema. Some internal branch supports were also increased: internal nodes connecting to X. chambersi (decay index increased from 1 to 4), internal node connecting to X. italiae (bootstrap increased from 73% to 85%). However, the support for X. americium group decreased (bootstrap from 85% to 62% and decay index from 3 to 2). MP analyses on elision-culled matrix of D2 or combined dataset resulted in the same topology (Fig. 5.10). Maximum likelihood analyses were performed on separate and combined dataset. The tree topology obtained from D2 or combined dataset is similar to the MP trees. The selected model is GTR+Γ+I (general time reversible plus gamma rates and proportion of invariable sites). Non-parametric bootstrap was run with minimum evolution and ML distance or ML criteria with TBR restricted to 100 per replicate. In both conditions, the calculated bootstrap values are in high agreement with each other. Eleven heuristic searches resulted in 11 ML tree that were compared by the KH-test. There is no significant difference between the eleven topologies obtained. The tree with the highest likelihood score ( for the combined dataset and for the D2 dataset) was selected as the default best ML tree. The tree topology was evaluated by SOWH-test (a parametric bootstrap based test for best tree topology). One hundred bootstrap replicates were simulated in the condition of the estimated parameters from the tested topology with the original dataset. Heuristic searches were performed for each simulated replicate with the estimated ML parameters. Likelihood of each ML tree obtained from each replicate subtracted the Likelihood estimated for the default topology 85

89 5. Ribosomal genes of longidorids with the same replicate (δ=l ML -L default, L represents the maximum likelihood). A single tail test was performed (H 0 : δ=0, H A : δ>0, resulting in P = ). So, the trees were statistically accepted as the best topology. Comparing the topology obtained from D2 and the combined data set, I did not find any significant topological difference. However, branch support is much improved by using the combined dataset than D2 dataset (Fig. 5.9). The ML tree generated from the D2 dataset alone did not give strong support to the monophyly of both Xiphinema and Longidorus (bootstrap is less than 50%). The ML tree based on the combined data set had 67% bootstrap support for Xiphinema and 59% bootstrap support for Longidorus. I can conclude that the D3 characters significantly improved the phylogeny construction although few of them are parsimonious informative (29 parsimonious sites contributed to the internal node leading to the bifurcate of Xiphinema and Longidorus). Poe and Swofford (1999) have discussed the positive effects of addition of characters to break the long-branch attraction in the Felsenstein zone (Felsenstein, 1978). In our analyses, I found that addition of characters could improve the branch stability for the correct phylogeny. The distance method using uncorrected p distance with the ME model produced a similar topology to MP and ML trees. The bootstrap support for Longidorus is high (81%), whereas the support for Xiphinema is low (53%) (Fig. 5.8). The distinct difference in topology is the disruption of group X. setariae and X. radicicola, which exists in the MP and ML trees with moderate bootstrap support (54%-70%). 86

90 5. Ribosomal genes of longidorids Fig The tree of the highest likelihood score (- ln L = ) from the combined dataset based on the alignment refined in the aid of the secondary structure model. Bootstrap values were added above the corresponding branch. Monophyly of Xiphinema and Longidorus are moderately supported. Xiphidorus and Paralongidorus are not supported as independent genera. Species names are followed by the sampling codes (Table 3.1). 87

91 5. Ribosomal genes of longidorids Fig A consensus of 200 most parsimonious trees with length 2751 from the combined dataset based on the alignment refined in the aid of the secondary structure model. Bootstrap values were added above the corresponding branch and decay indices were put under the branch supported. Monophyly of Xiphinema and Longidorus are moderately supported. Xiphidorus and Paralongidorus are not supported as independent genera. Species names are followed by the sampling codes (Table 3.1). 88

92 5. Ribosomal genes of longidorids Fig The NJ (neighbor joining) tree resulting from the combined dataset based on the alignment refined in the aid of the secondary structure model. Bootstrap values were added above the corresponding branch. Monophyly of Xiphinema and Longidorus are moderately supported. Xiphidorus and Paralongidorus are not supported as independent genera. Species names are followed by the sampling codes (Table 3.1). 89

93 5. Ribosomal genes of longidorids Fig The tree of the highest likelihood score (- ln L = ) from the D2 dataset based on the alignment refined in the aid of the secondary structure model. Bootstrap values were added above the corresponding branch. Monophyly of Xiphinema and Longidorus are not well supported with bootstrap analyses. Xiphidorus and Paralongidorus are not supported as independent genera. Species names are followed by the sampling codes (Table 3.1). 90

94 5. Ribosomal genes of longidorids Table 5.2. Summary of phylogenetic analyses on each data set. Dataset Method of analysis Settings Support for clades Tree scores b Number of best trees b Monophyly of Longidorus b Monophyly of Xiphinema b D2 A1 MP Gaps=missing 100 bootstrap 2846, , 1003 N c No a or 5 th base equal weights, search with 10 random addition of sequences reps with 10 random addition sequences addition D2 A2 MP 2831, , 1814 Yes (60%), No c No a, Yes (59%) D2 A3 MP 2839, , 1896 Yes (63%), No c No a, Yes (59%) D2 A4 MP 2815, , No c Yes (51%), No a D2 A5 MP 2776, , 9423 No c Yes (54%), Yes (58%) D2 A6 MP 2730, , No c No a D2 A7 MP 2724, , No c No a D2 A8 MP 2672, , Yes (57%), Yes No a D2 A9 MP Gaps=missing equal weights, search with 10 random addition of sequences 100 bootstrap reps with 10 random addition sequences addition (67%) 2889, , 3324 Yes (96%), Yes (55%, 1) No c, Yes (53%, 1) ML GTR+Γ+I, 11 searches Decay analysis with 10 random addition of sequences 100 bootstrap reps with 1 random addition sequences addition ME distance GTR+Γ+I 1000 bootstrap reps No c No a No c No a 91

95 5. Ribosomal genes of longidorids D2 elision matrix MP Gaps=missing equal weights, search with 10 random addition of sequences 100 bootstrap reps with 10 random addition sequences addition No c Yes D2+D3 A1 MP Gaps=missing or 5 th base equal weights, search with 10 random addition of sequences D2+D3 A2 MP Gaps=missing or 5 th base equal weights, search with 10 random addition of sequences D2+D3 A3 MP Gaps=missing or 5 th base equal weights, search with 10 random addition of sequences D2+D3 A4 MP Gaps=missing or 5 th base equal weights, search with 10 random addition of sequences D2+D3 A5 MP Gaps=missing or 5 th base equal weights, search with 10 random addition of sequences 100 bootstrap reps with 10 random addition sequences addition 100 bootstrap reps with 10 random addition sequences addition 100 bootstrap reps with 10 random addition sequences addition 100 bootstrap reps with 10 random addition sequences addition 100 bootstrap reps with 10 random addition sequences addition 3547, , 3687 No c Yes (89%), Yes (83%) 3535, , 124 No c Yes (91%), Yes (91%) 3535, , 147 No c, Yes (52%) Yes (91%), Yes (82%) 3533, , 250 No c Yes (97%), Yes ( 97%) 3543, , 408 No c Yes (92%), Yes (76%) 92

96 5. Ribosomal genes of longidorids D2+D3 A6 MP Gaps=missing or 5 th base equal weights, search with 10 random addition of sequences D2+D3 A7 MP Gaps=missing or 5 th base equal weights, search with 10 random addition of sequences D2+D3 A8 MP Gaps=missing or 5 th base equal weights, search with 10 random addition of sequences D2+D3 A9 MP Gaps=missing equal weights, search with 10 random addition of sequences ML GTR+Γ+I, 11 searches ME distance Uncorrected P distance 100 bootstrap reps with 10 random addition sequences addition 100 bootstrap reps with 10 random addition sequences addition 100 bootstrap reps with 10 random addition sequences addition 100 bootstrap reps with 10 random addition sequences addition Decay analysis with 10 random addition of sequences 100 bootstrap reps with 1 random addition sequences addition 1000 bootstrap reps 3442, , 512 No c Yes (71%), Yes (80%) 3428, , 1421 No c Yes (86%), Yes (79%) 3381, , 2411 No c Yes (67%), Yes (65%) 3552, , 200 Yes (94%), Yes (61%, 5) Yes (59%) Yes (67%) Yes (51%), Yes (68%, 2) Yes (81%) Yes (53%) 93

97 5. Ribosomal genes of longidorids D2+D3 elision matrix MP Gaps=missing, equal weights, search with 10 random addition of sequences No Yes D3 A1 MP No No ML GTR+Γ+I No No D3 elision matrix MP No No Gaps=missing, equal weights, search with 10 random addition of sequences a Although the monophyly of Xiphinema was not supported, the monophyly of Xiphinema americanum lineage and non-xiphinema americanum group species were well supported with average bootstrap (average bootstrap values obtained from all datasets) 81% and 93% considering the gaps as missing, 90% and 100% considering the gaps as new state in MP analyses on D2 dataset; 80% and 95% considering the gaps as missing, 93% and 98% considering the gaps as newstate in MP analyses based on the combined dataset. Xiphidorus are clustered with X. americanum lineage with average bootstrap 79% (gap mode set to missing) and 98% (gap mode set to newstate) based on D2 dataset, 83% (gap mode set to missing) and 99% (gap mode set to newstate) basded on combined dataset. b The first value in the column was calculated in condition that the gap mode was set new state; the second value separated from the first by a comma was calculated in condition that the gap mode was set missing; percentage values in parenthesis are bootstrap value and the integers are decay branch support value. c The bilobed amphids group (L. arthesis,etc.) is well supported with average bootstrap value 97% for both conditions (gap mode set to missing and new state) in MP analyses based on combined dataset, 95% (gap mode set to missing) and 97% (gap mode set to new state) based on D2 dataset. Funnel shaped amphids group (L. macrosoma, etc.) is supported with 100% average bootstrap value for all above analyses. Please refer to Figure 12B for the amphids groups mentioned. 94

98 5. Ribosomal genes of longidorids A B Fig (A) The consensus tree of 431 most parsimonious trees resulting from the MP analyses on the elision merged data matrix for the D2 dataset; (B) The consensus tree of 648 most parsimonious trees resulting from the MP analyses on the elision merged data matrix for the combined dataset; Species names were followed with sampling codes (Table 3.1); Monophyly of Xiphinema is supported in both analyses, however, monophyly of Longidorus is not supported. Xiphidorus and Paralongidorus are not supported as independent genera. 95

99 5. Ribosomal genes of longidorids A B Fig (A)The consensus tree of 3324 most parsimonious trees resulting from the MP analyses on the D2 dataset; (B) The consensus tree of most parsimonious trees resulting from the MP analyses on the D3 dataset; Species names are followed by the sampling codes (Table 3.1); Monophyly of Xiphinema and Longidorus are supported by analyses on the D2 dataset but not on the D3 dataset. Boostrap and decay indices were calculated for the tree based on the D2 dataset and shown above or under the branch supported. 96

100 5. Ribosomal genes of longidorids A 97

101 5. Ribosomal genes of longidorids B 98

102 5. Ribosomal genes of longidorids Fig (A) The consensus tree of most parsimonious trees (TL=43) resulting from the MP analyses on morphological characters used in the polytomous key for Longidorus (Chen et al., 1997); NJ tree based on morphological characters used in the polytomous key for Longidorus. Boostrap values were calculated and labeled above the corresponding branches. (B) Several morphological characters of Longidorus were mapped to the ML tree based on the combined dataset. The correspondence between shape of amphids and the nodes of the tree can be seen clearly: different shapes were assigned to different morphological characters; a different color was assigned to a different character state of the same character Correlation with Morphological characters and groups Longidorus To construct the phylogeny for Longidorus using morphological characters, I used the morphological characters selected for the recent polytomous key (Chen et al., 1997). Both MP and NJ methods resulted in the trees without strong non-parametric bootstrap for branches (Fig. 5.12A). Only the L. macrosoma and L. helveticus group recieved 74% boostrap support for both NJ and MP consensus trees, which are well fitted to our molecular analyses results. The remaining branches have less than 50% bootstrap value. These results indicated that the morphological characters used in the polytomous keys are not sufficient for constructing the phylogeny of Longidorus. The only interesting correspondence between morphological characters and phylogenetic trees is the grouping of Longidorus species in coincidence with the amphids similarity (Fig. 5.12B). Two groups were observed in the tree. One group included L. macrosoma, L. helveticus, L. caespiticola with funnel shaped amphids; the other group includes L. africanus, L. arthensis, L. sturhani, L. apulus, L. euonymus, L. athesinus, L. edmundsi, L. attenuatus, L. profundorum, L. breviannulatus, L. goodeyi, L. leptocephalus, L. junvenilis, L. carpathicus, L. elongatus, L. piceicola, L. intermedius with symmetrically or asymmetrically lobed amphids. The remaining species, L. camelliae, L. latocephalus, and L. diadecturus do not form one group though L. latocephalus and L. camelliae share the pouch type of amphids. This correspondence implies that the molecular evolution of some molecules may be synchronous with the evolution of some morphological characters. This synchronous evolution facilitates the recovery of correct 99

103 5. Ribosomal genes of longidorids phylogeny based on both molecular and morphological analyses, especially for the extant taxa without ancestor fossil records, such as most of the nematode taxa. Evolution is a hypotheses based research field. Without fossil record, we cannot confirm the true historical phylogeny. However, phylogeny supported by different types of data is apparently more convincing or at least better fitted to some widely accepted hypotheses models Xiphinema Xiphinema americanum group species were strongly clustered with Xiphidorus. The branch support for this clade was high (93% to 100% bootstrap in trees generated from NJ, MP or ML methods using D2 or the combined dataset). The non-xiphinema americanum- group species were clustered into one clade with strong branch support (89% to 99% boostrap support in trees generated from NJ, MP or ML methods using the combined dataset). These two groups were clustered into one clade with low bootstrap support (53% to 68% for topology produced from D2 or the combined dataset using different methods). No morphological characters strictly corresponded with the clades existing in the molecular phylogeny. Species with uterine differentiation in the reproductive system were grouped in one clade with 91%-100% bootstrap support, whereas, this clade also includes species without uterine differentiation (Fig. 5.12B). Therefore, this morphological character is not strictly correlated with the molecular data. The phylogeny of genus Xiphinema was constructed by Coomans et al. (2001) based on 44 morphological characters. Xiphinema americanum-group species were subgrouped into non-digitate conoid-tailed species for analysis. To facilitate the analyses, the authors subdivided the sampled species arbitrarily into several groups according to the tail shape. The tree topology obtained from our analyses is very close to the topology obtained by Coomans et al. (2001). I made a tree derived from the phylogeny inferred by Coomans et al. (Fig. 5.13A). The best fitted groups between the two analyses (morphology and our analysis based on D2 and D3 sequences data) were the group X. 100

104 5. Ribosomal genes of longidorids dentatum and X. pyrenaicum, the group X. dentatum, X. pyrenaicum, X. index and X. diversicaudatum, the group including X. coxi and X. basiri, and the large group including all above species and X. bakeri. The tree positions for the remaining species are also moderately correspondent with each other. However, there were some distinct differences from two topologies: position of X. americanum group. A reasonable explanation is the evolution rates of D2 and D3 expansion regions of this group do not synchronized with the evolution of the selected morphological characters and also are different from the species out of this group. Giving consideration to all Xiphinema species excluding Xiphinema americanumgroup, I made a KH-test (with RELL approximation) between two derived trees with only Xiphinema species (one tree was derived from the ML tree based on the combined D2 and D3 dataset (Fig. 5.13B); the other was derived from the tree shown in Figure 13A.). The test result does not show significant difference between the two topologies (P < 0.05, P=0.000) except that the ML tree has a higher likelihood score than the tree based on morphology: vs Xiphinema americanum lineage In our analyses, X. americanum lineage appears as a clade closely related to Xiphidorus with strong support from bootstrap analyses (Table 5.2). Within the lineage, I observed two groups that are also well supported in the analyses. One group includes X. americanum, X. brevicollum and several virus vector species. The other group includes X. pachtaicum, X. pachydermum and X. brevisicum (an amphimictic species) (Lamberti et al., 2000). I proposed the two groups as X. americanum subgroup and X. pachtaicum subgroup. In comparison with the cluster analysis based on morphological characters (Lamberti & Ciancio, 1993), in which X. americanum group was subdivided into X. brevicollum, X. americanum, X. taylori, X. pachtaicum and X. Lamberti subgroups, our results merged the X. taylori subgroup into the X. brevicollum subgroup that is part of the 101

105 5. Ribosomal genes of longidorids X. americanum subgroup (Fig. 5.14). Because I did not obtain a sample from the X. Lamberti subgroup, the position of this subgroup remains unclear. Fig (A): This tree was derived from the phylogeny of Xiphinema constructed by Coomans et al. (2001). I removed all the species that were not included in our analyses to get the above topology. This tree facilitated the comparison between phylogeny based on morphological characters and molecular data. (B): This tree took the topology for Xiphinema species (non-xiphinema americanmum group) from our ML and MP tree topology. Fig (A): This tree was derived from the phylogeny of Xiphinema americanum group constructed by Lamberti and Ciancio (1993) (B): This tree took the topology for Xiphinema americanum group from our ML and MP tree topology. 102

106 5. Ribosomal genes of longidorids 5.4 Conclusion and perspective This is the first extensive study using molecular data for inferring the phylogeny of Longidoridae. The results support the monophyly of the genera Xiphinema and Longidorus. However, the clades Xiphidorus and Paralongidorus are not supported as independent genera. They were clustered as a subgroup of the Longidorus and Xiphinema clade. Although the phylogeny constructed from molecular data is somewhat different from the phylogeny based on morphological characters of Xiphinema, it shares most of the clades with the latter. Therefore, it can be used as a convincing supplement for systematists trying to uncover the real phylogeny of this group of nematodes. The position of the X. americanum group is still questionable. The question whether it is a sub-taxon in the genus Xiphinema still requires to collect more data including morphology, biology, genetics, biochemistry and molecular biology. The tree construction results proved that the refined alignment using secondary structures is very effective for improving the phylogeny inference. By analysing the rrna genes, I found that D20-E9_1 region, H1_1 H1_4 region of the Large rrna gene contain many polymorphic sites. It is possible to infer the phylogeny based on data derived from those regions. Non-coding regions (IGS) of longidorids are highly diversified. This may be caused by too fast point mutation, insertion or deletion. The large differences between sequences and the length of the sequences in this region pose a big problem for aligning them. The fast mutation rates leads to the saturation of sites. Correct phylogenetic signals are difficult to be recovered from the sequences with high homoplasy. Therefore, I conclude that non-coding (IGS) regions of longidorids are not a good choice for the construction of the phylogeny. 103

107 5. Ribosomal genes of longidorids 104

108 6. Diversity of internal transcribed spacer 6 Diversity of Internal Transcribed Spacer in Xiphinema species and populations: diagnostic, population genetic and phylogenetic potential 6.1 Introduction The genus Xiphinema (Dorylaimida: Longidoridae) (Cobb, 1913) includes 296 nominal taxa (234 valid species, 49 junior synonyms, and 13 species inquirendae). It is the largest genus in the order Dorylaimida and the family Longidoridae (Coomans et al., 2001). Nematodes belonging to the genus Xiphinema are distributed worldwide and live ectoparasitically on a large number of plants. Because of their feeding on the root tips of plants, they usually cause galled roots. Their virus-vector feature (nepoviruses) of some of the species of the genus makes them even more damaging for economically important crops such as cherry, peach, tobacco, tomato, strawberry, and grape. More knowledge of their population dynamics, pathogenicity, and phylogenetic relationships can only be obtained if fast and accurate identification methods are available. The same methods should assist in the design of efficient strategies for control and quarantine purposes. Within the genus Xiphinema the Xiphinema americanum group occupies a particular status. Several members of the X. americanum group are of phytosanitary significance as they are efficient vectors of four important plant viruses of quarantine significance: cherry rasp leaf nepovirus, peach rosette mosaic nepovirus, tobacco ringspot nepovirus potato calico strain, and tomato ringspot nepovirus (all listed in the annex I/A1). The group includes 49 putative species and 2 species inquirendae (Lambertti et al., 105

109 6. Diversity of internal transcribed spacer 2000). Because of the overlapping of morphological characters, the identification of members of this group has always been an obstacle, even for specialists. In laboratories without a well-trained nematode taxonomist, misidentification of species belonging to this group is a common problem. Even though Lamberti et al. (2000) has provided an updated polytomous key for identification of species of this group, accurate and fast identification required for quarantine purposes is still a difficult task for agricultural diagnostic labs. PCR-RFLP analyses and sequencing analyses based on the internal transcribed spacer (ITS) region of the rdna have been successfully used for: (i) species identification of entomopathogenic nematodes (Nasmith et al., 1996), root lesion nematodes (Orui, 1996; Waeyenberge, 1999), root-knot nematodes (Zijlstra et al., 1997) and cyst forming nematodes (Subbotin et al., 1999); (ii) studies on population diversity (Hiatt et al., 1995); and (iii) phylogeny analyses of the genus Heterorhabditis (Adams et al., 1998a) and the burrowing nematodes (Kaplan et al., 2000). Powers et al. (1997) made a detailed evaluation of the diagnostic potential of the ITS1 region for nematodes. In this chapter I report on the evaluation of the ITS region of Xiphinema species for: (i) its diagnostic potential, (ii) population genetic studies and (iii) its phylogenetic potential. To obtain the information, I used PCR-RFLP and sequencing analyses. The confirmed virus-vector species (Taylor and Brown, 1997) were included in the study: X. index, X. diversicaudatum, X. italiae, X. americanum, X. bricolensis, X. rivesi and X. californicum (the latter four species belong to X. americanum lineage). One Longidorus species and Paratrichodorus species were also included in the analyses. This allows an extrapolation to other longidorids and trichodorids to be made. 106

110 6. Diversity of internal transcribed spacer 6.2 Material and Methods Taxon sampling Nematodes samples used in this study (Table 3.1) include eighteen species (85 populations) from the X. americanum group, eleven non-x. americanum group Xiphinema species, two Longidorus (Dorylaimida: Longidoridae) species and one Paratrichodorus (Triplonchida: Trichodoridae) species DNA extraction chapter 3. DNA was extracted from a single juvenile or adult for PCR as described in Amplification and cloning ITS and 5.8S genes were amplified by using primers VN18 and VN28 or Cur18 and Cur28 (Table 6.1) with a PCR kit from Qiagen (Qiagen GmbH, Postfach, Germany). The PCR was performed in a PTC-100/200 thermocycler (MJ research, Biozyme, San Diego, USA). The cycling conditions were 94 C for 3 minutes, 35 cycles 94 C for 30 sec, 54 C for 40 sec, and 72 C for 2 min followed by an extension at 72 C for 10 min. PCR products were visualised under UV after being separated in a 1% agarose gel and staining with ethidium bromide. The fragments were excised from the gel and purified with a Gel-purification kit (Qiagen). Finally, the fragment was cloned into the pgem-t vector (Promega, Leiden, The Netherlands). Table 6.1. Primers used in this study. Primer code Primer sequence (5-3 ) Reference VN18 TTGATTACGTCCCTGCCCTTT T. Vrain et al. (1992) VN28 TTTCACTCGCCGTTACTAAGG T. Vrain et al. (1992) Cur18 GTTTCCGTAGGTGAACCTGC Cur28 ATATGCTTAAGTTCAGCGGGT 107

111 6. Diversity of internal transcribed spacer Sequencing and sequence analyses A primer walking sequencing strategy was used for the ITS amplification product. The sequencing reactions were carried out using a BigDye terminator cycle sequencing kit (Lennik, Belgium). The final sequences were obtained with a ABI prism 377 genetic analyser (ABI). Sequences were assembled and edited by BioEdit (Hall, 1999). Primary sequence alignments were performed using ClustalX (Thompson et al., 1997) Nucleotide diversity Nucleotide diversity (π) was calculated for the ITS1 and ITS2 using the software DnaSP (Rozas and Rozas, 1999). π is given by the following formula: π = Π / L (L is the length of the sequence.) in which Π = the average number of nucleotide differences between two sequences randomly chosen from the sample; n = the number of sequences of the sample; Π ij = the number of nucleotide differences between ith and jth sequences; and n(n-1)/2 = the number of possible pairs. Inter-species nucleotide diversity was calculated for Xiphinema species and the inter-population diversity was inferred for X. pachtaicum from partial ITS sequences. 108

112 6. Diversity of internal transcribed spacer RFLP analyses (Promega). The DNA ladders used for RFLP analyses were the 1 Kb and 100 bp ladder RFLP analyses for inter-species polymorphism PCR products produced from 17 species including 6 species of the X. americanum group were digested with the following restriction enzymes: MboI, DraI, HaeIII, BsaAI, AluI, AflIII, RsaI, MspI, SfuI and KpnI. The primers used for the PCR amplification were Cur18 and Cur28 (Table 6.1). Digested products were visualised under UV after being separated in a 2-3% agarose gel stained with ethidium bromide. RFLP profiles were compared by eye RFLP analyses for polymorphism among X. americanum lineage The restriction enzymes DdeI, HaeIII, AluI, RsaI, MspI, MvaI and CfoI were used to digest the PCR products produced from 17 species (70 populations) of the X. americanum lineage, X. coxi (a non-x. americanum lineage species) and one undescribed Longidorus species. The latter two species were used as outgroup. The primers used for this amplification were VN18 and VN28 (Table 6.1). The digested products were visualised under UV after being separated in a 2-3% agarose gel stained with ethidium bromide. RFLP profiles were compared by eye and recorded in Table Genetic distance using restriction sites for X. americanum lineage The genetic distance was calculated by Paup 4.0 (Swofford, 2002) using UPGMA (unweighted pair-group method using arithmetic averages) or neighbour-joining method based on the ME (minimum evolution) model with Nei and Li (1979) distance. Bootstrap values were calculated for the NJ (neighbour joining) tree Phylogeny Sequence alignment was made using ClustalX (Thompson et al., 1997) using the 109

113 6. Diversity of internal transcribed spacer default parameters (gap open penalty and extension penalty 6.66). Sequences of 18S, 5.8S and 28S genes were removed and the sequence matrix was analysed by neighbour-joining method (ME model and uncorrected p-distance), MP (maximum parsimony) and ML (maximum likelihood) methods implemented in Paup 4.0 (Swofford, 2002). Bootstrap and decay analysis (Bremer, 1994) were performed using Autodecay (Eriksson, 2001) for MP tree to evaluate the branch support. Bootstrap support was also calculated under the distance and ML criteria. The software Modeltest (Posda & Crandall, 1998) was used to infer the best model for ML analysis. A series of nested models were evaluated by LRT (log-likelihood ratio test) (Huelsenbeck & Rannala, 1997). The best model was selected for the final analysis. To avoid constructing the tree on background noise, the permutation (PTP) test implemented in Paup 4.0 was used to check the hierarchical signals included in the data matrix. I also calculated the g1 statistic (Hillis & Huelsenbeck, 1992) using randomised tree sampling to evaluate the phylogenetic signals contained in the data matrix. The homogeneity test was performed to check the base composition. 6.3 Results and Discussion ITS size variation Using the VRN18 and VRN28 primers, the sizes of the amplified ITS-fragment ranged from 1375 bp (X. brasiliense) to 2010 bp (X. index) for the Xiphinema species studied. As to the X. americanum lineage, the product size for X. pachtaicum, X. pachydermum, X. brevisicum, X. madeirense, X. simile were estimated at 1.8 kb, 1.7 kb, 2.0 kb, 1.3 kb and 1.8 Kb, respectively. The product size of the remaining species was about 1.45kb (Fig. 6.1). The actual size of ITS1 and ITS2 excluding 193 bp from the 18S gene, 104bp from the 28S gene, and 156 bp from the 5.8S gene varied from 992 bp (X. brasiliense) to 1557 bp (X. index). The PCR products amplified from the two outgroup species, L. macrosoma and P. macrostylus, were 1968 bp and 1253 bp, respectively. The 110

114 6. Diversity of internal transcribed spacer amplified products obtained with Cur18 and Cur28 were 215 bp smaller than those obtained with VRN18 and VRN28 because this primer pair is located downstream to VRN18 and VRN28. The big size variation in ITS fragments is not unique for Xiphinema. It is a common feature within the phylum Nematoda (Powers et al., 1997) and even for eukaryotes (Joseph et al., 1999). This might be the outcome of rapid accumulation of insertions or deletions because of the absence of strong functional constraints on this region. The function of the ITS region is still speculative although yeast mutation/deletion experiments have unravelled the significance of the integrity of several structural features in the ITS region crucial for accurate spacer removal and biogenesis of the functional ribosome (Van Nues et al., 1995). Therefore, there are some structural constraints on the ITS, which result in highly divergent sequences while maintaining the similar structural domains derived from the central core (Lalev and Nazar, 1999). Indeed, the common secondary structural model has been constructed for ITS2 region for vertebrates and yeast (Joseph et al., 1999). Some simple sequence repeats (SSRs or microsatellites motifs) were also observed in the ITS sequence (Table 6.2). The expansion or retraction of the simple repeat regions contributes to the length polymorphism. The repeat regions are usually not long and mostly consist of several repeat motifs to form a compound microsatellite. It has been reported that microsatellites in the ITS region cause the intra-genomic variation in crayfish and obscure the phylogenetic relationships at population level (Harris & Crandall, 2000) RFLP analyses and the diagnostic potential Diagnostic power of inter-species polymorphisms revealed by RFLP analyses Each of the restriction enzymes used separated Xiphinema species from each other except the five species from the X. americanum lineage (X rivesi, X. americanum, X. thornei, X. brevicollum and X. taylori). X. brevicollum and X. taylori were separated from the other three X. americanum lineage species by MspI, RsaI and HaeIII. Additionally, 111

115 6. Diversity of internal transcribed spacer DraI, SfuI, BsaAI, and AflIII separated X. brevicollum and X. taylori from the other three species, as they produced slightly longer bands although they shared restriction sites. Xiphinema brevicollum and X. taylori can, however, not be separated from each other. X. thornei was separated from the rest of the X. americanum lineage species by HaeIII. Xiphinema americanum was separated from the other X. americanum lineage species by DraI, SfuI and BsaAI having longer bands while the restriction sites were the same (Fig. 6.2). 112

116 6. Diversity of internal transcribed spacer Table 6.2. Microsatellites found in ITS region. Code Species Microsatellite loci in ITS1 region EU7 X. diversicaudatum (AT) 5 EU25 X. index (AT) 4 (AT) 6 (TA) 2 (TG) 3 (TA) 3 (TA)(TG)(TA) 4 LM1 L. macrosoma (AT) 6 A(CG)(AT) A(CG) 2 C(AT) 3 AT(CT) 4 (AT) 3 (AAT) 2 AT(CT) 2 (AAT) 2 (AT) 2 PE1 X. rivesi (TA) 2 (CG) 5 (TA) 4 (TC) 3 GTCGAATAAA (GA) 3 GC(GA) XA1 X. americanum (AT)3(AGAA)(AT) 2 (AGAA) (TA) 5 (CG) 5 (TG) 2 CO3 X. thornei (TA) 2 (CG) 5 (TA) 4 (TC) 3 GTCGAATAAA (GA) 3 GC(GA) (TG) 4 (AAGA) 3 EU41 X. brasiliense (TG) 2 CG(TG) 3 EU27 X. setariae (TA) 2 (CG) 2 (TA) 2 (TC)G(TC) 4 EU36 X. krugi (CTG) 5 (CAC) 5 EU2 X. pachtaicum ((TC) 3 (TA) 2 AA * (AG) 3 N 57 ) 2 A 8 TGCGA(AT) 5 (AT) 5 CTG(TA) 2 CATCC(AAGG) 2 CAACA(TA) 2 G AAAG(TC) 4 (AG) 4 C(TAA) 2 (AG) 2 ACTA(TC) 3 (A) 6 (TA) 4 AAGC TAGCGATAGC (TA) 3 (TC) 2 (A) 7 (TA) 2 (AT) 2 (TA) 2 (AT) 3 A 4 (TA) 2 A(AT) 3 A AA EU37 X. brevicollum (GAC) 4 TGG(GAC) TAC(GAC) (AC) 3 (TC)(GC) 2 (AC)GA(TC) 2 (TTC) 3 (TG) 3 113

117 6. Diversity of internal transcribed spacer Code Species Microsatellites in ITS region EU7 X. diversicaudatum (TA) 4 (AT) 3 (CG) 3 AAT(CG) EU25 X. index (TAC) 3 (AT) 3 (TA) 3 (TA) 4 TCGAAAAG TT(TA) 6 (CG) 3 AA(CG) 2 (CGAA) 3 (TA) 2 (TA) 3 (CG) 2 TG (CG)AAAGAG C(TA) 3 AA(TA) LM1 L. macrosoma (AT) 3 TC(TA) 2 CG(TA) 3 (TG)(TA)(TG)(TA) 3 T(TA) (GT) 2 C(GT) 2 PE1 X. rivesi (CT) 4 (CGAA) 4 (AG) 3 ACTCGAT (AG) 2 XA1 X. americanum (TC) 3 (GA) 3 (AG) 4 ACTCGA (AG) 2 (TCGA) 3 G(CG AA) 3 (CGT) 2 CO3 X. thornei (CT) 4 (CGAA) 4 (AG) 3 ACTCGAT (AG) 2 EU41 X. brasiliense (TA) 3 EU27 X. setariae (AT) 3 (CCG)(CCT)(CCG) 2 (AT) 5 (AG) 2 EU36 X. krugi EU2 X. pachtaicum (AG) 4 AA(AG) (AG) 2 TA(AG) EU37 X. brevicollum (AAAGGG) 2 114

118 6. Diversity of internal transcribed spacer Fig Amplified ITS region of Xiphinema species: 1. X. basri (EU125), 2.X. pyrenaicum (EU121), 3.X. index (EU25), 4.X. italiae (EU123), 5. X. insigne (EU131), 6.X. vulgare (EU27), 7.X. dentatum (EU111), 8.X. diversicaudatum (EU7), 9.X. brasiliense (EU41), 10.X. krugi (EU36), 11.X. pachtaicum (EU2), 12.X. pachydermum (EU109), 13.X. taylori (EU117), 14.X. brevicollum (XB1), 15.X. americanum (XA1), 16.X. rivesi (PE2), 17.X. rivesi (PE1) and 18.X. thornei (Co3). Between the parenthesis are the codes that are used in the text. Please refer the populations and corresponding codes in Table 3.1. Any of the enzymes tested can be applied to assist Xiphinema species identification. As reported for other nematode groups (see introduction), PCR-RFLP on ITS is a powerful tool for diagnostic application to Xiphinema species excluding some putative species of the X. americanum lineage. The RFLP analyses on X. americanum lineage species will be discussed in the following section. Fig. 6.2A. RFLP profiles obtained by DraI digestion for Xiphinema species: 1.EU36, 2.EU111, 3.EU123, 4.EU131, 5.EU125, 6.EU27, 7.EU41, 8.EU121, 9.EU25, 10.EU7, 11.EU117, 12.EU109, 13.EU2, 14.XB1, 15.XA1, 16.PE2, 17.PE1, 18.Co3. The populations and corresponding codes can be found in Table

119 6. Diversity of internal transcribed spacer Fig. 6.2B. RFLP profiles obtained from SfuI digestion for Xiphinema species: 1. EU117, 2.EU2, 3.XB1, 4.XA1, 5.PE2, 6.PE1, 7.Co3, 8.EU109, 9.EU131, 10.EU111, 11.EU125, 12.EU27, 13.EU25, 14.EU36, 15.EU123, 16.EU7, 17.EU41, 18.EU121. Fig. 6.2C. RFLP profiles obtained by HaeIII digestion for Xiphinema species: 1.EU125,2. EU121, 3.EU25, 4.EU131, 5.EU27, 6.EU111, 7.EU7, 8. EU41, 9.EU36, 10.EU2, 11.EU109, 12.EU117, 13.XB1, 14.XA1, 15.PE2, 16.PE1, 17.Co3, 18.EU

120 6. Diversity of internal transcribed spacer Fig. 6.2D. RFLP profiles obtained from Bsa-AI digestion for Xiphinema species: 1.EU41, 2.EU36, 3.EU27, 4.EU131, 5.EU125, 6.EU123, 7.EU25, 8.EU121, 9.EU111, 10.EU7, 11.EU2, 12.EU109, 13.EU117, 14.XB1, 15.XA1, 16.PE2, 17.PE1, 18.Co3. Fig. 6.2E. RFLP profiles obtained by AluI digestion for Xiphinema species: 1.EU27, 2.EU41, 3.EU25, 4.XB1, 5.EU2, 6.EU109, 7.EU131, 8.EU125, 9.EU7, 10.EU123, 11.EU121, 12.EU117, 13.EU36, 14.PE2, 15.PE1, 16.EU111, 17.CO3, and 18.XA1. 117

121 6. Diversity of internal transcribed spacer Fig. 6.2F. RFLP profiles obtained by KpnI digestion for Xiphinema species: 1.EU125, 2.EU121, 3.EU25, 4.EU123, 5.EU131, 6.EU27, 7.EU111, 8.EU7, 9.EU41, 10.EU36, 11.EU2, 12.EU109, 13.EU117, 14.XB1, 15.XA1, 16.PE2, 17.PE1, 18.Co3. Fig. 6.2G. RFLP profiles obtained by MboI digestion for Xiphinema species: 1.EU125, 2.EU121, 3.EU25, 4.EU123, 5.EU131, 6.EU27, 7.EU111, 8.EU7, 9.EU41, 10.EU36, 11.EU2, 12.EU109, 13.EU117, 14.XB1, 15.XA1, 16.PE2, 17.PE1, 18.Co3. 118

122 6. Diversity of internal transcribed spacer Fig. 6.2H. RFLP profiles obtained by AflIII digestion for Xiphinema species: 1.EU111, 2.EU25, 3.EU36, 4.EU121, 5.EU7, 6.EU123, 7.EU131, 8.EU125, 9.EU41, 10.EU27, 11.EU117, 12.EU109, 13.EU2, 14.XB1, 15.XA1, 16.PE2, 17.PE1, 18.Co3. Fig. 6.2I. RFLP profiles obtained by RsaI digestion for Xiphinema species: 1.EU25, 2.XB1, 3.EU27, 4.EU41, 5.EU131, 6.XA1, 7.EU7, 8.EU123, 9.EU125, 10.EU117, 11.EU121, 12.PE2, 13.EU36, 14.EU111, 15.PE1, 16EU109, 17.CO3, 18.EU2. 119

123 6. Diversity of internal transcribed spacer Fig. 6.2J. RFLP profiles obtained by MspI digestion for Xiphinema species: 1.EU111, 2. PE1, 3.EU36, 4.CO3, 5.EU2, 6.EU25, 7.XB1, 8.EU131, 9.XA1, 10.EU7, 11.EU125, 12.EU123, 13.EU121, 14.EU117, 15.EU109, 16.EU27, 17.EU41, 18. PE RFLP analyses for polymorphism within X. americanum lineage species As mentioned above, the size of the ITS fragment of X. pachtaicum, X. pachydermum, X. brevisicum, X. madeirense, and X. simile are estimated at 1.8 kb, 1.7 kb, 2.0 kb, 1.3 kb and 1.8 Kb, respectively. This size is very different from that of the remaining species in my study (1.45 kb). RFLP analyses also confirmed the difference of the restriction sites. Any of the selected enzymes (DdeI, HaeIII, AluI, RsaI, MspI, MvaI and CfoI) separated the five X. americanum lineage species from the rest of the species and also from each other (Fig. 6.3). As to other species, the results are presented in Tables 6.4 and 6.5. All profiles for each restriction enzyme are shown in Fig CfoI. The second restriction site for profile 2 will shift position so that X. taylori can be separated from X. diffusum; a profile compounding 1+3 was produced for populations AB9, CA15, UT4, UT8 and CA24; CA3, CA33 and CA50 had profile 3 (Table 6.3 and 6.4). This restriction enzyme cannot only separate X. brevicollum, X. taylori, and X. diffusum from the rest of species. AluI. all populations shared the same restriction sites. However, X. brevicollum, X. taylori, X. diffusum produced a longer band (ca. 400 bp) whilst other species produced a shorter band (ca. 360 bp). 120

124 6. Diversity of internal transcribed spacer DdeI. Five restriction profiles were observed with this enzyme. Populations of the same species did not share profiles. However, populations from different species sometimes shared profiles. Xiphinema taylori was separated from X. diffusum and X. brevicollum by this enzyme because of its compound restriction profile. HaeIII: Restriction profiles obtained with this enzyme varied among populations of the same species. Populations of different species shared sometimes profiles. This enzyme has not enough diagnostic power. MspI: This enzyme separated X. brevicollum, X. taylori, X. diffusum from the remaining species. The restriction profiles varied between populations of the same species. Populations of different species shared sometimes profiles. MvaI: This enzyme separated X. brevicollum from the remaining species. However, these remaining species or populations were not separated from each other. Most of these species shared profile 2. Some of the species yielded a compound profile (1 + 3, or ). RsaI: This enzyme separated X. brevicollum, X. taylori, X. diffusum from the remaining species. All remaining populations shared profile 2. According to the results, X. brevicollum, X. taylori, X. diffusum can be separated from the other species. However, these other species cannot be separated from each other. Fig. 6.3A. RFLP profiles produced by AluI for X. americanum group species: 1. M35, 2.CA54, 3.CA55, 4.M5, 5.PE9, 6.PE26, 7.M8, 8.CA33, 9.EU109, 10.PE42, bp ladder, 12.CA31, 13.EU109, 14.PE6, 15.CA45, 16.CA16, 17.CA54, 18.UT14, 19. PE40, 20.CA6. 121

125 6. Diversity of internal transcribed spacer Fig. 6.3B.RFLP profiles produced by DdeI for X. americanum group species: 1.none, 2.CA10, 3.PE33, 4.CA50, 5.PE24, 6.UT8, 7.CA13, 8.CA14, 9.CA40, 10.GG15, bp ladder, 12.MS, 13.UT24, 14.CA12, 15.GG14, 16.UT6, 17.CO5, 18. CA3, 19.CA11, 20.CA8; Profile for CA50 belongs to the profile 3; Profile for CA14 belongs to the profile 4; CO5 yields the profile 1; CA40 yields profile 5. Please refer to Table 6.3 & 6.4 for the details of the profiles and corresponding populations and species. Fig. 6.3C.RFLP profiles produced by DdeI for X. americanum group species: 1.EU5, 2.EU109, 3.EU109, 4.NV5, 5.PE2, 6.NV2, 7.NV3, 8.PE1, 9.CA18, 10.EU14, 11.MD2, bp ladder, 13.none, 14.CO3; Profile for PE2 belongs to the profile 3; Profile for CA14 belongs to the profile 4; Profile for CA18 belongs to the profile 2. Please refer to Table 6.3 & 6.4 for the details of the profiles and corresponding populations and species. 122

126 6. Diversity of internal transcribed spacer Fig. 6.3D.RFLP profiles produced by CfoI for X. americanum group species: 1.PE33, 2.CA40, 3.AB9, 4.CA62, 5.MS, 6.TN1, 7.CA7, 8.CA15, 9.UT4, 10.UT25, bp ladder, 12.GG6, 13.AB7, 14.CA50, 15.CAN162, 16.CAN224, 17.UT8, 18.CA10, 19.CA8, 20.CAN39; Profile for TN1 and CAN162 belongs to the profile 2; Profile for CA10 belongs to the profile 1; Profile for CA50 belongs to the profile 3. Please refer to Table 6.3 & 6.4 for the details of the profiles and corresponding populations and species. Fig. 6.3E.RFLP profiles produced by CfoI for X. americanum group species: bp ladder, 2.none, 3.CO3, 4.NV2, 5.CA22, 6.NV3, 7.NV5, 8.CA56, 9.PE2, 10.PE1; Profile for PE2 belongs to the profile 4. Please refer to Table 6.3 & 6.4 for the details of the profiles and corresponding populations and species. 123

127 6. Diversity of internal transcribed spacer Fig. 6.3F.RFLP profiles produced by MspI for X. americanum group species: 1.UT6, 2.CAN162, 3.UT25, 4.CAN224, 5.PE20, 6.CA15, 7.GG7, 8.UT4, 9.OR4, 10.GG6, bp ladder, 12.CA62, 13.AB9, 14.CA7, 15.TN1, 16.AB7, 17.UT8, 18.CA3, 19.CA11, 20.MS; Profile for CAN162 and GG7 belong to the profile 1; Profile for CA7 belong to the profile 3; Profile for MS belong to the profile 4; Profile for PE20 belong to the profile 6; Profile for UT4 belong to the profile 2;Please refer to Table 6.3 & 6.4 for the details of the profiles and corresponding populations and species. Fig. 6.3G.RFLP profiles produced by MspI for X. americanum group species:1.ca56, 2.CA22, 3.PE33, 4.PE2, 5.NV2, 6.NV5, 7.NV3, 8.CO3, 9.PE1, bp ladder; Profile for CAN162 and GG7 belong to the profile 1; Profile for CA7 belong to the profile 3; Profile for PE2 belong to the profile 5;Please refer to Table 6.3 & 6.4 for the details of the profiles and corresponding populations and species. 124

128 6. Diversity of internal transcribed spacer Fig. 6.3H.RFLP profiles produced by HaeIII for X. americanum group species:1.pe20, 2.OR4, 3.CAN162, 4.CAN224, 5.GG6, 6.TN1, 7.CA7, 8.AB7, 9.CA62, bp ladder, 11.GG7, 12.UT4, 13.AB9, 14.CA43, 15.CA4, 16.CA8, 17.UT24, 18.CA13, 19.CA15, 20.UT25; Profile for PE20 belong to the profile 5; Profile for OR4 belong to the profile 2; Profile for CAN162 belong to the profile 3; Profile for CAN224 belong to the profile 1;Please refer to Table 6.3 & 6.4 for the details of the profiles and corresponding populations and species. Fig. 6.3I.RFLP profiles produced by HaeIII for X. americanum group species:1.nv5, 2.PE2, 3.NV3, 4.NV2, 5.PE1, 6.CO3, 7.CA22, bp ladder, 9.PE23; Profile for PE2 belong to the profile 4; Please refer to Table 6.4 & 6.5 for the details of the profiles and corresponding populations and species. 125

129 6. Diversity of internal transcribed spacer Fig. 6.3J.RFLP profiles produced by MvaI for X. americanum group species:1.ca3, 2.UT25, 3.CA15, 4.CA62, 5.TN1, 6.AB7, 7.AB9, 8.CAN224, 9.CAN162, 10.CAN39, bp ladder, 12.OR4, 13.GG7, 14.PE20, 15.GG6, 16.UT6, 17.CA50, 18.CA40, 19.PE33, 20.GG14; Profile for GG7 belong to the profile 1; Profile for PE20 belong to the profile 3; Profile for CA40 belong to the profile 2; Please refer to Table 6.3 & 6.4 for the details of the profiles and corresponding populations and species. Fig. 6.3K.RFLP profiles produced by RsaI for X. americanum group species:1.ut4, 2.UT6, 3.CA40, 4.CA50, 5.PE33, 6.PE23, 7.CA15, 8.CA3, 9.AB7, bp ladder, bp ladder, 12.GG14, 13.OR4, 14.CAN39, 15.GG6, 16.CAN162, 17.AB9, 18.CA62, 19.CAN224, 20.TN1; Profile for CAN162 belong to the profile 1; Profile for GG14 belong to the profile 2; Please refer to Table 6.3 & 6.4 for the details of the profiles and corresponding populations and species More discussion on the X. americanum lineage species Genetic distance using restriction sites The genetic distance was calculated for all species and populations using the restriction sites. Similar results were obtained by UPGMA, NJ and Parsimony methods (Fig. 6.4). Populations of the same species were not grouped together and distributed in several groups composed of different species. This is not accidental because the morphological characters of these species also overlap, and is considered to be the main obstacle for species 126

130 6. Diversity of internal transcribed spacer identification in this group. Even the latest polytomous key (Lamberti et al., 2000) does not solve this problem. The correspondence between the morphological characters and the restriction analyses strengthens the suggestion that some of the putative species of this group are just morphotypes of the same species. The cluster analysis using morphological characters of Xiphinema species (Lamberti and Ciancio, 1993) also corresponded with the results of the restriction analyses. The restriction analysis by Vrain et al. (1992) resulted in a distribution of the populations in the tree similar to the one I obtained. Additionally, X. brevicollum, X. taylor and X. diffusum formed one group supporting the one-taxon conclusion of Luc et al. (1998). A recent publication concerning cluster analysis of X. americanumgroup from Lamberti et al. (2002) also supported the one-taxon proposal from Luc et al. (1998). More sampling and different types of data collection are needed before it can be proposed that populations and species in the other cluster compose one taxon. In their comment on the X. americanum group, Luc and Baujard (2001) suggested the possibility of synonymisation of some species. In the next section, this discussion will be continued based on the sequence data. 127

131 6. Diversity of internal transcribed spacer Fig. 6.4A. The UPGMA tree based on the restriction sites analysis: codes refer to Table 3.1. Confirmed species identity was added after the codes. Branch metric ruler was shown below the tree. An impressive phenomenon in this tree is that most of the branches leading to the external nodes have zero branch length. 128

132 6. Diversity of internal transcribed spacer Fig. 6.4B. The NJ tree based on the restriction sites analysis using Nei and Li distance (Nei & Li, 1979) with ME (minimum evolution) model: codes refer to Table 3.1. Confirmed species identity was added after the codes. Branch metric ruler was shown below the tree. An impressive phenomenon in this tree is that most of the branches leading to the external nodes have almost zero branch length. 129

133 6. Diversity of internal transcribed spacer Nucleotide polymorphism Complete ITS sequences were obtained for ten Xiphinema species (X. index, X. setariae, X. diversicaudatum, X. krugi, X. brasiliense, X. pachtaicum, X. brevicollum (3 populations), X. americanum (2 populations), X. thornei and X. rivesi (2 populations), one Longidorus species (L. macrosoma) and one trichodorid species (P. macrostylus). Partial ITS1 and ITS2 sequences were obtained for X. pachtaicum (6 populations), X. simile, X. brevisicum and X. madeirense (X. americanum group species). Nucleotide diversity π was calculated from the sequence alignment. The value of π equalled for ITS1 and for ITS2 of Xiphinema species (X. brevicollum, X. rivesi, and X. thornei were not included because their high similarity would lead to the underestimation of the true nucleotide diversity). Only one sequence of X. americanum was added for the calculation. The value of π was for ITS1 and for ITS2 between X. americanum, X. brevicollum, X thornei and X. rivesi. We obtained a π value of for ITS1 and for ITS2 of X. pachtaicum populations based on the partial sequences. The nucleotide diversity between X. americanum, X. brevicollum, X thornei and X. rivesi was much lower than the common level between Xiphinema species ( vs for ITS1 and vs for ITS2) whereas it was very similar to that among X. pachtaicum populations ( vs for ITS1 and vs for ITS2). When X. brevicollum was removed from the analysis, the nucleotide diversity for X. americanum, X thornei and X. rivesi became for ITS1 and for ITS2, which perfectly matched the diversity among X. pachtaicum populations (X. pachtaicum is a member of the X. americanum group). However, the ITS1 diversity between X. pachtaicnum and X. brevicollum, X. rivesi, X. thornei, X. americanum had a π value of , , , respectively; the ITS2 diversity in the pairwise comparison of X. pachtaicum and the remaining species (X. brevicollum, X. rivesi, X. thornei, X. americanum) had a π value equal to , , , , respectively. This is very close to the values calculated for Xiphinema species. Obviously, the high nucleotide diversity of ITS1 and ITS2 among Xiphinema species also applies to species of X. americanum group. Indeed, I obtained a π value of for ITS1 and of for ITS2 based on the partial ITS sequences of five X. americanum group species (X. pachtaicum, X. brevicollum, X. brevisicum, X. madeirense, and X. simile). 130

134 6. Diversity of internal transcribed spacer The low diversity between X. americanum, X. brevicollum, X thornei and X. rivesi being very close to the diversity level of populations of X. pachtaicum, can be explained by the assumption that these species are morphotypes of the same species. Lamerti et al. (2002) also concluded that most of the species classified into the X. americanum group showed intraspecific variation in morphometric parameters Phylogeny potential of ITS region According to the homogeneity test, the base composition of the sequences was heterogeneous (P<0.01). The average of base composition among sequences was estimated at A: , C: , G: , T: Different methods used to infer the phylogeny have their own sensitivity to data that will mislead the results. The use of multiple methods increases the reliability of the inferred phylogeny if they produce the same results. So I constructed trees by criteria MP (Maximum parsimony), ML (Maximum likelihood), and distance method with ME (Minimum evolution) implemented in PAUP 4.0 (Swofford, 2002). The PTP test resulted in P=0.01 and the g1 statistic was estimated at g1= These results indicate that tree construction will be based on the real hierarchical signals of the data matrix. The distance method with ME model, MP, and ML methods produced the same tree topology (Fig. 6.5). For the MP method, a single maximum parsimonious tree was obtained with tree length (TL = 3461), rescaled consistence index (RC = ). Most of the nodes had more than 80% bootstrap value and more than 4 decay indices. The monophyly of the X. americanum lineage was not strongly supported (53% bootstrap value and 1 decay index). The distance method with ME model resulted in higher bootstrap support for monophyly of the X. americanum lineage (80% bootstrap). For ML method, the GTR+Γ+I model (general time reversible plus gamma rates plus proportion of invariable sites) was selected by Modeltest and parameters were calculated accordingly. Single maximum likelihood tree was obtained with likelihood score equalling Bootstrap support for monophyly of the X. americanum lineage was 60%, which is close to the MP method. Compared to the phylogeny of Xiphinema inferred from the morphological characters (Coomans et al., 2001), the positions of the X. americanum lineage and X. krugi are different (Fig. 6.6), whereas the position of X. americanum fits the phylogeny based on the D2 and D3 expansion region of the 28S gene (chapter 5). Compared to the cluster analysis results of 131

135 6. Diversity of internal transcribed spacer Lamberti and Ciancio (1993), the inner groups of the X. americanum lineage are corresponding to each other (Fig. 6.7). Lamberti et al. (2002) resulted in four subgroups (X. americanum, X. brevicollum, X. pachtaicum and X. brevisicum) in their cluster analysis. That is similar to our analysis based on ITS sequences except X. brevisicum subgroup that was positioned in X. pachtaicum subgroup in the tree based on partial sequences of ITS (Fig. 6.8) or the sequences of D2 and D3 expansion region of the 28S gene (chapter 5). I conclude that the ITS region of Xiphinema species has great potential for phylogeny inference. Although the nucleotide diversity is high between species, the interpretable phylogenetic information is still maintained for closely related taxa, which keeps the possibility for recovery of the lineage history from extant organisms. Structural constraints that functioned on ITS region probably play a major role in the maintenance of the phylogenetic signals. 132

136 6. Diversity of internal transcribed spacer Fig The phylogenetic trees inferred from the ITS sequences: A. The single ML tree: maximum likelihood score is ; bootstrap values were indicated near the corresponding nodes; measure of branch length was indicated at left corner under the tree; B. The single MP tree: tree length (TL = 3461); bootstrap values (the integers beside the nodes) and decay indices (the integers prefixed with d beside the corresponding nodes) were indicated; measure of branch length was indicated at the left corner under the tree; C. The NJ tree based on ME model and uncorrected 133

137 6. Diversity of internal transcribed spacer Fig Comparison between the tree (Tree A) based on morphological characters and the tree (Tree B) based on the ITS sequence data: Tree A was modified from the tree published by Coomans et al. (2001). Tree B is the maximum parsimony tree calculated from the ITS sequence matrix, which was shown in Fig. 134

138 6. Diversity of internal transcribed spacer 6.5. Fig Comparison between the tree (Tree A) based on morphological characters and the tree (Tree B) based on the ITS sequence data for the groups of X. americanum group species: Tree A was modified from the tree published by Lamberti & Cicio (1993). Tree B is the maximum parsimony tree calculated from the ITS sequence matrix, which was shown in Fig

139 6. Diversity of internal transcribed spacer A potential tool for population study We obtained partial ITS sequences for seven populations of X. pachtaicum and complete sequences for four populations of X. brevicollum, two populations of X. americanum, two populations of X. simile and two populations of X. rivesi to check the inter-population polymorphism. To assay the intra-population polymorphism, I obtained the sequences from individuals of the same populations. To test the intra-genomic variation, I used restriction analyses to select the colonies containing inserts from the single individual for sequencing analyses. I observed the inter-populations polymorphism for X. pachtaicum, X. brevicollum, X. americanum, X. rivesi and X. simile by sequences alignment and tree inference (Fig. 6.8). Intra-population polymorphism was observed for X. pachtaicum. The intra-genomic variation was found existing in L. macrosoma (Fig. 6.8). I also observed the intra-genomic polymorphism for a few populations analysed by restriction enzymes (Table 6.4). Some populations have a compound RFLP profile. Because we used a single juvenile to produce PCR products, the compound profile obtained can be ascribed to the intra-genomic variation. The observed polymorphism is the outcome of point mutation, insertion and deletion including the expansion or retraction of the simple repeats. These polymorphisms can be used to study the relationship between populations, the substructure of populations or gene flow. The simple repeats loci located inside ITS region may be a fruitful source for population study. We observed the diversity of the microsatellites between populations of X. brevicollum and X. pachtaicum (Fig. 6.8). 136

140 6. Diversity of internal transcribed spacer Fig. 6.8A. Consensus tree of 9 maximum parsimonious trees (Rescaled consistency index (RC) = , tree length TL = 2267) inferred from partial ITS, 18S and 28S sequences. Bootstrap values were indicated in the corresponding nodes. J1/J2 signified the sequences obtained from different single juvenile. EU37 XB1 EU29 BB1 TTGACGACGA C---GACTGG GACTACGACC GGGAAAATGG GCCGGGCGCC TTGACGACGA C---GACTGG GACTACGACC GGGAAAATGG GCCGGGCGCC TTGACGACGA C---GACTGG GACTACGACC GGGAAAATGG GCCGGGCGCC TTGACGACGA CCATGACTGG GACTACGGCT GGGAAAATGG GCCGGGCGCC EU37 XB1 EU29 BB1 CGAGTCGGTC -GTAAAGTAT GTGTTATATC CGGCTCGATT CTTCTTCGGG CGAGTCGGTC -GTAAAGTAT GTGTTATATC CGGCTCGATT CTTCTTCGGG CGAGTCGGTC -GTAAAGTAT GTGTTATATC CGGCTCGATT CTTCTTCGGG CGAGTCAGTC TGTAATGTAT -----ATATC TGGCTCGATT CTTCTTCGGG EU37 XB1 EU29 BB1 EU37 XB1 EU29 BB1 EU37 XB1 EU29 BB1 GTTAGTTTCG AGGACCCCTT CC-T TAGG GAAGGGGCAG GTTAGTTTCG AGGACCCCTT CC-T TAGG GAAGGGGCAG GTTAGTTTCG AGGACCCCTT CC-T TAGG GAAGGGGCAG GTTAGTTTCG AGGGGCCCTT CCCTAGTCTC TAGTAGTAGG GAAGGGGCAG ATTTAAAGAG TCGCTGGCTC TCGTCGAATA TCG--TAGAG ----CGATCT ATTTAAAGAG TCGCTGGCTC TCGTCGAATA TCG--TAGAG ----CGATCT ATTTAAAGAG TCGCTGGCTC TCGTCGAATA TCG--TAGAG ----CGATCT ATTTAAAGAG TCGCTGGCTC TCGTCG---- TCGAATAGAG AGAGCGATCT AAGCGATATT GTAAGAGAAC GCTCTCGAAC GA GGCGTCGAAT AAGCGATATT GTAAGAGAAC GCTCTCGAAC GA GGCGTCGAAT AAGCGATATT GTAAGAGAAC GCTCTCGAAC GA GGCGTCGAAT AAGCGATATT GTAAGAGAAC GCTCTCGAAC GAACGAACGA GGCGTCGAA- 137

141 6. Diversity of internal transcribed spacer EU37 XB1 EU29 BB1 GAAAATTCAG TCGATCGTCG TCGTAGTCGT CGTTATTAGT AGTCGT-AAT GAAAATTCAG TCGATCGTCG TCGTAGTCGT CGTTATTAGT AGTCGT-AAT GTAAATTCAG TCGATCGTCG TCG TTATTAGT AGTCGT-AAT GAAGATTCAG TCGATCGTCG TCG------T CGTTATTAGT AGTCGTTAAT EU37 XB1 EU29 BB1 GCGACGA--- ACGTAACACG CGCG TAAT-CTCGA AATTTGACCT GCGACGA--- ACGTAACACG CGCG TAAT-CTCGA AATTTGACCT GCGACGACGA ACGTAACACG CGCG TAAT-CTCGA AATTTGACCT GCGACGA--- ACGTAAGACG CGCGCGCGTA TAATACTCGA AATTTGACCT Fig. 6.8B. Partial ITS sequence alignment of four populations of X. brevicollum. The sequence blocks surrounded with a rectangular are the simple sequence repeats. The variation of these sequences contributed to the polymorphisms among populations. EU3 EU4 EU115 EU118 EU2 T48 ML8 EU3 EU4 EU115 EU118 EU2 T48 ML8 EU3 EU4 EU115 EU118 EU2 T48 ML8 EU3 EU4 EU115 EU118 EU2 T48 ML8 CAAGCTTAAA AAGAAG---G TAAATCCAAG CTAAG-AAAA GACTATGGTC TATAAAAGAG CAAGCTTAAA AAGAAG---G TAAATCCAAG CTAAG-AAAA GACTATGGTC TATAAAAGAG CAAGCTTAAA AAGAAG---G TAAATCCAAG CTAAG-AAAA GACTATGGTC TATAAAAGAG CAAGCTTAAA AAGAAG---G TAAATCCAAG CTAAG-AAAA GACTATGGTC TATAAAAGAG CAAGCTTAAA AAGAAG---G TAAATCCAAG CTAAG-AAAA GACTATGGTC TATAAAAGAG CAAGCTTAAA AAGAAG--GA AAAAGCCAAG CTAGAGAAAA GACTACGGTC TATAAAAGAG CAAGCTTATA AAAAAATAAG AAAAGCGAAG CGATA-TAAA GACTACGGTC TATAAATAA- AAGACTACGG TCTATAAAAA -CGCCAATCC AAGGAAGGCA T----ATATA AAAGAGCTAT AAGACTACGG TCTATAAAAA -CGCCAATCC AAGGAAGGCA T----ATATA AAAGAGCTAT AAGACTACGG TCTATAAAAA -CGCCAATCC AAGGAAGGCA T----ATATA AAAGAGCTAT AAGACTACGG TCTATAAAAA -CGCCAATCC AAGGAAGGCA T----ATATA AAAGAGCTAT AAGACTACGG TCTATAAAAA -CGCCAATCC AAGGAAGGCA T----ATATA AAAGAGCTAT AAGACTACGG TCTATAAAAA ACGCCAATCC GAGGAAGGCA TATATATATA AAAGAGCTAT -AGACTACGG TCTATAAA T-- --AAAGACTA CG---GTCTA TAAAGACTAC AAAAATAATC TCTCTATAAA AGAGAGAAGA CTACGGTCTA TAAAACAACG AAGCCGCCCG AAAAATAATC TCTCTATAAA AGAGAGAAGA CTACGGTCTA TAAAACAACG AAGCCGCCCG AAAAATAATC TCTCTATAAA AGAGAGAAGA CTACGGTCTA TAAAACAACG AAGCCGCCCG AAAA-TAATC TCTCTATAAA AGAGAGAAGA CTACGGTCTA TAAAACAACG AAGCCGCCCG AAAAATAATC TCTCTATAAA AGAGAGAAGA CTACGGTCTA TAAAACAACG AAGCCGCCTG AAAA-TAATC TCTCTATAAA AGAGAGAAGA CTACGGTCTA TAAAAGAAAA AGCCAAGCTT GG------TC T----ATAAA ----TAAAGA CTACGGTCTA TAAATAAAGA CTACGGTCTA CAGACCTAAG GTCTATAAAG CAATCTCTCT ATAAAAG-AG AGAAAGACTA CGGTCT CAGACCTAAG GTCTATAAAG CAATCTCTCT ATAAAAG-AG AGAAAGACTA CGGTCT CAGACCTAAG GTCTATAAAG TAATC--TCT ATAAAAG-AG AGAAAGACTA CGGTCT CAGACCTAAG GTCTATAAAG GAATCTCTCT ATAAAAG-AG AGAAAGACTA CGGTCT CAGACCTAAG GTCTATAAAG CAATCTCTCT ATAAAAG-AG AGAAAGACTA CGGTCT TA----TAAA GACTACG--G TCT ATAAAAAAAG AGAAAGACTA AGGTCT TAAA--TAAA GACTACG--G TCT ATAAATA AAGACTA CGGTCT Fig. 8C. Partial ITS sequences alignment of seven populations of X. pachtaicum. The sequence blocks surrounded with a rectangular are the simple sequence repeats. The variation of these sequences contributed to the polymorphisms among populations. 138

142 6. Diversity of internal transcribed spacer LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 TTGATTACGT CCCTGCCCTT TGTACACACC GCCCGTCGCT ACTACCGATT GGATGACTTA TTGATTACGT CCCTGCCCTT TGTACACACC GCCCGTCGCT ACTACCGATT GGATGACTTA GTGAGGTCTT AGGACCGAAG TGAAGGAGCT TTCATTAGTT CTTTTACTTT GGAAATTTGA GTGAGGTCTT AGGACCGAAG TGAAGGAGCT TTCATTAGTT CTTTTACTTT GGAAATTTGA TCGAACTACG TTATCTAGAG GAAGTAAAAG TCGTAACAAG GTTTCCGTAG GTGAACCTGC TCGAACTACG TTATCTAGAG GAAGTAAAAG TCGTAACAAG GTTTCCGTAG GTGAACCTGC GGAAGGATCA TTAACGAGCT AATATAAAAA AGAAAACATC GTCGGGAAAA CTATAGAGAA GGAAGGATCA TTAACGAGCT AATATAAAAA AGAAAACATC GTCGGGAAAA CTATAGAGAA AAGGGGGAAA AATAATAGAC TTTTCTCTAT GACGATGAAA AAATGCCACG CTGACGGGAA AAGGGGGAAA AATAATAGAC TTTTCTCTAT GACGATGAAA AAATGCCACG CTGACGGGAA TAGCGGTAGG CGCGTAAAAA AGCGCAAAGT CCGTCGTCAG TAATAAGTGG TTGGAAAAAA TAGCGGTAGG CGCGTAAAAA AGCGCAAAGT CCGTCGTCAG TAATAAGTGG TTGGAAAAAA AGAGATATAC GCGCGATATA GTTCCCGCGT TGACGGTGAT TATCATATAT CGAAATAAGA AGAGATATAC GCGCGATATA GTTCCCGCGT TGACGGTGAT TATCATATAT CGAAATAAGA GGGAACCTAC AGATATATAT ATATACGATA CGCGCATATA TGCGGGGTTT TAGGTAACTG GGGAACCTAC AGATATATAT ATATACGATA CGCGCATATA TGCGGGGTTT TAGGTAACTG CCCACCGCAG TACCTGGTCT CAGCTATCTC TCGTATACGC TGGCGAGTGT TGTAAGGAAA CCCACCGCAG TACCTGGTCT CAGCTATCTC TCGTATACGC TGGCGAGTGT TGTAAGGAAA AAAATCCTAC ACGTCGCCTG GGACACGCGG ATGCGTTAGG TAGTGAACGC CCCTGGCGGT AAAATCCTAC ACGTCGCCTG GGACACGCGG ATGCGTTAGG TAGTGAACGC CCCTGGCGGT GTTCGCGGGA TCAGCGAAAT GCGGTAGACT GGTTCGTTGT AATCAGTCGC ATCGGTTACC GTTCGCGGGA TCAGCGAAAT GCGGTAGACC GGTTCGTTGT AATCAGTCGC ATCGGTTACC LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 ACCGTATATT ATGCGCGTGC GTATACAAAA ATAATCGTCG GGAACTTTCT GATAATCGAT ACCGTATATT ATGCGTGTGC GTATAT---A ATAATCGTCG GGAACTTTCT GATAATCGAT ATATCCGTTA TCGCGGGGTG GAAATATTAT TTCGAACGTG TATCTCTATA TATACGAAAC ATATCCGTTA TCGCGGGGTG GAAATATTAT TTCGAACGTG TATCTCTATA C AGACGATCGG GACTAAGTTT GGCGAATCGG AAACAAAGTA GCTGACTTAG TTCTTTTTTG AGACGATCGG GACTAAGTTT GGCGAATCGG AAACAAAGTA GCTGACTTAG TTCTTTTTTG GGTATCTATC GAATTCTAAC GTTACAGTTC AAACGTCTGG CTGTAGTCGA AACGCGTTAG GGTATCTATC GAATTCTAAC GTTACAGTTC CAACGTCTGG CTGTAGTCGA AACGCGTTAG ATTCGGAATC CTCGGCGTAC GCGTCAAGGT GAGATTTAAA GAGTCGTTGG CTCCGAAGGA ATTCGGAATC CTCGGCGTAC GCGTCAAGGT GAGATTTAAA GAGTCGTTGG CTCCGAAGGA AACGGGAGTG ACTGTTAACG CGTCTAGGAA AGCGAAGCGC TAGGTTCACC GAGCGATATA AACGGGAGTG ACTGTTAACG CGTCTAGGAA AGCGAAGCGC TAGGTTCACC GAGCGATATA GAGAGTGGAA ATAATATCTC ---TAATATA TCGTTCGGTT GGAATTATAC CTTTCCGCAA GAGAGTGGAA ATAATATCTC TAATAATATA TCGTTCGGTT GGAATTATAC CTTTGCGCAA TGCGTTAGAT GCGTTTATCA GTTCGGGGTA TCCGGGAAAA ATGGATACCC GGGCGACCGC TGCGTTAGAT GCGTTTATCA GTTCGGGGTA TCCGGGAAAA ACGGATACCC GGGCGACCGC CCGAAAACGA AAATAACGAG TTAATATTAT ATACAAACTC TCCGGAGTAT AAAAAATATA CCGAAAACGA AAATAACGAG TTAATATTAT ATACAAACTC TCCGGAGTAT AAAAAATATA GAATCTAAGA GATTCC---- -ATTCTAAGC GGTGGATCAC TAGGCTCGCG GGTCGTTGAA GAATCTAAGA GATTCCATGG AATTCTAAGC GGTGGATCAC TAGGCTCGCG GGTCGTTGAA GAACGGGGCC AGTCCCGAGA ATAAGTGCGA ATTGCAGACA CAAAGAGCAT CGACTTTTCG GAACGGGGCC AGTCCCGAGA ATAAGTGCGA ATTGCAGACA CAAAGAGCAT CGACTTTTCG AACGCACATT GCGGTATCGG GCCTGCTCGA TACCACGCCT ATCTGAGGGA CGAATAAGAG 139

143 6. Diversity of internal transcribed spacer LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 LM1ITS01 LM1ITS02 AACGCACATT GCGGTATCGG GCCTGCTCGA TACCACGCCT ATCTGAGGGA CGAATAAGAG AACGAACTAA ATCGTTTGTT TGGCCGTTGG ATATTCCGAG GCGGAAAAAA TTTACTCGGA AACGAACTAA ATCGTTTGTT TGGCCGTTGG ATATTCCGAG GCGGAAAAAA TTTACTCGGA CGTCCAAGAA TAAAACTTGG CCAAATGGAG AAAATGACTT GAGATTGCGA CCAGCAGTCG CGTCCAAGAA TAAAACTTGG CCAAATGGAG AAAATGACTT GAGATTGCGA CCAGCAGTCG TTAAGTCTGA AGGTAAAAAA CGCGGGGAAA AAGTAAAGCG ATCGGGCGGA AACATATATA TTAAGTCTGA AGGTAAAAAA CGCGGGGAAA AAGTAAAGCG ATCGGGCGGA AACATATATA CGATCGCTTG AATGACGCGT GGTACCAGTT GGCGATTACA TATACAACGT AGAAAAATCT CGATCGCTTG AATGACGCGT GGTACCAGTT GGCGATTACA TATACAACGT AGAAAAATCT ATACGTATAT ACGCTGTTGG TTAGTCGTTA TTAGTCGTTG GTCTACGGAA AGGACTAACG ATACGTATAT ACGCTGTTGG TTAGTCGTTA TTAGTCGTTG GTCTACGGAA AGGACTAACG ACTCAATAGA ATCGACGCCT TACCGGCTGG TGTATGTATA TATTACGTAG AAAAGTACGT ACTCAATAGA ATCGACGCCT TACCGGCTGG TGTATGTATA TATTACGTAG AAAAGTACGT ATGTATTATC GCTCGAATAA AAATAACGCG ATGACAACCG CGTATTAGCG TTCGCGGCTT ATGTATTATC GCTCGAATAA AAATAACGCG ATGACAACCG CGTATTAGCG TTCGCGGCTT GACGTAATAA AGCGCAAGCG CGGGATAAAA CATATACGCT AAGAGTCGCG CGGATTTGTC GACGTAATAA AGCGCAAGCG CGGGATAAAA CATATACGCT AAGAGTCGCG CGGATTTGTC AGTGTCGTGT TCGACGTTTG ACCTCAGATT AGACGTGAAA ACCCGCCGAA TTTAAGCATA AGTGTCGTGT TCGACGTTTG ACCTCAGATT AGACGTGAAA ACCCGCCGAA TTTAAGCATA TAACTAGGCG GAGGAAAAGA AATTAACGAA GATTTCCTTA GTAACGGCGA GTGAAA TAACTAGGCG GAGGAAAAGA AATTAACGAA GATTTCCTTA GTAACGGCGA GTGAAA Fig. 6.8D. Complete ITS sequence alignment derived from two clones of PCR products from one individual of L. macrosoma The sequence blocks surrounded with a rectangular are the main polymorphic sites observed between the two clones. The variation of simple sequence repeats resulted in two polymorphic sites by expansion or retraction. 6.4 Conclusion The ITS region of Xiphinema species is highly diversified. It is a good choice for PCR-RFLP analyses in order to make diagnosis on species. According to our results, any restriction enzyme selected can be a useful tool to separate species. If synonymisation of some species of the X. americanum group is considered, the same diagnostic power also can be applied to it. The phylogenetic information maintained in the ITS region is also a good resource for phylogeny construction of closely related taxa such as species of genus Xiphinema. However, It may be less useful for the study deeper lineage beyond the genus Xiphinema I observed polymorphisms among populations and within populations. Some 140

144 6. Diversity of internal transcribed spacer intra-genomic polymorphisms were observed by restriction analyses or sequencing. There are a few simple sequence repeats (microsatellite loci) found in the ITS region. The expansion or retraction of these contributes to the polymorphism between populations and they may be a useful tool to study substructure of populations, gene flow or geographical subdivision. 141

145 6. Diversity of internal transcribed spacer Table 6.3. ITS-RFLP profiles for Xiphinema americanum lineage species. Restriction Enzymes AluI 750, 360, 160, 110, 80 CfoI 670, 310, 300, , 330, 250, 170, 130, 110, , 310, 300, 160, 150, ,300, 290, 80,60 DdeI 780, 340, 140, 75, 60, 10 HaeIII 610, 480, 277, 70 MspI 390, 320, 260, 245, , 330, 260, 150, 140, 75, 60, , 310, 300, 280, , 280, 250, , 340, 260, 140, 75, 60, , 330, 330, 270, , 270, 240, 240, 220 MvaI 760, , , 400, 360 RsaI 900, 240, 900, 520, 230, 70, , 370, 350, 140, 75, 60, , 290, 280, 190, 120, , 270, 220, 110, , 260, 350, 160, 140, 75, 60, , 450, 330, , 280, 250, 110, , 280, 240,160, 100 Note: The data recorded in the above table are only applicable to X. americanum for which a 1.45 Kb PCR product is usually obtained using primers VN18 and VN28 The numbers on top of the table are used to indicate the RFLP profiles observed. The RFLP profiles are listed in detail right below the numbers. ITS- RFLP patterns for a few species: X. pachtaicum, X. pachydermum, X. simile, X. brevisicum, X. madeirense that are morphologically grouped into Xiphinema pachtaicum subgroup by Lamberti and Ciancio (1993) is excluded from this table. They can be easily identified from the other species of this group by their distinct PCR products in the range between Kb and 1.3Kb. They are not suspected virus-vector. Refer to Table 6.4 to get the exact information for each population or species and the corresponding RFLP profile. 142

146 6. Diversity of internal transcribed spacer Table 6.4. ITS-RFLP profiles for X. americanum-group species. Nematode species AluI CfoI DdeI MspI MvaI RsaI HaeIII Code X. americanum CA22 X. americanum PE24 X. americanum MS X. americanum XA1 X. americanum XA2 X. americanum PE40 X. americanum CA54 X. americanum group sp CA4 X. americanum group sp CA6 X. americanum group sp CA7 X. americanum group sp CA8 X. americanum group sp CA10 X. americanum group sp CA11 X. americanum group sp CA12 X. americanum group sp CA13 X. americanum group sp CA14 X. americanum group sp CA15 X. americanum group sp CA16 X. americanum group sp CA18 X. americanum group sp CA24 X. americanum group sp CA31 X. americanum group sp CA40 X. americanum group sp CA43 X. americanum group sp CA45 X. americanum group sp CA55 X. americanum group sp PE26 X. americanum group sp PE29 X. americanum group sp PE32 X. americanum group sp PE39 X. americanum group sp AB7 X. americanum group sp AB9 X. americanum group sp PE5 X. americanum group sp PE6 X. americanum group sp PE7 X. americanum group sp PE9 X. americanum group sp PE12 X. americanum group sp PE15 X. americanum group sp MD2 X. americanum group sp CA62 X. brevicollum Xb1 X. brevicollum EU37 X. brevicollum EU29 X. bricolensis CAN39 X. bricolensis PE18 X. californicum CA3 X. californicum CA50 X. californicum CA33 X. diffusum CAN162 X. diffusum GG7 X. georgianum GG14 X. incognitum PE42 X. pacificum GG15 143

147 6. Diversity of internal transcribed spacer X. pacificum MD1 X. peruvianum GG3 X. rivesi PE23 X. revesi PE1 X. revesi PE2 X. rivesi PE20 X. rivesi PE33 X. santos CAN224 X. santos UT14 X. taylori EU117 X. taylori TN1 X. thornei CO3 X. thornei CO5 X. thornei OR4 X. utahense NV2 X. utahense NV3 X. utahense NV5 X. utahense UT4 X. utahense UT6 X. utahense UT8 X. utahense UT24 X. utahense UT25 Numbers indicate the RFLP profile listed in Table

148 7. Isolation and characterization of microsatellites 7 Isolation and characterisation of microsatellites for Xiphinema index Using Degenerate Oligonucleotide Primed PCR 7.1 Introduction Xiphinema index (Thorne and Allen 1950) is a migratory ectoparasitic nematode that is the natural vector for grapevine fan leaf virus (GFLV) (Hewitt et al., 1958). Reproduction of X. index is by meiotic parthenogenesis. The geographical centre of origin of X. index is considered to be the Middle East where it has been found in natural woodlands in association with wild grapevines, and where GFLV is believed to have originated and co-evolved with its vector (Hewitt, 1985). From this region both X. index and GFLV have been distributed to most grapevine-growing areas of the world through human intervention. X. index is widely distributed in the viticulture areas of the Americas (Robbins & Brown, 1991; Doucet et al., 1998) and Europe, and also of the former Soviet Union (Brown et al., 1990). X. index is found naturally associated with grapevine, and in the laboratory it can also multiply on other hosts like Vitis spp., Citrus aurantium, fig and rose (Radewald & Raski, 1962; Coiro & Brown, 1984). Because of its damaging association with GFLV, X. index is listed as quarantine nematode in several countries (Brown et al., 1990). Accurate identification methods not only would avoid misidentification but also allow the monitoring of the occurrence of the species and support breeding for host resistance. DNA markers with good discrimination power allow unequivocal and rapid species-identification and are particularly useful for application by those with little taxonomic expertise. 145

149 7. Isolation and characterization of microsatellites Microsatellites are widely distributed in the genome of eukaryotic organisms and are found with high levels of polymorphism (Litt & Luty, 1989; Weber & May, 1989; Tautz, 1989). Because of this high polymorphism, and ease of interpretation, uninterrupted repeats have been applied for different purposes (Weber, 1990). However, compound microsatellites loci with mixed repeat motifs, or interrupted by other sequences, complicate the interpretation of the length patterns at the population level (Freimer & Slatkin, 1996) and thus sequencing analysis is essential. High-resolution maps for mice and the human sequencing project have revealed an even distribution of microsatellites in these genomes (Dib et al., 1996; Dietrich et al., 1996). Microsatellites rarely locate in coding sequences (Hancock, 1995a) because of the instability caused mainly by slip-strand mispairing during DNA replication (Fresco & Alberts, 1960; Sia et al., 1997a) resulting in a lethal frame shift mutation of the organism. Microsatellite markers are widely used in population genetic studies, DNA typing systems, and genome mapping, because of their abundance and high polymorphism, and the analysis of microsatellites is fully automated (Hancock, 1999). The objectives of this study were: 1) to screen and analyze microsatellite from one population of X. index and to test the species specificity of eventually selected microsatellites to X. index by confirming their existence in different populations from diverse geographical origins and their absence in other Xiphinema species and distantly related genera of Nematoda; 2) to evaluate the genetic diversity of microsatellites within and between populations. 7.2 Materials and Methods Nematodes samples used in this study The Italian X. index population, coded T31 (Table 7.1), was used for microsatellites screening. Other nematode species and populations used in this study are summarized in Table 7.1. Nematodes were extracted from soil by decanting and sieving 146

150 7. Isolation and characterization of microsatellites methods (Cobb, 1918; Brown & Boag, 1988). The specimens were morphologically identified and stored at -70 C in 1M NaCl DNA Extraction DNA was extracted from a single juvenile or adult for PCR as described in chapter PCR amplification with degenerate oligonucleotide primer (DOP) 10 µl crude DNA extract from a single juvenile were included in the PCR with a total volume of 50 µl containing 2.5 units of Amplitaq-Gold polymerase (ABI, Lennik, Belgium), 2.0 µm DOP primer (5' -CCGACTCGAGNNNNNNATGTGG-3') (Telenius et al. 1992), 200 µm each dntp (Qiagen), 10 mm Tris-HCl (ph 8.3), 50 mm KCl, and 1.5/2.0 mm MgCl 2. PCR was performed in a PTC-100/200 thermocycler (MJ research, Biozyme, San Diego, USA). The cycling conditions modified from Cheung and Stanley (1996) were: 95 C for 10 minutes, 8 cycles of 93 C for 1 min, 30 C for 1 min and 72 C for 3 min, and another 28 cycles 93 C for 1 min, 60 C for 1 min, and 72 C for 3 min followed by an extension at 72 C for 10 min. After PCR samples were loaded onto a 2% agarose gel. Electrophoresis was performed in 1 TAE buffer at 100V for 2 hours. The gel was stained with ethidium bromide (0.5µg/mL). Pictures were taken by Kodak digital camera (Kodak, NJ, US) over an UV transluminator. 147

151 7. Isolation and characterization of microsatellites Table 7.1. The nematode samples used in the study. Family Genus Species Code of Populations Origin Longidoridae Xiphinema X. index T8 Italy CA28 Plenada, CA, US MC21 Morocco EU25 Argentina SPA1 Spain T31 Italy X. americanum PEAX Florida, US X. chambersi AB3 Lee county, AB, US X. italiae BAR1 Italy X. elongatum CAN24 Israel X. coxi GG10 Jenkil Tsloud, GG, US X. vuittenezi XV1 Austria X. dentatum EU113 Branisko, Slovakia X. brasiliense EU41 Para State, Brazil Paralongidorus P. maximus CAN201 St. Martins, Germany Longidorus L. minor SV46 Switzerland L. juvenilis CAN196 Moca, Slovakia Longidoroides L. sp. CAN74 Dakar, Senegal Xiphidorus X. sp. VE269 Carribean Heteroderidae Heterodera H. schachtii Belgium Globodera G. pallida Belgium Aphelenchoididae Bursaphelenchus B. xylophilus B3 Vietnam Pratylenchidae Radopholus R. similis CLO-laboratory culture Pratylenchus P. penetrans CLO-laboratory culture Hirschmanniella H. sp. H1 Vietnam Note: Longidoridae belong to the class Enoplea of the phylum Nematoda. The other families presented in the table belong to the class Chromadorea. 148

152 7. Isolation and characterization of microsatellites Genomic library construction and microsatellites screening DOP products from different individuals were pooled and purified from primers using a PCR-purification kit (Qiagen, GmbH, Postfach, Germany). DNA was measured with a spectrophotometer and ligated to pgem-t vector (Promega) using T4 DNA ligase (Promega, Leiden, The Netherlands). Ligation products were transformed to XL-1 blue supercompetent cells (Strategene). Bacterial clones were screened by the classical blue and white procedure using LB plates containing ampicillin, X-Gal and IPTG (Sambrook et al., 1989). The white colonies were picked and inoculated onto a fresh LB plate with ampicillin (100 µg/ml). After culturing overnight, they were transferred to nylon membranes (Hybond N+, Amersham-Pharmacia, Roosendaal, Nederland). Hybridization was performed with (CA) 9 probes that were tailed with DIG-dUTP (Roche Biochemical, Vilvoorde, Belgium) by terminal transferase (Roche Biochemical). Membranes were prehybridized in 5 SSC, 1 blocking solution (Roche Biochemical), 0.1% N- laurylsarkosine, 0.02% SDS at 65 C for 2 hours, and then hybridised at 45 C for hours. After stringent washing in a solution containing 0.5 SSC and 0.1% SDS, the membranes were incubated with anti-dig alkaline phosphatase (Roche Biochemical) and processed with CSPD (Roche Biochemical) chemiluminiscent substrate. The membrane was exposed to Fuji X-Ray film (Fuji Photo Film Company Ltd., Japan) for hours Sequencing and primer design Plasmids were purified from the positive clones using a Qiagen Miniprep kit (Qiagen). Dot-blot analysis was carried out to confirm the insertion containing microsatellites. The hybridisation procedure followed the same protocol as described in the Genomic library construction and screening section ng recombinant plasmid DNA was used as sequencing template. DNA sequencing was performed on ABI 377 TM automated sequencer using a Big-dye terminator cycle sequencing kit (ABI, Lennik, Belgium). DNA sequences were aligned by ClustalX 1.8 (Thompson et al., 1997). Primers 149

153 7. Isolation and characterization of microsatellites were designed in the flanking region for each microsatellite and the uniqueness was confirmed by blastn searching the NCBI nucleotides database (Table 7.2) PCR amplification using unique primer pair for each microsatellite 5 µl crude DNA extract from a single juvenile were included in a PCR with a total volume of 50 µl containing 1.5 units of Amplitaq-Gold polymerase (ABI, Lennik, Belgium), 0.5 µm of each primer, 200 µm each dntp (Qiagen), 10 mm Tris-HCl (ph 8.3), 50 mm KCl, and 2.0 mm MgCl 2. The PCR was performed in a PTC-100/200 thermocycler (MJ research, Biozyme). The cycling conditions comprised an initial denaturation step and activation of the polymerase at 95 C for 7 minutes, 35 cycles of 94 C for 30 sec, 54 C for 40 sec, and 72 C for 1 min, followed with an extension step at 72 C for 10 min. For DNA detection, PCR samples were loaded onto a 2% agarose gel. Electrophoresis was performed in 1 TAE buffer at 100V for 1 hour. The gel was stained with ethidium bromide (0.5µg/mL). Pictures were taken by Kodak digital camera (Kodak, NY, USA) over an UV transluminator Microsatellites analysis Microsatellites were amplified with 6-FAM or Hex fluorophore (Invitrogen, Merelbeke, Belgium) labeled primers. The cycling conditions were as described above. PCR products were prepared and analyzed in ABI genetic analyzer 310 following the instructions of Genescan user manual (ABI). Fragments with different sizes were cloned and sequenced. Sequences were compared with the NCBI genbank database by blastn or blastx searching. The secondary structure of the MIRs- like part of the XIMSS1 microsatellite was predicted using Mfold (Zuker & Turner, 1999) and improved by hand using RNAviz (De Rijk & De Wachter., 1997). 150

154 7. Isolation and characterization of microsatellites 7.3 RESULTS DOP-PCR results Amplification with the degenerate oligonucleotide resulted in a smear in the agarose gel, which was characterized by distinct band patterns. Much higher yields were achieved with 2 mm Mg 2+ than with 1.5 mm Mg 2+ (Fig. 7.1a and 7.1b). Amplification products were never observed in the negative controls. Fig. 7.1a. DOP-PCR with 1.5mM Mg 2+. Lane 1: low DNA mass ladder (GibcoBRL); lanes 2-11: DOP-PCR from Xiphinema index individuals; lane 12: 100 bp ladder (Promega, Leiden, The Netherlands). Fig. 7.1b. DOP-PCR with 2.0 mm Mg 2+. Lane 1: 100bp ladder (Promega); lanes 2-5: DOP-PCR from Xiphinema index individuals. 151

155 7. Isolation and characterization of microsatellites Microsatellites screening results Seven microsatellites were obtained by screening 6200 colonies. Out of the 26 sequenced positive clones, 2 clones carried the XIMSL1 microsatellite, 1 clone carried the XIMSL2 microsatellite, 2 clones carried XIMSL3, 8 clones carried XIMSL4, 6 clones carried XIMSL5, and 3 clones carried XIMSL6; 2 clones contained the microsatellite XIMSS1 and another 2 clones harbored only the common interspersed element of the microsatellites with prefix XIMSL (Table 7.2) PCR test results derived from the specific primer pairs The specific primer pairs for seven microsatellites produced PCR products for six X. index populations from different geographical origins. XIMSL prefixed microsatellites used the common reversed primer located in a conserved region of the flanking L1-like element. Amplification profiles for each microsatellites are shown in Table 7.3. No amplification products were produced for the other species listed in Table 7.1 (Fig. 7.2) Fig PCR with specific primer. Lane 20: 100 bp ladder (Promega); lane 1: Xiphinema index, T31; lane 2: CAN196; lane 3: VE269; lane 4: CAN74; lane 5: SV46; lane 6: CAN201; lane 7: EU41; lane 8: XV1; lane 9: EU113; lane 10: GG10; lane 11: CAN24; lane 12: BAR1; lane 13: AB3; lane 14: Hirschmanniella sp (H1); lane 15: Pratylenchus penetrans; lane 16: Radopholus similis; lane 17: Bursaphelenchus xylophilus (B3); lane 18: Globodera pallida; lane 19: Heterodera schachtii. Codes refer to Table

Frequently Asked Questions (FAQs)

Frequently Asked Questions (FAQs) Frequently Asked Questions (FAQs) Q1. What is meant by Satellite and Repetitive DNA? Ans: Satellite and repetitive DNA generally refers to DNA whose base sequence is repeated many times throughout the

More information

Morphological and Molecular Techniques for the Diagnosis of Nematodes

Morphological and Molecular Techniques for the Diagnosis of Nematodes Morphological and Molecular Techniques for the Diagnosis of Nematodes Jon Eisenback Professor of Plant Nematology Virginia Tech he internet may contain incorrect information regarding species What is

More information

Genomes and Their Evolution

Genomes and Their Evolution Chapter 21 Genomes and Their Evolution PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from

More information

TE content correlates positively with genome size

TE content correlates positively with genome size TE content correlates positively with genome size Mb 3000 Genomic DNA 2500 2000 1500 1000 TE DNA Protein-coding DNA 500 0 Feschotte & Pritham 2006 Transposable elements. Variation in gene numbers cannot

More information

Post-doc fellowships to non-eu researchers FINAL REPORT. Home Institute: Centro de Investigaciones Marinas, Universidad de La Habana, CUBA

Post-doc fellowships to non-eu researchers FINAL REPORT. Home Institute: Centro de Investigaciones Marinas, Universidad de La Habana, CUBA Recipient: Maickel Armenteros Almanza. Post-doc fellowships to non-eu researchers FINAL REPORT Home Institute: Centro de Investigaciones Marinas, Universidad de La Habana, CUBA Promoter: Prof. Dr. Wilfrida

More information

Molecular Markers, Natural History, and Evolution

Molecular Markers, Natural History, and Evolution Molecular Markers, Natural History, and Evolution Second Edition JOHN C. AVISE University of Georgia Sinauer Associates, Inc. Publishers Sunderland, Massachusetts Contents PART I Background CHAPTER 1:

More information

Outline. Genome Evolution. Genome. Genome Architecture. Constraints on Genome Evolution. New Evolutionary Synthesis 11/8/16

Outline. Genome Evolution. Genome. Genome Architecture. Constraints on Genome Evolution. New Evolutionary Synthesis 11/8/16 Genome Evolution Outline 1. What: Patterns of Genome Evolution Carol Eunmi Lee Evolution 410 University of Wisconsin 2. Why? Evolution of Genome Complexity and the interaction between Natural Selection

More information

Microbes usually have few distinguishing properties that relate them, so a hierarchical taxonomy mainly has not been possible.

Microbes usually have few distinguishing properties that relate them, so a hierarchical taxonomy mainly has not been possible. Microbial Taxonomy Traditional taxonomy or the classification through identification and nomenclature of microbes, both "prokaryote" and eukaryote, has been in a mess we were stuck with it for traditional

More information

Microbial Taxonomy. Slowly evolving molecules (e.g., rrna) used for large-scale structure; "fast- clock" molecules for fine-structure.

Microbial Taxonomy. Slowly evolving molecules (e.g., rrna) used for large-scale structure; fast- clock molecules for fine-structure. Microbial Taxonomy Traditional taxonomy or the classification through identification and nomenclature of microbes, both "prokaryote" and eukaryote, has been in a mess we were stuck with it for traditional

More information

DNA sequence collection at CNR-IPSP: a resource for nematode identification

DNA sequence collection at CNR-IPSP: a resource for nematode identification DNA sequence collection at CNR-IPSP: a resource for nematode identification Francesca De Luca CNR Istituto per la Protezione Sostenibile delle Piante, S.S. Bari email: francesca.deluca@ipsp.cnr.it EPPO

More information

CHAPTER 23 THE EVOLUTIONS OF POPULATIONS. Section C: Genetic Variation, the Substrate for Natural Selection

CHAPTER 23 THE EVOLUTIONS OF POPULATIONS. Section C: Genetic Variation, the Substrate for Natural Selection CHAPTER 23 THE EVOLUTIONS OF POPULATIONS Section C: Genetic Variation, the Substrate for Natural Selection 1. Genetic variation occurs within and between populations 2. Mutation and sexual recombination

More information

belonging to the Genus Pantoea

belonging to the Genus Pantoea Emerging diseases of maize and onion caused by bacteria belonging to the Genus Pantoea by Teresa Goszczynska Submitted in partial fulfilment of the requirements for the degree Philosophiae Doctoriae in

More information

Special Topics on Genetics

Special Topics on Genetics ARISTOTLE UNIVERSITY OF THESSALONIKI OPEN COURSES Section 9: Transposable elements Drosopoulou E License The offered educational material is subject to Creative Commons licensing. For educational material,

More information

Curriculum Links. AQA GCE Biology. AS level

Curriculum Links. AQA GCE Biology. AS level Curriculum Links AQA GCE Biology Unit 2 BIOL2 The variety of living organisms 3.2.1 Living organisms vary and this variation is influenced by genetic and environmental factors Causes of variation 3.2.2

More information

Computational Biology: Basics & Interesting Problems

Computational Biology: Basics & Interesting Problems Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information

More information

Genetics 275 Notes Week 7

Genetics 275 Notes Week 7 Cytoplasmic Inheritance Genetics 275 Notes Week 7 Criteriafor recognition of cytoplasmic inheritance: 1. Reciprocal crosses give different results -mainly due to the fact that the female parent contributes

More information

PLNT2530 (2018) Unit 5 Genomes: Organization and Comparisons

PLNT2530 (2018) Unit 5 Genomes: Organization and Comparisons PLNT2530 (2018) Unit 5 Genomes: Organization and Comparisons Unless otherwise cited or referenced, all content of this presenataion is licensed under the Creative Commons License Attribution Share-Alike

More information

CRISPR-SeroSeq: A Developing Technique for Salmonella Subtyping

CRISPR-SeroSeq: A Developing Technique for Salmonella Subtyping Department of Biological Sciences Seminar Blog Seminar Date: 3/23/18 Speaker: Dr. Nikki Shariat, Gettysburg College Title: Probing Salmonella population diversity using CRISPRs CRISPR-SeroSeq: A Developing

More information

PHYLOGENY AND SYSTEMATICS

PHYLOGENY AND SYSTEMATICS AP BIOLOGY EVOLUTION/HEREDITY UNIT Unit 1 Part 11 Chapter 26 Activity #15 NAME DATE PERIOD PHYLOGENY AND SYSTEMATICS PHYLOGENY Evolutionary history of species or group of related species SYSTEMATICS Study

More information

Principles of Genetics

Principles of Genetics Principles of Genetics Snustad, D ISBN-13: 9780470903599 Table of Contents C H A P T E R 1 The Science of Genetics 1 An Invitation 2 Three Great Milestones in Genetics 2 DNA as the Genetic Material 6 Genetics

More information

Genetic Variation: The genetic substrate for natural selection. Horizontal Gene Transfer. General Principles 10/2/17.

Genetic Variation: The genetic substrate for natural selection. Horizontal Gene Transfer. General Principles 10/2/17. Genetic Variation: The genetic substrate for natural selection What about organisms that do not have sexual reproduction? Horizontal Gene Transfer Dr. Carol E. Lee, University of Wisconsin In prokaryotes:

More information

Amy Driskell. Laboratories of Analytical Biology National Museum of Natural History Smithsonian Institution, Wash. DC

Amy Driskell. Laboratories of Analytical Biology National Museum of Natural History Smithsonian Institution, Wash. DC DNA Barcoding Amy Driskell Laboratories of Analytical Biology National Museum of Natural History Smithsonian Institution, Wash. DC 1 Outline 1. Barcoding in general 2. Uses & Examples 3. Barcoding Bocas

More information

Eukaryotic vs. Prokaryotic genes

Eukaryotic vs. Prokaryotic genes BIO 5099: Molecular Biology for Computer Scientists (et al) Lecture 18: Eukaryotic genes http://compbio.uchsc.edu/hunter/bio5099 Larry.Hunter@uchsc.edu Eukaryotic vs. Prokaryotic genes Like in prokaryotes,

More information

Interactive comment on Nematode taxonomy: from morphology to metabarcoding by M. Ahmed et al.

Interactive comment on Nematode taxonomy: from morphology to metabarcoding by M. Ahmed et al. SOIL Discuss., 2, C733 C741, 2016 www.soil-discuss.net/2/c733/2016/ Author(s) 2016. This work is distributed under the Creative Commons Attribute 3.0 License. Interactive comment on Nematode taxonomy:

More information

Bio 1B Lecture Outline (please print and bring along) Fall, 2007

Bio 1B Lecture Outline (please print and bring along) Fall, 2007 Bio 1B Lecture Outline (please print and bring along) Fall, 2007 B.D. Mishler, Dept. of Integrative Biology 2-6810, bmishler@berkeley.edu Evolution lecture #5 -- Molecular genetics and molecular evolution

More information

Outline. Genome Evolution. Genome. Genome Architecture. Constraints on Genome Evolution. New Evolutionary Synthesis 11/1/18

Outline. Genome Evolution. Genome. Genome Architecture. Constraints on Genome Evolution. New Evolutionary Synthesis 11/1/18 Genome Evolution Outline 1. What: Patterns of Genome Evolution Carol Eunmi Lee Evolution 410 University of Wisconsin 2. Why? Evolution of Genome Complexity and the interaction between Natural Selection

More information

The Eukaryotic Genome and Its Expression. The Eukaryotic Genome and Its Expression. A. The Eukaryotic Genome. Lecture Series 11

The Eukaryotic Genome and Its Expression. The Eukaryotic Genome and Its Expression. A. The Eukaryotic Genome. Lecture Series 11 The Eukaryotic Genome and Its Expression Lecture Series 11 The Eukaryotic Genome and Its Expression A. The Eukaryotic Genome B. Repetitive Sequences (rem: teleomeres) C. The Structures of Protein-Coding

More information

Conservation Genetics. Outline

Conservation Genetics. Outline Conservation Genetics The basis for an evolutionary conservation Outline Introduction to conservation genetics Genetic diversity and measurement Genetic consequences of small population size and extinction.

More information

Distance Learning course Plant pathology and entomology Covered topics

Distance Learning course Plant pathology and entomology Covered topics Distance Learning course Plant pathology and entomology Covered topics The distance learning course Plant pathology and entomology consist of four online modules that treat with the main groups of plant

More information

Introduction to Biosystematics - Zool 575

Introduction to Biosystematics - Zool 575 Introduction to Biosystematics Lecture 8 - Modern Taxonomy Outline - 1. Tools - digital imaging, databases 2. Dissemination - WWW 3. Tools - Molecular data, species demarcation, phylogeography 1 2 Prognosis

More information

Report of the Research Coordination Meeting Genetics of Root-Knot Nematode Resistance in Cotton Dallas, Texas, October 24, 2007

Report of the Research Coordination Meeting Genetics of Root-Knot Nematode Resistance in Cotton Dallas, Texas, October 24, 2007 Report of the Research Coordination Meeting Genetics of Root-Knot Nematode Resistance in Cotton Dallas, Texas, October 24, 2007 Participants: Frank Callahan, Peng Chee, Richard Davis, Mamadou Diop, Osman

More information

RNA Synthesis and Processing

RNA Synthesis and Processing RNA Synthesis and Processing Introduction Regulation of gene expression allows cells to adapt to environmental changes and is responsible for the distinct activities of the differentiated cell types that

More information

Microbial Taxonomy. Microbes usually have few distinguishing properties that relate them, so a hierarchical taxonomy mainly has not been possible.

Microbial Taxonomy. Microbes usually have few distinguishing properties that relate them, so a hierarchical taxonomy mainly has not been possible. Microbial Taxonomy Traditional taxonomy or the classification through identification and nomenclature of microbes, both "prokaryote" and eukaryote, has been in a mess we were stuck with it for traditional

More information

The Gene The gene; Genes Genes Allele;

The Gene The gene; Genes Genes Allele; Gene, genetic code and regulation of the gene expression, Regulating the Metabolism, The Lac- Operon system,catabolic repression, The Trp Operon system: regulating the biosynthesis of the tryptophan. Mitesh

More information

Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p

Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p.110-114 Arrangement of information in DNA----- requirements for RNA Common arrangement of protein-coding genes in prokaryotes=

More information

Model plants and their Role in genetic manipulation. Mitesh Shrestha

Model plants and their Role in genetic manipulation. Mitesh Shrestha Model plants and their Role in genetic manipulation Mitesh Shrestha Definition of Model Organism Specific species or organism Extensively studied in research laboratories Advance our understanding of Cellular

More information

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Chapter 12 (Strikberger) Molecular Phylogenies and Evolution METHODS FOR DETERMINING PHYLOGENY In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Modern

More information

Taxonomy. Content. How to determine & classify a species. Phylogeny and evolution

Taxonomy. Content. How to determine & classify a species. Phylogeny and evolution Taxonomy Content Why Taxonomy? How to determine & classify a species Domains versus Kingdoms Phylogeny and evolution Why Taxonomy? Classification Arrangement in groups or taxa (taxon = group) Nomenclature

More information

GCD3033:Cell Biology. Transcription

GCD3033:Cell Biology. Transcription Transcription Transcription: DNA to RNA A) production of complementary strand of DNA B) RNA types C) transcription start/stop signals D) Initiation of eukaryotic gene expression E) transcription factors

More information

Science Unit Learning Summary

Science Unit Learning Summary Learning Summary Inheritance, variation and evolution Content Sexual and asexual reproduction. Meiosis leads to non-identical cells being formed while mitosis leads to identical cells being formed. In

More information

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships Chapter 26: Phylogeny and the Tree of Life You Must Know The taxonomic categories and how they indicate relatedness. How systematics is used to develop phylogenetic trees. How to construct a phylogenetic

More information

Bio 119 Bacterial Genomics 6/26/10

Bio 119 Bacterial Genomics 6/26/10 BACTERIAL GENOMICS Reading in BOM-12: Sec. 11.1 Genetic Map of the E. coli Chromosome p. 279 Sec. 13.2 Prokaryotic Genomes: Sizes and ORF Contents p. 344 Sec. 13.3 Prokaryotic Genomes: Bioinformatic Analysis

More information

The science behind the conservation of endangered species: My experiences with Australian and Japanese bats. Kyle N. Armstrong

The science behind the conservation of endangered species: My experiences with Australian and Japanese bats. Kyle N. Armstrong The science behind the conservation of endangered species: My experiences with ustralian and Japanese bats Kyle N rmstrong The Kyoto University Museum This talk Why I am a Zoologist / Conservation Biologist

More information

Map of AP-Aligned Bio-Rad Kits with Learning Objectives

Map of AP-Aligned Bio-Rad Kits with Learning Objectives Map of AP-Aligned Bio-Rad Kits with Learning Objectives Cover more than one AP Biology Big Idea with these AP-aligned Bio-Rad kits. Big Idea 1 Big Idea 2 Big Idea 3 Big Idea 4 ThINQ! pglo Transformation

More information

Small RNA in rice genome

Small RNA in rice genome Vol. 45 No. 5 SCIENCE IN CHINA (Series C) October 2002 Small RNA in rice genome WANG Kai ( 1, ZHU Xiaopeng ( 2, ZHONG Lan ( 1,3 & CHEN Runsheng ( 1,2 1. Beijing Genomics Institute/Center of Genomics and

More information

Microbial Taxonomy and the Evolution of Diversity

Microbial Taxonomy and the Evolution of Diversity 19 Microbial Taxonomy and the Evolution of Diversity Copyright McGraw-Hill Global Education Holdings, LLC. Permission required for reproduction or display. 1 Taxonomy Introduction to Microbial Taxonomy

More information

Algorithms in Computational Biology (236522) spring 2008 Lecture #1

Algorithms in Computational Biology (236522) spring 2008 Lecture #1 Algorithms in Computational Biology (236522) spring 2008 Lecture #1 Lecturer: Shlomo Moran, Taub 639, tel 4363 Office hours: 15:30-16:30/by appointment TA: Ilan Gronau, Taub 700, tel 4894 Office hours:??

More information

Mole_Oce Lecture # 24: Introduction to genomics

Mole_Oce Lecture # 24: Introduction to genomics Mole_Oce Lecture # 24: Introduction to genomics DEFINITION: Genomics: the study of genomes or he study of genes and their function. Genomics (1980s):The systematic generation of information about genes

More information

Introduction to molecular biology. Mitesh Shrestha

Introduction to molecular biology. Mitesh Shrestha Introduction to molecular biology Mitesh Shrestha Molecular biology: definition Molecular biology is the study of molecular underpinnings of the process of replication, transcription and translation of

More information

2012 Univ Aguilera Lecture. Introduction to Molecular and Cell Biology

2012 Univ Aguilera Lecture. Introduction to Molecular and Cell Biology 2012 Univ. 1301 Aguilera Lecture Introduction to Molecular and Cell Biology Molecular biology seeks to understand the physical and chemical basis of life. and helps us answer the following? What is the

More information

Full file at CHAPTER 2 Genetics

Full file at   CHAPTER 2 Genetics CHAPTER 2 Genetics MULTIPLE CHOICE 1. Chromosomes are a. small linear bodies. b. contained in cells. c. replicated during cell division. 2. A cross between true-breeding plants bearing yellow seeds produces

More information

Microbiome: 16S rrna Sequencing 3/30/2018

Microbiome: 16S rrna Sequencing 3/30/2018 Microbiome: 16S rrna Sequencing 3/30/2018 Skills from Previous Lectures Central Dogma of Biology Lecture 3: Genetics and Genomics Lecture 4: Microarrays Lecture 12: ChIP-Seq Phylogenetics Lecture 13: Phylogenetics

More information

Prokaryotes & Viruses. Practice Questions. Slide 1 / 71. Slide 2 / 71. Slide 3 / 71. Slide 4 / 71. Slide 6 / 71. Slide 5 / 71

Prokaryotes & Viruses. Practice Questions. Slide 1 / 71. Slide 2 / 71. Slide 3 / 71. Slide 4 / 71. Slide 6 / 71. Slide 5 / 71 Slide 1 / 71 Slide 2 / 71 New Jersey Center for Teaching and Learning Progressive Science Initiative This material is made freely available at www.njctl.org and is intended for the non-commercial use of

More information

Designer Genes C Test

Designer Genes C Test Northern Regional: January 19 th, 2019 Designer Genes C Test Name(s): Team Name: School Name: Team Number: Rank: Score: Directions: You will have 50 minutes to complete the test. You may not write on the

More information

Chapter 18 Active Reading Guide Genomes and Their Evolution

Chapter 18 Active Reading Guide Genomes and Their Evolution Name: AP Biology Mr. Croft Chapter 18 Active Reading Guide Genomes and Their Evolution Most AP Biology teachers think this chapter involves an advanced topic. The questions posed here will help you understand

More information

Ch 10. Classification of Microorganisms

Ch 10. Classification of Microorganisms Ch 10 Classification of Microorganisms Student Learning Outcomes Define taxonomy, taxon, and phylogeny. List the characteristics of the Bacteria, Archaea, and Eukarya domains. Differentiate among eukaryotic,

More information

VCE BIOLOGY Relationship between the key knowledge and key skills of the Study Design and the Study Design

VCE BIOLOGY Relationship between the key knowledge and key skills of the Study Design and the Study Design VCE BIOLOGY 2006 2014 Relationship between the key knowledge and key skills of the 2000 2005 Study Design and the 2006 2014 Study Design The following table provides a comparison of the key knowledge (and

More information

Diagnostics and genetic variation of an invasive microsporidium (Nosema ceranae) in honey bees (Apis mellifera)

Diagnostics and genetic variation of an invasive microsporidium (Nosema ceranae) in honey bees (Apis mellifera) Diagnostics and genetic variation of an invasive microsporidium (Nosema ceranae) in honey bees (Apis mellifera) Dr. M. M. Hamiduzzaman School of Environmental Sciences University of Guelph, Canada Importance

More information

Introduction to Molecular and Cell Biology

Introduction to Molecular and Cell Biology Introduction to Molecular and Cell Biology Molecular biology seeks to understand the physical and chemical basis of life. and helps us answer the following? What is the molecular basis of disease? What

More information

Organelle genome evolution

Organelle genome evolution Organelle genome evolution Plant of the day! Rafflesia arnoldii -- largest individual flower (~ 1m) -- no true leafs, shoots or roots -- holoparasitic -- non-photosynthetic Big questions What is the origin

More information

8/23/2014. Phylogeny and the Tree of Life

8/23/2014. Phylogeny and the Tree of Life Phylogeny and the Tree of Life Chapter 26 Objectives Explain the following characteristics of the Linnaean system of classification: a. binomial nomenclature b. hierarchical classification List the major

More information

MRC-Holland MLPA. Description version 14; 21 January 2015

MRC-Holland MLPA. Description version 14; 21 January 2015 SALSA MLPA probemix P229-B2 OPA1 Lot B2-0412. As compared to version B1-0809, two reference probes and the 88 and 96 nt control fragments have been replaced (QDX2). The OPA1 gene product is a nuclear-encoded

More information

The rdna Internal Transcribed Spacer Region as a Taxonomic Marker for Nematodes

The rdna Internal Transcribed Spacer Region as a Taxonomic Marker for Nematodes University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Papers in Plant Pathology Plant Pathology Department 1997 The rdna Internal Transcribed Spacer Region as a Taxonomic Marker

More information

Bioinformatics. Dept. of Computational Biology & Bioinformatics

Bioinformatics. Dept. of Computational Biology & Bioinformatics Bioinformatics Dept. of Computational Biology & Bioinformatics 3 Bioinformatics - play with sequences & structures Dept. of Computational Biology & Bioinformatics 4 ORGANIZATION OF LIFE ROLE OF BIOINFORMATICS

More information

HORIZONTAL TRANSFER IN EUKARYOTES KIMBERLEY MC GRAIL FERNÁNDEZ GENOMICS

HORIZONTAL TRANSFER IN EUKARYOTES KIMBERLEY MC GRAIL FERNÁNDEZ GENOMICS HORIZONTAL TRANSFER IN EUKARYOTES KIMBERLEY MC GRAIL FERNÁNDEZ GENOMICS OVERVIEW INTRODUCTION MECHANISMS OF HGT IDENTIFICATION TECHNIQUES EXAMPLES - Wolbachia pipientis - Fungus - Plants - Drosophila ananassae

More information

SPECIATION. REPRODUCTIVE BARRIERS PREZYGOTIC: Barriers that prevent fertilization. Habitat isolation Populations can t get together

SPECIATION. REPRODUCTIVE BARRIERS PREZYGOTIC: Barriers that prevent fertilization. Habitat isolation Populations can t get together SPECIATION Origin of new species=speciation -Process by which one species splits into two or more species, accounts for both the unity and diversity of life SPECIES BIOLOGICAL CONCEPT Population or groups

More information

DNA Technology, Bacteria, Virus and Meiosis Test REVIEW

DNA Technology, Bacteria, Virus and Meiosis Test REVIEW Be prepared to turn in a completed test review before your test. In addition to the questions below you should be able to make and analyze a plasmid map. Prokaryotic Gene Regulation 1. What is meant by

More information

UE Praktikum Bioinformatik

UE Praktikum Bioinformatik UE Praktikum Bioinformatik WS 08/09 University of Vienna 7SK snrna 7SK was discovered as an abundant small nuclear RNA in the mid 70s but a possible function has only recently been suggested. Two independent

More information

MiGA: The Microbial Genome Atlas

MiGA: The Microbial Genome Atlas December 12 th 2017 MiGA: The Microbial Genome Atlas Jim Cole Center for Microbial Ecology Dept. of Plant, Soil & Microbial Sciences Michigan State University East Lansing, Michigan U.S.A. Where I m From

More information

1 In 2006, the scientific journal, Nature, reported the discovery of a fossil from around 380 million

1 In 2006, the scientific journal, Nature, reported the discovery of a fossil from around 380 million 1 In 2006, the scientific journal, Nature, reported the discovery of a fossil from around 380 million years ago. It was given the name Tiktaalik roseae. This fossil has some features in common with fish

More information

UON, CAS, DBSC, General Biology II (BIOL102) Dr. Mustafa. A. Mansi. The Origin of Species

UON, CAS, DBSC, General Biology II (BIOL102) Dr. Mustafa. A. Mansi. The Origin of Species The Origin of Species Galápagos Islands, landforms newly emerged from the sea, despite their geologic youth, are filled with plants and animals known no-where else in the world, Speciation: The origin

More information

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogeny? - Systematics? The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogenetic systematics? Connection between phylogeny and classification. - Phylogenetic systematics informs the

More information

Chapter Chemical Uniqueness 1/23/2009. The Uses of Principles. Zoology: the Study of Animal Life. Fig. 1.1

Chapter Chemical Uniqueness 1/23/2009. The Uses of Principles. Zoology: the Study of Animal Life. Fig. 1.1 Fig. 1.1 Chapter 1 Life: Biological Principles and the Science of Zoology BIO 2402 General Zoology Copyright The McGraw Hill Companies, Inc. Permission required for reproduction or display. The Uses of

More information

Wheat Genetics and Molecular Genetics: Past and Future. Graham Moore

Wheat Genetics and Molecular Genetics: Past and Future. Graham Moore Wheat Genetics and Molecular Genetics: Past and Future Graham Moore 1960s onwards Wheat traits genetically dissected Chromosome pairing and exchange (Ph1) Height (Rht) Vernalisation (Vrn1) Photoperiodism

More information

MRC-Holland MLPA. Description version 09; 25 April 2017

MRC-Holland MLPA. Description version 09; 25 April 2017 SALSA MLPA probemix P143-C2 MFN2-MPZ Lot C2-0317. As compared to version C1-0813, one reference probe has been removed and two replaced, in addition several probe lengths have been adjusted. This P143

More information

Chapter 15: Darwin and Evolution

Chapter 15: Darwin and Evolution Chapter 15: Darwin and Evolution AP Curriculum Alignment Big Idea 1 is about evolution. Charles Darwin is called the father of evolution because his theory of natural selection explains how evolution occurs.

More information

Gene expression in prokaryotic and eukaryotic cells, Plasmids: types, maintenance and functions. Mitesh Shrestha

Gene expression in prokaryotic and eukaryotic cells, Plasmids: types, maintenance and functions. Mitesh Shrestha Gene expression in prokaryotic and eukaryotic cells, Plasmids: types, maintenance and functions. Mitesh Shrestha Plasmids 1. Extrachromosomal DNA, usually circular-parasite 2. Usually encode ancillary

More information

Biology 105/Summer Bacterial Genetics 8/12/ Bacterial Genomes p Gene Transfer Mechanisms in Bacteria p.

Biology 105/Summer Bacterial Genetics 8/12/ Bacterial Genomes p Gene Transfer Mechanisms in Bacteria p. READING: 14.2 Bacterial Genomes p. 481 14.3 Gene Transfer Mechanisms in Bacteria p. 486 Suggested Problems: 1, 7, 13, 14, 15, 20, 22 BACTERIAL GENETICS AND GENOMICS We still consider the E. coli genome

More information

Journal Club Kairi Raime

Journal Club Kairi Raime Journal Club 21.01.15 Kairi Raime Articles: Zeros, E. (2013). Biparental Inheritance Through Uniparental Transmission: The Doubly Inheritance (DUI) of Mitochondrial DNA. Evolutionary Biology, 40:1-31.

More information

MOLECULAR ANALYSIS OF JAPANESE ANISAKIS SIMPLEX WORMS

MOLECULAR ANALYSIS OF JAPANESE ANISAKIS SIMPLEX WORMS MOLECULAR ANALYSIS OF JAPANESE ANISAKIS SIMPLEX WORMS Azusa Umehara 1, 2, Yasushi Kawakami 2, Jun Araki 3, Akihiko Uchida 2 and Hiromu Sugiyama 1 1 Department of Parasitology, National Institute of Infectious

More information

Darwin's theory of natural selection, its rivals, and cells. Week 3 (finish ch 2 and start ch 3)

Darwin's theory of natural selection, its rivals, and cells. Week 3 (finish ch 2 and start ch 3) Darwin's theory of natural selection, its rivals, and cells Week 3 (finish ch 2 and start ch 3) 1 Historical context Discovery of the new world -new observations challenged long-held views -exposure to

More information

Biology II : Embedded Inquiry

Biology II : Embedded Inquiry Biology II : Embedded Inquiry Conceptual Strand Understandings about scientific inquiry and the ability to conduct inquiry are essential for living in the 21 st century. Guiding Question What tools, skills,

More information

Big Idea 1: The process of evolution drives the diversity and unity of life.

Big Idea 1: The process of evolution drives the diversity and unity of life. Big Idea 1: The process of evolution drives the diversity and unity of life. understanding 1.A: Change in the genetic makeup of a population over time is evolution. 1.A.1: Natural selection is a major

More information

Characteristics of Life

Characteristics of Life UNIT 2 BIODIVERSITY Chapter 4- Patterns of Life Biology 2201 Characteristics of Life All living things share some basic characteristics: 1) living things are organized systems made up of one or more cells

More information

Levels of genetic variation for a single gene, multiple genes or an entire genome

Levels of genetic variation for a single gene, multiple genes or an entire genome From previous lectures: binomial and multinomial probabilities Hardy-Weinberg equilibrium and testing HW proportions (statistical tests) estimation of genotype & allele frequencies within population maximum

More information

GENETICS - CLUTCH CH.1 INTRODUCTION TO GENETICS.

GENETICS - CLUTCH CH.1 INTRODUCTION TO GENETICS. !! www.clutchprep.com CONCEPT: HISTORY OF GENETICS The earliest use of genetics was through of plants and animals (8000-1000 B.C.) Selective breeding (artificial selection) is the process of breeding organisms

More information

Inheritance part 1 AnswerIT

Inheritance part 1 AnswerIT Inheritance part 1 AnswerIT 1. What is a gamete? A cell with half the number of chromosomes of the parent cell. 2. Name the male and female gametes in a) a human b) a daisy plant a) Male = sperm Female

More information

The Impact of NGS in New Zealand Biosecurity

The Impact of NGS in New Zealand Biosecurity NGS Workshop, Bari, Italy, 22 November 2017 The Impact of NGS in New Zealand Biosecurity Bénédicte Lebas Senior Scientist, Post-entry Quarantine Team, Plant Health and Environment Laboratory, Diagnostic

More information

AP Curriculum Framework with Learning Objectives

AP Curriculum Framework with Learning Objectives Big Ideas Big Idea 1: The process of evolution drives the diversity and unity of life. AP Curriculum Framework with Learning Objectives Understanding 1.A: Change in the genetic makeup of a population over

More information

Systematics - BIO 615

Systematics - BIO 615 ICZN UPDATE Several issues now confronting the zoological community make desirable the development of a 5th edition of the International Code of Zoological Nomenclature (Code). Prime among them are: 1)

More information

Texas Biology Standards Review. Houghton Mifflin Harcourt Publishing Company 26 A T

Texas Biology Standards Review. Houghton Mifflin Harcourt Publishing Company 26 A T 2.B.6. 1 Which of the following statements best describes the structure of DN? wo strands of proteins are held together by sugar molecules, nitrogen bases, and phosphate groups. B wo strands composed of

More information

CHAPTER : Prokaryotic Genetics

CHAPTER : Prokaryotic Genetics CHAPTER 13.3 13.5: Prokaryotic Genetics 1. Most bacteria are not pathogenic. Identify several important roles they play in the ecosystem and human culture. 2. How do variations arise in bacteria considering

More information

AQA Biology A-level. relationships between organisms. Notes.

AQA Biology A-level. relationships between organisms. Notes. AQA Biology A-level Topic 4: Genetic information, variation and relationships between organisms Notes DNA, genes and chromosomes Both DNA and RNA carry information, for instance DNA holds genetic information

More information

What Organelle Makes Proteins According To The Instructions Given By Dna

What Organelle Makes Proteins According To The Instructions Given By Dna What Organelle Makes Proteins According To The Instructions Given By Dna This is because it contains the information needed to make proteins. assemble enzymes and other proteins according to the directions

More information

BME 5742 Biosystems Modeling and Control

BME 5742 Biosystems Modeling and Control BME 5742 Biosystems Modeling and Control Lecture 24 Unregulated Gene Expression Model Dr. Zvi Roth (FAU) 1 The genetic material inside a cell, encoded in its DNA, governs the response of a cell to various

More information

BIO 111: Biological Diversity and Evolution

BIO 111: Biological Diversity and Evolution BIO 111: Biological Diversity and Evolution Varsha 2017 Ullasa Kodandaramaiah & Hema Somanathan School of Biology MODULE: BIODIVERSITY AND CONSERVATION BIOLOGY Part I - FUNDAMENTAL CONCEPTS OF BIODIVERSITY

More information

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus:

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus: m Eukaryotic mrna processing Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus: Cap structure a modified guanine base is added to the 5 end. Poly-A tail

More information

Graduate Funding Information Center

Graduate Funding Information Center Graduate Funding Information Center UNC-Chapel Hill, The Graduate School Graduate Student Proposal Sponsor: Program Title: NESCent Graduate Fellowship Department: Biology Funding Type: Fellowship Year:

More information

GSBHSRSBRSRRk IZTI/^Q. LlML. I Iv^O IV I I I FROM GENES TO GENOMES ^^^H*" ^^^^J*^ ill! BQPIP. illt. goidbkc. itip31. li4»twlil FIFTH EDITION

GSBHSRSBRSRRk IZTI/^Q. LlML. I Iv^O IV I I I FROM GENES TO GENOMES ^^^H* ^^^^J*^ ill! BQPIP. illt. goidbkc. itip31. li4»twlil FIFTH EDITION FIFTH EDITION IV I ^HHk ^ttm IZTI/^Q i I II MPHBBMWBBIHB '-llwmpbi^hbwm^^pfc ' GSBHSRSBRSRRk LlML I I \l 1MB ^HP'^^MMMP" jflp^^^^^^^^st I Iv^O FROM GENES TO GENOMES %^MiM^PM^^MWi99Mi$9i0^^ ^^^^^^^^^^^^^V^^^fii^^t^i^^^^^

More information

1. In most cases, genes code for and it is that

1. In most cases, genes code for and it is that Name Chapter 10 Reading Guide From DNA to Protein: Gene Expression Concept 10.1 Genetics Shows That Genes Code for Proteins 1. In most cases, genes code for and it is that determine. 2. Describe what Garrod

More information