Phylogeny and evolution of papillomaviruses based on the E1 and E2 proteins

Size: px
Start display at page:

Download "Phylogeny and evolution of papillomaviruses based on the E1 and E2 proteins"

Transcription

1 Virus Genes (2007) 34: DOI /s Phylogeny and evolution of papillomaviruses based on the E1 and E2 proteins Ignacio G. Bravo Æ Ángel Alonso Received: 14 March 2006 / Accepted: 9 June 2006 / Published online: 22 August 2006 Ó Springer Science+Business Media, LLC 2006 Abstract Papillomaviridae are a family of small double-stranded DNA viruses that infect stratified squamous epithelia in vertebrates. Members of this family are causative agents of malignant tumours, such as cervical cancer while others are associated with benign proliferative lesions. So far, Papillomaviruses (PVs) are classified according to the sequence identity in the capsid gene L1. However, evidence has accumulated indicating a discontinuity in the evolutionary history of the L1 and L2 genes of many PVs, giving rise to differences in the phylogenetic reconstructions of the early and of the late genes. Neither the oncogenes E5, E6 and E7 nor the upstream regulatory region are suitable for phylogenetic inference due to the poor conservation along the Papillomaviridae family. We have analysed here the evolutionary relationships of the PVs with respect to the E1 and E2 proteins, and the results provide both phylogeny and biologic behaviour of the viruses. The hierarchical taxonomic relationships can be structured as an alternative classification system in which mucosal high-risk viruses, mucosal low-risk viruses and viruses associated with cutaneous lesions are grouped separately and do not appear intermingled. Some important trends are also observed: first, evolution of the PVs has not been homogeneous, even in viruses that infect the same host, and second mucosal human PVs have evolved faster than their cutaneous counterparts. The evolutionary analysis based on the E1 and E2 proteins will allow us to better I. G. Bravo (&) Æ Á. Alonso Deutsches Krebsforschungszentrum (F050), Im Neuenheimer Feld-242, Heidelberg, Germany i.bravo@dkfz.de understand the generation of the diversity of the PVs and the development of malignancy associated with these viruses. Keywords Papillomavirus Æ Evolution Æ Phylogeny Æ Classification Æ Virus host coevolution Introduction PVs are small, non-enveloped viruses having circular dsdna that infect stratified squamous epithelia in warm-blooded vertebrates [62]. The genome is ca. 8 kb in length, and comprises an upstream regulatory region (URR) that harbours transcription factor-binding sites and controls gene expression, a cluster of genes involved in the initial destabilisation of the host cell e.g. E4, E5, E6 and E7 and in genome replication e.g. E1 and E2 and a cluster of late genes that encode the capsid proteins e.g. L1 and L2 [14, 32]. The conserved blocks present in all PVs are the URR, the replicative proteins E1 and E2 (and possibly the E4 gene nested into E2), and the capsid proteins L1 and L2 [18]. The evolutionary reconstruction of Papillomaviridae can only be addressed with regard to the conserved elements. This excludes the oncogenes E5, E6 and E7, which have high divergence rates and are not present in all PVs [4, 9, 18]. These proteins might be however useful for determining phylogenetic relationships between closely related viruses [58]. Not all of the conserved elements in the PV genome are suitable for phylogenetic inference. This is the case of the URR, present in all PVs, but highly heterogeneous. It is believed that most of URRs contain similar factor binding sites [53] although particular differences

2 250 Virus Genes (2007) 34: might account for differential tropism and/or malignancy [22, 48, 60]. Furthermore, the structure of the DNA sequence in the URR does not allow a proper phylogenetic reconstruction, due to the shuffled relative positions of the conserved transcription factor binding sites and due to its higher evolutionary rate, compared to the rest of the genome [18, 19, 38]. The sequence of the L1 ORF has been chosen as yardstick for establishing the presently accepted PV classification [12]. This choice has historical roots, since the only information available for many PV isolates was (and in many cases still is) a short sequence stretch within the L1 ORF, amplified with broad-spectrum primers, i.e. SPF [29], GP5+/6+ [11, 25], MY09/11 [33] or FAP59/64 [17]. However, it has become evident in the last few years that the capsid genes are not suitable markers for the evolution of Papillomaviridae. The topology of the phylogenetic trees for the capsid proteins is different from that of the trees for the early genes [4, 12, 18, 23, 24, 36, 50]. Thus mucosal high-risk and low-risk viruses are not separately resolved on the basis of capsid genes [4, 12, 24, 36], whereas on the basis of early genes they clearly arise from different ancestors [4, 18, 23, 36, 50]. We have suggested that a discontinuity took place during the evolutionary history of the alpha PVs, possibly in the form of a recombination event in an ancestor of this taxon [18]. Although the exact nature of this event deserves further investigation, it prevents the use of the capsid genes as the sole reference for inferring phylogenies and establishing classifications within Papillomaviridae [4, 18, 36]. We have addressed the evolutionary reconstruction of Papillomaviridae on the basis of the E1 and E2 proteins. We show here that this approach renders a sharp categorisation of the PVs, grouping together viruses with similar biology. The parallelism between phylogenetic categories and the characteristics of the viruses, and the natural appearance of evolutionary categories allow us to propose a complementary classification of the PVs. Our data present quantitative divergence rates comparable to those of the hosts. Our results also provide information about differential evolutionary rates within the family, with certain branches having evolved faster than other branches. Material and methods Protein sequences. Taxonomic diversity All available PV full-genome sequences were retrieved either from Los Alamos HPV Sequence Database ( or from the public databases at EMBL. For all viruses, the amino acid sequences of the E1 and E2 proteins were retrieved and concatenated for further analysis. At present, most of the complete PV sequences are human PVs belonging to the current alpha, beta and gamma genera [12]. To avoid overrepresentation of these taxa, the phylogenetic analysis was performed in three steps. An initial phylogenetic reconstruction was performed with the whole set of sequences. All PVs that clustered confidently within the current alpha, beta and gamma genera (98 sequences) were then identified and extracted. A detailed phylogenetic analysis was then performed with this sequence subset to measure evolutionary distances. The results of this analysis allowed us to choose a representative selection of 60 genomes that covered the sequence diversity present in the original subset. These selected genomes were then combined with the rest of the original genomes not included in the intermediate step, generating a set of 83 PV genomes. A final detailed phylogenetic analysis was again performed on these sequences to measure evolutionary distances and to reconstruct evolutionary relationships. Thus, the final sequence set comprised all the non-human PVs and a phylogenetically representative selection of the human PVs. Phylogenetic analysis All algorithms were run in the HUSAR environment of the bioinformatics facility of the Deutsches Krebsforschungszentrum. Concatenated E1 E2 protein sequences were used for phylogenetic inference. Three alignment algorithms were used: T-COFFEE [37], which combines information for both global and local homologies, CLUSTALW [21], a progressive alignment algorithm, and DIALIGN [34], a local segment alignment algorithm. For distance estimation the raw outputs of the alignments were fed into the PHYLIP programme package (freely distributed by Dr. Felsenstein at [16] and distances measured with PROTDIST under the PAM250 amino acid substitution. The analysis rendered three values for each paired distance, one for each alignment algorithm. The median of each of these three values was chosen as central estimation and used for further calculations. The reconstruction of the phylogenetic relationships can only be accurately performed using conserved positions in the alignments [8]. Conserved positions in the three original alignments were defined with the GBLOCKS software using non-restringent conditions (freely distributed by Dr. Castresana at molevol.ibmb.csic.es) [8]. The output of GBLOCKS

3 Virus Genes (2007) 34: was a refined alignment that included 42 45% of the original positions, depending on the feeding alignment used. Phylogeny was estimated by the parsimony method with PROTPARS (PARS) and by distance matrices with PROTDIST. Distance matrices were then analysed with FITCH (FM), which estimates phylogenies from distance matrix data under the additive tree model according to which the distances are expected to equal the sums of branch lengths between the species, using the Fitch Margoliash criterion. Additionally, the distance matrix was analysed with NEIGHBOR, under both the Neighbor-Joining (NJ) and Unweighted Pair Group Method with Arithmetic Mean (UPGMA) methods of clustering. The statistical support was assessed by 1000 cycles bootstrapping with SEQBOOT. The trees for the different alignments treated with the same phylogenetic algorithm PARS, FM, NJ and UPGMA were combined and a consensus tree was computed for each algorithm with CONSENSE. Thus, 3 different alignments have been analysed with 4 different phylogenetic methods, yielding 12 different estimates of the phylogeny of the E1 E2 PV sequences, which were combined into 4 final output consensus trees. Statistical analysis Differences between data distributions were considered significant applying the Kolmogorov Smirnoff test. Composite frequency distributions were approximated by deconvolution of Gaussian distributions. Deconvolution was performed with SIMFIT software (freely distributed by Dr. Bardsley at Results The E1 E2 protein sequences allow to classify the papillomaviruses into seven high level taxa We have reconstructed the phylogeny of Papillomaviridae according to the concatenated E1 and E2 protein sequences. We have chosen to work with protein sequences instead of DNA sequences due to the large evolutionary distances between distant members of the family see below, and these large distances suggest that the third position of the codons is well saturated and less informative [39]. In addition, PVs show an important codon usage bias, that differs from the codon usage preferences of the host s genes and is not homogeneous along the viral genome [5, 51, 61]. Finally, the studied sequence set included PVs that infect different hosts, since different PVs might target different populations of keratinocytes within the same host [19], which might lead to different codon usage preferences among the PVs sequences. The most important step for a proper phylogenetic reconstruction is to generate a good alignment [13, 56]. In order to minimise the initial bias, we have used in parallel three different multiple alignment algorithms T-COFFEE, Clustalw and DIALIGN, which have different strengths and weaknesses [30]. Since the phylogeny can only be properly reconstructed when informative positions in the alignment are used, poorly aligned sequences and divergent regions were removed from each of the three alignments using the GBLOCKS software. This algorithm identifies segments that may not be homologous or may have been saturated by multiple substitutions and that should be eliminated prior to phylogenetic analysis [8]. The three filtered alignments were subsequently used for phylogenetic reconstruction with the PHYLIP package with four different bootstrapped methods. The consensus dendrogram highlighting the main taxa is depicted in Fig. 1. The dendrogram with the bootstrap support for the different nodes are depicted in Fig. 2, and the derived taxonomic classification is given in Table 1. According to the E1 and E2 sequences, 97% of the PVs can be classified into seven well-defined highorder taxa, with deep branches (Fig. 1), designated here as supergenera. The large evolutionary distances between supergenera prevent to establish further relationships among them, or even to discern whether they are monophyletic or paraphyletic. Exclusively in the case of supergenus B and supergenus C it is possible to infer that they might descend from a common ancestor (Fig. 2) [18]. Three sequences EcPV infecting horse, MnPV infecting African soft-furred rat, and TmPV infecting manatee did not show a consistent close relationship to any of the seven defined supergenera (Figs. 1, 2). Since it is now evident that a thorough search for PVs in different vertebrates will lead to an exponential increase of PVs sequences [20, 43 46, 54, 55], we have decided not to define individual supergenus for each of these sequences. It would be reasonable to wait for the publication of new sequences that might either confidently broaden the here defined supergenera, or define new high-order taxa. Two of the seven supergenera include only two members: HPV41 and EdPV, infecting porcupine, on the one hand, and FcPV, infecting chaffinch, and

4 PePV 252 Virus Genes (2007) 34: Fig. 1 Consensus dendrogram of Papillomaviridae based the concatenated E1 E2 proteins. The dendrogram gathers the consensus phylogenetic reconstruction after combining three alignment algorithms and four different phylogenetic inference algorithms, and bootstraping 1,000 cycles each combination. Branches with high bootstrap support (above 750/1000) are depicted in continuous line, and those with low bootstrap support (below 750/1000) are given in discontinuous line. Papillomaviridae can be confidently classified into two subfamilies Mammalian papillomavirinae and Avian papillomavirinae and seven supergenera. Human PVs are identified by their corresponding numbers. PV types belonging to the same species are depicted together in the same tree tip, and PV subtypes are indicated in parentheses. As an example, HPV69 is a subtype of HPV26, and HPV82 is a subtype of HPV51, and the four of them belong together to the same PV species, and these four lineages are represented by a single branch in the tree Mammalian PV C CaPV2 50 BPV3 48 BPV BPV4 HaOPV2 24,93 5(36),8,12(rtr),14(20,21),19(25) Mammalian PV B Mammalian PV E ROPVb ROPV 92 CRPV 96 49(75,76) 63 COPV 22(23),38 1 9,15(80),17(37) EdPV FdPV BPV2 BPV1 PlPV 41 Mammalian PV F OPV1 OPV2 RPV EEPV Mammalian PV D TmPV DPV EcPV 16,31,35 54 PsPV MnPV BPV5 FcPV 33,52,58,67 18(45),39(70),59,68,85 43,91 32,42 RhPV1 6(11),13(74,PCPV,CPV),44(55) 26(69),51(82) 30(53),56(66) 34,73 3(10,28,29,77,94) 2(27,57) 71,90 61(72),81,83,84(86,87),89 7,40 Mammalian PV A Avian PV PePV, infecting grey parrot, on the other hand. These four viruses are the most divergent with respect to the rest of the family members, but are clearly separated from each other: the distances between HPV41 and EdPV and the rest of the PVs (mean 1.43 substitutions per site, 95% confidence interval of the mean through 1.449) are statistically significantly shorter than the distances between PePV and FcPV and the rest of the PVs (mean 1.80 substitutions per site, 95% confidence interval of the mean, through 1.821; P < ). We have therefore defined a high-level split of Papillomaviridae into two subfamilies: (i) Mammalian papillomavirinae (Mammalian PV), encompassing six supergenera and the three yet orphan viruses, and (ii) Avian papillomavirinae (Avian PV), encompassing one supergenus (Fig. 2). Supergenera within Mammalian papillomavirinae include viruses that infect distant hosts It has recently been suggested that some of the high-level taxa of the present PVs belong to paralogous lineages, whereas some other might have shared a common ancestor [18, 24]. The classification of the PVs according to the L1 ORF, however, does not resolve deep nodes within the family [12]. As shown in Fig. 2, the phylogenetic reconstruction of Papillomaviridae based on the E1 and E2 proteins provided a finer resolution of the intermediate-deep nodes, defining viral supergenera that encompass evolutionarily related viral genera, thus clustering together viruses that infect distant hosts (Table 1).

5 Virus Genes (2007) 34: Mammalian PV supergenus A (Mammalian PV A) comprises viruses infecting Primates and Cetartiodactyla, and is discussed below. Mammalian PV B comprises two genera: genus 1 are viruses causing cutaneous lesions in humans. Some of them are associated with epidermodysplasia verruciformis, e.g. human PVs types 5, 8, 9, 12, 14, 15, 17 and 19 25, and some of them are possibly involved in non-melanoma skin cancer, e.g. human PVs types 5, 8, 14, 17, 20 or 47 [41]. Genus 2 includes three formerly orphan viruses that infect cattle, namely BPV3, BPV4 and BPV6. Mammalian PV C comprises a single genus with four subgenera, infecting humans (subgenera a and b), dogs (CaPV2, subgenus c) and hamsters (HaOPV, subgenus d). Mammalian PV D consists of a single genus and defines three subgenera, all of them causing fibropapillomas in ruminants. Mammalian PV E includes three genera, with viruses causing lesions in humans (genus 1), carnivores (genus 2), and rabbits (genus 3). Finally, Mammalian PV F includes two genera with viruses infecting humans (HPV41, genus 1) or porcupine (EdPV, genus 2) (Table 1). The fine classification of Mammalian PV A correlates phylogenetic patterns and the differences in the biologic behaviour Besides the seven high-level taxa described above, the taxonomic reconstruction of Papillomaviridae afforded to define different conserved low-level nodes (Fig. 2, Table 1). For well-represented supergenera it was possible to define taxa at the level of genus, nd species. Mammalian PV A confidently encompasses two distantly related genera: genus 1, which corresponds to the alpha papillomaviruses defined according to the L1 ORF sequence [12], and genus 2, with a two virus species, PsPV [6] and TtPV2 [47], which cause genital lesions in porpoise and dolphin, respectively. Viruses in genus 1 infect primates and can be divided into three well-defined subgenera that correspond to three clusters of viruses with different biology. Viruses grouped under Mammalian PV supergenus A genus 1 (A1a) infect Hominidae and Cercopithecidae. This subgenus includes all high-risk human PVs 16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, 68, 73 and 82 which are causative agents of cervical cancer [10, 35], all putative high-risk human PVs 26, 53 and 66, and other PVs closely related to them 26, 30, 34, 67, 69, 70 and 85. These PVs can be classified according to their phylogeny and the relative evolutionary distances into six species, as described in Table 1. Additionally, a PV that causes mucosal genital lesions in rhesus monkey, RHPV1, also belongs to this subgenus, although it is evolutionarily far from the other members and constitutes therefore a species itself. The distances between members included in the same category are given in Table 2. Mammalian PV A1b encloses viruses that infect Hominidae, exhibit mucosal tropism, and cause genital warts [35]. The E1 E2 sequences in this re highly conserved [19], especially in, which comprises viruses that infect human, chimpanzee and bonobo. This high degree of conservation is also evident in the E5, E6, E7, L1 and L2 ORFs [4, 12, 57, 58]. In this case, therefore, the barrier delineating the species specificity of the close relatives HPV13, CPV and PCPV, infecting humans, chimpanzees and bonobos, respectively, might be cultural rather than biological. Evolutionary distances between viruses infecting chimpanzees, and their closest human relative, HPV13, lie in the interval of , and the distance between PCPV and CPV is 0.13 substitutions per site. These figures are consistent with evolutionary distances between humans and chimpanzees according to cyochrome b with substitutions per site [7]. The agreement between both results validates the approach we have followed here. Mammalian PV A1c comprises PVs infecting humans and causing cutaneous lesions such as flat warts, some of them affecting mainly children. Others appear in wart-like lesions in butchers and fishers [28, 49]. Finally, Mammalian PV A1d comprises a single virus, namely HPV54. This virus was isolated from a patient with condylomata acuminata, but has also been isolated in patients with pterygium, a degeneration of Bowman s membrane of the cornea [42]. HPV54 branched together with Mammalian PV A1c in 44% of the PARS and in 60% of the NJ consensus trees, and together with Mammalian PV A1a in 52% of the FM and in 53% of the UPGMA consensus trees. However, HPV54 does not encode for any E5-like protein, neither similar to the E5c and E5d present in Mammalian PV A1c nor similar to the E5a present in Mammalian PV A1a [4]. Taking both facts into account, and also in consistency in the distances to the rest of the genus members, we have decided to define a subgenus with HPV54 as the temporarily sole member. The evolutionary distances between papillomaviruses are distributed in natural taxonomic categories We have measured the evolutionary distances for the concatenated E1 and E2 protein sequences in the final sequence set, comprising all animal PVs and a phylo-

6 254 Virus Genes (2007) 34: /100/100/100 genus 1 genus 2 Supergenus F 77/89/67/95 species 3 subgenus b 61/51/51/63 96/100/100/100 30/89/64/42 subgenus c genus 1 Supergenus B 37/84/64/- 99/100/100/ /100/67/100 genus 2 69/97/97/77 84/99/99/94 70/78/90/98 species 3 species 3 subgenus b subgenus c genus 1 genus 3 genus 1 genus 2 Supergenus C Supergenus E 92/100/98/100 66/98/98/80 99/100/100/100 subgenus b subgenus c genus 1 Supergenus D Mammalian papillomavirinae species 3 88/97/84/53 species 7 species 6 species 5 species 4 37/97/73/94 96/100/100/77 96/100/100/100 species 3 species 4 subgenus c genus 1 Supergenus A species 4 94/99/99/98 species 3 subgenus b subgenus d genus 2 100/100/100/100 genus 1 genus 2 Supergenus A Avian papillomavirinae genetically representative selection of human PVs. The distribution of the paired distances is displayed in Fig. 3. Distances cluster into five clearly defined groups, and the overall distribution can be ideally deconvoluted as the addition of five gaussian distributions. The mean values and the variance of the approximated Gaussian distributions are given in Fig. 3 in substitutions per site. We have addressed the analysis of the distance

7 Virus Genes (2007) 34: b Fig. 2 Phylogenetic consensus phenogram of Papillomaviridae based the concatenated E1 E2 proteins. Concatenated E1 E2 sequences were aligned with T-COFFEE, DIALIGN and CLUSTALW and informative positions were filtered with GBLOCKS. Phylogenetic relationships were computed with the parsimony method PROTPARS (PARS) or with three distance matrices methods Fitch Margoliash (FM), Neighborjoining (NJ) and UPGMA with the PHYLIP package. Processes were bootstrapped 1,000 times. The trees computed with the same phylogenetic method with the different alignments were combined and the consensus tree calculated with CONSENSE. Numbers in the nodes refer to the percentage of the times that the node appears in the PARS/FM/NJ/UPGMA consensus trees, respectively. An asterisk means that the corresponding node shows a bootstrap above 95% in all four tested methods. High and low degrees of node reliability are indicated as full circles and open circles, respectively. PVs can be classified into two subfamilies Mammalian papillomavirinae and Avian papillomavirinae and seven supergenera. Only three individual PVs cannot be confidently assigned: infecting African soft-furred rat MnPV; manatee TmPV; and horse EcPV. Within Mammalian PV A1, the phylogenetic analysis sharply differentiates between mucosal high-risk PVs Mammalian PV A1a, mucosal low risk viruses Mammalian PV A1b and closely related viruses causing cutaneous warts Mammalian PV A1c distributions within the different categories below supergenus, as defined with the help of the phylogenetic analysis. The evolutionary distances between different taxa within Papillomaviridae are not homogeneous PVs are at present classified according to their identity of the L1 ORFs [12]. According to the distribution of the paired DNA distances in this gene, a clear-cut classification criterion for Papillomaviridae has been proposed [12]. As described above, five natural clusters of paired distances appear according to the E1 E2 protein sequences (Fig. 3). However, it is known that in living organisms taxonomic ranks are not always equivalent with respect to genetic distance [27]. We have therefore analysed the distribution of the genetic distances in the different well-defined taxa within Papillomaviridae. The simplified questions that we want to address are: Is a genus in the Mammalian PV A equivalent to a genus in Mammalian PV B? Are there universal values in the genetic distances between PVs that we can use to blindly discern between different taxonomic categories? The results are displayed in Figs. 4 and 5. Only Mammalian PV A and B are well represented in the genomic databases. It is therefore possible to compare intra-subfamily distances (Fig. 4a), but comparison of intra-genus distances is more restricted (Fig. 4b) and a thorough comparison of distances across the different taxonomic categories is only possible for Mammalian PV A and B (Fig. 4c). The distribution of the paired distances in Mammalian PV and Mammalian PV B are shown in Figs. 5a and b, respectively. Paired distances cluster naturally into five categories, which correspond to genera, subgenera, species, type and subtypes. The calculated mean values and standard deviations of the deconvoluted Gaussian curves are included in Fig. 5. The observed values for the paired distances between members in the same category are given in Table 2. Distances between genera in the same supergenus are not homogeneous in Papillomaviridae (Fig. 4a). Distances within Mammalian PV B, C, D and E are equivalent, but distances within Mammalian PV A are significantly larger (P < 0.001). For Mammalian PV F and Avian PV A the available data do not suffice to carry out a statistic analysis. Larger distances within the Mammalian PV A are not artefacts resulting from the integration of porpoise and dolphin PVs, PsPV and TtPV2, in this supergenus, since the same results are obtained when analysing other taxonomic categories (Fig. 4b, c). Thus, distances between species in the same re significantly shorter in Mammalian PV B1 than in Mammalian PV A1, C1 and D1 (Fig. 4b). Moreover, when measured across different taxonomic categories, distances in Mammalian PV B are significantly shorter than in Mammalian PV A (Fig. 4c). There is only coincidence between distances for the very recent categories, i.e. variants, with distances below 0.1 substitutions per site for the E1 E2 protein sequences. In the distributions of the distances of Mammalian PV A and B and the corresponding deconvolution in Gaussian curves depicted in Fig. 5 is evident that the pattern of taxonomic categories is the same in both supergenera, but that distances are systematically shifted to lower values in Mammalian PV B. Discussion The conceptual problems regarding viral taxonomy arise first, from an unlikely common origin from a single protoviral ancestor; second, from the asexual nature of virus reproduction; and third from horizontal gene transmission, which might give rise to viral genomes containing non-monophyletic genes [31, 40, 59]. Describing viruses is therefore easier than classifying them. The description of the virus comprises the virion components, the genes and proteins encoded in the genetic material and their functions. The functions of the viral components arise exclusively in the living context of a viral infection. Therefore, the International Committee for the Taxonomy of Viruses (ICTV)

8 256 Virus Genes (2007) 34: Table 1 Classification of Papillomaviruses according to the phylogenetic relationships between the E1 and E2 protein sequences Subfamily supergenus genus subgenusspecies type subtype variant Mammalian Papillomavirinae A 1 a 1 HPV16 HPV31 HPV35 2 HPV18 HPV45 HPV39 HPV70 HPV59 HPV68 HPV85 3 HPV33 HPV58 HPV52 HPV67 4 HPV34 HPV73 5 HPV30 HPV53 HPV56 HPV66 6 HPV26 HPV69 HPV51 HPV82 7 RHPV1 b 1 HPV6 HPV11 HPV13 HPV74 PCPV CPV HPV44 HPV55 2 HPV7 HPV40 3 HPV32 HPV42 4 HPV43 HPV91 c 1 HPV2 HPV27 HPV57 2 HPV61 HPV72 HPV81 HPV83 HPV84 HPV86 HPV87 HPV89 3 HPV3 HPV10 HPV28 HPV29 HPV77 HPV94 4 HPV71 HPV90 d 1 HPV54 2 a 1 PsPV 2 TtPV2 recommends inclusion of both genotypic and phenotypic information in the definition of viral genera and species, i.e. the ecological niche and the relational properties of viral and host components [59]. In this sense, the ITCV guidelines for viral species demarcation define viral species as a polythetic class of viruses that constitute a replicating lineage and occupy a particular ecological niche [59]. A polythetic class is defined considering a broad set or common properties or characteristics. None of these characteristics are strictly necessary nor can be used alone for defining the polythetic species, i.e. for describing its presence in all the members of the species and its absence in the members of any other species [59]. However, the present classification of Papillomaviridae is based exclusively on the nucleotide identity in the capsid L1 ORFs [12]. In addition, the application of the rolling amplification technique for the cloning of new circular viruses [26] has greatly increased the number of totally sequenced PVs in different hosts [43 46]. The application of the PV clade definition to the newly described PVs has lead to a growing number of genera that comprise single species and it is not clear how the present nomenclature using the greek alphabet can easily accommodate many potential new genera [3]. The results presented here describe the phylogenetic relationships and evolution of Papillomaviridae, using the E1 and E2 proteins as markers (Fig. 1). Our results provide criteria for a fine categorisation of the PVs, and allow to define high-level taxa, i.e. subfamilies and supergenera, and low-level taxa, i.e. subtypes and variants (Table 1). The classification proposed in this study complements the currently accepted one, and reflects also differences in the biology of the infection caused by PVs and parallels the epidemiologic

9 Virus Genes (2007) 34: Table 1 continued Subfamily supergenus genus subgenusspecies type subtype variant Mammalian Papillomavirinae B 1 a 1 HPV5 HPV36 HPV8 HPV12 HPVRTR HPV14 HPV20 HPV21 HPV19 HPV25 2 HPV24 HPV93 b 1 HPV9 HPV15 HPV80 HPV17 HPV37 2 HPV22 HPV23 HPV38 3 HPV49 HPV75 HPV76 c 1 HPV92 2 HPV96 2 a 1 BPV3 BPV4 BPV6 b 1 ChPV1 Subfamily supergenus genus subgenusspecies type subtype variant Mammalian Papillomavirinae C 1 a 1 HPV4 HPV65 HPV95 2 HPV48 HPV50 3 HPV60 b 1 CaPV2 c 1 HaOPV Subfamily supergenus genus subgenusspecies type subtype variant Mammalian Papillomavirinae D 1 a 1 BPV1 BPV2 b 1 OPV1 OPV2 2 EEPV DPV RPV c 1 BPV5 Subfamily supergenus genus subgenusspecies type subtype variant Mammalian Papillomavirinae E 1 a 1 HPV1 2 HPV63 2 a 1 FdPV 2 COPV 3 PlPV 3 a 1 CRPV ROPVb 2 ROPV Subfamily supergenus genus subgenusspecies type subtype variant Mammalian Papillomavirinae F 1 a 1 HPV41 2 a 1 EdPV Subfamily supergenus genus subgenusspecies type subtype variant Avian Papillomavirinae A 1 a 1 FcPV b 1 PePV not classifiable yet MnPV TmPV EcPV classification of PVs associated to cervical cancer [10, 35]. Moreover, we have shown that the different PV supergenera have different rates of molecular evolution, and that there is no simple universal yardstick for establishing standard evolutionary categories (Table 2, Figs. 3 5). The widely accepted classification of the PVs is performed according to the nucleotide identity in the L1 ORF [12]. In the last year, however, evidence has accumulated showing inconsistencies in the phylogenies inferred for the capsid genes L1 and L2. The topologies of the phylogenetic trees for the early

10 258 Virus Genes (2007) 34: Table 2 Paired evolutionary distances within Papillomaviruses belonging to the same category Distance between variants in the same type Distance between subtypes in the same type Distance between types in the same species Distance between species in the same subgenus Distance between subgenera in the same genus Distance between genera in the same supergenus Mammalian PV A 1.25 ± 0.05 (n = 58) ± (n = 1101) ± (n = 440) ± (n = 63) ± (n = 45) ± (n =4) Mammalian PV B ± (n = 79) ± (n = 191) ± (n = 61) ± (n = 57) ± (n = 8) ± (n =2) Mammalian PV C ± (n = 29) ± (n = 11) ± (n = 2) ± (n = 4) (n =1) Mammalian PV D ± (n = 16) ± (n = 6) ± (n = 3) ± (n =2) Mammalian PV E ± (n = 21) ± (n = 6) (n =1) Mammalian PV F 1.41 (n =1) Avian PV 1.08 (n =1) Values are given as median ± standard error of the median. The number of paired distances in a given group are indicated in parentheses proteins E6, E7, E1 and E2 are equivalent, but differ from the trees for the late proteins L1 and L2 [4, 18, 23, 24, 36, 50]. Additionally, the phylogenetic distribution of PVs that infect mucosal epithelia in humans does not correlate with the standard phylogeny according to L1 [4, 18, 36, 50]. Thus, mucosal high-risk viruses appear together with mucosal low-risk viruses and with viruses that cause cutaneous lesions. On the contrary, the phylogeny according to the early genes clusters together viruses with similar epidemiology [2, 4, 18, 36, 58]. Finally, alpha PVs can encode for four different conserved E5-like proteins. The differential presence of one of these E5-like proteins follows the phylogeny according to the rest of the early genes and also correlates with the differential association with cervical cancer [4]. We have proposed that these inconsistencies arise from a discontinuity in the evolutionary history of the L1 and L2 genes that does not allow their use as standards for phylogenetic studies [18]. For these reasons, an additional marker is necessary for studying the phylogeny and the evolution of PVs. The early proteins E5, E6 and E7 are not appropriate markers since they are not present in all PVs and have very high evolutionary rates [18]. The URR shows only limited local similarities in short sequence stretches, and has diverged even more rapidly than the early oncogenes [18, 38]. Our results here show that the concatenated E1 E2 proteins sequences can also be number of paired distances subgenera within the same genus μ=0.935 σ=7.9e-2 species within the same subgenus μ=0.697 σ=7.4e-2 types within the same species μ=0.34 σ=0.20 genera within the same supergenus and supergenera within the same subfamily μ=1.29 σ=0.19 subfamilies within the same family μ=1.81 σ= evolutionary distance (substitutions per site) Fig. 3 Distribution of the evolutionary distances within Papillomaviridae, and deconvolution into Gaussian curves. Paired evolutionary distances expressed as substitution per site (white circles) were calculated as the median of the three distances of the concatenated E1 E2 processed with PROTDIST after alignment with the algorithms T-COFFEE, DIALIGN and CLUSTALW. The distances distribute into natural categories, depicted as Gaussian curves (continuous lines). The mean values (l) and the variances (r) of the calculated curves are provided. Data distrubution can be approximated as the sum of the calculated Gaussian curves (discontinuous line)

11 Virus Genes (2007) 34: incorporated as a suitable evolutionary standard for Papillomaviridae. Using three different alignments and four different phylogenetic algorithms, our results show that the paired distances in PVs can be naturally grouped into eight categories: subfamily, supergenus, genus, subgenus, species, type, subtype and variant (Fig. 3). We have classified PVs into two subfamilies, Mammalian PV and Avian PV, according to both phylogenetic congruence and distribution of evolutionary distances. Mammalian PV encompasses seven categories designated as supergenera, following common taxonomic nomenclature in other organisms. The ICTV recognises the following viral taxa: Order, Family, Subfamily, Genus and Species [15]. Other categories from clade to superfamily may communicate useful descriptive information [...] but have no formally recognised taxonomic meaning, [...] such as the extensively used quasi-species [15]. The use of the terms supergenus and subgenus in the proposed operational taxonomy based on E1 and E2 is therefore justified since they communicate useful descriptive information. In this case they refer to a monophyletic cluster of genera and to a monophyletic cluster of species, respectively. Besides it allows us to maintain the extensively used term papillomavirus type in an appropriate taxonomic context, following the example of the presently accepted classification based on L1 [12]. The introduction of additional taxonomic categories is also found in the presently accepted classification of the PV, which uses the term species after having justified it with similar arguments [12]. Due to the importance of taxonomy and nomenclature [3] we have tried to maintain a balance between the risks of splitting in excess and of amalgamating together viruses that are different [27]. Therefore we and have used quantitative distance criteria to avoid subjectivity, and have followed the natural distribution of evolutionary distances (Figs. 3 5). We have addressed not only the question of the reconstruction of the phylogenetic relationships within PVs, but we have also compared the differences in the evolutionary rates in different viral supergenera. We have previously shown that different genes within the same virus have different evolutionary rates, with E5, E6 and E7 evolving in general faster than E1 and E2, and much faster than L1 and L2 [4, 18, 19]. Our results here demonstrate also that different PVs have evolved with different rates. Evolutionary distances in Mammalian PV B are consistently shorter than Mammalian PV A, for all taxonomic categories studied, except for the very recent virus variants (Fig. 3). Most of the Fig. 4 Comparative analysis of the evolutionary distances within Papillomaviridae, based on the E1 and E2 proteins. The paired evolutionary distances expressed as substitutions per site between PVs where analysed with the phylogenetic relationship as a guideline. Number in parentheses show the number of paired distances in a given group. (a) Paired distances between viruses that belong to different genera within the same supergenus, i.e. Mammalian PV B1a HPV8 and Mammalian PV B2a BPV4. Median values are indicated with bars encompassing the standard error of the median. (b) Paired distances between viruses that belong to different species within the same subgenus, i.e. Mammalian PV A1a1 HPV16 and Mammalian PV A1a2 HPV18. Median values are indicated with bars encompassing the standard error of the median. Dash-dotted lines in Mammalian PV A and B mark the corresponding expected values after deconvolution of the distance distribution, and 0.502, respectively. The distances within a given genus are homogeneous, but different in different genera. (c) Comparison of the evolutionary distances across taxonomic categories for Mammalian PV A and Mammalian PV B. Median values are indicated with bars encompassing the 95% confidence interval of the median. Mammalian PV B show systematically lower evolutionary distances than the same category in Mammalian PV A above the intravariant variation level (P < 0.01, **), although many of the members of both viral supergenera have the same host, i.e. humans

12 260 Virus Genes (2007) 34: of paired distances of paired distances anumber b species within the same subgenus μ=0.698 σ=6.7e-2 types within the same species μ=0.425 σ=5.5e-2 subtypes and variants within the samet ype μ=0.228 σ=4.8e-2 species within the same subgenus μ=0.502 σ=5.0e-2 types within the same species μ=0.304 σ=4.8e-2 subtypes and variants within the same type μ=0.170 σ=4.6e-2 subgenera within the same genus μ=0.891 σ=6.3e-2 subgenera within the same genus μ=0.665 σ=3.8e-2 distances within supergenus A genera within the same supergenus μ=1.38 σ=8.7e evolutionary distance (substitutions per site) distances within supergenus B genera within the same supergenus μ=1.01 σ=5.6e evolutionary distance (substitutions per site) Fig. 5 Distribution of the evolutionary distances within Mammalian PV A and B, and deconvolution into Gaussian curves. Paired evolutionary distances expressed as substitution per site (white circles) were calculated as the median of the three distances of the concatenated E1 E2 processed with PROTDIST after alignment with the algorithms T-COFFEE, DIALIGN and CLUSTALW. The distances distribute into natural categories, depicted as Gaussian curves (continuous lines). The mean values (l) and the variances (r) of the calculated curves are provided. Data distrubution can be approximated as the sum of the calculated Gaussian curves (discontinuous line). The trends in both supergenera are the same, but evolutionary distances are systematically shifted to lower values in Mammalian PV B members in both PV supergenera infect humans, and the taxonomic levels have been defined homogeneously. Therefore, the differences in evolutionary distances indicate that either Mammalian PV B are younger than Mammalian PV A or that both supergenera differ in their molecular evolutionary rate [27]. Our results however do not support the hypothesis of Mammalian PV B being younger than Mammalian PV A. Within Mammalian PV B are present viruses that infect humans, dogs and hamsters. A similar taxonomic diversity in the host range is also observed in other supergenera, i.e. Mammalian PV A, E and F. The topology of the tree according to E1 and E2 (Fig. 1) supports therefore the hypothesis of a primordial split event that generated the ancestors of Avian PV and Mammalian PV, and a subsequent radiative event that gave rise to the ancestors of supergenera A, B (and probably C), D, E and F [18]. There is therefore no evidence of a more recent appearance of supergenus B compared to the rest of supergenera in Mammalian PV. Regarding putative differences in molecular evolutionary rate, we have proved that the evolutionary distances for E1 and E2 within subspecies in Mammalian PV A1b1 that infect different hosts HPV13 and HPV74 in human, CPV in chimpanzee and PCPV in bonobo are consistent with the evolutionary distances for the cytochrome b sequences in the host species [7]. Since PVs use the replicative machinery of the host, genetic drift is expected to occur at the same rate in both, viruses and hosts [52]. The congruence between the evolutionary distances also reflects this recent co-speciation of viruses and hosts, after the putative initial radiative event in Papillomaviridae [18]. The genetic drift in the recent branches seems also to be the same in Mammalian PV A and B, as seen in Fig. 3c for the distances between PV variants, ca. 0.1 substitutions per site in both cases. It seems therefore that there are additional factors that control evolution of Mammalian PV B as compared to Mammalian PV A. A possible explanation for this fact could be an enhanced genetic evolutionary rate in supergenus A driven by a more strict immune surveillance. Evidence in this sense is the apparent commensal relationship of Mammalian PV B and their hosts, these viruses being always a part of the microbial epidermal flora [1]. Also in concordance with this hypothesis, the capsid proteins L1 and L2 and the oncoproteins E5, E6 and E7 have diverged faster in mucosal high-risk viruses Mammalian PV A1a than in mucosal lowrisk viruses Mammalian PV A1b [4], and this tendency is also observed for the E1 and E2 proteins (Fig. 4b). Additional explanations might include differences in the biology of the proteins between different taxa, and subsequent differences in the evolutionary pressures on these genes [27]. The present sample of fully-sequenced PV genomes is biased in the sense that more than two-thirds of them have been isolated from humans. However, it is still evident that some deep branches in Papillomaviridae are not homologous but paralogous [18, 24]. This means, for instance, that PVs infecting closely related or even the same hosts do not seem to share a recent last common ancestor, if any. This is clearly shown in the dendrogram in Fig. 1 and in the phenogram in Fig. 2 for PVs infecting humans, such as HPV16, HPV5, HPV4, HPV1 or HPV41; for PVs

13 Virus Genes (2007) 34: infecting cattle, such as BPV1 or BPV3; for PVs infecting dogs, such as COPV and CaPV2; or for PVs infecting rodents, such as MnPV, HaOPV or EdPV. Since the rolling circle amplification technique seems to be a promising tool for expanding our knowledge on circular dsdna viruses [26, 45], the quest for new PVs should aim at maximising the phylogenetic information in order to clarify some of the many still unclear points regarding PV evolution. In this sense, open questions arise about the position of EcPV, MnPV and TmPV within the whole Papillomaviridae family. Furthermore, large phylogenetic gaps to be filled are present between Mammalian PV A1 and A2; between Mammalian PV B1 and B2; between Mammalian PV C1a, C1b and C1c; and between Mammalian PV E1, E2 and E3. An integrative question to be addressed is whether the putative relationship between Mammalian PV B and C could be proved to be true. If so, we could directly compare not only sequences but also biological activities between orthologous proteins from distant viruses that infect the same host, i.e. HPV5 and HPV4. Until now, we cannot be sure whether proteins with the same name in distant viruses such as HPV16 and HPV5 shared a common ancestor. Most of our inferences and extrapolations about protein function may therefore be misleading and need to be fundamented [18]. Finally, the discussed discontinuity in the evolution of the capsid genes in Mammalian PV A1 might be extremely informative [4, 24, 36]. The event that lead to the anomalous phylogeny of L1 and L2 predated the divergence of Mammalian PV A1 into the four subgenera a, b, c and d. This radiative event occurred in a short period of time, as evidenced by the short evolutionary distances between the last common ancestor of the genus, and each of the last common ancestors of the four subgenera (Fig. 1). In the present time, viruses in these subgenera differ dramatically in their tropism and malignancy potential [10, 35]. Identifying the event that lead to the discontinuity in the history of the L1 and L2 genes in this particular genus might help us understanding the historical emergence of malignant transformations. Conclusion The concatenated E1 and E2 protein sequences are suitable markers for reconstructing the phylogenetic relationships in Papillomaviridae. They render a complementary taxonomy that groups together viruses with similar biology, simultaneously making them clear the evolutionary gaps to be filled and the open questions still to be answered. The genetic distances based on E1 and E2 are consistent with the hypothesis of a two-step evolution process of the PVs, and are congruent with the evolutionary distances in the hosts. Finally, different PVs have evolved with different rates, with Mammalian PV B having diverged less that Mammalian PV A. It is not our aim to establish any sort of nomenclatural precedent concerning a change in the presently accepted classification of PVs, but rather to stimulate further a taxonomic debate. The findings on the incongruence of the evolutionary histories of early and late genes in PVs [4, 24, 36] highlight the limitations and the risks of choosing individual genes for supporting a viral classification, specially in the case of the small PV genomes. The results here communicated also exemplify that it is possible to gain phylogenetic resolution in the deep nodes by incorporating different regions of the PV genome in the analysis, i.e. the E1 and E2 genes. Additional resolution close to the tips of the branches could be achieved by including the highly divergent genes E6 and E7 and eventually the URR within the analysed region. Understanding the forces that drove the differential evolution of the PVs might prove to have a clinical importance. If we can explain how PVs have evolved then we could also explain the differences between mucosal and cutaneous PVs, and we may be able to answer what makes a high-risk PV different from a low-risk PV. Acknowledgedments The authors wish to thank Lutz Gissmann, Kerstin Leykauf, Michael Pawlita and Tim Waterboer for their comments and improvements to an initial draft of the manuscript. References 1. A. Antonsson, B.G. Hansson, J. Virol. 76, (2002) 2. P.C. Babbit, J.A. Gerlt, J. Biol. Chem. 272, (1997) 3. H.U. Bernard, J. Clin. Virol. 32S, S1 S6 (2005) 4. I.G. Bravo, A. Alonso, J. Virol. 78, (2004) 5. I.G. Bravo, M. Müller, Papillomavirus Rep. 16, 1 9 (2005) 6. P. Cassonet, M. van Bressem, C. Desaintes, G. Orth, Papillomaviruses cause genital warts in small cetaceans from Peru (1998) GenBank Sequence #AJ J. Castresana, Mol. Biol. Evol. 18, (2001) 8. J. Castresana, Mol. Biol. Evol. 17, (2000) 9. Z. Chen, M. Terai, L. Fu, R. Herrero, R. DeSalle, R.D. Burk, J. Virol. 79, (2005) 10. G.M. Clifford, J.S. Smith, M. Plummer, N. Munoz, S. Franceschi, Br. J. Cancer 88, (2003) 11. A.M. de Roda Husman, J.M. Walboomers, A.J. van den Brule, C.J. Meijer, P.J. Snijders, J. Gen. Virol. 76(Pt 4), (1995)

C3020 Molecular Evolution. Exercises #3: Phylogenetics

C3020 Molecular Evolution. Exercises #3: Phylogenetics C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from

More information

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships Chapter 26: Phylogeny and the Tree of Life You Must Know The taxonomic categories and how they indicate relatedness. How systematics is used to develop phylogenetic trees. How to construct a phylogenetic

More information

Analysis of Host Parasite Incongruence in Papillomavirus Evolution Using Importance Sampling. Research article. Open Access. Abstract.

Analysis of Host Parasite Incongruence in Papillomavirus Evolution Using Importance Sampling. Research article. Open Access. Abstract. Analysis of Host Parasite Incongruence in Papillomavirus Evolution Using Importance Sampling Seena D. Shah, 1 John Doorbar, 2 and Richard A. Goldstein*,1 1 Division of Mathematical Biology, MRC National

More information

8/23/2014. Phylogeny and the Tree of Life

8/23/2014. Phylogeny and the Tree of Life Phylogeny and the Tree of Life Chapter 26 Objectives Explain the following characteristics of the Linnaean system of classification: a. binomial nomenclature b. hierarchical classification List the major

More information

Phylogenetic Tree Reconstruction

Phylogenetic Tree Reconstruction I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven

More information

Analysis of Genomic Sequences of 95 Papillomavirus Types: Uniting Typing, Phylogeny, and Taxonomy

Analysis of Genomic Sequences of 95 Papillomavirus Types: Uniting Typing, Phylogeny, and Taxonomy JOURNAL OF VIROLOGY, May 1995, p. 3074 3083 Vol. 69, No. 5 0022-538X/95/$04.00 0 Copyright 1995, American Society for Microbiology Analysis of Genomic Sequences of 95 Papillomavirus Types: Uniting Typing,

More information

a,bD (modules 1 and 10 are required)

a,bD (modules 1 and 10 are required) This form should be used for all taxonomic proposals. Please complete all those modules that are applicable (and then delete the unwanted sections). For guidance, see the notes written in blue and the

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

What is Phylogenetics

What is Phylogenetics What is Phylogenetics Phylogenetics is the area of research concerned with finding the genetic connections and relationships between species. The basic idea is to compare specific characters (features)

More information

Phylogenetic inference

Phylogenetic inference Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis-) advantages of different information types

More information

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Phylogeny: the evolutionary history of a species

More information

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26 Phylogeny Chapter 26 Taxonomy Taxonomy: ordered division of organisms into categories based on a set of characteristics used to assess similarities and differences Carolus Linnaeus developed binomial nomenclature,

More information

Chapter 16: Reconstructing and Using Phylogenies

Chapter 16: Reconstructing and Using Phylogenies Chapter Review 1. Use the phylogenetic tree shown at the right to complete the following. a. Explain how many clades are indicated: Three: (1) chimpanzee/human, (2) chimpanzee/ human/gorilla, and (3)chimpanzee/human/

More information

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between

More information

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline Phylogenetics Todd Vision iology 522 March 26, 2007 pplications of phylogenetics Studying organismal or biogeographic history Systematics ating events in the fossil record onservation biology Studying

More information

Algorithms in Bioinformatics

Algorithms in Bioinformatics Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Distance Methods Character Methods

More information

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree) I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by

More information

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

CHAPTERS 24-25: Evidence for Evolution and Phylogeny CHAPTERS 24-25: Evidence for Evolution and Phylogeny 1. For each of the following, indicate how it is used as evidence of evolution by natural selection or shown as an evolutionary trend: a. Paleontology

More information

Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

More information

CHAPTER 26 PHYLOGENY AND THE TREE OF LIFE Connecting Classification to Phylogeny

CHAPTER 26 PHYLOGENY AND THE TREE OF LIFE Connecting Classification to Phylogeny CHAPTER 26 PHYLOGENY AND THE TREE OF LIFE Connecting Classification to Phylogeny To trace phylogeny or the evolutionary history of life, biologists use evidence from paleontology, molecular data, comparative

More information

Biology 211 (2) Week 1 KEY!

Biology 211 (2) Week 1 KEY! Biology 211 (2) Week 1 KEY Chapter 1 KEY FIGURES: 1.2, 1.3, 1.4, 1.5, 1.6, 1.7 VOCABULARY: Adaptation: a trait that increases the fitness Cells: a developed, system bound with a thin outer layer made of

More information

Chapter 19: Taxonomy, Systematics, and Phylogeny

Chapter 19: Taxonomy, Systematics, and Phylogeny Chapter 19: Taxonomy, Systematics, and Phylogeny AP Curriculum Alignment Chapter 19 expands on the topics of phylogenies and cladograms, which are important to Big Idea 1. In order for students to understand

More information

Phylogenies & Classifying species (AKA Cladistics & Taxonomy) What are phylogenies & cladograms? How do we read them? How do we estimate them?

Phylogenies & Classifying species (AKA Cladistics & Taxonomy) What are phylogenies & cladograms? How do we read them? How do we estimate them? Phylogenies & Classifying species (AKA Cladistics & Taxonomy) What are phylogenies & cladograms? How do we read them? How do we estimate them? Carolus Linneaus:Systema Naturae (1735) Swedish botanist &

More information

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 2009 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

Lecture 11 Friday, October 21, 2011

Lecture 11 Friday, October 21, 2011 Lecture 11 Friday, October 21, 2011 Phylogenetic tree (phylogeny) Darwin and classification: In the Origin, Darwin said that descent from a common ancestral species could explain why the Linnaean system

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 200 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

Assessing an Unknown Evolutionary Process: Effect of Increasing Site- Specific Knowledge Through Taxon Addition

Assessing an Unknown Evolutionary Process: Effect of Increasing Site- Specific Knowledge Through Taxon Addition Assessing an Unknown Evolutionary Process: Effect of Increasing Site- Specific Knowledge Through Taxon Addition David D. Pollock* and William J. Bruno* *Theoretical Biology and Biophysics, Los Alamos National

More information

Chapter 26 Phylogeny and the Tree of Life

Chapter 26 Phylogeny and the Tree of Life Chapter 26 Phylogeny and the Tree of Life Chapter focus Shifting from the process of how evolution works to the pattern evolution produces over time. Phylogeny Phylon = tribe, geny = genesis or origin

More information

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally

More information

Classification and Phylogeny

Classification and Phylogeny Classification and Phylogeny The diversity of life is great. To communicate about it, there must be a scheme for organization. There are many species that would be difficult to organize without a scheme

More information

Classification, Phylogeny yand Evolutionary History

Classification, Phylogeny yand Evolutionary History Classification, Phylogeny yand Evolutionary History The diversity of life is great. To communicate about it, there must be a scheme for organization. There are many species that would be difficult to organize

More information

PHYLOGENY AND SYSTEMATICS

PHYLOGENY AND SYSTEMATICS AP BIOLOGY EVOLUTION/HEREDITY UNIT Unit 1 Part 11 Chapter 26 Activity #15 NAME DATE PERIOD PHYLOGENY AND SYSTEMATICS PHYLOGENY Evolutionary history of species or group of related species SYSTEMATICS Study

More information

Microbial Taxonomy and the Evolution of Diversity

Microbial Taxonomy and the Evolution of Diversity 19 Microbial Taxonomy and the Evolution of Diversity Copyright McGraw-Hill Global Education Holdings, LLC. Permission required for reproduction or display. 1 Taxonomy Introduction to Microbial Taxonomy

More information

Inferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT

Inferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT Inferring phylogeny Constructing phylogenetic trees Tõnu Margus Contents What is phylogeny? How/why it is possible to infer it? Representing evolutionary relationships on trees What type questions questions

More information

Classification and Phylogeny

Classification and Phylogeny Classification and Phylogeny The diversity it of life is great. To communicate about it, there must be a scheme for organization. There are many species that would be difficult to organize without a scheme

More information

a-fB. Code assigned:

a-fB. Code assigned: This form should be used for all taxonomic proposals. Please complete all those modules that are applicable (and then delete the unwanted sections). For guidance, see the notes written in blue and the

More information

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5. Five Sami Khuri Department of Computer Science San José State University San José, California, USA sami.khuri@sjsu.edu v Distance Methods v Character Methods v Molecular Clock v UPGMA v Maximum Parsimony

More information

EVOLUTIONARY DISTANCES

EVOLUTIONARY DISTANCES EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:

More information

Cladistics and Bioinformatics Questions 2013

Cladistics and Bioinformatics Questions 2013 AP Biology Name Cladistics and Bioinformatics Questions 2013 1. The following table shows the percentage similarity in sequences of nucleotides from a homologous gene derived from five different species

More information

How should we organize the diversity of animal life?

How should we organize the diversity of animal life? How should we organize the diversity of animal life? The difference between Taxonomy Linneaus, and Cladistics Darwin What are phylogenies? How do we read them? How do we estimate them? Classification (Taxonomy)

More information

Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees Constructing Evolutionary/Phylogenetic Trees 2 broad categories: Distance-based methods Ultrametric Additive: UPGMA Transformed Distance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

More information

Name: Class: Date: ID: A

Name: Class: Date: ID: A Class: _ Date: _ Ch 17 Practice test 1. A segment of DNA that stores genetic information is called a(n) a. amino acid. b. gene. c. protein. d. intron. 2. In which of the following processes does change

More information

Supplementary Materials for

Supplementary Materials for advances.sciencemag.org/cgi/content/full/1/8/e1500527/dc1 Supplementary Materials for A phylogenomic data-driven exploration of viral origins and evolution The PDF file includes: Arshan Nasir and Gustavo

More information

Tree of Life iological Sequence nalysis Chapter http://tolweb.org/tree/ Phylogenetic Prediction ll organisms on Earth have a common ancestor. ll species are related. The relationship is called a phylogeny

More information

C.DARWIN ( )

C.DARWIN ( ) C.DARWIN (1809-1882) LAMARCK Each evolutionary lineage has evolved, transforming itself, from a ancestor appeared by spontaneous generation DARWIN All organisms are historically interconnected. Their relationships

More information

Lecture V Phylogeny and Systematics Dr. Kopeny

Lecture V Phylogeny and Systematics Dr. Kopeny Delivered 1/30 and 2/1 Lecture V Phylogeny and Systematics Dr. Kopeny Lecture V How to Determine Evolutionary Relationships: Concepts in Phylogeny and Systematics Textbook Reading: pp 425-433, 435-437

More information

Using phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression)

Using phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression) Using phylogenetics to estimate species divergence times... More accurately... Basics and basic issues for Bayesian inference of divergence times (plus some digression) "A comparison of the structures

More information

Macroevolution Part I: Phylogenies

Macroevolution Part I: Phylogenies Macroevolution Part I: Phylogenies Taxonomy Classification originated with Carolus Linnaeus in the 18 th century. Based on structural (outward and inward) similarities Hierarchal scheme, the largest most

More information

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS.

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS. !! www.clutchprep.com CONCEPT: OVERVIEW OF EVOLUTION Evolution is a process through which variation in individuals makes it more likely for them to survive and reproduce There are principles to the theory

More information

Homology and Information Gathering and Domain Annotation for Proteins

Homology and Information Gathering and Domain Annotation for Proteins Homology and Information Gathering and Domain Annotation for Proteins Outline Homology Information Gathering for Proteins Domain Annotation for Proteins Examples and exercises The concept of homology The

More information

Phylogeny and the Tree of Life

Phylogeny and the Tree of Life Chapter 26 Phylogeny and the Tree of Life PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from

More information

a-dB. Code assigned:

a-dB. Code assigned: This form should be used for all taxonomic proposals. Please complete all those modules that are applicable (and then delete the unwanted sections). For guidance, see the notes written in blue and the

More information

Biological Networks: Comparison, Conservation, and Evolution via Relative Description Length By: Tamir Tuller & Benny Chor

Biological Networks: Comparison, Conservation, and Evolution via Relative Description Length By: Tamir Tuller & Benny Chor Biological Networks:,, and via Relative Description Length By: Tamir Tuller & Benny Chor Presented by: Noga Grebla Content of the presentation Presenting the goals of the research Reviewing basic terms

More information

Homology. and. Information Gathering and Domain Annotation for Proteins

Homology. and. Information Gathering and Domain Annotation for Proteins Homology and Information Gathering and Domain Annotation for Proteins Outline WHAT IS HOMOLOGY? HOW TO GATHER KNOWN PROTEIN INFORMATION? HOW TO ANNOTATE PROTEIN DOMAINS? EXAMPLES AND EXERCISES Homology

More information

Taxonomy. Content. How to determine & classify a species. Phylogeny and evolution

Taxonomy. Content. How to determine & classify a species. Phylogeny and evolution Taxonomy Content Why Taxonomy? How to determine & classify a species Domains versus Kingdoms Phylogeny and evolution Why Taxonomy? Classification Arrangement in groups or taxa (taxon = group) Nomenclature

More information

Understanding relationship between homologous sequences

Understanding relationship between homologous sequences Molecular Evolution Molecular Evolution How and when were genes and proteins created? How old is a gene? How can we calculate the age of a gene? How did the gene evolve to the present form? What selective

More information

AP Biology Exam #7 (PRACTICE) Subunit #7: Diversity of Life

AP Biology Exam #7 (PRACTICE) Subunit #7: Diversity of Life AP Biology Exam #7 (PRACTICE) Subunit #7: Diversity of Life Multiple Choice Questions: Choose the best answer then bubble your answer on your scantron sheet. 1. Armadillos and spiny anteaters are not related.

More information

Microbial Diversity and Assessment (II) Spring, 2007 Guangyi Wang, Ph.D. POST103B

Microbial Diversity and Assessment (II) Spring, 2007 Guangyi Wang, Ph.D. POST103B Microbial Diversity and Assessment (II) Spring, 007 Guangyi Wang, Ph.D. POST03B guangyi@hawaii.edu http://www.soest.hawaii.edu/marinefungi/ocn403webpage.htm General introduction and overview Taxonomy [Greek

More information

Phylogenetic Analysis

Phylogenetic Analysis Phylogenetic Analysis Aristotle Through classification, one might discover the essence and purpose of species. Nelson & Platnick (1981) Systematics and Biogeography Carl Linnaeus Swedish botanist (1700s)

More information

Phylogenetic Analysis

Phylogenetic Analysis Phylogenetic Analysis Aristotle Through classification, one might discover the essence and purpose of species. Nelson & Platnick (1981) Systematics and Biogeography Carl Linnaeus Swedish botanist (1700s)

More information

Phylogenetic Analysis

Phylogenetic Analysis Phylogenetic Analysis Aristotle Through classification, one might discover the essence and purpose of species. Nelson & Platnick (1981) Systematics and Biogeography Carl Linnaeus Swedish botanist (1700s)

More information

MiGA: The Microbial Genome Atlas

MiGA: The Microbial Genome Atlas December 12 th 2017 MiGA: The Microbial Genome Atlas Jim Cole Center for Microbial Ecology Dept. of Plant, Soil & Microbial Sciences Michigan State University East Lansing, Michigan U.S.A. Where I m From

More information

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics: Homework Assignment, Evolutionary Systems Biology, Spring 2009. Homework Part I: Phylogenetics: Introduction. The objective of this assignment is to understand the basics of phylogenetic relationships

More information

Phylogeny and the Tree of Life

Phylogeny and the Tree of Life Chapter 26 Phylogeny and the Tree of Life PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from

More information

Outline. Classification of Living Things

Outline. Classification of Living Things Outline Classification of Living Things Chapter 20 Mader: Biology 8th Ed. Taxonomy Binomial System Species Identification Classification Categories Phylogenetic Trees Tracing Phylogeny Cladistic Systematics

More information

Quantifying sequence similarity

Quantifying sequence similarity Quantifying sequence similarity Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, February 16 th 2016 After this lecture, you can define homology, similarity, and identity

More information

Symmetric Tree, ClustalW. Divergence x 0.5 Divergence x 1 Divergence x 2. Alignment length

Symmetric Tree, ClustalW. Divergence x 0.5 Divergence x 1 Divergence x 2. Alignment length ONLINE APPENDIX Talavera, G., and Castresana, J. (). Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Systematic Biology, -. Symmetric

More information

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogeny? - Systematics? The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogenetic systematics? Connection between phylogeny and classification. - Phylogenetic systematics informs the

More information

PHYLOGENY & THE TREE OF LIFE

PHYLOGENY & THE TREE OF LIFE PHYLOGENY & THE TREE OF LIFE PREFACE In this powerpoint we learn how biologists distinguish and categorize the millions of species on earth. Early we looked at the process of evolution here we look at

More information

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Chapter 12 (Strikberger) Molecular Phylogenies and Evolution METHODS FOR DETERMINING PHYLOGENY In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Modern

More information

Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata.

Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Supplementary Note S2 Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Phylogenetic trees reconstructed by a variety of methods from either single-copy orthologous loci (Class

More information

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis 10 December 2012 - Corrections - Exercise 1 Non-vertebrate chordates generally possess 2 homologs, vertebrates 3 or more gene copies; a Drosophila

More information

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University Phylogenetics: Building Phylogenetic Trees COMP 571 - Fall 2010 Luay Nakhleh, Rice University Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary

More information

SPECIATION. REPRODUCTIVE BARRIERS PREZYGOTIC: Barriers that prevent fertilization. Habitat isolation Populations can t get together

SPECIATION. REPRODUCTIVE BARRIERS PREZYGOTIC: Barriers that prevent fertilization. Habitat isolation Populations can t get together SPECIATION Origin of new species=speciation -Process by which one species splits into two or more species, accounts for both the unity and diversity of life SPECIES BIOLOGICAL CONCEPT Population or groups

More information

Non-independence in Statistical Tests for Discrete Cross-species Data

Non-independence in Statistical Tests for Discrete Cross-species Data J. theor. Biol. (1997) 188, 507514 Non-independence in Statistical Tests for Discrete Cross-species Data ALAN GRAFEN* AND MARK RIDLEY * St. John s College, Oxford OX1 3JP, and the Department of Zoology,

More information

A (short) introduction to phylogenetics

A (short) introduction to phylogenetics A (short) introduction to phylogenetics Thibaut Jombart, Marie-Pauline Beugin MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis with PR Statistics, Millport Field

More information

Phylogenetic Trees. How do the changes in gene sequences allow us to reconstruct the evolutionary relationships between related species?

Phylogenetic Trees. How do the changes in gene sequences allow us to reconstruct the evolutionary relationships between related species? Why? Phylogenetic Trees How do the changes in gene sequences allow us to reconstruct the evolutionary relationships between related species? The saying Don t judge a book by its cover. could be applied

More information

A. Incorrect! In the binomial naming convention the Kingdom is not part of the name.

A. Incorrect! In the binomial naming convention the Kingdom is not part of the name. Microbiology Problem Drill 08: Classification of Microorganisms No. 1 of 10 1. In the binomial system of naming which term is always written in lowercase? (A) Kingdom (B) Domain (C) Genus (D) Specific

More information

ESTIMATION OF CONSERVATISM OF CHARACTERS BY CONSTANCY WITHIN BIOLOGICAL POPULATIONS

ESTIMATION OF CONSERVATISM OF CHARACTERS BY CONSTANCY WITHIN BIOLOGICAL POPULATIONS ESTIMATION OF CONSERVATISM OF CHARACTERS BY CONSTANCY WITHIN BIOLOGICAL POPULATIONS JAMES S. FARRIS Museum of Zoology, The University of Michigan, Ann Arbor Accepted March 30, 1966 The concept of conservatism

More information

Consensus Methods. * You are only responsible for the first two

Consensus Methods. * You are only responsible for the first two Consensus Trees * consensus trees reconcile clades from different trees * consensus is a conservative estimate of phylogeny that emphasizes points of agreement * philosophy: agreement among data sets is

More information

Reconstructing the history of lineages

Reconstructing the history of lineages Reconstructing the history of lineages Class outline Systematics Phylogenetic systematics Phylogenetic trees and maps Class outline Definitions Systematics Phylogenetic systematics/cladistics Systematics

More information

"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2011 University of California, Berkeley

PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION Integrative Biology 200B Spring 2011 University of California, Berkeley "PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2011 University of California, Berkeley B.D. Mishler March 31, 2011. Reticulation,"Phylogeography," and Population Biology:

More information

--Therefore, congruence among all postulated homologies provides a test of any single character in question [the central epistemological advance].

--Therefore, congruence among all postulated homologies provides a test of any single character in question [the central epistemological advance]. Integrative Biology 200A "PRINCIPLES OF PHYLOGENETICS" Spring 2008 University of California, Berkeley B.D. Mishler Jan. 29, 2008. The Hennig Principle: Homology, Synapomorphy, Rooting issues The fundamental

More information

Phylogenetics: Building Phylogenetic Trees

Phylogenetics: Building Phylogenetic Trees 1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should

More information

Introduction to characters and parsimony analysis

Introduction to characters and parsimony analysis Introduction to characters and parsimony analysis Genetic Relationships Genetic relationships exist between individuals within populations These include ancestordescendent relationships and more indirect

More information

Modern Evolutionary Classification. Section 18-2 pgs

Modern Evolutionary Classification. Section 18-2 pgs Modern Evolutionary Classification Section 18-2 pgs 451-455 Modern Evolutionary Classification In a sense, organisms determine who belongs to their species by choosing with whom they will mate. Taxonomic

More information

Bio 1B Lecture Outline (please print and bring along) Fall, 2007

Bio 1B Lecture Outline (please print and bring along) Fall, 2007 Bio 1B Lecture Outline (please print and bring along) Fall, 2007 B.D. Mishler, Dept. of Integrative Biology 2-6810, bmishler@berkeley.edu Evolution lecture #5 -- Molecular genetics and molecular evolution

More information

BINF6201/8201. Molecular phylogenetic methods

BINF6201/8201. Molecular phylogenetic methods BINF60/80 Molecular phylogenetic methods 0-7-06 Phylogenetics Ø According to the evolutionary theory, all life forms on this planet are related to one another by descent. Ø Traditionally, phylogenetics

More information

A bioinformatics approach to the structural and functional analysis of the glycogen phosphorylase protein family

A bioinformatics approach to the structural and functional analysis of the glycogen phosphorylase protein family A bioinformatics approach to the structural and functional analysis of the glycogen phosphorylase protein family Jieming Shen 1,2 and Hugh B. Nicholas, Jr. 3 1 Bioengineering and Bioinformatics Summer

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Supplementary information S3 (box) Methods Methods Genome weighting The currently available collection of archaeal and bacterial genomes has a highly biased distribution of isolates across taxa. For example,

More information

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Bioinformatics tools for phylogeny and visualization. Yanbin Yin Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and

More information

Biology 2. Lecture Material. For. Macroevolution. Systematics

Biology 2. Lecture Material. For. Macroevolution. Systematics Biology 2 Macroevolution & Systematics 1 Biology 2 Lecture Material For Macroevolution & Systematics Biology 2 Macroevolution & Systematics 2 Microevolution: Biological Species: Two Patterns of Evolutionary

More information

Gene Families part 2. Review: Gene Families /727 Lecture 8. Protein family. (Multi)gene family

Gene Families part 2. Review: Gene Families /727 Lecture 8. Protein family. (Multi)gene family Review: Gene Families Gene Families part 2 03 327/727 Lecture 8 What is a Case study: ian globin genes Gene trees and how they differ from species trees Homology, orthology, and paralogy Last tuesday 1

More information

Effects of Gap Open and Gap Extension Penalties

Effects of Gap Open and Gap Extension Penalties Brigham Young University BYU ScholarsArchive All Faculty Publications 200-10-01 Effects of Gap Open and Gap Extension Penalties Hyrum Carroll hyrumcarroll@gmail.com Mark J. Clement clement@cs.byu.edu See

More information

Phylogenetic methods in molecular systematics

Phylogenetic methods in molecular systematics Phylogenetic methods in molecular systematics Niklas Wahlberg Stockholm University Acknowledgement Many of the slides in this lecture series modified from slides by others www.dbbm.fiocruz.br/james/lectures.html

More information

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Phylogenetic Analysis Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Outline Basic Concepts Tree Construction Methods Distance-based methods

More information

"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2009 University of California, Berkeley

PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION Integrative Biology 200B Spring 2009 University of California, Berkeley "PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2009 University of California, Berkeley B.D. Mishler Jan. 22, 2009. Trees I. Summary of previous lecture: Hennigian

More information

Chapter 26 Phylogeny and the Tree of Life

Chapter 26 Phylogeny and the Tree of Life Chapter 26 Phylogeny and the Tree of Life Biologists estimate that there are about 5 to 100 million species of organisms living on Earth today. Evidence from morphological, biochemical, and gene sequence

More information