Introduction to Phylogenetic Networks

Size: px
Start display at page:

Download "Introduction to Phylogenetic Networks"

Transcription

1 Introduction to Phylogenetic Networks Daniel H. Huson GCB 2006, Tübingen,, September 19, Contents 1. Phylogenetic trees 2. Consensus networks and super networks 3. Hybridization and reticulate networks 4. Recombination networks 5. Other 2 1

2 Overview of Existing Concepts Phylogenetic networks Splits networks 1 Phylogenetic trees Reticulate networks Other types of phylogenetic networks Median networks from sequences Consensus (super) networks from trees Hybridization networks Special case: Galled trees Recombination networks 5 Augmented trees Split decomposition, Neighbor-net from distances Ancestor recombination graphs Any graph representing evolutionary data 3 Two Different Kinds of Networks Phylogenetic networks Splits networks 1 Phylogenetic trees Reticulate networks Other types of phylogenetic networks Median networks Consensus Hybridization Recombination Augmented (super) networks networks trees networks Split decomposition, Neighbor-net Ancestor recombination graphs Implicit networks Any graph representing evolutionary data 4 2

3 Two Different Kinds of Networks Phylogenetic networks Splits networks 1 Phylogenetic trees Reticulate networks Other types of phylogenetic networks Median networks Consensus Hybridization Recombination Augmented (super) networks networks trees networks Split decomposition, Neighbor-net Ancestor recombination graphs Explicit networks Any graph representing evolutionary data 5 Part I 1. Phylogenetic trees 2. Consensus networks and super networks 3. Hybridization and reticulate networks 4. Recombination networks 5. Other 6 3

4 Phylogenetic Trees The evolution of species is usually described by a phylogenetic tree Charles Darwin Ernst Haeckel,Tree of Life, Phylogenetic Trees Let X = {x 1,...,x n } denote a set of taxa. A phylogenetic tree T (or X-tree) ) is given by labeling the leaves of a tree by the set X: Cow Fin Whale Blue Whale Habor Seal Rat Mouse Chimp Human Gorilla Taxa + tree phylogenetic tree 8 4

5 Unrooted vs Rooted Trees Unrooted tree most popular methods produce unrooted trees Rooted tree, rooted using Chicken as outgroup biologically relevant, defines clades of related taxa 9 Branch Lengths Each branch e of a phylogenetic tree T may be scaled to represent r t, r the rate of evolution r time t along e: Blue Whale Mouse Fin Whale Rat Cow Chicken Seal Human Chimp Gorilla 0.01 root 10 5

6 A Simple Model of Evolution Sequences evolve along a pre-given tree T, called the evolutionary -, model - or the true tree Two types of events: mutations and speciation events 11 A Simple Model of Evolution TAA G C CG T ACT C CG A T AC G C C time AC C C A G C T Evolutionary tree AC G A C C C T Sequence of common ancestor Mutations along branches Speciation events at nodes 12 6

7 Tree Reconstruction Problem TAA G C CG T AC T C CG A T AC G C C Tree? Evolutionary tree 13 Tree of Life Based on 16S rrna (Doolittle, 2000) 14 7

8 Aligned Sequences A set of taxa X = {x 1,...,x n } may be given as an alignment of molecular sequences, e.g.: Human Chimp Gorilla Harbor Seal Cow Fin Whale Blue Whale Rat Mouse fqtpmviilqaimgsatlamtliiftiiiiltvhdtnttvptmitpmllt fqtpmiiifqaimgsatlaltliiftiiviltvhdtntavpttitpmllt lqtpmviifqaimgsatlamtliiftvimiltvhetnttvptmiapmllt fqlpmviifqaiiggatlalafitftiiifltvhdtdtstlimilsmilt fqtpmviifqaiiggatlalalitftiiifmtvhdtdtstltmilsmflt lqtfmviifqaimgettlalafitftiaifltvhdtdtsmlltilsmllt lqtfmviifqaimgettlvlaiitftiaifltvhdtdtstlltilsmllt fqismiiifqaimggatlvlatitfiilvfltvhdtdtstfitiissmat fqismiiifqaimggatlvlatitfiilifltvhdtdtstfitiissmit Usually obtained from some gene or locus that all taxa have in common. 17 Tree Reconstruction Problem Given an alignment that evolved along some evolutionary tree T, can we reconstruct the tree? Human fqtpmviilqaimgsatlamtliift Chimp fqtpmiiifqaimgsatlaltliift Gorilla lqtpmviifqaimgsatlamtliift Seal fqlpmviifqaiiggatlalafitft Cow fqtpmviifqaiiggatlalalitft Fin Whale lqtfmviifqaimgettlalafitft Blue Whale lqtfmviifqaimgettlvlaiitft Rat fqismiiifqaimggatlvlatitfi Mouse fqismiiifqaimggatlvlatitfi Chicken pqismiaffqaimggatlfaatitfi? Blue Whale Fin Whale Seal Cow Chicken root Challenges: 1. determine the unrooted topology of T, 2. estimate the branch lengths of T, and 3. infer the position of the root in T. Mouse Rat Chimp Human Gorilla 18 8

9 Tree Reconstruction Methods Sequence-based methods search for a tree that optimally explains the given sequence data: Maximum Parsimony [10], Maximum Likelihood [11], and Bayesian Inference [26]. Distance-based methods infer a distance matrix and construct a tree from it: UPGMA [45], Neighbor-Joining [43] and its variants Bio-NJ [15] and Weighbor [4]. Tree-based methods infer a tree from a set of trees Consensus tree (e.g. strict, majority, loose) Super tree (if trees are defined on overlapping subsets of taxa) 19 Software Selection of programs that build phylogenetic trees: PAUP* [49], a program for performing phylogenetic analysis using parsimony, maximum likelihood and other methods, Phylip [12], a package for phylogenetic inference, MrBayes [25], a program for Bayesian inference of trees, Mesquite [38], a modular system for evolutionary analysis, PAL [9], an object-oriented oriented programming library for molecular evolution and phylogenetics, and SplitsTree4 [27,28], an integrated program for estimating phylogenetic trees and networks. 26 9

10 Part II 1. Phylogenetic trees 2. Consensus networks and super networks 3. Hybridization and reticulate networks 4. Recombination networks 5. Other 27 Overview Will include additional evolutionary events that are not considered in simple tree models. Fundamental observation: gene trees differ. How to represent conflicting signals using a consensus network or super network. Some other methods that use a network to represent conflicting signals

11 Additional Evolutionary Events Models as discussed above represent the evolution of a single gene. When studying more than one gene simultaneously, one must also consider that: individual genes may be born, duplicated or lost. Moreover, biological mechanisms such as recombination, hybridization, or horizontal gene transfer may be involved. But even when the data evolved on a tree, networks can help to understand problems due to sampling or model-specification error 29 Gene Trees Can Differ Consider a model in which the sequence of a gene evolves via mutations, but we also allow gene duplication and loss: x 1 x 2 x 3 A A B x x x A A B x 1 x 2 x 3 Gene duplication A B G Gene Tree Species Tree 30 11

12 Gene Trees vs Species Trees Differing gene trees give rise to mosaic sequences Gene A Gene B Gene C Gene D 31 The Consensus of Different Gene Trees For a given set of species, we can build evolutionary trees based on different genes How to form a consensus of the trees? Consensus trees Consensus networks Consensus super networks 32 12

13 The Splits of a Tree Every edge of a tree defines a split of the taxon set X: x 6 x 1 x 4 x 8 e x 5 x 2 x 7 x 3 x 1,x 3,x 4,x 6,x 7 vs x 2,x 5,x 8 33 The Split Encoding of a Tree Tree T: c d a b Split encoding Σ(T): e 5 trivial splits: 2 non-trivial splits: 34 13

14 Compatibility Two splits A 1 B 1 and A 2 B 2 of X are compatible, if {A 1 A 2,A 1 B 2,B 1 A 2,B 1 A 2 } Two compatible splits: A 1 B 1 x 4 A 2 B 2 x 2 x 3 x 7 x 8 x 1 x 5 x 6 x 9 X 35 Compatibility Two splits A 1 B 1 and A 2 B 2 of X are compatible, if {A 1 A 2,A 1 B 2,B 1 A 2,B 1 A 2 } Two incompatible splits: A 2 A 1 B 1 x 4 x 5 B 2 x 6 x 2 x 3 x 1 x 7 X 36 14

15 Compatibility Theorem A set of splits Σ can be represented by a tree T, if and only if all pairs of splits are compatible. 37 Representing Incompatible Trees Consider the following two trees T 1 and T 2, for which the splits are incompatible: e e e c c p q c p d q d d b b b a a a T 1 T + 2 SN(Σ) The splits network SN(Σ) ) represents the incompatible set of splits Σ:= :=Σ (T 1 ) Σ (T 2 ), using bands of parallel edges for incompatible splits

16 Consensus of Trees Given trees T 1,,T k Define Σ(p):={S Σ all : {i: S Σ(TS i )} >pk} Strict consensus: Σ strict = Σ * (1/1) Majority consensus: Σ maj =Σ(1/2) In general, Σ(1/(d+1) 1/(d+1)) ) defines a set of consensus splits for d d 0 39 Six gene trees: Consensus of Trees Σ(1/2): majority consensus: splits contained in more than 50% of trees Σ(1/6): splits contained in more than one tree Σ(0): splits contained in at least one tree 40 16

17 Consensus Networks A consensus network [22] is obtained by computing the consensus splits Σ(1/(d+1) 1/(d+1)) ) for some value d 0. The parameter d determines the maximum dimensionality of the corresponding network: for d=1 the network will be 1-dimensional, 1 a tree, for d=2 the network may contain parallelograms, and in general it may contain cubes of dimension d. 41 Consensus of Partial Gene Trees For a given set of species, we may build evolutionary trees based on many different genes But: : not every species has every gene, or some sequences may be unavailable How to deal with partial trees,, i.e. trees that do not mention all species? Answer: Compute a super-network 42 17

18 Example of A Super Network (Plants) Partial trees for five plant genes Super network 43 Z-Closure Method [29] Idea: Extend partial splits. Z-rule: A 1 A 2 A 1, A 1 A 2 B 1 B 2 B 1 B 2 B 2 Repeatedly apply to completion. A 2 Return all full splits. B 1 A 1 B

19 Example Five fungal trees from (Pryor 2000) and (Pryor 2003) Trees: ITS (two trees) SSU (two trees) Gpd (one tree) Numbers of taxa differ: partial trees 45 Individual Gene Trees ITS00 46 taxa 46 19

20 Individual Gene Trees ITS03 40 taxa 47 Individual Gene Trees SSU00 29 taxa 48 20

21 Individual Gene Trees SSU03 40 taxa 49 Individual Gene Trees Gpd03 40 taxa 50 21

22 Gene Trees as Super Network Z-closure: a fast super-network method 51 Gene Trees as Super Network ITS00+ ITS

23 Gene Trees as Super Network ITS03+ SSU00 53 Gene Trees as Super Network ITS00+ ITS00+ SSU

24 Gene Trees as Super Network ITS00+ ITS03+ SSU03+ Gpd03 55 Gene Trees as Super Network ITS00+ ITS03+ SSU00+ SSU03+ Gpd

25 Sequence & Distance-Based Splits Networks So, incompatible splits arise naturally in the context of multiple trees. There also exist a number of methods that generate incompatible splits directly from characters (a multiple sequence alignment), or a distance matrix. 58 Sequences to Split Network I. Cassens et al.,, The phylography of dusky dolphins: a critical If characters examination have only of 2 network states and methods not too and conflicting: rooting procedures, interpret ret columns as splits and Molecular draw full Ecology splits (2003) network 12: (median 1792 network) 59 25

26 Split Decomposition The Split Decomposition method [2] computes a set of weighted X-splits X Σ decomp such that the sum of weights of all splits that separate two taxa x,y X approximates the distance D(x,y). It produces a tree,, whenever the distance matrix fits a tree, and else produces a network that displays different and incompatible signals. 60 Distances to Split Network Split Decomposition or Neighbor-Net Net produces network from distances 61 26

27 Neighbor-Net Net Split Decomposition is useful for visualizing conflicting signals in a data set. However, it is sensitive to noise and only has good resolution for small or clean data sets. The Neighbor-Net Net [5] method is a hybrid of Neighbor-Joining and Split Decomposition. It is applicable to data sets containing hundreds of taxa. However, it tends to produce spider-webs. 64 Neighbor-Net Net Splits network computed via Neighbor-net from distances between human mtdna sequences

28 Software SplitsTree4 [24] provides implementations of all methods described in this chapter, including a number of different algorithms for constructing networks from splits. SpectroNet [23] provides an algorithm for constructing a splits network (a special case, namely the median network) and some related methods 66 Implicit vs Explicit Networks Two fundamentally different types of phylogenetic networks: Implicit networks aim at displaying incompatible signals Example: split networks Explicit networks aim at providing an explicit model of reticulate evolution Example: hybridization and recombination networks 67 28

29 Part III 1. Phylogenetic trees 2. Consensus networks and super networks 3. Hybridization and reticulate networks 4. Recombination networks 5. Other 68 Overview Hybrid speciation. A simple model of evolution that incorporates gene trees and reticulation events. Reticulate networks and some approaches for inferring them from gene trees. Software

30 Hybridization Occurs when two organisms from different species interbreed and combine their chromosomes Copyright 2003 University of Illinois Copyright 2003 University of Illinois Copyright 2003 University of Illinois Water hemp Hybrid Pigs weed 70 Speciation by Hybridization 1 In allopolyploidization,, two different lineages produce a new species that has the complete nuclear genomes of both parental species [41]: 71 30

31 Speciation by Hybridization 1 Two parents X and Y each pass on their whole diploid genomes, with 2n 1 and 2n 2 chromosomes, respectively, to produce a polyploid offspring Z with (2n 1 +2n 2 ) chromosomes. Subsequently, it can happen that the genome reduces to half its size and is then a mosaic of genes from both ancestors. 72 Speciation by Hybridization 2 In diploid (or homoploid) hybrid speciation, each of the parents produces normal gametes (haploid) to produce a normal diploid hybrid [41]: 73 31

32 Speciation by Hybridization 2 Although diploid hybridization is more common, the ability of the hybrid to backcross with the parent species usually prevents that a new species will arise. Although less common, allopolyploidization is believed to produce more new species. Hybridization is usually restricted to plants, frogs and fish. 74 Horizontal Gene Transfer There are a number of known mechanisms by which bacteria can exchange genes Transformation Conjugation transduction

33 A Simple Model of Reticulate Evolution b 1 a h c b 3 P Q Tree for gene g 1 g 1 Ancestral genome 76 A Simple Model of Reticulate Evolution b 1 a h c b 3 P Q g 1 -tree is P -variant g

34 A Simple Model of Reticulate Evolution b 1 a h c b 3 g 1 -tree is P -variant 78 A Simple Model of Reticulate Evolution b 1 a h c b 3 P Q Tree for gene g 2 g

35 A Simple Model of Reticulate Evolution b 1 a h c b 3 P Q g2-tree is Q -variant g 2 80 A Simple Model of Reticulate Evolution b 1 a h c b 3 g2-tree is Q -variant 81 35

36 Reticulate Networks and Trees The evolutionary history associated with any given gene is a tree A network N with k reticulations gives rise to 2 k different gene trees b 1 a h c b 3 b 1 a h c b 3 P Q b 1 a h c b 3 N P-tree Q-tree 82 Reticulate Networks and Trees Note, however that the two choices P i and Q i can lead to the same tree topology: a h b c P i r i Q i Here, both induced trees are of the form: ((a,h),(b,c))

37 Rooted Reticulate Network Definition Let X be a set of taxa. A rooted reticulate network N on X is a connected, directed acyclic graph with: precisely one node of indegree 0, the root, all other nodes are tree nodes of indegree 1, or reticulation nodes of indegree 2, every edge is a tree edge joining two tree nodes, or a reticulation edge from a tree node to a reticulation node, and the set of leaves consists of tree nodes and is labeled by X. 84 Rooted Reticulate Network a b c d e f g h r 1 r 3 r 2 root 85 37

38 Reconstruction of Reticulate Networks Given a set of trees T ={T 1,...,T m }, want to determine the reticulate network N from which the trees were sampled with T = T(N). This form of the problem is not always solvable, e.g. if some of the 2 k possible trees are missing. Thus we consider the following: 86 Reconstruction of Reticulate Networks Most Parsimonious Network Problem: Given a set of trees T,, determine a reticulate network N such that T T(N) and N contains a minimum number of reticulation nodes. In fully generality, this is known to be a computationally hard problem [50]. We now discuss a special case that can be solved efficiently

39 Independent Reticulations Two reticulation nodes r i, r j in N are independent of each other, if they are not contained in any common simple cycle. r 1 r 2 r 3 Here, r 1 is independent of r 2 and r 3, whereas r 2 and r 3 are not independent of each other, as the highlighted cycle shows. 88 Galled Trees A reticulation that is independent of all others is also called a gall [18]. A network N in which all reticulations are galls is also called a galled tree [18] or gt-network [41]

40 SPR's and Independent Reticulations Observation [39]: If N contains only a single reticulation r, then it corresponds to a sub-tree prune and regraft operation: Reticulate network N: r SPR 90 SPR-Based Algorithm [39] Given two bifurcating trees, compute their SPR distance If the distance is 0, return a tree If the distance is 1, return a network Else, return fail This approach has been generalized to networks with multiple independent reticulations [41] 91 40

41 Challenge Unfortunately, on real data, such algorithms will often return fail". Please note: : All current approaches aim at solving a combinatorial puzzle: does there exist a network that induces the given set of trees? (below: does there exist a network that induces the given alignment of binary sequences?) One challenge is to produce useful output in the case of imperfect data. 92 Splits-Based Approach A new splits-based approach [29]: gene tree1 gene tree2 splits network of all splits reticulate network 93 41

42 Multiple Independent Reticulations Two reticulations four different gene trees all splits Reticulate network that induces all input trees 94 Overlapping Reticulations Current splits-based methods can resolve components in which the reticulation cycles overlap along a common path [31]: 95 42

43 Multiple and Overlapping Reticulations Input trees all splits Reticulate network that induces all input trees 96 Decomposition Theorem There exists a one-to to-one one correspondence between [17,29]: the connected components of the incompatibility graph, the netted regions" of the splits network and the tangles" of dependent reticulations of the reticulate network 98 43

44 Splits-Based Algorithm This leads to the following approach: Determine the set of all input splits Determine the connected components of the incompatibility graph or splits network Analyze each component C separately: If C can be explained by a reticulate network N(C), then locally replace C by N(C) 99 Application to Real Data New Zealand Ranunculus (buttercup) species JSA region in chloroplast ITS region in nuclear genome

45 Application to Real Data JSA ITS Split network representing both trees simultaneously Not explicit 101 Split network for ITS & JSA trees Filter splits [53] Hybridization network Two cases of hybridization Application to Real Data explicit

46 Details of Splits-Based Approach A reticulation corresponds to a subtree that attaches at two places: B 1 B 2 B 3 B 4 A X X C 103 Details of Splits-Based Approach A reticulation corresponds to a subtree that attaches at two places: B 1 B 2 B 3 B 4 A X X C

47 Detecting a Reticulation u 1 u 2 u 3 u 4 A X B 1 B 2 B 3 B 4 C d 1 d 2 d 3 d 4 A B 1 B 2 B 3 B 4 X C 105 Detecting a Reticulation The associated splits network B 1 B 2 B 3 B 4 A u 1 d 1 u 2 d 2 u 3 d 3 u 4 d 4 C d 1 u 4 d 2 u 3 d 3 u 2 d 4 u 1 X

48 Splits Network to Reticulate Network The associated splits network B 1 B 2 B 3 B 4 A C Delete all internal edges X 107 Splits Network to Reticulate Network The associated splits network& & the reticulate network B 1 B 2 B 3 B 4 A C Delete all internal edges X Note: Algorithm operates directly on the set of splits, not on the splits network

49 Details of Splits-Based Approach Multiple reticulations can overlap along a path: B 1 B 2 B 3 B 4 A X Y C 109 Details of Splits-Based Approach Multiple reticulations can overlap along a path: B 1 B 2 B 3 B 4 A X Y Y X C

50 Software SplitsTree4 [28] contains an implementation of the splits-based algorithm [31] that can handle overlapping reticulations. In [41] a program SPNet is described for galled trees, but it is not available for download. 111 Part IV 1. Phylogenetic trees 2. Consensus networks and super networks 3. Hybridization and reticulate networks 4. Recombination networks 5. Other

51 Overview Consider an alignment of binary sequences that have evolved under a model of mutation-,, speciation- and recombination events We will look at the problem of reconstructing the underlying reticulate network Software 113 Recombination (Sexual) recombination is studied in population genetics [24, 20,16, 46, 47, 48] and there ancestor recombination graphs (ARGs)) are used for statistical purposes. [41]

52 Chromosomal Recombination We will study the combinatorial aspects of chromosomal (meiotic) recombination and thus consider recombination networks rather than ARGs. Simplifying assumptions: all sequences have a common ancestor, and any position can mutate at most once. 115 Example of a Recombination Network r: b: a: c: d: ,11 Alignment A: a: b: r: c: d: o: , outgroup root

53 Recombination Network For an alignment A of binary sequences of length n, a recombination network R is a reticulate network N, together with [7]: a labeling of all nodes by binary sequences of length n, such that the leaves of R are labeled by A, a labeling of each tree edge e by the positions that mutate along e, and a labeling of each reticulation node r determining the recombination at r. 117 Non-Uniqueness of Mutations The placement of mutations on edges is not uniquely defined. Here, the mutation at position 5 can happen along two different edges: a: r: b: a: r: b: (a) 3, , (b) Current algorithms [18, 30] place such ambiguous mutations outside of the reticulation cycle, as in (a)

54 Recombination Network Tree-based approach [18, 17] for computing galled trees: Determine components of incompatibility graph For each component: Determine restricted dataset Determine whether removing one taxon produces a perfect phylogeny If so, arrange taxa in gall Return description of network 119 Recombination Network Splits-based approach [30] for computing overlapping networks: Determine a reticulate network as described earlier. Compute the labeling of nodes and edges

55 Computing a Labelling o: , ,11 a: b: r: c: Labelling of splits network is easy to compute d: o: ,5 a: Copy labelling to recombination network , b: c: ,11 r: d: Example 1, Data Fungus Fusarium, 37 strains reported in [52] Locus TRI101 known to undergone intragenic recombination

56 Example 1, Split Network Implicit network 123 Example 1, Recombination Network Explicit network

57 Example 2, Data Input: Restriction maps of the rdna cistron (length 10kb) of twelve species of mosquitoes using eight 6bp recognition restriction enzymes [35]: Aedes albopictus Aedes aegypti Aedes seatoi Aedes avopictus Aedes alcasidi Aedes katherinensis Aedes polynesiensis Aedes triseriatus Aedes atropalpus Aedes epactius Haemagogus equinus Armigeres subalbatus Culex pipiens Tripteroides bambusa Sabethes cyaneus Anopheles albimanus Example 2, Split Network This data set was analyzed using different tree- reconstruction methods with inconclusive results [35]. The associated splits network (or median network [3] in this context), with edges labeled by the corresponding mutations: Anopheles_albimanus root 10 Aedes_katherinensis Aedes_seatoi Aedes_alcasidi Aedes_flavopictus Aedes_albopictus 25 Aedes_polynesiensis 3,5,9,14-15,21, Tripteroides_bambusa ,23,26 Aedes_aegypti Sabethes_cyaneus Culex_pipiens Haemagogus_equinus Aedes_epactius Aedes_atropalpus Armigeres_subalbatus Aedes_triseriatus

58 Example 2, Subset Recombination scenarios based on the complete data set look unconvincing. However, removal of two taxa Aedes triseriatus and Armigeres subalbatus gives rise to a simpler splits network: Anopheles albimanus root Sabethes cyaneus 3,5,9,14-15,21,24 Haemagogus equinus Aedes epactius Aedes atropalpus Aedes aegypti 7 17,23,26 Aedes polynesiensis Culex pipiens Tripteroides bambusa Aedes katherinensis Aedes seatoi Aedes alcasidi Aedes albopictus Aedes flavopictus 127 Example 2, Recombination Network A possible recombination scenario is given by: Anopheles_albimanus root Sabethes_cyaneus Haemagogus_equinus Aedes_epactius 3,5,9,14-15,21, ,25 Aedes_atropalpus Aedes_aegypti 7 17,23,26 Culex_pipiens Tripteroides_bambusa Aedes_polynesiensis Aedes_katherinensis Aedes_seatoi Aedes_alcasidi Aedes_albopictus Aedes_flavopictus Here, Haemagogus equinus appears to arise by a single- crossover recombination, and a second such recombination leads to A.albopictus and A.avopictus

59 Recombination Network New branch-and and-bound approach [Lyngso,, Song and Hein, WABI 2005] Input: data and limit number of recombinations Branch: Starting from original data, consider all possible steps backward in time Bound: If recombinations used plus a lower bound on recombinations still needed exceeds the prescribed limit, do not pursue current configuration 129 Example 3 a b c d e f g h i haplotyped sites of the alcohol dehydrogenase locus from 11 chromosomes of D.melanogaster [Kreitman 1985] Recombination network with 7 events found using the branch-and and-bound method

60 Software Software for computing a recombination network from binary sequences: Software implementing the approach of Dan Gusfield et al. [18, 17] for constructing galled trees is available from: wwwcsif.cs.ucdavis.edu/~gusfield. SplitsTree4 [28] contains a method RecombinationNetwork for constructing galled trees and more general recombination networks [31, 30]. Beagle [Lunyso[ Lunyso,, Song and Hein, 2005] uses branch-and and-bound to compute network Part V 1. Phylogenetic trees 2. Consensus networks and super networks 3. Hybridization and reticulate networks 4. Recombination networks 5. Other

61 Augmented Trees: Reticulograms A reticulogram is a tree with additional short-cut edges. It is obtained from a distance matrix by first building a tree and then repeatedly adding new edges so as to optimize the least square fit of the graph distances to the matrix. Implemented in the program T-Rex [Makarenkov and Lengdre 2000, 2004] 133 Augmented Trees: Reticulograms Data: DNA sequences of 677 for honey bees. A.mellifer A.cerana A.dorsata A.andrenof A.koschev A.florea Reticulogram produced using T-RexT Bootstrap network [28] displays competing signals

62 Augmenting Species Trees by Gene Trees The goal here is to map a set of gene trees on to a given species tree, thus postulating a set of horizontal gene transfer events Implemented in the program lattrans [Hallet and Lagergren 2001] [Addario-Berry et al. 2003] 135 Augmenting Species Trees by Gene Trees A horizontal gene transfer scenario for the rbcl gene presented in [Hallet[ and Lagergren,, 2001]

63 Summary Implicit phylogenetic networks such as splits networks robustly represent incompatible phylogenetic signals while reticultate networks such as hybridization networks and recombination networks provide explicit models of reticulate evolution A wide range of tree and network construction methods are implemented in SplitsTree4 137 Bibliography 1. V. Bafna and V. Bansal.. The number of recombination events in a sample history: conflict graph and lower bounds. IEEE/ACM Transactions in Computational Biology and Bioinformatics, 1(2):78{90, H.-J. Bandelt and A. W. M. Dress. A canonical decomposition theory for metrics on a finite set. Advances in Mathematics, 92:47{105, H.-J. Bandelt,, P. Forster, B. C. Sykes, and M. B. Richards. Mitochondrial portraits of human population using median networks. Genetics, 141:743{753, W. J. Bruno, N. D. Socci,, and A. L. Halpern.. Weighted Neighbor Joining: A likelihood-based approach to distance-based phylogeny reconstruction. Molecular Biology and Evolution, 17(1): , 197, D. Bryant and V. Moulton. NeighborNet: : An agglomerative method for the construction of planar phylogenetic networks. In R. Guigo and D. Gusfield, editors, Algorithms in Bioinformatics, WABI 2002, volume LNCS 2452, pages 375{391, P. Buneman.. The recovery of trees from measures of dissimilarity. In F. R. Hodson,, D. G. Kendall, and P. Tautu,, editors, Mathematics in the Archaeological and Historical Sciences, pages Edinburgh University Press, S. Eddhu D. Gusfield and C. Langley. The fine structure of galls in phylogenetic networks. to appear in: INFORMS J. of Computing Special Issue on Computational Biology,

64 8. A. W. M. Dress and D. H. Huson. Constructing splits graphs. IEEE/ACM Transactions in Computational Biology and Bioinformatics, 1(3): , A. Drummond and K. Strimmer.. PAL: An object-oriented oriented programming libary for molecular evolution and phylogenetics.. Bioinformatics, 17: , 663, A.W.F. Edwards and L.L. Cavalli-Sfroza Sfroza.. The reconstruction of evolution. Annals of Human Genetics, 27: , 106, A.W.F. Edwards and L.L. Cavalli-Sfroza Sfroza.. Reconstruction of evolutionary trees. In V.H. Heywood and J. NcNeill,, editors, Phenetic and Phylogenetic Classification, volume 6, pages Systematics Association, London, J. Felsenstein.. PHYLIP phylogeny inference package (version 3.2). Cladistics, 5: , 166, J. Felsenstein.. Inferring Phylogenies. Sinauer Associates, Inc., K. Forslund,, D.H. Huson, and V. Moulton. VisRD - visual recombination detection. Bioinformatics, 20(18): , 3655, O. Gascuel.. BIONJ: An improved version of the NJ algorithm based on a simple model of sequence data. Mol. Biol. Evol., 14: , 695, R. C. Griffiths and P. Marjoram. Ancestral inference from samples s of DNA sequences with recombination. J. Computational Biology, 3: , D. Gusfield and V. Bansal.. A fundamental decomposition theory for phylogenetic networks and incompatible characters. In Proceedings of the Ninth International Conference on Research in Computational Molecular Biology (RECOMB), D. Gusfield,, S. Eddhu,, and C. Langley. Efficient reconstruction of phylgenetic networks with constrained recombination. In Proceedings of the IEEE CSB Bioinformatics Conference, M. Hallett,, J. Largergren,, and A. Togh gh.. Simultaneous identication of duplications and lateral transfers. In Proceedings of the Eight International Conference on Research in Computational Molecular Biology (RECOMB), pages , 356, J. Hein. A heuristic method to reconstruct the history of sequences subject to recombination. J. Mol. Evol., 36: , B. Holland, K. Huber, V. Moulton, and P. J. Lockhart. Using consensus networks to visualize contradictory evidence for species phylogeny. Molecular Biology and Evolution, 21: , 1461, B. Holland and V. Moulton. Consensus networks: A method for visualizing incompatibilities in collections of trees. In G. Benson and R. Page, P editors, Proceedings of Workshop on Algorithms in Bioinformatics", volume 2812 of LNBI, pages Springer, K. T. Huber, M. Langton, D. Penny, V. Moulton, and M. Hendy. Spectronet: : A package for computing spectra and median networks. Applied Bioinformatics, 1: , 161, R. R. Hudson. Properties of the neutral allele model with intergenic recombination. Theoretical Population Biology, 23: , 201, J.P. Huelsenbeck and F. Ronquist.. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics, 17(8): , 755,

65 26. J.P. Huelsenbeck,, F. Ronquist,, R. Nielsen, and J.P. Bollback.. Bayesian inference of phylogeny and its impact on evolutionary biology. Science, 294:2310 4: , 2314, D. H. Huson. SplitsTree: : A program for analyzing and visualizing evolutionary data. Bioinformatics, 14(10):68-73, D. H. Huson and D. Bryant. Estimating phylogenetic trees and networks using SplitsTree 4. Manuscript in preparation, software available from D. H. Huson, T. Dezulian,, T. Kloepper,, and M. A. Steel. Phylogenetic super- networks from partial trees. IEEE/ACM Transactions in Computational Biology and Bioinformatics, 1(4): , 158, D.H. Huson and T. Kloepper.. Computing recombination networks from binary sequences. Submitted, D.H. Huson, T. Kloepper,, P.J. Lockhart, and M.A. Steel. Reconstruction of reticulate networks from gene trees. In Proceedings of the Ninth International Conference on Research in Computational Molecular Biology (RECOMB), T. H. Jukes and C. R. Cantor. Evolution of protein molecules. In H. N. Munro, editor, Mammalian Protein Metabolism, pages Academic Press, M. Kreitman.. Nucleotide polymorphism at the alcohol dehydrogenase locus of Drosophila melanogaster. Genetics, 11: , 164, K. Kryukov and N. Saitou. Netview: : Application software for constructing and 35. visually exploring phylogenetic networks. Genome Informatics, 14: , 281, A. Kumar, W.C. Black, and K.S. Rai.. An estimate of phylogenetic relationships among culicine mosquitoes using a restriction map of the rdna cistron.. Insect Molecular Biology, 7(4): , 373, C.R. Linder, B.M.E. Moret,, L. Nakhleh,, and T. Warnow.. Network (reticulate) evolution: Biology, models, and algorithms. A tutorial presented at the Ninth Pacific Symposium on Biocomputing,, P. J. Lockhart, P. A. McLenachan,, D. Havell,, D. Glenny,, D. H. Huson, and U. Jensen. Phylogeny, dispersal and radiation of New Zealand alpine buttercups: molecular evidence under split decomposition. Ann Missouri Bot Gard,, 88: , W. Maddison and D. Maddison.. Mesquite- a modular system for evolutionary analysis. version mesquiteproject.org,, W. P. Maddison.. Gene trees in species trees. Syst.. Biol., 46(3): , 536, V. Makarenkov.. T-REX: T Reconstructing and visualizing phylogenetic trees and reticulation networks. Bioinformatics, 17(7): , 668, L. Nakhleh,, T. Warnow,, and C. R. Linder. Reconstructing reticulate evolution in species - theory and practice. In Proceedings of the Eight International Conference on Research in Computational Molecular Biology (RECOMB), pages , 346, D.E. Parfitt and M.L. Badenes.. Phylogeny of the genus pistacia as determined from analysis of the chloroplast genome. PNAS, 94: , 7992, N. Saitou and M. Nei.. The Neighbor-Joining method: a new method for reconstructing phylogenetic trees. Molecular Biology and Evolution, 4: , 425,

66 44. M. Salminen,, J.K. Carr, D.S. Burke, and F.E. McCutchan.. Identification of breakpoints in intergenotypic recombinants of HIV type 1 by bootscanning.. AIDS Res. Hum. Retroviruses, 11: , 1425, R. R. Sokal and C. D. Michener. A statistical method for evaluating systematic tic relationships. University of Kansas Scientific Bulletin, 28: , 1438, Y.S. Song and J. Hein. Parsimonious reconstruction of sequence evolution e and haplotype blocks: Finding the minimum number of recombination events. Proceedings of the Workshop on Algorithms in Bioinformatics, Y.S. Song and J. Hein. On the minimum number of recombination events ents in the evolutionary history of dna sequences. J. Math. Biol., 48: , 186, Y.S. Song and J. Hein. Constructing minimal ancestral recombination ion graphs. J. Comp. Biol., 12: , 169, D. L. Swofford.. PAUP: Phylogenetic analysis using parsimony (and other methods), version 4.2, L. Wang, K. Zhang, and L. Zhang. Perfect phylogenetic networks with recombination. Journal of Computational Biology, 8(1):69-78, M. Worobey.. A novel approach to detecting and measuring recombination: new insights into evolution in viruses, bacteria and mitochondria. Mol. Biol. Evol., 18: , 1434, K.O Donnell Donnell,, H. C. Kistler,, B. K. Tacke,, and H. H. Casper. Gene genealogies reveal global phylogeographic structure and reproductive isolation among lineages of fusarium graminearum,, the fungus causing wheat scab. PNAS, 97(14): , 7910, D.H. Huson,, M.A. Steel and J. Whitfield, Reducing distortion in phylogenetic networks, Proceedings of WABI

Splits and Phylogenetic Networks. Daniel H. Huson

Splits and Phylogenetic Networks. Daniel H. Huson Splits and Phylogenetic Networks Daniel H. Huson aris, June 21, 2005 1 2 Contents 1. Phylogenetic trees 2. Splits networks 3. Consensus networks 4. Hybridization and reticulate networks 5. Recombination

More information

Copyright (c) 2008 Daniel Huson. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation

Copyright (c) 2008 Daniel Huson. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation Daniel H. Huson Stockholm, May 28, 2005 Copyright (c) 2008 Daniel Huson. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version

More information

ISMB-Tutorial: Introduction to Phylogenetic Networks. Daniel H. Huson

ISMB-Tutorial: Introduction to Phylogenetic Networks. Daniel H. Huson ISMB-Tutorial: Introduction to Phylogenetic Networks Daniel H. Huson Center for Bioinformatics, Tübingen University Sand 14, 72075 Tübingen, Germany www-ab.informatik.uni-tuebingen.de June 25, 2005 Contents

More information

Copyright (c) 2008 Daniel Huson. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation

Copyright (c) 2008 Daniel Huson. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation Copyright (c) 2008 Daniel Huson. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published

More information

Phylogenetic Networks, Trees, and Clusters

Phylogenetic Networks, Trees, and Clusters Phylogenetic Networks, Trees, and Clusters Luay Nakhleh 1 and Li-San Wang 2 1 Department of Computer Science Rice University Houston, TX 77005, USA nakhleh@cs.rice.edu 2 Department of Biology University

More information

Beyond Galled Trees Decomposition and Computation of Galled Networks

Beyond Galled Trees Decomposition and Computation of Galled Networks Beyond Galled Trees Decomposition and Computation of Galled Networks Daniel H. Huson & Tobias H.Kloepper RECOMB 2007 1 Copyright (c) 2008 Daniel Huson. Permission is granted to copy, distribute and/or

More information

A new algorithm to construct phylogenetic networks from trees

A new algorithm to construct phylogenetic networks from trees A new algorithm to construct phylogenetic networks from trees J. Wang College of Computer Science, Inner Mongolia University, Hohhot, Inner Mongolia, China Corresponding author: J. Wang E-mail: wangjuanangle@hit.edu.cn

More information

TheDisk-Covering MethodforTree Reconstruction

TheDisk-Covering MethodforTree Reconstruction TheDisk-Covering MethodforTree Reconstruction Daniel Huson PACM, Princeton University Bonn, 1998 1 Copyright (c) 2008 Daniel Huson. Permission is granted to copy, distribute and/or modify this document

More information

A Phylogenetic Network Construction due to Constrained Recombination

A Phylogenetic Network Construction due to Constrained Recombination A Phylogenetic Network Construction due to Constrained Recombination Mohd. Abdul Hai Zahid Research Scholar Research Supervisors: Dr. R.C. Joshi Dr. Ankush Mittal Department of Electronics and Computer

More information

Consistency Index (CI)

Consistency Index (CI) Consistency Index (CI) minimum number of changes divided by the number required on the tree. CI=1 if there is no homoplasy negatively correlated with the number of species sampled Retention Index (RI)

More information

Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

More information

Phylogenetic Tree Reconstruction

Phylogenetic Tree Reconstruction I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven

More information

Intraspecific gene genealogies: trees grafting into networks

Intraspecific gene genealogies: trees grafting into networks Intraspecific gene genealogies: trees grafting into networks by David Posada & Keith A. Crandall Kessy Abarenkov Tartu, 2004 Article describes: Population genetics principles Intraspecific genetic variation

More information

Regular networks are determined by their trees

Regular networks are determined by their trees Regular networks are determined by their trees Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA swillson@iastate.edu February 17, 2009 Abstract. A rooted acyclic digraph

More information

NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees

NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees Erin Molloy and Tandy Warnow {emolloy2, warnow}@illinois.edu University of Illinois at Urbana

More information

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics Bioinformatics 1 Biology, Sequences, Phylogenetics Part 4 Sepp Hochreiter Klausur Mo. 30.01.2011 Zeit: 15:30 17:00 Raum: HS14 Anmeldung Kusss Contents Methods and Bootstrapping of Maximum Methods Methods

More information

NOTE ON THE HYBRIDIZATION NUMBER AND SUBTREE DISTANCE IN PHYLOGENETICS

NOTE ON THE HYBRIDIZATION NUMBER AND SUBTREE DISTANCE IN PHYLOGENETICS NOTE ON THE HYBRIDIZATION NUMBER AND SUBTREE DISTANCE IN PHYLOGENETICS PETER J. HUMPHRIES AND CHARLES SEMPLE Abstract. For two rooted phylogenetic trees T and T, the rooted subtree prune and regraft distance

More information

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 2009 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

Phylogenetics. BIOL 7711 Computational Bioscience

Phylogenetics. BIOL 7711 Computational Bioscience Consortium for Comparative Genomics! University of Colorado School of Medicine Phylogenetics BIOL 7711 Computational Bioscience Biochemistry and Molecular Genetics Computational Bioscience Program Consortium

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 200 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

A new e±cient algorithm for inferring explicit hybridization networks following the Neighbor-Joining principle

A new e±cient algorithm for inferring explicit hybridization networks following the Neighbor-Joining principle Journal of Bioinformatics and Computational Biology Vol. 12, No. 5 (2014) 1450024 (27 pages) #.c Imperial College Press DOI: 10.1142/S0219720014500243 A new e±cient algorithm for inferring explicit hybridization

More information

Properties of normal phylogenetic networks

Properties of normal phylogenetic networks Properties of normal phylogenetic networks Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA swillson@iastate.edu August 13, 2009 Abstract. A phylogenetic network is

More information

Phylogenetic inference

Phylogenetic inference Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis-) advantages of different information types

More information

Algorithms in Bioinformatics

Algorithms in Bioinformatics Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Distance Methods Character Methods

More information

Reconstructing Phylogenetic Networks Using Maximum Parsimony

Reconstructing Phylogenetic Networks Using Maximum Parsimony Reconstructing Phylogenetic Networks Using Maximum Parsimony Luay Nakhleh Guohua Jin Fengmei Zhao John Mellor-Crummey Department of Computer Science, Rice University Houston, TX 77005 {nakhleh,jin,fzhao,johnmc}@cs.rice.edu

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016 Molecular phylogeny - Using molecular sequences to infer evolutionary relationships Tore Samuelsson Feb 2016 Molecular phylogeny is being used in the identification and characterization of new pathogens,

More information

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz Phylogenetic Trees What They Are Why We Do It & How To Do It Presented by Amy Harris Dr Brad Morantz Overview What is a phylogenetic tree Why do we do it How do we do it Methods and programs Parallels

More information

Distance Corrections on Recombinant Sequences

Distance Corrections on Recombinant Sequences Distance Corrections on Recombinant Sequences David Bryant 1, Daniel Huson 2, Tobias Kloepper 2, and Kay Nieselt-Struwe 2 1 McGill Centre for Bioinformatics 3775 University Montréal, Québec, H3A 2B4 Canada

More information

An introduction to phylogenetic networks

An introduction to phylogenetic networks An introduction to phylogenetic networks Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University Email: steven.kelk@maastrichtuniversity.nl Web: http://skelk.sdf-eu.org Genome sequence,

More information

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree) I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by

More information

UNICYCLIC NETWORKS: COMPATIBILITY AND ENUMERATION

UNICYCLIC NETWORKS: COMPATIBILITY AND ENUMERATION UNICYCLIC NETWORKS: COMPATIBILITY AND ENUMERATION CHARLES SEMPLE AND MIKE STEEL Abstract. Graphs obtained from a binary leaf labelled ( phylogenetic ) tree by adding an edge so as to introduce a cycle

More information

ALGORITHMIC STRATEGIES FOR ESTIMATING THE AMOUNT OF RETICULATION FROM A COLLECTION OF GENE TREES

ALGORITHMIC STRATEGIES FOR ESTIMATING THE AMOUNT OF RETICULATION FROM A COLLECTION OF GENE TREES ALGORITHMIC STRATEGIES FOR ESTIMATING THE AMOUNT OF RETICULATION FROM A COLLECTION OF GENE TREES H. J. Park and G. Jin and L. Nakhleh Department of Computer Science, Rice University, 6 Main Street, Houston,

More information

Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees Constructing Evolutionary/Phylogenetic Trees 2 broad categories: Distance-based methods Ultrametric Additive: UPGMA Transformed Distance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

More information

Lecture 11 Friday, October 21, 2011

Lecture 11 Friday, October 21, 2011 Lecture 11 Friday, October 21, 2011 Phylogenetic tree (phylogeny) Darwin and classification: In the Origin, Darwin said that descent from a common ancestral species could explain why the Linnaean system

More information

Michael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D

Michael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D 7.91 Lecture #5 Database Searching & Molecular Phylogenetics Michael Yaffe B C D B C D (((,B)C)D) Outline Distance Matrix Methods Neighbor-Joining Method and Related Neighbor Methods Maximum Likelihood

More information

I. Short Answer Questions DO ALL QUESTIONS

I. Short Answer Questions DO ALL QUESTIONS EVOLUTION 313 FINAL EXAM Part 1 Saturday, 7 May 2005 page 1 I. Short Answer Questions DO ALL QUESTIONS SAQ #1. Please state and BRIEFLY explain the major objectives of this course in evolution. Recall

More information

Fast Phylogenetic Methods for the Analysis of Genome Rearrangement Data: An Empirical Study

Fast Phylogenetic Methods for the Analysis of Genome Rearrangement Data: An Empirical Study Fast Phylogenetic Methods for the Analysis of Genome Rearrangement Data: An Empirical Study Li-San Wang Robert K. Jansen Dept. of Computer Sciences Section of Integrative Biology University of Texas, Austin,

More information

Effects of Gap Open and Gap Extension Penalties

Effects of Gap Open and Gap Extension Penalties Brigham Young University BYU ScholarsArchive All Faculty Publications 200-10-01 Effects of Gap Open and Gap Extension Penalties Hyrum Carroll hyrumcarroll@gmail.com Mark J. Clement clement@cs.byu.edu See

More information

Distances that Perfectly Mislead

Distances that Perfectly Mislead Syst. Biol. 53(2):327 332, 2004 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150490423809 Distances that Perfectly Mislead DANIEL H. HUSON 1 AND

More information

THE THREE-STATE PERFECT PHYLOGENY PROBLEM REDUCES TO 2-SAT

THE THREE-STATE PERFECT PHYLOGENY PROBLEM REDUCES TO 2-SAT COMMUNICATIONS IN INFORMATION AND SYSTEMS c 2009 International Press Vol. 9, No. 4, pp. 295-302, 2009 001 THE THREE-STATE PERFECT PHYLOGENY PROBLEM REDUCES TO 2-SAT DAN GUSFIELD AND YUFENG WU Abstract.

More information

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis

More information

Processes of Evolution

Processes of Evolution 15 Processes of Evolution Forces of Evolution Concept 15.4 Selection Can Be Stabilizing, Directional, or Disruptive Natural selection can act on quantitative traits in three ways: Stabilizing selection

More information

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Bioinformatics tools for phylogeny and visualization. Yanbin Yin Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and

More information

Parsimony via Consensus

Parsimony via Consensus Syst. Biol. 57(2):251 256, 2008 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150802040597 Parsimony via Consensus TREVOR C. BRUEN 1 AND DAVID

More information

C.DARWIN ( )

C.DARWIN ( ) C.DARWIN (1809-1882) LAMARCK Each evolutionary lineage has evolved, transforming itself, from a ancestor appeared by spontaneous generation DARWIN All organisms are historically interconnected. Their relationships

More information

X X (2) X Pr(X = x θ) (3)

X X (2) X Pr(X = x θ) (3) Notes for 848 lecture 6: A ML basis for compatibility and parsimony Notation θ Θ (1) Θ is the space of all possible trees (and model parameters) θ is a point in the parameter space = a particular tree

More information

PHYLOGENIES are the main tool for representing evolutionary

PHYLOGENIES are the main tool for representing evolutionary IEEE TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, VOL. 1, NO. 1, JANUARY-MARCH 2004 13 Phylogenetic Networks: Modeling, Reconstructibility, and Accuracy Bernard M.E. Moret, Luay Nakhleh, Tandy

More information

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally

More information

What is Phylogenetics

What is Phylogenetics What is Phylogenetics Phylogenetics is the area of research concerned with finding the genetic connections and relationships between species. The basic idea is to compare specific characters (features)

More information

Reconstructing Trees from Subtree Weights

Reconstructing Trees from Subtree Weights Reconstructing Trees from Subtree Weights Lior Pachter David E Speyer October 7, 2003 Abstract The tree-metric theorem provides a necessary and sufficient condition for a dissimilarity matrix to be a tree

More information

Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata.

Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Supplementary Note S2 Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Phylogenetic trees reconstructed by a variety of methods from either single-copy orthologous loci (Class

More information

Assessing an Unknown Evolutionary Process: Effect of Increasing Site- Specific Knowledge Through Taxon Addition

Assessing an Unknown Evolutionary Process: Effect of Increasing Site- Specific Knowledge Through Taxon Addition Assessing an Unknown Evolutionary Process: Effect of Increasing Site- Specific Knowledge Through Taxon Addition David D. Pollock* and William J. Bruno* *Theoretical Biology and Biophysics, Los Alamos National

More information

Finding a gene tree in a phylogenetic network Philippe Gambette

Finding a gene tree in a phylogenetic network Philippe Gambette LRI-LIX BioInfo Seminar 19/01/2017 - Palaiseau Finding a gene tree in a phylogenetic network Philippe Gambette Outline Phylogenetic networks Classes of phylogenetic networks The Tree Containment Problem

More information

A 3-APPROXIMATION ALGORITHM FOR THE SUBTREE DISTANCE BETWEEN PHYLOGENIES. 1. Introduction

A 3-APPROXIMATION ALGORITHM FOR THE SUBTREE DISTANCE BETWEEN PHYLOGENIES. 1. Introduction A 3-APPROXIMATION ALGORITHM FOR THE SUBTREE DISTANCE BETWEEN PHYLOGENIES MAGNUS BORDEWICH 1, CATHERINE MCCARTIN 2, AND CHARLES SEMPLE 3 Abstract. In this paper, we give a (polynomial-time) 3-approximation

More information

Algorithmic Methods Well-defined methodology Tree reconstruction those that are well-defined enough to be carried out by a computer. Felsenstein 2004,

Algorithmic Methods Well-defined methodology Tree reconstruction those that are well-defined enough to be carried out by a computer. Felsenstein 2004, Tracing the Evolution of Numerical Phylogenetics: History, Philosophy, and Significance Adam W. Ferguson Phylogenetic Systematics 26 January 2009 Inferring Phylogenies Historical endeavor Darwin- 1837

More information

DNA Phylogeny. Signals and Systems in Biology Kushal EE, IIT Delhi

DNA Phylogeny. Signals and Systems in Biology Kushal EE, IIT Delhi DNA Phylogeny Signals and Systems in Biology Kushal Shah @ EE, IIT Delhi Phylogenetics Grouping and Division of organisms Keeps changing with time Splitting, hybridization and termination Cladistics :

More information

Tree-average distances on certain phylogenetic networks have their weights uniquely determined

Tree-average distances on certain phylogenetic networks have their weights uniquely determined Tree-average distances on certain phylogenetic networks have their weights uniquely determined Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA swillson@iastate.edu

More information

Theory of Evolution Charles Darwin

Theory of Evolution Charles Darwin Theory of Evolution Charles arwin 858-59: Origin of Species 5 year voyage of H.M.S. eagle (83-36) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties

More information

RECONSTRUCTING PATTERNS OF RETICULATE

RECONSTRUCTING PATTERNS OF RETICULATE American Journal of Botany 91(10): 1700 1708. 2004. RECONSTRUCTING PATTERNS OF RETICULATE EVOLUTION IN PLANTS 1 C. RANDAL LINDER 2,4 AND LOREN H. RIESEBERG 3 2 Section of Integrative Biology and the Center

More information

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9 Lecture 5 Alignment I. Introduction. For sequence data, the process of generating an alignment establishes positional homologies; that is, alignment provides the identification of homologous phylogenetic

More information

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis 10 December 2012 - Corrections - Exercise 1 Non-vertebrate chordates generally possess 2 homologs, vertebrates 3 or more gene copies; a Drosophila

More information

Letter to the Editor. Department of Biology, Arizona State University

Letter to the Editor. Department of Biology, Arizona State University Letter to the Editor Traditional Phylogenetic Reconstruction Methods Reconstruct Shallow and Deep Evolutionary Relationships Equally Well Michael S. Rosenberg and Sudhir Kumar Department of Biology, Arizona

More information

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

CHAPTERS 24-25: Evidence for Evolution and Phylogeny CHAPTERS 24-25: Evidence for Evolution and Phylogeny 1. For each of the following, indicate how it is used as evidence of evolution by natural selection or shown as an evolutionary trend: a. Paleontology

More information

An Investigation of Phylogenetic Likelihood Methods

An Investigation of Phylogenetic Likelihood Methods An Investigation of Phylogenetic Likelihood Methods Tiffani L. Williams and Bernard M.E. Moret Department of Computer Science University of New Mexico Albuquerque, NM 87131-1386 Email: tlw,moret @cs.unm.edu

More information

EVOLUTIONARY DISTANCES

EVOLUTIONARY DISTANCES EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:

More information

Chapter 27: Evolutionary Genetics

Chapter 27: Evolutionary Genetics Chapter 27: Evolutionary Genetics Student Learning Objectives Upon completion of this chapter you should be able to: 1. Understand what the term species means to biology. 2. Recognize the various patterns

More information

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Phylogeny: the evolutionary history of a species

More information

C3020 Molecular Evolution. Exercises #3: Phylogenetics

C3020 Molecular Evolution. Exercises #3: Phylogenetics C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from

More information

8/23/2014. Phylogeny and the Tree of Life

8/23/2014. Phylogeny and the Tree of Life Phylogeny and the Tree of Life Chapter 26 Objectives Explain the following characteristics of the Linnaean system of classification: a. binomial nomenclature b. hierarchical classification List the major

More information

Phylogenetic Networks with Recombination

Phylogenetic Networks with Recombination Phylogenetic Networks with Recombination October 17 2012 Recombination All DNA is recombinant DNA... [The] natural process of recombination and mutation have acted throughout evolution... Genetic exchange

More information

A Fitness Distance Correlation Measure for Evolutionary Trees

A Fitness Distance Correlation Measure for Evolutionary Trees A Fitness Distance Correlation Measure for Evolutionary Trees Hyun Jung Park 1, and Tiffani L. Williams 2 1 Department of Computer Science, Rice University hp6@cs.rice.edu 2 Department of Computer Science

More information

Estimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057

Estimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057 Estimating Phylogenies (Evolutionary Trees) II Biol4230 Thurs, March 2, 2017 Bill Pearson wrp@virginia.edu 4-2818 Jordan 6-057 Tree estimation strategies: Parsimony?no model, simply count minimum number

More information

Phylogenetic Analysis

Phylogenetic Analysis Phylogenetic Analysis Aristotle Through classification, one might discover the essence and purpose of species. Nelson & Platnick (1981) Systematics and Biogeography Carl Linnaeus Swedish botanist (1700s)

More information

Phylogenetic analyses. Kirsi Kostamo

Phylogenetic analyses. Kirsi Kostamo Phylogenetic analyses Kirsi Kostamo The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among different groups (individuals, populations, species,

More information

Inferring a level-1 phylogenetic network from a dense set of rooted triplets

Inferring a level-1 phylogenetic network from a dense set of rooted triplets Theoretical Computer Science 363 (2006) 60 68 www.elsevier.com/locate/tcs Inferring a level-1 phylogenetic network from a dense set of rooted triplets Jesper Jansson a,, Wing-Kin Sung a,b, a School of

More information

Consensus Methods. * You are only responsible for the first two

Consensus Methods. * You are only responsible for the first two Consensus Trees * consensus trees reconcile clades from different trees * consensus is a conservative estimate of phylogeny that emphasizes points of agreement * philosophy: agreement among data sets is

More information

Inferring phylogeny. Today s topics. Milestones of molecular evolution studies Contributions to molecular evolution

Inferring phylogeny. Today s topics. Milestones of molecular evolution studies Contributions to molecular evolution Today s topics Inferring phylogeny Introduction! Distance methods! Parsimony method!"#$%&'(!)* +,-.'/01!23454(6!7!2845*0&4'9#6!:&454(6 ;?@AB=C?DEF Overview of phylogenetic inferences Methodology Methods

More information

Bayesian Models for Phylogenetic Trees

Bayesian Models for Phylogenetic Trees Bayesian Models for Phylogenetic Trees Clarence Leung* 1 1 McGill Centre for Bioinformatics, McGill University, Montreal, Quebec, Canada ABSTRACT Introduction: Inferring genetic ancestry of different species

More information

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5. Five Sami Khuri Department of Computer Science San José State University San José, California, USA sami.khuri@sjsu.edu v Distance Methods v Character Methods v Molecular Clock v UPGMA v Maximum Parsimony

More information

Bioinformatics Advance Access published August 23, 2006

Bioinformatics Advance Access published August 23, 2006 Bioinformatics Advance Access published August 23, 2006 BIOINFORMATICS Maximum Likelihood of Phylogenetic Networks Guohua Jin a Luay Nakhleh a Sagi Snir b Tamir Tuller c a Dept. of Computer Science Rice

More information

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogeny? - Systematics? The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogenetic systematics? Connection between phylogeny and classification. - Phylogenetic systematics informs the

More information

Plan: Evolutionary trees, characters. Perfect phylogeny Methods: NJ, parsimony, max likelihood, Quartet method

Plan: Evolutionary trees, characters. Perfect phylogeny Methods: NJ, parsimony, max likelihood, Quartet method Phylogeny 1 Plan: Phylogeny is an important subject. We have 2.5 hours. So I will teach all the concepts via one example of a chain letter evolution. The concepts we will discuss include: Evolutionary

More information

Using phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression)

Using phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression) Using phylogenetics to estimate species divergence times... More accurately... Basics and basic issues for Bayesian inference of divergence times (plus some digression) "A comparison of the structures

More information

BINF6201/8201. Molecular phylogenetic methods

BINF6201/8201. Molecular phylogenetic methods BINF60/80 Molecular phylogenetic methods 0-7-06 Phylogenetics Ø According to the evolutionary theory, all life forms on this planet are related to one another by descent. Ø Traditionally, phylogenetics

More information

SPECIATION. REPRODUCTIVE BARRIERS PREZYGOTIC: Barriers that prevent fertilization. Habitat isolation Populations can t get together

SPECIATION. REPRODUCTIVE BARRIERS PREZYGOTIC: Barriers that prevent fertilization. Habitat isolation Populations can t get together SPECIATION Origin of new species=speciation -Process by which one species splits into two or more species, accounts for both the unity and diversity of life SPECIES BIOLOGICAL CONCEPT Population or groups

More information

A fuzzy weighted least squares approach to construct phylogenetic network among subfamilies of grass species

A fuzzy weighted least squares approach to construct phylogenetic network among subfamilies of grass species Journal of Applied Mathematics & Bioinformatics, vol.3, no.2, 2013, 137-158 ISSN: 1792-6602 (print), 1792-6939 (online) Scienpress Ltd, 2013 A fuzzy weighted least squares approach to construct phylogenetic

More information

From graph classes to phylogenetic networks Philippe Gambette

From graph classes to phylogenetic networks Philippe Gambette 40 années d'algorithmique de graphes 40 Years of Graphs and Algorithms 11/10/2018 - Paris From graph classes to phylogenetic networks Philippe Gambette Outline Discovering graph classes with Michel An

More information

Outline. Classification of Living Things

Outline. Classification of Living Things Outline Classification of Living Things Chapter 20 Mader: Biology 8th Ed. Taxonomy Binomial System Species Identification Classification Categories Phylogenetic Trees Tracing Phylogeny Cladistic Systematics

More information

RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION

RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION MAGNUS BORDEWICH, KATHARINA T. HUBER, VINCENT MOULTON, AND CHARLES SEMPLE Abstract. Phylogenetic networks are a type of leaf-labelled,

More information

Microbial Diversity and Assessment (II) Spring, 2007 Guangyi Wang, Ph.D. POST103B

Microbial Diversity and Assessment (II) Spring, 2007 Guangyi Wang, Ph.D. POST103B Microbial Diversity and Assessment (II) Spring, 007 Guangyi Wang, Ph.D. POST03B guangyi@hawaii.edu http://www.soest.hawaii.edu/marinefungi/ocn403webpage.htm General introduction and overview Taxonomy [Greek

More information

7. Tests for selection

7. Tests for selection Sequence analysis and genomics 7. Tests for selection Dr. Katja Nowick Group leader TFome and Transcriptome Evolution Bioinformatics group Paul-Flechsig-Institute for Brain Research www. nowicklab.info

More information

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS.

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS. !! www.clutchprep.com CONCEPT: OVERVIEW OF EVOLUTION Evolution is a process through which variation in individuals makes it more likely for them to survive and reproduce There are principles to the theory

More information

Phylogenetic Analysis

Phylogenetic Analysis Phylogenetic Analysis Aristotle Through classification, one might discover the essence and purpose of species. Nelson & Platnick (1981) Systematics and Biogeography Carl Linnaeus Swedish botanist (1700s)

More information

Phylogenetic Analysis

Phylogenetic Analysis Phylogenetic Analysis Aristotle Through classification, one might discover the essence and purpose of species. Nelson & Platnick (1981) Systematics and Biogeography Carl Linnaeus Swedish botanist (1700s)

More information

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics. Evolutionary Genetics (for Encyclopedia of Biodiversity) Sergey Gavrilets Departments of Ecology and Evolutionary Biology and Mathematics, University of Tennessee, Knoxville, TN 37996-6 USA Evolutionary

More information

The Phylogenetic Handbook

The Phylogenetic Handbook The Phylogenetic Handbook A Practical Approach to DNA and Protein Phylogeny Edited by Marco Salemi University of California, Irvine and Katholieke Universiteit Leuven, Belgium and Anne-Mieke Vandamme Rega

More information

Bio 1B Lecture Outline (please print and bring along) Fall, 2007

Bio 1B Lecture Outline (please print and bring along) Fall, 2007 Bio 1B Lecture Outline (please print and bring along) Fall, 2007 B.D. Mishler, Dept. of Integrative Biology 2-6810, bmishler@berkeley.edu Evolution lecture #5 -- Molecular genetics and molecular evolution

More information