ON THE UNIQUENESS OF BALANCED MINIMUM EVOLUTION

Size: px
Start display at page:

Download "ON THE UNIQUENESS OF BALANCED MINIMUM EVOLUTION"

Transcription

1 ON THE UNIQUENESS OF LNE MINIMUM EVOLUTION RON KLEINMN ecember 3, 20 bstract. Minimum evolution is a class of parsimonious distance-based phylogenetic reconstruction methods. One noteworthy example is balanced minimum evolution (ME), which is the theoretical underpinning of the neighbor-joining algorithm. The robustness of a minimum evolution method can be captured in part by a statistic known as its L radius. ME is known to have L radius 2, the best possible. We show that ME is in fact the only minimum evolution method with radius Introduction The minimum evolution (ME) approach to phylogenetic reconstruction is broadly based on the following idea: given a matrix δ of pairwise distances between a set of n taxa, find the tree that explains δ with as little evolution as possible [, 6]. More explicitly, ME employs the following algorithm: () For each tree topology T, find the branch lengths of T assuming δ comes from T ; (2) Use the branch lengths to compute the length l T of the tree T ; (3) hoose the tree ˆT = arg min T l T with minimum length. There is some ambiguity in how to use negative branch lengths to compute the length of the tree. Kidd and Sgaramella-Zonta [] proposed summing the absolute value of the edge lengths, while Swofford et. al. [20] suggested summing only positive edge lengths. Throughout, we will follow Rzhetsky and Nei [5] in calculating the length of the tree by summing all the edges, with sign. The effectiveness of this method then depends crucially on how we select the the branch lengths for a given tree T. One classic approach, first proposed by avalli-sforza and Edwards [4] and Fitch and Margoliash [8], chooses the edge lengths that minimize the sum of squares (δ ij δij) T 2, where δij T is the sum of lengths of edges in the path between i and j. This method of assigning edge lengths is known as ordinarly least squares (OLS), and we refer to the corresponding ME method as OLS+ME. If we know the variances V ij of the δ ij, then the variance-minimizing estimate of the edge lengths is given by the δ T that minimizes the weighted least squares (WLS) (0.) V ij (δ ij δ T ij) 2. Supported by an NSF Graduate Research Fellowship. The author thanks Lior Pachter for suggesting this problem. 0-

2 0-2 RON KLEINMN WLS assumes the δ ij are uncorrelated. Generalized least squares (GLS) gets rid of this assumption and seeks to minimize V ij,kl (δ ij δij)(δ T kl δkl), T ij,kl where V is the inverse of the variance-covariance matrix of the δ ij. s more data is gathered, we expect the observed pairwise distances δ to converge to the true distances δ T. OLS+ME satisfies the important property that it is statistically consistent [5]: that is, when δ is sufficiently close to δ T, OLS+ME will recover the correct tree topology T. While this property holds for WLS methods other than OLS [5], it does not hold for all WLS (and therefore all GLS) ME methods [9]. In GLS, computing the optimal edge weights is equivalent to minimizing a quadratic form, and the edge weights are then linear in the elements of δ. More precisely, we have (0.2) ˆlT = (ST t V S T ) ST t V δ, where V is the ( ( n 2) n ) 2 varinace-covariance matrix and ST is the ( n 2) E matrix whose entries are given by { if e is an edge on the path in T between i and j, (S T ) ij,e = 0 otherwise. We briefly mention that GLS has the following statistical interpretation. Suppose δ ij = ij + ɛ ij, where δ ij is the observed distance, ij is the true distance and the ɛ ij are error terms that are normally distributed with mean zero and covariance matrix V. Then ˆl e is the linear unbiased estimator for the length of edge e with minimal variance. Under GLS, the total length of the tree is a linear form in the coefficients of δ given by l T = tˆlt, where is the vector of ones of length E. If δ actually is T -additive, δ = S T E T for some positive vector E T, and then by (0.2) a GLS+ME method on T will estimate the length of δ to be E T, which is the correct length. This suggests the following definition. Let T be the set of phylogenetic X-trees. onsider the space of all dissimilarity maps on X as a cone in R (n 2), and let L be the space of linear forms on R (n 2). efinition 0.. minimum evolution (ME) method is a map φ : T L such that if δ is T -additive with length l, (0.3) (φ(t ))(δ) = l. (For notational convenience, in what follows we frequently write φ(t, δ) to mean (φ(t ))(δ)). We say φ is consistent if we further have (0.4) T = argmin T φ(t, δ) Equation 0.3 is a kind of normalization requirement that allows us to recover the correct length of the tree. Equation 0.4 says that when our disismilarity map is tree-like, our method recovers the correct tree. Note that a general ME method doesn t attempt to calculate the edge lengths - it cares only about the total length of the tree. Similarly, there is no statistical interpretation for a general ME method.

3 ON THE UNIQUENESS OF LNE MINIMUM EVOLUTION 0-3 We have shown that all GLS+ME methods are ME methods. onversely, we say φ is a GLS+ME method (respectively, WLS+ME method) if, for each tree T, there is a variancecovariance matrix (respectively, diagonal variance-covariance matrix) V T such that φ(t ) = t (ST t V T S T ) ST t V T δ. alanced Minimum Evolution (ME), first introduced in [4], is a WLS+ME method that corresponds to taking (V T ) ij = 2 P T (i,j), where P T (i, j) is number of edges on the path P T (i, j) between i and j in T. Pauplin showed [4] that in this case the WLS-minimizing edge lengths have computationally simple expressions and sum to give the nice formula φ ME (T ) := 2 P (i,j) δ ij. Like OLS+ME, this is consistent [6]. ME also has applications to the Neighbor Joining (NJ) algorithm. First introduced in [7], NJ has historically been very important in phylogenetic reconstruction. It constructs a tree agglomaratively as follows: () Given a distance matrix δ : X X R, compute the Q-criterion Q δ (i, j) = (n 2)δ(i, j) k i δ(i, k) k j δ(j, k). (2) Select a pair (a, b) of taxa that minimize Q δ. If there are more than three taxa, replace this pair by a leaf ab and construct a new dissimilarity map δ given by δ (i, ab) = (δ(i, a) + δ(i, b) δ(a, b)). 2 (3) Repeat until there are three taxa remaining. NJ is motivated by the result [7, 9] that if δ is a T -additive tree metric, then a pair (a, b) minimizing Q δ is a cherry in T. This shows that NJ is consistent. onsistency doesn t guarantee accuracy in practice, since actual data will rarely be tree-additive due to noise, but there are a number of results [, 2] showing that NJ is maximally robust to small perturbations in the elements of the dissimilarity map. To make this more precise we use the following definition: efinition 0.2. tree reconstruction method has L radius α if, for each T -additive dissimilarity δ T with minimal branch length w min, and dissimilarity δ with δ δ T < 2 w min, the method returns T when given δ as input. ny reconstruction method with positive L radius is necessarily consistent. tteson showed [] that NJ has L radius. For i {, 2} there exist distinct trees T 2 i with minimum branch length w min, and T i -additive dissimilarities δ T i such that δ δ T i = w 2 min, so this is the best L radius possible. While other reconstruction algorithms might be similarly robust, NJ is special in at least one way. ryant showed [3] that the Q-criterion the only criterion that is: linear in the coefficients of δ; consistent (i.e. that given tree-like data the criterion will select a cherry at each step); indifferent to the order of the taxa. Thus, one can consider NJ as the unique algorithm satisfying a certain set of desirable properties.

4 0-4 RON KLEINMN lthough NJ was originally created as a way to approximate OLS+ME, Gascuel and Steel showed [?] that NJ is a greedy agglomarative implementation of ME. The relationship between NJ and ME is bolstered in [3], where it is shown that ME also has L radius 2, while OLS+ME has L radius 0 as the number of taxa n increases to infinity. These are the only L radii that have been precisely computed for ME methods. Given the close ties between NJ and ME, and ryant s result on the uniqueness of NJ, it is natural to wonder if ME is somehow unique in the class of ME methods. The main result of this paper is the following affirmative result: Theorem 0.3. ME is the only ME method with L radius Proof We begin with some standard definitions. Let X be a set of taxa, X = n. phylogenetic X-tree is an unrooted trivalent tree whose leaves are labelled bijectively with the elements of X. split of X is a partition of X into two nonempty pieces. Each edge e in a tree T gives rise to a split obtained by removing e and looking at the taxa in the two disconnected components of T. clade of T is a subset of X which is a component of a split obtained in this way. If, are disjoint clades in X, let σ be the dissimilarity map given by { if {i, j} =, σ (i, j) = 0 otherwise. If e is an edge in T that gives the X-split, let σ e denote σ. dissimiliarity on a set X is a symmetric matrix δ whose diagonal elements are 0. If T is an X-tree, we say δ is T -additive if there is a positive function w : E(T ) R + such that δ ij = w e, e P T (i,j) where P T (i, j) is the path between i and j. We now begin the proof. Suppose φ is a consistent ME method and let l T = φ(t ). y a classic result [8] the map δ is T -additive if and only if it has the form δ = w e σ e for some positive w e. For such δ, w e = l T (δ) = w e l T (σ e ). This must hold for arbitrary w e, so l T (σ e ) = for all e E(T ). Now suppose S is an X-split that is not in T, and let T be a tree that contains this split. If l T = φ(t ) and δ = σ S, then δ is T -additive and not T -additive, so by consistency l T (δ ) > l T (δ ) which implies l T (σ S ) >. Following [3], let U(T ) denote the set of linear forms l such that l(σ S ) for each X-split S, with equality if and only if S lies in T. Then we have shown: Lemma 0.4. φ is consistent if and only if φ(t ) U(T ) for all T T.

5 ON THE UNIQUENESS OF LNE MINIMUM EVOLUTION 0-5 e e e e T e T ' T '' Figure 0.. Three trees T, T, T that are nearest-neighbor interchanges of each other We must show that if φ has radius then it is ME. First, we assume only that φ is 2 consistent. Fix T T and let l = φ(t ). If e,..., e k E(T ) form a path with ends determining disjoint clades and, then σ e...e k represents σ. Lemma 0.5. Let e be an internal edge of T and let,,, be the disjoint clades obtained by removing the four edges adjacent to e with the split corresponding to e, as in Figure. Then l(σ ) = l(σ ) and l(σ ) = l(σ ). Proof. Let v be an internal vertex with edges e, e 2, e 3. We have = l(σ ei ) = l(σ ei e j ) + l(σ ei e k ) for {i, j, k} = {, 2, 3}. This is a system of three equations and three unknowns, and solving shows l(σ ei e j ) = 2. pplying this to e e gives Similarly, 2 = l(σ ) = l(σ ) + l(σ ). 2 = l(σ ) = l(σ ) + l(σ ), 2 = l(σ ) = l(σ ) + l(σ ), 2 = l(σ ) = l(σ ) + l(σ ). Simple manipulation then proves the lemma. Two trees T, T are separated by a nearest neighbor interchange (NNI) if one can be obtained from the other by tranposing two subtrees that are precisely three edges apart. Pick an interior edge e E(T ) and label the clades given by the edges adjacent to e in this way, so T has the split. Let T be the tree obtained by a NNI so that T has

6 0-6 RON KLEINMN the split, as in Figure, and take l = φ(t ). Now suppose δ is a T -additive distance matrix δ = w e σ e. Then l(δ) = w e. Note that for each e E(T ) with e e, the split induced by e in T is also a split in T. So l (σ e ) = for all e e, and l (δ) l(δ) = (w e l (σ e ) + w e l (σ e )) w e l(σ e ) = w e (l (σ e ) ) e e = w e (l (σ ) + l (σ ) + l (σ ) + l (σ ) ) = 2w e l (σ ), by Lemma 0.5. Let δ be a dissimilarity map satisfying δ δ < αw min. Then l ( δ) l( δ) = (l ( δ) l (δ)) (l( δ) l(δ)) + (l (δ) l(δ)) = c T ij ( δ δ) ij c T ij( δ δ) ij + 2w e l (σ ) c T ij c T ij αw min + 2w e l (σ ). Now assume φ has L radius α. ecause equality can be achieved by taking ( δ δ) ij = αw min sgn(c T ij c T ij), the inequality l ( δ) l( δ) 0 gives (0.5) 2w e l (σ ) αw min ij c T ij. The sum in the right-hand side of (0.5) is (0.6) ij c T ij ij c T ij + ij c T ij + ij c T ij + ij c T ij. i i i i j j j j Now i j ij c T ij c T ij c T ij = l (σ ) l(σ ) = l (σ ) /2 l (σ ). i j Similar calculations show i j i j i j ij c T ij l(σ ), ij c T ij l(σ ), ij c T ij l (σ ).

7 ON THE UNIQUENESS OF LNE MINIMUM EVOLUTION 0-7 Substituting into (0.6) gives ij c T ij 2l(σ ) + 2l (σ ), so (0.5) becomes 2w e l (σ ) 2αw min (l(σ ) + l (σ )). This must hold when w e = w min, so we have ( α)l (σ ) αl(σ ). n identical argument gives ( α)l(σ ) αl (σ ). When α = these two inequalities 2 combine to give l (σ ) = l(σ ). This implies equality holds in each of the inequalities above, so in particular c T ij = c T ij i, j s.t. (σ ) ij = 0. Let T be the tree with split obtained by a single NNI from T and let l = φ(t ). Since T can also be obtained from T by a NNI, arguments identical to the one above show l (σ ) = l (σ ) and l (σ ) = l(σ ). ombining these three equations gives l(σ ) = l(σ ) and, since l(σ ) + l(σ ) =, we have 2 l(σ ) = l(σ ) = 4. We have shown that for k =, 2, 3, if e,..., e k E(T ) form a path with ends determining disjoint clades and, then l(σ e...e k ) = 2 k. We will prove this for all k by induction. So let T, T, T be the trees in Figure 2; note each can be obtained by an NNI from the other two. y our inductive hypothesis Since c T ij = c T ij Similar reasoning gives l(σ ) = l (σ E ) = 2 k. for all i, j, l(σ ) = l (σ ) and l(σ ) = l (σ E ). l (σ E ) = l (σ E ), l (σ E ) = l(σ ). ombining these equalities shows l(σ ) = l(σ ), so l(σ ) = 2 k and the induction is proved. Finally, for any i, j X let e,..., e k be the unique path between i and j. Then we have shown c T ij = l(σ e...e k ) = 2 k, and the theorem is proved oncluding remarks Finding the tree T that minimizies φ ME (T, δ) is in general NP-hard to approximate [7], so ME methods are seldom used directly in practice. ut ME has served as a theoretical guide to several distance-based algorithms [2, 0]. While the relationship between the robustness of an ME method and the robustness of an algorithm may be complex, Theorem 0.3 suggests that if such an algorithm is going to be based on a ME method, it s best to choose ME. In efinition 0., we required that a ME method satisfy φ(t, δ) = l when δ is a T -additive dissimilarity with length l. lthough this normalization requirement holds for all GLS+ME

8 0-8 RON KLEINMN F F 2 F k-2 E F F 2 F k-2 E T T ' F F 2 F k-2... T '' E Figure 0.2. Three trees T, T, T used in the induction. methods, it seems unnatural in this broader context. We now drop this requirement, so in what follows a ME method is just a map φ : T L. efinition 0.6. Let f be a function that assigns a real number to each X-split. For each X-tree T, let U f (T ) be the set of linear forms l such that l(σ S ) f(s) for each X-split S with equality iff S lies in T. We say a ME method φ is f-consistent if φ(t ) U f (T ) for all T T. Our usual definition of consistency corresponds to the case f. n easy generalization of Lemma 0.4 shows Lemma 0.7. If φ is statistically consistent, it is f-consistent for a unique function f. Following the proof of Theorem 0.3, we obtain Theorem 0.8. For each f, there is at most one f-consistent ME method with L radius 2. The proof is constructive and the resulting ME methods are combinatorially interesting, but we do not have space to explore them here. We will note that there are some functions f for which no f-consistent ME methods exist. For example, when n = 4, X = {,,, }, we have σ + σ + σ + σ = σ + σ + σ. Every X-tree contains the four splits on the left hand side and only one of the splits on the right hand. So if there exists an f-consistent ME method φ, applying φ(t ) to both sides for any T gives f( )+f( )+f( )+f( ) f( )+f( )+f( ). If this inequality is not satisfied then U f (T ) is empty. Let P be the polytope in R (n 2) given by P = {x x σ S f(s) splits S}.

9 ON THE UNIQUENESS OF LNE MINIMUM EVOLUTION 0-9 These generalized ME methods can be thought of geometrically as selecting points from certain faces of a P. Their combinatorics and geometry should be interesting objects of study. References. Kevin tteson, The performance of neighbor-joining methods of phylogenetic reconstruction, lgorithmica 25 (999), , 0.007/PL Magnus ordewich and Radu Mihaescu, ccuracy guarantees for phylogeny reconstruction algorithms based on balanced minimum evolution, lgorithms in ioinformatics (Vincent Moulton and Mona Singh, eds.), Lecture Notes in omputer Science, vol. 6293, Springer erlin / Heidelberg, pp avid ryant, On the uniqueness of the selection criterion in neighbor-joining, Journal of lassification 22 (2005), 3 5, 0.007/s x. 4. L.L. avalli-sforza and.w.f. Edwards, Phylogenetic analysis: Models and estimation procedures, merican Journal of Human genetics 9 (967), Franois enis and Olivier Gascuel, On the consistency of the minimum evolution principle of phylogenetic inference, iscrete pplied Mathematics 27 (2003), no., Richard esper and Olivier Gascuel, Theoretical foundation of the balanced minimum evolution method of phylogenetic inference and its relationship to weighted least-squares tree fitting, Molecular iology and Evolution 2 (2004), Samuel Fiorini and Gwenaël Joret, pproximating the balanced minimum evolution problem, orr abs/ (20). 8. Walter M. Fitch and Emanuel Margoliash, onstruction of phylogenetic trees, Science 55 (967), no. 3760, Olivier Gascuel, avid ryant, and Franois enis, Strengths and limitations of the minimum evolution principle, Systematic iology 50 (200), no. 5, pp (English). 0. Olivier Gascuel and Mike Steel, Neighbor-Joining Revealed, Molecular iology and Evolution 23 (2006), no., K. K. Kidd and L.. Sgaramella-Zonta, Phylogenetic analysis: concepts and methods, m J Hum Genet 23 (97), no. 3, Radu Mihaescu, an Levy, and Lior Pachter, Why neighbor-joining works, lgorithmica 54 (2009), Fabio Pardi, Sylvain Guillemot, and Olivier Gascuel, Robustness of phylogenetic inference based on minimal evolution, ulletin of Mathematical iology 72 (200), Yves Pauplin, irect calculation of a tree length using a distance matrix, Journal of Molecular Evolution 5 (2000), no., ndre Rzhetsky and Masatoshi Nei, Theoretical foundation of the minimal-evolution method of phylogenetic inference, Molecular iology Evolution 0 (993), no. 5, ndrey Rzhetsky and Masatoshi Nei, simple method for estimating and testing minimum-evolution trees, Molecular iology and Evolution 9 (992), no. 5, Naruya Saitou and Masatoshi Nei, The neighbor-joining method: a new method for reconstructing phylogenetic trees, Molecular iology and Evolution 4 (987), harles Semple and Mike Steel, yclic permutations and evolutionary trees, dvances in pplied Mathematics 32 (2004), James. Studier and Karl J. Keppler, note on the neighbor-joining method of Saitou and Nei, Molecular iology and Evolution 5 (988), L. Swofford, G.J. Olsen, P.J. Waddell, and.m. Hillis, Phylogenetic inference, Molecular Systematics (avid M. Hillis, raig Moritz, and arbara K. Mable, eds.), Sinauer ssociates, 996, pp epartment of Mathematics, U erkeley address: kleinman@math.berkeley.edu

Minimum evolution using ordinary least-squares is less robust than neighbor-joining

Minimum evolution using ordinary least-squares is less robust than neighbor-joining Minimum evolution using ordinary least-squares is less robust than neighbor-joining Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA email: swillson@iastate.edu November

More information

The least-squares approach to phylogenetics was first suggested

The least-squares approach to phylogenetics was first suggested Combinatorics of least-squares trees Radu Mihaescu and Lior Pachter Departments of Mathematics and Computer Science, University of California, Berkeley, CA 94704; Edited by Peter J. Bickel, University

More information

Math 239: Discrete Mathematics for the Life Sciences Spring Lecture 14 March 11. Scribe/ Editor: Maria Angelica Cueto/ C.E.

Math 239: Discrete Mathematics for the Life Sciences Spring Lecture 14 March 11. Scribe/ Editor: Maria Angelica Cueto/ C.E. Math 239: Discrete Mathematics for the Life Sciences Spring 2008 Lecture 14 March 11 Lecturer: Lior Pachter Scribe/ Editor: Maria Angelica Cueto/ C.E. Csar 14.1 Introduction The goal of today s lecture

More information

On the Uniqueness of the Selection Criterion in Neighbor-Joining

On the Uniqueness of the Selection Criterion in Neighbor-Joining Journal of Classification 22:3-15 (2005) DOI: 10.1007/s00357-005-0003-x On the Uniqueness of the Selection Criterion in Neighbor-Joining David Bryant McGill University, Montreal Abstract: The Neighbor-Joining

More information

Phylogenetic Networks, Trees, and Clusters

Phylogenetic Networks, Trees, and Clusters Phylogenetic Networks, Trees, and Clusters Luay Nakhleh 1 and Li-San Wang 2 1 Department of Computer Science Rice University Houston, TX 77005, USA nakhleh@cs.rice.edu 2 Department of Biology University

More information

DNA Phylogeny. Signals and Systems in Biology Kushal EE, IIT Delhi

DNA Phylogeny. Signals and Systems in Biology Kushal EE, IIT Delhi DNA Phylogeny Signals and Systems in Biology Kushal Shah @ EE, IIT Delhi Phylogenetics Grouping and Division of organisms Keeps changing with time Splitting, hybridization and termination Cladistics :

More information

Reconstructing Trees from Subtree Weights

Reconstructing Trees from Subtree Weights Reconstructing Trees from Subtree Weights Lior Pachter David E Speyer October 7, 2003 Abstract The tree-metric theorem provides a necessary and sufficient condition for a dissimilarity matrix to be a tree

More information

Consistency Index (CI)

Consistency Index (CI) Consistency Index (CI) minimum number of changes divided by the number required on the tree. CI=1 if there is no homoplasy negatively correlated with the number of species sampled Retention Index (RI)

More information

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree) I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by

More information

Walks in Phylogenetic Treespace

Walks in Phylogenetic Treespace Walks in Phylogenetic Treespace lan Joseph aceres Samantha aley John ejesus Michael Hintze iquan Moore Katherine St. John bstract We prove that the spaces of unrooted phylogenetic trees are Hamiltonian

More information

arxiv: v1 [q-bio.pe] 3 May 2016

arxiv: v1 [q-bio.pe] 3 May 2016 PHYLOGENETIC TREES AND EUCLIDEAN EMBEDDINGS MARK LAYER AND JOHN A. RHODES arxiv:1605.01039v1 [q-bio.pe] 3 May 2016 Abstract. It was recently observed by de Vienne et al. that a simple square root transformation

More information

Theory of Evolution. Charles Darwin

Theory of Evolution. Charles Darwin Theory of Evolution harles arwin 858-59: Origin of Species 5 year voyage of H.M.S. eagle (8-6) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties

More information

The Generalized Neighbor Joining method

The Generalized Neighbor Joining method The Generalized Neighbor Joining method Ruriko Yoshida Dept. of Mathematics Duke University Joint work with Dan Levy and Lior Pachter www.math.duke.edu/ ruriko data mining 1 Challenge We would like to

More information

RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION

RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION MAGNUS BORDEWICH, KATHARINA T. HUBER, VINCENT MOULTON, AND CHARLES SEMPLE Abstract. Phylogenetic networks are a type of leaf-labelled,

More information

CS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1. Lecture 8: Phylogenetic Tree Reconstruction: Distance Based - October 10, 2003

CS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1. Lecture 8: Phylogenetic Tree Reconstruction: Distance Based - October 10, 2003 CS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1 Lecture 8: Phylogenetic Tree Reconstruction: Distance Based - October 10, 2003 Lecturer: Wing-Kin Sung Scribe: Ning K., Shan T., Xiang

More information

Tree Space: Algorithms & Applications Part I. Megan Owen University of Waterloo

Tree Space: Algorithms & Applications Part I. Megan Owen University of Waterloo Tree Space: lgorithms & pplications Part I Megan Owen University of Waterloo Phylogenetic Trees a phylogenetic tree: GTTTGT GTTGTT GTGT GGTTGTT? 7 5 1 6 questions: how do we infer a tree from data? how

More information

Subtree Transfer Operations and Their Induced Metrics on Evolutionary Trees

Subtree Transfer Operations and Their Induced Metrics on Evolutionary Trees nnals of Combinatorics 5 (2001) 1-15 0218-0006/01/010001-15$1.50+0.20/0 c Birkhäuser Verlag, Basel, 2001 nnals of Combinatorics Subtree Transfer Operations and Their Induced Metrics on Evolutionary Trees

More information

NOTE ON THE HYBRIDIZATION NUMBER AND SUBTREE DISTANCE IN PHYLOGENETICS

NOTE ON THE HYBRIDIZATION NUMBER AND SUBTREE DISTANCE IN PHYLOGENETICS NOTE ON THE HYBRIDIZATION NUMBER AND SUBTREE DISTANCE IN PHYLOGENETICS PETER J. HUMPHRIES AND CHARLES SEMPLE Abstract. For two rooted phylogenetic trees T and T, the rooted subtree prune and regraft distance

More information

Finding the best tree by heuristic search

Finding the best tree by heuristic search Chapter 4 Finding the best tree by heuristic search If we cannot find the best trees by examining all possible trees, we could imagine searching in the space of possible trees. In this chapter we will

More information

Phylogenetic Tree Reconstruction

Phylogenetic Tree Reconstruction I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven

More information

arxiv: v1 [q-bio.pe] 1 Jun 2014

arxiv: v1 [q-bio.pe] 1 Jun 2014 THE MOST PARSIMONIOUS TREE FOR RANDOM DATA MAREIKE FISCHER, MICHELLE GALLA, LINA HERBST AND MIKE STEEL arxiv:46.27v [q-bio.pe] Jun 24 Abstract. Applying a method to reconstruct a phylogenetic tree from

More information

Bioinformatics 1 -- lecture 9. Phylogenetic trees Distance-based tree building Parsimony

Bioinformatics 1 -- lecture 9. Phylogenetic trees Distance-based tree building Parsimony ioinformatics -- lecture 9 Phylogenetic trees istance-based tree building Parsimony (,(,(,))) rees can be represented in "parenthesis notation". Each set of parentheses represents a branch-point (bifurcation),

More information

Building Phylogenetic Trees UPGMA & NJ

Building Phylogenetic Trees UPGMA & NJ uilding Phylogenetic Trees UPGM & NJ UPGM UPGM Unweighted Pair-Group Method with rithmetic mean Unweighted = all pairwise distances contribute equally. Pair-Group = groups are combined in pairs. rithmetic

More information

CSCI1950 Z Computa4onal Methods for Biology Lecture 5

CSCI1950 Z Computa4onal Methods for Biology Lecture 5 CSCI1950 Z Computa4onal Methods for Biology Lecture 5 Ben Raphael February 6, 2009 hip://cs.brown.edu/courses/csci1950 z/ Alignment vs. Distance Matrix Mouse: ACAGTGACGCCACACACGT Gorilla: CCTGCGACGTAACAAACGC

More information

AN ALTERNATING LEAST SQUARES APPROACH TO INFERRING PHYLOGENIES FROM PAIRWISE DISTANCES

AN ALTERNATING LEAST SQUARES APPROACH TO INFERRING PHYLOGENIES FROM PAIRWISE DISTANCES Syst. Biol. 46(l):11-lll / 1997 AN ALTERNATING LEAST SQUARES APPROACH TO INFERRING PHYLOGENIES FROM PAIRWISE DISTANCES JOSEPH FELSENSTEIN Department of Genetics, University of Washington, Box 35736, Seattle,

More information

Neighbor Joining Algorithms for Inferring Phylogenies via LCA-Distances

Neighbor Joining Algorithms for Inferring Phylogenies via LCA-Distances Neighbor Joining Algorithms for Inferring Phylogenies via LCA-Distances Ilan Gronau Shlomo Moran September 6, 2006 Abstract Reconstructing phylogenetic trees efficiently and accurately from distance estimates

More information

Tree of Life iological Sequence nalysis Chapter http://tolweb.org/tree/ Phylogenetic Prediction ll organisms on Earth have a common ancestor. ll species are related. The relationship is called a phylogeny

More information

NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees

NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees Erin Molloy and Tandy Warnow {emolloy2, warnow}@illinois.edu University of Illinois at Urbana

More information

RECOVERING A PHYLOGENETIC TREE USING PAIRWISE CLOSURE OPERATIONS

RECOVERING A PHYLOGENETIC TREE USING PAIRWISE CLOSURE OPERATIONS RECOVERING A PHYLOGENETIC TREE USING PAIRWISE CLOSURE OPERATIONS KT Huber, V Moulton, C Semple, and M Steel Department of Mathematics and Statistics University of Canterbury Private Bag 4800 Christchurch,

More information

Letter to the Editor. Department of Biology, Arizona State University

Letter to the Editor. Department of Biology, Arizona State University Letter to the Editor Traditional Phylogenetic Reconstruction Methods Reconstruct Shallow and Deep Evolutionary Relationships Equally Well Michael S. Rosenberg and Sudhir Kumar Department of Biology, Arizona

More information

Estimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057

Estimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057 Estimating Phylogenies (Evolutionary Trees) II Biol4230 Thurs, March 2, 2017 Bill Pearson wrp@virginia.edu 4-2818 Jordan 6-057 Tree estimation strategies: Parsimony?no model, simply count minimum number

More information

Week 5: Distance methods, DNA and protein models

Week 5: Distance methods, DNA and protein models Week 5: Distance methods, DNA and protein models Genome 570 February, 2016 Week 5: Distance methods, DNA and protein models p.1/69 A tree and the expected distances it predicts E A 0.08 0.05 0.06 0.03

More information

Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

More information

On the number of cycles in a graph with restricted cycle lengths

On the number of cycles in a graph with restricted cycle lengths On the number of cycles in a graph with restricted cycle lengths Dániel Gerbner, Balázs Keszegh, Cory Palmer, Balázs Patkós arxiv:1610.03476v1 [math.co] 11 Oct 2016 October 12, 2016 Abstract Let L be a

More information

A 3-APPROXIMATION ALGORITHM FOR THE SUBTREE DISTANCE BETWEEN PHYLOGENIES. 1. Introduction

A 3-APPROXIMATION ALGORITHM FOR THE SUBTREE DISTANCE BETWEEN PHYLOGENIES. 1. Introduction A 3-APPROXIMATION ALGORITHM FOR THE SUBTREE DISTANCE BETWEEN PHYLOGENIES MAGNUS BORDEWICH 1, CATHERINE MCCARTIN 2, AND CHARLES SEMPLE 3 Abstract. In this paper, we give a (polynomial-time) 3-approximation

More information

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the

More information

Molecular Evolution & Phylogenetics

Molecular Evolution & Phylogenetics Molecular Evolution & Phylogenetics Heuristics based on tree alterations, maximum likelihood, Bayesian methods, statistical confidence measures Jean-Baka Domelevo Entfellner Learning Objectives know basic

More information

A CLUSTER REDUCTION FOR COMPUTING THE SUBTREE DISTANCE BETWEEN PHYLOGENIES

A CLUSTER REDUCTION FOR COMPUTING THE SUBTREE DISTANCE BETWEEN PHYLOGENIES A CLUSTER REDUCTION FOR COMPUTING THE SUBTREE DISTANCE BETWEEN PHYLOGENIES SIMONE LINZ AND CHARLES SEMPLE Abstract. Calculating the rooted subtree prune and regraft (rspr) distance between two rooted binary

More information

On the Subnet Prune and Regraft distance

On the Subnet Prune and Regraft distance On the Subnet Prune and Regraft distance Jonathan Klawitter and Simone Linz Department of Computer Science, University of Auckland, New Zealand jo. klawitter@ gmail. com, s. linz@ auckland. ac. nz arxiv:805.07839v

More information

THE THREE-STATE PERFECT PHYLOGENY PROBLEM REDUCES TO 2-SAT

THE THREE-STATE PERFECT PHYLOGENY PROBLEM REDUCES TO 2-SAT COMMUNICATIONS IN INFORMATION AND SYSTEMS c 2009 International Press Vol. 9, No. 4, pp. 295-302, 2009 001 THE THREE-STATE PERFECT PHYLOGENY PROBLEM REDUCES TO 2-SAT DAN GUSFIELD AND YUFENG WU Abstract.

More information

Properties of normal phylogenetic networks

Properties of normal phylogenetic networks Properties of normal phylogenetic networks Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA swillson@iastate.edu August 13, 2009 Abstract. A phylogenetic network is

More information

Evolutionary Tree Analysis. Overview

Evolutionary Tree Analysis. Overview CSI/BINF 5330 Evolutionary Tree Analysis Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Backgrounds Distance-Based Evolutionary Tree Reconstruction Character-Based

More information

A multi-neighbor-joining approach for phylogenetic tree reconstruction and visualization

A multi-neighbor-joining approach for phylogenetic tree reconstruction and visualization A multi-neighbor-joining approach for phylogenetic tree 525 A multi-neighbor-joining approach for phylogenetic tree reconstruction and visualization Ana Estela A. da Silva 3, Wilfredo J.P. Villanueva 1,

More information

A Geometric Approach to Graph Isomorphism

A Geometric Approach to Graph Isomorphism A Geometric Approach to Graph Isomorphism Pawan Aurora and Shashank K Mehta Indian Institute of Technology, Kanpur - 208016, India {paurora,skmehta}@cse.iitk.ac.in Abstract. We present an integer linear

More information

Coloring Vertices and Edges of a Path by Nonempty Subsets of a Set

Coloring Vertices and Edges of a Path by Nonempty Subsets of a Set Coloring Vertices and Edges of a Path by Nonempty Subsets of a Set P.N. Balister E. Győri R.H. Schelp April 28, 28 Abstract A graph G is strongly set colorable if V (G) E(G) can be assigned distinct nonempty

More information

Lecture 9 : Identifiability of Markov Models

Lecture 9 : Identifiability of Markov Models Lecture 9 : Identifiability of Markov Models MATH285K - Spring 2010 Lecturer: Sebastien Roch References: [SS03, Chapter 8]. Previous class THM 9.1 (Uniqueness of tree metric representation) Let δ be a

More information

DISCRETIZED CONFIGURATIONS AND PARTIAL PARTITIONS

DISCRETIZED CONFIGURATIONS AND PARTIAL PARTITIONS DISCRETIZED CONFIGURATIONS AND PARTIAL PARTITIONS AARON ABRAMS, DAVID GAY, AND VALERIE HOWER Abstract. We show that the discretized configuration space of k points in the n-simplex is homotopy equivalent

More information

X X (2) X Pr(X = x θ) (3)

X X (2) X Pr(X = x θ) (3) Notes for 848 lecture 6: A ML basis for compatibility and parsimony Notation θ Θ (1) Θ is the space of all possible trees (and model parameters) θ is a point in the parameter space = a particular tree

More information

ALGORITHMS FOR RECONSTRUCTING PHYLOGENETIC TREES FROM DISSIMILARITY MAPS

ALGORITHMS FOR RECONSTRUCTING PHYLOGENETIC TREES FROM DISSIMILARITY MAPS ALGORITHMS FOR RECONSTRUCTING PHYLOGENETIC TREES FROM DISSIMILARITY MAPS DAN LEVY, FRANCIS EDWARD SU, AND RURIKO YOSHIDA Manuscript, December 15, 2003 Abstract. In this paper we improve on an algorithm

More information

On graphs having a unique minimum independent dominating set

On graphs having a unique minimum independent dominating set AUSTRALASIAN JOURNAL OF COMBINATORICS Volume 68(3) (2017), Pages 357 370 On graphs having a unique minimum independent dominating set Jason Hedetniemi Department of Mathematical Sciences Clemson University

More information

LAYERED NETWORKS, THE DISCRETE LAPLACIAN, AND A CONTINUED FRACTION IDENTITY

LAYERED NETWORKS, THE DISCRETE LAPLACIAN, AND A CONTINUED FRACTION IDENTITY LAYERE NETWORKS, THE ISCRETE LAPLACIAN, AN A CONTINUE FRACTION IENTITY OWEN. BIESEL, AVI V. INGERMAN, JAMES A. MORROW, AN WILLIAM T. SHORE ABSTRACT. We find explicit formulas for the conductivities of

More information

Theory of Evolution Charles Darwin

Theory of Evolution Charles Darwin Theory of Evolution Charles arwin 858-59: Origin of Species 5 year voyage of H.M.S. eagle (83-36) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties

More information

Inferring phylogeny. Today s topics. Milestones of molecular evolution studies Contributions to molecular evolution

Inferring phylogeny. Today s topics. Milestones of molecular evolution studies Contributions to molecular evolution Today s topics Inferring phylogeny Introduction! Distance methods! Parsimony method!"#$%&'(!)* +,-.'/01!23454(6!7!2845*0&4'9#6!:&454(6 ;?@AB=C?DEF Overview of phylogenetic inferences Methodology Methods

More information

B (a) n = 3 B D C. (b) n = 4

B (a) n = 3 B D C. (b) n = 4 onstruction of Phylogenetic Trees from mino cid Sequences using a Genetic lgorithm Hideo Matsuda matsuda@ics.es.osaka-u.ac.jp epartment of Information and omputer Sciences, Faculty of Engineering Science,

More information

LOWER BOUNDS ON SEQUENCE LENGTHS REQUIRED TO RECOVER THE EVOLUTIONARY TREE. (extended abstract submitted to RECOMB '99)

LOWER BOUNDS ON SEQUENCE LENGTHS REQUIRED TO RECOVER THE EVOLUTIONARY TREE. (extended abstract submitted to RECOMB '99) LOWER BOUNDS ON SEQUENCE LENGTHS REQUIRED TO RECOVER THE EVOLUTIONARY TREE MIKL OS CS } UR OS AND MING-YANG KAO (extended abstract submitted to RECOMB '99) Abstract. In this paper we study the sequence

More information

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics Bioinformatics 1 Biology, Sequences, Phylogenetics Part 4 Sepp Hochreiter Klausur Mo. 30.01.2011 Zeit: 15:30 17:00 Raum: HS14 Anmeldung Kusss Contents Methods and Bootstrapping of Maximum Methods Methods

More information

Sequential Monte Carlo Algorithms

Sequential Monte Carlo Algorithms ayesian Phylogenetic Inference using Sequential Monte arlo lgorithms lexandre ouchard-ôté *, Sriram Sankararaman *, and Michael I. Jordan *, * omputer Science ivision, University of alifornia erkeley epartment

More information

Coloring Vertices and Edges of a Path by Nonempty Subsets of a Set

Coloring Vertices and Edges of a Path by Nonempty Subsets of a Set Coloring Vertices and Edges of a Path by Nonempty Subsets of a Set P.N. Balister E. Győri R.H. Schelp November 8, 28 Abstract A graph G is strongly set colorable if V (G) E(G) can be assigned distinct

More information

Chapter 3: Phylogenetics

Chapter 3: Phylogenetics Chapter 3: Phylogenetics 3. Computing Phylogeny Prof. Yechiam Yemini (YY) Computer Science epartment Columbia niversity Overview Computing trees istance-based techniques Maximal Parsimony (MP) techniques

More information

Phylogenetics: Building Phylogenetic Trees

Phylogenetics: Building Phylogenetic Trees 1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should

More information

Phylogeny Tree Algorithms

Phylogeny Tree Algorithms Phylogeny Tree lgorithms Jianlin heng, PhD School of Electrical Engineering and omputer Science University of entral Florida 2006 Free for academic use. opyright @ Jianlin heng & original sources for some

More information

Finite Metric Spaces & Their Embeddings: Introduction and Basic Tools

Finite Metric Spaces & Their Embeddings: Introduction and Basic Tools Finite Metric Spaces & Their Embeddings: Introduction and Basic Tools Manor Mendel, CMI, Caltech 1 Finite Metric Spaces Definition of (semi) metric. (M, ρ): M a (finite) set of points. ρ a distance function

More information

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University Phylogenetics: Building Phylogenetic Trees COMP 571 - Fall 2010 Luay Nakhleh, Rice University Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary

More information

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline Phylogenetics Todd Vision iology 522 March 26, 2007 pplications of phylogenetics Studying organismal or biogeographic history Systematics ating events in the fossil record onservation biology Studying

More information

Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2016 University of California, Berkeley. Parsimony & Likelihood [draft]

Integrative Biology 200 PRINCIPLES OF PHYLOGENETICS Spring 2016 University of California, Berkeley. Parsimony & Likelihood [draft] Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2016 University of California, Berkeley K.W. Will Parsimony & Likelihood [draft] 1. Hennig and Parsimony: Hennig was not concerned with parsimony

More information

Page 1. Evolutionary Trees. Why build evolutionary tree? Outline

Page 1. Evolutionary Trees. Why build evolutionary tree? Outline Page Evolutionary Trees Russ. ltman MI S 7 Outline. Why build evolutionary trees?. istance-based vs. character-based methods. istance-based: Ultrametric Trees dditive Trees. haracter-based: Perfect phylogeny

More information

arxiv: v1 [cs.cc] 9 Oct 2014

arxiv: v1 [cs.cc] 9 Oct 2014 Satisfying ternary permutation constraints by multiple linear orders or phylogenetic trees Leo van Iersel, Steven Kelk, Nela Lekić, Simone Linz May 7, 08 arxiv:40.7v [cs.cc] 9 Oct 04 Abstract A ternary

More information

A necessary and sufficient condition for the existence of a spanning tree with specified vertices having large degrees

A necessary and sufficient condition for the existence of a spanning tree with specified vertices having large degrees A necessary and sufficient condition for the existence of a spanning tree with specified vertices having large degrees Yoshimi Egawa Department of Mathematical Information Science, Tokyo University of

More information

The Complexity of Constructing Evolutionary Trees Using Experiments

The Complexity of Constructing Evolutionary Trees Using Experiments The Complexity of Constructing Evolutionary Trees Using Experiments Gerth Stlting Brodal 1,, Rolf Fagerberg 1,, Christian N. S. Pedersen 1,, and Anna Östlin2, 1 BRICS, Department of Computer Science, University

More information

Application of new distance matrix to phylogenetic tree construction

Application of new distance matrix to phylogenetic tree construction Application of new distance matrix to phylogenetic tree construction P.V.Lakshmi Computer Science & Engg Dept GITAM Institute of Technology GITAM University Andhra Pradesh India Allam Appa Rao Jawaharlal

More information

BINF6201/8201. Molecular phylogenetic methods

BINF6201/8201. Molecular phylogenetic methods BINF60/80 Molecular phylogenetic methods 0-7-06 Phylogenetics Ø According to the evolutionary theory, all life forms on this planet are related to one another by descent. Ø Traditionally, phylogenetics

More information

Speeding up the Dreyfus-Wagner Algorithm for minimum Steiner trees

Speeding up the Dreyfus-Wagner Algorithm for minimum Steiner trees Speeding up the Dreyfus-Wagner Algorithm for minimum Steiner trees Bernhard Fuchs Center for Applied Computer Science Cologne (ZAIK) University of Cologne, Weyertal 80, 50931 Köln, Germany Walter Kern

More information

MATH 31BH Homework 1 Solutions

MATH 31BH Homework 1 Solutions MATH 3BH Homework Solutions January 0, 04 Problem.5. (a) (x, y)-plane in R 3 is closed and not open. To see that this plane is not open, notice that any ball around the origin (0, 0, 0) will contain points

More information

arxiv: v1 [cs.ds] 1 Nov 2018

arxiv: v1 [cs.ds] 1 Nov 2018 An O(nlogn) time Algorithm for computing the Path-length Distance between Trees arxiv:1811.00619v1 [cs.ds] 1 Nov 2018 David Bryant Celine Scornavacca November 5, 2018 Abstract Tree comparison metrics have

More information

Distances that Perfectly Mislead

Distances that Perfectly Mislead Syst. Biol. 53(2):327 332, 2004 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150490423809 Distances that Perfectly Mislead DANIEL H. HUSON 1 AND

More information

Workshop III: Evolutionary Genomics

Workshop III: Evolutionary Genomics Identifying Species Trees from Gene Trees Elizabeth S. Allman University of Alaska IPAM Los Angeles, CA November 17, 2011 Workshop III: Evolutionary Genomics Collaborators The work in today s talk is joint

More information

RANDOM WALKS IN Z d AND THE DIRICHLET PROBLEM

RANDOM WALKS IN Z d AND THE DIRICHLET PROBLEM RNDOM WLKS IN Z d ND THE DIRICHLET PROBLEM ERIC GUN bstract. Random walks can be used to solve the Dirichlet problem the boundary value problem for harmonic functions. We begin by constructing the random

More information

Notes on the Matrix-Tree theorem and Cayley s tree enumerator

Notes on the Matrix-Tree theorem and Cayley s tree enumerator Notes on the Matrix-Tree theorem and Cayley s tree enumerator 1 Cayley s tree enumerator Recall that the degree of a vertex in a tree (or in any graph) is the number of edges emanating from it We will

More information

CSCI1950 Z Computa4onal Methods for Biology Lecture 4. Ben Raphael February 2, hhp://cs.brown.edu/courses/csci1950 z/ Algorithm Summary

CSCI1950 Z Computa4onal Methods for Biology Lecture 4. Ben Raphael February 2, hhp://cs.brown.edu/courses/csci1950 z/ Algorithm Summary CSCI1950 Z Computa4onal Methods for Biology Lecture 4 Ben Raphael February 2, 2009 hhp://cs.brown.edu/courses/csci1950 z/ Algorithm Summary Parsimony Probabilis4c Method Input Output Sankoff s & Fitch

More information

2 Notation and Preliminaries

2 Notation and Preliminaries On Asymmetric TSP: Transformation to Symmetric TSP and Performance Bound Ratnesh Kumar Haomin Li epartment of Electrical Engineering University of Kentucky Lexington, KY 40506-0046 Abstract We show that

More information

Algorithms in Bioinformatics

Algorithms in Bioinformatics Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Distance Methods Character Methods

More information

Matching Polynomials of Graphs

Matching Polynomials of Graphs Spectral Graph Theory Lecture 25 Matching Polynomials of Graphs Daniel A Spielman December 7, 2015 Disclaimer These notes are not necessarily an accurate representation of what happened in class The notes

More information

Weighted Neighbor Joining: A Likelihood-Based Approach to Distance-Based Phylogeny Reconstruction

Weighted Neighbor Joining: A Likelihood-Based Approach to Distance-Based Phylogeny Reconstruction Weighted Neighbor Joining: A Likelihood-Based Approach to Distance-Based Phylogeny Reconstruction William J. Bruno,* Nicholas D. Socci, and Aaron L. Halpern *Theoretical Biology and Biophysics, Los Alamos

More information

the tree till a class assignment is reached

the tree till a class assignment is reached Decision Trees Decision Tree for Playing Tennis Prediction is done by sending the example down Prediction is done by sending the example down the tree till a class assignment is reached Definitions Internal

More information

Phylogenetic trees 07/10/13

Phylogenetic trees 07/10/13 Phylogenetic trees 07/10/13 A tree is the only figure to occur in On the Origin of Species by Charles Darwin. It is a graphical representation of the evolutionary relationships among entities that share

More information

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

Efficient Reassembling of Graphs, Part 1: The Linear Case

Efficient Reassembling of Graphs, Part 1: The Linear Case Efficient Reassembling of Graphs, Part 1: The Linear Case Assaf Kfoury Boston University Saber Mirzaei Boston University Abstract The reassembling of a simple connected graph G = (V, E) is an abstraction

More information

Exercises for Unit VI (Infinite constructions in set theory)

Exercises for Unit VI (Infinite constructions in set theory) Exercises for Unit VI (Infinite constructions in set theory) VI.1 : Indexed families and set theoretic operations (Halmos, 4, 8 9; Lipschutz, 5.3 5.4) Lipschutz : 5.3 5.6, 5.29 5.32, 9.14 1. Generalize

More information

Kei Takahashi and Masatoshi Nei

Kei Takahashi and Masatoshi Nei Efficiencies of Fast Algorithms of Phylogenetic Inference Under the Criteria of Maximum Parsimony, Minimum Evolution, and Maximum Likelihood When a Large Number of Sequences Are Used Kei Takahashi and

More information

Algorithmic Methods Well-defined methodology Tree reconstruction those that are well-defined enough to be carried out by a computer. Felsenstein 2004,

Algorithmic Methods Well-defined methodology Tree reconstruction those that are well-defined enough to be carried out by a computer. Felsenstein 2004, Tracing the Evolution of Numerical Phylogenetics: History, Philosophy, and Significance Adam W. Ferguson Phylogenetic Systematics 26 January 2009 Inferring Phylogenies Historical endeavor Darwin- 1837

More information

TheDisk-Covering MethodforTree Reconstruction

TheDisk-Covering MethodforTree Reconstruction TheDisk-Covering MethodforTree Reconstruction Daniel Huson PACM, Princeton University Bonn, 1998 1 Copyright (c) 2008 Daniel Huson. Permission is granted to copy, distribute and/or modify this document

More information

Cographs; chordal graphs and tree decompositions

Cographs; chordal graphs and tree decompositions Cographs; chordal graphs and tree decompositions Zdeněk Dvořák September 14, 2015 Let us now proceed with some more interesting graph classes closed on induced subgraphs. 1 Cographs The class of cographs

More information

Uniquely Universal Sets

Uniquely Universal Sets Uniquely Universal Sets 1 Uniquely Universal Sets Abstract 1 Arnold W. Miller We say that X Y satisfies the Uniquely Universal property (UU) iff there exists an open set U X Y such that for every open

More information

Containment restrictions

Containment restrictions Containment restrictions Tibor Szabó Extremal Combinatorics, FU Berlin, WiSe 207 8 In this chapter we switch from studying constraints on the set operation intersection, to constraints on the set relation

More information

Is the equal branch length model a parsimony model?

Is the equal branch length model a parsimony model? Table 1: n approximation of the probability of data patterns on the tree shown in figure?? made by dropping terms that do not have the minimal exponent for p. Terms that were dropped are shown in red;

More information

Combinatorial Aspects of Tropical Geometry and its interactions with phylogenetics

Combinatorial Aspects of Tropical Geometry and its interactions with phylogenetics Combinatorial Aspects of Tropical Geometry and its interactions with phylogenetics María Angélica Cueto Department of Mathematics Columbia University Rabadan Lab Metting Columbia University College of

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees Constructing Evolutionary/Phylogenetic Trees 2 broad categories: Distance-based methods Ultrametric Additive: UPGMA Transformed Distance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

More information

arxiv: v1 [q-bio.pe] 16 Aug 2007

arxiv: v1 [q-bio.pe] 16 Aug 2007 MAXIMUM LIKELIHOOD SUPERTREES arxiv:0708.2124v1 [q-bio.pe] 16 Aug 2007 MIKE STEEL AND ALLEN RODRIGO Abstract. We analyse a maximum-likelihood approach for combining phylogenetic trees into a larger supertree.

More information

The Algorithmic Aspects of the Regularity Lemma

The Algorithmic Aspects of the Regularity Lemma The Algorithmic Aspects of the Regularity Lemma N. Alon R. A. Duke H. Lefmann V. Rödl R. Yuster Abstract The Regularity Lemma of Szemerédi is a result that asserts that every graph can be partitioned in

More information