A CLASS OF GENERAL SUPERTREE METHODS FOR NESTED TAXA

Similar documents
1 PYTHAGORAS THEOREM 1. Given a right angled triangle, the square of the hypotenuse is equal to the sum of the squares of the other two sides.

A Lower Bound for the Length of a Partial Transversal in a Latin Square, Revised Version

Arrow s Impossibility Theorem

6.5 Improper integrals

Technische Universität München Winter term 2009/10 I7 Prof. J. Esparza / J. Křetínský / M. Luttenberger 11. Februar Solution

NON-DETERMINISTIC FSA

CS 491G Combinatorial Optimization Lecture Notes

Algorithms & Data Structures Homework 8 HS 18 Exercise Class (Room & TA): Submitted by: Peer Feedback by: Points:

Project 6: Minigoals Towards Simplifying and Rewriting Expressions

Section 1.3 Triangles

Matrices SCHOOL OF ENGINEERING & BUILT ENVIRONMENT. Mathematics (c) 1. Definition of a Matrix

Arrow s Impossibility Theorem

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER MACHINES AND THEIR LANGUAGES ANSWERS

Part 4. Integration (with Proofs)

Solutions for HW9. Bipartite: put the red vertices in V 1 and the black in V 2. Not bipartite!

p-adic Egyptian Fractions

Discrete Structures Lecture 11

CS 573 Automata Theory and Formal Languages

Chapter 3. Vector Spaces. 3.1 Images and Image Arithmetic

Introduction to Olympiad Inequalities

Counting Paths Between Vertices. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs

Computational Biology Lecture 18: Genome rearrangements, finding maximal matches Saad Mneimneh

A Study on the Properties of Rational Triangles

#A42 INTEGERS 11 (2011) ON THE CONDITIONED BINOMIAL COEFFICIENTS

Exercise sheet 6: Solutions

Lesson 2: The Pythagorean Theorem and Similar Triangles. A Brief Review of the Pythagorean Theorem.

Tutorial Worksheet. 1. Find all solutions to the linear system by following the given steps. x + 2y + 3z = 2 2x + 3y + z = 4.

Bisimulation, Games & Hennessy Milner logic

arxiv: v1 [math.gr] 11 Jan 2019

Lecture Notes No. 10

CS311 Computational Structures Regular Languages and Regular Grammars. Lecture 6

CIT 596 Theory of Computation 1. Graphs and Digraphs

Parse trees, ambiguity, and Chomsky normal form

The Word Problem in Quandles

12.4 Similarity in Right Triangles

Activities. 4.1 Pythagoras' Theorem 4.2 Spirals 4.3 Clinometers 4.4 Radar 4.5 Posting Parcels 4.6 Interlocking Pipes 4.7 Sine Rule Notes and Solutions

MAT 403 NOTES 4. f + f =

April 8, 2017 Math 9. Geometry. Solving vector problems. Problem. Prove that if vectors and satisfy, then.

Algorithm Design and Analysis

Hyers-Ulam stability of Pielou logistic difference equation

T b a(f) [f ] +. P b a(f) = Conclude that if f is in AC then it is the difference of two monotone absolutely continuous functions.

Graph Theory. Simple Graph G = (V, E). V={a,b,c,d,e,f,g,h,k} E={(a,b),(a,g),( a,h),(a,k),(b,c),(b,k),...,(h,k)}

AP Calculus BC Chapter 8: Integration Techniques, L Hopital s Rule and Improper Integrals

Bases for Vector Spaces

Finite State Automata and Determinisation

QUADRATIC EQUATION. Contents

Formal Languages and Automata

(a) A partition P of [a, b] is a finite subset of [a, b] containing a and b. If Q is another partition and P Q, then Q is a refinement of P.

Green s Theorem. (2x e y ) da. (2x e y ) dx dy. x 2 xe y. (1 e y ) dy. y=1. = y e y. y=0. = 2 e

System Validation (IN4387) November 2, 2012, 14:00-17:00

Algorithm Design and Analysis

Designing finite automata II

Intermediate Math Circles Wednesday 17 October 2012 Geometry II: Side Lengths

2.4 Linear Inequalities and Interval Notation

Free groups, Lecture 2, part 1

Discrete Structures, Test 2 Monday, March 28, 2016 SOLUTIONS, VERSION α

CHENG Chun Chor Litwin The Hong Kong Institute of Education

Proving the Pythagorean Theorem

Lecture 6: Coding theory

Global alignment. Genome Rearrangements Finding preserved genes. Lecture 18

THE PYTHAGOREAN THEOREM

INTEGRATION. 1 Integrals of Complex Valued functions of a REAL variable

Minimal DFA. minimal DFA for L starting from any other

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4

Pre-Lie algebras, rooted trees and related algebraic structures

= state, a = reading and q j

Lecture 3: Equivalence Relations

1 Nondeterministic Finite Automata

Vectors , (0,0). 5. A vector is commonly denoted by putting an arrow above its symbol, as in the picture above. Here are some 3-dimensional vectors:

CS241 Week 6 Tutorial Solutions

Probability. b a b. a b 32.

Logic Synthesis and Verification

Bravais lattices and crystal systems

TOPIC: LINEAR ALGEBRA MATRICES

Mid-Term Examination - Spring 2014 Mathematical Programming with Applications to Economics Total Score: 45; Time: 3 hours

Farey Fractions. Rickard Fernström. U.U.D.M. Project Report 2017:24. Department of Mathematics Uppsala University

Chapter 4 State-Space Planning

Linear choosability of graphs

6.1 Definition of the Riemann Integral

Figure 1. The left-handed and right-handed trefoils

Symmetrical Components 1

Section 4.4. Green s Theorem

Coalgebra, Lecture 15: Equations for Deterministic Automata

CM10196 Topic 4: Functions and Relations

Electromagnetism Notes, NYU Spring 2018

Convert the NFA into DFA

Graph States EPIT Mehdi Mhalla (Calgary, Canada) Simon Perdrix (Grenoble, France)

CS 275 Automata and Formal Language Theory

Suffix Trays and Suffix Trists: Structures for Faster Text Indexing

For a, b, c, d positive if a b and. ac bd. Reciprocal relations for a and b positive. If a > b then a ab > b. then

ILLUSTRATING THE EXTENSION OF A SPECIAL PROPERTY OF CUBIC POLYNOMIALS TO NTH DEGREE POLYNOMIALS

Linear Algebra Introduction

Topologie en Meetkunde 2011 Lecturers: Marius Crainic and Ivan Struchiner

Quadratic Forms. Quadratic Forms

Alpha Algorithm: Limitations

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

Descriptional Complexity of Non-Unary Self-Verifying Symmetric Difference Automata

Exercise 3 Logic Control

arxiv: v2 [math.co] 31 Oct 2016

Lecture 08: Feb. 08, 2019

Transcription:

A CLASS OF GENERAL SUPERTREE METHODS FOR NESTED TAXA PHILIP DANIEL AND CHARLES SEMPLE Astrt. Amlgmting smller evolutionry trees into single prent tree is n importnt tsk in evolutionry iology. Trditionlly, the (supertree) methods used for this mlgmtion tke olletion of lef-lelled trees s their input. However, it hs een reently highlighted tht, in prtie, suh n input is somewht limiting nd tht one would like supertree methods for olletions of trees in whih some of the interior verties, s well s ll of the leves, re lelled [10]. In this pper, we desrie wht ppers to e the first pproh for onstruting suh methods nd show tht ny method using this pproh stisfies prtiulr desirle properties. 1. Introdution In evolutionry iology, supertree methods hve eome fundmentl proess for onstruting n evolutionry tree tht est represents the informtion exhiited y the originl input. These methods mlgmte n input olletion of smller evolutionry trees on overlpping sets of tx into single prent tree lled supertree. The inresing populrity of supertree methods is highlighted y reent survey [3] nd pulished ook [4]. If the input olletion of trees rries no onfliting informtion, then one would like the resulting supertree to preserve ll of the nestrl reltionships displyed y eh of the trees in this olletion. For olletions of rooted phylogeneti trees, there is polynomil-time lgorithm tht finds suh tree. In prtie, however, inomptiility is more ommon nd so one seeks method tht resolves these onflits in sensile wy, while still produing supertree tht hs numer of ttrtive properties. The following list of desirle properties for ny supertree method pplied to olletion P of rooted phylogeneti trees is given in [12]: (i) The method runs in polynomil time in the size of the input. (ii) The resulting supertree displys ll rooted inry sutrees shred y ll of the trees in P. Dte: 18 Jnury 2005. 1991 Mthemtis Sujet Clssifition. 05C05; 92D15. Key words nd phrses. Rooted phylogeneti tree; Rooted semi-lelled tree; Nested tx; Supertree method; Supertree. The first uthor ws supported y the New Zelnd Institute of Mthemtis nd its Applitions funded progrmme Phylogeneti Genomis nd the seond uthor ws supported y the New Zelnd Mrsden Fund (UOC310). 1

2 PHILIP DANIEL AND CHARLES SEMPLE (iii) If P is omptile, then the resulting supertree displys eh of the trees in P. (iv) The method stisfies the following two nturl symmetry properties of ordering nd renming: () The resulting supertree is independent of the order in whih the memers of P re listed. () If we renme ll the speies, nd then pply the method to this new olletion of input trees, the resulting supertree tree is the one otined y pplying the method to the originl olletion P, ut with the speies renmed s efore. (v) The method llows possile weighting of the trees in P. To dte, the lgorithms MinCutSupertree [12] nd its modified version [9] re the only two supertree methods for rooted phylogeneti trees tht hve een shown to stisfy ll of the ove properties. We remrk here tht (iv) my seem trivil to stisfy ut, for olletions of unrooted phylogeneti trees, it hs een shown tht no supertree method for suh olletions n simultneously stisfy (iii) nd oth prts of (iv) [14]. In this pper, we present generl supertree method for olletions of rooted semi-lelled trees; tht is, rooted trees in whih some (possily none) of the interior verties s well s ll of the leves re lelled. Mking the extension from rooted phylogeneti trees to rooted semi-lelled trees mens tht we llow nested tx in the input. In prtiulr, the interior lels represent tx t higher txonomi level thn ny of their desendnts. For exmple, fmilies versus gener or gener versus speies. One of the min fetures of this supertree method is tht it purposely llows for the possiility of vrints. Indeed, provided the input stisfies two nturl nestor-desendnt pirwise properties, ny suh vrint onstruted from it stisfies ll of the rooted semi-lelled tree nlogues of the desirle properties (i) (iii) ove. Moreover, lthough the rooted semi-lelled tree nlogues of (iv) nd (v) re dependent on the onstruted vrint, stisfying these dditionl properties is not diffiult. We highlight this with n exmple of suh n lgorithm. To the est of our knowledge, this is the first time suh supertree methods for rooted semi-lelled trees hve een onsidered. The next setion ontins some further kground nd neessry preliminries for the rest of the pper. 2. Bkground nd Preliminries Throughout this pper, we will ssume tht the reder hs some fmilirity with the sis of phylogenetis. Unless otherwise stted, the nottion nd terminology follows Semple nd Steel [13]. A rooted phylogeneti tree (on X) is n ordered pir (T; φ) onsisting of rooted tree T in whih ll interior verties hve degree t lest three exept the root whih hs degree t lest two nd ijetive mp φ from X to the lef set of T. Rooted phylogeneti trees on X re lso lled rooted phylogeneti X-trees. Loosely speking, rooted phylogeneti X-tree is rooted tree whose leves re ijetively

SUPERTREE METHODS FOR NESTED TAXA 3 e e d d Figure 1. A olletion P of rooted semi-lelled trees. lelled with the elements of X. The leftmost tree in Fig. 1 is n exmple of rooted phylogeneti tree, where X = {,,, d}. Let T e rooted phylogeneti tree on X, nd let X e suset of X. The restrition of T to X is the rooted phylogeneti tree tht is otined from the miniml rooted sutree of T indued y the elements of X y suppressing ll non-root verties of degree two. This restrition is denoted y T X. We sy tht T displys rooted phylogeneti X-tree T if, up to isomorphism, T X is refinement of T. Intuitively, T displys T if T preserves ll of the nestrl reltionships desried y T. The reson for llowing refinement is tht, from iologil viewpoint, verties of outdegree t lest three usully represent n unertinty to the ext order of speition s oppose to multiple speition event. A olletion P of rooted phylogeneti trees is sid to e omptile if there exist rooted phylogeneti tree tht displys eh of the trees in P. Agin, intuitively, P is omptile if it rries no onfliting informtion. Trditionlly, supertree methods hve een pplied to rooted phylogeneti trees. One of the first suh methods is Build [2]. This polynomil-time lgorithm tkes olletion of rooted phylogeneti trees nd determines if they re omptile, in whih se it outputs tree tht displys eh of the trees in the olletion. Algorithms like Build re ll-or-nothing lgorithms s they only return tree if the input dt meets some riteri. However, despite this limittion, suh lgorithms give vlule insight into more generl supertree methods. Indeed, the lgorithm MinCutSupertree nd its modified version is sed on Build. For nested tx, the nlogues of rooted phylogeneti trees nd omptiility re rooted semi-lelled trees nd nestrl omptiility. A rooted semi-lelled tree (on X) is n ordered pir (T; φ) onsisting of rooted tree T with vertex set V nd root vertex ρ, nd mp φ : X V with the properties tht, for ll v V {ρ} of degree t most two, v φ(x) nd, if ρ hs degree zero or one, then ρ φ(x). Rooted semi-lelled trees on X re lso lled rooted X-trees. Furthermore, if φ is one-to-one, then (T; φ) is sid to e singulrly lelled. Oserve tht the definition of rooted semi-lelled trees extends the definition of rooted phylogeneti trees y llowing (i) some of the interior (non-lef) verties s well s ll of the leves to e lelled y the elements of X nd (ii) verties my e llelled y more thn one element of X. Exmples of rooted semi-lelled trees tht re singulrly lelled re shown in Fig. 1.

4 PHILIP DANIEL AND CHARLES SEMPLE Let X X nd let, X. A rooted X -tree T nestrlly displys rooted X-tree T if T X refines T so tht, whenever is strit desendnt of in T, is strit desendnt of in T X. The forml definition of strit desendnt is given t the end of this setion, ut intuition should suffie for the moment. A olletion P of rooted semi-lelled trees is nestrlly omptile if there is rooted semi-lelled tree T tht nestrlly displys eh of the trees in P, in whih se we sy tht T nestrlly displys P. Oserve tht if P onsists of rooted phylogeneti trees nd is omptile, then P is nestrlly omptile s none of the trees in P ontins ny interior lels. Conversely, suppose tht P is nestrlly omptile nd onsists of rooted phylogeneti trees. Let T e rooted semi-lelled tree tht nestrlly displys P. Let T e the rooted phylogeneti tree tht is otined from T y repling eh interior lel x with pendnt edge joining the interior vertex previously leled y x nd lelling the other end-vertex x. It is now esily heked tht T displys P. Pge [10] reently motivted the prolem of developing supertree methods for nested tx nd initilly posed the prolem of onstruting polynomil-time lgorithm for determining the nestrl omptiility of n ritrry olletion of rooted semi-lelled trees. In nswer to this prolem, Dniel nd Semple [7] presented n lgorithm lled AnestrlBuild. Anlogous to Build, this polynomil-time lgorithm is n ll-or-nothing lgorithm nd determines if olletion P of rooted semi-lelled trees re nestrlly omptile, in whih se it outputs rooted semi-lelled tree tht nestrlly displys P. With AnestrlBuild in hnd, the next nturl step forwrd is to onstrut more generl supertree method for rooted semi-lelled trees. In Setion 3 of this pper, we present supertree method for olletions of rooted semi-lelled trees tht re singulrly lelled. Clled NestedSupertree, this method either outputs rooted semi-lelled tree, or sttement inditing tht either there is pir of tx tht re not pirwise onsistent or there is n nestor-desendnt ontrdition. Stritly speking, this is still n ll-or-nothing lgorithm. However, suh n inonsisteny or ontrdition is very prtiulr, nd one tht we elieve in prtie ould e resolved seprtely. Bsed on Anestrl- Build, one of the ttrtions of NestedSupertree is tht it is le to e esily refined to give rise to numer of possile vrints eh of whih is supertree method for rooted semi-lelled trees tht re singulrly lelled. Moreover, we show in Setions 3 nd 4 tht ny suh vrint stisfies ll of the rooted semilelled tree nlogues of properties (i) (iii) in the introdution. Furthermore, in Setion 5, we desrie one prtiulr vrint where the rooted semi-lelled trees in the input re weighted. In ddition to (i) (iii), the resulting lgorithm stisfies the rooted semi-lelled tree nlogues of (iv) nd (v). The restrition to olletions of rooted semi-lelled trees tht re singulrly lelled is for simpliity nd funtionlity (see remrks in Setion 3). Indeed, in prtie, this is not muh of restrition s rooted semi-lelled trees re typilly singulrly lelled. In the lst setion of the pper, Setion 6, we onsider wht hppens when NestedSupertree is pplied to olletion P of rooted phylogeneti trees. In this se, the minor onditions on P referred to ove re redundnt nd tht NestedSupertree pplied to P lwys returns rooted phylogeneti tree. We show tht if

SUPERTREE METHODS FOR NESTED TAXA 5 P is omptile, then the rooted phylogeneti tree returned y NestedSupertree is the sme s tht returned y Build. Thus NestedSupertree is generlistion of Build. In ft, s we will see, it lso generlises AnestrlBuild in orresponding wy. Before ending this setion with some preliminries we mke two omments. Firstly, in ddition to the properties listed in the introdution, one other property is given in [12]. This property sys tht the resulting supertree displys ll nestings shred y ll of the trees in P, where one suset of the lels in P nests in nother if the most reent ommon nestor of the former is strit desendnt of the most reent ommon nestor of the ltter. It hs een reently shown y Willson [15] tht the proof in [12] tht estlishes MinCutSupertree hs this property is inorret nd, in ft, tht MinCutSupertree does not hve this property. (We note tht if one dds the ondition tht A is suset of B in the sttement ssoited with this proof, then the proof is orret nd Min- CutSupertree is gurnteed to hve the nesting property provided the first set of lels is suset of the other). Whether displying ll shred nestings of the input olletion is desirle property is detle. We simply note here tht NestedSupertree lso does not hve this property. For the urious reder, there is generl supertree method for olletions P of rooted phylogeneti trees tht stisfies this nesting property s well s the properties listed in the introdution. In prtiulr, first use the Build lgorithm to either produe supertree tht displys P, in whih se the supertree method outputs this tree, or reognise tht P is not omptile. If the ltter hppens, onstrut the Adms onsensus tree T (see [1] or [13]) for the set P of rooted phylogeneti trees otined from P y restriting eh tree to the suset of lels of P tht re ommon to eh tree in P. This tree displys ll of the nestings shred y ll of the trees in P nd hene P. Now, for eh remining lel in P, djoin to the root of T with distint new edge. The supertree method outputs the resulting tree. The seond omment is tht the pproh tken y NestedSupertree nd the pproh of MinCutSupertree re very different. A omprison etween these two methods for rooted phylogeneti trees would mke n interesting projet. Finlly, some preliminries. Typilly, one views rooted tree s n undireted grph. However, it will often e onvenient in this pper to view rooted tree s direted grph where eh edge is repled with n r direted wy from the root. Now let T = (T; φ) e rooted semi-lelled tree on X. The set X is lled the lel set of T nd the elements of X re lled lels. We lso use L(T ) to denote the lel set of T. If v is vertex of T, we sy tht the elements of φ 1 (v) lel v. Furthermore, T is fully lelled if every vertex of T is lelled y n element of X. For olletion P of rooted semi-lelled trees, we denote the union of the lel sets of the trees in P y L(P). Moreover, we ll n element x of L(P) ommon if x T P L(T ). There is nturl nd useful prtil order on the lel set L(T ) of rooted semi-lelled tree T = (T; φ). This prtil order is otined y setting T if the pth from the root of T to φ() inludes φ(), in whih se we sy tht is desendnt of. If < T, then we sy tht is strit desendnt of. Furthermore,, L(T ) re not omprle under T if neither T nor

6 PHILIP DANIEL AND CHARLES SEMPLE T holds. Essentilly, nd re not omprle in T if is not desendnt of, nd is not desendnt of. In Fig. 1, e nd re not omprle in the middle tree, ut is (strit) desendnt of e in the rightmost tree. Lstly, rooted triple is rooted phylogeneti tree tht hs two interior verties nd whose lel set hs size three. We denote the rooted triple T with lel set {,, } y if the pth from to does not interset the pth from the root to. For olletion P of rooted semi-lelled trees, rooted triple whose lel set {,, } is suset of T P L(T ) is ommon reltive to P if, for ll T 1, T 2 P, T 1 {,, } is isomorphi to T 2 {,, }. Note tht none of,, need lel lef of T 1 or T 2. The rooted triple is ommon to the three rooted semi-lelled trees shown in Fig. 1. 3. The Algorithm NestedSupertree For olletion P of rooted semi-lelled trees tht re singulrly lelled, the lgorithm NestedSupertree pplied to P is sed on prtiulr onstrution nd two grphs. We desried the onstrution first nd then the two grphs. Let T = (T; φ) e rooted semi-lelled tree on X, where T hs vertex set V. We sy tht rooted fully-lelled tree T 1 = (T; φ 1 ) on X 1, where X X 1, hs een otined from T y dding distint new lels if, for ll distint u, v V, the following properties re stisfied: (1) If φ 1 (v) is non-empty, then φ 1 1 (v) = φ 1 (v). (2) If φ 1 (v) is empty, then φ 1 1 (v) = 1. (3) If φ 1 (u) nd φ 1 (v) re oth empty, then φ 1 1 (u) φ 1 1 (v). Intuitively, T 1 hs een otined from T y dding distint new lel to eh nonlelled vertex of T. For olletion P of rooted semi-lelled trees, we sy tht P 1 hs een otined from P y dding distint new lels if it hs een otined y dding distint new lels to eh tree in P so tht no pir of dded lels re the sme. Although NestedSupertree is pplied to P, ll of the work in the method goes into onstruting supertree for olletion of rooted fully-lelled trees tht hs een otined from P y dding distint new lels. We now desrie the two grphs eh of whih onsists of oth rs (direted edges) nd edges. For the purposes of this pper nd to void onfusion, we will ll grph tht ontins oth rs nd edges mixed grph. Let P e olletion of rooted semi-lelled trees nd let P e olletion of rooted fully-lelled trees otined from P y dding distint new lels. The desendny grph of P, denoted D(P ), is the mixed grph whose vertex set is L(P ), whose r set is nd whose edge set is {(, ) : < T for some T P }, {{, } : is not omprle to under T for some T P }.

SUPERTREE METHODS FOR NESTED TAXA 7 v 2 u 3 e e u 1 u 2 d v 1 d w 1 Figure 2. A olletion P of rooted fully-lelled trees. The desendny grph is sid to e yli if, ignoring edges, it hs no direted yles. The seond grph D (P ) is otined from the desendny grph D(P ) of P s follows. For eh ommon rooted triple 1 2 of P, dd new vertex lelled 1 2, new r from 1 2 to 1, nd new r from 1 2 to 2. Verties of the form 1 2 re lled rooted triple verties of D (P ); ll other verties of D (P ) re lled lel verties. We ll D (P ) the modified desendny grph of P. In generl, let G e mixed grph nd let G e the direted grph otined from G y deleting ll of the edges in the edge set of G. Thus the r set of G is equl to the r set of G. A vertex v of G hs indegree zero if v hs indegree zero in G. Similrly, suset of the vertex set of G is the vertex set of n r omponent of G if it is the vertex set of omponent of G. Furthermore, for suset V 1 of the vertex set of G, the restrition of G to V 1 is the sugrph of G tht is otined y deleting ll verties not in V 1 together with their inident edges nd rs. This restrition is denoted y G V 1. Exmple 3.1. To illustrte the ove onstrution nd mixed grphs, let P e the olletion of rooted semi-lelled trees shown in Fig. 1 nd let P e the olletion of rooted fully-lelled trees otined from P y dding distint new lels s shown in Fig. 2. The modified desendny grph of P is shown in Fig. 3 where, for simpliity, the edges s well s the rs (, ) where is not n immedite desendnt of re omitted. If these edges were inluded, there would, for exmple, e n edge joining the lel verties w 1 nd s they re not omprle in the rightmost tree of Fig. 2. Furthermore, to highlight the one rooted triple vertex, its outgoing rs re drwn s dshed rrows. This exmple will e referred to lter in this setion nd lso in Setion 5. Lstly, let P e olletion of rooted semi-lelled trees tht re singulrly lelled, nd let nd e elements of L(P). We sy tht nd re pirwise onsistent if, whenever is strit desendnt of in some tree in P, is lwys strit desendnt of in every tree of P whose lel set ontins oth nd. Furthermore, P is sid to e pirwise onsistent if ll pirs of lels in L(P) re pirwise onsistent.

8 PHILIP DANIEL AND CHARLES SEMPLE e u 1 w 1 d u 3 v 1 u 2 v 2 Figure 3. The modified desendny grph of P. We now desrie NestedSupertree nd its suroutine Desendnt. An illustrtive exmple nd some informtive remrks follow these desriptions. In rief, NestedSupertree onstruts rooted semi-lelled tree y strting t the root nd working downwrds towrds the leves. The min workings of the method is ontined within suroutine lled Desendnt. This suroutine uses suessive restritions of ertin modified desendny grph to determine how this rooted semi-lelled tree is onstruted. Algorithm: NestedSupertree(P) Input: A olletion P of rooted semi-lelled trees tht re singulrly lelled. Output: A rooted semi-lelled tree T with lel set L(P), the sttement P is not pirwise onsistent, or the sttement P hs n nestor-desendnt ontrdition. 1. For eh pir, P, hek tht nd re pirwise onsistent. If not, then hlt nd return P is not pirwise onsistent. 2. Construt olletion P of rooted fully-lelled trees from P y dding distint new lels. 3. Construt the desendny grph D(P ) of P. 4. If D(P ) hs direted yle, then hlt nd return P hs n nestor-desendnt ontrdition. 5. Construt the modified desendny grph D (P ) of P. 6. Cll the suroutine Desendnt(D (P ), v ). 7. Remove the dded lels from T (the rooted semi-lelled tree outputted y Desendnt), suppress ny resulting unlelled vertex tht hs indegree one nd outdegree one nd, if the root is unlelled nd hs degree one, relote the root to the nerest vertex tht is either lelled or hs outdegree t lest two. Return the resulting rooted semi-lelled tree T. Algorithm: Desendnt(D (P ), v ) Input: A grph D (P ). Output: A rooted fully-lelled tree T with root vertex v.

SUPERTREE METHODS FOR NESTED TAXA 9 v 2, u 3 e, u 1, u 2 e w 1 v 1 d d () () Figure 4. () One possile output of Desendnt when pplied to D (P ) nd () the orresponding output of NestedSupertree. 1. Let S 0 denote the set of lel verties of D (P ) tht hve indegree zero nd no inident edges. 2. If S 0 is empty, then hoose S 0 to e ny non-empty suset of lel verties of D (P ) tht hve indegree zero. 3. Delete the elements of S 0 (nd their inident rs nd their inident edges) from D (P ). Furthermore, for eh ommon rooted triple 1 2 of P, delete the rooted triple vertex 1 2 if, in the resulting mixed grph, the r omponent ontining 1 nd 2 does not ontin the lel vertex. 4. Let S 1, S 2,..., S k denote the vertex sets of the r omponents of the grph otined t the end of Step 3. 5. For eh element i {1, 2,..., k}, ll Desendnt(D (P ) S i, v i ). Assign the lels in S 0 to v nd tth T i to v vi the edge {v i, v }. Exmple 3.2. As n exmple of NestedSupertree pplied to olletion of rooted semi-lelled trees tht re singulrly lelled, let P nd P e the olletions desried in Exmple 3.1. On the first itertion of Desendnt, the lel verties v 2 nd u 3 in the modified desendny grph D (P ) hve indegree zero nd no inident edges, nd no other lel verties hve this property. Therefore, in this itertion, S 0 = {v 2, u 3 }. Furthermore, the grph otined from D (P ) y deleting the elements of S 0 hs extly one r omponent. In the seond itertion, the lel verties of the inputted grph tht hve indegree zero re e, u 1, nd u 2, nd eh of these hve n inident edge. Therefore, in this itertion, we n hoose ny non-empty suset of {e, u 1, u 2 } to e S 0. If we hoose S 0 to e the whole set, then, in ll susequent itertions of the lgorithm, there is lwys non-empty set of lel verties of the orresponding grph tht hve indegree zero nd no inident edges. By mking this hoie, Desendnt eventully returns the rooted fully-lelled tree shown in Fig. 4() nd Nested- Supertree returns the rooted semi-lelled tree shown in Fig. 4(). Remrks.

10 PHILIP DANIEL AND CHARLES SEMPLE 1. Oserve tht in Step 2 of Desendnt hoie n e mde on the mke-up of S 0. This is the prt of the lgorithm tht llows for vrints. One possile wy of mking this hoie is desried in Setion 5. Note tht, s we will soon see, if the suroutine is lled, ut Step 2 is never invoked, then the supertree returned y NestedSupertree nestrlly displys P. 2. One of the ttrtions of generl supertree lgorithm is tht onflits re resolved in some wy so tht one lwys outputs supertree whether or not the originl olletion of input trees is omptile. In the se the input is olletion of rooted phylogeneti trees, it is resonle tht ny supertree lgorithm resolves suh onflits. However, in the se the input is olletion of rooted semi-lelled trees, it ppers to us tht there re some fundmentl nestordesendnt onflits tht should e resolved seprtely. Two suh onflits re when P is not pirwise onsistent or hs n nestor-desendnt ontrdition. Finding suh onflits n e esily done in polynomil time. In the se of nestor-desendnt ontrditions, see the proof of Lemm 3.4. 3. The purpose of dding the rooted triple verties nd ssoited rs to the desendny grph of P is so tht ny tree outputted y NestedSupertree preserves ll of the ommon rooted triples of P. This property nd, in prtiulr, the rooted semi-lelled tree nlogue of desirle property (ii) is estlished in the next setion. 4. Proposition 3.6 shows tht, provided P is pirwise onsistent nd the desendny grph of P is yli, NestedSupertree returns rooted semi-lelled tree. Thus we n lwys find non-empty set S 0 s desried in Steps 1 nd 2 of the suroutine Desendnt. 5. Lstly, the hek for the pirwise onsisteny of P nd the restrition tht eh tree in the input olletion is singulrly lelled ould e removed from NestedSupertree. However, if either is done, then there is no gurntee tht the resulting supertree stisfies the rooted semi-lelled tree nlogue of (ii) in the introdution. The rest of this setion estlishes some si properties of NestedSupertree, in prtiulr, the rooted semi-lelled tree nlogues of (i) (Proposition 3.7) nd (iii) (Proposition 3.3). Further properties re estlished in the next setion. We egin y mking the following oservtion. Rell from the introdution tht AnestrlBuild is polynomil-time lgorithm tht determines if olletion of rooted semi-lelled trees is nestrlly omptile, in whih se suh tree is returned [7]. The desription of NestedSupertree losely resemles the desription of AnestrlBuild. Indeed, the ltter n e essentilly otined from the former s follows: remove Steps 1, 4, nd 5 in NestedSupertree; reple the modified desendny grph of P with the desendny grph of P in Desendnt; remove the seond sentene of Step 3 of Desendnt; nd, reple Step 2 of Desendnt If S 0 is empty, hlt nd return P is not nestrlly omptile, in whih se P is not nestrlly omptile. It follows tht NestedSupertree n e viewed s generlistion of AnestrlBuild. Indeed, we hve the following proposition.

SUPERTREE METHODS FOR NESTED TAXA 11 Proposition 3.3. Let P e olletion of rooted semi-lelled trees tht re singulrly lelled, nd suppose tht P is nestrlly omptile. Then NestedSupertree pplied to P returns rooted semi-lelled tree tht nestrlly displys P. Proof. Let P e olletion of rooted fully-lelled trees tht is otined from P y dding distint new lels. Sine P is nestrlly omptile, P is pirwise onsistent nd D(P ) hs no direted yles. It now follows from the desription of how AnestrlBuild n e otined from NestedSupertree tht NestedSupertree pplied to P returns rooted semi-lelled tree tht nestrlly displys P. The next two lemms re needed for the proofs of Propositions 3.6 nd 3.7. The first lemm is well-known nd n esy exerise. However, we inlude its proof s it indites how one n find ll of the nestor-desendnt ontrditions of olletion of rooted semi-lelled trees. Lemm 3.4. Let D e onneted digrph tht ontins no direted yle. Then there exists vertex of D whose indegree is zero. Proof. Assume no vertex of D hs indegree zero. Let D e the digrph otined from D y reversing the orienttion of the rs of D. By ssumption, every vertex of D hs outdegree t lest one. Let u e vertex of D. Strting t u, onstrut direted wlk. Sine eh vertex of D hs n outgoing r, we must eventully meet vertex on this wlk tht hs lredy een trversed. In prtiulr, this mens tht D ontins direted yle, whih in turn implies tht D ontins direted yle. This ontrdition ompletes the proof of the lemm. Lemm 3.5. Let P e olletion of rooted fully-lelled trees tht re singulrly lelled. Let T P L(T ), nd let x, y L(P). Suppose tht is pirwise onsistent with eh of the lels in L(P). Then the following hold. (i) If there is direted pth from to x in D(P), then there is n r from to x in D(P). Furthermore, if there is direted pth from x to in D(P), then there is n r from x to in D(P). (ii) Suppose tht (, x) is n r in D(P). If (y, x) is lso n r in D(P) nd y, then either there is n r from y to in D(P) or there is n r from to y in D(P). Proof. We first prove (i). Assume tht y 1 y 2 y k x is direted pth in D(P) from to x. As y 1 is strit desendnt of in some tree in P nd is pirwise onsistent with y 1, it follows tht whenever nd y 1 re lels of some tree in P, y 1 is strit desendnt of. Sine there is n r from y 1 to y 2, there is tree T 1 in P in whih y 2 is strit desendnt of y 1. Sine is lel of T 1, this implies tht y 2 is strit desendnt of in T 1, so, y definition, there is n r in D(P) from to y 2. Repeting this rgument for y 2 nd y 3, we dedue tht there is n r in D(P) from to y 3. Continuing in this wy, we eventully estlish tht there is

12 PHILIP DANIEL AND CHARLES SEMPLE n r in D(P) from to x. A similr rgument shows tht there is n r (x, ) in D(P) if there is direted pth in D(P) from x to. This estlishes (i). We now prove (ii). Sine (y, x) is n r of D(P), there is tree T in P for whih x is strit desendnt of y. But this mens tht, s (, x) is n r of D(P), is ommon lel nd is pirwise onsistent with x, nd ll trees in P re singulrly lelled, either is strit desendnt of y or y is strit desendnt of in T. In prtiulr, either (, y) or (y, ) is n r in D(P), respetively. Proposition 3.6. Let P e olletion of rooted semi-lelled trees tht re singulrly lelled nd let P e olletion of fully-lelled trees otined from P y dding distint new lels. If P is pirwise onsistent nd the desendny grph of P is yli, then NestedSupertree pplied to P returns rooted semi-lelled tree. Proof. Beuse of Lemm 3.4 nd the ft tht Desendnt suessively onsiders proper restritions of the modified desendny grph of P, it suffies to show tht one n lwys hoose non-empty set S 0 of lel verties in Steps 1 nd 2 t eh itertion of the suroutine Desendnt. To see this, suppose tht t some itertion of Desendnt the ssoited onneted restrition, D sy, of D (P ) hs no lel vertex of indegree zero. Let S e the set of lel verties of D in whih the only inoming rs re the ones oming from rooted triple verties. Sine ny restrition of the desendny grph of P is yli, it follows tht S is non-empty. Let 1 e n element of S. By the onstrution of the modified desendny grph, 1 is lel of ommon rooted triple, 1 2 sy, of P. Furthermore, 1 is not the only element of S; for otherwise, every lel vertex of D, inluding 2, would e strit desendnt of 1. It now follows tht, s D is onneted, there is lel vertex w tht lies in direted pth from 1 nd tht lso lies in direted pth from ommon lel vertex, x 1 sy, tht is distint from 1 nd is in S. By Lemm 3.5(i), this implies tht there exists tree T 1 in P in whih w is strit desendnt of 1 nd tree T 2 in P in whih w is strit desendnt of x 1. As 1 nd x 1 re not omprle in T 1, it follows tht w is not omprle to x 1 in T 1. But w is strit desendnt of x 1 in T 2, ontrditing the ssumption tht P is pirwise onsistent. We onlude tht t Steps 1 nd 2 of eh itertion of Desendnt, we n lwys find n pproprite non-empty set of lel verties. Proposition 3.7. Let P e olletion of rooted semi-lelled trees tht re singulrly lelled. Then the running time of NestedSupertree pplied to P is polynomil in L(P) P. Proof. Let P e olletion of rooted fully-lelled trees tht is otined from P y dding distint new lels. Sine the only possile unlelled verties of rooted semi-lelled tree re either the root vertex or vertex of degree t lest three, the numer of suh interior verties is t most one less thn the numer of leves. Therefore, to prove the proposition, it suffies to show tht the running time of NestedSupertree is polynomil in L(P ) P. It is ler tht heking for pirwise onsisteny is polynomil time in L(P ) P. Furthermore, the onstrution of the desendny grph of P n e lso e

SUPERTREE METHODS FOR NESTED TAXA 13 done in suh time. Now one n determine if direted grph hs no direted yles y suessively deleting verties (nd their inident rs) tht either hve indegree or outdegree zero. If this proess results in the empty grph, then the originl grph hs no direted yles; otherwise it hs direted yle. Sine the size of D(P ) is polynomil in the size of L(P ), determining whether or not D(P ) hs no direted yles is polynomil in the size of L(P ). The numer of triples of L(P) is polynomil in L(P) nd so finding the olletion of ommon rooted triples of P is lso polynomil in L(P ) P. It follows tht the onstrution of the modified desendny grph of P is polynomil time in L(P ) P. Lstly, s stted in the fourth remrk following Exmple 3.2, t eh itertion of the suroutine Desendnt there is lwys t lest one vertex with indegree zero. Consequently, t eh itertion, S 0 is non-empty nd so the mixed grph resulting from deleting the elements in S 0 is proper restrition of the mixed grph inputted t tht prtiulr itertion. Thus the numer of suh itertions is ounded y the size of L(P ). We dedue tht the running time of NestedSupertree is polynomil in L(P ) P. 4. Other Properties of NestedSupertree The min purpose of this setion is to estlish the rooted semi-lelled tree nlogue of desirle property (ii) in the introdution for NestedSupertree. A rooted semi-lelled tree T is inry if T is singulrly lelled nd every vertex hs degree t most three exept for the root whih hs degree t most two. The min result of this setion is the following theorem. Theorem 4.1. Let P e olletion of semi-lelled trees tht re singulrly lelled nd let T e rooted semi-lelled inry tree tht is nestrlly displyed y eh tree in P. Suppose tht NestedSupertree pplied to P returns rooted semilelled tree T. Then T nestrlly displys T. To prove Theorem 4.1, we first estlish severl results. The first result, Proposition 4.2, is well-known (for exmple, see [13]). Proposition 4.2. Let T e rooted phylogeneti X-tree. Let R(T ) = {T S : S X, S = 3, T S is rooted triple}. If T is rooted phylogeneti X -tree, where X X, nd R(T ) R(T ), then T displys T. For rooted semi-lelled trees tht re singulrly lelled, the nlogous result is Proposition 4.3. We will ll rooted semi-lelled tree tht is singulrly lelled nd hs lel set of size three triple. A rooted triple is prtiulr type of triple. Up to isomorphism, there re six triples nd these re shown in Fig. 5. For onveniene in this pper, we denote these triples s Types (I)() nd (), (II), (III), nd (IV)() nd (). We will ontinue to refer to triple of Type(I)() s rooted triple.

14 PHILIP DANIEL AND CHARLES SEMPLE (I)() (I)() (II) (III) (IV)() (IV)() Figure 5. The six triples. Proposition 4.3. Let T e rooted semi-lelled tree on X. Let B(T ) = {T S : S X, S = 3, T S is triple of Type (I)() or (IV)()}, D(T ) = { < T :, L(T )}, nd N(T ) = { is not omprle to under T :, L(T )}. If T is rooted semi-lelled tree on X, where X X, nd B(T ) B(T ), D(T ) D(T ), nd N(T ) N(T ), then T nestrlly displys T. Proof. To prove the proposition, it is ler tht we my ssume tht X nd X re the sme sets. Let T = (T; φ). The proof is y indution on the numer n of interior lels of T. If n = 0, then it is strightforwrd to dedue the result y Proposition 4.2 nd the ft tht N(T ) N(T ). Now ssume tht the result holds for ll rooted semi-lelled trees tht hve fewer thn n interior lels, where n 1. Sine T hs t lest one interior lel, there exists n interior vertex u of T tht is lelled y n element, d sy, of X suh tht ll elements of X tht re strit desendnts of d lel leves of T. Let T 1 e the rooted semi-lelled tree otined from T y repling the rooted sutree of T tht lies elow or equl to u with single lef lelled y the elements of φ 1 (u). Let T 2 e the rooted semi-lelled tree tht is the rooted sutree of T tht lies elow or equl to d nd in whih the elements in φ 1 (u) re removed. Note tht if u hs outdegree one, then u is deleted nd the root of T 2 is the vertex of T tht is immeditely elow u. Now onsider T. Sine D(T ) D(T ), eh element in φ 1 (u) lels n interior vertex of T. Moreover, s there is n element of X tht is strit desendnt of eh element in φ 1 (u), it follows tht, for ll pirs, φ 1 (u), either nd lel the sme vertex of T or one element, sy, is strit desendnt of in T. Let e lest element of φ 1 (u) under T nd let v e the interior vertex of T tht is lelled y. Agin, s D(T ) D(T ), the set of strit desendnts of in T is extly the lel set of T 2. Anlogous to the onstrutions of T 1

SUPERTREE METHODS FOR NESTED TAXA 15 nd T 2 in the previous prgrph, onstrut T 1 nd T 2 from T using the vertex v insted of u. Evidently, B(T 1 ) B(T 1), D(T 1 ) D(T 1), nd N(T 1 ) N(T 1), nd B(T 2 ) B(T 2 ), D(T 2) D(T 2 ), nd N(T 2) N(T 2 ). Furthermore, oth T 1 nd T 2 hve fewer thn n lelled interior verties. Therefore, y our indution ssumption, T 1 nd T 2 nestrlly disply T 1 nd T 2, respetively. By definition, it immeditely follows tht T nestrlly displys T unless u hs outdegree one nd the vertex of T lelled y hs outdegree t lest two. But then, in this se, there re elements, X suh tht T {,, } is of Type (IV)() nd T {,, } is of Type (IV)(), ontrditing the ssumptions in the sttement of the proposition. This ompletes the proof of Proposition 4.3. Lemm 4.4. Let P e olletion of rooted semi-lelled trees tht re singulrly lelled nd let, L(P). Suppose tht NestedSupertree pplied to P returns rooted semi-lelled tree T. (i) If is strit desendnt of in some tree in P, then is strit desendnt of in T. (ii) If, T P L(T ), nd is not omprle to in eh tree in P, then is not omprle to in T. Proof. Prt (i) immeditely follows from the desription of NestedSupertree. To prove (ii), let P e the olletion of rooted fully-lelled trees tht is otined from P y dding distint new lels in Step 2 of NestedSupertree. At some itertion of the running of the su-routine Desendnt, one of the lel verties or in some restrition of the modified desendny grph D (P ) of P hs indegree zero. Consider the first suh itertion nd let D denote the orresponding onneted mixed grph. Without loss of generlity, we my ssume tht hs indegree zero in this restrition. To estlish the lemm, it suffies to show y the onstrution of T tht is not vertex of D. Let V e the suset of verties of D tht re either lel verties lying on direted pth strting t or rooted triple verties where oth djent lel verties lie on direted pth strting t. Sine nd re not omprle in every tree in P nd hs indegree zero, it follows y the ontrpositive of Lemm 3.5(i) tht V. Thus to estlish tht is not vertex of D, it suffies to show tht V is the vertex set of D. To see this, suppose tht D ontins n r (z, x) where z V, ut x V. Clerly, x is lel vertex of D. Assume tht z is lso lel vertex of D. By Lemm 3.5(i), (, x) is n r of D. Therefore, s hs indegree zero, it follows y Lemm 3.5(ii) tht there is n r from to z in D. This implies tht z V ; ontrdition. In ft, y extending this rgument, it is esily seen tht, ignoring rooted triple verties, V ontins ll of the lel verties of D. It now follows y the definition of V tht V is the vertex set of D. This ompletes the proof of the lemm. We remrk here tht the ondition nd re ommon lels of P in the sttement of Lemm 4.4(ii) nnot e wekened.

16 PHILIP DANIEL AND CHARLES SEMPLE () () Figure 6. Two triples. Let P e olletion of rooted semi-lelled trees. A triple whose lel set {,, } is suset of T P L(T ) is ommon reltive to P if, for ll T 1, T 2 P, T 1 {,, } is isomorphi to T 2 {,, }. Lemm 4.5. Let P e olletion of rooted semi-lelled trees tht re singulrly lelled, nd let T e ommon triple of P of Type (I)() or (IV)(). Let {,, } e the lel set of T. Suppose tht NestedSupertree pplied to P returns rooted semi-lelled tree T. Then T {,, } is isomorphi to T. Proof. If T is triple of Type (IV)(), then it is esily seen, y interpreting Lemm 4.4 for olletion of rooted fully-lelled trees tht re singulrly lelled, tht T {,, } is isomorphi to T. Therefore suppose tht T is the rooted triple sy. Let P e the olletion of rooted fully-lelled trees tht is otined from P y dding distint new lels in Step 2 of NestedSupertree. Sine,, nd re ommon lels of P, it follows y Lemm 4.4 tht every pir of,, nd re not omprle in T. Furthermore, y Step 3 of Desendnt, nd re lwys in the sme r omponent of the restritions of the modified desendny grph of P tht re onsidered throughout the running of NestedSupertree provided is in the sme restrition. We now dedue from the desription of Desendnt tht T {,, } is isomorphi to. We now prove Theorem 4.1. Proof of Theorem 4.1. Sine eh lel of T is ommon lel of P, it immeditely follows y Lemm 4.4 tht D(T ) D(T ) nd N(T ) N(T ). Furthermore, y Lemm 4.5, B(T ) B(T ). Hene, y Proposition 4.3, T nestrlly displys T. Corollry 4.6. Let P e olletion of rooted semi-lelled trees tht re singulrly lelled. Suppose tht NestedSupertree pplied to P returns rooted semilelled tree T. Then (i) If T is ommon triple of P, then T nestrlly displys T. (ii) Let {,, } e suset of T P L(T ). Suppose tht, for ll T P, T {,, } is one of the two triples shown in Fig. 6. Then T nestrlly displys the triple shown in Fig. 6().

SUPERTREE METHODS FOR NESTED TAXA 17 Proof. If T is ommon triple of ny type exept Type (I)(), then (i) follows from Theorem 4.1. If T is ommon triple of Type (I)(), then (i) follows from Lemm 4.4(ii). For (ii), routine hek using oth prts of Lemm 4.4 estlishes this prt of the orollry. We end this setion with n oservtion regrding the lst orollry. Oserve tht for the two triples in (ii) of this orollry one is refinement of the other. Amongst the other triples only one other pir s this property, Types (I)() nd (). Despite prt (ii) of Corollry 4.6, it is strightforwrd to onstrut n exmple where the nlogue of (ii) for Types (I)() nd (I)() does not hold. This is not wekness of NestedSupertree, ut simply highlights the ft shown in [14] tht no generl supertree method for rooted phylogeneti trees (nd hene rooted semi-lelled trees) is le to stisfy this nlogue. 5. A Vrint of NestedSupertree In this setion, we present prtiulr vrint of NestedSupertree. This lgorithm, whih we ll MinEdgeWeightTree, llows the inputed trees to e weighted nd lso stisfies the symmetry properties of ordering nd renming. To desrie MinEdgeWeightTree, we simply note tht it is otined from NestedSupertree y repling Step 2 of Desendnt with the following: 2. If S 0 is empty, then hoose S 0 s follows: () Let C 0 denote the set of lel verties of D (P ) tht hve indegree zero. () For eh C 0, weight to e the sum of the weights of the trees in P tht indue t lest one inident edge with in D (P ). () Let S 0 onsist of the elements of C 0 with minimum weight. Note tht if the input trees re not weighted, hoose eh tree to hve weight one. Remrks 1. Clerly, t eh itertion of the suroutine of MinEdgeWeightTree nlogous to Desendnt, S 0 is non-empty either t the end of Step 1 or Step 2. Furthermore, the time tken to find S 0 is polynomil in L(P) P. It immeditely follows y the results estlished in Setions 3 nd 4 for NestedSupertree tht MinEdgeWeightTree pplied to olletion of rooted semi-lelled trees tht re singulrly lelled nd weighted stisfies the rooted semi-lelled tree nlogues of (i) (iii) nd (v) in the introdution. 2. In omprison with Desendnt, the set S 0 of lel verties is well-defined in the orresponding suroutine of MinEdgeWeightTree. Sine no ppel is mde to the speifi symols used s lels or to the order in whih the memers of P re listed in MinEdgeWeightTree, it follows tht MinEdgeWeight- Tree lso stisfies the rooted semi-lelled tree nlogues of (iv)() nd () in the introdution.

18 PHILIP DANIEL AND CHARLES SEMPLE 2 e w 1 d u 1 u 1 1 3 3 w 1 2 d 3 3 v 1 u 2 v 1 u 2 () () Figure 7. The ssoited grphs in the seond nd third itertion of DesendntSupertree in Exmple 5.1. Exmple 5.1. To illustrte MinEdgeWeightTree, onsider the olletion P of rooted semi-lelled trees desried in Exmple 3.1 nd the olletion P of rooted fully-lelled trees otined from P y dding distint new lels. For the purposes of the exmple, suppose tht the three trees in Fig. 1 re weighted so tht the leftmost tree is weighted 3, the middle tree is weighted 2, nd the rightmost tree is weighted 1. The modified desendny grph of P is the sme s tht given in Fig. 3. Applying MinEdgeWeightTree to P, the first itertion of its suroutine is the sme s tht in Exmple 3.2. In prtiulr, S 0 = {v 2, u 3 } t the end of Step 1 nd so, in this itertion, no lel verties of the inputted grph re weighted. In the seond itertion of its suroutine, S 0 is empty fter Step 1. At Step 2 (), the set C 0 of lel verties of the inputted grph with no inoming rs is {e, u 1, u 2 }. Sine the lel vertex e hs extly one inident edge nd this is indued y the tree with weight 2, we give e weight 2 t Step 2 () in this itertion. Similrly, u 1 nd u 2 re oth weighted 3. This weighting together with the ssoited mixed grph is shown in Fig. 7(), where the edges nd the rs (, ) in whih is not n immedite desendnt of re omitted. At Step 2 (), S 0 = {e} nd so, t this itertion, it is e nd its inident rs nd edges tht re deleted from the input grph. The grph resulting from these deletions is shown in Fig. 7(), where the weights of the lel verties with indegree zero re lso shown. Continuing in this wy, the suroutine of MinEdgeWeightTree eventully returns the rooted fully-lelled tree shown Fig. 8() nd MinEdgeWeightTree returns the rooted semi-lelled tree shown in Figure 8(). Remrks. Although we think MinEdgeWeightTree is resonle lgorithm, we expet there to e more elorte lgorithms for supertree onstrution sed on NestedSupertree. The point is tht it highlights how one n use NestedSupertree s sis for onstruting new supertree methods for rooted semi-lelled

SUPERTREE METHODS FOR NESTED TAXA 19 v 2, u 3 e w 1 e v 1 u 1 u 2 d d () () Figure 8. The trees returned y MinEdgeWeightTree nd its suroutine in Exmple 5.1. trees tht stisfy ll of the rooted semi-lelled tree nlogues of the properties listed in the introdution. 6. NestedSupertree Applied to Rooted Phylogeneti Trees Although not originlly intended for phylogenetis, the lgorithm Build [2] ws one of the first supertree methods for olletions P of rooted phylogeneti trees. Furthermore, s well s MinCutSupertree nd its modified version, the generl pproh tken y Build hs een used in numer of more reent supertree lgorithms, for exmple [5, 6, 8, 11]. In the setting of phylogenetis, Build is polynomil-time lgorithm for deiding if P is omptile. In this setion, we desrie how NestedSupertree n e pplied to P to determine the omptiility of P. In the se tht P is omptile, we lso show tht the rooted phylogeneti tree returned y NestedSupertree is the sme s tht returned y Build. Sine olletion P of rooted phylogeneti trees is omptile if nd only if it is nestrlly omptile, it follows y the disussion prior to Proposition 3.3 tht NestedSupertree n e suitly modified to determine the omptiility of P. Theorem 6.1 shows tht, when pplied to the sme olletion of omptile rooted phylogeneti trees, the supertrees returned y NestedSupertree with this modifition nd Build re identil up to isomorphism. Before stting Theorem 6.1, we first give desription of Build. Let P e olletion of rooted phylogeneti trees nd let S e suset of L(P). Let [P, S] e the grph tht hs vertex set S nd hs n edge joining two verties nd preisely if there exists S nd T P suh tht T {,, } =.

20 PHILIP DANIEL AND CHARLES SEMPLE Algorithm: Build(P, v) Input: A olletion P of rooted phylogeneti trees. Output: A rooted phylogeneti tree T tht displys P with root vertex v, or the sttement P is not omptile. 1. Set S to e the lel set of P. 2. If S = 1, then output the rooted phylogeneti tree onsisting of the single vertex v lelled y the element in S. 3. If S 2, onstrut [P, S]. 4. Let S 1, S 2,..., S k denote the vertex sets of the omponents of [P, S]. If k = 1, then hlt nd return P is not omptile. 5. For eh i {1, 2,..., k}, ll Build(P i, v i ), where P i is the olletion of rooted phylogeneti trees otined from P y restriting eh tree in P to S i. If Build(P i, v i ) returns tree, then tth T i to v vi the edge {v i, v}. Theorem 6.1. Let P e olletion of rooted phylogeneti trees, nd suppose tht P is omptile. Then, up to isomorphism, the rooted phylogeneti trees returned y NestedSupertree with the ove modifitions nd Build when pplied to P re identil. Proof. We egin the proof with two oservtions. Let S denote the lel set of P, nd let P e olletion of rooted fully-lelled trees tht is otined from P y dding distint new lels. The first oservtion is tht the vertex set of eh omponent of the grph [P, S] is union of mximl proper lusters of the trees in P. For the seond oservtion, onsider the desendny grph of P, nd let S 0 denote the set of verties of D(P ) tht hve indegree zero nd no inident edges. Then the vertex sets of eh r omponent of D(P )\S 0 is lso union of mximl proper lusters of the trees in P. From these two oservtions, it is esily dedued, for ll, S, tht nd re in the sme omponent of [P, S] if nd only if nd re in the sme r omponent of D(P )\S 0. Let S i e the vertex set of omponent of [P, S] nd let S i e the vertex set of the r omponent of D(P )\S 0 tht ontins S i. Let P i e the olletion of rooted phylogeneti trees otined from P y restriting eh tree in P to S i, nd let P i e the olletion of rooted semi-lelled trees otined from P y restriting eh tree in P to S i. It is esily seen tht ll of the trees in P i re fully lelled. Now the equivlene t the end of the lst prgrph implies tht P i ould hve een otined from P i y dding distint new lels. Furthermore, the r omponent of D(P ) ontining the elements of S i is equl to the desendny grph of D(P i ). Sine [P, S] ontins t lest two omponents, this implies tht the mximl proper lusters of the trees returned y NestedSupertree with the pproprite modifitions nd Build when pplied to P re the sme. Repetedly pplying this rgument to P i for ll i, we eventully dedue tht the two rooted phylogeneti trees returned y NestedSupertree with the pproprite modifitions nd Build re identil. We end this setion y remrking on wht hppens when NestedSupertree is pplied to n ritrry olletion P of rooted phylogeneti trees. Let P e

SUPERTREE METHODS FOR NESTED TAXA 21 olletion of rooted fully-lelled trees tht is otined from P y dding distint new lels. Sine eh of the trees in P re phylogeneti, P is pirwise onsistent nd the desendny grph of P is yli. It is now esily seen from the desription of Desendnt tht NestedSupertree pplied to P returns rooted semilelled tree nd tht this tree is phylogeneti. It now follows y Propositions 3.7 nd 3.3, nd Theorem 4.1 tht NestedSupertree is generl supertree method for rooted phylogeneti trees tht stisfies the desirle properties (i) (iii) in the introdution. Aknowledgments. We thnk the nonymous referees for their onstrutive nd vlule omments. Referenes [1] E. N. Adms III (1986), N-trees s nestings: omplexity, similrity nd onsensus, J. Clssifition, 3, pp. 299-317. [2] A. V. Aho, Y. Sgiv, T. G. Szymnski, nd J. D. Ullmn (1981), Inferring tree from lowest ommon nestors with n pplition to the optimiztion of reltionl expressions, SIAM J. Comput., 10, pp. 405-421. [3] O. R. P. Binind-Emonds, J. L. Gittlemn, nd M. A. Steel (2002), The (super)tree of life: proedures, prolems nd prospets, Annul Reviews of Eology nd Systemtis, 33, pp. 265-289. [4] O. R. P. Binind-Emonds, ed. (2004), Phylogeneti Supertrees: Comining Informtion to Revel the Tree of Life, Computtionl Biology Series, Kluwer. [5] D. Brynt, C. Semple, nd M. Steel (2004), Supertree methods for nestrl divergene dtes nd other pplitions, in Phylogeneti Supertrees: Comining Informtion to Revel the Tree of Life, O. Binind-Emonds, ed., Computtionl Biology Series, Kluwer, pp. 129-150. [6] M. Constntinesu nd D. Snkoff (1995), An effiient lgorithm for supertrees, J. Clssifition, 12, pp. 101-112. [7] P. Dniel nd C. Semple (2004), Supertree lgorithms for nested tx, in Phylogeneti Supertrees: Comining Informtion to Revel the Tree of Life, O. Binind-Emonds, ed., Computtionl Biology Series, Kluwer, pp. 151-171. [8] M. P. Ng nd N. C. Wormld (1996), Reonstrution of rooted trees from sutrees, Disrete Appl. Mth., 69, pp. 19-31. [9] R. D. M. Pge, Modified minut supertrees, in Proeedings of the Seond Interntionl Workshop on Algorithms in Bioinformtis (WABI 2002), R. Guig nd D. Gusfield, eds, Springer, 2002, pp. 537-552. [10] R. D. M. Pge (2004), Txonomy, supertrees, nd the Tree of Life, in Phylogeneti Supertrees: Comining Informtion to Revel the Tree of Life, O. Binind-Emonds, ed., Computtionl Biology Series, Kluwer, pp. 247-265. [11] C. Semple (2003), Reonstruting miniml rooted trees, Disrete Appl. Mth., 127, pp. 489-503. [12] C. Semple nd M. Steel (2000), A supertree method for rooted trees, Disrete Appl. Mth., 105, pp. 147-158. [13] C. Semple nd M. Steel (2003), Phylogenetis, Oxford University Press. [14] M. Steel, A. W. M. Dress, nd S. Böker (2000), Simple ut fundmentl limittions on supertree nd onsensus tree methods, Syst. Biol., 49, pp. 363-368. [15] S. Willson (2003), Privte ommunition. Biomthemtis Reserh Centre, Deprtment of Mthemtis nd Sttistis, University of Cnterury, Christhurh, New Zelnd E-mil ddress: pjd62@student.nterury..nz,.semple@mth.nterury..nz