On the existence of a cherry-picking sequence

Similar documents
CS 491G Combinatorial Optimization Lecture Notes

Solutions for HW9. Bipartite: put the red vertices in V 1 and the black in V 2. Not bipartite!

Mid-Term Examination - Spring 2014 Mathematical Programming with Applications to Economics Total Score: 45; Time: 3 hours

CIT 596 Theory of Computation 1. Graphs and Digraphs

Lecture 6: Coding theory

Counting Paths Between Vertices. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs

2.4 Theoretical Foundations

Compression of Palindromes and Regularity.

22: Union Find. CS 473u - Algorithms - Spring April 14, We want to maintain a collection of sets, under the operations of:

arxiv: v2 [math.co] 31 Oct 2016

CS311 Computational Structures Regular Languages and Regular Grammars. Lecture 6

Technische Universität München Winter term 2009/10 I7 Prof. J. Esparza / J. Křetínský / M. Luttenberger 11. Februar Solution

Nondeterministic Automata vs Deterministic Automata

Necessary and sucient conditions for some two. Abstract. Further we show that the necessary conditions for the existence of an OD(44 s 1 s 2 )

Separable discrete functions: recognition and sufficient conditions

On a Class of Planar Graphs with Straight-Line Grid Drawings on Linear Area

Nondeterministic Finite Automata

Automata and Regular Languages

Graph Algorithms. Vertex set = { a,b,c,d } Edge set = { {a,c}, {b,c}, {c,d}, {b,d}} Figure 1: An example for a simple graph

COMPUTING THE QUARTET DISTANCE BETWEEN EVOLUTIONARY TREES OF BOUNDED DEGREE

Lecture 11 Binary Decision Diagrams (BDDs)

A Disambiguation Algorithm for Finite Automata and Functional Transducers

Solutions to Problem Set #1

A CLASS OF GENERAL SUPERTREE METHODS FOR NESTED TAXA

CS 573 Automata Theory and Formal Languages

1 PYTHAGORAS THEOREM 1. Given a right angled triangle, the square of the hypotenuse is equal to the sum of the squares of the other two sides.

The vertex leafage of chordal graphs

NON-DETERMINISTIC FSA

COMPUTING THE QUARTET DISTANCE BETWEEN EVOLUTIONARY TREES OF BOUNDED DEGREE

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER MACHINES AND THEIR LANGUAGES ANSWERS

Discrete Structures Lecture 11

Subsequence Automata with Default Transitions

On the Spectra of Bipartite Directed Subgraphs of K 4

Lecture 8: Abstract Algebra

Finite State Automata and Determinisation

A Lower Bound for the Length of a Partial Transversal in a Latin Square, Revised Version

Lecture 2: Cayley Graphs

arxiv: v1 [cs.dm] 24 Jul 2017

Maximum size of a minimum watching system and the graphs achieving the bound

Graph Theory. Simple Graph G = (V, E). V={a,b,c,d,e,f,g,h,k} E={(a,b),(a,g),( a,h),(a,k),(b,c),(b,k),...,(h,k)}

Minimal DFA. minimal DFA for L starting from any other

18.06 Problem Set 4 Due Wednesday, Oct. 11, 2006 at 4:00 p.m. in 2-106

Computing the Quartet Distance between Evolutionary Trees in Time O(n log n)

Chapter 4 State-Space Planning

= state, a = reading and q j

Common intervals of genomes. Mathieu Raffinot CNRS LIAFA

The DOACROSS statement

POSITIVE IMPLICATIVE AND ASSOCIATIVE FILTERS OF LATTICE IMPLICATION ALGEBRAS

Coalgebra, Lecture 15: Equations for Deterministic Automata

Algorithms & Data Structures Homework 8 HS 18 Exercise Class (Room & TA): Submitted by: Peer Feedback by: Points:

Logic, Set Theory and Computability [M. Coppenbarger]

Obstructions to chordal circular-arc graphs of small independence number

Monochromatic Plane Matchings in Bicolored Point Set

Lecture 4: Graph Theory and the Four-Color Theorem

CS 2204 DIGITAL LOGIC & STATE MACHINE DESIGN SPRING 2014

CSC2542 State-Space Planning

CS261: A Second Course in Algorithms Lecture #5: Minimum-Cost Bipartite Matching

Algebra 2 Semester 1 Practice Final

Nondeterminism and Nodeterministic Automata

for all x in [a,b], then the area of the region bounded by the graphs of f and g and the vertical lines x = a and x = b is b [ ( ) ( )] A= f x g x dx

Hyers-Ulam stability of Pielou logistic difference equation

Now we must transform the original model so we can use the new parameters. = S max. Recruits

Discrete Structures, Test 2 Monday, March 28, 2016 SOLUTIONS, VERSION α

Computational Biology Lecture 18: Genome rearrangements, finding maximal matches Saad Mneimneh

Designing finite automata II

Linear choosability of graphs

Data Structures LECTURE 10. Huffman coding. Example. Coding: problem definition

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

Connectivity in Graphs. CS311H: Discrete Mathematics. Graph Theory II. Example. Paths. Connectedness. Example

Regular expressions, Finite Automata, transition graphs are all the same!!

Formal Languages and Automata

CSE 332. Sorting. Data Abstractions. CSE 332: Data Abstractions. QuickSort Cutoff 1. Where We Are 2. Bounding The MAXIMUM Problem 4

SOME INTEGRAL INEQUALITIES FOR HARMONICALLY CONVEX STOCHASTIC PROCESSES ON THE CO-ORDINATES

Descriptional Complexity of Non-Unary Self-Verifying Symmetric Difference Automata

Prefix-Free Regular-Expression Matching

Convert the NFA into DFA

Computing on rings by oblivious robots: a unified approach for different tasks

#A42 INTEGERS 11 (2011) ON THE CONDITIONED BINOMIAL COEFFICIENTS

1 Nondeterministic Finite Automata

Model Reduction of Finite State Machines by Contraction

CARLETON UNIVERSITY. 1.0 Problems and Most Solutions, Sect B, 2005

GNFA GNFA GNFA GNFA GNFA

Computing all-terminal reliability of stochastic networks with Binary Decision Diagrams

Lecture 08: Feb. 08, 2019

On-Line Construction of Compact Directed Acyclic Word Graphs

EXTENSION OF THE GCD STAR OF DAVID THEOREM TO MORE THAN TWO GCDS CALVIN LONG AND EDWARD KORNTVED

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014

Linear Algebra Introduction

Bisimulation, Games & Hennessy Milner logic

arxiv: v1 [cs.cg] 28 Apr 2009

Petri Nets. Rebecca Albrecht. Seminar: Automata Theory Chair of Software Engeneering

Fast index for approximate string matching

Chapter 3. Vector Spaces. 3.1 Images and Image Arithmetic

Numbers and indices. 1.1 Fractions. GCSE C Example 1. Handy hint. Key point

A Primer on Continuous-time Economic Dynamics

Section 2.3. Matrix Inverses

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages

Algorithm Design and Analysis

Welcome. Balanced search trees. Balanced Search Trees. Inge Li Gørtz

Aperiodic tilings and substitutions

Transcription:

On the existene of herry-piking sequene Jnosh Döker, Simone Linz Deprtment of Computer Siene, University of Tüingen, Germny Deprtment of Computer Siene, University of Aukln, New Zeln Astrt Reently, the minimum numer of retiultion events tht is require to simultneously eme olletion P of roote inry phylogeneti trees into so-lle temporl network hs een hrterize in terms of herry-piking sequenes Suh sequene is prtiulr orering on the leves of the trees in P However, it is well-known tht not ll olletions of phylogeneti trees hve herry-piking sequene In this pper, we show tht the prolem of eiing whether or not P hs herry-piking sequene is NP-omplete for when P ontins t lest eight roote inry phylogeneti trees Moreover, we use utomt theory to show tht the prolem n e solve in polynomil time if the numer of trees in P n the numer of herries in eh suh tree re oune y onstnt Keywors: PN-3-SAT, herry, herry-piking sequene, Intermezzo, phylogeneti tree, temporl phylogeneti network 1 Introution To represent evolutionry reltionships mong speies, phylogeneti trees hve long een powerful tool However, s we now not only knowlege speition ut lso non-tree-like proesses suh s hyriiztion n lterl gene trnsfer to e riving fores in the evolution of ertin groups of orgnisms (eg teri, plnts, n fish) [16, 0], phylogeneti networks eome more wiely use to represent nestrl histories A phylogeneti network is generliztion of roote phylogeneti tree More preisely, suh network is roote irete yli grph whose leves re lele [14] The following optimiztion prolem, whih is iologilly relevnt n mthemtilly hllenging, motivtes muh of the theoretil work tht hs een one in reonstruting phylogeneti networks from phylogeneti trees Given olletion P of roote inry phylogeneti trees on set of speies suh tht P orretly represents the tree-like evolution of ifferent prts of the speies genomes, wht is the smllest numer of retiultion events tht is require to simultneously eme the trees in P into phylogeneti network? Here, retiultion events re olletively referring to ll non-tree-like events n they re represente y verties in phylogeneti network whose in-egree is t lest two Without ny struturl onstrints on phylogeneti network, it is well-known tht P n lwys e emee into suh network [, 19] n, hene, the optimiztion prolem is well-efine Moreover, espite the prolem eing NP-hr [4], even for when P =, severl ext lgorithms hve een evelope tht, given two roote phylogeneti trees, onstrut phylogeneti network whose numer of retiultion events is minimize over the spe of ll networks tht eme oth trees [1, 7, 18, ] Motivte y the introution of temporl networks [3, 17], whih re phylogeneti networks tht stisfy severl time onstrints, Humphries et l [1, 13] reently investigte the speil se of the forementione optimiztion prolem for when one is intereste in minimizing the numer of retiultion events over the smller spe of ll Emil resses: jnoshoeker@uni-tueingene (Jnosh Döker), slinz@uklnnz (Simone Linz) Preprint sumitte to XXXX Deemer 1, 017

temporl networks tht eme given olletion of roote inry phylogeneti trees More preisely, in the ontext of their two ppers, the uthors onsiere temporl networks to e phylogeneti networks tht stisfy the following three onstrints: (1) speition events our suessively, () retiultion events our instntneously, n (3) eh non-lef vertex hs hil whose in-egree is one The seon onstrint implies tht the three speies tht re involve in retiultion event, ie the new speies resulting from this event n its two istint prents, must oexist in time Moreover, phylogeneti network tht stisfies the thir onstrint (ut not neessrily the first two onstrints) is referre to s tree-hil network in the literture [6] Intuitively, if phylogeneti network N is temporl, then one n ssign time stmp to eh of its verties suh tht the following hols for eh ege (u, v) in N If v is retiultion, then the time stmp ssigne to u is the sme s the time stmp ssigne to v Otherwise, the time stmp ssigne to v is stritly greter thn tht ssigne to u Broni et l [3] showe tht it n e heke in polynomil time whether or not given phylogeneti network stisfies the first two onstrints Humphries et l [1] hve estlishe new hrteriztion to ompute the minimum numer of retiultion events tht is neee to simultneously eme n ritrrily lrge olletion P of roote inry phylogeneti trees into temporl network This hrteriztion, whih is formlly efine in Setion, is in terms of herries, n the existene of prtiulr type of sequene on the leves of the trees, lle herry-piking sequene It ws shown tht suh sequene for P exists if n only if the trees in P n simultneously e emee into temporl network [1, Theorem 1] Moreover, herry-piking sequene for P n e exploite further to ompute the minimum numer of retiultion events tht is neee over ll temporl networks Importntly, not every olletion P is gurntee to hve solution, ie there my e no herry-piking sequene for P n, hene no temporl network tht emes ll trees in P It ws left s n open prolem y Humphries et l [1] to nlyze the omputtionl omplexity of eiing whether or not P hs herry-piking sequene for when P = In this pper, we mke progress towrs this question n show tht it is NP-omplete to eie if P hs herrypiking sequene for when P 8 Trnslte into the lnguge of phylogeneti networks, this result iretly implies tht it is omputtionlly hr to eie if olletion of t lest eight roote inry phylogeneti trees n simultneously e emee into temporl network To estlish our result, we use reution from vrint of the Intermezzo prolem [9] On more positive note, we show tht eiing if P hs herry-piking sequene n e one in polynomil time if the numer of trees n the numer of herries in eh suh tree re oune y onstnt To this en, we explore onnetions etween phylogeneti trees n utomt theory n show how the prolem t hn n e solve y using eterministi finite utomton The reminer of the pper is orgnize s follows The next setion ontins nottion n terminology tht is use throughout the pper Setion 3 estlishes NP-ompleteness of vrint of the Intermezzo prolem whih is then, in turn, use in Setion 4 to show tht it is NP-omplete to eie if P hs herry-piking sequene for when P 8 In Setion 5, we show tht eiing if P hs herry-piking sequene is polynomil-time solvle if the numer of herries in eh tree n the size of P re oune y onstnt We finish the pper with some onluing remrks in Setion 6 Preliminries This setion provies nottion n terminology tht is use in the susequent setions Throughout this pper, X enotes finite set Phylogeneti trees A roote inry phylogeneti X-tree T is roote tree with lef set X n, prt from the root whih hs egree two, ll interior verties hve egree three Furthermore, pir of leves {, } of T is lle herry if n re leves tht re jent to ommon vertex Note tht every roote inry phylogeneti tree hs

t lest one herry We enote y T the numer of herries in T We now turn to roote inry phylogeneti tree with extly one herry More preisely, we ll T terpillr if X = n n the elements in X n e orere, sy x 1, x,, x n, so tht {x 1, x } is herry n, if p i enotes the prent of x i, then, for ll i {3, 4,, n}, we hve (p i, p i 1 ) s n ege in T, in whih se we enote the terpillr y (x 1, x,, x n ) To illustrte, Figure 1 shows the terpillr (D 1, D,, D A ) with herry {D 1, D } Two roote inry phylogeneti X-trees T n T re si to e isomorphi if the ientity mp on X inues grph isomorphism on the unerlying trees Sutrees Now, let T e roote inry phylogeneti X-tree, n let X = {x 1, x,, x k } e suset of X The miniml roote sutree of T tht onnets ll verties in X is enote y T (X ) Furthermore, the roote inry phylogeneti tree otine from T (X ) y ontrting ll non-root egree- verties is the restrition of T to X n is enote y T X We lso write T [ x 1, x,, x k ] or T [ X ] for short to enote T (X X ) For set P = {T 1, T,, T m } of roote inry phylogeneti X-trees, we write P X (resp P[ X ]) when referring to the set {T 1 X, T X,, T m X } (resp {T 1 [ X ], T [ X ],, T m [ X ]}) Lstly, roote inry phylogeneti tree is pennt in T if it n e ethe from T y eleting single ege Cherry-piking sequenes Let P e set of roote inry phylogeneti X-trees with X = n We sy tht n orering of the elements in X, sy (x 1, x,, x n ), is herry-piking sequene for P preisely if eh x i with i {1,,, n 1} lels lef of herry in eh tree tht is ontine in P[ x 1, x,, x i 1 ] Clerly, if P = 1, then P hs herrypiking sequene However, if P > 1, then P my or my not hve herry-piking sequene We now formlly stte the eision prolem tht this pper is entere roun CPS-Existene Instne A olletion P of roote inry phylogeneti X-trees Question Does there exist herry-piking sequene for P? The signifine of CPS-Existene is the prolem s equivlene to the question whether or not ll trees in P n simultneously e emee into roote phylogeneti network tht stisfies the three temporl onstrints s llue to in the introution Automt n lnguges Let Σ e n lphet A lnguge L is suset of ll possile strings (lso lle wors) whose symols re in Σ More preisely, L is suset of Σ, where the opertor is the Kleene str A eterministi finite utomton (or short utomton) is tuple A = (Q, Σ, δ, q ini, F), where (i) Q is finite set of sttes, (ii) Σ is finite lphet, (iii) δ: Q Σ Q is trnsition reltion, (iv) q ini is the initil stte, n (v) F Q re finl sttes A given utomton A epts wor w = 1 n if n only if A is in finl stte fter hving re ll symols from left to right, ie δ( δ(δ(q ini, 1 ), ), n ) F The lnguge L(A) Σ tht is reognize y A is efine s the set of wors tht A epts For the utomt onstrute in this pper, we hve F = 1 n δ eing totl funtion tht mps eh pir of stte in Q n symol in Σ to stte in Q For etile introution to utomt theory n lnguges, see the ook y Hoproft n Ullmn [11] 3 A vrint of the Intermezzo prolem In this setion, we estlish NP-ompleteness of vrint of the orering prolem Intermezzo Let A e finite set, n let O e n orering on the elements in A For two elements n in A, we write < preisely if preees 3

in O With this nottion in hn, we now formlly stte Intermezzo whih ws shown to e NP-omplete vi reution from 3-SAT [9, Lemm 1] Intermezzo Instne A finite set A, olletion B of pirs from A, n olletion C of pirwise-isjoint triples of istint elements in A Question Does there exist totl liner orering on the elements in A suh tht i < j for eh ( i, j ) in B, n i < j < k or j < k < i for eh ( i, j, k ) in C? Exmple Consier the following instne of Intermezzo with three pirs n two isjoint triples (when viewe s sets): A = { 1,, 3, 4, 5, 6 }, B = {( 1, 6 ), ( 4, 1 ), ( 4, 3 )}, C = {( 1,, 3 ), ( 4, 5, 6 )} A totl liner orering on the elements in A tht stisfies ll onstrints efine y B n C is O = (, 4, 3, 1, 5, 6 ) While eh element i A n pper n unoune numer of times in the input of given Intermezzo instne, this numer is oune from ove y N in the following Intermezzo vrint N-Disjoint-Intermezzo Instne A finite set A, olletions B 1, B,, B N of pirs from A, n olletions C 1, C,, C N of triples of istint elements in A suh tht, for eh l {1,,, N}, the elements in B l C l re pirwise isjoint Question Does there exist totl liner orering on the elements in A suh tht i < j for eh ( i, j ) B l, 1 l N n i < j < k or j < k < i for eh ( i, j, k ) C l? 1 l N Let I e n instne of N-Disjoint-Intermezzo, n let O e n orering on the elements of A tht stisfies the two orering onstrints for eh pir n triple in the sttement of N-Disjoint-Intermezzo We sy tht O is n N-Disjoint- Intermezzo orering for I We next show tht 4-Disjoint-Intermezzo is NP-omplete vi reution from the following restrite version of 3- SAT PN-3-SAT Instne A set U of vriles, n set C of luses, where eh luse is isjuntion of extly three literls, suh tht eh vrile ppers negte extly twie n unnegte extly twie in C Question Does there exist truth ssignment for U tht stisfies eh luse in C? Bermn et l [5, Theorem 1] estlishe NP-ompleteness for PN-3-SAT Theorem 31 4-Disjoint-Intermezzo is NP-omplete Proof We show tht the onstrution y Guttmnn n Muher [9, Lemm 1], tht ws use to show tht Intermezzo is NP-omplete vi reution from 3-SAT, yiels n instne of 4-Disjoint-Intermezzo if we reue from PN-3-SAT 4

Using the sme nottion s Guttmnn n Muher [9, Lemm 1], their onstrution is s follows Let I e n instne of PN-3-SAT tht is given y set of vriles U = {u 1,, u n } n set of luses C = {( 1,1 1, 1,3 ),, ( m,1 m, m,3 )}, where eh i, j {u 1, ū 1, u, ū,, u n, ū n } Furthermore, for, N, let enote the numer {1,, 3} suh tht + (mo 3) We efine the following three sets: A = {u k,l, ū k,l 1 k n 1 l 3} { l i, j 1 i m 1 j 3 1 l 3}, B = {(u k,1, ū k,3 ), (ū k,1, u k,3 ) 1 k n} {( i, j,, 1 i, j ), ( i, j, i, j,1) 1 i m 1 j 3} {( 1 i, j 1, 3 i, j ) 1 i m 1 j 3}, C = {(u k,1, u k,, u k,3 ), (ū k,1, ū k,, ū k,3 ) 1 k n} {( 1 i, j, i, j, 3 i, j ) 1 i m 1 j 3}, where i, j,l is n revition of u k,l with u k = i, j By onstrution, the elements in C re pirwise-isjoint triples of istint elements in A n, so, the three sets A, B, n C form n instne of Intermezzo Now, we show how the pirs n triples in B C n e prtitione into sets B l C l with B l B, C l C, n 1 l 4 suh tht the elements in B l C l re pirwise isjoint Relling tht C is set of pirwise-isjoint triples, we strt y setting B 1 = n C 1 = C Furthermore, we set B = {(u k,1, ū k,3 ), (ū k,1, u k,3 ) 1 k n} {( 1 i, j 1, 3 i, j ) 1 i m 1 j 3} n C = By onstrution, it is esy to hek tht the pirs in B re pirwise isjoint Lstly, onsier the remining pirs B \ B = {( i, j,, 1 i, j ), ( i, j, i, j,1) 1 i m 1 j 3} n oserve tht the only possiility for two pirs in B \ B to hve non-empty intersetion is to hve n element i, j,l with l {1, } in ommon Now, sine eh i, j,l is equl to n element in U = {u k,l, ū k,l 1 k n 1 l 3}, n eh element u k ppers extly twie negte n twie unnegte in C, it follows tht there is prtition of B \ B into B 3 n B 4 so tht ll pirs in the resulting two sets re pirwise isjoint Setting C 3 = C 4 = ompletes the onstrution of n instne of 4-Disjoint-Intermezzo Noting tht it is strightforwr to ompute the prtition B C = (B l C l ) 1 l 4 in polynomil time n tht we i not moify the onstrution esrie y Guttmnn n Muher [9, Lemm 1] itself, it follows from the sme proof tht I hs stisfying truth ssignment if n only if 1 l 4 (B l C l ) hs 4-Disjoint-Intermezzo orering Remrk By the onstrution of n instne of 4-Disjoint-Intermezzo in the proof of Theorem 31, we note tht no pir or triple ours twie n tht, for eh l {1,, 3, 4}, we hve B l C l We will freely use these fts throughout the reminer of the pper 5

D 1 D D 3 D 4 D A Figure 1: A terpillr on A leves n with herry {D 1, D } 4 Hrness of CPS-Existene In this setion, we show tht the eision prolem CPS-Existene is NP-omplete for ny olletion of roote inry phylogeneti trees on the sme lef set tht onsists of onstnt numer m of trees with m 8 To estlish the result, we use reution from 4-Disjoint-Intermezzo Let I e n instne of 4-Disjoint-Intermezzo Using the sme nottion s in the efinition of N-Disjoint-Intermezzo, let A = A 1 r, r, 3 r, 4 r r C l, n let D = { 1,,, A } For eh l {1,, 3, 4}, we next onstrut two roote inry phylogeneti trees Let A l e the suset of A tht preisely ontins eh element of A tht is neither ontine in n element of B l nor ontine in n element of C l { 1 r, r,, 4 r r C l } Furthermore, let S l n S l oth e the terpillr shown in Figure 1 Setting q = 1, let T l n T l e the two roote inry phylogeneti trees otine from S l n S l tht result from the following four-step proess (i) For eh ( i, j ) B l in turn, reple the lef D q in S l (resp S l ) with the 3-txon tree on the top left (resp ottom left) in Figure n inrement q y one (ii) For eh r C l with r = ( i, j, k ) in turn, reple the lef D q in S l (resp S l ) with the 8-txon tree on the top right (resp ottom right) in Figure n inrement q y one (iii) For eh i A l in turn, reple the lef D q in S l n S l with the herry { i, q } n inrement q y one 1 l 4 (iv) For eh element in {q, q + 1,, A }, reple the lef lel D q in S l n S l with q We ll P I = {T l, T l 1 l 4} the set of intermezzo trees ssoite with I The next oservtion is n immeite onsequene from the ove onstrution n the ft tht, for eh 1 l 4, the elements in B l n C l re pirwise isjoint Oservtion 41 For n instne I of 4-Disjoint-Intermezzo, the set of intermezzo trees ssoite with I onsists of eight pirwise non-isomorphi roote inry phylogeneti trees whose set of leves is A D We now estlish the min result of this setion Theorem 4 Let P = {T 1, T,, T m } e olletion of roote inry phylogeneti X-trees CPS-Existene is NP-omplete for m = 8 Proof Clerly, CPS-Existene for m = 8 is in NP euse, given n orering O on the elements in X, we n eie in polynomil time if O is herry-piking sequene for P Let I e n instne of 4-Disjoint-Intermezzo, n let P I = {T l, T l 1 l 4} e the set of eight intermezzo trees tht re ssoite with I Note tht eh tree in P I n e onstrute in polynomil time n hs size tht is polynomil in A The reminer of the proof essentilly onsists of estlishing the following lim 6

i j q i 1 r r j q k 3 r 4 r i q j i j r 3 r k 4 r 1 r q Figure : Ggets for pir ( i, j ) (left) n ggets for triple ( i, j, k ) (right) tht re use in the reution from 4-Disjoint-Intermezzo to CPS-Existene Clim I is yes -instne of 4-Disjoint-Intermezzo if n only if P I hs herry-piking sequene First, suppose tht P I hs herry-piking sequene Let O e herry-piking sequene for P I, n let O e the susequene of O of length A tht ontins eh element in A We next show tht O is 4-Disjoint-Intermezzo orering for I Let ( i, j ) e n element of some B l with 1 l 4, n let q, with q {1,,, A }, e the unique lef lel of T l n T l suh tht { i, j, q } is the lef set of pennt sutree of T l n T l By onstrution of T l n T l, it is esily seen tht q exists n i < j in O Hene, i < j in O Turning to the triples, let r = ( i, j, k ) e n element of some C l with 1 l 4, n let q, with q {1,,, A }, e the unique lef lel of T l n T l suh tht { i, j, k, 1 r, r, 3 r, 4 r, q } is the lef set of pennt sutree of T l n T l Agin, y onstrution, q exists Let S l = T l { i, j, k, 1 r, r, 3 r, 4 r, q } n, similrly, let S l = T l { i, j, k, 1 r, r, 3 r, 4 r, q } It is strightforwr to hek tht eh herry-piking sequene for S l n S l stisfies either i < j < k, or j < k < i Hene, s S l n S l re pennt in T l n T l, respetively, we hve i < j < k, or j < k < i in O n, onsequently, in O Sine the ove rgument hols for eh pir n eh triple, it follows tht O is 4-Disjoint- Intermezzo orering for I n, so, I is yes -instne Conversely, suppose tht I is yes -instne of 4-Disjoint-Intermezzo Let O e 4-Disjoint-Intermezzo orering on the elements of A To ese reing, let C = C l Moify O s follows to otin n orering O (1) Contente O with the sequene ( 1,,, A ) 1 l 4 () For eh r = ( i, j, k ) in C, o one of the following two epening on the orer of i, j, n k in O If i < j < k in O, then reple i with i, r n reple k with k, 3 r, 1 r, 4 r Otherwise, if j < k < i, reple k with k, 3 r n reple i with i, r, 1 r, 4 r 7

Sine O is 4-Disjoint-Intermezzo orering with i < j < k or j < k < i for eh ( i, j, k ) C, it follows from the onstrution of O from O tht O is n orering on the elements in A D It remins to show tht O is herry-piking sequene for P I First, onsier pennt sutree with lef set { i, j, q } in T l n T l for some 1 l 4 By onstrution, ( i, j ) is pir in B l n, so, we hve i < j in O n i < j < q in O Seon, onsier pennt sutree with lef set { i, j, k, 1 r, r, 3 r, 4 r, q } in T l n T l for some 1 l 4 By onstrution, ( i, j, k ) is triple in C l n, so, we hve either i < j < k in O n in O, or j < k < i in O n i < r < j < k < 3 r < 1 r < 4 r < q j < k < 3 r < i < r < 1 r < 4 r < q in O Thir, onsier pennt sutree with lef set { i, q } in T l n T l for some 1 l 4 By onstrution, we hve i < q in O Fourth, if ( i, j, k ) C, then, s I hs 4-Disjoint-Intermezzo orering, there oes not exist pir ( k, j ) in B l for some 1 l 4 Lstly, oserve tht ( 1,,, A ) is suffix of O n tht, for ny two trees, sy S n S in P I, we hve tht S D n S D re isomorphi Sine O is 4-Disjoint-Intermezzo orering, it is now strightforwr to hek tht O is herry-piking sequene of P I This estlishes the proof of the lim n, therey, the theorem The next orollry shows tht CPS-Existene is not only NP-omplete for olletion of eight roote inry phylogeneti trees on the sme lef set, ut for ny suh olletion with fixe numer m of trees with m 8 Corollry 43 Let P = {T 1, T,, T m } e olletion of roote inry phylogeneti X-trees CPS-Existene is NP-omplete for ny fixe m with m 8 Proof Clerly, CPS-Existene for m = t + 8 with t 0 is in NP To estlish the orollry, we show how one n moify the reution tht is esrie prior to Theorem 4 to otin set of t + 8 roote inry phylogeneti trees from n instne of 4-Disjoint-Intermezzo Let I e n instne of 4-Disjoint-Intermezzo Throughout the reminer of the proof, we ssume tht there exists n 1 l 4 suh tht B l C l > t Otherwise, sine t = m 8 n m is fixe, it follows tht I hs onstnt numer of pirs n triples with 4(m 8) n is solvle in polynomil time Now, let B i n C i with i {1,, 3, 4} e olletion of pirs n triples, respetively, suh tht B i C i > t Theorem 4 estlishes the result for when t = 0 We my therefore ssume tht t > 0 n onsier two ses First, suppose tht t is even Reple B i n C i in I with prtition of B i C i into t + 1 sets Eh of the resulting new sets n e split nturlly into olletion of pirs n olletion of triples of whih t most one is empty This results in ( t ) + 1 + (4 1) = t + 4 olletions of pirs n triples, respetively Now, for eh B l n C l with l {1,,, t + 4}, onstrut two roote inry phylogeneti trees s esrie in the efinition of the set of intermezzo trees ssoite with I This yiels ( t ) + 4 = t + 8 = m pirwise non-isomorphi trees Seon, suppose tht t is o Reple B i n C i in I with prtition of B i C i into t 1 + 1 sets Aitionlly, B = n C = Anlogous to the first se, this results in ( ) t 1 + 1 + 1 + (4 1) = t 1 + 5 olletions of pirs n triples, respetively Agin, for eh B l n C l with l {1,,, t 1 +5} onstrut two roote inry phylogeneti trees s esrie in the efinition of the set of intermezzo trees ssoite with I Noting tht the two trees for B n C re isomorphi, it follows tht the onstrution yiels ( ) t 1 + 5 1 = t + 10 1 = t + 8 = m 8

pirwise non-isomorphi trees Sine the proof of Theorem 4 generlizes to set of m intermezzo trees, the orollry now follows for oth ses 5 Bouning the numer of herries The min result of this setion is the following theorem Theorem 51 Let P = {T 1, T,, T m } e olletion of roote inry phylogeneti X-trees Let e the mximum element in { T1, T,, Tm } Then solving CPS-Existene for P tkes time m O X m(4 )+1 + f i ( X, Ti ), where f i ( X, Ti ) X O( T i ) In prtiulr, the running time is polynomil in X if n m re onstnt Let T e roote inry phylogeneti X-tree We enote y C(T ) the reursively efine set of trees tht ontins T n, n tht stisfies the following property (P) If tree T is in C(T ) n {, } is herry in T, then T [ ] n T [ ] re lso ontine in C(T ) We refer to C(T ) s the set of herry-pike trees of T Intuitively, C(T ) ontins eh tree tht n e otine from T y repetely eleting lef of herry To estlish Theorem 51, we onsier the set C(T ) of herry-pike trees of T First, we evelop new vetor representtion for eh tree in C(T ) n show tht the size of C(T ) is t most ( X + 1) O( T ) We then onstrut n utomton whose numer of sttes is C(T ) + 1 n tht reognizes whether or not wor tht ontins eh element in X preisely one is herry-piking sequene for T Lstly, we show how to use prout utomton onstrution to solve CPS-Existene for set of roote inry phylogeneti X-trees in time tht is polynomil if the numer of herries n the numer of trees in P is oune y onstnt We strt with simple lemm, whih shows tht eleting lef of herry never inreses the numer of herries Lemm 5 i=1 Let T e roote inry phylogeneti X-tree, n let e n element of herry in T Then, T T [ ] {0, 1} Proof Let e the unique element in X suh tht {, } is herry in T Oserve tht eh herry of T other thn {, } is lso herry of T [ ] Now, let p e the prent of the prent of in T, n let e the hil of p tht is not the prent of If is lef, then it is esily heke tht {, } is herry in T [ ] n, so T T [ ] = 0 On the other hn, if is not lef, then is not prt of herry in T [ ] n, so, T T [ ] = 1 We now efine lele tree tht will ply n importnt role throughout the reminer of this setion Let T e roote inry phylogeneti X-tree with herries {{ 1, 1 }, {, },, { T, T }} Otin tree T I from T s follows Step (1) Set T I to e T Step () Delete ll leves of T I tht re not prt of herry Step (3) Suppress ny resulting egree- vertex Step (4) If the root, sy ρ, hs egree one, elete ρ Step (5) For eh herry { i, i } with i {1,,, T }, lel the prent of i n i with i, n elete the two leves i n i Step (6) Bijetively lel the non-lef verties of T I with T + 1, T +,, T 1 9

Step (1) Step () Step (3) 1 1 3 3 1 1 3 3 1 1 3 3 Step (4) Step (5) Step (6) 5 4 1 3 1 3 1 1 3 3 Figure 3: An exmple of the onstrution of n inex tree Steps (1) to (6) refer to the orresponing steps in the efinition of n inex tree For simpliity, in Step (1), we hve only inite the lef lels of leves tht re prt of herry We ll T I the inex tree of T By onstrution, T I is lele roote inry tree tht is unique up to releling the internl verties To illustrte, n exmple of the onstrution of n inex tree is shown in Figure 3 The next oservtion follows immeitely from the onstrution of n inex tree Oservtion 53 Let T e roote inry phylogeneti tree, n let T I e the inex tree ssoite with T The size of T I is O( T ) In prtiulr, if the numer of herries in T is onstnt, the size of T I is O(1) We next efine prtiulr vetor reltive to given set Let S e finite set, let ɛ e n element tht is not in S, n let n e non-negtive integer We ll v = ( [ξ 1 ](x1 1, x 1,, xq 1 1, ɛ), [ξ ](x 1, x,, xq, ɛ), [ξ n ](x 1 n, x n,, x q n n, ɛ) ) n S -vetor if eh element in S ppers t most one in v, eh ξ i is n element in S {ɛ}, n eh x j i is n element in S Now onsier the following two S -vetors: n v = ( [ξ 1 ](x1 1, x 1,, xq 1 1, ɛ), [ξ ](x 1, x,, xq, ɛ), [ξ n ](x 1 n, x n,, x q n n, ɛ) ) v = ( [ψ 1 ](y 1 1, y 1,, yr 1 1, ɛ), [ψ ](y 1, y,, yr, ɛ), [ψ n ](y 1 n, y n,, y r n n, ɛ) ) 10

We sy tht v hs the suffix-property reltive to v if, for eh s {1,,, n}, the vetor omponent [ψ s ](y 1 s, y s,, y r s s, ɛ) is equl to [ψ s ](ɛ) or stisfies eh of the following equtions y r s s = x q s s, y r s 1 s = x q s 1 s,, y 1 s = x q s r s +1 s Lstly, if v hs the property tht [ψ i ](y 1 i, y i,, yr i i, ɛ) = [ɛ](ɛ) for eh i {1,,, n}, we ll v the empty vetor Note tht the empty vetor stisfies the suffix-property reltive to every S -vetor Builing on the efinition of n S -vetor, we now esrie vetor representtion of roote inry phylogeneti tree tht n e onstrute y using its inex tree s guie Roughly, the representtion ssoites terpillr-type struture to eh vertex in the inex tree Let T e roote inry phylogeneti X-tree, let X X, n let ɛ X For two verties u n v in T, we sy tht u (resp v) is n nestor (resp esennt) of v (resp u) if there is irete pth from u to v in T Throughout this setion, we regr vertex v of T to e n nestor n esennt of itself The most reent ommon nestor of X is the vertex v in T whose set of esennts ontins X n no esennt of v, exept v itself, hs this property We enote v y mr T (X ) Now, let {{ 1, 1 }, {, },, { T, T }} e the set of ll herries in T First, for eh lef i {1,,, T } in T I, let ( i, xi 1, x i,, xq i ) e the mximl pennt terpillr in T with herry { i, i } We enote this y [ξ i ](x 1 i, x i,, xq i, ɛ), where ξ i = i n x 1 i = i Seon, for eh non-lef vertex lele i in T I with i { T + 1, T +,, T 1}, let v i e the vertex in T suh tht v i = mr T ({ j, j j is esennt of i in T I }), n let T i e the roote inry phylogeneti tree otine from T y repling the pennt sutree roote t v i with lef lele v i Now, if v i is lef of herry in T i, let (v i, x 1 i, x i,, xq i ) e the mximl pennt terpillr in T i with herry {v i, x 1 i } We enote this y [ɛ](x 1 i, x i,, xq i, ɛ) Otherwise, if v i is not lef of herry in T i, we enote this y [ɛ](ɛ) Now, rell tht T 1 is the numer of verties in T I Setting n = T 1, we ll v T = ( [ξ 1 ](x1 1, x 1,, xq 1 1, ɛ), [ξ ](x 1, x,, xq, ɛ), the vetor representtion of T reltive to T I, n note tht [ξ n ](x 1 n, x n,, x q n n, ɛ) ) ξ T +1 = ξ T + = = ξ T = ξ n = ɛ An exmple of tree n its vetor representtion is shown in Figure 4 Let T e roote inry phylogeneti X-tree with ɛ X, n let T I e the inex tree of T Let v T e the vetor representtion reltive to T I Furthermore, let T e n element in C(T ), n let v T = ( [ψ 1 ](y 1 1, y 1,, yr 1 1, ɛ), [ψ ](y 1, y,, yr, ɛ), [ψ n ](y 1 n, y n,, y r n n, ɛ) ) e n X-vetor for T We sy tht v T hs the herry-property reltive to v T if, for eh herry {, } in T, extly one of the following onitions hols: 11

x 1 x x 3 x 4 x 5 x 6 x 7 x 8 x 9 x 10 x 11 x 1 Figure 4: A roote inry phylogeneti tree T whose inex tree T I is shown in Step (6) of Figure 3 The vetor representtion of T reltive to T I is ([x 1 ](x, x 3, ɛ), [x 6 ], (x 7, ɛ), [x 9 ](x 10, x 8, ɛ), [ɛ](x 5, x 4, ɛ), [ɛ](x 11, x 1, ɛ)) (i) There is n inex s {1,,, n} suh tht {ψ s, y 1 s} = {, } (ii) There re two istint inies s, t {1,,, n} suh tht {ψ s, ψ t } = {, }, the two orresponing vetor omponents re [ψ s ](ɛ) n [ψ t ](ɛ), respetively, there is vertex lele u in T I whose two hilren re lele s n t, n ψ u = ɛ To estlish Theorem 51, we next prove three lemms Lemm 54 Let T e roote inry phylogeneti X-tree, n let v T e the vetor representtion of T reltive to n inex tree of T Then eh tree T in C(T ) n e mppe to n X-vetor tht stisfies the suffix-property n the herry-property reltive to v T Moreover, the mpping is one-to-one Proof Set n = T 1 We efine mpping f from the elements in C(T ) into the set of ll X-vetors tht stisfy the suffix-property n the herry-property reltive to v T First, we mp T to v T n note tht v T stisfies the suffixproperty n the herry-property reltive to v T Seon, we mp the element in C(T ) to the empty vetor, sy v, with n vetor omponents Agin, v stisfies the suffix-property n the herry-property reltive to v T Now, let T e n element in C(T ) \ {T, } Relling the reursive efinition of C(T ), there exists tree T in C(T ) with herry {, } suh tht T [ ] is isomorphi to T Suppose tht f (efine elow) hs lrey mppe T to the X-vetor v T = ( [ψ 1 ](y 1 1, y 1,, yr 1 1, ɛ), [ψ ](y 1, y,, yr, ɛ), [ψ n ](y 1 n, y n,, y r n n, ɛ) ), tht stisfies the suffix-property s well s the herry-property reltive to v T Then f mps T to vetor tht n e otine from v T in one of the following two ses (M1) If there is n inex s {1,,, n} suh tht {ψ s, y 1 s} = {, }, then f mps T to vetor v T tht is otine from v T y repling the vetor omponent [ψ s ](y 1 s, y s,, y r s s, ɛ) with [](y s, y 3 s, y r s s, ɛ) (M) Otherwise, there re two inies s, t {1,,, n} with s t suh tht {ψ s, ψ t } = {, }, where the two orresponing omponents hve the form [ψ s ](ɛ) n [ψ t ](ɛ), respetively Furthermore, y onstrution, there is vertex lele u in T I whose two hilren re the verties lele s n t n ψ u = ɛ Then, f mps T to vetor v T tht is otine from v T y repling eh of the two vetor omponents [ψ s ](ɛ) n [ψ t ](ɛ) with [ɛ](ɛ), 1

n repling the vetor omponent [ɛ](y 1 u, y u,, y r u u, ɛ) with [](y 1 u, y u, y r u u, ɛ) For oth ses, it is esily heke tht v T is n X-vetor tht stisfies the suffix-property reltive to v T We next show tht v T stisfies the herry-property reltive to v T By Lemm 5, we hve T T {0, 1} If T 1 = T then, y onstrution, eh herry in T is herry in T Hene, s v T stisfies the herry-property reltive to v T, we hve tht v T stisfies the herry-property reltive to v T Otherwise, if T = T, then the herry {, } in T is reple with new herry tht ontins, while ll other herries in T re lso herries in T First, suppose tht T is otine from T oring to mpping (M1) Oserve tht r s 1 If r s, then {, y s} is the new herry n, thus, v T stisfies the herry-property reltive to v T On the other hn, if r s = 1, let [ψ t ](y 1 t, y t,, y r t t, ɛ) e the vetor omponent in v T suh tht the verties lele s n t in T I hve the sme prent Note tht t exists euse, otherwise, s is the root of T I n so the existene of herry in T tht is not herry in T implies tht r s ; ontrition If r t 1, then {ψ t, y 1 t } is herry in T n T Thus, the siling of in T is not lef, therey ontriting tht is lef of herry in T Hene, [ψ t ](y 1 t, y t,, y r t t, ɛ) = [ψ t](ɛ) Now, s [](ɛ) n [ψ t ](ɛ) re two vetor omponents of v T, it gin follows tht v T stisfies the herry-property reltive to v T Seon, suppose tht T is otine from T oring to mpping (M) Noting tht is n element of the vetor omponent [ψ u ](y 1 u, y u,, y r u t, ɛ) with ψ u = in v T, the result n e estlishe y using n rgument tht is similr to the previous se It remins to show tht the mpping is one-to-one Let T n T e two istint elements in C(T ) \ { } Sine eh element in C(T )\{T, } n e otine from T y repetely eleting lef of herry n suppressing the resulting egree- vertex, there exists n element l in X tht is lef in T n not lef in T Let X n X e the lef set of T n T, respetively Noting tht v T is n X-vetor tht ontins eh element in X extly one, it follows from onstrution of the mpping tht v T is n X -vetor tht ontins eh element in X extly one n tht v T is n X -vetor tht ontins eh element in X extly one Hene v T v T Moreover, sine no element in C(T ) \ { } is mppe to v, the mpping is one-to-one This ompletes the proof of the lemm Lemm 55 Let T e roote inry phylogeneti X-tree Then C(T ) ( X + 1) 4 T Proof Let v T = ( [ξ 1 ](x1 1, x 1,, xq 1 1, ɛ), [ξ ](x 1, x,, xq, ɛ), [ξ n ](x 1 n, x n,, x q n n, ɛ) ) e the vetor representtion of T reltive to n inex tree of T, where n = T 1 We first erive n upper oun on the numer of X-vetors tht stisfy the suffix-property reltive to v T For eh i {1,,, n}, onsier the vetor omponent [ξ i ](xi 1, x i,, xq i i, ɛ) Then eh X-vetor tht stisfies the suffix-property reltive to T hs n ith vetor omponent, sy [ψ i ](y 1 i, y i,, yr i i, ɛ), suh tht ψ i X {ɛ} n (y 1 i, y i,, yr i i, ɛ) is suffix of (x1 i, x i,, xq i i, ɛ) Sine there re t most X + 1 suh suffixes, it follows tht there re t most ( X + 1) vritions of [ψ i ](y 1 i, y i,, yr i i, ɛ) Hene, there re t most (( X + 1) ) n = ( X + 1) 4 T X-vetors tht stisfy the suffix-property reltive to v T By Lemm 54, eh tree in C(T ) n e mppe to one suh vetor n, s the mp is one-to-one, it follows tht C(T ) ontins t most ( X + 1) 4 T trees For roote inry phylogeneti X-tree T, the next lemm onstruts n utomton tht reognizes whether or not wor tht ontins eh element in X preisely one is herry-piking sequene for T 13

Lemm 56 Let T e roote inry phylogeneti X-tree There is eterministi finite utomton A T with O( X 4 T ) sttes tht reognizes the lnguge L X (T ) = {x 1 x x X (x 1, x,, x X ) is herry-piking sequene for T } Moreover, the utomton A T n e onstrute in time f ( X, T ) X O( T ) Proof Throughout this proof, we enote the tree without vertex y Let M n M e two sets Setting M = {T } n M =, we onstrut A T s follows (1) Crete the sttes q T, q, n q e For eh X, set δ(q e, ) = δ(q, ) = q e () For eh T M n eh X o the following () If is lef of herry in T or is the only vertex of T, then (i) rete the stte q T [ ] if T [ ] is not isomorphi to tree in M, (ii) set M = M T [ ], n (iii) set δ(q T, ) = q T [ ] () Otherwise, set δ(q T, ) = q e (3) Set M = M n, susequently, set M = If M { }, ontinue with () We set the initil stte of A T to e q T n the finl stte to e q To illustrte, the onstrution of A T is shown in Figure 5 for phylogeneti tree on four leves By onstrution, we hve A T = (C(T ), X, δ, q T, {q }) As eh herry-pike tree in C(T ) is mppe to unique stte, it follows from Lemm 55 tht the numer of sttes of A T is O( X 4 T ) Moreover, for eh X n eh pir of two istint sttes T, T C(T ), there is trnsition δ(q T, ) = q T if n only if T [ ] = T n is either lef of herry in T or T onsists of the single vertex The stte q e ollets ll inputs tht o not orrespon to the ontinution of herry-piking sequene More preisely, there is trnsition δ(q T, ) = q e if n only if is not lef of herry in T n T oes not onsist of the single vertex It now follows tht there is one-to-one orresponene etween the irete pths from q T to q in A T n the herry-piking sequenes of T n, hene, A T reognizes L X (T ) The time tken to onstrut A T is ominte y the numer of itertions of the for-loop in Step () Sine M < C(T ) n C(T ) O( X 4 T ), the numer of itertions in Step () is O( C(T ) X ) O( X 4 T 1 ) Moreover, sine Step () is exeute X times, eh opertion of the for-loop is exeute O( X 4 T ) times in totl While the omplexity of these opertions epen on the implementtion n t struture, they n lerly e implemente suh tht A T n e onstrute in time X O( T ) This estlishes the lemm Generlizing the lnguge tht is esrie in the sttement of Lemm 56, the next strightforwr oservtion esries lnguge for the eision prolem CPS-Existene Oservtion 57 Let P = {T 1, T,, T m } e olletion of roote inry phylogeneti X-trees Then, solving CPS-Existene for P is equivlent to eiing if L X (T i ) We re now in position to estlish Theorem 51 1 i m Proof of Theorem 51 By Oservtion 57, it follows tht there is herry-piking sequene for P if n only if T i P L X (T i ), where L X (T ) = {x 1 x x X (x 1, x,, x X ) is herry-piking sequene for T } 14

i = 1 T 1 M = q T1 q T3 q e x X x X T T 3 q T M = q i = T T 3 M = q T3 q T6 q e x X M = T 4 T 5 T 6 q T1 q T q T5 q T4 q x X i = 3 M = T 4 T 5 T 6 q T1 M = T 7 T 8 T 9 T 10 q T3 q T q T6 q T5 q T4 q T10 q T9 q T7 q e q x X x X q T8 i = 4 M = T 7 T 8 T 9 T 10 M = q T1 q T3 q T q T6 q T5 q T4 q T10 q T9 q T7 q e q x X x X q T8 Figure 5: Constrution of n utomton tht reognizes the lnguge L X (T 1 ) s esrie in the sttement of Lemm 56 n with T 1 shown in the top left of this figure Eh vertex (resp ege) represents stte (resp trnsition) The vertex q T1 inites the initil stte wheres the finl stte is q s inite y oule irle To inrese reility, most trnsitions to q e re omitte In row i, the figure shows M, M, n the utomton fter the ith exeution of the for-loop s esrie in Step () in the proof of Lemm 56 15

For eh T i P with 1 i m, we follow the nottion n onstrution tht is esrie in the proof of Lemm 56 to otin n utomton A Ti with O( X 4 T i ) sttes tht reognizes the lnguge L X (T i ) To solve the question whether or not the intersetion of these m lnguges is empty, we use the well-known onstrution of prout utomton [15] s follows For eh T i P, let Q Ti e set of sttes, n let δ Ti e the trnsition reltion of A Ti We onstrut new utomton A P, where the set of sttes Q P is the rtesin prout Q T1 Q T Q Tm Furthermore, the lphet of A P is X n the trnsition reltion δ P : Q P X Q P is efine s δ P ((q 1,, q m ), ) = (δ T1 (q 1, ),, δ Tm (q m, )) Lstly, the initil (resp finl) stte of A P is (q 1,, q m ) where, for ll i {1,,, m}, q i is the initil (resp finl) stte of A Ti Intuitively, A P simultes the prllel exeution of the utomt A T1, A T,, A Tm By onstrution, n input sequene is epte y A P if n only if it is epte y eh utomton A Ti It now follows tht there is herry-piking sequene for P if n only if the finl stte of A P n e rehe from the initil stte of A P n, hene, L(A P ) It remins to show tht the omputtionl omplexity is s lime in the sttement of the theorem Viewing A P s irete grph, eh irete pth from the initil to the finl stte of A P hs length X We n therefore eie whether the finl stte from A P is rehle y using reth-first serh [8] in time O( Q P + X Q P ), where X Q P is the numer of trnsitions in A P By onstrution n Lemm 56, it follows tht A P hs m O X 4 T i O( X m(4 ) ) i=1 sttes, ie Q P O( X m(4 ) ) Hene, we n eie in time O( X m(4 )+1 ) whether L(A P ) By Lemm 56, it tkes time f i ( X, Ti ) X O( T ) i to onstrut eh utomton A Ti n, thus, it follows tht eiing if there is herry-piking sequene for P n e one in time m O X m(4 )+1 + f i ( X, Ti ), i=1 whih is polynomil in X if n m re onstnt 6 Conluing remrks In this pper, we hve shown tht CPS-Existene, prolem of relevne to the onstrution of phylogeneti networks from set of phylogeneti trees, is NP-omplete for ll sets P of roote inry phylogeneti trees with P 8 This result prtilly nswers question pose y Humphries et l [1] They ske if CPS-Existene is omputtionlly hr for P = To estlish our result, we first showe tht 4-Disjoint-Intermezzo, whih is vrint of the Intermezzo prolem tht is new to this pper, is NP-omplete Susequently, we estlishe reution from n instne I of 4-Disjoint-Intermezzo to n instne I of CPS-Existene with P = 8 Sine eh of the four olletions of pirs n triples in I reues to two trees in I, possile pproh to otin stronger hrness result for CPS- Existene with P < 8 is to show tht N-Disjoint-Intermezzo is NP-omplete for N < 4 However, it seems likely tht suh result n only e hieve y following strtegy tht is ifferent from the one tht we use in this pper In prtiulr, there is no ovious reution from PN-3-SAT to 3-Disjoint-Intermezzo Moreover, 1-Disjoint- Intermezzo is solvle in polynomil time sine ll pirs n triples re pirwise isjoint n, so, it nnot e use for reution even if CPS-Existene turns out to e NP-omplete for P = In the seon prt of the pper, we hve trnslte CPS-Existene into n equivlent prolem on lnguges n use utomt theory to show tht CPS-Existene n e solve in polynomil time if the numer of trees in P n the 16

numer of herries in eh suh tree re oune y onstnt There re urrently only smll numer of other prolems in phylogenetis tht hve een solve with the help of utomt theory (eg [10, 1]) n it is to e hope tht the results presente in this pper will stimulte further reserh to explore onnetions etween omintoril prolems in phylogenetis n utomt theory Aknowlegements We thnk Britt Dorn for insightful omments on rft version of this pper The seon uthor ws supporte y the New Zeln Mrsen Fun Referenes [1] B Alreht, C Sornv, A Ceni, n D H Huson (01) Fst omputtion of minimum hyriiztion networks Bioinformtis, 8, 191 197 [] M Broni, S Grünewl, V Moulton, n C Semple, (005), Bouning the numer of hyriiztion events for onsistent evolutionry history, Journl of Mthemtil Biology, 51, 171 18 [3] M Broni, C Semple, n M Steel (006) Hyris in rel time Systemti Biology, 55, 46 56 [4] M Borewih n C Semple (007) Computing the minimum numer of hyriiztion events for onsistent evolutionry history Disrete Applie Mthemtis, 155, 914 98 [5] P Bermn, M Krpinski, n A D Sott (003) Approximtion hrness of short symmetri instnes of MAX-3SAT Eletroni Colloquium on Computtionl Complexity, Report No 49 [6] G Cron, F Rosselló, n G Vliente (009) Comprison of tree-hil phylogeneti networks IEEE/ACM Trnstions on Computtionl Biology n Bioinformtis, 6, 55 569 [7] Z Z Chen n L Wng (013) An ultrfst tool for minimum retiulte networks Journl of Computtionl Biology, 0, 38 41 [8] T H Cormen, C E Leierson, R L Rivest, n C Stein (009) Introution to Algorithms The MIT Press [9] W Guttmnn n M Muher (006) Vritions on n orering theme with onstrints In Fourth IFIP Interntionl Conferene on Theoretil Computer Siene, Springer, pp 77 90 [10] D Hll n D Klein (010) Fining ognte groups using phylogenies In Proeeings of the 48th Annul Meeting of the Assoition for Computtionl Linguistis, pp 1030 1039 [11] J E Hoproft n J D Ullmn (1979) Introution to Automt Theory, Lnguges, n Computtion Aison-Wesley [1] P J Humphries, S Linz, n C Semple (013) Cherry piking: hrteriztion of the temporl hyriiztion numer for set of phylogenies Bulletin of Mthemtil Biology, 75, 1879 1890 [13] P J Humphries, S Linz, n C Semple (013) On the omplexity of omputing the temporl hyriiztion numer for two phylogenies Disrete Applie Mthemtis, 161, 871 880 [14] D H Huson, R Rupp, C Sornv (010) Phylogeneti networks: onepts, lgorithms n pplitions Cmrige University Press [15] D C Kozen (1997) Automt n Computility Unergrute Texts in Computer Siene Springer [16] J Mllet, N Besnsky, n M W Hhn (016) How retiulte re speies? BioEssys, 38, 140 149 [17] B M E Moret, L Nkhleh, T Wrnow, C R Liner, A Tholse, A Polin, J Sun, n R Timme (004) Phylogeneti networks: moeling, reonstrutiility, n ury IEEE/ACM Trnstions on Computtionl Biology n Bioinformtis, 1, 13 3 [18] T Piovesn n S Kelk: A simple fixe prmeter trtle lgorithm for omputing the hyriiztion numer of two (not neessrily inry) trees, IEEE/ACM Trnstions on Computtionl Biology n Bioinformtis, 10, 18 5 [19] C Semple (007) Hyriiztion networks, pges 77 314 In Gsuel, O n Steel, M, es, Reonstruting Evolution: New Mthemtil n Computtionl Avnes, Oxfor University Press [0] S M Souy, J Hung, n J P Gogrten (015) Horizontl gene trnsfer: uiling the we of life Nture Reviews Genetis, 16, 47 48 [1] O Westesson, G Lunter, B Pten, n In Holmes (01) Aurte reonstrution of insertion-eletion histories y sttistil phylogenetis PLoS One, 7(4): e3457 [] Y Wu n J Wng (010) Fst omputtion of the ext hyriiztion numer of two phylogeneti trees In Interntionl Symposium on Bioinformtis Reserh n Applitions, Springer, pp 03 14 17