The Complexity of the uspr Distance

Similar documents
Walks in Phylogenetic Treespace

A 3-APPROXIMATION ALGORITHM FOR THE SUBTREE DISTANCE BETWEEN PHYLOGENIES. 1. Introduction

Exploring Treespace. Katherine St. John. Lehman College & the Graduate Center. City University of New York. 20 June 2011

SPR Distance Computation for Unrooted Trees

Solving the Maximum Agreement Subtree and Maximum Comp. Tree problems on bounded degree trees. Sylvain Guillemot, François Nicolas.

A 3-approximation algorithm for the subtree distance between phylogenies

A CLUSTER REDUCTION FOR COMPUTING THE SUBTREE DISTANCE BETWEEN PHYLOGENIES

NOTE ON THE HYBRIDIZATION NUMBER AND SUBTREE DISTANCE IN PHYLOGENETICS

Subtree Transfer Operations and Their Induced Metrics on Evolutionary Trees

Department of Mathematics and Statistics University of Canterbury Private Bag 4800 Christchurch, New Zealand

Molecular Evolution & Phylogenetics

Tree Space: Algorithms & Applications Part I. Megan Owen University of Waterloo

Analysis of Uninformed Search Methods

On the Subnet Prune and Regraft distance

Polynomial kernels for constant-factor approximable problems

Finding the best tree by heuristic search

Notes 3 : Maximum Parsimony

ALGORITHMIC STRATEGIES FOR ESTIMATING THE AMOUNT OF RETICULATION FROM A COLLECTION OF GENE TREES

Phylogeny. November 7, 2017

Charles Semple, Philip Daniel, Wim Hordijk, Roderic D M Page, and Mike Steel

Classical Complexity and Fixed-Parameter Tractability of Simultaneous Consecutive Ones Submatrix & Editing Problems

A Cubic-Vertex Kernel for Flip Consensus Tree

(Stevens 1991) 1. morphological characters should be assumed to be quantitative unless demonstrated otherwise

/ / MET Day 000 NC1^ INRTL MNVR I E E PRE SLEEP K PRE SLEEP R E

FPT is Characterized by Useful Obstruction Sets

arxiv: v1 [q-bio.pe] 16 Aug 2007

Physical Mapping. Restriction Mapping. Lecture 12. A physical map of a DNA tells the location of precisely defined sequences along the molecule.

Parameterized Algorithms and Kernels for 3-Hitting Set with Parity Constraints

Parsimony via Consensus

Supertree Algorithms for Ancestral Divergence Dates and Nested Taxa

Cutting Plane Methods II

Reconstructing Trees from Subtree Weights

Exact Algorithms for Dominating Induced Matching Based on Graph Partition

Parameterized Domination in Circle Graphs

Modern Phylogenetics. An Introduction to Phylogenetics. Phylogenetics and Systematics. Phylogenetic Tree of Whales

Chapter 3: Phylogenetics


CS 581 Paper Presentation

arxiv: v1 [cs.cc] 9 Oct 2014

Lecture 2 September 4, 2014

Inference in Graphical Models Variable Elimination and Message Passing Algorithm

arxiv: v2 [q-bio.pe] 4 Feb 2016

Lecture 6 January 15, 2014

Join Ordering. 3. Join Ordering

Let S be a set of n species. A phylogeny is a rooted tree with n leaves, each of which is uniquely

Variable Elimination: Basic Ideas

Efficiency of Markov Chain Monte Carlo Tree Proposals in Bayesian Phylogenetics

arxiv: v1 [q-bio.pe] 1 Jun 2014

Cherry picking: a characterization of the temporal hybridization number for a set of phylogenies

Classification of Join Ordering Problems

A Tight Upper Bound on the Second-Order Coding Rate of Parallel Gaussian Channels with Feedback

Phylogenetic Networks, Trees, and Clusters

Tree-width and algorithms

THE THREE-STATE PERFECT PHYLOGENY PROBLEM REDUCES TO 2-SAT

ON THE UNIQUENESS OF BALANCED MINIMUM EVOLUTION

The Budgeted Unique Coverage Problem and Color-Coding (Extended Abstract)

Properties of normal phylogenetic networks

Maximum Agreement Subtrees

The essential numerical range and the Olsen problem

UNICYCLIC NETWORKS: COMPATIBILITY AND ENUMERATION

Intractable Problems Part Two

B x1. Voronoi Tessellation, Decision Surfaces

Search and Lookahead. Bernhard Nebel, Julien Hué, and Stefan Wölfl. June 4/6, 2012

Notes for Lecture 27

Phylogeny. Properties of Trees. Properties of Trees. Trees represent the order of branching only. Phylogeny: Taxon: a unit of classification

Efficient discovery of the top-k optimal dependency rules with Fisher s exact test of significance

Workshop III: Evolutionary Genomics

COMPSCI 276 Fall 2007

CS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1. Lecture 8: Phylogenetic Tree Reconstruction: Distance Based - October 10, 2003

B x1. Voronoi Tessellation, Decision Surfaces

RECOVERING A PHYLOGENETIC TREE USING PAIRWISE CLOSURE OPERATIONS

A Phylogenetic Network Construction due to Constrained Recombination

Search Algorithms. Analysis of Algorithms. John Reif, Ph.D. Prepared by

Consistency algorithms

Computational Logic. Davide Martinenghi. Spring Free University of Bozen-Bolzano. Computational Logic Davide Martinenghi (1/30)

I can use properties of similar triangles to find segment lengths. I can apply proportionality and triangle angle bisector theorems.

A fast algorithm to generate necklaces with xed content

Tractable Approximations of Graph Isomorphism.

STABILITY ANALYSIS AND CONTROL DESIGN

Kernelization Lower Bounds: A Brief History

Phylogenetics. BIOL 7711 Computational Bioscience

TheDisk-Covering MethodforTree Reconstruction

Lecture 9 : Identifiability of Markov Models

Lecture 20: conp and Friends, Oracles in Complexity Theory

Computational Complexity and Intractability: An Introduction to the Theory of NP. Chapter 9

Clustering Perturbation Resilient

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

Exact Algorithms and Experiments for Hierarchical Tree Clustering

Notes for Lecture 21

Pitfalls of Heterogeneous Processes for Phylogenetic Reconstruction

C4.5 - pruning decision trees

The maximum agreement subtree problem

Cycles with consecutive odd lengths

NP-Complete problems

Consistency Index (CI)

Parametrizing Above Guaranteed Values: MaxSat. and MaxCut. Meena Mahajan and Venkatesh Raman. The Institute of Mathematical Sciences,

CS Lecture 28 P, NP, and NP-Completeness. Fall 2008

Component structure of the vacant set induced by a random walk on a random graph. Colin Cooper (KCL) Alan Frieze (CMU)

Chordal structure in computer algebra: Permanents

Algorithms for efficient phylogenetic tree construction

Transcription:

The omplexity of the uspr istance Maria Luisa onet Universtidad Politecnica de atalunya arcelona, Spain bonet@lsi.upc.edu Published in I/M T 2010 Joint work with Katherine St. John Maria Luisa onet, Phylogenetics follow-up Workshop, INI, June 2011 1

When are two trees similar? Nicotiana ampanula Scaevola Stokesia imorphotheca Senecio erbera azania chinops elicia Tagetes hromolaena lennosperma oreopsis Vernonia acosmia ichorium chillea arthamnus laveria Piptocarpa Helianthus Tragopogon hrysanthemum upatorium Lactuca arnadesia asyphyllum Nicotiana ampanula Scaevola Stokesia imorphotheca Senecio azania erbera chinops elicia Tagetes hromolaena lennosperma oreopsis Vernonia acosmia ichorium chillea arthamnus laveria Piptocarpa Helianthus Tragopogon hrysanthemum upatorium Lactuca arnadesia asyphyllum Phylogenies for sunflowers. ob Jansen (UT ustin). Maria Luisa onet, Phylogenetics follow-up Workshop, INI, June 2011 2

istances etween Trees Robinson-oulds distance: # of branches that occur in only one tree. alculate in O(n) time using ay s lgorithm (1985). xtends naturally to weighted trees. Maria Luisa onet, Phylogenetics follow-up Workshop, INI, June 2011 3

TR istance Tree-isection-Reconnect (TR) Move: The TR distance between two trees is the minimal number of TR moves needed to transform the first tree into the second tree. Maria Luisa onet, Phylogenetics follow-up Workshop, INI, June 2011 4

TR istance Tree-isection-Reconnect (TR): TR is NP-hard. (llen & Steel 01) Maria Luisa onet, Phylogenetics follow-up Workshop, INI, June 2011 5

SPR istance Subtree-prune-regraft (SPR): The SPR distance between two trees is the minimal number of SPR moves needed to transform the first tree into the second tree. d T R (T 1, T 2 ) d SP R (T 1, T 2 ) Maria Luisa onet, Phylogenetics follow-up Workshop, INI, June 2011 6

SPR istance Subtree-prune-regraft (SPR): SPR for rooted trees is NP-hard. (ordewich & Semple 05) pproximation algorithm for SPR on rooted trees (onet, St. John, menta, & Mahindru 05) (orderwich, Mcartin, Semple). Maria Luisa onet, Phylogenetics follow-up Workshop, INI, June 2011 7

Reduction Rules: Subtree Rule a a pplying Subtree Rule preserves TR distance (llen & Steel 01), SPR distance (llen &Steel 01, orderwich & Semple 04) and Hybrid number (orderwich & Semple 07). Maria Luisa onet, Phylogenetics follow-up Workshop, INI, June 2011 8

Reduction Rules: hain Rule pplying the chain Rule preserves TR distance (llen & Steel 01), rspr distance (orderwich & Semple 04) and Hybrid number (orderwich & Semple 07). Maria Luisa onet, Phylogenetics follow-up Workshop, INI, June 2011 9

$100 Problem Mike Steel posed the following questions: oes shrinking common chains preserve SPR distance? alculating SPR distance is fixed parameter tractable? alculating TR (llen & Steel 01), rspr (orderwich & Semple 04), or Hybrid Number (orderwich & Semple 07) is fixed parameter tractable. Maria Luisa onet, Phylogenetics follow-up Workshop, INI, June 2011 10

Lower ounds for Subchain Reduction Hickey et al. showed that: alculating SPR distance is NP-hard. d usp R (T1 n, T2 n ) 2 d usp R (T1 3, T2 3 ). We improve this to: d usp R (T1 n, T2 n ) 1 d usp R (T1 3, T2 3 ) Maria Luisa onet, Phylogenetics follow-up Workshop, INI, June 2011 11

SPR distance is PT We show that uspr is fixed parameter tractable with respect to parameter k: That is, for distance k, it can be decided in p(n)f(k) time that two trees are distance k, where p is a polynomial in n, the number of leaves in the tree, and f(k) does not depend on n. Maria Luisa onet, Phylogenetics follow-up Workshop, INI, June 2011 12

Sketch: TR distance is PT Input: Pair of trees (T 1, T 2 ) and distance parameter k. Obtain (T 1, T 2) applying subtree and clain reduction. d T R (T 1, T 2 ) = d T R (T 1, T 2) unknown for SPR T 1 c.d T R (T 1, T 2 ) (lemma in llen-steel 2001) if T 1 > c.k then d T R (T 1, T 2 ) > k and return no. else check all posible k TR-moves from T 1 to T 2 and return yes or no. This can be done in time O(( T 1 2 ) k ) = O((d k) 2k ) Total time is O(k 2k p(n)). Maria Luisa onet, Phylogenetics follow-up Workshop, INI, June 2011 13

New Reduction Rule: l-hain Rule... 1 2 3 l 1 2 3 ck... 1 2 3 l 1 2 3 Lemma: Let T 1 and T 2 be X-trees with SPR distance k. pplying the c k-chain Rule preserves SPR distance. ck Maria Luisa onet, Phylogenetics follow-up Workshop, INI, June 2011 14

Size of Reduced Trees Lemma: Let T 1 and T 2 be X-trees with SPR distance k. Let T 1 and T 2 be the trees obtained from T 1 and T 2 applying subtree and c k-chain Rule. Then, T 1 dk 2. Proof: if d SP R (T 1, T 2 ) k then d T R (T 1, T 2 ) k. Show that if d T R (T 1, T 2 ) k, then T 1 dk 2 as in llen-steel 2001 but with c.k-chains instead of 3-chains. Maria Luisa onet, Phylogenetics follow-up Workshop, INI, June 2011 15

Sketch: SPR distance is PT Input: Pair of trees (T 1, T 2 ) and distance parameter k. Obtain (T 1, T 2) applying subtree and ck-clain reduction. If d SP R (T 1, T 2 ) k, T 1 d k 2 and d SP R (T 1, T 2 ) = d SP R (T 1, T 2) if T 1 > d k 2 then d SP R (T 1, T 2 ) > k and return no. else check all posible k SPR-moves from T 1 to T 2 and return yes or no. This can be done in time O(( T 1 2 ) k ) = O((d k 2 ) 2k ) Total time is O(k 4k p(n)). Maria Luisa onet, Phylogenetics follow-up Workshop, INI, June 2011 16

Idea for Improving Lower ound d usp R (T n 1, T n 2 ) 1 d usp R (T 3 1, T 3 2 ) Idea inspired from Hickey et al: treat common chains as subtrees to get 2 bound. o carefully by cases on a minimal set of moves to improve the bound to 1. Maria Luisa onet, Phylogenetics follow-up Workshop, INI, June 2011 17

Idea for Improving Lower ound Let T 1 and T 2 X-trees with common chain 1 l and SPR dist k. We show d SP R (T 3 1, T 3 2 ) k 1 by cases. ase sketch: the minimal number of moves breaks 2 or 3 pendant edges from the 3-chain. 1 2 3 T 1 l+2... l+2 m 0 T 1 1 2 3 4 m, m,..., m 1 2 k+1 k +1 T 2 3 4 1 2 m... 1 2 3 T l+2 2 l+2 l+2 l+1 l+2 l+1 Maria Luisa onet, Phylogenetics follow-up Workshop, INI, June 2011 18