Recent Advances in Phylogeny Reconstruction

Similar documents
A New Fast Heuristic for Computing the Breakpoint Phylogeny and Experimental Phylogenetic Analyses of Real and Synthetic Data

Phylogenetic Reconstruction

Phylogenetic Reconstruction from Gene-Order Data

Improving Tree Search in Phylogenetic Reconstruction from Genome Rearrangement Data

Fast Phylogenetic Methods for the Analysis of Genome Rearrangement Data: An Empirical Study

BIOINFORMATICS. New approaches for reconstructing phylogenies from gene order data. Bernard M.E. Moret, Li-San Wang, Tandy Warnow and Stacia K.

Steps Toward Accurate Reconstructions of Phylogenies from Gene-Order Data 1

New Approaches for Reconstructing Phylogenies from Gene Order Data

Mathematics of Evolution and Phylogeny. Edited by Olivier Gascuel

Additive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive.

Advances in Phylogeny Reconstruction from Gene Order and Content Data

TheDisk-Covering MethodforTree Reconstruction

BIOINFORMATICS. Scaling Up Accurate Phylogenetic Reconstruction from Gene-Order Data. Jijun Tang 1 and Bernard M.E. Moret 1

A Practical Algorithm for Ancestral Rearrangement Reconstruction

Disk-Covering, a Fast-Converging Method for Phylogenetic Tree Reconstruction ABSTRACT

Phylogenetic Reconstruction: Handling Large Scale

Phylogenetic Networks, Trees, and Clusters

Phylogenetic Reconstruction from Arbitrary Gene-Order Data

An Investigation of Phylogenetic Likelihood Methods

Evolutionary Tree Analysis. Overview

Industrial Applications of High-Performance Computing for Phylogeny Reconstruction

Isolating - A New Resampling Method for Gene Order Data

Dr. Amira A. AL-Hosary

Phylogenetic Tree Reconstruction

EVOLUTIONARY DISTANCES

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Jijun Tang and Bernard M.E. Moret. Department of Computer Science, University of New Mexico, Albuquerque, NM 87131, USA

ABSTRACT 1. INTRODUCTION

A few logs suce to build (almost) all trees: Part II

High-Performance Algorithm Engineering for Large-Scale Graph Problems and Computational Biology

Phylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University

NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees

LOWER BOUNDS ON SEQUENCE LENGTHS REQUIRED TO RECOVER THE EVOLUTIONARY TREE. (extended abstract submitted to RECOMB '99)


CS 394C Algorithms for Computational Biology. Tandy Warnow Spring 2012

A Phylogenetic Network Construction due to Constrained Recombination

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

CS 581 Algorithmic Computational Genomics. Tandy Warnow University of Illinois at Urbana-Champaign

Constructing Evolutionary/Phylogenetic Trees

The Generalized Neighbor Joining method

Phylogeny: building the tree of life

An Improved Algorithm for Ancestral Gene Order Reconstruction

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

CS 581 Algorithmic Computational Genomics. Tandy Warnow University of Illinois at Urbana-Champaign

AN EXACT SOLVER FOR THE DCJ MEDIAN PROBLEM

A Minimum Spanning Tree Framework for Inferring Phylogenies

GASTS: Parsimony Scoring under Rearrangements

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center

Reconstruction of certain phylogenetic networks from their tree-average distances

Phylogenetics: Building Phylogenetic Trees

A Framework for Orthology Assignment from Gene Rearrangement Data

Constructing Evolutionary/Phylogenetic Trees

BIOINFORMATICS DISCOVERY NOTE

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University

Phylogenetics. BIOL 7711 Computational Bioscience

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz

Molecular Evolution, course # Final Exam, May 3, 2006

Algorithms in Bioinformatics

Phylogenetic inference

Reversing Gene Erosion Reconstructing Ancestral Bacterial Genomes from Gene-Content and Order Data

Reading for Lecture 13 Release v10

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

BIOL 1010 Introduction to Biology: The Evolution and Diversity of Life. Spring 2011 Sections A & B

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016

Accepted Manuscript. Maximum likelihood estimates of pairwise rearrangement distances

Reconstructing Trees from Subtree Weights

On Reversal and Transposition Medians

Martin Bader June 25, On Reversal and Transposition Medians

Mul$ple Sequence Alignment Methods. Tandy Warnow Departments of Bioengineering and Computer Science h?p://tandy.cs.illinois.edu

Properties of normal phylogenetic networks

A new algorithm to construct phylogenetic networks from trees

C3020 Molecular Evolution. Exercises #3: Phylogenetics

CS 581 Paper Presentation

Phylogenetics: Likelihood

Analysis of Gene Order Evolution beyond Single-Copy Genes

BINF6201/8201. Molecular phylogenetic methods

THE THREE-STATE PERFECT PHYLOGENY PROBLEM REDUCES TO 2-SAT

Presentation by Julie Hudson MAT5313

arxiv: v1 [q-bio.pe] 1 Jun 2014

A (short) introduction to phylogenetics

Phylogenetics: Parsimony and Likelihood. COMP Spring 2016 Luay Nakhleh, Rice University

Stochastic Errors vs. Modeling Errors in Distance Based Phylogenetic Reconstructions

Molecular Evolution & Phylogenetics

Phylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science

CREATING PHYLOGENETIC TREES FROM DNA SEQUENCES

CS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1. Lecture 8: Phylogenetic Tree Reconstruction: Distance Based - October 10, 2003

Opportunities and Challenges in Computational Biology

Lecture 11 Friday, October 21, 2011

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.

Algorithms for Bioinformatics

Phylogenetic inference: from sequences to trees

Inferring Phylogenetic Trees. Distance Approaches. Representing distances. in rooted and unrooted trees. The distance approach to phylogenies

Phylogenetics: Parsimony

MAXIMUM LIKELIHOOD PHYLOGENETIC RECONSTRUCTION FROM HIGH-RESOLUTION WHOLE-GENOME DATA AND A TREE OF 68 EUKARYOTES

Michael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D

Characteristics of Life

How to read and make phylogenetic trees Zuzana Starostová

Transcription:

Recent Advances in Phylogeny Reconstruction from Gene-Order Data Bernard M.E. Moret Department of Computer Science University of New Mexico Albuquerque, NM 87131 Department Colloqium p.1/41

Collaborators and Support Collaborators: University of Texas, Austin: Tandy Warnow (Computer Science) David Hillis, Robert Jansen, Randy Linder (Biology) University of New Mexico: David Bader (Electrical & Comp. Eng.) Funding: National Science Foundation, at UNM: 6 grants for $2 million over 5 years with UT Austin: 10 grants for $8 million Department Colloqium p.2/41

Overview Phylogenies Department Colloqium p.3/41

Overview Phylogenies Gene-order data: mitochondrion and chloroplast genomes Department Colloqium p.3/41

Overview Phylogenies Gene-order data: mitochondrion and chloroplast genomes Inversion and other genomic distance measures Department Colloqium p.3/41

Overview Phylogenies Gene-order data: mitochondrion and chloroplast genomes Inversion and other genomic distance measures Estimating the true evolutionary distance Department Colloqium p.3/41

Overview Phylogenies Gene-order data: mitochondrion and chloroplast genomes Inversion and other genomic distance measures Estimating the true evolutionary distance Fast convergence for reconstruction methods Department Colloqium p.3/41

Overview Phylogenies Gene-order data: mitochondrion and chloroplast genomes Inversion and other genomic distance measures Estimating the true evolutionary distance Fast convergence for reconstruction methods GRAPPA news Department Colloqium p.3/41

Phylogenies A phylogeny is a reconstruction of the evolutionary history of a collection of organisms; it usually takes the form of a tree. Modern organisms are placed at the leaves and ancestral organisms occupy internal nodes. The edges of the tree denote evolutionary relationships. Department Colloqium p.4/41

12 Species of Campanulaceae 2.42 1.75 4.25 Wahlenbergia Merciera 4.34 1.61 0.83 0.23 0.77 0.063 0.94 0.18 2.82 Trachelium Symphyandra Campanula Adenophora 0.78 2.59 1.28 3.22 3.39 1.61 Legousia Asyneuma Triodanus 2.22 4.68 3.32 Codonopsis Cyananthus 10.75 Platycodon 2.25 Tobacco Department Colloqium p.5/41

Herpes Viruses that Affect Humans HVS EHV2 KHSV EBV HSV1 HSV2 PRV EHV1 HHV6 VZV HHV7 HCMV Department Colloqium p.6/41

A Large Phylogeny: 500 Green Plants Department Colloqium p.7/41

Reconstructing Phylogenies Reconstructing phylogenies is a major component of modern research programs in many areas of biology and medicine: pharmaceutical research for drug discovery (most famous is herbicide Roundup TM ) Department Colloqium p.8/41

Reconstructing Phylogenies Reconstructing phylogenies is a major component of modern research programs in many areas of biology and medicine: pharmaceutical research for drug discovery (most famous is herbicide Roundup TM ) understanding rapidly mutating viruses (HIV) Department Colloqium p.8/41

Reconstructing Phylogenies Reconstructing phylogenies is a major component of modern research programs in many areas of biology and medicine: pharmaceutical research for drug discovery (most famous is herbicide Roundup TM ) understanding rapidly mutating viruses (HIV) designing enhanced organisms (rice, wheat) Department Colloqium p.8/41

Reconstructing Phylogenies Reconstructing phylogenies is a major component of modern research programs in many areas of biology and medicine: pharmaceutical research for drug discovery (most famous is herbicide Roundup TM ) understanding rapidly mutating viruses (HIV) designing enhanced organisms (rice, wheat) explaining and predicting gene expression Department Colloqium p.8/41

Reconstructing Phylogenies Reconstructing phylogenies is a major component of modern research programs in many areas of biology and medicine: pharmaceutical research for drug discovery (most famous is herbicide Roundup TM ) understanding rapidly mutating viruses (HIV) designing enhanced organisms (rice, wheat) explaining and predicting gene expression explaining and predicting ligands Department Colloqium p.8/41

Reconstructing Phylogenies Reconstructing phylogenies is a major component of modern research programs in many areas of biology and medicine: pharmaceutical research for drug discovery (most famous is herbicide Roundup TM ) understanding rapidly mutating viruses (HIV) designing enhanced organisms (rice, wheat) explaining and predicting gene expression explaining and predicting ligands most centrally, understanding genomic evolution Department Colloqium p.8/41

Reconstructing Phylogenies (cont d) Requires a model of tree evolution (e.g., random or birth-death) Requires a model of DNA/RNA/codon/gene order/etc. evolution (e.g., Markov models with weights matrices such as Jukes-Cantor and Kimura) Requires an optimization criterion that relates to the previous two models (e.g., likelihood or parsimony) Requires data with sufficient signal (to recover defining information) Department Colloqium p.9/41

Computational Phylogenetics Is extremely computation-intensive. Is viewed very differently by biologists (one dataset only, accuracy first) and by computer scientists (efficiency first) Department Colloqium p.10/41

Computational Phylogenetics Is extremely computation-intensive. Is viewed very differently by biologists (one dataset only, accuracy first) and by computer scientists (efficiency first) Sequence data (RNA, DNA, and aminoacid) has been used for over 20 years and is fairly well understood, but methods do not scale up. Genomic data (gene order and content of whole genomes) provides new information, but is much harder to analyze than sequence data. Department Colloqium p.10/41

Gene-Order Data Certain genomes evolve mostly through rearrangement of the order of genes, with occasional gene duplication or gene loss. A chloroplast is a semi-independent organism that lives within plant cells and allows them to photosynthesize. Chloroplasts have one circular chromosome with 120 genes. A mitochondrion is a semi-independent organism that lives within animal and some plant cells and supplies them with energy. Mitochondria have one circular chromosome with 40 genes in animals, more in plants. Department Colloqium p.11/41

Mitochondria Homo sapiens Felis catus Lumbricus terrestris Saccharomyces cerevisiae Department Colloqium p.12/41

Chloroplasts Cyanidium caldarium Zea mays Department Colloqium p.13/41

Phylogenies from Gene-Order Data Optimization target: reconstruct the phylogeny with the least total number of genomic changes. An application of Occam s razor; biologists call this the principle of parsimony. Department Colloqium p.14/41

True Evolutionary Distances True Evolutionary Distance (T.E.D.): actual number of events along an edge of the tree. Edit Distance: minimum number of events from one end of a tree edge to the other. We obtain better topological accuracy with T.E.D.s than with Edit Distances. T.E.D. can only be estimated. Department Colloqium p.15/41

True Evolutionary Distance A B 2 1 4 6 3 D C Polynomial Time A B C D A B C D 0 3 12 9 0 11 8 0 9 0 The tree and, a fortiori, its edge lengths are not known. Department Colloqium p.16/41

Rearrangement Events 2 1 8 5 1 8 3 7 Transposition 6 7 4 6 5 2 3 4 Inversion Inverted Transposition 4 1 8 5 1 8 3 7 6 7 2 5 6 4 3 2 Department Colloqium p.17/41

Generalized Nadeau-Taylor Model Inversions, Transpositions, and Inverted Transpositions All events of the same type are equiprobable Assign probabilities to different event types: Transposition: α Inverted Transposition: β Inversion: 1 α β Department Colloqium p.18/41

Breakpoint Distance D BP (G, G ) = No. of breakpoints in G w.r.t G G=(1 2 3 4 5 6 7 8) G =(1 2 5 4 3 6 7 8) Department Colloqium p.19/41

Genomic Distances BP: Breakpoint distance INV [Moret, Bader, Yan WADS 2001]: Minimum number of inversions required to transform one genome to another, IEBP [Wang, Warnow STOC 01]: Approximate the expected breakpoint distance with provable error. Exact IEBP [Wang WABI 01]: Invert the expected breakpoint distance EDE [Moret, Wang, Warnow, Wyman ISMB 01]: Estimate the expected inversion distance using simulation data. Department Colloqium p.20/41

Exact IEBP: Basic Idea Let G 0 be the starting genome and G k be the genome after k events. For every k > 0 compute E[D BP (G k, G 0 )], the expected number of breakpoints after k events. Return k that minimizes E[D BP (G k, G 0 )] D BP (G, G ). Department Colloqium p.21/41

The Counting Lemma ι n (u, v) = τ n (u, v) = ν n (u, v) = min{ u 1, v 1, n + 1 u, n + 1 v } (if uv < 0) 0 ( u 1 2 ) + ( n+1 u 2 ) (if u v, uv > 0) (if u = v) 0 (if uv < 0) (min{ u, v } 1)(n + 1 max{ u, v }) ( (if u v, uv > 0) n+1 u ) ( 3 + u 1 ) 3 (n 2)ι n (u, v) τ n (u, v) 3τ n (u, v) (if u = v) (if uv < 0) (if u v, uv > 0) (if u = v) Department Colloqium p.22/41

Goodness of Fit of Distance Estimators Inversion only on 120 genes 300 300 Actual number of events 250 200 150 100 50 Actual number of events 250 200 150 100 50 0 0 100 200 300 Inversion Distance Inversion distance 300 0 0 100 200 300 Breakpoint Distance Breakpoint distance 300 Actual number of events 250 200 150 100 50 Actual number of events 250 200 150 100 50 0 0 100 200 300 Exact IEBP Distance Exact-IEBP distance 0 0 100 200 300 Measured Distance Ideal estimator Department Colloqium p.23/41

Goodness of Fit of Distance Estimators Inversion only on 120 genes 300 300 Actual number of events 250 200 150 100 50 Actual number of events 250 200 150 100 50 0 0 100 200 300 IEBP Distance IEBP distance 300 0 0 100 200 300 EDE Distance EDE distance 300 Actual number of events 250 200 150 100 50 Actual number of events 250 200 150 100 50 0 0 100 200 300 Exact IEBP Distance Exact-IEBP distance 0 0 100 200 300 Measured Distance Ideal estimator Department Colloqium p.24/41

Absolute Error of Distance Estimators Absolute difference 300 250 200 150 100 BP INV IEBP EDE Exact IEBP 50 0 0 100 200 300 Actual number of events Inversion only Department Colloqium p.25/41

Absolute Error of Distance Estimators Absolute difference 300 250 200 150 100 BP INV IEBP EDE Exact IEBP 50 0 0 100 200 300 Actual number of events Transpositions only Department Colloqium p.26/41

Absolute Error of Distance Estimators Absolute difference 300 250 200 150 100 BP INV IEBP EDE Exact IEBP 50 0 0 100 200 300 Actual number of events All three classes equiprobable Department Colloqium p.27/41

Accuracy of Neighbor Joining 120 genes, inversion only, 10/20/40/80/160 genomes False Negative Rate (%) 70 60 50 40 30 20 NJ(BP) NJ(INV) NJ(IEBP) NJ(EDE) NJ(Exact IEBP) 10 0 0 0.2 0.4 0.6 0.8 1 Normalized Maximum Pairwise Inversion Distance Department Colloqium p.28/41

Accuracy of Neighbor Joining 120 genes, equiprobable events, 10/20/40/80/160 genomes False Negative Rate (%) 70 60 50 40 30 20 NJ(BP) NJ(INV) NJ(IEBP) NJ(EDE) NJ(Exact IEBP) 10 0 0 0.2 0.4 0.6 0.8 1 Normalized Maximum Pairwise Inversion Distance Department Colloqium p.29/41

Robustness of Exact-IEBP 120 genes, inversion only, 10/20/40/80/160 genomes 70 60 NJ(Exact IEBP(0,0)) NJ(Exact IEBP(1,0)) NJ(Exact IEBP(1/3,1/3)) False Negative Rate (%) 50 40 30 20 10 0 0 0.2 0.4 0.6 0.8 1 Normalized Maximum Pairwise Inversion Distance Department Colloqium p.30/41

Robustness of Exact-IEBP 120 genes, equiprobable events, 10/20/40/80/160 genomes 70 60 NJ(Exact IEBP(0,0)) NJ(Exact IEBP(1,0)) NJ(Exact IEBP(1/3,1/3)) False Negative Rate (%) 50 40 30 20 10 0 0 0.2 0.4 0.6 0.8 1 Normalized Maximum Pairwise Inversion Distance Department Colloqium p.31/41

Convergence Rate A method is statistically consistent for a given model if, given long enough data sequences, it recovers the true tree with high probability. Department Colloqium p.32/41

Convergence Rate A method is statistically consistent for a given model if, given long enough data sequences, it recovers the true tree with high probability. Problem: long enough" sequences may not exist in nature. Department Colloqium p.32/41

Convergence Rate A method is statistically consistent for a given model if, given long enough data sequences, it recovers the true tree with high probability. Problem: long enough" sequences may not exist in nature. Solution: a method is fast-converging for a given model if, given sequences of polynomial length, it recovers the true tree with high probability. Department Colloqium p.32/41

Convergence Rate A method is statistically consistent for a given model if, given long enough data sequences, it recovers the true tree with high probability. Problem: long enough" sequences may not exist in nature. Solution: a method is fast-converging for a given model if, given sequences of polynomial length, it recovers the true tree with high probability. Problem: the model conditions may not hold. Department Colloqium p.32/41

Convergence Rate A method is statistically consistent for a given model if, given long enough data sequences, it recovers the true tree with high probability. Problem: long enough" sequences may not exist in nature. Solution: a method is fast-converging for a given model if, given sequences of polynomial length, it recovers the true tree with high probability. Problem: the model conditions may not hold. Solution: a method is absolute fast-converging if, given sequences of polynomial length, it recovers the true tree with high probability. Department Colloqium p.32/41

Known Fast-Converging Methods The short-quartet methods [Warnow et al.]: absolute fast-converging The disk-covering methods (DCM) [Warnow et al.]: absolute fast-converging The harmonic greedy triplet method [Kao et al.] The method of Cryan, Goldberg, and Golbderg DCM-boosted neighbor-joining [Warnow et al.] Department Colloqium p.33/41

New Results [Warnow, Moret, St. John SODA 01] New absolute fast-converging method: weighted witness-antiwitness method (WIGWAM) Decision procedure to turn fast-converging methods into absolute fast-converging methods: short-quartet support (SQS) Boosting method (DCM plus SQS) to turn many methods with exponential convergence (e.g., neighbor-joining) into absolute fast-converging ones Generalizations to families of boosting methods with same properties, but experimental behavior Department Colloqium p.34/41

What is a Quartet? A quartet is an unrooted binary tree on four taxa the smallest tree that induces a nontrivial bipartition. b a {ab cd} d c c a {ac bd} d b d a {ad bc} A quartet {ab cd} agrees with a tree T if the subtree induced in T by the four taxa is the quartet itself. c b Department Colloqium p.35/41

Fast Convergence: Decision Problem TRUE TREE SELECTION PROBLEM: Input: A set S of sequences over A, C, T, G generated on an unknown tree (T, M), and a collection T = {T 1, T 2,..., T p } of phylogenies on S. Output: The true tree T if T is in T Department Colloqium p.36/41

Quartet Support Let T be a fixed tree leaf-labelled by the set S Let Q a fixed set of quartets on S Let D be the distance matrix on S The support of T with respect to Q is max{l (q Q and diam D (q) l) = q Q(T )} Department Colloqium p.37/41

Short Quartet Support PROCEDURE SQS(T, S) For each set of four taxa from S, compute the neighbor-joining quartet q; let Q be the set of all such quartets. Return T i such that s(t i, Q) is maximum; if more than one such tree exists, return the one with the smallest index i. Department Colloqium p.38/41

SQS Theorem For all ε > 0, there is a polynomial p such that, for all (T, M) in the model on set S of n sequences generated at random on T with length at least p(n), we have whenever T is in T. P r[sqs(t, S) = T ] > 1 ε Department Colloqium p.39/41

GRAPPA News: More Speed! Current release (1.03) runs from 2,000 to 10,000 times faster than the original tool, while also giving more capabilities. Department Colloqium p.40/41

GRAPPA News: More Speed! Current release (1.03) runs from 2,000 to 10,000 times faster than the original tool, while also giving more capabilities. Research version (1.1) runs from 10,000 to 500,000 times faster than the original tool, thanks to much better bounding. Department Colloqium p.40/41

GRAPPA News: More Speed! Current release (1.03) runs from 2,000 to 10,000 times faster than the original tool, while also giving more capabilities. Research version (1.1) runs from 10,000 to 500,000 times faster than the original tool, thanks to much better bounding. The 13-genome Campanulaceae now takes a few hours on a laptop instead of a few centuries on a large workstation. Department Colloqium p.40/41

GRAPPA News: More Speed! Current release (1.03) runs from 2,000 to 10,000 times faster than the original tool, while also giving more capabilities. Research version (1.1) runs from 10,000 to 500,000 times faster than the original tool, thanks to much better bounding. The 13-genome Campanulaceae now takes a few hours on a laptop instead of a few centuries on a large workstation. Speedup on Los Lobos is over 200,000,000! Department Colloqium p.40/41

Other Recent Results New sequence encodings for gene orders to enable classical parsimony searches. Department Colloqium p.41/41

Other Recent Results New sequence encodings for gene orders to enable classical parsimony searches. Combinations of fast-converging boosters with new encodings (i.e., use a new encoding and run a DCM+SQS booster on a classical parsimony optimizer): best accuracy to date. Department Colloqium p.41/41

Other Recent Results New sequence encodings for gene orders to enable classical parsimony searches. Combinations of fast-converging boosters with new encodings (i.e., use a new encoding and run a DCM+SQS booster on a classical parsimony optimizer): best accuracy to date. Combinations of fast-converging boosters with new encodings and fast heuristics (e.g., neighbor-joining): best speed/accuracy tradeoff to date. Department Colloqium p.41/41

Other Recent Results New sequence encodings for gene orders to enable classical parsimony searches. Combinations of fast-converging boosters with new encodings (i.e., use a new encoding and run a DCM+SQS booster on a classical parsimony optimizer): best accuracy to date. Combinations of fast-converging boosters with new encodings and fast heuristics (e.g., neighbor-joining): best speed/accuracy tradeoff to date. New results on computing inversion distances, inversion medians, etc. Department Colloqium p.41/41