Exploring Phylogenetic Relationships in Drosophila with Ciliate Operations

Similar documents
Patterns of Simple Gene Assembly in Ciliates

Chromosomal Rearrangement Inferred From Comparisons of 12 Drosophila Genomes

Complexity Measures for Gene Assembly

AN EXACT SOLVER FOR THE DCJ MEDIAN PROBLEM

Theoretical Computer Science

Two models for gene assembly in ciliates

Computational nature of gene assembly in ciliates

Towards a Comprehensive Annotation of Structured RNAs in Drosophila

Perfect Sorting by Reversals and Deletions/Insertions

Combinatorial models for DNA rearrangements in ciliates

Models of Natural Computation: Gene Assembly and Membrane Systems. Robert Brijder

Greedy Algorithms. CS 498 SS Saurabh Sinha

On the complexity of unsigned translocation distance

Taxonomy. Content. How to determine & classify a species. Phylogeny and evolution

A Complex Suite of Forces Drives Gene Traffic from Drosophila X Chromosomes

CONTENTS. P A R T I Genomes 1. P A R T II Gene Transcription and Regulation 109

Computational processes

Isolating - A New Resampling Method for Gene Order Data

Discrete Applied Mathematics

Inferring Phylogenies from RAD Sequence Data

I519 Introduction to Bioinformatics, Genome Comparison. Yuzhen Ye School of Informatics & Computing, IUB

The combinatorics and algorithmics of genomic rearrangements have been the subject of much

COMPUTATIONAL PROCESSES IN LIVING CELLS

Algorithms for Bioinformatics

Evolution of genes and genomes on the Drosophila phylogeny

Results and Problems in Cell Differentiation

Chromosomal Breakpoint Reuse in Genome Sequence Rearrangement ABSTRACT

计算机科学中的问题求解初探. Great Theoretical Ideas in Computer Science

Theoretical Computer Science. Rewriting rule chains modeling DNA rearrangement pathways

Computational Power of Gene Rearrangement. Lila Kari and Laura F. Landweber

Graphs, permutations and sets in genome rearrangement

A Framework for Orthology Assignment from Gene Rearrangement Data

Phylogenomic Resources at the UCSC Genome Browser

Graph Polynomials motivated by Gene Assembly

Multiple Sequence Alignment. Sequences

Abstract. comment reviews reports deposited research refereed research interactions information

Genome Rearrangements In Man and Mouse. Abhinav Tiwari Department of Bioengineering

The Algebra of Gene Assembly in Ciliates

Graduate Funding Information Center

CSC 421: Algorithm Design & Analysis. Spring 2018

Recombination Faults in Gene Assembly in Ciliates Modeled Using Multimatroids

Origins and Evolution of MicroRNA Genes in Drosophila Species

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

EARLY ONLINE RELEASE. Daniel A. Pollard, Venky N. Iyer, Alan M. Moses, Michael B. Eisen

Computational approaches for functional genomics

Probabilistic modeling of the evolution of gene synteny within reconciled phylogenies

BMI/CS 776 Lecture #20 Alignment of whole genomes. Colin Dewey (with slides adapted from those by Mark Craven)

Assembly improvement: based on Ragout approach. student: Anna Lioznova scientific advisor: Son Pham

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

A Nonlinear Lower Bound on Linear Search Tree Programs for Solving Knapsack Problems*

Annotation of Drosophila grimashawi Contig12

I519 Introduction to Bioinformatics, Genome Comparison. Yuzhen Ye School of Informatics & Computing, IUB

Genome Rearrangement. 1 Inversions. Rick Durrett 1

Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are:

species Nicolas Bargues and Emmanuelle Lerat *

Chapter 13 Meiosis and Sexual Reproduction

Gene Family Evolution across 12 Drosophila Genomes

Analysis of Gene Order Evolution beyond Single-Copy Genes

Discrete Applied Mathematics. Maximal pivots on graphs with an application to gene assembly

CS 350 Algorithms and Complexity

CS 350 Algorithms and Complexity

Techniques for Multi-Genome Synteny Analysis to Overcome Assembly Limitations

A PARSIMONY APPROACH TO ANALYSIS OF HUMAN SEGMENTAL DUPLICATIONS

Drosophila melanogaster and D. simulans, two fruit fly species that are nearly

Characteristics of Life

AP Biology. Read college-level text for understanding and be able to summarize main concepts

Evolution of Tandemly Arrayed Genes in Multiple Species

Edit Distance with Move Operations

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

THE THREE-STATE PERFECT PHYLOGENY PROBLEM REDUCES TO 2-SAT

Overdispersion of the Molecular Clock: Temporal Variation of Gene-Specific Substitution Rates in Drosophila

ECOL/MCB 320 and 320H Genetics

Single alignment: Substitution Matrix. 16 march 2017

MOLECULAR PHYLOGENY AND GENETIC DIVERSITY ANALYSIS. Masatoshi Nei"

New Algorithms for Statistical Analysis of Interval Data

Comparative genomics: Overview & Tools + MUMmer algorithm

Lesson 4: Understanding Genetics

Probability & Combinatorics Test and Solutions February 18, 2012

Lesson 10 Study Guide

Phylogenetic Networks, Trees, and Clusters

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata.

SUMS PROBLEM COMPETITION, 2000

Dr. Amira A. AL-Hosary

BIOLOGY 432 Midterm I - 30 April PART I. Multiple choice questions (3 points each, 42 points total). Single best answer.

Advanced practical course in genome bioinformatics DAY 6: Functional annotation. Petri Törönen Earlier version Patrik Koskinen

arxiv: v1 [math.co] 11 Jul 2016

Phylogenetic inference

Evolutionary genetics of the Drosophila melanogaster subgroup I. Phylogenetic relationships based on coatings, hybrids and proteins

F1 Parent Cell R R. Name Period. Concept 15.1 Mendelian inheritance has its physical basis in the behavior of chromosomes

CREATING PHYLOGENETIC TREES FROM DNA SEQUENCES

Lecture 11 Friday, October 21, 2011

Biol478/ August

Y chromosome dynamics in Drosophila. Amanda Larracuente Department of Biology

NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees

The maximum forcing number of a polyomino

Effects of Gap Open and Gap Extension Penalties

Many of the slides that I ll use have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Presentation by Julie Hudson MAT5313

A Methodological Framework for the Reconstruction of Contiguous Regions of Ancestral Genomes and Its Application to Mammalian Genomes

Transcription:

Exploring Phylogenetic Relationships in Drosophila with Ciliate Operations Jacob Herlin, Anna Nelson, and Dr. Marion Scheepers Department of Mathematical Sciences, University of Northern Colorado, Department of Mathematics, Boise State University July 29, 2011 J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 1 / 29

What is a phylogenetic relationship? Phylogenetics is the study of evolutionary relationships between groups of organisms. For this research, we focused on several organisms in the genus Drosophila. Using ciliate operations, we want to explore the possibility of relating via those operations the phylogenetic distance between two species to their known phylogeny.. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 2 / 29

What is a phylogenetic relationship? Phylogenetics is the study of evolutionary relationships between groups of organisms. For this research, we focused on several organisms in the genus Drosophila. Using ciliate operations, we want to explore the possibility of relating via those operations the phylogenetic distance between two species to their known phylogeny. Orthologs are genes that are common among various species. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 2 / 29

What is a phylogenetic relationship? Phylogenetics is the study of evolutionary relationships between groups of organisms. For this research, we focused on several organisms in the genus Drosophila. Using ciliate operations, we want to explore the possibility of relating via those operations the phylogenetic distance between two species to their known phylogeny. Orthologs are genes that are common among various species Relationships are illustrated using phylogenetic trees. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 2 / 29

Phylogenetic trees Image courtesy of DroSpeGe: Drosophila Species Genomes. http://insects.eugenes.org/drospege J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 3 / 29

Using reversals to determine phylogeny Genetic operations that cause DNA to be scrambled result from breaking and rejoining the chromosome J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 4 / 29

Using reversals to determine phylogeny Genetic operations that cause DNA to be scrambled result from breaking and rejoining the chromosome In 1938, Dobzhansky and Sturtevant discovered that the gene arrangements that occurred in D. pseudoobscura were reversals. Since then, reversals have been used as the main genetic operation. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 4 / 29

Using reversals to determine phylogeny Genetic operations that cause DNA to be scrambled result from breaking and rejoining the chromosome In 1938, Dobzhansky and Sturtevant discovered that the gene arrangements that occurred in D. pseudoobscura were reversals. Since then, reversals have been used as the main genetic operation. Using Drosophila melanogaster as the canonical reference species, one can use number of reversals as a measure of evolutionary distance. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 4 / 29

Using reversals to determine phylogeny Genetic operations that cause DNA to be scrambled result from breaking and rejoining the chromosome In 1938, Dobzhansky and Sturtevant discovered that the gene arrangements that occurred in D. pseudoobscura were reversals. Since then, reversals have been used as the main genetic operation. Using Drosophila melanogaster as the canonical reference species, one can use number of reversals as a measure of evolutionary distance. From Hannenhalli and Pevzner, it is known that the shortest reversal path can be found in polynomial time. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 4 / 29

Micronucleus and macronucleus of ciliates Ciliates are multinuclear protozoans found in aqueous environments. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 5 / 29

Micronucleus and macronucleus of ciliates Ciliates are multinuclear protozoans found in aqueous environments. The micronucleus contains very long strands of DNA that are encrypted versions of macronuclear DNA. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 5 / 29

Micronucleus and macronucleus of ciliates Ciliates are multinuclear protozoans found in aqueous environments. The micronucleus contains very long strands of DNA that are encrypted versions of macronuclear DNA. The macronucleus is larger than the micronucleus and contains short strands of DNA that have been multiplied. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 5 / 29

Micronucleus and macronucleus of ciliates Ciliates are multinuclear protozoans found in aqueous environments. The micronucleus contains very long strands of DNA that are encrypted versions of macronuclear DNA. The macronucleus is larger than the micronucleus and contains short strands of DNA that have been multiplied. The micronuclear DNA is decrypted to form macronuclear DNA using three ciliate operation (hi, ld, dlad). J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 5 / 29

Macronuclear vs. Micronuclear DNA Micronuclear DNA has three elements: 1. Macronuclear destined sequences (MDSs) J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 6 / 29

Macronuclear vs. Micronuclear DNA Micronuclear DNA has three elements: 1. Macronuclear destined sequences (MDSs) 2. Internal eliminated sequences (IESs) J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 6 / 29

Macronuclear vs. Micronuclear DNA Micronuclear DNA has three elements: 1. Macronuclear destined sequences (MDSs) 2. Internal eliminated sequences (IESs) 3. Pointers occur on the flanks of the MDSs J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 6 / 29

ld operation Step 1:. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 7 / 29

ld operation Step 1: Step 2: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 7 / 29

ld operation Step 3: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 8 / 29

ld operation Step 3: Step 4: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 8 / 29

ld operation Step 3: Step 4: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 8 / 29

hi operation Step 1:. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 9 / 29

hi operation Step 1: Step 2:. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 9 / 29

hi operation Step 1: Step 2: Step 3a: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 9 / 29

hi operation Step 3b:. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 10 / 29

hi operation Step 3b: Step 4: Original: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 10 / 29

dlad operation Step 1: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 11 / 29

dlad operation Step 1: Step 2: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 11 / 29

dlad operation Step 3: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 12 / 29

dlad operation Step 3: Step 4: Original: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 12 / 29

Data Collection We collected data from Flybase.org, which is a database for Drosophila genes and genomes. The data was in a precomputed text file with all the genes in our reference species (D. melanogaster) and their orthologs on various species genome. We used the relative location and orientation of genes to produce signed permutations. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 13 / 29

J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 14 / 29

Data Analysis We used the following species to create permutations based on their genomes. 1 Simulans 2 Sechellia 3 Yakuba 4 Erecta 5 Virilis 6 Grimshawi 7 Mojavensis 8 Melanogaster J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 15 / 29

Pointer Lists We created an algorithm to find a path from a signed permutation back to the canonical in terms of ciliate operations. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 16 / 29

Pointer Lists We created an algorithm to find a path from a signed permutation back to the canonical in terms of ciliate operations. We start by mapping a signed permutation where each elements represents a section of genome: onto a list of pairs of pointers: [1, 4, 3, 2, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 16 / 29

Pointer Lists We created an algorithm to find a path from a signed permutation back to the canonical in terms of ciliate operations. We start by mapping a signed permutation where each elements represents a section of genome: onto a list of pairs of pointers: [1, 4, 3, 2, 6, 5] [(1, 2), (4, 5), (4, 3), (3, 2), (6, 7), (6, 5)] where each pair (a, b) represents a section of genome spanning from a pointer a to a pointer b. In our algorithm, we give the pointers signs to represent the orientation of each section, in stead of keeping them in pairs: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 16 / 29

Pointer Lists We created an algorithm to find a path from a signed permutation back to the canonical in terms of ciliate operations. We start by mapping a signed permutation where each elements represents a section of genome: onto a list of pairs of pointers: [1, 4, 3, 2, 6, 5] [(1, 2), (4, 5), (4, 3), (3, 2), (6, 7), (6, 5)] where each pair (a, b) represents a section of genome spanning from a pointer a to a pointer b. In our algorithm, we give the pointers signs to represent the orientation of each section, in stead of keeping them in pairs: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] We call this representation a pointer list. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 16 / 29

Pointer Lists We define a pointer list formally as a list L = [x 1, x 2,... x n ] that satisfies the following six conditions: Example: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 17 / 29

Pointer Lists We define a pointer list formally as a list L = [x 1, x 2,... x n ] that satisfies the following six conditions: 1 n is even. Example: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 17 / 29

Pointer Lists We define a pointer list formally as a list L = [x 1, x 2,... x n ] that satisfies the following six conditions: 1 n is even. 2 There is a unique i with µ = x i = min{ x j : i j n}. Example: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 17 / 29

Pointer Lists We define a pointer list formally as a list L = [x 1, x 2,... x n ] that satisfies the following six conditions: 1 n is even. 2 There is a unique i with µ = x i = min{ x j : i j n}. 3 There is a unique j with λ = x i = max{ x j : i j n}. Example: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 17 / 29

Pointer Lists We define a pointer list formally as a list L = [x 1, x 2,... x n ] that satisfies the following six conditions: 1 n is even. 2 There is a unique i with µ = x i = min{ x j : i j n}. 3 There is a unique j with λ = x i = max{ x j : i j n}. 4 For each i {1,..., n} with µ < x i < λ, there is a unique j {1,..., n}\{i} with x i = x j. Example: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 17 / 29

Pointer Lists We define a pointer list formally as a list L = [x 1, x 2,... x n ] that satisfies the following six conditions: 1 n is even. 2 There is a unique i with µ = x i = min{ x j : i j n}. 3 There is a unique j with λ = x i = max{ x j : i j n}. 4 For each i {1,..., n} with µ < x i < λ, there is a unique j {1,..., n}\{i} with x i = x j. 5 For each odd i {1,..., n}, x i x i+1 and x i x i+1 > 0. Example: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 17 / 29

Pointer Lists We define a pointer list formally as a list L = [x 1, x 2,... x n ] that satisfies the following six conditions: 1 n is even. 2 There is a unique i with µ = x i = min{ x j : i j n}. 3 There is a unique j with λ = x i = max{ x j : i j n}. 4 For each i {1,..., n} with µ < x i < λ, there is a unique j {1,..., n}\{i} with x i = x j. 5 For each odd i {1,..., n}, x i x i+1 and x i x i+1 > 0. 6 For each odd i, an odd j such that x i < x j < x i+1 < x j+1. Example: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 17 / 29

ld The ld operation is represented by the removal of pairs of the same pointer that are adjacent. This is equivalent to joining two sections that are correctly adjacent. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 18 / 29

ld The ld operation is represented by the removal of pairs of the same pointer that are adjacent. This is equivalent to joining two sections that are correctly adjacent. [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 18 / 29

ld The ld operation is represented by the removal of pairs of the same pointer that are adjacent. This is equivalent to joining two sections that are correctly adjacent. [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] [1, 2, 4, 5, 4, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 18 / 29

ld The ld operation is represented by the removal of pairs of the same pointer that are adjacent. This is equivalent to joining two sections that are correctly adjacent. [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] [1, 2, 4, 5, 4, 2, 6, 7, 6, 5] Formally, ld is a function that maps a pointer list of length n to a pointer list of length n-2 as such: [x 1, x 2,... x i, x i+1,... x n ] [x 1, x 2,... x i 1, x i+2,... x n ] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 18 / 29

hi The hi operation is represented by moving together two pointers of opposite orientation with a reversal, setting up an ld move. Note that reversing a section changes the signs of each element. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 19 / 29

hi The hi operation is represented by moving together two pointers of opposite orientation with a reversal, setting up an ld move. Note that reversing a section changes the signs of each element. [1, 2, 4, 5, 4, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 19 / 29

hi The hi operation is represented by moving together two pointers of opposite orientation with a reversal, setting up an ld move. Note that reversing a section changes the signs of each element. [1, 2, 4, 5, 4, 2, 6, 7, 6, 5] [1, 2, 4, 4, 5, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 19 / 29

hi The hi operation is represented by moving together two pointers of opposite orientation with a reversal, setting up an ld move. Note that reversing a section changes the signs of each element. [1, 2, 4, 5, 4, 2, 6, 7, 6, 5] [1, 2, 4, 4, 5, 2, 6, 7, 6, 5] Formally, hi is a function that maps a pointer list of length n to a pointer list of the same length as such: [x 1, x 2,...,x i, x i+1,..., x j, x j+1,..., x n ] [x 1, x 2,..., x i, x j,..., x i+1, x j+1,..., x n ] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 19 / 29

dlad The dlad operation is represented by finding a 4-tuple of pointers (x i, x j, x k, x l ) where x i = x j, and x k = x l and i < k < j < l. Then, you take the section x j x l, including the pointers, and the section in-between, but not including, the pointers x i and x k, and swapping them, setting up two ld-moves. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 20 / 29

dlad The dlad operation is represented by finding a 4-tuple of pointers (x i, x j, x k, x l ) where x i = x j, and x k = x l and i < k < j < l. Then, you take the section x j x l, including the pointers, and the section in-between, but not including, the pointers x i and x k, and swapping them, setting up two ld-moves. [2, 8, 10, 11, 9, 2, 1, 10, 8, 9, 12, 11] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 20 / 29

dlad The dlad operation is represented by finding a 4-tuple of pointers (x i, x j, x k, x l ) where x i = x j, and x k = x l and i < k < j < l. Then, you take the section x j x l, including the pointers, and the section in-between, but not including, the pointers x i and x k, and swapping them, setting up two ld-moves. [2, 8, 10, 11, 9, 2, 1, 10, 8, 9, 12, 11] [2, 8, 10, 10, 8, 9, 9, 2, 1, 11, 12, 11] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 20 / 29

dlad The dlad operation is represented by finding a 4-tuple of pointers (x i, x j, x k, x l ) where x i = x j, and x k = x l and i < k < j < l. Then, you take the section x j x l, including the pointers, and the section in-between, but not including, the pointers x i and x k, and swapping them, setting up two ld-moves. [2, 8, 10, 11, 9, 2, 1, 10, 8, 9, 12, 11] [2, 8, 10, 10, 8, 9, 9, 2, 1, 11, 12, 11] Formally, this maps a list of length n to a list of the lame length: [x 1,... x i,... x k,... x j,... x l,... x n ] [x 1,... x i, x j,... x l, x k,... x j 1, x i+1,... x k 1, x l+1,... x n ] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 20 / 29

boundary-ld The boundary ld move (b-ld) maps lists of length 4 onto lists of length 2.. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 21 / 29

boundary-ld The boundary ld move (b-ld) maps lists of length 4 onto lists of length 2. It only operates on lists of the following form: [x, m, m, x] where m, m {±µ, ±λ} and x / {±µ, ±λ} is some pointer.. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 21 / 29

boundary-ld The boundary ld move (b-ld) maps lists of length 4 onto lists of length 2. It only operates on lists of the following form: [x, m, m, x] where m, m {±µ, ±λ} and x / {±µ, ±λ} is some pointer. It maps as such: [x, m, m, x] [m, m]. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 21 / 29

boundary-ld The boundary ld move (b-ld) maps lists of length 4 onto lists of length 2. It only operates on lists of the following form: [x, m, m, x] where m, m {±µ, ±λ} and x / {±µ, ±λ} is some pointer. It maps as such: [x, m, m, x] [m, m] For example, [ 2, 1, 3, 2] [ 3, 1]. A list is considered sorted if it is in the form [µ, λ] or [ λ, µ] Thus, if a boundary-ld move is done, it will always be the final move in the sorting.. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 21 / 29

The Algorithm (1) Map the signed permutation onto a list of signed pointers. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 22 / 29

The Algorithm (1) Map the signed permutation onto a list of signed pointers. (2) Search for and do the first possible ld. If one is done, go to (2). If no ld is found, go to (3). J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 22 / 29

The Algorithm (1) Map the signed permutation onto a list of signed pointers. (2) Search for and do the first possible ld. If one is done, go to (2). If no ld is found, go to (3). (3) Check if a boundary-ld can be done. If it can, do it. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 22 / 29

The Algorithm (1) Map the signed permutation onto a list of signed pointers. (2) Search for and do the first possible ld. If one is done, go to (2). If no ld is found, go to (3). (3) Check if a boundary-ld can be done. If it can, do it. (4) Check if the list is sorted. If it is, end the program. Otherwise, go to (5) J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 22 / 29

The Algorithm (1) Map the signed permutation onto a list of signed pointers. (2) Search for and do the first possible ld. If one is done, go to (2). If no ld is found, go to (3). (3) Check if a boundary-ld can be done. If it can, do it. (4) Check if the list is sorted. If it is, end the program. Otherwise, go to (5) (5) Search through the list, and keep memory of pairs of oppositely-oriented pointers and of pairs of equally-oriented pointers. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 22 / 29

The Algorithm (1) Map the signed permutation onto a list of signed pointers. (2) Search for and do the first possible ld. If one is done, go to (2). If no ld is found, go to (3). (3) Check if a boundary-ld can be done. If it can, do it. (4) Check if the list is sorted. If it is, end the program. Otherwise, go to (5) (5) Search through the list, and keep memory of pairs of oppositely-oriented pointers and of pairs of equally-oriented pointers. (6) Search through the list of equally-oriented pointers for a possible dlad move. If one is found, do it and go to (2). If none is found, go to (7). J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 22 / 29

The Algorithm (1) Map the signed permutation onto a list of signed pointers. (2) Search for and do the first possible ld. If one is done, go to (2). If no ld is found, go to (3). (3) Check if a boundary-ld can be done. If it can, do it. (4) Check if the list is sorted. If it is, end the program. Otherwise, go to (5) (5) Search through the list, and keep memory of pairs of oppositely-oriented pointers and of pairs of equally-oriented pointers. (6) Search through the list of equally-oriented pointers for a possible dlad move. If one is found, do it and go to (2). If none is found, go to (7). (7) Do the hi represented by the first element of the list of equally-oriented pairs, then go to (2). J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 22 / 29

The Algorithm (1) Map the signed permutation onto a list of signed pointers. (2) Search for and do the first possible ld. If one is done, go to (2). If no ld is found, go to (3). (3) Check if a boundary-ld can be done. If it can, do it. (4) Check if the list is sorted. If it is, end the program. Otherwise, go to (5) (5) Search through the list, and keep memory of pairs of oppositely-oriented pointers and of pairs of equally-oriented pointers. (6) Search through the list of equally-oriented pointers for a possible dlad move. If one is found, do it and go to (2). If none is found, go to (7). (7) Do the hi represented by the first element of the list of equally-oriented pairs, then go to (2). J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 22 / 29

Some Theorems Theorem The algorithm runs in polynomial time. Specifically, the worst-case complexity is O(n 3 ). J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 23 / 29

Some Theorems Theorem The algorithm runs in polynomial time. Specifically, the worst-case complexity is O(n 3 ). Theorem A correctly-formed pointer list of length n > 4 is always in the domain of of an hi, dlad, ld or boundary ld move. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 23 / 29

Some Theorems Theorem The algorithm runs in polynomial time. Specifically, the worst-case complexity is O(n 3 ). Theorem A correctly-formed pointer list of length n > 4 is always in the domain of of an hi, dlad, ld or boundary ld move. Theorem The algorithm will always find a path to either [µ, λ] or [ λ, µ], an thus will always terminate. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 23 / 29

The Algorithm We proved that a pointer list is always in the domain of a ciliate operation by contradiction.. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 24 / 29

The Algorithm We proved that a pointer list is always in the domain of a ciliate operation by contradiction. We assumed the following conditions: (1) For every i where 1 i < n, x i x i+1. (ld moves can t happen). Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 24 / 29

The Algorithm We proved that a pointer list is always in the domain of a ciliate operation by contradiction. We assumed the following conditions: (1) For every i where 1 i < n, x i x i+1. (ld moves can t happen) (2) For every x i = x j, x i = x j. (hi moves can t happen). Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 24 / 29

The Algorithm We proved that a pointer list is always in the domain of a ciliate operation by contradiction. We assumed the following conditions: (1) For every i where 1 i < n, x i x i+1. (ld moves can t happen) (2) For every x i = x j, x i = x j. (hi moves can t happen) (3) For every x i = x j and x k = x l where i < j and k < l the intervals (x i, x k ) and (x j, x l ) are either disjoint, or one is a proper subset of the other. (dlad moves can t happen). Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 24 / 29

The Algorithm We proved that a pointer list is always in the domain of a ciliate operation by contradiction. We assumed the following conditions: (1) For every i where 1 i < n, x i x i+1. (ld moves can t happen) (2) For every x i = x j, x i = x j. (hi moves can t happen) (3) For every x i = x j and x k = x l where i < j and k < l the intervals (x i, x k ) and (x j, x l ) are either disjoint, or one is a proper subset of the other. (dlad moves can t happen) We then showed that it is impossible for a list to fit these conditions and to still be a pointer list.. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 24 / 29

The Algorithm We proved that a pointer list is always in the domain of a ciliate operation by contradiction. We assumed the following conditions: (1) For every i where 1 i < n, x i x i+1. (ld moves can t happen) (2) For every x i = x j, x i = x j. (hi moves can t happen) (3) For every x i = x j and x k = x l where i < j and k < l the intervals (x i, x k ) and (x j, x l ) are either disjoint, or one is a proper subset of the other. (dlad moves can t happen) We then showed that it is impossible for a list to fit these conditions and to still be a pointer list. This, and the fact that the ciliate operations all produce pointer lists, prove that the algorithm halts.. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 24 / 29

Data Analysis The algorithm produced these numbers of each move for the Muller A element, shown with the known time since their divergence from D. melanogaster. species hi dlad b-ld known divergence time D. sechellia 7 488 0 5.4 mya D. simulans 8 268 0 5.4 mya D. erecta 3 170 0 12.6 mya D. yakuba 11 105 1 12.6 mya D. mojavensis 41 407 0 62 mya D. virilis 33 403 0 62 mya D. grimshawi 36 381 1 62 mya J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 25 / 29

Data Analysis The algorithm produced these total numbers for every Muller element added together: species hi dlad b-ld known divergence time D. sechellia 15 1460 3 5.4 mya D. simulans 24 508 0 5.4 mya D. erecta 30 658 0 12.6 mya D. yakuba 47 438 1 12.6 mya D. mojavensis 189 986 2 62 mya D. virilis 194 1278 2 62 mya D. grimshawi 197 1459 3 62 mya J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 26 / 29

Data Analysis The algorithm produced these total numbers for every Muller element added together: species hi dlad b-ld known divergence time D. sechellia 15 1460 3 5.4 mya D. simulans 24 508 0 5.4 mya D. erecta 30 658 0 12.6 mya D. yakuba 47 438 1 12.6 mya D. mojavensis 189 986 2 62 mya D. virilis 194 1278 2 62 mya D. grimshawi 197 1459 3 62 mya While this algorithm does not necessarily produce the shortest overall path in the number of combined hi, dlad and ld moves, we conjecture that this is the shortest possible number of hi moves. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 26 / 29

The Future Here are some further areas we have to explore: 1. We conjecture that our algorithm minimizes the number of hi moves in its reversal path. We have yet to prove this. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 27 / 29

The Future Here are some further areas we have to explore: 1. We conjecture that our algorithm minimizes the number of hi moves in its reversal path. We have yet to prove this. 2. Develop an algorithm to find the shortest possible ciliate operation path. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 27 / 29

The Future Here are some further areas we have to explore: 1. We conjecture that our algorithm minimizes the number of hi moves in its reversal path. We have yet to prove this. 2. Develop an algorithm to find the shortest possible ciliate operation path. 3. Look at more species in the Drosophila genus and see if the correlation between ciliate operation path length and divergence time holds. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 27 / 29

The Future Here are some further areas we have to explore: 1. We conjecture that our algorithm minimizes the number of hi moves in its reversal path. We have yet to prove this. 2. Develop an algorithm to find the shortest possible ciliate operation path. 3. Look at more species in the Drosophila genus and see if the correlation between ciliate operation path length and divergence time holds. 4. Explore the possibility of using ciliate operations to solve mathematical problems, such as the word and conjugacy problems in groups. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 27 / 29

Bibliography Sridhar Hannenhalli, Pavel A. Pevzner Transforming Cabbage into Turnip: Polynomial Algorithm for Sorting Signed Permutations by Reversals. Journal of the ACM, Vol. 46, No. 1, 1999. Pavel Pevzner, Glenn Tesler Genome Rearrangements in Mammalian Evolution: Lessons from Human and House Henomes. Genome Research, Vol. 13, 2003. Arjun Bhutkar, Stephen W. Schaeffer, Susan M. Russo, Mu Xu, Temple F. Smith, William M. Gelbart Chromosomal Rearrangement Inferred From Comparisons of 12 Drosophila Genomes. Genetics, Vol 197, 2008. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 28 / 29

Bibliography (cont.) Jose M. Ranz, Damien Maurin, Yuk S. Chan, Marchin Von Grotthuss, LeDeana W. Hillier, John Roote, Michael Ashburner, Casey M. Bergman Principles of Genome Evolution in the Drosophila melanogaster Species Group. PLoS Biology, Vol. 5, Issue 6, 2007. Andrzej Ehrenfeucht, Tero Harju, Ion Petre, David M. Prescott, Grzegorz Rozenberg Computation in Living Cells. Springer-Verlag Berlin Heidelberg, 2004. S. Tweedie, M. Ashburner, K. Falls, P. Leyland, P. McQuilton, S. Marygold, G. Millburn, D. Osumi-Sutherland, A. Schroeder, R. Seal, H. Zhang and The FlyBase Consortium FlyBase: enhancing Drosophila Gene Ontology annotations. Nucleic Acids Research, Vol. 37, 2009. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 29 / 29