Exploring Phylogenetic Relationships in Drosophila with Ciliate Operations Jacob Herlin, Anna Nelson, and Dr. Marion Scheepers Department of Mathematical Sciences, University of Northern Colorado, Department of Mathematics, Boise State University July 29, 2011 J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 1 / 29
What is a phylogenetic relationship? Phylogenetics is the study of evolutionary relationships between groups of organisms. For this research, we focused on several organisms in the genus Drosophila. Using ciliate operations, we want to explore the possibility of relating via those operations the phylogenetic distance between two species to their known phylogeny.. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 2 / 29
What is a phylogenetic relationship? Phylogenetics is the study of evolutionary relationships between groups of organisms. For this research, we focused on several organisms in the genus Drosophila. Using ciliate operations, we want to explore the possibility of relating via those operations the phylogenetic distance between two species to their known phylogeny. Orthologs are genes that are common among various species. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 2 / 29
What is a phylogenetic relationship? Phylogenetics is the study of evolutionary relationships between groups of organisms. For this research, we focused on several organisms in the genus Drosophila. Using ciliate operations, we want to explore the possibility of relating via those operations the phylogenetic distance between two species to their known phylogeny. Orthologs are genes that are common among various species Relationships are illustrated using phylogenetic trees. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 2 / 29
Phylogenetic trees Image courtesy of DroSpeGe: Drosophila Species Genomes. http://insects.eugenes.org/drospege J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 3 / 29
Using reversals to determine phylogeny Genetic operations that cause DNA to be scrambled result from breaking and rejoining the chromosome J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 4 / 29
Using reversals to determine phylogeny Genetic operations that cause DNA to be scrambled result from breaking and rejoining the chromosome In 1938, Dobzhansky and Sturtevant discovered that the gene arrangements that occurred in D. pseudoobscura were reversals. Since then, reversals have been used as the main genetic operation. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 4 / 29
Using reversals to determine phylogeny Genetic operations that cause DNA to be scrambled result from breaking and rejoining the chromosome In 1938, Dobzhansky and Sturtevant discovered that the gene arrangements that occurred in D. pseudoobscura were reversals. Since then, reversals have been used as the main genetic operation. Using Drosophila melanogaster as the canonical reference species, one can use number of reversals as a measure of evolutionary distance. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 4 / 29
Using reversals to determine phylogeny Genetic operations that cause DNA to be scrambled result from breaking and rejoining the chromosome In 1938, Dobzhansky and Sturtevant discovered that the gene arrangements that occurred in D. pseudoobscura were reversals. Since then, reversals have been used as the main genetic operation. Using Drosophila melanogaster as the canonical reference species, one can use number of reversals as a measure of evolutionary distance. From Hannenhalli and Pevzner, it is known that the shortest reversal path can be found in polynomial time. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 4 / 29
Micronucleus and macronucleus of ciliates Ciliates are multinuclear protozoans found in aqueous environments. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 5 / 29
Micronucleus and macronucleus of ciliates Ciliates are multinuclear protozoans found in aqueous environments. The micronucleus contains very long strands of DNA that are encrypted versions of macronuclear DNA. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 5 / 29
Micronucleus and macronucleus of ciliates Ciliates are multinuclear protozoans found in aqueous environments. The micronucleus contains very long strands of DNA that are encrypted versions of macronuclear DNA. The macronucleus is larger than the micronucleus and contains short strands of DNA that have been multiplied. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 5 / 29
Micronucleus and macronucleus of ciliates Ciliates are multinuclear protozoans found in aqueous environments. The micronucleus contains very long strands of DNA that are encrypted versions of macronuclear DNA. The macronucleus is larger than the micronucleus and contains short strands of DNA that have been multiplied. The micronuclear DNA is decrypted to form macronuclear DNA using three ciliate operation (hi, ld, dlad). J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 5 / 29
Macronuclear vs. Micronuclear DNA Micronuclear DNA has three elements: 1. Macronuclear destined sequences (MDSs) J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 6 / 29
Macronuclear vs. Micronuclear DNA Micronuclear DNA has three elements: 1. Macronuclear destined sequences (MDSs) 2. Internal eliminated sequences (IESs) J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 6 / 29
Macronuclear vs. Micronuclear DNA Micronuclear DNA has three elements: 1. Macronuclear destined sequences (MDSs) 2. Internal eliminated sequences (IESs) 3. Pointers occur on the flanks of the MDSs J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 6 / 29
ld operation Step 1:. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 7 / 29
ld operation Step 1: Step 2: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 7 / 29
ld operation Step 3: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 8 / 29
ld operation Step 3: Step 4: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 8 / 29
ld operation Step 3: Step 4: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 8 / 29
hi operation Step 1:. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 9 / 29
hi operation Step 1: Step 2:. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 9 / 29
hi operation Step 1: Step 2: Step 3a: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 9 / 29
hi operation Step 3b:. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 10 / 29
hi operation Step 3b: Step 4: Original: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 10 / 29
dlad operation Step 1: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 11 / 29
dlad operation Step 1: Step 2: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 11 / 29
dlad operation Step 3: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 12 / 29
dlad operation Step 3: Step 4: Original: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 12 / 29
Data Collection We collected data from Flybase.org, which is a database for Drosophila genes and genomes. The data was in a precomputed text file with all the genes in our reference species (D. melanogaster) and their orthologs on various species genome. We used the relative location and orientation of genes to produce signed permutations. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 13 / 29
J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 14 / 29
Data Analysis We used the following species to create permutations based on their genomes. 1 Simulans 2 Sechellia 3 Yakuba 4 Erecta 5 Virilis 6 Grimshawi 7 Mojavensis 8 Melanogaster J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 15 / 29
Pointer Lists We created an algorithm to find a path from a signed permutation back to the canonical in terms of ciliate operations. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 16 / 29
Pointer Lists We created an algorithm to find a path from a signed permutation back to the canonical in terms of ciliate operations. We start by mapping a signed permutation where each elements represents a section of genome: onto a list of pairs of pointers: [1, 4, 3, 2, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 16 / 29
Pointer Lists We created an algorithm to find a path from a signed permutation back to the canonical in terms of ciliate operations. We start by mapping a signed permutation where each elements represents a section of genome: onto a list of pairs of pointers: [1, 4, 3, 2, 6, 5] [(1, 2), (4, 5), (4, 3), (3, 2), (6, 7), (6, 5)] where each pair (a, b) represents a section of genome spanning from a pointer a to a pointer b. In our algorithm, we give the pointers signs to represent the orientation of each section, in stead of keeping them in pairs: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 16 / 29
Pointer Lists We created an algorithm to find a path from a signed permutation back to the canonical in terms of ciliate operations. We start by mapping a signed permutation where each elements represents a section of genome: onto a list of pairs of pointers: [1, 4, 3, 2, 6, 5] [(1, 2), (4, 5), (4, 3), (3, 2), (6, 7), (6, 5)] where each pair (a, b) represents a section of genome spanning from a pointer a to a pointer b. In our algorithm, we give the pointers signs to represent the orientation of each section, in stead of keeping them in pairs: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] We call this representation a pointer list. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 16 / 29
Pointer Lists We define a pointer list formally as a list L = [x 1, x 2,... x n ] that satisfies the following six conditions: Example: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 17 / 29
Pointer Lists We define a pointer list formally as a list L = [x 1, x 2,... x n ] that satisfies the following six conditions: 1 n is even. Example: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 17 / 29
Pointer Lists We define a pointer list formally as a list L = [x 1, x 2,... x n ] that satisfies the following six conditions: 1 n is even. 2 There is a unique i with µ = x i = min{ x j : i j n}. Example: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 17 / 29
Pointer Lists We define a pointer list formally as a list L = [x 1, x 2,... x n ] that satisfies the following six conditions: 1 n is even. 2 There is a unique i with µ = x i = min{ x j : i j n}. 3 There is a unique j with λ = x i = max{ x j : i j n}. Example: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 17 / 29
Pointer Lists We define a pointer list formally as a list L = [x 1, x 2,... x n ] that satisfies the following six conditions: 1 n is even. 2 There is a unique i with µ = x i = min{ x j : i j n}. 3 There is a unique j with λ = x i = max{ x j : i j n}. 4 For each i {1,..., n} with µ < x i < λ, there is a unique j {1,..., n}\{i} with x i = x j. Example: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 17 / 29
Pointer Lists We define a pointer list formally as a list L = [x 1, x 2,... x n ] that satisfies the following six conditions: 1 n is even. 2 There is a unique i with µ = x i = min{ x j : i j n}. 3 There is a unique j with λ = x i = max{ x j : i j n}. 4 For each i {1,..., n} with µ < x i < λ, there is a unique j {1,..., n}\{i} with x i = x j. 5 For each odd i {1,..., n}, x i x i+1 and x i x i+1 > 0. Example: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 17 / 29
Pointer Lists We define a pointer list formally as a list L = [x 1, x 2,... x n ] that satisfies the following six conditions: 1 n is even. 2 There is a unique i with µ = x i = min{ x j : i j n}. 3 There is a unique j with λ = x i = max{ x j : i j n}. 4 For each i {1,..., n} with µ < x i < λ, there is a unique j {1,..., n}\{i} with x i = x j. 5 For each odd i {1,..., n}, x i x i+1 and x i x i+1 > 0. 6 For each odd i, an odd j such that x i < x j < x i+1 < x j+1. Example: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 17 / 29
ld The ld operation is represented by the removal of pairs of the same pointer that are adjacent. This is equivalent to joining two sections that are correctly adjacent. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 18 / 29
ld The ld operation is represented by the removal of pairs of the same pointer that are adjacent. This is equivalent to joining two sections that are correctly adjacent. [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 18 / 29
ld The ld operation is represented by the removal of pairs of the same pointer that are adjacent. This is equivalent to joining two sections that are correctly adjacent. [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] [1, 2, 4, 5, 4, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 18 / 29
ld The ld operation is represented by the removal of pairs of the same pointer that are adjacent. This is equivalent to joining two sections that are correctly adjacent. [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] [1, 2, 4, 5, 4, 2, 6, 7, 6, 5] Formally, ld is a function that maps a pointer list of length n to a pointer list of length n-2 as such: [x 1, x 2,... x i, x i+1,... x n ] [x 1, x 2,... x i 1, x i+2,... x n ] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 18 / 29
hi The hi operation is represented by moving together two pointers of opposite orientation with a reversal, setting up an ld move. Note that reversing a section changes the signs of each element. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 19 / 29
hi The hi operation is represented by moving together two pointers of opposite orientation with a reversal, setting up an ld move. Note that reversing a section changes the signs of each element. [1, 2, 4, 5, 4, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 19 / 29
hi The hi operation is represented by moving together two pointers of opposite orientation with a reversal, setting up an ld move. Note that reversing a section changes the signs of each element. [1, 2, 4, 5, 4, 2, 6, 7, 6, 5] [1, 2, 4, 4, 5, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 19 / 29
hi The hi operation is represented by moving together two pointers of opposite orientation with a reversal, setting up an ld move. Note that reversing a section changes the signs of each element. [1, 2, 4, 5, 4, 2, 6, 7, 6, 5] [1, 2, 4, 4, 5, 2, 6, 7, 6, 5] Formally, hi is a function that maps a pointer list of length n to a pointer list of the same length as such: [x 1, x 2,...,x i, x i+1,..., x j, x j+1,..., x n ] [x 1, x 2,..., x i, x j,..., x i+1, x j+1,..., x n ] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 19 / 29
dlad The dlad operation is represented by finding a 4-tuple of pointers (x i, x j, x k, x l ) where x i = x j, and x k = x l and i < k < j < l. Then, you take the section x j x l, including the pointers, and the section in-between, but not including, the pointers x i and x k, and swapping them, setting up two ld-moves. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 20 / 29
dlad The dlad operation is represented by finding a 4-tuple of pointers (x i, x j, x k, x l ) where x i = x j, and x k = x l and i < k < j < l. Then, you take the section x j x l, including the pointers, and the section in-between, but not including, the pointers x i and x k, and swapping them, setting up two ld-moves. [2, 8, 10, 11, 9, 2, 1, 10, 8, 9, 12, 11] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 20 / 29
dlad The dlad operation is represented by finding a 4-tuple of pointers (x i, x j, x k, x l ) where x i = x j, and x k = x l and i < k < j < l. Then, you take the section x j x l, including the pointers, and the section in-between, but not including, the pointers x i and x k, and swapping them, setting up two ld-moves. [2, 8, 10, 11, 9, 2, 1, 10, 8, 9, 12, 11] [2, 8, 10, 10, 8, 9, 9, 2, 1, 11, 12, 11] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 20 / 29
dlad The dlad operation is represented by finding a 4-tuple of pointers (x i, x j, x k, x l ) where x i = x j, and x k = x l and i < k < j < l. Then, you take the section x j x l, including the pointers, and the section in-between, but not including, the pointers x i and x k, and swapping them, setting up two ld-moves. [2, 8, 10, 11, 9, 2, 1, 10, 8, 9, 12, 11] [2, 8, 10, 10, 8, 9, 9, 2, 1, 11, 12, 11] Formally, this maps a list of length n to a list of the lame length: [x 1,... x i,... x k,... x j,... x l,... x n ] [x 1,... x i, x j,... x l, x k,... x j 1, x i+1,... x k 1, x l+1,... x n ] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 20 / 29
boundary-ld The boundary ld move (b-ld) maps lists of length 4 onto lists of length 2.. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 21 / 29
boundary-ld The boundary ld move (b-ld) maps lists of length 4 onto lists of length 2. It only operates on lists of the following form: [x, m, m, x] where m, m {±µ, ±λ} and x / {±µ, ±λ} is some pointer.. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 21 / 29
boundary-ld The boundary ld move (b-ld) maps lists of length 4 onto lists of length 2. It only operates on lists of the following form: [x, m, m, x] where m, m {±µ, ±λ} and x / {±µ, ±λ} is some pointer. It maps as such: [x, m, m, x] [m, m]. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 21 / 29
boundary-ld The boundary ld move (b-ld) maps lists of length 4 onto lists of length 2. It only operates on lists of the following form: [x, m, m, x] where m, m {±µ, ±λ} and x / {±µ, ±λ} is some pointer. It maps as such: [x, m, m, x] [m, m] For example, [ 2, 1, 3, 2] [ 3, 1]. A list is considered sorted if it is in the form [µ, λ] or [ λ, µ] Thus, if a boundary-ld move is done, it will always be the final move in the sorting.. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 21 / 29
The Algorithm (1) Map the signed permutation onto a list of signed pointers. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 22 / 29
The Algorithm (1) Map the signed permutation onto a list of signed pointers. (2) Search for and do the first possible ld. If one is done, go to (2). If no ld is found, go to (3). J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 22 / 29
The Algorithm (1) Map the signed permutation onto a list of signed pointers. (2) Search for and do the first possible ld. If one is done, go to (2). If no ld is found, go to (3). (3) Check if a boundary-ld can be done. If it can, do it. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 22 / 29
The Algorithm (1) Map the signed permutation onto a list of signed pointers. (2) Search for and do the first possible ld. If one is done, go to (2). If no ld is found, go to (3). (3) Check if a boundary-ld can be done. If it can, do it. (4) Check if the list is sorted. If it is, end the program. Otherwise, go to (5) J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 22 / 29
The Algorithm (1) Map the signed permutation onto a list of signed pointers. (2) Search for and do the first possible ld. If one is done, go to (2). If no ld is found, go to (3). (3) Check if a boundary-ld can be done. If it can, do it. (4) Check if the list is sorted. If it is, end the program. Otherwise, go to (5) (5) Search through the list, and keep memory of pairs of oppositely-oriented pointers and of pairs of equally-oriented pointers. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 22 / 29
The Algorithm (1) Map the signed permutation onto a list of signed pointers. (2) Search for and do the first possible ld. If one is done, go to (2). If no ld is found, go to (3). (3) Check if a boundary-ld can be done. If it can, do it. (4) Check if the list is sorted. If it is, end the program. Otherwise, go to (5) (5) Search through the list, and keep memory of pairs of oppositely-oriented pointers and of pairs of equally-oriented pointers. (6) Search through the list of equally-oriented pointers for a possible dlad move. If one is found, do it and go to (2). If none is found, go to (7). J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 22 / 29
The Algorithm (1) Map the signed permutation onto a list of signed pointers. (2) Search for and do the first possible ld. If one is done, go to (2). If no ld is found, go to (3). (3) Check if a boundary-ld can be done. If it can, do it. (4) Check if the list is sorted. If it is, end the program. Otherwise, go to (5) (5) Search through the list, and keep memory of pairs of oppositely-oriented pointers and of pairs of equally-oriented pointers. (6) Search through the list of equally-oriented pointers for a possible dlad move. If one is found, do it and go to (2). If none is found, go to (7). (7) Do the hi represented by the first element of the list of equally-oriented pairs, then go to (2). J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 22 / 29
The Algorithm (1) Map the signed permutation onto a list of signed pointers. (2) Search for and do the first possible ld. If one is done, go to (2). If no ld is found, go to (3). (3) Check if a boundary-ld can be done. If it can, do it. (4) Check if the list is sorted. If it is, end the program. Otherwise, go to (5) (5) Search through the list, and keep memory of pairs of oppositely-oriented pointers and of pairs of equally-oriented pointers. (6) Search through the list of equally-oriented pointers for a possible dlad move. If one is found, do it and go to (2). If none is found, go to (7). (7) Do the hi represented by the first element of the list of equally-oriented pairs, then go to (2). J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 22 / 29
Some Theorems Theorem The algorithm runs in polynomial time. Specifically, the worst-case complexity is O(n 3 ). J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 23 / 29
Some Theorems Theorem The algorithm runs in polynomial time. Specifically, the worst-case complexity is O(n 3 ). Theorem A correctly-formed pointer list of length n > 4 is always in the domain of of an hi, dlad, ld or boundary ld move. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 23 / 29
Some Theorems Theorem The algorithm runs in polynomial time. Specifically, the worst-case complexity is O(n 3 ). Theorem A correctly-formed pointer list of length n > 4 is always in the domain of of an hi, dlad, ld or boundary ld move. Theorem The algorithm will always find a path to either [µ, λ] or [ λ, µ], an thus will always terminate. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 23 / 29
The Algorithm We proved that a pointer list is always in the domain of a ciliate operation by contradiction.. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 24 / 29
The Algorithm We proved that a pointer list is always in the domain of a ciliate operation by contradiction. We assumed the following conditions: (1) For every i where 1 i < n, x i x i+1. (ld moves can t happen). Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 24 / 29
The Algorithm We proved that a pointer list is always in the domain of a ciliate operation by contradiction. We assumed the following conditions: (1) For every i where 1 i < n, x i x i+1. (ld moves can t happen) (2) For every x i = x j, x i = x j. (hi moves can t happen). Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 24 / 29
The Algorithm We proved that a pointer list is always in the domain of a ciliate operation by contradiction. We assumed the following conditions: (1) For every i where 1 i < n, x i x i+1. (ld moves can t happen) (2) For every x i = x j, x i = x j. (hi moves can t happen) (3) For every x i = x j and x k = x l where i < j and k < l the intervals (x i, x k ) and (x j, x l ) are either disjoint, or one is a proper subset of the other. (dlad moves can t happen). Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 24 / 29
The Algorithm We proved that a pointer list is always in the domain of a ciliate operation by contradiction. We assumed the following conditions: (1) For every i where 1 i < n, x i x i+1. (ld moves can t happen) (2) For every x i = x j, x i = x j. (hi moves can t happen) (3) For every x i = x j and x k = x l where i < j and k < l the intervals (x i, x k ) and (x j, x l ) are either disjoint, or one is a proper subset of the other. (dlad moves can t happen) We then showed that it is impossible for a list to fit these conditions and to still be a pointer list.. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 24 / 29
The Algorithm We proved that a pointer list is always in the domain of a ciliate operation by contradiction. We assumed the following conditions: (1) For every i where 1 i < n, x i x i+1. (ld moves can t happen) (2) For every x i = x j, x i = x j. (hi moves can t happen) (3) For every x i = x j and x k = x l where i < j and k < l the intervals (x i, x k ) and (x j, x l ) are either disjoint, or one is a proper subset of the other. (dlad moves can t happen) We then showed that it is impossible for a list to fit these conditions and to still be a pointer list. This, and the fact that the ciliate operations all produce pointer lists, prove that the algorithm halts.. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 24 / 29
Data Analysis The algorithm produced these numbers of each move for the Muller A element, shown with the known time since their divergence from D. melanogaster. species hi dlad b-ld known divergence time D. sechellia 7 488 0 5.4 mya D. simulans 8 268 0 5.4 mya D. erecta 3 170 0 12.6 mya D. yakuba 11 105 1 12.6 mya D. mojavensis 41 407 0 62 mya D. virilis 33 403 0 62 mya D. grimshawi 36 381 1 62 mya J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 25 / 29
Data Analysis The algorithm produced these total numbers for every Muller element added together: species hi dlad b-ld known divergence time D. sechellia 15 1460 3 5.4 mya D. simulans 24 508 0 5.4 mya D. erecta 30 658 0 12.6 mya D. yakuba 47 438 1 12.6 mya D. mojavensis 189 986 2 62 mya D. virilis 194 1278 2 62 mya D. grimshawi 197 1459 3 62 mya J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 26 / 29
Data Analysis The algorithm produced these total numbers for every Muller element added together: species hi dlad b-ld known divergence time D. sechellia 15 1460 3 5.4 mya D. simulans 24 508 0 5.4 mya D. erecta 30 658 0 12.6 mya D. yakuba 47 438 1 12.6 mya D. mojavensis 189 986 2 62 mya D. virilis 194 1278 2 62 mya D. grimshawi 197 1459 3 62 mya While this algorithm does not necessarily produce the shortest overall path in the number of combined hi, dlad and ld moves, we conjecture that this is the shortest possible number of hi moves. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 26 / 29
The Future Here are some further areas we have to explore: 1. We conjecture that our algorithm minimizes the number of hi moves in its reversal path. We have yet to prove this. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 27 / 29
The Future Here are some further areas we have to explore: 1. We conjecture that our algorithm minimizes the number of hi moves in its reversal path. We have yet to prove this. 2. Develop an algorithm to find the shortest possible ciliate operation path. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 27 / 29
The Future Here are some further areas we have to explore: 1. We conjecture that our algorithm minimizes the number of hi moves in its reversal path. We have yet to prove this. 2. Develop an algorithm to find the shortest possible ciliate operation path. 3. Look at more species in the Drosophila genus and see if the correlation between ciliate operation path length and divergence time holds. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 27 / 29
The Future Here are some further areas we have to explore: 1. We conjecture that our algorithm minimizes the number of hi moves in its reversal path. We have yet to prove this. 2. Develop an algorithm to find the shortest possible ciliate operation path. 3. Look at more species in the Drosophila genus and see if the correlation between ciliate operation path length and divergence time holds. 4. Explore the possibility of using ciliate operations to solve mathematical problems, such as the word and conjugacy problems in groups. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 27 / 29
Bibliography Sridhar Hannenhalli, Pavel A. Pevzner Transforming Cabbage into Turnip: Polynomial Algorithm for Sorting Signed Permutations by Reversals. Journal of the ACM, Vol. 46, No. 1, 1999. Pavel Pevzner, Glenn Tesler Genome Rearrangements in Mammalian Evolution: Lessons from Human and House Henomes. Genome Research, Vol. 13, 2003. Arjun Bhutkar, Stephen W. Schaeffer, Susan M. Russo, Mu Xu, Temple F. Smith, William M. Gelbart Chromosomal Rearrangement Inferred From Comparisons of 12 Drosophila Genomes. Genetics, Vol 197, 2008. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 28 / 29
Bibliography (cont.) Jose M. Ranz, Damien Maurin, Yuk S. Chan, Marchin Von Grotthuss, LeDeana W. Hillier, John Roote, Michael Ashburner, Casey M. Bergman Principles of Genome Evolution in the Drosophila melanogaster Species Group. PLoS Biology, Vol. 5, Issue 6, 2007. Andrzej Ehrenfeucht, Tero Harju, Ion Petre, David M. Prescott, Grzegorz Rozenberg Computation in Living Cells. Springer-Verlag Berlin Heidelberg, 2004. S. Tweedie, M. Ashburner, K. Falls, P. Leyland, P. McQuilton, S. Marygold, G. Millburn, D. Osumi-Sutherland, A. Schroeder, R. Seal, H. Zhang and The FlyBase Consortium FlyBase: enhancing Drosophila Gene Ontology annotations. Nucleic Acids Research, Vol. 37, 2009. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, 2011 29 / 29