Exploring Phylogenetic Relationships in Drosophila with Ciliate Operations

Size: px
Start display at page:

Download "Exploring Phylogenetic Relationships in Drosophila with Ciliate Operations"

Transcription

1 Exploring Phylogenetic Relationships in Drosophila with Ciliate Operations Jacob Herlin, Anna Nelson, and Dr. Marion Scheepers Department of Mathematical Sciences, University of Northern Colorado, Department of Mathematics, Boise State University July 29, 2011 J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

2 What is a phylogenetic relationship? Phylogenetics is the study of evolutionary relationships between groups of organisms. For this research, we focused on several organisms in the genus Drosophila. Using ciliate operations, we want to explore the possibility of relating via those operations the phylogenetic distance between two species to their known phylogeny.. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

3 What is a phylogenetic relationship? Phylogenetics is the study of evolutionary relationships between groups of organisms. For this research, we focused on several organisms in the genus Drosophila. Using ciliate operations, we want to explore the possibility of relating via those operations the phylogenetic distance between two species to their known phylogeny. Orthologs are genes that are common among various species. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

4 What is a phylogenetic relationship? Phylogenetics is the study of evolutionary relationships between groups of organisms. For this research, we focused on several organisms in the genus Drosophila. Using ciliate operations, we want to explore the possibility of relating via those operations the phylogenetic distance between two species to their known phylogeny. Orthologs are genes that are common among various species Relationships are illustrated using phylogenetic trees. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

5 Phylogenetic trees Image courtesy of DroSpeGe: Drosophila Species Genomes. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

6 Using reversals to determine phylogeny Genetic operations that cause DNA to be scrambled result from breaking and rejoining the chromosome J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

7 Using reversals to determine phylogeny Genetic operations that cause DNA to be scrambled result from breaking and rejoining the chromosome In 1938, Dobzhansky and Sturtevant discovered that the gene arrangements that occurred in D. pseudoobscura were reversals. Since then, reversals have been used as the main genetic operation. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

8 Using reversals to determine phylogeny Genetic operations that cause DNA to be scrambled result from breaking and rejoining the chromosome In 1938, Dobzhansky and Sturtevant discovered that the gene arrangements that occurred in D. pseudoobscura were reversals. Since then, reversals have been used as the main genetic operation. Using Drosophila melanogaster as the canonical reference species, one can use number of reversals as a measure of evolutionary distance. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

9 Using reversals to determine phylogeny Genetic operations that cause DNA to be scrambled result from breaking and rejoining the chromosome In 1938, Dobzhansky and Sturtevant discovered that the gene arrangements that occurred in D. pseudoobscura were reversals. Since then, reversals have been used as the main genetic operation. Using Drosophila melanogaster as the canonical reference species, one can use number of reversals as a measure of evolutionary distance. From Hannenhalli and Pevzner, it is known that the shortest reversal path can be found in polynomial time. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

10 Micronucleus and macronucleus of ciliates Ciliates are multinuclear protozoans found in aqueous environments. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

11 Micronucleus and macronucleus of ciliates Ciliates are multinuclear protozoans found in aqueous environments. The micronucleus contains very long strands of DNA that are encrypted versions of macronuclear DNA. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

12 Micronucleus and macronucleus of ciliates Ciliates are multinuclear protozoans found in aqueous environments. The micronucleus contains very long strands of DNA that are encrypted versions of macronuclear DNA. The macronucleus is larger than the micronucleus and contains short strands of DNA that have been multiplied. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

13 Micronucleus and macronucleus of ciliates Ciliates are multinuclear protozoans found in aqueous environments. The micronucleus contains very long strands of DNA that are encrypted versions of macronuclear DNA. The macronucleus is larger than the micronucleus and contains short strands of DNA that have been multiplied. The micronuclear DNA is decrypted to form macronuclear DNA using three ciliate operation (hi, ld, dlad). J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

14 Macronuclear vs. Micronuclear DNA Micronuclear DNA has three elements: 1. Macronuclear destined sequences (MDSs) J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

15 Macronuclear vs. Micronuclear DNA Micronuclear DNA has three elements: 1. Macronuclear destined sequences (MDSs) 2. Internal eliminated sequences (IESs) J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

16 Macronuclear vs. Micronuclear DNA Micronuclear DNA has three elements: 1. Macronuclear destined sequences (MDSs) 2. Internal eliminated sequences (IESs) 3. Pointers occur on the flanks of the MDSs J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

17 ld operation Step 1:. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

18 ld operation Step 1: Step 2: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

19 ld operation Step 3: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

20 ld operation Step 3: Step 4: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

21 ld operation Step 3: Step 4: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

22 hi operation Step 1:. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

23 hi operation Step 1: Step 2:. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

24 hi operation Step 1: Step 2: Step 3a: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

25 hi operation Step 3b:. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

26 hi operation Step 3b: Step 4: Original: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

27 dlad operation Step 1: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

28 dlad operation Step 1: Step 2: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

29 dlad operation Step 3: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

30 dlad operation Step 3: Step 4: Original: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

31 Data Collection We collected data from Flybase.org, which is a database for Drosophila genes and genomes. The data was in a precomputed text file with all the genes in our reference species (D. melanogaster) and their orthologs on various species genome. We used the relative location and orientation of genes to produce signed permutations. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

32 J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

33 Data Analysis We used the following species to create permutations based on their genomes. 1 Simulans 2 Sechellia 3 Yakuba 4 Erecta 5 Virilis 6 Grimshawi 7 Mojavensis 8 Melanogaster J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

34 Pointer Lists We created an algorithm to find a path from a signed permutation back to the canonical in terms of ciliate operations. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

35 Pointer Lists We created an algorithm to find a path from a signed permutation back to the canonical in terms of ciliate operations. We start by mapping a signed permutation where each elements represents a section of genome: onto a list of pairs of pointers: [1, 4, 3, 2, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

36 Pointer Lists We created an algorithm to find a path from a signed permutation back to the canonical in terms of ciliate operations. We start by mapping a signed permutation where each elements represents a section of genome: onto a list of pairs of pointers: [1, 4, 3, 2, 6, 5] [(1, 2), (4, 5), (4, 3), (3, 2), (6, 7), (6, 5)] where each pair (a, b) represents a section of genome spanning from a pointer a to a pointer b. In our algorithm, we give the pointers signs to represent the orientation of each section, in stead of keeping them in pairs: J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

37 Pointer Lists We created an algorithm to find a path from a signed permutation back to the canonical in terms of ciliate operations. We start by mapping a signed permutation where each elements represents a section of genome: onto a list of pairs of pointers: [1, 4, 3, 2, 6, 5] [(1, 2), (4, 5), (4, 3), (3, 2), (6, 7), (6, 5)] where each pair (a, b) represents a section of genome spanning from a pointer a to a pointer b. In our algorithm, we give the pointers signs to represent the orientation of each section, in stead of keeping them in pairs: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] We call this representation a pointer list. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

38 Pointer Lists We define a pointer list formally as a list L = [x 1, x 2,... x n ] that satisfies the following six conditions: Example: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

39 Pointer Lists We define a pointer list formally as a list L = [x 1, x 2,... x n ] that satisfies the following six conditions: 1 n is even. Example: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

40 Pointer Lists We define a pointer list formally as a list L = [x 1, x 2,... x n ] that satisfies the following six conditions: 1 n is even. 2 There is a unique i with µ = x i = min{ x j : i j n}. Example: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

41 Pointer Lists We define a pointer list formally as a list L = [x 1, x 2,... x n ] that satisfies the following six conditions: 1 n is even. 2 There is a unique i with µ = x i = min{ x j : i j n}. 3 There is a unique j with λ = x i = max{ x j : i j n}. Example: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

42 Pointer Lists We define a pointer list formally as a list L = [x 1, x 2,... x n ] that satisfies the following six conditions: 1 n is even. 2 There is a unique i with µ = x i = min{ x j : i j n}. 3 There is a unique j with λ = x i = max{ x j : i j n}. 4 For each i {1,..., n} with µ < x i < λ, there is a unique j {1,..., n}\{i} with x i = x j. Example: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

43 Pointer Lists We define a pointer list formally as a list L = [x 1, x 2,... x n ] that satisfies the following six conditions: 1 n is even. 2 There is a unique i with µ = x i = min{ x j : i j n}. 3 There is a unique j with λ = x i = max{ x j : i j n}. 4 For each i {1,..., n} with µ < x i < λ, there is a unique j {1,..., n}\{i} with x i = x j. 5 For each odd i {1,..., n}, x i x i+1 and x i x i+1 > 0. Example: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

44 Pointer Lists We define a pointer list formally as a list L = [x 1, x 2,... x n ] that satisfies the following six conditions: 1 n is even. 2 There is a unique i with µ = x i = min{ x j : i j n}. 3 There is a unique j with λ = x i = max{ x j : i j n}. 4 For each i {1,..., n} with µ < x i < λ, there is a unique j {1,..., n}\{i} with x i = x j. 5 For each odd i {1,..., n}, x i x i+1 and x i x i+1 > 0. 6 For each odd i, an odd j such that x i < x j < x i+1 < x j+1. Example: [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

45 ld The ld operation is represented by the removal of pairs of the same pointer that are adjacent. This is equivalent to joining two sections that are correctly adjacent. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

46 ld The ld operation is represented by the removal of pairs of the same pointer that are adjacent. This is equivalent to joining two sections that are correctly adjacent. [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

47 ld The ld operation is represented by the removal of pairs of the same pointer that are adjacent. This is equivalent to joining two sections that are correctly adjacent. [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] [1, 2, 4, 5, 4, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

48 ld The ld operation is represented by the removal of pairs of the same pointer that are adjacent. This is equivalent to joining two sections that are correctly adjacent. [1, 2, 4, 5, 4, 3, 3, 2, 6, 7, 6, 5] [1, 2, 4, 5, 4, 2, 6, 7, 6, 5] Formally, ld is a function that maps a pointer list of length n to a pointer list of length n-2 as such: [x 1, x 2,... x i, x i+1,... x n ] [x 1, x 2,... x i 1, x i+2,... x n ] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

49 hi The hi operation is represented by moving together two pointers of opposite orientation with a reversal, setting up an ld move. Note that reversing a section changes the signs of each element. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

50 hi The hi operation is represented by moving together two pointers of opposite orientation with a reversal, setting up an ld move. Note that reversing a section changes the signs of each element. [1, 2, 4, 5, 4, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

51 hi The hi operation is represented by moving together two pointers of opposite orientation with a reversal, setting up an ld move. Note that reversing a section changes the signs of each element. [1, 2, 4, 5, 4, 2, 6, 7, 6, 5] [1, 2, 4, 4, 5, 2, 6, 7, 6, 5] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

52 hi The hi operation is represented by moving together two pointers of opposite orientation with a reversal, setting up an ld move. Note that reversing a section changes the signs of each element. [1, 2, 4, 5, 4, 2, 6, 7, 6, 5] [1, 2, 4, 4, 5, 2, 6, 7, 6, 5] Formally, hi is a function that maps a pointer list of length n to a pointer list of the same length as such: [x 1, x 2,...,x i, x i+1,..., x j, x j+1,..., x n ] [x 1, x 2,..., x i, x j,..., x i+1, x j+1,..., x n ] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

53 dlad The dlad operation is represented by finding a 4-tuple of pointers (x i, x j, x k, x l ) where x i = x j, and x k = x l and i < k < j < l. Then, you take the section x j x l, including the pointers, and the section in-between, but not including, the pointers x i and x k, and swapping them, setting up two ld-moves. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

54 dlad The dlad operation is represented by finding a 4-tuple of pointers (x i, x j, x k, x l ) where x i = x j, and x k = x l and i < k < j < l. Then, you take the section x j x l, including the pointers, and the section in-between, but not including, the pointers x i and x k, and swapping them, setting up two ld-moves. [2, 8, 10, 11, 9, 2, 1, 10, 8, 9, 12, 11] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

55 dlad The dlad operation is represented by finding a 4-tuple of pointers (x i, x j, x k, x l ) where x i = x j, and x k = x l and i < k < j < l. Then, you take the section x j x l, including the pointers, and the section in-between, but not including, the pointers x i and x k, and swapping them, setting up two ld-moves. [2, 8, 10, 11, 9, 2, 1, 10, 8, 9, 12, 11] [2, 8, 10, 10, 8, 9, 9, 2, 1, 11, 12, 11] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

56 dlad The dlad operation is represented by finding a 4-tuple of pointers (x i, x j, x k, x l ) where x i = x j, and x k = x l and i < k < j < l. Then, you take the section x j x l, including the pointers, and the section in-between, but not including, the pointers x i and x k, and swapping them, setting up two ld-moves. [2, 8, 10, 11, 9, 2, 1, 10, 8, 9, 12, 11] [2, 8, 10, 10, 8, 9, 9, 2, 1, 11, 12, 11] Formally, this maps a list of length n to a list of the lame length: [x 1,... x i,... x k,... x j,... x l,... x n ] [x 1,... x i, x j,... x l, x k,... x j 1, x i+1,... x k 1, x l+1,... x n ] J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

57 boundary-ld The boundary ld move (b-ld) maps lists of length 4 onto lists of length 2.. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

58 boundary-ld The boundary ld move (b-ld) maps lists of length 4 onto lists of length 2. It only operates on lists of the following form: [x, m, m, x] where m, m {±µ, ±λ} and x / {±µ, ±λ} is some pointer.. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

59 boundary-ld The boundary ld move (b-ld) maps lists of length 4 onto lists of length 2. It only operates on lists of the following form: [x, m, m, x] where m, m {±µ, ±λ} and x / {±µ, ±λ} is some pointer. It maps as such: [x, m, m, x] [m, m]. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

60 boundary-ld The boundary ld move (b-ld) maps lists of length 4 onto lists of length 2. It only operates on lists of the following form: [x, m, m, x] where m, m {±µ, ±λ} and x / {±µ, ±λ} is some pointer. It maps as such: [x, m, m, x] [m, m] For example, [ 2, 1, 3, 2] [ 3, 1]. A list is considered sorted if it is in the form [µ, λ] or [ λ, µ] Thus, if a boundary-ld move is done, it will always be the final move in the sorting.. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

61 The Algorithm (1) Map the signed permutation onto a list of signed pointers. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

62 The Algorithm (1) Map the signed permutation onto a list of signed pointers. (2) Search for and do the first possible ld. If one is done, go to (2). If no ld is found, go to (3). J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

63 The Algorithm (1) Map the signed permutation onto a list of signed pointers. (2) Search for and do the first possible ld. If one is done, go to (2). If no ld is found, go to (3). (3) Check if a boundary-ld can be done. If it can, do it. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

64 The Algorithm (1) Map the signed permutation onto a list of signed pointers. (2) Search for and do the first possible ld. If one is done, go to (2). If no ld is found, go to (3). (3) Check if a boundary-ld can be done. If it can, do it. (4) Check if the list is sorted. If it is, end the program. Otherwise, go to (5) J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

65 The Algorithm (1) Map the signed permutation onto a list of signed pointers. (2) Search for and do the first possible ld. If one is done, go to (2). If no ld is found, go to (3). (3) Check if a boundary-ld can be done. If it can, do it. (4) Check if the list is sorted. If it is, end the program. Otherwise, go to (5) (5) Search through the list, and keep memory of pairs of oppositely-oriented pointers and of pairs of equally-oriented pointers. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

66 The Algorithm (1) Map the signed permutation onto a list of signed pointers. (2) Search for and do the first possible ld. If one is done, go to (2). If no ld is found, go to (3). (3) Check if a boundary-ld can be done. If it can, do it. (4) Check if the list is sorted. If it is, end the program. Otherwise, go to (5) (5) Search through the list, and keep memory of pairs of oppositely-oriented pointers and of pairs of equally-oriented pointers. (6) Search through the list of equally-oriented pointers for a possible dlad move. If one is found, do it and go to (2). If none is found, go to (7). J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

67 The Algorithm (1) Map the signed permutation onto a list of signed pointers. (2) Search for and do the first possible ld. If one is done, go to (2). If no ld is found, go to (3). (3) Check if a boundary-ld can be done. If it can, do it. (4) Check if the list is sorted. If it is, end the program. Otherwise, go to (5) (5) Search through the list, and keep memory of pairs of oppositely-oriented pointers and of pairs of equally-oriented pointers. (6) Search through the list of equally-oriented pointers for a possible dlad move. If one is found, do it and go to (2). If none is found, go to (7). (7) Do the hi represented by the first element of the list of equally-oriented pairs, then go to (2). J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

68 The Algorithm (1) Map the signed permutation onto a list of signed pointers. (2) Search for and do the first possible ld. If one is done, go to (2). If no ld is found, go to (3). (3) Check if a boundary-ld can be done. If it can, do it. (4) Check if the list is sorted. If it is, end the program. Otherwise, go to (5) (5) Search through the list, and keep memory of pairs of oppositely-oriented pointers and of pairs of equally-oriented pointers. (6) Search through the list of equally-oriented pointers for a possible dlad move. If one is found, do it and go to (2). If none is found, go to (7). (7) Do the hi represented by the first element of the list of equally-oriented pairs, then go to (2). J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

69 Some Theorems Theorem The algorithm runs in polynomial time. Specifically, the worst-case complexity is O(n 3 ). J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

70 Some Theorems Theorem The algorithm runs in polynomial time. Specifically, the worst-case complexity is O(n 3 ). Theorem A correctly-formed pointer list of length n > 4 is always in the domain of of an hi, dlad, ld or boundary ld move. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

71 Some Theorems Theorem The algorithm runs in polynomial time. Specifically, the worst-case complexity is O(n 3 ). Theorem A correctly-formed pointer list of length n > 4 is always in the domain of of an hi, dlad, ld or boundary ld move. Theorem The algorithm will always find a path to either [µ, λ] or [ λ, µ], an thus will always terminate. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

72 The Algorithm We proved that a pointer list is always in the domain of a ciliate operation by contradiction.. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

73 The Algorithm We proved that a pointer list is always in the domain of a ciliate operation by contradiction. We assumed the following conditions: (1) For every i where 1 i < n, x i x i+1. (ld moves can t happen). Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

74 The Algorithm We proved that a pointer list is always in the domain of a ciliate operation by contradiction. We assumed the following conditions: (1) For every i where 1 i < n, x i x i+1. (ld moves can t happen) (2) For every x i = x j, x i = x j. (hi moves can t happen). Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

75 The Algorithm We proved that a pointer list is always in the domain of a ciliate operation by contradiction. We assumed the following conditions: (1) For every i where 1 i < n, x i x i+1. (ld moves can t happen) (2) For every x i = x j, x i = x j. (hi moves can t happen) (3) For every x i = x j and x k = x l where i < j and k < l the intervals (x i, x k ) and (x j, x l ) are either disjoint, or one is a proper subset of the other. (dlad moves can t happen). Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

76 The Algorithm We proved that a pointer list is always in the domain of a ciliate operation by contradiction. We assumed the following conditions: (1) For every i where 1 i < n, x i x i+1. (ld moves can t happen) (2) For every x i = x j, x i = x j. (hi moves can t happen) (3) For every x i = x j and x k = x l where i < j and k < l the intervals (x i, x k ) and (x j, x l ) are either disjoint, or one is a proper subset of the other. (dlad moves can t happen) We then showed that it is impossible for a list to fit these conditions and to still be a pointer list.. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

77 The Algorithm We proved that a pointer list is always in the domain of a ciliate operation by contradiction. We assumed the following conditions: (1) For every i where 1 i < n, x i x i+1. (ld moves can t happen) (2) For every x i = x j, x i = x j. (hi moves can t happen) (3) For every x i = x j and x k = x l where i < j and k < l the intervals (x i, x k ) and (x j, x l ) are either disjoint, or one is a proper subset of the other. (dlad moves can t happen) We then showed that it is impossible for a list to fit these conditions and to still be a pointer list. This, and the fact that the ciliate operations all produce pointer lists, prove that the algorithm halts.. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

78 Data Analysis The algorithm produced these numbers of each move for the Muller A element, shown with the known time since their divergence from D. melanogaster. species hi dlad b-ld known divergence time D. sechellia mya D. simulans mya D. erecta mya D. yakuba mya D. mojavensis mya D. virilis mya D. grimshawi mya J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

79 Data Analysis The algorithm produced these total numbers for every Muller element added together: species hi dlad b-ld known divergence time D. sechellia mya D. simulans mya D. erecta mya D. yakuba mya D. mojavensis mya D. virilis mya D. grimshawi mya J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

80 Data Analysis The algorithm produced these total numbers for every Muller element added together: species hi dlad b-ld known divergence time D. sechellia mya D. simulans mya D. erecta mya D. yakuba mya D. mojavensis mya D. virilis mya D. grimshawi mya While this algorithm does not necessarily produce the shortest overall path in the number of combined hi, dlad and ld moves, we conjecture that this is the shortest possible number of hi moves. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

81 The Future Here are some further areas we have to explore: 1. We conjecture that our algorithm minimizes the number of hi moves in its reversal path. We have yet to prove this. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

82 The Future Here are some further areas we have to explore: 1. We conjecture that our algorithm minimizes the number of hi moves in its reversal path. We have yet to prove this. 2. Develop an algorithm to find the shortest possible ciliate operation path. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

83 The Future Here are some further areas we have to explore: 1. We conjecture that our algorithm minimizes the number of hi moves in its reversal path. We have yet to prove this. 2. Develop an algorithm to find the shortest possible ciliate operation path. 3. Look at more species in the Drosophila genus and see if the correlation between ciliate operation path length and divergence time holds. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

84 The Future Here are some further areas we have to explore: 1. We conjecture that our algorithm minimizes the number of hi moves in its reversal path. We have yet to prove this. 2. Develop an algorithm to find the shortest possible ciliate operation path. 3. Look at more species in the Drosophila genus and see if the correlation between ciliate operation path length and divergence time holds. 4. Explore the possibility of using ciliate operations to solve mathematical problems, such as the word and conjugacy problems in groups. J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

85 Bibliography Sridhar Hannenhalli, Pavel A. Pevzner Transforming Cabbage into Turnip: Polynomial Algorithm for Sorting Signed Permutations by Reversals. Journal of the ACM, Vol. 46, No. 1, Pavel Pevzner, Glenn Tesler Genome Rearrangements in Mammalian Evolution: Lessons from Human and House Henomes. Genome Research, Vol. 13, Arjun Bhutkar, Stephen W. Schaeffer, Susan M. Russo, Mu Xu, Temple F. Smith, William M. Gelbart Chromosomal Rearrangement Inferred From Comparisons of 12 Drosophila Genomes. Genetics, Vol 197, J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

86 Bibliography (cont.) Jose M. Ranz, Damien Maurin, Yuk S. Chan, Marchin Von Grotthuss, LeDeana W. Hillier, John Roote, Michael Ashburner, Casey M. Bergman Principles of Genome Evolution in the Drosophila melanogaster Species Group. PLoS Biology, Vol. 5, Issue 6, Andrzej Ehrenfeucht, Tero Harju, Ion Petre, David M. Prescott, Grzegorz Rozenberg Computation in Living Cells. Springer-Verlag Berlin Heidelberg, S. Tweedie, M. Ashburner, K. Falls, P. Leyland, P. McQuilton, S. Marygold, G. Millburn, D. Osumi-Sutherland, A. Schroeder, R. Seal, H. Zhang and The FlyBase Consortium FlyBase: enhancing Drosophila Gene Ontology annotations. Nucleic Acids Research, Vol. 37, J. Herlin, A. Nelson and M. Scheepers () Phylogenetic Relationships July 29, / 29

Patterns of Simple Gene Assembly in Ciliates

Patterns of Simple Gene Assembly in Ciliates Patterns of Simple Gene Assembly in Ciliates Tero Harju Department of Mathematics, University of Turku Turku 20014 Finland harju@utu.fi Ion Petre Academy of Finland and Department of Information Technologies

More information

Chromosomal Rearrangement Inferred From Comparisons of 12 Drosophila Genomes

Chromosomal Rearrangement Inferred From Comparisons of 12 Drosophila Genomes Copyright Ó 2008 by the Genetics Society of America DOI: 10.1534/genetics.107.086108 Chromosomal Rearrangement Inferred From Comparisons of 12 Drosophila Genomes Arjun Bhutkar,*,,1 Stephen W. Schaeffer,

More information

Complexity Measures for Gene Assembly

Complexity Measures for Gene Assembly Tero Harju Chang Li Ion Petre Grzegorz Rozenberg Complexity Measures for Gene Assembly TUCS Technical Report No 781, September 2006 Complexity Measures for Gene Assembly Tero Harju Department of Mathematics,

More information

AN EXACT SOLVER FOR THE DCJ MEDIAN PROBLEM

AN EXACT SOLVER FOR THE DCJ MEDIAN PROBLEM AN EXACT SOLVER FOR THE DCJ MEDIAN PROBLEM MENG ZHANG College of Computer Science and Technology, Jilin University, China Email: zhangmeng@jlueducn WILLIAM ARNDT AND JIJUN TANG Dept of Computer Science

More information

Theoretical Computer Science

Theoretical Computer Science Theoretical Computer Science 411 (2010) 919 925 Contents lists available at ScienceDirect Theoretical Computer Science journal homepage: www.elsevier.com/locate/tcs Algorithmic properties of ciliate sequence

More information

Two models for gene assembly in ciliates

Two models for gene assembly in ciliates Two models for gene assembly in ciliates Tero Harju 1,3, Ion Petre 2,3, and Grzegorz Rozenberg 4 1 Deartment of Mathematics, University of Turku Turku 20014 Finland harju@utu.fi 2 Deartment of Comuter

More information

Computational nature of gene assembly in ciliates

Computational nature of gene assembly in ciliates Computational nature of gene assembly in ciliates Robert Brijder 1, Mark Daley 2, Tero Harju 3, Natasha Jonoska 4, Ion Petre 5, and Grzegorz Rozenberg 1,6 1 Leiden Institute of Advanced Computer Science,

More information

Towards a Comprehensive Annotation of Structured RNAs in Drosophila

Towards a Comprehensive Annotation of Structured RNAs in Drosophila Towards a Comprehensive Annotation of Structured RNAs in Drosophila Rebecca Kirsch 31st TBI Winterseminar, Bled 20/02/2016 Studying Non-Coding RNAs in Drosophila Why Drosophila? especially for novel molecules

More information

Perfect Sorting by Reversals and Deletions/Insertions

Perfect Sorting by Reversals and Deletions/Insertions The Ninth International Symposium on Operations Research and Its Applications (ISORA 10) Chengdu-Jiuzhaigou, China, August 19 23, 2010 Copyright 2010 ORSC & APORC, pp. 512 518 Perfect Sorting by Reversals

More information

Combinatorial models for DNA rearrangements in ciliates

Combinatorial models for DNA rearrangements in ciliates University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School 2009 Combinatorial models for DNA rearrangements in ciliates Angela Angeleska University of South Florida Follow

More information

Models of Natural Computation: Gene Assembly and Membrane Systems. Robert Brijder

Models of Natural Computation: Gene Assembly and Membrane Systems. Robert Brijder Models of Natural Computation: Gene Assembly and Membrane Systems Robert Brijder The work in this thesis has been carried out under the auspices of the research school IPA (Institute for Programming research

More information

Greedy Algorithms. CS 498 SS Saurabh Sinha

Greedy Algorithms. CS 498 SS Saurabh Sinha Greedy Algorithms CS 498 SS Saurabh Sinha Chapter 5.5 A greedy approach to the motif finding problem Given t sequences of length n each, to find a profile matrix of length l. Enumerative approach O(l n

More information

On the complexity of unsigned translocation distance

On the complexity of unsigned translocation distance Theoretical Computer Science 352 (2006) 322 328 Note On the complexity of unsigned translocation distance Daming Zhu a, Lusheng Wang b, a School of Computer Science and Technology, Shandong University,

More information

Taxonomy. Content. How to determine & classify a species. Phylogeny and evolution

Taxonomy. Content. How to determine & classify a species. Phylogeny and evolution Taxonomy Content Why Taxonomy? How to determine & classify a species Domains versus Kingdoms Phylogeny and evolution Why Taxonomy? Classification Arrangement in groups or taxa (taxon = group) Nomenclature

More information

A Complex Suite of Forces Drives Gene Traffic from Drosophila X Chromosomes

A Complex Suite of Forces Drives Gene Traffic from Drosophila X Chromosomes Complex Suite of Forces Drives Gene Traffic from Drosophila Chromosomes Richard P. Meisel,* Mira V. Han,à and Matthew W. Hahnà *Department of Biology and Graduate Program in Genetics, The Pennsylvania

More information

CONTENTS. P A R T I Genomes 1. P A R T II Gene Transcription and Regulation 109

CONTENTS. P A R T I Genomes 1. P A R T II Gene Transcription and Regulation 109 CONTENTS ix Preface xv Acknowledgments xxi Editors and contributors xxiv A computational micro primer xxvi P A R T I Genomes 1 1 Identifying the genetic basis of disease 3 Vineet Bafna 2 Pattern identification

More information

Computational processes

Computational processes Spring 2010 Computational processes in living cells Lecture 6: Gene assembly as a pointer reduction in MDS descriptors Vladimir Rogojin Department of IT, Abo Akademi http://www.abo.fi/~ipetre/compproc/

More information

Isolating - A New Resampling Method for Gene Order Data

Isolating - A New Resampling Method for Gene Order Data Isolating - A New Resampling Method for Gene Order Data Jian Shi, William Arndt, Fei Hu and Jijun Tang Abstract The purpose of using resampling methods on phylogenetic data is to estimate the confidence

More information

Discrete Applied Mathematics

Discrete Applied Mathematics Discrete Applied Mathematics 159 (2011) 1641 1645 Contents lists available at ScienceDirect Discrete Applied Mathematics journal homepage: www.elsevier.com/locate/dam Note Girth of pancake graphs Phillip

More information

Inferring Phylogenies from RAD Sequence Data

Inferring Phylogenies from RAD Sequence Data Inferring Phylogenies from RAD Sequence Data Benjamin E. R. Rubin 1,2 *, Richard H. Ree 3, Corrie S. Moreau 2 1 Committee on Evolutionary Biology, University of Chicago, Chicago, Illinois, United States

More information

I519 Introduction to Bioinformatics, Genome Comparison. Yuzhen Ye School of Informatics & Computing, IUB

I519 Introduction to Bioinformatics, Genome Comparison. Yuzhen Ye School of Informatics & Computing, IUB I519 Introduction to Bioinformatics, 2015 Genome Comparison Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Whole genome comparison/alignment Build better phylogenies Identify polymorphism

More information

The combinatorics and algorithmics of genomic rearrangements have been the subject of much

The combinatorics and algorithmics of genomic rearrangements have been the subject of much JOURNAL OF COMPUTATIONAL BIOLOGY Volume 22, Number 5, 2015 # Mary Ann Liebert, Inc. Pp. 425 435 DOI: 10.1089/cmb.2014.0096 An Exact Algorithm to Compute the Double-Cutand-Join Distance for Genomes with

More information

COMPUTATIONAL PROCESSES IN LIVING CELLS

COMPUTATIONAL PROCESSES IN LIVING CELLS COMPUTATIONAL PROCESSES IN LIVING CELLS Lecture 7: Formal Systems for Gene Assembly in Ciliates: the String Pointer Reduction System March 31, 2010 MDS-descriptors MDS descriptors: strings over the following

More information

Algorithms for Bioinformatics

Algorithms for Bioinformatics Adapted from slides by Alexandru Tomescu, Leena Salmela, Veli Mäkinen, Esa Pitkänen 582670 Algorithms for Bioinformatics Lecture 5: Combinatorial Algorithms and Genomic Rearrangements 1.10.2015 Background

More information

Evolution of genes and genomes on the Drosophila phylogeny

Evolution of genes and genomes on the Drosophila phylogeny Vol 450 8 November 2007 doi:10.1038/nature06341 ARTICLES Evolution of genes and genomes on the Drosophila phylogeny Drosophila 12 Genomes Consortium* Comparative analysis of multiple genomes in a phylogenetic

More information

Results and Problems in Cell Differentiation

Results and Problems in Cell Differentiation Results and Problems in Cell Differentiation A Series of Topical Volumes in Developmental Biology 13 Editors W. Hennig, Nijmegen and 1. Reinert, Berlin Germ Line - Soma Differentiation Edited by W. Hennig

More information

Chromosomal Breakpoint Reuse in Genome Sequence Rearrangement ABSTRACT

Chromosomal Breakpoint Reuse in Genome Sequence Rearrangement ABSTRACT JOURNAL OF COMPUTATIONAL BIOLOGY Volume 12, Number 6, 2005 Mary Ann Liebert, Inc. Pp. 812 821 Chromosomal Breakpoint Reuse in Genome Sequence Rearrangement DAVID SANKOFF 1 and PHIL TRINH 2 ABSTRACT In

More information

计算机科学中的问题求解初探. Great Theoretical Ideas in Computer Science

计算机科学中的问题求解初探. Great Theoretical Ideas in Computer Science 计算机科学中的问题求解初探 Great Theoretical Ideas in Computer Science Great Theoretical Ideas in Computer Science Spring 201: Lecture 4 Pancake Sorting Feel free to ask questions The chef in our place is sloppy; when

More information

Theoretical Computer Science. Rewriting rule chains modeling DNA rearrangement pathways

Theoretical Computer Science. Rewriting rule chains modeling DNA rearrangement pathways Theoretical Computer Science 454 (2012) 5 22 Contents lists available at SciVerse ScienceDirect Theoretical Computer Science journal homepage: www.elsevier.com/locate/tcs Rewriting rule chains modeling

More information

Computational Power of Gene Rearrangement. Lila Kari and Laura F. Landweber

Computational Power of Gene Rearrangement. Lila Kari and Laura F. Landweber DIMACS Series in Discrete Mathematics and Theoretical Computer Science Computational Power of Gene Rearrangement Lila Kari and Laura F. Landweber Abstract. In [8] we proposed a model to describe the homologous

More information

Graphs, permutations and sets in genome rearrangement

Graphs, permutations and sets in genome rearrangement ntroduction Graphs, permutations and sets in genome rearrangement 1 alabarre@ulb.ac.be Universite Libre de Bruxelles February 6, 2006 Computers in Scientic Discovery 1 Funded by the \Fonds pour la Formation

More information

A Framework for Orthology Assignment from Gene Rearrangement Data

A Framework for Orthology Assignment from Gene Rearrangement Data A Framework for Orthology Assignment from Gene Rearrangement Data Krister M. Swenson, Nicholas D. Pattengale, and B.M.E. Moret Department of Computer Science University of New Mexico Albuquerque, NM 87131,

More information

Phylogenomic Resources at the UCSC Genome Browser

Phylogenomic Resources at the UCSC Genome Browser 9 Phylogenomic Resources at the UCSC Genome Browser Kate Rosenbloom, James Taylor, Stephen Schaeffer, Jim Kent, David Haussler, and Webb Miller Summary The UC Santa Cruz Genome Browser provides a number

More information

Graph Polynomials motivated by Gene Assembly

Graph Polynomials motivated by Gene Assembly Colloquium USF Tampa Jan Graph Polynomials motivated by Gene Assembly Hendrik Jan Hoogeboom, Leiden NL with Robert Brijder, Hasselt B transition polynomials assembly polynomial of G w for doc-word w S(G

More information

Multiple Sequence Alignment. Sequences

Multiple Sequence Alignment. Sequences Multiple Sequence Alignment Sequences > YOR020c mstllksaksivplmdrvlvqrikaqaktasglylpe knveklnqaevvavgpgftdangnkvvpqvkvgdqvl ipqfggstiklgnddevilfrdaeilakiakd > crassa mattvrsvksliplldrvlvqrvkaeaktasgiflpe

More information

Abstract. comment reviews reports deposited research refereed research interactions information

Abstract.  comment reviews reports deposited research refereed research interactions information http://genomebiology.com/2002/3/12/research/0086.1 Research Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome Casey M Bergman*, Barret D Pfeiffer*,

More information

Genome Rearrangements In Man and Mouse. Abhinav Tiwari Department of Bioengineering

Genome Rearrangements In Man and Mouse. Abhinav Tiwari Department of Bioengineering Genome Rearrangements In Man and Mouse Abhinav Tiwari Department of Bioengineering Genome Rearrangement Scrambling of the order of the genome during evolution Operations on chromosomes Reversal Translocation

More information

The Algebra of Gene Assembly in Ciliates

The Algebra of Gene Assembly in Ciliates The Algebra of Gene Assembly in Ciliates Robert Brijder and Hendrik Jan Hoogeboom Abstract The formal theory of intramolecular gene assembly in ciliates is fitted into the well-established theories of

More information

Graduate Funding Information Center

Graduate Funding Information Center Graduate Funding Information Center UNC-Chapel Hill, The Graduate School Graduate Student Proposal Sponsor: Program Title: NESCent Graduate Fellowship Department: Biology Funding Type: Fellowship Year:

More information

CSC 421: Algorithm Design & Analysis. Spring 2018

CSC 421: Algorithm Design & Analysis. Spring 2018 CSC 421: Algorithm Design & Analysis Spring 2018 Complexity & Computability complexity theory tractability, decidability P vs. NP, Turing machines NP-complete, reductions approximation algorithms, genetic

More information

Recombination Faults in Gene Assembly in Ciliates Modeled Using Multimatroids

Recombination Faults in Gene Assembly in Ciliates Modeled Using Multimatroids Recombination Faults in Gene Assembly in Ciliates Modeled Using Multimatroids Robert Brijder 1 Hasselt University and Transnational University of Limburg, Belgium Abstract We formally model the process

More information

Origins and Evolution of MicroRNA Genes in Drosophila Species

Origins and Evolution of MicroRNA Genes in Drosophila Species Origins and Evolution of MicroRNA Genes in Drosophila Species Masafumi Nozawa*, Sayaka Miura, and Masatoshi Nei Institute of Molecular Evolutionary Genetics and Department of Biology, Pennsylvania State

More information

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships Chapter 26: Phylogeny and the Tree of Life You Must Know The taxonomic categories and how they indicate relatedness. How systematics is used to develop phylogenetic trees. How to construct a phylogenetic

More information

EARLY ONLINE RELEASE. Daniel A. Pollard, Venky N. Iyer, Alan M. Moses, Michael B. Eisen

EARLY ONLINE RELEASE. Daniel A. Pollard, Venky N. Iyer, Alan M. Moses, Michael B. Eisen EARLY ONLINE RELEASE This is a provisional PDF of the author-produced electronic version of a manuscript that has been accepted for publication. Although this article has been peer-reviewed, it was posted

More information

Computational approaches for functional genomics

Computational approaches for functional genomics Computational approaches for functional genomics Kalin Vetsigian October 31, 2001 The rapidly increasing number of completely sequenced genomes have stimulated the development of new methods for finding

More information

Probabilistic modeling of the evolution of gene synteny within reconciled phylogenies

Probabilistic modeling of the evolution of gene synteny within reconciled phylogenies RESEARCH Open Access Probabilistic modeling of the evolution of gene synteny within reconciled phylogenies Magali Semeria 1, Eric Tannier 1,2, Laurent Guéguen 1* From 13th Annual Research in Computational

More information

BMI/CS 776 Lecture #20 Alignment of whole genomes. Colin Dewey (with slides adapted from those by Mark Craven)

BMI/CS 776 Lecture #20 Alignment of whole genomes. Colin Dewey (with slides adapted from those by Mark Craven) BMI/CS 776 Lecture #20 Alignment of whole genomes Colin Dewey (with slides adapted from those by Mark Craven) 2007.03.29 1 Multiple whole genome alignment Input set of whole genome sequences genomes diverged

More information

Assembly improvement: based on Ragout approach. student: Anna Lioznova scientific advisor: Son Pham

Assembly improvement: based on Ragout approach. student: Anna Lioznova scientific advisor: Son Pham Assembly improvement: based on Ragout approach student: Anna Lioznova scientific advisor: Son Pham Plan Ragout overview Datasets Assembly improvements Quality overlap graph paired-end reads Coverage Plan

More information

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot

More information

A Nonlinear Lower Bound on Linear Search Tree Programs for Solving Knapsack Problems*

A Nonlinear Lower Bound on Linear Search Tree Programs for Solving Knapsack Problems* JOURNAL OF COMPUTER AND SYSTEM SCIENCES 13, 69--73 (1976) A Nonlinear Lower Bound on Linear Search Tree Programs for Solving Knapsack Problems* DAVID DOBKIN Department of Computer Science, Yale University,

More information

Annotation of Drosophila grimashawi Contig12

Annotation of Drosophila grimashawi Contig12 Annotation of Drosophila grimashawi Contig12 Marshall Strother April 27, 2009 Contents 1 Overview 3 2 Genes 3 2.1 Genscan Feature 12.4............................................. 3 2.1.1 Genome Browser:

More information

I519 Introduction to Bioinformatics, Genome Comparison. Yuzhen Ye School of Informatics & Computing, IUB

I519 Introduction to Bioinformatics, Genome Comparison. Yuzhen Ye School of Informatics & Computing, IUB I519 Introduction to Bioinformatics, 2011 Genome Comparison Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Whole genome comparison/alignment Build better phylogenies Identify polymorphism

More information

Genome Rearrangement. 1 Inversions. Rick Durrett 1

Genome Rearrangement. 1 Inversions. Rick Durrett 1 Genome Rearrangement Rick Durrett 1 Dept. of Math., Cornell U., Ithaca NY, 14853 rtd1@cornell.edu Genomes evolve by chromosomal fissions and fusions, reciprocal translocations between chromosomes, and

More information

Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are:

Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are: Comparative genomics and proteomics Species available Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are: Vertebrates: human, chimpanzee, mouse, rat,

More information

species Nicolas Bargues and Emmanuelle Lerat *

species Nicolas Bargues and Emmanuelle Lerat * Bargues and Lerat Mobile DNA (2017) 8:7 DOI 10.1186/s13100-017-0090-3 RESEARCH Open Access Evolutionary history of LTRretrotransposons among 20 Drosophila species Nicolas Bargues and Emmanuelle Lerat *

More information

Chapter 13 Meiosis and Sexual Reproduction

Chapter 13 Meiosis and Sexual Reproduction Biology 110 Sec. 11 J. Greg Doheny Chapter 13 Meiosis and Sexual Reproduction Quiz Questions: 1. What word do you use to describe a chromosome or gene allele that we inherit from our Mother? From our Father?

More information

Gene Family Evolution across 12 Drosophila Genomes

Gene Family Evolution across 12 Drosophila Genomes Gene Family Evolution across 12 Drosophila Genomes Matthew W. Hahn 1,2*, Mira V. Han 2, Sang-Gook Han 2 1 Department of Biology, Indiana University, Bloomington, Indiana, United States of America, 2 School

More information

Analysis of Gene Order Evolution beyond Single-Copy Genes

Analysis of Gene Order Evolution beyond Single-Copy Genes Analysis of Gene Order Evolution beyond Single-Copy Genes Nadia El-Mabrouk Département d Informatique et de Recherche Opérationnelle Université de Montréal mabrouk@iro.umontreal.ca David Sankoff Department

More information

Discrete Applied Mathematics. Maximal pivots on graphs with an application to gene assembly

Discrete Applied Mathematics. Maximal pivots on graphs with an application to gene assembly Discrete Applied Mathematics 158 (2010) 1977 1985 Contents lists available at ScienceDirect Discrete Applied Mathematics journal homepage: www.elsevier.com/locate/dam Maximal pivots on graphs with an application

More information

CS 350 Algorithms and Complexity

CS 350 Algorithms and Complexity CS 350 Algorithms and Complexity Winter 2019 Lecture 15: Limitations of Algorithmic Power Introduction to complexity theory Andrew P. Black Department of Computer Science Portland State University Lower

More information

CS 350 Algorithms and Complexity

CS 350 Algorithms and Complexity 1 CS 350 Algorithms and Complexity Fall 2015 Lecture 15: Limitations of Algorithmic Power Introduction to complexity theory Andrew P. Black Department of Computer Science Portland State University Lower

More information

Techniques for Multi-Genome Synteny Analysis to Overcome Assembly Limitations

Techniques for Multi-Genome Synteny Analysis to Overcome Assembly Limitations 152 Genome Informatics 17(2): 152{161 (2006) Techniques for Multi-Genome Synteny Analysis to Overcome Assemly Limitations Arjun Bhutkar 1;2 Susan Russo 1 arjun@morgan.harvard.edu russo@morgan.harvard.edu

More information

A PARSIMONY APPROACH TO ANALYSIS OF HUMAN SEGMENTAL DUPLICATIONS

A PARSIMONY APPROACH TO ANALYSIS OF HUMAN SEGMENTAL DUPLICATIONS A PARSIMONY APPROACH TO ANALYSIS OF HUMAN SEGMENTAL DUPLICATIONS CRYSTAL L. KAHN and BENJAMIN J. RAPHAEL Box 1910, Brown University Department of Computer Science & Center for Computational Molecular Biology

More information

Drosophila melanogaster and D. simulans, two fruit fly species that are nearly

Drosophila melanogaster and D. simulans, two fruit fly species that are nearly Comparative Genomics: Human versus chimpanzee 1. Introduction The chimpanzee is the closest living relative to humans. The two species are nearly identical in DNA sequence (>98% identity), yet vastly different

More information

Characteristics of Life

Characteristics of Life UNIT 2 BIODIVERSITY Chapter 4- Patterns of Life Biology 2201 Characteristics of Life All living things share some basic characteristics: 1) living things are organized systems made up of one or more cells

More information

AP Biology. Read college-level text for understanding and be able to summarize main concepts

AP Biology. Read college-level text for understanding and be able to summarize main concepts St. Mary's College AP Biology Continuity and Change Consider how specific changes to an ecosystem (geological, climatic, introduction of new organisms, etc.) can affect the organisms that live within it.

More information

Evolution of Tandemly Arrayed Genes in Multiple Species

Evolution of Tandemly Arrayed Genes in Multiple Species Evolution of Tandemly Arrayed Genes in Multiple Species Mathieu Lajoie 1, Denis Bertrand 1, and Nadia El-Mabrouk 1 DIRO - Université de Montréal - H3C 3J7 - Canada {bertrden,lajoimat,mabrouk}@iro.umontreal.ca

More information

Edit Distance with Move Operations

Edit Distance with Move Operations Edit Distance with Move Operations Dana Shapira and James A. Storer Computer Science Department shapird/storer@cs.brandeis.edu Brandeis University, Waltham, MA 02254 Abstract. The traditional edit-distance

More information

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Bioinformatics tools for phylogeny and visualization. Yanbin Yin Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and

More information

THE THREE-STATE PERFECT PHYLOGENY PROBLEM REDUCES TO 2-SAT

THE THREE-STATE PERFECT PHYLOGENY PROBLEM REDUCES TO 2-SAT COMMUNICATIONS IN INFORMATION AND SYSTEMS c 2009 International Press Vol. 9, No. 4, pp. 295-302, 2009 001 THE THREE-STATE PERFECT PHYLOGENY PROBLEM REDUCES TO 2-SAT DAN GUSFIELD AND YUFENG WU Abstract.

More information

Overdispersion of the Molecular Clock: Temporal Variation of Gene-Specific Substitution Rates in Drosophila

Overdispersion of the Molecular Clock: Temporal Variation of Gene-Specific Substitution Rates in Drosophila Overdispersion of the Molecular Clock: Temporal Variation of Gene-Specific Substitution Rates in Drosophila Trevor Bedford and Daniel L. Hartl Department of Organismic and Evolutionary Biology, Harvard

More information

ECOL/MCB 320 and 320H Genetics

ECOL/MCB 320 and 320H Genetics ECOL/MCB 320 and 320H Genetics Instructors Dr. C. William Birky, Jr. Dept. of Ecology and Evolutionary Biology Lecturing on Molecular genetics Transmission genetics Population and evolutionary genetics

More information

Single alignment: Substitution Matrix. 16 march 2017

Single alignment: Substitution Matrix. 16 march 2017 Single alignment: Substitution Matrix 16 march 2017 BLOSUM Matrix BLOSUM Matrix [2] (Blocks Amino Acid Substitution Matrices ) It is based on the amino acids substitutions observed in ~2000 conserved block

More information

MOLECULAR PHYLOGENY AND GENETIC DIVERSITY ANALYSIS. Masatoshi Nei"

MOLECULAR PHYLOGENY AND GENETIC DIVERSITY ANALYSIS. Masatoshi Nei MOLECULAR PHYLOGENY AND GENETIC DIVERSITY ANALYSIS Masatoshi Nei" Abstract: Phylogenetic trees: Recent advances in statistical methods for phylogenetic reconstruction and genetic diversity analysis were

More information

New Algorithms for Statistical Analysis of Interval Data

New Algorithms for Statistical Analysis of Interval Data New Algorithms for Statistical Analysis of Interval Data Gang Xiang, Scott A. Starks, Vladik Kreinovich, and Luc Longpré NASA Pan-American Center for Earth and Environmental Studies PACES) University of

More information

Comparative genomics: Overview & Tools + MUMmer algorithm

Comparative genomics: Overview & Tools + MUMmer algorithm Comparative genomics: Overview & Tools + MUMmer algorithm Urmila Kulkarni-Kale Bioinformatics Centre University of Pune, Pune 411 007. urmila@bioinfo.ernet.in Genome sequence: Fact file 1995: The first

More information

Lesson 4: Understanding Genetics

Lesson 4: Understanding Genetics Lesson 4: Understanding Genetics 1 Terms Alleles Chromosome Co dominance Crossover Deoxyribonucleic acid DNA Dominant Genetic code Genome Genotype Heredity Heritability Heritability estimate Heterozygous

More information

Probability & Combinatorics Test and Solutions February 18, 2012

Probability & Combinatorics Test and Solutions February 18, 2012 1. A standard 12-hour clock has hour, minute, and second hands. How many times do two hands cross between 1:00 and 2:00 (not including 1:00 and 2:00 themselves)? Answer: 119 Solution: We know that the

More information

Lesson 10 Study Guide

Lesson 10 Study Guide URI CMB 190 Issues in Biotechnology Lesson 10 Study Guide 15. By far most of the species that have ever existed are now extinct. Many of those extinct species were the precursors of the species that are

More information

Phylogenetic Networks, Trees, and Clusters

Phylogenetic Networks, Trees, and Clusters Phylogenetic Networks, Trees, and Clusters Luay Nakhleh 1 and Li-San Wang 2 1 Department of Computer Science Rice University Houston, TX 77005, USA nakhleh@cs.rice.edu 2 Department of Biology University

More information

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogeny? - Systematics? The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogenetic systematics? Connection between phylogeny and classification. - Phylogenetic systematics informs the

More information

Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata.

Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Supplementary Note S2 Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Phylogenetic trees reconstructed by a variety of methods from either single-copy orthologous loci (Class

More information

SUMS PROBLEM COMPETITION, 2000

SUMS PROBLEM COMPETITION, 2000 SUMS ROBLEM COMETITION, 2000 SOLUTIONS 1 The result is well known, and called Morley s Theorem Many proofs are known See for example HSM Coxeter, Introduction to Geometry, page 23 2 If the number of vertices,

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

BIOLOGY 432 Midterm I - 30 April PART I. Multiple choice questions (3 points each, 42 points total). Single best answer.

BIOLOGY 432 Midterm I - 30 April PART I. Multiple choice questions (3 points each, 42 points total). Single best answer. BIOLOGY 432 Midterm I - 30 April 2012 Name PART I. Multiple choice questions (3 points each, 42 points total). Single best answer. 1. Over time even the most highly conserved gene sequence will fix mutations.

More information

Advanced practical course in genome bioinformatics DAY 6: Functional annotation. Petri Törönen Earlier version Patrik Koskinen

Advanced practical course in genome bioinformatics DAY 6: Functional annotation. Petri Törönen Earlier version Patrik Koskinen Advanced practical course in genome bioinformatics DAY 6: Functional annotation Petri Törönen Earlier version Patrik Koskinen Genome project roadmap After experimental design and preparations a genome

More information

arxiv: v1 [math.co] 11 Jul 2016

arxiv: v1 [math.co] 11 Jul 2016 Characterization and recognition of proper tagged probe interval graphs Sourav Chakraborty, Shamik Ghosh, Sanchita Paul and Malay Sen arxiv:1607.02922v1 [math.co] 11 Jul 2016 October 29, 2018 Abstract

More information

Phylogenetic inference

Phylogenetic inference Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis-) advantages of different information types

More information

Evolutionary genetics of the Drosophila melanogaster subgroup I. Phylogenetic relationships based on coatings, hybrids and proteins

Evolutionary genetics of the Drosophila melanogaster subgroup I. Phylogenetic relationships based on coatings, hybrids and proteins Jpn. J. Genet. (1987) 62, pp. 225-239 Evolutionary genetics of the Drosophila melanogaster subgroup I. Phylogenetic relationships based on coatings, hybrids and proteins BY Won Ho LEE* and Takao K. WATANABE**

More information

F1 Parent Cell R R. Name Period. Concept 15.1 Mendelian inheritance has its physical basis in the behavior of chromosomes

F1 Parent Cell R R. Name Period. Concept 15.1 Mendelian inheritance has its physical basis in the behavior of chromosomes Name Period Concept 15.1 Mendelian inheritance has its physical basis in the behavior of chromosomes 1. What is the chromosome theory of inheritance? 2. Explain the law of segregation. Use two different

More information

CREATING PHYLOGENETIC TREES FROM DNA SEQUENCES

CREATING PHYLOGENETIC TREES FROM DNA SEQUENCES INTRODUCTION CREATING PHYLOGENETIC TREES FROM DNA SEQUENCES This worksheet complements the Click and Learn developed in conjunction with the 2011 Holiday Lectures on Science, Bones, Stones, and Genes:

More information

Lecture 11 Friday, October 21, 2011

Lecture 11 Friday, October 21, 2011 Lecture 11 Friday, October 21, 2011 Phylogenetic tree (phylogeny) Darwin and classification: In the Origin, Darwin said that descent from a common ancestral species could explain why the Linnaean system

More information

Biol478/ August

Biol478/ August Biol478/595 29 August # Day Inst. Topic Hwk Reading August 1 M 25 MG Introduction 2 W 27 MG Sequences and Evolution Handouts 3 F 29 MG Sequences and Evolution September M 1 Labor Day 4 W 3 MG Database

More information

Y chromosome dynamics in Drosophila. Amanda Larracuente Department of Biology

Y chromosome dynamics in Drosophila. Amanda Larracuente Department of Biology Y chromosome dynamics in Drosophila Amanda Larracuente Department of Biology Sex chromosomes J. Graves X X X Y Sex chromosome evolution Autosomes Proto-sex chromosomes Sex determining Suppressed recombination

More information

NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees

NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees Erin Molloy and Tandy Warnow {emolloy2, warnow}@illinois.edu University of Illinois at Urbana

More information

The maximum forcing number of a polyomino

The maximum forcing number of a polyomino AUSTRALASIAN JOURNAL OF COMBINATORICS Volume 69(3) (2017), Pages 306 314 The maximum forcing number of a polyomino Yuqing Lin Mujiangshan Wang School of Electrical Engineering and Computer Science The

More information

Effects of Gap Open and Gap Extension Penalties

Effects of Gap Open and Gap Extension Penalties Brigham Young University BYU ScholarsArchive All Faculty Publications 200-10-01 Effects of Gap Open and Gap Extension Penalties Hyrum Carroll hyrumcarroll@gmail.com Mark J. Clement clement@cs.byu.edu See

More information

Many of the slides that I ll use have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Many of the slides that I ll use have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Many of the slides that I ll use have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis

More information

Presentation by Julie Hudson MAT5313

Presentation by Julie Hudson MAT5313 Proc. Natl. Acad. Sci. USA Vol. 89, pp. 6575-6579, July 1992 Evolution Gene order comparisons for phylogenetic inference: Evolution of the mitochondrial genome (genomics/algorithm/inversions/edit distance/conserved

More information

A Methodological Framework for the Reconstruction of Contiguous Regions of Ancestral Genomes and Its Application to Mammalian Genomes

A Methodological Framework for the Reconstruction of Contiguous Regions of Ancestral Genomes and Its Application to Mammalian Genomes A Methodological Framework for the Reconstruction of Contiguous Regions of Ancestral Genomes and Its Application to Mammalian Genomes Cedric Chauve 1, Eric Tannier 2,3,4,5 * 1 Department of Mathematics,

More information