A Framework for Orthology Assignment from Gene Rearrangement Data

Size: px
Start display at page:

Download "A Framework for Orthology Assignment from Gene Rearrangement Data"

Transcription

1 A Framework for Orthology Assignment from Gene Rearrangement Data Krister M. Swenson, Nicholas D. Pattengale, and B.M.E. Moret Department of Computer Science University of New Mexico Albuquerque, NM 87131, USA Abstract. Gene rearrangements have successfully been used in phylogenetic reconstruction and comparative genomics, but usually under the assumption that all genomes have the same gene content and that no gene is duplicated. While these assumptions allow one to work with organellar genomes, they are too restrictive when comparing nuclear genomes. The main challenge is how to deal with gene families, specifically, how to identify orthologs. While searching for orthologies is a common task in computational biology, it is usually done using sequence data. We approach that problem using gene rearrangement data, provide an optimization framework in which to phrase the problem, and present some preliminary theoretical results. 1 Introduction Gene rearrangements have successfully been used in phylogenetic reconstruction and comparative genomics (see the survey of [13] and the monograph of [15]), but usually under the assumption that all genomes have the same gene content and that no gene is duplicated. While these assumptions allow one to work with organellar genomes [2 7, 11, 18], they are too restrictive when comparing nuclear genomes [8], where the main challenge is how to deal with gene families, specifically, how to identify orthologs. While searching for orthologies is a common task in computational biology, it is usually done using sequence data. We approach that problem using gene rearrangement data. Sankoff [14] first addressed this problem with his introduction of exemplars, in which he suggested identifying a single gene within each family (the exemplar) on the basis of a parsimonious criterion (using the fewest rearrangements) and discarding all others. Our group provided an alternate approach in which a correspondence is established between gene families on the basis of conserved segments [12, 17]; our results suggested that considering all members of a gene family yields better results than keeping only exemplars, but were limited in that the assignment of orthologs did not take into account any rearrangement structure beyond conserved segments. In this paper, we remedy this problem by providing an

2 optimization framework derived from the breakpoint graph (the basic structure behind the last decade of work in gene rearrangements [10]) in which to phrase the problem; we give preliminary theoretical results in support of our framework. We are not suggesting that orthology assignment based on rearrangement data should replace that based on sequence data, but that the two complement each other indeed, that we should aim for methods that will eventually take both types of data into account in the same analysis. 2 Preliminaries The problem we consider can be phrased as follows: given a set of genes S and two genomes, G 1 and G 2, where each genome is represented as a (linear or circular) sequence of elements of S (an element may occur zero, one, or many times within the sequence), each with an associated sign (which basically denotes which strand the gene lies on), find the shortest edit sequence, that is, the shortest sequence of evolutionary events that transforms one genome into the other. Permitted evolutionary events are inversions, which take a subsequence of genes and reverse it in place (in both order and signs), deletions (each of which removes a consecutive subsequence of genes), and insertions (including duplications). A parsimony constraint is also imposed on any editing scenario: if G 1 has a family of k 1 genes and G 2 a corresponding family of k 2 genes, for k 1 k 2, then none of the k 2 genes in G 2 s family may be deleted in the edit sequence instead, we must identify within G 1 s family of k 1 genes a distinct ortholog for each of the k 2 genes in G 2 s family. (In absence of this constraint, of course, the most parsimonious edit sequence is almost always that which deletes the entire genome G 1 as a single operation, then insert the entire genome G 2, an absurd scenario.) Once that identification has been made, the algorithms of El- Mabrouk [9] and of our group [8, 12, 17, 18] can complete the work of finding one ore more parsimonious edit sequences. In this paper, we assume that genome G 2 contains no duplicate genes, although we illustrate how our framework describes the more general case. Moreover, we assume that gene families present in one genome but not the other have been removed these families do not affect orthology assignment, which is simply a mapping between the elements of gene families present in both genomes. Finally, we assume that the remaining genes have been indexed from 1 to n so as to turn G 2 into the identity permutation n; as is customary, we will prepend a marker gene 0 and append another marker gene n + 1 to both genomes. The edit distance can be affected in two ways by the orthology assignment. A good choice of orthologies can reduce the required number of deletions (or duplications) this is the main focus of the cover-based method [12, 17]. It can also reduce the number of required inversions by grouping them properly: this

3 is the focus of this paper. We rely on the fact that every gene in a one-member family of G 1 must be assigned to its corresponding gene in G 2 and that these singleton genes must all be sorted through inversions: because we know how to sort by inversions [1, 10], the presence of singleton genes creates a structural contex in which to study orthology assignment. 3 Background and Definitions 3.1 The Breakpoint Graph The basic structure describing a pair of genomes with no duplicates and equal gene content is the breakpoint graph (really a multigraph) for a careful and readable description of its construction, see [16]. In our case, however, gene PSfrag replacements families in G 1 need not be singletons, so we need to extend the construction. Let B(G 1 ) = (V,E) denote the breakpoint graph for G 1 and G 2 (because of our conventions, G 2 is known once G 1 is). As in the regular breakpoint graph, each singleton gene g in G 1 becomes a pair of vertices, g and g + (the negative and positive terminals), joined by an edge; we leave out the gene families with multiple members, since only the singletons have a well-defined structure, but we now need to accommodate gaps left in the sequence where duplicate genes exist in G 1. We add a desire edge (in the charming terminology of [16] also known elsewhere as a gray edge) (a i,b + j ), for each member i and j of gene families a and b, respectively, whenever a and b differ by one in the indexing (i.e., are neighbors in G 2 ). We add a reality edge (a p,b q ) if a is the element to the left of b in G 1 and either p = q if a and b have different parities (in G 1 naturally) or p q if a and b have the same parity. Figure 1 illustrates the construction. Note that the figure shows a circular ordering, in which the added sentinels are considered adjacent to each other. a) G 1 = Cycle A Component 1 Cycle B Component 2 Cycle C Component 3 Cycle D G 1 = Fig. 1. (a) The original genome G 1. ( Its associated graph B(G 1 ) after removing gene families with duplicates (3 and 9). Desire edges are shown in gray, reality edges in black.

4 The inversion distance equals the number of genes minus the number of cycles in the breakpoint graph, plus some corrective factors (hurdles and a fortress) [10]. Researchers have found that hurdles are very rare in real data, so we focus on selecting an orthology assignment that maximizes the number of cycles. 3.2 The Consequences of An Assignment We call each gene in a multigene family of G 1 a candidate, since it is one of the choices for an orthology assignment to the unique corresponding gene in G 2. For each candidate d, denote by β + (d) the positive terminal of the next smaller element in B(G 1 ) and by β (d) the negative terminal of the next larger element; we call these nodes the bookends of d. Choosing a candidate imposes an additional constraint on B(G 1 ), which we now proceed to examine. The following simple lemma underlies many of our results. Lemma 1. When a candidate d is chosen, the only edges affected are the reality edge that spans it and the edge between its bookends. Proof. As shown in Figure 2, when d is added to the breakpoint graph, the reality edge that spans it gets split, creating two new endpoints d β + and d β. The desire edge that links the bookends of d also gets split, to meet each of d β + and d β. PSfrag replacements a) add d β + β d β + d β β + β cycle d β + d β β β + Fig. 2. (a) A subgraph of B(G 1 ) before adding d. ( The two choices after adding d. Dotted lines indicate a path (and therefore a cycle). 3.3 The Cycle Splitting Problem As observed above, we can formulate the orthology assignment problem as an optimization problem within the context of the breakpoint graph B(G 1 ): choose an assignment of orthologs (one from each multigene family in G 1 ) such that the number of cycles in the augmented breakpoint graph (B(G 1 ) to which the chosen candidates have been added) has the largest possible number of cycles.

5 Fig. 3. The breakpoint graphs for the two candidates for gene 9. Consider component 2 from Figure 1, namely (6,9,8, 10, 7,9,11). There are two occurrences of gene 9 and we must choose which one to call orthologous with gene 9 in G 2. Figure 3 shows the two breakpoint graphs. Note that the graph on left, where the candidate lies between 6 and 8, has one more cycle than the graph on the right, where the candidate lies between 7 and 11; thus the first candidate is a better choice. The choice of candidate is advantageously viewed on breakpoint graph inscribed in a circle as shown in Figure 4. Now overlay the two choices into a single graph, as shown in Figure 4(. Two curved lines meet on the perimeter between 10 and 8 +, denoting the two choices. The solid line indicates that choosing the candidate between 6 and 8 gives rise to desire edges that do not cross in the inscribed representation. The dashed line indicates that the other candidate gives rise to crossing desire edges. Each line meets the perimeter at one end between the two terminals of the candidate and at the other end between its bookends. Figure 5(a) illustrates a more general instance with three multigene families, while Figure 5( shows how this representation can be used for the general many-to-many case. Note that the solid and dashed lines we have introduced really represent orthology assignments, or operations; we will call an operation represented by a solid line a straight operation (because it does not introduce crossings) and one represented by a dashed line a cross operation. The collection of all lines a) Fig. 4. (a) The graphs of Figure 3 inscribed in a circle. ( The result of overlaying the two graphs from part (a).

6 a) Fig. 5. (a) An instance of the many-to-one cycle splitting problem. ( An instance of the manyto-many cycle splitting problem. that share an endpoint represents all members of the gene family in G 1, so we also call it a family and call its common endpoint (between the bookends) the family home. We can now state the constraints for the optimization problem: (i) each family home is a distinct point on the circle; (ii) the family home is not the endpoint of any operation not in that family; and (iii) the other endpoint (on the circle) of each operation is unique to that operation. The many-to-many variation gives rise to multiple homes per family. Each home in the same family must connect to all of the same endpoints. The problem thus becomes picking as many operations as there are homes per family such that the cycle count is maximized. The only additional complication is that applying an operation removes that operation from consideration in all other homes for its family. See Figure 5( for an example instance as well as the effect of applying an operation. Straight and cross operations display a form of duality, one that allows us to restrict our attention to just straight operations. Theorem 1. Applying a cross operation c converts all operations that intersect c (call the set of such operations I) to their complement crosses are replaced by straights and straights by crosses. Furthermore, for any two operations in I, if they intersected before applying c, then they no longer do after applying c, and vice versa. Figure 6 sketches the main elements of the omitted proof. 4 Theoretical Results 4.1 Buried Operations An operation makes no contribution to the cycle count of a a complete assignment whenever the two new desire edges it creates lie on the same cycle. Because of their appearance in our inscribed graph representation, we call such operations buried see Figure 7. Since the two desire edges occur on the same cycle, they bury the chosen operation in the cycle.

7 a) c Fig. 6. The visual argument for Theorem 1. 6 a) 5 c) Fig. 7. (a) A new configuration; ( a buried edge for that configuration; and (c) the visual appearance of the buried edge, inside two segments of the same cycle. Theorem 2. If an orthology assignment creates a total of k buried edges, then the number of cycles is bounded by n b + 1. Proof. The number of cycles cannot exceed n + 1, since each operation can give rise to at most one new cycle. Consider the effect on the breakpoint graph of choosing a buried operation. A single desire edge d is replaced with two desire edges d 1 and d 2, and a single reality edge r is replaced with two reality edges r 1 and r 2. Without loss of generality assume that d 1 connects to r 1 and d 2 connects to r 2. d 1 and d 2 each inherit one of the original endpoints of d. Similarly, r 1 and r 2 each inherit one of the original endpoints of r. By assumption d 1 and d 2 lie on the same cycle, so that so do all of the original endpoints of d and r. Thus all of the newly created edges must lie on a cycle that already existed. 4.2 Chains and Stars We have discovered two operation patterns that, while they need not contain buried operations, nevertherless impose sharp bounds on the number of cycles. A k-chain is an assignment in which k operations form a chain, that is, each chosen operation overlaps two of the other k, its predecessor and successor around the circle. Figure 8(a, illustrates a k-chain.

8 a) c) d) Fig. 8. (a) A 4-chain; ( a 5-chain; (c) a 3-star; (d) a 4-star. Proposition 1. A k-chain has no buried operations. In a k-chain with k odd, the cycle count is 2. In a k-chain with k even, the cycle count is 3. A k-star is an assignment in which k operations form a clique (each overlaps every other). Figure 8(c,d) illustrates a k-star. Proposition 2. In a k-star with k even, every operation is buried and the cycle count is 1. In a k-star with k odd, no operation is buried and the cycle count is 2. We conjecture that these two patterns, along with buried operations, describe all operations that reduce the upper bounds on the number of cycles. Figure 9 shows examples of combining k-stars and k-chains. a) Fig. 9. (a) A 3-star and two 4-chains. ( Four 3-stars. 4.3 Reduced Forms Clearly, a successive assignment procedure could reach a state in which no operation remains that could split a cycle. We call such a state a reduced form of the instance. In a reduced form, an instance is composed of multiple cycles linked by the operations from the remaining families. This structure lends itself naturally to a graph representation; an analysis of this graph reveals conditions under which optimality can be verified.

9 a) Fig. 10. (a) An instance of the cycle splitting problem. The operations without a family home (bold) are those that we apply to obtain a reduced form. ( The resulting reduced instance. Black edges are those chosen to produce an optimal solution to the reduced instance. Theorem 3. After applying a maximal nonoverlapping set of operations M, remaining operations can only (by themselves) join two cycles. Proof. The application of a set of k nonoverlapping operations always yields k new cycles, each defined either by two contiguous operations or by a single operation and the original circle. Since M is maximal, every remaining operation from every family overlaps an element of M. Application of any m M, therefore, must span two of the new cycles, joining them into one. Figure 10( shows the reduced instance induced by applying each of the operations shown in bold in Figure 10(a). We are left with a reduced form that can be viewed as a graph on the cycles created so far and in which each adjacency list is strictly ordered (because exactly one operation from a family will remain). We can now take advantage of graph properties such as planarity, circuits, and connected components in our analysis. Because of the ordered nature of the edges on the vertices planarity is somewhat specialized in our case. Nonplanar edges can occur in simpler situations than in general graphs, as shown in Figure 11(. Circuits play a vital role on these graphs. In particular, we are interested in circuits that are subcircuits of no other, i.e., minimal circuits. Provided a) c) Fig. 11. (a) The effect of applying an operation on a reduced form. ( A reduced form. The dotted and dashed lines trace the cycle created by applying the operations shown. (c) Adding a nonplanar edge to the reduced form from ( joins the cycles.

10 a) Fig. 12. An optimal solution to the reduced instance in Figure 10 embedded as (a) the original instance and ( its reduced form. there is no nonplanar edge in a reduced instance, the properties of a solution directly dictate the number of cycles produced. As shown in Figure 11, each connected component produces a cycle around its outer hull. Each minimal circuit yields another cycle to its inside. Figure 11(c) shows how nonplanar edges can join these two cycles. Theorem 4. The number of cycles produced by a solution S to a planar reduced instance with m minimal circuits and c connected components is R(S) = m + c. Proof. This certainly holds for a reduced instance with no operations. Assume R(S) = m+c for a solution where S = k. We look at the effect of adding another edge. 1. If that edge links two previously disconnected components, then the cycles around the hulls of these components will get merged, removing a cycle and a connected component. 2. If that edge links two connected components, then a minimal circuit will be created. Since the edge added is planar, we know that the same cycle runs past both endpoints of the operations and thus the operation will split it. It remains to relate results on reduced forms back to the original inscribed breakpoint graph formulation, a process illustrated in Figure Conclusion We have described a graph-theoretical framework in which to represent and reason about orthology assignments and their effect on the number of cycles present in the resulting breakpoint graph. We have given some foundational results about this framework, including several that point us directly to to algorithmic strategies for optimizing this assignment. We believe that this framework will lead to a characterization of the orthology assignment problem as well as to the development of practical algorithm solutions.

11 References 1. D.A. Bader, B.M.E. Moret, and M. Yan. A fast linear-time algorithm for inversion distance with an experimental comparison. J. Comput. Biol., 8(5): , M. Blanchette, T. Kunisawa, and D. Sankoff. Gene order breakpoint evidence in animal mitochondrial phylogeny. J. Mol. Evol., 49: , J.L. Boore. Phylogenies derived from rearrangements of the mitochondrial genome. In N. Saitou, editor, Proc. Int l Inst. for Advanced Studies Symp. on Biodiversity, pages 9 20, Kyoto, Japan, J.L. Boore and W.M. Brown. Big trees from little genomes: Mitochondrial gene order as a phylogenetic tool. Curr. Opinion Genet. Dev., 8(6): , J.L. Boore, T. Collins, D. Stanton, L. Daehler, and W.M. Brown. Deducing the pattern of arthropod phylogeny from mitochondrial DNA rearrangements. Nature, 376: , M.E. Cosner, R.K. Jansen, B.M.E. Moret, L.A. Raubeson, L. Wang, T. Warnow, and S.K. Wyman. An empirical comparison of phylogenetic methods on chloroplast gene order data in Campanulaceae. In D. Sankoff and J.H. Nadeau, editors, Comparative Genomics, pages Kluwer Academic Publishers, S. R. Downie and J. D. Palmer. Use of chloroplast DNA rearrangements in reconstructing plant phylogeny. In D.E. Soltis, P.S. Soltis, and J.J. Doyle, editors, Molecular Systematics of Plants, pages Chapman and Hall, New York, J. Earnest-DeYoung, E. Lerat, and B.M.E. Moret. Reversing gene erosion: reconstructing ancestral bacterial genomes from gene-content and gene-order data. In Proc. 4th Int l Workshop Algs. in Bioinformatics (WABI 04), volume 3240 of Lecture Notes in Computer Science, pages Springer Verlag, N. El-Mabrouk. Genome rearrangement by reversals and insertions/deletions of contiguous segments. In Proc. 11th Ann. Symp. Combin. Pattern Matching (CPM 00), volume 1848 of Lecture Notes in Computer Science, pages Springer Verlag, S. Hannenhalli and P.A. Pevzner. Transforming cabbage into turnip (polynomial algorithm for sorting signed permutations by reversals). In Proc. 27th Ann. ACM Symp. Theory of Comput. (STOC 95), pages ACM Press, New York, B. Larget, D.L. Simon, and J.B. Kadane. Bayesian phylogenetic inference from animal mitochondrial genome arrangements. J. Royal Stat. Soc. B, 64(4): , M. Marron, K.M. Swenson, and B.M.E. Moret. Genomic distances under deletions and insertions. Theor. Computer Science, 325(3): , B.M.E. Moret, J. Tang, and T. Warnow. Reconstructing phylogenies from gene-content and gene-order data. In O. Gascuel, editor, Mathematics of Evolution and Phylogeny, pages Oxford University Press, D. Sankoff. Genome rearrangement with gene families. Bioinformatics, 15(11): , D. Sankoff and J. Nadeau, editors. Comparative Genomics: Empirical and Analytical Approaches to Gene Order Dynamics, Map Alignment, and the Evolution of Gene Families. Kluwer Academic Pubs., Dordrecht, Netherlands, J.C. Setubal and J. Meidanis. Introduction to Computational Molecular Biology. PWS Publishers, Boston, MA, K.M. Swenson, M. Marron, J.V. Earnest-DeYoung, and B.M.E. Moret. Approximating the true evolutionary distance between two genomes. In Proc. 7th SIAM Workshop on Algorithm Engineering & Experiments (ALENEX 05). SIAM Press, Philadelphia, J. Tang, B.M.E. Moret, L. Cui, and C.W. depamphilis. Phylogenetic reconstruction from arbitrary gene-order data. In Proc. 4th IEEE Symp. on Bioinformatics and Bioengineering BIBE 04, pages IEEE Press, Piscataway, NJ, 2004.

AN EXACT SOLVER FOR THE DCJ MEDIAN PROBLEM

AN EXACT SOLVER FOR THE DCJ MEDIAN PROBLEM AN EXACT SOLVER FOR THE DCJ MEDIAN PROBLEM MENG ZHANG College of Computer Science and Technology, Jilin University, China Email: zhangmeng@jlueducn WILLIAM ARNDT AND JIJUN TANG Dept of Computer Science

More information

Fast Phylogenetic Methods for the Analysis of Genome Rearrangement Data: An Empirical Study

Fast Phylogenetic Methods for the Analysis of Genome Rearrangement Data: An Empirical Study Fast Phylogenetic Methods for the Analysis of Genome Rearrangement Data: An Empirical Study Li-San Wang Robert K. Jansen Dept. of Computer Sciences Section of Integrative Biology University of Texas, Austin,

More information

Reversing Gene Erosion Reconstructing Ancestral Bacterial Genomes from Gene-Content and Order Data

Reversing Gene Erosion Reconstructing Ancestral Bacterial Genomes from Gene-Content and Order Data Reversing Gene Erosion Reconstructing Ancestral Bacterial Genomes from Gene-Content and Order Data Joel V. Earnest-DeYoung 1, Emmanuelle Lerat 2, and Bernard M.E. Moret 1,3 Abstract In the last few years,

More information

Steps Toward Accurate Reconstructions of Phylogenies from Gene-Order Data 1

Steps Toward Accurate Reconstructions of Phylogenies from Gene-Order Data 1 Steps Toward Accurate Reconstructions of Phylogenies from Gene-Order Data Bernard M.E. Moret, Jijun Tang, Li-San Wang, and Tandy Warnow Department of Computer Science, University of New Mexico Albuquerque,

More information

BIOINFORMATICS. Scaling Up Accurate Phylogenetic Reconstruction from Gene-Order Data. Jijun Tang 1 and Bernard M.E. Moret 1

BIOINFORMATICS. Scaling Up Accurate Phylogenetic Reconstruction from Gene-Order Data. Jijun Tang 1 and Bernard M.E. Moret 1 BIOINFORMATICS Vol. 1 no. 1 2003 Pages 1 8 Scaling Up Accurate Phylogenetic Reconstruction from Gene-Order Data Jijun Tang 1 and Bernard M.E. Moret 1 1 Department of Computer Science, University of New

More information

Perfect Sorting by Reversals and Deletions/Insertions

Perfect Sorting by Reversals and Deletions/Insertions The Ninth International Symposium on Operations Research and Its Applications (ISORA 10) Chengdu-Jiuzhaigou, China, August 19 23, 2010 Copyright 2010 ORSC & APORC, pp. 512 518 Perfect Sorting by Reversals

More information

Isolating - A New Resampling Method for Gene Order Data

Isolating - A New Resampling Method for Gene Order Data Isolating - A New Resampling Method for Gene Order Data Jian Shi, William Arndt, Fei Hu and Jijun Tang Abstract The purpose of using resampling methods on phylogenetic data is to estimate the confidence

More information

Phylogenetic Reconstruction from Arbitrary Gene-Order Data

Phylogenetic Reconstruction from Arbitrary Gene-Order Data Phylogenetic Reconstruction from Arbitrary Gene-Order Data Jijun Tang and Bernard M.E. Moret University of New Mexico Department of Computer Science Albuquerque, NM 87131, USA jtang,moret@cs.unm.edu Liying

More information

Algorithms for Bioinformatics

Algorithms for Bioinformatics Adapted from slides by Alexandru Tomescu, Leena Salmela, Veli Mäkinen, Esa Pitkänen 582670 Algorithms for Bioinformatics Lecture 5: Combinatorial Algorithms and Genomic Rearrangements 1.10.2015 Background

More information

Advances in Phylogeny Reconstruction from Gene Order and Content Data

Advances in Phylogeny Reconstruction from Gene Order and Content Data Advances in Phylogeny Reconstruction from Gene Order and Content Data Bernard M.E. Moret Department of Computer Science, University of New Mexico, Albuquerque NM 87131 Tandy Warnow Department of Computer

More information

On the complexity of unsigned translocation distance

On the complexity of unsigned translocation distance Theoretical Computer Science 352 (2006) 322 328 Note On the complexity of unsigned translocation distance Daming Zhu a, Lusheng Wang b, a School of Computer Science and Technology, Shandong University,

More information

BIOINFORMATICS. New approaches for reconstructing phylogenies from gene order data. Bernard M.E. Moret, Li-San Wang, Tandy Warnow and Stacia K.

BIOINFORMATICS. New approaches for reconstructing phylogenies from gene order data. Bernard M.E. Moret, Li-San Wang, Tandy Warnow and Stacia K. BIOINFORMATICS Vol. 17 Suppl. 1 21 Pages S165 S173 New approaches for reconstructing phylogenies from gene order data Bernard M.E. Moret, Li-San Wang, Tandy Warnow and Stacia K. Wyman Department of Computer

More information

New Approaches for Reconstructing Phylogenies from Gene Order Data

New Approaches for Reconstructing Phylogenies from Gene Order Data New Approaches for Reconstructing Phylogenies from Gene Order Data Bernard M.E. Moret Li-San Wang Tandy Warnow Stacia K. Wyman Abstract We report on new techniques we have developed for reconstructing

More information

Phylogenetic Networks, Trees, and Clusters

Phylogenetic Networks, Trees, and Clusters Phylogenetic Networks, Trees, and Clusters Luay Nakhleh 1 and Li-San Wang 2 1 Department of Computer Science Rice University Houston, TX 77005, USA nakhleh@cs.rice.edu 2 Department of Biology University

More information

The combinatorics and algorithmics of genomic rearrangements have been the subject of much

The combinatorics and algorithmics of genomic rearrangements have been the subject of much JOURNAL OF COMPUTATIONAL BIOLOGY Volume 22, Number 5, 2015 # Mary Ann Liebert, Inc. Pp. 425 435 DOI: 10.1089/cmb.2014.0096 An Exact Algorithm to Compute the Double-Cutand-Join Distance for Genomes with

More information

Improving Tree Search in Phylogenetic Reconstruction from Genome Rearrangement Data

Improving Tree Search in Phylogenetic Reconstruction from Genome Rearrangement Data Improving Tree Search in Phylogenetic Reconstruction from Genome Rearrangement Data Fei Ye 1,YanGuo, Andrew Lawson 1, and Jijun Tang, 1 Department of Epidemiology and Biostatistics University of South

More information

Genes order and phylogenetic reconstruction: application to γ-proteobacteria

Genes order and phylogenetic reconstruction: application to γ-proteobacteria Genes order and phylogenetic reconstruction: application to γ-proteobacteria Guillaume Blin 1, Cedric Chauve 2 and Guillaume Fertin 1 1 LINA FRE CNRS 2729, Université de Nantes 2 rue de la Houssinière,

More information

On Reversal and Transposition Medians

On Reversal and Transposition Medians On Reversal and Transposition Medians Martin Bader International Science Index, Computer and Information Engineering waset.org/publication/7246 Abstract During the last years, the genomes of more and more

More information

Genome Rearrangements In Man and Mouse. Abhinav Tiwari Department of Bioengineering

Genome Rearrangements In Man and Mouse. Abhinav Tiwari Department of Bioengineering Genome Rearrangements In Man and Mouse Abhinav Tiwari Department of Bioengineering Genome Rearrangement Scrambling of the order of the genome during evolution Operations on chromosomes Reversal Translocation

More information

Recent Advances in Phylogeny Reconstruction

Recent Advances in Phylogeny Reconstruction Recent Advances in Phylogeny Reconstruction from Gene-Order Data Bernard M.E. Moret Department of Computer Science University of New Mexico Albuquerque, NM 87131 Department Colloqium p.1/41 Collaborators

More information

High-Performance Algorithm Engineering for Large-Scale Graph Problems and Computational Biology

High-Performance Algorithm Engineering for Large-Scale Graph Problems and Computational Biology High-Performance Algorithm Engineering for Large-Scale Graph Problems and Computational Biology David A. Bader Electrical and Computer Engineering Department, University of New Mexico, Albuquerque, NM

More information

Non-independence in Statistical Tests for Discrete Cross-species Data

Non-independence in Statistical Tests for Discrete Cross-species Data J. theor. Biol. (1997) 188, 507514 Non-independence in Statistical Tests for Discrete Cross-species Data ALAN GRAFEN* AND MARK RIDLEY * St. John s College, Oxford OX1 3JP, and the Department of Zoology,

More information

A Practical Algorithm for Ancestral Rearrangement Reconstruction

A Practical Algorithm for Ancestral Rearrangement Reconstruction A Practical Algorithm for Ancestral Rearrangement Reconstruction Jakub Kováč, Broňa Brejová, and Tomáš Vinař 2 Department of Computer Science, Faculty of Mathematics, Physics, and Informatics, Comenius

More information

Let S be a set of n species. A phylogeny is a rooted tree with n leaves, each of which is uniquely

Let S be a set of n species. A phylogeny is a rooted tree with n leaves, each of which is uniquely JOURNAL OF COMPUTATIONAL BIOLOGY Volume 8, Number 1, 2001 Mary Ann Liebert, Inc. Pp. 69 78 Perfect Phylogenetic Networks with Recombination LUSHENG WANG, 1 KAIZHONG ZHANG, 2 and LOUXIN ZHANG 3 ABSTRACT

More information

On the tandem duplication-random loss model of genome rearrangement

On the tandem duplication-random loss model of genome rearrangement On the tandem duplication-random loss model of genome rearrangement Kamalika Chaudhuri Kevin Chen Radu Mihaescu Satish Rao November 1, 2005 Abstract We initiate the algorithmic study of a new model of

More information

Jijun Tang and Bernard M.E. Moret. Department of Computer Science, University of New Mexico, Albuquerque, NM 87131, USA

Jijun Tang and Bernard M.E. Moret. Department of Computer Science, University of New Mexico, Albuquerque, NM 87131, USA BIOINFORMATICS Vol. 19 Suppl. 1 2003, pages i30 i312 DOI:.93/bioinformatics/btg42 Scaling up accurate phylogenetic reconstruction from gene-order data Jijun Tang and Bernard M.E. Moret Department of Computer

More information

Consecutive ones Block for Symmetric Matrices

Consecutive ones Block for Symmetric Matrices Consecutive ones Block for Symmetric Matrices Rui Wang, FrancisCM Lau Department of Computer Science and Information Systems The University of Hong Kong, Hong Kong, PR China Abstract We show that a cubic

More information

A Bayesian Analysis of Metazoan Mitochondrial Genome Arrangements

A Bayesian Analysis of Metazoan Mitochondrial Genome Arrangements A Bayesian Analysis of Metazoan Mitochondrial Genome Arrangements Bret Larget,* Donald L. Simon, Joseph B. Kadane,à and Deborah Sweet 1 *Departments of Botany and of Statistics, University of Wisconsin

More information

A New Fast Heuristic for Computing the Breakpoint Phylogeny and Experimental Phylogenetic Analyses of Real and Synthetic Data

A New Fast Heuristic for Computing the Breakpoint Phylogeny and Experimental Phylogenetic Analyses of Real and Synthetic Data A New Fast Heuristic for Computing the Breakpoint Phylogeny and Experimental Phylogenetic Analyses of Real and Synthetic Data Mary E. Cosner Dept. of Plant Biology Ohio State University Li-San Wang Dept.

More information

4 CONNECTED PROJECTIVE-PLANAR GRAPHS ARE HAMILTONIAN. Robin Thomas* Xingxing Yu**

4 CONNECTED PROJECTIVE-PLANAR GRAPHS ARE HAMILTONIAN. Robin Thomas* Xingxing Yu** 4 CONNECTED PROJECTIVE-PLANAR GRAPHS ARE HAMILTONIAN Robin Thomas* Xingxing Yu** School of Mathematics Georgia Institute of Technology Atlanta, Georgia 30332, USA May 1991, revised 23 October 1993. Published

More information

RECOVERY OF NON-LINEAR CONDUCTIVITIES FOR CIRCULAR PLANAR GRAPHS

RECOVERY OF NON-LINEAR CONDUCTIVITIES FOR CIRCULAR PLANAR GRAPHS RECOVERY OF NON-LINEAR CONDUCTIVITIES FOR CIRCULAR PLANAR GRAPHS WILL JOHNSON Abstract. We consider the problem of recovering nonlinear conductances in a circular planar graph. If the graph is critical

More information

Discrete Applied Mathematics

Discrete Applied Mathematics Discrete Applied Mathematics 159 (2011) 1641 1645 Contents lists available at ScienceDirect Discrete Applied Mathematics journal homepage: www.elsevier.com/locate/dam Note Girth of pancake graphs Phillip

More information

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

Analysis of Gene Order Evolution beyond Single-Copy Genes

Analysis of Gene Order Evolution beyond Single-Copy Genes Analysis of Gene Order Evolution beyond Single-Copy Genes Nadia El-Mabrouk Département d Informatique et de Recherche Opérationnelle Université de Montréal mabrouk@iro.umontreal.ca David Sankoff Department

More information

An Improved Algorithm for Ancestral Gene Order Reconstruction

An Improved Algorithm for Ancestral Gene Order Reconstruction V. Kůrková et al. (Eds.): ITAT 2014 with selected papers from Znalosti 2014, CEUR Workshop Proceedings Vol. 1214, pp. 46 53 http://ceur-ws.org/vol-1214, Series ISSN 1613-0073, c 2014 A. Herencsár B. Brejová

More information

Maximal Independent Sets In Graphs With At Most r Cycles

Maximal Independent Sets In Graphs With At Most r Cycles Maximal Independent Sets In Graphs With At Most r Cycles Goh Chee Ying Department of Mathematics National University of Singapore Singapore goh chee ying@moe.edu.sg Koh Khee Meng Department of Mathematics

More information

Theoretical Computer Science

Theoretical Computer Science Theoretical Computer Science 410 (2009) 2759 2766 Contents lists available at ScienceDirect Theoretical Computer Science journal homepage: www.elsevier.com/locate/tcs Note Computing the longest topological

More information

Phylogenetic Reconstruction from Gene-Order Data

Phylogenetic Reconstruction from Gene-Order Data p.1/7 Phylogenetic Reconstruction from Gene-Order Data Bernard M.E. Moret compbio.unm.edu Department of Computer Science University of New Mexico p.2/7 Acknowledgments Close Collaborators: at UNM: David

More information

Chromosomal Breakpoint Reuse in Genome Sequence Rearrangement ABSTRACT

Chromosomal Breakpoint Reuse in Genome Sequence Rearrangement ABSTRACT JOURNAL OF COMPUTATIONAL BIOLOGY Volume 12, Number 6, 2005 Mary Ann Liebert, Inc. Pp. 812 821 Chromosomal Breakpoint Reuse in Genome Sequence Rearrangement DAVID SANKOFF 1 and PHIL TRINH 2 ABSTRACT In

More information

Generating p-extremal graphs

Generating p-extremal graphs Generating p-extremal graphs Derrick Stolee Department of Mathematics Department of Computer Science University of Nebraska Lincoln s-dstolee1@math.unl.edu August 2, 2011 Abstract Let f(n, p be the maximum

More information

Genome Rearrangement

Genome Rearrangement Genome Rearrangement David Sankoff Nadia El-Mabrouk 1 Introduction The difference between genome rearrangement theory and other approaches to comparative genomics, and indeed most other topics in computational

More information

Multiple Whole Genome Alignment

Multiple Whole Genome Alignment Multiple Whole Genome Alignment BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 206 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under CC BY-NC 4.0 by

More information

Phylogenetic Tree Reconstruction

Phylogenetic Tree Reconstruction I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven

More information

Maximising the number of induced cycles in a graph

Maximising the number of induced cycles in a graph Maximising the number of induced cycles in a graph Natasha Morrison Alex Scott April 12, 2017 Abstract We determine the maximum number of induced cycles that can be contained in a graph on n n 0 vertices,

More information

Evolution of Tandemly Arrayed Genes in Multiple Species

Evolution of Tandemly Arrayed Genes in Multiple Species Evolution of Tandemly Arrayed Genes in Multiple Species Mathieu Lajoie 1, Denis Bertrand 1, and Nadia El-Mabrouk 1 DIRO - Université de Montréal - H3C 3J7 - Canada {bertrden,lajoimat,mabrouk}@iro.umontreal.ca

More information

Exploring Phylogenetic Relationships in Drosophila with Ciliate Operations

Exploring Phylogenetic Relationships in Drosophila with Ciliate Operations Exploring Phylogenetic Relationships in Drosophila with Ciliate Operations Jacob Herlin, Anna Nelson, and Dr. Marion Scheepers Department of Mathematical Sciences, University of Northern Colorado, Department

More information

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz Phylogenetic Trees What They Are Why We Do It & How To Do It Presented by Amy Harris Dr Brad Morantz Overview What is a phylogenetic tree Why do we do it How do we do it Methods and programs Parallels

More information

Evolutionary Tree Analysis. Overview

Evolutionary Tree Analysis. Overview CSI/BINF 5330 Evolutionary Tree Analysis Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Backgrounds Distance-Based Evolutionary Tree Reconstruction Character-Based

More information

GASTS: Parsimony Scoring under Rearrangements

GASTS: Parsimony Scoring under Rearrangements GASTS: Parsimony Scoring under Rearrangements Andrew Wei Xu and Bernard M.E. Moret Laboratory for Computational Biology and Bioinformatics, EPFL, EPFL-IC-LCBB INJ230, Station 14, CH-1015 Lausanne, Switzerland

More information

THE THREE-STATE PERFECT PHYLOGENY PROBLEM REDUCES TO 2-SAT

THE THREE-STATE PERFECT PHYLOGENY PROBLEM REDUCES TO 2-SAT COMMUNICATIONS IN INFORMATION AND SYSTEMS c 2009 International Press Vol. 9, No. 4, pp. 295-302, 2009 001 THE THREE-STATE PERFECT PHYLOGENY PROBLEM REDUCES TO 2-SAT DAN GUSFIELD AND YUFENG WU Abstract.

More information

A Separator Theorem for Graphs with an Excluded Minor and its Applications

A Separator Theorem for Graphs with an Excluded Minor and its Applications A Separator Theorem for Graphs with an Excluded Minor and its Applications Noga Alon IBM Almaden Research Center, San Jose, CA 95120,USA and Sackler Faculty of Exact Sciences, Tel Aviv University, Tel

More information

CSCI1950 Z Computa4onal Methods for Biology Lecture 5

CSCI1950 Z Computa4onal Methods for Biology Lecture 5 CSCI1950 Z Computa4onal Methods for Biology Lecture 5 Ben Raphael February 6, 2009 hip://cs.brown.edu/courses/csci1950 z/ Alignment vs. Distance Matrix Mouse: ACAGTGACGCCACACACGT Gorilla: CCTGCGACGTAACAAACGC

More information

FRACTIONAL PACKING OF T-JOINS. 1. Introduction

FRACTIONAL PACKING OF T-JOINS. 1. Introduction FRACTIONAL PACKING OF T-JOINS FRANCISCO BARAHONA Abstract Given a graph with nonnegative capacities on its edges, it is well known that the capacity of a minimum T -cut is equal to the value of a maximum

More information

Phylogenetic Reconstruction

Phylogenetic Reconstruction Phylogenetic Reconstruction from Gene-Order Data Bernard M.E. Moret compbio.unm.edu Department of Computer Science University of New Mexico p. 1/71 Acknowledgments Close Collaborators: at UNM: David Bader

More information

Multiple genome rearrangement

Multiple genome rearrangement Multiple genome rearrangement David Sankoff Mathieu Blanchette 1 Introduction Multiple alignment of macromolecular sequences, an important topic of algorithmic research for at least 25 years [13, 10],

More information

Genome Rearrangement. 1 Inversions. Rick Durrett 1

Genome Rearrangement. 1 Inversions. Rick Durrett 1 Genome Rearrangement Rick Durrett 1 Dept. of Math., Cornell U., Ithaca NY, 14853 rtd1@cornell.edu Genomes evolve by chromosomal fissions and fusions, reciprocal translocations between chromosomes, and

More information

I519 Introduction to Bioinformatics, Genome Comparison. Yuzhen Ye School of Informatics & Computing, IUB

I519 Introduction to Bioinformatics, Genome Comparison. Yuzhen Ye School of Informatics & Computing, IUB I519 Introduction to Bioinformatics, 2015 Genome Comparison Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Whole genome comparison/alignment Build better phylogenies Identify polymorphism

More information

On the intersection of infinite matroids

On the intersection of infinite matroids On the intersection of infinite matroids Elad Aigner-Horev Johannes Carmesin Jan-Oliver Fröhlich University of Hamburg 9 July 2012 Abstract We show that the infinite matroid intersection conjecture of

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

NOTE ON THE HYBRIDIZATION NUMBER AND SUBTREE DISTANCE IN PHYLOGENETICS

NOTE ON THE HYBRIDIZATION NUMBER AND SUBTREE DISTANCE IN PHYLOGENETICS NOTE ON THE HYBRIDIZATION NUMBER AND SUBTREE DISTANCE IN PHYLOGENETICS PETER J. HUMPHRIES AND CHARLES SEMPLE Abstract. For two rooted phylogenetic trees T and T, the rooted subtree prune and regraft distance

More information

Bayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies

Bayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies Bayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies 1 What is phylogeny? Essay written for the course in Markov Chains 2004 Torbjörn Karfunkel Phylogeny is the evolutionary development

More information

HW Graph Theory SOLUTIONS (hbovik) - Q

HW Graph Theory SOLUTIONS (hbovik) - Q 1, Diestel 3.5: Deduce the k = 2 case of Menger s theorem (3.3.1) from Proposition 3.1.1. Let G be 2-connected, and let A and B be 2-sets. We handle some special cases (thus later in the induction if these

More information

The Mixed Chinese Postman Problem Parameterized by Pathwidth and Treedepth

The Mixed Chinese Postman Problem Parameterized by Pathwidth and Treedepth The Mixed Chinese Postman Problem Parameterized by Pathwidth and Treedepth Gregory Gutin, Mark Jones, and Magnus Wahlström Royal Holloway, University of London Egham, Surrey TW20 0EX, UK Abstract In the

More information

EVOLUTIONARY DISTANCES

EVOLUTIONARY DISTANCES EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:

More information

Generalized Pigeonhole Properties of Graphs and Oriented Graphs

Generalized Pigeonhole Properties of Graphs and Oriented Graphs Europ. J. Combinatorics (2002) 23, 257 274 doi:10.1006/eujc.2002.0574 Available online at http://www.idealibrary.com on Generalized Pigeonhole Properties of Graphs and Oriented Graphs ANTHONY BONATO, PETER

More information

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9 Lecture 5 Alignment I. Introduction. For sequence data, the process of generating an alignment establishes positional homologies; that is, alignment provides the identification of homologous phylogenetic

More information

Algorithms for pattern involvement in permutations

Algorithms for pattern involvement in permutations Algorithms for pattern involvement in permutations M. H. Albert Department of Computer Science R. E. L. Aldred Department of Mathematics and Statistics M. D. Atkinson Department of Computer Science D.

More information

Genome Rearrangement Problems with Single and Multiple Gene Copies: A Review

Genome Rearrangement Problems with Single and Multiple Gene Copies: A Review Genome Rearrangement Problems with Single and Multiple Gene Copies: A Review Ron Zeira and Ron Shamir August 9, 2018 Dedicated to Bernard Moret upon his retirement. Abstract Genome rearrangement problems

More information

arxiv: v1 [cs.ds] 21 May 2013

arxiv: v1 [cs.ds] 21 May 2013 Easy identification of generalized common nested intervals Fabien de Montgolfier 1, Mathieu Raffinot 1, and Irena Rusu 2 arxiv:1305.4747v1 [cs.ds] 21 May 2013 1 LIAFA, Univ. Paris Diderot - Paris 7, 75205

More information

Network Augmentation and the Multigraph Conjecture

Network Augmentation and the Multigraph Conjecture Network Augmentation and the Multigraph Conjecture Nathan Kahl Department of Mathematical Sciences Stevens Institute of Technology Hoboken, NJ 07030 e-mail: nkahl@stevens-tech.edu Abstract Let Γ(n, m)

More information

Finding and Counting Given Length Cycles 1. multiplication. Even cycles in undirected graphs can be found even faster. A C 4k 2 in an undirected

Finding and Counting Given Length Cycles 1. multiplication. Even cycles in undirected graphs can be found even faster. A C 4k 2 in an undirected Algorithmica (1997 17: 09 3 Algorithmica 1997 Springer-Verlag New York Inc. Finding and Counting Given Length Cycles 1 N. Alon, R. Yuster, and U. Zwick Abstract. We present an assortment of methods for

More information

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree) I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by

More information

Even Cycles in Hypergraphs.

Even Cycles in Hypergraphs. Even Cycles in Hypergraphs. Alexandr Kostochka Jacques Verstraëte Abstract A cycle in a hypergraph A is an alternating cyclic sequence A 0, v 0, A 1, v 1,..., A k 1, v k 1, A 0 of distinct edges A i and

More information

Acyclic Digraphs arising from Complete Intersections

Acyclic Digraphs arising from Complete Intersections Acyclic Digraphs arising from Complete Intersections Walter D. Morris, Jr. George Mason University wmorris@gmu.edu July 8, 2016 Abstract We call a directed acyclic graph a CI-digraph if a certain affine

More information

A PARSIMONY APPROACH TO ANALYSIS OF HUMAN SEGMENTAL DUPLICATIONS

A PARSIMONY APPROACH TO ANALYSIS OF HUMAN SEGMENTAL DUPLICATIONS A PARSIMONY APPROACH TO ANALYSIS OF HUMAN SEGMENTAL DUPLICATIONS CRYSTAL L. KAHN and BENJAMIN J. RAPHAEL Box 1910, Brown University Department of Computer Science & Center for Computational Molecular Biology

More information

Multiple genome rearrangement: a general approach via the evolutionary genome graph

Multiple genome rearrangement: a general approach via the evolutionary genome graph BIOINFORMATICS Vol. 18 Suppl. 1 2002 Pages S303 S311 Multiple genome rearrangement: a general approach via the evolutionary genome graph Dmitry Korkin and Lev Goldfarb Faculty of Computer Science, University

More information

arxiv: v1 [math.co] 11 Jul 2016

arxiv: v1 [math.co] 11 Jul 2016 Characterization and recognition of proper tagged probe interval graphs Sourav Chakraborty, Shamik Ghosh, Sanchita Paul and Malay Sen arxiv:1607.02922v1 [math.co] 11 Jul 2016 October 29, 2018 Abstract

More information

ACYCLIC DIGRAPHS GIVING RISE TO COMPLETE INTERSECTIONS

ACYCLIC DIGRAPHS GIVING RISE TO COMPLETE INTERSECTIONS ACYCLIC DIGRAPHS GIVING RISE TO COMPLETE INTERSECTIONS WALTER D. MORRIS, JR. ABSTRACT. We call a directed acyclic graph a CIdigraph if a certain affine semigroup ring defined by it is a complete intersection.

More information

A New Genomic Evolutionary Model for Rearrangements, Duplications, and Losses that Applies across Eukaryotes and Prokaryotes

A New Genomic Evolutionary Model for Rearrangements, Duplications, and Losses that Applies across Eukaryotes and Prokaryotes A New Genomic Evolutionary Model for Rearrangements, Duplications, and Losses that Applies across Eukaryotes and Prokaryotes Yu Lin and Bernard M.E. Moret Laboratory for Computational Biology and Bioinformatics,

More information

Faster Algorithms for some Optimization Problems on Collinear Points

Faster Algorithms for some Optimization Problems on Collinear Points Faster Algorithms for some Optimization Problems on Collinear Points Ahmad Biniaz Prosenjit Bose Paz Carmi Anil Maheshwari J. Ian Munro Michiel Smid June 29, 2018 Abstract We propose faster algorithms

More information

The Intractability of Computing the Hamming Distance

The Intractability of Computing the Hamming Distance The Intractability of Computing the Hamming Distance Bodo Manthey and Rüdiger Reischuk Universität zu Lübeck, Institut für Theoretische Informatik Wallstraße 40, 23560 Lübeck, Germany manthey/reischuk@tcs.uni-luebeck.de

More information

Graphs, permutations and sets in genome rearrangement

Graphs, permutations and sets in genome rearrangement ntroduction Graphs, permutations and sets in genome rearrangement 1 alabarre@ulb.ac.be Universite Libre de Bruxelles February 6, 2006 Computers in Scientic Discovery 1 Funded by the \Fonds pour la Formation

More information

BMI/CS 776 Lecture #20 Alignment of whole genomes. Colin Dewey (with slides adapted from those by Mark Craven)

BMI/CS 776 Lecture #20 Alignment of whole genomes. Colin Dewey (with slides adapted from those by Mark Craven) BMI/CS 776 Lecture #20 Alignment of whole genomes Colin Dewey (with slides adapted from those by Mark Craven) 2007.03.29 1 Multiple whole genome alignment Input set of whole genome sequences genomes diverged

More information

A new algorithm to construct phylogenetic networks from trees

A new algorithm to construct phylogenetic networks from trees A new algorithm to construct phylogenetic networks from trees J. Wang College of Computer Science, Inner Mongolia University, Hohhot, Inner Mongolia, China Corresponding author: J. Wang E-mail: wangjuanangle@hit.edu.cn

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 2009 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

Mathematics of Evolution and Phylogeny. Edited by Olivier Gascuel

Mathematics of Evolution and Phylogeny. Edited by Olivier Gascuel Mathematics of Evolution and Phylogeny Edited by Olivier Gascuel CLARENDON PRESS. OXFORD 2004 iv CONTENTS 12 Reconstructing Phylogenies from Gene-Content and Gene-Order Data 1 12.1 Introduction: Phylogenies

More information

Presentation by Julie Hudson MAT5313

Presentation by Julie Hudson MAT5313 Proc. Natl. Acad. Sci. USA Vol. 89, pp. 6575-6579, July 1992 Evolution Gene order comparisons for phylogenetic inference: Evolution of the mitochondrial genome (genomics/algorithm/inversions/edit distance/conserved

More information

Approximation Algorithms for the Consecutive Ones Submatrix Problem on Sparse Matrices

Approximation Algorithms for the Consecutive Ones Submatrix Problem on Sparse Matrices Approximation Algorithms for the Consecutive Ones Submatrix Problem on Sparse Matrices Jinsong Tan and Louxin Zhang Department of Mathematics, National University of Singaproe 2 Science Drive 2, Singapore

More information

CONTENTS. P A R T I Genomes 1. P A R T II Gene Transcription and Regulation 109

CONTENTS. P A R T I Genomes 1. P A R T II Gene Transcription and Regulation 109 CONTENTS ix Preface xv Acknowledgments xxi Editors and contributors xxiv A computational micro primer xxvi P A R T I Genomes 1 1 Identifying the genetic basis of disease 3 Vineet Bafna 2 Pattern identification

More information

Maximal and Maximum Independent Sets In Graphs With At Most r Cycles

Maximal and Maximum Independent Sets In Graphs With At Most r Cycles Maximal and Maximum Independent Sets In Graphs With At Most r Cycles Bruce E. Sagan Department of Mathematics Michigan State University East Lansing, MI sagan@math.msu.edu Vincent R. Vatter Department

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 200 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

On the mean connected induced subgraph order of cographs

On the mean connected induced subgraph order of cographs AUSTRALASIAN JOURNAL OF COMBINATORICS Volume 71(1) (018), Pages 161 183 On the mean connected induced subgraph order of cographs Matthew E Kroeker Lucas Mol Ortrud R Oellermann University of Winnipeg Winnipeg,

More information

Evolution of Tandemly Repeated Sequences through Duplication and Inversion

Evolution of Tandemly Repeated Sequences through Duplication and Inversion Evolution of Tandemly Repeated Sequences through Duplication and Inversion Denis Bertrand 1, Mathieu Lajoie 2, Nadia El-Mabrouk 2, and Olivier Gascuel 1 1 LIRMM, UMR 5506, CNRS et Univ. Montpellier 2,

More information

The Chromatic Number of Ordered Graphs With Constrained Conflict Graphs

The Chromatic Number of Ordered Graphs With Constrained Conflict Graphs The Chromatic Number of Ordered Graphs With Constrained Conflict Graphs Maria Axenovich and Jonathan Rollin and Torsten Ueckerdt September 3, 016 Abstract An ordered graph G is a graph whose vertex set

More information

NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees

NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees Erin Molloy and Tandy Warnow {emolloy2, warnow}@illinois.edu University of Illinois at Urbana

More information

Pairing Transitive Closure and Reduction to Efficiently Reason about Partially Ordered Events

Pairing Transitive Closure and Reduction to Efficiently Reason about Partially Ordered Events Pairing Transitive Closure and Reduction to Efficiently Reason about Partially Ordered Events Massimo Franceschet Angelo Montanari Dipartimento di Matematica e Informatica, Università di Udine Via delle

More information

Part V. Matchings. Matching. 19 Augmenting Paths for Matchings. 18 Bipartite Matching via Flows

Part V. Matchings. Matching. 19 Augmenting Paths for Matchings. 18 Bipartite Matching via Flows Matching Input: undirected graph G = (V, E). M E is a matching if each node appears in at most one Part V edge in M. Maximum Matching: find a matching of maximum cardinality Matchings Ernst Mayr, Harald

More information

MAXIMUM LIKELIHOOD PHYLOGENETIC RECONSTRUCTION FROM HIGH-RESOLUTION WHOLE-GENOME DATA AND A TREE OF 68 EUKARYOTES

MAXIMUM LIKELIHOOD PHYLOGENETIC RECONSTRUCTION FROM HIGH-RESOLUTION WHOLE-GENOME DATA AND A TREE OF 68 EUKARYOTES MAXIMUM LIKELIHOOD PHYLOGENETIC RECONSTRUCTION FROM HIGH-RESOLUTION WHOLE-GENOME DATA AND A TREE OF 68 EUKARYOTES YU LIN Laboratory for Computational Biology and Bioinformatics, EPFL, Lausanne VD, CH-115,

More information

arxiv: v1 [cs.dm] 12 Jun 2016

arxiv: v1 [cs.dm] 12 Jun 2016 A Simple Extension of Dirac s Theorem on Hamiltonicity Yasemin Büyükçolak a,, Didem Gözüpek b, Sibel Özkana, Mordechai Shalom c,d,1 a Department of Mathematics, Gebze Technical University, Kocaeli, Turkey

More information

DEGREE SEQUENCES OF INFINITE GRAPHS

DEGREE SEQUENCES OF INFINITE GRAPHS DEGREE SEQUENCES OF INFINITE GRAPHS ANDREAS BLASS AND FRANK HARARY ABSTRACT The degree sequences of finite graphs, finite connected graphs, finite trees and finite forests have all been characterized.

More information